Skip to main content
Critical Care Explorations logoLink to Critical Care Explorations
. 2025 Sep 5;7(9):e1306. doi: 10.1097/CCE.0000000000001306

Progesterone for Traumatic Brain Injury, Experimental Clinical Treatment III Trial Revisited: Objective Classification of Traumatic Brain Injury With Brain Imaging Segmentation and Biomarker Levels

Scarlett Cheong 1,2, Rishabh Gupta 3, Sharada Kadaba Sridhar 1,2, Alex J Hall 4, Michael Frankel 5,6, David W Wright 4,7, Yuk Y Sham 2,8,, Uzma Samadani 1,2,
PMCID: PMC12417008  PMID: 40911759

Abstract

OBJECTIVE:

This post hoc study of the Progesterone for Traumatic Brain Injury, Experimental Clinical Treatment (ProTECT) III trial investigates whether improving traumatic brain injury (TBI) classification, using serum biomarkers (glial fibrillary acidic protein [GFAP] and ubiquitin carboxyl-terminal esterase L1 [UCH-L1]) and algorithmically assessed total lesion volume, could identify a subset of responders to progesterone treatment, beyond broad measures like the Glasgow Coma Scale (GCS) and Glasgow Outcome Scale-Extended (GOS-E), which may fail to capture subtle changes in TBI recovery.

DESIGN:

Brain lesion volumes on CT scans were quantified using Brain Lesion Analysis and Segmentation Tool for CT. Patients were classified into true-positive and true-negative groups based on an optimization scheme to determine a threshold that maximizes agreement between radiological assessment and objectively measured lesion volume. True-positives were further categorized into low (> 0.2–10 mL), medium (> 10–50 mL), and high (> 50 mL) lesion volumes for analysis with protein biomarkers and injury severity. Correlation analyses linked Rotterdam scores (RSs) with biomarker levels and lesion volumes, whereas Welch’s t-test evaluated biomarker differences between groups and progesterone’s effects.

SETTING:

Forty-nine level 1 trauma centers in the United States.

PATIENT:

Patients with moderate-to-severe TBI.

INTERVENTIONS:

Progesterone.

MEASUREMENTS AND MAIN RESULTS:

GFAP and UCH-L1 levels were significantly higher in true-positive cases with low to medium lesion volume. Only UCH-L1 differed between progesterone and placebo groups at 48 hours. Both biomarkers and lesion volume in the true-positive group correlated with the RS. No sex-specific or treatment differences were found.

CONCLUSIONS:

This study reaffirms elevated levels of GFAP and UCH-L1 as biomarkers for detecting TBI in patients with brain lesions and for predicting clinical outcomes. Despite improved classification using CT-imaging segmentation and serum biomarkers, we did not identify a subset of progesterone responders within 24 or 48 hours of progesterone treatment. More rigorous and quantifiable measures for classifying the nature of injury may be needed to enable development of therapeutics as neither serum markers nor algorithmic CT analysis performed better than the older metrics of Rotterdam or GCS metrics.

Keywords: biomarkers, brain lesion analysis and segmentation tool for CT, deep-learning, glial fibrillary acidic protein, progesterone, traumatic brain injury, ubiquitin carboxyl-terminal esterase L1


KEY POINTS

Question: Does improving traumatic brain injury (TBI) classification using serum biomarkers (glial fibrillary acidic protein [GFAP], ubiquitin carboxyl-terminal esterase L1 [UCH-L1]) and CT lesion volume improve the identification of responders to progesterone treatment in the Progesterone for Traumatic Brain Injury, Experimental Clinical Treatment III trial?

Findings: Progesterone treatment does not improve TBI outcomes. Serum biomarkers GFAP and UCH-L1 were elevated in patients with larger lesions, with UCH-L1 showing a significant difference between progesterone and placebo groups at 48 hours. However, no significant differences were found between progesterone and placebo groups in terms of biomarkers or lesion volume.

Meaning: Improving TBI classification with biomarkers and lesion volume did not identify progesterone responders.

Progesterone has been shown to reduce cerebral edema, neuronal loss, and behavioral deficits in animal models for traumatic brain injury (TBI) (16), which led to the Progesterone for Traumatic Brain Injury, Experimental Clinical Treatment (ProTECT) III clinical trial. It was discontinued for futility after enrolling 882 of the planned 1140 participants at the final preplanned interim analysis (710). Overly broad classification of injuries and outcomes due to the use of Glasgow Coma Scale (GCS) and the Glasgow Outcome Scale-Extended (GOS-E) for inclusion and outcome assessment, respectively, was cited as one of the reasons for potential ProTECT III trial failure (1, 6, 7), with similar findings reported elsewhere (11). Furthermore, the broad inclusion criteria further limit the ability to detect treatment effects (7, 12). Given the well-known heterogeneity of TBI (13), there is a growing need to identify objective, quantifiable measures that can help elucidate therapeutic response.

In ProTECT III trial, serum protein biomarker levels of glial fibrillary acidic protein (GFAP) and ubiquitin carboxyl-terminal esterase L1 (UCH-L1) did not differ following progesterone treatment vs controls after brain injury (6). These biomarkers were recently Food and Drug Administration (FDA) cleared for identification of patients with a positive head CT (11), and many clinicians intend to use them to help classify the nature of brain injury for clinical trials and as part of clinical care (14, 15). However, their utility in identifying treatment responders has not been explored. Functional classification with GCS and GOS-E lacks granularity and can obscure meaningful clinical change (16, 17). This further motivates the need to incorporate alternative metrics that can provide more detailed injury characterization.

To determine whether we could identify a subset of responders to progesterone in a post hoc analysis of the ProTECT III trial, we analyzed serum protein biomarkers and applied the Brain Lesion Analysis and Segmentation Tool for CT (BLAST-CT), a deep-learning image segmentation tool (18), to improve brain injury classification. Artificial intelligence tools like BLAST-CT enable objective, high-fidelity quantification of lesion burden and heterogeneity, allowing stratification by injury severity, time since injury, and sex (19, 20).

We hypothesized that patients with larger lesion volumes would exhibit higher biomarker levels and might respond to progesterone, given its neuroprotective mechanism. By refining classification using lesion volume and biomarkers, we aimed to identify a subgroup that might benefit from progesterone treatment, which was not evident from the original trial.

METHODS

Ethics Statement

This retrospective analysis used data from the ProTECT trial, with approved institute review boards at all 49 participating institutions as required under FDA 21 CFR 50.24 (12). When available, written informed consent was obtained from a legally authorized representative before enrollment (12). For patients enrolled under exemption, notification and consent to continue were obtained as soon as practicable (12). Study safety was overseen by an National Institute of Neurological Disorders and Stroke-appointed data and safety monitoring board and two independent monitors (12).

ProTECT Dataset

The ProTECT III trial included 882 participants with moderate to severe nonpenetrating TBI, enrolled within four hours of injury and randomized to receive IV progesterone or placebo for 96 hours (6, 12). The primary outcome was the GOS-E at 6 months, with secondary outcomes including mortality and Disability Rating Score (6, 12). CT scans were classified by radiologists as abnormal or normal based on visible injury.

In the ancillary BIO-ProTECT III study, serum biomarkers GFAP, UCH-L1, S100 calcium-binding protein B (S100B), and α-II-spectrin breakdown products 150 (SBDP150) were collected at baseline, 24, and 48 hours post-injury (21). These biomarkers associated with neuronal and glial injury were chosen for their temporal peak expression at 6–8 hours (UCH-L1, S100B, and SBDP) and at 24–48 hours (GFAP) (21).

Outcome Measures: GOS-E, GCS, and Rotterdam Score

The GCS assesses consciousness based on eye, verbal, and motor response (score range 3–15). The Extended GOS-E expands on the original GOS, categorizing functional recovery into eight levels, from death (1) to upper good recovery (8). The Rotterdam score (RS) classifies injury severity using CT findings (e.g., basal cistern status, midline shift, epidural hematoma, and intraventricular/subarachnoid hemorrhage) to predict mortality and early outcomes. All scores were obtained from the ProTECT dataset.

Cohort Selection

ProTECT III data were obtained from FITBIR (22). Patients received progesterone or a placebo within 4 hours of TBI. Exclusion criteria included missing biomarker profiles (GFAP, S100B, SBDP150, UCH-L1) at any time point, incomplete transverse brain CT scan, or absent radiographic assessments. Of 882 patients, 778 had complete CT scans and 507 had complete biomarker profiles. To reduce bias, only patients with both imaging and biomarker data were included to mitigate this concern (23).

Brain Lesion Assessment

Lesion volumes were measured using BLAST-CT (24), a validated deep-learning tool that segments intraparenchymal hemorrhage (IPH), extra-axial hemorrhage (EAH), perilesional edema, and intraventricular hemorrhage (IVH) from CT scans. Digital imaging and communications in medicine images were converted to neuroimaging informatics technology initiative format using “dcm2nii” (Columbia, SC) (25). Total volumes were computed by summing all lesion types. Segmentation ran on a Linux workstation (Nvidia RTX 2080), with each scan processed in under 2 minutes per scan (18).

CT Imaging-Based TBI Group Classification

Machine learning determined the optimal lesion volume threshold using the Area Under the Receiver Operating Characteristics curve with Youden’s Index (Sensitivity + Specificity—1). Patients with normal CT findings and BLAST-CT lesion volume less than or equal to 0.2 mL were classified as true negatives. All others were true positives, further stratified into low, medium, and high-volume subgroups based on hemorrhage volume.

TBI Group Classification

A lesion volume threshold of 0.2 mL optimized classification performance (Youden’s Index-maximized F1 score = 0.84; accuracy = 75%), identifying 334 true-positive and 49 true-negative cases based on radiology assessment. False negatives (n = 110) and false positives (n = 14) were excluded to enable robust comparisons between true-positive and true-negative groups (Supplemental Table 1, https://links.lww.com/CCX/B540).

True positives were subdivided by lesion volume: low volume (0.2–10 mL, n = 246), medium (10–50 mL, n = 246), and high (> 50 mL, n = 13). These thresholds were determined through visual inspection via Figure 1 to ensure meaningful cohort sizes. Most patients (73.76%) had lesion volumes less than or equal to 5 mL, with the full range spanning 0–150 mL. Although statistical comparisons for the high-volume group remain limited due to small sample size, the low and medium-volume groups provide the most reliable comparisons.

Figure 1.

Figure 1.

Histogram distribution of total brain lesion volume from 0 to 140 mL (A) and a closeup from 0 to 4 mL (B).

Statistical Analysis

Biomarker levels were compared between true-positive and true-negative groups at baseline, 24 hours, and 48 hours. Subgroup analyses included among low, medium, and high-volume true positives and true negatives, as well as between progesterone and placebo groups within each TBI classification. Additional comparisons were conducted within the true-positive group to assess severity- and sex-specific treatment effects. Two-tailed Welch’s t-tests (p < 0.05) were performed in RStudio (version 2021.09.1; R Foundation for Statistical Computing, Vienna, Austria). We acknowledge the risk of type I error from multiple comparisons and consider false discovery rate correction.

RESULTS

Biomarker Levels Between True-Negative and True-Positive Groups

Biomarker levels (GFAP, UCH-L1, S100B, and SBDP) were compared between true-negative and true-positive groups, as well as within low, medium, and high-volume subgroups of the true-positive group at baseline, 24 hours, and 48 hours (Fig. 2 and Table 1). At baseline, the true-positive group had significantly higher biomarkers. GFAP was 10-fold higher in the true-positive group (11.118 ng/mL vs. 1.347 ng/mL, p < 0.001), and UCH-L1 was nearly double (7.387 ng/mL vs. 3.760 ng/mL, p < 0.001). S100B and SBDP were also significantly elevated in the true-positive group (S100B: 0.462 ng/mL vs. 0.221 ng/mL, p < 0.001; SBDP: 0.335 ng/mL vs. 0.213 ng/mL, p = 0.049). The GFAP, UC-L1, and S100B protein levels in the medium volume are significantly higher than those of low low-volume true-positive subgroups, whereas the high-volume group did not consistently show statistical significance. At 24 hours, GFAP and UCH-L1 remained significantly higher in the true-positive group (GFAP: 10.547 ng/mL vs. 1.434 ng/mL, p < 0.001; UCH-L1: 0.708 ng/mL vs. 0.324 ng/mL, p < 0.001). Similar trends were observed for S100B and SBDP. Low- and medium-volume subgroups had significant differences compared with the true-negative group, whereas the high-volume group showed inconsistent results. At 48 hours, the true-positive group continued to exhibit elevated biomarker levels. GFAP (7.116 ng/mL vs. 0.611 ng/mL, p < 0.001) and UCH-L1 (0.334 ng/mL vs. 0.149 ng/mL, p < 0.001) remained significantly higher. Low and medium-volume groups maintained these differences, whereas the high-volume group showed less consistent results.

Figure 2.

Figure 2.

Boxplot figure illustrates the temporal variations in biomarker levels (glial fibrillary acidic protein [GFAP] (A), ubiquitin carboxyl-terminal esterase L1 [UCH-L1] (B), α-II-spectrin breakdown products [SBDP]) (C), and S100 calcium-binding protein B [S100B] (D) between the true-negative and true-positive groups, with distinct responses to placebo and progesterone treatment across baseline, 24 hours, and 48 hours. ***p < 0.001.

TABLE 1.

Biomarker Levels of Subjects From the Established True-Negative and True-Positive Groups

Groups Biomarker True-Negative Group True-Positive Group
Placebo (Mean) Progesterone (Mean) p Placebo (Mean) Progesterone (Mean) p
Baseline (ng/mL) n = 21 n = 28 n = 166 n = 168
 GFAP 0.639 1.852 0.057 10.580 11.650 0.627
 UCH-L1 3.127 4.236 0.325 7.160 7.613 0.694
 S100B 0.217 0.224 0.925 0.472 0.452 0.678
 SBDP 0.195 0.225 0.630 0.331 0.338 0.946
24 hr (ng/mL) n = 18 n = 25 n = 163 n = 160
 GFAP 0.965 1.771 0.043 10.348 10.750 0.872
 UCH-L1 0.334 0.317 0.881 0.692 0.724 0.806
 S100B 0.041 0.034 0.494 0.101 0.093 0.670
 SBDP 0.071 0.052 0.127 0.075 0.089 0.438
48 hr (ng/mL) n = 14 n = 24 n = 158 n = 154
 GFAP 0.270 0.809 0.003 7.192 7.037 0.930
 UCH-L1 0.148 0.150 0.971 0.424 0.261 0.008
 S100B 0.027 0.022 0.397 0.077 0.050 0.042
 SBDP 0.060 0.047 0.154 0.084 0.061 0.153

GFAP = glial fibrillary acidic protein, SBDP150 = α-II-spectrin breakdown products 150, S100B = S100 calcium-binding protein B, UCH-L1 = ubiquitin carboxyl-terminal esterase L1.

GOS-E Scores and Biomarker Levels

Correlation analysis between GOS-E score and the serum protein biomarker levels (GFAP, UCH-L1, S100B, SBDP) at baseline for the true-positive group is shown in Figure 3, A and B. GFAP shows a strong inverse correlation to the GOS-E score with an observed R² equals to 0.54, consistent with the expectation that increased levels of protein biomarker levels as an indicator for greater glial and neuronal brain injury associated with expected poorer clinical outcomes and a lower GOS-E score. Interestingly, both SBDP and S100B, which were also used in the secondary Bio-ProTECT secondary analysis, exhibited weak correlations with GOS-E with an observed R² equals to 0.11 and 0.02, respectively. Of the four protein biomarkers, GFAP and UCH-L1 are significantly more reliable as biomarkers for assessing TBI severity and predicting functional outcomes than SBDP and S100B.

Figure 3:

Figure 3:

Scatterplots illustrate the biomarker level (ng/mL) or total Brain Lesion Analysis and Segmentation Tool for CT (BLAST-CT) volume (mL) relationship to the Glasgow Outcome Scale-Extended (GOS-E) or Rotterdam scores on the True-Positive group. A, The relationship between glial fibrillary acidic protein (GFAP) and ubiquitin carboxyl-terminal esterase L1 (UCH-L1) biomarker levels and GOS-E scores. B, The relationship between BLAST-CT biomarker levels and GOS-E scores. C, The relationship between GFAP and UCH-L1 biomarker levels and Rotterdam scores. D, The relationship between total BLAST-CT volume and Rotterdam scores.

Rotterdam Score, Biomarker Levels, and Total Lesion Volume

The RS is an objective classification system using a non-contrast head CT scan aimed at predicting mortality and early functional outcomes in TBI patients (26). To further establish the reliability of using the serum protein biomarkers and total lesion volume as alternative and objective quantifiable measures for assessing TBI severity and for predicting clinical outcome, correlation analysis was performed for the true-positive group between the RSs against the protein biomarker levels and total lesion volumes at baseline as shown in Figure 3, C and D. GFAP, UCH-L1, and total lesion volume showed a strong positive correlation with an observed R² equals to 0.92, 0.85, and 0.80, respectively.

Correlation Between Lesion Volume and Biomarker Expression

To explore whether larger brain lesion volume was associated with higher protein biomarker levels, we examined the total lesion volume as a continuous variable in relation to baseline biomarker expression (GFAP, UCH-L1, SBDP, and S100B). Patients with low lesion volumes (< 10 mL) have large variability in protein biomarker expression levels. No direct correlation can be observed between each of the four biomarkers with the total lesion volume.

Temporal Effect of Progesterone on TBI

Biomarker levels in the true-negative and true-positive subgroups were compared at baseline, 24 hours, and 48 hours (Table 2). At baseline, GFAP, UCH-L1, S100B, and SBDP levels showed no significant differences between placebo and progesterone groups. At 24 hours, GFAP levels in the true-negative group increased significantly in the progesterone group (1.771 ng/mL) vs. placebo (0.965 ng/mL, p = 0.043), whereas other biomarkers showed no significant differences. In the true-positive group, no significant differences were observed for any biomarkers. At 48 hours, GFAP levels remained higher in the progesterone group (0.809 ng/mL) vs. placebo (0.270 ng/mL, p = 0.003) for true negatives. In the true-positive group, UCH-L1 and S100B levels were lower in the progesterone group (p = 0.008, p = 0.042), whereas GFAP and SBDP showed no significant changes.

TABLE 2.

Comparison of Biomarker Levels Across True-Negative and True-Positive Groups Stratified by Lesion Volume at Baseline, 24 Hours, and 48 Hours

Groups Biomarker True Negative (Mean) True Positive (Mean) p True-Positive Low-Volume Group (Mean) p True-Positive Medium-Volume Group (Mean) p True-Positive High-Volume Group (Mean) p
Baseline (ng/mL) n = 49 n = 334 n = 232 n = 89 n = 13
 GFAP 1.347 11.118 < 0.001 7.003 < 0.001 21.006 < 0.001 16.852 0.057
 UCH-L1 3.760 7.387 < 0.001 6.956 < 0.001 8.683 < 0.001 6.200 0.413
 S100B 0.221 0.462 < 0.001 0.444 < 0.001 0.522 < 0.001 0.382 0.316
 SBDP 0.213 0.335 0.049 0.387 0.032 0.227 0.730 0.126 0.059
24 hr (ng/mL) n = 43 n = 323 n = 227 n = 85 n = 11
 GFAP 1.434 10.547 < 0.001 7.481 < 0.001 19.183 < 0.001 6.810 0.054
 UCH-L1 0.324 0.708 < 0.001 0.656 0.001 0.866 < 0.001 0.529 0.604
 S100B 0.037 0.097 < 0.001 0.074 < 0.001 0.161 < 0.001 0.0778 0.141
 SBDP 0.060 0.082 0.035 0.086 0.040 0.073 0.265 0.055 0.801
48 hr (ng/mL) n = 38 n = 312 n = 221 n = 80 n = 11
 GFAP 0.611 7.116 < 0.001 4.531 < 0.001 13.634 < 0.001 11.143 0.083
 UCH-L1 0.149 0.334 < 0.001 0.293 < 0.001 0.451 < 0.001 0.278 0.312
 S100B 0.024 0.064 < 0.001 0.054 < 0.001 0.089 < 0.001 0.075 0.079
 SBDP 0.052 0.071 0.022 0.069 0.078 0.082 0.129 0.073 0.518

GFAP = glial fibrillary acidic protein, SBDP150 = α-II-spectrin breakdown products 150, S100B = S100 calcium-binding protein B, UCH-L1 = ubiquitin carboxyl-terminal esterase L1.

Effects of Progesterone on Biomarker Levels Across Lesion Volume and Time Since Injury

We analyzed GFAP, UCH-L1, S100B, and SBDP levels at baseline, 24 hours, and 48 hours in the true-positive group by lesion volume (low, medium, high) and treatment (placebo, progesterone) (Table 3). At baseline, GFAP levels in the low-volume progesterone group were higher than placebo (7.468 ng/mL vs. 6.514 ng/mL), although not significant (p = 0.484). At 24 hours, GFAP levels increased further in both groups, with a greater rise in the progesterone group, but not significantly (p = 0.704). At 48 hours, UCH-L1 levels showed a significant decrease in the low-volume progesterone group compared with placebo (p = 0.038). S100B levels showed no significant differences across timepoints. SBDP levels also showed no significant change until 48 hours, when a significant decrease was observed in the progesterone group (p = 0.046).

TABLE 3.

Biomarker Levels Across Volume Categories and Treatment Groups Over Time

Groups Biomarker Low-Volume Placebo (Mean) Low-Volume Progesterone (Mean) p Medium-Volume Placebo (Mean) Medium-Volume Progesterone (Mean) p High-Volume Placebo (Mean) High-Volume Progesterone (Mean) p
Baseline (ng/mL) n = 113 n = 119 n = 46 n = 43 n = 7 n = 6
 GFAP 6.514 7.468 0.484 19.239 22.897 0.588 19.307 13.989 0.728
 UCH-L1 6.640 7.256 0.665 8.496 8.884 0.847 6.724 5.588 0.848
 S100B 0.435 0.452 0.760 0.586 0.452 0.197 0.319 0.457 0.690
 SBDP 0.394 0.380 0.926 0.207 0.248 0.409 0.121 0.132 0.876
24 hr (ng/mL) n = 114 n = 113 n = 43 n = 42 n = 6 n = 5
 GFAP 7.158 7.810 0.704 19.483 18.877 0.940 5.505 8.377 0.598
 UCH-L1 0.637 0.676 0.820 0.856 0.877 0.895 0.558 0.494 0.861
 S100B 0.081 0.067 0.336 0.160 0.162 0.975 0.082 0.074 0.885
 SBDP 0.080 0.094 0.547 0.065 0.080 0.448 0.065 0.044 0.523
48 hr (ng/mL) n = 110 n = 111 n = 42 n = 38 n = 6 n = 5
 GFAP 4.425 4.641 0.861 13.694 13.568 0.982 12.411 9.620 0.805
 UCH-L1 0.350 0.235 0.038 0.555 0.338 0.116 0.318 0.218 0.663
 S100B 0.065 0.043 0.181 0.111 0.066 0.081 0.069 0.086 0.773
 SBDP 0.082 0.057 0.174 0.089 0.075 0.718 0.095 0.046 0.428

GFAP = glial fibrillary acidic protein, SBDP150 = α-II-spectrin breakdown products 150, S100B = S100 calcium-binding protein B, UCH-L1 = ubiquitin carboxyl-terminal esterase L1.

Sex-Specific Effects of Progesterone on TBI

Biomarker levels (GFAP, UCH-L1, S100B, and SBDP) were analyzed by sex and treatment (placebo, progesterone) at baseline, 24 hours, and 48 hours (Fig. 3; and Supplemental Table 2, https://links.lww.com/CCX/B540). At baseline, no significant differences were found between the female placebo and progesterone groups for any biomarkers. GFAP levels were 10.886 ng/mL (placebo) vs. 15.447 ng/mL (progesterone) in females (p = 0.340) and 10.450 ng/mL vs. 10.502 ng/mL in males (p = 0.998). At 24 hours, GFAP levels were not significantly different in females on progesterone (7.072 ng/mL) compared with placebo (9.808 ng/mL) (p = 0.271). Male GFAP levels were not significantly different from female levels with progesterone (p = 0.683). By 48 hours, UCH-L1 levels were higher in males vs. females on progesterone (0.277 ng/mL vs. 0.202 ng/mL, p = 0.048). Females on progesterone trended toward reduction in UCH-L1 compared with placebo, but this number did not reach significance (p = 0.083). Other biomarkers showed no significant changes between sexes or treatments.

Biomarker Levels in False-Negative and False-Positive Groups

To further explore potential biochemical differences missed by volumetric CT classification, we examined serum biomarker concentrations in the false-negative and false-positive cohorts, stratified by treatment group and time point (Supplemental Table 3, https://links.lww.com/CCX/B540). At baseline, FN patients had comparable levels of GFAP, UCH-L1, S100B, and SBDP between the Placebo and Progesterone arms, with no significant differences observed. This trend persisted at 24 hours and 48 hours post-injury. In contrast, FP patients demonstrated numerically higher GFAP and UCH-L1 levels in the progesterone group across all time points, although differences did not reach statistical significance due to small sample sizes. These findings suggest that, despite being misclassified by CT volume thresholds, both FN and FP groups may exhibit biomarker signatures indicative of brain injury, supporting the utility of multi-modal assessment.

Associations Between Biomarker Levels, Blast Volume, and Clinical Scores in Misclassified Patients

We evaluated the relationships between serum biomarkers, total BLAST-CT lesion volume, and clinical outcome scores in the combined false-negative and false-positive cohorts (Supplemental Fig. 1, https://links.lww.com/CCX/B540). GFAP and UCH-L1 levels showed inverse associations with GOS-E scores, suggesting that higher biomarker concentrations were associated with poorer neurologic outcomes (Supplemental Fig. 2A, https://links.lww.com/CCX/B540). Total BLAST-CT volume, however, exhibited no significant relationship with GOS-E (R² = 0.02; Supplemental Fig. 2B, https://links.lww.com/CCX/B540). In contrast, both GFAP and UCH-L1 levels demonstrated a positive correlation with Rotterdam CT scores (Supplemental Fig. 2C, https://links.lww.com/CCX/B540), indicating alignment with more severe radiological presentation. Similarly, total lesion volume increased in parallel with higher RSs (R² = 0.50; Supplemental Fig. 2D, https://links.lww.com/CCX/B540). These results suggest that, in cases where volumetric thresholds misclassify injury severity, biomarker levels may offer more reliable correlation with clinical and radiologic severity indices.

Clinical Outcomes by Lesion Type and Treatment Group

To further investigate how lesion types interact with treatment response, we stratified mean clinical outcomes by lesion presence and treatment arm (Supplemental Table 4, https://links.lww.com/CCX/B540). Across the entire cohort, patients with EAH and edema had relatively higher mean outcome scores in both treatment arms, with minimal differences between groups. Among patients with only a single lesion type, we observed similar trends, although sample sizes were limited. Notably, patients with only edema or only IVH demonstrated numerically higher outcome scores compared with those with only IPH, regardless of treatment assignment. However, no statistically significant differences were observed between the Placebo and progesterone groups for any lesion type.

DISCUSSION

The biomarkers studied (GFAP, UCH-L1, S100B, and SBDP) are associated with different aspects of neuronal and glial injury (6, 21). GFAP, elevated after TBI due to astrocytic damage (6, 21, 27), showed significant differences between true-positive and true-negative groups, except in the high-volume lesion group, possibly due to the small sample size (n = 13). UCH-L1, which indicates neuronal damage, along with S100B and SBDP, markers of glial and cytoskeletal damage, respectively, also showed changes consistent with brain injury (6, 21, 27). Of the four protein biomarkers, only GFAP and UCH-L1 correlated with GOS-E and RS. GFAP showed a stronger correlation to RS than UCH-L1, whereas UCH-L1 showed a stronger correlation to GOS-E than GFAP. Previous analyses have shown that GFAP and UCH-L1 were independent of other variables in predicting outcome (21). In this study, we examined only the CT-positive cohort with clinical, radiographic, and biomarker assessment. Our results indicate that better classification of injury demonstrated no difference in outcomes between patients receiving placebo or progesterone at any time point, further solidifying the conclusion of ProTECT III that progesterone has a limited clinical role in treating TBI (6, 12, 21).

One major challenge that affected the ProTECT III trial was the use of broad inclusion (GCS) and outcome measures (GOS-E). In this study, we quantified brain lesion volumes from available CT scans, using machine learning to classify patients with detectable brain injury confirmed by radiological assessment. We further assessed the extent of physical brain injury using serum protein biomarker levels (GFAP, UCH-L1) and determined that lesion volumes are directly correlated with the RS, which is based on a non-contrast CT scan for predicting mortality and early functional outcomes in TBI patients. Ultimately, using these newer and more objective techniques for classification of injury still did not identify a subset of responders to progesterone. As more and more clinical trials plan to use serum biomarkers and algorithmic image analysis to better classify patients for inclusion, the ProTECT III trial serves as a caveat that such measures have not yet performed better than the RS or GCS score.

The ProTECT data did not include detailed clinical variables such as surgical decompression, intracranial pressure monitoring and management, use of osmotic therapy, duration of coma, ventilator days, or the need for tracheostomy and percutaneous endoscopic gastrostomy placement. The inclusion of these secondary clinical outcomes may reveal subtle treatment effects of progesterone in TBI-positive patients. Future trials incorporating such clinical parameters may better elucidate progesterone’s role in this population.

The analysis revealed that there were no significant differences in GOS-E scores between the two groups in any of the combinations of BLAST volume and biomarker levels. For instance, in the low BLAST and low GFAP category, GOS-E scores for the placebo and progesterone groups were comparable, with p values confirming the lack of statistically significant variation in either total BLAST volume or biomarker levels. This trend persisted across all categories, including medium and high BLAST volumes paired with varying biomarker levels. Even in subgroups with elevated BLAST volumes and higher levels of GFAP or UCH-L1, where greater injury severity might have been expected to elicit differential treatment effects, no meaningful differences in GOS-E scores were observed. This consistency suggests that progesterone treatment does not significantly influence clinical outcomes as measured by functional recovery in any of the evaluated subsets. The uniformity of results across these stratified combinations highlights a broader conclusion: although the stratification by BLAST and biomarker levels provides a nuanced framework for analysis, progesterone appears to have no measurable impact on the functional outcomes of patients in this cohort.

These findings underscore the limited efficacy of progesterone in altering recovery trajectories as measured by GOS-E scores, regardless of the injury severity (as indicated by BLAST volume) or the associated serum biomarker levels. This observation suggests that other factors, possibly unrelated to these stratifications, may play a more pivotal role in determining patient outcomes in this context.

Our study aimed to determine whether refined classification using biomarkers and lesion volume could identify a subgroup of patients responsive to progesterone treatment. However, our findings reinforce that progesterone does not improve clinical outcomes in any subgroup. We do not assume progesterone’s efficacy but rather sought to test whether improved classification could uncover responders. Given the lack of signal, our findings further support that TBI recovery improvements are more likely due to best supportive care and rehabilitation rather than pharmacologic intervention.

Limitations and Future Directions

This study had several limitations. First, brain lesions were segmented using BLAST-CT rather than manual radiological assessment. BLAST-CT provides high specificity, at the expense of sensitivity, particularly for hemorrhages that may have weakened associations between lesion volume and biomarker levels. In addition, our analysis focused solely on vascular injury, excluding diffuse axonal injury (DAI), a key contributor to poor neurologic outcomes that is not easily detectable on CT scans.

In addition, this study used a true-negative subgroup with diagnosed TBI rather than healthy controls, potentially elevating baseline biomarker levels. The small sample size in the high-volume group limited statistical comparisons. Volume thresholds for defining low, medium, and high-volume subgroups were selected empirically and may benefit from further refinement. Finally, BLAST-CT does not account for lesion heterogeneity (e.g., intra-axial vs. extra-axial lesions), which may affect biomarker expression. Future work could incorporate lesion-specificity and non-vascular injury, such as DAI, to enhance TBI classification.

CONCLUSIONS

Improving TBI classification is needed to identify treatment responders for therapeutic development. In this post hoc analysis of the ProTECT III trial, we combined brain imaging segmentation with protein biomarker levels to identify potential progesterone responders. GFAP and UCH-L1 were validated as biomarkers of CT-visible brain injury. Neither subgroup by injury severity nor sex benefited from progesterone treatment. These findings reaffirm that progesterone has a limited therapeutic value in TBI.

Ms. Cheong performed computational experiments, conducted data analysis, interpreted results, and wrote article. Dr. Sham designed computational experiments, supervised data analysis, interpreted results, and wrote article. Dr. Samadani designed computational experiments, interpreted results, and wrote article. Drs. Wright, Frankel, and Hall designed clinical experiments, collected data, and wrote article. Mrs. Gupta and Ms. Kadaba Sridhar wrote article.

Supplementary Material

cc9-7-e1306-s001.pdf (1.1MB, pdf)

Footnotes

This study was funded by the Minnesota Office of Higher Education (grant number: 165646) and by grants from the National Institute of Neurological Disorders and Stroke of the National Institutes of Health (NS062778, 5U10NS059032, U01NS056975) and the National Center for Advancing Translational Sciences of the National Institutes of Health (UL1TR000454) and by the Emory Emergency Neurosciences Laboratory in the Department of Emergency Medicine, Emory School of Medicine, and Grady Memorial Hospital.

The authors have disclosed that they do not have any potential conflicts of interest.

Supplemental digital content is available for this article. Direct URL citations appear in the printed text and are provided in the HTML and PDF versions of this article on the journal’s website (http://journals.lww.com/ccejournal).

Contributor Information

Scarlett Cheong, Email: cheon068@umn.edu.

Rishabh Gupta, Email: gupta324@umn.edu.

Sharada Kadaba Sridhar, Email: kadab004@umn.edu.

Alex J. Hall, Email: alex.hall@emory.edu.

Michael Frankel, Email: mfranke@emory.edu.

David W. Wright, Email: david.wright@emory.edu.

REFERENCES

  • 1.Stein DG: Embracing failure: What the Phase III progesterone studies can teach about TBI clinical trials. Brain Inj 2015; 29:1259–1272 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Gibson CL, Gray LJ, Bath PM, et al. : Progesterone for the treatment of experimental brain injury: A systematic review. Brain 2008; 131:318–328 [DOI] [PubMed] [Google Scholar]
  • 3.De Nicola AF, Labombarda F, Gonzalez Deniselle MC, et al. : Progesterone neuroprotection in traumatic CNS injury and motoneuron degeneration. Front Neuroendocrinol 2009; 30:173–187 [DOI] [PubMed] [Google Scholar]
  • 4.Stein DG: Progesterone in the treatment of acute traumatic brain injury: A clinical perspective and update. Neuroscience 2011; 191:101–106 [DOI] [PubMed] [Google Scholar]
  • 5.Brinton RD: Neurosteroids as regenerative agents in the brain: Therapeutic implications. Nat Rev Endocrinol 2013; 9:241–250 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Korley F, Pauls Q, Yeatts SD, et al. : Progesterone treatment does not decrease serum levels of biomarkers of glial and neuronal cell injury in moderate and severe traumatic brain injury subjects: A secondary analysis of the Progesterone for Traumatic Brain Injury, Experimental Clinical Treatment (ProTECT) III Trial. J Neurotrauma 2021; 38:1953–1960 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wright DW, Kellermann AL, Hertzberg VS, et al. : ProTECT: A randomized clinical trial of progesterone for acute traumatic brain injury. Ann Emerg Med 2007; 49:391–402, 402.e1 [DOI] [PubMed] [Google Scholar]
  • 8.Xiao G, Wei J, Yan W, et al. : Improved outcomes from the administration of progesterone for patients with acute severe traumatic brain injury: A randomized controlled trial. Crit Care 2008; 12:R61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Aminmansour B, Nikbakht H, Ghorbani A, et al. : Comparison of the administration of progesterone versus progesterone and vitamin D in improvement of outcomes in patients with traumatic brain injury: A randomized clinical trial with placebo group. Adv Biomed Res 2012; 1:58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Shakeri M, Boustani MR, Pak N, et al. : Effect of progesterone administration on prognosis of patients with diffuse axonal injury due to severe head trauma. Clin Neurol Neurosurg 2013; 115:2019–2022 [DOI] [PubMed] [Google Scholar]
  • 11.Kobeissy F, Arja RD, Munoz JC, et al. : The game changer: UCH-L1 and GFAP-based blood test as the first marketed in vitro diagnostic test for mild traumatic brain injury. Expert Rev Mol Diagn 2024; 24:67–77 [DOI] [PubMed] [Google Scholar]
  • 12.Wright DW, Yeatts SD, Silbergleit R, et al. ; NETT Investigators: Very early administration of progesterone for acute traumatic brain injury. N Engl J Med 2014; 371:2457–2466 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Maas AIR, Menon DK, Adelson PD, et al. ; InTBIR Participants and Investigators: Traumatic brain injury: Integrated approaches to improve prevention, clinical care, and research. Lancet Neurol 2017; 16:987–1048 [DOI] [PubMed] [Google Scholar]
  • 14.Papa L, Lewis LM, Falk JL, et al. : Elevated levels of serum glial fibrillary acidic protein breakdown products in mild and moderate traumatic brain injury are associated with intracranial lesions and neurosurgical intervention. Ann Emerg Med 2012; 59:471–483 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Yue JK, Vassar MJ, Lingsma HF, et al. ; TRACK-TBI Investigators: Transforming research and clinical knowledge in traumatic brain injury pilot: Multicenter implementation of the common data elements for traumatic brain injury. J Neurotrauma 2013; 30:1831–1844 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Wilson JT, Pettigrew LE, Teasdale GM: Structured interviews for the Glasgow Outcome Scale and the extended Glasgow Outcome Scale: Guidelines for their use. J Neurotrauma 1998; 15:573–585 [DOI] [PubMed] [Google Scholar]
  • 17.Maas AI, Harrison-Felix CL, Menon D, et al. : Common data elements for traumatic brain injury: Recommendations from the interagency working group on demographics and clinical assessment. Arch Phys Med Rehabil 2010; 91:1641–1649 [DOI] [PubMed] [Google Scholar]
  • 18.Kok YE, Pszczolkowski S, Law ZK, et al. : Semantic segmentation of spontaneous intracerebral hemorrhage, intraventricular hemorrhage, and associated edema on CT images using deep learning. Radiol Artif Intell 2022; 4:e220096. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Kou Z, Wu Z, Tong KA, et al. : The role of advanced MR imaging findings as biomarkers of traumatic brain injury. J Head Trauma Rehabil 2010; 25:267–282 [DOI] [PubMed] [Google Scholar]
  • 20.Courville E, Kazim SF, Vellek J, et al. : Machine learning algorithms for predicting outcomes of traumatic brain injury: A systematic review and meta-analysis. Surg Neurol Int 2023; 14:262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Frankel M, Fan L, Yeatts SD, et al. : Association of very early serum levels of S100B, glial fibrillary acidic protein, ubiquitin C-terminal hydrolase-L1, and spectrin breakdown product with outcome in ProTECT III. J Neurotrauma 2019; 36:2863–2871 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Federal Interagency Traumatic Brain Injury Research (FITBIR) Informatics System. National Institutes of Health; Available at: https://fitbir.nih.gov/. Accessed October 1, 2022 [Google Scholar]
  • 23.Sterne JA, White IR, Carlin JB, et al. : Multiple imputation for missing data in epidemiological and clinical research: Potential and pitfalls. BMJ 2009; 338:b2393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Monteiro M, Newcombe VFJ, Mathieu F, et al. : Multiclass semantic segmentation and quantification of traumatic brain injury lesions on head CT using deep learning: An algorithm development and multicentre validation study. Lancet Digit Health 2020; 2:e314–e322 [DOI] [PubMed] [Google Scholar]
  • 25.Li X, Morgan PS, Ashburner J, et al. : The first step for neuroimaging data analysis: DICOM to NIfTI conversion. J Neurosci Methods 2016; 264:47–56 [DOI] [PubMed] [Google Scholar]
  • 26.Liesemer K, Riva-Cambrin J, Bennett KS, et al. : Use of Rotterdam CT scores for mortality risk stratification in children with traumatic brain injury. Pediatr Crit Care Med 2014; 15:554–562 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Bazarian JJ, Biberthaler P, Welch RD, et al. : Serum GFAP and UCH-L1 for prediction of absence of intracranial injuries on head CT (ALERT-TBI): A multicentre observational study. Lancet Neurol 2018; 17:782–789 [DOI] [PubMed] [Google Scholar]

Articles from Critical Care Explorations are provided here courtesy of Wolters Kluwer Health

RESOURCES