Skip to main content
Function logoLink to Function
. 2026 Jan 19;7(2):e090-2025. doi: 10.1152/function.090.2025

Machine learning integrated extracellular vesicle proteome analysis for early markers of bronchopulmonary dysplasia

Shaili Amatya 1,, Shawn Rice 2, Anne Stanley 3, Han Chen 4, Ann Donnelly 1, Heather Stephens 1, Roopa Siddaiah 5, Chandra P Belani 6, Zissis C Chroneos 1,7
PMCID: PMC12934772  PMID: 41552916

Abstract

Bronchopulmonary dysplasia (BPD) is a serious and often lethal complication of preterm birth that typically manifests about 1 mo after preterm delivery. The lungs of premature infants are underdeveloped and vulnerable to mechanical damage, inflammation, and oxidative stress. Collectively, these stressors impair the normal alveolarization of the premature lungs after birth. The multifactorial pathophysiology of BPD necessitates the identification of the molecular factors that mediate cell-to-cell communication that discriminates normal lung development from progression to BPD. Extracellular vesicles (EVs) mediate intercellular cross talk by transporting functional molecules, including proteins and nucleic acids, to recipient cells through biological fluids. This feasibility study determined the utility of profiling the discarded plasma-derived EV proteome to predict BPD susceptibility risk in extremely preterm infants. Discarded plasma was obtained from routine laboratory draws from infants born at less than 32 wk of gestation and weighing less than 1,500 g. Plasma EVs were captured using a magnetic bead-based immunoaffinity method. Subsequently, mass spectrometry and differential protein content analysis workflow identified a novel nine-EV-protein signature [APOD, heterogenous nuclear ribonucleoprotein M (HNRNPM), high-mobility group nucleosome-binding domain-containing protein 2 (HMGN2), intelectin-1 (ITLN1), proteinase 3 (PRTN3), RNA-binding protein4 (RBM4), RNA-binding motif protein, X chromosome (RBMX), TATA-binding protein-associated factor 2 N (TAF15, transcription elongation regulator 1 (TCERG1)] that distinguished preterm infants who developed BPD from those who did not. Application of machine learning statistical modeling using Promor tool trained on the nine-protein signature template identified a high specificity and selectivity prognostic threshold for the development of BPD. HNRNPM emerged as the most consistent biological response component predicting development of BPD in our patient cohort. Our study suggests that circulating EVs derived from discarded plasma are a suitable “liquid biopsy” to help stratify the vulnerability risk for BPD in preterm infants.

Keywords: BPD, bronchopulmonary dysplasia, extracellular vesicles, machine learning, proteomics

INTRODUCTION

The incidence of preterm birth remains significant, affecting 1 in every 10 infants born in the United States (1), despite advancements in perinatal care practices. Improved survival rates for extremely preterm infants have also led to a rise in the incidence of bronchopulmonary dysplasia (BPD). The global incidence of BPD varies widely, ranging from 10%–73% in Europe, 18%–89% in North America, and 18%–82% in Asia (2). Preterm infants with BPD often have prolonged hospital stays, high healthcare costs, and long-term pulmonary morbidity (3–12).

Preterm infants who develop BPD experience disrupted lung development and persistent or chronic inflammation and scarring of the lung. A combination of factors, including genetic predisposition, prenatal inflammation, oxidative stress, ventilation-induced lung injury, and sepsis in the postnatal period, contributes to the development of BPD (8, 13–16). BPD is only diagnosed at 36 wk postmenstrual age (PMA) (17), which is far too late for preventive therapies. Integrating early molecular predictors may enhance the accuracy of clinical risk predictors such as the National Institute of Child Health and Human Development (NICHD) BPD estimator (15) and the ability to develop precision medicine for individualized care of patients with BPD. The study of biomarkers in BPD (15) may not fully represent the complex intercellular communication that occurs during lung development and the ongoing chronic inflammation in the body.

Extracellular vesicles (EVs) are emerging as important mediators of cell-cell communication. EVs are cell-derived, nanosized, membrane-bound vesicles that play a key role in intercellular communication within and among tissues. In addition, EVs can cross physiological barriers and influence recipient cells by transferring functional molecules between different tissues and tissue compartments.

EVs carry multiple types of bioactive cargo encompassing lipids, proteins (including enzymes, chaperones, chemokines, receptors, signaling molecules, and tetraspanins), nucleic acids (including mRNA, miRNA, and lncRNA), and peptides. Exosomes are ∼30–100 nm in size, formed within multivesicular endosomes inside cells, and are released when the endosomes fuse with the plasma membrane. Microvesicles, on the other hand, range from 100 to 1,000 nm and bud directly from the plasma membrane. The formation of EVs in nearly all cell types and their detection in various body fluids, including blood, urine, saliva, breast milk, cerebrospinal fluid, and respiratory aspirates, underscore the biological significance of EVs (18). As membrane-bound structures, EVs can be stored long term and repeatedly processed for analysis of different molecule types after repeated freeze-thaw cycles.

An emerging body of evidence indicates that EVs collected from different tissue sites can provide clinically relevant information on susceptibility to BPD. Studies on EVs derived from tracheal aspirates have identified microRNA 876-3p as a protective molecule against BPD, with experimental studies supporting the notion that microRNA 876-3p improves alveologenesis and reduces inflammation in hyperoxia mouse models of BPD (19). In addition, studies examining umbilical cord blood-derived EVs have shown decreased expression of microRNA 103a-3p and microRNA 185-5p, whereas the level of microRNA 200a-3p increased in cases of BPD (20). Furthermore, serum-derived microRNA 21 was associated with a higher risk of developing BPD (21). On the other hand, protein-based studies reported that tracheal fluid-derived EVs contain increased levels of CD24 and CD14, that originate from epithelial and immune cells, respectively (22). These findings suggest that the EV proteome may reflect inflammatory exposure in the lungs of preterm infants who develop BPD. These promising studies highlight the need for comprehensive omics investigations to better understand the EV molecular composition that signifies BPD risk and to gain insights into potential mechanisms for the development of BPD.

Here, we have established a standardized and automated workflow to investigate the early proteomic signatures of BPD in preterm infants. Our results demonstrate that EV derived from discarded blood samples collected during routine laboratory draws can be used to identify molecular predictors of BPD. We identified a novel nine-protein signature that predicts increased risk for BPD in a small cohort of preterm infants.

MATERIALS AND METHODS

Sample Collection

This study included preterm infants born at less than 32 wk of gestation and weighing under 1,500 g admitted to Penn State Children’s Hospital neonatal intensive care unit. Infants with chromosomal abnormalities, significant congenital anomalies, or multisystem organ failure were excluded. Blood samples were collected after informed consent within 7 days of life and stored at −80°C. The study was approved by the Institutional Review Board of the Penn State College of Medicine. BPD was diagnosed as Grade 2 and 3 based on respiratory support at 36 wk of postmenstrual age, according to the NICHD 2019 guidelines (23). Patients with BPD Grade 0 and 1 were used as control patients.

Characterization of EVs

EVs were characterized by transmission electron microscopy, nanoparticle tracking analysis, and Western blotting in accordance with the International Society of Extracellular Vesicles (ISEV) guidelines (18).

Transmission electron microscopy.

About 10 µL of the isolated sample was pipetted onto a 400-mesh copper grid with a carbon-coated formvar film and incubated for 1 min. Excess liquid was removed by paper blotting. The grid was briefly placed on 10 µL of 1% uranyl acetate for 1 min, followed by paper blotting to remove excess liquid. Allow the grid to dry and examine the grid the same day, viewed in a JEOL JEM1400 transmission electron microscope (TEM) (JEOL USA Inc., Peabody, MA). Isolated EVs were deposited on formvar/carbon-coated copper EM grids (10 μL on each grid) for 20 min. The vesicle-coated grids were washed three times with PBS and then fixed with 2.5% glutaraldehyde for 10 min. After washing in three drops of distilled water, the grids were negatively stained with 2% uranyl acetate for 15 min and air-dried for 20 min. TEM was performed using the JEM1400 TEM (JEOL Ltd., Tokyo, Japan) at the Penn State College of Medicine Transmission Electron Microscopy Facility (RRID: SCR_021200).

Western blotting.

Isolated EVs (10 μL) were mixed with LDS sample buffer (NP0007, Thermo Fisher Scientific, California) and 0.1 M dithiothreitol (except when probing for CD63). They were heated to 70°C for 15 min, then separated on NuPAGE 4%–12% bis-Tris gel (NP0323BOX, Thermo Fisher Scientific, CA) and ran for 15 min at 50 V and 60 min at 100 V in MOPS SDS running buffer (NP0001, Thermo Fisher Scientific). The samples were transferred from the gel using the BioRad transfer system in PVDF (Immunobilon-FL) and activated in 100% methanol. First, the membrane was blocked in Li-Cor blocking buffer in TBS (Li-COR) at room temperature for 1 h, then incubated with primary antibodies (antihuman CD63 antibody, clone H5C6, Cat. No. 100-0139, antihuman CD9 antibody, clone HI9a, Cat. No. 100-0138, Stem cell technologies) diluted in Li-COR antibody diluent at 4°C overnight. Next, the membrane was washed four times for 5 min each time in TBST, incubated for 1.5 h at room temperature with secondary antibody (IRDye 680CW goat anti-mouse IgG, Li-COR), diluted in blocking buffer, then washed another four times in TBST for 10 min each time. The Odyssey CLx Imaging system (Li-COR) detected signals to scan images using software Image Studio Ver 5.2 (Li-COR). The positive control was taken as adult retinal pigment epithelium cell line 19 (ARPE-19) cell-derived EV from Dr. Jeffrey Sundstrom’s laboratory (24) and adult plasma EV.

Automated EV Capture and Proteomic Sample Preparation

Automated EV capture was performed using the EasySep Human Pan-Extracellular Vesicle Positive Selection Kit (Stem Cell Technologies, Cat. No. 17891) and proteomic sample preparation in a 96-well format with two steps on a KingFisher Flex system (Thermo Fisher Scientific) in one batch. Plasma (100 μL) was mixed with 10 μL of capture antibodies and incubated with mixing at room temperature for 10 min. Twenty microliters of capture beads were added to the plasma and antibody mixture and incubated with shaking for 10 min at room temperature. The beads were washed three times with PBS, and the EVs were then lysed, reduced, and alkylated on the beads with 100 μL of lysis buffer (50 mM Tris, pH 8.5, 1% SDS, 40 mM 2-chloroacetamide, 10 mM tris(2-carboxyethyl)phosphine (TCEP), and 8 pg/μL of yeast enolase as a quality control protein) and heated to 47°C for 15 min. Samples were then subjected to protein aggregation capture (PAC) on the EasySep beads by the addition of acetonitrile to 70% final concentration and incubation for 15 min at room temperature to remove contaminating species. The beads were washed three times with 95% acetonitrile and digested for 2 h in 50 mM ammonium bicarbonate at 47°C. Digests were stopped by adding TFA to 0.1% and digested equine apo myoglobin to 50 fmol/uL (as a quality assurance standard), and the plate was spun thrice at 200 g for 15 min to remove any residual beads. Peptide concentrations were determined with a NanoDrop OneC instrument using the A205 quantification method.

Liquid Chromatography and Mass Spectrometry

Tandem LC-MS analysis was performed using a NanoElute LC connected to a timsTOF Flex (Bruker) mass spectrometer. The NanoElute used a 50-sample-per-day method with an ∼20-min gradient from 2% to 35% B (acetonitrile with 0.1% formic acid) and a 28-min cycle time. The samples (∼100 ng) were loaded onto a micro-Precolumn with PepMap C18 (5 × 0.3 mm, Thermo Scientific) trap and then separated with a 150 × 0.15 mm C18 PepSep column (Bruker). The mass spectrometer was operated in data-independent acquisition parallel accumulation–serial fragmentation (diaPASEF) mode. Twenty variable DIA isolation windows across 10 ion-mobility scans per cycle were optimized to fit the expected precursor ion cloud from our in silico peptide library using the pydiAID tool (25). The Proteomic mass spec work done was performed at the Mass Spec and Proteomics Core Facility at the Penn State University College of Medicine (RRID: SCR_017831).

Data Searches

An experimental spectral library was generated by searching 16 samples, 8 from each group, and the four QC samples using Data Independent Analysis Neural Network (DIA-NN) (v.1.9) against an in silico library generated by DIA-NN from the human proteome (∼20,000 proteins) (26). The resulting experiment-specific library included 5,594 protein isoforms, 5,483 protein groups, and 45,678 precursor ions in 40,984 elution groups. This library was used to search all the patients’ data files. Searches allowed one missed cleavage, one variable modification, C-carbamidomethylation was set as a fixed modification, mass accuracy and MS1 accuracy were set to 15, the Scan Window was set to 9, and the match between runs function was activated. All other settings used default values.

Data Analysis

The protein groups’ result files from the DIA-NN were imported into FragPipeAnalyst for processing and summarizing the dataset (27). The data were imported, proteins were filtered for 80% valid values globally and in at least one group, normalized by variance-stabilized normalization (VSN), and missing values were imputed using the Perseus method. The resulting dataset was exported from FragPipeAnalyst and used as input for analysis, modeling, and visualization using Promor (28). A panel of proteins was established based on a differential expression (DE) analysis performed with the Promor tool using an adjusted P value cutoff of 0.05 and log 2 fold change cutoff of 1. The subjects were split into training and test sets to develop and test a list of default models for the ability of the protein panel to distinguish patients without BPD from those who will develop the condition. The Promor default parameters included k-fold cross validation with the cross-validation resample method. Receiver operator curves (ROCs) and feature importance lollipop plots were generated using the Promor tool. Ingenuity pathway analysis (IPA, Qiagen) was used to construct protein interaction networks and identify functional annotations.

RESULTS

This pilot study included 30 preterm infants who were admitted to the Neonatal Intensive Care Unit (NICU) at Penn State Children’s Hospital between 2020 and 2023. Among these infants, 15 developed bronchopulmonary dysplasia (BPD), classified as Grade 2 and 3, whereas the other 15 served as controls, representing infants without significant BPD. Table 1 presents the demographic characteristics of the BPD and non-BPD groups. The infants who developed BPD were significantly smaller in terms of gestational age, with a median of 25 wk (22–29 wk) [interquartile range (IQR)] compared with 29 wk (24–31 wk) in the control group. In addition, the BPD group had a lower mean birth weight of 765 g (±215 g) [± standard deviation (SD)] versus 1,152 g (±202 g) in the control group. There were no significant differences between the groups regarding sex, race, or ethnicity.

Table 1.

Demographic characteristics of patients with BPD and without BPD (control) were compared

Demographic Characteristics Control (n = 15) BPD (n = 15) P Value
GA
 Median (IQR), wk 29 (24–31) 25 (22–29) <0.0001
BW
 Means (SD), g 1152.3 ± 202.1 764.7 ± 215.1 <0.0001
Sex, n (%)
 Female 4 (27%) 6 (40%) 0.69
Race, n (%)
 White 8 (53%) 10 (67%) 0.87
 Black 4 (27%) 2 (13%)
 Other 3 (20%) 3 (20%)
Ethnicity, n (%)
 Hispanic 5 (33%) 4 (27%) 1.0
 Non-Hispanic 10 (67%) 11 (73%)

Patients with BPD were included as Grade 2 and 3 as diagnosed at 36 wk postmenstrual age based on respiratory support as per NICHD 2019 guidelines. Control patients were included as BPD Grade 0 and 1. BPD, bronchopulmonary dysplasia; BW, birth weight; GA, gestational age; IQR, interquartile range; SD, standard deviation.

Effect of Automated EV Proteomic Workflow on Sample Variability

We designed an automated EV capture and LC-MS workflow for processing, acquisition, and analysis to minimize sample variability. Plasma EVs were isolated within 1 wk (1w) of preterm delivery from infants who either did (BPD.1w) or did not develop BPD (CTL.1w). EV capture was first verified by transmission electron microscopy (TEM) visualization and presence of standard EV markers CD63 and CD9 by Western blotting (Fig. 1, A and B). Furthermore, the level of EV marker proteins CD63, CD81, and CD9 was similar (Fig. 1C), indicating consistent capture of EVs across all samples. EV proteins were then processed through an enzymatic digestion, extraction, LS-MS fractionation, and analysis pipeline. Mass analysis of peptide precursor ions identified 12,747 ± 3,532 (means ± SD) and 14,271 ± 4,135 precursor ions (Fig. 1D) that mapped to 2,182 ± 436 and 2,332 ± 489 identified proteins (Fig. 1D) in EVs from the BPD.1w and CTL.1w groups, respectively. Although the number of precursor ions and corresponding proteins was higher in CTL.1w EVs compared BPD.1w, differences did not reach statistical significance with the number of samples studied in this cohort.

Figure 1.

Figure 1.

Extracellular vesicle (EV) proteomic quality control workflow. Plasma-derived EVs were characterized by transmission electron microscopy (TEM) following negative staining (A) and Western blotting (B). A: representative TEM images from a patient with bronchopulmonary dysplasia (BPD) and non-BPD control (CTL). B: representative Western blots of EV extract from preterm infant and adult plasma. An ARPE-19 cell-derived EV extract was used as a control for CD63. Visualization and acquisition of peptide MS raw data were performed using Fragpipe Analyst software. C: EV lysates showed similar level of EV markers CD63, CD81, and CD9. The number of precursor ions and proteins (D) between BPD.1w and CTL.1w samples was similar (ANOVA P value is shown in the boxes). E: the coefficient of variation between samples was similar for each group, 43%–49%. F: the density distribution of raw data was not affected by sample processing.

The relative variability between the samples, as measured by the coefficient of variation, was minimal: 43% in the BPD.1w group and 49% in the CTL.1w (Fig. 1E). The intensity and density distribution plots generated by filtering, normalization, and imputation of the mass spectrometry data indicate minimal contribution of sample processing on variability (Fig. 1F). These results indicate technical consistency in sample processing, enhancing the reliability of the findings.

Machine Learning-Assisted Discovery of EV Proteins That Predict Development of BPD

We first used the Promo label-free proteome analysis and modeling R package software (28) to identify candidate proteins with predictive potential for the development of BPD. Volcano plot depicts the differential protein expression between BPD and control group with fold change cutoff = 1 and P value cutoff = 0.05 (Fig. 2A), corrected for gestational age and weight. Heatmap density plots (Fig. 2B) and Promo modeling of the top 20 EV proteins (Fig. 2C) identified a nine-protein signature that differentiates the BPD.1w and CTL.1w groups; heterogenous nuclear ribonucleoprotein M (HNRNPM), high mobility group nucleosome-binding domain-containing protein 2 (HMGN2), proteinase 3 (PRTN3), RNA-binding protein4 (RBM4), TATA-binding protein-associated factor 2 N (TAF15), intelectin-1 (ITLN1), apolipoprotein D (APOD), transcription elongation regulator 1 (TCERG1), and RNA-binding motif protein, X chromosome (RBMX) between the BPD and control groups. PRTN3, TAF15, and RBM4 presented notably higher expression levels in BPD.1w EVs compared with CTL.1w. Conversely, ITLN1 and APOD were expressed at lower levels in BPD.1w EVs compared with CTL.1w. HNRNPM, HMGN2, TCERG1, and RBMX displayed more modest differences between the two groups.

Figure 2.

Figure 2.

Comparative proteome and machine learning analysis of plasma extracellular vesicles (EVs) from BPD.1w compared to control CTL.1w plasma. A: volcano plot depicting significant upregulated and downregulated proteins (orange dots) with foldchange cutoff = 1, P value cutoff = 0.05 in BPD.1w Vs. CTL.1w. B: heatmap comparison of the top 20 EV proteins with differential abundance across BPD.1w and CTL.1w sample. C: boxplot comparison of Promor tool-selected proteins differentiating BPD.1w and CTL.1w EVs. Each box represents the interquartile range, with the median indicated; whiskers represent 1.5 times the interquartile range. D: machine learning model-assisted classification of gene importance scores (D) and predictive performance (E) of EV candidate biomarkers identify infants at risk for development of BPD. BPD, bronchopulmonary dysplasia.

To identify the most influential genes distinguishing 1-wk-old infants at risk for BPD, we performed a comparative analysis of gene importance using six machine learning models (Fig. 2, D and E): generalized linear model (GLM), Naive Bayes, Bayesian generalized linear model (BayesGLM), random forest (RF), support vector machine with radial kernel (SVM Radial), and extreme gradient boosting with linear booster (XGB Linear). Each model generated a ranking of gene importance scores for comparison (Fig. 2D). Naive Bayes achieved an area under the curve (AUC) of 100%. SVM with radial kernel (SVM Radial) achieved an AUC of 88.9%. Both Bayesian GLM (BayesGLM) and Random Forest (RF) models achieved moderate performance with AUC 77.8%. XGB Linear and GLM showed lower AUC values of 66.7% and 61.1%, respectively (Fig. 2E).

HNRNPM emerged as the most important predictor of BPD across all models. Similarly, PRTN3, RBM4, TAF15, and HMGN2 were frequently ranked among the top disease contributors. In contrast, the contribution of TCERG1, RBMX, and ITLN1 varied depending on the model applied, reflecting differences in model sensitivity. Notably, APOD was identified as highly important by the nonlinear models, RF and XGB Linear, but was not prioritized by the linear models GLM and Naive Bayes.

We used the curated ingenuity pathway analysis (IPA) database tool to obtain functional insights into the nine-protein signature panel. IPA network analysis linked HNRNPM, RBM4, TAF15, RBMX, TCERG1 to mRNA splicing and RNA metabolism, PRTN3, and APOD to inflammation, ITLN1 to mitochondrial and protein aggregation, and HMGN2 to nucleosomes (Supplemental Fig. S1).

DISCUSSION

The lack of molecular tools to predict the development and severity of BPD in low-birth-weight preterm infants hampers prevention, treatment, and management of this debilitating disease. We report the application of an automated EV proteomic and machine learning-assisted workflow that effectively minimizes technical variability and ensures consistency in sample processing from discarded blood plasma obtained during routine blood draw in the first week of life. To our knowledge, this is the first study to identify plasma-derived EV proteome signature that discriminates infants at high risk for development of BPD.

Our proteomic analysis used Promor, an R software package designed to analyze label-free proteomics quantification. It streamlines the process of differential expression analysis and machine learning-based predictive modeling, enhancing the efficiency and accuracy of proteomics studies. The machine learning predictive modeling determined that a core differential expression signature of nine EV proteins, APOD, HNRNPM, HMGN2, ITLN1, PRTN3, RBM4, RBMX, TAF15, and TCERG1, indicates high risk for development of BPD. The function of these proteins includes mRNA splicing, RNA metabolism, and inflammation. In this regard, alternative splicing of VEGFA is critical for development of aerocytes, a subset of endothelial cells that are critical for adaptation of the lung to gas exchange. Disruption of aerocytes contributes to development of BPD (29, 30). Additional studies are needed to determine whether the EV signature reported here reflects impaired adaptation of gas exchange in the BPD-prone lungs. One of the nine proteins, PRTN3, has been previously implicated in BPD (31). Notably, ITLN1, a protein that contributes to airway mucous plugging (32), was higher in BPD EVs. Airway mucous plugging was a pathologic feature in “old” BPD that is caused by oxygen therapy (33, 34).

HNRNPM emerged as the most consistent and robust feature across all machine learning models. HNRNPM is involved in the regulation of pre-mRNA splicing and metabolism (35–37), innate sensing, and expression of viral genes (38–45). Notably, Naive Bayes achieved an AUC of 100% for HNRNPM, suggesting excellent discrimination on the dataset. This near-perfect performance, however, may reflect overfitting or overestimation of predictive ability, particularly in the absence of independent validation. The SVM Radial model also performed strongly, indicating its capability to capture complex nonlinear relationships among protein features. The differential importance of other proteins across model types reflects the added sensitivity of nonlinear models in capturing complex relationships not detected by linear approaches. Together, these results highlight the promise of integrating EV proteomics with machine learning to identify early biomarkers to stratify risk for the development of BPD. Future studies are needed to apply these models to larger cohort databases and external validation are warranted to confirm these protein signatures and assess their clinical utility.

Previous investigations of EVs isolated from plasma or other biological fluids have reported variability in EV yield, purity, and protein content, often attributable to differences in isolation methods, sample handling, or analytical pipelines (18, 46). By using an automated EV proteomic workflow (automated sample preparation with bead-based EV capture, binding, washing, and digestion for a large number of samples in single batch) combined with rigorous quality control, our study minimized these sources of technical variability, as evidenced by the comparable numbers of precursor ions, quantified proteins, and EV marker levels across BPD and control groups. Furthermore, the use of discarded clinical plasma samples demonstrates the practicality of applying high-throughput EV proteomics in a real-world neonatal intensive care setting with low sample volume. Collectively, our results provide the technical foundation for future studies aimed at discovering EV protein biomarkers of BPD disease severity and other neonatal conditions while also highlighting the importance of standardization in EV isolation and proteomic analysis to enhance reproducibility across studies.

Despite the strengths of our automated EV proteomic workflow, several limitations should be acknowledged. First, although our analysis demonstrated minimal technical variability and consistent EV marker levels, the sample size was relatively small, limiting the generalizability of the findings and the ability to detect subtle biological differences between groups and distinguish infants who developed either moderate or severe BPD. Second, this work used data-independent acquisition (DIA), which reduces precursor sampling variation and increases dynamic range of quantified values relative to data-dependent acquisition (DDA). The pipeline presented here minimized variability introduced during mass spectrometry data handling. Still, potential biases in protein quantification arising from differential ionization efficiencies or dynamic range limitations inherent to mass spectrometry may persist. As plasma has an extensive dynamic range, high-abundance proteins can mask low-abundance EV proteins. Beyond the depletion of abundant proteins, several enrichment strategies have been developed to address this limitation. For example, enrichment methods, such as Mag-Net (ReSynBio) (47, 48), use strong anion exchange to enrich EVs based on surface charge, reducing background plasma proteins. In contrast, EasySep immunoprecipitation approach isolates EV through antibody-based capture of EV markers, providing greater specificity but potentially lower overall yield. Although both strategies are expected to mitigate dynamic range constraints, they may differ in selectivity and recovery (49).

Among the 30 infants enrolled, however, those who developed BPD (grades 2 and 3) had significantly lower gestational age and birth weight compared with controls without significant BPD. These findings are consistent with established risk factors for BPD, as prematurity and low birth weight are known contributors to the BPD pathogenesis, largely due to the susceptibility of immature lungs to injury from mechanical ventilation, oxygen toxicity, and inflammation (13). Notably, we did not find significant differences in sex, race, or ethnicity between groups, suggesting that demographic variables did not confound the observed relationships in this cohort. This small, single-center study underscores the feasibility of recruiting high-risk BPD preterm infants for EV-based biomarker discovery that will inform the design of therapeutic intervention strategies for the prevention and treatment of BPD in the future.

Thus, circulating EVs derived from as little as 100 µL of discarded plasma are a suitable “liquid biopsy” to help identify early risk targets of BPD in vulnerable preterm infants and provide new avenues of investigation into the pathogenesis of BPD. We have applied machine learning-assisted modeling that can discern disease-selective biomarkers in the heterogeneous proteome of circulating EVs.

ACKNOWLEDGMENTS

We acknowledge Dr. Jeffrey Sundstrom’s Lab at Penn State College of Medicine for providing the ARPE-19 cell-derived EV as a positive control used in Western blotting.

DATA AVAILABILITY

The proteomics data supporting the findings of this study will be made available from the corresponding author upon request.

SUPPLEMENTAL MATERIAL

GRANTS

This research was funded by Center for Women and Newborn Health (CROWN) grant.

DISCLOSURES

Z. C. Chroneos is Founder of Respana Therapeutic, Inc. (http://respana-therapeutics.com/) an early-stage company developing therapeutics targeting SPR210 isoforms and is coinventor on associated patents. None of the other authors has any conflicts of interest, financial or otherwise, to disclose.

AUTHOR CONTRIBUTIONS

S.A.: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Writing—original draft, Writing—review & editing; S.R.: Formal analysis, Methodology, Software, Visualization, Writing—review & editing; A.S.: Investigation, Methodology, Writing—review & editing; H.C.: Investigation, Methodology, Visualization, Writing—review & editing; A.D.: Data curation, Resources, Writing—review & editing; H.S.: Data curation, Resources, Writing—review & editing; R.S.: Data curation, Resources, Writing—review & editing; C.P.B.: Methodology, Supervision, Writing—review & editing; Z.C.C.: Conceptualization, Supervision, Writing—original draft, Writing—review & editing.

REFERENCES

  • 1. Hamilton BE, Martin JA, Osterman MJK. Births: Provisional Data for 2024 In Vital Statistics Rapid Release. Vital Statistics Rapid Release, 2025, p. 1–10. [Google Scholar]
  • 2. Siffel C, Kistler KD, Lewis JFM, Sarda SP. Global incidence of bronchopulmonary dysplasia among extremely preterm infants: a systematic literature review. J Matern Fetal Neonatal Med 34: 1721–1731, 2021. doi: 10.1080/14767058.2019.1646240. [DOI] [PubMed] [Google Scholar]
  • 3. Gough A, Linden MA, Spence D, Halliday HL, Patterson CC, McGarvey L. Executive functioning deficits in young adult survivors of bronchopulmonary dysplasia. Disabil Rehabil 37: 1940–1945, 2015. doi: 10.3109/09638288.2014.991451. [DOI] [PubMed] [Google Scholar]
  • 4. Natarajan G, Pappas A, Shankaran S, Kendrick DE, Das A, Higgins RD, Laptook AR, Bell EF, Stoll BJ, Newman N, Hale EC, Bara R, Walsh MC. Outcomes of extremely low birth weight infants with bronchopulmonary dysplasia: impact of the physiologic definition. Early Hum Dev 88: 509–515, 2012. doi: 10.1016/j.earlhumdev.2011.12.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Vollsæter M, Røksund OD, Eide GE, Markestad T, Halvorsen T. Lung function after preterm birth: development from mid-childhood to adulthood. Thorax 68: 767–776, 2013. doi: 10.1136/thoraxjnl-2012-202980. [DOI] [PubMed] [Google Scholar]
  • 6. Ronkainen E, Dunder T, Peltoniemi O, Kaukola T, Marttila R, Hallman M. New BPD predicts lung function at school age: follow-up study and meta-analysis. Pediatr Pulmonol 50: 1090–1098, 2015. doi: 10.1002/ppul.23153. [DOI] [PubMed] [Google Scholar]
  • 7. Stoll BJ, Hansen NI, Bell EF, Shankaran S, Laptook AR, Walsh MC, Hale EC, Newman NS, Schibler K, Carlo WA, Kennedy KA, Poindexter BB, Finer NN, Ehrenkranz RA, Duara S, Sánchez PJ, O'Shea TM, Goldberg RN, Van Meurs KP, Faix RG, Phelps DL, Frantz ID, Watterberg KL, Saha S, Das A, Higgins RD; Eunice Kennedy Shriver National Institute of Child Health and Human Development Neonatal Research Network. Neonatal outcomes of extremely preterm infants from the NICHD Neonatal Research Network. Pediatrics 126: 443–456, 2010. doi: 10.1542/peds.2009-2959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Tin W, Wiswell TE. Adjunctive therapies in chronic lung disease: examining the evidence. Semin Fetal Neonatal Med 13: 44–52, 2008. doi: 10.1016/j.siny.2007.09.008. [DOI] [PubMed] [Google Scholar]
  • 9. Islam JY, Keller RL, Aschner JL, Hartert TV, Moore PE. Understanding the short- and long-term respiratory outcomes of prematurity and bronchopulmonary dysplasia. Am J Respir Crit Care Med 192: 134–156, 2015. doi: 10.1164/rccm.201412-2142PP. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Vom Hove M, Prenzel F, Uhlig HH, Robel-Tillig E. Pulmonary outcome in former preterm, very low birth weight children with bronchopulmonary dysplasia: a case-control follow-up at school age. J Pediatr 164: 40–45.e44, 2014. doi: 10.1016/j.jpeds.2013.07.045. [DOI] [PubMed] [Google Scholar]
  • 11. Lapcharoensap W, Gage SC, Kan P, Profit J, Shaw GM, Gould JB, Stevenson DK, O'Brodovich H, Lee HC. Hospital variation and risk factors for bronchopulmonary dysplasia in a population-based cohort. JAMA Pediatr 169: e143676, 2015. doi: 10.1001/jamapediatrics.2014.3676. [DOI] [PubMed] [Google Scholar]
  • 12. Jensen EA, Schmidt B. Epidemiology of bronchopulmonary dysplasia. Birth Defects Res A Clin Mol Teratol 100: 145–157, 2014. doi: 10.1002/bdra.23235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Jobe AH. The new bronchopulmonary dysplasia. Curr Opin Pediatr 23: 167–172, 2011. doi: 10.1097/MOP.0b013e3283423e6b. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Higgins RD, Jobe AH, Koso-Thomas M, Bancalari E, Viscardi RM, Hartert TV, Ryan RM, Kallapur SG, Steinhorn RH, Konduri GG, Davis SD, Thebaud B, Clyman RI, Collaco JM, Martin CR, Woods JC, Finer NN, Raju TNK. Bronchopulmonary dysplasia: executive summary of a workshop. J Pediatr 197: 300–308, 2018. doi: 10.1016/j.jpeds.2018.01.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Abman SH, Bancalari E, Jobe A. The evolution of bronchopulmonary dysplasia after 50 years. Am J Respir Crit Care Med 195: 421–424, 2017. doi: 10.1164/rccm.201611-2386ED. [DOI] [PubMed] [Google Scholar]
  • 16. Thébaud B, Goss KN, Laughon M, Whitsett JA, Abman SH, Steinhorn RH, Aschner JL, Davis PG, McGrath-Morrow SA, Soll RF, Jobe AH. Bronchopulmonary dysplasia. Nat Rev Dis Primers 5: 78, 2019. doi: 10.1038/s41572-019-0127-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Amatya S, Corr TE, Gandhi CK, Glass KM, Kresch MJ, Mujsce DJ, Oji-Mmuo CN, Mola SJ, Murray YL, Palmer TW, Singh M, Fricchione A, Arnold J, Prentice D, Bridgeman CR, Smith BM, Gavigan PJ, Ericson JE, Miller JR, Pauli JM, Williams DC, McSherry GD, Legro RS, Iriana SM, Kaiser JR. Management of newborns exposed to mothers with confirmed or suspected COVID-19. J Perinatol 40: 987–996, 2020. doi: 10.1038/s41372-020-0695-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Welsh JA, Goberdhan DCI, O'Driscoll L, Buzas EI, Blenkiron C, Bussolati B, , et al. Minimal information for studies of extracellular vesicles (MISEV2023): From basic to advanced approaches. J Extracell Vesicles 13: e12404, 2024. [Erratum in J Extracell Vesicles 13: e12451, 2024]. doi: 10.1002/jev2.12404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Lal CV, Olave N, Travers C, Rezonzew G, Dolma K, Simpson A, Halloran B, Aghai Z, Das P, Sharma N, Xu X, Genschmer K, Russell D, Szul T, Yi N, Blalock JE, Gaggar A, Bhandari V, Ambalavanan N. Exosomal microRNA predicts and protects against severe bronchopulmonary dysplasia in extremely premature infants. JCI Insight 3: e93994, 2018. doi: 10.1172/jci.insight.93994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Zhong X-Q, Yan Q, Chen Z-G, Jia C-H, Li X-H, Liang Z-Y, Gu J, Wei H-L, Lian C-Y, Zheng J, Cui Q-L. Umbilical cord blood-derived exosomes from very preterm infants with bronchopulmonary dysplasia impaired endothelial angiogenesis: roles of exosomal microRNAs. Front Cell Dev Biol 9: 637248, 2021. doi: 10.3389/fcell.2021.637248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Go H, Maeda H, Miyazaki K, Maeda R, Kume Y, Namba F, Momoi N, Hashimoto K, Otsuru S, Kawasaki Y, Hosoya M, Dennery PA. Extracellular vesicle miRNA-21 is a potential biomarker for predicting chronic lung disease in premature infants. Am J Physiol Lung Cell Mol Physiol 318: L845–L851, 2020. doi: 10.1152/ajplung.00166.2019. [DOI] [PubMed] [Google Scholar]
  • 22. Ransom MA, Bunn KE, Negretti NM, Jetter CS, Bressman ZJ, Sucre JMS, Pua HH. Developmental trajectory of extracellular vesicle characteristics from the lungs of preterm infants. Am J Physiol Lung Cell Mol Physiol 324: L385–L392, 2023. doi: 10.1152/ajplung.00389.2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Jensen EA, Dysart K, Gantz MG, McDonald S, Bamat NA, Keszler M, Kirpalani H, Laughon MM, Poindexter BB, Duncan AF, Yoder BA, Eichenwald EC, DeMauro SB. The diagnosis of bronchopulmonary dysplasia in very preterm infants. an evidence-based approach. Am J Respir Crit Care Med 200: 751–759, 2019. doi: 10.1164/rccm.201812-2348OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Zhou M, Zhao Y, Weber SR, Gates C, Carruthers NJ, Chen H, Liu X, Wang H-G, Ford M, Swulius MT, Barber AJ, Grillo SL, Sundstrom JM. Extracellular vesicles from retinal pigment epithelial cells expressing R345W-Fibulin-3 induce epithelial-mesenchymal transition in recipient cells. J Extracell Vesicles 12: e12373, 2023. doi: 10.1002/jev2.12373. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Skowronek P, Thielert M, Voytik E, Tanzer MC, Hansen FM, Willems S, Karayel O, Brunner A-D, Meier F, Mann M. Rapid and in-depth coverage of the (phospho-)proteome with deep libraries and optimal window design for dia-PASEF. Mol Cell Proteomics 21: 100279, 2022. doi: 10.1016/j.mcpro.2022.100279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Demichev V, Messner CB, Vernardis SI, Lilley KS, Ralser M. DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput. Nat Methods 17: 41–44, 2020. doi: 10.1038/s41592-019-0638-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Hsiao Y, Zhang H, Li GX, Deng Y, Yu F, Kahrood HV, Steele JR, Schittenhelm RB, Nesvizhskii AI. Analysis and visualization of quantitative proteomics data using FragPipe-analyst (Preprint). bioRxiv 2024. doi: 10.1101/2024.03.05.583643. [DOI] [PubMed]
  • 28. Ranathunge C, Patel SS, Pinky L, Correll VL, Chen S, Semmes OJ, Armstrong RK, Combs CD, Nyalwidhe JO. promor: a comprehensive R package for label-free proteomics data analysis and predictive modeling. Bioinform Adv 3: vbad025, 2023. [Erratum in Bioinform Adv 39: vbad041, 2023]. doi: 10.1093/bioadv/vbad025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Fidalgo MF, Fonseca CG, Caldas P, Raposo AA, Balboni T, Henao-Mišíková L, Grosso AR, Vasconcelos FF, Franco CA. Aerocyte specification and lung adaptation to breathing is dependent on alternative splicing changes. Life Sci Alliance 5: e202201554, 2022. doi: 10.26508/lsa.202201554. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Vila Ellis L, Cain MP, Hutchison V, Flodby P, Crandall ED, Borok Z, Zhou B, Ostrin EJ, Wythe JD, Chen J. Epithelial Vegfa specifies a distinct endothelial population in the mouse lung. Dev Cell 52: 617–630.e6, 2020. doi: 10.1016/j.devcel.2020.01.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Guo R, Zheng Q, Zhang L. Identification and validation of diagnostic markers and drugs for pediatric bronchopulmonary dysplasia based on integrating bioinformatics and molecular docking analysis. PLoS One 20: e0323006, 2025. doi: 10.1371/journal.pone.0323006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Everman JL, Sajuthi SP, Liegeois MA, Jackson ND, Collet EH, Peters MC, , et al. A common polymorphism in the Intelectin-1 gene influences mucus plugging in severe asthma. Nat Commun 15: 3900, 2024. doi: 10.1038/s41467-024-48034-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Tanswell AK, Jankov RP. Bronchopulmonary dysplasia: one disease or two? Am J Respir Crit Care Med 167: 1–2, 2003. doi: 10.1164/rccm.2210005. [DOI] [PubMed] [Google Scholar]
  • 34. Chang L-YL, Subramaniam M, Yoder BA, Day BJ, Ellison MC, Sunday ME, Crapo JD. A catalytic antioxidant attenuates alveolar structural remodeling in bronchopulmonary dysplasia. Am J Respir Crit Care Med 167: 57–64, 2003. doi: 10.1164/rccm.200203-232OC. [DOI] [PubMed] [Google Scholar]
  • 35. Akinyemi AR, Li D, Zhang J, Liu Q. hnRNPM deficiency leads to cognitive deficits via disrupting synaptic plasticity. Neurosci Lett 751: 135824, 2021. doi: 10.1016/j.neulet.2021.135824. [DOI] [PubMed] [Google Scholar]
  • 36. Lv P, Xu W, Xin S, Deng Y, Yang B, Xu D, Bai J, Ma D, Wang T, Liu J, Liu X. HnRNPM modulates alternative splicing in germ cells by recruiting PTBP1. Reprod Biol Endocrinol 23: 3, 2025. doi: 10.1186/s12958-024-01340-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Harvey SE, Xu Y, Lin X, Gao XD, Qiu Y, Ahn J, Xiao X, Cheng C. Coregulation of alternative splicing by hnRNPM and ESRP1 during EMT. RNA 24: 1326–1338, 2018. doi: 10.1261/rna.066712.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Zheng R, Dunlap M, Bobkov GOM, Gonzalez-Figueroa C, Patel KJ, Lyu J, Harvey SE, Chan TW, Quinones-Valdez G, Choudhury M, Le Roux CA, Bartels MD, Vuong A, Flynn RA, Chang HY, Van Nostrand EL, Xiao X, Cheng C. hnRNPM protects against the dsRNA-mediated interferon response by repressing LINE-associated cryptic splicing. Mol Cell 84: 2087–2103.e8, 2024. doi: 10.1016/j.molcel.2024.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Zhong H, Li Q, Pei S, Wu Y, Li Z, Liu X, Peng Y, Zheng T, Xiao J, Feng H. hnRNPM suppressed IRF7-mediated IFN signaling in the antiviral innate immunity in triploid hybrid fish. Dev Comp Immunol 148: 104915, 2023. doi: 10.1016/j.dci.2023.104915. [DOI] [PubMed] [Google Scholar]
  • 40. Zheng R, Dunlap M, Lyu J, Gonzalez-Figueroa C, Bobkov G, Harvey SE, Chan TW, Quinones-Valdez G, Choudhury M, Vuong A, Flynn RA, Chang HY, Xiao X, Cheng C. LINE-associated cryptic splicing induces dsRNA-mediated interferon response and tumor immunity (Preprint). bioRxiv 2023. doi: 10.1101/2023.02.23.529804. [DOI] [PMC free article] [PubMed]
  • 41. Zhao X, Qiao Y, Fan S, Chang X, Zhao J, Zhong K, Han Y, Zhu H, Zhang C. HnRNPM inhibits pseudorabies virus replication by inducing apoptosis in infected cells. Vet Microbiol 304: 110455, 2025. doi: 10.1016/j.vetmic.2025.110455. [DOI] [PubMed] [Google Scholar]
  • 42. Kirchhoff A, Herzner A-M, Urban C, Piras A, Düster R, Mahlberg J, Grünewald A, Schlee-Guimarães TM, Ciupka K, Leka P, Bootz RJ, Wallerath C, Hunkler C, de Regt AK, Kümmerer BM, Christensen MH, Schmidt FI, Lee-Kirsch MA, Günther C, Kato H, Bartok E, Hartmann G, Geyer M, Pichlmair A, Schlee M. RNA-binding proteins hnRNPM and ELAVL1 promote type-I interferon induction downstream of the nucleic acid sensors cGAS and RIG-I. EMBO J 44: 824–853, 2025. doi: 10.1038/s44318-024-00331-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Andoh K, Nishimori A, Matsuura Y. The bovine leukemia virus-derived long non-coding RNA AS1-S binds to bovine hnRNPM and alters the interaction between hnRNPM and host mRNAs. Microbiol Spectr 11: e0085523, 2023. doi: 10.1128/spectrum.00855-23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Cao P, Luo W-W, Li C, Tong Z, Zheng Z-Q, Zhou L, Xiong Y, Li S. The heterogeneous nuclear ribonucleoprotein hnRNPM inhibits RNA virus-triggered innate immunity by antagonizing RNA sensing of RIG-I-like receptors. PLoS Pathog 15: e1007983, 2019. doi: 10.1371/journal.ppat.1007983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Bortz E, Westera L, Maamary J, Steel J, Albrecht RA, Manicassamy B, Chase G, Martínez-Sobrido L, Schwemmle M, García-Sastre A. Host- and strain-specific regulation of influenza virus polymerase activity by interacting cellular proteins. mBio 2: e00151, 2011. doi: 10.1128/mBio.00151-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Yáñez-Mó M, Siljander PR-M, Andreu Z, Zavec AB, Borràs FE, Buzas EI, , et al. Biological properties of extracellular vesicles and their physiological functions. J Extracell Vesicles 4: 27066, 2015. doi: 10.3402/jev.v4.27066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Roger K, Metatla I, Ceccacci S, Wahbi K, Motté L, Chhuon C, Guerrera IC. Mining the plasma proteome: Evaluation of enrichment methods for depth and reproducibility. J Proteomics 321: 105519, 2025. doi: 10.1016/j.jprot.2025.105519. [DOI] [PubMed] [Google Scholar]
  • 48. Wu CC, Tsantilas KA, Park J, Plubell D, Sanders JA, Naicker P, Govender I, Buthelezi S, Stoychev S, Jordaan J, Merrihew G, Huang E, Parker ED, Riffle M, Hoofnagle AN, Noble WS, Poston KL, Montine TJ, MacCoss MJ. Enrichment of extracellular vesicles using Mag-Net for the analysis of the plasma proteome. Nat Commun 16: 5447, 2025. doi: 10.1038/s41467-025-60595-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Beimers WF, Overmyer KA, Sinitcyn P, Lancaster NM, Quarmby ST, Coon JJ. Technical evaluation of plasma proteomics technologies. J Proteome Res 24: 3074–3087, 2025. doi: 10.1021/acs.jproteome.5c00221. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The proteomics data supporting the findings of this study will be made available from the corresponding author upon request.


Articles from Function are provided here courtesy of American Physiological Society

RESOURCES