Dear Editor,
Sepsis, the highest mortality disease in critically ill patients, is clinically diagnosed through the dysregulated systemic inflammatory response of patients to infection in the presence of organ dysfunction. 1 , 2 , 3 No effective biomarkers and approved molecular therapies have been developed for sepsis to diagnose and treat the immune response state of the patients, leading to the management of these critically ill patients only relies on early recognition by experience and supportive care. 4 , 5 Long noncoding RNAs (lncRNAs) are implicated in a wide variety of biological processes and accumulative studies have demonstrated that several dysregulated lncRNAs play important roles in tumorigenesis and tumor progression. 6 , 7 , 8 However, the lncRNA signature has not been studied for the rapid diagnosis of sepsis, due to the limitation of data sources and lack of RNA‐seq datasets. 3 Hence, we analyzed three whole blood transcriptome cohorts of critically ill adult patients and identified a 28‐lncRNA signature for sepsis diagnosis, which imputes a score to assess the risk of sepsis.
The expression profiling of 3745 lncRNAs in three cohorts, GSE95233, GSE28750, and GSE57065, were normalized and reannotated for the investigation 6 , 9 (Table S1). The largest cohort GSE95233 was set as the discovery dataset, while the other two independent cohorts were set as the validation datasets. To select lncRNAs for the predictive signature, we first determined 84 differentially expressed (DE) lncRNAs between sepsis patients and healthy individuals based on the discovery dataset. Then we took advantage of a regression algorithm least absolute shrinkage and selection operator (LASSO) to further identify 28 predictive lncRNAs, named SepSig28, which serves as a molecular diagnostic signature to calculate the risk score to predict whether individuals were suffering from sepsis or not. After that, we validated the diagnostic signature in two independent datasets and demonstrated the high performance of the 28 lncRNAs in the risk prediction of sepsis (Figure 1A).
Risk score = (BOLA3.AS1 × 0.254) + (LINC00354 × 0.1996) + (C5orf27 × 0.1537) + (RP1.187B23.1 × ‐0.1427) + (MBNL1.AS1 × ‐0.1419) + (LINC01420 × ‐0.1140) + (RP13.436F16.1 × 0.1060) + (CTB.31O20.2 × 0.1023) + (LINC01425 × 0.0949) + (C10orf25 × ‐0.0763) + (RP11.111M22.3 × 0.0743) + (LAMTOR5.AS1 × 0.0739) + (FLJ37453 × 0.0713) + (AX746755 × ‐0.0690) + (TTTY12 × 0.0678) + (ASMTL.AS1 × ‐0.0535) + (LOC101928491 × 0.0461) + (RBM26.AS1 × ‐0.0438) + (ANP32A.IT1 × 0.0437) + (LOC101060691 × 0.0319) + (MSH5 × ‐0.0311) + (LOC100507221 × 0.0289) + (RP11.1137G4.3 × ‐0.0245) + (LOC100506457 × 0.0237) + (MIR612 × ‐0.0189) + (AC114730.11 × 0.0079) + (LOC101927526 × 0.0026) + (LINC01019 × ‐0.0020). The values following the symbols are the importance weights of the expression abundance of each lncRNA. These lncRNAs are listed in order of decreasing importance.
When tuned in the discovery dataset using fivefold cross‐validation, the SepSig28 can perfectly classify the sepsis patient samples and healthy control samples, with all the measures equal 1, including the area under curve (AUC), accuracy, sensitivity, and specificity (Figure 1B,C). To test the randomness of the model, we randomly picked up an equivalent number of lncRNAs 1000 times and evaluated their performance using the same procedure as SepSig28. Our result shows that no random combinations can achieve the score of AUC as high as 1 (Figure 1D). Besides, we constructed all possible 27‐lncRNA signatures (28 minus 1) by excluding one lncRNA once a time to evaluate the predictive capability of each lncRNA in the SepSig28 model. For the discovery dataset, two lncRNA members are not necessary for the model, as the model can perform equally well without either of them (Figure 1C). We added these two as supplementary features to make the model more robust.
In the independent cohorts GSE28750 and GSE57065, the hierarchical clustering shows altered expression pattern of the SepSig28 lncRNAs cannot well distinguish sepsis patient samples from the normal ones (Figure 2A,B). Using the computed risk scores by weighted sum, however, SepSig28 can achieve the AUC scores as high as 0.9712 for GSE57065 and 0.95 for GSE28750, respectively (Figure 2C,D), which outperforms almost all the other combinations of 27 (28 minus 1) lncRNAs. Overall, SepSig28 has the best classification performance for all three cohorts according to the measures of AUC, accuracy, sensitivity, and specificity (Figure 2E).
To investigate the biological functions the SepSig28 involved, we associated them with their co‐expressed genes across the sepsis samples of each cohort. Genes co‐expressed with the lncRNAs in all the cohorts (Pearson correlation coefficient > 0.7) were considered to be co‐expressed. Gene Ontology (GO) and KEGG pathway enrichment analysis were separately performed for the set of co‐expressed genes. 10 GO enrichment analysis showed that the lncRNAs of SepSig28 are mainly involved in three biological processes, including hormone mediated signaling pathway, RNA splicing, and histone modification (Figure S1A). KEGG analysis showed the SepSig28 associated genes are significantly implicated in pathways that are known to be related to sepsis pathogenesis, including Wnt signaling pathway, Th17 cell differentiation, Notch signaling pathway, etc. (Figure S1B). Interestingly, both GO and KEGG enrichment revealed that lncRNAs in SepSig28 tend to participate in hormone signaling related pathways, indicating an underlying association between hormone signaling and sepsis.
In conclusion, we identified and validated the first non‐coding signature consisting of 28 lncRNAs that can well distinguish sepsis patients from healthy controls for adults. Despite limitations such as the limited number of lncRNA features and the small sample size, we provided evidence that lncRNAs could be adopted as markers for the diagnosis of critical diseases. The proposed model could be used as an alternative or complementary diagnostic metric for sepsis.
AUTHOR CONTRIBUTIONS
LC conceived the idea and drafted the manuscript. LC performed data analysis. XL, XZ, JW, NZ, and RW performed data management and analysis. XL, KL, and XY helped interpret the results and give suggestions. All authors read and approved the final manuscript.
CONFLICT OF INTEREST
The authors declare no conflict of interest.
Supporting information
Contributor Information
Xiufeng Ye, Email: szrosexiu@126.com.
Lixin Cheng, Email: easonlcheng@gmail.com.
REFERENCES
- 1. Angus DC, van der Poll T. Severe sepsis and septic shock. N Engl J Med. 2013;369:840‐851. [DOI] [PubMed] [Google Scholar]
- 2. Fleischmann C, Scherag A, Adhikari NK, et al. International forum of acute care T: assessment of global incidence and mortality of hospital‐treated sepsis. current estimates and limitations. Am J Respir Crit Care Med. 2016;193:259‐272. [DOI] [PubMed] [Google Scholar]
- 3. Cheng L, Nan C, Kang L, et al. Whole blood transcriptomic investigation identifies long non‐coding RNAs as regulators in sepsis. J Transl Med. 2020;18:217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Scicluna BP, van Vught LA, Zwinderman AH, et al. Classification of patients with sepsis according to blood genomic endotype: a prospective cohort study. Lancet Respir Med. 2017;5:816‐826. [DOI] [PubMed] [Google Scholar]
- 5. Sutherland A, Thomas M, Brandon RA, et al. Development and validation of a novel molecular biomarker diagnostic test for the early detection of sepsis. Crit Care. 2011;15:R149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Zhou M, Zhao H, Wang X, Sun J, Su J. Analysis of long noncoding RNAs highlights region‐specific altered expression patterns and diagnostic roles in Alzheimer's disease. Brief Bioinform. 2019;20:598‐608. [DOI] [PubMed] [Google Scholar]
- 7. Liu X, Xu Y, Wang R, et al. A network‐based algorithm for the identification of moonlighting noncoding RNAs and its application in sepsis. Brief Bioinform. 2020. [DOI] [PubMed] [Google Scholar]
- 8. Cheng L, Leung KS. Identification and characterization of moonlighting long non‐coding RNAs based on RNA and protein interactome. Bioinformatics. 2018;34:3519‐3528. [DOI] [PubMed] [Google Scholar]
- 9. Cheng L, Lo LY, Tang NL, Wang D, Leung KS. CrossNorm: a novel normalization strategy for microarray data in cancers. Sci Rep. 2016;6:18898. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Cheng L, Leung K‐S. Quantification of non‐coding RNA target localization diversity and its application in cancers. J Mol Cell Biol. 2018;10:130‐138. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.