Genomic technologies, specifically microarrays and high-throughput DNA and RNA sequencing (RNA-seq), have had an enormous impact on basic and translational research in lung biology. The findings from some of these studies have now transformed into diagnostic tests, which are rapidly changing our clinical practice (1). Advances in high-throughput single-cell genomics and multiomics approaches enable investigators to study organs and tissues at the level of fundamental units of life—cells. Single-cell technologies allow investigators to overcome “the averaging problem” inherent to the analysis at the whole-tissue level or by pooling specific cell populations selected on a handful of markers, which are subjected to composition bias. Hence, single-cell technologies have rapidly become a logical method of choice to study lung biology and disease (2). Specifically, single-cell RNA-seq has been applied to create reference atlases of normal human lung tissue (3–6), to study cellular crosstalk in multicellular lung cancer niches (7, 8), to discover novel cell types in the normal lung (9) or in pulmonary fibrosis (10–16), to evaluate remodeling of the airway epithelium in asthma (17) or as a result of smoking (18), and to study the immune response in cystic fibrosis (19) and in coronavirus disease (COVID-19) (20–22). By applying the power of single-cell techniques to causal experiments in model organisms or in vitro systems, investigators were able to provide mechanistic insights about the role of specific cell types in disease pathogenesis (23–25).
In line with previous studies, in this issue of the Journal, Carraro, Mulay, Yao, and colleagues (pp. 1540–1550) applied single-cell RNA-seq to evaluate the hierarchical relationship between airway epithelial cells of patients with idiopathic pulmonary fibrosis (IPF) and those of control subjects (26). Performing single-cell RNAseq on epithelial cells from airways of six control subjects and seven patients with IPF, the authors used computational approaches to resolve previously reported major subsets of airway epithelial cells, such as basal, club, goblet, ciliated, mucous, and serous cells, and ionocytes (4, 9). The authors focused on the analysis of basal epithelial cells, which play an important role as a progenitor population in the airways, contributing to normal maintenance and repair after injury (27, 28). Several transcriptionally distinct subsets were identified within basal cells and were named multipotent basal, proliferating basal, activated basal, and secretory-primed basal cells. Previous lineage-tracing studies in animal models identified basal cells as local progenitor cells capable of giving rise to secretory and ciliated cells in the airways after the injury and revealed the crucial role of Wnt and Notch pathways in this process (27, 28). In agreement with previous studies (4, 18), this work from Carraro and colleagues confirmed the existence of the multiple transitory cell types between basal and secretory cells. The authors identified transcriptionally similar cell types in patients with IPF and reported a substantial expansion of secretory-primed basal cell type in IPF. Moreover, guided by the results of the single-cell transcriptomic profiling, a screen of commercially available antibodies to antigens expressed on the surface of these basal cell subsets identified anti-CD66 antibody as a marker separating bona fide basal cells (EPCAM+NGFR+CD66−) from secretory-primed basal cells (EPCAM+NGFR+CD66+). This allowed the sorting of the live basal and secretory-primed basal cells and the evaluation of their differentiation potential in a series of in vitro assays. In contrast to true basal cells, secretory-primed basal cells had limited capacity for self-renewal. Using specific blocking antibodies against NOTCH1, NOTCH2, or NOTCH3, the authors validated their distinct roles in the maintenance of the basal cells and secretory-primed basal cells. Thus, the work by Carraro and colleagues provides an example of how single-cell transcriptomic analyses, together with orthogonal validation techniques, uncover novel disease mechanisms and reconcile findings from model organisms and human subjects.
The wealth and depth of data generated by single-cell genomic techniques demand a different approach to manuscript submission, evaluation, and publishing, with increased responsibility for authors, reviewers, editors, and publishers, among others. Will the reader be able to find the same reported cell in several different papers using similar tissues yet different analytical approaches? Interestingly, Deprez and colleagues (6) performed single-cell RNA-seq of airway biopsies from healthy volunteers in another manuscript recently published in the Journal. The authors specifically resolved several transcriptionally distinct subsets of basal cells, which they referred to as basal, cycling basal, and suprabasal, among others. Although cluster labels and marker genes between these two studies do not directly match, this does not imply that one group is correct and the other wrong. Slight differences in nomenclature, methodology, and tissue sources and, especially, differences in computational approaches explain why cell types and clusters in those datasets differ. An integrated analysis of these and other existing datasets in which single-cell RNA-seq data are uniformly processed will reveal stable cell types and states reproducibly observed in human airways, and new computational approaches (transfer learning) will enable rapid iterative validation in newly generated datasets and reconcile differences in nomenclature or annotations (29).
Such integrative analysis, as well as the validation of existing cellular populations and discovery of novel cellular populations, demands appropriate sharing of single-cell genomic data and accompanying metadata. Genomics data-sharing is mandated by American Thoracic Society Journal guidelines and funding agencies, including the NIH. Authors should strive to share as much data, metadata, and protocols as possible and to do so in a manner that facilitates retrieval and reuse of the data by the community. For an in-depth review of the current challenges in genomic data-sharing and paths for mitigating these challenges, we recommend an outstanding recent review by Byrd and colleagues (30). Peer review of manuscripts reporting findings from single-cell genomics assays also brings unique challenges for the reviewer and editor alike. Authors can mitigate these challenges by taking simple actions, such as providing a statement about the availability of the raw data and metadata on controlled repositories, including the Short Reads Archive via the Database of Genotypes and Phenotypes, the European Genome-Phenome Archive, or the Chinese Genome Sequence Archive for Human. Authors should ensure that reviewers have access to processed data via public repositories (Gene Expression Omnibus or The European Bioinformatics Institute Archive) at the time of the manuscript submission for review. Reviewers can then be provided with access to detailed and well-annotated code, which allows reproduction of the analysis performed by the authors. This is usually achieved on platforms such as GitHub (https://github.com/) or Code Ocean (https://codeocean.com/); the latter takes reproducible analysis one step further and allows fully reproducible reanalysis in the preserved computational environment in the cloud (“containers”), thus alleviating potential issues related to outdated software packages and dependencies. Finally, authors can facilitate peer review and data dissemination upon publication by providing an interactive web tool for easy and intuitive dataset exploration by reviewers and readers without computational expertise. Setting up such tools is not a complicated task, and the cost of maintaining such websites is low. Moreover, two popular platforms, CellBrowser (https://cells.ucsc.edu/) and Cellxgene (https://github.com/chanzuckerberg/cellxgene), offer dataset hosting on their websites, thus alleviating concerns about preserving the anonymity of the peer review.
Undoubtedly, sharing the code and presenting the analysis of single-cell genomics datasets in this manner requires substantial effort from the authors. On the positive side, this comes with the benefit of fast and transparent peer review, in which reviewers do not have to possess specialized skills or knowledge of software packages but can rather fully focus on the authors’ interpretation of the results and their relevance to pulmonary biology or disease. It is worth mentioning that peer review is a lengthy process, and sharing data, code, or analysis results up front possesses the risk of “being scooped.” These risks are usually mitigated by depositing the manuscript linked to a specific dataset and analysis to one of the available and reliable preprint servers, such as arXiv.org or bioRxiv.org. A great example of such an approach is recent work from Habermann and colleagues, who made their high-quality, well-annotated dataset and accompanying code publicly available at the time of publishing the preprint, making it an invaluable resource for the community 10 months before its publication in a peer-reviewed journal (11).
In conclusion, single-cell genomics is rapidly becoming a standard tool for basic, clinical, and translational research. These changes call for updates to guidelines for sharing genomics and other big data. Because even publicly shared data and code could be challenging to reanalyze (data could be mislabeled, deposited in an unusual format, missing or removed after the upload), journals may eventually assume a role of a “data guardian” and clone a version of the authors’ code to a specific GitHub repository and also provide a snapshot (via checksum) of the files deposited to the public repositories. Data and code integrity during and after the publication process would therefore be ensured. In this editorial, we have focused on a specific aspect related to single-cell genomics data. These ideas and suggestions, however, can and should be applied to other types of high-content data, such as imaging, metabolomics, proteomics, or mass cytometry, in the near future.
Supplementary Material
Acknowledgments
Acknowledgment
The authors thank Dr. Benjamin D. Singer and Dr. Paul T. Schumacker for productive discussions on sharing and reviewing genomics data in American Thoracic Society journals that helped to frame this editorial.
Footnotes
O.E. is supported by NIH grants R01HL146519 and 1R25HL146166-01, LongFonds grant LF20146.1.14.009, and CZF2019-002438 from the Chan Zuckerberg Initiative Foundation awarded to the HCA Lung Seed Network. A.V.M. is supported by NIH grants U19AI135964, P01AG049665, R56HL135124, and R01HL153312; Clinical and Translational Sciences Institute, Northwestern University COVID-19 Rapid Response grant; and CZF2019-002438 from the Chan Zuckerberg Initiative Foundation awarded to the Human Cell Atlas Lung Seed Network. P.J.S. is supported by NIH grant R01HL127001.
Originally Published in Press as DOI: 10.1164/rccm.202008-3073ED on September 11, 2020
Author disclosures are available with the text of this article at www.atsjournals.org.
References
- 1. Silvestri GA, Vachani A, Whitney D, Elashoff M, Porta Smith K, Ferguson JS, et al. AEGIS Study Team. A bronchial genomic classifier for the diagnostic evaluation of lung cancer. N Engl J Med. 2015;373:243–251. doi: 10.1056/NEJMoa1504601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Alexander MJ, Budinger GRS, Reyfman PA. Breathing fresh air into respiratory research with single-cell RNA sequencing. Eur Respir Rev. 2020;29:200060. doi: 10.1183/16000617.0060-2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Travaglini KJ, Nabhan AN, Penland L, Sinha R. A molecular cell atlas of the human lung from single cell RNA sequencing [preprint] bioRxiv. 2020 doi: 10.1038/s41586-020-2922-4. [accessed on 2020 Oct 8]. Available from: https://www.biorxiv.org/content/10.1101/742320v2.abstract. [DOI] [PMC free article] [PubMed]
- 4. Vieira Braga FA, Kar G, Berg M, Carpaij OA, Polanski K, Simon LM, et al. A cellular census of human lungs identifies novel cell states in health and in asthma. Nat Med. 2019;25:1153–1163. doi: 10.1038/s41591-019-0468-5. [DOI] [PubMed] [Google Scholar]
- 5. Madissoon E, Wilbrey-Clark A, Miragaia RJ, Saeb-Parsy K, Mahbubani KT, Georgakopoulos N, et al. scRNA-seq assessment of the human lung, spleen, and esophagus tissue stability after cold preservation. Genome Biol. 2019;21:1. doi: 10.1186/s13059-019-1906-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Deprez M, Zaragosi L-E, Truchi M, Becavin C, Ruiz García S, Arguel M-J, et al. A single-cell atlas of the human healthy airways. Am J Respir Crit Care Med. doi: 10.1164/rccm.201911-2199OC. [online ahead of print] 29 Jul 2020; DOI: 10.1164/rccm.201911-2199OC. [DOI] [PubMed] [Google Scholar]
- 7. Lambrechts D, Wauters E, Boeckx B, Aibar S, Nittner D, Burton O, et al. Phenotype molding of stromal cells in the lung tumor microenvironment. Nat Med. 2018;24:1277–1289. doi: 10.1038/s41591-018-0096-5. [DOI] [PubMed] [Google Scholar]
- 8. Laughney AM, Hu J, Campbell NR, Bakhoum SF, Setty M, Lavallée V-P, et al. Regenerative lineages and immune-mediated pruning in lung cancer metastasis. Nat Med. 2020;26:259–269. doi: 10.1038/s41591-019-0750-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Montoro DT, Haber AL, Biton M, Vinarsky V, Lin B, Birket SE, et al. A revised airway epithelial hierarchy includes CFTR-expressing ionocytes. Nature. 2018;560:319–324. doi: 10.1038/s41586-018-0393-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Reyfman PA, Walter JM, Joshi N, Anekalla KR, McQuattie-Pimentel AC, Chiu S, et al. Single-cell transcriptomic analysis of human lung provides insights into the pathobiology of pulmonary fibrosis. Am J Respir Crit Care Med. 2019;199:1517–1536. doi: 10.1164/rccm.201712-2410OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Habermann AC, Gutierrez AJ, Bui LT, Yahn SL, Winters NI, Calvi CL, et al. Single-cell RNA sequencing reveals profibrotic roles of distinct epithelial and mesenchymal lineages in pulmonary fibrosis. Sci Adv. 2020;6:eaba1972. doi: 10.1126/sciadv.aba1972. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Adams TS, Schupp JC, Poli S, Ayaub EA, Neumark N, Ahangari F, et al. Single-cell RNA-seq reveals ectopic and aberrant lung-resident cell populations in idiopathic pulmonary fibrosis. Sci Adv. 2020;6:eaba1983. doi: 10.1126/sciadv.aba1983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Valenzi E, Bulik M, Tabib T, Morse C, Sembrat J, Trejo Bittar H, et al. Single-cell analysis reveals fibroblast heterogeneity and myofibroblasts in systemic sclerosis-associated interstitial lung disease. Ann Rheum Dis. 2019;78:1379–1387. doi: 10.1136/annrheumdis-2018-214865. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Tsukui T, Sun K-H, Wetter JB, Wilson-Kanamori JR, Hazelwood LA, Henderson NC, et al. Collagen-producing lung cell atlas identifies multiple subsets with distinct localization and relevance to fibrosis. Nat Commun. 2020;11:1920. doi: 10.1038/s41467-020-15647-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Xu Y, Mizuno T, Sridharan A, Du Y, Guo M, Tang J, et al. Single-cell RNA sequencing identifies diverse roles of epithelial cells in idiopathic pulmonary fibrosis. JCI Insight. 2016;1:e90558. doi: 10.1172/jci.insight.90558. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Morse C, Tabib T, Sembrat J, Buschur KL, Bittar HT, Valenzi E, et al. Proliferating SPP1/MERTK-expressing macrophages in idiopathic pulmonary fibrosis. Eur Respir J. 2019;54:1802441. doi: 10.1183/13993003.02441-2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Jackson ND, Everman JL, Chioccioli M, Feriani L, Goldfarbmuren KC, Sajuthi SP, et al. Single-cell and population transcriptomics reveal pan-epithelial remodeling in type 2-high asthma. Cell Rep. 2020;32:107872. doi: 10.1016/j.celrep.2020.107872. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Goldfarbmuren KC, Jackson ND, Sajuthi SP, Dyjack N, Li KS, Rios CL, et al. Dissecting the cellular specificity of smoking effects and reconstructing lineages in the human airway epithelium. Nat Commun. 2020;11:2485. doi: 10.1038/s41467-020-16239-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Schupp JC, Khanal S, Gomez JL, Sauler M, Adams TS, Chupp GL, et al. Single-cell transcriptional archetypes of airway inflammation in cystic fibrosis. Am J Respir Crit Care Med. doi: 10.1164/rccm.202004-0991OC. [online ahead of print] 30 Jun 2020; DOI: 10.1164/rccm.202004-0991OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Liao M, Liu Y, Yuan J, Wen Y, Xu G, Zhao J, et al. Single-cell landscape of bronchoalveolar immune cells in patients with COVID-19. Nat Med. 2020;26:842–844. doi: 10.1038/s41591-020-0901-9. [DOI] [PubMed] [Google Scholar]
- 21. Chua RL, Lukassen S, Trump S, Hennig BP, Wendisch D, Pott F, et al. COVID-19 severity correlates with airway epithelium-immune cell interactions identified by single-cell analysis. Nat Biotechnol. 2020;38:970–979. doi: 10.1038/s41587-020-0602-4. [DOI] [PubMed] [Google Scholar]
- 22. He J, Cai S, Feng H, Cai B, Lin L, Mai Y, et al. Single-cell analysis reveals bronchoalveolar epithelial dysfunction in COVID-19 patients. Protein Cell. 2020;11:680–687. doi: 10.1007/s13238-020-00752-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Strunz M, Simon LM, Ansari M, Kathiriya JJ, Angelidis I, Mayr CH, et al. Alveolar regeneration through a Krt8+ transitional stem cell state that persists in human lung fibrosis. Nat Commun. 2020;11:3559. doi: 10.1038/s41467-020-17358-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Kobayashi Y, Tata A, Konkimalla A, Katsura H, Lee RF, Ou J, et al. Persistence of a regeneration-associated, transitional alveolar epithelial cell state in pulmonary fibrosis. Nat Cell Biol. 2020;22:934–946. doi: 10.1038/s41556-020-0542-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Jiang P, Gil de Rubio R, Hrycaj SM, Gurczynski SJ, Riemondy KA, Moore BB, et al. Ineffectual type 2-to-type 1 alveolar epithelial cell differentiation in idiopathic pulmonary fibrosis: persistence of the KRT8hi transitional state. Am J Respir Crit Care Med. 2020;201:1443–1447. doi: 10.1164/rccm.201909-1726LE. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Carraro G, Mulay A, Yao C, Mizuno T, Konda B, Petrov M, et al. Single-cell reconstruction of human basal cell diversity in normal and idiopathic pulmonary fibrosis lungs. Am J Respir Crit Care Med. 2020;202:1540–1550. doi: 10.1164/rccm.201904-0792OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Tata PR, Rajagopal J. Plasticity in the lung: making and breaking cell identity. Development. 2017;144:755–766. doi: 10.1242/dev.143784. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Zepp JA, Morrisey EE. Cellular crosstalk in the development and regeneration of the respiratory system. Nat Rev Mol Cell Biol. 2019;20:551–566. doi: 10.1038/s41580-019-0141-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Lotfollahi M, Naghipourfar M, Luecken MD, Khajavi M, Büttner M, Avsec Z, et al. Query to reference single-cell integration with transfer learning [preprint] bioRxiv. 2020 [accessed on 2020 Oct 8]. Available from: https://www.biorxiv.org/content/10.1101/2020.07.16.205997v1. [Google Scholar]
- 30. Byrd JB, Greene AC, Prasad DV, Jiang X, Greene CS. Responsible, practical genomic data sharing that accelerates research. Nat Rev Genet. doi: 10.1038/s41576-020-0257-5. [online ahead of print] 21 Jul 2020; DOI: 10.1038/s41576-020-0257-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.