Abstract
Bioinformatic analysis of large and complex omics datasets has become increasingly useful in modern day biology by providing a great depth of information, with its application to neuroscience termed neuroinformatics. Data mining of omics datasets has enabled the generation of new hypotheses based on differentially regulated biological molecules associated with disease mechanisms, which can be tested experimentally for improved diagnostic and therapeutic targeting of neurodegenerative diseases. Importantly, integrating multi-omics data using a systems bioinformatics approach will advance the understanding of the layered and interactive network of biological regulation that exchanges systemic knowledge to facilitate the development of a comprehensive human brain profile. In this review, we first summarize data mining studies utilizing datasets from the individual type of omics analysis, including epigenetics/epigenomics, transcriptomics, proteomics, metabolomics, lipidomics, and spatial omics, pertaining to Alzheimer's disease, Parkinson's disease, and multiple sclerosis. We then discuss multi-omics integration approaches, including independent biological integration and unsupervised integration methods, for more intuitive and informative interpretation of the biological data obtained across different omics layers. We further assess studies that integrate multi-omics in data mining which provide convoluted biological insights and offer proof-of-concept proposition towards systems bioinformatics in the reconstruction of brain networks. Finally, we recommend a combination of high dimensional bioinformatics analysis with experimental validation to achieve translational neuroscience applications including biomarker discovery, therapeutic development, and elucidation of disease mechanisms. We conclude by providing future perspectives and opportunities in applying integrative multi-omics and systems bioinformatics to achieve precision phenotyping of neurodegenerative diseases and towards personalized medicine.
Keywords: Multi-omics integration, Systems bioinformatics, Data mining, Human brain profile reconstruction, Translational neuroscience
Graphical abstract
Highlights
-
•
Significance of data mining of multi-omics datasets in translational neuroscience.
-
•
Multi-omics data integration approaches: independent biological integration and unsupervised integration.
-
•
Systems bioinformatics approach: towards the reconstruction of a comprehensive human brain profile.
-
•
Applications in biomarker discovery, therapeutic development, and elucidation of pathogenic mechanisms of neurodegeneration.
1. Introduction
The beginning of bioinformatics can be dated back to more than five decades ago, witnessing the parallel advances of computer science and experimental biology [1], including the advent of next-generation sequencing and omics technologies [2]. The omics technologies and analyses have been widely applied in neuroscience studies, ranging from the detection of alterations in genes (epigenetics/epigenomics), mRNA (transcriptomics), proteins (proteomics), and metabolites (metabolomics/lipidomics) at the molecular scale in the brain, their localization in different anatomical regions (spatial omics) to construct brain atlases, and the study of dynamics in biological processes by monitoring changes in individual cells (single-cell/single-nucleus trajectory inference) [[3], [4], [5], [6], [7], [8]]. Importantly, these omics studies provide a great depth of information and contain datasets that can be further analyzed and interpreted to support separate experimental observations or generate new hypotheses in neuroscience research [9].
Data mining of existing omics datasets to obtain novel biological insights opens new avenues to deepen the understanding of the clinical phenotypes, neuropathological features, disease progression, and pathogenic mechanisms of neurological disorders including Alzheimer's disease (AD), Parkinson's disease (PD) and multiple sclerosis (MS) [10]. In addition, data mining overcomes the technical challenges in conducting the initial experiments such as difficulty in obtaining precious tissue samples, stringent requirements in sample preparation, and high cost, while creating opportunities for new diagnostic and therapeutic strategies targeting neurodegenerative diseases. While single omics analyses have provided valuable disease mechanistic insights, recent studies have shown that integrative multi-omics analysis can help to define the connection and relationship among the different types of omics datasets to unravel brain networks regulating transitions from health to the development of neurological diseases and to classify clinically relevant subgroups to identify potential biomarkers [11,12].
It is hence crucial to understand the neurodegenerative pathology from the perspective of individual type of omics data obtained from the human samples, followed by adopting the systems bioinformatics approach to integrate the multiscale and multisource big data from each omics layer to allow for a holistic analysis of the complex brain system (Fig. 1A). Importantly, there is heterogeneous omics profiles under various disease conditions, such as neuroinflammation, neurodegeneration, and neuroimmune dysregulation, which can be characterized by different combinations of alterations in the omics layers (Fig. 1B). In addition, further development and optimization of muti-omics integration approaches and pipelines will potentially enable the reconstruction of comprehensive brain networks and pathological profiles reflective of the biological systems and the microenvironments under specific disease states (Fig. 1C) [5,[13], [14], [15], [16]]. Furthermore, there is a need for thorough interpretation of the outcomes from data mining and their relevance to the true biological observations.
In this review, we first summarize data mining based translational neuroscience studies that performed secondary bioinformatics analysis utilizing deposited datasets from a spectrum of omics technologies including epigenetics, transcriptomics, proteomics, metabolomics, lipidomics, and spatial omics. We then discuss current methods of multi-omics integration and propose a systems bioinformatics approach to work towards the reconstruction of brain networks and pathological profiles for increased accuracy and reliability in recapitulating the true biological systems of the brain. We further recommend combining high dimensional bioinformatics analysis with experimental validation in biomarker discovery, therapeutic development, and elucidation of disease mechanisms. We conclude by providing future perspectives and opportunities in utilizing integrative multi-omics and systems bioinformatics to achieve targeted therapies for neurodegenerative diseases and advance towards personalized medicine.
2. Omics spectrum-based data mining in translational neuroscience
Many factors contribute to neurodegeneration, including but not limited to, pathogenic mutations leading to mutant protein production and toxic protein aggregation [[17], [18], [19], [20]], as well as altered cytokine production, and dysregulated cellular signaling pathways [21,22]. As opposed to conventional interventions aimed at specific toxic protein or aberrant receptor signaling, interventions at the gene level can more effectively reduce the negative downstream cellular pathogenic effects that arise from malfunctioning of a single protein. Gene targeting strategies can be particularly beneficial in neurodegenerative diseases that result from the toxic gain-of-function of a protein where a significant loss of its normal function does not have adverse effects on the cells [23]. Furthermore, there is a need for better understanding of the regional effects of a gene or protein in the brain as well as the time-dependent changes in their expression or functions to enable specific targeting and more effective treatments. In this section, we will summarize the individual omics-based data mining studies in translational neuroscience and their applications in guiding biomarker and therapeutic development.
2.1. Epigenetics/epigenomics analysis
With recent advancements in experimental and computational tools to analyze neurodegenerative diseases, a genetic basis for these diseases has yet to be fully elucidated [24]. In this section, we will discuss the gene-centric view of understanding the role of epigenetics/epigenomics in the pathogenesis of neurodegenerative diseases. Epigenetics can be summarized as heritable changes in gene function which are not encoded by nucleotide sequences in DNA, but influence gene expression and subsequent protein expression levels without altering the DNA sequence [25,26]. These changes can be due to major epigenetic mechanisms such as DNA methylation, histone modifications, chromatin remodeling, non-coding RNA regulation, as well as environmental factors such as diet, and exposure to chemicals which are dynamic and reversible [[24], [25], [26], [27]], making them viable targets for therapeutic developments [28]. Experimental studies exploring the role of epigenetic mechanisms in neurodegenerative diseases have found that DNA methylation variations may affect beta-amyloid (Aβ) and tau deposition in AD, and expression of α-synuclein in PD [29,30]. While epigenetics/epigenomics datasets are deposited in various databases, including DeepBlue [31], EWAS Open Platform [32], Genomic Expression Archive (GEA) [33], Genome Wide Associated Studies (GWAS) [34], IHEC Data Portal [35], National Cell Repository for Alzheimer's disease (NCRAD) [36], Roadmap Epigenomics [37], and Gene Expression Omnibus (GEO) [38] (Table 1, Epigenetics/Epigenomics database), there are limited number of data mining studies.
Table 1.
Databases and repositories | Data types | Refs. |
---|---|---|
DeepBlue Epigenomic Data Server | Epigenetics/Epigenomics | [31] |
Epigenome-Wide Association Study (EWAS) Open Platform | Epigenetics/Epigenomics | [32] |
Genomic Expression Archive (GEA) | Epigenetics/Epigenomics | [33] |
Genome Wide Associated Studies (GWAS) Catalog | Epigenetics/Epigenomics | [34] |
International Human Epigenome Consortium (IHEC) Portal | Epigenetics/Epigenomics | [35] |
National Cell Repository for Alzheimer's Disease (NCRAD) | Epigenetics/Epigenomics | [36] |
Roadmap Epigenomics | Epigenetics/Epigenomics | [37] |
Gene Expression Omnibus (GEO) Database | Epigenetics/Epigenomics/Bulk RNA-seq | [38] |
Mount Sinai Brain Bank | Bulk RNA-seq | [54] |
ROSMAP Database | Bulk RNA-seq | [55] |
ARCHS4 Database | Bulk RNA-seq/sc/snRNA-seq | [56] |
DDBJ Sequence Read Archive (SRA) | Bulk RNA-seq/sc/snRNA-seq | [57] |
Synapse Database | Bulk RNA-seq/sc/snRNA-seq | [58] |
scREAD Database | sc/snRNA-seq | [69] |
Allen Brain Map | sc/snRNA-seq | [70] |
DRscDB Database | sc/snRNA-seq | [71] |
Global Proteome Machine Database (GPMDB) | Proteomics | [87] |
jPOSTrepo (Japan ProteOme STandard Repository) | Proteomics | [88] |
MassIVE | Proteomics | [89] |
Proteomic Data Commons | Proteomics | [90] |
ProteomeXchange | Proteomics | [91] |
PRoteomics IDEntifications Database (PRIDE) | Proteomics/Spatial Omics | [92] |
ProteomicsDB | Proteomics/Spatial Omics | [93] |
Alzheimer's Disease Neuroimaging Initiative (ADNI) | Metabolomics/Lipidomics | [110] |
Cerebrospinal Fluid Metabolome Database | Metabolomics/Lipidomics | [111] |
Human Metabolome Database | Metabolomics/Lipidomics | [112] |
Lipid Bank | Metabolomics/Lipidomics | [117] |
LipidBlast | Metabolomics/Lipidomics | [118] |
Lipid MAPS | Metabolomics/Lipidomics | [119] |
MetaboAge Database | Metabolomics/Lipidomics | [120] |
MetaboLights | Metabolomics/Lipidomics | [121] |
Metabolomics Workbench | Metabolomics/Lipidomics | [122] |
MetabolomeXchange | Metabolomics/Lipidomics | [123] |
Serum Metabolome Database | Metabolomics/Lipidomics | [124] |
Omics Discovery Index (OmicsDI) | All Omics | [127] |
Dynamic Proteomics | Spatial Omics | [165] |
Giotto | Spatial Omics | [166] |
Spatial TranscriptOmics DataBase (STOmicsDB) | Spatial Omics | [167] |
SpatialDB | Spatial Omics | [168] |
This has led to the need to overcome practical impediments including developing algorithms and models for large-scale data mining of epigenetics/epigenomics datasets [39]. One data mining study utilizing DNA methylation data obtained from the GEO database implemented a supervised machine learning algorithm, including the construction of differential network related to aging acceleration and the use of Markov Chain Monte Carlo method of global sensitivity analysis, to better understand the accelerated epigenetic aging mechanisms of various neurodegenerative diseases [40]. Their results indicated that individuals with neurodegenerative diseases exhibited a significantly accelerated aging pattern. Specifically, they found that CDCA7L and EFNB2 are significantly different than other genes in AD and PD, respectively. While CDCA7L is involved in neuronal death, and EFNB2 is involved in apoptosis and the development of the nervous system as well as neuronal migration. Their analysis further revealed that DUSP12 had the largest betweenness across different disease types, and that DUSP12 may regulate the c-Jun N-terminal kinase signaling pathway by dephosphorylating its substrate, which is critical to cell differentiation, apoptosis, and other neural functions in the progression of neurodegenerative diseases [40].
Another data mining study implemented a new computational framework, including the use of the DBSCAN algorithm and Limma statistical methods, to analyze GEO datasets and identified 21 and 89 differentially methylated genes for AD and Down syndrome respectively [41]. Their evaluation indicated high classification accuracy of these two methylation signatures with 92% for AD and 70% for Down syndrome. Their framework is capable of detecting outlier-free epigenetic signatures in complex diseases, with applications to analyze various epigenetic signatures throughout disease pathogenesis [41]. Studies performing meta-analyses of epigenomic datasets have found differentially methylated genes in varying brain regions [41], as well as age-associated methylation patterns concurrent with epigenetic dysregulation observed in AD [42].
It is important to note that epigenetic features and RNA expression as well as the subsequent protein expression level do not necessarily have a direct correlation [43]. With most drugs targeting proteins, it is vital to take a systems bioinformatics approach and integrate multiple types of omics data to holistically understand the disease mechanisms and regulation of protein synthesis at all levels of the central dogma [44]. There is also a need to characterize epigenetic changes at the cell-specific level with spatial resolution [[45], [46], [47], [48], [49]]. Going forward, personalized medicine aimed at targeting specific epigenetic changes in patients with neurodegenerative diseases is poised to drastically transform diagnostic and therapeutic strategies [50].
2.2. Bulk and single-cell/single-nucleus RNA sequencing analysis
One of the most commonly used methods in transcriptomics for gaining insight into which genes are differentially expressed in varying disease states is the RNA sequencing (RNA-seq). Transcriptomics analysis has been used to produce data representative of mRNA expression levels of tens of thousands of genes, as well as the identification of the differentially expressed genes (DEGs) in various biological samples between patients and healthy controls. RNA-seq allows for the sequencing of the whole transcriptome [51] and provides additional information on splice variants or non-coding RNA [52,53]. The databases that archive RNA-seq datasets include GEO [38], Mount Sinai Brain Bank [54], ROSMAP database [55], ARCHS4 [56], DDBJ Sequence Read Archive (SRA) [57] and Synapse database [58] (Table 1, Bulk RNA-seq database).
A data mining study focusing on biomarker discovery for AD utilized RNA-seq datasets stored in the DDBJ SRA and identified the gene NEUROD6 to be downregulated in the brain tissue of AD patients [59]. Another study utilizing RNA-seq datasets obtained from the ROSMAP database and Mayo Clinic studies found that disease pseudotime (an arbitrary unit of time to measure a cells progression) in AD is significantly concordant with the burden of tau, Aβ, and cognitive diagnosis of late-onset AD [60]. Additionally, it was reported that early-stage disease pseudotime samples show changes in basic cellular functions, while the late stage disease pseudotime samples show changes in neuroinflammation and amyloid pathologic processes [60].
Another data mining study uses the DDBJ SRA to obtain RNA-seq datasets from 26 different studies involving brain tissues and blood samples of AD and PD patients for meta-analysis [61]. By applying a random forest-based machine learning algorithm to analyze existing central and peripheral transcriptomic data, it was found that there is little overlap between AD and PD. Interestingly, the study revealed an overlap between central and peripheral transcriptomic signatures in PD that are characterized by anomalies in exocytosis and specific genes related to the SNARE complex including vesicle-associated membrane protein 2 (VAMP2), syntaxin 1A (STX1A), and p21-activated kinase 1 (PAK1) [61]. In a separate PD study making use of RNA-seq datasets from the GEO database, several genes including RPL21, RPL34, CKS2, B2M, TNFRSF10A, DTX2, and HLA-B, have been shown to be upregulated in PD brain tissues [62]. In MS, a data mining study analyzing RNA-seq datasets from the GEO database has reported that inactive MS brain lesions contain significantly more M2 macrophages compared to normal white matter controls [63].
Besides bulk RNA-seq, studies using single-cell RNA-seq (scRNA-seq) and single-nucleus RNA-seq (snRNA-seq) techniques are of high interest because they provide not only the average expression level for an ensemble of cells such as in the typical RNA-seq analysis [64,65], but also the ability to quantify gene expression levels in specific cell types [[64], [65], [66], [67], [68]]. Both techniques provide greater depth and insight into the analyzed data when compared to bulk RNA-seq. The scRNA-seq and snRNA-seq approaches are important in revealing cell subpopulations and intercellular heterogeneity, understanding regulatory relationships between genes, and tracking trajectories of distinct cell lineages in development, to understand disease pathogenesis [64,65]. In particular, the ability of scRNA-seq and snRNA-seq techniques to dissect the functional changes of highly heterogeneous cells in the brain at the single-cell level can significantly improve our understanding of the vulnerability of particular cell types in certain neurodegenerative diseases [69]. Some of the common databases and repositories used for data mining of scRNA-seq/snRNA-seq datasets include ARCHS4 [56], DDBJ SRA [57], Synapse database [58], Allen Brain Map [70], DRscDB database [71], and scREAD database [69] (Table 1, sc/snRNA-seq database).
A study making use of scRNA-seq datasets from scREAD to analyze the entorhinal cortex of AD brains found that phosphoinositide 3‑kinase (PI3K)/protein kinase B (AKT) signaling, Wnt signaling, neuroactive ligand-receptor interaction pathways, and neurodegeneration pathways were significantly impaired in astrocytes from the entorhinal cortex of AD patients [72]. A similar data mining study using scRNA-seq data in combination with bulk RNA-seq data obtained from the Synapse database found significant upregulation of PLCG2 expression in AD patients which positively correlates with amyloid plaque density [73]. This finding was validated by using an AD mouse model which showed increased PLCG2 expression associated with amyloid pathology and disease progression and reducing microglia reverses the disease pathology [73]. A different study analyzed three snRNA-seq datasets [[74], [75], [76]] and showed that LINGO1 is upregulated in both excitatory neurons and oligodendrocytes, together with indication of mitochondrial and estrogen signalling dysfunction in AD [77].
Finally, a study integrated both scRNA-seq datasets from scREAD and bulk RNA-seq datasets obtained from the Mount Sinai Brain Bank and ROSMAP databases for a comprehensive drug repositioning analysis [78]. They identified multiple new candidates for AD treatment such as trichostatin, which was predicted to be broadly applicable to different AD subtypes, and vorinostat, which was specific for one subtype of AD, and both of which are histone deacetylase inhibitors [78]. It is important to note the lack of scRNA-seq and snRNA-seq data mining studies for PD and MS which might be due to the limited number of databases dedicated to compiling scRNA-seq/snRNA-seq data for these diseases. While providing greater depth, scRNA-seq and snRNA-seq are lacking the spatial information which is achievable with image-based transcriptomics which we will discuss in the subsequent section.
2.3. Proteomics analysis
Functional proteins and their interactions with other molecules are essential for biological processes and cellular functions which govern the disease mechanisms in neurodegenerative diseases [79]. Importantly, translation of proteins from RNA is not a linear relationship, where some genes may not even translate into functional proteins [[80], [81], [82]]. Therefore, it is important to characterize protein expression in addition to RNA quantification when investigating disease states [79]. Proteomics provides additional biological insights such as protein-abundance differences in proteomes, time-dependent expression patterns, post-translational modifications, and protein-protein interactions (PPIs) that otherwise could not be obtained from transcriptomics [79]. These parameters have been shown to have paramount effects on the pathogenesis and progression of neurodegenerative diseases [83].
Proteins rarely act as isolated machinery, and rather, their functionality is highly related to the proteins they interact with. Therefore, proteins whose function is well understood may be used to predict the function of unidentified proteins [84]. Both experimental and computational studies of PPI has enabled and expedited the modelling of functional pathways to elucidate the pathogenic mechanisms of cellular processes and identify their translational applications [85,86]. There is currently a lack of data mining of proteomics studies, although there have been efforts to compile mass spectrometry proteomics datasets into databases such as Global Proteome Machine Database (GPMDB) [87], jPOSTrepo [88], MassIVE [89], Proteomic Data Commons [90], ProteomeXchange [91], PRoteomics IDEntifications Database (PRIDE) [92], and ProteomicsDB [93] (Table 1, Proteomics database). ProteomicsDB is an example of a database that allows users to explore and retrieve protein abundance values across different tissues, cell lines, and body fluids [93].
A neurodegenerative disease focused data mining study compiled proteomics studies of post-mortem brain tissue and used a meta-analysis approach to discover that biological processes related to the organization of the extracellular matrix, metabolism of glycosaminoglycans and proteoglycans, blood coagulation, response to injury, and oxidative stress were highly dysregulated in AD, PD, and Huntington's disease through PPI network analysis and Gene Ontology (GO) enrichment analysis [94]. Another study utilized the PRIDE database to study post-translational modifications in AD patients. They report 103 proteins with post translational modifications that are uniquely expressed between brain region with no tangles, intermediate tangles, and severe tangles [95]. The bioinformatics analysis suggested the association of these proteins in AD progression through platelet activation, and they were found to be enriched for the tricarboxylic acid cycle (Kreb's Cycle), respiratory electron cycle, and detoxification of reactive oxygen species [95]. Another proteomics study making use of meta-analysis found that pathways related to synaptic signaling, oxidative phosphorylation, immune response, and extracellular matrix were commonly dysregulated in AD through bioinformatic gene set enrichment analysis with the Enrichr web server [96,97].
Proteomic expression data provides insight into the involvement of post-transcriptional editing and quantifies protein encoding mRNA genes that make it through translation, and ultimately play critical roles as functional proteins [80,81,98]. As bioinformatics analysis is becoming more refined and expansive, it is vital for technological advances to keep up with the growing need for specificity. The importance of using combinatorial methods of transcriptomics and proteomics is now well established [99,100]. Single-cell proteomics analysis techniques are now bridging the gap by filling the growing need of specificity in understanding cell heterogeneity in neurodegenerative disease tissues [99]. Single-cell proteomics makes use of mass spectrometry techniques for proteome quantification by analyzing individual cells one at a time [99,100]. This new and growing field will give novel insights into cell heterogeneity in tissue samples, as well as providing information into transcriptional regulation of DEGs when compared to scRNA-seq datasets [82,99,101].
2.4. Metabolomics/lipidomics analysis
Metabolomics and lipidomics were once classified under the same umbrella, but each of them now occupies an independent domain due to the large range of studies characterized by each of the methods extensively [102]. Metabolomics focuses mostly on examining polar metabolites such as sugars, amino acids, organic acids, and nucleotides which are usually the end products of complex biochemical cascades. On the other hand, lipidomics strives to identify lipid molecular species which should be analyzed separately from small-molecule metabolites due to their hydrophobic nature [103,104]. Importantly, with the brain being the second most abundant organ in terms of lipid concentration and diversity, lipid dysregulation has been largely linked to AD, PD, and MS due to the vital tasks of lipids in myelination of neurons and signal transduction via lipid mediators [105].
Metabolomics and lipidomics datasets are typically acquired from similar biological samples using common analysis methods such as mass spectrometry, ion chromatography, liquid chromatography, and nuclear magnetic resonance [103,106,107]. Data collected from metabolomics and proteomics are not mutually exclusive to each other, due to the intertwined relationship they both have to biological processes involved in cellular homeostasis and pathogenesis of neurodegenerative diseases, which will contribute to potential diagnosis and therapeutic targeting of these diseases [108,109]. With the expanding experimental data being collected, there are now many openly accessible metabolomics and lipidomics databases such as Alzheimer's Disease Neuroimaging Initiative (ADNI) database [110], Cerebrospinal Fluid (CSF) Metabolome Database [111], Human Metabolome Database (HMDB) [[112], [113], [114], [115], [116]], Lipid Bank [117], LipidBlast [118], Lipid MAPS [119], MetaboAge Database [120], MetaboLights [121], Metabolomics Workbench [122], MetabolomeXchange [123], and Serum Metabolome Database [124] (Table 1, Metabolomics/Lipidomics database).
In terms of lipidomics, a data mining study utilized datasets obtained from the ADNI database [110], which consisted of 349 serum samples obtained form 806 participants, to investigate lipid metabolism in AD [125]. They found lipid desaturation, elongation, and acyl chain remodeling processes to be disturbed in the blood of AD patients. The study further tested the association between sets of blood lipids with known AD biomarkers and showed that Aβ in the CSF correlates with glucosylceramides, lysophosphatidylcholines, and unsaturated triacylglycerides [125]. On the other hand, there is a scarcity of metabolomics-based data mining studies associated with neurodegenerative diseases. To investigate aging and associated diseases, MetaboAge database has compiled metabolomics data from dozens of studies reporting statistically significant changes in metabolites associated with ageing in healthy individuals [120]. This database may serve as an informative platform to compare metabolic changes between ageing and the mechanisms of neurodegenerative diseases obtained from other databases to facilitate future data mining studies using metabolomics datasets.
As examples, there are several data mining studies utilizing metabolomics datasets to understand the changes in metabolites in other neurological diseases such as glaucoma and depression. A meta-analysis study looking at primary open angle glaucoma (POAG) identified aminoacyl-tRNA biosynthesis and arginine metabolisms, which play important roles in immune responses, being dysregulated in patients with POAG compared to controls [126]. Another study examining altered metabolites in depression compiled 5,675 metabolite entries from 464 studies collected from metabolomic databases, including HMDB, MetaboLights [121], Metabolomics Workbench [122], MetabolomeXchange [123], as well as Omics Discovery Index [127] which contain all omics datasets, together with extensive literature survey [128]. They found that patients with depression had lower levels of brain gamma-aminobutyric acid and glutamate/glutamine, and that tryptophan metabolism-related metabolites such as serotonin, 5-hydroxyindoleacetic acid, quinolinic acid, and tryptophan were most frequently changed after treatment [128].
2.5. Spatial omics analysis
Spatial omics technologies have provided new opportunities to visualize the anatomical localization of biological molecules to enable the investigation of the structural organization of complex tissue as well as visualization of the interactions between cells and their tissue microenvironments [[129], [130], [131], [132]]. While spatial analysis has been increasingly applied to all types of omics studies, most of the current studies focus on spatial transcriptomics and proteomics analyses. A number of technological advances have enabled transcriptomics and proteomics profiling where the transcripts or proteins can be assigned to their specific cell types and cell location [[133], [134], [135], [136]], revealing distinct spatial patterns of cells in tissues that were previously inferred through indirect means [[137], [138], [139]]. Techniques used for spatial transcriptomics analysis include fluorescence in situ hybridization (FISH) [140], seqFISH+ [136], and multiplexed error-robust FISH (MERFISH) [133], while spatial proteomics analysis makes use of techniques such as cytometry by time of flight (CyTOF) [141] and highly multiplexed immunofluorescence imaging approaches [142]. All analytical techniques have been applied to analyze AD [76,[143], [144], [145]], PD [[146], [147], [148]], and MS tissues [[149], [150], [151], [152], [153]].
To facilitate the interpretation of spatial transcriptomics and proteomics datasets, interactive visualization tools are typically used, including SpatialLIBD [154], SpatialExperiment [155], Bento [156], MSnbase [157], pRoloc [157], Squidpy [158], Spatial Multi-Omics (SM-Omics) [159], ATHENA [160], and TRANSPIRE [161,162]. These analysis tools process single-cell transcriptomics and proteomics data and computes spatial statistics of subcellular RNA and protein molecular distributions, compartmental expression, and cell morphology to build multidimensional biological features associated with diseased states as compared to controls [156]. Databases and repositories can further facilitate the comprehensive archiving and exploration of spatial omics datasets to enable data mining or comparisons with other experimental data [163,164]. Commonly used databases include PRIDE [92], ProteomicsDB [93], Dynamic Proteomics [165], Giotto [166], Spatial TranscriptOmics DataBase (STOmicsDB) [167], and SpatialDB [168] (Table 1, Spatial Omics database). For example, SpatialDB contains functions such as the ability to search for relevant publications and tools, public dataset visualization, customized specialized databases, new data archive, and online analysis [168]. It is important to note that while there are several analytical tools and databases available to aid in spatial transcriptomic and proteomics data analysis, not many studies have made use of such tools for data mining.
A prominent example is Giotto, which utilizes a rich variety of algorithms that enables robust spatial data analysis, and a user-friendly platform for data visualization and exploration, including characterizations of tissue composition, spatial expression patterns, and cellular interactions [166]. Giotto has been shown to be applicable to a wide range of public datasets, including several spatial datasets from neurodevelopment and neurodegeneration studies that illustrated consistent analysis and conclusion [166]. In addition, RNA-seq data can be integrated for spatial cell-type enrichment analysis [166,169,170]. Giotto, for example, has utilized single-cell spatial transcriptomic MERFISH data that was collected from the pre-optic cortex of a mouse, and was able to identify 8 distinct cell clusters as well as creating interactive three-dimensional plots of the dataset [166] where the results are concordant with the original study from which the data was obtained. Giotto has additionally been used to predict the presence of a given cell type in a spatial location with multiple cell types for datasets with low spatial resolution. The spatial cell type prediction algorithm was tested by altering a seqFISH + dataset to mimic low spatial resolution. The cell-type enrichment analysis was conducted by using scRNA-seq data and derived marker gene lists for somatosensory cortex associated cell types obtained from a previous study [171], where there is a high accuracy in predicting the presence of a cell type at each individual spatial location [166].
A data mining study made use of a multi-modal structured embedding (MUSE) approach to analyze five datasets consisting of seqFISH+, STARmap, spatial transcriptomics, Visium, and spatial transcriptomics with fluorescent imaging data [172]. Application of MUSE to these diverse datasets yielded spatial patterning in healthy mice brain cortex tissue utilizing seqFISH + data, as well as heterogeneity of amyloid precursor protein processing in mice AD brain regions [172]. MUSE also successfully clustered STARmap mouse cortex data and differential expression analysis allowed for the identification of the clusters as astrocytes, hippocampal neurons, oligodendrocytes, or smooth muscle cells. Using transcriptomic and immunofluorescent imaging of AD mice brain tissue, MUSE was able to identify DEGs in individual clusters [172]. It was found that known AD-related genes RANBP9 (downregulated in hypothalamus), IGF1 (upregulated in cortex), and SORL1 (upregulated in hypothalamus; downregulated in cortex) were differentially expressed [172]. This approach has revealed regional, temporal, and biological differences reflecting AD progression in a mouse model [172].
Spatial transcriptomics and proteomics have substantially advanced our ability to detect the heterogeneity of RNA and protein expression in tissues, although characterizing whole-transcriptome data of individual cells in space remains a challenge [173]. Integrating spatial transcriptomics with scRNA-seq and snRNA-seq techniques has been on the rise with hopes of resolving the limitations that spatial transcriptomics currently pose by gaining a deeper understanding of cell-cell communication within healthy and disease tissues and the roles certain cell subpopulations have in maintaining homeostasis and disease pathogenesis [[174], [175], [176], [177]]. With the potential to incorporate other types of spatial omics datasets as they become available to advance towards spatial multi-omics [178], we will be a step closer to elucidate the detailed tissue organization, cell regulation, and cellular communication at an unprecedented scale.
3. Towards integrative multi-omics and systems bioinformatics to reconstruct brain profiles
Neuroscience is being propelled into the big data era with an exponential increase in the amount of information generated which demands for better data organization, improved pipeline frameworks, and rapid turnover for analysis and interpretation to turn this information into valuable biological insights [179]. A systems bioinformatics based data mining approach enables the integration of a spectrum of multi-omics information ranging from epigenetics/epigenomics, transcriptomics, and proteomics to metabolomics and lipidomics by a combination of data-driven bioinformatics (top-down approach) and systems biology (bottom-up approach) [14]. It has also been proposed that establishing multiple networks representing information obtained from each type of omics dataset and integrating them in a layered network that exchanges information within and between layers could enable the comprehensive systems bioinformatics analysis (Fig. 1) [14]. One of the main challenges of integrative approaches is related to increased dimensionality, due to increased complexity of the omics dataset in the biological system. Experimentally, several neurodegenerative focused studies have started to incorporate a multi-omics approach in their analyses [[180], [181], [182], [183], [184]].
In both individual and multi-omics analyses, after data mining of the omics datasets, the data must be transformed and processed through normalization, quality control, and feature selection to extract interpretable information (Fig. 2). Normalization is typically applied to most omics layers to remove bias, large variation, and outlier or incorrect reads in order to make better comparison between different datasets or omics layers [185]. Quality control should be applied to all omics layers such as through quantifying GC content in RNA-seq and removing duplicated and fragmented reads for sequence alignment to be performed [186]. Feature selection is commonly conducted to reduce the dimensionality and redundancy of the high-throughput data, and discriminate desired features contained within the data [187,188]. The goal of the data processing is to reduce dimensionality, bias, and variation of the mined data in order to ensure robustness and efficiency of analysis, especially prior to multi-omics integration. Next, we describe two multi-omics integration approaches in data mining, namely independent biological integration and unsupervised integration, to combine individual layers of omics data for an integrative multi-omics analysis (Fig. 3A).
3.1. Independent biological integration
In independent biological integration, different types of omics datasets are typically analyzed by isolating the DEGs at the individual layers first and then compiled for integrated analysis. Integration is then performed by biological intuition such as comparing expression levels of genes and proteins to understand the translational regulation [189]. Integration can also be done through computational based or web-based tools that integrate expression levels of various types of omics data for biological interpretation, functional annotations, gene set enrichment, and multiscale network-based visualization to understand the interplay of regulation at different levels of gene expression [190,191] (Fig. 3B). Some of the computational tools include Kyoto Encyclopedia of Genes and Genomes (KEGG) [192,193], GO [194,195], DAVID [196], PANTHER [197,198], GSEA [199], and IPA [200] for pathway and gene enrichment analyses. The PPIs in functional modules and their interactions with each other in cellular networks can be determined by STRING [201], Cytoscape [202], NPA [203], and SPIA [204]. It is important to note that there might be variability between these computational tools, and it is advisable to use these computational tools in combination to define the most relevant and highly reproducible pathways and networks associated with certain disease states. To ensure robustness and accuracy of results, there must be consistency in parameters used for statistical analysis in each individual omics layer.
Other web-based tools such as MetExplore [205], 3Omics [206], and PaintOmics [207] have started to layout the potential of multi-omics analysis and visualization based on the independent biological integration approach [208]. The MetExplore program enables the visualization and interpretation of omics data from multiple molecular layers with inputs from different omics data followed by providing an interpretation of genome-scale metabolic networks and how various types of omics data modulate metabolic processes [205]. While the 3Omics program builds correlation networks, and enables phenotyping based on data from different omics layers [206], the PaintOmics programs allow for network visualization and accepts a wide variety of omics data [207]. The PaintOmics program additionally allows for pathway and enrichment analyses based on KEGG, Reactome, and MapMan databases. Using independent biological integration, omics datasets can be analyzed from multiple samples and do not need to be isolated from the same experiment. More complex forms of independent biological integration include “horizontal” (with features as anchors), “vertical” (with cellular information as anchors), and “diagonal” (no anchor) integration that are used for data mining of multi-modal single-cell datasets with anchors aligning different types of omics data [209]. Independent biological integration is currently being used as the primary method of omics integration in data mining studies due to ease of implementation without the need for highly technical computational competencies.
With the emergence of databases making multiple types of omics data publicly accessible, proof-of-concept integration of multi-omics datasets has been illustrated in data mining using the independent biological integration approach. A data mining study utilizing transcriptomic and proteomic datasets found significant DEGs in the human spinal cord of MS patients and the implications of these DEGs on biological processes involved in the disease progression of MS [189]. Specifically, they found that HOXA5 was significantly upregulated in MS patients through individual transcriptomic (ARCHS4 database [56]) and proteomic analysis (BioGrid [210]) and that HOXA5 was found to promote the transforming growth factor (TGF)-beta pathway [189]. A previous study has shown that in the spinal cord of MS patients there are large areas of demyelination characterized by a unique TGFB1 genomic signature [189]. This study proposes that the overexpression of HOXA5 in the spinal cord may promote the progression of TGFB1-mediated gliosis in MS patients [189].
Multi-omics integration has also been reported to be conducted with the help of the MetExplore program to process genome-wide association studies (GWAS), transcriptomics, and proteomics datasets obtained from several databases (GWAS catalog [34], GEO [38], and PRIDE [92], respectively) to extract differentially expressed multi-omics elements [190]. This study identified 203 differentially expressed transcripts, 164 differentially expressed proteins, and 58 differentially expressed GWAS-derived mouse orthologs associated with significantly enriched metabolic biological processes [190]. Additionally, lipid metabolic pathways were significantly upregulated across the multi-omics datasets, with microglia and astrocytes expressing significant enrichment in the lipid-predominant AD-metabolic transcriptome [190]. This study brings attention to the significance of dysregulated lipid metabolism in AD, and the importance and usefulness of using multi-omics analysis to better understand AD pathogenesis from a systems bioinformatics approach, with experimental metabolomics/lipidomics validation in a mouse model and in the blood plasma of AD patients, respectively [190]. Although there are several advantages using the independent biological integration approach, it is worth noting that different studies that make use of this method may subject to different biological intuitions in the integration process, leading to non-standardized analysis and less consistent results.
3.2. Unsupervised integration
For unsupervised integration, multi-omics data from the different molecular layers are often concatenated together into a single matrix for analysis via ensemble dimension reduction [211,212]. Other approaches, such as model-ensemble, each omics layer is analyzed independently to obtain the respective matrix and the matrices from all omics layers are then inputted into the unsupervised algorithm and fused to build an integrated analysis. There are three main categories of unsupervised integration of multi-omics data, namely clustering-based, network-based, and similarity/association-based approaches (Fig. 3C). First, the clustering-based approach is primarily based on statistical calculations, making use of matrix factorization, kernel, and Bayesian analyses. Using matrix factorization analysis, the non-negative matrix factorization (NMF) method is most commonly utilized for high-dimensionality datasets and restricts their entries to non-negative values, allowing for easier interpretation of results [213]. Extensions of NMF include integrative NMF and joint NMF which account for the identification of heterogeneity and homogeneity in datasets respectively during the integration process [214,215]. Kernel analysis captures the degrees of similarity of the input data which are contained within the kernel matrix. This analysis is dimension-free and does not depend on the total number of features in the datasets. In the Bayesian analysis or Bayesian consensus clustering, a probability model such as the Dirichlet process mixture model, is used to model source-specific features as well as an overall clustering accounting for multiple data sources in different omics layers [216]. Clustering-based methods are suitable for identifying disease subtypes and module patterns, as well as isolating subgroups, samples, or features that have similar biological function.
The network-based unsupervised integration relies on biological knowledge databases for information on functional relationships between omics layers in addition to statistical analysis and calculation, and are heavily used to identify functional relationships between omics layers [212]. Network propagation analysis tracks the flow of each node and amplifies the signals based on the assumption that genes underlying similar phenotypes interact with one another through known information such as the association with a biological process [217]. Individual networks are then fused together into a similarity network using a nonlinear fusion approach which is based on message-passage theory [218]. Similar to the clustering-based method of analysis, network-based methods also utilize matrix factorization statistical approaches. On the other hand, correlation analysis is based on the correlation between a node, such as a gene and an outcome, with the significance of a node determined by the correlation coefficient or a regression-based significance [219].
Finally, the similarity/association-based unsupervised integration approach enables the identification of the marginal associations and correlations between various omics layers. Sequential analysis is an example, where statistical tests and models are applied to narrow down the list of features in one omics layer based on their relationship with features in other omics layers [212]. Multivariate analyses including canonical correlation analysis (CCA) and co-inertia analysis (CIA) are useful methods due to its flexibility in accepting multiple matrices as input data. While CCA can be applied for feature selection and classification in high-dimensional multi-omics datasets, CIA is used to find the low-dimensional components and aims to distinguish sources of variation in multi-omics datasets [220,221]. Similarity/association-based methods can also utilize kernel statistical analysis similar to clustering-based methods. Similarity/association-based unsupervised integration approach enables biomarker prediction, associations between omics layers (e.g., genotypes based on gene expression) and flexibility in accepted data (e.g., multiple matrices).
There are also some commonly used programs that can be utilized to facilitate unsupervised integration such as iCluster programs [222,223], JIVE [224], CNAmet [225], and PARADIGM [226]. Briefly, the iCluster programs and JIVE are all matrix-factorization-based clustering methods used for disease subtyping. The iCluster programs create flexible models based on the associations between omics layers and determines the variance-covariance within omics layers in a single framework, all while simultaneously reducing dimensionality of each omics layer [222]. The JIVE program is an extension of principal component analysis (PCA) and calculates the amount of joint variation between omics layers, reduces dimensionality, and enables visual exploration of joint and individual structures such as patterns of biological relationships between omics layers [224]. The CNAmet program is a similarity/associated-based sequential analysis tool that integrates high-throughput copy number, DNA methylation, and gene expression data that is used for biomarker prediction [225]. The PARADIGM program is a Bayesian-based network integration tool used for disease mechanistic studies and subtyping. It integrates multi-omics data to infer the modulation of genetic pathways based on established knowledge of the given pathways [226].
Unsupervised integration of multi-omics datasets has been mainly used in primary research, including AD, PD and ALS studies [4,12,227,228]. Recently, this approach has been adopted by neurodegenerative disease related data mining studies. For example, a study utilizes a combination of proteomics and lipidomics datasets collected from the blood of 586 AD patients and controls from other studies [229,230] for multi-omics analysis using the unsupervised integration approach [231]. Network analysis of the individual omics datasets was conducted using the Weighted Gene Correlation Network Analysis (WGCNA) which can also be applied to proteomics and lipidomics data [232,233]. Data processing which includes normalization, imputation of missing values, and PCA were performed before creating the weighted co-expression networks via hierarchically clustering and module assignments using a dynamic tree-cutting algorithm. To integrate the protein and lipid modules and analyze the associations between AD-associated modules, Pearson's correlation coefficient was utilized [231]. GO enrichment analysis was then used to analyze the biological processes as well as molecular and cellular functions of the protein modules associated with AD phenotypes. The study identified lipid modules involved in immune response and lipid metabolism as well as protein modules involved in increased cytokine production, humoral immune responses, and neutrophil-mediated immunity, all of which were highly correlated with AD risk loci [231]. This study is a good example of an unsupervised multi-omics integration approach via data mining which exemplifies network-based approaches for isolating protein and lipid modules that are highly associated with established AD risk loci. Another example related to brain disease is a glioblastoma study making use of multi-omics datasets obtained from patients to develop a network-correlation-based method called Lemon-Tree for biomarker discovery [234]. This study demonstrated that the Lemon-Tree algorithm successfully identifies known oncogenes and tumor suppressors as master regulators in the inferred module network, utilizing somatic copy number and expression data. Lemon tree allows the addition of other omics features such as miRNA and DNA methylation to be added to the model, and for GO enrichment analysis of the modules.
Currently, the largest challenges with overcoming interoperability of omics data is the lack of a standardized framework or pipeline, to enable various types of omics data to be seamlessly integrated and analyzed [208,212]. Systems bioinformatics extracts disease relevant information from multiple levels of the omics spectrum and integrates them in a layered and interactive network that exchanges systemic knowledge towards developing a comprehensive brain profile [[235], [236], [237]]. Although the process of optimizing the derived results is certainly required to obtain reliable biological information, this approach is theoretically applicable to the whole omics spectrum to work towards achieving systems bioinformatics in translational neuroscience. Our proposed concept focuses on a broad idea of obtaining ultimate research goals of reconstructing human brain systems with direct healthcare relevance [238], rather than a detailed in-depth analysis pipeline or algorithm development to analyze specific datasets [[239], [240], [241]]. We note here that highly technical methodologies, mathematical algorithms, and information theory are necessary to understand the omics derived networks.
4. Data mining and experimental validation in translational neuroscience
Data mining approaches have advantages of being high throughput and low cost as compared to traditional low throughput experimental techniques in revealing the underlying pathogenic mechanisms of complex neurodegenerative diseases. However, prediction results arising from data mining remain theoretical and require validation with experimental evidence [242]. Here, we recommend an integrative neuroscience approach that synergizes systems bioinformatics and experimental analysis to yield opportunities for translational applications including biomarker discovery, therapeutic development, and insights into disease mechanisms (Fig. 4).
In the previous section, we have described multi-omics approaches to integrate different individual omics data together for a more holistic interpretation of the results. Besides omics data, high dimensionality analysis should include cell type, spatial location, and time trajectory (Fig. 4A). The knowledge of the biological functions and network systems in the brain such as the key pathways, genes, and PPI involved are important to piece all information together to provide interpretable biological findings (Fig. 4B). Under the context of multi-omics and systems bioinformatics analysis, one would expect consistent or correlated alterations in different omics layers, be it localized information or circulating information, to enable the reconstruction of the brain profile and understanding of the brain physiology. Nuanced analysis might be encountered which requires further optimization in data analysis and confirmation through experimental validation. Experimental validation bridges the gap between bioinformatics analysis and translational neuroscience and increases the credibility of results, especially for novel discoveries. For example, it is important to test how the overall biological system of cells and/or mice would react to alterations in certain key protein achieved by either overexpression/knockout or treatment of protein ligands/modulators (Fig. 4C). It is also important to examine whether these alterations have any adverse consequences on the model organisms tested such as toxicity. Generally, a protein plays a key role in disease mechanism if the change in the protein level or function correlates with disease pathogenesis or progression. Therapeutic discovery can be achieved through screening of small molecules or antisense oligonucleotide that can modulate the protein function. A biomarker is typically established if alterations in certain key proteins that are associated with disease pathogenesis can be detected in blood or CSF prior to disease progression (Fig. 4D).
Overall, it is essential for computational scientists, experimental biologists, and neuroscientists to communicate and exchange the intricacies of their individual methods to enable evaluation and validation of results with appropriate interpretation in a balanced manner [243]. Integrative neuroscience, combining both systems bioinformatics and experimental biology analysis, is a multidisciplinary science that can provide a new approach for biomarker discovery, therapeutic development, and elucidation of pathogenic mechanisms of neurodegenerative diseases [239].
5. Summary and future perspectives
Bioinformatics is becoming increasingly essential for the organization and management of data in modern biology and is a comprehensive field that harnesses methods in computer science and experimental biology. Bioinformatics is redefining modern science in ways that were not possible in the past with the ability to combine datasets from multiple experiments and different samples for large analysis has never been as accessible and accurate as today, yielding ever-growing applications of bioinformatic analyses [244]. Systems bioinformatics is a rising concept which makes use of network-based computational analysis to increase the precision of mechanistic understanding of disease pathogenesis as well as development of new diagnostics and therapeutics [14].
This review accentuates the importance of understanding the applications and boundaries of various data mining approaches of omics datasets and computational methods towards multi-omics analysis. The integration of multi-omics analysis, systems bioinformatics, and experimental validation provides insights into disease mechanisms and opens avenues for translational neuroscience applications as well as advancing towards early diagnosis [245], precision phenotyping of diseases [246] and personalized medicine [247]. With a resolution of exactly understanding the molecular level changes of a certain proteins in a specific cell-type as well as its detailed location and trajectory in time, it is theoretically possible to target the exact cells with the pathological features and provide treatments, although this will require highly specific targeting strategies to be developed.
A number of limitations remain for systems bioinformatics and experimental analysis in translational neuroscience before it can be fully implementable [248,249]. A major challenge lies in validating the reconstructed molecular networks with real biological observations as there is a lack of ground truth and technical capabilities to recapitulate the layered networks in the brain [14]. It has been suggested that an initial step would be to compare reconstructed networks with benchmarking datasets with known biological information such as from existing curated databases and simulated datasets that mimic real data [250]. In addition, similar to the current limitation in data mining where there is no streamlined pipeline for bioinformatics analysis, there is also variability in the network reconstruction approaches, leading to inadequate network selection and inconsistent results between different studies [251]. Hence, there is a need for computational approaches to screen results from different methods of analyses to provide a consensus analysis that will maximize the information content, although this remains to be developed.
Heterogeneity of neurodegenerative diseases within individuals is caused by multiple complex factors and is impeding development of effective treatments, leading to the notion of personalized medicine. Systems bioinformatics methodologies enable the collection of invaluable knowledge gathered from the different aspects of a disease condition and provide revolutionary approaches and tools to clinicians to demystify the complex nature of these diseases. Although the full implementation of the systems bioinformatics approach to reconstruct the human brain profile might seem ambitious at the current stage, its current application might be complementary with the use of existing computational and translational neuroscience methods. The practical integration of systems bioinformatics and experimental analysis in translational neuroscience is likely to have a major impact and significant breakthroughs in detection and diagnostics of neurological disorders and neuroscience targeted drug discovery in the future [252], and may be applicable to studying other diseases in general.
CRediT author statement
Lance M. O’Connor: Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing - Original draft preparation; Blake A. O’Connor: Data curation, Writing - Reviewing and Editing; Su Bin Lim: Writing - Reviewing and Editing; Jialiu Zeng: Funding acquisition, Validation, Writing - Reviewing and Editing; Chih Hung Lo: Conceptualization, Funding acquisition, Investigation, Project administration, Supervision, Validation, Visualization, Writing - Original draft preparation.
Declaration of competing interest
The authors declare that there are no conflicts of interest.
Acknowledgments
Chih Hung Lo is supported by a Lee Kong Chian School of Medicine Dean’s Postdoctoral Fellowship (021207-00001) from Nanyang Technological University (NTU) Singapore and a Mistletoe Research Fellowship (022522-00001) from the Momental Foundation USA. Jialiu Zeng is supported by a Presidential Postdoctoral Fellowship (021229-00001) from NTU Singapore and an Open Fund Young Investigator Research Grant (OF-YIRG) (MOH-001147) from the National Medical Research Council (NMRC) Singapore. Su Bin Lim is supported by the National Research Foundation (NRF) of Korea (Grant Nos.: 2020R1A6A1A03043539, 2020M3A9D8037604, and 2022R1C1C1004756) and a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (Grant No.: HR22C1734). The authors thank Jonathan Indajang from Cornell University for proofreading the manuscript.
Footnotes
Peer review under responsibility of Xi'an Jiaotong University.
References
- 1.Gauthier J., Vincent A.T., Charette S.J., et al. A brief history of bioinformatics. Brief. Bioinform. 2019;20:1981–1996. doi: 10.1093/bib/bby063. [DOI] [PubMed] [Google Scholar]
- 2.Svensson V., Vento-Tormo R., Teichmann S.A. Exponential scaling of single-cell RNA-seq in the past decade. Nat. Protoc. 2018;13:599–604. doi: 10.1038/nprot.2017.149. [DOI] [PubMed] [Google Scholar]
- 3.Geschwind D.H., Konopka G. Neuroscience in the era of functional genomics and systems biology. Nature. 2009;461:908–915. doi: 10.1038/nature08537. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Dong X., Liu C., Dozmorov M. Review of multi-omics data resources and integrative analysis for human brain disorders. Brief. Funct. Genomics. 2021;20:223–234. doi: 10.1093/bfgp/elab024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hasin Y., Seldin M., Lusis A. Multi-omics approaches to disease. Genome Biol. 2017;18 doi: 10.1186/s13059-017-1215-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Tasic B. Single cell transcriptomics in neuroscience: Cell classification and beyond. Curr. Opin. Neurobiol. 2018;50:242–249. doi: 10.1016/j.conb.2018.04.021. [DOI] [PubMed] [Google Scholar]
- 7.Lein E., Borm L.E., Linnarsson S. The promise of spatial transcriptomics for neuroscience in the era of molecular cell typing. Science. 2017;358:64–69. doi: 10.1126/science.aan6827. [DOI] [PubMed] [Google Scholar]
- 8.Wang W.X., Lefebvre J.L. Morphological pseudotime ordering and fate mapping reveal diversification of cerebellar inhibitory interneurons. Nat. Commun. 2022;13 doi: 10.1038/s41467-022-30977-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Wilson S.L., Way G.P., Bittremieux W., et al. Sharing biological data: Why, when, and how. FEBS Lett. 2021;595:847–863. doi: 10.1002/1873-3468.14067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Shepherd G.M., Mirsky J.S., Healy M.D., et al. The Human Brain Project: Neuroinformatics tools for integrating, searching and modeling multidisciplinary neuroscience data. Trends Neurosci. 1998;21:460–468. doi: 10.1016/s0166-2236(98)01300-9. [DOI] [PubMed] [Google Scholar]
- 11.Villa C., Yoon J.H. Multi-omics for the understanding of brain diseases. Life. 2021;11 doi: 10.3390/life11111202. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Clark C., Rabl M., Dayon L., et al. The promise of multi-omics approaches to discover biological alterations with clinical relevance in Alzheimer’s disease. Front. Aging Neurosci. 2022;14 doi: 10.3389/fnagi.2022.1065904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Greco F.V., Pandi A., Erb T.J., et al. Harnessing the central dogma for stringent multi-level control of gene expression. Nat. Commun. 2021;12 doi: 10.1038/s41467-021-21995-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Oulas A., Minadakis G., Zachariou M., et al. Systems Bioinformatics: Increasing precision of computational diagnostics and therapeutics through network-based approaches. Brief. Bioinform. 2019;20:806–824. doi: 10.1093/bib/bbx151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Grillner S., Kozlov A., Kotaleski J.H. Integrative neuroscience: Linking levels of analyses. Curr. Opin. Neurobiol. 2005;15:614–621. doi: 10.1016/j.conb.2005.08.017. [DOI] [PubMed] [Google Scholar]
- 16.Schneider-Poetsch T., Yoshida M. Along the central dogma-controlling gene expression with small molecules. Annu. Rev. Biochem. 2018;87:391–420. doi: 10.1146/annurev-biochem-060614-033923. [DOI] [PubMed] [Google Scholar]
- 17.Calabrese G., Molzahn C., Mayor T. Protein interaction networks in neurodegenerative diseases: From physiological function to aggregation. J. Biol. Chem. 2022;298 doi: 10.1016/j.jbc.2022.102062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lo C.H., Sachs J.N. The role of wild-type tau in Alzheimer’s disease and related tauopathies. J. Life Sci. (Westlake Village) 2020;2:1–17. doi: 10.36069/jols/20201201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lo C.H. Heterogeneous tau oligomers as molecular targets for Alzheimer’s disease and related tauopathies. Biophysica. 2022;2:440–451. [Google Scholar]
- 20.Lo C.H. Recent advances in cellular biosensor technology to investigate tau oligomerization. Bioeng. Transl. Med. 2021;6 doi: 10.1002/btm2.10231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Hou Y., Dan X., Babbar M., et al. Ageing as a risk factor for neurodegenerative disease. Nat. Rev. Neurol. 2019;15:565–581. doi: 10.1038/s41582-019-0244-7. [DOI] [PubMed] [Google Scholar]
- 22.Gan L., Cookson M.R., Petrucelli L., et al. Converging pathways in neurodegeneration, from genetics to mechanisms. Nat. Neurosci. 2018;21:1300–1309. doi: 10.1038/s41593-018-0237-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ghosh R., Tabrizi S.J. Gene suppression approaches to neurodegeneration. Alzheimer’s Res. Ther. 2017;9 doi: 10.1186/s13195-017-0307-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Qureshi I.A., Mehler M.F. Advances in epigenetics and epigenomics for neurodegenerative diseases. Curr. Neurol. Neurosci. Rep. 2011;11:464–473. doi: 10.1007/s11910-011-0210-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Yu G., Su Q., Chen Y., et al. Epigenetics in neurodegenerative disorders induced by pesticides. Genes Environ. 2021;43 doi: 10.1186/s41021-021-00224-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Ghosh P., Saadat A. Neurodegeneration and epigenetics: A review. Neurologia (Engl Ed) 2023;38:e62–e68. doi: 10.1016/j.nrleng.2023.05.001. [DOI] [PubMed] [Google Scholar]
- 27.Coppede F. Targeting the epigenome to treat neurodegenerative diseases or delay their onset: A perspective. Neural Regen. Res. 2022;17:1745–1747. doi: 10.4103/1673-5374.332145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Hwang J.Y., Aromolaran K.A., Zukin R.S. The emerging field of epigenetics in neurodegeneration and neuroprotection. Nat. Rev. Neurosci. 2017;18:347–361. doi: 10.1038/nrn.2017.46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Jowaed A., Schmitt I., Kaut O., et al. Methylation regulates alpha-synuclein expression and is decreased in Parkinson’s disease patients’ brains. J. Neurosci. 2010;30:6355–6359. doi: 10.1523/JNEUROSCI.6119-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Matsumoto L., Takuma H., Tamaoka A., et al. CpG demethylation enhances alpha-synuclein expression and affects the pathogenesis of Parkinson’s disease. PLoS One. 2010;5 doi: 10.1371/journal.pone.0015522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Albrecht F., List M., Bock C., et al. DeepBlue epigenomic data server: Programmatic data retrieval and analysis of epigenome region sets. Nucleic Acids Res. 2016;44:W581–W586. doi: 10.1093/nar/gkw211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Xiong Z., Yang F., Li M., et al. EWAS Open Platform: Integrated data, knowledge and toolkit for epigenome-wide association study. Nucleic Acids Res. 2022;50:D1004–D1009. doi: 10.1093/nar/gkab972. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kodama Y., Mashima J., Kosuge T., et al. DDBJ update: The Genomic Expression Archive (GEA) for functional genomics data. Nucleic Acids Res. 2019;47:D69–D73. doi: 10.1093/nar/gky1002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Sollis E., Mosaku A., Abid A., et al. The NHGRI-EBI GWAS Catalog: Knowledgebase and deposition resource. Nucleic Acids Res. 2023;51:D977–D985. doi: 10.1093/nar/gkac1010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Bujold D., Grégoire R., Brownlee D., et al. Practical Guide to Life Science Databases. first ed. Springer Nature; Singapore: 2011. IHEC data portal. I. Abugessaisa, T. Kasukawa; pp. 77–94. [Google Scholar]
- 36.National Centralized Repository for Alzheimer’s Disease and Related Dementias (NCRAD) https://ncrad.iu.edu/
- 37.Consortium R.E., Kundaje A., Meuleman W., et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518:317–330. doi: 10.1038/nature14248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Barrett T., Wilhite S.E., Ledoux P., et al. NCBI GEO: Archive for functional genomics data sets: Update. Nucleic Acids Res. 2013;41:D991–D995. doi: 10.1093/nar/gks1193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Huttenhower C., Hofmann O. A quick guide to large-scale genomic data mining. PLoS Comput. Biol. 2010;6 doi: 10.1371/journal.pcbi.1000779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Shi F., He Y., Chen Y., et al. Comparative analysis of multiple neurodegenerative diseases based on advanced epigenetic aging brain. Front. Genet. 2021;12 doi: 10.3389/fgene.2021.657636. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Mallik S., Zhao Z. Detecting methylation signatures in neurodegenerative disease by density-based clustering of applications with reducing noise. Sci. Rep. 2020;10 doi: 10.1038/s41598-020-78463-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Pellegrini C., Pirazzini C., Sala C., et al. A meta-analysis of brain DNA methylation across sex, age, and Alzheimer’s disease points for accelerated epigenetic aging in neurodegeneration. Front. Aging Neurosci. 2021;13 doi: 10.3389/fnagi.2021.639428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Klein H.U., McCabe C., Gjoneska E., et al. Epigenome-wide study uncovers large-scale changes in histone acetylation driven by tau pathology in aging and Alzheimer’s human brains. Nat. Neurosci. 2019;22:37–46. doi: 10.1038/s41593-018-0291-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.De Jager P.L. Deconstructing the epigenomic architecture of human neurodegeneration. Neurobiol. Dis. 2021;153 doi: 10.1016/j.nbd.2021.105331. [DOI] [PubMed] [Google Scholar]
- 45.MacBean L.F., Smith A.R., Lunnon K. Exploring beyond the DNA sequence: A review of epigenomic studies of DNA and histone modifications in dementia. Curr. Genet. Med. Rep. 2020;8:79–92. [Google Scholar]
- 46.Lu T., Ang C.E., Zhuang X. Spatially resolved epigenomic profiling of single cells in complex tissues. Cell. 2022;185:4448–4464.e17. doi: 10.1016/j.cell.2022.09.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Deng Y., Bartosovic M., Kukanja P., et al. Spatial-CUT&Tag: Spatially resolved chromatin modification profiling at the cellular level. Science. 2022;375:681–686. doi: 10.1126/science.abg7216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Deng Y., Bartosovic M., Ma S., et al. Spatial profiling of chromatin accessibility in mouse and human tissues. Nature. 2022;609:375–383. doi: 10.1038/s41586-022-05094-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Fan R., Zhang D., Su G. Spatially resolved epigenome-transcriptome co-profiling of mammalian tissues at the cellular level. Res. Sq. 2022:1–26. [Google Scholar]
- 50.Qureshi I.A., Mehler M.F. Understanding neurological disease mechanisms in the era of epigenetics. JAMA Neurol. 2013;70:703–710. doi: 10.1001/jamaneurol.2013.1443. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Rao M.S., Van Vleet T.R., Ciurlionis R., et al. Comparison of RNA-seq and microarray gene expression platforms for the toxicogenomic evaluation of liver from short-term rat toxicity studies. Front. Genet. 2018;9 doi: 10.3389/fgene.2018.00636. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Verheijen J., Sleegers K. Understanding Alzheimer disease at the interface between genetics and transcriptomics. Trends Genet. 2018;34:434–447. doi: 10.1016/j.tig.2018.02.007. [DOI] [PubMed] [Google Scholar]
- 53.Han S., Nho K., Lee Y. Alternative splicing regulation of an Alzheimer’s risk variant in CLU. Int. J. Mol. Sci. 2020;21 doi: 10.3390/ijms21197079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Wang M., Beckmann N.D., Roussos P., et al. The Mount Sinai cohort of large-scale genomic, transcriptomic and proteomic data in Alzheimer’s disease. Sci. Data. 2018;5 doi: 10.1038/sdata.2018.185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Bennett D.A., Buchman A.S., Boyle P.A., et al. Religious orders study and rush memory and aging project. J. Alzheimers Dis. 2018;64:S161–S189. doi: 10.3233/JAD-179939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Lachmann A., Torre D., Keenan A.B., et al. Massive mining of publicly available RNA-seq data from human and mouse. Nat. Commun. 2018;9 doi: 10.1038/s41467-018-03751-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Nakamura Y., Kodama Y., Saruhashi S., et al. DDBJ sequence read archive/DDBJ omics archive. Nat. Preced. 2010 doi: 10.1038/npre.2010.5085.1. [DOI] [Google Scholar]
- 58.SYNAPSE. https://www.synapse.org/
- 59.Satoh J.I., Yamamoto Y., Asahina N., et al. RNA-Seq data mining: Downregulation of NeuroD6 serves as a possible biomarker for Alzheimer’s disease brains. Dis. Markers. 2014;2014 doi: 10.1155/2014/123165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Mukherjee S., Heath L., Preuss C., et al. Molecular estimation of neurodegeneration pseudotime in older brains. Nat. Commun. 2020;11 doi: 10.1038/s41467-020-19622-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Hooshmand K., Halliday G.M., Pineda S.S., et al. Overlap between central and peripheral transcriptomes in Parkinson’s disease but not Alzheimer’s disease. Int. J. Mol. Sci. 2022;23 doi: 10.3390/ijms23095200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Hossain M.B., Islam M.K., Adhikary A., et al. Bioinformatics approach to identify significant biomarkers, drug targets shared between Parkinson’s disease and bipolar disorder: A pilot study. Bioinform. Biol. Insights. 2022;16 doi: 10.1177/11779322221079232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Batchu S. Progressive multiple sclerosis transcriptome deconvolution indicates increased M2 macrophages in inactive lesions. Eur. Neurol. 2020;83:433–435. doi: 10.1159/000510075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Hwang B., Lee J.H., Bang D. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp. Mol. Med. 2018;50:1–14. doi: 10.1038/s12276-018-0071-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Poirion O.B., Zhu X., Ching T., et al. Single-cell transcriptomics bioinformatics and computational challenges. Front. Genet. 2016;7 doi: 10.3389/fgene.2016.00163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Slovin S., Carissimo A., Panariello F., et al. Single-cell RNA sequencing analysis: A step-by-step overview. Methods Mol. Biol. 2021;2284:343–365. doi: 10.1007/978-1-0716-1307-8_19. [DOI] [PubMed] [Google Scholar]
- 67.Svensson V., da Veiga Beltrame E., Pachter L. A curated database reveals trends in single-cell transcriptomics. Database. 2020;2020 doi: 10.1093/database/baaa073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Ma S., Lim S.B. Single-cell RNA sequencing in Parkinson’s disease. Biomedicines. 2021;9 doi: 10.3390/biomedicines9040368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Jiang J., Wang C., Qi R., et al. scREAD: A single-cell RNA-seq database for Alzheimer’s disease. iScience. 2020;23 doi: 10.1016/j.isci.2020.101769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Wang Q., Ding S., Li Y., et al. The Allen mouse brain common coordinate framework: A 3D reference atlas. Cell. 2020;181:936–953.e20. doi: 10.1016/j.cell.2020.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Hu Y., Tattikota S.G., Liu Y., et al. DRscDB: A single-cell RNA-seq resource for data mining and data comparison across species. Comput. Struct. Biotechnol. J. 2021;19:2018–2026. doi: 10.1016/j.csbj.2021.04.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Pushparaj P.N., Kalamegam G., Wali Sait K.H., et al. Decoding the role of astrocytes in the entorhinal cortex in Alzheimer’s disease using high-dimensional single-nucleus RNA sequencing data and next-generation knowledge discovery methodologies: Focus on drugs and natural product remedies for dementia. Front. Pharmacol. 2021;12 doi: 10.3389/fphar.2021.720170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Tsai A.P., Dong C., Lin P.B., et al. PLCG2 is associated with the inflammatory response and is induced by amyloid plaques in Alzheimer’s disease. Genome Med. 2022;14 doi: 10.1186/s13073-022-01022-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Grubman A., Chew G., Ouyang J.F., et al. A single-cell atlas of entorhinal cortex from individuals with Alzheimer’s disease reveals cell-type-specific gene expression regulation. Nat. Neurosci. 2019;22:2087–2097. doi: 10.1038/s41593-019-0539-4. [DOI] [PubMed] [Google Scholar]
- 75.Lau S.F., Cao H., Fu A.K.Y., et al. Single-nucleus transcriptome analysis reveals dysregulation of angiogenic endothelial cells and neuroprotective glia in Alzheimer’s disease. Proc. Natl. Acad. Sci. U. S. A. 2020;117:25800–25809. doi: 10.1073/pnas.2008762117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Mathys H., Davila-Velderrain J., Peng Z., et al. Single-cell transcriptomic analysis of Alzheimer’s disease. Nature. 2019;570:332–337. doi: 10.1038/s41586-019-1195-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Wang X., Li L. Cell type-specific potential pathogenic genes and functional pathways in Alzheimer’s Disease. BMC Neurol. 2021;21 doi: 10.1186/s12883-021-02407-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Pei G., Fernandes B., Wang Y., et al. A single-cell atlas of the human brain in Alzheimer’s disease and its implications for personalized drug repositioning. bioRxiv. 2022 doi: 10.1101/2022.06.14.496100. [DOI] [Google Scholar]
- 79.Wilhelm M., Schlegl J., Hahne H., et al. Mass-spectrometry-based draft of the human proteome. Nature. 2014;509:582–587. doi: 10.1038/nature13319. [DOI] [PubMed] [Google Scholar]
- 80.Wang X., Liu Q., Zhang B. Leveraging the complementary nature of RNA-Seq and shotgun proteomics data. Proteomics. 2014;14:2676–2687. doi: 10.1002/pmic.201400184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Maier T., Güell M., Serrano L. Correlation of mRNA and protein in complex biological samples. FEBS Lett. 2009;583:3966–3973. doi: 10.1016/j.febslet.2009.10.036. [DOI] [PubMed] [Google Scholar]
- 82.Schoof E.M., Furtwängler B., Üresin N., et al. Quantitative single-cell proteomics as a tool to characterize cellular hierarchies. Nat. Commun. 2021;12 doi: 10.1038/s41467-021-23667-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Verrou K.M., Tsamardinos I., Papoutsoglou G. Learning pathway dynamics from single-cell proteomic data: A comparative study. Cytometry A. 2020;97:241–252. doi: 10.1002/cyto.a.23976. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Rao V.S., Srinivas K., Sujini G.N., et al. Protein-protein interaction detection: Methods and analysis. Int. J. Proteom. 2014;2014 doi: 10.1155/2014/147648. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Zhang A. Cambridge University Press; 2009. Protein interaction networks: computational analysis; pp. 1–292. [Google Scholar]
- 86.Gonzalez M.W., Kann M.G. Chapter 4: Protein interactions and disease, PLoS Comput. Biol. 2012;8 doi: 10.1371/journal.pcbi.1002819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Craig R., Cortens J.P., Beavis R.C. Open source system for analyzing, validating, and storing protein identification data. J. Proteome Res. 2004;3:1234–1242. doi: 10.1021/pr049882h. [DOI] [PubMed] [Google Scholar]
- 88.Okuda S., Watanabe Y., Moriya Y., et al. jPOSTrepo: An international standard data repository for proteomes. Nucleic Acids Res. 2017;45:D1107–D1111. doi: 10.1093/nar/gkw1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Choi M., Carver J., Chiva C., et al. MassIVE.quant: A community resource of quantitative mass spectrometry-based proteomics datasets. Nat. Meth. 2020;17:981–984. doi: 10.1038/s41592-020-0955-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Proteomic Data Commons. https://proteomic.datacommons.cancer.gov/pdc/
- 91.Vizcaíno J.A., Deutsch E.W., Wang R., et al. ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nat. Biotechnol. 2014;32:223–226. doi: 10.1038/nbt.2839. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Martens L., Hermjakob H., Jones P., et al. PRIDE: The proteomics identifications database. Proteomics. 2005;5:3537–3545. doi: 10.1002/pmic.200401303. [DOI] [PubMed] [Google Scholar]
- 93.Samaras P., Schmidt T., Frejno M., et al. ProteomicsDB: A multi-omics and multi-organism resource for life science research. Nucleic Acids Res. 2020;48:D1153–D1163. doi: 10.1093/nar/gkz974. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Freitas A., Aroso M., Rocha S., et al. Bioinformatic analysis of the human brain extracellular matrix proteome in neurodegenerative disorders. Eur. J. Neurosci. 2021;53:4016–4033. doi: 10.1111/ejn.15316. [DOI] [PubMed] [Google Scholar]
- 95.Deolankar S.C., Patil A.H., Rex D.A.B., et al. Mapping post-translational modifications in brain regions in Alzheimer’s disease using proteomics data mining. Omics. 2021;25:525–536. doi: 10.1089/omi.2021.0054. [DOI] [PubMed] [Google Scholar]
- 96.Haytural H., Benfeitas R., Schedin-Weiss S., et al. Insights into the changes in the proteome of Alzheimer disease elucidated by a meta-analysis. Sci. Data. 2021;8 doi: 10.1038/s41597-021-01090-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Kuleshov M.V., Jones M.R., Rouillard A.D., et al. Enrichr: A comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44:W90–W97. doi: 10.1093/nar/gkw377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Kinoshita Y., Uo T., Jayadev S., et al. Potential applications and limitations of proteomics in the study of neurological disease. Arch. Neurol. 2006;63:1692–1696. doi: 10.1001/archneur.63.12.1692. [DOI] [PubMed] [Google Scholar]
- 99.Brunner A.D., Thielert M., Vasilopoulou C., et al. Ultra-high sensitivity mass spectrometry quantifies single-cell proteome changes upon perturbation. Mol. Syst. Biol. 2022;18 doi: 10.15252/msb.202110798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Perkel J.M. Single-cell proteomics takes centre stage. Nature. 2021;597:580–582. doi: 10.1038/d41586-021-02530-6. [DOI] [PubMed] [Google Scholar]
- 101.Paul I., White C., Turcinovic I., et al. Imaging the future: The emerging era of single-cell spatial proteomics. FEBS J. 2021;288:6990–7001. doi: 10.1111/febs.15685. [DOI] [PubMed] [Google Scholar]
- 102.Gallart-Ayala H., Teav T., Ivanisevic J. Metabolomics meets lipidomics: Assessing the small molecule component of metabolism. BioEssays. 2020;42 doi: 10.1002/bies.202000052. [DOI] [PubMed] [Google Scholar]
- 103.Wang R., Li B., Lam S.M., et al. Integration of lipidomics and metabolomics for in-depth understanding of cellular mechanism and disease progression. Yi Chuan Xue Bao. 2020;47:69–83. doi: 10.1016/j.jgg.2019.11.009. [DOI] [PubMed] [Google Scholar]
- 104.Alves M.A., Lamichhane S., Dickens A., et al. Systems biology approaches to study lipidomes in health and disease. Biochim. Biophys. Acta Mol. Cell. Biol. Lipids. 2021;1866 doi: 10.1016/j.bbalip.2020.158857. [DOI] [PubMed] [Google Scholar]
- 105.Castellanos D.B., Martín-Jiménez C.A., Rojas-Rodríguez F., et al. Brain lipidomics as a rising field in neurodegenerative contexts: Perspectives with Machine Learning approaches. Front. Neuroendocrinol. 2021;61 doi: 10.1016/j.yfrne.2021.100899. [DOI] [PubMed] [Google Scholar]
- 106.Misra B.B., Langefeld C., Olivier M., et al. Integrated omics: Tools, advances and future approaches. J. Mol. Endocrinol. 2019;62:R21–R45. doi: 10.1530/JME-18-0055. [DOI] [PubMed] [Google Scholar]
- 107.Schumacher-Schuh A., Bieger A., Borelli W.V., et al. Advances in proteomic and metabolomic profiling of neurodegenerative diseases. Front. Neurol. 2021;12 doi: 10.3389/fneur.2021.792227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Shao Y., Le W. Recent advances and perspectives of metabolomics-based investigations in Parkinson’s disease. Mol. Neurodegener. 2019;14 doi: 10.1186/s13024-018-0304-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Reveglia P., Paolillo C., Ferretti G., et al. Challenges in LC-MS-based metabolomics for Alzheimer’s disease early detection: Targeted approaches versus untargeted approaches. Metabolomics. 2021;17 doi: 10.1007/s11306-021-01828-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Petersen R.C., Aisen P.S., Beckett L.A., et al. Alzheimer’s Disease Neuroimaging Initiative (ADNI): Clinical characterization. Neurology. 2010;74:201–209. doi: 10.1212/WNL.0b013e3181cb3e25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Wishart D.S., Lewis M.J., Morrissey J.A., et al. The human cerebrospinal fluid metabolome. J. Chromatogr. B Anal. Technol. Biomed. Life Sci. 2008;871:164–173. doi: 10.1016/j.jchromb.2008.05.001. [DOI] [PubMed] [Google Scholar]
- 112.Wishart D.S., Tzur D., Knox C., et al. HMDB: The human metabolome database. Nucleic Acids Res. 2007;35:D521–D526. doi: 10.1093/nar/gkl923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Wishart D.S., Knox C., Guo A.C., et al. HMDB: A knowledgebase for the human metabolome. Nucleic Acids Res. 2009;37:D603–D610. doi: 10.1093/nar/gkn810. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Wishart D.S., Jewison T., Guo A.C., et al. HMDB 3.0-The human metabolome database in 2013. Nucleic Acids Res. 2013;41:D801–D807. doi: 10.1093/nar/gks1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Wishart D.S., Feunang Y.D., Marcu A., et al. HMDB 4.0: The human metabolome database for 2018. Nucleic Acids Res. 2018;46:D608–D617. doi: 10.1093/nar/gkx1089. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Wishart D.S., Guo A.C., Oler E., et al. HMDB 5.0: The human metabolome database for 2022. Nucleic Acids Res. 2022;50:D622–D631. doi: 10.1093/nar/gkab1062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Watanabe K., Yasugi E., Oshima M. How to search the glycolipid data in “LIPIDBANK for Web” the newly developed lipid database in Japan. Trends Glycosci. Glycotechnol. 2000;12:175–184. [Google Scholar]
- 118.Kind T., Liu K.H., Lee D.Y., et al. LipidBlast in silico tandem mass spectrometry database for lipid identification. Nat. Meth. 2013;10:755–758. doi: 10.1038/nmeth.2551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Cotter D., Maer A., Guda C., et al. LMPD: LIPID MAPS proteome database. Nucleic Acids Res. 2006;34:D507–D510. doi: 10.1093/nar/gkj122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Mracica T.B., Anghel A., Ion C.F., et al. MetaboAge DB: A repository of known ageing-related changes in the human metabolome. Biogerontology. 2020;21:763–771. doi: 10.1007/s10522-020-09892-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Haug K., Salek R.M., Conesa P., et al. MetaboLights: An open-access general-purpose repository for metabolomics studies and associated meta-data. Nucleic Acids Res. 2013;41:D781–D786. doi: 10.1093/nar/gks1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Sud M., Fahy E., Cotter D., et al. Metabolomics Workbench: An international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training, and analysis tools. Nucleic Acids Res. 2016;44:D463–D470. doi: 10.1093/nar/gkv1042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.MetabolomeXchange. http://www.metabolomexchange.org/site/
- 124.Psychogios N., Hau D.D., Peng J., et al. The human serum metabolome. PLoS One. 2011;6 doi: 10.1371/journal.pone.0016957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Barupal D.K., Baillie R., Fan S., et al. Sets of coregulated serum lipids are associated with Alzheimer’s disease pathophysiology. Alzheimers Dement. (Amst) 2019;11:619–627. doi: 10.1016/j.dadm.2019.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Tang Y., Shah S., Cho K.S., et al. Metabolomics in primary open angle glaucoma: A systematic review and meta-analysis. Front. Neurosci. 2022;16 doi: 10.3389/fnins.2022.835736. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Perez-Riverol Y., Bai M., da Veiga Leprevost F., et al. Discovering and linking public omics data sets using the Omics Discovery Index. Nat. Biotechnol. 2017;35:406–409. doi: 10.1038/nbt.3790. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Pu J., Yu Y., Liu Y., et al. MENDA: A comprehensive curated resource of metabolic characterization in depression. Brief Bioinform. 2020;21:1455–1464. doi: 10.1093/bib/bbz055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Han X., Wang R., Zhou Y., et al. Mapping the mouse cell atlas by microwell-seq. Cell. 2018;172:1091–1107.e17. doi: 10.1016/j.cell.2018.02.001. [DOI] [PubMed] [Google Scholar]
- 130.The Tabula Muris Consortium Overall coordination, Logistical coordination, et al., Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature. 2018;562:367–372. doi: 10.1038/s41586-018-0590-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Liao J., Lu X., Shao X., et al. Uncovering an organ’s molecular architecture at single-cell resolution by spatially resolved transcriptomics. Trends Biotechnol. 2021;39:43–58. doi: 10.1016/j.tibtech.2020.05.006. [DOI] [PubMed] [Google Scholar]
- 132.Cheng J., Liao J., Shao X., et al. Multiplexing methods for simultaneous large-scale transcriptomic profiling of samples at single-cell resolution. Adv. Sci. (Weinh) 2021;8 doi: 10.1002/advs.202101229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133.Chen K.H., Boettiger A.N., Moffitt J.R., et al. RNA imaging. Spatially resolved, highly multiplexed RNA profiling in single cells. Science. 2015;348 doi: 10.1126/science.aaa6090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Ståhl P.L., Salmén F., Vickovic S., et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science. 2016;353:78–82. doi: 10.1126/science.aaf2403. [DOI] [PubMed] [Google Scholar]
- 135.Rodriques S.G., Stickels R.R., Goeva A., et al. Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolution. Science. 2019;363:1463–1467. doi: 10.1126/science.aaw1219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136.Eng C.L., Lawson M., Zhu Q., et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH. Nature. 2019;568:235–239. doi: 10.1038/s41586-019-1049-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 137.Satija R., Farrell J.A., Gennert D., et al. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 2015;33:495–502. doi: 10.1038/nbt.3192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.Achim K., Pettit J.B., Saraiva L.R., et al. High-throughput spatial mapping of single-cell RNA-seq data to tissue of origin. Nat. Biotechnol. 2015;33:503–509. doi: 10.1038/nbt.3209. [DOI] [PubMed] [Google Scholar]
- 139.Franjic D., Skarica M., Ma S., et al. Transcriptomic taxonomy and neurogenic trajectories of adult human, macaque, and pig hippocampal and entorhinal cells. Neuron. 2022;110:452–469.e14. doi: 10.1016/j.neuron.2021.10.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Femino A.M., Fay F.S., Fogarty K., et al. Visualization of single RNA transcripts in situ. Science. 1998;280:585–590. doi: 10.1126/science.280.5363.585. [DOI] [PubMed] [Google Scholar]
- 141.Giesen C., Wang H.A.O., Schapiro D., et al. Highly multiplexed imaging of tumor tissues with subcellular resolution by mass cytometry. Nat. Meth. 2014;11:417–422. doi: 10.1038/nmeth.2869. [DOI] [PubMed] [Google Scholar]
- 142.Gut G., Herrmann M.D., Pelkmans L. Multiplexed protein maps link subcellular organization to cellular states. Science. 2018;361 doi: 10.1126/science.aar7042. [DOI] [PubMed] [Google Scholar]
- 143.Chen W., Lu A., Craessaerts K., et al. Spatial transcriptomics and in situ sequencing to study Alzheimer’s disease. Cell. 2020;182:976–991.e19. doi: 10.1016/j.cell.2020.06.038. [DOI] [PubMed] [Google Scholar]
- 144.Prokop S., Miller K.R., Labra S.R., et al. Impact of TREM2 risk variants on brain region-specific immune activation and plaque microenvironment in Alzheimer’s disease patient brain samples. Acta Neuropathol. 2019;138:613–630. doi: 10.1007/s00401-019-02048-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145.Navarro J.F., Croteau D.L., Jurek A., et al. Spatial transcriptomics reveals genes associated with dysregulated mitochondrial functions and stress signaling in Alzheimer disease. iScience. 2020;23 doi: 10.1016/j.isci.2020.101556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 146.Aguila J., Cheng S., Kee N., et al. Spatial RNA sequencing identifies robust markers of vulnerable and resistant human midbrain dopamine neurons and their expression in Parkinson’s disease. Front. Mol. Neurosci. 2021;14 doi: 10.3389/fnmol.2021.699562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 147.Kamath T., Abdulraouf A., Burris S.J., et al. Single-cell genomic profiling of human dopamine neurons identifies a population that selectively degenerates in Parkinson’s disease. Nat. Neurosci. 2022;25:588–595. doi: 10.1038/s41593-022-01061-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 148.Smajić S., Prada-Medina C.A., Landoulsi Z., et al. Single-cell sequencing of human midbrain reveals glial activation and a Parkinson-specific neuronal state. Brain. 2022;145:964–978. doi: 10.1093/brain/awab446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 149.Lo C.H., Skarica M., Mansoor M., et al. Astrocyte heterogeneity in multiple sclerosis: Current understanding and technical challenges. Front. Cell. Neurosci. 2021;15 doi: 10.3389/fncel.2021.726479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 150.Schirmer L., Velmeshev D., Holmqvist S., et al. Neuronal vulnerability and multilineage diversity in multiple sclerosis. Nature. 2019;573:75–82. doi: 10.1038/s41586-019-1404-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 151.Kaufmann M., Evans H., Schaupp A.L., et al. Identifying CNS-colonizing T cells as potential therapeutic targets to prevent progression of multiple sclerosis. Med. 2021;2:296–312.e8. doi: 10.1016/j.medj.2021.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 152.Absinta M., Maric D., Gharagozloo M., et al. A lymphocyte-microglia-astrocyte axis in chronic active multiple sclerosis. Nature. 2021;597:709–714. doi: 10.1038/s41586-021-03892-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 153.Kaufmann M., Schaupp A.L., Sun R., et al. Identification of early neurodegenerative pathways in progressive multiple sclerosis. Nat. Neurosci. 2022;25:944–955. doi: 10.1038/s41593-022-01097-3. [DOI] [PubMed] [Google Scholar]
- 154.Pardo B., Spangler A., Weber L.M., et al. spatialLIBD: An R/Bioconductor package to visualize spatially-resolved transcriptomics data. BMC Genomics. 2022;23 doi: 10.1186/s12864-022-08601-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 155.Righelli D., Weber L.M., Crowell H.L., et al. SpatialExperiment: Infrastructure for spatially-resolved transcriptomics data in R using Bioconductor. Bioinformatics. 2022;38:3128–3131. doi: 10.1093/bioinformatics/btac299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 156.Mah C.K., Ahmed N., Lopez N., et al. Bento: A toolkit for subcellular analysis of spatial transcriptomics data. bioRxiv. 2023 doi: 10.1186/s13059-024-03217-7. https://10.1101/2022.06.10.495510 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 157.Breckels L.M., Mulvey C.M., Lilley K.S., et al. A Bioconductor workflow for processing and analysing spatial proteomics data. F1000Research. 2016;5 doi: 10.12688/f1000research.10411.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158.Palla G., Spitzer H., Klein M., et al. Squidpy: A scalable framework for spatial omics analysis. Nat. Meth. 2022;19:171–178. doi: 10.1038/s41592-021-01358-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 159.Vickovic S., Lötstedt B., Klughammer J., et al. SM-Omics is an automated platform for high-throughput spatial multi-omics. Nat. Commun. 2022;13 doi: 10.1038/s41467-022-28445-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 160.Martinelli A., Rapsomaniki M. ATHENA: Analysis of tumor heterogeneity from spatial omics measurements. Bioinformatics. 2022;38:3151–3153. doi: 10.1093/bioinformatics/btac303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 161.Kennedy M.A., Hofstadter W.A., Cristea I.M. TRANSPIRE: A computational pipeline to elucidate intracellular protein movements from spatial proteomics data sets. J. Am. Soc. Mass Spectrom. 2020;31:1422–1439. doi: 10.1021/jasms.0c00033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 162.Christopher J.A., Geladaki A., Dawson C.S., et al. Subcellular transcriptomics and proteomics: A comparative methods review. Mol. Cell. Proteom. 2022;21 doi: 10.1016/j.mcpro.2021.100186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 163.Regev A., Teichmann S.A., Lander E.S., et al. The human cell atlas. eLife. 2017;6 doi: 10.7554/eLife.27041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 164.Consortium HuBMAP. The human body at cellular resolution: The NIH Human Biomolecular Atlas Program. Nature. 2019;574:187–192. doi: 10.1038/s41586-019-1629-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 165.Frenkel-Morgenstern M., Cohen A.A., Geva-Zatorsky N., et al. Dynamic proteomics: A database for dynamics and localizations of endogenous fluorescently-tagged proteins in living human cells. Nucleic Acids Res. 2010;38:D508–D512. doi: 10.1093/nar/gkp808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 166.Dries R., Zhu Q., Dong R., et al. Giotto: A toolbox for integrative analysis and visualization of spatial expression data. Genome Biol. 2021;22 doi: 10.1186/s13059-021-02286-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 167.Xu Z., Wang W., Yang T., et al. STOmicsDB: A database of Spatial Transcriptomic data. bioRxiv. 2022 doi: 10.1101/2022.03.11.481421. [DOI] [Google Scholar]
- 168.Fan Z., Chen R., Chen X. SpatialDB: A database for spatially resolved transcriptomes. Nucleic Acids Res. 2020;48:D233–D237. doi: 10.1093/nar/gkz934. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 169.Qian J., Liao J., Liu Z., et al. Reconstruction of the cell pseudo-space from single-cell RNA sequencing data with scSpace. Nat. Commun. 2023;14 doi: 10.1038/s41467-023-38121-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 170.Liao J., Qian J., Fang Y., et al. De novo analysis of bulk RNA-seq data at spatially resolved single-cell resolution. Nat. Commun. 2022;13 doi: 10.1038/s41467-022-34271-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 171.Zeisel A., Muñoz-Manchado A.B., Codeluppi S., et al. Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science. 2015;347:1138–1142. doi: 10.1126/science.aaa1934. [DOI] [PubMed] [Google Scholar]
- 172.Bao F., Deng Y., Wan S., et al. Integrative spatial analysis of cell morphologies and transcriptional states with MUSE. Nat. Biotechnol. 2022;40:1200–1209. doi: 10.1038/s41587-022-01251-z. [DOI] [PubMed] [Google Scholar]
- 173.Marx V. Method of the year: Spatially resolved transcriptomics. Nat. Meth. 2021;18:9–14. doi: 10.1038/s41592-020-01033-y. [DOI] [PubMed] [Google Scholar]
- 174.Longo S.K., Guo M.G., Ji A.L., et al. Integrating single-cell and spatial transcriptomics to elucidate intercellular tissue dynamics. Nat. Rev. Genet. 2021;22:627–644. doi: 10.1038/s41576-021-00370-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 175.Shao X., Lu X., Liao J., et al. New avenues for systematically inferring cell-cell communication: through single-cell transcriptomics data. Protein Cell. 2020;11:866–880. doi: 10.1007/s13238-020-00727-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 176.Shao X., Liao J., Li C., et al. CellTalkDB: A manually curated database of ligand-receptor interactions in humans and mice. Brief. Bioinform. 2021;22 doi: 10.1093/bib/bbaa269. [DOI] [PubMed] [Google Scholar]
- 177.Shao X., Li C., Yang H., et al. Knowledge-graph-based cell-cell communication inference for spatially resolved transcriptomic data with SpaTalk. Nat. Commun. 2022;13 doi: 10.1038/s41467-022-32111-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 178.Fangma Y., Liu M., Liao J., et al. Dissecting the brain with spatially resolved multi-omics. J. Pharm. Anal. 2023 doi: 10.1016/j.jpha.2023.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 179.Sejnowski T.J., Churchland P.S., Movshon J.A. Putting big data to good use in neuroscience. Nat. Neurosci. 2014;17:1440–1441. doi: 10.1038/nn.3839. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 180.Lee H., Lee J.J., Park N.Y., et al. Multi-omic analysis of selectively vulnerable motor neuron subtypes implicates altered lipid metabolism in ALS. Nat. Neurosci. 2021;24:1673–1685. doi: 10.1038/s41593-021-00944-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 181.Xicota L., Ichou F., Lejeune F.X., et al. Multi-omics signature of brain amyloid deposition in asymptomatic individuals at-risk for Alzheimer’s disease: The INSIGHT-preAD study. EBioMedicine. 2019;47:518–528. doi: 10.1016/j.ebiom.2019.08.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 182.Puris E., Kouřil Š., Najdekr L., et al. Metabolomic, lipidomic and proteomic characterisation of lipopolysaccharide-induced inflammation mouse model. Neuroscience. 2022;496:165–178. doi: 10.1016/j.neuroscience.2022.05.030. [DOI] [PubMed] [Google Scholar]
- 183.Clark C., Dayon L., Masoodi M., et al. An integrative, hypothesis-free, multi-omics approach uncovers biological pathway alterations in Alzheimer’s disease. Alzheimers. Dement. 2020;16 doi: 10.1186/s13195-021-00814-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 184.Lee S., Devanney N.A., Golden L.R., et al. APOE modulates microglial immunometabolism in response to age, amyloid pathology, and inflammatory challenge. Cell Rep. 2023;42 doi: 10.1016/j.celrep.2023.112196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 185.O’Rourke M.B., Town S.E.L., Dalla P.V., et al. What is normalization? the strategies employed in top-down and bottom-up proteome analysis workflows. Proteomes. 2019;7 doi: 10.3390/proteomes7030029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 186.Conesa A., Madrigal P., Tarazona S., et al. A survey of best practices for RNA-seq data analysis. Genome Biol. 2016;17 doi: 10.1186/s13059-016-0881-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 187.Yang P., Huang H., Liu C. Feature selection revisited in the single-cell era. Genome Biol. 2021;22 doi: 10.1186/s13059-021-02544-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 188.Torres-Martos Á., Bustos-Aibar M., Ramírez-Mena A., et al. Omics data preprocessing for machine learning: A case study in childhood obesity. Genes. 2023;14 doi: 10.3390/genes14020248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 189.Nataf S., Guillen M., Pays L. TGFB1-mediated gliosis in multiple sclerosis spinal cords is favored by the regionalized expression of HOXA5 and the age-dependent decline in androgen receptor ligands. Int. J. Mol. Sci. 2019;20 doi: 10.3390/ijms20235934. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 190.Garcia-Segura M.E., Durainayagam B.R., Liggi S., et al. Pathway-based integration of multi-omics data reveals lipidomics alterations validated in an Alzheimer’s disease mouse model and risk loci carriers. J. Neurochem. 2023;164:57–76. doi: 10.1111/jnc.15719. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 191.Zhou G., Li S., Xia J. Network-based approaches for multi-omics integration. Methods Mol. Biol. 2020;2104:469–487. doi: 10.1007/978-1-0716-0239-3_23. [DOI] [PubMed] [Google Scholar]
- 192.Kanehisa M., Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 193.Kanehisa M., Furumichi M., Sato Y., et al. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 2023;51:D587–D592. doi: 10.1093/nar/gkac963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 194.Ashburner M., Ball C.A., Blake J.A., et al. Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 195.Gene Ontology Consortium The Gene Ontology resource: Enriching a GOld mine. Nucleic Acids Res. 2021;49:D325–D334. doi: 10.1093/nar/gkaa1113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 196.Dennis G., Jr., Sherman B.T., Hosack D.A., et al. DAVID: Database for annotation, visualization, and integrated discovery. Genome Biol. 2003;4 [PubMed] [Google Scholar]
- 197.Thomas P.D., Ebert D., Muruganujan A., et al. PANTHER: Making genome-scale phylogenetics accessible to all. Protein Sci. 2022;31:8–22. doi: 10.1002/pro.4218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 198.Mi H., Muruganujan A., Huang X., et al. Protocol Update for large-scale genome and gene function analysis with the PANTHER classification system (v.14.0) Nat. Protoc. 2019;14:703–721. doi: 10.1038/s41596-019-0128-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 199.Subramanian A., Tamayo P., Mootha V.K., et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. U. S. A. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 200.Krämer A., Green J., Pollard J., Jr., et al. Causal analysis approaches in Ingenuity Pathway Analysis. Bioinformatics. 2014;30:523–530. doi: 10.1093/bioinformatics/btt703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 201.Szklarczyk D., Gable A.L., Lyon D., et al. STRING v11: Protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47:D607–D613. doi: 10.1093/nar/gky1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 202.Shannon P., Markiel A., Ozier O., et al. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–2504. doi: 10.1101/gr.1239303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 203.Martin F., Thomson T.M., Sewer A., et al. Assessment of network perturbation amplitudes by applying high-throughput data to causal biological networks. BMC Syst. Biol. 2012;6 doi: 10.1186/1752-0509-6-54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 204.Tarca A.L., Draghici S., Khatri P., et al. A novel signaling pathway impact analysis. Bioinformatics. 2009;25:75–82. doi: 10.1093/bioinformatics/btn577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 205.Cottret L., Frainay C., Chazalviel M., et al. MetExplore: Collaborative edition and exploration of metabolic networks. Nucleic Acids Res. 2018;46:W495–W502. doi: 10.1093/nar/gky301. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 206.Kuo T.C., Tian T.F., Tseng Y.J. 3Omics: A web-based systems biology tool for analysis, integration and visualization of human transcriptomic, proteomic and metabolomic data. BMC Syst. Biol. 2013;7 doi: 10.1186/1752-0509-7-64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 207.Liu T., Salguero P., Petek M., et al. PaintOmics 4: New tools for the integrative analysis of multi-omics datasets supported by multiple pathway databases. Nucleic Acids Res. 2022;50:W551–W559. doi: 10.1093/nar/gkac352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 208.Subramanian I., Verma S., Kumar S., et al. Multi-omics data integration, interpretation, and its application. Bioinform. Biol. Insights. 2020;14 doi: 10.1177/1177932219899051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 209.Xu Y., McCord R.P. Diagonal integration of multimodal single-cell data: Potential pitfalls and paths forward. Nat. Commun. 2022;13 doi: 10.1038/s41467-022-31104-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 210.Stark C., Breitkreutz B.J., Reguly T., et al. BioGRID: A general repository for interaction datasets. Nucleic Acids Res. 2006;34:D535–D539. doi: 10.1093/nar/gkj109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 211.Fanaee-T H., Thoresen M. Multi-insight visualization of multi-omics data via ensemble dimension reduction and tensor factorization. Bioinformatics. 2019;35:1625–1633. doi: 10.1093/bioinformatics/bty847. [DOI] [PubMed] [Google Scholar]
- 212.Vahabi N., Michailidis G. Unsupervised multi-omics data integration methods: A comprehensive review. Front. Genet. 2022;13 doi: 10.3389/fgene.2022.854752. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 213.Lee D.D., Seung H.S. 2001. Algorithms for Non-negative Matrix Factorization, Proceedings of Adv. Neural Inf. Process, Dec 3 – 8, 2001, Vancouver, Canada. [Google Scholar]
- 214.Pierre-Jean M., Deleuze J.F., le Floch E., et al. Clustering and variable selection evaluation of 13 unsupervised methods for multi-omics data integration. Brief. Bioinform. 2020;21:2011–2030. doi: 10.1093/bib/bbz138. [DOI] [PubMed] [Google Scholar]
- 215.Yang Z., Michailidis G. A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data. Bioinformatics. 2016;32:1–8. doi: 10.1093/bioinformatics/btv544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 216.MacEachern S., Müller P. In: Lecture Notes in Statistics. first ed. Diggle P., Zeger S., editors. Springer; New York: 2000. Efficient MCMC Schemes for Robust Model Extensions Using Encompassing Dirichlet Process Mixture Models - Robust Bayesian Analysis; pp. 295–315. [Google Scholar]
- 217.Cowen L., Ideker T., Raphael B.J., et al. Network propagation: A universal amplifier of genetic associations. Nat. Rev. Genet. 2017;18:551–562. doi: 10.1038/nrg.2017.38. [DOI] [PubMed] [Google Scholar]
- 218.Ye W., Ji G., Ye P., et al. scNPF: An integrative framework assisted by network propagation and network fusion for preprocessing of single-cell RNA-seq data. BMC Genomics. 2019;20 doi: 10.1186/s12864-019-5747-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 219.Langfelder P., Horvath S. WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9 doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 220.Dolédec S., Chessel D. Co-inertia analysis: an alternative method for studying species-environment relationships. Freshw. Biol. 1994;31:277–294. [Google Scholar]
- 221.Sankaran K., Holmes S.P. Multitable methods for microbiome data integration. Front. Genet. 2019;10 doi: 10.3389/fgene.2019.00627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 222.Shen R., Olshen A.B., Ladanyi M. Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis. Bioinformatics. 2009;25:2906–2912. doi: 10.1093/bioinformatics/btp543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 223.Mo Q., Wang S., Seshan V.E., et al. Pattern discovery and cancer gene identification in integrated cancer genomic data. Proc. Natl. Acad. Sci. U. S. A. 2013;110:4245–4250. doi: 10.1073/pnas.1208949110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 224.Lock E.F., Hoadley K.A., Marron J.S., et al. Joint and individual variation explained (jive) for integrated analysis of multiple data types. Ann. Appl. Stat. 2013;7:523–542. doi: 10.1214/12-AOAS597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 225.Louhimo R., Hautaniemi S. CNAmet: An R package for integrating copy number, methylation and expression data. Bioinformatics. 2011;27:887–888. doi: 10.1093/bioinformatics/btr019. [DOI] [PubMed] [Google Scholar]
- 226.Vaske C.J., Benz S.C., Sanborn J.Z., et al. Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics. 2010;26:i237–i245. doi: 10.1093/bioinformatics/btq182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 227.Park J.-C., Barahona-Torres N., Jang S.-Y., et al. Multi-omics-based autophagy-related untypical subtypes in patients with cerebral amyloid pathology. Adv. Sci (Weinh). 2022;9 doi: 10.1002/advs.202201212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 228.Catanese A., Rajkumar S., Sommer D., et al. Multiomics and machine-learning identify novel transcriptional and mutational signatures in amyotrophic lateral sclerosis. Brain. 2023 doi: 10.1093/brain/awad075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 229.Lovestone S., Francis P., Kloszewska I., et al. AddNeuroMed: The European collaboration for the discovery of novel biomarkers for Alzheimer’s disease. Ann. N. Y. Acad. Sci. 2009;1180:36–46. doi: 10.1111/j.1749-6632.2009.05064.x. [DOI] [PubMed] [Google Scholar]
- 230.Hye A., Lynham S., Thambisetty M., et al. Proteome-based plasma biomarkers for Alzheimer’s disease. Brain. 2006;129:3042–3050. doi: 10.1093/brain/awl279. [DOI] [PubMed] [Google Scholar]
- 231.Xu J., Bankov G., Kim M., et al. Integrated lipidomics and proteomics network analysis highlights lipid and immunity pathways associated with Alzheimer’s disease. Transl. Neurodegener. 2020;9 doi: 10.1186/s40035-020-00215-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 232.Seyfried N.T., Dammer E.B., Swarup V., et al. A multi-network approach identifies protein-specific co-expression in asymptomatic and symptomatic Alzheimer’s disease. Cell Syst. 2017;4:60–72.e4. doi: 10.1016/j.cels.2016.11.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 233.Toledo J.B., Arnold M., Kastenmüller G., et al. Metabolic network failures in Alzheimer’s disease: A biochemical roadmap. Alzheimers. Dement. 2017;13:965–984. doi: 10.1016/j.jalz.2017.01.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 234.Bonnet E., Calzone L., Michoel T. Integrative multi-omics module network inference with Lemon-Tree. PLoS Comput. Biol. 2015;11 doi: 10.1371/journal.pcbi.1003983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 235.Jamal S., Goyal S., Shanker A., et al. Integrating network, sequence and functional features using machine learning approaches towards identification of novel Alzheimer genes. BMC Genomics. 2016;17 doi: 10.1186/s12864-016-3108-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 236.Menche J., Guney E., Sharma A., et al. Integrating personalized gene expression profiles into predictive disease-associated gene pools. NPJ Syst. Biol. Appl. 2017;3 doi: 10.1038/s41540-017-0009-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 237.Ram P.T., Mendelsohn J., Mills G.B. Bioinformatics and systems biology. Mol. Oncol. 2012;6:147–154. doi: 10.1016/j.molonc.2012.01.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 238.Buchan N.S., Rajpal D.K., Webster Y., et al. The role of translational bioinformatics in drug discovery, Drug Discov. Today. 2011;16:426–434. doi: 10.1016/j.drudis.2011.03.002. [DOI] [PubMed] [Google Scholar]
- 239.Leipzig J. A review of bioinformatic pipeline frameworks. Brief Bioinform. 2017;18:530–536. doi: 10.1093/bib/bbw020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 240.Gupta A., Gupta S., Jatawa S.K., et al. A simplest bioinformatics pipeline for whole transcriptome sequencing: Overview of the processing and steps from raw data to downstream analysis. bioRxiv. 2019 doi: 10.1101/836973. [DOI] [Google Scholar]
- 241.Siegwald L., Touzet H., Lemoine Y., et al. Assessment of Common and Emerging Bioinformatics Pipelines for Targeted Metagenomics. PLoS One. 2017;12 doi: 10.1371/journal.pone.0169563. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 242.Moritz C.P., Mühlhaus T., Tenzer S., et al. Poor transcript-protein correlation in the brain: Negatively correlating gene products reveal neuronal polarity as a potential cause. J. Neurochem. 2019;149:582–604. doi: 10.1111/jnc.14664. [DOI] [PubMed] [Google Scholar]
- 243.Jafari M., Guan Y., Wedge D.C., et al. Re-evaluating experimental validation in the Big Data Era: a conceptual argument. Genome Biol. 2021;22 doi: 10.1186/s13059-021-02292-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 244.Manzoni C., Kia D.A., Vandrovcova J., et al. Genome, transcriptome and proteome: The rise of omics data and their integration in biomedical sciences. Brief Bioinform. 2018;19:286–302. doi: 10.1093/bib/bbw114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 245.Lo C.H., Zeng J. Defective lysosomal acidification: A new prognostic marker and therapeutic target for neurodegenerative diseases. Transl. Neurodegener. 2023;12 doi: 10.1186/s40035-023-00362-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 246.Pitt D., Lo C.H., Gauthier S.A., et al. Toward precision phenotyping of multiple sclerosis. Neurol. Neuroimmunol. Neuroinflamm. 2022;9 doi: 10.1212/NXI.0000000000200025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 247.Suh K.S., Sarojini S., Youssif M., et al. Tissue banking, bioinformatics, and electronic medical records: The front-end requirements for personalized medicine. J. Oncol. 2013;2013 doi: 10.1155/2013/368751. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 248.Kolodkin A., Simeonidis E., Balling R., et al. Understanding complexity in neurodegenerative diseases: in silico reconstruction of emergence. Front. Physiol. 2012;3 doi: 10.3389/fphys.2012.00291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 249.Golriz Khatami S., Mubeen S., Hofmann-Apitius M. Data science in neurodegenerative disease: Its capabilities, limitations, and perspectives. Curr. Opin. Neurol. 2020;33:249–254. doi: 10.1097/WCO.0000000000000795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 250.Myszczynska M.A., Ojamies P.N., Lacoste A.M.B., et al. Applications of machine learning to diagnosis and treatment of neurodegenerative diseases. Nat. Rev. Neurol. 2020;16:440–456. doi: 10.1038/s41582-020-0377-8. [DOI] [PubMed] [Google Scholar]
- 251.Mammoliti A., Smirnov P., Nakano M., et al. Orchestrating and sharing large multimodal data for transparent and reproducible research. Nat. Commun. 2021;12 doi: 10.1038/s41467-021-25974-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 252.Lam S., Bayraktar A., Zhang C., et al. A systems biology approach for studying neurodegenerative diseases. Drug Discov. Today. 2020;25:1146–1159. doi: 10.1016/j.drudis.2020.05.010. [DOI] [PubMed] [Google Scholar]