Significance
Early prediction of severity of cognitive deficits in children born from adverse pregnancies is both difficult and urgently needed for timely clinical intervention. Therefore, we need biological signatures that can accurately identify susceptible children early as well as robust tools and methods that can in turn discover such biological signatures. In this study, we used a machine learning model to show that differential splicing of certain key messenger RNAs in blood cells of young mice, born from adverse pregnancies, can accurately predict whether those mice would manifest motor learning disabilities. We propose that machine learning models, such as ours, and differential splicing of peripheral mRNAs, can serve as templates for discovery and therapeutics respectively, potentially across various adverse pregnancy paradigms.
Keywords: RNA splicing, offspring of mother with diabetes, prenatal alcohol exposure, peripheral biomarker, machine learning
Abstract
Severity of neurobehavioral deficits in children born from adverse pregnancies, such as maternal alcohol consumption and diabetes, does not always correlate with the adversity’s duration and intensity. Therefore, biological signatures for accurate prediction of the severity of neurobehavioral deficits, and robust tools for reliable identification of such biomarkers, have an urgent clinical need. Here, we demonstrate that significant changes in the alternative splicing (AS) pattern of offspring lymphocyte RNA can function as accurate peripheral biomarkers for motor learning deficits in mouse models of prenatal alcohol exposure (PAE) and offspring of mother with diabetes (OMD). An aptly trained deep-learning model identified 29 AS events common to PAE and OMD as superior predictors of motor learning deficits than AS events specific to PAE or OMD. Shapley-value analysis, a game-theory algorithm, deciphered the trained deep-learning model’s learnt associations between its input, AS events, and output, motor learning performance. Shapley values of the deep-learning model’s input identified the relative contribution of the 29 common AS events to the motor learning deficit. Gene ontology and predictive structure–function analyses, using Alphafold2 algorithm, supported existing evidence on the critical roles of these molecules in early brain development and function. The direction of most AS events was opposite in PAE and OMD, potentially from differential expression of RNA binding proteins in PAE and OMD. Altogether, this study posits that AS of lymphocyte RNA is a rich resource, and deep-learning is an effective tool, for discovery of peripheral biomarkers of neurobehavioral deficits in children of diverse adverse pregnancies.
Intellectual and behavioral disabilities in children result from an interplay between mutations and exposure to adverse prenatal environments (1). Although severity and type of disability can correlate with type, timing, duration, and intensity of environmental stressors, developmental trajectories are usually divergent among similarly exposed children with similar genetic make-up (2). Exposed children likely to develop neurobehavioral impairments are not always identified early, when appropriate clinical interventions are more effective (3). Therefore, identification of biomarkers that can accurately predict neurobehavioral impairments in children, regardless of their exposure history, is critical.
Alcohol consumption (4) and diabetes among pregnant women (5) are serious risk factors for the developing fetus. Children with fetal alcohol spectrum disorders (FASD), caused by prenatal alcohol exposure (PAE), exhibit a spectrum of neurobehavioral deficits including impaired motor and intellectual development (6). However, PAE, even at high dosages, does not always result in uniform neurobehavioral deficits (7). Although some fatty acid molecules and maternal microRNAs have been identified as biomarkers for FASD (8–10), biomarkers that can predict the risk of developing neurobehavioral impairments characteristic of FASD have not been identified. Offspring of mother with diabetes (OMD), prenatally exposed to high blood glucose levels, also have an increased risk of neurobehavioral deficits including impaired motor development (11). OMD animal models exhibit similar impairments (12). However, biomarkers associated with neurobehavioral impairments in OMD are more limited than that in FASD.
Aberrant splicing is associated with increased risk for various cancers (13) and neuropsychiatric disorders (14, 15). Alternative splicing (AS) also plays a pivotal role in development of nervous (16) and immune systems (17) and is sensitive to changes in the fetal environment (18, 19). Therefore, we hypothesized that AS events in peripheral blood mononuclear cells (PBMCs) can serve as biomarkers of neurobehavioral impairments in PAE and OMD.
Next-generation RNA sequencing methods have generated vast data about transcriptional changes underlying various disorders (20). The urgent need is for development of tools that can identify pertinent data for treatment (i.e., therapeutics) and prediction of penetrance (i.e., biomarker) of these disorders (21). Deep-learning algorithms are uniquely suited for this task (22) and are increasingly deployed in clinical decision-making (23). However, the criteria for these decisions are not observable without algorithmic interventions (24–26).
In this study, we performed RNA sequencing of B cells, T cells, and monocytes from PAE and OMD, conditions with significant but variable impairment in motor learning. We deployed a deep-learning algorithm, Long Short-Term Memory (LSTM), to identify the subset of AS events most relevant for prediction of motor learning impairment. With these biomarker AS events as input, we trained the deep-learning model to perform at maximum prediction accuracy, without being underfit or overfit. Using Shapley value analysis, a game-theory algorithm for deep-learning model interpretation (24, 27), putative relative contribution of these biomarkers toward accurate model prediction was determined. We then performed gene ontology (GO) analysis to identify the roles of these biomarkers in cognitive development and disorders, and structure–function analysis, via AlphaFold2 algorithm, to characterize the putative effect of these AS events on protein structures. Finally, we investigated how differential RNA binding protein (RBP) expression in PAE and OMD might influence AS patterns of these biological signatures.
Results
Variable but Significant Motor-skill Learning Deficits in Young PAE and OMD Mice.
OMD mice were generated using the High-Dose Streptozotocin (STZ) Induction Protocol, which induces maternal diabetes via STZ-induced pancreatic beta cell death (28) (Fig. 1A). PAE mice that model FASD have been described (29) (Fig. 1B). Dynamics of blood ethanol concentration in alcohol-exposed (AE) and AE-control dams at gestational day 16.5 and 17.5 is presented in SI Appendix, Fig. S1. Fasting (Fig. 1D) and random (Fig. 1E) blood glucose levels of mother with diabetes (MD) and MD-control groups were monitored over two weeks post STZ injection, during which no change was observed in their body weight (Fig. 1C). Higher fasting (Fig. 1G) and random (Fig. 1H) blood glucose levels in MD group continued throughout pregnancy, with no difference in their body weight from MD-control group (Fig. 1F). Both body weight and blood glucose level at 6 h post fasting in male, but not female OMD group, were lower at P29 compared to offspring from the MD-control group (OMD-control) (SI Appendix, Table S1). Such sexual dimorphism is also observed in children from MD (30). STZ administration to dams had no effect on pancreatic insulin-positive cell area and number of islets in 11-wk-old OMD mice (SI Appendix, Fig. S2). Locomotor activity as assessed via open field test was similar between OMD and OMD-control mice, and between PAE and PAE-control mice, irrespective of gender (ref. 29 and SI Appendix, Fig. S3).
We then assessed motor-skill learning ability via accelerated rotarod test (Fig. 1I). PAE and OMD groups were significantly impaired in motor learning, in comparison to PAE-control and OMD-control groups respectively, as assessed via difference in their terminal speed between the first and final (sixth) trials (Fig. 1J). The learning index, average change in terminal speed of each mouse between two consecutive trials, was also significantly lower in PAE and OMD, relative to their respective controls (Fig. 1K). These two motor-skill learning ability parameters did not show any differences by gender in each group (SI Appendix, Fig. S4). The extent of learning deficits among similarly exposed PAE offspring varied widely, as per previous reports (29), and resembled the variability of motor-skill learning deficit among children with FASD (31). Motor-skill learning deficits and their variability among OMD were milder than those in PAE (Fig. 1 J and K).
To examine any secondary effects due to fostering complications by mothers with adverse pregnancy, we performed cross-fostering of newborn offspring. There was no effect on fostering by alcohol-administered mothers (29) and MD mothers on motor learning of their respective offspring (SI Appendix, Fig. S5). In OMD condition, the metabolic measures were similar between fostering and non-fostering mothers and between their respective offspring (SI Appendix, Fig. S5).
Minimal Overlaps in Differentially Expressed Genes between PAE and OMD.
PBMCs were collected for RNA-sequencing one day after the accelerated rotarod test (Fig. 2A) and sorted via fluorescence-activated cell sorting (FACS) into monocytes (CD19−/CD90.2+), B cells (CD19+/CD90.2−), and T cells (CD11b+/CD19−) (32) (Fig. 2 B–E). The proportion of these cells was not significantly different among PAE, OMD, and their respective controls (SI Appendix, Fig. S6 A, C, and E). The proportion of B cells in OMD-control females showed a significant decrease compared to OMD-control males, but no other groups showed significant differences by gender (SI Appendix, Fig. S6 B, D, and F). RNA sequencing reads were aligned onto mm10 mouse genome by HISAT2 (33) and counted by HTSeq (34). Most reads passed Phred score-based quality filter (SI Appendix, Fig. S7 A and B). Gene density across expression level bins was similar among samples (SI Appendix, Fig. S4C). Similarity among samples was analyzed by multi-dimensional scaling (MDS) analysis where most samples clustered according to cell type (Fig. 2 F and G). Differentially expressed gene (DEG) analysis via EdgeR algorithm in Integrative Differential Expression Analysis for Multiple EXperiments (IDEAMEX) (35) exhibited minimal DEG overlap between PAE and OMD in any cell type (Fig. 2 H–J and SI Appendix, Fig. S7 D–I). Genes with fold change > 2 and false discovery rate (FDR) < 0.05 were considered significantly different. Different combination of read aligning (i.e., Spliced Transcripts Alignment to a Reference, STAR) and counting methods (featureCounts) did not materially alter the number of shared DEGs.
Common and Unique Differential AS Events in PAE and OMD.
Significant differential AS events in PBMCs were identified via rMATS (replicate multivariate analysis of transcript splicing) (36). Fig. 3A depicts five AS event types assessed: skipped exon (SE), mutually exclusive exon (MXE), alternative 3′ (A3SS) and 5′ (A5SS) splice sites, and retained intron (RI). Of note, ~30,000 AS events were detected in each cell type, in PAE and OMD (Fig. 3 B–G and SI Appendix, Fig. S8). AS events with delta percent spliced in (Δpsi) value > 0.05 and FDR < 0.05 were considered significantly different (Fig. 3 and SI Appendix, Fig. S8). Number of SE events was the highest, and RI the lowest, in all cases (Fig. 3 B–G and SI Appendix, Fig. S8).
It was observed that 16, 13, and 1 AS events were common between PAE and OMD in B cells, T cells, and monocytes, respectively (Fig. 3 H–J; arrows, Fig. 3 K–T). In contrast, 320, 253, and 106 AS events were unique to PAE, and 161, 249, and 82 AS events were unique to OMD in B cells, T cells, and monocytes, respectively (Fig. 3 H–J). Minimal to no genes were common to both significant DEGs and significant AS events in any cell type in either PAE or OMD (SI Appendix, Fig. S9). Results were cross-checked with another AS detection algorithm, Leafcutter (37). Due to the limited number of significant AS events and only one AS event being common to both PAE and OMD, combined with the under-representation of monocyte contributions to cognitive development and function in neuro-immune interaction literature compared to B cells and T cells, AS events in monocytes were excluded from further analyses.
An SE event in Ets1 was shared between common AS events in B cells and T cells in PAE and OMD. Therefore, there were 29 common AS events, but 28 common and unique AS events. In subsequent analyses, this SE event in Ets1 was treated as two independent AS events and was experimentally confirmed (SI Appendix, Fig. S10 B and C), along with another common SE AS event in TVP23b (SI Appendix, Fig. S10A), via quantitative real-time PCR. To identify any trends between psi values of AS events and motor learning ability in PAE and OMD, we arranged z-score normalized psi values of 29 common AS events from all samples in ascending order of their learning index. Mice were classified as slow or fast learners depending on whether their learning index were above or below the population median of 2.8, respectively. Normalized psi values clustered according to motor learning ability (SI Appendix, Fig. S11) suggesting that psi values of AS events could be useful input in accurate classification of motor-learner type.
Common AS Events Are Better than Unique AS Events at Adequately Training a Deep-learning Model to Accurately Predict Motor-learner Type.
To identify significant AS events most predictive of motor-learner type, we deployed a deep-learning model (38), Long Short-Term Memory (LSTM) (39) (Fig. 4A, Methods). Raw psi values of either common (Fig. 3 K–T, annotations) or unique (Fig. 3 H and I) AS events from B cells and T cells were used as input. For output, mice were classified as either fast or slow learners.
Input data were randomly split into a training dataset, containing 80% of data, and a test dataset, containing 20% of data. The LSTM model learns during training from errors in its prediction and uses it to predict fast or slow learners, for unseen test dataset during testing. Deep-learning algorithms are therefore optimization algorithms that iteratively seek to minimize errors in its prediction. This error is represented by loss of optimization (i.e., loss) function. Binary cross-entropy is the loss function for the LSTM model, and its output, binary cross-entropic loss, is a proxy for the LSTM model’s learnability.
Nature of distribution of loss function values over epochs indicates overgeneralization (overfitting) or under generalization (underfitting) of mapping model input to output. While overfit models learn useful features and irrelevant noise in input dataset (and overgeneralize), underfit models are incapable of adequately learning useful features in input dataset (and undergeneralize), both suboptimal for model performance and utility. The loss function value should incrementally decrease and approach zero with each successive epoch for optimal underfitting-free learning. Model learning loss on the training dataset (i.e., training loss, black line in Fig. 4 B–J) should be lower than model learning loss on the test dataset (i.e., test loss, red line in Fig. 4 B–J) at each epoch for optimal overfitting-free learning.
We compared LSTM model learnability, as assessed via distribution of loss function values over 1,000 epochs (Fig. 4 B–J), with input datasets containing psi values of AS events either common to PAE and OMD (Fig. 4 B, C, and H) or unique to PAE (Fig. 4 C, F, and I) or OMD (Fig. 4 D, G, and J). The LSTM model learned optimally, with no overfitting or underfitting, with input data from 29 common AS events from all 56 samples in PAE and OMD (Fig. 4B). However, the LSTM model was underfit with input data from 573 unique AS events from 32 samples in PAE (Fig. 4C) and input data from 410 unique AS events from 24 samples in OMD (Fig. 4D). The LSTM model learning with input data from 29 common AS events was also relatively optimal when sample size was randomly reduced to 32 (Fig. 4C) and 24 (Fig. 4H) to facilitate equivalent comparisons of LSTM model learnability with input data from unique AS events of PAE (Fig. 4 F and G) and OMD (Fig. 4 I and J). However, LSTM model learning was suboptimal with input data derived from 29 most significant B cell (underfit, Fig. 4F) and 29 most significant T cell (overfit, Fig. 4G) AS events from 32 samples in PAE (Fig. 4 F and G). Similarly, LSTM model learning was suboptimal when input data was derived from 29 most significant B cell (underfit, Fig. 4I) and 29 most significant T cell (underfit, Fig. 4J) AS events from 24 samples in OMD (Fig. 4 I and J).
To summarize, the LSTM model trained and tested on input data from 29 common AS events were superior in learning, relatively less overfit and underfit, than the same LSTM model trained and tested on comparable input data from non-common (i.e., unique) AS events from B cells, T cells, or both, from either PAE or OMD.
Accurate Classification of Fast and Slow Learners Via the Deep-learning Model with Common AS Events as Input.
We evaluated the accuracy of the LSTM model in predicting fast from slow learners with psi values from 29 common biomarker AS events as input. The LSTM model was trained and tested over 1,000 epochs, and its % prediction accuracy on the test dataset, its performance proxy, was plotted for each epoch (SI Appendix, Fig. S12A) along with corresponding loss values, its learnability proxy (SI Appendix, Fig. S12B). LSTM model performance began with ~40% prediction accuracy, for training and test datasets, and rose to 100%, in training dataset at ~80 epochs, and in test dataset at ~225 epochs (SI Appendix, Fig. S12A). Corresponding loss values for training and test datasets incrementally decreased from 0.8 at epoch 1 to ~0 at epoch 1,000 (SI Appendix, Fig. S12B), suggesting that the LSTM model learned optimally and is not underfit. Percentage prediction accuracy of test dataset was never better than that of training dataset (SI Appendix, Fig. S7A), and loss values of test dataset were higher than those of training dataset (SI Appendix, Fig. S12B), throughout 1,000 epochs, indicating the absence of overfitting.
We then performed fivefold cross validation of the LSTM model to determine whether all input features were equally important for prediction accuracy. Input dataset was randomly split into five equal parts: four of which (80%) were used for training, and fifth was used for testing model performance (SI Appendix, Fig. S12C) and learnability (SI Appendix, Fig. S12D) over 200 epochs. The process was repeated until each of the five parts was used for testing exactly once. Trajectory of performance and learning within each test varied widely (SI Appendix, Fig. S12C), suggesting differential contribution of AS events toward model performance. Split III performed best, and split V worst, in testing of model prediction accuracy (SI Appendix, Fig. S12C), indicating that their respective training AS events, splits I + II + IV + V, and splits I − IV, contained the best and worst AS events for model training. Importantly, the LSTM model learned optimally, with no overfitting or underfitting, in all five splits of the fivefold cross-validation analysis (SI Appendix, Fig. S12D). However, because assignment of AS events to different splits during cross-validation analysis is random, and due to general opacity of deep-learning models (40), it is not possible to determine exactly which AS events were included in the different splits. Therefore, it was not possible to determine which AS events were relatively more important for model training/performance solely via cross-validation analysis.
Relative Contribution of Common AS Events to LSTM Model Performance in Classifying Learner Types.
Deep-learning models are “black boxes” where learned associations between input and output are not accessible to observation (41). Algorithms have been developed to decipher this opacity of deep-learning models (24, 42, 43). Shapley-value analysis is the most widely used among them (24–26) and is deployed here to determine the relative contribution of 29 AS events to the LSTM model’s performance in accurately classifying motor-learner type.
Derived from cooperative game theory, Shapley-value algorithm assigns payouts to players depending on their contribution to total payout from a game (27). Here, each input feature is a “player” in a game where accuracy of model prediction is the payout. Shapley values then indicate how to fairly distribute this payout among input features. Therefore, input features (i.e., AS events) with higher Shapley values are relatively more important to predictive accuracy of the LSTM model. Using SHAP (SHapley Additive exPlanations) algorithm (44) during training of the LSTM model, we computed a unique Shapley value for 29 common AS events derived from 56 mice samples in input. Common AS events were then ranked based on sum of their Shapley values from all input samples (top to bottom, Fig. 5A).
We then tested whether different input features cluster in ways that could suggest hidden relationships (45). We performed Pearson correlation analysis of Shapley values of all 29 common AS events (Fig. 5B). Input features grouped into four distinct clusters: clusters I and III comprised of B cell AS events, and clusters II and IV comprised of T cell AS events. While different clusters of the same cell type exhibited a negative correlation with each other (clusters: I vs. III, II vs. IV; Fig. 5B), clusters of one cell type showed no or weak correlation with both clusters of other cell types (clusters: I vs. II and IV, II vs. I and III, III vs. II and IV, IV vs. I and III; Fig. 5B).
Relevance of Shapley-value Clusters to Neuronal Function and Childhood Motor Disorders.
We then identified significantly enriched GO terms (P < 0.01) related to biological processes, molecular processes, cellular component, human phenotype, molecular pathways, and rare childhood disorders with cognomotor dysfunction (SI Appendix, Figs. S13 and S14), for clusters I–IV genes.
Cluster I B cell genes—Gnas, Chchd7, Mtpap, Rasa1, Ets1, Stk38, Bin1, and Kdm7a (Fig. 5B)—were associated with neuronal GO terms (amyloid precursor protein metabolism, glial differentiation, dopamine receptor signaling, binding to glutamate and adrenergic receptors, and to Tao proteins) (SI Appendix, Fig. S13 A and E), neuronal cellular compartments [axon initial segment and node of Ranvier (SI Appendix, Fig. S13I)] and dysarthria (SI Appendix, Fig. S13J), a motor speech disorder associated with FASD (46).
Cluster II T cell genes—Zfp639, Ms4a6b, Setdb2, Umps, Celf2, Usp15, and Wac (Fig. 5B)—were associated with GO terms for enzymatic modification of nucleotides and proteins (catabolism of proteasomal proteins, epigenetic histone modifications, and pyrimidine metabolism) (SI Appendix, Fig. S13B), binding to RNA polymerase II transcription apparatus ( SI Appendix, Fig. S13F), and abnormality of human T cell physiology (SI Appendix, Fig. S13J).
Cluster III B cell genes—Tnfaip3, Ttc3, Dapp1, Pld4, Ddx50, Rars2, Wdr43, and Brox (Fig. 5B)—were associated with GO terms for peripheral immune cell functions (Toll-like receptor signaling, B cell and lymphocyte activation, B cell homeostasis, and interleukin production) (SI Appendix, Fig. S13C) and motor disabilities in young children, impaired head control and suckling (SI Appendix, Fig. S13J) (47).
Cluster IV T cell genes—2310001H17Rik, Mapk9, Ets1, Ptcd3, Tvp23b, Tcrg-c4 (Fig. 5B)—were associated with GO terms for vascular development (SI Appendix, Fig. S13D), leukocyte cell adhesion (SI Appendix, Fig. S13D), and microRNA transcription (SI Appendix, Fig. S13D), processes affected in FASD and OMD (48–51).
Human pathway analysis showed associations with inflammation-related pathways including B cell receptor, IL-17, and TNF-alpha signaling (SI Appendix, Fig. S13K). Rare disease ontology showed association of clusters I–IV genes with rare childhood genetic developmental disorders with cognomotor impairment (SI Appendix, Fig. S14). Thus, GO analyses demonstrated the relevance of biomarker AS genes to various cellular and molecular processes crucial for motor development, suggesting that these AS events can also occur simultaneously in neural cells and thereby affect childhood motor development. Another possibility is that these AS events in PBMCs may also directly regulate early motor development (52).
Influence of Key AS Events on the Structure and Function of Spliced Isoforms.
We then determined whether biomarker AS events would impact long and short AS protein isoform structures and therefore their stability and function. Non-coding RNA 2310001H17Rik was excluded, and the amino acid (AA) sequences of long and short AS isoforms of the remaining 27 unique biomarker AS events were derived.
AS of Ms4a6b, Zfp639, and Mapk9 retain introns and are likely non-coding, and AS of Chchd7 and Gnas occurs in their untranslated regions, suggesting that these AS events have no influence on the transcribed mRNA.
Eleven AS events—in Kdm7a, Usp15, Ttc3, Pld4, Rars2, Dapp1, Stk38, Umps, Tnfaip3, Tcrg-c4, Brox—result in substantial C-terminal truncation (i.e., >30% AA loss) of short isoform. One AS Setdb2 transcript is non-coding. We predicted structures of long and short isoforms resulting from four AS events—Kdm7a, Usp15, Dapp1, and Brox—using AlphaFold2 (53, 54). Superimpositions of long and short isoforms using FATCAT (Flexible structure AlignmenT by Chaining Aligned fragment pairs allowing Twists) (55) showed drastic differences in structures of short and long isoforms (SI Appendix, Fig. S15) suggesting that big mRNA truncation should result in either degradation of truncated short transcript, or production of substantially truncated isoform (56). Therefore, these AS events should result in selective loss of short isoform, thereby affecting relative availability of long isoform.
Among remaining AS events—Mtpap, Ets1, Tvp23b, Celf2, Wdr43, Ptcd3, Wac, Ddx50, Bin1, and Rasa1—while Mtpap and Ptcd3 AS events result in small C-terminal truncations, the rest result in interior exon skipping. We predicted structures of long and short isoforms from five AS events using AlphaFold2 (53, 54)—SE AS events in Mtpap, Ets1, Tvp23b, Celf2, and Rasa1 (SI Appendix, Figs. S5L and S17 and Movie S1)—to highlight how AS can influence structure, properties, and functions of their spliced isoforms by modifying protein domains directly [i.e., substrate binding site in Celf2 and enzymatic site in Rasa1 (SI Appendix, Fig. S16 G–J and Movie S1)] and indirectly [i.e., protein–protein binding site in Ets1 and membrane localization domain in Tvp23b (SI Appendix, Fig. S16 C–F and Movie S1)] essential for protein function. As expected, predicted structures of long isoforms were globally similar to predicted structures of their respective short isoforms, except around their spliced-in regions, as evident from their superimposition using FATCAT (red, SI Appendix, Fig. S18 and Movie S1). Spliced regions are important for various physiochemical properties, suggesting that AS here should result in viable translation of short and long isoforms with different physiochemical properties.
Potential Involvement of RBPs in Opposing Pattern of AS in PAE and OMD.
A majority of significant AS events, in PAE and OMD, for all cell types, were negative (red dots, Fig. 3 B–D) and positive (black dots, Fig. 3 E–G) Δpsi events, respectively. This opposing directional splicing pattern was preserved in 20 biomarker AS events (Fig. 3 K–T).
RBPs play a combinatorial role in AS, where differential binding of RBPs to transcript influences its AS outcome (57, 58). We hypothesized that opposing directional splicing patterns in PAE and OMD could result from differential effects of alcohol and high glucose on RBP expression. We investigated how any such differential RBP expression between PAE and OMD could interact with differential density of cognate RBP binding sites in biomarkers, to influence their opposing directional splicing.
We determined binding site densities of 71 RBPs along upregulated and downregulated exons of all significant SE AS events (Fig. 3 B–G and SI Appendix, Fig. S8) using RBPMap (59). We then determined average ratio of binding site density along upregulated vs. downregulated exons for each RBP (SI Appendix, Fig. S19A) thereby facilitating direct comparison across RBPs irrespective of frequency of occurrence of individual RBP binding sites along AS events. Next, we determined fold change of RBP expression in each cell type in PAE and OMD (SI Appendix, Fig. S19B). Upon plotting binding site density ratio and fold change for RBPs across all cell types and experimental conditions as scatter plots, we observed that distribution of RBPs changes drastically between PAE and OMD within the same cell type (SI Appendix, Fig. S20 A–F).
We then investigated whether differential RBP expression plays a role in opposing splicing direction in the biomarkers. We observed that binding site density of RBPs, Igf2bp3, Srsf2, and Cpeb2, among the top differentially expressed RBPs between PAE and OMD conditions across all cell types (SI Appendix, Fig. S19B), varies across regions key for splicing in these biomarkers (SI Appendix, Fig. S20H). All three RBPs also exhibited a differential expression pattern between PAE and OMD in B cells and T cells (SI Appendix, Fig. S20G).
These data suggest that interaction of differential expression of individual RBPs in PAE and OMD, with differential occurrence of binding sites for those RBPs, could explain opposing AS direction in PAE and OMD.
Discussion
Deep-learning models have enabled rapid advances in medicine, from early detection of cancers (60) to rapid drug screening and discovery (61). However, despite their prowess, inner workings of deep-learning models are opaque to observation (41, 62). Considering their increased adoption in decision-making in healthcare, transition of deep-learning models from a “black box” to an “interpretable” paradigm is imperative, and thus, several different algorithms have been suggested as solutions (24–26, 43, 63).
Using a deep-learning model, LSTM (39), we analyzed PBMC RNA-sequencing data and discovered 29 AS events in key genes as accurate predictors of motor learning deficit in PAE and OMD. An analogous machine-learning analysis of PBMC RNA-sequencing data identified an increase in CD8+ T-effector memory cells as a peripheral biomarker for Alzheimer’s disease (64). We then used Shapley-value analysis to interpret the aptly trained LSTM model and characterize the relative contribution of 29 biomarkers to prediction of motor learning deficit.
AS is a critical driver of early brain development (16, 17). Besides, the possibility that AS events in PBMCs similarly affect the same genes in developing brain; these peripheral AS events could also influence early brain development directly. Neuroimmune interactions play an important role in cognitive function and development (65). T cells in CNS meninges regulate synaptic plasticity and short-term memory via IL-17 production (52), whose signaling pathway was enriched in GO analysis of Cluster III genes (Fig. 5). The AS events identified in this study are therefore likely key contributors to cognitive development and function, disruption of which may underlie intellectual disabilities in children affected by PAE, MD, and other prenatal stressors.
Children exposed to other prenatal stressors—maternal inflammation (66), smoking (67), and drug abuse (68)—show an increased risk for similar neurobehavioral impairments (69, 70). Environmental stressors activate stress signaling pathways involving molecular chaperones and inflammatory cytokines (71), and experimentally mimicking their fetal expression results in similar neurobehavioral impairments (72, 73). Therefore, common mechanisms are anticipated as etiologies for these diverse pathologies. Here, common AS events, by being at least twice as frequent in total significant AS event dataset, are more likely to be AS events of key salience for motor learning deficit, than unique AS events. The deep-learning model independently concluded the same.
Combination of differential RBP expression and occurrence of their binding sites might be involved in opposing splicing direction of AS events in PAE and OMD (SI Appendix, Fig. S20). However, we do not know how this can consistently result in motor-skill learning disability. Biomarkers fell into two categories: one where longer AS isoform is likely more stable than their shorter isoform and the other where both isoforms are likely equally stable but with different physiochemical properties. A possible explanation could be that AS dysregulation in the first category can result in an imbalance in subcomponents of macromolecular complexes and signaling pathways, lowering their functional activity and stability (74). Another possibility entails homodimerization-dependent protein activity, observed in five biomarkers: Mtpap (75), Usp15 (76), Ets1 (77), Umps (78), and Tnfaip3 (79). Dysregulation of AS resulting in formation of isoforms unamenable to homodimerization can have a dominant negative effect on homodimer formation. Yet another possibility is that both isoforms of biomarkers in the second category fill a functional niche in early brain development, and dysregulation of AS in either direction perturbs this balance.
Ideally, our LSTM model would have been tested on a validation set (besides the test set), for optimal assessment of model generalizability on unseen data. We mitigated the absence of the validation set with fivefold cross-validation analysis, which showed that the LSTM model is capable of generalizing well and is robust (SI Appendix, Fig. S12 C and D). Nevertheless, the LSTM model should be further assessed with a validation set, as well as on independent datasets, for optimal evaluation of model generalizability.
These biomarker AS events are conserved in mice and humans. However, PBMC types tested here are limited, and there are species-specific differences in PBMC differentiation and function (80). Another potential, albeit unlikely, reservation is that PBMC properties may have been affected by accelerated rotarod test preceding blood collection. AS biomarkers identified here with the short-read NovaSeq 6000 platform should be further validated using a long-read RNA-seq platform (e.g., HIFI sequencing from Pacific Biosciences) (81). Long-read platforms, by producing longer sequencing reads, can more accurately differentiate between true AS events and pseudogenes. Such validation, besides enhancing the reliability of our findings, may also uncover other novel AS biomarkers. Also, implied changes in protein diversity resulting from biomarker AS events should be validated via mass spectrometry in future studies. Ultimately, these biomarkers need to be validated in a clinical setting for their accuracy and reliability as prognosticators of motor and other learning deficits in children prenatally exposed to alcohol, high blood glucose, or other prenatal stressors.
Materials and Methods
More details on the materials and methods used are provided in SI Appendix.
PAE and OMD Mice Models.
CD-1 mice (Jackson Laboratory) were maintained in a regular 12-h light–dark cycle and constant room temperature (RT) of 22 ± 1 °C. Mice with an overnight vaginal plug were marked as E0.5. For generating PAE mice, pregnant 8-wk-old CD-1 mice were intraperitoneally (IP) injected with 25% ethanol solution (in PBS), at 4 g/kg of body weight, on their 16.5 and 17.5 d of gestation. PAE-control mice were injected with PBS (29, 69, 72).
For generating OMD mice, 6-wk-old CD-1 mice were IP injected with STZ solution [in citric buffer (CB)], at 150 mg/kg body weight (https://www.diacomp.org/shared/document.aspx?id=74&docType=Protocol). MD-control mice were injected with CB. Blood glucose levels were periodically monitored, during fasting and at random intervals, with the Accu-CHEK Guide (Roche) glucometer. Female mice that tested for blood glucose levels above 150 mg/dL after 6 h of fasting and above 250 mg/dL during random sampling, in at least two time points within 14 d of STZ injection, were chosen for breeding.
For cross-fostering experiments, OMD and OMD-control mice were swapped within 48 h after birth and were reared by CF-MD-control and CF-MD mothers, respectively.
DEG Analysis.
DEG analysis was performed via EdgeR (v3.24.3) algorithm (82) using IDEAMEX (35).
LSTM Deep-learning Model.
The LSTM model was written and executed with Keras (83), an open-source python neural-network library running on top of TensorFlow(84) (v1.x; Google). Python code (85) was executed in Colaboratory (86) (Google). Numpy python library (87) was used for proper formatting of raw input data, psi values of AS events, into LSTM model input.
SHAP Analysis.
SHAP, a Shapley-value algorithm for interpreting deep-learning model predictions (24–26), was used to assign a Shapley value to each input feature, proportional to its contribution toward accurate prediction of slow and fast learners by LSTM deep-learning model (44).
Study Approval.
All animal experimental protocols were approved by Institutional Animal Care and Use Committee, Children’s National Hospital.
Supplementary Material
Acknowledgments
We thank Claire Charpentier and Vanessa Niba for technical assistance. Part of this work was done in conjunction with the Collaborative Initiative on Fetal Alcohol Spectrum Disorders (CIFASD), which is funded by grants from the National Institute on Alcohol Abuse and Alcoholism (NIAAA). This work was supported by the following grants: NIH, NIAAA Grant R01AA026272 (M.T. and K.H.-T.), and CIFASD grant UH2AA026106 (K.H.-T. and M.T.), National Institute on Drug AbuseDA023999 (P.R.). Japan Society for the promotion of science (JSPS) Grants-in-Aid for Scientific Research (KAKENHI) grant 19K240390 (T.S.), Scott-Gentle Foundation grant (K.H.-T. and M.T.), District of Columbia Intellectual and Developmental Disabilities Research Center Award U54HD090257 by NICHD (PI: V. Gallo)
Author contributions
P.R., M.T., and K.H.-T. designed research; D.J.D., J.S., K.S., S.Y., G.L., C.L., L.W., T.S., C.Y., H.C., and Y.I.K. performed research; D.J.D., J.S., A.B., and S.Y. analyzed data; D.J.D., M.T., and K.H.-T. designed and performed machine learning analysis; and D.J.D., J.S., R.S., M.O., P.R., M.T., and K.H.-T. wrote the paper.
Competing interests
The authors declare no competing interest.
Footnotes
Reviewers: D.C., Michigan State University; and F.A.M., State University of New York Upstate Medical University.
Contributor Information
Pasko Rakic, Email: pasko.rakic@yale.edu.
Masaaki Torii, Email: mtorii@childrensnational.org.
Kazue Hashimoto-Torii, Email: khtorii@childrensnational.org.
Data, Materials, and Software Availability
Python code is available upon request. RNA sequencing data are available as GSE202254 (88) at GEO.
Supporting Information
References
- 1.Levitt P., Reinoso B., Jones L., The critical impact of early cellular environment on neuronal development. Prev. Med. 27, 180–183 (1998). [DOI] [PubMed] [Google Scholar]
- 2.Shonkoff J. P., Boyce W. T., Levitt P., Martinez F. D., McEwen B., Leveraging the biology of adversity and resilience to transform pediatric practice. Pediatrics 147, e20193845 (2021). [DOI] [PubMed] [Google Scholar]
- 3.Hadders-Algra M., Early diagnostics and early intervention in neurodevelopmental disorders-age-dependent challenges and opportunities. J. Clin. Med. 10, 861 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Popova S., Lange S., Probst C., Gmel G., Rehm J., Estimation of national, regional, and global prevalence of alcohol use during pregnancy and fetal alcohol syndrome: A systematic review and meta-analysis. Lancet Glob. Health 5, e290–e299 (2017). [DOI] [PubMed] [Google Scholar]
- 5.International Association of Diabetes and Pregnancy Study Groups Consensus Panel et al. , International Association of Diabetes and Pregnancy Study Groups recommendations on the diagnosis and classification of hyperglycemia in pregnancy. Diabetes Care 33, 676–682 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Popova S., et al. , Comorbidity of fetal alcohol spectrum disorder: A systematic review and meta-analysis. Lancet 387, 978–987 (2016). [DOI] [PubMed] [Google Scholar]
- 7.Hoyme H. E., et al. , Updated clinical guidelines for diagnosing fetal alcohol spectrum disorders. Pediatrics 138, e20154256 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sowell K. D., et al. , Altered maternal plasma fatty acid composition by alcohol consumption and smoking during pregnancy and associations with fetal alcohol spectrum disorders. J. Am. Coll. Nutr. 39, 249–260 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Balaraman S., et al. , Plasma miRNA profiles in pregnant women predict infant outcomes following prenatal alcohol exposure. PLoS ONE 11, e0165081 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Tseng A. M., et al. , Maternal circulating miRNAs that predict infant FASD outcomes influence placental maturation. Life Sci. Alliance 2, e201800252 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Camprubi Robles M., et al. , Maternal diabetes and cognitive performance in the offspring: A systematic review and meta-analysis. PLoS One 10, e0142583 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kinney B. A., Rabe M. B., Jensen R. A., Steger R. W., Maternal hyperglycemia leads to gender-dependent deficits in learning and memory in offspring. Exp. Biol. Med. (Maywood) 228, 152–159 (2003). [DOI] [PubMed] [Google Scholar]
- 13.Zhang Y., Qian J., Gu C., Yang Y., Alternative splicing and cancer: A systematic review. Signal Transduct. Target. Ther. 6, 78 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Lee E.-G., et al. , Redefining transcriptional regulation of the APOE gene and its association with Alzheimer’s disease. PLoS ONE 15, e0227667 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Liu H., et al. , N-terminal alternative splicing of GluN1 regulates the maturation of excitatory synapses and seizure susceptibility. Proc. Natl. Acad. Sci. U.S.A. 116, 21207–21212 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Weyn-Vanhentenryck S. M., et al. , Precise temporal regulation of alternative splicing during neural development. Nat. Commun. 9, 2189 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Baralle F. E., Giudice J., Alternative splicing as a regulator of development and tissue identity. Nat. Rev. Mol. Cell Biol. 18, 437–451 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Kawasawa Y. I., et al. , Genome-wide profiling of differentially spliced mRNAs in human fetal cortical tissue exposed to alcohol. Alcohol 62, 1–9 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Binder A. M., LaRocca J., Lesseur C., Marsit C. J., Michels K. B., Epigenome-wide and transcriptome-wide analyses reveal gestational diabetes is associated with alterations in the human leukocyte antigen complex. Clin. Epigenetics 7, 79 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kulski J. K., “Next-generation sequencing—An overview of the history, tools, and ‘omic’ applications” in Next Generation Sequencing—Advances, Applications and Challenges, Kulski J. K., Ed. (InTech, 2016), 10.5772/61964. [DOI] [Google Scholar]
- 21.Grapov D., Fahrmann J., Wanichthanarak K., Khoomrung S., Rise of deep learning for genomic, proteomic, and metabolomic data integration in precision medicine. OMICS 22, 630–636 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Skrede O.-J., et al. , Deep learning for prediction of colorectal cancer outcome: A discovery and validation study. Lancet 395, 350–360 (2020). [DOI] [PubMed] [Google Scholar]
- 23.Kaji D. A., et al. , An attention based deep learning model of clinical events in the intensive care unit. PLoS ONE 14, e0211057 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Lundberg S. M., Lee S.-I., A unified approach to interpreting model predictions. arXiv [Preprint] (2017). 10.48550/arXiv.1705.07874 (Accessed 18 April 2021). [DOI]
- 25.Lundberg S. M., et al. , From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Lundberg S. M., et al. , Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat. Biomed. Eng. 2, 749–760 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Shapley L. S., 17. A Value for n-Person Games (Princeton University Press, 2016). [Google Scholar]
- 28.Furman B. L., Streptozotocin-induced diabetic models in mice and rats. Curr. Protoc. 1, e78 (2021). [DOI] [PubMed] [Google Scholar]
- 29.Mohammad S., et al. , Kcnn2 blockade reverses learning deficits in a mouse model of fetal alcohol spectrum disorders. Nat. Neurosci. 23, 533–543 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Le Moullec N., et al. , Sexual dimorphism in the association between gestational diabetes mellitus and overweight in offspring at 5–7 years: The OBEGEST cohort study. PLoS ONE 13, e0195531 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Ali S., Kerns K. A., Mulligan B. P., Olson H. C., Astley S. J., An investigation of intra-individual variability in children with fetal alcohol spectrum disorder (FASD). Child Neuropsychol. 24, 617–637 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Draxler D. F., Madondo M. T., Hanafi G., Plebanski M., Medcalf R. L., A flowcytometric analysis to efficiently quantify multiple innate immune cells and T cell subsets in human blood. Cytometry A 91, 336–350 (2017). [DOI] [PubMed] [Google Scholar]
- 33.Kim D., Paggi J. M., Park C., Bennett C., Salzberg S. L., Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Anders S., Pyl P. T., Huber W., HTSeq—A Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Jiménez-Jacinto V., Sanchez-Flores A., Vega-Alvarado L., Integrative differential expression analysis for multiple experiments (IDEAMEX): A web server tool for integrated RNA-Seq data analysis. Front. Genet. 10, 279 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Shen S., et al. , rMATS: Robust and flexible detection of differential alternative splicing from replicate RNA-Seq data. Proc. Natl. Acad. Sci. U.S.A. 111, E5593–E5601 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Li Y. I., et al. , Annotation-free quantification of RNA splicing using LeafCutter. Nat. Genet. 50, 151–158 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.LeCun Y., Bengio Y., Hinton G., Deep learning. Nature 521, 436–444 (2015). [DOI] [PubMed] [Google Scholar]
- 39.Hochreiter S., Schmidhuber J., Long short-term memory. Neural Comput. 9, 1735–1780 (1997). [DOI] [PubMed] [Google Scholar]
- 40.Castelvecchi D., Can we open the black box of AI? Nature 538, 20–23 (2016). [DOI] [PubMed] [Google Scholar]
- 41.Sheu Y.-H., Illuminating the black box: Interpreting deep neural network models for psychiatric research. Front. Psychiatry 11, 551299 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Marcotcr, Lime: Explaining the predictions of any machine learning classifier. GitHub repository. https://github.com/marcotcr/lime. Deposited 19 August 2016.
- 43.Uber, Manifold: A model-agnostic visual debugging tool for machine learning. GitHub repository. https://github.com/uber/manifold. Deposited 8 January 2020.
- 44.Slundberg, SHAP: A game theoretic approach to explain the output of any machine learning model. GitHub repository. https://github.com/slundberg/shap. Deposited 28 June 2018.
- 45.Fuchs M., Paningbatan A. R., Correlation between Shapley values of rooted phylogenetic trees under the beta-splitting model. J. Math. Biol. 80, 627–653 (2020). [DOI] [PubMed] [Google Scholar]
- 46.Becker M., Warr-Leeper G. A., Leeper H. A., Fetal alcohol syndrome: A description of oral motor, articulatory, short-term memory, grammatical, and semantic abilities. J. Commun. Disord. 23, 97–124 (1990). [DOI] [PubMed] [Google Scholar]
- 47.van den Engel-Hoek L., de Groot I. J. M., de Swart B. J. M., Erasmus C. E., Feeding and swallowing disorders in pediatric neuromuscular diseases: An overview. J. Neuromuscul. Dis. 2, 357–369 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Noda K., Nakao S., Ishida S., Ishibashi T., Leukocyte adhesion molecules in diabetic retinopathy. J. Ophthalmol. 2012, 279037 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Balaraman S., Tingling J. D., Tsai P.-C., Miranda R. C., Dysregulation of microRNA expression and function contributes to the etiology of fetal alcohol spectrum disorders. Alcohol Res. 35, 18–24 (2013). [PMC free article] [PubMed] [Google Scholar]
- 50.Feng J., Xing W., Xie L., Regulatory roles of microRNAs in diabetes. Int. J. Mol. Sci. 17, 1729 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Sallam N. A., Palmgren V. A. C., Singh R. D., John C. M., Thompson J. A., Programming of vascular dysfunction in the intrauterine milieu of diabetic pregnancies. Int. J. Mol. Sci. 19, 3665 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Ribeiro M., et al. , Meningeal γδ T cell-derived IL-17 controls synaptic plasticity and short-term memory. Sci. Immunol. 4, eaay5199 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.DeepMind, AlphaFold: Open source code for AlphaFold. GitHub repository. https://github.com/google-deepmind/alphafold. Deposited 16 July 2021.
- 54.Jumper J., et al. , Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Li Z., Jaroszewski L., Iyer M., Sedova M., Godzik A., FATCAT 2.0: towards a better understanding of the structural diversity of proteins. Nucleic Acids Res. 48, W60–W64 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Isken O., Maquat L. E., Quality control of eukaryotic mRNA: Safeguarding cells from abnormal mRNA function. Genes Dev. 21, 1833–1856 (2007). [DOI] [PubMed] [Google Scholar]
- 57.Hogan D. J., Riordan D. P., Gerber A. P., Herschlag D., Brown P. O., Diverse RNA-binding proteins interact with functionally related sets of RNAs, suggesting an extensive regulatory system. PLoS Biol. 6, e255 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Dassi E., Handshakes and fights: The regulatory interplay of RNA-binding proteins. Front. Mol. Biosci. 4, 67 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Paz I., Kosti I., Ares M., Cline M., Mandel-Gutfreund Y., RBPmap: A web server for mapping binding sites of RNA-binding proteins. Nucleic Acids Res. 42, W361–W367 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Ardila D., et al. , End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat. Med. 25, 954–961 (2019). [DOI] [PubMed] [Google Scholar]
- 61.Pham T.-H., Qiu Y., Zeng J., Xie L., Zhang P., A deep learning framework for high-throughput mechanism-driven phenotype compound screening and its application to COVID-19 drug repurposing. Nat. Mach. Intell. 3, 247–257 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Hayashi Y., “Black box nature of deep learning for digital pathology: Beyond quantitative to qualitative algorithmic performances, Lecture notes in computer science” in Artificial Intelligence and Machine Learning for Digital Pathology: State-of-the-Art and Future Challenges, Holzinger A., Goebel R., Mengel M., Müller H., Eds. (Springer International Publishing, 2020), pp. 95–101. [Google Scholar]
- 63.Ribeiro M. T., Singh S., Guestrin C., ““Why should I trust you?”: Explaining the predictions of any classifier” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM Press, 2016), pp. 1135–1144. [Google Scholar]
- 64.Gate D., et al. , Clonally expanded CD8 T cells patrol the cerebrospinal fluid in Alzheimer’s disease. Nature 577, 399–404 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Dantzer R., Neuroimmune interactions: From the brain to the immune system and vice versa. Physiol. Rev. 98, 477–504 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Lombardo M. V., et al. , Maternal immune activation dysregulation of the fetal brain transcriptome and relevance to the pathophysiology of autism spectrum disorder. Mol. Psychiatry 23, 1001–1013 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Smith A. M., Dwoskin L. P., Pauly J. R., Early exposure to nicotine during critical periods of brain development: Mechanisms and consequences. J. Pediatr. Biochem. 1, 125–141 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Thompson B. L., Levitt P., Stanwood G. D., Prenatal exposure to drugs: Effects on brain development and implications for policy and education. Nat. Rev. Neurosci. 10, 303–312 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Hashimoto-Torii K., et al. , Roles of heat shock factor 1 in neuronal response to fetal environmental risks and its relevance to brain disorders. Neuron 82, 560–572 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Dutta D. J., Hashimoto-Torii K., Torii M., “Role of heat shock factor 1 in neural development and disorders” in Heat Shock Proteins (Springer, The Netherlands, 2020). [Google Scholar]
- 71.Boulanger-Bertolus J., Pancaro C., Mashour G. A., Increasing role of maternal immune activation in neurodevelopmental disorders. Front. Behav. Neurosci. 12, 230 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Torii M., et al. , Detection of vulnerable neurons damaged by environmental insults in utero. Proc. Natl. Acad. Sci. U.S.A. 114, 2367–2372 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Ishii S., et al. , Variations in brain defects result from cellular mosaicism in the activation of heat shock signalling. Nat. Commun. 8, 15157 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Birchler J. A., Veitia R. A., Gene balance hypothesis: Connecting issues of dosage sensitivity across biological disciplines. Proc. Natl. Acad. Sci. U.S.A. 109, 14746–14753 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Bai Y., Srivastava S. K., Chang J. H., Manley J. L., Tong L., Structural basis for dimerization and activity of human PAPD1, a noncanonical poly(A) polymerase. Mol. Cell 41, 311–320 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Elliott P. R., et al. , Structural variability of the ubiquitin specific protease DUSP-UBL double domains. FEBS Lett. 585, 3385–3390 (2011). [DOI] [PubMed] [Google Scholar]
- 77.Lamber E. P., et al. , Regulation of the transcription factor Ets-1 by DNA-mediated homo-dimerization. EMBO J. 27, 2006–2017 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Wittmann J. G., et al. , Structures of the human orotidine-5’-monophosphate decarboxylase support a covalent mechanism and provide a framework for drug design. Structure 16, 82–92 (2008). [DOI] [PubMed] [Google Scholar]
- 79.Bosanac I., et al. , Ubiquitin binding to A20 ZnF4 is required for modulation of NF-κB signaling. Mol. Cell 40, 548–557 (2010). [DOI] [PubMed] [Google Scholar]
- 80.Mestas J., Hughes C. C., Of mice and not men: differences between mouse and human immunology. J. Immunol. 172, 2731–2738 (2004). [DOI] [PubMed] [Google Scholar]
- 81.Wenger A. M., et al. , Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155–1162 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Robinson M. D., McCarthy D. J., Smyth G. K., edgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Keras-team, Keras: Deep Learning for humans. GitHub repository. https://github.com/Keras-team/keras. Deposited 24 May 2017.
- 84.TensorFlow, TensorFlow. GitHub repository. https://github.com/tensorflow/tensorflow. Deposited 16 February 2016.
- 85.Van Rossum G., Drake F. L. Jr., The python language reference manual (Network Theory Ltd., 2021). ISBN: 978-1-906966-14-0. [Google Scholar]
- 86.Google Colaboratory, Google Colaboratory. GitHub repository. https://github.com/googlecolab/colabtools. Deposited 10 November 2017.
- 87.NumPy, NumPy: The fundamental package for scientific computing with Python. GitHub repository. https://github.com/numpy/numpy. Deposited 16 November 2016.
- 88.Sasaki J., et al. , Peripheral RNA biomarkers for neurocognitive impairment shared by prenatal alcohol exposure (PAE) and offspring from mother with diabetes (OMD) in mice. GEO. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE202254. Deposited 4 May 2022.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Python code is available upon request. RNA sequencing data are available as GSE202254 (88) at GEO.