Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Sep 6.
Published in final edited form as: J Alzheimers Dis. 2017;55(3):1223–1233. doi: 10.3233/JAD-160835

MicroRNAs in Human Cerebrospinal Fluid as Biomarkers for Alzheimer’s Disease

Theresa A Lusardi a,1, Jay I Phillips b,1, Jack T Wiedrick c,1, Christina A Harrington d, Babett Lind e, Jodi A Lapidus c, Joseph F Quinn e,f, Julie A Saugstad b,*
PMCID: PMC5587208  NIHMSID: NIHMS899492  PMID: 27814298

Abstract

Background

Currently available biomarkers of Alzheimer’s disease (AD) include cerebrospinal fluid (CSF) protein analysis and amyloid PET imaging, each of which has limitations. The discovery of extracellular microRNAs (miRNAs) in CSF raises the possibility that miRNA may serve as novel biomarkers of AD.

Objective

Investigate miRNAs in CSF obtained from living donors as biomarkers for AD.

Methods

We profiled miRNAs in CSF from 50 AD patients and 49 controls using TaqMan® arrays. Replicate studies performed on a subset of 32 of the original CSF samples verified 20 high confidence miRNAs. Stringent data analysis using a four-step statistical selection process including log-rank and receiver operating characteristic (ROC) tests, followed by random forest tests, identified 16 additional miRNAs that discriminate AD from controls. Multimarker modeling evaluated linear combinations of these miRNAs via best-subsets logistic regression, and computed area under the ROC (AUC) curve ascertained classification performance. The influence of ApoE genotype on miRNA biomarker performance was also evaluated.

Results

We discovered 36 miRNAs that discriminate AD from control CSF. 20 of these retested in replicate studies verified differential expression between AD and controls. Stringent statistical analysis also identified these 20 miRNAs, and 16 additional miRNA candidates. Top-performing linear combinations of 3 and 4 miRNAs have AUC of 0.80–0.82. Addition of ApoE genotype to the model improved performance, i.e., AUC of 3 miRNA plus ApoE4 improves to 0.84.

Conclusions

CSF miRNAs can discriminate AD from controls. Combining miRNAs improves sensitivity and specificity of biomarker performance, and adding ApoE genotype improves classification.

Keywords: Alzheimer’s disease, ApoE, biomarker, cerebrospinal fluid, microRNA, PCR

INTRODUCTION

Alzheimer’s disease (AD) is the most common form of dementia and the sixth leading cause of death in the United States, with a cost currently estimated at $100 billion per year. The development of preventive strategies is urgent, but depends on the development of biomarkers to identify “preclinical” cases and to monitor mechanisms of disease.

Cerebrospinal fluid (CSF) serves as an excellent candidate for biomarker studies in neuropathological brain diseases such as AD [1]. The most extensively studied CSF protein biomarkers of AD include Aβ42, tau, and phospho-tau, which have received intense study because they are associated with the “classical” AD pathology of plaques and tangles. While these CSF biomarkers have some diagnostic utility, they have not performed well as outcome measures in clinical trials [2]. Perhaps more importantly, the bias toward plaque and tangle pathology has limited the ability to identify other pathogenic mechanisms, which might be plausible with a more open-ended approach to CSF biomarker development. The existence of extracellular RNAs (exRNAs) in biofluids represents a fertile molecular landscape from which diagnostic and prognostic biomarkers may be accessed, characterized, and exploited. Accordingly, the identification of exRNAs in CSF provides an opportunity to define important biomarkers that characterize and differentiate CNS diseases [3]. MiRNAs are the most well studied exRNA species as they are found in virtually all biofluids including CSF, saliva, plasma, serum, and urine [4]. Thus, their persistence and altered abundance in the extracellular fluid or CSF may play a role in the spreading of the disease throughout the brain, as occurs in AD.

MiRNAs are members of the non-protein-coding family of RNAs that serve as regulators of post-transcriptional gene expression [5]. MiRNAs are small, ~20–24 nucleotide, genomically encoded RNAs that regulate gene expression by base-pairing to sequences in the mRNA [6, 7]. Importantly, miRNAs are stable in circulating fluids, presumably because they are contained within ribonucleoprotein complexes or membrane vesicles that affords them protection against degradation.

We explored the hypothesis that CSF miRNA species distinguish patients with AD from healthy age-matched controls. In contrast to prior reports of miRNAs in AD CSF [816], we used 1) CSF from living (rather than postmortem) donors, 2) a large sample size from a well-characterized repository of AD and control patients, and 3) a rigorous analytic approach examining the diagnostic utility of combinations of miRNAs. These proof-of-principle studies demonstrate the internal validity of CSF derived miRNA for distinguishing between confirmed AD and neurologically normal donors, and identify a subset of miRNA that are most likely indicators of AD. Confirmatory studies of these candidate miRNA biomarkers in an independent population are currently in progress.

MATERIALS AND METHODS

Subjects

The Institutional Review Board of Oregon Health & Science University (OHSU) approved all of the donor procedures (IRB 6845); all subjects provided written informed consent. Participants underwent detailed clinical and laboratory evaluation, including cognitive testing and interview with a collateral historian. The samples were banked at the Oregon Alzheimer’s Disease Center (OADC), the core program of the Layton Aging & Alzheimer’s Disease Center, supported by the National Institute on Aging.

CSF sample collection

The OADC has standardized their CSF collection protocol to correspond to that used in other AD centers engaged in CSF biomarker research [17]. All CSF examinations are done in the morning under fasting conditions, in the lateral decubitus position, with a 24-gauge Sprotte spinal needle that minimizes the discomfort of the procedure and reduces the incidence of lumbar puncture headache. The first 3 to 5 mL of CSF collected is sent to the clinical lab for cell count, and determination of glucose and total protein levels. Next, serial syringes of 5mL CSF are collected, mixed, transferred to polypropylene tubes in 0.5 mL aliquots, and the tubes numbered to account for any gradient effect in subsequent experiments. All CSF tubes have an OADC subject number, but no other identifying information, in order to facilitate later collaborations and sharing of samples. Immediately after aliquots are transferred, the CSF is frozen on dry ice and stored in a −80°C freezer.

Apolipoprotein E genotyping

DNA was isolated from blood and amplified by Touchdown PCR with 250 µM dNTPs, 1 Unit Taq DNA Polymerase, buffer, 1X Q-solution (Qiagen), and 0.5 µM forward and reverse primers for ApoE exon 4 (E4 allele). A product size of 443 nucleotides identified on a 1% agarose 1X TBE gel was excised, cleaned with ExoSAP-IT reagent (Affymetrix), and sequenced on a model 377 automated fluorescence sequencer (Applied Biosystems). Chromatogram traces were examined and nucleotide sequences determined using FinchTV (Geospiza, Inc.).

RNA isolation and amplification

Total RNA was extracted from 0.5 mL of each CSF sample using the mirVana™ PARIS™ RNA and Native Protein Purification Kit (ThermoFisher Scientific), modified to include two aqueous extractions during the organic phase extraction steps in order to maximize RNA recovery [13]. RNA samples were concentrated using the RNA Clean & Concentrator™-5 Kit (Zymo Research) and eluted in 9 µL RNAse/DNase-free water. RNA concentrations were initially measured on a set of test CSF samples using the Quant-iT™ RiboGreen® RNA Assay Kit (ThermoFisher Scientific). The average concentration for the test group was 133 pg/µL with a total RNA recovery of approximately 2 ng/mL CSF. The concentrated RNA samples were converted to cDNA and pre-amplified using a T-100 thermocycler (Bio-Rad, Hercules, CA) with Megaplex™ RT Primers, Human Pool Set v3.0, as per the manufacturer’s protocol (“Megaplex Pools For microRNA Expression Analysis”), following instructions for detection of miRNA with pre-amplification. The pre-amplification products were diluted into a prescribed final volume of 100 µL and stored at −20°C until ready for the final detection PCR reactions. Real-time PCR reactions followed the manufacturer’s protocol, using 18 µL of diluted pre-amplification product.

MicroRNA qRT-PCR arrays

The expression profile of miRNAs in CSF samples was determined using the TaqMan® Array Human MicroRNA A + B Cards Set v3.0 (ThermoFisher). The arrays consist of two cards (A and B), each containing a total of 384 TaqMan® MicroRNA Assays per card, including potential endogenous control RNAs (U6 snRNA, RNU44, RNU48). For these arrays, there is an n=1 technical replicate for each RNA probe. The qRT-PCR amplifications and data acquisition were performed on a QuantStudio™ 12K Flex Real-Time PCR System (ThermoFisher) using automated baseline and threshold values determined by Expression Suite™ software v1.2.2 (ThermoFisher). Amplification data were imported into Expression Suite and cycle time (Ct) values were calculated. The expression data were analyzed as separate batches according to the TLDA card lot number. The Ct value for each well was reported along with the amplification score (Amp-Score) metric. Quality control filtering of the Ct values consisted of the following steps: i) PCR products were considered below the detection threshold and censored if Ct ≥ 36 or if Expression Suite reported the Ct value as “Undetected”; ii) individual assays were excluded if AmpScore <0.9; iii) all other Ct values were accepted as reported by Expression Suite. RNU44 and RNU48 detection was inconsistent across all cards and samples, and excluded from further analysis. U6 snRNA detection was consistent within cards, across cards, and across samples, making it a strong candidate for quality assessment and normalization strategies (see below).

All qRT-PCR data and donor-specific metadata have been submitted to the exRNA Atlas [18] repository and will be available following publication of results. This PCR dataset with title “TLDA miRNA Screen for Candidate AD Biomarkers in CSF from Living Donors” and accession number “EXR-JSAUG1UH2001-AN” can be accessed via the Datasets link in the menubar at http://exrnaatlas.org/.

Statistical analysis of miRNA expression in control and AD CSF

Initial analysis of the TaqMan® data was performed using a default statistical analysis and routine procedures for qRT-PCR data processing. To improve identification of robust and reproducible miRNA candidates, we developed a stringent analytical approach which involved passing the complete miRNA data set through multiple quality and performance filters in order to hone in on candidate AD miRNAs that: i) were frequently expressed within the dynamic range of qRT-PCR measurements for CSF samples; ii) whose Ct values trend strongly with the AD condition; iii) were able to discriminate AD from control at the patient level. Note that built into this analytic approach were rigorous internal validation procedures designed to more accurately estimate classification performance of the top miRNA candidates prior to measuring them in independent cohorts.

Test (i) was addressed by thorough data filtering using Expression Suite error flags. After exclusion of technical failures, we further excluded miRNAs with low (<~20%) detection rates in samples in order to restrict focus to targets with adequate biomarker potential. Although the amplification experiments for this study were allowed to run to 40 cycles before termination, empirical data-quality metrics indicated that virtually no amplifications occurred at or after 36 cycles, so we censored the raw Ct values at a threshold of 36; thus, censored values may be viewed as miRNA expression below a lower limit of detection. Ct values were then normalized relative to the U6 snRNA reference and transformed onto an expression scale by taking the multiplicative inverse (i.e., 1 divided by the value) and scaling by the maximum value so that higher values indicate relatively greater quantities of miRNA expression; censored values were assigned a value of zero on this expression scale. 151 candidate miRNAs passed the initial quality filters and proceeded on to further analysis.

To address tests (ii) and (iii), we performed a four-step battery of inferential test procedures. In order to mitigate inferential bias due to testing method, we imposed the rule that a candidate miRNA would not be carried forward unless it had passed at least two of the four steps. The testing was performed using Stata Release 14 (http://www.stata.com/) and R Version 3.2.3 (https://cran.r-project.org/) software.

The first two steps employed log-rank testing of normalized Ct values and predictions from univariate logistic regression on the transformed expression values using models that accounted for censoring. The goal in these phases was to select candidate miRNAs according to their strength of association with AD status, and selection cutoffs were chosen to achieve an average false discovery rate (FDR) of 30% under correction for multiple comparisons. This FDR heuristic was chosen to reflect our anticipation that selected candidates would, on average, be about twice as likely to be associated as not to be associated with AD status.

The last two steps of testing were designed to find candidate miRNA markers whose expression profiles would provide a reliably clear signal of AD or non-AD status across the observed range of total miRNA expression contexts; the goal in these phases was to gauge the relative predictive importance of each miRNA while accounting for interactions with the other miRNAs. To test the signal strength, we repeatedly sampled random groups of miRNAs and assessed their ability to classify the CSF samples into “AD” and “non-AD” bins. If the classification does worse when a certain miRNA is excluded from the mix, that miRNA is probably contributing some signal, and is viewed as relatively more important than other miRNAs. For the testing we used two different random-forest classifiers [19]—one based on conditional inference trees [20] and the other based on CHAID trees [21]—in order to make the selection process more robust to the choice of underlying classification method; selection in both phases was based on an appropriate method-specific ‘importance score’ that could be calculated for each classifier.

Upon completion of these tests, we identified 36 candidate miRNAs that passed at least two of the four steps, and this set was carried forward for the final evaluation step: multimarker classification performance, which was assessed by evaluating linear combinations of all possible subsets of the top miRNA biomarker candidates via best-subsets logistic regression, employing multiple imputation of missing values. Estimated inclusion probabilities for each miRNA and computed area under the ROC curves (AUC) [22] fit from each model were calculated and used to rank targets.

Prospective assessment of candidate miRNAs

To get a more accurate estimate of the prospective performance of the top 36 candidates from our rigorous analytic pipeline, we implemented a repeated leave-one-out (LOO) procedure [23]to internally validate our findings. This is a standard method for projecting what the performance of a classifier would be in a new cohort; it attempts to ascertain how well the 36 candidates would replicate in an independent cohort. We applied this procedure in a repeated manner to the m = 30 MI datasets previously generated for multimarker classification performance evaluation in the final stage of selection testing (see above). Since each MI dataset is a simulation of the total set of miRNA expression values on these 36 candidates that could have been obtained from a cohort drawn from the same population as ours, each can thus be viewed as a small perturbation of our observed data. On each of these MI datasets we performed a LOO nonparametric nearest-neighbor discriminant analysis (using the three nearest neighbors by Canberra distance on Mahalanobis-normalized expression values [23]), where each subject is classified as AD or non-AD in turn by leaving that subject out of the calculation of the discrimination rule and then using the resulting rule, based on the kept-in subjects only, to classify the left-out subject. Averaging over the MI replicates using the standard combining procedure for multiple-imputation (MI) [24] estimation to the LOO sensitivity and specificity estimates calculated for each MI dataset, we obtain an improved estimate of the projected sensitivity and specificity of our set of 36 miRNA biomarker candidates if applied to a new cohort. This provides an internally validated estimate of the discriminating ability of this set of markers.

Replication study of candidate miRNAs

Reproducibility of phenotype associations under similar measurement technology platforms is a necessary feature for any proposed biomarker. In order to assess the reproducibility of our initial findings on retest, we generated custom TaqMan® miRNA assays to examine the expression of candidate AD miRNAs in a subset of 16 AD and 16 control CSF samples that were analyzed in the original array assays. This subset of CSF samples was chosen based on the availability of duplicate sample aliquots from the OADC bank, and the ability to include 32 CSF samples (representing 1/3 of the original CSF samples) for these test/retest reliability studies. The custom arrays included assays for 46 candidate miRNAs (original analysis), 6 miRNAs not changed between AD and control CSF, 4 miRNAs not detected in CSF, and U6 snRNA. All miRNA assays were plated in triplicate to allow for analysis of n = 3 technical replicates/RNA in the replication study. Only 20 of the 36 candidate miRNAs selected by the four-step statistical evaluation were included in this study, for reasons we discuss in more detail below (see Replication studies).

RESULTS

Donor characteristics

All donor characteristics for the 50 AD and 49 control subjects considered in this phase of the biomarker study, including sex, age at CSF collection, Mini-Mental State Examination (MMSE), and ApoE genotype, are shown in Table 1. The 49 control subjects were community volunteers in good health, with MMSE [25]scores between 28 and 30 (29.22±1.28), Clinical Dementia Rating scores of 0, and no evidence or history of cognitive or functional decline [26, 27]. The 50 AD participants were recruited from OHSU Neurology clinics and diagnosed with probable AD according to ADRDA-NINDS criteria [28, 29] at a consensus conference of OADC clinicians, with mean MMSE scores between 18 and 19 (18.28 ± 6.4), and Clinical Dementia Rating scores of 1–2. ApoE genotyping was done for the 90/99 CSF donors who also donated blood samples (44 control and 46 AD). The genotype revealed that of the control group, 54.55% had zero, 38.64% had one, and 6.82% had two ApoE4 alleles. In the AD group 17.39% had zero, 58.70% had one, and 23.91% had two ApoE4 alleles, so that the E4 genotype was over-represented in AD as expected [30].

Table 1.

CSF Donor characteristics

Control AD All
Subjects
Male 23 30 53
Female 26 20 46
Total 49 50 99
Age at LP Mean±SD Mean±SD Mean±SD
Male 69.61± 9.82 68.70± 7.49 69.09± 8.5
Female 66.15± 8.94 70.75± 6.94 68.15± 8.37
Total 67.78± 9.43 69.52± 7.27 68.66± 8.41
MMSE at LP Mean±SD Mean±SD Difference±SDboot
Male 28.83± 1.64 18.23± 6.71 10.59± 6.78
Female 29.58± 0.70 18.35± 6.08 11.23± 5.96
Total 29.22± 1.28 18.28± 6.40 10.94± 6.46
ApoE4 Alleles (0–2) Ratio AD: Control
0 24 54.55% 8 17.39% 0.33
1 17 38.64% 27 58.70% 1.59
2 3 06.82% 11 23.91% 3.67
Total* 44 100% 46 100%
*

Not including those subjects where genotyping data was not available (n =5 Control, n = 4 AD).

SDboot, standard deviation of differences between randomly selected Control and AD individuals based on 100,000 bootstrap samples.

Identification of candidate miRNAs that discriminate AD from control CSF

The TaqMan® miRNA arrays were assessed for uniformity of miRNA expression within the AD group, and within the control group, by correlating the expression levels for each individual CSF sample assay to the average expression for entire sample group. The miRNA expression from one sample was compared to the miRNA expression for all samples in the same group to show the correlation between one individual assay relative to the average of the miRNA assay across each group for control and AD (data not shown). These data showed that miRNA expression is consistent within each group (control versus AD) and that regulation of miRNAs is not random in individual samples.

For a miRNA to be included in the statistical analysis, it had to be present in at least 20% of all 99 CSF donor samples, which resulted in 151 miRNA candidates chosen for analysis. The discriminating abilities of the 151 miRNAs that passed our PCR quality control filtering were assessed using four statistical testing steps. The log-rank and ROC AUC tests assess the group differences and discriminating ability of each miRNA individually, while the two random forest tests (CART and CHAID) evaluate the classification performance of each miRNA as a member of a group. The p-values for the log-rank tests, AUC values from the ROC tests, and the relative importance ranks from the CART and CHAID tests are presented in Table 2. We summarized these results in a “Multitest Score” for each miRNA that indicates the number of steps where the miRNA was selected as a strong performer. MiRNA that performed well in at least two statistical tests (Multitest score = 2, 3, or 4) were considered top biomarker candidates. We also include for each candidate the fraction of samples in control and AD groups with detectable levels of that miRNA (% Detected), along with the log fold change and 95% confidence interval (CI) of the detected values in AD versus control samples.

Table 2.

A Quantitative and statistical measures of 20 replicated, high confidence miRNAs

MIMAT
Accession
ID (hsa) Log-Rank
p-value
ROC
AUC
CART
Rank
CHAID
Rank
Multi test
Score
% Detected
Fold Change
Cont. AD Log2 95% CI
Increased in AD
0000732 miR-378a-3p* <0.01 0.61 42 41 2 96 100 4.06   0.60, 7.52
0005881 miR-1291 0.02 0.59 6 38 2 14 34 1.24   0.39, 2.10
0003265 miR-597-5p 0.06 0.60 19 105 2 98 100 0.93 −0.10, 1.95
Decreased in AD
0000435 miR-143-3p <0.01 0.69 7 4 4 59 29 −2.15 −3.67, −0.64
0000434 miR-142-3p <0.01 0.72 1 54 3 96 80 −2.01 −3.03, −0.99
0000752 miR-328-3p 0.01 0.67 3 25 4 80 62 −1.72 −3.10, −0.34
0004614 miR-193a-5p <0.01 0.66 10 23 4 56 26 −1.52 −2.75, −0.28
0000088 miR-30a-3p 0.06 0.66 30 16 3 81 56 −1.49 −3.10, 0.12
0000074 miR-19b-3p 0.03 0.62 17 55 3 100 93 −1.34 −2.37, −0.31
0000245 miR-30d-5p 0.04 0.63 22 94 2 77 62 −1.28 −2.34, −0.23
0004692 miR-340-5p <0.01 0.67 2 2 4 43 9 −1.26 −2.18, −0.35
0000431 miR-140-5p 0.02 0.65 4 14 4 65 43 −1.24 −2.44, −0.03
0000423 miR-125b-5p 0.02 0.65 13 66 3 88 72 −1.17 −2.25, −0.09
0000083 miR-26b-5p 0.06 0.63 34 26 3 63 50 −1.17 −2.11, −0.22
0000069 miR-16-5p 0.12 0.64 14 9 3 96 96 −1.00 −1.96, −0.03
0000449 miR-146a-5p 0.10 0.62 11 57 2 100 94 −0.90 −1.78, −0.02
0000086 miR-29a-3p 0.07 0.63 12 7 4 98 100 −0.77 −1.55, 0.02
0000461 miR-195-5p 0.05 0.61 106 150 2 91 76 −0.75 −1.69, 0.18
0000417 miR-15b-5p 0.05 0.61 15 1 4 40 20 −0.71 −1.92, 0.50
0000280 miR-223-3p <0.01 0.62 39 5 3 100 100 −0.66 −2.00, 0.69
B Quantitative and statistical measures of 16 candidate miRNAs not retested

MIMAT
Accession
ID (hsa) Log-Rank
p-value
ROC
AUC
CART
Rank
CHAID
Rank
Multi test
Score
% Detected
Fold Change
Cont. AD Log2 95% CI
Increased in AD
0002843 miR-520b 0.03 0.71 8 32 4 47 71 1.86   0.51, 3.20
0003271 miR-603 0.06 0.62 31 31 3 36 56 1.22   0.01, 2.42
0002811 miR-202-3p 0.07 0.61 52 67 2 97 100 1.13   0.01, 2.26
0002837 miR-519b-3p 0.09 0.67 51 11 3 100 95 1.09 −0.04, 2.22
0002174 miR-484 0.10 0.58 138 3 2 100 100 0.48 −0.26, 1.22
Decreased in AD
0003249 miR-584-5p 0.03 0.73 67 22 2 91 54 −4.12 −7.42, −0.81
0004601 miR-145-3p 0.03 0.67 9 61 3 55 34 −1.71 −2.95, −0.48
0000080 miR-24-3p <0.01 0.63 5 78 3 100 89 −1.54 −2.52, −0.56
0002888 miR-532-5p 0.03 0.60 16 48 3 32 14 −1.21 −2.26, −0.17
0004502 miR-28-3p 0.09 0.63 37 28 2 70 52 −0.99 −2.56, 0.59
0002809 miR-146b-5p 0.07 0.59 18 43 2 86 74 −0.99 −2.10, 0.13
0000419 miR-27b-3p 0.09 0.63 21 92 2 48 32 −0.85 −1.92, 0.23
0000760 miR-331-3p 0.08 0.67 144 52 2 57 32 −0.83 −2.17, 0.50
0000437 miR-145-5p 0.06 0.64 20 15 3 63 39 −0.42 −1.81, 0.97
0003258 miR-590-5p 0.06 0.61 36 36 2 47 27 −0.33 −1.20, 0.53
0000710 miR-365a-3p 0.04 0.63 24 51 2 55 32 −0.29 −1.10, 0.53
*

Target sequence matches nucleotides 1–21 (of 22 nt) of hsa-miR-378a-3p.

Linear combinations of miRNAs increase sensitivity/specificity

We evaluated linear combinations of subsets of the 36 selected miRNAs via best-subsets logistic regression (employing multiple imputation of values missing due to technical dropout), and computed AUC to assess multimarker classification performance. Top-performing linear combinations of 3 and 4 miRNAs attain AUC of 0.80–0.82 (Fig. 1). Addition of ApoE genotype status to miRNAs further increased sensitivity/specificity as an AD biomarker (Fig. 2). ApoE alone has an AUC of 0.73 in this group of subjects, and 3 miRNAs had an AUC of 0.80. However, 3 miRNAs and ApoE genotype increase the AUC to 0.84. Thus, a combination of new (miRNA) and a reference (ApoE) biomarker increased classification performance.

Fig. 1.

Fig. 1

Linear combinations of candidate AD miRNAs increase sensitivity and specificity. 2-marker models show AUC of 0.75 (dashes and dots), while 3-marker models increased AUC to 0.80 (short dashes), and 4-marker models increased AUC to 0.82 (long dashes).

Fig. 2.

Fig. 2

Addition of ApoE genotype to miRNAs increases sensitivity/specificity. ApoE alone has an AUC of 0.73 in this group of subjects, and 3 miRNAs had an AUC of 0.80. However, 3 miRNAs and ApoE increase the AUC to 0.84.

Prospective discriminating ability of miRNA candidates

Estimates of classification performance obtained in discovery datasets are often too optimistic about real-world classification performance on new test cases because models trained on a particular set of data sometimes tune themselves too finely to that data, capturing idiosyncratic features that may not be reflective of typical features in the target population. A standard means of improving the accuracy of prospective classification assessment is a leave-one-out (LOO) classification procedure, whereby one data point is left out of the model estimation and the resulting model, based on the rest of the data, is used to guess the class of the left-out point. This procedure is repeated for each data point in turn, and the overall accuracy of the guesses is taken as the estimate of what the discriminating ability would be in a new dataset. Having generated m = 30 multiple-imputation (MI) datasets in a previous analysis step, we were able to repeat the LOO procedure in all 30 MI datasets and average the estimates using the standard combining rules to obtain an estimate that is not only validated internally but also robust to likely patterns of missing values because the variance across the MI datasets simulates the variance caused by having missing values. The MI-averaged estimates of LOO performance are 60% sensitivity at 73% specificity. We plotted this projected point of sensitivity and specificity for AD discrimination by our set of 36 candidates, along with a 95% joint confidence region, in Fig. 3. We also extrapolated a plausible ROC curve passing through that point, assuming a typical monotonic shape for the curve, and calculated the projected AUC (0.72) that would be implied by the curve. From Fig. 3, we can see that not only is our set of candidates projected to discriminate better than blind guessing with good confidence, but also that the performance is at least on par with ApoE, and possibly much better.

Fig. 3.

Fig. 3

Internally validated projection of candidate miRNA biomarker performance. The results of applying leave-one-out cross-validation to m = 30 multiple-imputation (MI) datasets simulated from the cohort. We employed nonparametric nearest-neighbor discriminant analysis (using three nearest neighbors by Canberra distance on Mahalanobis-scaled miRNA expression values) to each MI dataset and plotted the resulting sensitivity and specificity along with their associated point wise confidence. The MI estimates were averaged using the standard combining rules to obtain the best unbiased MI estimate and its corresponding 95% joint confidence region (the shaded rectangle). The projected ROC curve and AUC were obtained by extrapolating a typical monotonic curve shape passing through the best MI estimate and the two endpoints.

Replication studies

Our original (what we call a “default”) analysis of the 754 miRNAs was based on simple statistical criteria such as significance values, and led to a liberal selection of 46 candidates (only partially overlapping with our current set of 36). During this process we recognized the need for more robust procedures, including better data filtering and more stringent selection criteria, but also needed to move forward with verification testing while our more rigorous pipeline was being developed. Taking the 46 identifications as our best candidates thus far, we attempted to reproduce the observed associations with AD using custom TaqMan® arrays to measure differential expression of miRNAs in AD versus control CSF. The original set of 99 CSF samples had been run on TLDAs with n = 1 technical replicate/RNA probe, while the custom TLDAs were designed with n = 3 technical replicates/RNA probe using 32 of the 99 CSF samples (16 AD versus 16 control). These 32 CSF samples were identical sample aliquots from the original 99 CSF samples in the OADC bank, and represent 1/3 of the original CSF samples. We found that 20 of the original 46 candidates held up as expected and 26 did not (Fig. 4). After our more rigorous analytical pipeline, which included improved raw data processing and quality control as well as a stronger battery of statistical selection procedures, was completed, we used it to reanalyze the entire miRNA set (Fig. 4). The new pipeline selected 36 miRNAs that included the same 20 that had replicated and, interestingly, rejected all of the failed 26, mostly on data quality grounds. This demonstrates the need for strong data quality filtering procedures in biomarker selection pipelines. Figure 4 shows a flowchart of the original analysis and the stringent statistical reanalysis, and the overlap between the finding from both that strongly supports the inclusion of the top candidates in the validation phase: all 20 of the 20 targets from the original analysis that held up in the verification study would also have been selected for replication by the stringent statistical reanalysis.

Fig. 4.

Fig. 4

Flow for statistical analysis of candidate miRNA biomarkers. Of the 754 miRNAs in the discovery arrays, 46 miRNAs were selected as potential AD biomarkers based on the original statistical analysis of array data. 46 candidates were retested in replication studies using custom arrays; 20 miRNAs replicated, while 26 did not. During the replication studies, we developed and used a stringent statistical analysis pipeline of the array data that also identified the same top 20 replicated miRNAs, as well as 16 additional candidate miRNAs. These 36 miRNAs are being evaluated in a new, independent cohort of AD and control CSF donors for validation studies.

DISCUSSION

These studies report the results of miRNA expression studies performed using Taqman Low Density Array to examine in AD and control CSF obtained from living donors. Once we established the protocol for these CSF studies, we intentionally locked in the parameters for the discovery and replication studies reported herein, as well as for the current validation studies being done on a new and larger cohort of patients. The decision to maintain one technological approach throughout all phases of these studies was based on a publication showing diminished reproducibility of expression results between platforms and/or vendors when using the same starting RNA sample [31]. This finding was also consistent with personal experience in our and others’ laboratories. Thus, to maximize consistency of results within this study, we chose to use the same vendor and miRNA probes throughout the entire study, from discovery through validation.

In this study, we have identified 36 miRNAs that discriminate AD from control CSF, supporting the hypothesis that there are measurable differences in miRNAs in AD patients that can be exploited for use as clinical biomarkers. We also show that linear combinations of these CSF miRNAs result in increased sensitivity and specificity as biomarkers for AD, and that combinations of 2–4 miRNAs can distinguish AD from control with an accuracy of 75–82%. In addition, we examined the effect of adding ApoE4 genotype to the miRNA combinations. The ApoE4 genotype represents a substantial risk factor for the development of AD, and although it is not specific or sensitive enough to serve as a diagnostic test for AD, it does serve as a reference biomarker. When added to the miRNAs identified in this study, ApoE4 genotype increased classification performance: While AUC for 3 miRNAs is 80%, 3 miRNAs plus ApoE4 increased the AUC to 84%. These findings indicate that CSF miRNAs are not redundant with ApoE4, but instead offer additional diagnostic information. It will also be important to investigate whether these CSF miRNAs are complementary to, or more informative than, existing CSF protein biomarkers for AD (tau, Aβ). While protein biomarker values are not available for the CSF samples committed to the present study, it will be important to include that information in future studies in independent samples.

Our results show consistency with previous studies that have identified differences in miRNA expression between AD and controls. Kiko et al. measured 6 candidate miRNAs in 10 AD and 10 control subjects. These studies revealed a significant decrease in miR-34a and miR-146a in plasma and CSF, a significant decrease in miR-125b in CSF, and a significant increase miR-29a and miR-29b, in AD versus controls [14]. Similarly, we show decreased expression of miR-146a-p and miR-125b-5p, and increased expression of miR-29a-3p in CSF, in AD versus control (Table 2). Galimberti et al. showed decreased expression of miR-125b in serum from 22 AD patients, relative to 18 controls, and miR-125b could distinguish AD from control with an accuracy of 82% [32]. Muller et al. also showed significantly decreased expression of miR-146a in CSF of AD patients [33]. Consistent with these studies, our data show decreased expression of miR-125b-5p and miR-146a in AD versus control (Table 2). In addition, Denk et al. examined CSF in 22 AD and 28 controls and found 6 miRNAs considered as reliable and 9 miRNAs considered as informative as biomarkers for AD [34]. The Taqman arrays included 10 of these (miR-100, miR-103, miR-146a, miR-219, miR-296, miR-335, miR-375, miR-505#, miR-766). Denk et al. found increased expression of miR-146a in AD; however, we found decreased expression of miR-146a-5p in AD versus control. Our miR146a change in AD CSF is consistent with other studies [14, 32, 33], despite challenges of reproducing the finding of decreased miR-146a in a multi-center trial [35]. These findings underscore the fact that the reproducibility of miRNA studies remains a challenge. However, several factors can influence study outcomes, including the normalization method that will affect directionality of expression, consistency of naming miRNA species (miR-29a versus miR-29a-3p), insufficiently sized cohorts, and blood contamination in CSF [35]. Other factors include differences in reagents and miRNA probe design between manufacturers, differences in baseline miRNA levels due to age or other secondary factors, or disease progression.

In our study, we found incomplete success in a replication study of 46 candidate miRNAs based on our original, default analysis of the Taqman array data—yet 20 miRNAs held up between the original analysis and the stringent statistical analysis. We believe there are three primary reasons for the failure of 26 of the candidates from the original analysis. The first reason was our liberal inclusion of potential miRNA candidates beyond those with high significance in the original study; thus not all miRNAs included in the replication had a high degree of confidence for discriminating AD from control. The second was due to differences in the patterns of censoring, which reflects uncertainty about the optimal censoring threshold, a matter that we are addressing in our current studies. The third reason was sample dropout due to some issues with inconsistencies between data obtained from separate reagent lots, which has been addressed in our current studies by implementing tighter control over reagent lots in order to minimize potential batch variations that may occur over the course of performing a relatively large experimental study. Nonetheless, the top 20 miRNAs identified in the original analysis held up as candidate miRNAs for the validation studies under the stringent reanalysis, and the reanalysis added an additional 16 candidate miRNAs that were not detected in the initial analysis. While confirmatory studies in an independent population are necessary, statistical models to evaluate the internal validity of these preliminary findings are sufficiently robust to justify reporting at this stage.

Thus, in summary, our studies provide evidence that miRNA expression in CSF from living donors can be used to discriminate AD patients from control subjects. We are currently performing external validation studies to evaluate these 36 candidate miRNAs for their ability to distinguish AD from control on a new and larger set of CSF samples from living donors. These samples were obtained from the UCSD Shiley-Marcos Alzheimer’s Disease Research Center for RNA expression studies, and are accompanied by additional clinical information such as Aβ/tau measurements that will be incorporated into the data analysis of miRNAs differentially expression in AD and control CSF. Successful external validation of these miRNAs will allow us to confirm a robust set of markers that will then be explored for functional relationships to AD, as well as for biomarkers for early detection of AD in an additional independent cohort of CSF samples including patients with MCI, and to examine longitudinal studies in individual patients to determine the efficacy of these biomarkers as prognostic indicators of AD.

Acknowledgments

These studies were supported by NIH grants NCATS UH2TR000903 (JAS, JFQ), NIA AG08017 (JFQ), and NCATS UL1TR000128 (Oregon Clinical and Translational Research Institute). We thank Dr. Steven Rekow for assistance in interpreting the ExpressionSuite data, and Dr. Shawn Westaway for the ApoE genotyping data. We also thank the OHSU Gene Profiling Shared Resource for technical advice and access to core instrumentation.

Footnotes

Authors’ disclosures available online (http://j-alz.com/manuscript-disclosures/16-0835).

References

  • 1.Ghidoni R, Benussi L, Paterlini A, Albertini V, Binetti G, Emanuele E. Cerebrospinal fluid biomarkers for Alzheimer’s disease: The present and the future. Neurodegener Dis. 2011;8:413–420. doi: 10.1159/000327756. [DOI] [PubMed] [Google Scholar]
  • 2.Quinn JF. Biomarkers for Alzheimer’s disease: Showing the way or leading us astray? J Alzheimers Dis. 2013;33(Suppl 1):S371–S376. doi: 10.3233/JAD-2012-129022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Rao P, Benito E, Fischer A. MicroRNAs as biomarkers for CNS disease. Front Mol Neurosci. 2013;6:39. doi: 10.3389/fnmol.2013.00039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Quinn JF, Patel T, Wong D, Das S, Freedman JE, Laurent LC, Carter BS, Hochberg F, Van Keuren-Jensen K, Huentel-man M, Spetzler R, Kalani MY, Arango J, Adelson PD, Weiner HL, Gandhi R, Goilav B, Putterman C, Saugstad JA. Extracellular RNAs: Development as biomarkers of human disease. J Extracell Vesicles. 2015;4:27495. doi: 10.3402/jev.v4.27495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Chekulaeva M, Filipowicz W. Mechanisms of miRNA-mediated post-transcriptional regulation in animal cells. Curr Opin Cell Biol. 2009;21:452–460. doi: 10.1016/j.ceb.2009.04.009. [DOI] [PubMed] [Google Scholar]
  • 6.Bartel DP. MicroRNAs: Target recognition and regulatory functions. Cell. 2009;136:215–233. doi: 10.1016/j.cell.2009.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Baek D, Villén J, Shin C, Camargo FD, Gygi SP, Bartel DP. The impact of microRNAs on protein output. Nature. 2008;455:64–71. doi: 10.1038/nature07242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Cogswell JP, Ward J, Taylor IA, Waters M, Shi Y, Cannon B, Kelnar K, Kemppainen J, Brown D, Chen C, Prinjha RK, Richardson JC, Saunders AM, Roses AD, Richards CA. Identification of miRNA changes inAlzheimer’s disease brain and CSF yields putative biomarkers and insights into disease pathways. J Alzheimers Dis. 2008;14:27–41. doi: 10.3233/jad-2008-14103. [DOI] [PubMed] [Google Scholar]
  • 9.Hebert SS, Horre K, Nicolai L, Papadopoulou AS, Mandemakers W, Silahtaroglu AN, Kauppinen S, Delacourte A, De Strooper B. Loss of microRNA cluster miR-29a/b-1 in sporadic Alzheimer’s disease correlates with increased BACE1/beta-secretase expression. Proc Natl Acad Sci U S A. 2008;105:6415–6420. doi: 10.1073/pnas.0710263105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Geekiyanage H, Chan C. MicroRNA-137/181c regulates serine palmitoyltransferase and in turn amyloid beta, novel targets in sporadic Alzheimer’s disease. J Neurosci. 2011;31:14820–14830. doi: 10.1523/JNEUROSCI.3883-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Geekiyanage H, Jicha GA, Nelson PT, Chan C. Blood serum miRNA: Non-invasive biomarkers for Alzheimer’s disease. Exp Neurol. 2012;235:491–496. doi: 10.1016/j.expneurol.2011.11.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Kumar P, Dezso Z, MacKenzie C, Oestreicher J, Agoulnik S, Byrne M, Bernier F, Yanagimachi M, Aoshima K, Oda Y. Circulating miRNA biomarkers for Alzheimer’s disease. PLoS One. 2013;8:e69807. doi: 10.1371/journal.pone.0069807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Burgos K, Malenica I, Metpally R, Courtright A, Rakela B, Beach T, Shill H, Adler C, Sabbagh M, Villa S, Tembe W, Craig D, Van Keuren-Jensen K. Profiles of extracellular miRNA in cerebrospinal fluid and serum from patients with Alzheimer’s and Parkinson’s diseases correlate with disease status and features of pathology. PLoS One. 2014;9:e94839. doi: 10.1371/journal.pone.0094839. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kiko T, Nakagawa K, Tsuduki T, Furukawa K, Arai H, Miyazawa T. MicroRNAs in plasma and cerebrospinal fluid as potential markers for Alzheimer’s disease. J Alzheimers Dis. 2014;39:253–259. doi: 10.3233/JAD-130932. [DOI] [PubMed] [Google Scholar]
  • 15.Dorval V, Nelson PT, Hebert SS. Circulating microRNAs in Alzheimer’s disease: The search for novel biomarkers. Front Mol Neurosci. 2013;6:24. doi: 10.3389/fnmol.2013.00024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kim DH, Yeo SH, Park JM, Choi JY, Lee TH, Park SY, Ock MS, Eo J, Kim HS, Cha HJ. Genetic markers for diagnosis and pathogenesis of Alzheimer’s disease. Gene. 2014;545:185–193. doi: 10.1016/j.gene.2014.05.031. [DOI] [PubMed] [Google Scholar]
  • 17.Shi M, Bradner J, Hancock AM, Chung KA, Quinn JF, Peskind ER, Galasko D, Jankovic J, Zabetian CP, Kim HM, Leverenz JB, Montine TJ, Ginghina C, Kang UJ, Cain KC, Wang Y, Aasly J, Goldstein D, Zhang J. Cerebrospinal fluid biomarkers for Parkinson disease diagnosis and progression. Ann Neurol. 2011;69:570–580. doi: 10.1002/ana.22311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Subramanian SL, Kitchen RR, Alexander R, Carter BS, Cheung KH, Laurent LC, Pico A, Roberts LR, Roth ME, Rozowsky JS, Su AI, Gerstein MB, Milosavljevic A. Integration of extracellular RNA profiling data using metadata, biomedical ontologies and Linked Data technologies. J Extracell Vesicles. 2015;4:27497. doi: 10.3402/jev.v4.27497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Breiman L. Random Forests. Machine Learning. 2001;45:5–32. [Google Scholar]
  • 20.Hothorn T, Hornik K, Zeileis A. Unbiased recursive partioning: A conditional inference network. J Comput Graph Stat. 2006;15:651–674. [Google Scholar]
  • 21.Kass GV. An exploratory technique for investigating large quantities of categorical data. J R Stat Soc Ser C Appl Stat. 1980;29:119–127. [Google Scholar]
  • 22.Pepe MS. The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press; Oxford, New York: 2003. pp. 220–224. [Google Scholar]
  • 23.Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer; New York: 2009. [Google Scholar]
  • 24.Little R, Rubin D. Statistical Analysis with Missing Data. Wiley-Interscience; New Jersey: 2002. [Google Scholar]
  • 25.Folstein MF, Robins LN, Helzer JE. The Mini-Mental State Examination. Arch Gen Psychiatry. 1983;40:812. doi: 10.1001/archpsyc.1983.01790060110016. [DOI] [PubMed] [Google Scholar]
  • 26.Peskind ER, Li G, Shofer J, Quinn JF, Kaye JA, Clark CM, Farlow MR, DeCarli C, Raskind MA, Schellenberg GD, Lee VM, Galasko DR. Age and apolipoprotein E*4 allele effects on cerebrospinal fluid beta-amyloid 42 in adults with normal cognition. Arch Neurol. 2006;63:936–939. doi: 10.1001/archneur.63.7.936. [DOI] [PubMed] [Google Scholar]
  • 27.Nation DA, Edland SD, Bondi MW, Salmon DP, Delano-Wood L, Peskind ER, Quinn JF, Galasko DR. Pulse pressure is associated with Alzheimer biomarkers in cognitively normal older adults. Neurology. 2013;81:2024–2027. doi: 10.1212/01.wnl.0000436935.47657.78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.McKhann G, Drachman D, Folstein M, Katzman R, Price D, Stadlan EM. Clinical diagnosis of Alzheimer’s disease: Report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer’s Disease. Neurology. 1984;34:939–944. doi: 10.1212/wnl.34.7.939. [DOI] [PubMed] [Google Scholar]
  • 29.McKhann GM, Knopman DS, Chertkow H, Hyman BT, Jack CR, Jr, Kawas CH, Klunk WE, Koroshetz WJ, Manly JJ, Mayeux R, Mohs RC, Morris JC, Rossor MN, Scheltens P, Carrillo MC, Thies B, Weintraub S, Phelps CH. The diagnosis of dementia due to Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement. 2011;7:263–269. doi: 10.1016/j.jalz.2011.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Seshadri S, Drachman DA, Lippa CF. Apolipoprotein E epsilon 4 allele and the lifetime risk of Alzheimer’s disease. What physicians know, and what they should know. Arch Neurol. 1995;52:1074–1079. doi: 10.1001/archneur.1995.00540350068018. [DOI] [PubMed] [Google Scholar]
  • 31.Git A, Dvinge H, Salmon-Divon M, Osborne M, Kutter C, Hadfield J, Bertone P, Caldas C. Systematic comparison of microarray profiling, real-time PCR, and next-generation sequencing technologies for measuring differential microRNA expression. RNA. 2010;16:991–1006. doi: 10.1261/rna.1947110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Galimberti D, Villa C, Fenoglio C, Serpente M, Ghezzi L, Cioffi SM, Arighi A, Fumagalli G, Scarpini E. Circulating miRNAs as potential biomarkers in Alzheimer’s disease. J Alzheimers Dis. 2014;42:1261–1267. doi: 10.3233/JAD-140756. [DOI] [PubMed] [Google Scholar]
  • 33.Muller M, Kuiperij HB, Claassen JA, Kusters B, Verbeek MM. MicroRNAs in Alzheimer’s disease: Differential expression in hippocampus and cell-free cerebrospinal fluid. Neurobiol Aging. 2014;35:152–158. doi: 10.1016/j.neurobiolaging.2013.07.005. [DOI] [PubMed] [Google Scholar]
  • 34.Denk J, Boelmans K, Siegismund C, Lassner D, Arlt S, Jahn H. Micro RNA profiling of CSF reveals potential biomarkers to detect Alzheimer’s disease. PLoS One. 2015;10:e0126423. doi: 10.1371/journal.pone.0126423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Muller M, Kuiperij HB, Versleijen AA, Chiasserini D, Farotti L, Baschieri F, Parnetti L, Struyfs H, De Roeck N, Luyckx J, Engelborghs S, Claassen JA, Verbeek MM. Validation of microRNAs in cerebrospinal fluid as biomarkers for different forms of dementia in a multicenter study. J Alzheimers Dis. 2016;52:1321–1333. doi: 10.3233/JAD-160038. [DOI] [PubMed] [Google Scholar]

RESOURCES