Skip to main content
. Author manuscript; available in PMC: 2019 Oct 29.
Published in final edited form as: Alzheimers Dement. 2017 Mar 22;13(4):e1–e85. doi: 10.1016/j.jalz.2016.11.007

Table 7.

Approaches for the improvement of genetic studies

Challenge Approach and results Reference
Avoiding biased interference when using secondary phenotypes Tested whether a standard analysis of secondary phenotypes encountered problems such as type-I errors and reduced power for association testing. Although the analysis was generally valid, authors recommend caution when analyzing these types of data. [237]
Improving computational efficiency of mass univariate analyses Presented a functional mixed-effects modeling framework to jointly analyze high-dimensional imaging data with genetic markers and clinical covariants. Tested associations of candidate genes (CR1, CD2AP, PICALM) with MRI brain regions. Detected regional clusters of voxels associated with candidate genes and different patient groups. Method was computationally efficient. [239]
Presented a fast voxelwise genome-wide association analysis framework able to search for sparse signals while controlling for family-wise error rate. When tested on ADNI data with 708 subjects, 193,275 voxels, and 501,584 SNPs, the total processing time was 203,645 seconds for a single CPU, a substantial improvement over traditional methods. [240]
Presented modifications to mass univariate analyses by using dimensionality reduction techniques on both MRI imaging data and genomics data, and a new multiple testing adjustment method. Experiments suggest procedure has more power to detect associations. [241]
Presented a method based on a generalization of the partial least squares correspondence analysis which can simultaneously analyze behavioral and genetic data. [242]
Selecting most informative SNPs or quantitative features Presented method that uses tree-guided sparse learning to learn the most informative SNPs and MRI measures and that models the a priori hierarchical grouping structure among SNPs. Experiments suggest method can identify informative SNPs. [243]
Adopted a generalized estimation equations methodology to test the association between single SNPs and multiple quantitated traits. Found the method was general and flexible when tested on ADNI data using seven MRI-derived multivariate traits. Outperformed principal component analysis or canonical correlation analysis for dimensionality reduction. [244]
Presented a sparse projection regression modeling framework. Incorporates two novel heritability ratios to simultaneously perform dimensionality reduction, response selection, estimation, and testing. [245]
Evaluated several sparse canonical correlation analysis methods that can reveal complex multi-SNP, multiquantitative trait associations. Suggest that the estimation of covariate structure is limited in these methods. [246]
Tested three Bayesian network supervised learning methods on a whole-genome sequencing data to identify causal AD SNPs and the gene-SNP interactions. Reported that Markoff blanket-based methods outperformed both naïve Bayes and tree-augmented naïve Bayes methods in selecting SNPs strongly associated with AD from top-ranked susceptibility genes. [247]
Developing a summary measure of associations between multiple SNPs and traits of interest Adapted the Rasch model to compute a multimarker genetic summary score which accounts for statistical issues such as inflated false-positive rates, linkage disequilibrium. Genetic summary score can then be used for association analysis. [202]
Developed a summary score based on an asymptotically normal and consistent estimate of the parameter vector to be tested and its covariance matrix. The derived score vector extended several score-based tests to mixed-effects models. [248]
Accurately calling multisample variants for whole-genome sequencing Compared two multisample variant-calling methods for the detection of small nucleotide variants and short on solutions and deletions using a whole-genome sequencing data from ADNI subjects. Found that the JOINT method, which first calls variants individually and then genotypes the variant files for all samples, outperformed the second method, REDUCE. [249]
Identifying interaction effects Combined kernel machine regression and kernel distance covariance to identify associated genetic markers with multidimensional phenotypes. Identified SNPs in FLJ16124 that exhibit pairwise interaction effects correlated with volumetric changes. [250]
Proposed a general kernel machine-based method to jointly detecting genetic and nongenetic variables and their interactions. Framework consists of a genetic kernel to capture epistasis, and a nongenetic kernel which can model the joint effects of multiple variables. [189]
Proposed a new Bayesian generalized low-rank regression model to characterize the association between genetic variants and brain imaging phenotypes, while accounting the impact of other covariance. Tested using 20 most significant SNPs from ADNI and identified loci associated with brain regions. Method was more computationally efficient and less noisy because it reduced the number of parameters to be sampled and tested. [251]
Presented a method to consider joint effects for polymorphisms in a biologically defined pathway. Determines SNP-SNP interactions using a quantitated multifactor dimensionality reduction technique, infers functional relationships between selected genes, and uses gene-set enrichment analysis to determine whether genes and functional network occur more frequently than expected by chance and biological pathways defined by gene ontology. [205] [252]
Proposed a versatile likelihood ratio test to detect mean and variance heterogeneity present in loci due to biological disruption, gene by gene or gene by environment interactions or linkage disequilibrium.
Visualizing genetic interaction networks Used 3D printing to visualize a statistical epistasis network of 34 significant SNPs. Suggest that a 3D physical model may make interpretation of data easier than from a digital representation. [253]
Imputing common APOE SNPs missing from genome-wide genotyping arrays. Compared directly genotyped SNPs versus SNPs imputed via reference panel compiled by the Thousands Genome Project. Reported that the imputation method is highly accurate. [238]
Learning predictive models or progression profiles Applied three feature selection methods (multiple kernel learning, high-order graph matching-based feature selection, sparse multimodal learning) to classification challenges using multidimensional imaging genomics data and biochemical markers. Found that higher order graph matching-based feature selection gave best results. [254]
Used the principal component analysis to select most important SNPs associated with clinical diagnosis and used these data along with hippocampal surface information to predict MCI to AD progression. [255]

Abbreviations: MRI, magnetic resonance imaging; ADNI, Alzheimer’s Disease Neuroimaging Initiative; SNP, small nucleotide polymorphism; AD, Alzheimer’s disease; MCI, mild cognitive impairment.