Abstract
Abnormal blood lipid levels are influenced by genetic and lifestyle/dietary factors. Although many genetic variants associated with blood lipid traits have been identified in Europeans, similar data in Middle Eastern populations are limited. We performed a genome-wide association study with Arab individuals (discovery cohort: 1,353; replication cohort: 1,176) from Kuwait to identify possible associations of genetic variants with high lipid levels. We used Illumina HumanOmniExpress BeadChip and candidate SNP genotyping in the discovery and replication phases, respectively. For association tests, we used genetic models that were based on additive and recessive modes of inheritance. High triglycerides (TGs) were recessively associated with six risk variants (rs1002487/RPS6KA1, rs11805972/LAD1) rs7761746/Or5v1, rs39745/CTTNBP2-LSM8, rs2934952/PGAP3, and rs9626773/RP11-191L9.4-CERK) at genome-wide significance (P ≤ 6.12E-09), and another six variants (rs10873925/ST6GALNAC5, rs4663379/SPP2-ARL4C, rs10033119/NPY1R, rs17709449/LINC00911-FLRT2, rs11654954/CDK12-NEUROD2, and rs9972882/STARD3) were associated at borderline significance (P ≤ 5.0E-08). High TG was also additively associated with rs11654954. All of the 12 identified markers are novel and are harbored in runs of homozygosity. Literature evidence supports the involvement of these gene loci in lipid-related processes. This study in an Arab population augments international efforts to identify genetic regulation of lipid traits.
Keywords: lipid and lipoprotein metabolism, triglycerides, high density lipoprotein, diabetes, obesity, genomics, genetics
Abnormal blood lipid levels are risk factors for cardiovascular disorders (1–4); it is further the case that lifestyle complications, such as obesity, diabetes, and metabolic syndrome, are accompanied by variations in blood lipid levels (5–7). Factors that contribute to abnormal blood lipid levels include lipid mobilization and metabolism (8), imbalances in energy homeostasis or gene-diet interactions (9, 10), hyperinsulinemia, and loss of adipocyte-derived hormonal and hypothalamic functions (11, 12). The mechanisms underlying the regulation of lipid traits are poorly understood.
Delineating the genetic basis of blood lipid levels is an important step in efforts to identify targets for new therapies to manage cholesterol (13–15). Genome-wide association studies (GWASs) have proved very useful to uncover the genetic basis of normal variation in blood lipid traits and of extreme lipid phenotypes. Thus far, GWASs have identified 157 loci associated with lipid levels (16). Twin- and family-based studies demonstrated the genetic heritability of lipid traits and reported a high estimate of variance (43–83%) from genetic factors (17–19). The global studies were mostly performed on European populations followed by African-American, East Asian, and South Asian populations (20–23). Populations from the Arabian Peninsula have not been included in these global studies and there is limited genetic association data for blood lipids in the Arab population. This deficit persists in spite of the observation that cardiovascular disorder is a leading cause of death in Kuwait, representing 46% of all deaths (24), and in spite of the rising prevalence of metabolic trait-related disorders (including dyslipidemia at 70.3%, obesity at 48.2%, and diabetes at 23.3%) in Kuwait and the Middle East.
While most of the lipid-associated genetic variants identified in Europeans are consistent and have the same direction of association in other population groups (20, 23, 25), some are heterogeneous, population specific, and not validated in all ethnicities (26, 27). Studies from the Arab region on traits associated with obesity and diabetes (28–32) revealed that not all Euro-centric markers show up with evidence of association and that new population-specific markers could be identified. The reasons for heterogeneity and lack of generalization include differences in effect sizes, allele frequencies, and linkage disequilibrium (LD) (25); it is also the case that sample sizes in previous studies conducted in the Arab region are probably underpowered to detect the established associations from other populations.
The native population of Kuwait is unique in many ways and offers a potential to detect novel genetic associations. The Arabian Peninsula is at the nexus of Africa, Europe, and Asia; and has been presumed to be part of early human migration. The population is well-structured into three groups: the KWP group that is largely of West Asian ancestry with European admixture; the KWS group that is predominantly of city-dwelling Saudi Arabian tribe ancestry; and the KWB group that is comprised of tent-dwelling nomadic Bedouins with a characteristic presence of 17% African ancestry (33). The population of Kuwait, like other states in the Peninsula, has been practicing consanguineous marriages (involving first or second cousins or relatives within the large family or the same tribe); the rate of consanguineous marriages can be as high as 54.3% (34). Consanguinity leads to increased endogamy, homozygosity, and accumulation of deleterious recessive alleles in the gene pool. The tradition of consanguinity has made the population vulnerable to a plague of recessive genetic disorders. Consanguinity and inbreeding can also play an important role in the etiology of complex disorders (with a multifactorial mode of inheritance), such as diabetes mellitus, hypertension, mental disorders, and cancer (35); autosomal recessive alleles may play a larger role in the pathogenesis of complex traits in this population (36). The population history, as delineated above, has the potential to result in some variants becoming more common and population specific; as a result, elucidation of novel associations is realistically possible in this population.
The Middle East exhibits a high prevalence of metabolic syndrome (37), diabetes (38, 39), and familial hypercholesterolemia (40); these disorders, along with the practice of consanguineous marriages, have resulted in a pattern of dyslipidemia [low HDL cholesterol and high triglycerides (TGs)] that is different from many other regions of the world (41). These lifestyle disorders have reached the proportion of public health crisis; this epidemic is primarily a consequence of recent environmental changes that have triggered the effect of preexisting susceptibility genes via gene-environment interactions (42). The post-oil era has seen rapid lifestyle transitions, such as rapid urbanization, increasingly sedentary lifestyles, and Westernized diet practices. Interactions between genetic background and rapid transitions in diet and lifestyle may accelerate the incidence of disorders such as diabetes (43). A number of SNPs and genes have been implicated in gene-environment interactions that mediate the blood lipid phenotypes (44, 45). The most common environmental factors associated with lipid phenotypes are physical activity [with HDL and total cholesterol (TC)], high fat challenge (with TG), and dietary saturated fat (with LDL) (46). Studying the Arab population, which has seen rapid lifestyle transitions, thus has the potential to identify novel preexisting susceptibility gene loci triggered by environmental factors.
As stated above, the population of Kuwait is composed of three distinct genetic groups that fall firmly into declared ancestral/tribal backgrounds; association tests, appropriately controlled for population stratification to avoid false positive findings, can provide opportunities for discovery of novel associations that are not seen in other populations. In the present study, we perform a GWAS on a cohort of native Arab individuals from Kuwait to elucidate novel genetic markers underlying the lipid traits, TG, LDL, HDL, and TC.
MATERIALS AND METHODS
The study was reviewed and approved by the International Scientific Advisory Board and Ethical Review Committee at Dasman Diabetes Institute.
Study participants
A total of 3,145 participants from Kuwait were recruited under protocols in accordance with guidelines laid in place by the institutional Ethical Review Committee. The study cohort included two groups. The first group comprised a random representative sample of adults (age >18 years) of Arab ethnicity from the six governorates of Kuwait. A stratified random sampling technique was used to select participants from the computerized register of the Public Authority of Civil Information, which maintains records of personal information for both Kuwaiti citizens and expatriates from other Arab and non-Arab countries. The second group comprised Arab individuals seeking tertiary medical care for diabetes/prediabetes-related disorders at our clinics, visitors to our nutrition programs and fitness center, visitors to our Open Day Events (that offer various diagnostic services), and visitors to our campaigns at primary health centers and blood banks in each of the six governorates of Kuwait. Such visitors interested to participate in our research programs were invited to the institute at a later date to give samples after fasting overnight. At the time of recruitment, ethnicity was confirmed through detailed questioning on parental lineage up to three generations; data on age, sex, and illness (e.g., diabetes and cardiovascular complications) were recorded. Baseline characteristics and vital signs, such as height, weight, waist circumference (WC), and blood pressure, were recorded. Signed informed consent was obtained from each of the participants. Details on medications taken by the participants for lowering lipid levels, diabetes, and hypertension were collected and used in correction procedures with the association statistics. A participant was regarded as affected by type 2 diabetes if the diagnosis was known to the participant (self-declaration) or, in accordance with the ADA guidelines, if fasting serum glucose was ≥7 mmol/l (126 mg/dl) or if glycated hemoglobin was ≥6.5% (48 mmol/mol). When in doubt, the recorded details on anti-diabetes medication were used; and for participants recruited through our clinics or our campaigns (constituting the above-mentioned second group), the clinician’s notes were used.
The discovery cohort was drawn largely, but not exclusively, from the second group and the replication cohort was drawn largely, but not exclusively, from the first group. Recruitment into the two groups was carried out in parallel with one another; genotyping for the discovery phase was also performed in parallel to the recruitment process. As a result, in order to make up the numbers for genotyping batches and runs during the course of the discovery phase, randomly chosen samples from the first group had to be added to the discovery cohort (a random set of 200 samples from second group were part of discovery cohort). A total of 1,913 samples were considered for the discovery phase and 1,176 for the replication phase.
Sample collection and processing
Upon confirming that the participants had fasted overnight, signed consent forms and blood samples were collected. The guidelines of the institutional Ethical Review Committee were followed for the collection of blood samples and measurement of vital signs. A Gentra Puregene® kit (Qiagen, Valencia, CA) was used to extract DNA. Quantification of DNA was performed using a Quant-iT™ PicoGreen® dsDNA assay kit (Life Technologies, Grand Island, NY) and an Epoch microplate spectrophotometer; only samples with a ratio in the range of 1.8–2.1 for absorbance at 260 nm to absorbance at 280 nm were used. DNA stocks were then frozen. Prior to genotyping, frozen DNA was diluted to a working concentration of 50 ng/μl, as recommended by Illumina (San Diego, CA).
Power calculation
Two types of power calculations were performed. The first one was to estimate the sample size and its potential to detect variability in quantitative traits with 80% power and a P-value threshold of 5.0E-08; the second one was to determine the number of samples required to achieve 80% power in a two-stage (discovery and replication) design.
For the first calculation, Quanto software (http://biostats.usc.edu/Quanto.html) (47) was used and both additive and recessive models were considered. “Gene only” hypothesis was used. Power for the analysis was set at 80%, and a type 1 error at a P-value of 5.0E-08 was considered significant. The marginal genetic effect estimate (RG2) was set to range from 0.01 to 0.04 in increments of 0.001 to enable the detection of a genetic effect that explained at least 1–4% of trait variance. For each lipid trait, the population (mean ± SD) of the quantitative trait was used. By using Quanto to estimate the power of our study over a range of percent variance of the trait, it was found that the discovery cohort size had 80% power to detect associations with genetic variants (under additive or recessive models) that explained 2.9% variance of the trait. The sample sizes required to detect various RG2 values in the discovery phase were denoted as (RG2; sample size): (0.0100; 3,940), (0.0110; 3,580), (0.0120; 3,280), (0.0130; 3,026), (0.0140; 2,809), (0.0150; 2,620), (0.0160; 2,455), (0.0170; 2,310), (0.0180; 2,180), (0.0190; 2,064), (0.0200; 1,960), (0.0210; 1,866), (0.0220; 1,780), (0.0230; 1,702), (0.0240; 1,630), (0.0250; 1,564), (0.0260; 1,503), (0.0270; 1,447), (0.0280; 1,394), (0.0290; 1,346), (0.0300; 1,300), (0.0310; 1,258), and (0.0320; 1,218). The acceptable effect sizes for associations between the trait of TG and SNP markers in the discovery phase at different allele frequencies are presented in supplemental Table S1.
For the second calculation, QPowR software (https://msu.edu/~steibelj/JP_files/QpowR.html) was used with the following parameters: total sample size = 2,532; total heritability = 0.05; samples genotyped in the first or second stage = approximately 50% of 2,532; markers typed in the second stage = typically 0.2% of the markers typed in the first stage; and type I error rate = 5.0E-08.
Discovery phase
Whole-genome genotyping was performed on DNA samples from the discovery cohort of 1,913 participants. Illumina HumanOmniExpress arrays utilizing Infinium® HD Assay Ultra genotyping assay methods were used to genotype. Assays included whole-genome amplification, fragmentation, hybridization, staining, and imaging of the HumanOmniExpress arrays using the Illumina iSCAN system. Genotyping was performed in batches; Illumina HumanOmniExpress-12v1-Multi_H (730,525 markers) was used to genotype 1,097 participants in 20 batches, HumanOmniExpress-12v1-1_B (719,665 markers) to genotype 336 participants in five batches, and HumanOmniExpress-24v1-0_a (716,503 markers) to genotype 480 participants in six batches. The number of markers in these three versions of OmniExpress BeadChips differed from one another by at most ∼10,000; otherwise the markers were common between the three chip versions.
Replication phase
Top markers identified in the discovery phase were selected for replication in a cohort consisting of 1,176 unrelated participants. Candidate SNP genotyping was performed using TaqMan® SNP genotyping assays (Applied Biosystems, Foster City, CA) and ABI 7500 real-time PCR system (Applied Biosystems). Each PCR sample contained 10 ng of genomic DNA, 5× FIREPol® Master Mix (Solis BioDyne, Tartu, Estonia), and 1 μl of 20× TaqMan® SNP genotyping assay (Applied Biosystems). The thermal cycling conditions were 60°C for 1 min, 95°C for 15 min, and then 40 cycles of 95°C for 15 s and 60°C for 1 min. Genotypes ascribed by real-time PCR were validated by direct Sanger sequencing of the PCR products for selected cases of homozygotes and heterozygotes. Sequencing was performed using the BigDye™ Terminator v3.1 cycle sequencing FS ready reaction kit (Applied Biosystems), according to the manufacturer’s recommendations, on an Applied Biosystems 3730xl DNA analyzer (Applied Biosystems).
Meta-analysis with results from discovery and replication phases
Meta-analysis with the results of association tests (namely, allele frequencies, effective sample size, effect size, standard error, and P-values of SNP associations) from the discovery and replication cohorts was performed using the METAL tool (48).
Quality control and statistical analysis
Raw intensity data were pooled from all samples from all batches, and genotype calling was performed using the GenCall algorithm provided in the GenomeStudio software.
A series of quality metric thresholds was applied to derive a high-quality set of SNPs and samples.
Samples with call rates >95% were selected. SNPs with call rates of <98%, low intensities (AB R mean ≤0.25), poor cluster separation (Cluster Sep <0.3), heterozygote clusters too close to homozygotes (AB T mean ≤0.2 or ≥0.8), excess heterozygotes (Het Excess ≥0.2), and fewer than expected heterozygotes (Het Excess less than or equal to −0.3) were removed. Sex estimations were performed using GenomeStudio software, and samples with sex mismatches were removed. Duplicated samples were also removed. Strand designations were corrected to the forward strand and REF/ALT designations were corrected using the design files for HumanOmniExpress BeadChip (available from Illumina). In addition, poor markers with respect to missingness per individual (–mind; 0.1), minor allele frequency (–maf; 0.01), missingness per marker (–geno; 0.1), and Hardy-Weinberg equilibrium (HWE; <10−6) were removed using PLINK (49).
Relatedness among the participants was calculated using the “–genome” feature in PLINK with PI_HAT >0.125 (i.e., third-degree relatives) and one sample per pair of related participants was randomly removed. Our earlier work (33) demonstrated that the Arab population in Kuwait is characterized by an admixture of six ancestral components that arise from the Human Genome Diversity Project populations of Negev Bedouin, Yoruba, Brahui tribe, Druze, Kalash tribe, and French Basque; we estimated the extents of the six ancestry components in the genetic substructure of the Kuwaiti population. Ancestry estimation was performed using ADMIXTURE (50), through which samples with abnormal deviations in the extents of component ancestry elements were identified as samples of ethnicity mismatch, and such samples were removed. These series of quality control (QC) steps reduced the number of markers to 632,375 and the size of the discovery cohort to 1,353 samples.
Principal component analysis (PCA) was performed using EIGENSTRAT (51). The following parameters were set: the number of eigenvectors to output (numoutevec = 10), no outlier removal (numoutlieriter = 0), number of principal components along which to remove outliers during each outlier removal iteration (numoutlierevec = 10), and number of SDs a participant must exceed, along one of the top (numoutlierevec) principal components, in order for that participant to be classified as an outlier (outliersigmathresh = 6.0) and to be removed. All 10 principal components were used as covariates in quantitative trait association tests to correct for population stratification.
LD-pruning was performed using the “–indep-pairwise” option in PLINK to remove markers with an R2 value of >0.5 with any other SNP within a 50-SNP sliding window (advanced in 5-SNP increments). These pruned markers (340,299) were used for measuring relatedness and admixture, to perform PCA, and to calibrate the genome-wide P-value threshold used to identify significant genotype-phenotype associations.
Independent variables among the four lipid traits were identified by way of performing Pearson correlation analysis between the traits followed by matSpD (52) analysis (http://neurogenetics.qimrberghofer.edu.au/matSpD/); the estimated effective number of independent traits was seen as 3.
Quantitative trait association tests in discovery phase
Association tests were performed with all of the 632,375 SNPs (that passed the QC without LD-pruning) against HDL, LDL, TC, and TG using linear regression methods under additive and recessive genetic models. Association tests were adjusted for age, sex, and first 10 principal components with further adjustment toward medication for lowering lipid levels, toward obesity status, and toward diabetes status.
P-value thresholds for associations in discovery phase
The P-value threshold for genome-wide significance was calibrated for the number of LD-pruned markers, effective number of independent variables among the quantitative traits, number of tested genetic models, and number of models used to adjust the association tests; the “stringent” P-value threshold required to keep the type I error rate at 5% was thus calibrated to 6.12E-09 [= 0.05/(340,299 × 3 × 2 × 4)]. We further defined a “borderline” P-value threshold of (>6.12E-09 and ≤5.0E-08) in alignment with the standard P-value threshold for genome-wide significance adopted by the GWAS community.
Delineation of runs of homozygosity regions
Identification of runs of homozygosity (ROH) was performed using PLINK 1.9 through two different methods and sets of parameters: the first method comprised pruning the markers for LD (R2 > 0.9) (leading to 568,670 markers) followed by ROH detection using optimal parameters suggested by Howrigan, Simonson, and Keller (53); the second method utilized the unpruned marker set (of 632,375 markers) to detect ROH using parameters as deployed in Christofidou et al. (54). Groups of overlapping ROH segments were discovered; for each such group, the consensus ROH region and mean ± SD (by taking the midpoint of individual ROH falling in the group) were delineated and were used to identify the overlay of SNP with ROH region. Further, the ROH signatures discovered from global populations, as reported in the Pemberton et al. (55) study, were used to classify the delineated ROH segments as “known” or “novel”.
Tests for correlations among the quantitative traits associated with lipids and insulin resistance
Pairwise relationships among the four lipid traits (TG, HDL, LDL, and TC) and the trait of fasting plasma glucose (FPG) linked to insulin resistance were examined by way of performing Pearson correlation analysis. Relationship between two traits was denoted by the product-moment correlation coefficient (r2) and significance of correlation (P-value, the probability of finding the current result if no correlation exists between the two variables, i.e., null hypothesis). A value of 0.05 was considered as the P-value threshold. The analysis was performed independently on the discovery and replication cohorts.
Testing the quantitative lipid traits for contribution to the dichotomous status of being diabetic or nondiabetic
The relationship between a lipid trait and the categorical status of being diabetic or nondiabetic was examined by way of performing logistic regression, a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. The relationship between a trait and the outcome of diabetes status was denoted by the odds ratio (OR) of being diabetic (as calculated using the derived regression coefficient and standard error of coefficient) and significance of the coefficient (P-value). A value of 0.05 was considered as P-value threshold. The analysis was performed independently on the discovery and replication cohorts.
RESULTS
Marker and sample sets
The QC procedures led to a final marker set of 632,375 SNPs, a discovery cohort of 1,353 samples, and a replication cohort of 1,176 samples.
Characteristics of the study participants
The cohort used for the discovery phase largely comprised participants in middle adulthood (mean age, 46.77 ± 13.8 years), with almost equal proportions of males and females (Table 1). The population was largely categorized as class I obese (mean BMI, 32.42 ± 7.38 kg/m2), with a high mean value for WC (102.21 ± 16.35 cm). Forty-five percent of the participants were diabetic. The mean values for LDL, HDL, TC, and TG (3.07 ± 0.97 mmol/l, 1.13 ± 0.38 mmol/l, 4.92 ± 1.11 mmol/l, and 1.69 ± 1.18 mmol/l, respectively) were normal or near optimal. One hundred and thirty-three of the 1,353 participants that formed the discovery cohort were taking lipid-lowering medications; such medications were of the following types: Zocor (simvastatin), omega-3 fatty acids, Lipitor (atorvastatin), Crestor (rosuvastatin), Mevalotin (pravastatin), Lopid (gemfibrozil), and Lipanthyl.
TABLE 1.
Demographic characteristics of the study participants
Discovery Cohort (mean ± SD) | Replication Cohort (mean ± SD) | P for Differences between Discovery and Replication Cohorts | |
Sex (male:female) | 667:686 | 673:503 | 7.96E-05 |
Age (years ± SD) | 46.77 ± 13.79 | 46.79 ± 10.65 | 0.967 |
Weight (kg ± SD) | 88.51 ± 21.12 | 92.38 ± 16.97 | 3.62E-06 |
Height (cm ± SD) | 165.29 ± 9.60 | 166.45 ± 8.91 | 0.0057 |
BMI (kg/m2 ± SD) | 32.42 ± 7.38 | 31.24 ± 5.70 | 6.15E-06 |
WC (cm ± SD) | 102.21 ± 16.35 | 100.53 ± 12.05 | 0.003 |
LDL (mmol/l ± SD) | 3.07 ± 0.97 | 3.40 ± 0.92 | <2.2E-16 |
HDL (mmol/l ± SD) | 1.13 ± 0.38 | 1.13 ± 0.32 | 0.8213 |
TC (mmol/l ± SD) | 4.92 ± 1.11 | 5.22 ± 1.04 | 7.77E-12 |
TG (mmol/l ± SD) | 1.69 ± 1.18 | 1.56 ± 1.00 | 0.0016 |
HbA1c (mmol/l ± SD) | 7.10 ± 2.08 | 6.00 ± 1.44 | <2.2E-16 |
FPG (mmol/l ± SD) | 7.31 ± 3.57 | 5.86 ± 2.27 | <2.2E-16 |
SBP (mmHg ± SD) | 127.61 ± 17.51 | 129.07 ± 16.72 | 0.062 |
DBP (mmHg ± SD) | 77.86 ± 10.60 | 78.72 ± 11.10 | 0.035 |
Obesity (BMI ≥30 kg/m2) (yes:no)a | 803: 550 | 535: 641 | 7.43E-05 |
Diabetic (yes:no) | 605:748 | 452:724 | 0.0016 |
Hypertensive (yes:no) | 607:746 | 420:756 | 3.61E-06 |
Lipid lowering medication (yes:no) | 133:1,220 | 4:1,162 | <2.2E-16 |
Anti-diabetes medication (yes:no) | 216:1,137 | 54:1,122 | <2.2E-16 |
Anti-hypertensive medication (yes:No) | 161:1,192 | 85:1,091 | 0.00010 |
HbA1c, glycated hemoglobin; SBP, systolic blood pressure; DBP, diastolic blood pressure.
The distribution of the participants onto normal weight (BMI 20 to <25 kg/m2), overweight (BMI 25 to <30 kg/m2), obese (BMI 30 to <40 kg/m2), morbid obese (BMI ≥40 kg/m2) = 222:328:597:206 in the discovery cohort and 93:442:559:82 in the replication cohort.
The characteristics of the cohorts used in the study have been described in our previous reports: a) A subset was used to delineate the three genetic substructures of the Kuwaiti population (33). b) The cohort, when used to delineate genetic risk variants for obesity, led to identification of a TCN2 variant associated with WC under additive mode of inheritance (30). Scatter plots presenting the first three principal components of the final discovery cohort (after all the QC analysis) and representative Human Genome Diversity Project populations are presented in supplemental Fig. S1; the scatter plots depict the three genetic substructures and are in agreement with the published scatter plot in our previous study (33) (the scatter plot from the study is reproduced here in supplemental Fig. S2) for native Kuwaitis of Arab ethnicity confirmed through detailed surname lineage analysis. Scatter plots presenting the first two principal components of all the samples considered for the discovery cohort, prior to QC analysis and representative 1KGP populations (56), are presented in supplemental Fig. S3; the samples that got removed during the QC procedures are color-coded in the illustration.
Variants passing association tests in both the discovery and replication phases
Upon examining the results of association tests in the discovery phase for acceptable β values and at least nominal P-values of <1.0E-04, we short-listed 32 markers (of which 19 were associated with TG, 8 with HDL, 4 with TC, and 1 with TC and LDL) to carry forward to the replication phase. The SNP quality assessment values for these 32 markers in the replication phase are presented in supplemental Table S2. Two markers (namely rs1209347 and rs4731512) failed the assessment for allele frequency consistency (between the discovery and replication phases) and HWE QC (HWE >10−3). Twelve markers, associated with TG at either the genome-wide significant or borderline P-values (under recessive model), passed the replication P-value threshold of <0.05 (Tables 2, 3); in addition, three markers from cholesteryl ester transfer protein (CETP) (namely rs3764261, rs1800775, and rs1864163), associated with HDL at nominal P-values (under additive model), passed the replication P-value threshold; and none of the other markers associated with LDL or TC passed the replication P-value threshold.
TABLE 2.
Results of the statistical association tests from the discovery, replication, and meta-analysis phases with TG (P approaching genome-wide significance at ≤6.12E-09) when modeled for the recessive mode of inheritance
SNP: Effect Allele | Gene: Functional Consequences | Phase | Effect Sizea | a | Effect Sizeb | b | Effect Sizec | c | Effect Sized | d |
rs1002487: C | RPS6KA1: intron | Discovery | 3.337 | 7.17E-11 | 3.344 | 6.69E-11 | 3.402 | 1.91E-11 | 3.202 | 3.32E-10 |
Replication | 1.036 | 0.01805 | 1.045 | 0.0165 | 1.062 | 0.01508 | 1.029 | 0.01765 | ||
Meta | 6.517 | 7.17E-11 | 6.197 | 5.76E-10 | 6.532 | 6.49E-11 | 6.18 | 6.42E-10 | ||
rs11805972: T | LAD1: P > Q | Discovery | 2.481 | 8.55E-11 | 2.48 | 8.79E-11 | 2.467 | 7.12E-11 | 2.404 | 2.54E-10 |
Replication | 1.893 | 4.03E-08 | 1.878 | 4.50E-08 | 1.861 | 6.26E-08 | 1.835 | 7.98E-08 | ||
Meta | 8.485 | 2.16E-17 | 8.351 | 6.75E-17 | 8.451 | 2.89E-17 | 8.281 | 1.23E-16 | ||
rs7761746: A | OR5V1: synonymous | Discovery | 2.447 | 1.89E-09 | 2.443 | 2.08E-09 | 2.37 | 4.31E-09 | 2.49 | 7.36E-10 |
Replication | 0.8277 | 9.97E-03 | 0.8443 | 7.88E-03 | 0.7594 | 1.62E-02 | 0.8576 | 6.44E-03 | ||
Meta | 6.066 | 1.31E-09 | 6.229 | 4.68E-10 | 5.85 | 4.91E-09 | 6.277 | 3.45E-10 | ||
rs39745: T | CTTNBP2, LSM8: intergenic | Discovery | 1.485 | 3.63E-09 | 1.486 | 3.56E-09 | 1.427 | 1.09E-08 | 1.473 | 3.76E-09 |
Replication | 0.4809 | 3.69E-02 | 0.4769 | 3.95E-02 | 0.5116 | 0.02603 | 0.5255 | 0.0221 | ||
Meta | 5.643 | 1.67E-08 | 5.619 | 1.92E-08 | 5.613 | 1.99E-08 | 5.782 | 7.39E-09 | ||
rs2934952: G | PGAP3: intron | Discovery | 0.773 | 3.17E-09 | 0.7722 | 3.38E-09 | 0.7527 | 5.95E-09 | 0.7617 | 4.21E-09 |
Replication | 0.3243 | 6.87E-03 | 0.3167 | 7.94E-03 | 0.2927 | 1.35E-02 | 0.3152 | 7.66E-03 | ||
Meta | 6.086 | 1.16E-09 | 6.054 | 1.41E-09 | 5.859 | 4.67E-09 | 6.03 | 1.64E-09 | ||
rs9626773: A | RP11-191L9.4, CERK: intergenic | Discovery | 2.848 | 1.42E-09 | 2.857 | 1.29E-09 | 2.751 | 3.65E-09 | 2.93 | 3.56E-10 |
Replication | 3.395 | 8.60E-07 | 3.315 | 1.54E-06 | 3.368 | 9.85E-07 | 3.313 | 1.26E-06 | ||
Meta | 7.776 | 7.47E-15 | 7.855 | 4.00E-15 | 7.647 | 2.06E-14 | 7.882 | 3.21E-15 |
Effect size represents β value for the discovery and replication phases and Z-score for meta-analysis.
Regular correction: corrected for age, sex, and the top 10 principal components that resulted from the PCA of the genotype data.
Corrected for lipid medication in addition to the regular correction.
Corrected for obesity (using BMI as the characterizing trait) in addition to the regular correction.
Corrected for diabetes status in addition to the regular correction.
TABLE 3.
Results of the statistical association tests from the discovery, replication, and meta-analysis phases with TG (P approaching borderline to genome-wide significance at >6.12E-09 and <5.0E-08) when modeled for the recessive and additive modes of inheritance
SNP: Effect Allele | Gene: Functional Consequences | Phase | Effect Sizea | a | Effect Sizeb | b | Effect Sizec | c | Effect Sized | d |
rs10873925: G#e | ST6GALNAC5: intron | Discovery | 0.633 | 4.11E-08 | 0.6031 | 4.19E-08 | 0.5899 | 6.15E-08 | 0.5669 | 2.32E-07 |
Replication | 0.2035 | 0.0427 | 0.2092 | 0.03631 | 0.2182 | 0.0295 | 0.1874 | 0.059 | ||
Meta | 5.364 | 8.12E-08 | 5.184 | 2.18E-07 | 5.418 | 6.03E-08 | 5.042 | 4.61E-07 | ||
rs4663379: G#e | SPP2, ARL4C: intergenic | Discovery | 1.841 | 8.38E-09 | 1.842 | 8.39E-09 | 1.852 | 4.83E-09 | 1.79 | 1.73E-08 |
Replication | 0.5718 | 0.005562 | 0.5861 | 0.0042 | 0.5585 | 0.0066 | 0.5514 | 0.00695 | ||
Meta | 6.078 | 1.22E-09 | 6.051 | 1.44E-09 | 6.107 | 1.02E-09 | 5.938 | 2.89E-09 | ||
rs10033119: G#e | NPY1R: 3′-utr | Discovery | 2.698 | 8.79E-09 | 2.695 | 9.24E-09 | 2.734 | 3.88E-09 | 2.745 | 3.69E-09 |
Replication | 1.362 | 0.00064 | 1.336 | 0.00078 | 1.393 | 0.00046 | 1.257 | 0.0015 | ||
Meta | 6.446 | 1.15E-10 | 6.583 | 4.60E-11 | 6.677 | 2.43E-11 | 6.455 | 1.08E-10 | ||
rs17709449: T#e | LINC00911, FLRT2: intergenic | Discovery | 1.173 | 5.12E-08 | 1.172 | 5.36E-08 | 1.186 | 2.68E-08 | 1.141 | 9.64E-08 |
Replication | 0.3883 | 0.02078 | 0.3937 | 0.0185 | 0.3797 | 0.02342 | 0.369 | 0.0265 | ||
Meta | 5.447 | 5.12E-08 | 5.481 | 4.24E-08 | 5.585 | 2.34E-08 | 5.386 | 7.20E-08 | ||
rs11654954: A#e | CDK12, NEUROD2: intergenic | Discovery | 0.9881 | 2.18E-08 | 0.9915 | 2.00E-08 | 0.9452 | 6.70E-08 | 0.9873 | 1.76E-08 |
Replication | 0.3692 | 1.98E-02 | 0.3544 | 2.59E-02 | 0.3228 | 4.10E-02 | 0.3552 | 2.42E-02 | ||
Meta | 5.602 | 2.13E-08 | 5.555 | 2.77E-08 | 5.261 | 1.43E-07 | 5.574 | 2.49E-08 | ||
rs11654954: A@f | CDK12, NEUROD2: intergenic | Discovery | 0.3311 | 3.75E-08 | 0.3312 | 3.74E-08 | 0.3199 | 8.41E-08 | 0.3305 | 3.14E-08 |
Replication | 0.1289 | 1.47E-02 | 0.1149 | 2.89E-02 | 0.1116 | 3.26E-02 | 0.1131 | 2.99E-02 | ||
Meta | 5.612 | 2.00E-08 | 5.432 | 5.57E-08 | 5.298 | 1.17E-07 | 5.444 | 5.20E-08 | ||
rs9972882: A#e | STARD3: intron | Discovery | 0.7284 | 1.81E-08 | 0.7287 | 1.80E-08 | 0.7137 | 2.56E-08 | 0.7208 | 2.00E-08 |
Replication | 0.3311 | 4.43E-03 | 0.3168 | 5.73E-03 | 0.3043 | 7.83E-03 | 0.3205 | 5.01E-03 | ||
Meta | 5.99 | 2.11E-09 | 5.918 | 3.25E-09 | 5.816 | 6.01E-09 | 5.949 | 2.70E-09 |
Effect size represents β value for discovery and replication phases and Z-score for meta-analysis.
Regular correction: corrected for age, sex, and the top 10 principal components that resulted from the PCA of the genotype data.
Corrected for lipid medication in addition to the regular correction.
Corrected for obesity (using BMI as the characterizing trait) in addition to the regular correction.
Corrected for diabetes status in addition to the regular correction.
Recessive model.
Additive model.
Variants associated with lipid traits either at genome-wide significance or at borderline P-values
Six of the 12 markers associated with TG [namely rs1002487/ribosomal protein S6 kinase A1 (RPS6KA1), rs11805972/ladinin-1 (LAD1), rs7761746/olfactory receptor family 5 subfamily V member 1 (Or5v1), rs39745/cortactin binding protein 2 (CTTNBP2)-LSM8 homolog (LSM8), rs2934952/post-GPI attachment to proteins 3 (PGAP3), and rs9626773/uncharacterized RNA gene (RP11-191L9.4)-ceramide kinase (CERK)] were associated at genome-wide significant P-values (<6.12E-09) (see Table 2), and the remaining six [namely, rs10873925/ST6 N-acetylgalactosaminide alpha-2,6-sialyltransferase 5 (ST6GALNAC5), rs4663379/secreted phosphoprotein 2 (SPP2)-ADP ribosylation factor like GTPase 4C (ARL4C), rs10033119/neuropeptide Y receptor Y1 (NPY1R), rs17709449/long intergenic non-protein coding RNA 911 (LINC00911)-fibronectin leucine-rich transmembrane protein 2 (FLRT2), rs9972882/StAR-related lipid transfer domain containing 3 (STARD3), and rs11654954/cyclin-dependent kinase 12 (CDK12)-neuronal differentiation 2 (NEUROD2)] were associated at borderline P-values (>6.12E-09 and <5.0E-08) (see Table 3). The association involving the rs11654954/CDK12-NEUROD2 marker also appeared under the additive model. In the joint analysis, wherein we combined results from the discovery cohort and the replication cohort by way of meta-analysis, all the 12 markers showed significant P-values. The test statistics from the discovery phase for the markers that are in LD with the identified 12 markers for association with TG are listed in supplemental Data Set 1.
Supplemental Fig. S4 presents the intensity maps; the maps display the quality of the three different genotypes called homozygous allele A, heterozygous allele AB, and homozygous allele B at these 12 markers. The quantile-quantile (QQ) plots of the expected and observed −log10 (P-values) for the association of the markers with TG are presented in Fig. 1, and for the other three traits in supplemental Fig. S5. The genomic control inflation factors corresponding to TG were λ = 1.01 (recessive model) and λ = 1.029 (additive model) in tests with regular corrections, λ = 1.002 (recessive model) and λ = 1.028 (additive model) in tests additionally corrected for lipid lowering medication, λ = 0.994 (recessive model) and λ = 1.029 (additive model) in tests additionally corrected for obesity, and λ = 0.996 (recessive model) and λ = 1.028 (additive model) in tests additionally corrected for diabetes status. Similar values were obtained for the other traits as well. The values for λ from recessive and additive models with regular corrections were (HDL, 1.013, 1.025), (TC, 1.05, 1.041), and (LDL, 1.047, 1.037). The values differed through a small range of 1.01 through 1.05. Values close to 1.0 were obtained for these λ factors and, hence, it was not felt necessary to perform corrections for genomic-control inflation on association statistics. The Manhattan plot of the −log10 (P-values) from the genome-wide association analysis for all the four traits under both the recessive and additive models are presented in supplemental Fig. S6.
Fig. 1.
QQ plot of the expected and observed –log10 (P-value) for markers associated with TG under recessive or additive models corrected for age, sex, and the top 10 principal components that resulted from the PCA of the genotype data. Circles in the plots represent observed P-values obtained in tests of associations. Expected P-values represent the null hypothesis. Expected values from a theoretical χ2-distribution are represented by the light gray lines between the x axis and the y axis. The observed P-values for each SNP are sorted from largest to smallest and plotted against expected values. Thus, the presented QQ plot is a graphical representation of the deviation of the observed P-values from the null hypothesis. If some observed P-values are clearly more significant than expected under the null hypothesis, points move toward the y axis. The λ represents the genomic control inflation factor, which compares observed association statistics against the expected distributions. Plots for the other three traits are presented in supplemental Fig. S5.
The regional association plots covering a region of 500 Kb centered at the SNPs showing association with TG at genome-wide significance are presented in Fig. 2, and those for the markers associated at borderline P-values are presented in supplemental Fig. S7. The regions surrounding the reported risk variants were gene-dense. However, the markers in LD with each reported risk variant were located in the same gene loci; for example: a) the rs10873925 LD markers (rs3113982, r2 = 0.24; rs1252593, r2 = 0.25; rs10782639, r2 = 0.36; and rs199667, r2 = 0.22) were also located in ST6GALNAC5; b) the rs11805972 LD markers (rs2799686, r2 = 0.22 and rs6671391, r2 = 0.35) were also located in LAD1; c) the rs17709449 LD markers (rs11628575, r2 = 0.45; rs7144487, r2 = 0.48; rs12897409, r2 = 0.23; and rs17636692, r2 = 0.23) were also located in LINC00911 and FLRT2; d) the rs9626773 LD marker (rs9627141, r2 = 0.22) was also located in RP11-191L9 and CERK; and e) the rs1002487 LD marker (rs6668958, r2 = 0.24) was also located in RPS6KA1.
Fig. 2.
Regional association plots showing the six identified risk variants at genome-wide significant P-values [rs1002487 (A), rs11805972 (B), rs7761746 (C), rs39745 (D), rs2934952 (E), and rs9626773 (F)] and the markers in LD (from a 500 Kb genome region centered at the risk variants) with the risk variants in their respective gene regions and their association with TG. The SNPs are color-coded (see the insets) as per the r2 value for the SNP with the identified risk variant. The x axis represents the gene region in physical order; the y axis represents the −log10 (P-value) of the associations with TG for all the SNPs. The dashed horizontal line represents a P-value of 6.09E-09 (similar plots for the markers showing association with TG at borderline P-values are presented in supplemental Fig. S7). In order to generate a regional association plot for a SNP-trait association, all the SNPs (typed in passing the QC analysis) from a region of around 500 Kb centered on the SNP were tested for association with the trait; the resultant statistics and the SNPs were displayed in the regional association plot. The region-plot tool (https://github.com/pgxcentre/region-plot) was used to produce regional plots. A: Regional association plot showing the risk variant, rs1002487 (in red and labeled), and its LD markers (in purple: rs6668958, r2 = 0.29) in their respective gene regions and their association with TG. B: Regional association plot showing the rs11805972 (in red and labeled) and its LD markers (in purple: rs6671391, r2 = 0.22 and rs2799686, r2 = 0.35) in their respective gene regions and their association with TG. C: Regional association plot showing the rs7761746 (in red and labeled) and its LD markers (in orange: rs9257792, r2 = 0.68); (in green: rs9501112, r2 = 0.51; rs6901923, r2 = 0.50; rs9257696, r2 = 0.46; rs9257745, r2 = 0.45; and rs9348827, r2 = 0.48); and (in purple: rs7763661, r2 = 0.30; rs11962388, r2 = 0.28; rs11964645, r2 = 0.29; rs9257803, r2 = 0.35; rs11969272, r2 = 0.25; rs9501676, r2 = 0.33; rs9501677, r2 = 0.32; rs9461533, r2 = 0.28; rs10484548, r2= 0.32; rs362541, r2 = 0.32; rs6915177, r2 = 0.28; rs2076486, r2 = 0.28; rs2076484, r2 = 0.28; and rs64036, r2 = 0.32) in their respective gene regions and their association with TG. D: Regional association plot showing the rs39745 (in red and labeled) and its LD markers (in orange: rs13226962, r2 = 0.60 and rs7456706, r2 = 0.64), (in green: rs2706164, r2 = 0.54 and rs10487381, r2 = 0.56), and (in purple: rs2214226, r2 = 0.37 and rs39746; r2 = 0.27) in their respective gene regions and their association with TG. E: Regional association plot showing the rs2934952 (in red and labeled) and its LD markers (in red: rs9972882, r2 = 0.82; rs2941503, r2 = 0.95; rs907087, r2 = 1; rs2941504, r2 = 0.93; rs1565922, r2 = 0.92; and rs2517956, r2 = 0.91), (in orange: rs1877031, r2 = 0.62; rs931992, r2 = 0.63; rs1053651, r2 = 0.68; rs903502, r2 = 0.73; rs12150603, r2 = 0.69; rs11078919, r2 = 0.74; rs9303274, r2 = 0.71; and rs2517957, r2 = 0.69), (in green: rs12950186, r2 = 0.42; rs11655972, r2 = 0.42; rs801427, r2 = 0.42; rs590051, r2 = 0.41; rs597069, r2 = 0.45; rs11654954, r2 = 0.52; rs4795388, r2 = 0.45; rs1874223, r2 = 0.46; rs879606, r2 = 0.50; rs10852934, r2 = 0.50; rs907094, r2 = 0.45; rs3764352, r2 = 0.45; and rs3764351, r2 = 0.55); and (in purple: rs9897185, r2 = 0.36; rs3964723, r2 = 0.36; rs8069451, r2 = 0.36; rs7221875, r2 = 0.24; rs10491128, r2 = 0.37; rs6503513, r2 = 0.32; rs12452509, r2 = 0.36; rs10445306, r2 = 0.37; rs4795369, r2 = 0.37; and rs4794814; r2 = 0.38) in their respective gene regions and their association with TG. F: Regional association plot showing the rs9626773 (red and labeled) and its LD markers (in purple: rs9627141, r2 = 0.22) in their respective gene regions and their association with TG.
Examining the NHGRI-EBI GWAS Catalog for published reports on associations between the identified markers and phenotype traits
We examined the NHGRI-EBI GWAS Catalog (https://www.ebi.ac.uk/gwas/) to determine whether the 12 associations (at the stringent or borderline P-values), or at least the gene loci identified in our study, had been reported in previous GWASs on global populations. Of the identified 12 gene loci, only the PGAP3 and STARD3 were previously associated with lipid traits (20, 57).
Examining the transferability of established lipid-associated variants from other populations to the study population
All of the 12 markers identified so far in the study are novel and resulted from using the genetic model of recessive mode of inheritance. In order to evaluate the transferability of the lipid markers established in other populations to the study population, we examined associations (resulting from using the genetic models of additive or recessive mode of inheritance) with nominal values of P < 1.0E-04 and consistent direction. This resulted in identifying 23 associations involving 22 markers (supplemental Table S3). However, only six of the 22 markers passed the replication phase, and such variants were rs7156508/solute carrier family 10 member 1 (SLC10A1)-SPARC-related modular calcium binding 1 (SMOC1) (P = 5.49E-08 with TG), rs9972882/STARD3 (P = 4.07E-07 with TG), rs2934952/PGAP3 (P = 8.81E-08 with TG), rs3764261/CETP (P = 1.10E-05 with HDL), rs1864163/CETP (P = 4.64E-06 with HDL), and rs1800775/CETP (P = 4.99E-06 with HDL); of these (except the SLC10A1-SMOC1 marker), all appeared under the additive model. Of the above six markers, the three CETP markers appeared as established markers in the NHGRI-EBI GWAS Catalog for lipid traits: the SNP, rs3764261/CETP, was associated with lipid traits at genome-wide significance in European, Japanese, Han Chinese, African, British ancestry, and North Finnish founder populations (57–61); the rs1864163/CETP marker was similarly associated with lipid traits in East Asian, European, and Sardinian populations (57, 62); and rs1800775/CETP was associated with lipid traits in Europeans and Filipinos (63, 64). Further, the SNP, rs9972882/STARD3, was in LD (r2 = 0.55) with an established marker, rs1877031/STARD3, associated with lipid traits in East Asians and Europeans (57).
ROH segments overlaying the identified markers associated with lipid traits
ROH analysis indicated that all of the 12 reported risk variants were harbored in ROH segments (Table 4) identified in our study cohort. Upon comparing these 12 ROH segments with ROH data from studies on global populations (55), it was seen that the ROH segments for 9 of the 12 markers were known in global populations (see Table 4); the remaining markers that are harbored in novel ROH segments are rs1002487, rs39745, and rs9626773. Upon performing ROH analysis on the 22 markers that showed nominal values of P < 1.0E-04, it was seen that all of the 22 markers were proximal to ROH segments in the study population (supplemental Table S4); comparative analysis with data from the Pemberton et al. (55) study revealed that 20 of such ROH segments were known in global populations and 2 were novel.
TABLE 4.
Top 12 markers associated with TG and their proximity to ROH regions identified from methods 1 and 2
SNP | ROH Group | Consensus ROH Region | Distance to SNP from Consensus ROH (in MB) | Number of Individuals in ROH Group | Length of Consensus ROH (in KB) | Number of SNPs in Consensus ROH Region | Mean ± SD of ROH Groups | Distance to SNP from Mean ± SD Window (in MB) | Presence of SNP in ROH Regions from Worldwide Population (from Ref. 55) |
rs1002487 | S1818a | 1:28864435–29062427 | 1.99 | 51 | 197.99 | 11 | 24917436–33009426 | Overlapping | Yes |
S1557b | 1:28056342–28084571 | 1.19 | 44 | 28.23 | 5 | 27723540–28417372 | 0.85 | ||
rs11805972 | S1321a | 1:19325908–197359999 | 3.99 | 59 | 34.09 | 5 | 193932455–200753450 | 0.6 | No |
S2848a | 1:200997930–201060865 | 0.29 | 41 | 62.94 | 29 | 195716208–206342586 | Overlapping | ||
S3058a | 1:201071790–201083343 | 0.27 | 40 | 11.54 | 10 | 195944266–206210865 | Overlapping | ||
S4068a | 1:202132777–202201905 | 0.77 | 36 | 69.13 | 22 | 195822094–208512587 | Overlapping | ||
S846b | 1:19325908–197359999 | 3.52 | 63 | 517.65 | 54 | 196010552–199133473 | 2.22 | ||
S2403b | 1:201257016–201385868 | Overlapping | 37 | 128.85 | 69 | 200231804–202411079 | Overlapping | ||
rs4663379 | S4458a | 2:238462725–238482636 | 3.25 | 35 | 19.91 | 5 | 237266998–239678361 | 2.05 | No |
S4457a | 2:237708511–237742697 | 2.5 | 35 | 34.19 | 11 | 236520083–238931125 | 1.31 | ||
S3446b | 2:237048179–23748969 | 1.83 | 33 | 433.79 | 66 | 235304942–239225206 | 0.095 | ||
rs10033119 | S2907a | 4:160361747–160412034 | 3.83 | 41 | 50.29 | 8 | 156044269–164729510 | Overlapping | No |
S3611a | 4:168039224–168294218 | 3.79 | 38 | 254.99 | 34 | 162229756–174103686 | Overlapping | ||
S6281a | 4:164602209–164766062 | 0.35 | 31 | 163.85 | 26 | 158856625–170511645 | Overlapping | ||
S6282a | 4:164797386–165202724 | 0.55 | 31 | 405.34 | 55 | 158919502–171080608 | Overlapping | ||
S3714b | 4:164479479–164512021 | 0.23 | 32 | 32.54 | 18 | 160288277–168703222 | Overlapping | ||
S3715b | 4:164529666–164655831 | 0.28 | 32 | 126.17 | 39 | 160426360–168759136 | Overlapping | ||
S4050b | 4:163775335–163857878 | 0.38 | 31 | 82.54 | 24 | 159521646–168111566 | Overlapping | ||
rs7761746 | S439a | 6:29308393–29323655 | Overlapping | 89 | 15.26 | 3 | 26954921–31677127 | Overlapping | No |
S1192a | 6:29356734–29408528 | 0.03 | 62 | 51.8 | 19 | 26546122–32219139 | Overlapping | ||
S80b | 6:28988583–29080344 | 0.24 | 207 | 91.76 | 43 | 26992405–31076520 | Overlapping | ||
rs39745 | S244a | 7:118636817–118769366 | 0.931 | 104 | 132.55 | 5 | 115023895–122382286 | Overlapping | Yes |
S264a | 7:118367111–118429175 | 0.661 | 100 | 62.06 | 8 | 114686349–122109936 | Overlapping | ||
S403a | 7:118263380–118274228 | 0.558 | 92 | 10.85 | 3 | 114257190–122280417 | Overlapping | ||
S772a | 7:118119143–118119263 | 0.413 | 74 | 0.121 | 2 | 113660544–122577861 | Overlapping | ||
S902a | 7:117832457–117960973 | 0.127 | 70 | 128.52 | 8 | 113314045–122479384 | Overlapping | ||
S1851a | 7:117667791–117713915 | Overlapping | 51 | 46.12 | 11 | 112620728–122760977 | Overlapping | ||
S461b | 7:117783250–117823289 | 0.078 | 88 | 40.04 | 4 | 115026070–120580467 | Overlapping | ||
S1115b | 7:117497811–117579363 | 0.12 | 53 | 81.55 | 10 | 114497989–120579184 | Overlapping | ||
rs11654954 | S1745a | 17:41353410–41433660 | 3.607 | 53 | 80.25 | 3 | 38421342–44365727 | 0.675 | No, but LD SNPs such as rs4795369 (r2 = 0.56), rs879606 (r2 = 0.49), and rs907094 (r2 = 0.51) are in ROH region |
S2066a | 17:37938047–38066240 | 0.192 | 49 | 128.19 | 8 | 35214542–40789744 | Overlapping | ||
S2067a | 17:38069949–38082831 | 0.323 | 49 | 12.88 | 4 | 35289200–40863579 | Overlapping | ||
S2246a | 17:41557974–41660081 | 3.811 | 47 | 102.11 | 7 | 38448549–44769504 | 0.702 | ||
S698b | 17:40575688–40841397 | 2.829 | 71 | 265.71 | 35 | 40136698–41280385 | 2.39 | ||
S2056b | 17:37697212–38080865 | Overlapping | 40 | 383.65 | 51 | 36839131–38938944 | Overlapping | ||
rs9972882 | S1745a | 17:41353410–41433660 | 3.545 | 53 | 80.25 | 3 | 38421342–44365727 | 0.675 | No, but LD SNPs such as rs879606 (r2 = 0.64), rs907094 (r2 = 0.55), rs1877031 (r2 = 0.73), and rs931992 (r2 = 0.73) are in ROH region |
S2066a | 17:37938047–38066240 | 0.13 | 49 | 128.19 | 8 | 35214542–40789744 | Overlapping | ||
S2067a | 17:38069949–38082831 | 0.262 | 49 | 12.88 | 4 | 35289200–40863579 | Overlapping | ||
S2246a | 17:41557974–41660081 | 3.75 | 47 | 102.11 | 7 | 38448549–44769504 | 0.64 | ||
S698b | 17:40575688–40841397 | 2.767 | 71 | 265.71 | 35 | 40136698–41280385 | 2.32 | ||
S2056b | 17:37697212–38080865 | Overlapping | 40 | 383.65 | 51 | 36839131–38938944 | Overlapping | ||
rs2934952 | S17451a | 17:41353410–41433660 | 3.521 | 53 | 80.25 | 3 | 38421342–44365727 | 0.588 | No, but LD SNPs such as rs2941503 (r2 = 0.94) and rs1565922 (r2 = 1) are in ROH region |
S2066a | 17:37938047–38066240 | 0.105 | 49 | 128.19 | 8 | 35214542–40789744 | Overlapping | ||
S2067a | 17:38069949–38082831 | 0.237 | 49 | 12.88 | 4 | 35289200–40863579 | Overlapping | ||
S2246a | 17:41557974–41660081 | 3.725 | 47 | 102.11 | 7 | 38448549–44769504 | 0.616 | ||
S698b | 17:40575688–40841397 | 2.743 | 71 | 265.71 | 35 | 40136698–41280385 | 2.304 | ||
S2056b | 17:37697212–38080865 | Overlapping | 40 | 383.65 | 51 | 36839131–38938944 | Overlapping | ||
rs9626773 | S1316a | 22:42391017–42416695 | 5.76 | 60 | 25.68 | 5 | 40225588–44582123 | 3.6 | Yes |
S292b | 22:41739370–41769083 | 6.41 | 102 | 29.71 | 5 | 40316307–43192145 | 4.99 | ||
rs10873925 | S530a | 1:73390986–73552927 | 3.91 | 83 | 161.94 | 5 | 72077740–74866172 | 2.59 | Yes |
S1175a | 1:78168758–78392446 | 0.7 | 62 | 223.68 | 14 | 75058792–81502412 | Overlapping | ||
S262b | 1:73231948–73251013 | 4.2 | 106 | 19.07 | 5 | 71552003–74930956 | 2.53 | ||
S1746b | 1:77986516–78607268 | 0.5 | 42 | 620.75 | 76 | 74594745–81999038 | Overlapping | ||
S1936b | 1:77895705–77897188 | 0.4 | 51 | 1.98 | 5 | 74124056–81669336 | Overlapping | ||
rs17709449 | S3466a | 14:83999743–84062199 | 1.88 | 36 | 62.46 | 10 | 78521481–89540460 | Overlapping | Yes |
S4339a | 14:84322906–84348722 | 1.6 | 36 | 25.82 | 6 | 78580609–90091019 | Overlapping | ||
S4340a | 14:84665987–84668987 | 1.2 | 36 | 0.001 | 1 | 78727855–90610119 | Overlapping | ||
S3047b | 14:83245454–83262064 | 2.703 | 35 | 16.611 | 9 | 79247160–87260357 | Overlapping | ||
S3048b | 14:83608855–83870390 | 2.34 | 35 | 261.536 | 38 | 79700134–87779109 | Overlapping | ||
S3049b | 14:84006036–84229254 | 1.943 | 35 | 223.219 | 39 | 80024397–88210892 | Overlapping | ||
S3050b | 14:84603539–84635493 | 1.345 | 35 | 31.955 | 9 | 80515539–88723492 | Overlapping |
ROH regions from (55) shown, SNPs rs1002487, rs39745, rs1565922, rs9626773, rs11654954, and rs9972882 found to be known in worldwide population, whereas SNPs rs11805972, rs4663379, rs10033119, and rs7761746 representing/overlapping ROH regions in this population not found in (55) might be unique to this population. MB, mega base; KB, kilo base.
Method 1.
Method 2.
Risk variants and allele frequency differences among different populations
Allele frequencies at the reported 12 risk variants in different populations (including the study cohort) and the three population subgroups of the study cohort are listed in supplemental Table S5. Fisher’s exact tests failed to reveal any statistically significant difference (P-value threshold of 0.004) in allele frequencies at any of the identified 12 risk variants between our study cohort and 1KGP populations.
Gene expression regulation by the identified risk variants
Examination of Genotype-Tissue Expression (GTEx) (https://www.gtexportal.org) data to assess the involvement of the identified risk variants in gene expression regulation indicated that 6 of the reported 12 markers regulate their own or one or more of the reported 12 gene loci in blood and other tissues (supplemental Table S6): the LD partner (rs2799686) of LAD1 marker regulates LAD1; the CTTNBP2-LSM8 marker regulates LSM8; the PGAP3 marker regulates PGAP3 and STARD3; the LD partner (rs12897409) of the LINC00911-FLRT2 marker regulates FLRT2; the CDK12-NEUROD2 marker regulates PGAP3 and STARD3; and STARD3 regulates STARD3, PGAP3, and RP11-690G19.3. Supplemental Data Set 2 presents an excel document that lists GTeX data for all the markers that are in LD with the 12 risk variants identified in this study.
Associations among lipid traits, insulin resistance-linked trait of FPG, and diabetes status
It is well-documented that lipid measures, such as TG and HDL, are associated with insulin resistance and the incidence of type 2 diabetes (65, 66); thus, it is interesting to evaluate the associations among the lipid and insulin resistance traits in our study cohort. Upon performing Pearson correlation tests to identify the existence of statistically significant associations between FPG (insulin resistance-linked trait) and the lipid traits, it was seen that FPG exhibited a direct correlation with TG (discovery cohort: r2 = 0.28; P-value < 2.2E-16; replication cohort: r2 = 0.31; P-value < 2.2E-16) and an inverse correlation with HDL (discovery cohort: r2 = −0.11; P-value < 4.9E-05; replication cohort: r2 = −0.173; P-value < 2.2E-16). Upon performing logistic regression models for each of these two traits for association with the diagnosis of diabetes in the participants, it was seen that high levels of TG and low levels of HDL were risk factors for diabetes; TG was associated with higher odds of being diabetic [OR = 1.37 (95% CI 1.54, 1.76), P-value = 2.20E-08; replication cohort: OR = 1.64 (95% CI 1.34, 1.65), P-value = 5.05E-10] and HDL was associated with lower odds of being diabetic [OR = 0.64 (95% CI 0.83, 1.42), P-value = 0.0031; replication cohort: OR = 0.20 (95% CI 0.63, 1.65), P-value = 8.01E-08].
DISCUSSION
We identified a set of six markers associated with TG at stringent genome-wide significant P-values of <6.12E-09 in a cohort of subjects of Arab descent from Kuwait; the markers were harbored in the gene loci of RPS6KA1, LAD1, OR5V1, [CTTNBP2, LSM8], PGAP3, and [RP11-191L9.4, CERK]. We additionally identified six borderline associations (with P-values in the range of >6.12E-09 to <5E-08) with TG from the gene loci of ST6GALNAC5, [SPP2, ARL4C], NPY1R, [LINC00911, FLRT2], [CDK12, NEUROD2], and STARD3. The reported risk variants from the gene loci of LAD1, CTTNBP2-LSM8, PGAP3, LINC00911-FLRT2, CDK12-NEUROD2, and STARD3 have the potential to regulate one or more of the identified 12 gene loci (supplemental Table S8).
A literature survey indicated suggestive inferences on the involvement of the 12 identified gene loci in metabolic processes (supplemental Table S8). Presented below are gene loci for which the literature suggested a link to the trait of TG: The RSK1 protein (from RPS6KA1) is an important regulator of insulin signaling and glucose metabolism in the MAPK/ERK pathway (67). This protein can selectively phosphorylate insulin receptor substrate 1 (IRS1) and thereby prevent insulin resistance (68). RSK1-deficient mice remain sensitive to insulin due to the loss of the negative feedback mechanism for insulin resistance (69–71). Thus, RSK1 has the potential to be involved in insulin resistance. The CERK converts ceramide to ceramide 1-phosphate, a sphingolipid metabolite. Ceramides play an active role in glucose homeostasis, insulin signaling, and, ultimately, the diabetes phenotype (72–75); ceramides in conjunction with diacylglycerols mediate high TG and insulin resistance (76). It is further the case that CERK deficiency suppresses diet-induced obesity and improves glucose intolerance (77). Thus, CERK has impact on processes relating to insulin resistance. The SPP2, NPY1R, and FLRT2 proteins also have the potential to be involved in diabetes/obesity (see supplemental Table S8). In our study, the reported variants from RPS6KA1, CERK, SPP2, NPY1R, and FLRT2 are associated with TG. The TG trait is well-known for association with insulin resistance (65, 66) in many populations, including Japanese (78), Asian Indians (79), Chinese (80), and Hispanics (81). A study using the Arab population from Oman has shown that elevation in serum TG level is associated with increasing levels of glucose in blood (82). Prevalence of high plasma TG levels in Arab individuals from Saudi Arabia with diabetes was significantly higher than in those without diabetes (83). Our study also finds that TG levels have positive correlations with the insulin resistance-linked trait of FPG, and HDL levels have inverse correlations with FPG; further, TG is associated with higher odds of diabetes.
Elevated activity of sialyltransferase (from ST6GALNAC5) in blood cells and increased levels of sialic acids are associated with coronary diseases (84); genetic analysis of Iranian subjects demonstrated that mutations in ST6GALNAC5 act as risk factors for coronary artery disease (85). In our study, the variant from ST6GALNAC5 is associated with TG. A microsatellite-based linkage study on Caucasian families with premature coronary artery disease and myocardial infarction showed that genes in the 1p31-32 region (in which the ST6GALNAC5 is located) influence TG level (86). Epidemiological and genetic evidence exists to support the notion that raised TG is an additional cause (87) and an independent risk factor (88–90) for cardiovascular disease and all-cause mortality; this notion also holds in Arab populations: serum concentrations of TG were significantly higher in the CHD+ compared with the CHD− group of Saudi Arabian patients (91).
Though global GWASs have identified many risk variants associated with lipid levels, none of these established markers emerged in our study. Even upon examining the associations with nominal values of P < 1.10E-05, only four established associations [involving the three CETP markers (rs3764261, rs1864163, and rs1800775) and a STARD3 marker (rs9972882), which is in LD with an established marker (rs1877031/STARD3)] could be concluded as transferable to the study population. The inability to reproduce the established European gene loci could have resulted from several aspects of the study design, including the study power, the low prevalence of causative European variants in the study population, or gene-environment interactions that masked the effects of the European variants in the study population. This concern may be resolved in our future studies by way of enlarging the cohort, expanding the genome-coverage for genotyping (e.g., by way of imputing), and performing conditional analysis of examining regions nearby the well-established gene loci.
There is a high level of inbreeding in the Arab region due to the practice of consanguineous marriages, often between first cousins. An overwhelming proportion (63%) of the disorders documented in the Catalogue of Transmission Genetics in Arabs (CTGA) (http://cags.org.ae/ctga/) follow a recessive mode of inheritance. A great burden of recessive alleles and homozygosity in the Kuwaiti population has been reported in our publications (33) and in our ongoing studies (S. E. John, D. Antony, M. Eaaswarkhanth, et al.; unpublished observations). Thus, it is not surprising to find that all the reported 12 risk variants appeared when the genetic model based on the recessive mode of inheritance was used and were seen harbored in ROH segments identified in our study cohort; and three of such ROH regions were not observed in global populations of other ancestries. Further, the four established markers (from CETP and STARD3), for which we had an indication of transferability to the study population, were also located in ROH regions; the ROH encompassing the CETP markers was novel. Inbreeding helps with the aggregation of risk alleles in families; three of the reported gene loci in this study were also associated with phenotypes characterizing familial aggregation. PGAP3 is linked to a congenital disorder of glycosylation (Mabry syndrome) (92, 93); SMOC1 is associated with microphthalmia with limb anomalies (94); and RPS6KA1 is associated with Sézary syndrome (95). Allele frequencies at the reported 12 markers did not differ significantly between Arab and global populations; however, it is possible that the impact of these risk alleles on phenotype traits is more pronounced in the Arab population.
In conclusion, by performing a GWAS followed by replication in an independent sample set of Arab subjects from Kuwait, we pinpointed 12 novel risk variants associated (under a genetic model based on recessive mode of inheritance) with plasma TG levels in this population. The study provided an indication of transferability of the established markers from CETP and STARD3 genes to the study population under a genetic model based on the additive mode of inheritance. Studies in culturally and geographically distinct ethnic populations, such as those of the Arabian Peninsula, that have been underrepresented in global genome survey studies will augment international efforts to identify the genetic reasons for variation in lipid traits.
Supplementary Material
Acknowledgments
The authors are extremely thankful to the Biostatistics and Epidemiology Department for their efforts and excellent work on recruiting participants, conducting interviews, collecting samples, and managing sample information data. The authors further thank Maisa Mahmoud for providing help with recruiting participants and Daisy Thomas for providing help with recruiting participants and collecting phenotype information. The Tissue Bank Core Facility is acknowledged for sample processing and DNA extraction.
Footnotes
Abbreviations:
- ARL4C
- ADP ribosylation factor like GTPase 4C
- CDK12
- cyclin-dependent kinase 12
- CERK
- ceramide kinase
- CETP
- cholesteryl ester transfer protein
- CTTNBP2
- cortactin binding protein 2
- FLRT2
- fibronectin leucine-rich transmembrane protein 2
- FPG
- fasting plasma glucose
- GWAS
- genome-wide association study
- HWE
- Hardy-Weinberg equilibrium
- LAD1
- ladinin-1
- LD
- linkage disequilibrium
- LINC00911
- long intergenic non-protein coding RNA 911
- LSM8
- LSM8 homolog (U6 small nuclear RNA associated)
- NEUROD2
- neuronal differentiation 2
- NPY1R
- neuropeptide Y receptor Y1
- OR
- odds ratio
- OR5V1
- olfactory receptor family 5 subfamily V member 1
- PCA
- principal components analysis
- PGAP3
- post-GPI attachment to proteins 3
- QC
- quality control
- quantile-quantile
- RG2
- marginal genetic effect estimate
- ROH
- runs of homozygosity
- RP11-191L9.4
- uncharacterized RNA gene
- RPS6KA1
- ribosomal protein S6 kinase A1
- SLC10A1
- solute carrier family 10 member 1
- SMOC1
- SPARC-related modular calcium binding 1
- SPP2
- secreted phosphoprotein 2
- STARD3
- StAR-related lipid transfer domain containing 3
- ST6GALNAC5
- ST6 N-acetylgalactosaminide alpha-2,6-sialyltransferase 5
- TC
- total cholesterol
- TG
- triglyceride
- WC
- waist circumference
This work was supported by the Kuwait Foundation for the Advancement of Sciences (Dasman Diabetes Institute project numbers RA 2016-026 and RA-2010-005).
The online version of this article (available at http://www.jlr.org) contains a supplement.
REFERENCES
- 1.Stamler J., Dyer A. R., Shekelle R. B., Neaton J., and Stamler R.. 1993. Relationship of baseline major risk factors to coronary and all-cause mortality, and to longevity: findings from long-term follow-up of Chicago cohorts. Cardiology. 82: 191–222. [DOI] [PubMed] [Google Scholar]
- 2.Manolio T. A., Pearson T. A., Wenger N. K., Barrett-Connor E., Payne G. H., and Harlan W. R.. 1992. Cholesterol and heart disease in older persons and women. Review of an NHLBI workshop. Ann. Epidemiol. 2: 161–176. [DOI] [PubMed] [Google Scholar]
- 3.Roeters van Lennep J. E., Westerveld H. T., Erkelens D. W., and van der Wall E. E.. 2002. Risk factors for coronary heart disease: implications of gender. Cardiovasc. Res. 53: 538–549. [DOI] [PubMed] [Google Scholar]
- 4.Reiner Ž. 2017. Hypertriglyceridaemia and risk of coronary artery disease. Nat. Rev. Cardiol. 14: 401–411. [DOI] [PubMed] [Google Scholar]
- 5.Heitmann B. L. 1992. The variation in blood lipid levels described by various measures of overall and abdominal obesity in Danish men and women aged 35–65 years. Eur. J. Clin. Nutr. 46: 597–605. [PubMed] [Google Scholar]
- 6.Laakso M., and Barrett-Connor E.. 1989. Asymptomatic hyperglycemia is associated with lipid and lipoprotein changes favoring atherosclerosis. Arteriosclerosis. 9: 665–672. [DOI] [PubMed] [Google Scholar]
- 7.Kimm H., Lee S. W., Lee H. S., Shim K. W., Cho C. Y., Yun J. E., and Jee S. H.. 2010. Associations between lipid measures and metabolic syndrome, insulin resistance and adiponectin. Usefulness of lipid ratios in Korean men and women. Circ. J. 74: 931–937. [DOI] [PubMed] [Google Scholar]
- 8.Gepner Y., Shelef I., Schwarzfuchs D., Zelicha H., Tene L., Yaskolka Meir A., Tsaban G., Cohen N., Bril N., Rein M., et al. 2018. Effect of distinct lifestyle interventions on mobilization of fat storage pools: CENTRAL Magnetic Resonance Imaging Randomized Controlled Trial. Circulation. 137: 1143–1157. [DOI] [PubMed] [Google Scholar]
- 9.Srivastava R. A., Pinkosky S. L., Filippov S., Hanselman J. C., Cramer C. T., and Newton R. S.. 2012. AMP-activated protein kinase: an emerging drug target to regulate imbalances in lipid and carbohydrate metabolism to treat cardio-metabolic diseases. J. Lipid Res. 53: 2490–2514. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Chilton F. H., Murphy R. C., Wilson B. A., Sergeant S., Ainsworth H., Seeds M. C., and Mathias R. A.. 2014. Diet-gene interactions and PUFA metabolism: a potential contributor to health disparities and human diseases. Nutrients. 6: 1993–2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Gil-Campos M., Canete R., and Gil A.. 2004. Hormones regulating lipid metabolism and plasma lipids in childhood obesity. Int. J. Obes. Relat. Metab. Disord. 28 (Suppl. 3): S75–S80. [DOI] [PubMed] [Google Scholar]
- 12.Williamson D. H., and Lund P.. 1994. Cellular mechanisms for the regulation of adipose tissue lipid metabolism in pregnancy and lactation. Adv. Exp. Med. Biol. 352: 45–70. [DOI] [PubMed] [Google Scholar]
- 13.Nestruck A. C., Bouthillier D., Sing C. F., and Davignon J.. 1987. Apolipoprotein E polymorphism and plasma cholesterol response to probucol. Metabolism. 36: 743–747. [DOI] [PubMed] [Google Scholar]
- 14.Hegele R. A. 2009. Plasma lipoproteins: genetic influences and clinical implications. Nat. Rev. Genet. 10: 109–121. [DOI] [PubMed] [Google Scholar]
- 15.Cohen J. C. 2013. Emerging LDL therapies: using human genetics to discover new therapeutic targets for plasma lipids. J. Clin. Lipidol. 7: S1–S5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Willer C. J., Schmidt E. M., Sengupta S., Peloso G. M., Gustafsson S., Kanoni S., Ganna A., Chen J., Buchkovich M. L., Mora S., et al. 2013. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45: 1274–1283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Heller D. A., de Faire U., Pedersen N. L., Dahlen G., and McClearn G. E.. 1993. Genetic and environmental influences on serum lipid levels in twins. N. Engl. J. Med. 328: 1150–1156. [DOI] [PubMed] [Google Scholar]
- 18.Namboodiri K. K., Kaplan E. B., Heuch I., Elston R. C., Green P. P., Rao D. C., Laskarzewski P., Glueck C. J., and Rifkind B. M.. 1985. The Collaborative Lipid Research Clinics Family Study: biological and cultural determinants of familial resemblance for plasma lipids and lipoproteins. Genet. Epidemiol. 2: 227–254. [DOI] [PubMed] [Google Scholar]
- 19.Choquette A. C., Bouchard L., Houde A., Bouchard C., Perusse L., and Vohl M. C.. 2007. Associations between USF1 gene variants and cardiovascular risk factors in the Quebec Family Study. Clin. Genet. 71: 245–253. [DOI] [PubMed] [Google Scholar]
- 20.Teslovich T. M., Musunuru K., Smith A. V., Edmondson A. C., Stylianou I. M., Koseki M., Pirruccello J. P., Ripatti S., Chasman D. I., Willer C. J., et al. 2010. Biological, clinical and population relevance of 95 loci for blood lipids. Nature. 466: 707–713. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Aulchenko Y. S., Ripatti S., Lindqvist I., Boomsma D., Heid I. M., Pramstaller P. P., Penninx B. W., Janssens A. C., Wilson J. F., Spector T., et al. 2009. Loci influencing lipid levels and coronary heart disease risk in 16 European population cohorts. Nat. Genet. 41: 47–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kathiresan S., Melander O., Guiducci C., Surti A., Burtt N. P., Rieder M. J., Cooper G. M., Roos C., Voight B. F., Havulinna A. S., et al. 2008. Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans. Nat. Genet. 40: 189–197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Asselbergs F. W., Guo Y., van Iperen E. P., Sivapalaratnam S., Tragante V., Lanktree M. B., Lange L. A., Almoguera B., Appelman Y. E., Barnard J., et al. 2012. Large-scale gene-centric meta-analysis across 32 studies identifies multiple lipid loci. Am. J. Hum. Genet. 91: 823–838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Awad A., and Al-Nafisi H.. 2014. Public knowledge of cardiovascular disease and its risk factors in Kuwait: a cross-sectional survey. BMC Public Health. 14: 1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Dumitrescu L., Carty C. L., Taylor K., Schumacher F. R., Hindorff L. A., Ambite J. L., Anderson G., Best L. G., Brown-Gentry K., Buzkova P., et al. 2011. Genetic determinants of lipid traits in diverse populations from the population architecture using genomics and epidemiology (PAGE) study. PLoS Genet. 7: e1002138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Elbers C. C., Guo Y., Tragante V., van Iperen E. P., Lanktree M. B., Castillo B. A., Chen F., Yanek L. R., Wojczynski M. K., Li Y. R., et al. 2012. Gene-centric meta-analysis of lipid traits in African, East Asian and Hispanic populations. PLoS One. 7: e50198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wu Y., Waite L. L., Jackson A. U., Sheu W. H., Buyske S., Absher D., Arnett D. K., Boerwinkle E., Bonnycastle L. L., Carty C. L., et al. 2013. Trans-ethnic fine-mapping of lipid loci identifies population-specific signals and allelic heterogeneity that increases the trait variance explained. PLoS Genet. 9: e1003379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Tomei S., Mamtani R., Al Ali R., Elkum N., Abdulmalik M., Ismail A., Cheema S., Rouh H. A., Aigha I. I., Hani F., et al. 2015. Obesity susceptibility loci in Qataris, a highly consanguineous Arabian population. J. Transl. Med. 13: 119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hebbar P., Elkum N., Alkayal F., John S. E., Thanaraj T. A., and Alsmadi O.. 2017. Genetic risk variants for metabolic traits in Arab populations. Sci. Rep. 7: 40988. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hebbar P., Alkayal F., Nizam R., Melhem M., Elkum N., John S. E., Abufarha M., Alsmadi O., and Thanaraj T. A.. 2017. The TCN2 variant of rs9606756 [Ile23Val] acts as risk loci for obesity-related traits and mediates by interacting with Apo-A1. Obesity (Silver Spring). 25: 1098–1108. [DOI] [PubMed] [Google Scholar]
- 31.Ghassibe-Sabbagh M., Haber M., Salloum A. K., Al-Sarraj Y., Akle Y., Hirbli K., Romanos J., Mouzaya F., Gauguier D., Platt D. E., et al. 2014. T2DM GWAS in the Lebanese population confirms the role of TCF7L2 and CDKAL1 in disease susceptibility. Sci. Rep. 4: 7351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.O’Beirne S. L., Salit J., Rodriguez-Flores J. L., Staudt M. R., Abi Khalil C., Fakhro K. A., Robay A., Ramstetter M. D., Al-Azwani I. K., Malek J. A., et al. 2016. Type 2 diabetes risk allele loci in the Qatari population. PLoS One. 11: e0156834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Alsmadi O., Thareja G., Alkayal F., Rajagopalan R., John S. E., Hebbar P., Behbehani K., and Thanaraj T. A.. 2013. Genetic substructure of Kuwaiti population reveals migration history. PLoS One. 8: e74913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Al-Awadi S. A., Moussa M. A., Naguib K. K., Farag T. I., Teebi A. S., el-Khalifa M., and el-Dossary L.. 1985. Consanguinity among the Kuwaiti population. Clin. Genet. 27: 483–486. [DOI] [PubMed] [Google Scholar]
- 35.Rudan I., Campbell H., Carothers A. D., Hastie N. D., and Wright A. F.. 2006. Contribution of consanguinity to polygenic and multifactorial diseases. Nat. Genet. 38: 1224–1225. [DOI] [PubMed] [Google Scholar]
- 36.Bittles A. H., and Black M. L.. 2010. Evolution in health and medicine Sackler colloquium: consanguinity, human evolution, and complex diseases. Proc. Natl. Acad. Sci. USA. 107 (Suppl. 1): 1779–1786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Ansarimoghaddam A., Adineh H. A., Zareban I., Iranpour S., HosseinZadeh A., and Kh F.. 2018. Prevalence of metabolic syndrome in Middle-East countries: meta-analysis of cross-sectional studies. Diabetes Metab. Syndr. 12: 195–201. [DOI] [PubMed] [Google Scholar]
- 38.Channanath A. M., Farran B., Behbehani K., and Thanaraj T. A.. 2013. State of diabetes, hypertension, and comorbidity in Kuwait: showcasing the trends as seen in native versus expatriate populations. Diabetes Care. 36: e75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Klautzer L., Becker J., and Mattke S.. 2014. The curse of wealth - Middle Eastern countries need to address the rapidly rising burden of diabetes. Int. J. Health Policy Manag. 2: 109–114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Bamimore M. A., Zaid A., Banerjee Y., Al-Sarraf A., Abifadel M., Seidah N. G., Al-Waili K., Al-Rasadi K., and Awan Z.. 2015. Familial hypercholesterolemia mutations in the Middle Eastern and North African region: a need for a national registry. J. Clin. Lipidol. 9: 187–194. [DOI] [PubMed] [Google Scholar]
- 41.Al Rasadi K., Almahmeed W., AlHabib K. F., Abifadel M., Farhan H. A., AlSifri S., Jambart S., Zubaid M., Awan Z., Al-Waili K., et al. 2016. Dyslipidaemia in the Middle East: current status and a call for action. Atherosclerosis. 252: 182–187. [DOI] [PubMed] [Google Scholar]
- 42.Franks P. W., Pearson E., and Florez J. C.. 2013. Gene-environment and gene-treatment interactions in type 2 diabetes: progress, pitfalls, and prospects. Diabetes Care. 36: 1413–1421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Hu F. B. 2011. Globalization of diabetes: the role of diet, lifestyle, and genes. Diabetes Care. 34: 1249–1257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Morabia A., Cayanis E., Costanza M. C., Ross B. M., Flaherty M. S., Alvin G. B., Das K., and Gilliam T. C.. 2003. Association of extreme blood lipid profile phenotypic variation with 11 reverse cholesterol transport genes and 10 non-genetic cardiovascular disease risk factors. Hum. Mol. Genet. 12: 2733–2743. [DOI] [PubMed] [Google Scholar]
- 45.Parnell L. D., Blokker B. A., Dashti H. S., Nesbeth P. D., Cooper B. E., Ma Y., Lee Y. C., Hou R., Lai C. Q., Richardson K., et al. 2014. CardioGxE, a catalog of gene-environment interactions for cardiometabolic traits. BioData Min. 7: 21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Lee Y. C., Lai C. Q., Ordovas J. M., and Parnell L. D.. 2011. A database of gene-environment interactions pertaining to blood lipid traits, cardiovascular disease and type 2 diabetes. J. Data Mining Genomics Proteomics. 2: 106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Gauderman W. J. 2002. Sample size requirements for association studies of gene-gene interaction. Am. J. Epidemiol. 155: 478–484. [DOI] [PubMed] [Google Scholar]
- 48.Willer C. J., Li Y., and Abecasis G. R.. 2010. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 26: 2190–2191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Purcell S., Neale B., Todd-Brown K., Thomas L., Ferreira M. A., Bender D., Maller J., Sklar P., de Bakker P. I., Daly M. J., et al. 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81: 559–575. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Alexander D. H., Novembre J., and Lange K.. 2009. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19: 1655–1664. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Price A. L., Patterson N. J., Plenge R. M., Weinblatt M. E., Shadick N. A., and Reich D.. 2006. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38: 904–909. [DOI] [PubMed] [Google Scholar]
- 52.Li J., and Ji L.. 2005. Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity (Edinb.). 95: 221–227. [DOI] [PubMed] [Google Scholar]
- 53.Howrigan D. P., Simonson M. A., and Keller M. C.. 2011. Detecting autozygosity through runs of homozygosity: a comparison of three autozygosity detection algorithms. BMC Genomics. 12: 460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Christofidou P., Nelson C. P., Nikpay M., Qu L., Li M., Loley C., Debiec R., Braund P. S., Denniff M., Charchar F. J., et al. 2015. Runs of homozygosity: association with coronary artery disease and gene expression in monocytes and macrophages. Am. J. Hum. Genet. 97: 228–237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Pemberton T. J., Absher D., Feldman M. W., Myers R. M., Rosenberg N. A., and Li J. Z.. 2012. Genomic patterns of homozygosity in worldwide human populations. Am. J. Hum. Genet. 91: 275–292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.1000 Genomes Project Consortium, Auton A., Brooks L. D., Durbin R. M., Garrison E. P., Kang H. M., Korbel J. O., Marchini J. L., McCarthy S., McVean G. A., et al. 2015. A global reference for human genetic variation. Nature. 526: 68–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Spracklen C. N., Chen P., Kim Y. J., Wang X., Cai H., Li S., Long J., Wu Y., Wang Y. X., Takeuchi F., et al. 2017. Association analyses of East Asian individuals and trans-ancestry analyses with European individuals reveal new loci associated with cholesterol and triglyceride levels. Hum. Mol. Genet. 26: 1770–1784. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Kurano M., Tsukamoto K., Kamitsuji S., Kamatani N., Hara M., Ishikawa T., Kim B. J., Moon S., Jin Kim Y., and Teramoto T.. 2016. Genome-wide association study of serum lipids confirms previously reported associations as well as new associations of common SNPs within PCSK7 gene with triglyceride. J. Hum. Genet. 61: 427–433. [DOI] [PubMed] [Google Scholar]
- 59.Sabatti C., Service S. K., Hartikainen A. L., Pouta A., Ripatti S., Brodsky J., Jones C. G., Zaitlen N. A., Varilo T., Kaakinen M., et al. 2009. Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat. Genet. 41: 35–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Hiura Y., Shen C. S., Kokubo Y., Okamura T., Morisaki T., Tomoike H., Yoshida T., Sakamoto H., Goto Y., Nonogi H., et al. 2009. Identification of genetic markers associated with high-density lipoprotein-cholesterol by genome-wide screening in a Japanese population: the Suita study. Circ. J. 73: 1119–1126. [DOI] [PubMed] [Google Scholar]
- 61.Lettre G., Palmer C. D., Young T., Ejebe K. G., Allayee H., Benjamin E. J., Bennett F., Bowden D. W., Chakravarti A., Dreisbach A., et al. 2011. Genome-wide association study of coronary heart disease and its risk factors in 8,090 African Americans: the NHLBI CARe Project. PLoS Genet. 7: e1001300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Willer C. J., Sanna S., Jackson A. U., Scuteri A., Bonnycastle L. L., Clarke R., Heath S. C., Timpson N. J., Najjar S. S., Stringham H. M., et al. 2008. Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nat. Genet. 40: 161–169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Chasman D. I., Pare G., Mora S., Hopewell J. C., Peloso G., Clarke R., Cupples L. A., Hamsten A., Kathiresan S., Malarstig A., et al. 2009. Forty-three loci associated with plasma lipoprotein size, concentration, and cholesterol content in genome-wide analysis. PLoS Genet. 5: e1000730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Wu Y., Marvelle A. F., Li J., Croteau-Chonka D. C., Feranil A. B., Kuzawa C. W., Li Y., Adair L. S., and Mohlke K. L.. 2013. Genetic association with lipids in Filipinos: waist circumference modifies an APOA5 effect on triglyceride levels. J. Lipid Res. 54: 3198–3205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Stannard S. R., and Johnson N. A.. 2004. Insulin resistance and elevated triglyceride in muscle: more important for survival than “thrifty” genes? J. Physiol. 554: 595–607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Ginsberg H. N., Zhang Y. L., and Hernandez-Ono A.. 2005. Regulation of plasma triglycerides in insulin resistance and diabetes. Arch. Med. Res. 36: 232–240. [DOI] [PubMed] [Google Scholar]
- 67.Gao X., and Patel T. B.. 2009. Regulation of protein kinase A activity by p90 ribosomal S6 kinase 1. J. Biol. Chem. 284: 33070–33078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Smadja-Lamère N., Shum M., Déléris P., Roux P. P., Abe J., and Marette A.. 2013. Insulin activates RSK (p90 ribosomal S6 kinase) to trigger a new negative feedback loop that regulates insulin signaling for glucose metabolism. J. Biol. Chem. 288: 31165–31176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Um S. H., Frigerio F., Watanabe M., Picard F., Joaquin M., Sticker M., Fumagalli S., Allegrini P. R., Kozma S. C., Auwerx J., et al. 2004. Absence of S6K1 protects against age- and diet-induced obesity while enhancing insulin sensitivity. Nature. 431: 200–205. [DOI] [PubMed] [Google Scholar]
- 70.Um S. H., D’Alessio D., and Thomas G.. 2006. Nutrient overload, insulin resistance, and ribosomal protein S6 kinase 1, S6K1. Cell Metab. 3: 393–402. [DOI] [PubMed] [Google Scholar]
- 71.Tremblay F., Brule S., Hee Um S., Li Y., Masuda K., Roden M., Sun X. J., Krebs M., Polakiewicz R. D., Thomas G., et al. 2007. Identification of IRS-1 Ser-1101 as a target of S6K1 in nutrient- and obesity-induced insulin resistance. Proc. Natl. Acad. Sci. USA. 104: 14056–14061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Lipina C., and Hundal H. S.. 2011. Sphingolipids: agents provocateurs in the pathogenesis of insulin resistance. Diabetologia. 54: 1596–1607. [DOI] [PubMed] [Google Scholar]
- 73.Summers S. A. 2010. Sphingolipids and insulin resistance: the five Ws. Curr. Opin. Lipidol. 21: 128–135. [DOI] [PubMed] [Google Scholar]
- 74.Holland W. L., and Summers S. A.. 2008. Sphingolipids, insulin resistance, and metabolic disease: new insights from in vivo manipulation of sphingolipid metabolism. Endocr. Rev. 29: 381–402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Straczkowski M., and Kowalska I.. 2008. The role of skeletal muscle sphingolipids in the development of insulin resistance. Rev. Diabet. Stud. 5: 13–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Amati F., Dube J. J., Alvarez-Carnero E., Edreira M. M., Chomentowski P., Coen P. M., Switzer G. E., Bickel P. E., Stefanovic-Racic M., Toledo F. G., et al. 2011. Skeletal muscle triglycerides, diacylglycerols, and ceramides in insulin resistance: another paradox in endurance-trained athletes? Diabetes. 60: 2588–2597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Mitsutake S., Date T., Yokota H., Sugiura M., Kohama T., and Igarashi Y.. 2012. Ceramide kinase deficiency improves diet-induced obesity and insulin resistance. FEBS Lett. 586: 1300–1305. [DOI] [PubMed] [Google Scholar]
- 78.Katsuki A., Sumida Y., Urakawa H., Gabazza E. C., Murashima S., Maruyama N., Morioka K., Nakatani K., Yano Y., and Adachi Y.. 2003. Increased visceral fat and serum levels of triglyceride are associated with insulin resistance in Japanese metabolically obese, normal weight subjects with normal glucose tolerance. Diabetes Care. 26: 2341–2344. [DOI] [PubMed] [Google Scholar]
- 79.Sandeep S., Gokulakrishnan K., Deepa M., and Mohan V.. 2011. Insulin resistance is associated with increased cardiovascular risk in Asian Indians with normal glucose tolerance–the Chennai Urban Rural Epidemiology Study (CURES-66). J. Assoc. Physicians India. 59: 480–484. [PubMed] [Google Scholar]
- 80.Lin K. C., Tsai S. T., Kuo S. C., Tsay S. L., and Chou P.. 2007. Interrelationship between insulin resistance and menopause on the metabolic syndrome and its individual component among nondiabetic women in the kinmen study. Am. J. Med. Sci. 333: 208–214. [DOI] [PubMed] [Google Scholar]
- 81.Hanley A. J., Williams K., Stern M. P., and Haffner S. M.. 2002. Homeostasis model assessment of insulin resistance in relation to the incidence of cardiovascular disease: the San Antonio Heart Study. Diabetes Care. 25: 1177–1184. [DOI] [PubMed] [Google Scholar]
- 82.Daboul M. W. 2011. A study measuring the effect of high serum triglyceride and cholesterol on glucose elevation in human serum. Oman Med. J. 26: 109–113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Aljabri K. S. J., Bokhari S. A., and Akl A.. 2016. The relation between overweight, obesity and plasma lipids in Saudi adults with type 2 diabetes. Journal of Health Specialties. 4: 140–145. [Google Scholar]
- 84.Gopaul K. P., and Crook M. A.. 2006. Sialic acid: a novel marker of cardiovascular disease? Clin. Biochem. 39: 667–681. [DOI] [PubMed] [Google Scholar]
- 85.InanlooRahatloo K., Parsa A. F., Huse K., Rasooli P., Davaran S., Platzer M., Kramer M., Fan J. B., Turk C., Amini S., et al. 2014. Mutation in ST6GALNAC5 identified in family with coronary artery disease. Sci. Rep. 4: 3595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Seidelmann S. B., Li L., Shen G. Q., Topol E. J., and Wang Q. K.. 2008. Identification of a novel locus for triglyceride on chromosome 1p31-32 in families with premature CAD and MI. J. Lipid Res. 49: 1034–1038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Nordestgaard B. G., and Varbo A.. 2014. Triglycerides and cardiovascular disease. Lancet. 384: 626–635. [DOI] [PubMed] [Google Scholar]
- 88.McBride P. 2008. Triglycerides and risk for coronary artery disease. Curr. Atheroscler. Rep. 10: 386–390. [DOI] [PubMed] [Google Scholar]
- 89.Miller M. 1998. Is hypertriglyceridaemia an independent risk factor for coronary heart disease? The epidemiological evidence. Eur. Heart J. 19 (Suppl. H): H18–H22. [PubMed] [Google Scholar]
- 90.Hokanson J. E., and Austin M. A.. 1996. Plasma triglyceride level is a risk factor for cardiovascular disease independent of high-density lipoprotein cholesterol level: a meta-analysis of population-based prospective studies. J. Cardiovasc. Risk. 3: 213–219. [PubMed] [Google Scholar]
- 91.Ashmaig M. E., Ashmeik K., Ahmed A., Sobki S., and Abdulla M.. 2011. Association of lipids with coronary heart disease in a Saudi population. J. Vasc. Bras. 10: 131–136. [Google Scholar]
- 92.Howard M. F., Murakami Y., Pagnamenta A. T., Daumer-Haas C., Fischer B., Hecht J., Keays D. A., Knight S. J., Kolsch U., Kruger U., et al. 2014. Mutations in PGAP3 impair GPI-anchor maturation, causing a subtype of hyperphosphatasia with mental retardation. Am. J. Hum. Genet. 94: 278–287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Knaus A., Awaya T., Helbig I., Afawi Z., Pendziwiat M., Abu-Rachma J., Thompson M. D., Cole D. E., Skinner S., Annese F., et al. 2016. Rare noncoding mutations extend the mutational spectrum in the PGAP3 subtype of hyperphosphatasia with mental retardation syndrome. Hum. Mutat. 37: 737–744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Okada I., Hamanoue H., Terada K., Tohma T., Megarbane A., Chouery E., Abou-Ghoch J., Jalkh N., Cogulu O., Ozkinay F., et al. 2011. SMOC1 is essential for ocular and limb development in humans and mice. Am. J. Hum. Genet. 88: 30–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Wang L., Ni X., Covington K. R., Yang B. Y., Shiu J., Zhang X., Xi L., Meng Q., Langridge T., Drummond J., et al. 2015. Genomic profiling of Sezary syndrome identifies alterations of key T cell signaling and differentiation genes. Nat. Genet. 47: 1426–1434. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.