Table 1. Description of the Cohorts Used for the Analysesa.
Variable | Discovery Cohort | Replication Cohorts | ||
---|---|---|---|---|
QIMR15,16 (Australia) |
UKB17,18 (United Kingdom) |
NTR19 (the Netherlands) |
QSKIN20 (Australia [limited to Queensland]) |
|
Sample size, No. | 15 544 | 456 426 (345 246 with PRS) |
16 434 (11 212 with PRS) |
15 726 |
Inclusion criteriab | Adult genotyped participants of QIMR studies | Genotyped participants of the UKB | Adult genotyped participants of the NTR | Unrelated (GRM <0.1) adult genotyped participants of the QSKIN who had not previously participated in QIMR studies |
Demographics | 10 197 (65.6%) Women | 187 469 (54.3%) Women | 6727 (60.0%) Women | 8602 (54.7%) Women |
Age, mean (SD): 54.4 (13.2) | Age, mean (SD): 65.7 (8.0) | Age, mean (SD): 48.6 (17.5) | Age, mean (SD), 57.0 (7.9) | |
From 7015 families | From 4456 families | |||
Sample comprises 1119 complete MZ pairs of twins, 1104 complete DZ pairs, and 1448 singleton twins | Sample comprises 345 258 unrelated individuals (GRM <0.05) | Sample comprises 1740 complete MZ pairs, 1114 DZ complete twin pairs, and 812 singleton twins | ||
Genetic data | Participants genotyped using commercial arrays | Participants genotyped using 2 closely related arrays (the UK BiLEVE and the UK Biobank Axiom Arrays) | Participants genotyped using commercial arrays | Same as QIMR |
Genotype data cleaned (by batch) for call rate (≥95%) | We used HRC imputed data provided by the biobank | Genotype data cleaned for call rate (≥95%) | ||
MAF (≥1%) | Variants with MAF <0.005%, missingness <0.05, pHWE <10−6, and MAC >5 excluded | MAF ≥0.5% | ||
Hardy-Weinberg equilibrium (P ≥ 10−3), GenCall score (≥0.15 per genotype; mean, ≥0.7), standard Illumina filters | Data were checked for non-European ancestry | Hardy-Weinberg equilibrium (P ≥ 10−12), allele frequency difference with GONL <0.10 | ||
Data checked for pedigree, sex, and mendelian errors and for non-European ancestry | Data checked for pedigree, sex, and mendelian errors and for non-European ancestry21 | |||
Imputation to the 1000 genomes (phase 3, release 5) performed on the Michigan Imputation Server22 | ||||
Measures of population density and SES | Population density, remoteness, and SES variables generated from the postcode provided by participants at the time of last contact (1990-2015) | From the Easting and Northing coordinates rounded to the kilometer, we performed reverse geocoding to identify the postcode district in which the participants likely lived | Population density and SES derived from the most recent participants’ postcodes | Population density and SES variables were generated from the postcode provided by the participants at the time of last contact (2010-2012) |
We matched postcodes to the latest census data collected by the Australian Bureau of Statistics (2016 for population density, 2011 for remoteness and SES) | We crossed this information with the population density by postcode district calculated in the 2011 census | Numbers corresponded to the neighborhood data published in 2015-2016 by the Netherlands’s national statistical agency (CBS), which defines a neighborhood as the part of a municipality that is homogeneously demarcated from either a demographic or socioeconomic structure | We matched the postcodes as described in the QIMR sample | |
Population density expressed in number of residents per squared kilometer | Population density expressed in number of residents per hectare | Population density expressed in number of residents per squared kilometer | Population density expressed in number of residents per squared kilometer | |
SES based on the IRSAD,23 which can be used to measure socioeconomic well-being in a continuum, from the most disadvantaged areas (low values) to the most advantaged areas (high values) | We used the Townsend deprivation index17,18 as a measure of SES | SES measured using the average personal income per person and the average market value of residential properties | SES based on the IRSAD,23 as described in the QIMR sample | |
Mean (SD) population density was 1169 (1350) people/km2 (range, 0.01-5506) | Mean (SD) population density was 24.10 (25.1) inhabitants per hectare (ie, 2410 per km2) (range, 0.1-222.5) | Mean (SD) population density in the total sample was 4669 (3793) (range, 1-17 797) | Mean (SD) population density was 748 people/km2 (945) (range, 0.01-3771) |
Abbreviations: DZ, dizygotic; GONL, Genome of the Netherlands; GRM, genetic relationship matrices; HRC, Haplotype Reference Consortium; IRSAD, Index of Relative Socioeconomic Advantage and Disadvantage; MAC, minor allele count; MAF, minor allele frequency; MZ, monozygotic; NTR, Netherlands Twin Register; PHWE, P value for the Hardy-Weinberg test statistic; PRS, polygenic risk scores; QIMR, QIMR Berghofer Medical Research Institute; QSKIN, QSkin Sun and Health Study; SES, socioeconomic status; UKB, United Kingdom Biobank.
The Supplement contains documents on genetic and phenotypic data for all cohorts.
European ancestry was an inclusion criterion across all cohorts.