Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 May 7.
Published in final edited form as: Science. 2023 Jul 21;381(6655):eadf8009. doi: 10.1126/science.adf8009

The genetic architecture and evolution of the human skeletal form

Eucharist Kun 1, Emily M Javan 1, Olivia Smith 1, Faris Gulamali 2, Javier de la Fuente 3, Brianna I Flynn 1, Kushal Vajrala 1, Zoe Trutner 4, Prakash Jayakumar 4, Elliot M Tucker-Drob 3, Mashaal Sohail 5, Tarjinder Singh 6,7,8,*, Vagheesh M Narasimhan 1,9,*
PMCID: PMC11075689  NIHMSID: NIHMS1983665  PMID: 37471560

Abstract

INTRODUCTION:

Humans are the only bipedal great apes, owing to our distinctive skeletal form. Morphological changes that contribute to our skeletal form have been studied extensively in paleoanthropology. With the exception of standing height, examining the genetic basis for differential and specific growth of individual bones and their evolution has been challenging because of limited sample sizes.

RATIONALE:

One approach to studying skeletal form is to obtain a map of regions in the genome that affect skeletal development and morphology. Previously, this has been examined mainly through animal models and comparative genomics, but these approaches are largely low throughput. A complementary approach is to examine the genetic basis of variation in skeletal traits in humans. In this work, we applied deep-learning models to 31,221 full-body dualenergy x-ray absorptiometry (DXA) images from the UK Biobank to extract 23 different image-derived phenotypes that include all longbone lengths and hip and shoulder widths, which we analyzed while controlling for height.

RESULTS:

All skeletal proportions (SPs) are highly heritable (~30 to 50%), and genome-wide association studies of these traits identified 145 independent loci. These loci are enriched ingenes that regulate skeletal development as well as those that are associated with rare human skeletal diseases and abnormal mouse skeletal phenotypes. Genetic correlation and genomic structural equation modeling indicated that limb proportions exhibited strong genetic sharing but were genetically independent of width and torso proportions. Phenotypic and polygenic risk score analyses identified specific associations between osteoarthritis of the hip and knee, which are the leading causes of adult disability in the United States, and SPs of the corresponding regions. We also found genomic evidence of evolutionary change in arm-to-leg and hip-width proportions in humans, consistent with notable anatomical changes in these SPs in the hominin fossil record. In contrast to cardiovascular, autoimmune, metabolic, and other categories of traits, loci associated with these SPs are significantly enriched both in human accelerated regions and in regulatory elements of genes that are differentially expressed in humans and the great apes throughout development.

CONCLUSION:

Our work validates the use of deep-learning models on DXA images to identify specific genetic variants that affect the human skeletal form. It also ties a major evolutionary facet of human anatomical change to pathogenesis.


The human skeletal form underlies bipedalism, but the genetic basis of skeletal proportions (SPs) is not well characterized. We applied deep-learning models to 31,221 x-rays from the UK Biobank to extract a comprehensive set of SPs, which were associated with 145 independent loci genome-wide. Structural equation modeling suggested that limb proportions exhibited strong genetic sharing but were independent of width and torso proportions. Polygenic score analysis identified specific associations between osteoarthritis and hip and knee SPs. In contrast to other traits, SP loci were enriched in human accelerated regions and in regulatory elements of genes that are differentially expressed between humans and great apes. Combined, our work identifies specific genetic variants that affect the skeletal form and ties a major evolutionary facet of human anatomical change to pathogenesis.

Graphical Abstract

graphic file with name nihms-1983665-f0001.jpg

The genetic basis, evolution, and health consequences of human skeletal traits. (A) Measurement of SPs using a deep learning–based landmark estimation method on full-body DXAs. (B) Location of loci that localize to a single protein-coding gene and are associated with various SPs, colored according to the scheme in (A). (C) Significant phenotypic and genetic associations of various SPs with musculoskeletal disease or joint pain. Number notations in parentheses are the ICD-10 (International Classification of Diseases, Tenth Revision) codes associated with each disease. OA, osteoarthritis; TFA, tibiofemoral angle. (D) SPs with genomic evidence of human-specific evolution. Illustration was created with BioRender.com.


Humans are the only primates who are normally bipedal, owing to our distinctive skeletal form, which stabilizes the upright position. Bipedalism is enabled by specific anatomical properties of the human skeleton, including shorter arms relative to legs, a narrow body and pelvis, and the orientation of the vertebral column (13). These broad changes to skeletal proportions (SPs) likely began to occur around the separation of the human and chimpanzee lineages, and as a result, may have facilitated the use of tools and accelerated cognitive development (4, 5). Fossil evidence showing major morphological changes in the length of the limbs, torso, and body width suggest that these changes were gradual, with incremental development over the course of several million years (6, 7). However, despite more than a hundred years of effort in paleoanthropology documenting morphological changes of the skeletal form in human evolution, evidence of genomic change has been elusive.

In developmental biology, the mechanisms and processes underlying animal limb development, morphology, and broad body plan have been studied extensively. Early work using forward genetic screens in Drosophila identified homeobox genes as key regulators of anatomical development in invertebrates (8). Subsequent experiments in vertebrates, including fish, chickens, and mice, identified additional gene families that are crucial in the regulation of skeletal development and form (9, 10). Comparative genomic and evolutionary developmental biology approaches have produced several insights into the genetic basis of skeletal structure, from the underpinnings of convergent limb loss in snakes and limbless lizards (11, 12) to increased limb lengths in jerboas when compared with mice (13). However, these approaches do not provide an unbiased and comprehensive map of the genetic loci that regulate SPs and overall body plan. In addition, many of these approaches largely focus on examining the impact of loss-of-function mutations, which often have widespread effects on the entire skeleton. The subset of genes responsible for differential and specific growth of individual bones remains unknown.

Genome-wide association studies (GWASs) of human skeletal traits are a direct and complementary approach to characterizing the genetic basis of traits. Twin studies suggest that the heritability of SPs range between 0.40 and 0.80 (14), similar to the heritability of standing height (15), a skeletal trait that has served as an exemplary quantitative trait in human genetics. Meta-analysis of more than 5 million individuals has identified a saturated map of common genetic variants that are associated with standing height (16). However, height is among the most straightforward and accurate of quantitative traits to measure. Other skeletal elements, such as limb, torso, and shoulder lengths, are not typically or comprehensively measured in large sample sizes (17, 18). As a result, the genetic basis of such proportions and lengths remains understudied. Furthermore, anthropometric traits, like hip and waist circumferences, are measured externally and therefore are intrinsically tied to body-fat percentage and distribution, which fails to isolate genetic effects specific to the skeletal frame (19, 20).

Applying deep-learning methods to non-invasive medical imaging is a powerful way to extract skeletal measures in an accurate and scalable manner. Furthermore, the collection of genetic, phenotypic, and imaging data by national biobanks provides an opportunity to run GWASs for image-derived phenotypes (IDPs) with sufficiently large sample sizes. Several genetic studies have successfully applied computer vision to generate IDPs of the retina, distribution of body fat, heart structure, and liver-fat percentage and have linked significant loci to various disorders (2124).

In the context of musculoskeletal disease, epidemiological data suggest that disorders such as osteoarthritis (OA), the leading cause of adult disability in the United States (25, 26), are thought to be influenced by a variety of risk factors that range across obesity, mechanical stresses, genetic factors, and even the geometric structure of certain bones (27). Although some small studies have examined the relationship of certain skeletal element lengths such as leg-length discrepancy and OA (28), how the skeletal frame may exacerbate an individual’s development of osteoarthritic disease has not been fully studied (27, 29).

In this study, we applied methods in computer vision to derive comprehensive human skeletal measurements from full-body dualenergy x-ray absorptiometry (DXA) images at biobank scale. We then performed genomewide scans on 23 generated phenotypes to identify loci associated with variation in the skeletal form. Using summary statistics from these IDPs, we identified biological processes linked with human SPs and studied the phenotypic and genetic correlation between these measures and a range of external phenotypes, with an emphasis on musculoskeletal disorders. Finally, we investigated the impact of natural selection on these traits to understand how skeletal morphology is linked to human evolution and bipedalism.

Results

A deep-learning approach for quality control and quantification of biobank-scale imaging data

To study the genetic basis of human SPs, we jointly analyzed DXA and genetic data from 42,284 individuals in the UK Biobank (UKB). Individuals from this dataset are between 40 and 80 years old and reflect adult skeletal morphology. We report baseline information about our analyzed cohort in (30) and in table S1. We acquired 328,854 DXA scan images across eight imaging modalities comprising full-body transparent images, full-body opaque images, anteroposterior (AP) views of the left and right knees, AP views of the hips, and AP and lateral views of the spine. For quality control (QC), we first developed a deep learning–based multiclass predictor to select full-body transparent images from the pool of eight total imaging modalities. We developed a second deep-learning classifier to remove cropping artifacts. Finally, we excluded images with atypical aspect ratios and padded them to uniform sizes (30) (Fig. 1A). After our QC process, we were left with 39,469 images for analysis.

Fig. 1. Deep learning–based image quantification.

Fig. 1.

(A) QC process. Deep learning–based classifiers were used to select full-body images from a pool of DXA images of different body parts, as well as to remove images with artifacts, resolution, or cropping issues. Full-body images were then padded to standardize image pixel size before phenotyping (the image presented here shows padding of 5 pixels on each side). (B) Image quantification. Deep learning–based image landmark estimation using the HRNet architecture is shown. During this process, 297 training images annotated with specific landmarks were used to train the model to perform automatic annotation of landmarks on the rest of images in the dataset from which measurements of skeletal length and other measurements were calculated. (C) Average HRNet measurement error when compared with human-derived measurements of the tibia across 100 validation images. (D) Correlation of length measurements and height. (E) Correlation between left- and right-side measurements of the femur, humerus, forearm, and tibia. (F) Correlation of lengths measured from the first and second imaging visits for the same individual.

After image QC, we manually labeled 14 landmarks at pixel-level resolution on 297 images for use as training data. These labels were independently validated by an orthopedic team. The 14 landmarks include the major joints—the wrist, elbow, shoulder, hip, knee, and ankle—and the position of each eye. The segments connecting these landmarks reflect natural measurements for long-bone lengths or body-width measures. We assessed the replicability of manual annotation by inserting 20 duplicated images from the 297 training images without the knowledge of the annotator and found that repeat measurements resulted in a difference of less than 2 pixels at any landmark (30) (Fig. 1B).

We adapted and applied a new computer vision architecture, High-Resolution Net (HRNet), for landmark estimation, or the prediction of the location of human joints (31). There are four main reasons why we chose HRNet. First, HRNet maintains high-resolution representations throughout the model (30), and we wanted to use the high-resolution medical images produced by the DXA scanner to obtain precise measurement information of bone lengths. Second, the architecture had already been trained on two large imaging datasets, first on imageNet (32), a general natural image dataset, and then subsequently on Common Objects in Context (COCO) (33), a dataset of more than 200,000 images of humans in natural settings with joint landmarks classified. These two previous layers of training enabled us to perform transfer learning to fine-tune the architecture on our training data and reduce the total amount of manual annotation to just 297 images. Third, HRNet has among the best performance for a similar task of labeling human joints on two large-scale benchmarking datasets of human subjects (33, 34). Finally, we directly compared the performance of the HRnet architecture with a more traditional architecture on our dataset (ResNet-34) (35) and obtained significantly better results across different training parameter choices (30) (table S2). Upon training, the model achieved greater than 95% average precision on hold-out validation data across all body parts (table S2).

Validation of human skeletal length estimates

After training and validating the deep-learning model on the 297 manually annotated images, we applied this model to predict the 14 landmarks on the rest of the 39,172 full-body DXA images. We then calculated pixel distances between pairs of landmarks that corresponded to seven bone and body-length segments (30) (Fig. 1B and table S3). We also computed an angle measure between the tibia and the femur (tibiofemoral angle, or TFA) (Fig. 1B). To standardize images with different aspect ratios, we rescaled pixels into centimeters for each image resolution by regressing the height in pixels against standing height in centimeters as measured by the UKB assessment (30). We then removed individuals with any skeletal measurements that were more than four standard deviations from the mean.

After outlier removal, we validated the accuracy of our measurements on the remaining samples in four ways. First, the error rate for segment length from the model compared with manual annotation was, at maximum, 3 pixels or 0.7 cm, which is similar to the variation from manual annotation of the 20 duplicate images. Reliability (100% variance in measurement divided by variance of a segment length) was greater than 95% across all length measures (30) (Fig. 1C and tables S4 to S6). Second, the correlation between long-bone lengths and height as measured in the UKB was around ~0.88, which falls within the expectation observed in the literature (17) (Fig. 1D). Third, the correlation between left and right limb lengths was greater than 0.99 (Fig. 1E). Fourth, a subset of 667 individuals had undergone repeat imaging an average of 2 years apart, with different image aspect ratios, DXA machines, software models, and technicians carrying out the imaging (Fig. 1F). The correlation in these technical replicates across skeletal elements was also greater than 0.99. Taken together, these results suggest that the IDPs from our deep-learning model are highly accurate and highly replicable.

Characteristics and correlations of human SPs with sex, age, and height

From the seven bone and body length segments, we examined these IDPs as proportions instead of lengths (or to control for variation in overall height, which is highly correlated with each of these lengths) by taking simple ratios of each IDP with overall height (30) (Fig. 1B). We also carried out this normalization analysis in alternate ways, including using height as a covariate in association tests as well as regressing each IDP with height and obtaining residuals. All three approaches were highly correlated, and we used the simple approach of taking proportions for most analyses (30). As expected, this greatly reduced the overall correlation of our traits with height (table S7). In addition to obtaining ratios of each segment length with overall height, we also computed ratios of segments with each other and obtained a total of 21 different ratio IDPs along with the angle measure (TFA) (table S3). These ratios are referred to in the text as Segment:Segment (Hip Width:Height, Torso Length:Legs, and so on).

We then examined differences in SPs across sex and age. In line with well-known observations, Hip Width:Height (Student’s t test p < 10−15) and Torso Length:Height (Student’s t test p < 10−15) were significantly larger in women than in men (36), but we also observed that Humerus: Height was also significantly larger in women than in men (Student’s t test p = 1.45 × 10−5) (30) (table S8). In addition, we found that all body proportions vary slightly but significantly as a function of age (30) (table S9). We also examined how body proportions vary as a function of overall height and found that Torso Length:Legs decreases with height [Pearson correlation (r) = −0.21], suggesting that increases in height are driven more by increasing leg length rather than torso length (Fig. 2A). Arms:Legs also decreases with height (r = −0.02), meaning that leg length also outpaces arm length as height increases. Within each limb, for both arms and legs, lower to upper limb ratios (Tibia:Femur, Forearm: Humerus) increase with overall limb length. These increases also correspond with correlations with height, with Tibia:Femur increasing when height increases (r = 0.12).

Fig. 2. Genetic architecture.

Fig. 2.

(A) Correlation of SPs and overall height. Bars show ±2 SE. (B) Genotype (lower-left triangle) and phenotype (upper-right triangle) correlation of SPs. Overall correlation is shown in color, and the p value of the correlation is visualized by size. A Bonferroni-corrected threshold is also shown. (C) Solution for a genomic SEM model for the genetic covariance structure shown in (B) shows one common factor loading for arms, an additional factor for legs, and independent factors for each of the torso-related traits (hip width, shoulder width, and torso length). (D) Sex-specific analysis showing the ratio of the standardized effect size of the polygenic score on each trait (±2 SE) in males to the effect in females in a hold-out dataset.

GWASs of human SPs

We performed GWASs using imputed genotype data in the UKB to identify variants associated with each skeletal measure. We applied standard variant and sample QC and focused our analyses on 31,221 individuals of white British ancestry, as determined by the UKB genetic assessment, and 7.4 million common biallelic single-nucleotide polymorphisms (SNPs) with minor allele frequency >1% (30, 37) (tables S1 and S10). We used BOLT-LMM (38) to regress variants on each skeletal measure using a linear mixed-model association framework. After generating summary statistics for each skeletal measure, we estimated SNP heritability using LD Score regression (LDSC) (39) and GCTA-REML (40). All traits were highly heritable, with SNP heritability between 23 and 53% for LDSC and between 17 and 50% for GCTA-REML (tables S11 and S12). We detected inflation in test statistics in our quantilequantile (QQ) plots (mean inflation, λ = 1.20); however, minimal deviation of univariate LDSC intercepts from 1.0 suggested that this inflation was consistent with polygenicity rather than confounding (30) (Fig. 3B).

Fig. 3. Genome-wide association results.

Fig. 3.

(A) Manhattan plot of a GWAS performed across seven SPs and TFA; the lowest p value for any trait at each SNP is annotated. Loci over the genome-wide significance threshold that are close to only a single gene are annotated. (B) Shown are mean values of proportion and angle traits across individuals, the total number of genome-wide significant loci per trait, heritability (GCTA-REML), λ (from LDSC), and associated genes of loci that are specific to each skeletal trait (again annotating only loci that map to a region with a protein-coding gene within 20 kb of each clumped region). Illustration was created with BioRender.com.

In the seven SPs as a ratio of height (Forearm: Height, Humerus:Height, Tibia:Height, Femur: Height, HipWidth:Height, ShoulderWidth:Height, Torso Length:Height) and TFA, we identified 223 loci at 223 loci at p< 5 × 10−8 and 150 loci at p < 6.25×10−9 (Bonferroni correction for eight traits). Of these loci, 145 are independently significant [linkage disequilibrium (r2) < 0.1] across all eight phenotypes (92 after Bonferroni correction for eight traits). Of the 145 independent loci, 37 loci are only significant in SPs after conditioning on all SNPs discovered in a saturated GWAS for height (16, 30) (table S13). As a sensitivity analysis, we also examined the genetic effect of skeletal lengths before and after height adjust ment and found that 95% of genome-wide significant loci had the same direction of effect when carrying out GWASs in these alternate ways (30).

Genetic correlations and factor analysis of SPs

We calculated the genetic correlation between each pair of traits to investigate the degree of genetic sharing between each skeletal measure. Estimates from LDSC and GCTA-REML were virtually identical (fig. S10); in this work, we report estimates from GCTA-REML. Limb proportions had positive genetic correlations with each other (rg = 0.34 to 0.55). Upper arms and legs (Humerus:Height–Femur:Height rg = 0.55, p = 1.59 × 10−66) and lower arms and legs (Forearm:Height–Tibia:Height rg = 0.51, p = 6.01 × 10−50) were significantly more correlated than upper arms and lower legs (Humerus: Height–Tibia:Height rg = 0.38, p = 5.18 × 10−23) or lower arms and upper legs (Forearm:Height– Femur:Height rg = 0.34, p = 1.49 × 10−18). Body-width proportions, Hip Width:Height and Shoulder Width:Height, were largely uncorrelated with limb-length proportions (30). No correlations involving any pairwise combination of arm and width traits were significant (the minimum p value across all such correlations was ≥0.0022, which was above our Bonferroni threshold). Correlations between leg and width traits were marginally significant in three out of four comparisons, with the maximal correlation (Hip Width:Height–Tibia:Height) being 0.23 (Fig. 2B and table S13). In addition, we also computed phenotypic correlations between our traits, which were highly concordant with genetic correlations (r = 0.98).

We used genomic structural equation modeling (genomic SEM) to produce an empirically derived low-dimensional representation of the genetic covariance structure of the individual SPs (41). We performed exploratory factor analysis to identify the likely number of factors and built confirmatory models using odd-numbered chromosomes for model building and even-numbered chromosomes for validation, which we compared using a range of model fit indices (30). Our preferred model of the genetic covariance structure revealed five main factors that governed SPs. First, we identified a single broad factor (Skeletal factor) that represents dimensions of genetic variation that are statistically pleiotropic; that is, genetic variation represented in each factor contributes to variation in not just one phenotype but to variation in multiple phenotypes (30). All limb traits [both arms (Humerus:Height and Forearm:Height) and legs (Femur:Height and Tibia:Height)] load positively on this general Skeletal factor (on which Torso Length:Height loads negatively), but the arm traits additionally load on a second factor. Torso length and body-width traits (Hip Width:Height and Shoulder Width: Height) only load appreciably on trait-specific factors (Fig. 2C). That torso and body-width proportions do not load appreciably on either the general Skeletal factor or the Arm factor reinforces our observations from the pairwise bivariate genetic correlation analysis in which arm and leg proportions were largely independent of torso and body-width proportions. Moreover, the genomic SEM results produce insights that inspection of the genetic correlation matrix by itself does not (30). We find that genetic sharing between the two components of leg length (femur and tibia) does not represent genetic variation specific to leg growth per se but rather represents a more general dimension of genetic variation shared with the upper limbs (forearm and humerus). By contrast, the upper limbs specifically share genetic variation with one another (as indexed by the Arm factor) above and beyond a more general dimension of skeletal limb proportions (30).

Sex-specific heritabilities and genetic effects of SPs

Anthropometric and skeletal traits, such as hip width, are common examples of sexual dimorphism. We found that for most traits, the genetic correlation of SPs between males and females was not statistically different from a value of one except for TFA (rg = 0.89) (30) (fig. S16). For five out of the seven SPs, both of the sex-specific SNP heritabilities were greater than the heritability estimated jointly with both sexes (fig. S17).

To test for pervasive differences in the magnitude of genetic effects, we performed sex-specific GWASs of all the skeletal traits and evaluated these polygenic scores in both sexes in a hold-out dataset (30). This method had recently been applied to examine sex-specific effects in biobank traits (42). Across all SPs that we tested, polygenic scores had a significantly larger standardized effect size (standardized in males and females separately) in males compared with females (Student’s t test p < 1 × 10−3 for all comparisons) (Fig. 2D). These results are in line with previous work suggesting that SPs, like other anthropometric traits, have clear differences in the magnitude of sex-specific effects when compared with other quantitative traits in the UKB (42).

Biological insights from skeletal associations

We performed gene set enrichment analyses in 10,678 gene sets using functional mapping and annotation (FUMA) of GWASs to identify biological processes and pathways enriched in each skeletal trait (30, 43). After false discovery rate (FDR) correction (FDR < 0.05), we found 195 gene sets to be significantly enriched across our seven skeletal traits. Several gene sets related to development were common across most traits, such as skeletal system development, connective tissue development, chondrocyte differentiation, and cartilage development (table S15).

Furthermore, common alleles associated with SPs were significantly enriched in 701 autosomal genes linked to “skeletal growth abnormality”in the Online Mendelian Inheritance in Man (OMIM) (44) database (p < 5.0 × 10−2 ) except genes associated with torso length (p = 0.22) (tables S16 and S17). Combined, these results indicate that common variants associated with SPs pinpointed genes in which rare coding variants contribute to Mendelian musculoskeletal disorders. To determine if loci discovered in our GWASs had been implicated in previous genetic studies, we queried the GWAS catalog (45) for each of the 145 independent SNPs in our study. As expected, the largest overlaps were seen with anthropometric traits (table S18).

Out of the total loci identified across GWASs (table S18), 45 loci overlapped a single proteincoding gene within 20 kb of each clumped region. Notably, of these 45 genes, 32 (or 71%) resulted in abnormal skeletal phenotypes when disrupted in mice using the Human-Mouse Disease Connection database (46). Four of these genes (COL11A1, SOX9, FN1, and AGDRD6) were associated with rare skeletal diseases in humans, as annotated in OMIM (table S20). In some cases, a gene linked with a specific SP in our GWASs resulted in a defect in the same skeletal trait in mouse models. We found that a common variant (rs6546231) near MEIS1, a homeodomain transcription factor, is associated with increased Forearm:Height. Mouse models of MEIS−/− mice are specifically associated with abnormal forelimb development (47). Similarly, a common variant (rs1891308) near ADGRG6, which encodes a G protein–coupled receptor, is associated with increased torso length. Mice with conditional knockouts in ADGRG6 have spine abnormalities that reduce torso length (48). Thus, our GWAS of SPs identifies genes that were previously associated with skeletal developmental biology and Mendelian skeletal phenotypes, demonstrating the potential for future functional and knockout studies

Next, we conducted a transcriptome-wide association study (TWAS) that linked predicted gene expression in skeletal muscle [based on the Genotype-Tissue Expression project (GTEx v.7) (49)] with our SP GWAS. In total, we identified 30 genes that were significantly associated with any one of our skeletal traits at a Bonferroni-corrected significance threshold across the total number of gene and trait combinations (30) (table S21). Among the strongest TWAS associations were PAX1 (TWAS z-score = 12.6, p = 1.31 × 10−36), a transcription factor that is critical in fetal development and is associated with development of the vertebral column, and FGFR3 (TWAS z-score = 6.5, p = 8.52 × 10−11), a fibroblast growth factor receptor that plays a role in bone development and maintenance.

Genetic and phenotypic association of skeletal phenotypes with musculoskeletal disease

To investigate the clinical relevance of human SPs, we examined their genetic and phenotypic associations with musculoskeletal disease and with joint and back pain. We used logistic regression to examine phenotypic associations between skeletal morphology and these musculoskeletal disorders (Fig. 4A) while controlling for age, sex, bone-mineral density, body mass index (BMI), and other major risk factors for OA (50). We found that one standard deviation in Hip Width:Height was associated with increased odds of hip OA [p = 3.16 × 10−5, odds ratio (OR) = 1.34]. Similarly, Femur:Height, Tibia:Height, and the TFA, which are all skeletal measures of the knee joint, were associated with increased risk of knee OA (p = 2.24 × 10−15, OR = 1.34; p = 6.09 × 10−5, OR = 1.16; p = 1.64 × 10−35, OR = 1.49). Femur:Height and the TFA were also significantly associated with internal derangement of the knee (p = 4.03 × 10−6, OR = 1.19; p = 1.43 × 10−17, OR = 1.34). Pain phenotypes for hip and knee joints were also associated with the specific SPs that make up each joint (hip pain with Hip Width:Height: p = 8.53 × 10−5, OR = 1.12; knee pain with Femur:Height, Tibia: Height, and TFA: p = 8.13 × 10−6, OR = 1.09; p = 2.89 × 10−5, OR = 1.09; p = 1.66 × 10−46, OR = 1.31) (30) (Fig. 4A) (table S22).

Fig. 4. Association between skeletal traits and musculoskeletal disease.

Fig. 4.

(A) Phenotypic associations from logistic regression analysis of musculoskeletal disease traits on skeletal phenotypes. (B) Polygenic risk score associations between musculoskeletal disease traits and skeletal phenotypes. For both (A) and (B), associations that are significant after Bonferroni correction are annotated with an asterisk. ORs for the phenotypic associations and polygenic risk scores are shown in different colors, and the p values are represented by size. The number notations in parentheses are the ICD-10 codes associated with each disease; M54 –Dorsalgia, M16 –Coxarthrosis (arthrosis of hip), M17 –Gonarthrosis (arthrosis of knee), and M23 –Internal derangement of knee.

Next, we analyzed 361,140 UKB participants who had not undergone DXA imaging and were of white British ancestry for predictive risk based on polygenic scores derived from our GWAS on SPs on the imaged set of individuals (Fig. 4B). We generated polygenic scores with Bayesian regression and continuous shrinkage priors (51) using the significantly associated SNPs and ran a phenome-wide association study of the generated risk scores and traits, adjusting for the first 20 principal components of ancestry and imputed sex (30). Polygenic scores of Hip Width:Height and TFA were associated with an increased incidence of hip and knee OA, respectively (p = 7.92 × 10−5, OR= 1.04; p = 1.73 × 10−4, OR = 1.04), in line with the phenotypic associations. In addition, we also saw significant association between back pain [both recorded on the ICD-10 (International Classification of Diseases, Tenth Revision) code and self-reported] and Torso Length:Height (p = 5.59 × 10−5, OR = 1.05; p = 5.71 × 10−6, OR = 1.02) (table S23). Neither the OA nor the musculoskeletal pain phenotypes that we tested were significantly associated with overall height in this analysis [phenotypic associations: 1.10 × 10−2 < p < 8.51 × 10−1; polygenic risk score associations: 2.17 × 10−3 < p < 3.88 × 10−1] except for polygenic risk scores of height and back pain (p = 5.76 × 10−10) (tables S22 and S23). In genomic SEM analyses, we observed similar patterns of genetic associations with musculoskeletal diseases at the level of general genetic factors (30) (fig. S13 and table S24). Taken together, these analyses suggest that increases in the length of skeletal elements that are associated with the hip, knee, and back as a ratio of overall height are exclusively associated with an increased risk of arthritis and pain phenotypes in those specific areas.

Evolutionary analysis

As human SPs are an important part of our transformation to bipedalism, we next investigated whether variants associated with SPs have undergone accelerated evolution in humans in two ways. First, following a procedure by Richard et al. (52) and Xu et al. (53), we examined whether genes associated with SPs overlapped human accelerated regions (HARs) more than expectation. HARs are segments of the genome that are conserved throughout vertebrate and great ape evolution but are notably different in humans (54). We generated a null distribution by randomly sampling regions matched for overall gene length (30) (Fig. 5A). For comparison, we also performed the same analysis on summary statistics from the ENIGMA Consortium (55) and several common quantitative and disease traits from the UKB (table S25). Genetic signals from several of the SP traits, in particular arm or leg length, were significantly enriched in HARs (Arms: Legs, Humerus:Height, Arms:Height, Hips: Legs, Tibia:Femur, and Hip Width:Height had FDR-adjusted p < 0.05). We also observed nominal enrichment for traits related to hair pigmentation (FDR-adjusted p = 0.013), which has also changed substantially in humans compared with the great apes, and for schizophrenia (FDR-adjusted p = 1.61 × 10−34). However, no enrichment (FDR-adjusted p > 0.05) was observed for HARs in autoimmune disorders, cardiovascular disease, cancer, and overall height (Fig. 5A).

Fig. 5. Evolutionary analyses.

Fig. 5.

(A) Shown are p values of enrichment for overlap of HARs with genes associated with SP, autoimmune, dermatological, neurological, endocrine, gastrointestinal, metabolic, psychiatric, and cancerrelated traits compared with randomly sampled genes of comparable length. Traits below the FDR-corrected threshold (0.05) are shown in orange, and nonsignificant traits are shown in blue. (B) Meta-analysis of LDSC heritability enrichment across 21 SP traits for different evolutionary annotations that represent different divergence points in human evolution. Annotations represented in colors refer to fetal human–gained enhancers and promoters (blue), adult human–gained enhancers and promoters (orange), ancient selective sweeps (purple), putatively introgressed variants from Neanderthals (teal), and genomic regions depleted in Neanderthal and Denisovan ancestry (teal). Blue and orange intervals mark epigenetic annotations, whereas the other color intervals mark genetic annotations. Asterisks show significance at FDR < 0.05. A dashed line is drawn at y = 1 (no heritability enrichment). This analysis was jointly performed with all genomic annotations in the baseline LDv2.2 model. KYA, thousand years ago; MYA, million years ago. (C) Heritability enrichment analysis in human-gained enhancers and promoters at 7 pcw for each trait analyzed. Asterisks show significance at FDR < 0.05 across all genomic annotations and traits analyzed in this study. A dashed line is drawn at x = 1 (no heritability enrichment). Error bars show 1 SE around each estimate. (D) Arm:Leg ratio and Hip Width:Height are the only two skeletal traits that show significant enrichment in both types (HARs and heritability across differentially regulated regions at 7 pcw) of evolutionary analysis. Illustration was created with BioRender.com.

Second, we examined heritability enrichment using LDSC on genomic annotations that reflect divergence at different time points in human evolution (Fig. 5B) following an approach outlined in Sohail (56) and Hujoel et al. (57). These annotations include regions that differ in gene regulation between humans and primates through stages of early development (58), regions that differ in expression between adult humans and macaques (59), and regions that are enriched and depleted of ancestry from archaic humans (60, 61). We then computed heritability enrichment, h2(C), which measures the proportion of heritability in an annotation set divided by the proportion of SNPs in the annotation. In our analysis, we also simultaneously incorporated other regulatory elements, measures of selective constraint, and linkage statistics (baseline LDv2.2 with 97 annotations) (57, 6264) to estimate h2(C) while minimizing bias due to model misspecification (30).

Meta-analyzing across all our SP traits, we found enrichment in fetal human–gained enhancers and promoters at early time points [7, 8.5, and 12 postconception weeks (pcw): h2(C) = 8.08, p = 5.91 × 10−44; h2(C)= 3.60, p = 2.55 × 10−4; h2(C) = 3.65, p = 3.55 × 10−4; table S26] but not in adults, suggesting that genes associated with SPs are differentially expressed in early development between apes and humans. Although we acknowledge that the annotations of differentially regulated elements are from developing brain and not skeletal tissues, fetal human–gained brain regulatory elements and adult human skeletal regulatory elements are correlated at 58% (56, 65). Moreover, we only observed enrichment in developing, but not adult, tissues, suggesting that the enrichment is not driven by confounders of tissue type but by differences in development between the two species. As a second line of analysis, we also examined enrichment of individual traits across the different annotations, controlling for multiple hypothesis correction at the level of FDR < 0.05. Out of 21 of our SP traits (Hip Width: Height, Hip Width:Shoulder Width, Arms:Legs, Shoulder Width:Torso Length, Hip Width:Arms, Shoulder Width:Height, Hip Width:Legs, Shoulder Width:Legs, Shoulder Width:Arms), 9 were significantly enriched at 7 pcw at FDR < 0.05 (Fig. 5C and table S27). In addition, we saw depletion in regions of the genome that were depleted for Neanderthal and Denisovan ancestry, particularly for overall leg length [h2(C)= 0.44, p = 5.89 × 10−5] (table S27). These results were consistent with another analysis that showed a depletion of Neanderthal informative markers in contrast with modern human mutations, particularly for anthropometric traits (66), and are suggestive of purifying selection.

The proportion traits that were significantly enriched across both types of evolutionary analysis were associated with Arms:Legs and Hip Width ratios (Fig. 5D). These results suggest that specific SPs, but not overall height or several other quantitative and disease traits examined by us or Sohail (56), underwent human lineage-specific evolution since the separation of humans from the great apes.

Discussion

In this study, we used deep learning to understand the genetic basis of skeletal elements that make up the human skeletal form using DXA imaging data in a large population-based biobank. We carried out genetic correlation and factor analysis to characterize the joint genetic architecture of these skeletal traits. We identified 145 independent genetic loci associated with SPs. We then showed that OA of the hip and knee are associated with specific SPs that comprise each of those joints. Lastly, we performed an analysis to link SPs with regions of the genome that were accelerated in human evolution, as well as regions of the genome that were differentially regulated between great apes and humans.

There have been concerted efforts to use the imaging data available from the UKB, but most of the work has focused on the magnetic resonance imaging (MRI) modality for the brain or heart (23, 67). Our study expands ongoing efforts in the DXA modality (68, 69), which is the key modality for diagnosing musculoskeletal diseases. We also extend image analysis beyond joint-specific DXA images to full-body images, which have not been examined in the context of bone diseases. We demonstrate that deep learning is useful not just in phenotyping individuals but also as a tool for QC at scale, including the capture of heterogeneous types of error modes. Automated QC pipelines have been developed for brain and heart MRIs from the UKB, but fewer efforts have been made with DXA images (70, 71). We show that modification of existing deep-learning architectures enables us to classify DXA images by body part and filter full-body images for quality, and we have made these modified architectures available for use on any DXA dataset. Our work also demonstrates the importance of having an interconnected dataset of imaging data and physical measurements to best leverage biological insights; the scaling and resolution issues presented by the imaging data would have been impossible to correct for without information about individual height in the biobank metadata. Through transfer learning, we also show that deep learning–based landmark estimation can produce accurate and replicable phenotypes for imaging data with limited manual annotation. We present the final DXA trained models, which are fast, flexible architectures that can be deployed rapidly at population scale, enabling their utility for automated phenotyping as imaging data becomes more integrated into large population biobanks.

Beyond methodological improvements for biobank-scale analysis, our results provide new insights into musculoskeletal biology. Despite more than a century of work in genetics investigating the development of limbs and the overall body plan, a comprehensive genetic map of variation that shapes the overall skeletal form has been absent. Specifically, which genes and how their expression regulates modular development of the forelimb, hindlimb, and other long bones have not been fully characterized. Additionally, whether natural selection has acted on these genes to alter the development of limb proportions, thus allowing us to walk upright, is still unknown. Our work provides a genotype-to-phenotype map of SPs and lays the foundation for future assays of the genes discovered to understand how they contribute functionally to overall phenotype.

The moderate genetic correlations (a maximum of 0.55) observed between SPs indicate genetic sharing, particularly among limb-length traits, while also highlighting the distinctive biology behind the growth of each element. Our results are in line with artificial selection experiments in mouse lines that show that selection for tibia length increased the trait by more than 15% across 14 generations but did not result in significant change in overall body mass (72), a trait that is highly correlated with body width (rg = 0.25, p =1 × 10−21) but not limb length (rg = −0.01, p = 0.53) proportions. Thus, our genetic correlation and factor analysis models provide insight into constraints placed on the evolutionary trajectory of the skeletal form both in humans and in vertebrates more broadly.

One important issue that affects the interpretation of our results is the normalization for height for each skeletal length measure that we obtained. We did this to look at our primary outcome of interest: SPs that are independent of height. Several papers have cautioned that the interpretation of association studies performed with adjustment should be carefully considered (73, 74). Although this issue affects virtually every GWAS that uses age as a covariate in the model (where age is a proxy for survivability, a complex trait with a heritable basis), our analysis is most similar to GWASs conducted for BMI, a trait for which body weight is computed as a proportion of height. Our results largely show consistent direction of effect for loci before and after height adjustment (30). This suggests that our GWASs for SPs are largely identifying loci that are directly associated with overall length of particular skeletal elements and is confirmed by low genetic correlation between our proportion phenotypes and height (mean r = 0.19) (table S28). However, a minority of these signals could still arise from pleiotropic increases or decreases in other skeletal elements that affect overall height. Thus, in interpreting our results, it is important to only view each of our phenotypes as proportions of height rather than directly associated with individual skeletal element lengths themselves.

Epidemiological studies indicate that OA of the hip and the knee frequently do not occur together or in combination with OA in other large joints, suggesting that local factors are important in OA pathogenesis (7580). Specific abnormalities in skeletal morphology are now recognized as major biomechanical risk factors for the development of OA (8186). The findings presented here of the association between specific SPs, but not overall height, and joint-specific OA highlightthe biomechanical role that these proportions play in shaping stresses on the joints themselves and highlight specific risk factors of clinical relevance.

Across both types of evolutionary analyses, the most significant SP traits were those associated with the proportions of arms and legs, as well as proportions of hip width. These results are concordant with some of the most notable morphological differences between humans and the great apes, including arm-to-leg ratio as well as pelvic shape, which enabled a transition from knuckle-based walking to bipedalism (Fig. 5D). Numerous studies have proposed a thermoregulatory hypothesis that accompanied the primary biomechanical energy efficiency hypothesis to explain the evolution of these traits in early hominin evolution as well as to explain differences in anatomy between humans and Neanderthals (87, 88). However, only one extremely small sample study of 20 individuals has been conducted to attempt to test these thermoregulatory theories (89). In this work, we conducted a large–sample size genetic correlation analysis between SPs and basal metabolic rate as well as whole-body fat-free mass in humans using genetic correlation (30). We found that an increased Arms:Legs ratio was associated with lower basal metabolic rate and lower whole-body fat-free mass (p = 9.37 × 10−16; p = 4.05 × 10−16), in line with the theory that these changes in early human evolution would have also increased heat dissipation in early hominins (table S28). Our results provide genomic evidence of selection shaping some of the most fundamental anatomical transitions that have been observed in the fossil record in human evolution—changes in the overall skeletal form that confer the distinctive ability of humans to walk upright.

Materials and methods summary

All patient data, including electronic health record data, DXA images, and genotype data, were obtained from the UKB (37). To perform QC and phenotyping on 31,221 full-body DXA images from the UKB, we modified existing deep-learning models (31, 35) used for classification and landmark estimation by adding final additional training layers with limited manual annotation. We used classification models to filter images that were poor in quality or incorrectly cropped, and we used the landmark estimation model to extract 23 different IDPs that include all long-bone lengths as well as hip and shoulder width, which we analyzed while controlling for height.

After filtering UKB participants and genotype data for QC, we ran GWASs using BOLT-LMM (90) for each phenotype and estimated the heritability and genetic correlations of these traits with each other using GCTA (91). To further investigate the joint genetic architecture of skeletal traits, we used genomic SEM to analyze the genetic factor structure of the limb and body measurements independent of height. Moving forward, we focused our remaining analyses on limb and body measurements as ratios of height (30).

We used GCTA-COJO (92) followed by linkage disequilibrium–based SNP pruning in PLINK (93) to find independent loci across our SP phenotypes, which were mapped to genes using positional-based mapping in PLINK. We used MAGMA (94) to run a gene set enrichment analysis on our traits and queried the Human-Mouse Disease Connection (46) database to determine which mouse phenotypes and human diseases were associated with SP loci.

We then examined correlations of SP phenotypes with musculoskeletal disease through phenotypic and polygenic risk score analyses. First, for phenotypic analysis, we regressed the binary outcome of disease or reported pain in the hip, knee, and back against SPs while controlling for clinically relevant covariates that are known to affect OA (95), including age, sex, BMI, and other factors. For polygenic risk score analysis, we generated polygenic risk scores for each SP with Bayesian regression and continuous shrinkage priors (51) using the significantly associated SNPs. We ran a logistic or linear regression of the polygenic risk score on traits across all individuals, adjusting for the first 20 principal components of ancestry and imputed sex.

Evolutionary analyses were carried out on our SPs using two major methods. We used S-LDSC (62) to estimate the heritability enrichment for each SP in genomic annotations marking different evolutionary periods (30). We also scanned for elevated levels of intersections between genes containing genomewide significant SNPs and HARs (54) through a modified version of the method outlined in Xu et al. (53) and Richard et al. (52). Additional methodological details are available in (30).

Supplementary Material

Supplementary Materials
Checklist for Authors
Tables

ACKNOWLEDGMENTS

This research was conducted using the UKB Resource under application no. 65439. We thank C. Zhu and A. Harpak for insightful discussions and comments on sex-specific analysis. We thank P. Wooley and M. Lee for early implementations of our deep-learning models.

Funding:

V.M.N. was supported by a grant from the Allen Discovery Center program, a Paul G. Allen Frontiers Group advised program of the Paul G. Allen Family Foundation, and a Good Systems for Ethical AI grant from the University of Texas at Austin. O.S. and B.I.F. were supported by NSF Graduate Research Fellowships (DGE 2137420 and DGE 2137420). E.M.J. and B.I.F. were supported by NIH T32 grant 5T32LMO012414. B.I.F. was also supported by a UT Austin Provost’s Graduate Excellence Fellowship. E.M.T.-D. and J.d.l.F. were supported by NIH grants R01MH120219, R01AG054628, and RF1AG073593. Additionally, E.M.T.-D. and J.d.l.F. are members of the University of Texas Center on Aging and Population Sciences and the University of Texas Population Research Center, which are supported by NIH grants P30AG066614 and P2CHD042849, respectively. GPU and compute resources were supported by a Director’s Discretionary Award from the Texas Advanced Computing Cluster.

Footnotes

Competing interests: The authors declare no competing interests.

Data and materials availability:

Code used for performing the deep learning–based key point identification and quality control of the DXA data is available at https://github.com/EucharistKun/Human-Skeletal-Form/ and Zenodo (96). Code for the HAR analysis is available at https://github.com/ossmith/HARE/ and Zenodo (97). Our GWAS summary statistics are available at the GWAS catalog (https://www.ebi.ac.uk/gwas/ under GCP ID GCP000646) as well as at https://utexas.box.com/s/vli4rb4ise7qbdx5gmgpakga5n9ce2lr. Individual-level information of skeletal lengths has been reported back to the UKB and will be available via the Access Management System.

REFERENCES AND NOTES

  • 1.Gruss LT, Schmitt D, The evolution of the human pelvis: Changing adaptations to bipedalism, obstetrics and thermoregulation. Philos. Trans. R. Soc. London Ser. B 370, 20140063 (2015). doi: 10.1098/rstb.2014.0063 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Aiello L, Dean C, An Introduction to Human Evolutionary Anatomy (Academic Press, 1990). [Google Scholar]
  • 3.Young NM, Wagner GP, Hallgrímsson B, Development and the evolvability of human limbs. Proc. Natl. Acad. Sci. U.S.A. 107, 3400–3405 (2010). doi: 10.1073/pnas.0911856107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Futuyma DJ, Kirkpatrick M, Evolution (Sinauer Associates, ed. 4, 2017). [Google Scholar]
  • 5.Orban GA, Caruana F, The neural basis of human tool use. Front. Psychol. 5, 310 (2014). doi: 10.3389/fpsyg.2014.00310 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Harcourt-Smith WEH, Aiello LC, Fossils, feet and the evolution of human bipedal locomotion. J. Anat. 204, 403–416 (2004). doi: 10.1111/j.0021-8782.2004.00296.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Grine FE, Mongle CS, Fleagle JG, Hammond AS, The taxonomic attribution of African hominin postcrania from the Miocene through the Pleistocene: Associations and assumptions. J. Hum. Evol. 173, 103255 (2022). doi: 10.1016/j.jhevol.2022.103255 [DOI] [PubMed] [Google Scholar]
  • 8.Garcia-Fernàndez J, The genesis and evolution of homeobox gene clusters. Nat. Rev. Genet. 6, 881–892 (2005). doi: 10.1038/nrg1723 [DOI] [PubMed] [Google Scholar]
  • 9.Johnson RL, Tabin CJ, Molecular models for vertebrate limb development. Cell 90, 979–990 (1997). doi: 10.1016/S0092-8674(00)80364-5 [DOI] [PubMed] [Google Scholar]
  • 10.Struhl G, A homoeotic mutation transforming leg to antenna in Drosophila. Nature 292, 635–638 (1981). doi: 10.1038/292635a0 [DOI] [PubMed] [Google Scholar]
  • 11.Roscito JG et al. , Convergent and lineage-specific genomic differences in limb regulatory elements in limbless reptile lineages. Cell Rep. 38, 110280 (2022). doi: 10.1016/j.celrep.2021.110280 [DOI] [PubMed] [Google Scholar]
  • 12.Kvon EZ et al. , Progressive loss of function in a limb enhancer during snake evolution. Cell 167, 633–642.e11 (2016). doi: 10.1016/j.cell.2016.09.028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Saxena A et al. , Interspecies transcriptomics identify genes that underlie disproportionate foot growth in jerboas. Curr. Biol. 32, 289–303.e6 (2022). doi: 10.1016/j.cub.2021.10.063 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Chatterjee S, Das N, Chatterjee P, The estimation of the heritability of anthropometric measurements. Appl. Human Sci.18, 1–7 (1999). doi: 10.2114/jpa.18.1 [DOI] [PubMed] [Google Scholar]
  • 15.Silventoinen K et al. , Heritability of adult body height: A comparative study of twin cohorts in eight countries. Twin Res. 6, 399–408 (2003). doi: 10.1375/136905203770326402 [DOI] [PubMed] [Google Scholar]
  • 16.Yengo L et al. , A saturated map of common genetic variants associated with human height. Nature 610, 704–712 (2022). doi: 10.1038/s41586-022-05275-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Fredriks AM et al. , Nationwide age references for sitting height, leg length, and sitting height/height ratio, and their diagnostic value for disproportionate growth disorders. Arch. Dis. Child. 90, 807–812 (2005). doi: 10.1136/adc.2004.050799 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Livshits G, Roset A, Yakovenko K, Trofimov S, Kobyliansky E, Genetics of human body size and shape: Body proportions and indices. Ann. Hum. Biol. 29, 271–289 (2002). doi: 10.1080/03014460110085322 [DOI] [PubMed] [Google Scholar]
  • 19.Pulit SL et al. , Meta-analysis of genome-wide association studies for body fat distribution in 694 649 individuals of European ancestry. Hum. Mol. Genet. 28, 166–174 (2019). doi: 10.1093/hmg/ddy327 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Chan Y et al. , Genome-wide analysis of body proportion classifies height-associated variants by mechanism of action and implicates genes important for skeletal development. Am. J. Hum. Genet. 96, 695–708 (2015). doi: 10.1016/j.ajhg.2015.02.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Currant H et al. , Genetic variation affects morphological retinal phenotypes extracted from UK Biobank optical coherence tomography images. PLOS Genet. 17, e1009497 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Bai W et al. , A population-based phenome-wide association study of cardiac and aortic structure and function. Nat. Med. 26, 1654–1662 (2020). doi: 10.1038/s41591-020-1009-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Pirruccello JP et al. , Deep learning enables genetic analysis of the human thoracic aorta. Nat. Genet. 54, 40–51 (2022). doi: 10.1038/s41588-021-00962-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Agrawal S et al. , BMI-adjusted adipose tissue volumes exhibit depot-specific and divergent associations with cardiometabolic diseases. Nat. Commun. 14, 266 (2023). doi: 10.1038/s41467-022-35704-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Centers for Disease Control and Prevention (CDC), Prevalence and most common causes of disability among adults—United States, 2005. MMWR Morb. Mortal. Wkly. Rep. 58, 421–426 (2009) [PubMed] [Google Scholar]
  • 26.Guccione AA et al. , The effects of specific medical conditions on the functional limitations of elders in the Framingham Study. Am. J. Public Health 84, 351–358 (1994). doi: 10.2105/AJPH.84.3.351 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Chen D et al. , Osteoarthritis: Toward a comprehensive understanding of pathological mechanism. Bone Res. 5, 16044 (2017). doi: 10.1038/boneres.2016.44 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Murray KJ, Azari MF, Leg length discrepancy and osteoarthritis in the knee, hip and lumbar spine. J. Can. Chiropr. Assoc. 59, 226–237 (2015) [PMC free article] [PubMed] [Google Scholar]
  • 29.Baker-LePain JC, Lane NE, Role of bone architecture and anatomy in osteoarthritis. Bone 51, 197–203 (2012). doi: 10.1016/j.bone.2012.01.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.See supplementary materials.
  • 31.Sun K, Xiao B, Liu D, Wang J, “Deep high-resolution representation learning for human pose estimation” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (IEEE, 2019), pp. 5686–5696. [Google Scholar]
  • 32.Deng J et al. , “ImageNet: A large-scale hierarchical image database” in 2009 IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2010), pp. 248–255. [Google Scholar]
  • 33.Lin T-Y et al. , “Microsoft COCO: Common Objects in Context” in Computer Vision – ECCV 2014, Lecture Notes in Computer Science, vol. 8693, Fleet D, Pajdla T, Schiele B, Tuytelaars T, Eds. (Springer, 2014), pp. 740–755. [Google Scholar]
  • 34.Andriluka M, Pishchulin L, Gehler P, Schiele B, “2D human pose estimation: New benchmark and state of the art analysis” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (IEEE, 2014), pp. 3686–3693. [Google Scholar]
  • 35.He K, Zhang X, Ren S, Sun J, “Deep residual learning for image recognition” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (IEEE, 2015), pp. 770–778. [Google Scholar]
  • 36.Robinette K, Churchill T, McConville J; Anthropology Research Project, “A comparison of male and female body sizes and proportions” (Technical Report AMRL-TR-79–69, Aerospace Medical Research Laboratory, 1979). [Google Scholar]
  • 37.Bycroft C et al. , The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018). doi: 10.1038/s41586-018-0579-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Loh PR, Kichaev G, Gazal S, Schoech AP, Price AL, Mixed-model association for biobank-scale datasets. Nat. Genet. 50, 906–908 (2018). doi: 10.1038/s41588-018-0144-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Bulik-Sullivan BK et al. , LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015). doi: 10.1038/ng.3211 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Yang J et al. , Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010). doi: 10.1038/ng.608 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Grotzinger AD et al. , Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat. Hum. Behav. 3, 513–525 (2019). doi: 10.1038/s41562-019-0566-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Zhu C et al. , Amplification is the primary mode of gene-by-sex interaction in complex human traits. Cell Genomics 3, 100297 (2023). doi: 10.1016/j.xgen.2023.100297 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Watanabe K, Taskesen E, van Bochoven A, Posthuma D, Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017). doi: 10.1038/s41467-017-01261-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA, Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33, D514–D517 (2005). doi: 10.1093/nar/gki033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Sollis E et al. , The NHGRI-EBI GWAS Catalog: Knowledgebase and deposition resource. Nucleic Acids Res. 51, D977–D985 (2023). doi: 10.1093/nar/gkac1010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Blake JA et al. , Mouse Genome Database (MGD): Knowledgebase for mouse-human comparative biology. Nucleic Acids Res. 49, D981–D987 (2021). doi: 10.1093/nar/gkaa1083 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Delgado I et al. , Control of mouse limb initiation and anteroposterior patterning by Meis transcription factors. Nat. Commun. 12, 3086 (2021). doi: 10.1038/s41467-021-23373-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Liu Z et al. , An adhesion G protein-coupled receptor is required in cartilaginous and dense connective tissues to maintain spine alignment. eLife 10, e67781 (2021). doi: 10.7554/eLife.67781 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Consortium GTEx, Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017). doi: 10.1038/nature24277 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Funck-Brentano T, Nethander M, Movérare-Skrtic S, Richette P, Ohlsson C, Causal factors for knee, hip, and hand osteoarthritis: A Mendelian randomization study in the UK Biobank. Arthritis Rheumatol. 71, 1634–1641 (2019). doi: 10.1002/art.40928 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Ge T, Chen CY, Ni Y, Feng YA, Smoller JW, Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1776 (2019). doi: 10.1038/s41467-019-09718-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Richard D et al. , Evolutionary selection and constraint on human knee chondrocyte regulation impacts osteoarthritis risk. Cell 181, 362–381.e28 (2020). doi: 10.1016/j.cell.2020.02.057 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Xu K, Schadt EE, Pollard KS, Roussos P, Dudley JT, Genomic and network patterns of schizophrenia genetic variation in human evolutionary accelerated regions. Mol. Biol. Evol. 32, 1148–1160 (2015). doi: 10.1093/molbev/msv031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Pollard KS et al. , Forces shaping the fastest evolving regions in the human genome. PLOS Genet. 2, e168 (2006). doi: 10.1371/journal.pgen.0020168 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Thompson PM et al. , ENIGMA and global neuroscience: A decade of large-scale studies of the brain in health and disease across more than 40 countries. Transl. Psychiatry 10, 100 (2020). doi: 10.1038/s41398-020-0705-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Sohail M, Investigating relative contributions to psychiatric disease architecture from sequence elements originating across multiple evolutionary time-scales. bioRxiv 2022.02.28.482389 [Preprint] (2022); doi: 10.1101/2022.02.28.482389 [DOI] [Google Scholar]
  • 57.Hujoel MLA, Gazal S, Hormozdiari F, van de Geijn B, Price AL, Disease heritability enrichment of regulatory elements is concentrated in elements with ancient sequence age and conserved function across species. Am. J. Hum. Genet. 104, 611–624 (2019). doi: 10.1016/j.ajhg.2019.02.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Reilly SK et al. , Evolutionary genomics. Evolutionary changes in promoter and enhancer activity during human corticogenesis. Science 347, 1155–1159 (2015). doi: 10.1126/science.1260943 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Vermunt MW et al. , Epigenomic annotation of gene regulatory alterations during evolution of the primate brain. Nat. Neurosci.19, 494–503 (2016). doi: 10.1038/nn.4229 [DOI] [PubMed] [Google Scholar]
  • 60.S. R. Browning, B. L. Browning, Y. Zhou, S. Tucci, J. M. Akey, Analysis of human sequence data reveals two pulses of archaic Denisovan admixture. Cell 173, 53–61.e9 (2018). doi: 10.1016/j.cell.2018.02.031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Peyrégne S, Boyle MJ, Dannemann M, Prüfer K, Detecting ancient positive selection in humans using extended lineage sorting. Genome Res. 27, 1563–1572 (2017). doi: 10.1101/gr.219493.116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Finucane HK et al. , Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015). doi: 10.1038/ng.3404 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Gazal S et al. , Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 49, 1421–1427 (2017). doi: 10.1038/ng.3954 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Marnetto D et al. , Evolutionary rewiring of human regulatory networks by waves of genome expansion. Am. J. Hum. Genet. 102, 207–218 (2018). doi: 10.1016/j.ajhg.2017.12.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Kundaje A et al. , Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015). doi: 10.1038/nature14248 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Wei X et al. , The lingering effects of Neanderthal introgression on human complex traits. eLife 12, e80757 (2023). doi: 10.7554/eLife.80757 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Elliott LT et al. , Genome-wide association studies of brain imaging phenotypes in UK Biobank. Nature 562, 210–216 (2018). doi: 10.1038/s41586-018-0571-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Faber BG et al. , A novel semi-automated classifier of hip osteoarthritis on DXA images shows expected relationships with clinical outcomes in UK Biobank. Rheumatology 61, 3586–3595 (2022). doi: 10.1093/rheumatology/keab927 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Frysz M et al. , Machine learning-derived acetabular dysplasia and cam morphology are features of severe hip osteoarthritis: Findings from UK Biobank. J. Bone Miner. Res. 37, 1720–1732 (2022). doi: 10.1002/jbmr.4649 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Robinson R et al. , Automated quality control in image segmentation: Application to the UK Biobank cardiovascular magnetic resonance imaging study. J. Cardiovasc. Magn. Reson. 21, 18 (2019). doi: 10.1186/s12968-019-0523-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Alfaro-Almagro F et al. , Image processing and Quality Control for the first 10,000 brain imaging datasets from UK Biobank. Neuroimage 166, 400–424 (2018). doi: 10.1016/j.neuroimage.2017.10.034 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Marchini M et al. , Impacts of genetic correlation on the independent evolution of body mass and skeletal size in mammals. BMC Evol. Biol. 14, 258 (2014). doi: 10.1186/s12862-014-0258-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Aschard H, Vilhjálmsson BJ, Joshi AD, Price AL, Kraft P, Adjusting for heritable covariates can bias effect estimates in genome-wide association studies. Am. J. Hum. Genet. 96, 329–339 (2015). doi: 10.1016/j.ajhg.2014.12.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Day FR, Loh PR, Scott RA, Ong KK, Perry JRB, A robust example of collider bias in a genetic association study. Am. J. Hum. Genet. 98, 392–393 (2016). doi: 10.1016/j.ajhg.2015.12.019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Nicholls AS et al. , The association between hip morphology parameters and nineteen-year risk of end-stage osteoarthritis of the hip: A nested case-control study. Arthritis Rheum. 63, 3392–3400 (2011). doi: 10.1002/art.30523 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Ecker TM, Tannast M, Puls M, Siebenrock KA, Murphy SB, Pathomorphologic alterations predict presence or absence of hip osteoarthrosis. Clin. Orthop. Relat. Res. 465, 46–52 (2007). doi: 10.1097/BLO.0b013e318159a998 [DOI] [PubMed] [Google Scholar]
  • 77.Cushnaghan J, Dieppe P, Study of 500 patients with limb joint osteoarthritis. I. Analysis by age, sex, and distribution of symptomatic joint sites. Ann. Rheum. Dis. 50, 8–13 (1991). doi: 10.1136/ard.50.1.8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Ganz R, Leunig M, Leunig-Ganz K, Harris WH, The etiology of osteoarthritis of the hip: An integrated mechanical concept. Clin. Orthop. Relat. Res. 466, 264–272 (2008). doi: 10.1007/s11999-007-0060-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Grotle M, Hagen KB, Natvig B, Dahl FA, Kvien TK, Obesity and osteoarthritis in knee, hip and/or hand: An epidemiological study in the general population with 10 years follow-up. BMC Musculoskelet. Disord. 9, 132 (2008). doi: 10.1186/1471-2474-9-132 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Wright AA, Cook C, Abbott JH, Variables associated with the progression of hip osteoarthritis: A systematic review. Arthritis Rheum. 61, 925–936 (2009). doi: 10.1002/art.24641 [DOI] [PubMed] [Google Scholar]
  • 81.Wiberg G, Studies on dysplastic acetabula and congenital subluxation of the hip joint with special reference to the complication of osteo-arthritis. J. Am. Med. Assoc. 115, 81 (1940). doi: 10.1001/jama.1940.02810270083038 [DOI] [Google Scholar]
  • 82.Li PLS, Ganz R, Morphologic features of congenital acetabular dysplasia: One in six is retroverted. Clin. Orthop. Relat. Res. 416, 245–253 (2003). doi: 10.1097/01.blo.0000081934.75404.36 [DOI] [PubMed] [Google Scholar]
  • 83.Gosvig KK, Jacobsen S, Palm H, Sonne-Holm S, Magnusson E, A new radiological index for assessing asphericity of the femoral head in cam impingement. J. Bone Joint Surg. Br. 89-B, 1309–1316 (2007). doi: 10.1302/0301-620X.89B10.19405 [DOI] [PubMed] [Google Scholar]
  • 84.Harris WH, The correlation between minor or unrecognized developmental deformities and the development of osteoarthritis of the hip. Instr. Course Lect. 58, 257–259 (2009) [PubMed] [Google Scholar]
  • 85.Nötzli HP et al. , The contour of the femoral head-neck junction as a predictor for the risk of anterior impingement. J. Bone Joint Surg. Br. 84-B, 556–560 (2002). doi: 10.1302/0301-620X.84B4.0840556 [DOI] [PubMed] [Google Scholar]
  • 86.Pollard TCB et al. , Femoroacetabular impingement and classification of the cam deformity: The reference interval in normal hips. Acta Orthop. 81, 134–141 (2010). doi: 10.3109/17453671003619011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Ruff CB, Climate and body shape in hominid evolution. J. Hum. Evol. 21, 81–105 (1991). doi: 10.1016/0047-2484(91)90001-C [DOI] [Google Scholar]
  • 88.Steudel-Numbers KL, Tilkens MJ, The effect of lower limb length on the energetic cost of locomotion: Implications for fossil hominins. J. Hum. Evol. 47, 95–109 (2004). doi: 10.1016/j.jhevol.2004.06.002 [DOI] [PubMed] [Google Scholar]
  • 89.Tilken MJ, Wall-Scheffler C, Weaver TD, Steudel-Numbers K, The effects of body proportions on thermoregulation: An experimental assessment of Allen’s rule. J. Hum. Evol. 53, 286–291 (2007). doi: 10.1016/j.jhevol.2007.04.005; [DOI] [PubMed] [Google Scholar]
  • 90.Loh PR et al. , Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015). doi: 10.1038/ng.3190 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Yang J, Lee SH, Goddard ME, Visscher PM, GCTA: A tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011). doi: 10.1016/j.ajhg.2010.11.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Yang J et al. , Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012). doi: 10.1038/ng.2213 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Purcell S et al. , PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007). doi: 10.1086/519795 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.de Leeuw CA, Mooij JM, Heskes T, Posthuma D, MAGMA: Generalized gene-set analysis of GWAS data. PLOS Comput. Biol. 11, e1004219 (2015). doi: 10.1371/journal.pcbi.1004219 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Palazzo C, Nguyen C, Lefevre-Colau MM, Rannou F, Poiraudeau S, Risk factors and burden of osteoarthritis. Ann. Phys. Rehabil. Med. 59, 134–138 (2016). doi: 10.1016/j.rehab.2016.01.006 [DOI] [PubMed] [Google Scholar]
  • 96.Kun E, Human-skeletal-form. Zenodo (2023); 10.5281/zenodo.7787839. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Smith O, HARE. Zenodo (2023); 10.5281/zenodo.7793834. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Materials
Checklist for Authors
Tables

Data Availability Statement

Code used for performing the deep learning–based key point identification and quality control of the DXA data is available at https://github.com/EucharistKun/Human-Skeletal-Form/ and Zenodo (96). Code for the HAR analysis is available at https://github.com/ossmith/HARE/ and Zenodo (97). Our GWAS summary statistics are available at the GWAS catalog (https://www.ebi.ac.uk/gwas/ under GCP ID GCP000646) as well as at https://utexas.box.com/s/vli4rb4ise7qbdx5gmgpakga5n9ce2lr. Individual-level information of skeletal lengths has been reported back to the UKB and will be available via the Access Management System.

RESOURCES