Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Apr 1.
Published in final edited form as: Osteoarthritis Cartilage. 2011 Jan 5;19(4):420–429. doi: 10.1016/j.joca.2010.12.011

Canine Hip Dysplasia is Predictable by Genotyping

Gang Guo 1,§, Zhengkui Zhou 2,3,4,§, Yachun Wang 1,§, Keyan Zhao 5, Lan Zhu 6, George Lust 7, Linda Hunter 4, Steven Friedenberg 4, Junya Li 3, Yuan Zhang 1, Stephen Harris 8, Paul Jones 8, Jody Sandler 9, Ursula Krotscheck 4, Rory Todhunter 4, Zhiwu Zhang 10,*
PMCID: PMC3065507  NIHMSID: NIHMS263272  PMID: 21215318

Summary

Objective

To establish a predictive method using whole genome genotyping for early intervention in canine hip dysplasia (CHD) risk management, for the prevention of the progression of secondary osteoarthritis (OA), and for selective breeding.

Design

Two sets of dogs (6 breeds) were genotyped with dense SNPs covering the entire canine genome. The first set contained 359 dogs upon which a predictive formula for genomic breeding value (GBV) was derived by using their estimated breeding value (EBV) of the Norberg angle (a measure of CHD) and their genotypes. To investigate how well the formula would work for an individual dog with genotype only (without using EBV or phenotype), a cross validation was performed by masking the EBV of one dog at a time. The genomic data and the EBV of the remaining dogs were used to predict the GBV for the single dog that was left out. The second set of dogs included 38 new Labrador retriever dogs, which had no pedigree relationship to the dogs in the first set.

Results

The cross validation showed a strong correlation (r>0.7) between the EBV and the GBV. The independent validation showed a strong correlation (r=0.5) between GBV for the Norberg angle and the observed Norberg angle (no EBV was available for the new 38 dogs). Sensitivity, specificity, positive, and negative predictive value of the genomic data were all above 70%.

Conclusions

Prediction of CHD from genomic data is feasible, and can be applied for risk management of CHD and early selection for genetic improvement to reduce the prevalence of CHD in breeding programs. The prediction can be implemented before maturity, at which age current radiographic screening programs are traditionally applied, and as soon as DNA is available.

Introduction

Hip dysplasia (HD) is a common inherited trait that affects the wellbeing of humans and dogs and imposes a heavy financial and emotional burden[1]. The disease is characterized by hip instability, which leads inexorably to painful, debilitating secondary hip osteoarthritis (OA) [2-4]. Canine hip Dysplasia (CHD) is a major veterinary problem occurring with a frequency up to 75% in mixed and pure breed dogs of approximately 70 million dogs in American households [5]. The prevalence in a hospital population is about 20% [5]. Human HD, referred to as developmental dysplasia of the hip (DDH), occurs with a frequency ranging from 5.4% to 12.8%. Hip OA prevalence was 4.4–5.3% for individuals over 60 years old [6, 7]. Developmental dysplasia of the human hip significantly influenced the prevalence of hip OA [7]. Radiographic surveys have found that 20% to 50% of human patients diagnosed with idiopathic hip OA had antecedent DDH[6].

Canine HD and DDH are homologous conditions from a clinical perspective with identical sequelae due to subluxation which results in focal overload of the articular surface and hip OA [6-9]. Current treatment options for human and canine HD or OA are limited to symptom management and hip replacement at end-stage degeneration. No data is available for the number of canine hip replacements undertaken each year but 82% of human hip replacements are due to end-stage OA [10]. The number of human total hip replacements is about a quarter million and this number is expected to double in the next twenty years [11]. The challenge is to develop predictive tools to identify the risk of CHD, DDH and hip OA at an early age so that more efficient and cost effective management can be applied.

Selective breeding of dogs has proven to be effective in reducing the prevalence of CHD [12]. In a previous study [13], we showed that the selective breeding program operated by Guiding Eyes for Blind (in Yorktown Heights, New York, USA) was able to achieve stable genetic improvement in hip morphology. Nationwide, the Orthopedic Foundation for Animals (OFA) has been scoring hip radiographs and releasing some of the records publicly over the last 40 years. In a previous study, we showed that a consistent genetic improvement has accumulated [14]. The genetic improvement was limited by the fact that the selection criteria of the majority of the breeding dogs had low accuracy. Even when an estimated breeding value (EBV) of an individual derived from raw phenotypes of itself and its relatives was available, it only reached reasonable accuracy if it was based on hundreds of progeny who were in a comparable group which also contained progeny from other dogs [14-16]. Producing this large number of progeny takes several years. The number of such accurate dogs was limited. Thus, improved methods of identifying dogs susceptible to HD are required to implement earlier preventative methods for allaying secondary hip OA. Because pure breed dogs must have documented pedigrees to be registered as pure by the American Kennel Club (AKC), EBVs can be calculated and these can be correlated with genomic breeding values (GBVs) composed of single nucleotide polymorphisms (SNPs) or sequence variants.

Here we present data for the first time to demonstrate that CHD is predictable from genomic data so that selection decisions can be made for a dog at puppy age. This implies that human HD could also be predicted at an early age and suitable preventative management could be applied to identify susceptible individuals who may be missed by physical screening and ultrasound and reduce the prevalence hip OA by preemptive intervention.

Materials and Methods

Dog samples

Two sets of dogs were genotyped for this study. The first set (359 dogs) was sampled from a pool of dogs with breeding values reported from our previous study [13]. The second set (53 Labrador retrievers) contained 15 dogs that were in the first set for the purpose of data quality control (e.g. genotyping error) and imputation of missing SNPs across genotyping platforms. The rest of the dogs (38) were newly admitted patients to the Cornell University Hospital for Animals. They either had hip pain and lameness or were being radiographed as a screening tool prior to breeding. There was no known pedigree relationship between the 38 new dogs in the second sample and the 359 dogs in the first sample. Cornell Institutional Animal Care and Use Committee Protocol approval numbers are 2005-0151 (DNA Bank) and 2006-0187 (Hip Dysplasia and Osteoarthritis Genetics).

Radiographic methods and estimated breeding value

The four measurements used for hip evaluation were the Norberg angle (NA), Orthopedic Foundation for Animals (OFA) score, the distraction index (DI) and the dorsolateral subluxation score (DLS) [17]. The former two are evaluated from the extended hip projection and are phenotypically and genetically correlated while the latter two are evaluated on different projections and are phenotypically and genetically correlated [14]. No measure alone completely represents hip morphology. The hips of the Baker Institute dogs were commonly radiographed at 8-12 months of age. The Guiding Eyes for the Blind radiographed their dogs’ hips at 14-18 months of age. The age of dogs admitted to the CUHA varied but were 2 years of age on average. All radiographic measurements except the OFA score have achieved their maximal accuracy when the dogs are 8 months old which is skeletal maturity. The DI and the DLS score reveal more hip laxity than the NA and the OFA score. The DLS imaging position reveals maximum subluxation which can be masked by the extended hip imaging position. The OFA score increases in accuracy as a dog ages because the secondary OA progresses and is more evident radiographically [17]. Estimated breeding values (EBV) were derived by using a multiple trait mixed linear model from our previous study [13]. As NA correlated to OFA score and most dogs had NAs measured, NA was chosen for this study.

Single Nucleotide Polymorphism Genotyping

The first set of dogs was genotyped on the Infinium Canine SNP20 BeadChip (Illumina Inc., San Diego CA) with ~22,000 SNPs across the genome (http://www.illumina.com/documents/products/datasheets/datasheet_canine_snp20.pdf). The second sample of dogs was genotyped on an Affymetrix platform (Canine 127K SNP array version 2) of which ~50,000 SNPs were reliable. The majority (92.3%) of the Illumina SNPs had completed calls. There was 99.46% SNPs with call rate above 95%. For the Affymetrix SNP array, there were 71.65 % and 43.24% of SNPs with call rate above 90% and 95% respectively.

SNPs with missing calls above 45% were removed. We also removed SNPs with minor allele frequency (MAF) below 1% [18]. The final analysis contained 21,455 SNPs for the Illumina array and 48,431 for Affymetrix array. For the Illumina SNP array, the mean and median MAF were 0.2589 and 0.2399, respectively. For the Affymetrix SNP array, the mean and median MAF were 0.2589 and 0.2641, respectively. There were 13,465 SNPs in common between the two sets of SNPs. The concordance rate was 99.9% of the common SNPs genotyped on 15 dogs.

Principal component analysis (PCA) was performed based on the numeric genotypes that were 0 and 2 for the two homozygotes and 1 for the heterozygotes. Two analyses were performed on different dogs with different SNPs. The first PCA was performed on the 359 dogs in the first sample by using the 21,455 Illumina SNPs. The second PCA was performed on the 359 plus the additional 38 new dogs by using the common 13,465 SNPs. The 359 dogs, including the 15 common dogs, used the Illumina genotypes and the 38 additional dogs used the Affymetrix genotypes.

Genomic prediction model

We used the EBV and Illumina SNPs on 359 dogs to derive the predictive formula. The model to predict the genomic breeding values based on m bi-allelic markers with m=21,455 was

y=μ+ijXijβij+e(i=1tomandj=1to2), (1)

where y is the vector of the dependent variable (EBV), μ is a general mean, Xij is a design vector for the jth allele of marker i, βij is the allele substitution effect of the jth allele of marker i, and e is a residual vector, which by default is e~N(0,Iσe2). In this model, the allele effects are modeled as random effects with βij~N(0,φi2), where φi is a scaling factor that models the variance explained at the ith marker. The scaling factors can be interpreted as a standard deviation of allele substitution effects. The variance of allele effects is estimated using an informative prior distribution. We chose a prior common normal distribution on the scaling factors φi, e.g. φi~N(0,σs2), where σs2 was variance of φi. The σs2 parameter was estimated from the data so that it would properly adjust to the correct level and apply the optimal shrinkage [19]. The σs2 parameter could roughly be described as the expected average fitted variance per marker. The parameter of the common prior was given a starting value as σs2=0.0001, and then was estimated simultaneously with other unknown parameters.

For all parameters, single chain Gibbs samplers were implemented. A Markov Chain Monte Carlo (MCMC) sampler was used to generate samples from the joint posterior distribution of the model parameters. The MCMC was performed with IBAY [20] for 50,000 cycles. The first 10,000 cycles were used as the burn-in period. One sample was saved for every 5 cycles in the rest of the 40,000 cycles. The averages and variances of unknown parameters from the 8,000 posterior samples were used as the final estimates and their dispersion parameters. The genomic breeding value (GBV) was estimated as follows:

GBV=ijxijβ^ij (2)

where β^ij was the average of the estimates of βij over 8,000 samples. Prediction error variance (PEV) [13] was derived for the GBV of each individual and genomic variance (σa2) was calculated from GBVs of all individuals.

Reliability (r), or accuracy, of the GBV of an individual, defined as the correlation between true and predicted values, was calculated from PEV and σa2 as follows:

r=1PEVσa2 (3)

We calculated GBV from all the SNPs based on the scaling factor and a subset of the most influential k SNPs (k=20, 50, 100, 200, 500, 1000, 5000 and 10000) selected for the largest scaling factors.

Imputation of missing SNPs

The Illumina SNPS that were not on the Affymetrix array were imputed by using a software tool (MACH) [21, 22].

Validation of predictive formula

We performed two types of validations: cross validation and independent validation. The cross validation was performed by masking an EBV of a dog one at a time (Jackknife cross validation). Its GBV was calculated based only on its genotypes by using the formula derived from the EBV and genotype on the rest of the 358 dogs. The process was repeated for each of the 359 dogs. We calculated GBV from all the SNPs based on the scaling factor and subsets of the most influential SNPs.

The independent validation was performed on 38 Labrador retriever dogs that had no pedigree relationship to the 359 dogs. The predictive formula was derived from the 359 dogs. The GBVs of the 38 dogs were calculated by using the formula in two ways. The first way used all the 21,455 Illumina SNPs with the missing SNPs imputed. The second way used the 13,465 common SNPs. The rest of the SNPs were discarded from the predictive formula. The correlations between GBV and EBV/phenotype within breed/cross were used as the criteria of validation.

Sensitivity, Specificity, Positive and Negative Predictive Value

Canine hip dysplasia is a complex disease and NA measurement is continuous. The range of NA is usually between 70 and 120 degrees, and the low degree indicates severe HD. No obvious cutoff was defined a priori to distinguish dysplastic and non-dysplastic hip. In this study, the cutoff was determined to maximize the minimum of the four diagnostic statistics; sensitivity, specificity, positive predictive value and negative predictive value [23, 24].

Results

Estimated breeding value

Canine hip dysplasia was measured using the Norberg angle. Its EBV for each dog were obtained from a multiple trait model in our previous study[13]. Breed was included as a co-factor. The average EBV was restricted to zero for each breed. Table 1 displays the averages and standard deviations of the 359 dogs sampled from various breeds and crosses. The 7 Greyhounds did not show HD and the standard deviations within this breed were small. The standard deviations within breed were about the same among other pure breeds. The variations among the crosses between Greyhounds and Labrador retrievers were between their parental variations (Table 1). The EBV of each dog was accompanied with a reliability score indicating the degree to which the EBV correlated with the true genetic effect, with 1 and 0 as the closest and farthest, respectively (Table 1).

Table 1.

Estimated Breeding Values (EBV) and reliabilities of hip dysplasia for 359 genotyped dogs*

Breed N Average SD Min Max
EBV LR 182 0.81 6.1691 −20.42 9.96
G 7 −0.65 2.0233 −4.31 2.20
LR × G 8 −1.15 2.0455 −4.50 1.54
F1×LR 68 −0.66 4.4654 −14.71 6.40
F1×G 17 −0.84 2.1344 −5.19 4.06
(F1×LR) × (F1×LR) 13 −3.78 2.7547 −8.73 0.92
German Shepherd 17 −1.24 8.0438 −19.70 6.98
Golden Retriever 15 0.30 6.7509 −15.40 7.42
Newfoundland 18 0.39 6.9325 −16.41 9.27
Rottweiler 14 1.22 6.8150 −10.93 8.89
Reliability LR 182 0.91 0.0343 0.70 0.97
G 7 0.90 0.0121 0.89 0.92
LR×G 8 0.92 0.0169 0.89 0.94
F1×LR 68 0.90 0.0064 0.89 0.93
F1×G 17 0.88 0.0051 0.88 0.89
(F1×LR) × (F1×LR) 13 0.89 0.0065 0.87 0.89
German Shepherd 17 0.87 0.0461 0.76 0.94
Golden Retriever 15 0.87 0.0113 0.86 0.91
Newfoundland 18 0.83 0.0221 0.86 0.91
Rottweiler 14 0.86 0.0354 0.76 0.92
*

The number of dogs (N), average, standard deviation (SD), the minimum and maximum of EBV and reliability were given for each pure breed and crosses between Labrador retriever (LR) and Greyhound (G). The crosses included the first cross between retriever and Greyhound (F1), backcross to LR (F1×LR), backcross to Greyhound (F1×G), and a third generation cross: (F1×LR) × (F1×LR).

Genomic prediction

The training data set contained 359 dogs which were genotyped with the Illumina CanineSNP20 BeadChip containing ~22,000 SNPs (Figure 1). The prediction formula was built in a Bayesian framework [16, 19, 25]. When both genomic data and EBV were used to formulate the model for each dog, the correlation coefficient between the EBV and the predicted GBV was almost 1.00 which indicated that the model was over-parameterized. When the SNPs with least contribution (smallest scaling factors) to the GBV were gradually removed from the formula, the correlation decreased slowly and steadily. Even when 5,000 SNPs remained in the formula, the correlation was still above 0.98. However, the correlation decreased when the total number of SNPs in the GBV model was less than 100-500 (Figure 2). In addition to the agreement between EBVs and GBVs, a moderate correlation (R=0.47) was observed between the reliabilities of EBVs and GBVs. As expected, the more reliable the EBV, the more reliable the corresponding GBV.

Figure 1.

Figure 1

The properties of single nucleotide polymorphisms (SNPs). The SNP were genotyped with the Illumina array on 359 dogs and Affymetrix array on 53 dogs. A: Cumulative distribution of minor allele frequencies B: The density of the SNPs; C: Distribution of heterozygosity; D: Linkage disequilibrium (LD) decay (R2) over physical distance. The LD was calculated with all breeds and Labrador retriever (LR) respectively.

Figure 2.

Figure 2

Model fit of genomic breeding value and estimated breeding value. The model fit (R2) was displayed for each breed/cross over different number of the most influential SNPs. The cross included the first cross (F1) between Labrador retriever and Greyhound (G), backcross to LR (F1×LR), backcross to G (F1×G), and third generations cross: (F1×LR) × (F1×LR).

Cross validation

To examine how well the genomic prediction would work for an individual dog without an EBV or any phenotype, we removed one dog at a time from the set with 359 and then used the rest of the set to build a new prediction formula and then used that new formula to predict the GBV for the excluded dog. We repeated this process (Jackknife cross validation) for each dog until every dog had its own GBV estimated only with its genotype. During the process of calculating GBV, we used the most influential k SNPs (k=5000, 1000, 500, 200, 100, 50 and 20). Interestingly, the correlation (r) for using the top 5,000 SNPs dropped from ~0.98 to ~ 0.60 for the lowest number of SNPs when each dogs’s EBV was not used to predict the GBV of itself in the cross validation, no longer observing the over parameterization as before (Figure 3). The cross validation showed a strong correlation (r>0.70) between EBV and GBV by using all the SNPs.

Figure3.

Figure3

Accuracy of genomic prediction from cross validation. Linear regression lines and R2 are given for each plot of genomic breeding value (GBV) vs. estimated breeding value (EBV). The GBV of a dog was calculated from its genotypes by using the predictive formula derived from the genotype and EBV of all the other dogs (Jackknife cross validation). The plots were classified by breed/cross and number of the most influential SNPs used to calculate GBV. The cross included the first cross (F1) between Labrador retriever and Greyhound (G), backcross to LR (F1×LR), backcross to G (F1×G), and third generations cross: (F1×LR) × (F1×LR).

More interestingly, the correlation was well maintained even when the prediction formula was based on the most influential 100-500 SNPs. The correlation reduction for using less than 100-500 SNPs reflected the loss of SNPs in linkage disequilibrium (LD) with the quantitative trait nucleotides (QTNs) underlying the EBVs. In general, reasonable correlation was achieved when the top 100-500 SNPs were included.

Independent validation

Our final goal was to test whether the predictive formula could be applied to a naïve set of dogs outside the 359 dogs from which the original formula was derived, especially for the dogs that were unrelated to the original set. We genotyped another set of Labrador retriever dogs (53) with the Affymetrix Canine Array. One third of these dogs (15) were part of the 359 used for the purpose of data quality verification (e.g. genotyping error) and imputation of missing SNPs. The other 38 dogs were from among those admitted to the Cornell Hospital and had no known pedigree relationship to the dogs in the first set. Each of these dogs only had a single Norberg angle measurement on each hip. The worst hip angle (the minimum) from the two hips was used as the phenotype for each dog. No EBV was available on these dogs due to lack of pedigree information.

The Affymetrix SNP array contained ~50,000 informative SNPs, including the 13,465 from the Illumina array. The genotype calls on the 15 dogs genotyped with both arrays showed a very strong agreement (concordance rate was 99.9%).

We performed PCA by using the common SNPs. The population structure of these dogs was clearly revealed by the first two principal components (PC). All the dogs within a pure breed were clustered together in the scatter plots (Figure 4). All the F1 dogs of Greyhound/Labrador retriever breedings were positioned between their respective parental breeds. The backcross of the F1 to Labrador retriever was closer to Labrador retriever and the backcross of F1 to Greyhound was closer to Greyhound as expected. The substructure within the Labrador retriever breed reflected the multiple sources of Labrador retriever for these studies. The scatter plot of the first two PCs revealed the dispersion of the relationship of the new 38 Labrador retriever dogs with other dogs (Figure 4).

Figure 4.

Figure 4

Genetic relationship among dogs in the reference and independent sample. The genetic relationship was characterized by the first principal component (PC, X axis) and the second principal component (Y axis) of the SNP genotypes. The PCs were derived from the common 13,465 SNPs shared by the Illumina array and AffyMetrix Array. The Illumina array was used to genotype the reference population (359 dogs) and the Affymetrix SNP array was used to genotype the independent validation population (38). Fifteen dogs were genotyped on both platforms for the purpose of data quality control and imputation of missing SNPs. Different pure breeds and crosses are displayed separately. The crosses between Labrador retriever (LR) and Greyhound (G) include the first cross (F1), backcross to LR (F1×L), backcross to G (F1×G), and third generations cross: (F1×L) × (F1×L). The Labrador retrievers from the reference population and the independent validation population are also displayed separately to show the diversity between the two subgroups. The Labrador retrievers from the reference sample are displayed as LR and the ones from the independent sample is displayed as LR (Independent).

Among the Illumina 21,455 SNPs used to derive the predictive formula, 40% were not on the Affymetrix SNP array, including the most influential SNPs (Figure 5). We imputed the Illumina SNPs missing on the Affymetrix array. We applied the predictive formula to the 38 dogs by using the common SNPs (without imputation) and all of the Illumina array SNPs (with missing SNPs imputed). This independent validation showed strong correlations (r=0.5 with imputation and r=0.45 without imputation) between their known NA phenotype and GBV (Figure 6).

Figure 5.

Figure 5

Missing rate of single nucleotide polymorphisms (SNPs). There were 21,455 SNPs on Illumina array that was used to derive the predictive formula. About ~40% of these SNPs were not present on the Affymetrix array that was used to genotype the dogs for independent validation (including the first and the forth most influential SNPs on the Illumina array). The cumulative missing rates of SNPs are plotted against their order (descending log scale) based on their scaling factor.

Figure 6.

Figure 6

Accuracy of genomic prediction from independent validation. The validation was performed on 38 Labrador retrievers with the Norberg angle phenotype and SNPs from the Affymetrix array. The accuracy is displayed as the correlation coefficient between the phenotype and genomic breeding value (GBV). GBV was calculated by using the formula derived from 359 dogs genotyped with the Illumina SNP array. As 40% of SNPs on the Illumina array were not on the Affymetrix array, GBV was calculated with the common SNPs shared by the two arrays and all the SNPs on the Illumina array with missing SNPs imputed. The calculation of GBV was performed with a different number of the most influential SNPs.

Clinical diagnosis/prediction

The cutoffs on NA and GBV were set at 105 degree and -6 respectively to define and diagnose dysplastic and non-dysplastic hips. These cutoffs maximized the minimum of the four clinical diagnostic statistics (sensitivity, specificity, positive predictive ability and negative predictive ability) in the reference population with 359 dogs (Figure 7). The corresponding sensitivity, specificity, positive predictive ability and negative predictive ability were 72.22%, 75.00%, 72.22% and 75.00%, respectively, among the 38 dogs in the independent validation.

Figure 7.

Figure 7

Precision of genomic prediction. The dichotomous status of hip dysplasia was defined by the cutoff of NA (X axis) and diagnosed by GBV (Y axis) among 359 dogs in the reference population. The color at each combination of the two cutoffs indicates the corresponding values of sensitivity (A), specificity (B), positive predictive value (C) and negative predictive value (D) and the minimum among the four values (E). The optimized cutoffs were 94 for NA and -6 for genomic breeding value indicated by the blue circle. The corresponding sensitivity, specificity, positive predictive ability and negative predictive ability were 98.77%, 75.00%, 96.97% and 88.23%, respectively.

Discussion

This is the first report showing a repeatable prediction of CHD from genomic data. A reliable prediction was achieved with as few as the most influential 100-500 SNPs. This prediction could be used for risk management of CHD or as a better alternative selection criteria than phenotype [26]. The correlation between GBV and observed NA (R2=0.52=0.25) in the independent validation population was close to the correlation level between phenotype and true breeding value represented by heritability, reported as 0.24~0.25 [27] and 0.31~0.35 [28].

Furthermore, higher selection response may be achieved using GBV compared to using EBVs [25, 29, 30] suggesting that genomic selection would therefore be the method of choice to improve hip conformation most efficiently. It would have special use in small breeding programs for which breeders do not have deep and extended pedigrees upon which to estimate breeding values because the pool of reference individuals for a breed would house the genetic information needed for any dog of the same breed. In chickens, an almost four-fold increase in the accuracy of prediction of yet-to-be observed phenotypes for food conversion rate in broilers was reported when genomic prediction of phenotype was used compared with pedigree prediction of phenotype [31]. In mice, genomic predictions, including both additive and dominant SNP effects, produced a higher accuracy of phenotype prediction for various traits than using pedigree information alone [32].

In addition to the financial burden of progeny testing, the time delay to phenotyping at maturity means that many dogs are bred or bought by owners at weaning time without knowledge of their genetic potential for good hip conformation. Genomic prediction could be applied at birth even prior to weaning and purchase of pups by owners. Nevertheless, reliable phenotype or EBVs are essential for prior to developing GBV. Our result showed that there was a significant correlation (P<0.01) between the reliability of EBV and the accuracy of GBV. The correlation between the reliability of EBV and reliability of GBV was 0.47 (R2=0.22). The more reliable the EBV, the more reliable the corresponding GBV. This was consistent with the previous results that GBV was more accurate when it was derived from EBV than that derived from the raw phenotype as the EBV was more reliable than the raw phenotype.

The qualities of genotyped markers in this study were reasonably good with respect to polymorphism and heterozygosity. The minor allele frequency (MAF) of these SNPs followed a uniform distribution after removing the SNPs with MAF<1%. Heterozygosities had a bimodal distribution with one peak toward zero and one toward 0.5. The distribution was similar to both practical [33-35] and theoretical observations [36] under the assumptions of the neutral theory in a random mating population.

The accuracy of GBV could be higher if the genotyped markers had better coverage of LD by inclusion of more densely spaced markers [37-39]. The LD in our study population decayed very rapidly. A useful LD [40] (r2>0.3) only occurred at distances shorter than 30 kilobase pairs in Labrador retrievers and 20 kilobase pairs across the 6 breeds and the crosses. The dog genome is similar in size to the genomes of humans and other mammals, containing approximately 2.5 billion DNA base pairs [41]. This requires at least 125,000 informative SNPs to capture the LD intervals among breeds. The average and median marker interval from the Illumina CanineSNP20 BeadChip were 107 Kb and 70, respectively. This implied that we could have missed many QTNs.

GBV for hip conformation will become available for most breeds of interest. However, a continued effort at progeny testing to obtain reliable EBV, even as the new technology is applied, is necessary to retrain the predictive formula and to improve the accuracy of genomic prediction. There will always be impetus to expand the reference panel for a breed and across breeds, such as combining the naïve 38 dogs with our original 359 to use as the next reference population. As more dogs and breeds which have undergone genome wide genotyping are added to the reference population, the subset of SNPs used in the prediction set would be recalculated to capture more SNPs in LD with the causal QTN or genes.

The majority of the 359 dogs we used were Labrador retrievers and their crosses with Greyhounds, yet the other minor breeds were well predicted. This indicated that multiple breeds could be integrated together although they were remarkably diversified from a phenotypic and genotypic point of view. We derived PCs from all the SNPs. Similar to previous reports, we were able to separate the breed structure of the dogs in our study based on a plot of the first and second PCs (Figure 4).

For the four traits that collectively define CHD, there are at least 10-20 QTLs [42-44]. In the current study, we identified the most 100-500 influential SNPs providing the most information to the GBV through Bayesian analysis which jointly estimates their contribution. As the number of SNPs in the reference panel dropped below 50, the SNPs failed to bracket some of the QTN and thus the accuracy of the GBV would decrease.

Our previous genome wide association study identified 4 SNPs associated with CHD and 2 SNPs associated with hip OA [45]. These SNPs were identified through individual SNP associations. Further, the GWAS [45] included many more dogs and breeds which were genotyped within 8 previously identified QTL [44]. Genomic prediction could be enhanced by the finding the causal genes through genome wide association studies [45]. Positional cloning of candidate genes will provide opportunities to add intragenic informative SNPs in mutated genes to the genomic prediction panel [46, 47]. However, this may take some time as only one gene, fibrillin 2, has been shown to associated with CHD to date [48].

Our study indicated that genomic prediction could be effective with the most influential 100-500 SNPs chosen from ~22,000SNPs. If genotyping of these SNPs with customized array, or sequencing whole genome becomes cost effective for public use, genomic prediction can become a vital and integral part of improving canine breeding practices for CHD and become a routine part of personalized canine genetic medicine.

Acknowledgement

We thank Liz Corey and Dr. Marta Castelhano for technical assistance. The study was supported by National Institutes of Health (1R21AR055228-01A1), National Science Foundation(#0606461), Chinese National Key Technologies R&D Program (No.2006BAD04A01, No.2006BAD01A10, No.2008BADB2B03, No.2008AA101010), National Department Public Benefit Research Foundation of China (No. nyhyzx07-035), National Natural Science Foundation of China (No. 30871774), Waltham Center for Pet Nutrition, Cornell Advanced Technology in Biotechnology, and the Collaborative Research Grant Program, Department of Clinical Sciences, and the Baker Institute for Animal Health in the College of Veterinary Medicine, Cornell University. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Conflict of interest

The authors have no financial or personal relationships with other people or organizations that could bias our research.

References

  • 1.Cachon T, Genevois JP, Remy D, Carozzo C, Viguier E, Maitre P, et al. Risk of simultaneous phenotypic expression of hip and elbow dysplasia in dogs: a study of 1,411 radiographic examinations sent for official scoring. Vet Comp Orthop Traumatol. 23:28–30. doi: 10.3415/VCOT-08-11-0116. [DOI] [PubMed] [Google Scholar]
  • 2.Clements DN, Carter SD, Innes JF, Ollier WE. Genetic basis of secondary osteoarthritis in dogs with joint dysplasia. Am J Vet Res. 2006;67:909–918. doi: 10.2460/ajvr.67.5.909. [DOI] [PubMed] [Google Scholar]
  • 3.Smith GK, Mayhew PD, Kapatkin AS, McKelvie PJ, Shofer FS, Gregor TP. Evaluation of risk factors for degenerative joint disease associated with hip dysplasia in German Shepherd Dogs, Golden Retrievers, Labrador Retrievers, and Rottweilers. J Am Vet Med Assoc. 2001;219:1719–1724. doi: 10.2460/javma.2001.219.1719. [DOI] [PubMed] [Google Scholar]
  • 4.Todhunter RJ, Grohn YT, Bliss SP, Wilfand A, Williams AJ, Vernier-Singer M, et al. Evaluation of multiple radiographic predictors of cartilage lesions in the hip joints of eight-month-old dogs. Am J Vet Res. 2003;64:1472–1478. doi: 10.2460/ajvr.2003.64.1472. [DOI] [PubMed] [Google Scholar]
  • 5.Breur G, Lust G, Todhunter R. Genetics of hip dysplasia and other orthopedic diseases. In: Ruvinsky A, Sampson J, editors. The genetics of the dog. CAB International; Wallingford, Oxon, UK: 2001. pp. 267–298. [Google Scholar]
  • 6.Weinstein SL. Natural history of congenital hip dislocation (CDH) and hip dysplasia. Clin Orthop Relat Res. 1987:62–76. [PubMed] [Google Scholar]
  • 7.Jacobsen S, Sonne-Holm S. Hip dysplasia: a significant risk factor for the development of hip osteoarthritis. A cross-sectional survey. Rheumatology (Oxford) 2005;44:211–218. doi: 10.1093/rheumtology/keh436. [DOI] [PubMed] [Google Scholar]
  • 8.Russell ME, Shivanna KH, Grosland NM, Pedersen DR. Cartilage contact pressure elevations in dysplastic hips: a chronic overload model. J Orthop Surg Res. 2006;1:6. doi: 10.1186/1749-799X-1-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Burton-Wurster N, Farese JP, Todhunter RJ, Lust G. Site-specific variation in femoral head cartilage composition in dogs at high and low risk for development of osteoarthritis: insights into cartilage degeneration. Osteoarthritis Cartilage. 1999;7:486–497. doi: 10.1053/joca.1999.0244. [DOI] [PubMed] [Google Scholar]
  • 10.Jacobs J. Rosemont. American Academy of Orthopaedic Surgeons; 2008. The Burden of Musculoskeletal Diseases in the United States; p. 247. [Google Scholar]
  • 11.Kurtz S, Ong K, Lau E, Mowat F, Halpern M. Projections of primary and revision hip and knee arthroplasty in the United States from 2005 to 2030. J Bone Joint Surg Am. 2007;89:780–785. doi: 10.2106/JBJS.F.00222. [DOI] [PubMed] [Google Scholar]
  • 12.Spady TC, Ostrander EA. Canine behavioral genetics: pointing out the phenotypes and herding up the genes. Am J Hum Genet. 2008;82:10–18. doi: 10.1016/j.ajhg.2007.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zhang Z, Zhu L, Sandler J, Friedenberg SS, Egelhoff J, Williams AJ, et al. Estimation of heritabilities, genetic correlations, and breeding values of four traits that collectively define hip dysplasia in dogs. American Journal of Veterinary Research. 2009;70:483–492. doi: 10.2460/ajvr.70.4.483. [DOI] [PubMed] [Google Scholar]
  • 14.Hou Y, Wang Y, Lust G, Zhu L, Zhang Z, Todhunter RJ. Retrospective analysis for genetic improvement of hip joints of cohort labrador retrievers in the United States: 1970-2007. PLoS ONE. 2010;5:e9410. doi: 10.1371/journal.pone.0009410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Schaeffer LR. Strategy for applying genome-wide selection in dairy cattle. J Anim Breed Genet. 2006;123:218–223. doi: 10.1111/j.1439-0388.2006.00595.x. [DOI] [PubMed] [Google Scholar]
  • 16.Goddard M. Genomic selection: prediction of accuracy and maximisation of long term response. Genetica. 2009;136:245–257. doi: 10.1007/s10709-008-9308-0. [DOI] [PubMed] [Google Scholar]
  • 17.Lust G, Todhunter RJ, Erb HN, Dykes NL, Williams AJ, Burton-Wurster NI, et al. Comparison of three radiographic methods for diagnosis of hip dysplasia in eight-month-old dogs. Journal of the American Veterinary Medical Association. 2001;219:1242–1246. doi: 10.2460/javma.2001.219.1242. [DOI] [PubMed] [Google Scholar]
  • 18.Maffia M, Acierno R, Cillo E, Storelli C. Na(+)-D-glucose cotransport by intestinal BBMVs of the Antarctic fish Trematomus bernacchii. Am J Physiol. 1996;271:R1576–1583. doi: 10.1152/ajpregu.1996.271.6.R1576. [DOI] [PubMed] [Google Scholar]
  • 19.Meuwissen TH, Hayes BJ, Goddard ME. Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001;157:1819–1829. doi: 10.1093/genetics/157.4.1819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Heuven H, Janss L. Bayesian multi-QTL mapping for growth curve parameters. BMC Proceedings. 4:S12. doi: 10.1186/1753-6561-4-s1-s12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Li Y, Willer C, Sanna S, Abecasis G. Genotype imputation. Annu Rev Genomics Hum Genet. 2009;10:387–406. doi: 10.1146/annurev.genom.9.081307.164242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Marchini J, Howie B. Genotype imputation for genome-wide association studies. Nat Rev Genet. 2010;11:499–511. doi: 10.1038/nrg2796. [DOI] [PubMed] [Google Scholar]
  • 23.Altman DG, Bland JM. Diagnostic tests 2: Predictive values. BMJ. 1994;309:102. doi: 10.1136/bmj.309.6947.102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Taube A. The predictive value of microbiologic diagnostic tests if asymptomatic carriers are present. Statistics in Medicine. 2003;22:1201–1202. doi: 10.1002/sim.1476. Ronny K. Gunnarsson and Jan Lanke, Statistics in Medicine 2002; 21:1773–1785. [DOI] [PubMed] [Google Scholar]
  • 25.VanRaden PM, Van Tassell CP, Wiggans GR, Sonstegard TS, Schnabel RD, Taylor JF, et al. Invited review: reliability of genomic predictions for North American Holstein bulls. J Dairy Sci. 2009;92:16–24. doi: 10.3168/jds.2008-1514. [DOI] [PubMed] [Google Scholar]
  • 26.Guo G, Lund MS, Zhang Y, Su G. Comparison between genomic predictions using daughter yield deviation and conventional estimated breeding value as response variables. Journal of Animal Breeding and Genetics. 2010;127:423–432. doi: 10.1111/j.1439-0388.2010.00878.x. [DOI] [PubMed] [Google Scholar]
  • 27.Hamann H, Kirchhoff T, Distl O. Bayesian analysis of heritability of canine hip dysplasia in German Shepherd Dogs. Journal of Animal Breeding and Genetics. 2003;120:258–268. [Google Scholar]
  • 28.Leppaänen M, Maäki K, Juga J, Saloniemi H. Estimation of heritability for hip dysplasia in German Shepherd Dogs in Finland. Journal of Animal Breeding and Genetics. 2000;117:97–103. [Google Scholar]
  • 29.Stock KF, Distl O. Simulation study on the effects of excluding offspring information for genetic evaluation versus using genomic markers for selection in dog breeding. J Anim Breed Genet. 2010;127:42–52. doi: 10.1111/j.1439-0388.2009.00809.x. [DOI] [PubMed] [Google Scholar]
  • 30.Hayes BJ, Bowman PJ, Chamberlain AJ, Goddard ME. Invited review: Genomic selection in dairy cattle: progress and challenges. J Dairy Sci. 2009;92:433–443. doi: 10.3168/jds.2008-1646. [DOI] [PubMed] [Google Scholar]
  • 31.Gonzalez-Recio O, Gianola D, Long N, Weigel KA, Rosa GJ, Avendano S. Nonparametric methods for incorporating genomic information into genetic evaluations: an application to mortality in broilers. Genetics. 2008;178:2305–2313. doi: 10.1534/genetics.107.084293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Legarra A, Robert-Granie C, Manfredi E, Elsen JM. Performance of genomic selection in mice. Genetics. 2008;180:611–618. doi: 10.1534/genetics.108.088575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Makarieva AM. Variance of protein heterozygosity in different species of mammals with respect to the number of loci studied. Heredity. 2001;87:41–51. doi: 10.1046/j.1365-2540.2001.00899.x. [DOI] [PubMed] [Google Scholar]
  • 34.Higashino A, Osada N, Suto Y, Hirata M, Kameoka Y, Takahashi I, et al. Development of an integrative database with 499 novel microsatellite markers for Macaca fascicularis. BMC Genet. 2009;10:24. doi: 10.1186/1471-2156-10-24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Rogers J, Garcia R, Shelledy W, Kaplan J, Arya A, Johnson Z, et al. An initial genetic linkage map of the rhesus macaque (Macaca mulatta) genome using human microsatellite loci. Genomics. 2006;87:30–38. doi: 10.1016/j.ygeno.2005.10.004. [DOI] [PubMed] [Google Scholar]
  • 36.Fuerst PA, Chakraborty R, Nei M. Statistical studies on protein polymorphism in natural populations. I. Distribution of single locus heterozygosity. Genetics. 1977;86:455–483. [PMC free article] [PubMed] [Google Scholar]
  • 37.Calus MP, Meuwissen TH, de Roos AP, Veerkamp RF. Accuracy of genomic selection using different methods to define haplotypes. Genetics. 2008;178:553–561. doi: 10.1534/genetics.107.080838. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Solberg TR, Sonesson AK, Woolliams JA, Meuwissen THE. Genomic selection using different marker types and densities. J. Anim Sci. 2008;86:2447–2454. doi: 10.2527/jas.2007-0010. [DOI] [PubMed] [Google Scholar]
  • 39.Habier D, Fernando RL, Dekkers JCM. Genomic Selection Using Low-density Marker Panels. Genetics. 2009 doi: 10.1534/genetics.108.100289. genetics.108.100289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Sargolzaei M, Schenkel FS, Jansen GB, Schaeffer LR. Extent of linkage disequilibrium in Holstein cattle in North America. J Dairy Sci. 2008;91:2106–2117. doi: 10.3168/jds.2007-0553. [DOI] [PubMed] [Google Scholar]
  • 41.Parker HG, Ostrander EA. Canine genomics and genetics: running with the pack. PLoS Genet. 2005;1:e58. doi: 10.1371/journal.pgen.0010058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Marschall Y, Distl O. Mapping quantitative trait loci for canine hip dysplasia in German Shepherd dogs. Mammalian Genome. 2007;18:861–870. doi: 10.1007/s00335-007-9071-z. [DOI] [PubMed] [Google Scholar]
  • 43.Chase K, Lawler DF, Adler FR, Ostrander EA, Lark KG. Bilaterally asymmetric effects of quantitative trait loci (QTLs): QTLs that affect laxity in the right versus left coxofemoral (hip) joints of the dog (Canis familiaris) Am J Med Genet A. 2004;124:239–247. doi: 10.1002/ajmg.a.20363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Todhunter RJ, Mateescu R, Lust G, Burton-Wurster NI, Dykes NL, Bliss SP, et al. Quantitative trait loci for hip dysplasia in a cross-breed canine pedigree. Mammalian Genome. 2005;16:720–730. doi: 10.1007/s00335-005-0004-4. [DOI] [PubMed] [Google Scholar]
  • 45.Zhou Z, Sheng X, Zhang Z, Zhao K, Zhu L, Guo G, et al. Differential Genetic Regulation of Canine Hip Dysplasia and Osteoarthritis. PLoS ONE. 2010;5:e13219. doi: 10.1371/journal.pone.0013219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Goddard ME, Hayes BJ. Mapping genes for complex traits in domestic animals and their use in breeding programmes. Nat Rev Genet. 2009;10:381–391. doi: 10.1038/nrg2575. [DOI] [PubMed] [Google Scholar]
  • 47.Georges M. Mapping, fine mapping, and molecular dissection of quantitative trait Loci in domestic animals. Annu Rev Genomics Hum Genet. 2007;8:131–162. doi: 10.1146/annurev.genom.8.080706.092408. [DOI] [PubMed] [Google Scholar]
  • 48.Friedenberg SG, Zhu L, Zhang Z, Van den Berg Foels W, Schweitzer PA, Todhunter RJ. A fibrillin 2 haplotype associated with canine hip dysplasia and incipient osteoarthritis. Am J Vet Res. 2010 doi: 10.2460/ajvr.72.4.530. accepted. [DOI] [PubMed] [Google Scholar]

RESOURCES