Abstract
In the last decade breeders of Romagnola cattle observed an outbreak of a new congenital anomaly. This lethal multi-organ developmental dysplasia is mainly characterized by facial deformities, ascites and hepatic fibrosis. Affected stillborn calves were inbred to a single founder sire suggesting autosomal monogenic recessive inheritance. We localized the causative mutation to a 1.2 Mb interval on BTA 17 by genome-wide association and identical by descent mapping. A solution-based method for targeted DNA capture combined with massively parallel sequencing was used to analyze the entire critical region containing 24 genes. Homozygosity for two non-synonymous coding sequence variants affecting the RNF34 and KDM2B genes was detected by evaluating one affected calf. Here we show that the disease phenotype is associated with a KDM2B missense mutation (c.2503G>A) leading to an amino acid exchange (p.D835N) in an evolutionary strongly conserved domain. In addition, the genetic makeup of three inbred cattle strongly supports the causality of the KDM2B mutation. This report of a naturally-occurring spontaneous mutation of a JmjC domain containing histone demethylase gene provides evidence for their important role in the endo- and mesodermal organ development.
Introduction
Since 2003 breeders of Romagnola cattle in Italy have experienced an increase in the occurrence of a previously unknown complex congenital anomaly, mainly characterized by craniofacial deformities, an enlarged fluid-filled abdomen and hepatic fibrosis [1]. Affected calves are usually stillborn and the disease is known as ‘paunch calf syndrome’ (Figure 1). The emerging paunch calf syndrome disease causes significant economic and animal welfare concerns. The paunch calf syndrome defect is supposed to be inherited as a monogenic autosomal recessive trait [1]. The outbreak of paunch calf syndrome can be attributed to selective breeding practices involving the extensive use of particular highly selected artificial insemination sires leading to an increase in co-ancestry and inbreeding [2]. Parents of affected calves do not present any clinical signs. Therefore, it is crucial that the breeders implement an effective strategy to control this recessive defect [1], [3]. Identification of the causative mutation has an immediate translation into breeding practice, allowing direct DNA testing and effective selection against the defect through avoidance of at-risk matings [2]. After sequencing the cow genome [4], [5] recent examples showed that the availability of genome-wide, high-density SNP arrays, combined with the typical structure of cattle populations, markedly accelerates the positional identification of genes and mutations that cause inherited recessive defects [2], [6], [7].
In humans and other mammalian species a comparable syndrome has also not yet been described. Lacking functional candidate genes for this unique phenotype, the spontaneous cattle paunch calf syndrome mutants provide the potential to gain insights into the pathogenesis of this developmental dysplasia affecting several endo- and mesodermal derived organs.
The aim of our investigation was to identify the mutation associated with paunch calf syndrome in Romagnola cattle using a genome-wide association study mapping approach, followed by targeted re-sequencing of the entire disease locus.
Results
The Paunch Calf Syndrome Mutation Maps to BTA 17
We genotyped 777,962 SNPs in 16 paunch calf syndrome affected calves and 41 controls. After removing non-informative markers 536,171 SNPs were used for the subsequent analyses. A genome-wide association study revealed a single strong signal for paunch calf syndrome on bovine chromosome (BTA) 17 (Figure 2A). The best-associated SNP had a genome-wide corrected p-value of 1.0×10−4 at 56,938,011 bp on BTA 17 (Figure 2B).
The quite uniform phenotype and a high degree of inbreeding suggested the existence of a single recessive founder mutation. Pedigree analysis indicated that all paunch calf syndrome cases with available pedigree records traced back to a once popular Romagnola sire born in 1969, between four to eight generations on both, the paternal and maternal side (Figure S1). Therefore, we applied an identical by descent mapping approach to fine-map the region containing the paunch calf syndrome mutation. Based on the pedigree records we hypothesized that the affected calves most likely were inbred to one single founder animal (Figure S1). Under this scenario the affected individuals were expected to be identical by descent (IBD) for the causative mutation and flanking chromosomal segments. To identify IBD regions between paunch calf syndrome cases we analyzed the cases for extended regions of homozygosity with simultaneous allele sharing. Only the disease-associated genome region on BTA 17 fulfilled our search criteria. All 16 cases were homozygous for the same disease-associated haplotype and shared identical alleles over 341 SNP markers corresponding to a 1.23 Mb interval (55,880,892–57,114,451 bp) on BTA 17 (Figure 2C). This cattle chromosome region contains 24 genes and loci according to the annotation of the corresponding human chromosome (HSA) 5 intervals (Figure 2D).
A Missense Mutation in the KDM2B Gene is the Most Likely Causative Mutation
To identify the mutation associated with paunch calf syndrome, we used a targeted sequence capture approach to re-sequence the critical 1.23 Mb interval in one affected calf. Sequencing coverage was limited to approximately 55% of the target region, as repetitive elements were masked during target enrichment probe design. Alignment of sequence reads showed that we obtained 98.6% of the single-copy sequences of the targeted BTA 17 interval, with 95.1% of the region covered by at least 50 high-quality reads. We mapped 17,986,641 sequences to the target region and a total of 8,030,795 sequence reads (44.6%) were reliably mapped to the target region with a mapping quality >20 resulting in a mean coverage of 406 for the probe design region. In total, 95% (87.4% with mapping quality >20) of the coding bases of the exome defined by UMD3.1 cattle genome annotation [8] were represented by at least 50 reads per base. Within the critical interval, we identified 400 homozygous sequence variants (382 SNPs and 18 indels) in comparison to the reference cattle genome sequence (Table S1). Visual inspection of the read-pairs revealed no indication for structural variants spanning coding regions. To distinguish potentially pathogenic mutations from other variants, we focused only on homozygous variants located in coding and splice site sequences, anticipating that variants in non-coding sequences were less likely to be pathogenic. Applying this filter, we identified two non-synonymous coding variants that were associated with paunch calf syndrome. These were a 12 bp deletion (c.704–716del) in the ring finger protein 34, E3 ubiquitin protein ligase gene (RNF34) leading to a truncated stretch of aspartic acid residues, and a SNP (c.2503G>A) in the lysine (K)-specific demethylase 2B gene (KDM2B) causing an aspartic acid to asparagine substitution (p.D835N) (Figure S2). The c.2503G>A transition is located in KDM2B exon 18 and the encoded residue is located between the CXXC zinc finger and F-box domains of the KDM2B protein in a region of very high sequence conservation (Figure 3).
Both polymorphisms were further investigated by genotyping additional cases, parents and unrelated controls using Sanger sequencing (Figure S1, Figure S2). We found perfect concordance between the presence of the KDM2B c.2503G>A variant and the paunch calf syndrome phenotype (Table 1). Only one reported paunch calf syndrome carrier yielded a discordant genotyping result and was homozygous for the wildtype allele, which is most likely due to a phenotyping error of the offspring (Table 1). The RNF34 deletion was not perfectly associated with paunch calf syndrome was thus excluded (Table 1). A total of 3 out of 255 Romagnola controls were genotyped as homozygous mutant for the RNF34 deletion and should by implication be affected if that were the causative mutation, but they were only heterozygous for the KDM2B SNP. Interestingly, these 3 animals were inbred to the assumed paunch calf syndrome founder (Figure 4). We performed SNP array genotyping of these individuals and detected that they were homozygous for all tested markers in the critical interval on BTA 17. In other words their SNP array genotypes in the critical region were identical to the SNP array genotypes of paunch calf syndrome cases.
Table 1. Genotyping results for RNF34 and KDM2B mutations.
RNF34 genotype (c.704–716del) | ||||
del/del | del/wt | wt/wt | ||
Cases | (n = 65) | 65 | ||
Parentsa | (n = 57) | 56 | 1b | |
Romagnola controls | (n = 466) | 3 | 172 | 291 |
Other breed controls | (n = 288) | 288 | ||
KDM2B genotype (c.2503G>A) | ||||
A/A | AG | GG | ||
Cases | (n = 65) | 65 | ||
Parentsa | (n = 57) | 56 | 1b | |
Romagnola controls | (n = 466) | 150 | 316 | |
Other breed controls | (n = 288) | 288 |
Cattle recorded as parent of affected calves.
For this cow only one single suspicious offspring was recorded by the breeding organization and the diagnosis of this calf had not been confirmed by necropsy. Genotyping of additional flanking markers revealed that this animal did not carry the disease-associated haplotype. Therefore, the reported calf most likely represented a phenocopy and the parent is indeed free of the deleterious mutation.
None of 523 healthy Romagnola cattle had the homozygous KDM2B A/A genotype (Table 1). The KDM2B variant was absent in all tested non-Romagnola cattle. It had an allelic frequency of 16% amongst 466 healthy Romagnola individuals that were not related to cases at the parent level. We carefully checked the available pedigree records of genotyped carriers and in all A/G animals we were able to identify at least one line of relationship to the assumed founder. The two oldest genotyped carriers were a grandson and a great-grandson of this sire, respectively (Figure S1).
Discussion
Using a genome-wide association study approach followed by targeted re-sequencing, we have identified the KDM2B c.2503G>A variant as most likely causative mutation for the lethal multi-organ developmental dysplasia in homozygous mutant animals. We exploited the special structure of cattle families to identify the causative mutation by a purely genetic approach. The rich pedigree records in cattle breeding allowed us to identify the possible founder animal of paunch calf syndrome, a Romagnola bull born in 1969. A few generations later several cattle received two copies of the same chromosome segment from this bull due to inbreeding. We hypothesize that this sire was a germline mosaic as he transmitted copies of the BTA 17 chromosome which acquired the causative mutation, while the other copies stayed in the ancestral state (Figure S1; Figure 4). Three inbred animals, which were IBD for all tested markers across the critical interval, were heterozygous at the KDM2B mutation. The recognition of these animals was a key element in our discovery and illustrates once again the potential of livestock specific population structures for genetic research [6]. In addition, we detected mutation carriers born two or three generation after the assumed founder, which provide additional support for the hypothesis of the origin of the causative mutation. The DNA-based test for the detection of paunch calf syndrome carriers test been already adopted by the Italian Romagnola Breeders’ Association and all the top sires indexed in the annual catalogues are identified as to whether they are carriers of the paunch calf syndrome causative mutation. Since summer 2011, there has been a requirement for young bulls short-listed for entrance into the performance testing program to be tested for the mutation; carriers are excluded from the programme. The 32% carrier frequency within the examined Romagnola population is very high compared to other recessive traits we studied [6+7]. This might explain the recently observed outbreak of the paunch calf syndrome disease due to intensive usage of a limited number of artificial insemination sires and subsequent increase of inbreeding level.
Due to the recessive inheritance and the lethal effect of the mutation we hypothesized that most likely a loss of function mutation affecting the coding sequence of a gene would be responsible for paunch calf syndrome. Therefore, we subsequently concentrated on homozygous variants that were located within the coding sequences or within the splice sites of the annotated genes in the targeted region of the bovine genome. Our genetic data implicates KDM2B as a cause of paunch calf syndrome. Only the identified missense mutation in KDM2B co-segregates with the disease phenotype, and the re-sequencing of the entire paunch calf syndrome locus failed to identify other perfectly associated non-synonymous variants in transcribed sequences. The second non-synonymous variant in RNF34 is most likely functionally neutral as we detected the homozygous mutant RNF34 genotype in three normal control cattle. Unfortunately, there is no experimental crystal structure available to predict the structures of wildtype and mutant KDM2B proteins. The exchange of an acidic residue to an uncharged amino acid with the loss of a negative charge may disrupt the normal structure. Another possible structural effect could be that an aspartic acid residue accepts two hydrogen bonds instead of only one in case of an asparagine. In regard to the close vicinity of the mutant residue to the zinc finger domain of KDM2B the mutation could also affect its role as metal ligand.
The KDM2B gene (also known as JHDM1B and FBXL10) encodes a histone H3 lysine 36 dimethyl (H3K36me2)-specific demethylase [9], [10]. Histone methylation is one important transcription regulatory system that affects mammalian development and cell differentiation [11]. During the embryonic and fetal differentiation, chromatin modifications are important to drive the correct temporal expression of specific genes. Transcription factors act in concert with histone acetylatases and methylases to influence gene transcription by modifying the architecture of chromatin at specific loci [12]. Histone methylation had been thought of as an irreversible epigenetic mark until the first lysine specific histone demethylase LSD1 (also known as KDM1) was discovered [13]. Subsequent to the discovery of LSD1, another larger family of more than 30 histone demethylases sharing a motif designated the Jumonji C (JmjC) domain, structurally different from LSD1 but including KDM2B, was described [10]. This enzyme catalyzes the removal of methyl groups from a specific lysine residue on histone H3 and is capable of suppressing cellular senescence [9]. Recently, it was shown that KDM2B acts in concert with other key reprogramming factors in early gene activation and promotes induced pluripotent stem cell generation from somatic cells [14]. To our knowledge, no human patients with KDM2B mutations have so far been identified. Several other JmjC-domain-containing histone demethylases have been implicated in disease including prostate cancer [15] and X-linked intellectual disability [16+17]. Kdm2b-deficient mice exhibit failure of neural tube closure, resulting in exencephaly and death shortly after birth [18]. These recent findings in mice suggest that the KDM2B gene is involved in neural development. Our report indicates that KDM2B function is also important for the development of endo- and mesodermal derived organs like bone and liver.
In conclusion, our results suggest that the KDM2B c.2503G>A variant represents the causative mutation for paunch calf syndrome in Romagnola cattle. Especially, the genetic makeup of three inbred cattle provides extremely strong support for the causality of this mutation. Selection against this mutation can now be used to eliminate paunch calf syndrome from Romagnola cattle in production systems. Furthermore, we discovered a hint for a possible essential role of this histone demethylase during the development of endo- and mesodermal organs. This study highlights the potential of naturally-occurring mutants in domestic animals to gain further insights into gene functions.
Materials and Methods
Ethics Statement
All animal work has been conducted according to the national and international guidelines for animal welfare. There is no permit number as this study is not based on an invasive animal experiment. All cattle owners agreed that the samples can be used for our study. The data were obtained during diagnostic procedures that would have been carried out anyway. This is a very special situation in veterinary medicine. As the data are from client-owned cattle that underwent veterinary exams, there was no “animal experiment” according to the legal definitions in Italy.
Animals
We collected samples from 65 paunch calf syndrome affected calves (35 male, 30 female) from different farms. For most of the cases detailed pedigree records were available and their relationship is illustrated in Figure S1. The phenotype classification was primarily based on the owners’ reports. Most of the affected animals were seen by a veterinarian and for some cases necropsy results were available. Additionally, we collected 57 samples recorded as parents (14 sires, 43 dams) of affected offspring and 466 healthy Romagnola cattle (352 male, 114 female) resulting in a total of 588 samples from this breed. For the association study we also used 288 normal cattle from 25 genetically diverse Bos taurus breeds.
Genome Wide Association Mapping
Genotyping was performed using the BovineHD BeadChip (Illumina), including 777,962 evenly distributed SNPs on the 29 bovine autosomes and X chromosome (median distance between markers: 2.68 kb), and standard protocols recommended by the manufacturer. A total of 241,791 SNPs were excluded based on genotyping rate (call frequency <80%; 5,949 SNPs) and minor allele frequency (MAF <10%; 237,307 SNPs). For single-marker association studies the genotype frequencies at 536,171 SNPs were compared between cases and controls using a standard allelic association test implemented with PLINK [19]. Genome-wide corrected empirical p-values were determined applying the max(T) permutation procedure implemented in PLINK with 500,000 permutations. To identify extended homozygous regions with allele sharing across all affected animals the options –homozyg-group and –homozyg-match were applied. All given positions correspond to the UMD3.1 cattle genome assembly [8]. The corresponding human chromosome segments (build 37.2) were identified by BLASTN searches of bovine SNP flanking sequences to the human genome sequence.
Sequence Capture Enrichment
DNA fragmentation was conducted using a microTube in a Covaris S2 instrument, resulting in 100 to 150 bp average fragment length at the sample peak, as monitored using an Agilent Bioanalyzer. Ends were repaired using a DNA end polishing enzyme mixture (Agilent). Double stranded custom adaptors containing barcode sequences (sequences available upon request; Agilent) were ligated to the blunt ended genomic DNA fragments using T4 DNA ligase (Agilent). Resulting fragments were purified by elution from E-gel (Invitrogen) to enrich approximately 200 bp length fragments. These fragments were nick translated and amplified by PCR using primers specific for adaptor sequences for 12 cycles using Platinium PCR SuperMix High Fidelity (Invitrogen) standard PCR conditions.
PCR products were then enriched using a custom SureSelect 120 bp probe set in solution (Agilent). Probes were designed using eArray web-based software (Agilent) to cover a 1.74 Mb region from 55,853,187 to 57,599,455 bp which includes the homozygosity region of BTA 17 and three additional segments of the bovine genome with a total size of 2.46 Mb as control. Repetitive elements were excluded from the probe design. A total of 56,923 probes were created for the target regions. For hybridization, we followed the manufacturer’s directions with the following modifications. The library was hybridized at 65°C for 24 h, and then purified using streptavidin M280 beads (Dynal, Invitrogen). Eluted DNA was amplified using adaptor-specific primers for 12 cycles as before hybridization. Enriched products were analyzed using an Agilent Bioanalyzer to determine their size and concentration.
High Throughput Sequencing, Read Mapping and Variant Calling
The paired-end library was sequenced on each end for 50 and 35 cycles, respectively, on one quarter of a slide on a SOLiD4 sequencer (Life Technologies). We collected 383 million reads (191.5 million pairs) and aligned them to the bovine genome (UMD3.1) using BioScope (Life Technologies) with default settings. Alignment data were stored in BAM format and visualized using IGV software [20]. Duplicate pairs were removed, and only reads with reliable alignment quality >20 were used for the variant (SNP and small indel) detection. Sequence variants were detected using BioScope software. High-confidence variants were supported by 8 or more high-quality aligned reads on both strands. Lower confidence variants for low coverage regions were required to be supported by at least 4 reads on each strand.
Sanger Re-sequencing
Two variants were genotyped by re-sequencing of targeted PCR products using Sanger sequencing technology. PCR products were amplified using AmpliTaqGold360Mastermix (Life Technologies). PCR products were directly sequenced on an ABI 3730 capillary sequencer (Life Technologies) after treatment with exonuclease I and shrimp alkaline phosphatase. Sequence data were analyzed with Sequencher 4.9 (GeneCodes).
Supporting Information
Acknowledgments
We would like to thank the Functional Genomics Center Zürich and the Next Generation Sequencing Platform of the University of Bern for their technical support. The authors would also like to thank all Romagnola cattle breeders and the ANABIC for donating samples and sharing pedigree data.
Funding Statement
The authors thank the region of Emilia Romagna (Italy) for financial support. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Testoni S, Militerno G, Rossi M, Gentile A (2009) Congenital facial deformities, ascites and hepatic fibrosis in Romagnola calves. Vet Rec 164: 693–694. [DOI] [PubMed] [Google Scholar]
- 2. Charlier C, Coppieters W, Rollin F, Desmecht D, Agerholm JS, et al. (2008) Highly effective SNP-based association mapping and management of recessive defects in livestock. Nat Genet 40: 449–454. [DOI] [PubMed] [Google Scholar]
- 3. Thompson PN, Heesterbeek JA, van Arendonk JA (2006) Changes in disease gene frequency over time with differential genotypic fitness and various control strategies. J Anim Sci 84: 2629–2635. [DOI] [PubMed] [Google Scholar]
- 4. The Bovine Genome Sequencing, Analysis Consortium, Elsik CG, Tellam RL, Worley KC (2009) The genome sequence of taurine cattle: A window to ruminant biology and evolution. Science 324: 522–528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Zimin AV, Delcher AL, Florea L, Kelley DR, Schatz MC, et al. (2009) A whole-genome assembly of the domestic cow, Bos taurus. Genome Biol 10: R42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Drögemüller C, Tetens J, Sigurdsson S, Gentile A, Testoni S, et al. (2010) Identification of the bovine arachnomelia mutation by massively parallel sequencing implicates sulfite oxidase (SUOX) in bone development. PLoS Genet 6: e1001079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Drögemüller C, Reichart U, Seuberlich T, Oevermann A, Baumgartner M, et al. (2011) An unusual splice defect in the bovine mitofusin 2 gene (MFN2) is associated with degenerative axonopathy in Tyrolean Grey cattle. PLoS ONE 6: e18931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Homepage Bos taurus UMD3.1 assembly (2012). Available: http://www.cbcb.umd.edu/research/bos_taurus_assembly.shtml#12. Accessed 2012 March 22.
- 9. He J, Kallin EM, Tsukada Y, Zhang Y (2008) The H3K36 demethylase Jhdm1b/Kdm2b regulates cell proliferation and senescence through p15(Ink4b). Nat Struct Mol Biol 15: 1169–1175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Tsukada Y, Fang J, Erdjument-Bromage H, Warren ME, Borchers CH, et al. (2006) Histone demethylation by a family of JmjC domain-containing proteins. Nature 439: 811–816. [DOI] [PubMed] [Google Scholar]
- 11. Ruthenburg AJ, Li H, Patel DJ, Allis CD (2007) Multivalent engagement of chromatin modifications by linked binding modules. Nat Rev Mol Cell Biol 8: 983–994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Verrier L, Escaffit F, Chailleux C, Trouche D, Vandromme MA (2011) New isoform of the histone demethylase JMJD2A/KDM4A is required for skeletal muscle differentiation PLoS Genet. 7: e1001390. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Shi Y, Lan F, Matson C, Mulligan P, Whetstine JR, et al. (2004) Histone demethylation mediated by the nuclear amine oxidase homolog LSD1. Cell 119: 941–953. [DOI] [PubMed] [Google Scholar]
- 14. Liang G, He J, Zhang Y (2012) Kdm2b promotes induced pluripotent stem cell generation by facilitating gene activation early in reprogramming. Nat Cell Biol 14: 457–466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Cloos PAC, Christensen J, Agger K, Maiolica A, Rappsilber J, et al. (2006) The putative oncogene GASC1 demethylates tri- and dimethylated lysine 9 on histone H3. Nature 442: 307–311. [DOI] [PubMed] [Google Scholar]
- 16. Jensen LR, Amende M, Gurok U, Moser B, Gimmel V, et al. (2005) Mutations in the JARID1C gene, which is involved in transcriptional regulation and chromatin remodeling, cause X-linked mental retardation. Am J Hum Genet 76: 227–236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Tahiliani M, Mei P, Fang R, Leonor T, Rutenberg M, et al. (2007) The histone H3K4 demethylase SMCX links REST target genes to X-linked mental retardation. Nature 447: 601–605. [DOI] [PubMed] [Google Scholar]
- 18. Fukuda T, Tokunaga A, Sakamoto R, Yoshida N (2011) Fbxl10/Kdm2b deficiency accelerates neural progenitor cell death and leads to exencephaly. Mol Cell Neurosci 46: 614–624. [DOI] [PubMed] [Google Scholar]
- 19. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81: 559–575. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, et al. (2011) Integrative genomics viewer. Nat Biotechnol 29: 24–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.