Genes for common diseases

Mark J Caulfield

doi:10.1046/j.0306-5251.2001.01343.x

editorial

. 2001 Jan;51(1):1–3. doi: 10.1046/j.0306-5251.2001.01343.x

Genes for common diseases

Mark J Caulfield ¹

PMCID: PMC2014423 PMID: 11167659

There is increasing interest in the genetic factors that may influence the development of common diseases such as hypertension and diabetes. In this series, the editors have sought to bring readers of the ‘Green’ journal the state of the art in these common complex disorders.

In terms of human genome research it is an exciting time. There will shortly be a draft sequence of the entire human genome available within the public domain. During the course of refinement of this we will get a clearer picture of the number of genes and their locations within the human genome. In seeking to detect the genes for common complex diseases it is important to remember that the effects of these genes may only be modest and may operate in concert with environmental factors. Thus the effect of an individual variation within a gene may be small at a population level. This has implications for the size of the studies required to be adequately powered to detect a true positive effect, for example: in the case of hypertension, 1500 families based on affected sibling pairs would be required to detect five genes each contributing 6% to blood pressure variation. Accordingly when evaluating studies published on complex traits, one of the key questions to be answered in the mind of the reader will be whether the study has been adequately powered to detect a true effect.

The progress that has been made towards understanding the genetics of these common diseases will be outlined in two articles. The first is by Kevin O'Shaughnessy on the Genetics of Hypertension and the second by Mark McCarthy & Stephan Menzel on Type II diabetes. In both these articles the authors describe how the tools of the genomic trade have been applied to understand the genetic base of common disease.

The tools are those genetic variations or polymorphisms which each of us has about every 500 bases within our genome. These polymorphisms or variations can be used to identify disease-causing variants and explain genetic influence on a trait [1].

The most abundant form of polymorphisms are single nucleotide, or base, changes in the code. These single nucleotide polymorphisms or SNPs (pronounced snips) have become a source of great interest and a consortium between the pharmaceutical industry and private investors has been formed to identify 300 000 new SNPs by May 2001. This $45 000 000 US investment has already identified 41 209 unique SNPs which can be found on the Cold Spring Harbour website at http://snp.cshl.org/. These SNPs will become not only increasingly important in mapping disease genes but also in predicting drug response and so are of considerable interest to the pharmaceutical industry.

In addition to these simple polymorphisms, there are more complex, highly polymorphic markers, which have been widely employed to map disease genes in common disorders. These highly polymorphic markers are based on repetitive sequences in DNA known as simple sequence repeats. These may be based on di, tri nucleotide, or tetranucleotide repetitive segments of DNA. Such polymorphisms are widely employed in sets spread evenly, throughout the genome, to map regions of interest within which genes for complex disorders may lurk. These simple sequence repeats that comprise linkage marker sets have been employed to date in genome wide screening. A second type of highly polymorphic marker is a variable number tandem repeat. This is based on repetitive segments of DNA that involve tens or hundreds of nucleotides. All of these genetic markers constitute the tools of the trade and may be used to identify disease causing genetic variation or polymorphism.

Study designs for understanding complex traits

There are several different types of study that can be applied to identify a gene for a common disease [2]. The most widely applied have been population based case-control studies [2]. Here it is essential that the cases and controls are drawn from the same population and that there is no risk of ethnic heterogeneity or substratification which might create a genetic artefact. Ethnicity is particularly important as the representation of different genetic polymorphisms is quite different between ethnic groups. This may lead to spurious findings in the analysis of common diseases [2]. When we study a genetic variation such as a SNP, or group of SNPs in a case control population association study, we are relying on a principle known as linkage disequilibrium. This basically means that the disease causing variation will have been inherited on the same piece of DNA over time along with the SNP under study, such that the two are physically related and do not become separated by recombination during meiosis. The proportion of linkage disequilibrium between a disease causing allele and the genetic variation under study is important because it can affect the power of the study. When the genotypic data derived from such a study are analysed, the categoric data or genotypes are usually arranged in a contingency table and tested using Chi squared test with the appropriate degrees of freedom.

The second study design that is widely used is family based linkage analysis [1]. Unlike single gene disorders, which often present relatively early in life, so called common complex traits will often present in the middle or later years of life [1]. They can also exhibit variable age of onset [1]. Therefore within a family it may be very difficult to be absolutely certain that an individual who is apparently unaffected now will not become affected in the future by the disease of interest [2]. This has led to two approaches in common diseases which are being applied in both hypertension and diabetes research.

The first is to choose affected siblings who are concordant for the trait of interest, for example: two diabetic affected siblings, or two hypertensive affected siblings [1]. In the analysis of such a study, we would genotype a series of usually highly polymorphic markers of the simple sequence repeat type and analyse these data asking the simple question, do our hypertensive or diabetic sibling pairs share versions of this genetic marker more often than you would expect in the general population. If this is so this may be remarkable and may be identifying a region of linkage within the genome. Linkage may still be preserved at up to 50 000 000 bases of distance, which is not an inconsiderable length of genome [1].

A second form of family based analysis is to use discordant sibling pair analysis [1]. This is where a severely affected individual and an unaffected individual at the opposite end of a quantitative trait such as blood pressure, are compared to see whether or not there are genetic differences. This extreme discordant sibling pair analysis may be quite powerful in a common complex trait [2].

Technological advances

Finally the technological advances that have occurred in the last 10 years have facilitated the ability to achieve the high throughput required to screen the numbers of polymorphisms we need to detect common disease causing genes. The first of these advances inspired by Kary Mullis was the polymerase chain reaction which essentially photocopies DNA within a target sequence. This has allowed increased throughput for genotyping and also for sequencing to detect genetic variation. Much of the current advances are only possible because of this development.

Semi-automated fluorescence based genotyping

The second development is methods for high throughput genotyping. The first of these was semiautomated fluoresence based genotyping. Here, using different coloured fluorescent tags, 20 markers may be combined in a single lane on a gel, since it is perfectly possible to run 96 lanes on a gel, you can imagine that a considerable volume of information can be derived from a single two hour experiment [3]. The fluorescent tags are detected after excitation by a laser scanning device at the base of the gel as they migrate under electrophoresis. This fluorescence based technology has been further refined and is available using capillary based systems, which offer the advantage of a higher throughput and also mean that there is no risk of spillover between lanes in the gel.

Microarrays and DNA chips

There have also been advances in the methods available to detect single nucleotide polymorphisms [4]. These techniques may be facilitated by developments in microarray technology where short lengths of DNA can be applied to glass slides and combined with target DNA to genotype the targets of interest [4]. In addition the same process can be carried out on a DNA chip. Whether both of these techniques prove as valuable remains to be seen. However, there is considerable interest in whether such high throughput SNP screening may become available in the future [4]. Such microarray and chip technology is currently quite expensive and therefore not widely used as yet in academia [4]. However, microarrays have already clearly demonstrated their value in the area of functional genomics where one might wish to examine the effect of expression profiles of different genes in a tissue sample [4].

A key question at the end of all of this is what makes O'Shaughnessy, McCarthy and Menzel spend large segments of their life looking for genes with modest effects on our common diseases. Their collective hope, and mine too, is that by finding these genes we will be better able to predict those at risk of these common diseases and also, perhaps, most importantly, to develop new treatments and to target their use in a refined way. Anyone who treats patients with common diseases like high blood pressure, will know that different patients may respond differently to drugs. Essentially, the management may constitute a fishing trip in which the patient may be exposed to several agents that prove ineffective before an effective alternative emerges. In the subsequent articles on hypertension and diabetes the reviewers have sought to give you the state of the art, but they have not shirked from presenting the difficulties.

References

1.Lander ES, Schork NJ. Genetic dissection of complex traits. Science. 1994;265:2037–2048. doi: 10.1126/science.8091226. [DOI] [PubMed] [Google Scholar]
2.Risch N, Zhang H. Extreme discordant sib pairs for mapping quantitative trait loci in humans. Science. 1995;268:1584–1589. doi: 10.1126/science.7777857. [DOI] [PubMed] [Google Scholar]
3.Reed PW, Davies JL, Copeman JB, et al. Chromosome-specific microsatellite sets for fluorescence-based, semi-automated genome mapping. Nature Genet. 1994;7:390–395. doi: 10.1038/ng0794-390. [DOI] [PubMed] [Google Scholar]
4.The Chipping Forecast. Nature Genet. 1999. p. 1.

[b1] 1.Lander ES, Schork NJ. Genetic dissection of complex traits. Science. 1994;265:2037–2048. doi: 10.1126/science.8091226. [DOI] [PubMed] [Google Scholar]

[b2] 2.Risch N, Zhang H. Extreme discordant sib pairs for mapping quantitative trait loci in humans. Science. 1995;268:1584–1589. doi: 10.1126/science.7777857. [DOI] [PubMed] [Google Scholar]

[b3] 3.Reed PW, Davies JL, Copeman JB, et al. Chromosome-specific microsatellite sets for fluorescence-based, semi-automated genome mapping. Nature Genet. 1994;7:390–395. doi: 10.1038/ng0794-390. [DOI] [PubMed] [Google Scholar]

[b4] 4.The Chipping Forecast. Nature Genet. 1999. p. 1.

PERMALINK

Genes for common diseases

Mark J Caulfield

Study designs for understanding complex traits

Technological advances

Semi-automated fluorescence based genotyping

Microarrays and DNA chips

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Genes for common diseases

Mark J Caulfield

Study designs for understanding complex traits

Technological advances

Semi-automated fluorescence based genotyping

Microarrays and DNA chips

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases