Table 1.
Characteristics/challenge |
Possible solutions |
Potential benefits of the solution in genetic prognostic studies |
Correlation among genetic polymorphisms |
(i) Utilization of the linkage disequilibrium (LD) information and investigating the tagging single nucleotide polymorphisms (tagSNPs) instead can prevent this issue [23] |
(i) reduces the redundancy among variables and simplify the analysis while also reducing the genotyping cost and efforts [23] |
(ii) Once an association is found with a genetic polymorphism, this genomic region (usually within the same LD block) may be investigated in detail to identify the nearby ‘true’ prognostic factor that modifies the prognosis in patients |
(ii) may identify the prognostic factor biologically linked to variable prognosis in patients |
|
Genetic polymorphisms as confounders |
Some of the genetic polymorphisms confounding the relationship between the prognostic factor and the outcome are likely to be in close vicinity and can be identified by investigating the genomic region in detail |
Genetic confounders can be identified |
Hardy-Weinberg equilibrium (HWE) testing in case-only cohorts |
Whether appropriate or not remains to be established |
|
Estimating the correct genetic model |
Visual inspection of Kaplan-Meier curves for the codominant genetic model may reveal the best suitable genetic model for investigation of each polymorphism in multivariable models |
Provides a logical and comprehensive solution while also reduces the number of tests to be performed |
Minor allele frequency (MAF) of genetic polymorphisms |
Excluding the rare polymorphisms (for example, MAF <5%) from the analysis is a common practice |
Prevents unstable model construction and by reducing the multiple testing burden and increasing the events/variables ratio also improves the study power |
Population stratification due to variable frequencies of genetic polymorphisms in different ethnicities |
Detecting and controlling for the population substructure in the cohort eliminates this problem (for example, outlier samples may be eliminated from the analysis or ethnicity can be used as a covariate in the analysis) |
Prevents biased estimations and increases the study power |
Multiple testing issue due to the investigation of large numbers of polymorphisms |
Correction for multiple testing using a variety of methods such as Bonferroni or false discovery rate (FDR) methods [42] |
Reduces the false-positive rate (however, ironically may also increase the false-negative rate) |
Use of genomic material extracted from archived specimen |
Use of new technologies with high rates of successful genotyping [48,49] |
Reduces bias and increases study power by allowing the construction of models with a higher number of patients |
Use of tumor versus non-tumor DNA in the same study | Using one type (either tumor or non-tumor) depending on the objectives of the study in the cohort or checking the correlation of genotype data obtained from both tumor and non-tumor DNA samples in a set of patients to see whether they are comparable with each other (for example, the tumor DNA may not be a good surrogate for non-tumor DNA all the time) | Prevents bias in study results created by alterations in tumor tissue DNA (that is, different genotypes in tumor DNA compared to non-tumor DNA) |
The main characteristics of genetic polymorphisms that require additional considerations in genetic prognostic research are summarized. The majority of the solutions are already applied in susceptibility studies, which can be or have been extended to the prognostic studies.