Abstract
BACKGROUND
There is wide interest in the use of molecular markers for the early detection of cancer, the prediction of disease outcome, and the selection of patients for chemotherapy. Despite significant and increasing research activity, to the authors’ knowledge only a small number of molecular markers have been successfully integrated into clinical practice. In the current study, the experimental designs and statistical methods used in contemporary molecular marker studies are reviewed, particularly with respect to whether these evaluated a marker’s clinical value.
METHODS
MEDLINE was searched for studies that analyzed an association between a cancer outcome and a marker involving chemical analysis of body fluid or tissue. For each article, data were extracted regarding patients, markers, type of statistical analysis, and principal results.
RESULTS
The 129 articles eligible for analysis included a very large variety of molecular markers; the total number of markers was larger than the number of articles. Only a minority of articles (47 articles; 36%) incorporated multivariate modeling in which the marker was added to standard clinical variables, and only a very small minority had any measure of predictive accuracy (14 articles; 11%). No article used decision analytic methods or experimentally evaluated the clinical value of a marker. Correction for overfit was also rare (3 articles).
CONCLUSIONS
Statistical methods in molecular marker research have not focused on the clinical value of a marker. Attention to sound statistical practice, in particular the use of statistical approaches that provide clinically relevant information, will help maximize the promise of molecular markers for care of the cancer patient.
Keywords: neoplasms, tumor markers, research design, biomedical research
There is wide interest in the use of molecular markers for cancer care. Markers are currently under investigation for a wide variety of clinical roles, including the early detection of cancer,1–3 prognostication,4,5 and predicting response to chemotherapy.6,7 Ludwig and Weinstein estimate that close to 4000 articles on molecular markers are published each year, an approximate 50% increase within the past decade.8
It is not hard to see the attraction of molecular markers. Few cancers can be detected by imaging, and even for established methods such as mammography, imaging has imperfect sensitivity and specificity.9 Similarly, staging systems are clearly inadequate for risk prediction.10 It appears entirely reasonable that molecular analysis of bodily fluids or tumor tissue could provide additional information to help detect cancer or predict its outcome.
Yet despite significant and increasing research activity, to our knowledge only a small number of molecular markers have been successfully integrated into cancer clinical practice.8,11 Several authors have argued that this is a result of poor study design; typical marker studies have involved convenience samples from poorly defined populations, nonstandardized assays, and small numbers of patients subject to missing data.11,12 These observations have prompted the development of guidelines intended to ensure that marker studies conform to some basic standards of design and reporting.11 An issue that has received less attention is the degree to which research on molecular markers has made sufficient use of clinically relevant statistics, such as the assessment of predictive accuracy, decision analysis, or experimental methodology.
Predictive accuracy refers to metrics such as, but not limited to, sensitivity, specificity, area under the curve (AUC), or concordance index. Such measures are critical for understanding the clinical value of a molecular marker; we want to know not only whether there is an association between a marker and an outcome, but how well the marker predicts outcome. Moreover, given that clinicians often have prognostic information, such as stage and tumor grade, readily available, a marker will only be of clinical value if the accuracy of a model including standard clinical variables is higher than that of a model including standard clinical variables plus the marker.13
Decision analysis provides a direct estimate of the clinical implications of using a molecular marker.14 This is of value because measures of accuracy are often difficult to interpret clinically. If the sensitivity and specificity of a clinical model (based on stage and grade) are, for example, 80% and 50%, respectively, and increase to 83% and 55%, respectively, when a molecular marker is added, it is unclear whether this is a sufficient improvement to justify the clinical use of the marker. A decision analysis attempts to provide information regarding the clinical value of molecular markers by incorporating data on the consequences of a clinical decision. For example, it is generally considered a greater error to miss a cancer (a false-negative result) than to conduct an unnecessary biopsy (a false-positive result); this would be incorporated into a decision analysis by giving a greater ‘weight’ to a false-negative finding than to a false-positive finding. Regardless of the value of decision analysis, the gold standard for the clinical evaluation of a molecular marker is experimental, such as randomizing patients to marker measurement or no marker measurement and measuring long-term outcome.15
In this systematic review, we attempted to document the use of these various statistical methods in contemporary molecular marker studies in cancer. Our particular concern was the degree to which statistical analyses would allow conclusions to be drawn regarding the clinical value of a molecular marker.
MATERIALS AND METHODS
Literature Search and Study Eligibility
We searched MEDLINE to May 2006 using the terms (cancer OR tumor OR tumor OR lymphoma* OR leukemia* OR malignan*) AND (predict* OR prognos*), limiting the search to exclude review articles. Eligibility criteria for the review are given in Table 1. Articles were reviewed in reverse chronological order. We prespecified that we would review 125 eligible articles, at least 50 of which focused on breast cancer. Studies not eligible for review were categorized into 1 of the following reasons for exclusion: not a molecular marker, review article, no cancer endpoint, not about prediction, other.
TABLE 1.
Eligibility Criteria for the Systematic Review*
|
Articles had to meet all criteria to be included.
Data Extraction
For each article, we extracted data regarding patient selection, molecular markers measured, type of statistical analysis, and principal results using a standard data collection form. Sample size was recorded as the number of patients with evaluable data, rather than the number accrued. If several different analyses were conducted with different numbers of patients evaluable, the analysis including the highest number of patients was recorded. Patient diagnosis was recorded in the following categories: pediatric, breast, gynecologic, colorectal, lung, prostate, other solid, hematologic, or mixed tumor types.
The names of the molecular markers investigated were recorded along with whether they were genomic, nongenomic, or included both types. In order for a marker to be coded as genomic, direct analysis of genetic or transcriptional material (DNA or RNA) had to be conducted using, for example, polymerase chain reaction. Markers were also coded as prespecified (such as a study of prostate-specific antigen [PSA] in prostate cancer) or derived from the data, such as in an analysis of a gene microarray to identify genes associated with outcome.
The core part of data extraction concerned the statistical methods. We first documented whether inferential statistics were used, that is, whether a P value was reported for the association between at least 1 molecular marker and at least 1 cancer endpoint. If so, we recorded whether a significance test was conducted for a marker when entered into a multivariate model that included at least 1 standard clinical variable. Such a test is often described in terms of determining whether a marker is an ‘independent’ prognostic factor. We defined a ‘standard clinical variable’ as any patient or tumor characteristic used to define risk that did not involve chemical analysis of body fluid or tissue. This might include, for example, tumor stage or grade, or patient age. Although molecular markers are commonly used for some cancers, such as PSA for prostate cancer, such markers were not defined as ‘clinical’ for the purposes of this analysis.
Next, we recorded whether there was any metric of predictive accuracy such as, but not limited to, sensitivity, specificity, AUC, or concordance index. We also recorded whether the accuracy of a model including standard clinical variables was compared with the accuracy of a model including both standard variables and the marker.
Multivariate models are prone to what is known as overfit, which is when a model predicts well when applied to the dataset on which it was constructed, but poorly on a new dataset. There are 2 general methods of correcting for overfit. One method is to split the data into a ‘training’ set, on which the model is constructed, and a ‘validation’ set, on which the properties of the model, such as specificity, are calculated. Although a robust and elegant solution to overfitting, splitting the dataset reduces statistical power. A variety of statistical methods, such as cross-validation or bootstrap resampling,14 have been proposed that use the entire dataset for both training and validation, but nonetheless correct for overfit. Correction for overfit is of particular importance for datasets that include a very large number of potential predictors, such as those derived from genomic or proteomic studies. We therefore recorded whether there was correction for overfit and, if so, whether this was conducted using an independent validation dataset or statistical methods such as cross-validation.
For our final assessment of statistical methods, we documented whether a study incorporated either decision analysis, or an experimental design, defined as any study in which the outcome of patients in which the marker was used to inform clinical practice was compared with a group in which the marker was not used.
The result of each study was categorized as positive, negative, or unclear. Studies were classified as positive if there was a statistically significant association between any molecular marker and any cancer outcome or, in the absence of inference statistics, if the authors made an explicit statement as to the value of the marker. Finally, we recorded whether the authors claimed any clinical consequences of the study (eg, if they recommended that the marker be used in practice to aid treatment decisions).
Review Methods
Eligibility and data extraction methods were formalized in a protocol that was piloted on 10 articles. The results of these articles were not included in the main analysis. Subsequently, all articles were read by 1 researcher (J.K.) and checked by a second (A.V.), with disagreements resolved by consensus. We prespecified that we would use the Fisher exact test to determine whether the likelihood of a positive result, or a claim of clinical consequences, was associated with the use of each statistical method: multivariate modeling, accuracy metric, correction for overfit, decision analysis, or experimental design. Statistical analyses were conducted using Stata 9.2 statistical software (StataCorp, College Station, Tex).
RESULTS
We analyzed 762 articles, including 129 articles and excluding 633 (Fig. 1). The most common reasons for exclusion were that the article did not concern prediction (336 articles; 53%), that no molecular markers were analyzed (208 articles; 33%) or that there were no cancer-related endpoints (69 articles; 11%). All included articles were published in 2005 or 2006: this study therefore provides a ‘snapshot’ of contemporary molecular marker research. We included slightly more articles than intended because some articles excluded by the initial reviewer (J.K.) were found to be eligible on rereview by the second reviewer (A.V.).
FIGURE 1.
Flowchart showing inclusion and exclusion of articles for the review.
The number of patients analyzed ranged from 11 to >6000, with a median of 94 (interquartile range, 50–190). The cancers under study were very mixed; 51 articles concerned breast cancer (which we over-sampled as part of our study design). Of the remaining articles, 12 concerned hematologic malignancies, 7 were regarding diverse cancer diagnoses, 1 pediatric cancer, 9 gynecologic cancer, 6 colorectal cancer, 8 prostate cancer, 5 lung cancer, and 30 other solid tumors. The majority of articles (120 articles; 93%) prespecified the molecular markers to be analyzed rather than searching across a large number of markers before selecting a subset to analyze. Genomic markers were used exclusively in 26 articles (20%), with 87 articles (67%) examining exclusively nongenomic markers and the remainder (16 articles; 12%) analyzing both marker types.
A very large number of different markers were studied. Five articles examined more than 10 prespecified markers and an additional 9 articles studied markers across the genome or proteome. The remaining 115 articles included a total of 147 different molecular markers, the majority of which were studied in only a single article. The only markers studied in more than 5 different articles were epidermal growth factor receptor (EGFR) (6 articles), estrogen receptor (8 articles), HER-2 (16 articles), Ki-67 (12 articles), and p53 (8 articles). This no doubt reflects our oversampling of breast cancer studies; analyzing separately by breast versus other diagnoses, there were, respectively, 54 markers in 48 articles and 103 markers in 67 articles. Hence, even within a single cancer diagnosis there were more markers than articles.
The statistical methods used in the articles are described in Table 2. Although nearly all articles (97%) reported inference statistics, the majority did so univariately because only approximately one-third (36%) incorporated multivariate modeling in which the marker was added to standard clinical variables. A very small minority of articles (11%) included a measure of predictive accuracy and in only a single case was the accuracy of a model including clinical variables plus a marker compared with that of a model including clinical variables alone. No article reported used decision analytic methods or an experimental approach to determining a marker’s clinical value.
TABLE 2.
Statistical Methods Used in Studies of Molecular Markers in Cancer
Statistical method | No. of articles (n = 129) |
---|---|
Inference statistics | 125 (97%) |
Inclusion of marker in a multivariate model with standard clinical variables | 47 (36%)* |
Measure of predictive accuracy | 14 (11%)† |
Accuracy compared with standard clinical model | 1 (1%) |
Correction for overfit | |
None | 126 (98%) |
Internal (statistical correction) | 3 (2%) |
External (independent validation set) | 0 (0%) |
Correction for overfit in articles without prespecified markers | |
None | 7 (78%) |
Internal (statistical correction) | 2 (22%) |
Decision analytic methods | 0 (0%) |
Experimental evaluation of clinical value | 0 (0%) |
Two of these 47 articles also included a measure of predictive accuracy.
Two of these 14 articles also included a multivariate model.
Correction for overfit was also rare. Even if we restrict the analysis to the 9 proteomic and genomic studies that did not prespecify markers, only 2 articles (22%) used statistical methods to correct for overfit and none used external validation.
The great majority of articles (112 articles; 87%) were categorized as reporting positive results. Deciding whether an article included reference to clinical consequences, however, proved surprisingly difficult. Indeed, this literature was characterized by the liberal use of words such as ‘may,’ the sole purpose of which appears to say nothing at all. For example, the statement that a marker “may aid selection of chemotherapy” is true in all but the most exceptional circumstances (eg, a large, negative trial), and is thus uninformative. Ultimately, we categorized 51 articles (40%) as reporting clinical consequences. We found no statistically significant associations between the statistical methods used and the results reported (P >.2 for all analyses). This was contrary to our expectation that articles claiming clinical application would depend more heavily on multivariate modeling (43% in articles claiming clinical application vs 32% in those without clinical recommendation) or accuracy metrics (16% vs 8%).
DISCUSSION
To our knowledge, few molecular markers have been successfully integrated into the clinical care of cancer patients. Although this no doubt results, at least in part, from deficiencies in the design of marker studies, we have a particular interest in statistical methods. We reviewed a ‘snapshot’ sample of 129 studies to determine whether the statistical analyses used would allow conclusions to be drawn regarding the clinical value of the molecular markers studied. We found that the majority of articles regarding molecular markers for cancer focused on testing the null hypothesis of no association between the marker and cancer outcome. Few used statistical methods that attempted to address a marker’s clinical value (eg, by examining whether it improves accuracy compared with predictions made on the basis of routinely collected clinical data).
The current study findings likely underestimate the scale of statistical problems in contemporary molecular marker research. First, we only assessed whether a particular statistical method was applied, not whether it was applied correctly. We did not, for example, evaluate problems such as multiple hypothesis testing,16 data-dependent choice of cutoff points,11 power,12 or missing data.11,12 Furthermore, we assumed that any molecular markers described in the “Methods” section of an article were prespecified, although it is likely that at least some researchers investigated a large number of markers but reported only those that were statistically significant. Moreover, we focused only on statistical issues and did not address whether there were deficiencies in other aspects of study methodology such as patient selection,17 definition of endpoints,18,19 or laboratory methods, including sample collection, workup, storage, and analysis.11,19,20
We acknowledge that, in some cases, it may be perfectly appropriate to make clinical recommendations without a formal assessment of a marker’s accuracy, or its decision analytic properties. For example, 1 of the studies included in the review was a meta-analysis of 3 randomized trials of adjuvant chemotherapy for breast cancer (n = 6644) that reported large differences in risk reduction depending on estrogen receptor status.21 These results stand for themselves, and it is difficult to see how an assessment of predictive accuracy would be clinically informative. Yet this article was highly unusual. The typical study in our review included approximately 100 patients in a nonrandomized study. A demonstration that a marker has some association with a cancer outcome in such a study generally is inadequate evidence with which to support a clinical recommendation, such as that the molecular marker under study should be considered when planning the treatment of a patient with a suspected ovarian malignancy. Such a recommendation must require an explicit evaluation of the biomarker’s incremental benefit compared with established predictors, and should optimally consider the clinical utility of the marker.
It is clear that molecular marker research passes through a series of phases, and that an assessment of clinical value might be inappropriate for early-phase research. As a simple example, a preliminary study might measure levels of a molecular marker in patients with metastatic disease and in healthy controls. This would test whether the marker was worthy of further investigation; clearly, if the marker failed to distinguish advanced disease from no disease it should not be researched further. However, although such an experiment is often justified, a measure of accuracy, such as sensitivity or specificity, may be inappropriate. We decided against making any assessment of whether the inclusion of a measure of accuracy or a decision analysis was appropriate for any particular article, on the grounds that this would be an overly subjective judgment. To our knowledge there currently are no clear, reproducible criteria for categorizing an article as being at a particular phase of marker research. Nonetheless, it is our impression that the majority of reviewed articles included clinical cohorts in typical treatment settings; this no doubt resulted from our search strategy, which only included articles regarding prediction or prognosis, and is evidenced by the large proportion of articles (40%) that made clinical recommendations.
Despite our focus on prediction, molecular markers are not always studied to determine whether they improve clinical decision making. One important alternative rationale is to find therapeutic targets. For example, if the expression of a certain protein is found to be elevated in patients with early-stage disease who subsequently develop metastases, an agent that blocks the production of the protein or targets cells that express the protein might be an effective adjuvant treatment. Studies aiming to identify therapeutic targets need not be analyzed by methods that address the issue of clinical value of a marker. However, it is again our impression that few studies in the current review were explicitly concerned with the identification of therapeutic targets and, again, we cite as evidence the high proportion of studies making clinical recommendations.
Our review reemphasizes the importance of 3 general recommendations. First, the scientific rationale for studying molecular markers often concerns clinical prediction and, as such, studies should include a measure of predictive accuracy. As previously noted, routinely collected clinical information, such as stage and grade, predicts patient outcome. Therefore, marker studies should aim to compare the predictive accuracy of a model including only standard clinical variables with that of a model including standard clinical variables plus the marker.13
Second, although recent years have witnessed the development of novel biologic methods such as gene microarrays and novel analytic methods such as neural networks, certain basic principles of model evaluation remain unchanged. In particular, any nontrivial model is at risk for overfit, and the predictive accuracy of models therefore should be assessed either on independent (‘validation’) datasets, or by incorporating statistical methods, such as cross-validation, that correct for overfit.
Third, a model incorporating a molecular marker might improve predictive accuracy, but have little clinical value. To use a simple example to illustrate this point, take the case of a screening test for cancer that categorizes patients into 1 of 4 risk groups (unlikely, possible, probable, and definite), in which common clinical practice is to advise biopsy for patients defined as being at ‘possible’ or higher risk for cancer. Now imagine that the addition of a molecular marker allowed the majority of cancer cases in the ‘probable’ category to be recategorized as ‘definite.’ The predictive accuracy of the screening test would be higher, but this would make no practical difference to patient care. We have demonstrated this type of finding empirically. In a dataset of men undergoing a biopsy for prostate cancer, the addition of urokinase to a model incorporating only PSA level and age increased the predictive accuracy from an AUC of 0.715 to an AUC of 0.783; however, the incidence of prostate cancer was very high in the cohort and the model made few risk predictions low enough to warrant avoiding biopsy.22 We therefore recommend consideration of decision analytic or experimental techniques. A simple decision analytic method, with particular application to molecular markers, has been proposed by Vickers and Elkin23; Sargent et al.15 proposed several randomized trial designs to evaluate whether the clinical use of molecular markers improves patient outcome.
In summary, the development and evaluation of molecular markers currently constitutes a huge investment of human and financial resources by the cancer research community. Attention to sound statistical practice, in particular the use of statistical approaches that provide clinically relevant information, will help maximize the promise of molecular markers for care of the cancer patient.
Acknowledgments
Supported in part by a grant from the National Cancer Institute (P50-CA92629).
References
- 1.Lilja H, Ulmert D, Bjork T, et al. Long-term prediction of prostate cancer in a large, representative Swedish cohort: prostate kallikreins measured at age 44–50 predict prostate cancer up to 25 years before diagnosis. J Clin Oncol. 2007;25:431–436. doi: 10.1200/JCO.2006.06.9351. [DOI] [PubMed] [Google Scholar]
- 2.Wang X, Yu J, Sreekumar A, et al. Autoantibody signatures in prostate cancer. N Engl J Med. 2005;353:1224–1235. doi: 10.1056/NEJMoa051931. [DOI] [PubMed] [Google Scholar]
- 3.Sanchini MA, Gunelli R, Nanni O, et al. Relevance of urine telomerase in the diagnosis of bladder cancer. JAMA. 2005;294:2052–2056. doi: 10.1001/jama.294.16.2052. [DOI] [PubMed] [Google Scholar]
- 4.Cristofanilli M, Budd GT, Ellis MJ, et al. Circulating tumor cells, disease progression, and survival in metastatic breast cancer. N Engl J Med. 2004;351:781–791. doi: 10.1056/NEJMoa040766. [DOI] [PubMed] [Google Scholar]
- 5.Paik S, Shak S, Tang G, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004;351:2817–2826. doi: 10.1056/NEJMoa041588. [DOI] [PubMed] [Google Scholar]
- 6.Watanabe T, Wu TT, Catalano PJ, et al. Molecular predictors of survival after adjuvant chemotherapy for colon cancer. N Engl J Med. 2001;344:1196–1206. doi: 10.1056/NEJM200104193441603. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Olaussen KA, Dunant A, Fouret P, et al. DNA repair by ERCC1 in non-small-cell lung cancer and cisplatin-based adjuvant chemotherapy. N Engl J Med. 2006;355:983–991. doi: 10.1056/NEJMoa060570. [DOI] [PubMed] [Google Scholar]
- 8.Ludwig JA, Weinstein JN. Biomarkers in cancer staging, prognosis and treatment selection. Nat Rev Cancer. 2005;5:845–856. doi: 10.1038/nrc1739. [DOI] [PubMed] [Google Scholar]
- 9.Barlow WE, Chi C, Carney PA, et al. Accuracy of screening mammography interpretation by characteristics of radiologists. J Natl Cancer Inst. 2004;96:1840–1850. doi: 10.1093/jnci/djh333. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Burke HB. Outcome prediction and the future of the TNM staging system. J Natl Cancer Inst. 2004;96:1408–1409. doi: 10.1093/jnci/djh293. [DOI] [PubMed] [Google Scholar]
- 11.McShane LM, Altman DG, Sauerbrei W, Taube SE, Gion M, Clark GM. Reporting recommendations for tumor marker prognostic studies (REMARK) J Natl Cancer Inst. 2005;97:1180–1184. doi: 10.1093/jnci/dji237. [DOI] [PubMed] [Google Scholar]
- 12.Pajak TF, Clark GM, Sargent DJ, McShane LM, Hammond ME. Statistical issues in tumor marker studies. Arch Pathol Lab Med. 2000;124:1011–1015. doi: 10.5858/2000-124-1011-SIITMS. [DOI] [PubMed] [Google Scholar]
- 13.Kattan MW. Evaluating a new marker’s predictive contribution. Clin Cancer Res. 2004;10:822–824. doi: 10.1158/1078-0432.ccr-03-0061. [DOI] [PubMed] [Google Scholar]
- 14.Harrell FEJ. Regression Modelling Strategies with Applications to Linear Models, Logistic Regression, and Survival Analysis. New York: Springer; 2001. [Google Scholar]
- 15.Sargent DJ, Conley BA, Allegra C, Collette L. Clinical trial designs for predictive marker validation in cancer treatment trials. J Clin Oncol. 2005;23:2020–2027. doi: 10.1200/JCO.2005.01.112. [DOI] [PubMed] [Google Scholar]
- 16.Biganzoli E, Boracchi P, Marubini E. Biostatistics and tumor marker studies in breast cancer: design, analysis and interpretation issues. Int J Biol Markers. 2003;18:40–48. doi: 10.1177/172460080301800107. [DOI] [PubMed] [Google Scholar]
- 17.Sargent D, Allegra C. Issues in clinical trial design for tumor marker studies. Semin Oncol. 2002;29:222–230. doi: 10.1053/sonc.2002.32898. [DOI] [PubMed] [Google Scholar]
- 18.Kyzas PA, Loizou KT, Ioannidis JP. Selective reporting biases in cancer prognostic factor studies. J Natl Cancer Inst. 2005;97:1043–1055. doi: 10.1093/jnci/dji184. [DOI] [PubMed] [Google Scholar]
- 19.Kyzas PA, Denaxa-Kyza D, Ioannidis JP. Quality of reporting of cancer prognostic marker studies: association with reported prognostic effect. J Natl Cancer Inst. 2007;99:236–243. doi: 10.1093/jnci/djk032. [DOI] [PubMed] [Google Scholar]
- 20.Conley BA, Taube SE. Prognostic and predictive markers in cancer. Dis Markers. 2004;20:35–43. doi: 10.1155/2004/202031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Berry DA, Cirrincione C, Henderson IC, et al. Estrogen-receptor status and outcomes of modern chemotherapy for patients with node-positive breast cancer. JAMA. 2006;295:1658–1667. doi: 10.1001/jama.295.14.1658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Steyerberg EW, Vickers AJ. Decision curve analysis: a discussion. Med Decis Making. 2008;28:146–149. doi: 10.1177/0272989X07312725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26:565–574. doi: 10.1177/0272989X06295361. [DOI] [PMC free article] [PubMed] [Google Scholar]