Skip to main content
The Journal of Manual & Manipulative Therapy logoLink to The Journal of Manual & Manipulative Therapy
editorial
. 2008;16(2):69–71. doi: 10.1179/106698108790818477

Potential Pitfalls of Clinical Prediction Rules

Chad E Cook
PMCID: PMC2565112  PMID: 19119389

What Are Clinical Prediction Rules?

A clinical prediction rule (CPR) is a combination of clinical findings that have statistically demonstrated meaningful predictability in determining a selected condition or prognosis of a patient who has been provided with a specific treatment1,2. CPRs are created using multivariate statistical methods, are designed to examine the predictive ability of selected groupings of clinical variables3,4, and are intended to help clinicians make quick decisions that may normally be subject to underlying biases5. The rules are algorithmic in nature and involve condensed information that identifies the smallest number of indicators that are statistically diagnostic to the targeted condition6. The number of derived or validated CPRs is increasing6, specifically in rehabilitation medicine where prescriptive studies have been developed for musculoskeletal interventions for low back pain7,8, cervical pain9,10, and knee dysfunction11,12.

Clinical prediction rules may best be classified into three distinct groups: 1) diagnostic, 2) prognostic, and 3) prescriptive1,13. Studies that focus on predictive factors related to a specific diagnosis are known as diagnosticCPRs. Clinical prediction rules that are designed to predict an outcome such as success or failure are considered prognostic. Clinical prediction rules designed to target the most effective interventions are identified as prescriptive, and these require prospective, longitudinal, randomized controlled trials that compare outcomes after selected interventions for subjects who meet a similar score on the CPR1.

Clinical prediction rules are generally developed using a 3-step method14. First, CPRs are derived prospectively using multivariate statistical methods to examine the predictive ability of selected groupings of clinical variables3. The second step involves validating the CPR in a randomized controlled trial to reduce the risk that the predictive factors developed during the derivation phase were selected by chance14. The third step involves conducting an impact analysis to determine the extent that the CPR improves care, reduces costs, and accurately defines the targeted objective14.

Although there is little debate that carefully constructed CPRs can improve clinical practice, to my knowledge, there are no guidelines that specify methodological requirements for CPRs for infusion into all clinical practice environments. Guidelines are created to improve the rigor of study design and reporting. The following editorial outlines potential methodological pitfalls in CPRs that may significantly weaken the transferability of the algorithm. Within the field of rehabilitation, most CPRs have been prescriptive; thus, my comments here are reflective of prescriptive CPRs.

Methodological Pitfalls

CPRs are designed to specify a homogenous set of characteristics from a heterogeneous population of prospectively selected consecutive patients5,15. Typically, the resulting applicable population is a small subset of a larger sample and may only represent a small percentage of the clinician's actual daily caseload. The setting and location of the larger sample should be generalizable15,16, and subsequent validity studies require assessment of the CPR in different patient groups, in different environments, and with a typical patient group seen by most clinicians16. Because many CPRs are developed based on a very distinct group, that may or may not be reflective of a typical population of patients, the spectrum transportability17 of many current CPR algorithms may be limited.

Clinical prediction rules use outcome measures to determine the effectiveness of the intervention. Outcome measures must have a single operational definition5and require enough responsiveness to truly capture appropriate change in the condition14; in addition, these measures should have a well constructed cut-off score16,18 and be collected by a blinded administrator15. The selection of an appropriate anchor score for measurement of actual change is currently debated19,20. Most outcome measures use a patient recall-based questionnaire such as a global rating of change score (GRoC), which is appropriate when used in the short term but suffers from recall bias when used in long-term analyses19,20,21. Other studies may use minimally detectable change scores that were originally validated using the GROC and also may be affected by both recall bias and differences in sample severity or pathology. Lastly, outcome measures that use scores that are influenced by administrative factors (discharge date, length of stay, patient charges), socio-demographic factors, or internal behavioral characteristics (changes in fear avoidance or attitude) are not consistent among populations5.

A potential drawback for CPRs is the failure to maintain the quality of the tests and measures used as predictors in the algorithm. The prospective test and measures should be independent of one another during modeling16; each should be performed in a meaningful, acceptable manner4; and clinicians or data administrators should be blinded to the patient's outcomes measures and condition22. Furthermore, the tests should demonstrate acceptable reliability (≥ 0.60)15and require administration within an acceptable timeframe of the outcome measure22; equivocal or indeterminable results necessitate reporting22. Recognizing the likelihood of a true positive finding in the absence of any information will avoid the representative heuristic pitfall that may compel us toward identifying a clinical test as positive simply because the result fits the pattern of other findings23. CPRs that use tests and measures with reliability or agreement below 0.60 may result in variable findings depending on the clinician who performs the examination and depending on the findings of other tests and measures.

It is my impression that the most frequent current pitfall of CPRs is associated with the failure to meet statistical assumptions during regression modeling. CPRs are typically underpowered falling below the suggested requirements of 10 to 15 subjects for each prospective predictor variable24. Validation cohorts require sampling sizes of 100 or greater with use of logistic regression (used as a standard for CPR assessment)25. Rarely is the statistical significance of the model reported in the rehabilitation-based CPRs, nor is the R2 or R2-equivalent of the model identified5. An R2 or R2-equivalent outlines the strength of association of the predictor variables (both independently and as a group) in explaining the variance of the outcome measure. Low R2 or R2-equivalents may suggest that other variables more accurately predict the outcome of the study5 and generally suggest a low effect size of the independent variables identified and retained in the analyses26. Most CPRs do report confidence intervals, and when reported, wide confidence intervals imply poor precision or too small of a sample size15.

Once a CPR is developed, it is important to recognize the true benefit of the tool. It has been suggested that for true impact on clinical practice, CPRs should provide a LR+ of 5 or greater27. CPR derivations performed on high-risk groups, where failure to provide the appropriate intervention is highly undesirable, should have sensitivity values that are greater than specificity values28. This indicates that the final algorithm will accurately provide all of the best treatment(s) possible versus assuring that only those specific to the problem are used28.

CPRs should have clinical sensibility. Clinical sensibility implies that the tool makes inherent clinical sense, that it's easy to use, that the tests and measures are truly related to the outcome, and that clinician perception does not overly alter the findings of the tool15. Consequently, tests and measures that vary in clinical interpretation (e.g., spring tests of the spine) or that are potentially explained by factors beyond the original scope of the examination (e.g., hip osteoarthritis when addressing hip procedures that affect the knee) may not be as useful as factors that are more explicit during clinical assessment.

Lastly, most rehabilitation-related CPRs are derivation studies, which are the initial steps in the development of clinical decision rules. Derivation studies lack validation and require follow-up studies in diverse centers with different populations of patients and different clinicians. Whether the findings from a derivation study stand up to the scrutiny of further assessment is unknown15. In essence, adoption of a derivation-only CPR runs the risk of improper treatment. Careful attention should be made before blindly adopting derivation studies or basing treatment pathways on these tools.

Summary

Is this editorial an attack on clinical prediction rules? Actually, it's quite the contrary. Prescriptive CPRs are useful tools for a select and discrete population of patients. As manually oriented clinicians, we have long realized that sub-sets of the population benefit from manual therapy more so than others. CPRs allow us to isolate a sub-set of desired patient characteristics and to define which techniques are most useful for that population. The current rehabilitation-based CPRs have opened the door for additional research to improve our accuracy as clinicians. Unfortunately, many of the present rehabilitation-based CPRs may have methodological weaknesses that may allow questioning of the utility of the instrument. Although there is no such thing as a “perfect” study, better and more rigorous designs should provide additional, profound and clinically applicable findings. As a clinician and a researcher, I am an advocate of CPRs.

REFERENCES

  • 1.Beattie P, Nelson R. Clinical prediction rules: What are they and what do they tell us? Aust J Physiother. 2006;52:157–163. doi: 10.1016/s0004-9514(06)70024-1. [DOI] [PubMed] [Google Scholar]
  • 2.Randolph A, Guyatt H, Calvin JE, Doig G, Richardson WS. Understanding articles describing clinical prediction tools. Crit Care Med. 1998;26:1603–1612. doi: 10.1097/00003246-199809000-00036. [DOI] [PubMed] [Google Scholar]
  • 3.Hier DB, Edlestein G. Deriving clinical prediction rules from stroke outcome research. Stroke. 1991;22:1431–1436. doi: 10.1161/01.str.22.11.1431. [DOI] [PubMed] [Google Scholar]
  • 4.Kuijpers T, van der Heijden GJMG, Vergouwe Y, et al. Good generalizability of a prediction rule for prediction of persistent shoulder pain in the short term. J Clin Epidemiol. 2007;60:947–953. doi: 10.1016/j.jclinepi.2006.11.015. [DOI] [PubMed] [Google Scholar]
  • 5.Wasson JH, Sox HC, Neff RK, Goldman L. Clinical prediction rules: Applications and methodological standards. New Engl J Med. 1985;313:793–799. doi: 10.1056/NEJM198509263131306. [DOI] [PubMed] [Google Scholar]
  • 6.Brehaut JC, Stiell IG, Visentin L, Graham ID. Clinical decision rules “in the real world”: How a widely disseminated rule is used in everyday practice. Acad Emerg Med. 2005;12:948–956. doi: 10.1197/j.aem.2005.04.024. [DOI] [PubMed] [Google Scholar]
  • 7.Childs JD, Fritz JM, Flynn TW, et al. A clinical prediction rule to identify patients with low back pain most likely to benefit from spinal manipulation: A validation study. Ann Intern Med. 2004;141:920–928. doi: 10.7326/0003-4819-141-12-200412210-00008. [DOI] [PubMed] [Google Scholar]
  • 8.Hicks GE, Fritz JM, Delitto A, McGill SM. Preliminary development of a clinical prediction rule for determining which patients with low back pain will respond to a stabilization exercise program. Arch Phys Med Rehabil. 2005;86:1753–1762. doi: 10.1016/j.apmr.2005.03.033. [DOI] [PubMed] [Google Scholar]
  • 9.Cleland JA, Childs JD, Fritz JM, Whitman JM, Eberhart SL. Development of a clinical prediction rule for guiding treatment of a subgroup of patients with neck pain: Use of thoracic spine manipulation, exercise, and patient education. Phys Ther. 2007;87:9–23. doi: 10.2522/ptj.20060155. [DOI] [PubMed] [Google Scholar]
  • 10.Tseng YL, Wang WT, Chen WY, Hou TJ, Chen TC, Lieu FK. Predictors for the immediate responders to cervical manipulation in patients with neck pain. Man Ther. 2006;11:306–315. doi: 10.1016/j.math.2005.08.009. [DOI] [PubMed] [Google Scholar]
  • 11.Lesher JD, Sutlive TG, Miller GA, Chine NJ, Garber MB, Wainner RS. Development of a clinical prediction rule for classifying patients with patellofemoral pain syndrome who respond to patellar taping. J Orthop Sports Phys Ther. 2006;36:854–866. doi: 10.2519/jospt.2006.2208. [DOI] [PubMed] [Google Scholar]
  • 12.Currier LL, Froechlich PJ, Carow SD, et al. Development of a clinical prediction rule to identify patients with knee pain and clinical evidence of knee osteoarthritis who demonstrate a favorable short-term response to hip mobilization. Phys Ther. 2007;87:1106–1119. doi: 10.2522/ptj.20060066. [DOI] [PubMed] [Google Scholar]
  • 13.Reilly BM, Evans AT. Translating clinical research into clinical practice: Impact of using prediction rules to make decisions. Ann Intern Med. 2006;144:201–209. doi: 10.7326/0003-4819-144-3-200602070-00009. [DOI] [PubMed] [Google Scholar]
  • 14.Childs JD, Cleland JA. Development and application of clinical prediction rules to improve decision-making in physical therapist practice. Phys Ther. 2006;86:122–131. doi: 10.1093/ptj/86.1.122. [DOI] [PubMed] [Google Scholar]
  • 15.Laupacis A, Sekar M, Stiell IG. Clinical prediction rules: A review and suggested modifications of methodological standards. JAMA. 1997;277:488–494. [PubMed] [Google Scholar]
  • 16.Knottnerus JA. Diagnostic prediction rules: Principles, requirements, and pitfalls. Prim Care. 1995;22:341–363. [PubMed] [Google Scholar]
  • 17.Justice AC, Covinsky KE, Berlin JA. Assessing the generalizability of prognostic information. Ann Intern Med. 1999;130:515–524. doi: 10.7326/0003-4819-130-6-199903160-00016. [DOI] [PubMed] [Google Scholar]
  • 18.McConnochie KM, Roghmann KJ, Pasternack J. Developing clinical prediction rules and evaluating observational patterns using categorical clinical markers. Med Decis Making. 1993;13:30–42. doi: 10.1177/0272989X9301300105. [DOI] [PubMed] [Google Scholar]
  • 19.Norman GR, Stratford P, Regehr G. Methodological problems in the retrospective computation of responsiveness to change: The lesson of Cronbach. J Clin Epidemiol. 1997;50:869–879. doi: 10.1016/s0895-4356(97)00097-8. [DOI] [PubMed] [Google Scholar]
  • 20.Schmitt JC, Di Fabio RP. Reliable change and minimum important difference (MID) proportions facilitated group responsiveness comparisons using individual threshold criteria. J Clin Epidemiol. 2004;57:1008–1018. doi: 10.1016/j.jclinepi.2004.02.007. [DOI] [PubMed] [Google Scholar]
  • 21.Schmitt JC, Di Fabio RP. The validity of prospective and retrospective global change criterion measures. Arch Phys Med Rehabil. 2005;86:2270–2276. doi: 10.1016/j.apmr.2005.07.290. [DOI] [PubMed] [Google Scholar]
  • 22.Whiting P, Rutjes AV, Reitsma JB, Bossuyt PM, Kleijnen J. The development of QUADAS: A tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Med Res Methodol. 2003;10(3):25. doi: 10.1186/1471-2288-3-25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Klein JG. Five pitfalls in decisions about diagnosis and prescribing. BMJ. 2005;330:781–783. doi: 10.1136/bmj.330.7494.781. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Concato J, Feinstein AR, Holford TR. The risk of determining risk with multivariate methods. Ann Intern Med. 1993;118:201–210. doi: 10.7326/0003-4819-118-3-199302010-00009. [DOI] [PubMed] [Google Scholar]
  • 25.Vergouwe Y, Steyerberg EW, Eijkemans MS, Habbema J. Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol. 2005;58:475–483. doi: 10.1016/j.jclinepi.2004.06.017. [DOI] [PubMed] [Google Scholar]
  • 26.Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Hillsdale, NJ: Erlbaum; 1988. [Google Scholar]
  • 27.Jaeschke R, Guyatt GH, Sackett DL. Users guide to the medical literature. III. How to use an article about a diagnostic test. What are the results and will they help me? JAMA. 1994;271:703–707. doi: 10.1001/jama.271.9.703. [DOI] [PubMed] [Google Scholar]
  • 28.McGinn TG, Guyatt GH, Wyer PC, et al. Evidence-based medicine working group. User's guides to the medical literature. XXII. How to use articles about clinical decision rules. JAMA. 2000;284:79–84. doi: 10.1001/jama.284.1.79. [DOI] [PubMed] [Google Scholar]

Articles from The Journal of Manual & Manipulative Therapy are provided here courtesy of Taylor & Francis

RESOURCES