Skip to main content
International Wound Journal logoLink to International Wound Journal
. 2014 Sep 16;12(5):590–594. doi: 10.1111/iwj.12376

Inter‐rater reliability of three most commonly used pressure ulcer risk assessment scales in clinical practice

Li‐Hua Wang 1,, Hong‐Lin Chen 2, Hong‐Yan Yan 1, Jian‐Hua Gao 1, Fang Wang 1, Yue Ming 1, Li Lu 1, Jing‐Jing Ding 1
PMCID: PMC7950447  PMID: 25224688

ABSTRACT

The objective of this study was to evaluate inter‐rater reliability of Braden Scale, Norton Scale and Waterlow Scale for pressure ulcer risk assessment in clinical practice. The design of the study was cross‐sectional. A total of 23 patients at pressure ulcer risk were included in the study, and 6 best registered nurses conducted three subsequent risk assessments for all included patients. They assessed alone and independently from each other. An intra‐class correlation coefficient (ICC) was used to determine the inter‐rater reliability. For the Braden Scale, the ICC values ranged between 0·603 (95% CI: 0·435–0·770) for the item ‘moisture’ and a maximum of 0·964 (95% CI: 0·936–0·982) for the item ‘activity’; for the Norton Scale, the ICC values ranged between 0·595 (95% CI: 0·426–0·764) for the item ‘physical condition’ and a maximum of 0·975 (95% CI: 0·955–0·988) for the item ‘activity’; and for the Waterlow Scale, the ICC values ranged between 0·592 (95% CI: 0·422–0·762) for the item ‘skin type’ and a maximum of 0·990 (95% CI: 0·982–0·995) for the item ‘activity’. The ICC values of total score for three scales of were 0·955 (95% CI: 0·922–0·978), 0·967 (95% CI: 0·943–0·984), and 0·915 (95% CI: 0·855–0·958) for Braden, Norton, and Waterlow scales, respectively. Although the inter‐rater reliability of Braden Scale, Norton Scale and Waterlow Scale total scores were all substantial, the reliability of some items was not so good. The items of ‘moisture’, ‘physical condition’ and ‘skin type’ should be paid more attention. However, some studies are needed to find out high reliable quantitative items to replace these ambiguous items in new designed scales.

Keywords: Assessment tools, Braden Scale, Inter‐rater reliability, Norton Scale, Pressure ulcer, Waterlow Scale

Introduction

The prevalence of pressure ulcer in hospital settings is still high. According to the international surveys for pressure ulcer, the overall prevalence rates ranged from 5·6% to 15·5% 1, 2, 3. The pressure ulcers can lead to severe or intolerable pain, are susceptible to infection and are associated with high mortality rates. Some studies estimate that the cost of treating a pressure ulcer in the UK varies from £1,064 to £10,551 in 2004 and £1,214 to £14,108 in 2011 4, 5. The prevention of pressure ulcer is an important matter of concern.

A risk assessment scale (RAS) for pressure ulcer prevention is a tool for establishing a score according to a series of parameters considered to be risk factors. Some of the most widely used RASs are the Norton Scale, the Braden Scale and the Waterlow Scale. The Norton Scale was the first pressure ulcer RAS developed in 1962 in the UK 6. The Braden Scale was developed in 1987 by Barbara Braden and Nancy Bergstrom 7, recommended by the American Agency for Health Care Research and Quality for predicting pressure ulcer risk. The Waterlow Scale was developed in 1985 by clinical nurse teacher, Judy Waterlow 8. Many studies have investigated the validity for the RASs and indicated good predictive value for pressure ulcers 9, 10. However, an ideal RAS tool should not only have good validity, but also the good reliability. Inter‐rater reliability, also called inter‐rater agreement, or concordance, is the degree of agreement among raters. It gives a score of how much homogeneity, or consensus. It is useful in refining the tools by determining if a particular scale is appropriate for measuring a particular variable. In our clinical practice, we found the inter‐rater reliability for these RASs were not so good, and in the same patients, the RAS scores were often assessed differently by different nurses. The different risk scores will lead to different stratified care strategy. But how do we choose it?

The purpose of this study was to investigate the inter‐rater reliability of the Norton Scale, the Braden Scale and the Waterlow Scale in pressure ulcer risk assessment, and to find out the items with poor inter‐rater reliability, which should be paid more attention.

Methods

The design of the study was cross‐sectional. The study was approved by the medical ethics committee of our hospital. The data were collected from February 2014.

Patients

We recruited patients with pressure ulcer risk from our hospital for this analysis. Convenience sampling was used. Patients from the departments of neurosurgery, ICU, orthopedics, neurology, respiratory medicine, spine surgery and cardiothoracic surgery were eligible for this study. Patients were excluded if they did not agree to participate in the study, or if they had developed pressure ulcer. A total of 23 patients were included in the study. Demographic and baseline characteristics of the patient population are presented in Table 1.

Table 1.

Demographics and baseline characteristics of patients (N = 23)

Variable Values
Age (years) 58·7 ± 11·2
Sex (male/female) 13/10
Ward
Neurosurgery 4
ICU 3
Orthopedics 4
Neurology 3
Respiratory medicine 3
Spine surgery 2
Cardiothoracic surgery 4
Pressure ulcer risk
No risk (Braden scale score > 18) 11 
Mild risk (Braden scale score 15–18) 2
Moderate risk (Braden scale score 13–14) 3
High risk (Braden scale score 10–12) 5
Severe risk (Braden scale score ≤ 9) 2

Raters

In the departments, such as neurosurgery, ICU, orthopedics, neurology, respiratory medicine, spine surgery and cardiothoracic surgery, we chose the chief nurses from each of them as raters. Six nurses participated in the study. They are all experienced in pressure ulcer care and pressure ulcer risk assessment. The mean age of raters was 43·5 ± 3·6 years with 22·2 ± 2·4 years of work experience.

Procedure

Before the study, the research team had meetings among the six raters. They were informed about the aim of the study in both oral and written forms. They were asked to do the risk assessment for one participating patient with three different scales subsequently. They assessed alone and independently from each other. All scale items were scored on written data collection forms.

Assessment tools

The Braden Scale is composed of six subscales: sensory perception, skin moisture, activity, mobility, nutrition, and friction and shear 7. After adding up the item scores, the sum scores of Braden can range from 6 to 23. Scores of 18 or below are identified as at risk, following by 14–13 as moderate risk, 12–10 as high risk, and ≤9 as very high risk 11.

The Norton Scale comprises five items: physical condition, mental state, activity, mobility, and incontinence. Each item is rated from 1 (very bad) to 4 (good). The maximum score is 20. Patients with a score of 16 indicate the risk 6.

The Waterlow Scale consists of seven items: weight for height, skin type, sex and age, malnutrition screening tool, continence, mobility and special risk factors. Potential scores ranged from 1 to 64. Patients scoring 10–14 are identified as at risk, 15–19 as high risk and above 20 as very high risk 8.

Statistics

An intra‐class correlation coefficient (ICC) was used to determine the inter‐rater reliability. This ICC is based on a two‐way random effects model with rater variance included in the ICC denominator. ICC of 0·00–0·10 are identified as none inter‐rater reliability virtually; 0·11–0·40, 0·41–0·60, 0·61–0·80 and 0·81–1·00 are deemed as slight, fair, moderate and substantial, respectively, for ICC 12. We also did subgroup analyses by ward and pressure ulcer risk category. Pressure ulcer risk category is based on Braden Scale. All analyses were performed using IBM SPSS statistics software (version 20.0; IBM, Armonk, NY).

Results

Inter‐rater reliability of three scales

From 7 clinical departments, 6 experienced registered nurses assessed pressure ulcer risk for 23 patients alone and independently with three scales in sequence. The inter‐rater reliabilities of these three scales are shown in Table 2.

Table 2.

Inter‐rater reliability for the three scales and its items for assessment of pressure ulcer risk (6 experienced registered nurses assessed alone and independently for 23 patients from 7 clinical departments)

Scales Items ICC (95% CI) F/P Inter‐rater reliability
Braden Sensory perception 0·926 (0·873–0·963) 76·190/0·000 Substantial
Moisture 0·603 (0·435–0·770) 10·106/0·000 Moderate
Activity 0·964 (0·936–0·982) 159·609/0·000 Substantial
Mobility 0·892 (0·819–0·946) 50·732/0·000 Substantial
Nutrition 0·683 (0·529–0·824) 13·934/0·000 Moderate
Friction and shear 0·733 (0·592–0·855) 17·459/0·000 Moderate
Total score 0·955 (0·922–0·978) 129·803/0·000 Substantial
Norton Physical condition 0·595 (0·426–0·764) 9·825/0·000 Fair
Mental condition 0·929 (0·878–0·965) 79·818/0·000 Substantial
Activity 0·975 (0·955–0·988) 231·309/0·000 Substantial
Mobility 0·911 (0·849–0·956) 62·720/0·000 Substantial
Incontinence 0·681 (0·526–0·822) 13·809/0·000 Moderate
Total score 0·967 (0·943–0·984) 179·093/0·000 Substantial
Waterlow Weight for height 0·959 (0·929–0·980) 142·221/0·000 Substantial
Skin type 0·592 (0·422–0·762) 9·700/0·000 Fair
Sex 0·840 (0·739–0·917) 32·439/0·000 Substantial
Age 0·990 (0·982–0·995) 602·660/0·000 Substantial
Malnutrition screening tool 0·879 (0·799–0·939) 44·704/0·000 Substantial
Continence 0·797 (0·678–0·893) 24·506/0·000 Moderate
Mobility 0·836 (0·734–0·915) 31·657/0·000 Substantial
Special risks–tissue malnutrition 0·737 (0·597–0·858) 17·828/0·000 Moderate
Special risks–neurological deficit 0·830 (0·725–0·912) 30·304/0·000 Substantial
Special risks–surgery/trauma 0·600 (0·431–0·768) 10·000/0·000 Fair
Special risks–medication 0·890 (0·815–0·944) 49·392/0·000 Substantial
Total score 0·915 (0·855–0·958) 85·596/0·000 Substantial

ICC, intra‐class correlation coefficients.

For the Braden Scale, the ICC values ranged between 0·603 (95% CI: 0·435–0·770) for the item ‘moisture’ and a maximum of 0·964 (95% CI: 0·936–0·982) for the item ‘activity’. Among the six items, three of them had the substantial inter‐rater reliability, and the others had the moderate reliability. For the Norton Scale, the ICC values ranged between 0·595 (95% CI: 0·426–0·764) for the item ‘physical condition’ and a maximum of 0·975 (95% CI: 0·955–0·988) for the item ‘activity’. Among the five items, three of them had the substantial inter‐rater reliability, one had the moderate reliability and another one had fair inter‐rater reliability. For the Waterlow Scale, the ICC values ranged between 0·592 (95% CI: 0·422–0·762) for the item ‘skin type’ and a maximum of 0·990 (95% CI: 0·982–0·995) for the item ‘activity’. Among the 11 items, 7 had the substantial inter‐rater reliability, 2 had the moderate reliability, and the other 2 items had fair inter‐rater reliability.

The inter‐rater reliabilities of total score for three scales were all substantial. The ICC values were 0·955 (95% CI: 0·922–0·978), 0·967 (95% CI: 0·943–0·984), and 0·915 (95% CI: 0·855–0·958) for Braden, Norton and Waterlow Scales, respectively.

Subgroup analyses

The subgroup analyses for total score of the three scales are shown in Table 3.

Table 3.

Subgroup analyses for total score of the three scales (6 experienced registered nurses assessed alone and independently for 23 patients from 7 clinical departments)

Scales Subgroup Patients number ICC (95% CI) F/P Inter‐rater reliability
Braden Neurosurgery 4 0·943 (0·796–0·996) 101·168/0·000 Substantial
ICU 3 0·964 (0·827–0·999) 161·596/0·000 Substantial
Orthopedics 4 0·963 (0·860–0·997) 157·500/0·000 Substantial
Neurology 3 0·981 (0·901–0·999) 303·333/0·000 Substantial
Respiratory medicine 3 0·993 (0·961–1·000) 812·500/0·000 Substantial
Spine surgery 2 0·853 (0·300–1·000) 35·714/0·002 Substantial
Cardiothoracic surgery 4 0·972 (0·892–0·998) 209·496/0·000 Substantial
Norton Neurosurgery 4 0·974 (0·898–0·998) 224·178/0·000 Substantial
ICU 3 0·955 (0·791–0·999) 129·458/0·000 Substantial
Orthopedics 4 0·793 (0·443–0·983) 23·953/0·000 Moderate
Neurology 3 0·979 (0·894–0·999) 281·780/0·000 Substantial
Respiratory medicine 3 0·993 (0·961–1·000) 815·345/0·000 Substantial
Spine surgery 2 0·448 (−0·074 to 0·999) 5·870/0·060 Fair
Cardiothoracic surgery 4 0·990 (0·958–0·999) 573·226/0·000 Substantial
Waterlow Neurosurgery 4 0·867 (0·591–0·990) 40·097/0·000 Substantial
ICU 3 0·984 (0·761–0·999) 109·882/0·000 Substantial
Orthopedics 4 0·117 (−0·105 to 0·803) 1·791/0·192 Slight
Neurology 3 0·906 (0·619–0·997) 58·546/0·000 Substantial
Respiratory medicine 3 0·985 (0·920–1·000) 383·204/0·000 Substantial
Spine surgery 2 0·948 (0·628–1·000) 111·154/0·000 Substantial
Cardiothoracic surgery 4 0·931 (0·757–0·995) 81·734/0·000 Substantial
Braden No risk 11  0·678(0·449–0·877) 13·641/0·000 Moderate
Mild risk 2 0·400(–0·091–0·999) 5·000/0·076 Slight
Moderate risk 3 0·171(–0·109–0·936) 2·241/0·157 Slight
High risk 5 0·352(0·034–0·855) 4·262/0·012 Slight
Severe risk 2 0·356(–0·105–0·998) 4·310/0·093 Slight
Norton No risk 11  0·849 (0·700–0·949) 34·800/0·000 Substantial
Mild risk 2 0·160 (−0·151 to 0·997) 2·143/0·203 Slight
Moderate risk 3 0·182(−0·105 to 0·938) 2·333/0·147 Slight
High risk 5 0·844 (0·586–0·979) 33·385/0·000 Substantial
Severe risk 2 0·200 (−0·143 to 0·997) 2·500/0·175 Slight
Waterlow No risk 11  0·660 (0·427–0·869) 12·661/0·000 Moderate
Mild risk 2 0·618 (0·012–0·999) 10·714/0·022 Moderate
Moderate risk 3 0·172 (−0·109 to 0·936) 2·249/0·156 Slight
High risk 5 0·537 (0·174–0·918) 7·966/0·001 Fair
Severe risk 2 0·962 (0·705–1·000) 153·837/0·000 Substantial

ICC, intra‐class correlation coefficients.

In the ward subgroup, the ICC values for total score of the Braden Scale ranged between 0·853 (95% CI: 0·300–1·000) in the spine surgery ward subgroup and a maximum of 0·993 (95% CI: 0·961–1·000) in the respiratory medicine ward subgroup. In every ward subgroup, the inter‐rater reliabilities were substantial. The ICC values for total score of the Norton Scale ranged between 0·448 (95% CI: −0·074 to 0·999) in the spine surgery ward subgroup and a maximum of 0·993 (95% CI: 0·961–1·000) in the respiratory medicine ward subgroup. In seven ward subgroup, the inter‐rater reliabilities in five wards were substantial, one moderate and one fair. The ICC values for total score of the Waterlow Scale ranged between 0·117 (95% CI: −0·105 to 0·803) in the orthopedics ward subgroup and a maximum of 0·985 (95% CI: 0·920–1·000) in the respiratory medicine ward subgroup. In this ward subgroup, the inter‐rater reliabilities in six wards were substantial and one was slight.

However, in the pressure ulcer risk subgroup, the ICC values for total score of Braden Scale ranged between 0·171 (95% CI: −0·109 to 0·936) in the moderate risk subgroup and a maximum of 0·678 (95% CI: 0·449–0·877) in the no risk subgroup. The inter‐rater reliabilities were slight to moderate. The ICC values for total score of Norton Scale ranged between 0·160 (95% CI: −0·151 to 0·997) in the mild risk subgroup and a maximum of 0·849 (95% CI: 0·700–0·949) in the no‐risk subgroup. The inter‐rater reliabilities were from slight to substantial. The ICC values for total score of Waterlow Scale ranged between 0·172 (95% CI: −0·109 to 0·936) in the moderate risk subgroup and a maximum of 0·962 (95% CI: 0·705–1·000) in the severe risk subgroup. The inter‐rater reliabilities were slight to substantial.

Discussion

This study found that the inter‐rater reliability of total score for three scales were all substantial. However, the reliabilities of some items were not substantial. In the Braden Scale, the inter‐rater reliability of ‘moisture’, ‘nutrition’, and ‘friction and shear’ was moderate. In the Norton Scale, the inter‐rater reliability of ‘physical condition’ was fair, and for ‘incontinence’ was moderate. In the Waterlow Scale, the inter‐rater reliabilities of ‘physical condition’ and ‘surgery/trauma’ were fair, and for ‘continence’ and ‘tissue malnutrition’ were moderate. Pancorbo et al. systematically reviewed RASs for pressure ulcer prevention and concluded that the inter‐rater reliability for the Braden, Norton and Waterlow Scales were high 13. However, the inter‐rater reliabilities were based on the total score of these three scales. The present study found that the inter‐rater reliabilities of some items were not so good. Kottner and Dassen investigated the inter‐rater reliability and validity of the Braden and Waterlow scales in two intensive care units, and found inter‐rater reliability of Braden Scale sum scores with ICC = 0·72 (95% CI 0·52–0·87) and for Waterlow Scale sum scores, ICC = 0·36 (95% CI 0·09–0·63). They did not recommend the use of the Braden and Waterlow scales for measuring pressure ulcer risk of intensive care unit patients because of low inter‐rater reliability 14. We believe that the Braden, Norton and Waterlow scales should be used cautiously in clinical practice. In particular, the items of ‘moisture’, ‘nutrition’, ‘friction and shear’, ‘physical condition’ and ‘incontinence’ should be paid more attention.

We also did subgroup analyses by wards and pressure ulcer risk categories. For the Braden Scale total score, the inter‐rater reliability in every ward was substantial; for the Norton Scale total score, the reliability was not good in orthopedics and spine surgery wards; and for the Waterlow Scale total score, the reliability was not good in orthopedics ward. Empirically, we believe that it is difficult to assess activity in orthopedics and spine surgery wards, to assess continence and nutrition in neurology and neurosurgery wards, and to assess skin type in cardiothoracic surgery ward (oedema and heart failure). In the pressure ulcer risk subgroups, none of the inter‐rater reliabilities were good in each risk subgroup for three scales. This is another reason why we believe that the Braden, Norton and Waterlow scales should be used cautiously in clinical practice.

The Norton Scale was designed in 1962 6, and the Braden and Waterlow scales were designed based on the Norton Scale 7, 8. Although these three RASs are widely used, they were all designed in the last century based on experience. In recent years, methodology progress has been made in scale design. Evidence‐based methods are widely used. One important method was using multiple logistic regression model to develop scale 15. It appears that some new pressure ulcer RASs should be developed by the evidence‐based method. In addition, before redesigning the RASs, we should also find out high reliable quantitative items to replace these ambiguous items in order to improve the inter‐rater reliability.

Conclusion

Although the inter‐rater reliability of Braden Scale, Norton Scale and Waterlow Scale total scores were all substantial, the reliabilities of some items were not so good. The items of ‘moisture’, ‘physical condition’ and ‘skin type’ should be paid more attention. Moreover, some studies are needed to find out high reliable quantitative items to replace these ambiguous items in new designed scales.

Acknowledgements

This work is supported by Nantong Municipal Science and Technology Bureau, grant number BK2013014. The authors had no conflicts of interest to declare in relation to this article.

Author contribution

LHW and HLC involved in the study design, acquisition of data, analysis and interpretation of data. HYY, JHG, FW, YM, LL and JJD also involved in the acquisition of data.

References

  • 1. Vangilder C, Macfarlane GD, Meyer S. Results of nine international pressure ulcer prevalence surveys: 1989 to 2005. Ostomy Wound Manage 2008;54:40–54. [PubMed] [Google Scholar]
  • 2. VanGilder C, Amlung S, Harrison P, Meyer S. Results of the 2008–2009 International Pressure Ulcer Prevalence Survey and a 3‐year, acute care, unit‐specific analysis. Ostomy Wound Manage 2009;55:39–45. [PubMed] [Google Scholar]
  • 3. House S, Giles T, Whitcomb J. Benchmarking to the international pressure ulcer prevalence survey. J Wound Ostomy Continence Nurs 2011;38:254–9. [DOI] [PubMed] [Google Scholar]
  • 4. Bennett G, Dealey C, Posnett J. The cost of pressure ulcers in the UK. Age Ageing 2004;33:230–5. [DOI] [PubMed] [Google Scholar]
  • 5. Dealey C, Posnett J, Walker A. The cost of pressure ulcers in the United Kingdom. J Wound Care 2012;21:261–2, 264, 266. [DOI] [PubMed] [Google Scholar]
  • 6. Norton D. Calculating the risk: reflections on the Norton Scale. 1989. Adv Wound Care 1996;9:38–43. [PubMed] [Google Scholar]
  • 7. Bergstrom N, Braden BJ, Laguzza A, Holman V. The Braden Scale for predicting pressure sore risk. Nurs Res 1987;36:205–10. [PubMed] [Google Scholar]
  • 8. Waterlow J. Pressure sores: a risk assessment card. Nurs Times 1985;81:49–55. [PubMed] [Google Scholar]
  • 9. Gould D, Goldstone L, Kelly D, Gammon J. Examining the validity of pressure ulcer risk assessment scales: a replication study. Int J Nurs Stud 2004;41:331–9. [DOI] [PubMed] [Google Scholar]
  • 10. Jun Seongsook RN, Jeong Ihnsook RN, Lee Younghee RN. Validity of pressure ulcer risk assessment scales; Cubbin and Jackson, Braden, and Douglas Scale. Int J Nurs Stud 2004;41:199–204. [DOI] [PubMed] [Google Scholar]
  • 11. Ayello EA, Braden B. How and why to do pressure ulcer risk assessment. Adv Skin Wound Care 2002;15:125–31; quiz 132–3. [DOI] [PubMed] [Google Scholar]
  • 12. Shrout PE. Measurement reliability and agreement in psychiatry. Stat Methods Med Res 1998;7:301–17. [DOI] [PubMed] [Google Scholar]
  • 13. Pancorbo‐Hidalgo PL, Garcia‐Fernandez FP, Lopez‐Medina IM, Alvarez‐Nieto C. Risk assessment scales for pressure ulcer prevention: a systematic review. J Adv Nurs 2006;54:94–110. [DOI] [PubMed] [Google Scholar]
  • 14. Kottner J, Dassen T. Pressure ulcer risk assessment in critical care: interrater reliability and validity studies of the Braden and Waterlow scales and subjective ratings in two intensive care units. Int J Nurs Stud 2010;47:671–7. [DOI] [PubMed] [Google Scholar]
  • 15. Sullivan LM, Massaro JM, D'Agostino RB Sr. Presentation of multivariate data for clinical use: the Framingham Study risk score functions. Stat Med 2004;23:1631–60. [DOI] [PubMed] [Google Scholar]

Articles from International Wound Journal are provided here courtesy of Wiley

RESOURCES