Abstract
Context: Data from electrothermometers are used to determine therapeutic modality usage, but the value of experimental results is only as good as the data collected.
Objective: To determine the reliability and validity of 3 electrothermometers from 2 manufacturers.
Design: A 3 × 4 × 17 factorial with repeated measures on 2 factors. Independent variables were trial (1, 2, 3), thermometer (mercury thermometer, Iso-Thermex calibrated from −50°C to 50°C, Iso-Thermex calibrated from −20°C to 80°C, and Datalogger), and time (17).
Setting: Human Performance Research Center.
Intervention(s): Eighteen thermocouples were inserted through the wall of a foamed polystyrene cooler, and 6 were connected to each of the 3 electrothermometers. The cooler was positioned on a stir plate and filled with room-temperature water (18.4°C). A mercury thermometer was immersed into the water bath. Measurements of the water bath were taken every 10 seconds for three 3-minute trials.
Main Outcome Measure(s): The temperature variability of 3 electrothermometers was taken from a calibrated mercury thermometer.
Results: The Iso-Thermex electrothermometers did not differ statistically from each other in uncertainty (validity error ± reliability error = 0.06°C ± 0.03°C ± 0.03°C ± 0.02°C, P < .05), but both differed from the Datalogger (0.64°C ± 0.20°C, P < .05). The Datalogger temperature was consistently higher than the mercury thermometer temperature.
Conclusions: The Iso-Thermex electrothermometers were more stable than the Datalogger, and values were within the published uncertainty (±0.1°C) when used with PT-6 thermocouples. The Datalogger we used had an uncertainty of measurement greater than that indicated in the user's manual (∼±0.52°C). Uncertainty of ±0.84°C can significantly influence the interpretation of results when intramuscular temperature changes are usually less than 5°C.
Keywords: accuracy, repeatability, reproducibility, measurements
One indicator of the efficacy of a thermal therapeutic modality is its ability to change temperature in specific tissues. Measuring tissue temperature, therefore, is an important element in therapeutic modality research. The value of experimental results is only as good as the data collected. Both the Iso-Thermex (Columbus Instruments, Columbus, OH) and Datalogger (model MMS 3000-T6V4; Commtest Instruments Ltd, Christchurch, New Zealand) electrothermometers have been used for measuring tissue temperature and have helped to advance our understanding of thermal and cryotherapeutic modalities.
Most therapeutic modality investigators using these electrothermometers have either reported the manufacturer's claims of their instrument's accuracy1–4 or failed to report the reliability and validity of their instruments.5–12 We know of no independent assessment of these two electrothermometers. Our research question was, How reliable and valid are the Iso-Thermex and Datalogger?
METHODS
Experimental Design
A 3 × 4 × 17 repeated-measures design guided this study. The independent variables were trial (1, 2, and 3), thermometer (mercury thermometer, Iso-Thermex calibrated from −50°C to 50°C, Iso-Thermex calibrated from −20 to 80°C, and Datalogger), and time (every 10 seconds for 3 minutes, n = 17). The dependent variables were temperature and absolute temperature differences from a mercury thermometer.
Instruments
We interfaced 18 PT-6 thermocouples (Physitemp Instruments Inc, Clifton, NJ) with 3 electrothermometers (6 thermocouples per machine): a 16-channel Iso-Thermex with a measurable temperature range of −50°C to 50°C (Iso−50:50), a second 16-channel Iso-Thermex with a measurable temperature range of −20°C to 80°C (Iso−20:80), and a 6-channel Datalogger with a measurable temperature range of −250°C to 350°C (Figure 1). A calibrated mercury thermometer (model 15-059-18; Fisher Scientific International Inc, Hampton, NH; National Institute of Standards and Technology13 traceable) graded at 0.1°C was used to monitor water bath temperature. We circulated the water bath with a stirrer (model 103, Corning PC, Corning, NY) and magnetic stir bar.
Procedures
Eighteen thermocouples, 3 rows of 6, were inserted through the wall of a 23- × 15- × 19-cm foamed polystyrene cooler and secured with silicone polymer (Figure 2). The thermocouples extended 10 cm into the cooler. The bottom row was 4 cm from the cooler bottom, and rows were 3 cm apart from each other. Thermocouples within a row were also 3 cm apart from each other. The cooler was filled to within 3 cm of the top with water (18.4°C) that had sat in the room for 24 hours before the study. Water temperature gradients were minimized with a magnetic stir bar spinning in the bottom of the cooler. The mercury thermometer was immersed into the water bath approximately 3 cm from the bottom.
We interfaced 6 thermocouples with each of the 3 electrothermometers, using all 6 channels of the Datalogger and 6 randomly selected channels in each Iso-Thermex unit. Data collection consisted of 3 trials. After the first and second trials, the 3 sets of 6 thermocouples were rotated among electrothermometers. Electrothermometers were started simultaneously, and temperature was recorded every 10 seconds for 3 minutes. The same investigator read the mercury thermometer every 10 seconds throughout each trial.
Statistical Analysis
We computed the mean and standard deviation of the 306 measures (17 samples for each of 18 thermocouples) for each machine. The standard deviation served as our measure of reliability. We then compared the reliability (standard deviation) among electrothermometers using a modified Levenes equal variance test14 and pairwise F tests.14
Validity was computed as the absolute difference between the individual measurements of the mercury thermometer and each electrothermometer. These individual differences then became the dependent variable in a 3 × 17 repeated-measures analysis of variance,15 with electrothermometer and time as main factors. Thermocouple was used as a repeat measure to ensure that the differences were between machines and not thermocouples. When appropriate, we computed Scheffé multiple comparison tests to determine the location of statistical differences among individual machines. Results were considered statistically significant at an alpha level of 0.05.
RESULTS
Each Iso-Thermex unit was more valid (F2,34 = 151.51, P = .000001) and more reliable (Iso −50:50: F305,305 = 36.35, P = .00001; Iso −20:80: F305,305 = 49.97, P = .00001) than the Datalogger, but they did not differ from each other (Tables 1 and 2, Figure 3). The difference between the Datalogger and the mercury thermometer was 10–20 times more than the difference between the Iso-Thermex units and the mercury thermometer (see Table 1). The Iso-Thermex −20:80 underestimated the mercury thermometer by 0.03°C; the Iso-Thermex −50:50 underestimated the mercury thermometer by 0.06°C; and the Datalogger overestimated the mercury thermometer by 0.64°C (see Table 1, Figure 3). The reliability of the Iso-Thermex units was not different, but both were more reliable than the Datalogger (see Table 2).
Table 1. Means (Validity) and Standard Deviations of Absolute Temperature Differences for 3 Electrothermometers*.
Table 2. Means and Standard Deviations (Reliability) of Temperature Measurements for 3 Electrothermometers*.
DISCUSSION
Terms used to describe how closely or consistently repeated measurements describe phenomena include accuracy, conformance, reliability, validity, error, repeatability, reproducibility, and uncertainty. Scientists use the terms reliability and validity,16 whereas manufacturers of instruments typically use the term accuracy.17,18 However, the National Institute of Standards and Technology and those who calibrate thermometers to its specifications,13,19 use the words accuracy, conformance, error, repeatability, reproducibility, and uncertainty. These terms are sometimes used synonymously and incorrectly, most likely because of a misunderstanding of their meanings. They are defined as follows:
Reliability is the extent to which an experiment, test, or measuring procedure yields the same results on repeated trials.20
Validity is the extent to which a situation as observed reflects the true situation, or the degree to which data or results of a study are correct or true.21
Accuracy is the freedom from mistake or error—correctness.20 This includes freedom from both validity and reliability errors. Ironically, manufacturers use accuracy as a quantitative term despite the fact that the National Institute of Standards and Technology defines accuracy as a qualitative and not a quantitative term.
Conformance and conformity are actions that are in accordance with some specified standard or authority.20 This would indicate validity.
Error is the difference between an observed or calculated value and a true value20—the amount of deviation from a standard or specification,20 composed of random error and systematic error.15 Random error is equivalent to error that would affect reliability, whereas systematic error contributes to validity.
Repeatability is the closeness of agreement between the results of successive measurements carried out under the same conditions of measurement.15
Reproducibility is the closeness of agreement between the results of measurements carried out under changed conditions of measurement.15
Uncertainty is a value, associated with the result of a measurement, that characterizes the dispersion of the values that could reasonably be attributed to the measurement,22 composed of uncertainty from both random and systematic error.15 Again, random error contributes to reliability, whereas systematic error contributes to validity.
Neither Columbus Instruments nor Commtest Instruments reports the validity or reliability of their instruments.18,23 Both report accuracy, which is inappropriate because they are assigning a quantitative value to a qualitative term. What they really mean to report is uncertainty (M. Grisby, unpublished data, 2005). Uncertainty comprises both random error (reliability) and systematic error (validity).15 To calculate uncertainty based on this definition, we must add both our reliability and validity measurements (Table 3).
Table 3. Calculated Uncertainty and Reported Accuracy of Each Electrothermometer*.
Our uncertainty measurements for both Iso-Thermex units were less (ie, they had smaller boundaries of error [conceptually similar to confidence intervals] than their manufacturer's specifications), whereas the Datalogger had greater uncertainty than its manufacturer's claims (see Table 3). Columbus Instruments reports the Iso-Thermex is accurate to within ±0.1°C, which is more uncertain than our calculated uncertainty of ±0.09°C for the Iso-Thermex −50:50 and ±0.06°C for the Iso-Thermex −20:80. Commtest Instruments reports the Datalogger accuracy to be ± (0.5 + [0.001·temperature measurement]),18 which would be ±0.52°C for our study. This is less uncertain than our calculated uncertainty of ±0.84°C.
The systematic error (validity) and random error (reliability) measures almost equally contributed to our calculated uncertainty for the Iso-Thermex units (see Tables 1 and 2). The Datalogger's uncertainty, however, was composed mostly of systematic error (validity).
The uncertainty (combination of reliability and validity) of the Iso-Thermex and Datalogger electrothermometers is vital in interpreting the results of the therapeutic modality researchers using them. Scientists and clinicians should be aware that conclusions based on Datalogger data may be less accurate than those generated by an Iso-Thermex. Not only did the Datalogger fail to meet manufacturer's claims for accuracy, but the manufacturer's claims were more uncertain than those of the Iso-Thermex (0.52°C versus 0.1°C for the temperature in our study). This may be especially relevant when the reported difference between experimental conditions is within 1°C.
Because we evaluated a limited number of units at a single temperature, our results may not apply to all units of a similar model and brand of electrothermometers, nor to other temperatures. Therefore, the more important conclusion of this study is not the actual reliability and validity values for our specific machines. Our observations demonstrate the need for individual assessment and reporting of reliability and validity for individual machines. Scientists examining tissue temperature should know the manufacturer's claims of uncertainty, test their own equipment, and report the reliability and validity of their instruments, so that others are aware of the uncertainty of the measurements. Knowing the uncertainty of research results helps consumers of the literature make more correct interpretations of those results and may help to explain differences found among studies.
REFERENCES
- Burr PO, Demchak TJ, Cordova ML, Ingersoll CD, Stone MB. Effects of altering intensity during 1-MHz ultrasound treatment on increasing triceps surae temperature. J Sport Rehabil. 2004;13:275–286. [Google Scholar]
- Gallo JA, Draper DO, Brody LT, Fellingham GW. A comparison of human muscle temperature increases during 3-MHz continuous and pulsed ultrasound with equivalent temporal average intensities. J Orthop Sports Phys Ther. 2004;34:395–401. doi: 10.2519/jospt.2004.34.7.395. [DOI] [PubMed] [Google Scholar]
- Merrick MA, Mihalyov MR, Roethemeier JL, Cordova ML, Ingersoll CD. A comparison of intramuscular temperatures during ultrasound treatments with coupling gel or gel pads. J Orthop Sports Phys Ther. 2002;32:216–220. doi: 10.2519/jospt.2002.32.5.216. [DOI] [PubMed] [Google Scholar]
- Merrick MA, Jutte LS, Smith ME. Cold modalities with different thermodynamic properties produce different surface and intramuscular temperatures. J Athl Train. 2003;38:28–33. [PMC free article] [PubMed] [Google Scholar]
- Palmer JE, Knight KL. Ankle and thigh skin surface temperature changes with repeated ice pack application. J Athl Train. 1996;31:319–323. [PMC free article] [PubMed] [Google Scholar]
- Myrer JW, Measom GJ, Fellingham GW. Intramuscular temperature rises with topical analgesics used as coupling agents during therapeutic ultrasound. J Athl Train. 2001;36:20–26. [PMC free article] [PubMed] [Google Scholar]
- Myrer JW, Myrer KA, Measom GJ, Fellingham GW, Evers SL. Muscle temperature is affected by overlying adipose when cryotherapy is administered. J Athl Train. 2001;36:32–36. [PMC free article] [PubMed] [Google Scholar]
- Hopkins JT, Ingersoll CD, Edwards J, Klootwyk TE. Cryotherapy and transcutaneous electric neuromuscular stimulation decrease arthrogenic muscle inhibition of the vastus medialis after knee joint effusion. J Athl Train. 2002;37:25–31. [PMC free article] [PubMed] [Google Scholar]
- Otte JW, Merrick MA, Ingersoll CD, Cordova ML. Subcutaneous adipose tissue thickness alters cooling time during cryotherapy. Arch Phys Med Rehabil. 2002;83:1501–1505. doi: 10.1053/apmr.2002.34833. [DOI] [PubMed] [Google Scholar]
- Garrett CL, Draper DO, Knight KL. Heat distribution in the lower leg from pulsed short-wave diathermy and ultrasound treatments. J Athl Train. 2000;35:50–55. [PMC free article] [PubMed] [Google Scholar]
- Drust B, Atkinson G, Gregson W, French D, Binningsley D. The effects of massage on intra muscular temperature in the vastus lateralis in humans. Int J Sports Med. 2003;24:395–399. doi: 10.1055/s-2003-41182. [DOI] [PubMed] [Google Scholar]
- Merrick MA, Knight KL, Ingersoll CD, Potteiger JA. The effects of ice and compression wraps on intramuscular temperature at various depths. J Athl Train. 1993;23:236–245. [PMC free article] [PubMed] [Google Scholar]
- National Institution of Standards and Technology. Standards. Available at: http://www.nist.gov/public_affairs/standards.htm. Accessed January 24, 2005.
- Ramsey F, Schafer D. The Statistical Sleuth: A Course in Methods of Data Analysis. 2nd ed. Pacific Grove, CA: Duxbury Press; 2002.
- Taylor BN, Kuyatt CE. Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results. 1994 ed. NIST Technical Note 1297. Gaithersburg, MD: National Institute of Standards and Technology; 1994.
- Thomas JR, Nelson JK. Measuring research variables. In: Research Methods in Physical Activity. 4th ed. Champaign, IL: Human Kinetics; 2001: 181–200.
- Temperature microprobes. Physitemp Precision Temperature Specialists. Clifton, NJ: Physitemp Instruments, Inc; 1999.
- Appendix specifications. MMS3000-T6V4 Owner's Manual. Christchurch, New Zealand: Commtest Instruments Limited; 2001:75.
- Ordering, terms, & warranty information. Physitemp Precision Temperature Specialists. Clifton, NJ: Physitemp Instruments, Inc; 1999.
- Mish FC. The Merriam-Webster's Collegiate Dictionary. 11th ed. ed. Springfield, MA: Merriam-Webster; 2003.
- Egan EJ. Taber's Cyclopedia Medical Dictionary. 18th ed. ed. Philadelphia, PA: FA Davis; 1997.
- National Institution of Standards and Technology: physics laboratory. The NIST Reference on Constants, Units, and Uncertainty: Glossary, 10/2000. Available at: http://physics.nist.gov/cuu/Uncertainty/glossary.html. Accessed February 21, 2005.
- Iso-Thermex Electronic Thermometer Instruction Manual. Columbus, OH: Columbus Instruments International Corp; 1992.