
Keywords: global warming, heat balance, psychrometric limits, repeatability, uncompensable heat stress
Abstract
The PSU HEAT protocol has been used to determine critical environmental limits, i.e., those combinations of ambient temperature and humidity above which heat stress becomes uncompensable and core temperature rises continuously. However, no studies have rigorously investigated the reliability and validity of this experimental protocol. Here, we assessed the 1) between-visit reliability and 2) validity of the paradigm. Twelve subjects (5 M/7W; 25 ± 4 yr) completed a progressive heat stress protocol during which they walked on a treadmill (2.2 mph, 3% gradient) in a controllable environmental chamber. After an equilibration period, either dry-bulb temperature (Tdb) was increased every 5 min while ambient water vapor pressure (Pa) was held constant (Tcrit experiments) or Pa was increased every 5 min while Tdb was held constant (Pcrit experiments) until an upward inflection in gastrointestinal temperature (Tgi) was observed. For reliability experiments, 11 subjects repeated the same protocol on a different day. For validity experiments, 10 subjects performed a Tcrit experiment at their previously determined Pcrit or vice versa. The between-visit reliability (intraclass correlation coefficient, ICC) for critical environmental limits was 0.98. Similarly, there was excellent agreement between original and validity trials for Tcrit (ICC = 0.95) and Pcrit (ICC = 0.96). Furthermore, the wet-bulb temperature at the Tgi inflection point was not different during reliability (P = 0.78) or validity (P = 0.32) trials compared with original trials. These findings support the reliability and validity of this experimental paradigm for the determination of critical environmental limits for maintenance of human heat balance.
NEW & NOTEWORTHY The PSU HEAT progressive heat stress protocol has been used to identify critical environmental limits for various populations, clothing ensembles, and metabolic intensities. However, no studies have rigorously investigated the reliability and validity of this experimental model. Here, we demonstrate excellent reliability and validity of the PSU HEAT protocol.
INTRODUCTION
Body core temperature (Tc) equilibrates proportionally to metabolic heat production across a wide range of environmental conditions (1, 2). Thermal environments above which heat balance cannot be maintained, resulting in a continuous rise in Tc, are termed critical environmental limits. Lind (1, 2) first defined this critical limit as the “upper limit of the prescriptive zone” using an approach in which subjects completed a series of ∼1 h exposures to a wide range of thermal environments during treadmill walking. Belding and Kamon (3) developed a less time-intensive model to determine critical ambient water vapor pressures for a variety of exercise intensities and air movements at an ambient temperature of 36°C. Subsequently, our laboratory refined the approach by separating experiments in which heat balance is limited by sweat evaporation (critical water vapor pressure, Pcrit) from those in which maximal sweating capacity and skin blood flow limitations are the primary limiting factors (critical dry-bulb temperature, Tcrit) (4–7). The experimental approach has a long history of use in identifying critical environmental limits in a variety of populations, metabolic intensities, and clothing ensembles (4, 8–10). The PSU HEAT (Human Environmental Age Thresholds) project aims to establish critical environments for a variety of vulnerable populations that may be used for evidence-based alert communication, triage for impending heat events, and global climate change initiatives to mitigate the deleterious impact of severe heat events on human health. As such, it is imperative to ensure the experimental method used to determine these thresholds is rigorous, repeatable, and valid.
Our laboratory has previously reported reliability (r = 0.97) of this experimental paradigm; however, only a subgroup of four young, healthy adults were tested at a metabolic intensity of 30% V̇o2max, an intensity designed to approximate that across an 8-h workday in industrial settings (4). There has yet to be an investigation to specifically determine the reliability and validity of determining critical environmental limits in conditions that approximate the metabolic intensity of daily living or minimal activity. This is a critical component of the PSU HEAT project, given the National Institutes of Health initiative for advancement in scientific research that is repeatable, reproducible, and rigorous (11, 12).
The purpose of the present investigation was to examine the between-visit reliability of the experimental paradigm used to identify those critical environmental conditions in which heat stress becomes uncompensable. In addition, to assess the validity of the findings from these experiments, we conducted a series of trials in which the Tcrit or Pcrit trials were reversed [i.e., progressive heat stress was conducted with dry-bulb temperature (Tdb) or ambient water vapor pressure (Pa) clamped at the previously observed Tcrit or Pcrit value for a given subject] to determine whether critical environmental loci would occur at the same combinations of ambient temperature and humidity.
METHODS
Subjects
All experimental procedures were approved in advance by the Institutional Review Board at the Pennsylvania State University. Oral and written consents were obtained voluntarily from all subjects before participation and in accordance with the guidelines set forth by the Declaration of Helsinki. All testing was conducted in Noll Laboratory at the Pennsylvania State University.
Twelve young healthy adults (5 men/7 women) aged 18–35 yr were recruited for the study. All of the original trials (15 experiments) overlapped with our companion paper (13). Nine subjects returned for both a reliability and validity trial, two subjects returned for a reliability trial only, and one subject returned for a validity trial only. All visits were separated by at least 5 days and were conducted during the same season of the year for each subject. The average duration between visits was 4 wk (range = 1–16 wk) for reliability tests and 3 wk (range = 1–7 wk) for validity tests. All participants were healthy, normotensive, nonsmokers, and were not taking any prescription medications that might affect the physiological variables of interest in this study. No attempt was made to control for menstrual status or contraceptive use (4). Maximal aerobic capacity (V̇o2max) was determined using open-circuit spirometry during a graded exercise test performed on a motor-driven treadmill. During the experiments, participants wore thin, short-sleeved cotton tee-shirts, shorts, socks, and walking/running shoes.
Testing Procedures
Before each experimental session, participants were instructed to abstain from alcoholic beverages and vigorous exercise for 24 h and from caffeine for 12 h. Upon arrival, participants provided a urine sample to ensure euhydration, defined as urine specific gravity ≤ 1.020 (USG; PAL-S, Atago, Bellevue, WA) (14). Subjects walked on a motor-driven treadmill at a speed of 2.2 mph with a grade of 3% until a clear rise in gastrointestinal temperature (Tgi) was observed.
Two protocols were used to determine Pcrit and Tcrit as previously described (6, 15). The Pcrit and Tcrit for the upward inflection of Tc was determined in an environmental chamber at three distinct Tdb conditions of 36°C, 38°C, and 40°C and three Pa conditions of 12, 16, and 20 mmHg, respectively. During Pcrit experiments, Tdb was held constant while, after a 30-min equilibration period, Pa was increased by 1 mmHg every 5 min. Conversely, during Tcrit experiments, Pa was held constant and Tdb was increased by 1°C every 5 min after a 30-min equilibration period. Progressive heat stress continued until a clear and continuous rise in Tgi was observed (Fig. 1).
Figure 1.

Representative tracings of the time course of gastrointestinal temperature (Tgi), dry-bulb temperature (Tdb), and ambient water vapor pressure (Pa) for an original trial (A) with increasing Tdb and a validity trial (B) with increasing Pa. During the original trial, Pa is held constant at 11.7 mmHg and the Tgi inflection point occurs at Tdb = 43.1°C. During the validity trial, Tdb is held constant at 43.1°C and the Tgi inflection point occurs at Pa = 12.1 mmHg.
To assess the reliability and validity of critical environmental limits identified using this protocol, each subject first completed either a Tcrit or Pcrit trial and then returned for 1) a reliability trial in which subjects repeated the same protocol as their first trial, and/or 2) a validity trial in which the critical Pa or Tdb observed during the first trial was held constant while either Tdb or Pa, respectively, was systematically increased to determine if Tgi inflected at the same combination of ambient temperature and humidity. A representative tracing of the time course of Tgi and the environmental conditions for an original Tcrit and its Pcrit validity trial is presented in Fig. 1.
Measurements
Tgi was measured using telemetry capsules that were ingested 1–2 h before each experiment, in accordance with previous data demonstrating that ingestion times from 1–12 h before use do not influence the precision of Tgi data (16) (VitalSense, Philips Respironics, Bend, OR). Tgi data were transmitted continuously to a PowerLab data acquisition system and LabChart signal processing software (AD Instruments, Colorado Springs, CO) using an Equivital wireless physiological monitoring system (Equivital Inc., New York, NY).
Oxygen consumption V̇o2; L/min) and respiratory exchange ratio (RER; unitless) measurements were collected for 5 min at the 30-min and 60-min time points of each trial. The average V̇o2 and RER of the final 3 min of each collection period were used to calculate metabolic heat production (M; W/m2) (17) as follows:
| (1) |
where AD is Dubois surface area (m2). External work (W; W/m2) was calculated as follows:
| (2) |
where mb is body mass (kg), vw is walking velocity (m/min), and Fg is fractional grade of the treadmill (17). Mnet (W·m−2) was then calculated as M – W.
Sweat rate was calculated as the difference between pre and postexercise nude body mass on a scale accurate to ±20 g. Subjects were not allowed to intake fluid during the experimental trial, and no effort was made to account for respiratory losses as they are negligible in the environments of interest.
Statistical Analysis
No a priori power analysis was conducted; however, a post hoc power analysis using a slope of 0.9 from the relation between original and reliability trials suggested that we were adequately powered (Power = 0.99) to detect significant correlations in the present study. In an effort to ensure the most comprehensive and transparent results, reliability and reproducibility were assessed using multiple indices. For reliability trials, intraclass correlation coefficients (ICCs; two-way mixed effects, single measures, and absolute agreement) were used to compare the original and repeat Tdb or Pa at which each critical point was observed (IBM SPSS Statistics v26, Armonk, NY). For validity trials, because the dependent variable of interest was flipped from the original trial to the validity trial (i.e., if the original trial was to determine the Tcrit at a given Pa, the validity trial was to determine the Pcrit at the critical Tdb from the original trial), ICCs were performed for both Tdb and Pa between original and validity trials. In addition, data were converted to wet-bulb temperature (Twb) values and Bland–Altman plots (along with the calculated upper and lower limits, bias, and correlation coefficients) are presented for reliability and validity trials. One-way ANOVA was used to evaluate potential differences in Mnet between trials with Tukey’s multiple-comparison procedure for post hoc pairwise comparisons of interest. Sweat rates for original and repeat trials were compared using paired-samples t tests. No comparison was made for sweat rate for validity trials because the environmental conditions (and, thus, sweat rates across the entirety of the trial) were not directly comparable.
Group data are reported as means ± standard deviation. Significance was set at P < 0.05.
RESULTS
Subject Characteristics
Subject characteristics are presented in Table 1. All subjects were young, healthy adults with body mass indices in the normal range.
Table 1.
Subject characteristics
| Reliability Trials | Means ± SD |
|---|---|
| Age, yr | 25 ± 4 |
| Height, m | 1.72 ± 0.12 |
| BMI, kg/m2 | 23.5 ± 3.5 |
| V̇o2max, mL·kg−1·min−1 | 50.1 ± 12.2 |
| Validity Trials | Means ± SD |
|---|---|
| Age, yr | 25 ± 4 |
| Height, m | 1.71 ± 0.13 |
| BMI, kg/m2 | 23.7 ± 3.4 |
| V̇o2max, mL·kg−1·min−1 | 47.7 ± 9.7 |
BMI, body mass index; V̇o2max, maximal oxygen consumption.
Metabolic Heat Production and Sweat Rate
There were no differences in Mnet between original and reliability trials (133 ± 12 vs. 127 ± 16; P = 0.11), nor between original and validity trials (131 ± 13 vs. 135 ± 14; P = 0.41). In addition, there were no differences in sweat rate between original and reliability trials (219 ± 82 vs. 208 ± 67; P = 0.49). Similarly, there were no differences in percent body mass loss between original and reliability trials (1.06 ± 0.46 vs. 0.98 ± 0.44; P = 0.43), nor between original and validity trials (0.89 ± 0.31 vs. 1.01 ± 0.43; P = 0.41).
Reliability
Table 2 shows the combination of Tdb and Pa, as well as the corresponding Twb, at the Tgi inflection point for original and reliability trials. There were no differences in critical Twb between original and reliability trials (P = 0.78). The reliability for the determination of Tcrit and Pcrit was excellent (ICC = 0.98). Similarly, the reliability of the critical Twb between original and reliability trials was excellent (ICC = 0.91; Fig. 2).
Table 2.
Psychrometric loci for ambient dry-bulb temperature (Tdb) and water vapor pressure (Pa), with the corresponding wet-bulb temperature (Twb), at the core temperature inflection point for all subjects during original and reliability trials
| Trial 1 |
Trial 2 |
|||
|---|---|---|---|---|
| Subject | Tdb, °C; Pa, mmHg | Twb | Tdb, °C; Pa, mmHg | Twb |
| Subj 1 | (38.1, 25.1) | 28.9 | (37.9, 26.1) | 29.3 |
| Subj 2 | (36.0, 24.2) | 28.0 | (35.8, 23.4) | 27.6 |
| Subj 3 | (37.8, 19.3) | 26.1 | (41.1, 20.0) | 27.3 |
| Subj 4 | (38.1, 24.0) | 28.4 | (38.0, 22.0) | 27.5 |
| Subj 5 | (40.5, 19.2) | 26.7 | (40.0, 22.5) | 28.2 |
| Subj 6 | (42.2, 16.2) | 25.7 | (43.8, 16.3) | 26.1 |
| Subj 7 | (35.8, 24.6) | 28.1 | (36.0, 23.0) | 27.4 |
| Subj 8 | (44.7, 11.3) | 23.8 | (45.5, 12.4) | 24.6 |
| Subj 9 | (43.1, 16.0) | 35.8 | (40.6, 16.0) | 25.2 |
| Subj 10 | (47.8, 12.0) | 25.0 | (47.7, 11.3) | 24.6 |
| Subj 11 | (42.6, 12.3) | 23.7 | (41.2, 12.3) | 23.3 |
| Means ± SD | 26.4 ± 1.8 | 26.5 ± 1.8 | ||
Figure 2.
Bland–Altman plot of critical wet-bulb temperature (Twb) for original and reliability trials (n = 11; 5 men, 6 women). The solid line at y = 0 represents perfect reliability. The dashed lines represent the 95% confidence interval of the bias. ICC, intraclass correlation coefficient.
Validity
Table 3 shows the combinations of Tdb and Pa, and the corresponding Twb, at the Tgi inflection point for original and validity trials. There were no differences in critical Twb between original and validity trials (P = 0.32). There was excellent agreement in the Tdb (ICC = 0.95) and Pa (ICC = 0.96) at the Tgi inflection point between original and validity trails. Likewise, there was excellent agreement in the critical Twb between original and validity trials (ICC = 0.91; Fig. 3).
Table 3.
Psychrometric loci for ambient dry-bulb temperature (Tdb) and water vapor pressure (Pa), with the corresponding wet-bulb temperature (Twb), at the core temperature inflection point for all subjects during original and validity trials
| Trial 1 |
Trial 2 |
|||
|---|---|---|---|---|
| Subject | Tdb, °C; Pa, mmHg | Twb | Tdb, °C; Pa, mmHg | Twb |
| Subj 1 | (43.1, 11.7) | 23.5 | (43.1, 13.0) | 24.3 |
| Subj 2 | (38.1, 21.0) | 27.0 | (38.2, 20.3) | 26.7 |
| Subj 3 | (37.8, 19.3) | 26.1 | (38.1, 18.8) | 26.0 |
| Subj 4 | (35.9, 24.3) | 28.0 | (34.3, 24.2) | 27.6 |
| Subj 5 | (39.2, 19.8) | 26.7 | (38.3, 22.4) | 27.9 |
| Subj 6 | (42.2, 16.2) | 25.7 | (42.2, 17.7) | 26.5 |
| Subj 7 | (35.8, 24.6) | 28.1 | (36.8, 25.4) | 28.7 |
| Subj 8 | (40.5, 23.5) | 28.8 | (40.5, 23.0) | 28.5 |
| Subj 9 | (43.1, 16.0) | 25.8 | (42.9, 14.9) | 25.2 |
| Subj 10 | (39.9, 20.7) | 27.3 | (42.0, 21.0) | 28.0 |
| Means ± SD | 26.7 ± 1.5 | 26.9 ± 1.5 | ||
Figure 3.
Bland–Altman plot of critical wet-bulb temperature (Twb) for original and validity trials (n = 10; 4 men, 6 women). The solid line at y = 0 represents perfect reliability. The dashed lines represent the 95% confidence interval of the bias. ICC, intraclass correlation coefficient.
DISCUSSION
PSU HEAT is an ongoing project aimed at determining the critical environmental conditions above which human heat balance cannot be maintained. The objective of the present study was to determine reliability (between-visit reliability) and validity of the experimental paradigm used. Our data demonstrated a high degree of reliability for Mnet and SR, as well as the Twb, Tdb, and Pa at which the Tgi inflection point occurred. In addition, the protocol yielded critical environmental loci that were not significantly different regardless of whether Tdb or Pa is being systematically increased. As such, these findings provide evidence of validity, rigor, and reproducibility of this experimental approach.
Reliability of this progressive heat stress protocol has been previously reported (r = 0.97), although that study included only a small subgroup of individuals who completed reliability trials (4). Given the importance of robust experimental design that provides reliable data, providing evidence for experimental rigor and reproducibility is critical. Indeed, the National Institutes of Health have placed a strong emphasis on rigor and reproducibility in recent years (18). The reliability data in the current study confirm the previous findings from our laboratory (4) that this experimental paradigm is highly reliable. Similarly, the validity trials provide further evidence supporting the use of this experimental approach to establishing thermal environments that place various populations at risk of heat-related morbidity and mortality.
LIMITATIONS
Only young, healthy subjects were tested in the present study. However, comparisons were conducted within subjects, so it is unlikely that age would affect reliability and validity. In addition, this progressive heat stress protocol has been used to identify critical environmental limits for a range of exercise intensities. The present study only reported reliability and validity for physical activity approximating light activities of daily living. However, our findings were similar to those previously reported for an exercise intensity of 30% V̇o2max (4). Thus, we do not expect that exercise intensity is likely to influence the reliability and validity of this experimental approach.
Conclusions
The present study demonstrates excellent reliability of the progressive heat stress protocol used to identify critical environmental limits for the maintenance of heat balance in humans. In addition, these data suggest that uncompensable heat stress occurs at the same combination of ambient temperature and humidity (i.e., psychrometric loci), regardless of which environmental variable is being experimentally manipulated. Together, these data provide strong evidence that this experimental paradigm is rigorous, reliable, and valid, supporting its use to investigate critical environmental limits to the maintenance of heat balance.
GRANTS
This work was supported by the National Institutes of Health (NIH) Grant RO1 AG067471 (to W.L.K.) and National Institute on Aging (NIA) Grant T32 AG049676 (to D.J.V.).
DISCLOSURES
No conflicts of interest, financial or otherwise, are declared by the authors.
AUTHOR CONTRIBUTIONS
R.M.C., S.T.W., and W.L.K. conceived and designed research; R.M.C., S.T.W., and Z.S.L. performed experiments; R.M.C. analyzed data; R.M.C., S.T.W., Z.S.L., and W.L.K. interpreted results of experiments; R.M.C. prepared figures; R.M.C. drafted manuscript; R.M.C., S.T.W., Z.S.L., and W.L.K. edited and revised manuscript; R.M.C., S.T.W., Z.S.L., and W.L.K. approved final version of manuscript.
ACKNOWLEDGMENTS
The authors are thankful for the subject’s participation and for the assistance of Susan Slimak, RN.
REFERENCES
- 1.Lind AR. Physiological effects of continuous or intermittent work in the heat. J Appl Physiol (1985) 18: 57–60, 1963. doi: 10.1152/jappl.1963.18.1.57. [DOI] [PubMed] [Google Scholar]
- 2.Lind AR. Prediction of safe limits for prolonged occupational exposure to heat. Fed Proc 32: 1602–1606, 1973. [PubMed] [Google Scholar]
- 3.Belding HS, Kamon E. Evaporative coefficients for prediction of safe limits in prolonged exposures to work under hot conditions. Fed Proc 32:1598–1601, 1973. [PubMed] [Google Scholar]
- 4.Kenney WL, Zeman MJ. Psychrometric limits and critical evaporative coefficients for unacclimated men and women. J Appl Physiol (1985) 92: 2256–2263, 2002. doi: 10.1152/japplphysiol.01040.2001. [DOI] [PubMed] [Google Scholar]
- 5.Kenney WL, Hyde DE, Bernard TE. Physiological evaluation of liquid-barrier, vapor-permeable protective clothing ensembles for work in hot environments. Am Ind Hyg Assoc J 54: 397–402, 1993. doi: 10.1080/15298669391354865. [DOI] [PubMed] [Google Scholar]
- 6.Kenney WL, Lewis DA, Armstrong CG, Hyde DE, Dyksterhouse TS, Fowler SR, Williams DA. Psychrometric limits to prolonged work in protective clothing ensembles. Am Ind Hyg Assoc J 49: 390–395, 1988. doi: 10.1080/15298668891379954. [DOI] [PubMed] [Google Scholar]
- 7.Kulka TJ, Kenney WL. Heat balance limits in football uniforms: how different uniform ensembles alter the equation. Phys Sportsmed 30: 29–39, 2002. doi: 10.3810/psm.2002.07.377. [DOI] [PubMed] [Google Scholar]
- 8.Kamon E, Avellini B. Physiologic limits to work in the heat and evaporative coefficient for women. J Appl Physiol (1985) 41: 71–76, 1976. doi: 10.1152/jappl.1976.41.1.71. [DOI] [PubMed] [Google Scholar]
- 9.Kenney WL. Psychrometric limits and critical evaporative coefficients for exercising older women. J Appl Physiol (1985) 129: 263–271, 2020. doi: 10.1152/japplphysiol.00345.2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Kenney WL, Mikita DJ, Havenith G, Puhl SM, Crosby P. Simultaneous derivation of clothing-specific heat exchange coefficients. Med Sci Sports Exerc 25: 283–289, 1993. [PubMed] [Google Scholar]
- 11.Collins F, Tabak LA. Policy: NIH plans to enhance reproducibility. Nature 505: 612–613, 2014. doi: 10.1038/505612a. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Hewitt JA, Brown LL, Murphy SJ, Grieder F, Silberberg SD. Accelerating biomedical discoveries through rigor and transparency. ILAR J 58: 115–128, 2017. doi: 10.1093/ilar/ilx011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wolf ST, Cottle RM, Vecellio DJ, Kenney WL. Critical environmental limits for young, healthy adults (PSU HEAT Project). J Appl Physiol (1985). doi: 10.1152/japplphysiol.00737.2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Kenefick RW, Cheuvront SN. Hydration for recreational sport and physical activity. Nutr Rev 70: S137–S142, 2012. doi: 10.1111/j.1753-4887.2012.00523.x. [DOI] [PubMed] [Google Scholar]
- 15.Kamon E, Avellini B, Krajewski J. Physiological and biophysical limits to work in the heat for clothed men and women. J Appl Physiol Respir Environ Exerc Physiol 44: 918–925, 1978. doi: 10.1152/jappl.1978.44.6.918. [DOI] [PubMed] [Google Scholar]
- 16.D'Souza AW, Notley SR, Meade RD, Rutherford MM, Kenny GP. The influence of ingestion time on the validity of gastrointestinal pill temperature as an index of body core temperature during work in the heat. FASEB J 33: 842.7, 2019. doi: 10.1096/fasebj.2019.33.1_supplement.842.7. [DOI] [Google Scholar]
- 17.Cramer MN, Jay O. Partitional calorimetry. J Appl Physiol (1985) 126: 267–277, 2019. doi: 10.1152/japplphysiol.00191.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.DeSoto K. NIH-wide policy doubles down on scientific rigor and reproducibility. https://www.psychologicalscience.org/observer/nih-wide-policy-doubles [2016 Nov 30].


