Abstract
Background
Military personnel and first responders (police and firefighters) often carry large amounts of gear. This increased load can negatively affect posture and lead to back pain. The ability to quantitatively measure muscle thickness under loading would be valuable to clinicians to assess the effectiveness of core stabilization treatment programs and could aid in return to work decisions. Ultrasound imaging (USI) has the potential to provide such a measure, but to be useful it must be reliable.
Purpose
To assess the intrarater and interrater reliability of measurements of transversus abdominis (TrA) and internal oblique (IO) muscle thickness conducted by novice examiners using USI in supine, standing, and with an axial load.
Study Design
Prospective, test‐retest study
Methods
Healthy, active duty military (N=33) personnel were examined by two physical therapy doctoral students (primary and secondary ultrasound technicians) without prior experience in USI. Thickness measurements of the TrA and IO muscles were performed at rest and during a contraction to preferentially activate the TrA in three positions (hook‐lying, standing, and standing with body armor). Percent thickness changes and intraclass correlation coefficients (ICC) were calculated.
Results
Using the mean of three measurements for each of the three positions in resting and contracted muscle states, the intrarater ICC (3,3) values ranged from 0.90 to 0.98. The interrater ICC (2,1) values ranged from 0.39 to 0.79. The ICC values of percent thickness changes were lower than the individual ICC values for all positions and muscle states.
Conclusion
There is excellent intrarater reliability of novice ultrasound technicians measuring abdominal muscle thickness using USI in three positions during the resting and contracted muscle states. However, interrater reliability of two novice technicians was poor to fair, so additional training and experience may be necessary to improve reliability.
Level of Evidence
2b
Keywords: Ultrasonography, Transversus Abdominis, Body Armor
INTRODUCTION
Deployed military personnel often carry more than 100 pounds of gear and equipment while on foot patrols.1 Body armor comprises a significant portion of this load, with the ballistic vest weighing over 20 pounds.2 Body armor is standard military issued equipment that is vital to the safety of military personnel, however, increased pain and disability may be caused by this increased load carriage. Spinal posture changes significantly during weighted conditions and increased forces are imparted on the lumbosacral spine.3,4 Postural adaptations in response to additional loading, such as increased trunk flexion and forward head posture, are contributing factors to back pain.5,6 Konitzer et al. found a positive correlation between increased musculoskeletal pain and soldiers who wore body armor for more than four hours each day.6 Additionally, many soldiers reported that they attributed their back pain to wearing body armor rather than specific job related tasks or physical training with their units.6
Back pain is a significant concern in the military due to attrition from a unit during deployment or training and increased medical costs. Back pain is notorious for being difficult to treat with poor success rates, often becoming a chronic condition.7 Generally, treatment interventions for low back pain are varied and often have mixed results. However, there is evidence to suggest that the recurrence of back pain can be reduced with core stabilization exercise programs targeting the multifidi and transversus abdominis (TrA) muscles.8‐12 Improving core strength has been shown to help prevent injury among firefighters by 42% and reduce lost time from injuries by 62%.13 The ability to quantitatively measure muscle thickness under loading would be valuable to clinicians to assess the effectiveness of core stabilization treatment programs and could aid in return to work decisions as abdominal muscle thickness has been shown to correlate with strength.14 Ultrasound imaging (USI) has the potential to provide such a measure, but to be useful it must be both valid and reliable.
USI is a quick, inexpensive tool that has been shown to be valid for measuring muscle thickness during most isometric, submaximal muscle contractions.15 It has also been established as reliable for measuring the thickness of abdominal muscles in healthy16,17 and unhealthy adults.18‐20 A study by Teyhen et al. that also included novice US technicians with 20 hours of training demonstrated good (ICC > 0.8) or excellent (ICC > 0.9) interrater and intrarater reliability with TrA, internal oblique (IO), rectus abdominis, and lumbar multifidus muscle thickness imaging and measurements.21 Ultrasound imaging has been used to measure thickness of the muscles in the abdomen with the subject in multiple positions22 or with the muscles under load.23 However, no studies to the authors' knowledge have assessed the reliability of USI with individuals standing wearing body armor or assessed the reliability of technicians with very minimal training (less than five hours). As the vast majority of physical therapists do not learn or routinely perform USI, the training described in this study is relevant to practicing physical therapists. Therefore, the primary purpose of this study was to assess the intrarater and interrater reliability of measurements of TrA and IO muscle thickness conducted by novice examiners using USI in supine, standing, and with an axial load.
METHODS
Participants
Thirty‐six active duty volunteers aged 18 to 40 were enrolled in this study by responding to fliers in the U.S. Army Medical Department (AMEDD) Center and School at Fort Sam Houston, TX. Participants were included if they were asymptomatic, without history of peripheral neuropathy or any condition that affected standing balance. Exclusion criteria included inability to ambulate or use of an assistive device during ambulation and inability to perform core stability exercises. During the initial appointment, participants were screened by a physical therapist using a brief clinical examination to rule out low back pain. The examination included lumbar range of motion and bilateral quadrant test to ensure pain‐free, active motion within a normal physiological range and with provocation (positioning into combination extension and rotation). Participants signed consent forms pre‐approved by the Brooke Army Medical Center Institutional Review Board (protocol C.2011.170).
Examiners
Two physical therapy students attending the U.S. Army‐Baylor University Doctoral Program in Physical Therapy at the AMEDD Center and School were the ultrasound technicians for the reliability study. Neither examiner had previously utilized USI in clinical practice or in research, and both examiners performed approximately 3‐4 hours of hands‐on training with the help of an experienced instructor who was familiar with the specific USI machine and protocol used in this study.
Procedures
A prospective, test‐retest study design was used as part of a larger randomized controlled trial. Testing included USI of abdominal muscle thickness in hook‐lying and standing with and without body armor. Testing of the three positions was standardized in this order of increasing task difficulty to avoid confounding related to fatigue of the abdominal muscles. Images of the TrA and IO were taken with a SonoSite M‐Turbo ultrasound machine (SonoSite, Inc., Bothell, WA) in B‐mode (brightness mode) with a 60 mm 2‐5 MHz curvilinear array. Imaging for each of the three positions (hook‐lying, standing, and standing with body armor) was performed three times by each examiner. This was done in order to average the results of three consecutive trials for each condition which has been shown to optimize intrarater reliability.24 While one examiner positioned the transducer, the other saved the image (Figure 1). To avoid an order effect related to fatigue or learning, the order of the imaging examiner was counterbalanced.
During the imaging, the US technician cued the subject using one of three methods in order to preferentially activate the TrA: the abdominal drawing in maneuver (ADIM), cutting off the flow of urine, or closing the anal sphincter. The method that best activated the TrA in each subject was used at both sessions. Based on the US images, the US technician determined when the participant correctly performed preferential TrA muscle activation using one of the previously described methods. In order to measure the muscles consistently, the anterior and lateral fascia of the TrA was aligned with the edge of the screen for each measurement (Figure 2). The subject's left side was imaged just superior to the iliac crest in order to standardize data collection (Figure 3).
In the hook‐lying position, participants were supine with their knees bent and feet flat to minimize lordosis. In standing, participants lined up the base of each of their fifth metatarsals inside a 30.48 cm tile on the floor. The body armor used was the same model worn by service members currently engaged in combat operations (Point Blank Enterprises, Inc., Pompano Beach, FL). The manufacturer modified the body armor under the direction of the research team in order to allow access to the abdomen for US imaging while maintaining the structural integrity and weight distribution caused by the Ballistic Panels, Small Arms Protective Inserts, and Enhanced Small Arms Protective Inserts. All inserts were training grade but still possessed the same weight and size as the combat‐grade inserts.
All images were measured independently by the same novice examiners who had conducted the US imaging on a separate date from the date of the capture of the image. Both examiners were blinded to the subject's group, to each other's measurements, and to their own previous measurements. ImageJ software (V1.38t, National Institutes of Health, Bethesda, MD) was utilized to measure TrA and IO thickness midway between the medial and lateral borders on the screen (Figure 2).
Statistical Methods
Data entry and statistical analyses were performed using IBM SPSS Statistics 20 (Chicago, IL). Data from 33 participants was obtained for three measurement conditions (hook‐lying, standing, standing with body armor) with three trials for each condition in resting and contracted states. The percent change in thickness for each muscle was determined based on the average of three trials for each position according to the equation below and multiplied by 100%.
Preferential Activation Ratio,25 where t is the thickness:
This equation calculates the relative change in the proportion of the TrA relative to the total lateral abdominal muscle thickness, with higher values indicating more change in TrA thickness and lower values indicating more change in IO and external oblique thickness.25 ICCs with 95% confidence intervals (CI) were calculated for intrarater reliability (model 3,3) and interrater reliability (model 2,1). The standard error of measurement (SEM) was calculated using the formula: SD x √[1‐ICC]. Minimal detectable change (MDC) was calculated using the formula: 1.96 x SEM x √2. To determine the minimal change in thickness that represents a true change, MDCs were calculated with 95% confidence intervals.
RESULTS
A total of 36 military service members were enrolled and 33 (17 men, 16 women; average age 28 ± 4.9 years) completed the study. All participants successfully completed TrA muscle contractions by one of the three cuing methods, and the resting and contracted images were captured and measured.
With respect to intrarater reliability, all ICC values for novice ultrasound technician 1 ranged from 0.90 to 0.98 for the resting and contracted measurements in hook‐lying, standing and standing with body armor for the TrA and IO. Similar values were found for the second novice technician and are therefore not included in the tables. The standard error of measurement ranged from 0.03 to 0.07 cm. The ICC values for percent activation are predictably lower ranging from 0.59 to 0.83 as it compounds the error to calculate this value. These values are summarized in Table 1. Inter‐examiner reliability ICCs ranged from 0.39 to 0.79 with SEM values from 0.07 to 0.17 cm; these values are summarized in Table 2.
TABLE 1.
ICC3,3 (95% CI) | Mean ± SD (cm*) | SEM (cm*) | MDC (cm*) | |
---|---|---|---|---|
TrA in Hook‐lying | ||||
Rest | 0.92 (0.86‐0.96) | 0.28 ± 0.09 | 0.03 | 0.35 |
Contracted | 0.94 (0.90‐0.97) | 0.48 ± 0.14 | 0.03 | 0.57 |
% Activation | 0.83 (0.79‐0.94) | 77.7 ± 46.8 | 19.1 | 130.7 |
TrA in Standing | ||||
Rest | 0.97 (0.95‐0.99) | 0.44 ± 0.19 | 0.03 | 0.53 |
Contracted | 0.97 (0.95‐0.98) | 0.64 ± 0.26 | 0.04 | 0.76 |
% Activation | 0.81 (0.66‐0.90) | 50.2 ± 34.0 | 15.0 | 91.8 |
TrA in Body Armor | ||||
Rest | 0.94 (0.90‐0.97) | 0.40 ± 0.17 | 0.04 | 0.51 |
Contracted | 0.90 (0.82‐0.94) | 0.60 ± 0.22 | 0.07 | 0.79 |
% Activation | 0.76 (0.59‐0.87) | 61.4 ± 54.1 | 26.3 | 134.3 |
IO in Hook‐lying | ||||
Rest | 0.98 (0.96‐0.99) | 0.65 ± 0.20 | 0.03 | 0.73 |
Contracted | 0.98 (0.96‐0.99) | 0.74 ± 0.23 | 0.04 | 0.84 |
% Activation | 0.76 (0.57‐0.87) | 15.8 ± 17.2 | 8.4 | 39.2 |
IO in Standing | ||||
Rest | 0.94 (0.89‐0.97) | 0.73 ± 0.24 | 0.06 | 0.90 |
Contracted | 0.95 (0.91‐0.97) | 0.85 ± 0.31 | 0.07 | 1.04 |
% Activation | 0.82 (0.68‐0.90) | 18.4 ± 25.1 | 10.8 | 48.3 |
IO in Body Armor | ||||
Rest | 0.96 (0.94‐0.98) | 0.77 ± 0.28 | 0.05 | 0.92 |
Contracted | 0.97 (0.94‐0.98) | 0.93 ± 0.35 | 0.06 | 1.10 |
% Activation | 0.59 (0.29‐0.78) | 21.2 ± 20.5 | 13.0 | 57.4 |
Values in centimeters except % Activation
TrA = Transversus abdominis, IO = Internal Oblique
TABLE 2.
ICC2,1 (95% CI) | Mean ± SD (cm*) | SEM (cm*) | MDC (cm*) | |
---|---|---|---|---|
TrA in Hook‐lying | ||||
Rest | 0.39 (0.08‐0.63) | 0.30 ± 0.09 | 0.07 | 0.49 |
Contracted | 0.62 (0.37‐0.79) | 0.49 ± 0.13 | 0.08 | 0.71 |
% Activation | 0.25 (‐0.07‐0.53) | 70.5 ± 36.2 | 31.3 | 157.4 |
TrA in Standing | ||||
Rest | 0.72 (0.50‐0.85) | 0.43 ± 0.17 | 0.09 | 0.69 |
Contracted | 0.46 (0.16‐0.69) | 0.60 ± 0.21 | 0.16 | 1.04 |
% Activation | 0.45 (0.14‐0.69) | 44.5 ± 34.0 | 25.1 | 114.0 |
TrA in Body Armor | ||||
Rest | 0.69 (0.45‐0.83) | 0.40 ± 0.16 | 0.09 | 0.64 |
Contracted | 0.66 (0.41‐0.81) | 0.61 ± 0.22 | 0.13 | 0.97 |
% Activation | 0.51 (0.21‐0.72) | 57.5 ± 39.3 | 27.6 | 134.1 |
IO in Hook‐lying | ||||
Rest | 0.77 (0.59‐0.88) | 0.66 ± 0.18 | 0.08 | 0.89 |
Contracted | 0.79 (0.61‐0.89) | 0.76 ± 0.21 | 0.10 | 1.03 |
% Activation | 0.52 (0.22‐0.73) | 15.8 ± 13.5 | 9.4 | 41.7 |
IO in Standing | ||||
Rest | 0.78 (0.58‐0.89) | 0.76 ± 0.24 | 0.11 | 1.07 |
Contracted | 0.76 (0.53‐0.88) | 0.90 ± 0.30 | 0.15 | 1.30 |
% Activation | 0.53 (0.24‐0.74) | 18.1 ± 19.7 | 13.5 | 55.5 |
IO in Body Armor | ||||
Rest | 0.64 (0.39‐0.80) | 0.76 ± 0.24 | 0.15 | 1.16 |
Contracted | 0.76 (0.57‐0.87) | 0.95 ± 0.33 | 0.17 | 1.41 |
% Activation | 0.61 (0.25‐0.80) | 26.2 ± 20.8 | 13.0 | 62.2 |
Values in centimeters except % Activation
TrA = Transversus abdominis, IO = Internal oblique
DISCUSSION
This study assessed the intrarater and interrater reliability of two novice ultrasound technicians' ability to obtain values of muscular thickness for the TrA and IO. ICC values above 0.75 are considered excellent reliability,26 which was found with all intrarater reliability. While ICC values in this study for intrarater reliability are consistent with previous studies using USI of the abdominal muscles, the interrater reliability in this study was generally poor to fair (0.39‐0.79) through all three positions in either resting or contracted states. Failure to establish quantitative guidelines for determining when the participant correctly performed preferential TrA muscle activation may be one reason for the poor to fair interrater reliability. The technicians' bias in selecting one of the three cueing methods could have introduced systematic error into measurement of TrA and IO thickness. A systematic review of the reliability of USI of lumbar trunk and abdominal muscles found excellent (ICC > 0.93) intrarater and interrater reliability for intraimage measurements.16 Interrater reliability for interimage measurements was also excellent (ICC > 0.90).16 Intrarater reliability for interimage measurements, however, were more variable (ICC 0.62‐0.97).16 The authors of the systematic review also noted that greater measurement error was reported in measures obtained by novice examiners.16
To the authors' knowledge, this study was the first using novice raters with limited (less than five hours) ultrasound training and experience. A previous study conducted by Teyhen et al.21 used novice technicians consisting of two physical therapists and four physical therapy doctoral students who were instructed in 20 hours of US training and were evaluated for proficiency prior to data collection. Novice technicians with minimal training may still have good intrarater reliability to capture abdominal muscle measurements using USI with a TrA‐cuing technique but are less likely to be standardized with more experienced technicians or therapists. Based on the results from this study, a standardized method with formal training for abdominal US measurement may be necessary to have interrater reliability values approaching those of experienced technicians.
Few studies have assessed the reliability of percent thickness from resting to contracted states of the abdominal muscles.20,23,24 Understandably, the ICC for percent thickness is considerably lower (.59‐.83) due to the multiplied effect of error to calculate percent thickness. Consequently, it is important to note that determining the patient's percent thickness change may have less clinical benefit due to its decreased reliability for a novice US technician.
Only one previous study assessed the percent thickness change for a subject under a load.23 Reliability under a load may have been affected by the ability to maintain the same contact pressure of the transducer on the abdomen as used in the hook‐lying and standing positions. This contact pressure could have affected how flattened the acquired image of the abdominal muscles was. Our novice examiners found it easier to maintain constant contact pressure in hook‐lying than in standing conditions. Acquiring measurements in standing was also more difficult because the technician had to physically maneuver around the body armor to access the abdomen just above the iliac crest. Although the body armor had been modified to allow access, it was difficult to maintain the transducer in a position perpendicular to the skin because of the limited space afforded between the body armor and the subject's body depending on their shape.
Study Limitations
All study participants were within healthy body mass index (BMI) range and a relatively narrow age range since all were active duty military. Increased superficial soft tissue (i.e., adipose) would require greater US depth resulting in a decrease in the scale of the muscles which could potentially cause decreased accuracy and reliability. This study is also limited since it did not evaluate the reliability of symptomatic individuals. Therefore, the results cannot be generalized to the injured or diseased population, especially as it relates to measuring the ability to preferentially contract the TrA. Future studies should validate these findings in similar healthy, as well as patient, populations. Lastly, the use of only two examiners is a limitation to the study. Future studies with more examiners should be encouraged, to validate these findings with other examiners and in other settings.
CONCLUSIONS
TrA and IO thickness measurements using USI in asymptomatic individuals using the average of three measurements is highly reliable within a single technician. Reliability between multiple technicians with minimal USI experience is low, so additional training and experience may be necessary to improve interrater reliability.
REFERENCES
- 1.Attwells RL Birrell SA Hooper RH Mansfield NJ. Influence of carrying heavy loads on soldiers' posture, movements and gait. Ergonomics. 2006;49(14):1527‐1537. [DOI] [PubMed] [Google Scholar]
- 2.Erwin SI. Army Has Few Options to Lessen Weight of Body Armor‐The Army is considering buying a lighter and comfier vest used by US Special Operations Command. National Defense. 2009(671):36. [Google Scholar]
- 3.Goh JH Thambyah A Bose K. Effects of varying backpack loads on peak forces in the lumbosacral spine during walking. Clin Biomech. 1998;13(1):S26‐S31. [DOI] [PubMed] [Google Scholar]
- 4.Vacheron JJ Poumarat G Chandezon R Vanneuville G. Changes of contour of the spine caused by load carrying. Surg Radiol Anat. 1999;21(2):109‐113. [DOI] [PubMed] [Google Scholar]
- 5.Berton H. Weight of War: Military struggles to lighten soldiers' load. The Seattle Times 2011.
- 6.Konitzer LN Fargo MV Brininger TL Reed ML. Association between back, neck, and upper extremity musculoskeletal pain and the individual body armor. J Hand Ther. 2008;21(2):143‐149. [DOI] [PubMed] [Google Scholar]
- 7.Cohen SP Gallagher RM Davis SA Griffith SR Carragee EJ. Spine‐area pain in military personnel: a review of epidemiology etiology, diagnosis, and treatment. The Spine Journal. 2012;12(9):833‐842. [DOI] [PubMed] [Google Scholar]
- 8.Vibe Fersum K O'Sullivan P Skouen J Smith A Kvåle A. Efficacy of classification‐based cognitive functional therapy in patients with non‐specific chronic low back pain: A randomized controlled trial. Eur J Pain. 2013;17(6):916‐928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.O'Sullivan PB Phyty GDM Twomey LT Allison GT. Evaluation of specific stabilizing exercise in the treatment of chronic low back pain with radiologic diagnosis of spondylolysis or spondylolisthesis. Spine. 1997;22(24):2959‐2967. [DOI] [PubMed] [Google Scholar]
- 10.Hides JA Jull GA Richardson CA. Long‐term effects of specific stabilizing exercises for first‐episode low back pain. Spine. 2001;26(11):E243‐E248. [DOI] [PubMed] [Google Scholar]
- 11.Koumantakis GA Watson PJ Oldham JA. Trunk muscle stabilization training plus general exercise versus general exercise only: randomized controlled trial of patients with recurrent low back pain. Phys Ther. 2005;85(3):209‐225. [PubMed] [Google Scholar]
- 12.Saner J Kool J Sieben JM Luomajoki H Bastiaenen CH de Bie RA. A tailored exercise program versus general exercise for a subgroup of patients with low back pain and movement control impairment: A randomised controlled trial with one‐year follow‐up. Man Ther. 2015, in press. [DOI] [PubMed] [Google Scholar]
- 13.Peate W Bates G Lunda K Francis S Bellamy K. Core strength: a new model for injury prediction and prevention. J Occup Med Toxicol. 2007;2(3):1‐9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Noguchi T Demura S. Relationship between Abdominal Strength Measured by a Newly Developed Device and Abdominal Muscle Thickness. Advances in Physical Education. 2014;4(2):70‐76. [Google Scholar]
- 15.Koppenhaver SL Hebert JJ Parent EC Fritz JM. Rehabilitative ultrasound imaging is a valid measure of trunk muscle size and activation during most isometric sub‐maximal contractions: a systematic review. Aus J Physiother. 2009;55(3):153‐169. [DOI] [PubMed] [Google Scholar]
- 16.Hebert JJ Koppenhaver SL Parent EC Fritz JM. A systematic review of the reliability of rehabilitative ultrasound imaging for the quantitative assessment of the abdominal and lumbar trunk muscles. Spine. 2009;34(23):E848‐E856. [DOI] [PubMed] [Google Scholar]
- 17.McPherson SL Watson T. Reproducibility of ultrasound measurement of transversus abdominis during loaded, functional tasks in asymptomatic young adults. PM&R. 2012;4(6):402‐412. [DOI] [PubMed] [Google Scholar]
- 18.doo Park S. Reliability of ultrasound imaging of the transversus deep abdominial, internal oblique and external oblique muscles of patients with low back pain performing the drawing‐in maneuver. J Phys Ther Sci. 2013;25(7):845‐847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Koppenhaver SL Hebert JJ Fritz JM Parent EC Teyhen DS Magel JS. Reliability of rehabilitative ultrasound imaging of the transversus abdominis and lumbar multifidus muscles. Arch Phys Med Rehab. 2009;90(1):87‐94. [DOI] [PubMed] [Google Scholar]
- 20.Yang HS Yoo JW Lee BA Choi CK You JH. Inter‐tester and intra‐tester reliability of ultrasound imaging measurements of abdominal muscles in adolescents with and without idiopathic scoliosis: a case‐controlled study. Bio‐medical Materials and Engineering. 2013;24(1):453‐458. [DOI] [PubMed] [Google Scholar]
- 21.Teyhen DS George SZ Dugan JL Williamson J Neilson BD Childs JD. Inter‐rater reliability of ultrasound imaging of the trunk musculature among novice raters. J Ultras Med. 2011;30(3):347‐356. [DOI] [PubMed] [Google Scholar]
- 22.Larivière C Gagnon D De Oliveira E Henry SM Mecheri H Dumas J‐P. Reliability of ultrasound measures of the transversus abdominis: Effect of task and transducer position. PM&R. 2013;5(2):104‐113. [DOI] [PubMed] [Google Scholar]
- 23.Watson T McPherson S Fleeman S. Ultrasound measurement of transversus abdominis during loaded, functional tasks in asymptomatic individuals: Rater reliability. PM&R. 2011;3(8):697‐705. [DOI] [PubMed] [Google Scholar]
- 24.Koppenhaver SL Parent EC Teyhen DS Hebert JJ Fritz JM. The effect of averaging multiple trials on measurement error during ultrasound imaging of transversus abdominis and lumbar multifidus muscles in individuals with low back pain. J Orthop Sport Phys. 2009;39(8):604‐611. [DOI] [PubMed] [Google Scholar]
- 25.Teyhen DS Miltenberger CE Deiters HM, et al. The use of ultrasound imaging of the abdominal drawing‐in maneuver in subjects with low back pain. J Orthop Sport Phys. 2005;35(6):346‐355. [DOI] [PubMed] [Google Scholar]
- 26.Fleiss J. The Design and Analysis of Clinical Experiments. New York: John Wiley Sons; 1986. [Google Scholar]