Application of National Testing Standards to Simulation-Based Assessments of Clinical Palpation Skills

Carla M Pugh

doi:10.7205/MILMED-D-13-00215

. Author manuscript; available in PMC: 2014 Aug 4.

Published in final edited form as: Mil Med. 2013 Oct;178(10 0):55–63. doi: 10.7205/MILMED-D-13-00215

Application of National Testing Standards to Simulation-Based Assessments of Clinical Palpation Skills

Carla M Pugh ¹

PMCID: PMC4120121 NIHMSID: NIHMS604052 PMID: 24084306

Abstract

With the advent of simulation technology, several types of data acquisition methods have been used to capture hands-on clinical performance. Motion sensors, pressure sensors, and tool-tip interaction software are a few of the broad categories of approaches that have been used in simulation-based assessments. The purpose of this article is to present a focused review of 3 sensor-enabled simulations that are currently being used for patient-centered assessments of clinical palpation skills. The first part of this article provides a review of technology components, capabilities, and metrics. The second part provides a detailed discussion regarding validity evidence and implications using the Standards for Educational and Psychological Testing as an organizational and evaluative framework. Special considerations are given to content domain and creation of clinical scenarios from a developer’s perspective. The broader relationship of this work to the science of touch is also considered.

INTRODUCTION

With the advent of simulation technology, several types of data acquisition technologies have been used to capture hands-on performance.^1–5 Motion sensors, pressure sensors, and tool-tip interaction software are a few of the broad categories of technologies that have been used in simulation-based assessments.^6–10 One major distinction between some of the technologies used to capture hands-on performance is the location of the sensors, hence the type of data that is collected. Some of the simulation systems have the sensors on a tool, surgical instrument, or data glove. In this instance, the information captured helps to quantify hand or instrument positions during task execution. For example, when using a data glove, the associated metrics provide quantitative measures of various hand and individual finger positions throughout the task.^11–14 Similarly, instrumented surgical tools allow information capture regarding the position of the surgical instrument during a procedure.^1–3 In both instances, the data captured is clinician centered and focuses on the motor movements and positioning of the clinician.

In this article, we will focus on sensor-enabled training tools in which the sensors are not on the clinician’s hand or instrument but on the patient. In this instance, the information captured using this type of technology helps to quantify human interaction with specific anatomical structures. For example, when using a simulated patient with sensor-enabled organs, the associated metrics provide quantitative measures regarding patient contact including anatomical location and quality of clinical palpation during a physical examination.^15–17 Moreover, using a simulated patient with sensor-enabled organs during a surgical procedure will allow data collection regarding instrument interaction with the patient’s anatomical structures.^18,19 The resulting data in both scenarios are patient centered and provide detailed information regarding direct (hands) or indirect (instruments) patient contact.

As most procedures and physical examinations require hands-on contact with specific anatomical locations, patient-centered devices allow for assessment of a variety of errors including incomplete examinations, missed lesions, or use of excessive force. The purpose of this article is to present a focused review of three sensor-enabled simulations that are currently being developed for patient-centered assessments of clinical palpation skills. The Standards for Educational and Psychological Testing will be used as a framework to structure our review.²⁰

METHODS

Testing Standards

The Standards for Educational and Psychological Testing is a set of testing standards developed jointly by the American Educational Research Association, American Psychological Association, and the National Council on Measurement in Education. The standards address professional and technical issues of test development and use in education, psychology, and employment. The intent is to promote the sound and ethical use of tests and to provide a basis for evaluating the quality of testing practices. While evaluation of the appropriateness of a test or testing application should depend heavily on professional judgment, the standards provide a frame of reference to assure that relevant issues are addressed. The standards apply equally to standardized multiple-choice tests and performance assessments. For the purpose of this article, our sensor-enabled, simulation-based assessments are considered to be a type of performance-based assessment: evaluation of performance during tasks that are valued in their own right.

Sensor-Enabled Simulation Technology

The focus of this article is on performance assessments using sensor-enabled mannequin technologies. Development and implementation of this technology began in 1998.²¹ The initial clinical focus was physical examination of the female pelvis, female breast, and male digital rectal examination. All three systems include a partial mannequin; umbilicus to mid-thigh for the pelvic and digital rectal examinations; and left or right chest wall for the breast models. In addition, the mannequins have interchangeable parts that enable simulation of different clinical presentations, both normal and abnormal scenarios. For example, the pelvic examination inserts range from normal-anteverted and normal-retroverted uterine positions as well as enlargements of the uterus or ovary.

Paper-thin 2 mm force sensing resisters (sensors) are connected to important anatomical structures on the mannequin and imbedded organs.²² Data acquisition systems enable the sensors to be sampled at a specified sampling rate. For these clinical examinations, the sampling rate is set at 30 Hz. Figure 1 shows the pelvic exam simulator including computer interface and mannequin. The computer interface shows that the user is touching the cervical os at six pressure units (1 PU = 0.125 psi). Consequently, the corresponding register bar rises to a level of six, the indicator button in the cartoon diagram turns blue, and a check mark appears in the “Exam Checklist” window. During a simulated pelvic exam, this interface enables students and instructors to see where the examiner is touching and how much pressure is being used. The prostate and breast models have similar computer interfaces.

Figures 2A and 2B show line graph representations of a pelvic examination plotted as pressure over time. Each line represents a different anatomical area. When reviewing performance using the line graphs, the examiner’s touch can be tracked by anatomical location and several palpation characteristics including the level and type of pressure. In Figure 2A, the examining medical student applies several bursts of pressure to the fundus of the uterus. This is represented by a series of narrow spikes (see black arrow) in the 8–10 PU range. Simultaneously, there is constant pressure on the left posterior cervix (L-post) at a level of 6 PUs. These data quantify bimanual examination of the uterine fundus: one hand explores the fundus while the other hand applies counterpressure to the cervix to lift the uterus toward the abdominal wall and facilitate palpation. The combination of pressures used (spikes plus constant pressure) provide detail on the examiner’s approach including anatomical locations explored (i.e., sensor locations) and palpation characteristics used (i.e., pressure levels and spikes plus constant pressure). To emphasize a range in performance, the medical student in Figure 2B applies a combination of pressures to the cervix in the 6–8 PU range. The first spike (~600 time units) represents pressure applied to the cervical os. The second waveform (~750–1166 time units) combines a series of spikes and constant pressure on the left posterior cervix. Starting around 1200 time units, the examination continues with several manipulations of the cervix in the 5–8 PU range and ends with two low pressure (4 PU) spikes on the uterine sensors. Compared to the student in Figure 2A, the student in Figure 2B spends a lot of time applying pressure to several areas on the cervix; takes twice as long to accomplish bimanual examination of the uterine fundus; and uses much lower pressure and a smaller number of peaks when palpating the uterine fundus. While the second student was eventually able to accomplish the goal of bimanual examination, there are noticeable differences in performance when comparing the two students. These graphs show that specific physical examination events can be captured and quantified. Use of this data for performance-related decisions will be discussed using the standards as a guiding framework.

Line graph representation of the sensor-generated performance data collected during a pelvic examination. (A) The arrow points to several, high-pressure palpations of the uterus. (B) The arrows point to high pressure-palpation of the right and left cervix.

Metrics

While the line graph representations of the data provide a high-level, qualitative view of performance differences, to quantify individual differences, the data must be converted to measureable variables. The most commonly used performance assessment variables extracted from the sensor data include (1) examination time, (2) number of sensors touched, (3) maximum pressure, and (4) frequency.^21–27 The operational definition of these variables is as follows:

Time Variable

The time variable is equivalent to the length of time necessary for an examiner to perform a complete examination. Mathematically, exam completion time was defined as the time at which the last sensor was touched minus the time at which the first sensor was touched. The exam was considered to have begun when the pressure on any given sensor reached 1 full pressure unit above baseline.^21–27

Critical Areas Variable

The critical areas variable represents the number of sensors touched during the simulated clinical examinations. For the pelvic exam there were seven sensors; four on the cervix, one on the uterus, and one on each ovary. For the digital rectal exam there were also seven sensors on the prostate, three on the right lateral lobe of the prostate, three on the left lateral lobe of the prostate, and one in the median raphe. For the breast exam there were eleven sensors; approximately two-three sensors in each of the four quadrants of the breast.^21–27

Maximum Pressure Variable

The maximum pressure variable represents the highest pressure reading recorded for a sensor during the simulated examination. For example, if the highest pressure readings were recorded for sensor no. 1 as seven PU and sensor no. 2 as ten PU for mathematical purposes, average maximum pressure would be the average of those two sensor variables; however, maximum pressure would be the individual highest pressure for each sensor.^21–27

Frequency Variable

The frequency variable represents the number of times an individual sensor was touched near the maximum pressure during the examination. The mathematical formulation for creating this variable involved counting the number of times the given sensor was sampled within 0.5 PU of the maximum for that sensor.^21–27

In addition to extraction of these variables using MATLAB code (MathWorks, Natick, MA), specific data mining techniques have been applied to the sensor data to gain an understanding of quality and usefulness, including Markov models^19,23–25 and visual analytics.²⁶

Application of National Testing Standards

Use of the sensor-generated data for performance-related decisions requires a standardized approach to ensure validity, reliability, fairness, and appropriate use.²⁰ The Standards provide a three part guide for evaluating a wide variety of assessments. The following sections present a detailed review of our past work. The Standards are used as a guide to structure our discussion. All of the studies reviewed in this article were performed after approval by the local Institution Review Board.

Standards Part I—Test Construction, Evaluation, and Documentation

When designing the simulation-based assessments, test construction required a review of all elements that may be used to gather data for evaluation purposes. The two main elements included (1) the written clinical assessment form and (2) the computer-generated sensor data. Correct answers for the written clinical assessment vary according to the clinical scenario (normal or pathologic variation) represented by the simulation. Correct answers for the computer-generated variables are also closely linked to the clinical presentation. While there is ongoing work to determine the best construction and administration for our simulation-based assessments, for the purpose of this article, each clinical scenario represents a specific content domain, hence a major section within a test. Most administrations using the clinical simulations involve at least two different clinical scenarios. Opportunities for partial credit exist for both the written and the sensor data components. The following sections provide an overview of the various studies we have performed to guide the process of test construction and administration.

Validity, reliability, and measurement error are important to consider during test construction. Validity was initially assessed by evaluating a basic content construct—does this technology capture any performance measures of interest? To answer this question we used a sensor-enabled pelvic examination simulator with second-year medical students (N = 73). In addition to collecting sensor-generated performance data, we assessed diagnostic accuracy rates using participant’s written clinical assessments of two, clinically different, pelvic models.²⁷ Using a 2-tailed Pearson’s correlation, we found that three of the four sensor variables were significantly associated with participants’ ability to generate an accurate clinical assessment of the simulator after performing an examination. The highest correlation was for the “number of critical areas touched” during an examination (r = 0.311, p = 0.007). The second highest correlation was for the “mean maximum pressure” used during the examination (r = 0.279, p = 0.017). Finally, the last variable with a significant correlation was “mean frequency” used during the examination (r = 0.267, p = 0.022). While the correlation values for these three variables are moderate to low, they indicate that there is a relationship between the simulator variables; accuracy on the written assessments; and the use of direct, hands-on, contact during an examination. In essence, these three variables capture aspects of the clinical pelvic examination that are important in achieving an accurate diagnosis. Reliability of the three variables in this setting was as follows: (1) time, r = 0.72; (2) critical areas, r = 0.63; (3) mean maximum pressure, r = 0.77; and (4) mean frequency, r = 0.50. While the time variable did not have a significant correlation with diagnostic accuracy in this setting, it appeared to have moderate to high reliability and we continued to evaluate this variable in other settings (i.e., different participants and clinical scenarios).

Validity was also assessed by evaluating the ability to use the simulator data to differentiate between experience levels. When assessing the experience level using the pelvic simulators, we compared medical student (N = 43) performance with that of experienced clinicians (N = 20).²⁸ For the written assessment, mean examination scores showed medical students were less accurate than clinicians (students = 10.18/18, clinicians = 15.60/18.0, p italic> 0.001). There were also differences noted in examination techniques. Students were noted to spend more time on the exam (students = 82.01 seconds, clinicians = 31.07 seconds, p bold> 0.001) and used greater palpation frequencies when examining each area (students = 42.45 Hz, clinicians = 20.30 Hz, p < 0.005).²⁸ When evaluating experience level using the digital rectal examination simulator we conducted a study involving surgical residents (N = 24) and medical students (N = 30).¹⁶ Participants were grouped according to the number of prior digital rectal examinations performed. Group 1 (N = 27) was the less experienced group having performed five or fewer previous examinations. Group 2 (N = 27) had performed six or more rectal examinations. Each participant examined two different simulators: Simulator A (easy diagnosis—normal rectum and a firm 2 mm prostate nodule) and Simulator B (difficult diagnosis—enlarged prostate + a subtle 3 cm rectal mass). When comparing technical performance on Simulator A, the more experienced group (G2) was noted to spend more time on the examination (G2 = 12.34 seconds, G1 = 7.22 seconds, p < 0.01). For this simulator (Simulator A—easy diagnosis), there were no significant differences in accuracy. The less experienced group had an accuracy rate of 80% and the more experienced group had an accuracy rate of 85%. In contrast, for the more difficult simulator (Simulator B) there were significant differences in technical performance and accuracy. When comparing performance on Simulator B, the more experienced group G2 spent more time on the exam (G2 = 17.52 seconds, G1 = 11.94 seconds, p < 0.05) and were more accurate in their assessment and documentation of the prostate findings (G2 = 64% accurate, G1 = 33% accurate, p < 0.05).¹⁶

Validity was also assessed by evaluating the ability to use the simulator data to differentiate between clinical specialties and gender. When assessing clinical specialty using the breast examination simulators, we compared the performance of four clinical groups: (1) surgeons (N = 37), (2) nonsurgical MD’s (N = 36), (3) nurses (N = 12), and (4) medical assistants (N = 15) on three simulators: (1) Simulator A—dense breast, 2 cm hard mass; (2) Simulator B—fatty breast, no masses; and (3) Simulator C—dense breast with right upper quadrant thickening.¹⁵ When assessing overall approach to the breast examination, nurses were noted, on average, to palpate more anatomical areas during the examination (nurses = 9.4/11 areas, surgeons = 7.38/11 areas, nonsurgeons = 6.22/11 areas, and medical assistants = 6.71/11 areas, p < 0.01). In addition, there was a trend for the nurses to spend more time on the breast examination. However, this finding was only significant for one of the three models—Simulator A (nurses = 58.57 seconds, surgeons = 40.94 seconds, nonsurgeons = 32.33 seconds, and medical assistants = 39.65 seconds, p < 0.05). Despite these observed differences in technical approach, and differences in mean accuracy rates for each clinical scenario (A = 87%, B = 76%, and C = 68%), there were no significant differences in accuracy when comparing the four specialties on the three breast models. Although there were differences in hands-on contact and approach among the specialties in this study, there was no difference in accuracy. As such, the three clinical scenarios and simulator variables are not a valid tool for assessing important differences in surgical, nonsurgical, and nursing specialties.

When assessing gender, we compared male (N = 38) and female (N = 57) performance on three breast simulators: (1) Simulator A—dense breast, 2 cm hard mass; (2) Simulator B—fatty breast, no masses; (3) Simulator C—dense breast with right upper quadrant thickening.¹⁵ The results showed that females, on average, spent more time (M = 42.09 seconds, F = 56.66 seconds, p < 0.05), touched more anatomical areas (M = 6.30/11 areas, F = 7.97/11 areas, p < 0.05), and used greater pressures (M = 4.82 mmHg, F = 5.21 mmHg, p < 0.05) when compared to male clinicians. While there was a trend toward females being less accurate (M = 83.7% correct, F = 72.4% correct) the difference was not statistically significant. Although there were differences in hands-on contact and approach, when comparing males and females, there was no difference in accuracy. As such, Simulators A–C and the computer-generated simulator variables are not a valid tool for assessing gender-related accuracy differences. Moreover, it is possible that there are no gender-related differences.

In summary, the simulator variables appear to capture data that correlates with hands-on performance during simulated clinical examinations. In addition, the variables show promising results in discriminating experience levels.^16,28 When evaluating specialty and gender, there appears to be differences in technical approach but not overall accuracy. As such, the variables may not be useful in discriminating between specialties and gender. Moreover, there may not be any important differences between specialty and gender except approach.

Standards Part II and III

Standards Part II and III deal with fairness and testing applications. Fairness issues include lack of bias, equitable treatment in the testing process, and equal opportunities to learn. As the simulations are physical models and the sensors quantify hands-on touch, biases in language and linguistics are limited. Test taker rights such as access to test results and rights when reviewing testing irregularities are issues that will need to be addressed before formal use of the simulations for performance related decisions. Issues relating to testing applications largely address general responsibilities of those who administer, interpret, and use test results. When using previously validated tests in different venues, test users must ensure that there is continued test validity and reliability in the new setting (Standard 11.19). Use of tests for psychological evaluation, educational assessment, employment-related decisions, and program evaluation are other areas that must be considered as part of test applications. These areas will be considered as part of our future work in evaluating the simulation-based assessments.

SPECIAL CONSIDERATIONS

Content Domain and Creation of Clinical Scenarios

There are inherent difficulties in simulating human body parts. These difficulties present several challenges to the use of simulation as an assessment tool. From a manufacturing standpoint, the developer’s goal is to find the right combination of materials and molds that, once fabricated, are the most realistic representation of human tissue possible. In essence, the goal is the best match for context and functionality.^29,30 When building a breast model for example, it may be possible to perfect the mold and achieve an extremely realistic look; the shape, the color, the skin detail may all be perfect. However, after achieving this perfection, the materials may not feel or behave like real breast tissue. Likewise, a breast model may feel realistic to touch but be found deficient in achieving a realistic look.³¹

An additional challenge in using simulation as an assessment tool relates to human perception. For example, when manufacturing a breast model that represents a patient with an obvious breast mass, the expectation would be that most clinicians examining the model would detect the mass on palpation. However, diagnostic perception may be affected by the manufacturer’s materials as well as the clinician’s clinical skills.³² From a validity perspective, it is difficult to determine whether lower than expected accuracy rates represent a fabrication problem or reasonable variations in perception. The problem is further confounded when a developer desires to fabricate a clinical scenario that is less obvious. While the overall goal, from an assessment perspective, is to provide a range of easy and difficult test questions (simulations), lower accuracy rates on the more difficult clinical scenarios must be evaluated from a validity perspective in a similar fashion and with the same rigor as multiple-choice questions.²⁰ Defining this process for simulation is imperative to the success and applicability of simulation-based assessments.

The Science of Touch

In health care, despite the many technological advances, human touch remains important for many diagnostic and therapeutic interventions. Unfortunately, without performance measures, objective and formative feedback to health care trainees and practitioners is nearly impossible. As such, there are no standards and health care professionals continue to graduate and become credentialed to practice medicine without any real measure of their ability to perform hands-on procedures or make sound clinical judgments using palpation. Part of the problem lies in the complexity of touch (palpation). Despite its importance, the human sense of touch is poorly understood and understudied. Touch is extremely difficult to convey verbally and there are no objective means of explaining one’s own experience or perception based on the sense of touch.^32,33 Over 20 years of extensive research on the sense of touch reveals a set of specific hand maneuvers that humans use to detect object characteristics.^32,34,35 The hand maneuvers, called exploratory procedures, are stereotyped movement patterns consisting of certain characteristics, which are largely subconscious and reproducible in a variety of settings. Key findings from this work show (1) human beings are very good at recognizing common objects on the basis of touch alone; (2) object recognition is strongly based on specific object characteristics including texture, hardness, shape, temperature, and weight; (3) during object recognition, specific hand maneuvers are used to detect object characteristics.

Our work using sensor-enabled, patient-centered simulations to assess palpation skills in clinical medicine has shown promising results regarding the relationship between sensor outputs and specific exploratory maneuvers used during palpation. When using force sensing resistors (FSRs) on anatomical models that simulate common medical examinations (breast, pelvic, and digital rectal examinations), we found that specific palpation maneuvers were detectable in our data. Figures 3A–3F show laboratory-generated waveforms for specific palpation maneuvers.³⁶ Laboratory participants were asked to perform the following maneuvers for a minimum of 4 seconds on a sensored plate: (1) balloting (multiple, vertical bursts of firm pressure), (2) circular motion (rubbing counterclockwise), (3) constant pressure, and (4) rubbing (firm back and forth pressure across a vertical line). While the resulting waveforms show similar patterns for rubbing and circular maneuvers, balloting in multiple areas is distinguishable from circular motion in multiple areas and constant pressure. Linking the waveforms to specific exploratory maneuvers will facilitate our understanding of palpation characteristics during physical examination. For example, as shown in Figure 2A, we now understand that the examinee was using the balloting maneuver when examining the fundus. Our future work will continue to explore the sensor data using Klatzky and Lederman’s classification of palpation maneuvers. We believe this may help to generate a better understanding of how the sensor-generated data can be used in performance assessments.

Waveforms extracted from the sensor data during specific palpation maneuvers: (A, B) Balloting; (C, D) Circular pressure movements; (E) Constant pressure; and (F) Rubbing.

SUMMARY

The purpose of this article was to present a focused review of sensor-based assessments of palpation skills using patient-centered simulations. The technology components, capabilities, and metrics were reviewed. In our review of validity, we found that there were significant correlations between the sensor-generated performance variables and accuracy when reporting and documenting clinical findings. While the correlation values were moderate to low (r = 0.267–0.311), they warrant additional investigation of the relationship between hands-on performance and rate of accuracy during a clinical assessment.

In our review of validity evidence, we focused on three constructs: (1) experience level, (2) clinical specialty, and (3) gender. The results for experience level revealed promising results in discriminating between groups based on technical approach and accuracy.^16,28 The results for specialty and gender showed differences in technical approach but not overall accuracy. As such, the current variables and clinical scenarios are not valid discriminatory variables for these constructs.

The use of palpation remains important for many diagnostic and therapeutic interventions in the health care environment. Development of performance metrics and assessments to ensure minimum performance standards is an important endeavor that should be closely guided by national standards.

Acknowledgments

The following researchers have made significant contributions to this body of work: Jacob Rosen, Lawrence Salud, Jonathan Salud, Alec Peniche, Abby Kaye, and Brandon Andrew. This body of work has been funded by the following foundations and agencies: National Board of Medical Examiners (NBME) Stemmler Fund; Media X Grant, Stanford University; Augusta Webster Educational Innovation Grant, Northwestern University; Eleanor Wood-Prince Grants Initiative, Northwestern Memorial Hospital; National Cancer Institute-Supplement Grant-3U01CA116875-03S1; The Baum Family Fund; National Institutes of Health R01EB011524. The work reported herein was also partially supported by a grant from the Office of Naval Research, Award Number N00014-10-1-0978.

Footnotes

The findings and opinions expressed here do not necessarily reflect the positions or policies of the Office of Naval Research.

References

1.Cano AM, Gayá F, Lamata P, Sánchez-González P, Gomez EJ. Laparoscopic tool tracking method for augmented reality surgical applications. Proc Biomedical Simulation. 2008;5104:191–6. [Google Scholar]
2.Dosis A, Aggarwal R, Bello F, et al. Synchronized video and motion analysis for the assessment of procedures in the operating theater. Arch Surg. 2005;140:293–9. doi: 10.1001/archsurg.140.3.293. [DOI] [PubMed] [Google Scholar]
3.Chmarra MK, Bakker NH, Grimbergen CA, Dankelman J. Tr Endo, a device for tracking minimally invasive surgical instruments in training setups. Sens Actuators A Phys. 2006;126:328–34. [Google Scholar]
4.Rosen J, Brown JD, Barreca M, Chang L, Hannaford B, Sinanan M. The Blue DRAGON–a system for monitoring the kinematics and the dynamics of endoscopic tools in minimally invasive surgery for objective laparoscopic skill assessment. Stud Health Technol Inform. 2002;85:412–8. [PubMed] [Google Scholar]
5.Pagador JB, Sánchez LF, Sánchez JA, Bustos P, Moreno J, Sánchez-Margallo FM. Augmented reality haptic (ARH): an approach of electromagnetic tracking in minimally invasive surgery. Int J Comput Assist Radiol Surg. 2011;6:257–63. doi: 10.1007/s11548-010-0501-0. [DOI] [PubMed] [Google Scholar]
6.Bann SD, Khan MS, Darzi AW. Measurement of surgical dexterity using motion analysis of simple bench tasks. World J Surg. 2003;27:390–4. doi: 10.1007/s00268-002-6769-7. [DOI] [PubMed] [Google Scholar]
7.Datta V, Mackay S, Mandalia M, Darzi A. The use of electromagnetic motion tracking analysis to objectively measure open surgical skill in the laboratory-based model. J Am Coll Surg. 2001;193:479–85. doi: 10.1016/s1072-7515(01)01041-9. [DOI] [PubMed] [Google Scholar]
8.Murphy TE, Vignes CM, Yuh DD, Okamura AM. Automatic motion recognition and skill evaluation for dynamic tasks. Proc EuroHaptics. 2003:363–73. [Google Scholar]
9.Oropesa I, Sánchez-González P, Cano AM, Lamata P, Sánchez-Margallo FM, Gómez EJ. Objective evaluation methodology for surgical motor skills assessment. Minim Invasive Ther Allied Technol. 2010;10:55–6. [Google Scholar]
10.Leong JJH, Nicolaou M, Atallah L, Mylonas GP, Darzi AW, Yang GZ. HMM assessment of quality of movement trajectory in laparoscopic surgery. Comput Aided Surg. 2007;12:335–46. doi: 10.3109/10929080701730979. [DOI] [PubMed] [Google Scholar]
11.Dipietro L, Sabatini AM, Dario P. Evaluation of an instrumented glove for hand-movement acquisition. J Rehabil Res Dev. 2003;40(2):179–89. [PubMed] [Google Scholar]
12.Cook JR, Baker NA, Cham R, Hale E, Redfern MS. Measurements of wrist and finger postures: a comparison of goniometric and motion capture techniques. J Appl Biomech. 2007;23(1):70–8. doi: 10.1123/jab.23.1.70. [DOI] [PubMed] [Google Scholar]
13.Gentner R, Classen J. Development and evaluation of a low-cost sensor glove for assessment of human finger movements in neurophysiological settings. J Neurosci Methods. 2009;178(1):138–47. doi: 10.1016/j.jneumeth.2008.11.005. [DOI] [PubMed] [Google Scholar]
14.Gülke J, Wachter NJ, Geyer T, Schöll H, Apic G, Mentzel M. Motion coordination patterns during cylinder grip analyzed with a sensor glove. J Hand Surg Am. 2010;35(5):797–806. doi: 10.1016/j.jhsa.2009.12.031. [DOI] [PubMed] [Google Scholar]
15.Pugh CM, Domont ZB, Salud LH, Blossfield KM. A simulation-based assessment of clinical breast examination technique: do patient and clinician factors affect clinician approach? Am J Surg. 2008;195(6):874–80. doi: 10.1016/j.amjsurg.2007.10.018. [DOI] [PubMed] [Google Scholar]
16.Balkissoon R, Blossfield-Iannitelli K, Salud L, Ford D, Pugh C. Lost in translation: unfolding medical students’ misconceptions of how to perform the clinical digital rectal examination. Am J Surg. 2009;197(4):525–32. doi: 10.1016/j.amjsurg.2008.11.025. [DOI] [PubMed] [Google Scholar]
17.Pugh CM, Rosen J. Qualitative and quantitative analysis of pressure sensor data acquired by the E-Pelvis simulator during simulated pelvic examinations. Stud Health Technol Inform. 2002;85:376–9. [PubMed] [Google Scholar]
18.Rosen J, Hannaford B, Richards CG, Sinanan MN. Markov modeling of minimally invasive surgery based on tool/tissue interaction and force/torque signatures for evaluating surgical skills. IEEE Trans Biomed Eng. 2001;48(5):579–91. doi: 10.1109/10.918597. [DOI] [PubMed] [Google Scholar]
19.Rosen J, Solazzo M, Hannaford B, Sinanan M. Objective laparoscopic skills assessments of surgical residents using hidden Markov models based on haptic information and tool/tissue interactions. Stud Health Technol Inform. 2001;81:417–23. [PubMed] [Google Scholar]
20.American Educational Research Association (AERA), American Psychological Association (APA), and National Council for Measurement in Education (NCME) Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association; 1999. [Google Scholar]
21.Pugh CM. Ann Arbor, MI: ProQuest/Bell and Howell Dissertations Publishing; 2001. [accessed September 26, 2013]. Evaluating Simulators for Medical Training: The Case of the Pelvic Exam Model. Available at http://disexpress.umi.com/dxweb. [Google Scholar]
22.Medical Examination Teaching System. 6,428,323. [accessed May 7, 2013];US Patent Number. 2002 Available at http://www.google.com/patents/US6428323.
23.Mackel T, Rosen J, Pugh C. Application of hidden Markov modeling to objective medical skill evaluation. Stud Health Technol Inform. 2007;125:316–8. [PubMed] [Google Scholar]
24.Mackel T, Rosen J, Pugh CM. Markov model assessment of subjects’ clinical skill using the E-Pelvis physical simulator. IEEE Trans Biomed Eng. 2007;54(12):2133–41. doi: 10.1109/tbme.2007.908338. [DOI] [PubMed] [Google Scholar]
25.Mackel T, Rosen J, Pugh C. Data mining of the E-pelvis simulator database: a quest for a generalized algorithm for objectively assessing medical skill. Stud Health Technol Inform. 2006;119:355–60. [PubMed] [Google Scholar]
26.Silverstein J, Selkov G, Salud L, Pugh C. Developing performance criteria for the e-Pelvis simulator using visual analysis. Stud Health Technol Inform. 2007;125:436–8. [PubMed] [Google Scholar]
27.Pugh CM, Youngblood P. Development and validation of assessment measures for a newly developed physical examination simulator. J Am Med Inform Assoc. 2002;9(5):448–60. doi: 10.1197/jamia.M1107. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Pugh CM, Heinrichs WL, Dev P, Srivastava SS, Krummel T. Objective assessment of clinical skills with a simulator. JAMA. 2001;286(9):1021–3. doi: 10.1001/jama.286.9.1021-a. [DOI] [PubMed] [Google Scholar]
29.Verschuren P, Hartog R. Evaluation in design-oriented research. Qual Quant. 2005;39(6):733–62. [Google Scholar]
30.Kirschner P, Carr C, van Merriënboer J, Sloep P. How expert designers design. Perform Improv Quart. 2002;15(4):86–104. [Google Scholar]
31.Salud LH, Ononye CI, Kwan C, Salud JC, Pugh CM. Clinical examination simulation: getting to real. Stud Health Technol Inform. 2012;173:424–9. [PMC free article] [PubMed] [Google Scholar]
32.Klatzky R, Lederman SJ. Tactile object perception and the perceptual stream. In: Albertazzi L, editor. Unfolding Perceptual Continua. Netherlands: John Benjamin Publishing Company; 2002. pp. 147–62. [Google Scholar]
33.Minogue J, Jones MG. Haptics in education: exploring an untapped sensory modality. Rev Educ Res. 2006;76(3):317–48. [Google Scholar]
34.Lederman SJ, Klatzky RL. Extracting object properties through haptic exploration. Acta Psychol (Amst) 1993;84(1):29–40. doi: 10.1016/0001-6918(93)90070-8. [DOI] [PubMed] [Google Scholar]
35.Lederman SJ, Klatzky RL. Hand movements: a window into haptic object recognition. Cogn Psychol. 1987;19(3):342–68. doi: 10.1016/0010-0285(87)90008-9. [DOI] [PubMed] [Google Scholar]
36.Salud LH, Pugh CM. Use of sensor technology to explore the science of touch. Stud Health Technol Inform. 2011;163:542–8. [PubMed] [Google Scholar]

[R1] 1.Cano AM, Gayá F, Lamata P, Sánchez-González P, Gomez EJ. Laparoscopic tool tracking method for augmented reality surgical applications. Proc Biomedical Simulation. 2008;5104:191–6. [Google Scholar]

[R2] 2.Dosis A, Aggarwal R, Bello F, et al. Synchronized video and motion analysis for the assessment of procedures in the operating theater. Arch Surg. 2005;140:293–9. doi: 10.1001/archsurg.140.3.293. [DOI] [PubMed] [Google Scholar]

[R3] 3.Chmarra MK, Bakker NH, Grimbergen CA, Dankelman J. Tr Endo, a device for tracking minimally invasive surgical instruments in training setups. Sens Actuators A Phys. 2006;126:328–34. [Google Scholar]

[R4] 4.Rosen J, Brown JD, Barreca M, Chang L, Hannaford B, Sinanan M. The Blue DRAGON–a system for monitoring the kinematics and the dynamics of endoscopic tools in minimally invasive surgery for objective laparoscopic skill assessment. Stud Health Technol Inform. 2002;85:412–8. [PubMed] [Google Scholar]

[R5] 5.Pagador JB, Sánchez LF, Sánchez JA, Bustos P, Moreno J, Sánchez-Margallo FM. Augmented reality haptic (ARH): an approach of electromagnetic tracking in minimally invasive surgery. Int J Comput Assist Radiol Surg. 2011;6:257–63. doi: 10.1007/s11548-010-0501-0. [DOI] [PubMed] [Google Scholar]

[R6] 6.Bann SD, Khan MS, Darzi AW. Measurement of surgical dexterity using motion analysis of simple bench tasks. World J Surg. 2003;27:390–4. doi: 10.1007/s00268-002-6769-7. [DOI] [PubMed] [Google Scholar]

[R7] 7.Datta V, Mackay S, Mandalia M, Darzi A. The use of electromagnetic motion tracking analysis to objectively measure open surgical skill in the laboratory-based model. J Am Coll Surg. 2001;193:479–85. doi: 10.1016/s1072-7515(01)01041-9. [DOI] [PubMed] [Google Scholar]

[R8] 8.Murphy TE, Vignes CM, Yuh DD, Okamura AM. Automatic motion recognition and skill evaluation for dynamic tasks. Proc EuroHaptics. 2003:363–73. [Google Scholar]

[R9] 9.Oropesa I, Sánchez-González P, Cano AM, Lamata P, Sánchez-Margallo FM, Gómez EJ. Objective evaluation methodology for surgical motor skills assessment. Minim Invasive Ther Allied Technol. 2010;10:55–6. [Google Scholar]

[R10] 10.Leong JJH, Nicolaou M, Atallah L, Mylonas GP, Darzi AW, Yang GZ. HMM assessment of quality of movement trajectory in laparoscopic surgery. Comput Aided Surg. 2007;12:335–46. doi: 10.3109/10929080701730979. [DOI] [PubMed] [Google Scholar]

[R11] 11.Dipietro L, Sabatini AM, Dario P. Evaluation of an instrumented glove for hand-movement acquisition. J Rehabil Res Dev. 2003;40(2):179–89. [PubMed] [Google Scholar]

[R12] 12.Cook JR, Baker NA, Cham R, Hale E, Redfern MS. Measurements of wrist and finger postures: a comparison of goniometric and motion capture techniques. J Appl Biomech. 2007;23(1):70–8. doi: 10.1123/jab.23.1.70. [DOI] [PubMed] [Google Scholar]

[R13] 13.Gentner R, Classen J. Development and evaluation of a low-cost sensor glove for assessment of human finger movements in neurophysiological settings. J Neurosci Methods. 2009;178(1):138–47. doi: 10.1016/j.jneumeth.2008.11.005. [DOI] [PubMed] [Google Scholar]

[R14] 14.Gülke J, Wachter NJ, Geyer T, Schöll H, Apic G, Mentzel M. Motion coordination patterns during cylinder grip analyzed with a sensor glove. J Hand Surg Am. 2010;35(5):797–806. doi: 10.1016/j.jhsa.2009.12.031. [DOI] [PubMed] [Google Scholar]

[R15] 15.Pugh CM, Domont ZB, Salud LH, Blossfield KM. A simulation-based assessment of clinical breast examination technique: do patient and clinician factors affect clinician approach? Am J Surg. 2008;195(6):874–80. doi: 10.1016/j.amjsurg.2007.10.018. [DOI] [PubMed] [Google Scholar]

[R16] 16.Balkissoon R, Blossfield-Iannitelli K, Salud L, Ford D, Pugh C. Lost in translation: unfolding medical students’ misconceptions of how to perform the clinical digital rectal examination. Am J Surg. 2009;197(4):525–32. doi: 10.1016/j.amjsurg.2008.11.025. [DOI] [PubMed] [Google Scholar]

[R17] 17.Pugh CM, Rosen J. Qualitative and quantitative analysis of pressure sensor data acquired by the E-Pelvis simulator during simulated pelvic examinations. Stud Health Technol Inform. 2002;85:376–9. [PubMed] [Google Scholar]

[R18] 18.Rosen J, Hannaford B, Richards CG, Sinanan MN. Markov modeling of minimally invasive surgery based on tool/tissue interaction and force/torque signatures for evaluating surgical skills. IEEE Trans Biomed Eng. 2001;48(5):579–91. doi: 10.1109/10.918597. [DOI] [PubMed] [Google Scholar]

[R19] 19.Rosen J, Solazzo M, Hannaford B, Sinanan M. Objective laparoscopic skills assessments of surgical residents using hidden Markov models based on haptic information and tool/tissue interactions. Stud Health Technol Inform. 2001;81:417–23. [PubMed] [Google Scholar]

[R20] 20.American Educational Research Association (AERA), American Psychological Association (APA), and National Council for Measurement in Education (NCME) Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association; 1999. [Google Scholar]

[R21] 21.Pugh CM. Ann Arbor, MI: ProQuest/Bell and Howell Dissertations Publishing; 2001. [accessed September 26, 2013]. Evaluating Simulators for Medical Training: The Case of the Pelvic Exam Model. Available at http://disexpress.umi.com/dxweb. [Google Scholar]

[R22] 22.Medical Examination Teaching System. 6,428,323. [accessed May 7, 2013];US Patent Number. 2002 Available at http://www.google.com/patents/US6428323.

[R23] 23.Mackel T, Rosen J, Pugh C. Application of hidden Markov modeling to objective medical skill evaluation. Stud Health Technol Inform. 2007;125:316–8. [PubMed] [Google Scholar]

[R24] 24.Mackel T, Rosen J, Pugh CM. Markov model assessment of subjects’ clinical skill using the E-Pelvis physical simulator. IEEE Trans Biomed Eng. 2007;54(12):2133–41. doi: 10.1109/tbme.2007.908338. [DOI] [PubMed] [Google Scholar]

[R25] 25.Mackel T, Rosen J, Pugh C. Data mining of the E-pelvis simulator database: a quest for a generalized algorithm for objectively assessing medical skill. Stud Health Technol Inform. 2006;119:355–60. [PubMed] [Google Scholar]

[R26] 26.Silverstein J, Selkov G, Salud L, Pugh C. Developing performance criteria for the e-Pelvis simulator using visual analysis. Stud Health Technol Inform. 2007;125:436–8. [PubMed] [Google Scholar]

[R27] 27.Pugh CM, Youngblood P. Development and validation of assessment measures for a newly developed physical examination simulator. J Am Med Inform Assoc. 2002;9(5):448–60. doi: 10.1197/jamia.M1107. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] 28.Pugh CM, Heinrichs WL, Dev P, Srivastava SS, Krummel T. Objective assessment of clinical skills with a simulator. JAMA. 2001;286(9):1021–3. doi: 10.1001/jama.286.9.1021-a. [DOI] [PubMed] [Google Scholar]

[R29] 29.Verschuren P, Hartog R. Evaluation in design-oriented research. Qual Quant. 2005;39(6):733–62. [Google Scholar]

[R30] 30.Kirschner P, Carr C, van Merriënboer J, Sloep P. How expert designers design. Perform Improv Quart. 2002;15(4):86–104. [Google Scholar]

[R31] 31.Salud LH, Ononye CI, Kwan C, Salud JC, Pugh CM. Clinical examination simulation: getting to real. Stud Health Technol Inform. 2012;173:424–9. [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Klatzky R, Lederman SJ. Tactile object perception and the perceptual stream. In: Albertazzi L, editor. Unfolding Perceptual Continua. Netherlands: John Benjamin Publishing Company; 2002. pp. 147–62. [Google Scholar]

[R33] 33.Minogue J, Jones MG. Haptics in education: exploring an untapped sensory modality. Rev Educ Res. 2006;76(3):317–48. [Google Scholar]

[R34] 34.Lederman SJ, Klatzky RL. Extracting object properties through haptic exploration. Acta Psychol (Amst) 1993;84(1):29–40. doi: 10.1016/0001-6918(93)90070-8. [DOI] [PubMed] [Google Scholar]

[R35] 35.Lederman SJ, Klatzky RL. Hand movements: a window into haptic object recognition. Cogn Psychol. 1987;19(3):342–68. doi: 10.1016/0010-0285(87)90008-9. [DOI] [PubMed] [Google Scholar]

[R36] 36.Salud LH, Pugh CM. Use of sensor technology to explore the science of touch. Stud Health Technol Inform. 2011;163:542–8. [PubMed] [Google Scholar]

PERMALINK

Application of National Testing Standards to Simulation-Based Assessments of Clinical Palpation Skills

Carla M Pugh, MD, PhD

Abstract

INTRODUCTION