Abstract
Speech recognition (SR), available since the 1980s, has only recently become sufficiently reliable to allow utilization in medical environment. This study measured the effect of SR for the radiological dictation process and estimated differences in report turnaround times (RTTs). During the transition from cassette-based reporting to SR, the workflow of 14 radiologists was periodically followed up for 2 years in a university hospital. The sample size was more than 20,000 examinations, and the radiologists were the same throughout the study. A RTT was defined as the time from imaging at the modality to the time when the report was available for the clinician. SR cut down RTTs by 81% and the standard deviation by 83%. The proportion of reports available within 1 h escalated from 26% to 58%. The proportion of reports created by SR increased during a follow-up time of this study from 0% up to 88%. SR decreases turnaround times and may thus speed up the whole patient care process by facilitating online reporting. SR was easily adopted and well accepted by radiologists. Our findings encourage the utilization of SR, which improves the productivity and accelerates the workflow with excellent end-user satisfaction.
Key words: Speech recognition, productivity, report workflow, report turnaround time
Background
In a speech recognition (SR) process, dictated speech is converted to the digital signal and then to a sequence of words in written text. SR systems have been available for medicine since the 1980s, but not until the late 1990s have applications proven sufficiently reliable and agile for report dictation1–3. It has been demonstrated that SR systems improve patient care with reduced report turnaround times (RTTs), reduced staffing needs, and also the efficient completion and distribution of reports4.
Initially, SR systems were used for pan-European languages, such as English, German, or French, but today, SR is applicable for other languages as well. In addition to a language model, a specific vocabulary, so-called context, and language model is required for each medical specialty because a context increases recognition level significantly5.
The Finnish language is challenging for SR because its vocabulary is exceptionally wide allowing many different words to evolve from one word body. HUS Helsinki Medical Imaging Center has actively participated in the development of a Finnish SR context for radiology. This context started to pilot in the department of radiology in the Töölö trauma hospital in the spring of 2005, and nowadays, every radiologist has their opportunity to utilize SR. Töölö hospital is the largest trauma hospital in Scandinavia providing trauma care, orthopedics, plastic surgery, and neurosurgery for 1.5 million people in southern Finland. The hospital has been utilizing the picture archiving and communication system (PACS) since 1997, a nonintegrated radiology information system (RIS) since 1998, and a PACS- and SR-integrated RIS since April 2005. The hospital produces nearly 70,000 studies annually, of which 44,000 are X-rays, 10,500 are computed tomography (CT), and 5,100 are magnetic resonance (MR) studies, in addition to fluoroscopic interventions and sonography.
The dynamics of the radiological dictation workflow and its productivity was examined. The Finnish radiological SR (by Philips SpeechMagic, Philips Speech Recognition Systems GmbH) has been integrated to our RIS system. Philips SpeechMagic is one of the most used clinical applications for SR in Europe2,3. The SR, RIS, and PACS use a single sign on the interface. The workflow is mastered by the PACS and thus requires no manual typing of accession numbers or names.
In this study, we measured the productivity of radiologists utilizing the SR system. The purpose was also to evaluate the effect of SR for the whole radiological dictation process and to estimate differences in RTT.
Materials and Methods
The SR procedure can be divided into different parts, where firstly the dictated speech is digitalized and then conveyed into signal processing. In the signal processing, both acoustic and language modeling are important parts in statistical SR. Normally, SR systems are generally based on the Hidden Markov model, which is a statistical model producing a sequence of symbols or quantities6. To ensure error-free reports, immediate proofreading of the text, by the radiologist, is essential. In our system, proofreading is possible at any moment: word by word, after each sentence or paragraph, or after completion of the complete report. In our experience, most radiologists prefer to proofread complete paragraphs or reports, once their reliance to capabilities of the SR has evolved. Third-party proofreading and editing was not used.
The accuracy of SR dictation is important. However, many studies are proving that nowadays SR technology is mature and the accuracy rate is high4. It is also possible to integrate SR into structured reporting or templates, which will also change the workflow of the radiological process5,7–11. Structured reporting or templates were not available in our software yet and are thus beyond the scope of this study.
In the radiological workflow process, the radiologist can dictate the report to the minicassette or to the SR system. The process is described in Figure 1. In most hospitals, some studies require urgent reporting, while for other studies, a turnaround time of several days may be sufficient. At our hospital, this is accomplished by using specific worklists to prioritize departments, i.e., a higher priority for the emergency room and intensive care unit (ICU) and a lower priority for plastic surgery. We are currently developing a RIS-based solution to prioritize, but to date, its efficiency has not been tested in a production environment.
Fig 1.
The radiological workflow.
In this study, a RTT was defined as the time from completion of imaging and PACS archiving to the time when the report was stored in the RIS and available for the clinician online. The comparison of different RTT studies is difficult because observed processes are not necessarily equal among each other12–14. However, the latest studies are describing that the decrease in RTTs is prominent3,14,15.
Results
SR was installed in the HUS Helsinki Medical Imaging Center during the period Q1/2005 through Q2/2007. We followed up the proportion of SR reports from all dictations. Prior to SR, this proportion was 0%, but it increased rapidly. Cassette-based dictation was still kept available for those who preferred it. The proportion of reports created by SR as a function of time is presented in Figure 2. The first sample (Q1/2005) started using SR in April 2005. The second sample was done Q1/2006 and the third on Q2/2007. Because our hospital is a university hospital clinic training radiologists and because it has a numerous collaboration with other clinics as well, due to technical limitations (integration to other hospitals’ RIS, second readouts for residents not utilizing SR, confidentiality of occupational health care), some reports beyond the capabilities of the RIS-integrated SR were created manually. Thus, 100% SR usage was unattainable. Modality distribution for each radiologist is presented in Figure 3.
Fig 2.
Proportion of reports created by SR.
Fig 3.
Modality distribution for each radiologist, where 1 = X-ray, 2 = angiography, 3 = CT, 4 = US, 5 = MR, and 6 = other.
We also evaluated RTTs based on traditional dictation to the cassette and SR system. The results of the RTTs are summarized in Table 1.
Table 1.
RTTs Using Cassette-based Process and SR
Cassette-based report | SR 1 | SR 2 | |
---|---|---|---|
Sample time | Q1/2005 | Q1/2006 | Q2/2007 |
Mean | 24 h, 46 min | 5 h, 23 min | 4 h, 40 min |
SD | 76 h, 31 min | 27 h, 42 min | 12 h, 43 min |
N | 6,037 | 6,486 | 9,072 |
SR speeds up the reports turnaround remarkably, especially if comparing results of Q1/2005 and Q1/2006 and, respectively, Q1/2005 and Q2/2007. This difference was statistically significant (p < 0.0001) in both comparisons (unpaired t test). The distribution of RTTs is presented in Figure 4. The figure is demonstrating clearly that the reports made by SR are more quickly available for clinicians than the cassette-based report.
Fig 4.
Distribution of RTTs.
Discussion
We found that the utilization of SR decreases turnaround times by more than 80% and therefore facilitates online reporting of radiological examinations to clinicians immediately after completion of the study. When the first cassette-based results were compared to the results of the first SR sample, we noticed that RTT decreased 78% (SD 64%), whereas the decrease from SR samples Q2/2007 to Q4/2006 was 13% (SD 54%).
In the comparison of Q1/2005, Q1/2006, and Q2/2007, the study profile is equal, and the radiologists are the same. The number of radiologists increased in the Q2/2007 sample, and correspondingly, the total number of reports increased prominently. Due to the increased efficiency, the radiologists also reported studies from the university hospital’s other clinics as well (Table 2). The average RTT decreased in the course of time. Trumm et al.3 and Rana et al.14 also reported decreased turnaround times, but we also found a striking 83% reduction in the standard deviation of RTTs. These figures presenting a skewed distribution of RTTs may result, in practice, not only in faster reporting each time but also improved throughput and quality of patient care. In a typical hospital environment, some reports are needed urgently, while some may be sought only several days after the completion of a study. At our hospital, the trauma care, ICU, neurosurgery, and orthopedics frequently require urgent reporting. Differences in mean turnaround times alone do not completely reveal the true potential of SR in this context. During this study, the proportion of reports available within 1 h has rapidly risen from 26% (cassette Q1/2005) to 58% (SR Q1/2006). For nonurgent studies, such as most of our MR imaging procedures, the mean RTT still remained high (Table 2). In contrast, for typical high-priority worklists requiring online reporting (i.e., ICU or orthopedics), we measured an exceptional 53% reduction in RTTs and an increase from 34% to 65% in first-hour reporting. Thus, for our hospital, the increased number of reports available within 1 h from the completion of a study has proven a great improvement.
Table 2.
Prioritization of Studies and Corresponding Differences in RTTs by Modality
MR | CT | Ultrasound | X-ray contrast enhancement | Intervention X-ray | Total | |
---|---|---|---|---|---|---|
Mean | 13:50:54 | 4:46:45 | 0:23:35 | 4:27:25 | 18:25:36 | 5:57:41 |
SD | 29:50:55 | 16:33:18 | 0:41:16 | 17:07:10 | 35:07:36 | 15:19:10 |
N | 838 | 986 | 78 | 325 | 109 | 13,021 |
This sample included 13,021 studies from Q2/2007 (3 months). A considerable proportion of X-ray and CT studies require immediate reporting, while most MR studies are nonurgent. In contrast, most ultrasound examinations are typically reported immediately after completion of the study. These figures also incorporate 3,949 studies from other hospitals, which our radiologists were able to report as well.
Training of the new users was done by one staff radiologist solely. Learning the use of SR was fairly easy: For each new user, only 10 to 15 min of training was sufficient to adopt SR. The training was performed in the production environment and therefore further advice, when necessary, was readily available from colleagues. We discovered that, albeit not specifically measured, the utilization of SR improved the information value, structure, and clarity of the radiological reports. To achieve a structurally coherent and easy-to-follow report, using cassette-based dictation, radiologists have to memorize the entire report and plan the structure and content prior to dictation. In contrast, online editing, which is possible in SR, facilitates focused reports, which we found, subjectively, structurally superior to those created by cassette-based process. Immediate proofreading was considered both easy and fast, with a negligible addition to the radiologist’s workload. Although not easily measurable, the staff radiologists uniformly found SR superior in this context and also reported, since the introduction of SR, cassette-based dictation cumbersome.
Conclusion
In conclusion, SR speeds up not only RTT but also the whole patient care process by significantly facilitating online reporting. SR was easily adopted and well accepted by radiologists. We also found improved quality of the reports, which became better structured and focused.
Our findings encourage the utilization of SR, which improves the productivity and accelerates the workflow with excellent end-user satisfaction.
References
- 1.White KS. Speech recognition implementation in radiology. Pediatr Radiol. 2005;35:841–846. doi: 10.1007/s00247-005-1511-x. [DOI] [PubMed] [Google Scholar]
- 2.Vorbeck F, Ba-Ssalamah A, Kettenbach J, Huebsch P. Report generation using speech recognition in radiology. Eur Radiol. 2000;10:1976–1982. doi: 10.1007/s003300000459. [DOI] [PubMed] [Google Scholar]
- 3.Trumm C, Francke M, Küttner B, Nissen-Meyer S, Reiser M, Glaser C. Speech recognition: impact on report availability and clinical workflow. Hosp Imaging Radiol Eur. 2006;1:14–16. [Google Scholar]
- 4.Voll K, Atkins S, Forster B: Improving the utility of speech recognition through error detection. J Digit Imaging, DOI 10.1007/s10278-007-9034-7, 2008 (in press) [DOI] [PMC free article] [PubMed]
- 5.Eng J, Eisner JM. Radiology report entry with automatic phrase completion driven by language modeling. Radiographics. 2004;24:1493–1501. doi: 10.1148/rg.245035197. [DOI] [PubMed] [Google Scholar]
- 6.Deng L, Erler K. Structural design of hidden Markov model speech recognizer using multivalued phonetic features: comparison with segmental speech units. J Acoust Soc Am. 1992;92:3058–3067. doi: 10.1121/1.404202. [DOI] [PubMed] [Google Scholar]
- 7.Liu D, Zucherman M, Tulloss WB., Jr Six characteristics of effective structured reporting and the inevitable integration with speech recognition. J Digit Imaging. 2006;19:98–104. doi: 10.1007/s10278-005-8734-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Reiner B, Siegel E. Radiology reporting: returning to our image-centric roots. Am J Roentgenol. 2006;187:1151–1155. doi: 10.2214/AJR.05.1954. [DOI] [PubMed] [Google Scholar]
- 9.Talton D. Perspectives of speech recognition technology. Radiol Manage. 2005;27(38–40):42–43. [PubMed] [Google Scholar]
- 10.Sistrom CL. Conceptual approach for the design of radiology reporting interfaces: the talking template. J Digit Imaging. 2005;18:176–187. doi: 10.1007/s10278-005-5167-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Pezzullo JA, Tung GA, Rogg JM, Davis LM, Brody JM, Mayo-Smith WW: Voice recognition dictation: radiologist as transcriptionist. J Digit Imaging, DOI 10.1007/s10278-007-9039-2, 2008 (in press) [DOI] [PMC free article] [PubMed]
- 12.Langer SG. Impact of speech recognition on radiologist productivity. J Digit Imaging. 2002;15:203–209. doi: 10.1007/s10278-002-0014-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Langer SG. Impact of tightly coupled PACS/speech recognition on report turnaround time in the radiology department. J Digit Imaging. 2002;15:234–236. doi: 10.1007/s10278-002-5011-3. [DOI] [PubMed] [Google Scholar]
- 14.Rana DS, Hurst G, Shepstone L, Pilling J, Cockburn J, Crawford M. Voice recognition for radiology reporting: is it good enough? Clin Radiol. 2005;60:1205–1212. doi: 10.1016/j.crad.2005.07.002. [DOI] [PubMed] [Google Scholar]
- 15.Mehta A, McLoud TC. Voice recognition. J Thorac Imaging. 2003;18:178–182. doi: 10.1097/00005382-200307000-00007. [DOI] [PubMed] [Google Scholar]