Abstract
The purpose of this research was to develop queries that quantify the utilization of comparison imaging in free-text radiology reports. The queries searched for common phrases that indicate whether comparison imaging was utilized, not available, or not mentioned. The queries were iteratively refined and tested on random samples of 100 reports with human review as a reference standard until the precision and recall of the queries did not improve significantly between iterations. Then, query accuracy was assessed on a new random sample of 200 reports. Overall accuracy of the queries was 95.6%. The queries were then applied to a database of 1.8 million reports. Comparisons were made to prior images in 38.69% of the reports (693,955/1,793,754), were unavailable in 18.79% (337,028/1,793,754), and were not mentioned in 42.52% (762,771/1,793,754). The results show that queries of text reports can achieve greater than 95% accuracy in determining the utilization of prior images.
Key words: Query, structured query language (SQL), databases
Introduction
Most diagnostic radiologists believe that it is important to have prior images available for side-by-side comparison during the interpretation of radiographs, preferably the actual image rather than the written report.1 Radiologists also make more observations, gain an added sense of confidence in their interpretation, and provide more specific diagnoses when comparison studies are utilized.2 Moreover, in the realm of mammography, numerous studies have shown that the use of comparison imaging increases the diagnostic accuracy and specificity or decreases the false-positive rate.3–5 As a result, the utilization of comparison imaging when available is considered standard practice for many radiologists and is part of the current American College of Radiology guidelines.6,7
However, the data to support this currently accepted practice is still limited, as the impact of priors on the sensitivity of diagnosis is uncertain, and the cost/benefit of obtaining and utilizing comparison films has not been proven.3,8,9 As such, radiologists have little guidance on the effort that should be expended to obtain comparison images when unavailable. This is especially an issue when prior exams are stored at another hospital or medical practice, causing radiologists to expend considerable resources trying to obtain such images.10,11
Thus, there has been a growing interest in implementing technology that provides more reliable access to comparison images. The universal adoption of Picture Archive and Communication Systems (PACS) provides a means for improved access to digital images, and has been shown to improve productivity and cost-effectiveness, particularly in institutions with multidetector computed tomography (MDCT).12–14 However, to our knowledge, no research has been carried out to assess the impact of PACS on comparison utilization.
Therefore, the goal of our research was to provide a tool that could assess this by developing and validating a computer-based method to accurately quantify the utilization of comparison imaging. We created a series of queries using structured query language (SQL) to probe the text of radiology reports stored within a massive database. The queries were designed to determine if comparison imaging was utilized, unavailable, or not mentioned at all. The SQL queries were also used to estimate the frequency with which comparisons were used overall and per imaging modality, and to establish trends in their utilization from 1997 to 2004.
Material and Methods
PACS and RIS
Approximately 1.8 million digital diagnostic imaging procedures were performed at the Hospital of the University of Pennsylvania (HUP) and archived on its PACS (GE Centricity 2.0; GE Medical Systems, Waukesha, WI, USA) from 1997 through May of 2004. Corresponding reports were archived on a RIS system (IDXrad v9.6; IDX, Burlington, VT, USA), with secondary copies of each report delivered from the RIS to the PACS via a Mitra broker (Agfa, Mortsel, Belgium).
Although IDXrad v9 is the primary archive of reports at HUP, it is the last generation of a series of products based on MUMPS, a hierarchical database management system that is not well suited for the efficient on-line transaction processing to which today's database users have grown accustomed. Modern relational database management systems (RDBMS), such as those offered by Oracle, Sybase, and Microsoft, now form the basis for the newer-generation clinical IT products, and are easily accessible via the standard structured query language (SQL).
Research Database Implementation
For this investigation, we sought to mine the textual contents of millions of radiology reports. Given the difficulty of interfacing with the PACS and RIS and the load that indexing and searching would likely place on these mission-critical operational systems, we chose to establish a stand-alone research RDBMS that would enable SQL-based access to a duplicate corpus of reports and that would serve as a research resource for the department. We opted for Oracle 9i Standard Edition as the database server, installing it on a dual-processor Pentium III Xeon server running RedHat Enterprise Linux 3. In addition, we wished to extend the standard capabilities of Oracle with a more sophisticated information retrieval (IR) engine whose indexing and searching would better enable access to narrative text. As such, we installed Oracle's interMedia product, an off-the-shelf IR engine intended to robustly index large volumes of text and which extends SQL to enable flexible and powerful text searching.
Report Duplication
To load the database, we created a custom Perl script, using the Perl DBI database interface module (http://dbi.perl.org), to connect to the PACS database, establish a mapping between the two database schemata, and perform the duplication of reports. This script was first scheduled to run weekly to gather data accumulated in the PACS over the course of the week. The script was also run retrospectively to gather all reports since the PACS went live in 1997. The complete import of some 1.8 million historical reports was completed in 6 h.
Report Indexing
Although Oracle interMedia provides native storage and scalable management for of all types of multimedia data, we focused on its text management and information retrieval capabilities. This functionality is dependent on the creation of specialized indices. As with most information retrieval approaches, the indexing of narrative unstructured text is accomplished by breaking passages into phrases or individual words on the basis of common “stop words” (“if,” “and,” “or,” “but,” “the,” etc.). In building our indices, however, we took care to remove “no” and “with” from the list of stop words so as to ensure that these words were themselves indexed and thus searchable.
Database Characteristics
We chose those attributes (see Table 1) that we deemed essential to the conduct of this and similar investigations, and used them to construct a simple schema for our new research database. The database contained all the original elements of the radiology reports, including patient name, study date, modality, history, medical record number, reading radiologist, and the text of the radiology report itself. In addition, each report in the database was assigned a unique identifier.
Table 1.
Attribute | Definition | Attribute | Definition |
---|---|---|---|
GE_REPORT_ID | NOT NULL NUMBERS | GE_EXAM_UID | VARCHAR(32) |
ACCESSION_NUM | NUMBER(38) | GE_PATIENT_CKEY | NUMBER(38) |
HISTORY | VARCHAR2(500) | GE_DEPARTMENT_CKEY | NUMBER(38) |
COMMENTS | VARCHAR2(500) | GE_PROCEDURE_CKEY | NUMBER(38) |
REPORT_TEXT | CLOB | LTA_STAT | CHAR(1) |
STUDY_DATE | DATE | ORDER_NUM | VARCHAR2(32) |
SCHED_DATE | NOT NULL DATE | GE_REQUESTING_ID | NUMBER(38) |
AQUISITION_DATE | DATE | GE_READING_RADIOLOGIST1_ID | NUMBER(38) |
APPROVAL_DATE | DATE | GE_READING_RADIOLOGIST2_ID | NUMBER(38) |
LAST_REPORT_UPDATE_DATE | DATE | GE_EXAM_STATUS_ID | NUMBER(38) |
PATIENT_NAME | VARCHAR2(100) | REPORT_RESULT_CODE | VARCHAR2(32) |
BIRTH_DATE | DATE | REPORT_SI_CODE | VARCHAR2(32) |
SEX | CHAR(1) | REPORT_ACR_CODE | VARCHAR2(32) |
RIS_PAT_ID | VARCHAR2(50) | REPORT_APP_CODE | VARCHAR2(30) |
STUDY_INSTANCE_UID | VARCHAR2(255) | NUM_IMAGES | NUMBER(38) |
MODALITY | VARCHAR2(10) | NUM_IMAGES_REJECTED | NUMBER(38) |
STUDY_CODE | VARCHAR2(30) | LAST_UPDATE_USER | NOT NULL VARCHAR2(16) |
GE_EXAM_CKEY | NUMBER(38) | LAST_UPDATE_DATE | DATE |
Query Development
The queries were developed using structured query language (SQL), and were designed to search for common phrases in radiology reports that could indicate the utilization of comparison imaging. For example, the queries contained phrases such as “prior study” or “previous film” to select for reports that contained these terms in the main text of the radiology report. Because many different words and phrases were often used to refer to the use of prior images, we needed a query method that could select for many terms at the same time. The implementation of Oracle's interMedia option allowed us to do this with the “CONTAINS” command, which enabled us to search for multiple terms and phrases within the text of the radiology reports. An example of such a query would be: select all of the radiology reports that contain the phrase “prior study” or “prior exam.” A query such as this could be expanded to include or exclude many more terms.
Using a combination of standard and interMedia-based SQL commands, we developed three separate queries. The first query was designed to search for radiology reports indicating the utilization of prior or comparison imaging. This was called the PRIORS query. The second query was designed to search for reports where comparison or prior imaging was unavailable. This was called the NO PRIORS query. The final query was designed to search for reports that had no mention of comparison or prior imaging at all. This was called the NO MENTION query. All three queries were mutually exclusive; that is, each query selected for a unique set of radiology reports with no overlap among the queries.
Query Design
In designing the PRIORS query, we first created a list of terms that could potentially identify or refer to reports that used comparison imaging. We realized that, depending on the context in which the words were used in, certain words or phrases could either indicate that comparison imaging was utilized or alternatively that comparison imaging was not available. For example, the word “comparison” could indicate the utilization of prior imaging if used in this context: “A CT scan from July, 1999 was used for comparison.” The word “comparison,” however, could also be used to indicate that prior films were unavailable if used in this context: “no comparison CT examinations were available.” Because the word “comparison” is contained in both of the above phrases, a query containing this word would select both reports. Thus, to distinguish between the two possibilities, we designed our queries to subtract reports that contained negative modifiers of certain words in our queries. Radiology reports that contained words such as “no” or “without” preceding a word or phrase that would normally indicate the use of prior imaging, such as “comparison,” were subtracted from the query. An example of this type of query would be: select all reports that contain “comparison” but not “no comparison,” using the appropriate SQL syntax. This type of query would be able to select reports from the database that indicated the utilization of comparison images, but deselect reports where comparison images were unavailable.
In developing the PRIORS query, we realized that many different terms could be used to refer to priors. For example, the word “prior” could be juxtaposed with any one of many words to indicate the use of comparison images. Specifically, “prior” could be combined with a word corresponding to an exam, such as “examination” or “radiograph,” to make the phrase “prior examination” or “prior radiograph,” which could be used to refer to comparison utilization. Likewise, “prior” could be combined with words corresponding to a modality, such as “CT” or “ultrasound” to make the phrase “prior CT” or “prior ultrasound,” which could also refer to comparison utilization. In addition, “prior” could be combined with anatomical regions of the body, such as “head” or “chest,” as in “prior head CT” or “prior chest x-ray” to indicate use of comparisons. We also realized that certain terms could indicate use of comparisons without any juxtaposition with words such as “prior.” For example, terms corresponding to time, such as “yesterday” or “day before,” as in “the CHF has improved since yesterday” or “the effusion is worse than the day before” could be used to refer to comparison utilization. Similarly, other terms such as “interval change” or “continued,” as in “there is no interval change in the fracture from previously” or “there is a continued pneumothorax,” could indicate the utilization of comparisons without any juxtaposition with words such as prior. Therefore, these terms were included in the query as well. A full list of terms and how they were constructed within the PRIORS query is listed and explained in Figure 1.
The second query that we designed was the NO PRIORS query. The purpose of this query was to select reports where prior or comparison imaging was unavailable. To do this, we compiled a list of various terms that indicated the unavailability of priors, and incorporated them into the query. As described above, this query searched for terms with negative modifiers of words that would normally indicate the use of priors, such as “without comparison” or “no prior.” A list of terms used in the final NO PRIORS query is listed in Figure 2.
The third and final query was the NO MENTION query. The purpose of this query was to select reports that had no mention of prior or comparison imaging at all. The general schema of this query was to search for reports in the database that contained neither the PRIORS query nor the NO PRIORS query (Figure 3). In other words, this query selected for reports that did not contain words listed in either the PRIORS or the NO PRIORS query. An example of such a query would be: select all reports that do not contain “comparison” or “no comparison.”
Because we were interested in determining the overall access to prior images and their respective use at our institution, we wanted to differentiate the use of prior imaging for comparison vs. that of prior reports. To achieve this, we added a group of query terms designed to specifically select radiology reports that indicated the use of comparison imaging and not of prior reports. Put otherwise, the purpose of the query terms was to deselect reports containing phrases such as “comparison was made to the report” or “compared with the written report” from the PRIORS query. Instead, these reports were counted as priors unavailable and incorporated into the NO PRIORS query. The full query language for these terms is listed in Figure 4.
Query Refinement
After the queries were created, they underwent a rigorous iterative refinement on a random sample of reports to verify and improve their accuracy. For this iterative assessment, we tested the queries on a random sample set of reports with human review as the reference standard. To randomize the reports selected from the Oracle database, we constructed a database “view” which, whenever accessed, randomized the row order of the reports table based on the system time. Thus, any selection from the view would yield a pseudorandom set of reports.
For each iteration, we assessed the queries on a new random sample of 100 reports, which provided a confidence interval of about 3%.15 After the random set of 100 reports was obtained for each query, they were subjected to human review, and subsequently scored and tabulated in a Microsoft Excel spreadsheet. The reports were then placed into one of three categories: Comparisons Utilized, Comparisons Unavailable, or No Mention.
Each report was read by a medical student (P.L.) and assigned into one of the above categories based on what was contained within the text of the radiology report. Radiology reports that contained phrases indicating the use of comparisons as in “A previous MRI was used for comparison” were categorized into Comparisons Utilized. Radiology reports with phrases indicating the unavailability of comparisons as in “No priors are available for comparison” were categorized into Comparisons Unavailable. Radiology reports that indicated no mention of comparisons at all were categorized into No Mention. This scoring process was performed for each set of randomly generated reports from each query, and the final results were tabulated on a spreadsheet.
We first tested the PRIORS query, then the NO PRIORS, and finally the NO MENTION query. Based on the results of the testing, the queries were modified to include and/or remove various terms and phrases in an attempt to improve their precision and recall. For example, terms such as “interval change” or “improving” were added to the queries after we noticed that many radiology reports referred to prior studies using those key words. An example of this is “there is no interval change in the x-ray from May 2001” or “the bibasilar atelectasis is improving since two days ago.” Other terms and phrases were added or removed in the same way during the testing process.
After the queries were revised, they were then retested on a new set of 100 random reports. These reports were iteratively subjected to human review, scored, and subsequently modified if necessary based on the new results. This process was repeated until the precision and recall of the queries did not improve significantly between iterations.
Query Validation
At the conclusion of the refinement process, the PRIORS, NO PRIORS, and NO MENTION queries were finalized based on the results of the multiple tests. For validation of the finalized queries, they were each applied to a random sample of 200 randomized reports, which provided a confidence interval of 1.5%,15 and scored by the same process outlined above. Results from this validation process were then tabulated as our final data, and subsequently used to calculate the recall and precision of the three queries. The final version of all three queries with complete syntax is shown in Figure 5.
Estimate Frequency of Comparison Utilization
As soon as the final queries were defined and validated, they were applied to the Oracle database to quantify the utilization of comparison imaging in all radiology reports from our hospital over the past 8 years. Of the 1.8 million total reports contained in the database, we determined the absolute number of reports that the PRIOR, NO PRIOR, and NO MENTION queries selected for. The numbers generated from these queries were then used to estimate the frequency with which comparison imaging was utilized, not available, or not mentioned at all at our hospital.
Estimate Frequency of Comparison Utilization per Modality
The final queries were also applied to the database to estimate the relative usage of comparison studies per imaging modality. A list of the major imaging modalities grouped by the database is shown in Table 2. One of the modality groupings, however, was undefined by the database and listed as “other.” This “other” category represented 6.67% (119,626 reports) of the total reports in the database, and was found to consist of primarily mammograms and myocardial perfusion scans. From a random sample of 200 reports taken from this category, it was determined that 72.0% of reports consisted of mammograms and mammograms with breast ultrasounds (144/200 reports), and 25.0% of reports consisted of myocardial perfusion scans (50/200 reports). The remaining 3.0% of the reports consisted of miscellaneous studies, including two thyroid scans, a whole-body positron emission tomography (PET), and abdominal ultrasound, a review of an outside CT/MRI/Bone Scan, and a plain film of the abdomen (6/200 reports).
Table 2.
Modality | Prior | No Prior | No Mention | Total Count |
---|---|---|---|---|
Neuroangiography | 1,198 (10.50%) | 49 (0.43%) | 10,163 (89.07%) | 11,410 (100.00%) |
Computed Radiography | 425,409 (41.90%) | 134,529 (13.25%) | 455,262 (44.85%) | 1,015,200 (100.00%) |
Computed tomography | 105,454 (36.65%) | 93,731 (32.57%) | 88,586 (30.78%) | 287,771 (100.00%) |
Magnetic resonance | 55,931 (28.28%) | 46,652 (23.59%) | 95,218 (48.14%) | 197,801 (100.00%) |
Nuclear medicine | 392 (37.87%) | 246 (23.77%) | 397 (38.36%) | 1,035 (100.00%) |
Mammography | 67,844 (77.42%) | 10,447 (11.92%) | 9,344 (10.66%) | 87,635 (100.00%) |
Myocardial perfusion | 1,073 (4.35%) | 62 (0.25%) | 23,529 (95.40%) | 24,664 (100.00%) |
PET | 477 (54.51%) | 250 (28.57%) | 148 (16.91%) | 875 (100.00%) |
Radiofluoroscopy | 358 (5.50%) | 492 (7.55%) | 5,664 (86.95%) | 6,514 (100.00%) |
Ultrasound | 32,507 (21.95%) | 49,238 (33.24%) | 66,374 (44.81%) | 148,119 (100.00%) |
Angiography | 1,667 (34.09%) | 90 (1.84%) | 3,133 (64.07%) | 4,890 (100.00%) |
Variousa | 1,645 (20.98%) | 1,242 (15.84%) | 4,953 (63.18%) | 7,840 (100.00%) |
Total | 693,955 (38.69%) | 337,028 (18.79%) | 762,771 (42.52%) | 1,793,754 (100.00%) |
aConsists of breast MR, digital fluoroscopy, unspecified modalities, and miscellameous reports from the “other” modality.
In this sample set from the “other” category, only 6.0% of the myocardial perfusion scans utilized comparison imaging (3/50 reports). There was no mention of comparisons in the remaining 94.00% of myocardial perfusion scans (47/50 reports). On the other hand, comparisons were utilized in 86.81% of the mammograms (125/144 reports), were unavailable in 7.64% (11/144 reports), and not mentioned in 5.56% (8/144 reports).
Estimation of Mammograms and Myocardial Perfusion Scans
Because mammography and myocardial perfusion scans were not listed as separate modalities in the database, but were part of the “other” category as described above, we created a series of queries that estimated the number of mammograms and myocardial perfusion scans contained within the “other” category. For mammography, the query searched the text of radiology reports for the words “breast” or “breasts,” as these words were very sensitive for detecting mammography reports. Because this query was limited to the “other” category, it was fairly specific as well, as the remaining reports within this category (myocardial perfusion scans and a few miscellaneous reports) never contained the words “breast” or “breasts” based on the sample of 200 reports looked at above. Additionally, this query excluded reports that contained the words “sestamibi” or “ventricular” or “myocardial,” because these words were exclusively used in myocardial perfusion scans. The latter three words, therefore, were also used in a separate query to select for myocardial perfusion scans. The full series of queries used to isolate these two imaging modalities from the “other” category is listed along with a general schematic in Figure 6.
We then combined these queries with the PRIORS, NO PRIORS, and NO MENTION queries to estimate the frequency that comparisons were utilized, not available, and not mentioned among mammography and myocardial perfusion scans.
We also created our own category, which we termed various, in which we lumped radiology reports from minor and undefined modalities, breast MRI, as well as the few miscellaneous reports from the “other” modality grouping (Table 2).
Analysis of the NO MENTION query
For the NO MENTION query, we determined the prevalence of those reports that actually had priors available on IDX records vs. those that did not. To do this, we selected 100 random reports using the NO MENTION query, which contained the report text, patient medical record number, modality, and study date. We then looked at the actual IDX records, which dated back to 1997, to see if there was a relevant prior available for each report selected by the query. Relevant priors were considered to encompass the same or similar region of interest, containing information that was thought to be important or meaningful to the radiologist. For example, a prior head CT was considered to be irrelevant if the current report was of a chest radiograph. Results from this analysis were tabulated onto a spreadsheet and presented using a flow chart diagram (Figure 13).
Results
Utilization of Comparisons
When applied to the database, the PRIORS query selected 693,955 radiology reports (38.69%). The NO PRIORS query selected 337,028 reports (18.79%). The NO MENTION query selected 762,771 reports (42.52%). The total number of reports selected by all three queries combined equaled 1,793,754, or 100% of the reports within the database (Figure 7).
Query Validation
In the final validation of the PRIORS query, comparisons were actually utilized in 94.0% (188/200) of the reports, were unavailable in 1.5% (3/200) of the reports, and not mentioned in 4.5% (9/200) of the reports. In the final validation of the NO PRIORS query, comparisons were utilized in 1.5% (3/200) of the reports, were unavailable in 98.0% (196/200) of the reports, and not mentioned in 0.5% (1/200) of the reports. In the final validation of the NO MENTION query, comparisons were utilized in 4.0% (8/200) of the reports, were unavailable in 0.0% (0/200) of the reports, and not mentioned in 96.0% (192/200) of the reports (Figure 8). Overall accuracy of the queries was 95.6%. The precision of the PRIORS, NO PRIORS, and NO MENTION queries was 94.0%, 98.0%, and 96.0%, respectively. The recall of the three queries was calculated to be 94.9%, 96.9%, and 95.6%, respectively.
Utilization of Comparisons per Modality
Of all the imaging modalities, mammography utilized comparisons more than any other modality at 77.42%. This was followed by positron emission tomography (PET) at 54.51%, computed radiography at 41.90%, and nuclear medicine at 37.87%. Results for the remaining modalities are included in a complete list shown in Table 2 and Figure 9.
Priors were unavailable most frequently in ultrasound (33.24%). This was followed by computed tomography (32.57%), PET (28.57%), and nuclear medicine (23.77%). Results for the remaining modalities are included in a complete list shown in Table 2 and Figure 10.
Priors were not mentioned most frequently in myocardial perfusion scans (95.40%), which was followed by neuroangiography (89.07%), radiofluoroscopy (86.95%), and angiography (64.07%). Results for the remaining modalities are included in a complete list shown in Table 2 and Figure 11.
Temporal Trends in the Utilization of Comparisons
There was a steady increase in the use of comparison imaging during the period 1997–2004. In 1997, comparisons were utilized 31.03% of the time. For the first five months of 2004 (January–May), comparisons were utilized 42.34% of the time (Figure 12). This represents a 36.45% increase in utilization of comparisons over the past eight years.
Analysis of the NO MENTION query
Of the 100 random reports selected by the NO MENTION query, 26% (26/100) contained relevant prior(s) on IDX records. Five percent (5/100) of these reports were attributable to query inaccuracy—that is, the report text had actually mentioned or inferred the use of priors. In all five of these reports, one or more priors were available on IDX. In 74% (74/100) of the reports selected by the NO MENTION query, no relevant prior was available on IDX. Of these, 43% (32/74) had no prior at all on IDX records, and 57% (42/74) actually had prior(s) available, but they were deemed to be irrelevant (Figure 13).
Discussion
Results indicate that an SQL query of text can achieve high accuracy in estimating the utilization of prior imaging in radiology interpretation. Although all three queries were very precise, the precision of the PRIORS query was slightly lower than the other two. The precision of this query is defined by its ability to select reports that utilize comparison imaging. In the case of this query, 188 of the 200 selected reports actually utilized comparison imaging, yielding a precision of 94.0%.
In this query, comparison studies were unavailable in 3 of 200 (1.5%) reports. These reports were selected because they contained terms used in the query, although priors were actually unavailable. The following is an example of a phrase from such a report: “without any available films for comparison.” In this case, the report was selected because it contained the term “comparison,” which was part of the PRIORS query. Although our final query contained similar terms such as “without comparisons” and “without films” to indicate the unavailability of priors, the specific phrase “without any available films” was not included. Because alterations in sentence structure and a multitude of words and phrases could be used to represent the unavailability of priors, it would be difficult and impractical to include certain phrases that were rarely encountered during testing. Thus, we chose to include those terms that were encountered at least more than once during the testing process. If we chose to include more terms, the precision of this query might be increased by little, but at the expense of a longer and more complicated query.
Finally, in this query, there was no mention of comparison imaging in 9 of 200 reports (4.5%). These reports were selected because they contained terms outlined in the query, but in a context that did not indicate the utilization of comparison imaging. For example, the following report is taken from a CT scan of the coronary arteries: “Total coronary artery calcification score of 1 which places this patient in the 70th percentile when compared with women of similar age without known coronary artery disease.” In this example, the report was selected because it contained the word “compared,” which was part of the query, but not used in a context to refer to comparison images. Examples such as these slightly reduced the precision of this query.
Query Precision
Of the three, the NO PRIORS query was the most precise, as 196 of 200 reports (98.0%) indicated the unavailability of comparison images. In this query, comparisons were utilized in only 3 of 200 reports (1.5%). In two of the reports, a particular imaging modality or study was unavailable, and therefore comparison was made to another imaging modality or study instead. The following CT scan of the chest is an example: “No prior lung scans are available for comparison, correlation is made to chest x-ray on 2/4/04.” In this case, the report was selected by the query because it contained the term “no prior,” which was part of the query, to indicate that prior CT scans were unavailable for comparison. However, as the following sentence indicated, a comparison was made to a chest x-ray that was available instead. In the other example, prior studies were initially unavailable, but then later retrieved from an outside institution for comparison: “There are no prior studies for comparison.... Outside films dated 19-Jul-2002 from Phoenixville Hospital have been received for comparison.” Finally, there was no mention of comparisons in only one report selected by this query of 200 (0.5%). Therefore, this factor lowered the precision of the query minimally.
The NO MENTION query was also very precise, although slightly lower at 96.0%. In this query, 8 of 200 reports (4.0%) indicated the use of comparisons. These reports contained phrases that indicated the utilization of comparisons, although they contained none of the query terms from the PRIORS or NO PRIORS query. An example of such a phrase is: “patchy infiltrates in the right upper and right lower lobes which were not present on 26 February 2004.” This phrase is a good example of how a reference to prior image utilization could be constructed by using a component of time, as in “There is no significant change from June 1998” or “These areas probably represent metastases and are much more poorly seen on today's film then they were on April 24, 1998.” Although our final query included many terms with reference to time, such as “yesterday” or “previous day” (see Figure 1), the language used to construct such phrases proved to be exceptionally varied, and thus it was difficult to design a query that could be universally inclusive. Another type of example is: “a portable AP film of the chest shows a 20% right pneumothorax which was not present prior to removal of the right chest tube.” This is a good example of a report that indicated the use of comparisons in a subtle way. Although not directly stating the use of priors, it is clear that a comparison was made to a previous film before the “chest tube was removed.” Because of reports such as these, which proved very difficult to select for in the PRIORS query, the precision of the NO MENTION query was slightly reduced. Finally, comparisons were unavailable in zero reports of 200 selected by this query (0.0%).
Query Recall
All three queries showed excellent recall with rates nearing 95% and higher. The recall of a query was defined as its ability to select certain radiology reports relative to the total number of those radiology reports selected by all three. In the PRIORS query, for example, its recall was defined by its ability to select reports that utilized comparisons relative to the total number of reports utilizing comparisons in all three queries. In this case, the PRIORS query had a slightly lower recall compared to the other queries at 94.9%, selecting 656,564 of 691,949 estimated reports that utilized comparisons. In other words, there were some radiology reports that utilized comparisons that were not selected by the PRIORS query, but by the NO PRIORS and NO MENTION queries. As indicated above, 4.0% of the reports in the NO MENTION query actually utilized comparisons during validation (8/200 reports). Moreover, because the NO MENTION query selected the largest number of overall reports (762,771) compared to the other two, it made a slightly larger contribution to reducing the recall of the PRIORS query. On the other hand, as the NO PRIORS query contained the least number of overall reports (337,028), its slight imprecision in selecting reports that actually used comparisons (1.5%) had less of an effect on reducing the recall of the PRIORS query.
Recall of the NO MENTION query was moderately higher at 95.6%, selecting 727,930 of 761,046 estimated reports that had no mention of comparisons. Although 4.5% of the reports in PRIORS query actually had no mention of comparisons (9/200), the NO PRIORS query contained only one report (0.5%) with no mention of comparisons. As a result, the overall number of reports with no mention of comparisons from the other two queries was fairly low, which explains the relatively good recall of this query.
Recall of the NO PRIORS query was highest at 96.9%, selecting 330,282 of 340,759 estimated reports in which priors were unavailable. This query had an exceptionally high recall rate because comparisons were unavailable in only 1.5% and 0.0% of the reports in the PRIORS and NO MENTION queries, respectively. However, because those queries represented a very large proportion of the radiology reports in the entire database, they made a larger contribution to lowering the recall rate of the NO PRIORS query.
Overall Rate of Comparison Usage and Analysis of NO MENTION
Results from the comparisons among imaging modalities indicated that comparisons were utilized in roughly 39% of the reports, were unavailable in 19%, and not mentioned in 43% (Figure 7). From this data, it can be said that although comparison images were utilized in a significant amount of reports, they were either unavailable or not mentioned in the majority of them (61%). In our analysis of the NO MENTION query, we found that the vast majority (74%) of these reports actually had no relevant priors available on IDX records dating back to 1997. However, it is possible that this percentage would decrease if we checked before 1997 as well. Nevertheless, a considerable percentage (26%) of radiology reports from the NO MENTION query had relevant prior(s) available. Five percent (5/100) of these reports were attributable to query inaccuracy—that is, these radiology reports actually indicated the utilization of comparisons. Thus, the percentage of reports that had relevant prior(s) available, but with no mention of comparison utilization in the text of the radiology report was actually 22% (21/95). There are three scenarios that could account for this percentage: (1) radiologists are utilizing comparisons, but are not stating this in the report text; (2) radiologists are not utilizing comparisons although they are available; (3) radiologists do not realize that relevant comparisons are available, or fail to look for them. It is likely that all of these scenarios contribute to this percentage. However, it is unclear which is most prevalent, and this may ultimately depend on the practicing pattern of the radiologist, imaging modality, and reason for the study. It should also be noted that there is no definite standard on if and when comparisons should be utilized, with the exception of mammography. Thus, many radiologists may opt not to use priors in many cases when they believe it is unlikely to alter the interpretation.
Use of Comparisons by Modality
Based on the results per imaging modality, mammographers utilized comparisons roughly 77% of the time, more frequently than any other modality. This information is consistent with the American College of Radiology guidelines, which state that all mammography reports should contain reference to comparison images when available.6 Comparisons were unavailable in about 12% of mammograms. In these reports, patients may have been undergoing their first mammogram, or may have had their prior exams at an outside institution from which the images proved difficult to obtain. This situation was frequently seen during validation of the mammography query. The following is an example of such a mammogram: “No comparisons were available... The outside films have been requested but have never been obtained.” Finally, there was no mention of comparisons in about 11% of mammograms. This situation also most likely reflected patients receiving mammography for the first time, as 87.5% (7/8 reports) of mammograms with no mention of comparisons were found to be baseline exams during query validation.
After mammography, PET ranked second in comparison frequency with 54% of reports containing reference to prior images. A likely explanation for the relatively high rate of comparison utilization in PET scans is that they are typically performed in oncology patients, who often have prior images already available from other modalities suggesting a malignant process.15 Furthermore, because PET offers little structural information, correlation with other imaging modalities is frequently performed to gain anatomical insight.16
After PET, computed radiography ranked third in comparison frequency, with about 42% of reports containing reference to prior films. This result may seem counterintuitive to some because comparison films are usually available for the majority of studies in computed radiography, as conventional radiographs and chest x-rays are routinely performed in many patients. In fact, conventional radiography represented over half of the total radiology reports in the database (1,015,200 reports). Furthermore, priors were unavailable in computed radiography only 13% of the time, which was lower compared to the other major imaging modalities. Thus, one possible explanation for this is that many radiologists may not feel the need to mention or utilize comparison images when reading a normal chest x-ray or other conventional radiograph. This explanation is also supported by our data, which show that comparisons were not mentioned for conventional radiographs at a fairly high rate compared to the other major imaging modalities, or about 45% of the time.
Nuclear medicine scans (other than PET) utilized comparisons next most often at 39%, which was somewhere in the middle relative to the other modalities. Use of comparisons in nuclear medicine is likely weighed by two opposing factors. For one, patients who undergo these scans are more likely to have prior films available for comparison, usually because these studies are ordered after more conventional modalities have been performed (i.e., plain film, CT, MRI, US). However, because nuclear medicine scans are performed less frequently, most patients are receiving these for the first time. In these cases, prior nuclear medicine scans will not be available. Therefore, the use of comparisons is ultimately decided by whether the radiologist is willing to use priors from other modalities for comparison, or only relevant nuclear medicine scans when available.
After nuclear medicine, comparisons were utilized in CT scans next most often at about 38% of the time. Comparisons in CT scans were utilized more often than in MRI, where they were only utilized about 28% of the time. This is likely because CT scans were performed more frequently than MRI (287,771 vs. 197,801 reports). Therefore, patients were more likely to have a previous CT available than an MRI, for which many patients never receive. A similar explanation is the likely reason for the lack of comparison utilization in ultrasounds, where priors were utilized only 22% of the time, as this imaging modality was less frequently performed (148,119 reports). Thus, many of these patients are likely receiving a baseline exam. This explanation is also supported by our data, which show that comparisons were unavailable in about 33% of ultrasounds, the highest among all modalities. Moreover, ultrasounds are typically performed on a specific region of interest, so even if a prior ultrasound is available, it may not be of the same region. On the contrary, CT scans and MR images typically encompass a larger area, such as the chest, abdomen, and pelvis, or sometimes both. Therefore, relevant comparison images are more likely available in the latter modalities.
In addition to ultrasounds, comparisons were reported to be unavailable in CT scans at a fairly high rate, or nearly 33% of the time. Although patients are more likely to have a prior CT available than a prior MR, as explained above, this suggests that radiologists are more likely to search for priors when interpreting CTs than MRIs. In other words, CT radiologists may be more apt to mention the use of comparisons than those reading MRIs. This is also supported by our data, which show that there was no mention of comparisons in 48% of MRIs—higher than all the other major imaging modalities (CT, computed radiography, ultrasound, and mammography), whereas CT had no mention of comparisons in approximately 31% of reports.
Of all the imaging modalities, myocardial perfusion scans had no mention of comparison imaging most frequently, at approximately 95% of the time. Because the majority of these patients were likely receiving baseline images, the text of the report almost always had no mention of comparison imaging. Another possible contributing factor was the apparent frequent use of macros in these reports, in which the status of comparison availability was rarely mentioned. In addition, patients receiving neuroangiography, radiofluoroscopy, and angiography had a relatively high rate in which comparisons were not mentioned (89%, 87%, and 64% of the time, respectively). Similarly, this is likely because many of these patients were undergoing these studies for the first time, in which relevant priors of the same modality may have not existed.
Temporal Trends in Utilization of Priors
Finally, the queries were also applied to the database to determine the yearly utilization of comparison imaging from 1997 to 2004. From the data, there was a significant increase in the utilization of comparisons during this period, as priors were used 36% more frequently in 2004 compared to 1997. Moreover, increase in comparison utilization was most apparent from 1997 to 2001. This increase can be explained by a multitude of factors. First, with the incorporation of the PACS system at our institution in 1997, it became easier and more feasible to make comparisons to previous exams. Digital images are easier to locate than actual films, which can be time-consuming to find in a storage facility or library. Second, as more electronic images were added to the PACS over time, more prior electronic images became available for comparison. In other words, it would have been more difficult to make electronic comparisons in 1997 when the PACS had just been installed, for example, than in 2001, when more electronic images would have been available. Third, the American College of Radiology had begun releasing their standards for communication in 1991, with revisions in 1995, 1999, and 2001. In these standards for communication, the ACR strongly advised that comparisons should be used whenever available, and to clearly document in the report whenever comparisons were unavailable.6,7 These statements likely had an influence on the practice of diagnostic radiologists not only at our institution, but also in other establishments nationwide. Finally, there became an increasing worry about malpractice issues in radiology over that period, as more case reports surfaced in which radiologists lost malpractice suits where they did not adhere to the standards for communication. Therefore, during this time, radiologists were advised by some authors in the literature to adhere to the ACR standards for purposes of risk management.17–19
This study has some limitations. Our study may underestimate the actual usage of priors, as there may be cases in which comparisons were utilized but not mentioned in the body of the report. Additionally, validating with a larger sample size would have reduced the variability in determining query precision. In verifying the meaning of the NO MENTION query, it would have been ideal to search radiology records on the RIS prior to 1997, before the implementation of PACS at our institution. Finally, the queries in this study were based on commonly encountered language used by radiologists at our institution, and although we feel the language is fairly universal, some terms or phrases may not be entirely applicable to other centers.
Future Considerations
Our study provides indirect evidence that PACS may improve access to and increase the utilization of priors within the same institution. However, there still is a need to provide improved access to comparisons among different institutions. The implementation of a nationwide electronic medical record number, or an electronic system that could enable access to images among different institutions, are possible means to address such problems. Improved access to priors could potentially save time and costs for radiologists in obtaining such films, increase diagnostic accuracy, and improve overall patient care.
In regard to these queries, we feel that they can be used by other institutions to determine the utilization of priors on an institution-specific basis, and potentially assess the impact of their PACS on comparison utilization. Moreover, queries such as these can be modified or tailored to search for other entities besides the utilization of priors, and used for retrospective research purposes. It is also possible that they can be applied toward the development of a research PACS—that is, a search engine specifically geared for querying radiology reports—that could provide relatively accurate answers for various research questions.
Conclusion
An SQL query of text can achieve greater than 95% accuracy in determining the utilization of comparison imaging in radiology reports. Of the three queries, the NO PRIORS query was the most precise, followed by the NO MENTION, and then the PRIORS query. When applied to the PACS database from HUP, our query results show that although a significant number of radiology reports utilized comparison imaging, they were unavailable or not mentioned in the majority of reports. When there was no mention of comparison utilization in the report text, relevant priors were unavailable in most cases according to IDX records. Per modality, mammography utilized comparison imaging most frequently. This was distantly followed by PET, and then by computed radiography. Priors were reported to be unavailable most often in ultrasound and then in CT. Priors were not mentioned most frequently in myocardial perfusion scans, neuroangiography, radiofluoroscopy, and angiography. From the implementation of PACS in 1997 to 2004, there was a significant increase in the utilization of comparison imaging, suggesting that PACS improves access to prior images.
References
- 1.White K, Smith WL. The role of previous radiographs and reports in the interpretation of current radiographs. Invest Radiol. 1994;29:263–265. doi: 10.1097/00004424-199403000-00002. [DOI] [PubMed] [Google Scholar]
- 2.Aideyan UO, Berbaum K, Smith WL. Influence of prior radiologic information on the interpretation of radiographic examinations. Acad Radiol. 1995;2:205–208. doi: 10.1016/s1076-6332(05)80165-5. [DOI] [PubMed] [Google Scholar]
- 3.Sumkin JH, Holbert BL, Herrmann JS, Hakim CA, Ganott MA, Poller WR, Shah R, Hardesty LA, Gur D. Optimal reference mammography: a comparison of mammograms obtained 1 and 2 years before the present examination. Am J Roentgenol. 2003;180:343–346. doi: 10.2214/ajr.180.2.1800343. [DOI] [PubMed] [Google Scholar]
- 4.Thurfjell MG, Vitak B, Azavedo E, Svane G, Thurfjell E. Effect of sensitivity and specificity of mammography screening with or without comparison of old mammograms. Acta Radiol. 2000;41:52–56. doi: 10.1080/028418500127345884. [DOI] [PubMed] [Google Scholar]
- 5.Christiansen CL, Wang F, Barton MB, Kreuter W, Elmore JG, Gelfand AE, Fletcher SW. Predicting the cumulative risk of false-positive mammograms. J Natl Cancer Inst. 2000;92:1657–1666. doi: 10.1093/jnci/92.20.1657. [DOI] [PubMed] [Google Scholar]
- 6.ACR Standard for Communication: Diagnostic Radiology. Reston, VA: American College of Radiology; 2001. [Google Scholar]
- 7.ACR Standard for the Performance of Pediatric and Adult Chest Radiography. Reston, VA: American College of Radiology; 2001. [Google Scholar]
- 8.Callaway MP, Boggis CR, Astley SA, Hutt I. The influence of previous films on screening mammographic interpretation and detection of breast carcinoma. Clin Radiol. 1997;52:527–529. doi: 10.1016/s0009-9260(97)80329-7. [DOI] [PubMed] [Google Scholar]
- 9.Quekel LG, Goei R, Kessels AG, Engelshoven JM. Detection of lung cancer on the chest radiograph: impact of previous films, clinical information, double reading, and dual reading. J Clin Epidemiol. 2001;54:1146–1150. doi: 10.1016/S0895-4356(01)00382-1. [DOI] [PubMed] [Google Scholar]
- 10.Wilson TE, Nijhawan VK, Helvie MA. Normal mammograms and the practice of obtaining previous mammograms: usefulness and costs. Radiology. 1996;198:661–663. doi: 10.1148/radiology.198.3.8628851. [DOI] [PubMed] [Google Scholar]
- 11.Bassett LW, Shayestehfar B, Hirbawi I. Obtaining previous mammograms for comparison: usefulness and costs. Am J Roentgenol. 1994;163:1083–1086. doi: 10.2214/ajr.163.5.7976879. [DOI] [PubMed] [Google Scholar]
- 12.Channin DS. Is it time for ‘PACSter’? J Digit Imaging. 2001;14:52–53. doi: 10.1007/s10278-001-0002-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Ooijen PM, Bongaerts AH, Witkamp R, Wijker A, Tukker W, Oudkerk M. Multi-detector computed tomography and 3-dimensional imaging in a multi-vendor picture archiving and communications systems (PACS) environment. Acad Radiol. 2004;11:649–660. doi: 10.1016/j.acra.2004.03.001. [DOI] [PubMed] [Google Scholar]
- 14.Reiner BI, Siegel EL, Hooper FJ, Pomerantz S, Dahlke A, Rallis D. Radiologists' productivity in the interpretation of CT scans: a comparison of PACS with conventional film. Am J Roentgenol. 2001;176:861–864. doi: 10.2214/ajr.176.4.1760861. [DOI] [PubMed] [Google Scholar]
- 15.Berry CC. A tutorial on confidence intervals for proportions in diagnostic radiology. Am J Roentgenol. 1990;154:477–480. doi: 10.2214/ajr.154.3.2106207. [DOI] [PubMed] [Google Scholar]
- 16.Hustinx R, Benard F, Alavi A. Whole-body FDG-PET imaging in the management of patients with cancer. Semin Nucl Med. 2002;32:35–46. doi: 10.1053/snuc.2002.29272. [DOI] [PubMed] [Google Scholar]
- 17.Cascade PN, Berlin L. Malpractice issues in radiology: American College of Radiology Standard for Communication. Am J Roentgenol. 1999;173:1439–1442. doi: 10.2214/ajr.173.6.10584778. [DOI] [PubMed] [Google Scholar]
- 18.Berlin L. Comparing new radiographs with those obtained previously. Am J Roentgenol. 1999;172:3–6. doi: 10.2214/ajr.172.1.9888727. [DOI] [PubMed] [Google Scholar]
- 19.Berlin L. Must new radiographs be compared with all previous radiographs, or only with the most recently obtained radiographs? Am J Roentgenol. 2000;174:611–615. doi: 10.2214/ajr.174.3.1740611. [DOI] [PubMed] [Google Scholar]