Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Jan 1.
Published in final edited form as: Contemp Clin Trials. 2012 Sep 11;34(1):36–44. doi: 10.1016/j.cct.2012.09.002

Identifying and collecting pertinent medical records for centralized abstraction in a multi-center randomized clinical trial: The model used by the American College of Radiology arm of the National Lung Screening Trial

Ilana F Gareen 1, JoRean Sicks 2, Amanda Adams 3, Denise Moline 4, Nancy Coffman-Kadish 5
PMCID: PMC3525762  NIHMSID: NIHMS407378  PMID: 22982342

Abstract

Background

In clinical trials and epidemiologic studies, information on medical care utilization and health outcomes is often obtained from medical records. For multi-center studies, this information may be gathered by personnel at individual sites or by staff at a central coordinating center. We describe the process used to develop a HIPAA-compliant centralized process to collect medical record information for a large multi-center cancer screening trial.

Methods

The framework used to select, request, and track medical records incorporated a participant questionnaire with unique identifiers for each medical provider. De-identified information from the questionnaires was sent to the coordinating center indexed by these identifiers. The central coordinating center selected specific medical providers for abstraction and notified sites using these identifiers. The site personnel then linked the identifiers with medical provider information. Staff at the sites collected medical records and provided them for central abstraction.

Results

Medical records were successfully obtained and abstracted to ascertain information on outcomes and health care utilization in a study with over 18,000 study participants. Collection of records required for outcomes related to positive screening examinations and lung cancer diagnosis exceeded 90%. Collection of records for all aims was 87.32%.

Conclusions

We designed a successful centralized medical record abstraction process that may be generalized to other research settings, including observational studies. The coordinating center received no identifying data. The process satisfied requirements imposed by the Health Insurance Portability and Accountability Act and concerns of site institutional review boards with respect to protected health information.

Keywords: medical records, abstracting, lung cancer, screening, clinical trials

Introduction

In most multi-center clinical trials, patient management occurs at a single study site. Information on health outcomes and medical care are collected at that site. In large screening trials, however, the study site often provides only the screening examination. Any associated diagnostic follow-up and treatment are obtained from community physicians, often from more than one provider. Information needed to assess patient outcomes, measure health care utilization, or estimate costs must, therefore, be obtained from self-report, medical records, or administrative data. Although self-report [15] and administrative data [58] have proven reliable and accurate in certain situations [9,10], medical record review with explicit assignment of ICD and CPT codes by research personnel avoids problems of recall bias and possible upcoding for reimbursement purposes [11].

For medical record review to be complete and provide information to reliably estimate costs, medical records must be obtained from every medical provider seen by each patient. ICD- 9 and CPT codes are assigned based on information in the medical records. When planning medical records abstraction for the National Lung Screening Trial (NLST), we reviewed the literature and found few publications with detailed information on the process of identifying and collecting medical records for centralized abstraction in multi-site studies [3,12,13]. The proportion of medical records successfully obtained ranged from 50–85% for single [2,13] and multi-site studies [3]. Our review of the literature did not locate any reports with detailed information on rates of complete medical records successfully obtained by type of provider or reason for medical records request.

We were also unable to locate publications describing approaches for stratified sampling of records, that is, cases in which medical records were requested for only certain indications or medical providers. In NLST, for the primary aims, medical records specifically related to the intervention and the outcome of interest were needed, whereas for secondary aims, information on all medical care obtained during the screening follow-up period was required. Selecting specific records for abstraction is made more difficult because staff from the coordinating center are generally blinded to participant identifiers under Health Insurance Portability and Accountability Act (HIPAA) and other privacy regulations [14,15].

In this paper, we describe a framework used to identify and obtain medical records for the American College of Radiology Imaging Network (ACRIN) arm of the National Lung Screening Trial (NLST) 1517], and we provide information on the proportion of records successfully obtained. This process identifies those providers from whom records should be requested, provides this information to the remote sites, and facilitates the collection of records from many remote providers.

Methods

The National Lung Screening Trial (NLST), a joint collaboration between the American College of Radiology Imaging Network (ACRIN) grant, sponsored by the Cancer Imaging Program in the Division of Cancer Treatment and Diagnosis at the National Cancer Institute (NCI), and the Lung Screening Study (LSS) contract, administered by the Division of Cancer Prevention, NCI, has been described in detail elsewhere.[1618] In brief, it is a multiinstitutional trial of 53,452 participants designed to compare the ability of low-dose spiral CT and chest X-ray imaging to reduce lung cancer mortality conducted through a joint program of ACRIN which was responsible for 18,840 of the participants at 23 sites (see Appendix A) and the LSS. Participants were men and women with a history of heavy smoking (at least 30 pack years), aged 55–74 at the time of recruitment who received three screening examinations: a baseline (T0) screen and two incidence screens (T1 and T2) at one year intervals. Information was collected from medical records to better understand health outcomes and health care utilization associated with lung cancer screening, and to inform a cost effectiveness analysis.

Both ACRIN and the LSS abstracted medical records to document the procedures, diagnoses, and complications experienced by participants following a positive screening examination, the diagnosis, treatment and progression of lung cancers, and the pathological diagnosis of other cancers (to ensure that these were not metastatic lung cancers). For positive screens, lung cancers, and other cancers, ACRIN and the LSS harmonized their medical records collection. However, for these records, in addition to coding elements in a manner consistent with the LSS, ACRIN had certified nosologists assign ICD-9 and CPT codes.

ACRIN also collected medical records for any medical care received for three additional subsets of participants to meet additional aims: 1) a 5% random sample of participants screen positive at T0, T1, or T2 to assess whether higher levels of health care utilization for indications unrelated to the positive screen were associated screening positive and having contact with the medical system for related follow-up ; 2) a 5% random sample of participants with a negative screen negative at baseline (T0) to provide information on the background level of medical care expected in this cohort; 3) a convenience sample of participants with a screen result of significant findings not related to lung cancer to examine diagnoses and health care utilization associated with these findings. The classes of participants for whom ACRIN collected records and the specifics of that record collection are detailed in Table 1.

Table 1.

Indications for medical records collection and the records that are needed for each indication.

Category Indication Aim for Which
Records were
Requested
Items Requested Were records
needed for a
specific time
period?
Time Period of Interest Records of Interest
1 Positive screen (a random
sample of 5% of the T0,T1, and
T2 positive screens were
selected to also have more
extensive abstraction — see
category 4 below
Primary Aim The procedures, diagnoses,
and complications following a
positive screening
examination.
Yes Until the next screening
examination, definitive
diagnosis of lung cancer/no
lung cancer, or for 12
months from the date of the
positive screen, whichever
came first.
All medical records
related to the follow-up
of a positive screening examination.
2 Lung Cancer Primary Aim Medical records related to the
diagnosis, treatment and
progression of lung cancers.
No Any record related to
lung cancer diagnosis,
treatment, and progression.
3 Other cancer Primary Aim Pathological diagnosis of
other cancers with the
exception of non-melanoma
skin cancers
No Pathology report if
available, otherwise,
medical records
indicating cancer type
4 Random sample of 5% of
participants with a positive
screen at the baseline (T0, T1,
T2) screening examination
Cost-effectiveness
Analysis
All medical records. Yes 12 months from the date of
the index screen.
All medical records for
time period of interest.
5 Random sample of 5% of
participants with a negative screen at the baseline (T0)
screening examination
Cost-effectiveness
Analysis
All medical records. Yes 12 months from the date of
the index screen.
All medical records for
time period of interest.
6 Significant finding not related to
lung cancer
Investigation of
Downstream
Consequences of
Screening
All medical records. Yes 12 months from the date of
the index screen.
All medical records for
time period of interest.

For the NLST, the LSS built upon the existing Prostate, Lung, Ovarian and Colon Cancer Screening infrastructure, in which individual sites performed the chart abstraction with oversight by a coordinating center [19]. ACRIN did not have the extant infrastructure of the LSS and chose to develop a new centralized record selection and abstraction process. Assignment of ICD and CPT codes required expertise, training, and standardization difficult to coordinate across 23 sites. For these reasons, ACRIN elected to use a centralized abstraction process in which site staff were responsible for obtaining medical records, and a central group of coders performed abstraction. To maintain participant confidentiality, identifying information for participants and medical providers was maintained at the sites and was not visible to staff at the coordinating center. Medical records abstractors were able to see identifying information contained in the medical records. Appropriate approvals were obtained from all ACRIN sites and from the Brown University Institutional Review Boards. Participant-completed consent forms were constructed to satisfy institutional HIPAA requirements. The consent form used for the study can be found in the protocol appendix [http://www.acrin.org/Portals/0/Protocols/6654/Protocol-ACRIN%206654%20Amendment%2010,%2011.1.04.pdf]. The medical records release form used is available at: http://www.acrin.org/Portals/0/Protocols/6654/forms/6654mrra.pdf.

Overview of medical record abstraction process

ACRIN used a multi-stage process to identify medical records that were required for the study and to notify the site personnel to obtain those records. First, the participants were asked to complete a standardized questionnaire in which they detailed medical care obtained during the time period following the screening examination. Second, the central coordinating center used information from that questionnaire and from the screening results to determine whether records were needed from each of the medical care providers records listed on that standardized questionnaire. Third, the coordinating center identified for site personnel those records that should be obtained. This process is described in detail below.

Interval Questionnaire

During the time period between the first screening examination, and the end of the study (or a participant’s censor date), ACRIN participants were asked to complete a questionnaire to collect information on health outcomes and medical care every six months. We refer to this tool as the “interval questionnaire,” because the questionnaire covered a time “interval” of interest, usually six months. The time interval was made clear to participants by placing the “start date” of the interval in a large box on the first page of the questionnaire and asking participants to recall all medical care which occurred between that start date and the date that the participant was completing the questionnaire. If the site personnel were unable to contact the participant at any six month anniversary, the time interval covered by the questionnaire was extended in order that information on medical care received for the entire period, from accrual into the study to the date of questionnaire administration, was covered.

To aid participants in recalling their medical encounters, the interval questionnaire specifically asked participants about outpatient providers, ER visits, and hospitalizations. Contact information for each medical care provider was also requested so that medical records could be obtained if needed. Care provided by dentists, eye specialists, podiatrists, chiropractors, acupuncture specialists, or mental health specialists was explicitly excluded, as it was considered irrelevant to the study aims. The initial interval questionnaire, the F1 form [http://www.acrin.org/Portals/0/Protocols/6654/forms/6654f1.pdf], focused on collecting information relating to “lung care,” but this proved to be too general a query for this population of heavy smokers. In addition, the format, which asked for recall of specific dates of care, proved difficult for participants. The form was revised, piloted in the NLST population, and replaced by the F2 form [http://www.acrin.org/Portals/0/Protocols/6654/forms/6654%20F2.02212006.pdf]. The F2 form specifically asked about medical care related to the NLST, and the format focused on care received during the time interval covered by the questionnaire, rather than asking for specific dates of care. To increase the likelihood that we captured information on NLST-related diagnostic procedures, both the F1 and F2 interval questionnaires queried participants for information on diagnostic tests related to their lungs, including, X-ray, chest CT, chest MR, FDGPET scan of the body, nuclear medicine scans of the chest or lungs, surgery to chest or lungs, biopsy of chest or lung, or bronchoscopy.

Information for each medical provider was entered into a separate section of the interval questionnaire. Each section had a unique identifier that was used to characterize that medical provider. We referred to these as “Provider Codes” (i.e., A7). The information on medical care was identified by the “provider codes” that were integral to the database. The provider name and contact information were not entered. The coordinating center used the “provider code” to identify those records that were needed to site staff.

Medical Records Request to Sites

The coordinating center used a medical record selection report to request records from the sites. The selection report identified records by ACRIN NLST participant ID number, provider code, and interval questionnaire start and stop dates. Each entry specified the indication (see Table 1) for which records were being requested. Upon receipt of the selection report, the site personnel linked the medical provider codes from the report with those on the original interval questionnaires that were retained at the site to obtain the provider contact information.

Obtaining requested records

The suggested approach for requesting records was to fax or mail each medical provider a general medical record request letter, as well as a copy of the participant’s signed HIPAA authorization form, and to follow-up with medical providers if records were not received in a timely fashion. However, site personnel were allowed to use their discretion to develop other approaches to obtaining the records. Some found that contacting doctors’ offices by telephone was more successful, and many personnel found it useful to search for medical records in their institutional files, especially if the screening centers had converted to electronic medical records.

Once the medical records were obtained from the provider, site personnel reviewed the records for completeness. Site personnel were advised that for interventional procedures with associated biopsies, procedures, biopsy reports, clinical laboratory, and pathology reports were expected. For hospitalizations, admission history and physical, discharge summary, intensive care unit (ICU) nursing flow sheets (if the participant was admitted to the ICU during the hospitalization), progress notes, imaging reports, death summary, in the event of a death, and autopsy report, if an autopsy was performed, were expected. If surgical procedures were performed, we expected operative reports and pathology reports. If a participant reported a diagnosis of cancer other than lung cancer the site personnel were asked to request pathology and a clinical note documenting the site of the biopsy. After the medical records were received, the site personnel were responsible for organizing the records by provider and by date of service.

If site personnel or the medical records abstractor who was performing the abstraction felt that the medical record information was incomplete or if the available records made reference to additional medical care, site personnel made efforts to obtain these records.

Documentation that records were not available or no follow-up care occurred

If the site staff could not obtain any medical records, the site staff completed a form on which they indicated the reason that records could not be obtained and provided this information to the medical record abstractors for recording into the abstraction data base.

Records related to follow-up care were expected for participants with a positive screening examination. If positive screen participants failed to report any associated medical care on their interval questionnaire, we asked the site personnel to confirm that no records were available. This determination was based on a review of the participant’s study chart, contact with the participants’ primary care physician (identified by the participant at the time of accrual), and/or contact with the participant. If this investigation located medical records, site personnel obtained these for abstraction. If site personnel confirmed that these participants had not received followup care, the reason that no care was obtained was indicated on a worksheet that was provided to the abstractors for entry into the abstraction database.

Determination of Medical Record Completeness

Medical records could be complete for one of the indications described in Table 1 (e.g., a positive screen) and incomplete for another (e.g. 5% sample). In each case for which records were unavailable, the indication-specific reason that they were unavailable was collected. For certain indications (positive results, random 5% samples, and significant finding not related to lung cancer) medical records were needed for explicit periods of time following the screening examinations for comparison across groups. These follow-up periods could be comprised of one or more intervals. Determination of whether records were obtained, and whether those records were complete was indication specific. We considered interval coverage to be complete if the follow-up period of interest had interval follow-up questionnaires and corresponding medical records available covering at least 95% of the time period of interest, and the records were marked as “complete” by the medical record abstractors. Abstractors considered medical records “complete” if records were provided for each medical provider requested, and there were no references in the medical records for which records were unavailable. For example, participants would often report visiting their primary care physician, when the records for that visit were reviewed, the medical records might contain a notation referring to a biopsy and a CT scan. Medical record abstractors considered records “all abstracted” only if we were able to obtain and abstract records related to that biopsy and CT scan.

In cases for which participants did not report receiving medical care, we classified medical record completeness based on the availability of the interval questionnaire. If interval questionnaires were available for the entire interval of interest, and the participant reported receiving no medical care, we considered the interval to be completely resolved.

Records were considered partially available if records were abstracted for the entire period with incomplete information or completely abstracted for part of the interval or a combination of the two. If participants indicated on the interval questionnaire that they had received medical care, but no records could be obtained, we labeled that interval as having records unavailable.

For cataloguing completeness of medical records, records were classified by the study aims (Table 1). For clarity, in tables assessing completeness, the two positive samples (the 5% random sample of all records and the 95% sample of screen and lung cancer related records) were separated.

Results

Data are presented by intervals requested (Table 2) as well as at the participant level (Table 3) because for many indications (e.g., lung cancer diagnosis), a single interval was adequate to resolve the question of interest; but for other indications, multiple intervals were required. Records may be included in tables in more than one column as completeness was assessed for each indication. Table 2 lists requests by indication, whether all, some, or none of the records were available, and if unavailable, the reason. Nearly 88% of all requests were resolved by receiving complete records or discovering that records were not needed after investigation by site personnel (29.02% of all requests were resolved by site personnel). As noted in Table 2in 2007, sites were told that they no longer needed to obtain records for significant findings not related to lung cancer. All records received were credited, but outstanding requests are not included in this table.

Table 2.

Records request status by type of request, with detailed reasons for lack of record availability

Type of Request
Records Status Positive (Not
Including 5%
Random Sample)
Lung Cancer
Diagnosis
Lung Cancer Treatment
Or Progression
Other Cancer 5% Random Sample
T0,T1,T2 Positives
5% Random Sample
T0 Negatives
Significant Findings Not
Related To Lung Cancer
Total
Requested 14090 (100.00) 965 (100.0) 1196 (100.00) 3536 (100.00) 853 (100.00) 1144 (100.00) 2289c (100.00) 24073 (100.00)
All Records
Available
7267 (51.58) 724 (75.03) 744 (62.21) 1760 (49.77) 597 (69.99) 899 (78.58) 2043 (89.25) 14034 (58.30)
No records to obtain
after investigation by
site personnel
6031 (42.80) 45 (4.66) 163 (13.63) 1259 (35.61) 127 (14.89) 20 (1.75) 133 (5.81) 6986 (29.02)
Number of Requests
Resolved
13298 (94.38) 769 (79.69) 907 (75.84) 3019 (85.38) 724 (84.88) 919 (80.33) 2176 (95.06) 21020 (87.32)
Some Records
Available
214 (1.52) 118a (12.23) 134 (11.20) 322 (9.11) 69 (8.09) 57 (4.98) 100 (4.37) 1722 (7.15)
   Participant
   withdrew consent
13 (6.07) 5 (4.24) 5 (3.73) 13 (4.04) 0 (0.00) 0 (0.00) 94 (94.00) 130 (7.55)
   Records request
   refused by
   provider /facility
31 (14.49) 39 (33.05) 21 (15.67) 54 (16.77) 1 (1.45) 12 (21.05) 0 (0.00) 158 (9.18)
   Provider(s) or
   Provider contact
   information
   unknown
7 (3.27) 2 (1.69) 1 (0.75) 10 (3.11) 1 (1.45) 6 (10.53) 0 (0.00) 27 (1.57)
   Other 163 (76.17) 72 (61.02) 107 (79.85) 245b (76.09) 67 (97.10) 39 (68.42) 6 (6.00) 1407 (81.71)
No Records Available 578 (4.10) 78 (8.08) 155 (12.96) 195 (5.51) 60 (7.03) 168 (14.69) 13 (0.57) 1331 (5.53)
   Participant
   withdrew consent
112 (19.38) 14 (17.95) 26 (16.77) 9 (4.62) 0 (0.00) 9 (1.19) 13 (100.00) 176 (13.22)
   Records request
   refused by
   provider / facility
138 (23.88) 6 (7.69) 14 (9.03) 15 (7.69) 8 (13.33) 15 (8.93) 0 (0.00) 196 (14.73)
   Provider(s) or
   Provider contact
   information
   unknown
94 (16.26) 16 (20.51) 7 (4.52) 7 (3 59) 3 (5.00) 5 (2.98) 0 (0.00) 132 (9.92)
   Other 234 (40.48) 42 (53.85) 108 (69.68) 164 (84.10) 49 (81.67) 146 (86.90) 0 (0.00) 827 (62.13)
a

For Lung cancer diagnosis, records marked “some records available” were often adequate for ascertaining lung cancer status and stage. Thus 94% of lung cancer diagnosis requests resulted in ascertainment of lung cancer status.

b

Sixty three of these intervals had no pathology to obtain because the cancer was diagnosed clinically.

c

This number does not include those requests for which sites were told they did not need to request records due to a policy change in 2007.

Table 3.

Proportion of records complete for three follow-up periods, for indications for which records were required for a specific time period of interest.

T0 T1 T2
60-Days c 6-Month 1-Year 60-Days 6-Month 1-Year 60-Days 6-Month 1-Year
N(%) N(%) N(%) N(%) N(%) N(%) N(%) N(%) N(%)
Positive (Not including 5% Random sample) Completeb 2715 (94) 2701 (94) 2621 (91) 2413 (96) 238
4
(94) 234
0
(93) 1615 (95) 16
03
(94) 1556 (92)
Partial 75 (3) 93 (3) 185 (6) 48 (2) 90 (4) 155 (6) 33 (2) 60 (3) 126 (7)
Unavailable 98 (3) 94 (3) 82 (3) 62 (2) 49 (2) 28 (1) 51 (3) 36 (2) 17 (1)
Total 2888 (100) 2888 (100) 2888 (100) 2523 (100) 252
3
(100) 252
3
(100) 1699 (100) 16 99 (100) 1699 (100)
5% Random Sample T0, T1, T2 Positives Complete 154 (92) 153 (92) 145 (87) 106 (92) 104 (90) 102 (89) 84 (97) 83 (95) 78 (90)
Partial 10 (6) 11 (7) 19 (11) 3 (3) 7 (6) 11 (10) 1 (1) 3 (3) 8 (9)
Unavailable 3 (2) 3 (2) 3 (2) 6 (5) 4 (3) 2 (2) 2 (2) 1 (1) 1 (1)
Total 167 (100) 167 (100) 167 (100) 115 (100) 115 (100)d 115 (100) d 87 (100) 87 (100)d 87 (100)
5% Random Sample T0 Negativesa Complete 582 (80) 576 (79) 546 (75) -- -- -- -- -- -- -- -- -- -- -- --
Partial 40 (5) 58 (8) 110 (15) -- -- -- -- -- -- -- -- -- -- -- --
Unavailable 108 (15) 96 (13) 74 (10) -- -- -- -- -- -- -- -- -- -- -- --
Total 730 (100) 730 (100) 730 (100) -- -- -- -- -- -- -- -- -- -- -- --
Significant Finding Not Related To Lung cancer Complete 720 (63) 692 (61) 600 (53) 450 (59) 407 (53) 342 (45) 292 (44) 23
1
(35) 131 (20)
Partial 65 (6) 116 (10) 257 (23) 47 (6) 102 (13) 201 (26) 22 (3) 10
5
(16) 242 (36)
Unavailable 353 (31) 330 (29) 281 (25) 273 (36) 261 (34) 227 (30) 351 (53) 32
9
(50) 292 (44)
Total 1138 (100) 1138 (100) 1138 (100) 770 (100) 770 (100) 770 (100) 665 (100) 66
5
(100) 665 (100)
a

Records for the 5% random sample of negative screens were collected only for the T0 screening examination.

b

These records are considered complete if at least 95% of the time period had complete medical records

c

The number of days needed for 95% coverage is different for 60-days, 6-months, and 1-year, so a person could be incomplete at 60-days and complete at 1-year.

d

Because of rounding, the percentages do not add to 100%.

Although records were missing or incomplete for 12.68% of requests, for 56.39% of those (7.15% of all requests), some records were available. For certain indications, e.g., lung cancer diagnosis, the available records were sufficient to meet the primary aims of the study. That is, abstractors were able to reliably assign a lung cancer diagnosis using the records. For 43.61% of those (5.53% percent of all requests), no records could be obtained. For slightly more than 13% of these requests, records were unavailable due to participant withdrawal of consent. Most records, however, could not be obtained because the request was refused or ignored by the medical facility.

For indications requiring specific follow-up time periods, information available at the participant level to meet the study goals is presented in Table 3. For participants for whom medical records were needed to ascertain medical care and diagnoses subsequent to a positive screen, the proportion of complete records recovered varied with coverage period (60 days to 1 year post-screen), and ranged from 92% to 96%. For other indications, medical records describing all medical care, not just care sought subsequent to an abnormal screening test, was requested (see Table 1). The proportion of records successfully recovered for these more global requests was somewhat lower. The proportion of records that were complete for the baseline 5% random sample of positive cases ranged from 92% for a 60 day time period to 87% for one year post screen, whereas for cases who screened negative, it ranged from 80% for a 60 day period to 75% for a 1 year period. The proportion of complete records was higher for the T1 and T2 incidence screens. In these tables, in several rows, the proportion with partial coverage increases with extended follow-up (from 60 days to 1 year), rather than decreases, due to the way in which records were considered. Because we considered records complete if there was 95% coverage of the follow-up interval, full coverage was more likely to occur for longer intervals. For example, for a one year period, a gap of 18 days was allowed, whereas for a 60 day period, a gap of 3 days was allowed. Thus if there was a gap of 18 days directly following the screen, the gap would be too large to consider the 60 day period complete, but the 1 year period would be considered complete.

Medical records were complete for fewer participants diagnosed with a significant finding not related to lung cancer. For the baseline (T0) examination completeness ranged from 63% at 60 days to 53% at 1 year post-screen. Fewer were complete for the two incidence screens (T1, T2).

Discussion

Few publications have provided detailed information on the medical record selection and acquisition process for participants receiving community care in multi-center studies. We based our medical records selection on participant-reports and used a central infrastructure to identify those reports for which records should be obtained. Using this approach, we successfully collected complete information needed to meet our primary aims for over 90% of our participants requiring medical record abstraction, and we collected complete medical records for nearly 88% of the requested time intervals.

Success rates for obtaining complete records in the NLST compare favorably with those from other multi-center studies. Using an approach similar to that used in the NLST, Chen et al. reported that in the Women’s Health Initiative, radiologic records were obtained for only 44% to 69% of requests, success differed based on body part being imaged [1]. The Cardiovascular Health Study did not provide detailed information on the proportion of medical record requests that were successful, but Ives et al. noted that physician questionnaires to ascertain outpatient care were often not returned (15–18% ) [3]. Higher success rates are common for hospital records requests, Bergmann et al. reported that they were able to obtain hospital records for 89% of requested stays in the First National Health and Nutrition Examination Survey [4]. The information provided in these reports, however, was limited. We could not identify any large multi-center studies that reported medical record acquisition success rates for communitydelivered outpatient and inpatient care similar to those reported here for the NLST. Although patient self-report provides important information to guide the acquisition of medical records, the specifics of these reports were often in error, reported for the incorrect time interval, or incomplete. These issues were identified by site personnel, when they made medical records requests based on patient self-reports. This is reflected in Table 2, row 3, with a large number of requests for which there were no records to obtain after investigation by site personnel (29.25%). Abstraction served to validate self-reported care as well as to detect unreported diagnoses and procedures. Unfortunately, in this trial we did not track the number of additional medical encounters, diagnoses, and procedures that were ascertained through medical record review.

In Table 2, reports of non-melanoma lung cancers that were resolved at the sites are included in the “available” category, because these were not required to be provided to the abstractors. If site personnel were able to unequivocally determine that the cancer was a nonmelanoma skin cancer, they completed a form indicating that records abstraction was not needed. If there was any doubt as to the pathology diagnosis, the records were reviewed by the central certified tumor registrars. Reports of other cancers were often made in more than one time interval. These duplicate reports were also resolved by site personnel. Both these categories are included in the “resolved requests.”

To ensure that abstractor time was used most efficiently, we asked site personnel to review the medical records prior to provision to abstractors. Site personnel were able to determine whether participants had correctly reported care for the intervals of interest. They were also able to review the records and obtain any needed additional materials prior to providing the file to the abstractors. The efficiency with which site personnel assessed the records varied by site, with some site staff being very efficient and proactive, and other site staff being less so. As expected, it was easier to obtain more limited records for information directly related to the screening examination and lung cancer diagnosis than it was to obtain records relating to all medical care for the cost-effectiveness analysis or long-term treatment and cancer progression goals. These latter two types of records generally involved records from more providers and covered a greater span of time than were the more limited records needed for the trial’s primary aim.

Completeness was substantially lower among participants diagnosed with a significant finding not related to lung cancer (Table 3). During the planning of the trial, we anticipated collecting all of these records. However, many more participants were diagnosed with these significant abnormalities than anticipated. For budgetary reasons, we ceased collecting these records in 2007. At that time, a number of requests had been made to the sites. Site personnel were told that they were no longer responsible for procuring these records. Because the decision to cease abstracting these records was programmatic, and the decision as to which records were abstracted was made based on availability rather than on factors related to participant health, we feel that these records are missing at random, and that we will be able to use this convenience sample to obtain unbiased estimates of morbidities and procedures associated with this screen result.

Although some providers requested a fee for providing copies of medical records, many reduced or waived fees, because these records were being requested for research purposes. Unfortunately, we did not track the proportion of providers who requested payment nor the payments requested.

Cataloguing records by time interval allowed us to explicitly document whether information was available for specific follow-up periods. By choosing to index by specific dates for which information was available, as opposed to a specific time period, we were able to explicitly indicate that period of time for which medical records were complete. This will prevent analysts from incorrectly assuming that information for the entire follow-up period is complete. For example, if we had indexed patient follow-up in one-year periods, but we were only able to contact a participant at 6 months post-screen, and the participant was lost to follow-up after that time, the data available would pertain only to the first six month period; however, analysts might assume that data were complete for that entire one-year time period. This approach also facilitated stratification of records by the follow-up period for which records were complete. Thus, records might be complete and usable for 60 days post-screen, but not for 6 months postscreen, and for analyses where a shorter follow-up interval was of interest, we will be able to utilize all available data, rather than discarding records as incomplete because they are incomplete for a long interval post-screen.

Although a number of the institutions from whom we collected records were using electronic medical records, in most cases, these records were printed, and the printed copies were made available to abstractors. Although electronic medical records hold promise of more efficient data capture, the systems at this time are quite diverse, and, in this study, did not lend themselves to direct abstraction. Geiger et al. [20] reported a similar experience.

Abstraction of information from medical records is needed in many studies to validate participant-reported outcomes. Coordinating the abstraction centrally ensures that the criteria used to select medical records for abstraction and to code medical records are consistent across sites. This can be accomplished even when the central coordinating center does not have access to identifying information.

Acknowledgements

The authors thank the Screening Center investigators and staff of the National Lung Screening Trial (NLST). Most importantly, we acknowledge the study participants, whose contributions made this study possible. We also acknowledge Brian Murphy and Erin Greco for their assistance in generating results cited in this manuscript. Finally, the authors would like to thank the anonymous reviewers for the helpful suggestions that improved this manuscript.

Funding

This study was supported through grants U01-CA-80098 and CA 79778 under a cooperative agreement with the Cancer Imaging Program.

Appendix A

1. Beth Israel Deaconess Medical Center Boston, MA
2. Brigham and Women’s Hospital Boston, MA
3. Brown University, Rhode Island Hospital Providence, RI
4. The Cancer Institute of New Jersey New Brunswick, NJ
5. Dartmouth-Hitchcock Medical Center Lebanon, NH
6. Emory University Atlanta, GA
7. Jewish Hospital Rudd Heart and Lung Center Louisville, KY
8. Johns Hopkins University Baltimore, MD
9. Mayo Clinic, Jacksonville Jacksonville, FL
10. Mayo Clinic, Rochester Rochester, MN
11. Medical University of South Carolina Charlestown, SC
12. Moffitt Cancer Center Tampa, FL
13. Northwestern University Chicago, IL
14. Ochsner Medical Center New Orleans, LA
15. St. Elizabeth Health Center Youngstown, OH
16. University of California, Los Angeles Los Angeles, CA
17. University of California, San Diego San Diego, CA
18. University of Iowa Iowa City, IA
19. University of Michigan Medical Center Ann Arbor, MI
20. University of Pennsylvania Philadelphia, PA
21. University of Texas M.D. Anderson Cancer Center Houston, TX
22. Vanderbilt University Nashville, TN
23. Wake Forest University Winston-Salem, NC

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Contributor Information

Ilana F. Gareen, Center for Statistical Sciences and the Department of Epidemiology, Brown University School of Medicine. Providence, RI, USA.

JoRean Sicks, Center for Statistical Sciences, Brown University School of Medicine, Providence, RI, USA.

Amanda Adams, Center for Statistical Sciences, Brown University School of Medicine, Providence, RI, USA.

Denise Moline, Care Communications Inc., Chicago, IL, USA.

Nancy Coffman-Kadish, Care Communications Inc., Chicago, IL, USA.

References

  • 1.Chen Z, Kooperberg C, Pettinger MB, et al. Validity of self-report for fractures among a multiethnic cohort of postmenopausal women: results from the Women's Health Initiative observational study and clinical trials. Menopause. 2004;11:264–274. doi: 10.1097/01.gme.0000094210.15096.fd. [DOI] [PubMed] [Google Scholar]
  • 2.Lubeck DP, Hubert HB. Self-report was a viable method for obtaining health care utilization data in community-dwelling seniors. J Clin Epidemiol. 2005;58:286–290. doi: 10.1016/j.jclinepi.2004.06.011. [DOI] [PubMed] [Google Scholar]
  • 3.Ives DG, Fitzpatrick AL, Bild DE, et al. Surveillance and ascertainment of cardiovascular events. The Cardiovascular Health Study. Ann Epidemiol. 1995;5:278–285. doi: 10.1016/1047-2797(94)00093-9. [DOI] [PubMed] [Google Scholar]
  • 4.Bergmann MM, Byers T, Freedman DS, Mokdad A. Validity of self-reported diagnoses leading to hospitalization: a comparison of self-reports with hospital records in a prospective study of American adults. Am J Epidemiol. 1998 May 15;147(10):969–977. doi: 10.1093/oxfordjournals.aje.a009387. [DOI] [PubMed] [Google Scholar]
  • 5.Heckbert SR, Kooperberg C, Safford MM, et al. Comparison of self-report, hospital discharge codes, and adjudication of cardiovascular events in the Women's Health Initiative. Am J Epidemiol. 2004;160:1152–1158. doi: 10.1093/aje/kwh314. [DOI] [PubMed] [Google Scholar]
  • 6.Sarrazin MS, Rosenthal GE. Finding pure and simple truths with administrative data. JAMA. Apr 4;307(13):1433–1435. doi: 10.1001/jama.2012.404. 012. [DOI] [PubMed] [Google Scholar]
  • 7.Pignone M, Scott TL, Schild LA, Lewis C, Vázquez R, Glanz K. Yield of claims data and surveys for determining colon cancer screening among health plan members. Cancer Epidemiol Biomarkers Prev. 2009 Mar;18(3):726–731. doi: 10.1158/1055-9965.EPI-08-0751. [DOI] [PubMed] [Google Scholar]
  • 8.Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol. 2005 Apr;58(4):323–337. doi: 10.1016/j.jclinepi.2004.10.012. [DOI] [PubMed] [Google Scholar]
  • 9.Katz JN, Chang LC, Sangha O, Fossel AH, Bates DW. Can comorbidity be measured by questionnaire rather than medical record review? Med Care. 1996 Jan;34(1):73–84. doi: 10.1097/00005650-199601000-00006. [DOI] [PubMed] [Google Scholar]
  • 10.Mukerji SS, Duffy SA, Fowler KE, Khan M, Ronis DL, Terrell JE. Comorbidities in head and neck cancer: agreement between self-report and chart review. Otolaryngol Head Neck Surg. 2007 Apr;136(4):536–542. doi: 10.1016/j.otohns.2006.10.041. [DOI] [PubMed] [Google Scholar]
  • 11.O'Malley KJ, Cook KF, Price MD, Wildes KR, Hurdle JF, Ashton CM. Measuring diagnoses: ICD code accuracy. Health Serv Res. 2005 Oct;40(5 Pt 2):1620–1639. doi: 10.1111/j.1475-6773.2005.00444.x. PubMed PMID:6178999; PubMed Central PMCID: PMC1361216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Curb JD, McTiernan A, Heckbert SR, et al. WHI Morbidity and Mortality Committee. Outcomes ascertainment and adjudication methods in the Women's Health Initiative. Ann Epidemiol. 2003;13:S122–S128. doi: 10.1016/s1047-2797(03)00048-6. [DOI] [PubMed] [Google Scholar]
  • 13.Partin MR, Burgess DJ, Halek K, Grill J, Vernon SW, Fisher DA, Griffin JM, Murdoch M. Randomized trial showed requesting medical records with a survey produced a more representative sample than requesting separately. J Clin Epidemiol. 2008 Oct;61(10):1028–1035. doi: 10.1016/j.jclinepi.2007.11.015. [DOI] [PubMed] [Google Scholar]
  • 14.Ness RB. Joint Policy Committee, Societies of Epidemiology. Influence of the HIPAA Privacy Rule on Health Research. JAMA. 2007;298:2164–2170. doi: 10.1001/jama.298.18.2164. [DOI] [PubMed] [Google Scholar]
  • 15.McWilliams R, Hoover-Fong J, Hamosh A, et al. Problematic Variation in Local Institutional Review of a Multicenter Genetic Epidemiology Study. JAMA. 2003;290:360–366. doi: 10.1001/jama.290.3.360. [DOI] [PubMed] [Google Scholar]
  • 16.National Lung Screening Trial Research Team. Reduced Lung-Cancer Mortality with Low- Dose Computed Tomographic Screening. N Engl J Med. 2011;365:395–409. doi: 10.1056/NEJMoa1102873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.National Lung Screening Trial Research Team. Aberle DR, Adams AM, Berg CD, et al. Baseline characteristics of participants in the randomized national lung screening trial. J Natl Cancer Inst. 2010;102:1771–1779. doi: 10.1093/jnci/djq434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.National Lung Screening Trial Research Team. Aberle DR, Berg CD, Black WC, et al. The National Lung Screening Trial: overview and study design. Radiology. 2011;258:243–253. doi: 10.1148/radiol.10091808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.O'Brien B, Nichaman L, Browne JE, et al. Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial Project Team. Coordination and management of a large multicenter screening trial: the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. Control Clin Trials. 2000;21:310S–328S. doi: 10.1016/s0197-2456(00)00099-4. [DOI] [PubMed] [Google Scholar]
  • 20.Geiger AM, Greene SM, Pardee RE, 3rd, Hart G, Herrinton LJ, Macedo AM, Rolnick S, Harris EL, Barton MB, Elmore JG, Fletcher SW. A computerized system to facilitate medical record abstraction in cancer research (United States) Cancer Causes Control. 2003 Jun;14(5):469–476. doi: 10.1023/a:1024947903377. [DOI] [PubMed] [Google Scholar]

RESOURCES