Extracting Forced Vital Capacity from the Electronic Health Record through Natural Language Processing in Rheumatoid Arthritis-Associated Interstitial Lung Disease

Bryant R England; Punyasha Roul; Yangyuna Yang; Daniel Hershberger; Harlan Sayles; Jorge Rojas; Grant W Cannon; Brian C Sauer; Jeffrey R Curtis; Joshua F Baker; Ted R Mikuls

doi:10.1002/pds.5744

. Author manuscript; available in PMC: 2025 Jan 1.

Published in final edited form as: Pharmacoepidemiol Drug Saf. 2023 Dec 19;33(1):e5744. doi: 10.1002/pds.5744

Extracting Forced Vital Capacity from the Electronic Health Record through Natural Language Processing in Rheumatoid Arthritis-Associated Interstitial Lung Disease

Bryant R England ¹, Punyasha Roul ¹, Yangyuna Yang ¹, Daniel Hershberger ², Harlan Sayles ³, Jorge Rojas ⁴, Grant W Cannon ⁵, Brian C Sauer ⁵, Jeffrey R Curtis ⁶, Joshua F Baker ⁷, Ted R Mikuls ¹

PMCID: PMC10872496 NIHMSID: NIHMS1951104 PMID: 38112272

Abstract

Purpose:

To develop a natural language processing (NLP) tool to extract forced vital capacity (FVC) values from electronic health record (EHR) notes in patients with rheumatoid arthritis-interstitial lung disease (RA-ILD).

Methods:

We selected RA-ILD patients (n=7,485) in the Veterans Health Administration (VA) between 2000 and 2020 using validated ICD-9/10 codes. We identified numeric values in proximity to FVC string patterns from clinical notes in the EHR. Subsequently, we performed processing steps to account for variability in note structure, related pulmonary function test (PFT) output, and values copied across notes, then assigned dates from linked administrative procedure records. NLP-derived FVC values were compared to values recorded directly from PFT equipment available on a subset of patients.

Results:

We identified 5,911 FVC values (n=1,844 patients) from PFT equipment and 15,383 values (n=4,982 patients) by NLP. Among 2,610 date-matched FVC values from NLP and PFT equipment, 95.8% of values were within 5% predicted. The mean (SD) difference was 0.09% (5.9), and values strongly correlated (r=0.94, p<0.001), with a precision of 0.87 (95% CI 0.86, 0.88). NLP captured more patients with longitudinal FVC values (n=3,069 vs. n=1,164). Mean (SD) change in FVC %-predicted per year was similar between sources (−1.5 [30.0] NLP vs. −0.9 [16.6] PFT equipment; standardized response mean=0.05 for both).

Conclusions:

NLP of EHR notes increases the capture of accurate, longitudinal FVC values by three-fold over PFT equipment. Use of this NLP tool can facilitate pharmacoepidemiologic research in RA-ILD and other lung diseases by capturing this critical measure of disease severity.

Keywords: rheumatoid arthritis, interstitial lung disease, forced vital capacity, pulmonary function test, natural language processing, electronic health record

PLAIN LANGUAGE SUMMARY

Forced vital capacity (FVC) is a key physiologic measure of disease severity in rheumatoid arthritis-associated interstitial lung disease (RA-ILD) and other restrictive lung diseases. Most studies of people with RA-ILD using real-world data sources do not include FVC because the data is not readily available. We developed a natural language processing (NLP) program, an artificial intelligence method, to capture FVC values from clinical notes in the electronic health record across the Veterans Heath Administration system. FVC values derived from the NLP program had excellent agreement with values obtained directly from pulmonary function test (PFT) equipment. With the use of our NLP program, we were able to obtain nearly three times as many FVC values from the electronic health record for people with RA-ILD. Our NLP program can be utilized to extract FVC values and improve future comparative effectiveness and outcomes research in RA-ILD and other diseases where FVC is a crucial severity measure.

INTRODUCTION

Rheumatoid arthritis (RA) is a chronic, systemic, autoimmune disease with a primary manifestation being a polyarticular inflammatory arthritis. Between 30-40% of people with RA may also have evidence of interstitial lung disease (ILD) on chest computed tomography, with clinically relevant disease occurring in about 10%.^1,2 The prognosis is poor in RA-ILD, with an estimated median survival between 3 to 8 years.^1,3 Pulmonary function tests (PFTs) are regularly obtained during routine care in RA-ILD. Forced vital capacity (FVC), the total amount of air exhaled during spirometry testing, is a key PFT measure because it is an indicator of disease severity that is highly prognostic of survival in RA-ILD and other restrictive lung diseases.^3-5 In addition to its value in clinical care, FVC measurement is the primary outcome measure in clinical trials of progressive fibrotic lung diseases such as RA-ILD and idiopathic pulmonary fibrosis (IPF).^6-8

Despite the importance of monitoring FVC in clinical care and trials, real-world studies of RA-ILD often do not report FVC.^9-12 Frequently, this is because of the reliance on administrative data sources to complete such studies, where PFT results are unavailable. Electronic health records (EHRs) are now widespread and being leveraged for comparative effectiveness and outcomes research.¹³ Even in the EHR, FVC values can be difficult to obtain because they may not be stored as discrete data elements amenable to a simple extraction. Rather, these values may be maintained in separate databases, available only as document images, and/or manually entered into clinical notes as unstructured text.

Natural Language Processing (NLP), an artificial intelligence method, has been used to transform unstructured text from the EHR into structured clinical data.¹⁴ NLP programs have previously been developed to extract forced expiratory volume in 1 second (FEV1) values from PFTs with high accuracy, which can support studies in asthma and/or chronic obstructive pulmonary disease (COPD) but not ILD.¹⁵ However, such programs have not been developed to extract FVC measurements, and the responsiveness for detecting longitudinal changes in PFT values obtained from NLP programs is unknown. The purpose of this study was to develop an NLP program to extract FVC values from EHR notes among Veterans with RA-ILD and validate these values cross-sectionally and longitudinally compared to results directly from PFT equipment. We hypothesized that FVC values could accurately be extracted from clinical notes in the EHR to increase the ability to assess disease severity in RA-ILD cross-sectionally and longitudinally.

METHODS

Study setting and population

We performed this study within the Veterans Health Administration (VA) system, the largest integrated health care system in the United States. National administrative and EHR data stored in the VA Corporate Data Warehouse (CDW) was accessed within the VA Informatics and Computing Infrastructure (VINCI).¹⁶ This study received institutional review board approval from the VA Central IRB (1619487-4) with a waiver of informed consent.

We identified patients in the VA with RA-ILD between January 1, 2000 and December 31, 2020 by International Classification of Diseases 9^th & 10^th revision (ICD-9/10) codes for RA and ILD, based on previously validated administrative RA-ILD algorithms.¹⁷ Patients were required to have at least two ICD-9/10 codes for each RA and ILD from inpatient or outpatient encounters that were separated by at least 30 days. Such algorithms have a positive predictive value of approximately 70%.¹⁷ Diagnostic codes used in these algorithms are provided in Supplementary Table 1.

Data sources

Clinical notes from October 1, 1999 to February 28, 2021 were obtained from the Text Integration Utilities (TIU) documents in the VA CDW. These are clinical notes entered into the EHR by providers (e.g., physicians) and staff (e.g., nurses, respiratory technologists). In addition to the content of the clinical notes, the date and location of the encounter for the corresponding clinical notes were collected, the latter using stop codes indicating VA specialty of care. Administrative records of PFT completion within the VA CDW were identified using Current Procedural Terminology (CPT) and ICD-Procedure Coding System (PCS) codes. On a subset of patients, PFT results (including FVC volume) were available as structured data in the VA CDW when there was machine and software compatibility that allowed storage as structured data in the EHR (termed PFT equipment values). Patient demographic information was obtained from VA enrollment records in the VA CDW. Smoking status (current, former, or never) was obtained from tobacco use “health factors,” which are standard data fields completed during routine clinical encounters and stored in the EHR.¹⁸

Development of the NLP program

We developed an NLP program to extract FVC values from TIU documents using MS SQL Server based on prior work to extract FEV1 values.¹⁵ A flowchart depicting the process is provided in Figure 1. We selected string patterns of 45 characters that started with “FVC” and extracted numeric values within these strings. Similar searches for “forced vital capacity” yielded few additional results and were not included in the final program. We then performed data processing steps that included removing numeric values pertaining to dates or times, other PFT components (e.g., predicted normal values, ratios such as FEV1/FVC), and duplicate FVC values. When multiple FVC values were identified, the first non-duplicative value (i.e., value not identified in prior notes) was extracted. Because the date of the clinical note when the FVC value was found was not necessarily the same date that PFTs were completed, we used administrative data (i.e., procedure codes) to determine PFT completion dates. Dates were assigned to extracted FVC values, using the date of the most recently completed PFT (based on CPT or ICD-PCS for PFTs) before or on the clinical note date. If there was no administrative data for PFT completion in the five years prior to the clinical note, then the extracted FVC value was excluded (n=1,173). FVC volumes (L or mL/cc) and FVC percent predicted values were separated based on their non-overlapping numerical value ranges. FVC volumes (both PFT equipment and NLP-derived values) were then converted to percent-predicted values using the National Health and Nutrition Examination Survey (NHANES) calculators.¹⁹ After merging the two NLP sources of FVC percent-predicted values, duplicates and values outside the range observed from PFT equipment (20 to 135%) were excluded (n=24).

Figure 1. — Flowchart provides an overview of the process to extract forced vital capacity values (FVC) from electronic health record notes using natural language processing. FVC concepts and numeric values were extracted. Numeric values pertaining to date/time, other pulmonary function test components (e.g., FEV1/FVC ratio), and duplicate values were excluded. Dates were assigned to extracted values by linkage with administrative records of PFT completion. FVC values were converted to percent predicted before excluding any remaining duplicate or extreme values.

Abbreviations: FVC, forced vital capacity; NLP, natural language processing; PFT, pulmonary function test; TIU, Text Integration Utilities; % pred, percent predicted.

Statistical analysis

Patient characteristics and retrieved FVC values were descriptively summarized by source. To compare NLP-derived FVC values to FVC values obtained from PFT equipment (reference standard), we selected FVC values from both sources that occurred within 30 days of each other. Mean differences and Pearson correlations in FVC values between data sources were calculated. Bland-Altman plots were constructed and Pearson correlations of the difference in FVC measurements with the mean of the FVC measurements were calculated to assess agreement and systematic bias. We calculated the precision of the NLP program as the number of accurately matched FVC values out of all matched FVC values,²⁰ using a range of +/− 2% predicted based on the minimum clinically important difference.²¹ We indirectly assessed NLP sensitivity by calculating the number of accurately matched FVC values out of all FVC values from PFT equipment. This was expected to be lower than typical “recall” estimates because medical personnel do not record all FVC values in clinical notes. Finally, we assessed the responsiveness of FVC values from each source by calculating the difference between the first and last FVC values divided by the time interval between measures. Standardized response means (SRM) for changes in FVC over time were calculated by FVC source. Sensitivity analyses included assessing NLP-derived values based on the specialty of the clinical note, by calendar year period (1999-2006, 2007-2013, 2014-2021), and whether the value was recorded as a volume or percent predicted in the clinical note. All analyses were completed in SAS v9.4.

RESULTS

We studied 7,485 patients with RA-ILD. Most patients were male (93.1%), reported white race (74.2%), and had a smoking history (74.6%) (Supplemental Table 2). The mean (SD) age was 68.0 (10.1) years. Those with a FVC value from PFT equipment more often fulfilled the RA-ILD algorithm in earlier calendar years. We identified 5,911 FVC values for 1,844 unique patients from the PFT equipment data (Table 1). The NLP program increased the yield of capturing FVC values substantially, extracting 15,383 FVC values for 4,982 unique patients. Of NLP-derived values, most were recorded in the clinical notes as FVC volumes (n=12,784). Mean FVC values were similar across sources after converting to percent predicted values (range 70.8 to 73.3%).

Table 1.

Summary of forced vital capacity values by data source

	FVC values	Unique patients	FVC % pred.
PFT equipment	5,911	1,844	72.4 (18.7)
NLP – FVC % pred.	2,599	1,684	70.8 (18.9)
NLP – FVC volume	12,784	4,462	73.3 (18.8)
NLP (all)*	15,383	4,982	72.8 (18.9)
Total – NLP & equipment^*	18,524	5,377	73.1 (19.0)

Open in a new tab

Values mean (SD) unless otherwise noted.

Duplicate values (% predicted and volume) removed

Abbreviations: FVC, forced vital capacity; NLP, natural language processing; PFT, pulmonary function test

Cross-sectional comparison of FVC values from NLP and PFT equipment

There were 2,610 matched FVC values from NLP and PFT equipment (Figure 2). Mean (SD) FVC was 70.0% (17.2) among values from PFT equipment and 70.1% (17.5) from NLP. Perfect agreement between sources occurred for 2,141 (82.0%) values. The mean (SD) difference in FVC values between sources was 0.09% predicted (5.9), with no systematic bias over the range of NLP-derived FVC values observed on the Bland-Altman plot (Figure 3A). NLP and PFT equipment FVC values strongly correlated (Figure 4A, r=0.94, p<0.001). The precision of the NLP program was 0.87 (95% confidence interval [CI] 0.86, 0.88). Almost all (95.8%) NLP-derived FVC values were within 5% predicted of values from PFT equipment. The most common reasons for discordance beyond this threshold included FVC values being mapped to the incorrect PFT because procedure codes were not recorded in administrative data and extracting predicted FVC values rather than actual FVC values due to variability in their ordering and/or these concepts being unlabeled.

Figure 3. — Bland-Altman plots comparing differences in forced vital capacity (FVC) values between natural language processing (NLP) and pulmonary function test (PFT) equipment (y-axis) with the mean of NLP and PFT equipment FVC values (x-axis). Panel A contains all NLP-derived FVC values. Panel B contains NLP-derived FVC values that were recorded in the note as percent predicted. Panel C contains NLP-derived FVC values that were recorded in the note as volumes. The equal distribution of differences in NLP- and PFT-derived FVC measures across their observed values and lack of correlation between these differences and mean observed values indicates the absence of bias.

Abbreviations: FVC, forced vital capacity; NLP, natural language processing; PFT, pulmonary function test.

Figure 4. — Pearson correlations of forced vital capacity (FVC) values from natural language processing (NLP) compared to pulmonary function test (PFT) equipment. Panel A contains all NLP-derived FVC values. Panel B contains NLP-derived FVC values that were recorded in the note as percent predicted. Panel C contains NLP-derived FVC values that were recorded in the note as volumes.

Abbreviations: FVC, forced vital capacity; NLP, natural language processing; PFT, pulmonary function test.

While substantially fewer FVC values were recorded as percent predicted within clinical notes (8.5%), the performance was similar to those recorded as volumes. The mean (SD) difference from PFT equipment values was 0.02% predicted (10.5), and there was no systematic bias across NLP values (Figure 3B). For NLP-derived values recorded as volumes, the mean (SD) difference was 0.14% predicted (5.5), with again no systematic bias across values (Figure 3C). NLP and PFT equipment FVC values strongly correlated regardless of whether recorded as percent predicted (Figure 4B, r=0.83, p<0.001) or volumes (Figure 4C, r=0.95, p<0.001).

NLP values most frequently were obtained from pulmonology notes (59% of values), followed by PFT laboratory (30%) and rheumatology (4%). Mean differences and correlations between FVC values from NLP and PFT equipment were similar across notes from the different specialties (Supplemental Table 3; r range 0.76 to 0.99) and across different calendar year periods (Supplemental Table 4; r range 0.93 to 0.98).

Longitudinal comparison of FVC values from NLP and PFT equipment

Multiple FVC values were available for 1,164 patients using PFT equipment compared to 3,069 patients using NLP (Table 2). The mean (SD) change in FVC percent predicted per year was −0.9 (16.6) for PFT equipment values compared to −1.5 (30.0) for NLP-derived values. Among NLP-derived values, change in FVC values over time was similar between values recorded as percent predicted (mean −1.7 [SD 30.0]) and volumes (−1.4 [28.9]). The responsiveness of FVC was poor regardless of FVC source (SRM range 0.05 to 0.06).

Table 2.

Change in forced vital capacity values over time in patients with RA-ILD by data source.

	N	First FVC % pred.	Last FVC % pred.	Years of follow-up	ΔFVC % pred. per year	SRM
PFT equipment	1,164	75.4 (18.3)	70.9 (18.9)	4.8 (4.1)	−0.9 (16.6)	0.05
NLP	3,069	75.7 (18.4)	71.5 (19.4)	5.0 (4.3)	−1.5 (30.0)	0.05
FVC % pred.	551	70.4 (18.4)	68.9 (18.9)	3.7 (3.2)	−1.7 (30.0)	0.06
FVC volumes	2,812	75.9 (18.4)	71.5 (19.4)	5.2 (4.4)	−1.4 (28.9)	0.05

Open in a new tab

Values mean (SD)

Abbreviations: FVC, forced vital capacity; NLP, natural language processing; PFT, pulmonary function test; SRM, standardized response mean

DISCUSSION

FVC is a key measure of RA-ILD severity, as well as other restrictive lung diseases such as IPF. While a primary outcome in clinical trials,^6,7 observational studies frequently lack FVC values because they are not readily available either due to their absence in administrative data or unstructured format in EHRs. In this study, we developed an NLP program to extract FVC values among people with RA-ILD using national VA EHR data to facilitate pharmacoepidemiologic studies. The NLP program increased the yield of capturing FVC values by nearly 3-fold over PFT equipment alone. Compared to the reference standard values obtained directly from PFT equipment, FVC values captured with NLP were highly accurate and without systematic bias. Extraction of FVC values from clinical notes using NLP allowed us to assess longitudinal FVC changes in nearly three times as many people with RA-ILD in this national VA database, and the responsiveness of FVC values to change was similar between PFT equipment and NLP-derived values. Together, the use of this NLP program can support pharmacoepidemiologic studies in RA-ILD and other types of ILD.

Our study extends prior work by Akgün and colleagues, who developed an NLP program to extract FEV1 values from clinical notes in the VA.¹⁵ We adapted this program for FVC and added several data processing steps to remove related concepts (e.g., FEV1/FVC ratios) and utilize FVC values recorded as either volumes or percent predicted. The yield for capturing PFT components increased more substantially in our study, which may be explained by several study differences. Akgün and colleagues studied enrollees in the Veterans Aging Cohort Study,²² in contrast to our inclusion of all patients in the VA fulfilling an algorithm for RA-ILD. Second, Akgün extracted FEV1 while we extracted FVC. FEV1 is monitored in obstructive lung diseases (e.g., COPD, asthma), while FVC is monitored in restrictive lung diseases (e.g., ILD). Restrictive lung diseases may be more frequently managed and monitored by specialists due to their rarity and complexity, and these specialists may be more likely to record the PFT values in their notes. Indeed, pulmonology and rheumatology encounters (specialties caring for RA-ILD patients) were two of the three most frequent sources of NLP-derived FVC values. The other frequent source was PFT laboratory, which are typically semi-structured clinical notes recorded in the EHR. As in our study, Akgün found excellent agreement between PFT values obtained from NLP and PFT equipment, with 95% of FEV1 values agreeing and a Spearman’s correlation of 0.99. Sauer and colleagues also developed an NLP program to extract PFT data from a smaller set of VA EHR data, focusing on asthma patients treated at seven VA medical centers.²³ While the primary objective of this work was to capture changes between pre- and post-bronchodilator measurements for the assessment of bronchodilator reversibility, they found high accuracy for FVC values obtained by NLP compared to a reference standard of manual chart review.²³ Together, these studies demonstrate that NLP can be used to accurately capture PFT components (FEV1 and FVC) from clinical notes in the EHR.

An advancement over prior studies was our objective to acquire longitudinal PFT values to assess changes during the disease course. This is a highly valuable attribute for pharmacoepidemiologic studies, with FVC values being one of the most commonly used primary outcomes in clinical trials. Collecting longitudinal FVC values required additional processing steps to be integrated into our NLP program. The date of the clinical note when the FVC value was identified may not have been the same date that the PFTs were performed. Therefore, we leveraged complementary administrative data (procedure codes for PFT completion) to identify the date PFTs were performed. As a result, we were able to nearly triple the number of patients with longitudinal FVC values available, and NLP-derived values captured longitudinal changes as well as those from PFT equipment. While SRMs for both sources were poor (range 0.05 to 0.06), this reflects that most RA-ILD has a relatively stable disease course with gradual declines in FVC.^24-26 Even with the use of NLP, longitudinal FVC values were obtained for less than half of RA-ILD patients and approximately 1/3 of RA-ILD patients had no FVC values. Healthcare outside the VA likely contributed to these findings, but these observations also highlight that RA-ILD monitoring could likely be improved in real-world settings. Together, the tool will enable pharmacoepidemiologists to perform comparative studies of therapies in RA-ILD, and other ILD types.

There are limitations to this study. Only a proportion of NLP-derived values could be compared to matched PFT equipment values (reference standard). Because the availability of PFT equipment values was a direct result of compatible PFT equipment, we do not anticipate this introducing bias into the results. We could not assess “recall” without conducting manual EHR note review, so we calculated an indirect measure that likely underestimates NLP sensitivity because the tool only identifies FVC values medical personnel chose to record in EHR notes. We did not develop the NLP program to extract the diffusion capacity for carbon monoxide (DLCO), another component of PFTs that is associated with survival in RA-ILD.³ The additional variability in DLCO values because they can be unadjusted and/or adjusted for hemoglobin or alveolar volume makes this more challenging, and FVC remains the preferred pulmonary physiologic outcome measure in RA-ILD clinical trials.^4,6 Health care occurring outside the VA is anticipated to contribute to missing FVC values, and future work linking to these sources can clarify the impact of dual-care on RA-ILD monitoring. Additionally, we may have excluded NLP-derived values that did not have corresponding administrative PFT completion data because the test was performed outside the VA system. This would not affect the current results since the PFT equipment FVC values would not be available from outside health care data to serve as a reference standard. The NLP tool was tested in a population with RA-ILD, so the generalizability to other populations with restrictive lung diseases is not known. However, we speculate that the NLP tool has generalizability to other types of ILD because most NLP-derived values were obtained from pulmonology and PFT laboratory clinical notes, rather than rheumatology encounters. Finally, misclassification of RA-ILD is possible with the reliance on administrative algorithms.

Strengths of this study include the use of data from the VA, the largest integrated health care system in the U.S. Within the VA health care system, there are many different medical centers, providers, types of PFT equipment, documentation and coding practices, and data storage approaches. These differences introduce variability into the results but also allowed us to develop and refine an NLP tool that could be useful broadly across the VA, and likely in other EHRs. Modifying this tool for implementation at a single center or a select number of centers where there is more homogeneity in these processes could result in even better performance.

In conclusion, we developed and validated an NLP tool to extract FVC values from clinical notes in the EHR. With a substantial increase in yield of capturing longitudinal FVC values and high accuracy, this NLP tool will support high-quality comparative effectiveness and outcomes research in RA-ILD and other ILDs using real-world data.

Supplementary Material

Supinfo

NIHMS1951104-supplement-Supinfo.docx^{(20.3KB, docx)}

KEY POINTS.

Forced vital capacity (FVC) is a crucial measure of disease severity in rheumatoid arthritis-associated interstitial lung disease (RA-ILD) and other restrictive lung diseases but is often not readily available in real-world data.
We developed a natural language processing (NLP) program to extract FVC values from clinical notes in the electronic health record.
Deployment of the NLP program increased the yield of capturing FVC values among people with RA-ILD by nearly 3-fold compared to values recorded from PFT (pulmonary function test) equipment.
FVC values obtained by NLP were highly accurate compared with values from PFT equipment.
The responsiveness to change of FVC values from NLP and directly from PFT equipment were similar.

Funding:

BRE received support from the VA CSR&D (CX002203). JRC is supported by the National Institute of Arthritis and Musculoskeletal and Skin Diseases (P30AR072583). JFB is supported by the VA CSR&D (CX001703) and VA RR&D (RX003644). TRM is supported through grants the VA BLR&D (BX004660), the U.S. Department of Defense (PR200793), the Rheumatology Research Foundation, and the National Institute of General Medical Sciences (U54 GM115458).

Footnotes

Disclosures: BRE consulting and PI of research funding to institution from Boehringer-Ingelheim. JFB has received consulting fees from Bristol-Myers Squibb, Pfizer, CorEvitas, and Burns-White, LLC. TRM has consulted with Pfizer, Sanofi, UCB, and Horizon Therapeutics and received research funding from Horizon Therapeutics.

ETHICS STATEMENT

Institutional review board approval from the VA Central IRB (#1619487-4) and a waiver of informed consent were obtained for this study.

REFERENCES

1.Bongartz T, Nannini C, Medina-Velasquez YF, et al. Incidence and mortality of interstitial lung disease in rheumatoid arthritis: a population-based study. Arthritis Rheum. Jun 2010;62(6):1583–91. doi: 10.1002/art.27405 [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Gabbay E, Tarala R, Will R, et al. Interstitial lung disease in recent onset rheumatoid arthritis. Am J Respir Crit Care Med. Aug 1997;156(2 Pt 1):528–35. doi: 10.1164/ajrccm.156.2.9609016 [DOI] [PubMed] [Google Scholar]
3.Brooks R, Baker JF, Yang Y, et al. The impact of disease severity measures on survival in U.S. veterans with rheumatoid arthritis-associated interstitial lung disease. Rheumatology (Oxford). Nov 28 2022;61(12):4667–4677. doi: 10.1093/rheumatology/keac208 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Solomon JJ, Chung JH, Cosgrove GP, et al. Predictors of mortality in rheumatoid arthritis-associated interstitial lung disease. Eur Respir J. Feb 2016;47(2):588–96. doi: 10.1183/13993003.00357-2015 [DOI] [PubMed] [Google Scholar]
5.Ley B, Collard HR, King TE, Jr. Clinical course and prediction of survival in idiopathic pulmonary fibrosis. Am J Respir Crit Care Med. Feb 15 2011;183(4):431–40. doi: 10.1164/rccm.201006-0894CI [DOI] [PubMed] [Google Scholar]
6.Flaherty KR, Wells AU, Cottin V, et al. Nintedanib in Progressive Fibrosing Interstitial Lung Diseases. N Engl J Med. Oct 31 2019;381(18):1718–1727. doi: 10.1056/NEJMoa1908681 [DOI] [PubMed] [Google Scholar]
7.Solomon JJ, Danoff SK, Woodhead FA, et al. Safety, tolerability, and efficacy of pirfenidone in patients with rheumatoid arthritis-associated interstitial lung disease: a randomised, double-blind, placebo-controlled, phase 2 study. Lancet Respir Med. Jan 2023;11(1):87–96. doi: 10.1016/S2213-2600(22)00260-0 [DOI] [PubMed] [Google Scholar]
8.Nathan SD, Meyer KC. IPF clinical trial design and endpoints. Curr Opin Pulm Med. Sep 2014;20(5):463–71. doi: 10.1097/MCP.0000000000000091 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Druce KL, Iqbal K, Watson KD, Symmons DPM, Hyrich KL, Kelly C. Mortality in patients with interstitial lung disease treated with rituximab or TNFi as a first biologic. RMD Open. 2017;3(1):e000473. doi: 10.1136/rmdopen-2017-000473 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Curtis JR, Sarsour K, Napalkov P, Costa LA, Schulman KL. Incidence and complications of interstitial lung disease in users of tocilizumab, rituximab, abatacept and anti-tumor necrosis factor alpha agents, a retrospective cohort study. Arthritis Res Ther. Nov 11 2015;17:319. doi: 10.1186/s13075-015-0835-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Dixon WG, Hyrich KL, Watson KD, et al. Influence of anti-TNF therapy on mortality in patients with rheumatoid arthritis-associated interstitial lung disease: results from the British Society for Rheumatology Biologics Register. Ann Rheum Dis. Jun 2010;69(6):1086–91. doi: 10.1136/ard.2009.120626 [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Kang EH, Jin Y, Desai RJ, Liu J, Sparks JA, Kim SC. Risk of exacerbation of pulmonary comorbidities in patients with rheumatoid arthritis after initiation of abatacept versus TNF inhibitors: A cohort study. Semin Arthritis Rheum. Jun 2020;50(3):401–408. doi: 10.1016/j.semarthrit.2019.11.010 [DOI] [PubMed] [Google Scholar]
13.Hersh WR, Weiner MG, Embi PJ, et al. Caveats for the use of operational electronic health record data in comparative effectiveness research. Med Care. Aug 2013;51(8 Suppl 3):S30–7. doi: 10.1097/MLR.0b013e31829b1dbd [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: an introduction. J Am Med Inform Assoc. Sep-Oct 2011;18(5):544–51. doi: 10.1136/amiajnl-2011-000464 [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Akgun KM, Sigel K, Cheung KH, et al. Extracting lung function measurements to enhance phenotyping of chronic obstructive pulmonary disease (COPD) in an electronic health record using automated tools. PLoS One. 2020;15(1):e0227730. doi: 10.1371/journal.pone.0227730 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.U.S. Department of Veterans Affairs Corporate Data Warehouse. https://www.hsrd.research.va.gov/for_researchers/vinci/cdw.cfm
17.England BR, Roul P, Mahajan TD, et al. Performance of Administrative Algorithms to Identify Interstitial Lung Disease in Rheumatoid Arthritis. Arthritis Care Res (Hoboken). Oct 2020;72(10):1392–1403. doi: 10.1002/acr.24043 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Melzer AC, Pinsker EA, Clothier B, et al. Validating the use of veterans affairs tobacco health factors for assessing change in smoking status: accuracy, availability, and approach. BMC Med Res Methodol. May 11 2018;18(1):39. doi: 10.1186/s12874-018-0501-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Hankinson JL, Odencrantz JR, Fedan KB. Spirometric reference values from a sample of the general U.S. population. Am J Respir Crit Care Med. Jan 1999;159(1):179–87. doi: 10.1164/ajrccm.159.1.9712108 [DOI] [PubMed] [Google Scholar]
20.Hripcsak G, Rothschild AS. Agreement, the f-measure, and reliability in information retrieval. J Am Med Inform Assoc. May-Jun 2005;12(3):296–8. doi: 10.1197/jamia.M1733 [DOI] [PMC free article] [PubMed] [Google Scholar]
21.du Bois RM, Weycker D, Albera C, et al. Forced vital capacity in patients with idiopathic pulmonary fibrosis: test properties and minimal clinically important difference. Am J Respir Crit Care Med. Dec 15 2011;184(12):1382–9. doi: 10.1164/rccm.201105-0840OC [DOI] [PubMed] [Google Scholar]
22.Justice AC, Dombrowski E, Conigliaro J, et al. Veterans Aging Cohort Study (VACS): Overview and description. Med Care. Aug 2006;44(8 Suppl 2):S13–24. doi: 10.1097/01.mlr.0000223741.02074.66 [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Sauer BC, Jones BE, Globe G, et al. Performance of a Natural Language Processing (NLP) Tool to Extract Pulmonary Function Test (PFT) Reports from Structured and Semistructured Veteran Affairs (VA) Data. EGEMS (Wash DC). 2016;4(1):1217. doi: 10.13063/2327-9214.1217 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Juge P-A, Solomon JJ, van Moorsel CHM, et al. MUC5B promoter variant rs35705950 and rheumatoid arthritis associated interstitial lung disease survival and progression. Seminars in Arthritis and Rheumatism. 2021/October/01/ 2021;51(5):996–1004. doi: 10.1016/j.semarthrit.2021.07.002 [DOI] [PubMed] [Google Scholar]
25.Mena-Vázquez N, Rojas-Gimenez M, Romero-Barco CM, et al. Predictors of Progression and Mortality in Patients with Prevalent Rheumatoid Arthritis and Interstitial Lung Disease: A Prospective Cohort Study. Journal of Clinical Medicine. 2021;10(4):874. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Fu Q, Wang L, Li L, Li Y, Liu R, Zheng Y. Risk factors for progression and prognosis of rheumatoid arthritis–associated interstitial lung disease: single center study with a large sample of Chinese population. Clinical Rheumatology. 2019/April/01 2019;38(4):1109–1116. doi: 10.1007/s10067-018-4382-x [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supinfo

NIHMS1951104-supplement-Supinfo.docx^{(20.3KB, docx)}

[R1] 1.Bongartz T, Nannini C, Medina-Velasquez YF, et al. Incidence and mortality of interstitial lung disease in rheumatoid arthritis: a population-based study. Arthritis Rheum. Jun 2010;62(6):1583–91. doi: 10.1002/art.27405 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Gabbay E, Tarala R, Will R, et al. Interstitial lung disease in recent onset rheumatoid arthritis. Am J Respir Crit Care Med. Aug 1997;156(2 Pt 1):528–35. doi: 10.1164/ajrccm.156.2.9609016 [DOI] [PubMed] [Google Scholar]

[R3] 3.Brooks R, Baker JF, Yang Y, et al. The impact of disease severity measures on survival in U.S. veterans with rheumatoid arthritis-associated interstitial lung disease. Rheumatology (Oxford). Nov 28 2022;61(12):4667–4677. doi: 10.1093/rheumatology/keac208 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Solomon JJ, Chung JH, Cosgrove GP, et al. Predictors of mortality in rheumatoid arthritis-associated interstitial lung disease. Eur Respir J. Feb 2016;47(2):588–96. doi: 10.1183/13993003.00357-2015 [DOI] [PubMed] [Google Scholar]

[R5] 5.Ley B, Collard HR, King TE, Jr. Clinical course and prediction of survival in idiopathic pulmonary fibrosis. Am J Respir Crit Care Med. Feb 15 2011;183(4):431–40. doi: 10.1164/rccm.201006-0894CI [DOI] [PubMed] [Google Scholar]

[R6] 6.Flaherty KR, Wells AU, Cottin V, et al. Nintedanib in Progressive Fibrosing Interstitial Lung Diseases. N Engl J Med. Oct 31 2019;381(18):1718–1727. doi: 10.1056/NEJMoa1908681 [DOI] [PubMed] [Google Scholar]

[R7] 7.Solomon JJ, Danoff SK, Woodhead FA, et al. Safety, tolerability, and efficacy of pirfenidone in patients with rheumatoid arthritis-associated interstitial lung disease: a randomised, double-blind, placebo-controlled, phase 2 study. Lancet Respir Med. Jan 2023;11(1):87–96. doi: 10.1016/S2213-2600(22)00260-0 [DOI] [PubMed] [Google Scholar]

[R8] 8.Nathan SD, Meyer KC. IPF clinical trial design and endpoints. Curr Opin Pulm Med. Sep 2014;20(5):463–71. doi: 10.1097/MCP.0000000000000091 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Druce KL, Iqbal K, Watson KD, Symmons DPM, Hyrich KL, Kelly C. Mortality in patients with interstitial lung disease treated with rituximab or TNFi as a first biologic. RMD Open. 2017;3(1):e000473. doi: 10.1136/rmdopen-2017-000473 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Curtis JR, Sarsour K, Napalkov P, Costa LA, Schulman KL. Incidence and complications of interstitial lung disease in users of tocilizumab, rituximab, abatacept and anti-tumor necrosis factor alpha agents, a retrospective cohort study. Arthritis Res Ther. Nov 11 2015;17:319. doi: 10.1186/s13075-015-0835-7 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Dixon WG, Hyrich KL, Watson KD, et al. Influence of anti-TNF therapy on mortality in patients with rheumatoid arthritis-associated interstitial lung disease: results from the British Society for Rheumatology Biologics Register. Ann Rheum Dis. Jun 2010;69(6):1086–91. doi: 10.1136/ard.2009.120626 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Kang EH, Jin Y, Desai RJ, Liu J, Sparks JA, Kim SC. Risk of exacerbation of pulmonary comorbidities in patients with rheumatoid arthritis after initiation of abatacept versus TNF inhibitors: A cohort study. Semin Arthritis Rheum. Jun 2020;50(3):401–408. doi: 10.1016/j.semarthrit.2019.11.010 [DOI] [PubMed] [Google Scholar]

[R13] 13.Hersh WR, Weiner MG, Embi PJ, et al. Caveats for the use of operational electronic health record data in comparative effectiveness research. Med Care. Aug 2013;51(8 Suppl 3):S30–7. doi: 10.1097/MLR.0b013e31829b1dbd [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: an introduction. J Am Med Inform Assoc. Sep-Oct 2011;18(5):544–51. doi: 10.1136/amiajnl-2011-000464 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Akgun KM, Sigel K, Cheung KH, et al. Extracting lung function measurements to enhance phenotyping of chronic obstructive pulmonary disease (COPD) in an electronic health record using automated tools. PLoS One. 2020;15(1):e0227730. doi: 10.1371/journal.pone.0227730 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.U.S. Department of Veterans Affairs Corporate Data Warehouse. https://www.hsrd.research.va.gov/for_researchers/vinci/cdw.cfm

[R17] 17.England BR, Roul P, Mahajan TD, et al. Performance of Administrative Algorithms to Identify Interstitial Lung Disease in Rheumatoid Arthritis. Arthritis Care Res (Hoboken). Oct 2020;72(10):1392–1403. doi: 10.1002/acr.24043 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Melzer AC, Pinsker EA, Clothier B, et al. Validating the use of veterans affairs tobacco health factors for assessing change in smoking status: accuracy, availability, and approach. BMC Med Res Methodol. May 11 2018;18(1):39. doi: 10.1186/s12874-018-0501-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Hankinson JL, Odencrantz JR, Fedan KB. Spirometric reference values from a sample of the general U.S. population. Am J Respir Crit Care Med. Jan 1999;159(1):179–87. doi: 10.1164/ajrccm.159.1.9712108 [DOI] [PubMed] [Google Scholar]

[R20] 20.Hripcsak G, Rothschild AS. Agreement, the f-measure, and reliability in information retrieval. J Am Med Inform Assoc. May-Jun 2005;12(3):296–8. doi: 10.1197/jamia.M1733 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.du Bois RM, Weycker D, Albera C, et al. Forced vital capacity in patients with idiopathic pulmonary fibrosis: test properties and minimal clinically important difference. Am J Respir Crit Care Med. Dec 15 2011;184(12):1382–9. doi: 10.1164/rccm.201105-0840OC [DOI] [PubMed] [Google Scholar]

[R22] 22.Justice AC, Dombrowski E, Conigliaro J, et al. Veterans Aging Cohort Study (VACS): Overview and description. Med Care. Aug 2006;44(8 Suppl 2):S13–24. doi: 10.1097/01.mlr.0000223741.02074.66 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Sauer BC, Jones BE, Globe G, et al. Performance of a Natural Language Processing (NLP) Tool to Extract Pulmonary Function Test (PFT) Reports from Structured and Semistructured Veteran Affairs (VA) Data. EGEMS (Wash DC). 2016;4(1):1217. doi: 10.13063/2327-9214.1217 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] 24.Juge P-A, Solomon JJ, van Moorsel CHM, et al. MUC5B promoter variant rs35705950 and rheumatoid arthritis associated interstitial lung disease survival and progression. Seminars in Arthritis and Rheumatism. 2021/October/01/ 2021;51(5):996–1004. doi: 10.1016/j.semarthrit.2021.07.002 [DOI] [PubMed] [Google Scholar]

[R25] 25.Mena-Vázquez N, Rojas-Gimenez M, Romero-Barco CM, et al. Predictors of Progression and Mortality in Patients with Prevalent Rheumatoid Arthritis and Interstitial Lung Disease: A Prospective Cohort Study. Journal of Clinical Medicine. 2021;10(4):874. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Fu Q, Wang L, Li L, Li Y, Liu R, Zheng Y. Risk factors for progression and prognosis of rheumatoid arthritis–associated interstitial lung disease: single center study with a large sample of Chinese population. Clinical Rheumatology. 2019/April/01 2019;38(4):1109–1116. doi: 10.1007/s10067-018-4382-x [DOI] [PubMed] [Google Scholar]

PERMALINK

Extracting Forced Vital Capacity from the Electronic Health Record through Natural Language Processing in Rheumatoid Arthritis-Associated Interstitial Lung Disease

Bryant R England, MD, PhD

Punyasha Roul, MS

Yangyuna Yang, MBBS, PhD

Daniel Hershberger, MD

Harlan Sayles, MS

Jorge Rojas, MS

Grant W Cannon, MD

Brian C Sauer, PhD

Jeffrey R Curtis, MD, MS, MPH

Joshua F Baker, MD, MSCE

Ted R Mikuls, MD, MSPH

Abstract

Purpose:

Methods:

Results:

Conclusions:

PLAIN LANGUAGE SUMMARY

INTRODUCTION

METHODS

Study setting and population

Data sources

Development of the NLP program

Figure 1. Flowchart depicting natural language processing program to extract forced vital capacity from electronic health record notes.

Statistical analysis

RESULTS

Table 1.

Cross-sectional comparison of FVC values from NLP and PFT equipment

Figure 2. Study flow diagram.

Figure 3. Bland-Altman plots comparing differences in FVC values between NLP and PFT equipment with mean FVC values.

Figure 4. Correlation of FVC values from NLP and PFT equipment.

Longitudinal comparison of FVC values from NLP and PFT equipment

Table 2.

DISCUSSION

Supplementary Material

KEY POINTS.

Funding:

Footnotes

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases