Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Jun 10.
Published in final edited form as: Stud Health Technol Inform. 2022 Jun 6;290:405–409. doi: 10.3233/SHTI220106

Associations Between Aggregate NLP-extracted Conflicts of Interest and Adverse Events By Drug Product

S Scott Graham a, Zoltan P Majdik b, Johua B Barbour c, Justin F Rousseau d
PMCID: PMC9186043  NIHMSID: NIHMS1740801  PMID: 35673045

Abstract

This study evalutes associations between aggregate conflicts of interest (COI) and drug safety. We used a machine-learning system to extract and classify COI from PubMed-indexed disclosure statements. Individual conflicts were classified as Type 1 (personal fees, travel, board memberships, and non-financial support), Type 2 (grants and research support), or Type 3 (stock ownership and industry employment). COI were aggregated by type compared to adverse events by product. Type 1 COI are associated with a 1.1-1.8% increase in the number of adverse events, serious events, hospitalizations, and deaths. Type 2 COI are associated with a 1.7-2% decrease in adverse events across severity levels. Type 3 COI are associated with an approximately 1% increase in adverse events, serious events, and hospitalizations, but have no significant association with adverse events resulting in death. The findings suggest that COI policies might be adapted to account the relative risks of different types of financial relationships.

Keywords: Health Policy, Conflict of Interest, Natural Language Processing

Introduction

Studies of industry funding and conflicts of interest (COI) have found that financial relationships can bias choices in experimental design as well as clinical decision-making during trial execution. The most recent studies and meta-analyses confirm that financial relationships and resulting decisions are associated with substantial increase in the likelihood that clinical trial results and clinical recommendations will be favorable to industry.[1-6] Industry funded trials may be up to 5.4 times more likely to return positive results.[5] Trials with COI are up to 8.4 times more likely to return positive results.[3] Furthermore, trials with industry funding or COI may underestimate harms or adverse events.[6] These COI affect patient care through the dissemination of results of trials, clinical practice guideline development, and direct clinical decisions. For example, COI are considered drivers of the opioid crisis, with multiple categories of COI having been linked to potential pro-industry bias in published practice guidelines for prescribing opioids[7,8], lax United States Food and Drug Administration (FDA) oversight,[9,10] and overly promotional patient and clinician educational materials.[10] Given the widespread recognition of the problems of COI, federal agencies, universities, professional medical associations, and biomedical journals have adopted ethics policies designed to mitigate potentially harmful effects.[11] Despite the consensus that policies are necessary, guidelines are inconsistent. Some organizations ban financial relationships altogether, and others have complex requirements for vetting individual COIs.[12]

Many COI policies invest in classifying the types of COI that are perceived to be associated with risks of bias, but such policies are by-and-large not well-grounded in evidence and may be counterproductive. For example, although there has been significant attention focused on the possible effects industry advertising might have on editorial decision-making, the acceptance of advertising revenue does not associate with other markers of bias such as author COI.[13] Widely-cited studies do not distinguish between different COI types (e.g. industry employment vs. grant funding to academic researchers) in their analyses.[3,5,14-15] Furthermore, evidence shows that even small inducements such as ink pens and prescription pads can bias clinical decision making, and high-dollar value COI thresholds may be insufficient to mitigate against negative effects.[16]

Additionally, relevant policies are primarily grounded in scrutinizing individual COI. Emerging data on research funding and COI indicate that policies that focus on individualized effects may be insufficient to address the risks of bias. That is, the available literature shows that funding-related biases are often the result of aggregate interactions among different funding mechanisms and COI.[6,13] For example, a study of COI in psychiatry found that COI only predicted favorable results when the study sponsor was the source of the disclosed conflicts.[6] The most recent Cochrane Review evaluated 75 studies of industry funding effects and found that the quality of the available evidence ranged from very low to moderate.[3] The overall evidence quality for studies evaluating the relationships between COI and patient harms was rated as very low due to the inconsistency of findings and generally wide confidence intervals.

Given the current state of the research, it is clear that significant efforts are necessary to identify which types of industry funding are most likely to compromise the integrity of biomedical research and associated health outcomes. Specifically, there is a clear need for new research that (1) evaluates the differential effects of different COI types, (2) investigates COI in aggregation, and (3) evaluates associations between COI and drug safety. Recent innovations in machine learning and health policy informatics provide an ideal framework from which to advance research addressing these questions.[17,18] Biomedical research and clinical decision-making are connected by complex and diffuse networks.[19] Tracing complex associations across these decision systems requires integrating systems research and informatics methodologies.[20] This kind of research is essential for identifying appropriate policy solutions especially when policy effects are the result of complex multi-causal pathways, as is the case with COI.[16] This study contributes to these efforts by evaluating aggregate COI rates stratified by type and severity in the biomedical literature and comparing those rates to FDA post-marketing surveillance data on drug safety.

Methods

The primary purpose of this study was to evaluate the relationship between aggregate COI in the biomedical literature and drug safety as measured by the FDA Adverse Events Reporting System (FAERS). To do so, we began by identifying the most commonly prescribed 300 drug products listed on ClinCalc.com. ClinCalc.com aggregates and normalizes the results of the U.S. Department of Health and Human Services’ Agency for Healthcare Research and Quality (AHRQ)’s annual Medical Expenditure Panel Survey (MEPS).[21] Differences in database drug concept ontologies led us to refine the final list to 270 products in order to minimize the chance of false positives in our search strategy. The normalized product terms were used to send queries to PubMed and FAERS. We initially collected the most relevant 10,000 articles for each drug product and cross-referenced the Article Conflict of Interest database, a pre-existing database of AI-parsed COI statements from PubMed-indexed articles published between 2016 and 2018.[22]

COI Identification

Information in the Article Conflict of Interest database is based on automated parsing of 274,246 COI disclosure statements indexed in PubMed.[13,22] The metadata assisted, machine-learning enhanced, natural language processing (NLP) system uses a combination of custom named-entity recognition (NER) and a COI term dictionary to identify and aggregate individual COI in disclosure statements. We applied the spaCy NLP library NER tools pre-trained on English web text to a small sample of COI disclosure statements (N = 100) to identify authors and pharmaceutical companies. We hand-corrected those statements, yielding a 25% improvement in NER accuracy. A particular challenge was differentiating whether initials represented people or organizations; to address this challenge, we created a library of author permutations from PubMed article metadata to extract from the COI statements. When company-author parings were identified, the parser checked a relationship type dictionary using regular expressions (regex) to classify the specific COI relationship type (See Figure 1).

Figure 1:

Figure 1:

Pipeline for Identifying and Aggregating COI Source, Target, and Relationship Types

COI classification is based on the International Committee of Medical Journal Editors (ICMJE) standardized COI disclosure form. It organizes COI into a three-part schema based on potential benefit from a product’s success. Specifically, Type 1 COI included personal fees, travel, board memberships, and non-financial support. Type 2 COI included grants and research support. Finally, Type 3 COI included stock ownership and employment in industry. Parser reliability was evaluated on a stratified random sample of 1,000 human-coded disclosure statements. The training set oversampled longer disclosure statements where more COI were likely to be present. The two-way average Intra-Class Correlation Coefficient (ICC) for Type 1 COI was 0.722, with a 95% confidence interval from 0.69 to 0.751 (F[998,903] = 6.27 , p < 0.01). The average ICC for Type 2 COI was 0.773, with a 95% confidence level from 0.747 to 0.797 (F[998,985] = 7.84, p < 0.01). And, finally, the average ICC for Type 3 COI was 0.618, with a 95% confidence level from 0.578 to 0.656 (F[998,923] = 4.28, p < 0.001).

Adverse Event Reports

The FDA maintains FAERS as part of its postmarketing surveillance infrastructure. Healthcare providers can (and in some cases are required to) provide details on patient side effects, serious illness or injury, and even deaths associated with regulated products. We collected data on the number of adverse events per product, the number of serious adverse events per product, the number of hospitalizations per product, and the number of reported deaths per product. Using the same 270 drug product search terms, we queried FAERS for adverse event reports filed related to the products of interest. Our study focuses on reports filed in 2018, i.e., those that follow the publication period (2016-2017) for collected articles.This is the section where the authors describe the methods used at the level of detail necessary to convey the sample size, setting, procedure, datasets, analytic plan, and other relevant particulars to the reader.

Results

Rather than using individual studies or events as units of analysis, all collected data were aggregated by product prior to analysis. So, for example, the collected research on clindamycin had 28 Type 1 COI, 6 Type 2 COI, and 7 Type 3 COI. There were 1834 adverse event reports for clindamycin, with 952 serious events, 418 hospitalizations, and 46 deaths. For fluoxetine, there were 367 Type 1 COI, 210 Type 2 COI, and 11 Type 3 COI. FAERS reported 4,605 adverse events, 3,360 serious, 1293 hospitalizations, and 442 deaths. Overall the average product had an average of 39.07 (SD=125.98) Type 1 COI, 25.39 (SD=84.19) Type 2 COI, and 10.01 (SD=20.56) Type 3 COI. The total number of adverse event reports ranged from 2 to 65,591 with an average of 3878.1 (SD=6,639.79). Complete summary statistics are available in Table 1.

Table 1.

Summary Statistics for COI and Adverse Events (AE) by Drug Product

Variable Min Mean (SD) Max
Type 1 COI 0 39.07 (125.98) 1761
Type 2 COI 0 25.39 (84.19) 1195
Type 3 COI 0 10.02 (20.56) 156
AE Reports 2 3,878.1 (6,639.79) 65591
AE Serious 2 2,465.9 (4785.34) 51330
AE Hospital 0 1,061.8 (2233.25) 21913
AE death 0 296 (637.83) 6835

Given the over-dispersion in the data, a quasi-Poisson regression was used for all analysis. Incidence Rate Ratios (IRR) with 95% confidence intervals (CI) were calculated for each association between COI and adverse events. For the number of adverse events, the model indicated that each additional Type 1 COI associates with an 1.1% increase in the total number of adverse event reports (IRR: 1.011, 95% CI 1.006 to 1.016, p < 0.001). In contrast, each additional Type 2 COI associates with 2% decrease in the total number of adverse event reports (IRR: 0.98, 95% CI 0.978 to 0.992, p < 0.001). And for each Type 3 COI, an 0.8% increase in total number of adverse event reports was associated (IRR 1.008, 95% CI 1.001 to 1.016, p = 0.04). Subsequent tests for serious reports, hospitalizations, and deaths found nearly identical outcomes. Each additional Type 1 COI is associated with a 1.2% increase in the number of serious adverse events (IRR: 1.012, 95% CI, 1.007 to 1.018, p < .001). Each additional Type 2 COI is associated with a 1.7% decrease in the number of serious adverse events (IRR: .983, 95% CI, .976 to .99, p < 0.001). And each additional Type 3 COI is associated with approximately 1% increase in the total number of serious adverse events (IRR: 1.01, 95% CI 1.002 to 1.018, p = 0.02). The model indicates that for each Type 1 COI, we should expect to see a 1.4% increase in hospitalizations resulting from adverse events (IRR: 1.014, 95% CI 1.009 to 1.02, p < 0.001). For each Type 2 COI, we should expect to see a 2% decrease in hospitalization rates (IRR: .98, 95% CI, .973 to .987). Type 3 COIs associate with a 1% increase in hospitalizations following adverse events (IRR: 1.01, 95% CI, 1.002 to 1.018, p = 0.02). Finally, the mortality model demonstrates that each Type 1 COI is associated with a 1.8% increase in deaths resulting from adverse events (IRR: 1.018, 95% CI, 1.013 to 1.022, p < 0.001). Each Type 2 COI is associated with a 2.4% decrease in deaths (IRR: .976, 95% CI, .969 to .983, p < 0.001). The estimate for Type 3 COI was not statistically significant (p = .28). See figure 2 for a visualization of results.

Figure 2.

Figure 2

Incidence Rate Ratios for COI Types and Adverse Event Report Types

Discussion

The data presented here suggest two things with respect to COI: (1) the quantity of certain COI types is associated with overall drug safety; and (2) not all COI types necessarily involve the same risks. While a 1-2% IRR appears modest on its face, for a typical product, a single new Type 1 or Type 3 COI would associate with 38 new reports, 24 new serious event reports, and 10 new hospitalizations. If conflicts by type were to increase by a standard deviation (125 for Type 1 or 20 for Type 3), we would expect to see 4,847 more adverse event reports and 74 new deaths for Type 1 conflicts and 775 new reports for Type 3 conflicts. Interestingly, grants and contracts (Type 2 COI) associated with a reduction in adverse event rates across severity levels. This finding suggests that there may be an important difference between personal and institutional COI. Type 1 and 3 COI all involve direct disbursement to authors (speaking fees, travel money, employment, stock options). However, Type 2 COIs are typically grants paid to universities and research hospitals, institutions that provide significant internal oversight of research ethics and quality. Additionally, many Type 2 COIs come from federal funding agencies or non-profit organizations. Subsequent research should evaluate if the funding source impacts the associations demonstrated in this study.

While these findings offer a promising new direction for COI research at scale, additional studies are warranted to support effective and appropriate COI policies. Available data on COI are limited by the lack of uniform reporting standards across journals and incomplete participation in PubMed’s COI report scheme. Confirmatory research in this area should enhance the parsing algorithms identifying different categories of COI and expand data collection beyond PubMed. Future work might also consider defining increasingly granular approaches to categorizing COI. However, if these data are borne out in subsequent research, they would suggest that COI policies should be modified to address COI types of the greatest risk to patients and support those that enhance patient safety.

Conclusions

The challenges presented by COI and current disclosure practices in the biomedical research enterprise suggest a new intellectual framework is required. To that end, this study is grounded in a new model of COI, one that focuses not on individual biases but rather on the aggregation of influence across decision-making systems. Available research indicates that the focus on just the bias of individual researchers and teams overlooks the bias in networks of research, which suggests an important avenue for future inquiry. COI is more complex than previously theorized because COI is relational, and these relationships do not exist in isolation. Biomedical research is team science. Hundreds if not thousands of scientists, clinicians, providers, and technical experts work in the development and testing of any new drug. Subtle biases induced by financial relationships among a small minority of researchers may compromise the entire system, and this networked bias is more difficult to mitigate if policy makers focus only on individuals. COIs may be more usefully understood as a systemic problem that requires systemic intervention that not only addresses the rare occurance of individual unethical conduct but also the broader effects of bias in the aggregate. Safeguarding the integrity of biomedical research will require understanding how individual COI aggregate across research infrastructures and clinical contexts. Current misunderstandings of and resistance to COI policies can cause harm to patients, researchers, and public trust in medicine and clinical research.

Empowering researchers, data scientists, and policymakers with evidence-based approaches for the management of research funding and COI is a critically important part of safeguarding the integrity of biomedical research and patient health. The results here add to the growing evidence base that indicates common intuitions about which COI carry which risks of bias may not be accurate. COI policies need to be grounded in stronger evidence about the risks associated with specific types of COI. The results presented here advance science in this direction by demonstrating how the aggregation of COI across research areas are associated with differential drug safety profiles.

In addition to the specific findings for COI risks and related policies, this research also contributes to efforts to integrate systems science and informatics methods in the study of health policy. In recent years, AI and informatics technologies have been leveraged to advance health and medicine in clinical and administrative contexts while also raising concerns about the efficacy and fairness. The advances in diagnostics technologies and clinical decision support are especially promising. These same technologies have the potential to productively advance research in health policy and to provide new evidence-based foundations for health policy decision-making. This paper offers one model for research in this area. Future health policy informatics projects might investigate associations between various policy initiatives and patient safety or other outcomes of interest.

Acknowledgements

This study was funded by the National Institute of General Medical Sciences of the National Institutes of Health (R01GM141476).

References

  • [1].Ahn R, Woodbridge A, Abraham A, Saba S, Korenstein D, Madden E, et al. , Financial ties of principal investigators and randomized controlled trial outcomes: cross sectional study, BMJ 356, (2017), i6770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Flacco ME, Manzoli L, Boccia S, Capasso L, Aleksovska K, Rosso A, et al. , Head-to-head randomized trials are mostly industry spon-sored and almost always favor the industry sponsor. J Clin Epidemiol 68 (2015), 811–820. [DOI] [PubMed] [Google Scholar]
  • [3].Lundh A, Lexchin J, Mintzes B, Schroll JB, Bero L, Industry sponsorship and research outcome. Cochrane Database Syst Rev 2 (2017), MR000033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Nejstgaard CH, Bero L, Hróbjartsson A, Jørgensen AW, Jørgensen KJ, Le M, and Lundh A, Association between conflicts of interest and favourable recommendations in clinical guidelines, advisory committee reports, opinion pieces, and narrative reviews: systematic review. BMJ 371 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Als-Nielsen B, Chen W, Gluud C, Kjaergard LL, Association of funding and conclusions in randomized drug trials: a reflection of treatment effect or adverse events? JAMA 290, (2003), 921–928. [DOI] [PubMed] [Google Scholar]
  • [6].Perlis RH, Perlis CC, Wu Y, Hwang C, Joseph M, Nierenberg AA, Industry sponsorship and financial conflict of interest in the reporting of clinical trials in psychiatry. Am J Psychiatry 162, (2005), 1957–1960. [DOI] [PubMed] [Google Scholar]
  • [7].Spithoff S, Leece P, Sullivan F, Persaud N, Belesiotis P, Steiner L, Drivers of the opioid crisis: An appraisal of financial conflicts of interest in clinical practice guideline panels at the peak of opioid prescribing. PLOS One 15, (2020), e0227045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Lin DH, Lucas E, Murimi IB, Kolodny A, Alexander GC, Financial conflicts of interest and the Centers for Disease Control and Prevention’s 2016 guideline for prescribing opioids for chronic pain. JAMA Int Med 177, (2017), 427–428. [DOI] [PubMed] [Google Scholar]
  • [9].Kolodny A, How FDA Failures contributed to the opioid crisis. AMA J Ethics, 22, (2020), 743–750. [DOI] [PubMed] [Google Scholar]
  • [10].Nelson LS, Perrone J, Curbing the opioid epidemic in the United States: the risk evaluation and mitigation strategy (REMS). JAMA, 308, (2012), 457–458. [DOI] [PubMed] [Google Scholar]
  • [11].Leas BF, Umscheid CA, Principles for disclosure of interests and management of conflicts in guidelines: desirable and undesirable action and consequences. Ann Intern Med 165, (2016), 701–702. [DOI] [PubMed] [Google Scholar]
  • [12].Qaseem A, Wilt TJ, Disclosure of interests and management of conflicts of interest in clinical guidelines and guidance statements: methods from the Clinical Guidelines Committee of the American College of Physicians. Ann Intern Med 171, (2019), 354–361. [DOI] [PubMed] [Google Scholar]
  • [13].Graham SS, Majdik ZP, Clark D, Kessler MM, and Hooker TB, Relationships among commercial practices and author conflicts of interest in biomedical publishing. PLOS One, 15, (2020), e0236166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Bero L L, Oostvogel F, Bacchetti P, Lee K, Factors associated with findings of published trials of drug-drug comparisons: why some statins appear more efficacious than others. PLOS Med 4, (2007), e184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Peppercorn J, Blood E, Winer E, Partridge A, Association between pharmaceutical involvement and outcomes in breast cancer clinical trials. Cancer, 109(2007), 1239–1246. [DOI] [PubMed] [Google Scholar]
  • [16].Field MJ, Lo B, editors. Conflict of Interest in Medical Research, Education, and Practice. National Academies Press, Washington D.C., 2009. [PubMed] [Google Scholar]
  • [17].Bloomrosen M, and Detmer DE, Informatics, evidence-based care, and research; implications for national policy: a report of an American Medical Informatics Association health policy conference. JAMIA 17(2010), 115–123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Dwivedi YK, Hughes L, Ismagilova E, Aarts G, Coombs C, Crick T, Duan Y, Dwivedi R, Edwards J, Eirug A, and Galanos V, Artificial Intelligence (AI): multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy. Int J. Inf Manag (2019), p.101994. [Google Scholar]
  • [19].Gilson L, Hanson K, Sheikh K, Agyepong IA, Ssengooba F, and Bennett S, 2011. Building the field of health policy and systems research: social science matters. PLoS Med 8, 2011, e1001079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Wan TT, A transdisciplinary approach to health policy research and evaluation. Inat J Public Policy 10(2014), 161–177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Kane SP, The top 300 of 2021, ClinCalc DrugStats Database, Version 21.1. ClinCalc: https://clincalc.com/DrugStats/Top300Drugs.aspx. Updated December 1, 2020. Accessed April 14, 2021. [Google Scholar]
  • [22].Graham SS, Majdik ZP, Clark D D. Conflicts of interest: article XML. [Data file]. Texas Data Repository. 2020, DOI: 10.18738/T8/VSWAJY. [DOI] [Google Scholar]

RESOURCES