Using and improving distributed data networks to generate actionable evidence: the case of real-world outcomes in the Food and Drug Administration’s Sentinel system

Jeffrey S Brown; Judith C Maro; Michael Nguyen; Robert Ball

doi:10.1093/jamia/ocaa028

. 2020 Apr 11;27(5):793–797. doi: 10.1093/jamia/ocaa028

Using and improving distributed data networks to generate actionable evidence: the case of real-world outcomes in the Food and Drug Administration’s Sentinel system

Jeffrey S Brown ^o1,^✉, Judith C Maro ^o1, Michael Nguyen ^o2, Robert Ball ^o2

PMCID: PMC7647264 PMID: 32279080

Abstract

The US Food and Drug Administration (FDA) Sentinel System uses a distributed data network, a common data model, curated real-world data, and distributed analytic tools to generate evidence for FDA decision-making. Sentinel system needs include analytic flexibility, transparency, and reproducibility while protecting patient privacy. Based on over a decade of experience, a critical system limitation is the inability to identify enough medical conditions of interest in observational data to a satisfactory level of accuracy. Improving the system’s ability to use computable phenotypes will require an “all of the above” approach that improves use of electronic health data while incorporating the growing array of complementary electronic health record data sources. FDA recently funded a Sentinel System Innovation Center and a Community Building and Outreach Center that will provide a platform for collaboration across disciplines to promote better use of real-world data for decision-making.

Keywords: health outcomes; drug safety surveillance; distributed networks, computable phenotypes, electronic health data

INTRODUCTION: FDA SENTINEL SYSTEM

The US Food and Drug Administration (FDA) Sentinel Initiative began over a decade ago to create a national system for monitoring the safety and performance of FDA-regulated medical products. The Sentinel Initiative is the FDA’s response to the FDA Amendments Act (FDAAA) of 2007 requirement that the FDA use electronic healthcare data to rapidly and robustly assess the safety of approved medical products. Since 2016, the FDA has used the Sentinel System to meet its FDAAA requirements.^1–4

The Sentinel System uses a distributed data network, a common data model (CDM), curated real-world data (RWD), and distributed analytic tools to generate real-world evidence (RWE) for FDA decision-making.^5–7 Source data are largely based on health insurance plan claims data that have complete capture of medical encounters and outpatient pharmacy dispensing during health plan enrollment periods—a critical need for medical product safety assessments. Since its launch, the Sentinel System has executed thousands of distributed queries and findings have been used to inform regulatory decisions and FDA Advisory Committee meetings.⁸^,⁹ The Sentinel System has allowed the FDA to design, conduct, and analyze several postmarketing studies that FDA previously would have required of medical product sponsors as a postmarketing requirement, thereby meeting a critical goal of the system.

From the start, the Sentinel System was envisioned as part of a national infrastructure to support a learning health system, and several initiatives in the US and abroad are based on the Sentinel common data model, curated data, tools, and networking capabilities.^10–16 While the first 10 years’ work focused on meeting the FDAAA’s legislative requirements, the 21st Century Cures Act of 2016 requires the FDA to develop a framework to evaluate the potential use of RWE to support new indications for an already approved drug or to satisfy postapproval study requirements.¹⁷ To help the FDA understand how best to meet this new mandate, while continuing to improve the Sentinel System’s ability to evaluate medical product safety, new RWE generation approaches need to be explored.

The Sentinel System experience provides a use case to help identify common goals and shared solutions to support a learning health system able to generate evidence for decision-making. In this article we review the FDA’s experience assessing whether RWD is “good enough” to answer FDA postmarketing safety questions, identify areas of data and common data model insufficiency, and discuss options for improvement, especially within the context of increasing access to electronic health record (EHR) data, the growth of distributed data networks and common data models, and emerging technologies, such as natural language processing and deep learning tools.

WHEN IS REAL WORLD DATA GOOD ENOUGH?

FDAAA 2007 required the FDA to create a system for Active Risk Identification and Analysis (ARIA) to address safety issues.¹⁸ Under FDAAA, the FDA cannot require a medical product manufacturer to conduct a postmarketing study unless FDA has determined that both the FDA Adverse Event Reporting System and the ARIA system are insufficient to address the safety issue; this is referred to as the “sufficiency” assessment. Within Sentinel, ARIA is defined as the curated data available in the Sentinel common data model and the validated analytic tools able to query those data; analyses that require additional data elements, medical chart review, or extensive de-novo analytic programing are, for regulatory purposes, outside of the ARIA framework.³ The ARIA regulatory framework provides a lens for considering when RWD data are “good enough” to meet the FDA’s postmarketing safety decision-making needs and when not, to identify areas for improvement to guide future investments in making RWD more useful for the FDA and others.

The ARIA sufficiency experience

Since the 2016 launch of Sentinel System, the FDA found the ARIA system sufficient to answer about half of the safety issues evaluated. A common reason for insufficiency was the inability to identify health outcomes of interest (HOIs) with a computable phenotype that had an acceptable positive predictive value as determined by the FDA review team, for regulatory decision-making. Other reasons for insufficiency included inability to identify the exposure, cohort, or covariates of interest with the data available in the common data model and analytic tool limitations (eg, inability to monitor infant outcomes resulting from in utero exposures).

The ARIA experience established that the Sentinel System is well suited to identify clinically unambiguous, acute, and serious outcomes, such as hospitalized stroke, acute myocardial infarction, intracranial hemorrhage, venous thromboembolism, patients hospitalized with neutropenia and fractures.¹⁹ Conversely, outcomes with complex and/or delayed clinical presentation (eg, neurodevelopmental delays, severe acute liver injury, device implant complications, Hepatitis B reactivation, malignancies), outcomes that may not lead to a medical encounter (eg, migraine, diarrhea), outcomes that may not have standardized data elements available within the CDM (eg, opportunistic and serious infections, immune-mediated disorders), and out-of-hospital deaths are all examples of HOIs that have led to or would likely lead to an insufficiency finding due to the lack of a valid and robust computable phenotype. Expanding the number of HOIs that can be robustly and efficiently identified within the Sentinel System is a critical need for advancing the FDA’s mission.

Computable phenotypes: What’s the best approach?

Improving the Sentinel System’s ability to apply computable phenotypes quickly and efficiently will require an “all-of-the-above” approach that makes better use of health plan claims data while incorporating other data sources, such as EHRs, public health and medical product and disease registries, patient-reported information, and wearables. Beyond computable phenotypes, system improvement also will require new approaches for extracting and storing data elements from EHRs, new analytic methods, use of machine learning and other novel tools, and development of better technologies to enable privacy-protecting distributed analytics and linkages.²⁰

Sentinel system requirements

The Sentinel System has several requirements that influence how to address system limitations (Table 1). A unique characteristic of the Sentinel System use case is that it requires readily available curated data to enable rapid implementation of highly customized and targeted analyses. This contrasts with systems that require high throughput analyses (eg, millions of analyses at a time), hypothesis-free “all-by-all” investigations, just-in-time data extraction and curation, or cross-national comparisons that may have different design and computational needs. The Sentinel System requirements dictate data model and computable phenotype implementation decisions and the associated solutions for HOI identification and phenotype implementation.

Table 1.

Summary sentinel system requirements for ARIA

Requirement	Description
Secure distributed network	Sentinel must use a distributed network approach that allows partners to maintain physical and operational control of their data and approve all data uses. Queries and results must be securely transferred. All locally installed software must be institutionally approved by each data partner.
Curated and quality-checked data	Data must be formatted to the Sentinel CDM and approved by the Sentinel Operations Center before use in ARIA analyses.²¹ The Sentinel CDM guiding principles include minimization of data transformations or mapping and a reliance on local expertise to support data curation and analysis. Extraction of project-specific data elements, although possible, is outside the ARIA framework.
Analytic flexibility and standardized tools	ARIA analyses must use approved data in the Sentinel CDM and validated analytic tools in the Sentinel Routine Querying System. Extensive customized distributed programming is not included within ARIA but is possible outside the ARIA context.
Longitudinal capture of medical events	Sentinel data sources must have complete capture of medically attended events during a defined period to enable creation of periods of defined person-time. This is primarily achieved through use of health plan insurance data that can leverage enrollment windows as periods during which all medically attended care should be captured.
Transparency and reproducibility	ARIA analyses must be posted online with sufficient detail to be transparent and readily reproducible. Use of source data values, the CDM, and validated tools support this requirement. All computable phenotypes and associated codes must be available.
Access to full text medical records	All partners must have access to full text medical records to support potential follow-up investigations of ARIA findings.
Protection of patient privacy and partner proprietary information	ARIA analyses cannot require transfer of protected health information, personally identifiable information, or confidential information. The CDM does not include direct patient identifiers.
Large populations	Sentinel analyses often require very large populations to identify rare outcomes or study uncommon exposures or cohorts. Therefore, phenotypes should use the largest possible source population which, for Sentinel, is health plan insurance claims data.
Phenotype requirements	Binary phenotype definitions (yes/no) without an onset date are rarely acceptable as HOIs for FDA’s most common use case that requires HOIs to be new events and have an exact onset date that can be related to (typically) an exposure index date; FDA needs to know the temporal sequence of events, not just if an event occurred.

Open in a new tab

Data models design

The pharmacoepidemiology and informatics communities continue to work on how to best implement common data models in distributed data networks and develop and use computable phenotypes within those networks.^22–28 The FDA’s requirements led the Sentinel System to focus on use of the most granular data available (ie, the source data values) to define phenotypes, avoiding embedded terminology mapping (eg, ICD-10-CM to SNOMED) to maximize transparency and analytic flexibility. This “organizing data model” approach makes use of standard terminologies available at all the Sentinel data partners, meaning there is no need to map across terminologies to effectively use the available data.²² A potential disadvantage of this approach is that it requires “on the fly” phenotype computation. As computable phenotypes increasingly use data from EHRs (eg, excerpts of narrative text) and become more complex, this approach will be challenged and demand new solutions that match the expected use with the best available data. Other use cases and common data models (eg, the Observational Health Data Sciences and Informatics program’s use of the Observational Medical Outcome Partnership data model) espouse mapping standard terminologies found in the source data to clinical concepts and using those concepts to define HOIs.²⁵^,²⁸^,²⁹ This mapping approach—a “pre-configured rules system” data model—has efficiency benefits and can help simplify cross-national analyses but introduces risk of information loss, lack of transparency, mapping ambiguity (eg, when codes map to multiple concepts or to no concepts), misleading granularity (eg, when codes are mapped to more granular terminologies), maintenance costs, and inflexibility.²²^,²⁵^,²⁹^,³⁰

In addition, because computable phenotypes typically require different levels of sensitivity and specificity based on the specific context, it is important to allow phenotype definitions maximum flexibility through use of source data values rather mapped values. A condition like Type 2 diabetes mellitus used as an exclusion criterion will have a different computable phenotype than that same condition used as an outcome or as a covariate. This means that the same computable phenotype can be sufficient in one context and insufficient in another. So, although some high-throughput analytic use cases require the analytic efficiency enabled by terminology mapping, the added uncertainty introduced by terminology mapping creates analytic inflexibility. Such gains in analytic efficiency at the expense of analytic inflexibility is considered highly undesirable for the uses that the FDA makes of the Sentinel System.

SENTINEL’S APPROACH TO PHENOTYPES

Given the complexity of developing robust phenotypes that are fit for purpose, Sentinel investigators use a multipronged phenotype development approach that includes review of phenotypes used in prior Sentinel analyses, literature review, code searches across multiple terminologies, forward-backward mapping between ICD-9-CM and ICD-10-CM (when validated ICD-9 definitions are available), clinical expertise, local data system knowledge, and data science expertise, all framed within the context of the specific question and intended use. This phenotype assessment approach addresses issues of missing data, misclassification, local data variability, and potential biases that could be introduced in phenotype definitions.²⁹^,³¹^,³² For the reasons noted above, defining a phenotype often takes weeks or months. When necessary, Sentinel undertakes validation studies with medical chart review to develop phenotypes; Sentinel is currently developing phenotypes using medical chart review for serious infection, stillbirth, anaphylaxis, acute pancreatitis, and lymphoma.³³ This approach, while critical to demonstrating the validity and reproducibility of outcome algorithms, is slow and expensive. To improve this process, Sentinel has launched a project to standardize use of natural language processing and machine learning for complex HOIs using anaphylaxis as a starting point. How best to incorporate the data extracted from such analyses into an “organizing data model” approach is an open challenge.

Using unmapped source data values meets the FDA’s needs for analytic flexibility and transparency, but system improvement (ie, reducing the number of insufficiency findings) is dependent on access to additional standardized data elements, richer clinical data (eg, laboratory result values, vital signs, cancer stage), linked data, distributed querying technologies, and new data resources from which to develop phenotypes. As a starting point, Sentinel is expanding the CDM to capture the case status for patients included in Sentinel chart validation studies, thereby allowing use of the case status to, for example, develop improved algorithms in the future. The case status data will point to the original source data elements that led to the case status determination. This is an initial attempt to maintain the granularity of the data needed to meet the FDA’s requirements and provide an option for recomputing case status if a better algorithm is developed while improving the efficiency of computation. Other system improvements require advances in natural language processing, data provenance and metadata standards, and statistical methods—especially as they relate to missing data and variation in data availability across sites within a network. The good news is that the next steps for improving RWD to enable robust phenotype implementation are identifiable, but progress will require a clear vision with strong use cases and effective collaboration across stakeholders.

Call for collaboration

Our Sentinel System experience suggests that improving the ability to define computable phenotypes in distributed networks is a critical next step that has multiple immediate downstream consequences that would benefit from a collaborative effort across disciplines. FDA’s use case prioritizes identification of incident outcomes and onset dates rather than general assessments of whether a patient should be considered to have a specific condition or not. To that end, a critical step will be to develop best practices for populating common data models with information from EHRs that allows temporally sequenced analyses or information with specific dates of onset. While focused on HOIs, collaborative efforts (Table 2) will have broad value for other purposes and support the FDA’s general mission to use RWD to inform decision-making.³⁹^,⁴⁰ The FDA recently funded an Innovations Center and a Community Building and Outreach Center that will provide a nexus for collaboration across disciplines and communities with a goal of improving Sentinel System functionality specifically, and real-world data capabilities generally.⁴¹

Table 2.

Areas for collaboration

Data infrastructure and analytic tools
Data models	Encourage data model developers to maintain access to source data values to maximize analytic flexibility and support transparency.
Analytic tools	Encourage analytic tool developers to enable use of source data values.
Rapid data linkage	Given the complex technical, operational, and governance barriers to creating all-purpose standing linked datasets, work is needed to extend prior work to develop approaches to enable just-in-time linkages between data sources (eg, enterprise master patient index).³⁴^,³⁵
Rapid data quality assessment	Data quality assessment introduces time and effort burdens that are sometimes inconsistent with intended use case; data do not have to be perfect to be useful. Rapid data quality reviews of only the relevant data elements can facilitate just-in-time data extraction that can together support multiple use cases.
Standardized data quality metrics	As use of distributed health data networks grows there is a need for ways to compare and contrast data sources to assess fitness for use. A set of data model agnostic data quality metrics would help investigators assess data source fitness for use.
Data extraction
Focus on data value granularity	System improvement requires standardized extraction of the most granular clinical data elements to enable building phenotypes with those data elements. The community should focus on identification of data elements in the context of clinical concepts rather than clinical concepts alone. For example, better to extract ejection fraction and the date of service than to identify a patient with heart failure.
Improved natural language processing (NLP) for unstructured text	Improve the accuracy NLP functionality to help identify clinical values from clinical notes, focusing on extraction and standardization of atomic data elements rather than computed concepts alone.³⁶
Computable phenotype automation and characterization
Phenotype transparency	Expand upon ongoing efforts (eg, Phenotype KnowledgeBase [PheKB]) to standardize phenotype definitions with goal of sharing phenotypes, transparency, and reproducibility.³⁷
Learning Lab	Develop a network of learning laboratories that can quickly develop and test new computable phenotypes, including the ability to rapidly extract required information and test phenotypes across data sources, and investigation of probabilistic phenotyping.³⁸

Open in a new tab

FUNDING

This work was supported in part by US Food and Drug Administration contract # HHSF223201400030I.

AUTHOR CONTRIBUTIONS

JSB and RB collaborated on the original draft; JCM and MN provided critical comments and revision for important intellectual content. All authors approved the final version and agree to be accountable.

ACKNOWLEDGMENTS

The authors acknowledge the assistance of Adee Kennedy throughout the project.

CONFLICT OF INTEREST STATEMENT

None declared.

REFERENCES

1. Behrman RE, Benner JS, Brown JS, McClellan M, Woodcock J, Platt R.. Developing the Sentinel System–a national resource for evidence development. N Engl J Med 2011; 364 (6): 498–9. [DOI] [PubMed] [Google Scholar]
2. Platt R, Carnahan RM, Brown JS, et al. The U.S. Food and Drug Administration’s Mini-Sentinel program: status and direction. Pharmacoepidemiol Drug Saf 2012; 21 (Suppl 1): 1–8. [DOI] [PubMed] [Google Scholar]
3. Ball R, Robb M, Anderson SA, Dal Pan G.. The FDA’s sentinel initiative–a comprehensive approach to medical product surveillance. Clin Pharmacol Ther 2016; 99 (3): 265–8. [DOI] [PubMed] [Google Scholar]
4. Platt R, Brown JS, Robb M, et al. The FDA Sentinel initiative: an evolving national resource. N Engl J Med 2018; 379 (22): 2091–3. [DOI] [PubMed] [Google Scholar]
5. Brown JS, Holmes JH, Shah K, Hall K, Lazarus R, Platt R.. Distributed health data networks: a practical and preferred approach to multi-institutional evaluations of comparative effectiveness, safety, and quality of care. Med Care 2010; 48 (6 Suppl): S45–51. [DOI] [PubMed] [Google Scholar]
6. Maro JC, Platt R, Holmes JH, et al. Design of a national distributed health data network. Ann Intern Med 2009; 151 (5): 341–4. [DOI] [PubMed] [Google Scholar]
7. Curtis LH, Weiner MG, Boudreau DM, et al. Design considerations, architecture, and use of the Mini-Sentinel distributed data system. Pharmacoepidemiol Drug Saf 2012; 21 (21 Suppl 1): 23–31. [DOI] [PubMed] [Google Scholar]
8.Sentinel Initiative [Internet]. FDA Safety Communications; 2019. https://www.sentinelinitiative.org/communications/fda-safety-communications. Accessed October 31, 2019.
9.Sentinel Initiative [Internet]. FDA Advisory Committee Meetings; 2019. https://www.sentinelinitiative.org/communications/fda-advisory-committee-meetings. Accessed October 31, 2019.
10. Institute of Medicine. The National Academies Collection: Reports funded by National Institutes of Health In: Grossmann C, Powers B, McGinnis JM, eds. Digital Infrastructure for the Learning Health System: The Foundation for Continuous Improvement in Health and Health Care: Workshop Series Summary. Washington, DC: National Academies Press (US), National Academy of Sciences; 2011. [PubMed] [Google Scholar]
11. Friedman C, Rubin J, Brown J, et al. Toward a science of learning systems: a research agenda for the high-functioning Learning Health System. J Am Med Inform Assoc 2015; 22 (1): 43–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Baldziki M, Brown J, Chan H,. et al. Utilizing data consortia to monitor safety and effectiveness of biosimilars and their innovator products. J Manag Care Spec Pharm 2015; 21 (1): 23–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Curtis LH, Brown J, Platt R.. Four health data networks illustrate the potential for a shared national multipurpose big-data network. Health Aff (Millwood) 2014; 33 (7): 1178–86. [DOI] [PubMed] [Google Scholar]
14. Fleurence RL, Curtis LH, Califf RM, Platt R, Selby JV, Brown JS.. Launching PCORnet, a national patient-centered clinical research network. J Am Med Inform Assoc 2014; 21 (4): 578–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Suissa S, Henry D, Caetano P, et al. CNODES: the Canadian Network for Observational Drug Effect Studies. Open Med 2012; 6 (4): e134–40. [PMC free article] [PubMed] [Google Scholar]
16. Raman SR, Brown JS, Curtis LH, et al. Cancer screening results and follow-up using routinely collected electronic health data: estimates for breast, colon, and cervical cancer screenings. J Gen Intern Med 2019; 34 (3): 341–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.U.S. Food & Drug Administration. Framework for FDA’s Real-World Evidence Program. Silver Spring, MD: U.S. Food & Drug Administration; 2018.
18.Sentinel Initiative [Internet]. Active Risk Identification and Analysis (ARIA); 2019. https://www.sentinelinitiative.org/active-risk-identification-and-analysis-aria. Accessed October 31, 2019.
19. Lanes S, Brown JS, Haynes K, Pollack MF, Walker AM.. Identifying health outcomes in healthcare databases. Pharmacoepidemiol Drug Saf 2015; 24 (10): 1009–16. [DOI] [PubMed] [Google Scholar]
20. Li X, Fireman BH, Curtis JR, et al. Validity of privacy-protecting analytical methods that use only aggregate-level information to conduct multivariable-adjusted analysis in distributed data networks. Am J Epidemiol 2019; 188 (4): 709–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Sentinel Initiative [Internet]. Data Quality Review and Characterization; 2019. https://www.sentinelinitiative.org/sentinel/data-quality-review-and-characterization. Accessed October 31, 2019.
22. Schneeweiss S, Brown JS, Bate A, Trifiro G, Bartels DB.. Choosing among common data models for real-world data analyses fit for making decisions about the effectiveness of medical products. Clin Pharmacol Ther 2020. doi:10.1002/cpt.1577. [DOI] [PubMed] [Google Scholar]
23. Gini R, Schuemie M, Brown J, et al. Data extraction and management in networks of observational health care databases for scientific research: a comparison of EU-ADR, OMOP, Mini-Sentinel and MATRICE strategies. EGEMS (Wash DC) 2016; 4 (1): 1189. [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Overhage JM, Ryan PB, Reich CG, Hartzema AG, Stang PE.. Validation of a common data model for active safety surveillance research. J Am Med Inform Assoc 2012; 19 (1): 54–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
25. Xu Y, Zhou X, Suehs BT, et al. A comparative assessment of observational medical outcomes partnership and mini-sentinel common data models and analytics: implications for active drug safety surveillance. Drug Saf 2015; 38 (8): 749–65. [DOI] [PubMed] [Google Scholar]
26. Zhou X, Murugesan S, Bhullar H, et al. An evaluation of the THIN database in the OMOP Common Data Model for active drug safety surveillance. Drug Saf 2013; 36 (2): 119–34. [DOI] [PubMed] [Google Scholar]
27. Trifiro G, Coloma PM, Rijnbeek PR, et al. Combining multiple healthcare databases for postmarketing drug and vaccine safety surveillance: why and how? J Intern Med 2014; 275 (6): 551–61. [DOI] [PubMed] [Google Scholar]
28. Avillach P, Coloma PM, Gini R, et al. Harmonization process for the identification of medical events in eight European healthcare databases: the experience from the EU-ADR project. J Am Med Inform Assoc 2013; 20 (1): 184–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Hripcsak G, Levine ME, Shang N, Ryan PB.. Effect of vocabulary mapping for conditions on phenotype cohorts. J Am Med Inform Assoc 2018; 25 (12): 1618–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Voss EA, Makadia R, Matcho A, et al. Feasibility and utility of applications of the common data model to multiple, disparate observational health databases. J Am Med Inform Assoc 2015; 22 (3): 553–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
31. He M, Santiago Ortiz AJ, Marshall J, et al. Mapping from the International Classification of Diseases (ICD) 9th to 10th Revision for research in biologics and biosimilars using administrative healthcare data. Pharmacoepidemiol Drug Saf 2019; 1–8. doi: 10.1002/pds.4933. [DOI] [PubMed] [Google Scholar]
32. Panozzo CA, Woodworth TS, Welch EC, et al. Early impact of the ICD-10-CM transition on selected health outcomes in 13 electronic health care databases in the United States. Pharmacoepidemiol Drug Saf 2018; 27 (8): 839–47. [DOI] [PubMed] [Google Scholar]
33.Sentinel Initiative [Internet]. Health Outcome of Interest Validations and Literature Reviews; 2019. https://www.sentinelinitiative.org/sentinel/surveillance-tools/validations-lit-review. Accessed October 31, 2019.
34.Sentinel Initiative [Internet]. FDA-Catalyst Alignment with the CMS Linkage to the PCORI RELIANCE Trial; 2020. https://www.sentinelinitiative.org/content/fda-catalyst-alignment-cms-linkage-pcori-reliance-trial. Accessed October 31, 2019.
35.U.S. Food & Drug Administration [Internet]. FDA’S Sentinel Initiative; 2019. https://www.fda.gov/safety/fdas-sentinel-initiative. Accessed October 31, 2019.
36. Ball R, Toh S, Nolan J, Haynes K, Forshee R, Botsis T.. Evaluating automated approaches to anaphylaxis case classification using unstructured data from the FDA Sentinel System. Pharmacoepidemiol Drug Saf 2018; 27 (10): 1077–84. [DOI] [PubMed] [Google Scholar]
37. Newton KM, Peissig PL, Kho AN, et al. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J Am Med Inform Assoc 2013; 20 (e1): e147–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
38. Agarwal V, Podchiyska T, Banda JM, et al. Learning statistical models of phenotypes using noisy labeled training data. J Am Med Inform Assoc 2016; 23 (6): 1166–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
39. Corrigan-Curay J, Sacks L, Woodcock J.. Real-world evidence and real-world data for evaluating drug safety and effectiveness. JAMA 2018; 320 (9): 867–8. [DOI] [PubMed] [Google Scholar]
40. Sherman RE, Anderson SA, Dal Pan GJ, et al. Real-world evidence - what is it and what can it tell us? N Engl J Med 2016; 375 (23): 2293–7. [DOI] [PubMed] [Google Scholar]
41.Sentinel Initiative [Internet]. Exploration of Potential for Sentinel and PCORnet Data Linkage; 2020. https://www.sentinelinitiative.org/sentinel/data/complementary-data-sources/exploration-potential-sentinel-and-pcornet-data-linkage. Accessed October 31, 2019.

[ocaa028-B1] 1. Behrman RE, Benner JS, Brown JS, McClellan M, Woodcock J, Platt R.. Developing the Sentinel System–a national resource for evidence development. N Engl J Med 2011; 364 (6): 498–9. [DOI] [PubMed] [Google Scholar]

[ocaa028-B2] 2. Platt R, Carnahan RM, Brown JS, et al. The U.S. Food and Drug Administration’s Mini-Sentinel program: status and direction. Pharmacoepidemiol Drug Saf 2012; 21 (Suppl 1): 1–8. [DOI] [PubMed] [Google Scholar]

[ocaa028-B3] 3. Ball R, Robb M, Anderson SA, Dal Pan G.. The FDA’s sentinel initiative–a comprehensive approach to medical product surveillance. Clin Pharmacol Ther 2016; 99 (3): 265–8. [DOI] [PubMed] [Google Scholar]

[ocaa028-B4] 4. Platt R, Brown JS, Robb M, et al. The FDA Sentinel initiative: an evolving national resource. N Engl J Med 2018; 379 (22): 2091–3. [DOI] [PubMed] [Google Scholar]

[ocaa028-B5] 5. Brown JS, Holmes JH, Shah K, Hall K, Lazarus R, Platt R.. Distributed health data networks: a practical and preferred approach to multi-institutional evaluations of comparative effectiveness, safety, and quality of care. Med Care 2010; 48 (6 Suppl): S45–51. [DOI] [PubMed] [Google Scholar]

[ocaa028-B6] 6. Maro JC, Platt R, Holmes JH, et al. Design of a national distributed health data network. Ann Intern Med 2009; 151 (5): 341–4. [DOI] [PubMed] [Google Scholar]

[ocaa028-B7] 7. Curtis LH, Weiner MG, Boudreau DM, et al. Design considerations, architecture, and use of the Mini-Sentinel distributed data system. Pharmacoepidemiol Drug Saf 2012; 21 (21 Suppl 1): 23–31. [DOI] [PubMed] [Google Scholar]

[ocaa028-B8] 8.Sentinel Initiative [Internet]. FDA Safety Communications; 2019. https://www.sentinelinitiative.org/communications/fda-safety-communications. Accessed October 31, 2019.

[ocaa028-B9] 9.Sentinel Initiative [Internet]. FDA Advisory Committee Meetings; 2019. https://www.sentinelinitiative.org/communications/fda-advisory-committee-meetings. Accessed October 31, 2019.

[ocaa028-B10] 10. Institute of Medicine. The National Academies Collection: Reports funded by National Institutes of Health In: Grossmann C, Powers B, McGinnis JM, eds. Digital Infrastructure for the Learning Health System: The Foundation for Continuous Improvement in Health and Health Care: Workshop Series Summary. Washington, DC: National Academies Press (US), National Academy of Sciences; 2011. [PubMed] [Google Scholar]

[ocaa028-B11] 11. Friedman C, Rubin J, Brown J, et al. Toward a science of learning systems: a research agenda for the high-functioning Learning Health System. J Am Med Inform Assoc 2015; 22 (1): 43–50. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B12] 12. Baldziki M, Brown J, Chan H,. et al. Utilizing data consortia to monitor safety and effectiveness of biosimilars and their innovator products. J Manag Care Spec Pharm 2015; 21 (1): 23–34. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B13] 13. Curtis LH, Brown J, Platt R.. Four health data networks illustrate the potential for a shared national multipurpose big-data network. Health Aff (Millwood) 2014; 33 (7): 1178–86. [DOI] [PubMed] [Google Scholar]

[ocaa028-B14] 14. Fleurence RL, Curtis LH, Califf RM, Platt R, Selby JV, Brown JS.. Launching PCORnet, a national patient-centered clinical research network. J Am Med Inform Assoc 2014; 21 (4): 578–82. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B15] 15. Suissa S, Henry D, Caetano P, et al. CNODES: the Canadian Network for Observational Drug Effect Studies. Open Med 2012; 6 (4): e134–40. [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B16] 16. Raman SR, Brown JS, Curtis LH, et al. Cancer screening results and follow-up using routinely collected electronic health data: estimates for breast, colon, and cervical cancer screenings. J Gen Intern Med 2019; 34 (3): 341–3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B17] 17.U.S. Food & Drug Administration. Framework for FDA’s Real-World Evidence Program. Silver Spring, MD: U.S. Food & Drug Administration; 2018.

[ocaa028-B18] 18.Sentinel Initiative [Internet]. Active Risk Identification and Analysis (ARIA); 2019. https://www.sentinelinitiative.org/active-risk-identification-and-analysis-aria. Accessed October 31, 2019.

[ocaa028-B19] 19. Lanes S, Brown JS, Haynes K, Pollack MF, Walker AM.. Identifying health outcomes in healthcare databases. Pharmacoepidemiol Drug Saf 2015; 24 (10): 1009–16. [DOI] [PubMed] [Google Scholar]

[ocaa028-B20] 20. Li X, Fireman BH, Curtis JR, et al. Validity of privacy-protecting analytical methods that use only aggregate-level information to conduct multivariable-adjusted analysis in distributed data networks. Am J Epidemiol 2019; 188 (4): 709–23. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B21] 21.Sentinel Initiative [Internet]. Data Quality Review and Characterization; 2019. https://www.sentinelinitiative.org/sentinel/data-quality-review-and-characterization. Accessed October 31, 2019.

[ocaa028-B22] 22. Schneeweiss S, Brown JS, Bate A, Trifiro G, Bartels DB.. Choosing among common data models for real-world data analyses fit for making decisions about the effectiveness of medical products. Clin Pharmacol Ther 2020. doi:10.1002/cpt.1577. [DOI] [PubMed] [Google Scholar]

[ocaa028-B23] 23. Gini R, Schuemie M, Brown J, et al. Data extraction and management in networks of observational health care databases for scientific research: a comparison of EU-ADR, OMOP, Mini-Sentinel and MATRICE strategies. EGEMS (Wash DC) 2016; 4 (1): 1189. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B24] 24. Overhage JM, Ryan PB, Reich CG, Hartzema AG, Stang PE.. Validation of a common data model for active safety surveillance research. J Am Med Inform Assoc 2012; 19 (1): 54–60. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B25] 25. Xu Y, Zhou X, Suehs BT, et al. A comparative assessment of observational medical outcomes partnership and mini-sentinel common data models and analytics: implications for active drug safety surveillance. Drug Saf 2015; 38 (8): 749–65. [DOI] [PubMed] [Google Scholar]

[ocaa028-B26] 26. Zhou X, Murugesan S, Bhullar H, et al. An evaluation of the THIN database in the OMOP Common Data Model for active drug safety surveillance. Drug Saf 2013; 36 (2): 119–34. [DOI] [PubMed] [Google Scholar]

[ocaa028-B27] 27. Trifiro G, Coloma PM, Rijnbeek PR, et al. Combining multiple healthcare databases for postmarketing drug and vaccine safety surveillance: why and how? J Intern Med 2014; 275 (6): 551–61. [DOI] [PubMed] [Google Scholar]

[ocaa028-B28] 28. Avillach P, Coloma PM, Gini R, et al. Harmonization process for the identification of medical events in eight European healthcare databases: the experience from the EU-ADR project. J Am Med Inform Assoc 2013; 20 (1): 184–92. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B29] 29. Hripcsak G, Levine ME, Shang N, Ryan PB.. Effect of vocabulary mapping for conditions on phenotype cohorts. J Am Med Inform Assoc 2018; 25 (12): 1618–25. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B30] 30. Voss EA, Makadia R, Matcho A, et al. Feasibility and utility of applications of the common data model to multiple, disparate observational health databases. J Am Med Inform Assoc 2015; 22 (3): 553–64. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B31] 31. He M, Santiago Ortiz AJ, Marshall J, et al. Mapping from the International Classification of Diseases (ICD) 9th to 10th Revision for research in biologics and biosimilars using administrative healthcare data. Pharmacoepidemiol Drug Saf 2019; 1–8. doi: 10.1002/pds.4933. [DOI] [PubMed] [Google Scholar]

[ocaa028-B32] 32. Panozzo CA, Woodworth TS, Welch EC, et al. Early impact of the ICD-10-CM transition on selected health outcomes in 13 electronic health care databases in the United States. Pharmacoepidemiol Drug Saf 2018; 27 (8): 839–47. [DOI] [PubMed] [Google Scholar]

[ocaa028-B33] 33.Sentinel Initiative [Internet]. Health Outcome of Interest Validations and Literature Reviews; 2019. https://www.sentinelinitiative.org/sentinel/surveillance-tools/validations-lit-review. Accessed October 31, 2019.

[ocaa028-B34] 34.Sentinel Initiative [Internet]. FDA-Catalyst Alignment with the CMS Linkage to the PCORI RELIANCE Trial; 2020. https://www.sentinelinitiative.org/content/fda-catalyst-alignment-cms-linkage-pcori-reliance-trial. Accessed October 31, 2019.

[ocaa028-B35] 35.U.S. Food & Drug Administration [Internet]. FDA’S Sentinel Initiative; 2019. https://www.fda.gov/safety/fdas-sentinel-initiative. Accessed October 31, 2019.

[ocaa028-B36] 36. Ball R, Toh S, Nolan J, Haynes K, Forshee R, Botsis T.. Evaluating automated approaches to anaphylaxis case classification using unstructured data from the FDA Sentinel System. Pharmacoepidemiol Drug Saf 2018; 27 (10): 1077–84. [DOI] [PubMed] [Google Scholar]

[ocaa028-B37] 37. Newton KM, Peissig PL, Kho AN, et al. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J Am Med Inform Assoc 2013; 20 (e1): e147–54. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B38] 38. Agarwal V, Podchiyska T, Banda JM, et al. Learning statistical models of phenotypes using noisy labeled training data. J Am Med Inform Assoc 2016; 23 (6): 1166–73. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ocaa028-B39] 39. Corrigan-Curay J, Sacks L, Woodcock J.. Real-world evidence and real-world data for evaluating drug safety and effectiveness. JAMA 2018; 320 (9): 867–8. [DOI] [PubMed] [Google Scholar]

[ocaa028-B40] 40. Sherman RE, Anderson SA, Dal Pan GJ, et al. Real-world evidence - what is it and what can it tell us? N Engl J Med 2016; 375 (23): 2293–7. [DOI] [PubMed] [Google Scholar]

[ocaa028-B41] 41.Sentinel Initiative [Internet]. Exploration of Potential for Sentinel and PCORnet Data Linkage; 2020. https://www.sentinelinitiative.org/sentinel/data/complementary-data-sources/exploration-potential-sentinel-and-pcornet-data-linkage. Accessed October 31, 2019.

PERMALINK

Using and improving distributed data networks to generate actionable evidence: the case of real-world outcomes in the Food and Drug Administration’s Sentinel system

Jeffrey S Brown

Judith C Maro

Michael Nguyen

Robert Ball

Abstract

INTRODUCTION: FDA SENTINEL SYSTEM

WHEN IS REAL WORLD DATA GOOD ENOUGH?

The ARIA sufficiency experience

Computable phenotypes: What’s the best approach?

Sentinel system requirements

Table 1.

Data models design

SENTINEL’S APPROACH TO PHENOTYPES

Call for collaboration

Table 2.

FUNDING

AUTHOR CONTRIBUTIONS

ACKNOWLEDGMENTS

CONFLICT OF INTEREST STATEMENT

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Using and improving distributed data networks to generate actionable evidence: the case of real-world outcomes in the Food and Drug Administration’s Sentinel system

Jeffrey S Brown

Judith C Maro

Michael Nguyen

Robert Ball

Abstract

INTRODUCTION: FDA SENTINEL SYSTEM

WHEN IS REAL WORLD DATA GOOD ENOUGH?

The ARIA sufficiency experience

Computable phenotypes: What’s the best approach?

Sentinel system requirements

Table 1.

Data models design

SENTINEL’S APPROACH TO PHENOTYPES

Call for collaboration

Table 2.

FUNDING

AUTHOR CONTRIBUTIONS

ACKNOWLEDGMENTS

CONFLICT OF INTEREST STATEMENT

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases