Abstract
The Cherokee Nation Cancer Registry (CNCR) is the only tribally-operated Surveillance, Epidemiology, and End Results program registry. As registries, including CNCR, lack detailed data characterizing health behavior or comorbidity, we aimed to enrich CNCR by linking it with Cherokee Nation’s electronic medical record (EMR). We describe the process of a tribal-academic partnership and linking records between CNCR and EMR for American Indian people diagnosed with cancer from 2015-2020. Prior to data linkage, our team worked with the Cherokee Nation Governance Board and Institutional Review Board to ensure tribal data sovereignty was maintained. While not all persons in CNCR receive healthcare at Cherokee Nation, 63% linked with an EMR record. We observed differences (p<0.0001) between cancer site, year at diagnosis, age at diagnosis, and gender by EMR linkage status. Once we further validate linkages and assess data completeness, we will evaluate relationships between behavioral risk factors, comorbidities, and cancer outcomes.
Keywords: cancer, Surveillance, Epidemiology, End Results (SEER), data linkage, electronic medical records, tribal population health
Introduction
American Indian (AI) populations continue to experience cancer-related health disparities resulting in inequalities in incidence and mortality rates and cancer screening uptake.1-3 When adjusted for misclassification and age, AI individuals in Oklahoma have higher cancer incidence (33% higher) and mortality (41% higher) rates than Oklahoma non-Hispanic Whites (NHW) in recent years.4, 5
Cancer registries monitor the incidence and survival of cancer in the population, but historically, have faced challenges in reporting AI-specific cancer rates due to racial misclassification, geographic variation, and overall small population (0.9% of the US).6-8 For over a decade, linkages have been conducted with National Programs of Cancer Registry (NPCR) states (including Oklahoma) and the Surveillance, Epidemiology, and End Results (SEER) program with the Indian Health Services (IHS) patient registration data to account for racial misclassification. Using these linked data, health disparities among AI populations are well-established.3, 7 While registries play a critical role in cancer surveillance, they do not collect data on behavior and other risk factors, have incomplete data on comorbidities, and are limited in their ability to evaluate cancer etiology and outcomes.
In 1997, Cherokee Nation (CN) established the only tribally-operated SEER registry in the US, the CN Cancer Registry (CNCR).9 In order to improve data on risk factors and outcomes, CN established the CN Health Analytics Core (CNHAC) to conduct a linkage between the CNCR and CN Health System (CNHS) electronic medical records (EMR) in partnership with the University of Oklahoma Health Sciences Center (OUHSC). CN and OUHSC have partnered for research since 2000 to reduce the burden of cancer and other health conditions. In this paper, we describe the process of collaborating to conduct this data linkage, initial results of the data linkage, and future research plans.
Methods
Population
CN is the largest federally recognized tribe in the US, with an estimated citizenship of 392,478 enrolled tribal citizens. There are an estimated 142,604 CN citizens who reside within the 14 counties of the CN Reservation, which covers approximately 9,200 square miles of northeastern Oklahoma (Figure S1). Approximately 519,719 people reside on the CN Reservation, of whom 26% self-identified as AI or Alaska Native alone or in combination with one or more other races in 2019.10 CNHS provides health care via a network of nine health centers and a 60-bed inpatient hospital for Cherokee citizens as well as citizens of over 300 federally recognized tribes. CNHS accounts for approximately 8.6% of national IHS user population and provides care to 37% of eligible active users in the IHS Oklahoma City Area.
Data Sources and Data Request Process
The CNCR maintains cancer data for all AI people within the CN Reservation. CNCR works under a contract with the New Mexico Tumor Registry for technical support. Since the CNCR was formed in 1997 as a SEER registry, the CNCR exchanges data twice a year with the Oklahoma Central Cancer Registry (OCCR) to obtain data on incident cancer cases who were AI but did not seek care within the CNHS. As part of this project, CN, in partnership with OUHSC, worked with the OCCR and the New Mexico Tumor Registry to adjust the timing of regular data exchanges to ensure all eligible cancer records were incorporated into both registries accounting for differing timelines for data submissions to the Centers for Disease Control and Prevention (CDC) (for OCCR) and the National Cancer Institute (for CNCR). The estimated completeness of CNCR data for the last two years is 95%.
The CNHS EMR, managed by Cerner, transitioned from the IHS-supported Resource and Patient Management System (RPMS) in 2015. For this linkage, we included EMR data for cancer patients from August 9, 2015 (earliest date available) to June 30, 2020.
Prior to linking the CNCR data with the EMR, project leadership, including those at CN and OUHSC, worked with the CN Governance Board and IRB to ensure the privacy of CNHS patients was protected, the data obtained from the EMR were within the scope of the project, and tribal data sovereignty was maintained. This was the first project of its kind to conduct data linkages between health datasets within CN while allowing non-CN researchers to have data access. The Governance Board determined that OUHSC researchers could access identifiable data to conduct the linkage with data security protections in place. This was an iterative process that began with obtaining IRB approval for the project at both CN and OUHSC, establishing research agreements between the two institutions, and working with the Governance Board (Figure 1). We also hired dedicated staff to manage the project and implement data requests, including EMR data extractions. The project was approved because tribal leadership recognized the project’s potential to improve the health of CN citizens, improve the CNCR, and advance science while respecting tribal sovereignty, in addition to the prior trusted relationship with the research team.
Figure 1.
Data review and approval process at Cherokee Nation for the project (IRB: Institutional Review Board; CN: Cherokee Nation; OUHSC: University of Oklahoma Health Sciences Center).
Throughout the project, CN and OUHSC held bi-weekly conference calls and quarterly in-person meetings in Tahlequah (prior to pandemic-related restrictions beginning March 2020). During these in-person visits, we met with key staff at CN to work on infrastructure development, identification of available computing resources, and research needs of existing staff in CN Public Health.
Data Linkage
We included cancer records from the CNCR diagnosed from January 1, 2015 through August 25, 2020 in this analysis (N=2,428). The CNHAC staff at CN extracted data from the EMR for linkage with CNCR from August 9, 2015 to June 30, 2020. EMR data that were extracted included demographics, behavioral and other risk factors (e.g., use of hormone replacement therapy, smoking status), cancer screening (e.g., mammogram), and comorbidities. To conduct the linkage between CNCR and CN EMR, we used Registry Plus Link Plus v.2.0 (CDC). We used year of birth as the blocking variable and social security number, date of birth, first name, and last name as linking variables, with manual review of potential matches. We used a chi-square test to compare covariates among cancer records that did and did not link to an EMR record.
Results
In our initial data linkage, 63% of CNCR records linked with an EMR record (N=1,541 of the 2,428 unique individuals in the CNCR) from 2015-2020. We observed differences among CNCR patients by EMR linkage status, including younger age at cancer among those who linked to an EMR record compared to those who did not link. We also observed a lower percentage of patients with lung cancer, but a higher percentage of breast, uterine, and prostate cancers among patients with linked records. A higher percentage of CNCR patients who linked to an EMR record were alive as of December 2020 compared to those who did not link (Table 1).
Table 1.
Preliminary linkage results between electronic medical records and the Cherokee Nation Cancer Registry.
Linked N=1,541 |
Non-Linked N=887 |
||
---|---|---|---|
N (%) | N (%) | p-value | |
Sex | <0.0001 | ||
Male | 638 (41.4) | 407 (45.9) | |
Female | 770 (50.0) | 464 (52.3) | |
Other/Unknown | 133 (8.6) | 16 (1.8) | |
Year of Birth | <0.0001 | ||
<1930 | 18 (1.2) | 29 (3.3) | |
1930-1939 | 111 (7.2) | 98 (11.0) | |
1940-1949 | 353 (22.9) | 241 (27.2) | |
1950-1959 | 512 (33.2) | 246 (27.7) | |
1960-1969 | 284 (18.4) | 141 (15.9) | |
1970-1979 | 150 (9.7) | 71 (8.0) | |
1980-1989 | 67 (4.3) | 30 (3.4) | |
1990-1999 | 35 (2.3) | 15 (1.7) | |
≥2000 | 11 (0.7) | 16 (1.8) | |
Cancer Type | <0.0001 | ||
Oral Cavity/pharynx | 42 (2.7) | 24 (2.7) | |
Digestive system | 78 (5.1) | 58 (6.5) | |
Colorectal | 148 (9.6) | 76 (8.6) | |
Liver | 39 (2.5) | 21 (2.4) | |
Respiratory system | 24 (1.6) | 11 (1.2) | |
Lung/bronchus | 179 (11.6) | 171 (19.3) | |
Skin excluding basal and squamous cell carcinoma | 44 (2.9) | 29 (3.3) | |
Breast | 267 (17.3) | 108 (12.2) | |
Cervical | 13 (0.8) | 13 (1.5) | |
Uterine | 65 (4.2) | 23 (2.6) | |
Other female genital system | 35 (2.3) | 27 (3.0) | |
Prostate | 147 (9.5) | 51 (5.7) | |
Other male genital system | 12 (0.8) | 10 (1.1) | |
Urinary bladder | 36 (2.3) | 19 (2.1) | |
Kidney/renal pelvis | 105 (6.8) | 49 (5.5) | |
Brain/central nervous system | 44 (2.9) | 35 (3.9) | |
Endocrine system | 67 (4.3) | 36 (4.1) | |
Lymphoma | 65 (4.2) | 31 (3.5) | |
Myeloma | 35 (2.3) | 16 (1.8) | |
Leukemia | 42 (2.7) | 25 (2.8) | |
Other/unknown | 54 (3.5) | 54 (6.1) | |
Age at Diagnosis | 0.003 | ||
<20 years | 18 (1.2) | 19 (2.1) | |
20-29 years | 36 (2.3) | 19 (2.1) | |
30-39 years | 82 (5.3) | 38 (4.3) | |
40-49 years | 175 (11.4) | 86 (9.7) | |
50-59 years | 361 (23.4) | 173 (19.5) | |
60-69 years | 487 (31.6) | 278 (31.3) | |
70-79 years | 295 (19.1) | 194 (21.9) | |
80+ years | 87 (5.6) | 80 (9.0) | |
Insurance/Payer at Diagnosis | <0.0001 | ||
Uninsured | 80 (5.2) | 19 (2.1) | |
Insured, TRICARE, Military, VA | 407 (26.4) | 258 (29.1) | |
Medicaid | 90 (5.8) | 55 (6.2) | |
Medicare | 510 (33.1) | 326 (36.8) | |
Medicare/Medicaid | 24 (1.6) | 55 (6.2) | |
IHS | 342 (22.2) | 108 (12.2) | |
Unknown | 88 (5.7) | 66 (7.4) | |
Reporting Source | <0.0001 | ||
Hospital inpatient | 669 (43.4) | 723 (81.5) | |
Radiation Treatment/Medical Oncology Rights | 713 (46.3) | 90 (10.1) | |
Laboratory only | 152 (9.9) | 62 (7.0) | |
Othera | 7 (0.5) | 12 (1.4) | |
Vital Status | <0.0001 | ||
Dead | 326 (21.2) | 349 (39.3) | |
Alive | 1215 (78.8) | 538 (60.7) |
Other includes physician's office/private medical practitioner, nursing/convalescent home/hospice, autopsy/death certificate, and other hospital outpatient units/surgery centers due to small numbers in each category.
Discussion
Sixty-three percent of CNCR records linked with the EMR, which was expected as not all patients in the CNCR seek healthcare at the CNHS due to the availability of private or non-CN cancer care facilities within or near the CN Reservation. Because lung cancer (22%) has a lower five-year relative survival than breast (90%), uterine (81%), and prostate (98%),11 this may result in fewer lung cancer cases linking with a current EMR record.
This was the first project where an external research team accessed both the CNCR and EMR, which resulted in additional administrative and ethical review to ensure the data were protected. Allowing time for these reviews and resulting project modifications to occur were critical for a strong collaboration. Holding frequent meetings, with an emphasis on regular in-person meetings at the tribal headquarters, also helped to promote trust and an opportunity to clarify misunderstandings. We plan to continue regular meetings, including in-person meetings pending COVID-19 restrictions, to continue building a close relationship, address concerns about the research, and discuss future research questions that are of interest to CN leadership.
Strengths include the presence of a complete cancer registry and EMR database that allows for efficient data linkages. By working with both OCCR and the New Mexico Tumor Registry to promote data exchange with surrounding states and having regular linkages with IHS records, the cancer data are of high quality in line with other SEER cancer surveillance systems. This linkage between the CNCR and EMR will provide a unique opportunity for longitudinal follow-up of CNHS patients both prior to and after a cancer diagnosis to understand etiology and survivorship issues, which will allow for future intervention studies to reduce health disparities.
Limitations include an inability to identify CNCR patients who have not accessed CNHS for medical care prior to 2015 or who only rely on private healthcare providers outside of CNHS. Despite delays due to the COVID-19 pandemic and the need for CNHS to focus on the public health emergency as a healthcare delivery system, our team continued our collaboration to finalize the data linkage as planned.
Future directions include enhanced linkages with external data sources, including existing health information exchanges that work across EMR systems in Oklahoma, and pilot studies to evaluate factors related to cancer outcomes, including completion of recommended treatment and survivorship. As a next step, we are analyzing health outcomes related to breast cancer in CN. We are also working to link the CNCR and EMR data with the CN Diabetes Registry to understand whether having diabetes reduces survival from breast cancer. This linkage will provide a robust data source for future projects related to smoking cessation programs and human papillomavirus vaccination, in addition to future linkages with external data sources (e.g., environmental data), all of which are priorities for CN.
Supplementary Material
Figure S1. Age-adjusted cancer incidence rates for American Indians by county in Oklahoma12, 2013-2017. Data were obtained from OK2Share, Oklahoma State Department of Health’s web-based query system for health data. Cancer incidence data were available at the county level. However, the Cherokee Nation Reservation does not follow county lines.
Implications for Policy and Practice.
This was the first project in which an outside research team accessed detailed medical record data at Cherokee Nation, which resulted in additional administrative and ethical review to ensure the data were protected.
Allowing time for tribal reviews, flexibility in modifying the project, and ongoing communication were critical for a strong collaboration.
The ongoing collaboration between Cherokee Nation and the University of Oklahoma Health Sciences Center will strengthen the capacity of the Tribe to conduct cancer-related research and improve the health of AI populations living on the Cherokee Nation Reservation.
Funding:
Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number S06GM123546. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
List of Abbreviations
- AI
American Indian
- NHW
non-Hispanic Whites
- NPCR
National Programs of Cancer Registry
- SEER
Surveillance, Epidemiology, and End Results
- IHS
Indian Health Services
- CN
Cherokee Nation
- CNCR
Cherokee Nation Cancer Registry
- CNHS
Cherokee Nation Health System
- EMR
Electronic medical records
- OUHSC
University of Oklahoma Health Sciences Center
- OCCR
Oklahoma Central Cancer Registry
- CDC
Centers for Disease Control and Prevention
- RPMS
Resource and Patient Management System
- IRB
Institutional Review Board
Footnotes
Conflicts of Interest: None to declare
Human Participant Compliance Statement: This study was reviewed and approved by the Cherokee Nation IRB and the University of Oklahoma Health Sciences Center IRB.
Contributor Information
Amanda E. Janitz, Department of Biostatistics and Epidemiology, Hudson College of Public Health, University of Oklahoma Health Sciences Center, Oklahoma City, OK.
Sydney A. Martinez, Department of Biostatistics and Epidemiology, Hudson College of Public Health, University of Oklahoma Health Sciences Center, Oklahoma City, OK.
Janis E. Campbell, Department of Biostatistics and Epidemiology, Hudson College of Public Health, University of Oklahoma Health Sciences Center, Oklahoma City, OK.
Mary L. Williams, Department of Biostatistics and Epidemiology, Hudson College of Public Health, University of Oklahoma Health Sciences Center, Oklahoma City, OK.
Stefanie Buckskin, Cherokee Nation Public Health, Cherokee Nation, Tahlequah, OK.
Christopher Armstrong, Cherokee Nation Public Health, Cherokee Nation, Tahlequah, OK.
Travis Wickliffe, Cherokee Nation Public Health, Cherokee Nation, Tahlequah, OK.
Amber S. Anderson, Department of Biostatistics and Epidemiology, Hudson College of Public Health, University of Oklahoma Health Sciences Center, Oklahoma City, OK.
Mark Doescher, Department of Family Medicine, College of Medicine, University of Oklahoma Health Sciences Center, Oklahoma City, OK.
Sohail Khan, Cherokee Nation Public Health, Cherokee Nation, Tahlequah, OK.
References
- 1.Cobb N, Espey D, King J. Health Behaviors and Risk Factors Among American Indians and Alaska Natives, 2000–2010. Am J Public Health. 2014:e1–e9. doi: 10.2105/AJPH.2014.301879 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Espey DK, Jim MA, Cobb N, et al. Leading Causes of Death and All-Cause Mortality in American Indians and Alaska Natives. Am J Public Health. 2014:e1–e9. doi: 10.2105/AJPH.2013.301798 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.White MC, Espey DK, Swan J, Wiggins CL, Eheman C, Kaur JS. Disparities in Cancer Mortality and Incidence Among American Indians and Alaska Natives in the United States. Am J Public Health. 2014:e1–e11. doi: 10.2105/AJPH.2013.301673 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Oklahoma State Department of Health. Vital Statistics 1999 to 2019, on Oklahoma Statistics on Health Available for Everyone (OK2SHARE). Accessed July 20, 2021, http://www.health.ok.gov/ok2share
- 5.United States Cancer Statistics. 2013 - 2017 Incidence, WONDER Online Database. Accessed March 6, 2021, http://wonder.cdc.gov/cancer-v2017.html
- 6.Dougherty TM, Janitz AE, Williams MB, et al. Racial Misclassification in Mortality Records Among American Indians/Alaska Natives in Oklahoma From 1991 to 2015. J Public Health Manag Pract. Sep/Oct 2019;25(Suppl 5):S36–S43. doi: 10.1097/phh.0000000000001019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Espey DK, Jim MA, Richards TB, Begay C, Haverkamp D, Roberts D. Methods for Improving the Quality and Completeness of Mortality Data for American Indians and Alaska Natives. Am J Public Health. 2014:e1–e9. doi: 10.2105/AJPH.2013.301716 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Espey DK, Wiggins CL, Jim MA, Miller BA, Johnson CJ, Becker TM. Methods for improving cancer surveillance data in American Indian and Alaska Native populations. Cancer. 2008;113(S5):1120–1130. [DOI] [PubMed] [Google Scholar]
- 9.National Cancer Institute. Surveillance, Epidemiology, and End Results Program: List of SEER Registries. Accessed February 9, 2017, http://seer.cancer.gov/registries/list.html
- 10.United States Census Bureau. ACS Demographic and Housing Estimates (DP05). Accessed March 16, 2021, https://data.census.gov/cedsci/table?text=dp05&g=2500000US5550&tid=ACSDP1Y2019.DP05&hidePreview=false
- 11.National Cancer Institute. Cancer Stat Facts. Accessed November 5, 2021, https://seer.cancer.gov/statfacts/
- 12.Oklahoma State Department of Health. Oklahoma Central Cancer Registry (OCCR) 2013 to 2017, on Oklahoma Statistics on Health Available for Everyone (OK2SHARE). Accessed April 23 2021. http://www.health.ok.gov/ok2share
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figure S1. Age-adjusted cancer incidence rates for American Indians by county in Oklahoma12, 2013-2017. Data were obtained from OK2Share, Oklahoma State Department of Health’s web-based query system for health data. Cancer incidence data were available at the county level. However, the Cherokee Nation Reservation does not follow county lines.