Skip to main content
Journal of Diabetes Investigation logoLink to Journal of Diabetes Investigation
. 2021 Aug 24;13(2):249–255. doi: 10.1111/jdi.13641

Appropriate definition of diabetes using an administrative database: A cross‐sectional cohort validation study

Yuichi Nishioka 1,2,, Saki Takeshita 1, Shinichiro Kubo 1, Tomoya Myojin 1, Tatsuya Noda 1, Sadanori Okada 2, Hitoshi Ishii 3, Tomoaki Imamura 1,, Yutaka Takahashi 2,
PMCID: PMC8847127  PMID: 34327864

Abstract

Aims/Introduction

The purpose of the present study was to quantify errors in the diagnosis of diabetes for use in the national database, using a sufficient population size.

Materials and methods

A claims database constructed by the JMDC (Tokyo, Japan), using standardized disease classifications and anonymous record linkage, was used in this validation study. We included patients with health insurance claims data from April 2005 to March 2019 in the JMDC claims database. We excluded patients without a record of specific health checkups in Japan. Sample size calculation was based on a 5% prevalence of diabetes and 0.4% absolute accuracy (i.e., 1,250,000 individuals), to calculate the sensitivity, specificity, positive predictive value and negative predictive value.

Results

In total, 2,999,152 patients were included in this study, of which 165,515 were classified as having diabetes based on specific health checkups (validation cohort prevalence of 5.5%). The newly devised algorithm had three elements – the diagnosis‐related codes for diabetes without suspected flag, the medication codes for diabetes and then these two codes on the same record – and yielded a sensitivity of 74.6%, positive predictive value of 88.4% and Kappa Index of 0.80 (the highest values).

Conclusions

In future claims database studies, our validated algorithms will be useful as diagnostic criteria for diabetes.

Keywords: Administrative claims data, Diabetes, Validation


Algorithms 9 and 12 had three elements: (i) the diagnosis‐related codes for diabetes without suspected flag; (ii) the medication codes for diabetes; and (iii) then these two codes on the same record. These algorithms yielded a diagnosis that agrees with the results of the specific health checkups based on specificity, positive predictive value and Kappa Index. In future claims database studies, these validated algorithms will be useful as diagnostic criteria for diabetes.

graphic file with name JDI-13-249-g001.jpg

INTRODUCTION

The National Database of Health Insurance Claims and Specific Health Checkups of Japan (NDB) is a comprehensive database of health insurance claims data under Japan's National Health Insurance system, 1 and it is one of the largest administrative databases worldwide 2 , 3 , 4 , 5 , 6 , 7 . The NDB includes information on the administrative data of all insured people (approximately 120 million people) in Japan 1 . The NDB also includes information on specific health checkups of 29 million people 8 .

Over the past decade, real‐world studies, including administrative claims data, have provided evidence in clinical research 9 , 10 . Administrative claims databases, such as the Centers for Medicare and Medicaid Services (CMS) Open Payments Database in the US, Clinical Practice Research Datalink (CPRD) in the UK and NDB Japan, provide information on large samples of patients considered to be representative of the target population; however, their purpose for data collection is administrative rather than for research. Because key clinical variables (e.g., severity), such as medications for which patients pay out‐of‐pocket, patient‐reported outcomes, lifestyle variables and laboratory results, are typically not captured, it is necessary to establish a unique definition of the disease that is different from the clinical definition 11 .

In the claims database, diseases are defined based on the combination of some codes, such as diagnosis‐related codes, medicine codes, medical practice codes and specific equipment codes. It is difficult to define diseases based on the codes alone because of the suspected flag to insure medical examination and because of the confirmed diagnosis‐related codes that are unsuitable for clinical diagnostic criteria. It is also difficult to validate the definitions, because the NDB is prohibited from being linked with another database.

Diabetes is often defined using diagnosis‐related codes and diabetic medications 12 . Although a previous study proposed a definition of diabetes, there was no mention of how to handle data, such as the date of diagnosis, the diagnosis‐related codes with the suspected flags, and the diagnosis‐related codes and medication codes on the other records. It is important for researchers to use tested algorithms to help further studies refine and utilize appropriate algorithms 13 . We validated additional algorithms, including the handling of information associated with diagnosis‐related codes and medication codes. The purpose of the present study was to quantify errors in the diagnosis of diabetes for use in the NDB, using a sufficient population size.

MATERIALS AND METHODS

The present validation study was approved by the ethics committee of Nara Medical University (1123‐6, 8 October 2015). The need for informed consent was waived owing to the retrospective nature of the study. All patient data were anonymized before analysis. The principles outlined in the Declaration of Helsinki were followed. A claims database constructed by the JMDC (JMDC; Tokyo, Japan), using standardized disease classifications and anonymous record linkage 14 , was used in the present validation study. This claims database was constructed with monthly claims from all medical institutions and pharmacies, specific health checkups and registries in Japan submitted from January 2005 to March 2019, which included approximately 7,235,649 insured persons (approximately 5.7% of the Japanese population), comprised mainly of company employees and their family members. The JMDC database provided information on the beneficiaries, including encrypted personal identifiers, age, sex, International Classification of Diseases 10th revision procedure and diagnostic codes, as well as the name, dose and duration (days) with respect to the prescribed and/or dispensed drugs. All drugs were coded according to the Anatomical Therapeutic Chemical classification of the European Pharmaceutical Market Research Association. An encrypted personal identifier was used to link claims data from different hospitals, clinics and pharmacies. A deterministic linkage or probabilistic linkage was not carried out.

We included patients with a record of the health insurance claims data from April 2005 to March 2019 in the JMDC claims database. We excluded patients without a record of specific health checkups in Japan. Sample size calculation was based on a 5% prevalence of diabetes and 0.4% absolute accuracy (i.e., 1,250,000 individuals), to calculate the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). As aforementioned, the JMDC claims database included more than 1,250,000 individuals required for the sample size calculation.

The candidate population included patients with diabetes identified based on algorithm 1–17 (Table 1). Algorithms 1–12 were separated using a combination of five elements: (i) whether they had diagnosis‐related codes for diabetes, which are shown in Table S1; (ii) whether the date of diabetes diagnosis was identified; (iii) whether the diagnosis‐related codes had a suspected flag; (iv) whether they had medication codes for diabetes, as shown in Table S2; and (v) whether the diagnosis‐related codes and medication codes are on the same record (“receipts” are issued monthly for each patient and each medical institution). Algorithms 1–12 were designed with various combinations of these elements. Algorithms 13, 14 and 15 included measurement of hemoglobin A1c (HbA1c) or glycoalbumin, glucose and urine albumin, respectively. Algorithm 16 included any diagnosis‐related code, medication for diabetes and medical action codes as diabetes, as presented in Tables S1‐S3, in the view to the highest sensitivity. Algorithm 17 included algorithm 12 and measurements of hemoglobin A1c or glycoalbumin and glucose, in the view to the highest specificity. Because of the much lower number of patients with medical action codes, only HbA1c, glycoalbumin and glucose were used in algorithm 17.

Table 1.

Algorithms for diagnosis of diabetes

Table S1 Table S2 Table S3
With diagnosis‐related codes of diabetes With the identified date diagnosed diabetes Diagnosis‐related codes without suspected flag (d) With medication for diabetes (e) With diagnosis and medication codes on the same record (f) Measuring hemoglobin A1c or glycoalbumin Measuring glucose Measuring urinal albumin Others
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Algorithm 5
Algorithm 6
Algorithm 7
Algorithm 8
Algorithm 9
Algorithm 10
Algorithm 11
Algorithm 12
Algorithm 13
Algorithm 14
Algorithm 15
Algorithm 16 Any code
Algorithm 17

Algorithm 16 is for diagnosis of diabetes, which diagnoses patients who have any diagnosis‐related code, medication for diabetes or medical action codes as diabetes.

Patients were classified as having or not having diabetes based on: (i) HbA1c (≥6.5%) and fasting blood glucose (≥126 mg/dL); (2) HbA1c (≥6.5%) and a random blood glucose level ≥200 mg/dL; (iii) the use of antidiabetes drugs documented in the specific health checkups records; (iv) fasting blood glucose ≥126 mg/dL with diabetic retinopathy; (v) a random blood glucose value ≥200 mg/dL with diabetic retinopathy; or (vi) fasting blood glucose ≥126 mg/dL, a random blood glucose value ≥200 mg/dL or diabetic retinopathy on two occasions (the second occasion indicates the onset of diabetes), according to the Japanese guideline 15 .

Descriptive statistics were used to characterize the study population. We computed the sensitivity, specificity, PPV, NPV, prevalence, kappa value and Youden Index for each algorithm by sex, age class and Japanese academic years (from April to March) with corresponding 95% confidence intervals. Sensitivity and specificity were the probabilities of each algorithm correctly identifying patients with and without diabetes, respectively. PPV was the proportion of those identified by an algorithm as having diabetes who were truly diagnosed with diabetes. NPV was the proportion of those identified by an algorithm as not having diabetes who truly did not have diabetes. The prevalence estimates were calculated for each algorithm. A kappa statistic was calculated for the agreement between each algorithm and the reference standard, in an attempt to identify the algorithms that maximize kappa 16 . The Youden Index was calculated to equally weigh sensitivity and specificity, and it was calculated as follows: sensitivity+specificity1.

Following the identification of the optimal algorithm, a manual chart review of false positive and false negative cases was carried out to determine the reasons for misclassification. All analyses were carried out using Microsoft SQL Server 2016 Standard® (Microsoft Corp., Redmond, WA, USA) and IBM SPSS for Windows (version 25.0; IBM Corp., Armonk, NY, USA).

RESULTS

Reference standard

In total, 2,999,152 patients were included in the present study, and 165,515 patients were classified as having diabetes based on specific health checkups (validation cohort prevalence of 5.5%). Table 2 shows the characteristics of the patients in this validation cohort.

Table 2.

Characteristics of the patients in the validation cohort

Birth year Females Males
with DM Without DM with DM Without DM
1930–1934 1 22 1 10
1935–1939 131 1,364 257 1,216
1940–1944 967 8,177 2,072 9,072
1945–1949 2,628 26,418 8,084 40,500
1950–1954 5,089 66,905 22,774 109,782
1955–1959 6,232 103,741 28,013 151,915
1960–1964 5,516 139,813 27,359 199,288
1965–1969 4,260 175,850 21,583 232,632
1970–1974 3,207 206,487 14,524 260,760
1975–1979 1,445 157,285 6,648 218,725
1980–1984 530 93,634 2,256 154,513
1985–1989 284 76,947 996 141,644
1990–1994 152 67,668 381 116,726
1995–1999 39 26,624 84 44,647
2000–2004 1 389 1 883
Total 30,482 1,151,324 135,033 1,682,313

With/without diabetes mellitus (DM) is classified as having or not having diabetes based on the health checkups.

Administrative data algorithm validation

Table 3 shows the accuracy of the administrative data algorithms in identifying patients with diabetes, using the JMDC claims database. The accuracy assessment of both algorithms 9 and 12 showed a sensitivity of 74.6%, PPV of 88.4% and Kappa Index of 0.80, which were the highest values. The algorithms using only diagnosis‐related codes (algorithms 1, 5, 7 and 10) resulted in sensitivities of 91.7, 91.7, 88.0 and 88.0%, respectively. Additionally, they resulted in PPVs of 17.7, 17.7, 37.9 and 37.9%. The algorithms including medication codes (algorithms 2, 3, 4, 6, 8, 9, 10, and 12) resulted in sensitivities of 74.8, 74.7, 74.6, 74.7, 74.7, 74.6, 74.7 and 74.6%, respectively, and PPVs of 85.3, 86.0, 87.9, 86.0, 87.4, 88.4, 87.4 and 88.4%. In particular, the algorithm using only medication codes (algorithm 2) resulted in a sensitivity of 74.8%, PPV of 85.3% and Kappa Index of 0.79.

Table 3.

Accuracy of administrative data algorithms to identify patients with diabetes

Algorithm TP TN FN FP Sensitivity (%) 95% CI Specificity (%) 95% CI PPV (%) 95% CI NPV (%) 95% CI Prevalence estimate Kappa Youden
Algorithm 1 151,751 2,127,217 13,764 706,420 91.7% 91.6% 91.8% 75.1% 75.0% 75.1% 17.7% 17.6% 17.8% 99.4% 99.3% 99.4% 0.29 0.22 0.67
Algorithm 2 123,777 2,812,361 41,738 21,276 74.8% 74.6% 75.0% 99.2% 99.2% 99.3% 85.3% 85.2% 85.5% 98.5% 98.5% 98.6% 0.05 0.79 0.74
Algorithm 3 123,695 2,813,530 41,820 20,107 74.7% 74.5% 74.9% 99.3% 99.3% 99.3% 86.0% 85.8% 86.2% 98.5% 98.5% 98.5% 0.05 0.79 0.74
Algorithm 4 123,485 2,816,705 42,030 16,932 74.6% 74.4% 74.8% 99.4% 99.4% 99.4% 87.9% 87.8% 88.1% 98.5% 98.5% 98.5% 0.05 0.80 0.74
Algorithm 5 151,751 2,127,217 13,764 706,420 91.7% 91.6% 91.8% 75.1% 75.0% 75.1% 17.7% 17.6% 17.8% 99.4% 99.3% 99.4% 0.29 0.22 0.67
Algorithm 6 123,695 2,813,530 41,820 20,107 74.7% 74.5% 74.9% 99.3% 99.3% 99.3% 86.0% 85.8% 86.2% 98.5% 98.5% 98.5% 0.05 0.79 0.74
Algorithm 7 145,617 2,595,431 19,898 238,206 88.0% 87.8% 88.1% 91.6% 91.6% 91.6% 37.9% 37.8% 38.1% 99.2% 99.2% 99.2% 0.13 0.49 0.80
Algorithm 8 123,612 2,815,897 41,903 17,740 74.7% 74.5% 74.9% 99.4% 99.4% 99.4% 87.4% 87.3% 87.6% 98.5% 98.5% 98.5% 0.05 0.80 0.74
Algorithm 9 123,415 2,817,411 42,100 16,226 74.6% 74.4% 74.8% 99.4% 99.4% 99.4% 88.4% 88.2% 88.5% 98.5% 98.5% 98.5% 0.05 0.80 0.74
Algorithm 10 145,617 2,595,431 19,898 238,206 88.0% 87.8% 88.1% 91.6% 91.6% 91.6% 37.9% 37.8% 38.1% 99.2% 99.2% 99.2% 0.13 0.49 0.80
Algorithm 11 123,612 2,815,897 41,903 17,740 74.7% 74.5% 74.9% 99.4% 99.4% 99.4% 87.4% 87.3% 87.6% 98.5% 98.5% 98.5% 0.05 0.80 0.74
Algorithm 12 123,415 2,817,411 42,100 16,226 74.6% 74.4% 74.8% 99.4% 99.4% 99.4% 88.4% 88.2% 88.5% 98.5% 98.5% 98.5% 0.05 0.80 0.74
Algorithm 13 149,447 2,156,696 16,068 676,941 90.3% 90.1% 90.4% 76.1% 76.1% 76.2% 18.1% 18.0% 18.2% 99.3% 99.2% 99.3% 0.28 0.23 0.66
Algorithm 14 150,434 1,559,091 15,081 1,274,546 90.9% 90.7% 91.0% 55.0% 55.0% 55.1% 10.6% 10.5% 10.6% 99.0% 99.0% 99.1% 0.48 0.10 0.46
Algorithm 15 49,472 2,816,364 116,043 17,273 29.9% 29.7% 30.1% 99.4% 99.4% 99.4% 74.1% 73.8% 74.5% 96.0% 96.0% 96.1% 0.02 0.41 0.29
Algorithm 16 155,540 1,513,000 9,975 1,320,637 94.0% 93.9% 94.1% 53.4% 53.3% 53.5% 10.5% 10.5% 10.6% 99.3% 99.3% 99.4% 0.49 0.10 0.47
Algorithm 17 119,807 2,818,453 45,708 15,184 72.4% 72.2% 72.6% 99.5% 99.5% 99.5% 88.8% 88.6% 88.9% 98.4% 98.4% 98.4% 0.05 0.79 0.72

Reference standard: the specific health checkups in Japan (n = 165,515); total patients n = 2,999,152.

95% CI; 95% confidence interval; FN, false negative (the number of people for whom reported not having been prescribed diabetic medication, and recorded having been prescribed them); FP, false positive (the number of people for whom reported having been prescribed diabetic medication, and recorded not having been prescribed them); Kappa, Kappa Index; NPV, negative predictive value; PPV, positive predictive value, Prevalence estimate, prevalence of diabetes in the specific health checkups; TN, true negative (the number of people for whom reported not having been prescribed diabetic medication, and recorded not having been prescribed them); TP, true positive (the number of people for whom reported having been prescribed diabetic medication, and recorded having been prescribed them); Youden, Youden Index.

Supplemental raw data for original validated algorithms

Raw data for making original tested algorithms for diabetes in future administrative claims database studies are shown in Tables S4 and S5. Because Table S5 includes the information of specific health checkups, reference standards can be changed if necessary.

DISCUSSION

In the present study, we quantitatively evaluated the sensitivity and specificity of 17 algorithms for the diagnosis of diabetes based on the national claims data, using the JMDC claims database, which provided well‐balanced algorithms for diagnosis (algorithms 9 and 12) based on the Kappa Index. These algorithms were in almost perfect agreement with the diagnosis based on blood test values and diabetic prescriptions. Algorithms 9 and 12 had three elements: the diagnosis‐related codes for diabetes without suspected flag, the medication codes for diabetes and then these two codes on the same record. We also found that using the diabetes diagnosis date had no effect on the diagnosis of diabetes, as in algorithms 9 and 12.

If the desired result is to generalize to all people with diabetes (generalization), algorithm 16 is useful, with the highest sensitivity of 94.0%. To classify patients as to whether they have diabetes for outcome, algorithm 17 has the highest specificity. To identify cohorts of patients with diabetes, algorithms 9 and 12 have the highest PPV. To reduce the likelihood that people have diabetes, algorithm 1 has the highest NPV. In this manner, we have provided validated algorithms that match the research settings.

Previously, an algorithm for identifying diabetes in Canada was reported 17 . Although it had a sensitivity of 84.2%, specificity of 99.2%, PPV of 92.5% and Kappa Index of 0.87, it is not applicable to Japan because of differences in medical situations, including insurance systems. In the present study, with the use of only the disease name with or without suspected flag, many false positives were observed in the diagnosis (algorithms 1, 5, 7 and 10), suggesting that when diabetes is defined only by disease‐related codes, false positives might largely occur. It is easy to speculate that the disease codes were inputted when the examination of HbA1c value needed to be insured.

In contrast, other algorithms using medication codes have well‐balanced values of sensitivity, specificity, PPV, NPV, Kappa Index and Youden Index. People with a false negative diagnosis were untreated or might not have required drug therapy. In fact, the age‐standardized percentage of treated individuals among those requiring treatment for diabetes was 79.9% (95% confidence interval: 76.7–83.1) during the period 2013–2017. 18 People with a false positive diagnosis might be unaware that they are being prescribed diabetes medication.

The present study had several limitations. First, we used the JMDC claims database and excluded patients without a record of specific health checkups. Thus, there might have been a selection bias. Generalizability is controversial, and further validation studies are required. Second, our algorithm using medication codes overlooked patients with untreated diabetes or those who did not require drug therapy. The National Health and Nutrition Survey in Japan (2020) reported that 65.7% of those who have been diagnosed with diabetes have been treated. 19 It should be noted that our algorithms with medication codes could only identify patients with diabetes who have been prescribed antidiabetic drugs. Third, we used specific health checkup data as a reference standard. Misclassifications might have occurred in the judgment of the reference standards. In particular, there were patients with diabetes who could not be judged only by specific health checkups, which might have overestimated the false positives cases and underestimated the PPVs. However, there is no perfect reference standard, and it is best to use specific health checkups as a reference standard, which includes blood test and fundus test results.

In conclusion, algorithms 9 and 12 yielded a diagnosis that agrees with specific health checkups results according to specificity, PPV and Kappa Index. In future claims database studies, these validated algorithms will be useful as diagnostic criteria for diabetes.

DISCLOSURE

YN received consultant fees from Novo Nordisk. SO received speaker fees from Novo Nordisk, Mitsubishi Tanabe, Sumitomo Dainippon, Arkray, Bayer, Eli Lilly, Boehringer Ingelheim, Ono, AstraZeneca, Sanofi and Takeda, outside of the submitted work. HI received lecture fees and consultant fees from Takeda, Eli Lilly Japan, Sanofi, Merck & Co., Astellas, Mitsubishi Tanabe, Daiichi Sankyo, Ono, AstraZeneca, Taisho Toyama, Shionogi, Kowa, Boehringer Ingelheim, Novo Nordisk, Sumitomo Dainippon and Kyowa Hakko Kirin. YT received consultant fees from Novo Nordisk, Otsuka and Recordati, and speaker fees from Novo Nordisk, Sumitomo Dainippon, Eli Lilly, Ono, Novartis, Nippon Boehringer Ingelheim, AstraZeneca and Kyowa Kirin. The other authors declare no conflict of interest.

Approval of the research protocol: N/A

Informed Consent: N/A

Approval date of Registry and the Registration No. of the study/trial: N/A

Animal Studies: N/A

Supporting information

Table S1 | Diagnosis related‐codes for diabetes.

Table S2 | Medicine codes for diabetes.

Table S3 | Medical action codes related to diabetes.

Table S4 | Raw data for making original algorithms for diabetes in future administrative claims database study.

Table S5 | Raw data including specific health checkups for making original algorithms for diabetes in future administrative claims database studies

ACKNOWLEDGMENTS

We thank Editage (www.editage.com) for English language editing. This study was supported by a grant for young researchers from Research Institute of Research Institute of Healthcare Data Science, Young Researcher Grants (2021) from The Japan Diabetes Society and Japan Society for the Promotion of Science KAKENHI (grant number: JP18K17390, JP18H04126 and JP21K10451).

J Diabetes Investig 2022; 13: 249–255

[Correction added on 1 September 2021, after first online publication: All appearances of the term 'Japan Medical Center' in this article have been amended to 'JMDC' accordingly.]

References

  • 1. Kubo S, Noda T & Myojin T et al. National Database of Health Insurance Claims and Specific Health Checkups of Japan (NDB): Outline and patient‐matching technique. 2018.
  • 2. Tanaka H, Sugiyama T, Ihana‐Sugiyama N, et al. Changes in the quality of diabetes care in Japan between 2007 and 2015: A repeated cross‐sectional study using claims data. Diabetes Res Clin Pract 2019; 149: 188–199. [DOI] [PubMed] [Google Scholar]
  • 3. Nishioka Y, Okada S, Noda T, et al. Absolute risk of acute coronary syndrome after severe hypoglycemia: a population‐based 2‐year cohort study using the National Database in Japan. J Diabetes Investig. 2020; 11: 426–434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Nakayama T, Imanaka Y, Okuno Y, et al. Analysis of the evidence‐practice gap to facilitate proper medical care for the elderly: investigation, using databases, of utilization measures for National Database of Health Insurance Claims and Specific Health Checkups of Japan (NDB). Environ Health Prev Med 2017; 22: 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Hayashi S, Noda T, Kubo S, et al. Variation in fracture risk by season and weather: a comprehensive analysis across age and fracture site using a national database of health insurance claims in Japan. Bone 2019; 120: 512–518. [DOI] [PubMed] [Google Scholar]
  • 6. Okumura Y, Sugiyama N, Noda T, et al. Psychiatric admissions and length of stay during fiscal years 2014 and 2015 in Japan: a retrospective cohort study using a nationwide claims database. J Epidemiol 2019; 29: 288–294. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Nishioka Y, Noda T, Okada S, et al. Incidence and seasonality of type 1 diabetes: a population‐based 3‐year cohort study using the National Database in Japan. BMJ Open Diabetes Res Care 2020; 8: e001262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Ministry of Health, Labour and Welfare . The Data about Implementation status of a Specific Health Checkups (In Japanese). Available from: https://www.mhlw.go.jp/stf/newpage_03092.html Accessed August 7, 2021.
  • 9. Frieden TR. Evidence for health decision making ‐ beyond randomized, controlled trials. N Engl J Med 2017; 377: 465–475. [DOI] [PubMed] [Google Scholar]
  • 10. Blonde L, Khunti K, Harris SB, et al. Interpretation and impact of real‐world clinical data for the practicing clinician. Adv Ther 2018; 35: 1763–1774. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Gokhale M, Stürmer T, Buse JB. Real‐world evidence: the devil is in the detail. Diabetologia 2020; 63: 1694–1705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Hara K, Tomio J, Svensson T, et al. Association measures of claims‐based algorithms for common chronic conditions were assessed using regularly collected data in Japan. J Clin Epidemiol 2018; 99: 84–95. [DOI] [PubMed] [Google Scholar]
  • 13. Chubak J, Pocobelli G, Weiss NS. Tradeoffs between accuracy measures for electronic health care data algorithms. J Clin Epidemiol 2012; 65: 343–349.e2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Kimura S, Sato T, Ikeda S, et al. Development of a database of health insurance claims: standardization of disease classifications and anonymous record linkage. J Epidemiol 2010; 20: 413–419. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Araki E, Goto A, Kondo T, et al. Japanese clinical practice guideline for diabetes 2019. J Diabetes Investig 2020; 11: 1020–1076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Kundel HL, Polansky M. Measurement of observer agreement. Radiology 2003; 228: 303–308. [DOI] [PubMed] [Google Scholar]
  • 17. Lipscombe LL, Hwee J, Webster L, et al. Identifying diabetes cases from administrative data: a population‐based validation study. BMC Health Serv Res 2018; 18: 316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Ikeda N, Nishi N, Sugiyama T, et al. Effective coverage of medical treatment for hypertension, diabetes and dyslipidaemia in Japan: An analysis of National Health and Nutrition Surveys 2003–2017. J Health Serv Res Policy 2020; 26: 106–114. [DOI] [PubMed] [Google Scholar]
  • 19. The National Health and Nutrition Survey . Available from: https://www.mhlw.go.jp/stf/seisakunitsuite/bunya/kenkou_iryou/kenkou/eiyou/r1‐houkoku_00002.html Accessed July 2021 (In Japanese).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1 | Diagnosis related‐codes for diabetes.

Table S2 | Medicine codes for diabetes.

Table S3 | Medical action codes related to diabetes.

Table S4 | Raw data for making original algorithms for diabetes in future administrative claims database study.

Table S5 | Raw data including specific health checkups for making original algorithms for diabetes in future administrative claims database studies


Articles from Journal of Diabetes Investigation are provided here courtesy of Wiley

RESOURCES