No less than two articles in this issue of the Deutsches Ärzteblatt (1, 2) are devoted to the quality of routine data: While Vogelgesang and Thamm et al. compared the agreement of patients’ self-report for 11 diseases with the respective diagnoses documented in their statutory health insurance (SHI), Bothe et al. investigated the dependence of prevalence estimates from the definition of a disease, specifically from operationalizations within the ICD-10, the fee schedule item (Gebührenordnungsposition, GOP), and the Germ Uniform Value Scale (operation and procedure codes).
Clear discrepancies
In the first study (1), there were clear discrepancies in nine of 11 diagnoses—only for diabetes and hypertension was broad agreement observed (1). This result points us to the larger topic of diagnostic validity and the level to which patients are informed.
On the part of the patient, a lack of knowledge or poor understanding, among other things, may contribute to incomplete data, as may their denial or suppression of diseases, as well as a sense of shame or stigmatization. Methodological aspects also affect the observed frequency in SHI data, and thus agreement with self-reported information, for example:
Taking analysis periods of differing lengths (e.g., 12 or 24 months) into consideration
Different inclusion criteria, such as the presence of a diagnosis in one or two consecutive quarters.
When calling for faster and better use of secondary data to answer questions in health care research, it is easy to overlook the purpose for which the data were collected in the first place: Health insurance data are first and foremost reimbursement data. Their purpose is not to provide the most precise and comprehensive documentation of care, but rather to depict currently treatment-relevant disease states, diagnostic measures, and prescriptions from the perspective of the therapists. If a patient has comorbidities that did not play a role in their consultation or did not result in any measures being taken, one does not expect these to be documented. However, this does not mean that the documenting physician is not aware of these concomitant diseases. The fact that physicians wish to protect their patients from health-condition labeling may also play a role.
The authors cite a range of other factors that can contribute to a discrepancy in the frequency of a diagnosis. The results of this article mean that methodological aspects need to be taken into consideration in a disease-specific (!) manner when drawing conclusions from data based on surveys or SHI routine data.
Different definitions
In the second article, Bothe et al. (2) demonstrate that when different definitions of a rare clinical picture are applied, in this case using chronic kidney failure as an example, one finds differing frequencies of the disease and, thus, differing prognoses and treatment costs. Depending on the definition used, prevalence and incidence differed by a factor of two, the proportion of the two sexes relative to affected individuals varied, and survival probabilities were also different.
There were also significant differences in terms of direct health care costs per month. Particularly in the case of rare syndromes and diseases, differences in the definition, which in themselves cause only small deviations, have a significant effect. These differences in disease frequency, prognosis, and costs indicate that the choice of definition results in slightly different questions being answered. Therefore, it is essential to provide a detailed description of the patients (or patient group) for which the results can be generalized. It is the task of every author to provide a rationale for their choice of definition and its effect.
More and better routine data
The call for more—as well as more comprehensive and better—health care data that has been heard for many years in the German health care system has become significantly louder since the SARS-CoV-2 pandemic. However, this term generally refers to disease data, that is to say, data on disease frequency and severity, the use of treatments, and their outcomes or complications. The lament that during the pandemic, investigators were forced to rely on study results from other countries—particularly from the United Kingdom with its central and underfinanced health care system—lingers on.
The reasons for this lack of high-quality routine data are manifold, with some key points being:
Data protection regulations
A health-care system organized on a federal basis
Sectoral boundaries between outpatient and inpatient care
The sometimes differing standards in medical documentation and the software tools used to this end
And quite simply, a lack of data collection.
Added to this is the willingness that has been declining for years among people in Germany to take part in studies with surveys and examinations—referred to as primary data collection. While this willingness was still at around 80% in the 1980s (for example, WHO-MONICA studies [3]), it has since dropped to approximately 20% in population studies (for example, the German National Cohort (NAKO) [4]). This has also significantly amplified the call for the use of secondary data, generally from SHI or ambulatory health-care data. As a result, the German Federal Minister of Health is working on statutory regulations that will improve and facilitate this use. It remains to be seen whether this initiative will be able to satisfy the desire for more reliable data.
Conclusion
The articles by Vogelgesang and Thamm et al. [1] and Bothe et al. [2] are a reminder that one needs to keep an eye on a multitude of methodological aspects when using routine data—a reminder addressed not only primarily to those who evaluate and draw conclusions from the data but also to all those who read these studies. Although routine data collected for other purposes are better than no data, incorrect conclusions are difficult to get rid of again. Therefore, one should always keep the limitations of routine data in mind.
Acknowledgments
Translated from the original German by Christine Rye.
Footnotes
Conflict of interest statement
The author declares that no conflict of interest exists.
Editorial to accompany the articles:
“The Agreement Between Diagnoses as Stated by Patients and Those Contained in Routine Health Insurance Data—Results of a Data Linkage Study”
by Felicitas Vogelgesang, Roma Thamm, et al.
and
“The Lack of a Standardized Definition of Chronic Dialysis Treatment in German Statutory Health Insurance Claims Data—Effects on Estimated Incidence and Mortality”
by Tim Bothe et al.
in this issue of Deutsches Ärzteblatt International
References
- 1.Vogelgesang F, Thamm R, Frerk T, Grobe TG, Saam J, Schumacher C, Thom J. The agreement between diagnoses as stated by patients and those contained in routine health insurance data—results of a data linkage study. Dtsch Arztebl Int. 2024;121:141–147. doi: 10.3238/arztebl.m2023.0250. DOI: 10.3238/arztebl.m2023.0250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bothe T, Fietz AK, Mielke N, Freitag J, Ebert N, Schäffner E. The lack of a standardized definition of chronic dialysis treatment in German statutory health insurance claims data—effects on estimated incidence and mortality. Dtsch Arztebl Int. 2024;121:148–154. doi: 10.3238/arztebl.m2024.0015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Löwel H, Döring A, Schneider A, Heier M, Thorand B, Meisinger C MONICA/KORA Study Group. Gesundheitswesen. 2005;67(Sonderheft 1):S13–S18. doi: 10.1055/s-2005-858234. [DOI] [PubMed] [Google Scholar]
- 4.Peters A German National Cohort (NAKO) Consortium. Framework and baseline examination of the German National Cohort (NAKO) Eur J Epidemiol. 2022;37:1107–1124. doi: 10.1007/s10654-022-00890-5. [DOI] [PMC free article] [PubMed] [Google Scholar]