The Agreement Between Diagnoses as Stated by Patients and Those Contained in Routine Health Insurance Data: Results of a Data Linkage Study

Felicitas Vogelgesang; Roma Thamm; Timm Frerk; Thomas G Grobe; Joachim Saam; Catharina Schumacher; Julia Thom

doi:10.3238/arztebl.m2023.0250

. 2024 Mar 8;121(5):141–147. doi: 10.3238/arztebl.m2023.0250

The Agreement Between Diagnoses as Stated by Patients and Those Contained in Routine Health Insurance Data

Results of a Data Linkage Study

Felicitas Vogelgesang ^1,^8,^*, Roma Thamm ^2,⁸, Timm Frerk ³, Thomas G Grobe ⁴, Joachim Saam ⁵, Catharina Schumacher ⁶, Julia Thom ⁷

PMCID: PMC11539885 PMID: 38169330

Abstract

Background

The frequency of medical diagnoses is a figure of central importance in epidemiology and health services research. Prevalence estimates vary depending on the underlying data. For a better understanding of such discrepancies, we compared patients’ diagnoses as reported by themselves in response to our questioning with their diagnoses as stated in the routine data of their health insurance carrier.

Methods

For 6558 adults insured by BARMER, one of the statutory health insurance carriers in Germany, we compared the diagnoses of various illnesses over a twelve-month period, as reported by the patients themselves in response to our questioning (October to December 2021), with their ICD-10-based diagnosis codes (Q4/2020–Q3/2021). The degree of agreement was assessed with two kappa values, sensitivity, and specificity.

Results

The patients’ stated diagnoses of diabetes and hypertension agreed well or very well with their diagnosis codes, with kappa and PABAK values near 0.8, as well as very high sensitivity and specificity. Moderately good agreement with respect to kappa was seen for the diagnoses of heart failure (0.4), obesity, anxiety disorder, depression, and coronary heart disease (0.5 each). The poorest agreement (kappa ≤ 0.3) was seen for post-traumatic stress disorder, alcohol-related disorder, and mental and somatoform disorder. Agreement was worse with increasing age.

Conclusion

Diagnoses as stated by patients often differ from those found in routine health insurance data. Discrepancies that can be considered negligible were found for only two of the 11 diseases that we studied. Our investigation confirms that these two sources of data yield different estimates of prevalence. Age is a key factor; further reasons for the discrepancies should be investigated, and avoidable causes should be addressed.

The frequency of medical diagnoses (diagnostic prevalence) is a parameter of central importance in epidemiology and health services research. To estimate this frequency in the population, epidemiological survey studies as well as statutory health insurance claims data (SHI routine data) are used. Whereas medical diagnoses in survey studies are reported by the participants themselves, diagnostic prevalence in SHI data is based on diagnoses according to ICD-10 coding, the official classification for coding diagnoses in outpatient and inpatient care in Germany (1). Comparisons of these two data sources indicate differing estimates of nationwide diagnostic prevalence rates, which are particularly grave in the case of mental disorders (2–4). The heterogeneity of the evidence makes it difficult to provide a summarizing evaluation and to infer recommendations for policy and practice.

There is a lack of conclusive information for Germany on the extent to which the diagnoses of various diseases differ between the two data sources. Discrepancies here relate to people not reporting the diagnoses that have been documented for them or reporting medical diagnoses that have not been documented. Person-specific linkage of self-reported survey data and routine data enables a valid quantification and investigation of these discrepancies (5–8). By means of this type of linkage, the present article describes the extent to which a person’s self-reporting of a medical diagnosis in a survey corresponds to the health insurance documentation for that person and the extent to which various non-communicable diseases differ in this respect.

Methods

This investigation was based on the study “Optimierte Datenbasis für Public Mental Health: Datenlinkage-Studie zur Aufklärung von Diskrepanzen zwischen Befragungs- und Routinedaten” (Optimized database for public mental health: data linkage study to investigate discrepancies between survey and routine data) (OptDatPMH; funded by the Innovation Committee at the German Federal Joint Committee [Gemeinsamer Bundesausschuss]: 01VSF19015). The Ethics Committee of the Lower Saxony Medical Association, Germany, granted its approval for the study.

Sample

The study population was based on a sample of people aged over 18 years who were resident in Germany and had been insured by BARMER, one of the SHI carriers in Germany, for at least 12 months. The sample was stratified by 5-year age groups, sex, and German federal state. A total of 26,000 insured persons were randomly drawn from the strata, proportionate to the population distribution. In October 2021, the selected group was asked to complete a health questionnaire and, in a separate response, to consent to the linkage of pseudonymized survey data and BARMER data. Following two reminder letters, 7110 individuals returned their completed questionnaire (response rate: 27.3%); of these, data linkage was additionally possible for 6558 individuals (92.2%). In order to address a possible sample bias due to selective participation, an adjustment weighting to the Federal German population distribution in relation to sex, age, region (federal state, East–West, Nielsen regions, district size) (as of 31.12.2020), and education was conducted (Microcensus 2018 [9]). The results of a comprehensive analysis of (non-)response will be published separately.

Surveying medical diagnoses in the questionnaire

The survey of medical diagnoses in the area of mental health (depression; anxiety disorder; post-traumatic stress disorder [PTSD]; somatoform disorder; dependence on, or harmful use of, alcohol [alcohol-related disorder]; any mental disorder) was carried out in the same way as the health monitoring conducted by the Robert Koch Institute (RKI) with the following question: “Have you ever been diagnosed with X by a doctor or psychotherapist?” If the answer was “yes,” the following question was asked: “Has X also occurred in the last 12 months?” For medical diagnoses of physical diseases (diabetes; cardiovascular disease or coronary heart disease [CHD]; heart attack; heart failure; hypertension; obesity), participants were asked the following: “Have you ever been diagnosed with X by a doctor?” If respondents answered yes, they were then asked the following question: “Has X also been present in the last 12 months?” In the case of hypertension, respondents were additionally asked whether they were using blood pressure-lowering drugs. The selection of diseases was made according to their public health relevance, on the basis of which they are prioritized for the surveillance of non-communicable diseases by the RKI.

Diagnostic information in routine data

The diagnoses documented in routine data are based on ICD-10 codes. The proportionally most significant portion of the diagnoses from outpatient care is available only on a quarterly basis, and the survey took place in the fourth quarter of 2021. Therefore, to compare the diagnoses in relation to the occurrence of the diseases or disorders in the preceding 12 months, the quarters 4/2020–3/2021 (n = 6558) were taken into consideration. Sensitivity analyses investigated the extent to which agreement in the diagnostic information changes when other time periods are considered (4/2020–4/2021 and 1/2021–4/2021). On the one hand, this was to investigate whether selecting a different quarter would yield better agreement for individuals who returned the survey documents later in quarter 4/2021. On the other, this also investigated whether it was merely the respondents’ recollection of the previous 12 months that was not entirely correct and, as such, whether diagnoses from somewhat longer ago were also reported. For the comparison of self-reported medical diagnoses in relation to lifetime (“ever”), the period of the preceding 10 years was used in the routine data (quarters 4/2011–3/2021, n = 5849). A disease was deemed to be documented if it had been coded in at least one quarter as an outpatient (M1Q inclusion criterion) or inpatient with the diagnostic certainty “confirmed” in the that period. Sensitivity analyses were performed to investigate the extent to which agreement in the diagnoses changed if documentation of the ICD-10 code in at least two cases of treatment (M2C) or at least two quarters (M2Q) was selected as an inclusion criterion in routine data.

Measures of agreement and statistical analyses

To measure the degree of agreement between patient-reported and SHI routine data, Cohen’s kappa (κ), the prevalence- and bias-adjusted kappa (PABAκ) as well as sensitivity and specificity were calculated for each disease or disorder overall and separately for women and men and for different age groups (Box). For the question under investigation, sensitivity corresponds to the proportion of people with a documented diagnosis in the routine data who also reported the respective diagnosis in the survey (true positive). Specificity refers to the proportion of people not reporting a diagnosis relative to all people who do not have a diagnosis documented in the routine data (true negative). Term 1-specificity represents the proportion of people reporting a diagnosis in the survey although no diagnosis is documented (false positive) (10–12). However, it is important to bear in mind that in the present study, sensitivity and specificity are not defined in relation to a reference standard with optimal validity, as is usually the case. Instead, these two parameters relate to the extent to which a diagnosis documented in the BARMER claims data was reflected in the answers given by the insured persons surveyed. All analyses were performed using SAS version 9.4 statistical software. Further information on the methodology of the OptDatPMH study can be found in the eMethods Section.

Box. Measures of agreement: Cohen’s kappa and PABAκ.

To assess the agreement between two categorical variables, Cohen’s kappa (κ) and the prevalence- and bias-adjusted kappa (PABAκ) are used.

κ =	p₀ – p₁		PABAκ = 2 p₀ – 1
	1 – p₁

Open in a new tab

p₀= observed agreement, p₁= expected agreement

Sample calculation for diabetes (unweighted):

	Routine data (BARMER)
	Diagnosis	No diagnosis	Marginal frequency
Data from the written survey	n (%)	n (%)	n (%)
Diagnosis	614 (9.6)	19 (0.3)	633 (9.9)
No diagnosis	251 (3.9)	5530 (86.2)	5781 (90.1)
Marginal frequency	865 (13.5)	5549 (86.5)	6414 (100)

Open in a new tab

The data from the two sources agree in 614 + 5530 = 6144 cases. The observed agreement is p₀= 6144/6414 = 0.958 = 95.8%.

The expected agreement describes the proportion of data in agreement from the two data sources that would be obtained if one were to make a completely random allocation to the “diagnosis” and “no diagnosis” groups for a given marginal distribution (mathematically referred to as independence). Even then, one would expect a certain proportion of cases to agree. For the agreeing judgment “diagnosis” in both data sources, it is, for example: 9.9% × 13.5% = 1.34%. Overall, the expected agreement is calculated as p₁ = 9.9% × 13.5% + 90.1% × 86.5% = 79.3%.

Thus, the observed agreement is approximately 16.5% higher than would have been expected with random allocation (given the marginal distribution).

For diabetes, this yields: κ = (0.958–0.793)/(1–0.793) = 0.165/0.207 = 0.797.

If the data from the two data sources are in full agreement, the value of Cohen’s κ is 1. A value of 0 means that the observed agreement does not differ from the expected agreement. Negative values indicate that the observed agreement is even lower than would have been expected in the case of random allocation.

To interpret κ, the agreement rating developed by Altman et al. is often used: < 0.20, ‘insufficient’; 0.21–0.40, ‘sufficient’; 0.41–0.60, ‘moderate’; 0.61–0.80, ‘good’; and 0.81–1.0 ‘very good’ (21). Thus, the κ calculated for diabetes (unweighted) is on the border between good and very good.

When using κ values, one must bear in mind that they are related to the prevalence of the entity under consideration (the lower the prevalence, the lower κ tends to be), which is why PABAκ values are recommended as a measure of agreement in the recent scientific literature (10, 22).

In the case of two forms, PABAκ depends only on the observed agreement and, in the example for diabetes, is: 2 × 0.958 - 1 = 0.916. Its interpretation is often carried out in the same way as for Cohen’s κ. PABAκ values tend to be very high at very low prevalence rates.

eMethods.

Sampling

The aim of the sampling procedure was to determine a population-based sample of individuals insured by BARMER who were representative for Germany with regard to the characteristics of sex, age, and region of residence. In October 2021, this group of people participated in the postal survey of the OptDatPMH study. Only insured persons aged ≥ 18 years for whom precise information was available regarding birth and sex details, place of residence in Germany, and 92 insured days from October 1 to December 31, 2020 without evidence of subsequent termination of insurance (n = 6,815,683) were taken into consideration in the data available for sampling in 2020. All preselected insured individuals were allocated to one of a total of 512 strata according to sex, 5-year age group (18–19 years, 20–24 years, 25–29 years, etc., up to 90 years and older), and place of residence (differentiated according to 16 federal states). For the letter regarding the main survey, 26,000 insured persons were randomly selected from the 512 strata in such a way that the proportionate allocation of the strata among those contacted corresponded to the distribution of the average population in Germany in 2020 across the corresponding strata according to the Federal Statistical Office (Statistisches Bundesamt). Immediately prior to the letter, BARMER carried out an internal check of the current insurance status of those selected to receive the letter. Insured persons that could no longer be contacted due to having left the health insurance or death, as registered in the meantime, were replaced by other persons from the same stratum if necessary. Thus, in October 2021, precisely 26,000 insured individuals were contacted by letter for the survey. The distribution of structural characteristics of those contacted by letter was representative according to the population data available at the time of the survey. For subsequent analyses, weightings and standardizations were then based on population data for 2021, which, however, did not become available until summer 2022.

The survey

Together with the written invitation to take part in the study, prospective participants received study information, a data-linkage consent form, and an 18-page questionnaire. Based on the pre-test, it was assumed that the questions would take 20–30 min to answer. Two envelopes were provided for return postage in order that the consent form and the questionnaire documents could be further processed separately. Questionnaire documents were read by the aQua Institute; the forms with written consent to data linkage were processed by the BARMER’s trust center.

Response

Following a one-page written reminder in calendar week (CW) 43, followed by a second reminder in CW 48, which once again included all questionnaire documents, 7110 completed questionnaires were received by 11 January 2022 (excluding duplicates). As such, a response rate of 27.3% was achieved. In addition, 6581 questionnaires were accompanied by data linkage consent. A comparison of age and sex details in the survey with information in the routine data showed differences for 23 people, which suggested that a different person had completed the survey documents than the person to whom the questionnaire had been addressed. These individuals were not included in the data linkage, meaning that the data linkage was ultimately carried out for 6558 individuals. Linkage and analysis of the data took place exclusively in BARMER’s secure data warehouse.

An analysis of response according to age, sex, and federal state was conducted and will be published separately. Differences between responders and non-responders will be discussed there in more detail. For the present analyses, possible discrepancies due to non-response were adjusted through weighting. A comparison of diagnostic frequency between participants and non-participants was also performed and will be published separately.

Weighting

Why use weighting?

Older rather than younger people tend to take part in surveys. This group not only has more but also different diseases compared to younger people. Moreover, the analyses show that agreement between the two data sources worsens with increasing age. If the different responses in the age groups had not been adjusted by weighting, this would have resulted in a skewed picture in which overall agreement would have been poorer and the proportions of disease higher. Mindful of the fact that a sample of persons insured by a health insurance carrier is not a population-wide, representatively drawn sample (for example, a sample drawn from residents’ registration offices), no nationwide prevalences are reported, but rather, proportions are shown that have been adjusted for selective willingness to participate according to age, sex, region of residence, and education.

Application of weighting

The weighting adjusts to the German population by sex, age group (same grouping as in the sampling procedure), and region. This process is iterative and takes into consideration the following rough allocations, differentiated according to region:

Federal state by age groups
West–East by education
Nielsen areas and district type.

Assignment of diseases to ICD-10 codes

Assignment was based on the results of previous methodological projects conducted at the Robert Koch Institute (RKI) and on the expertise of the co-authors from the aQua Institute.

Results

The Figure as well as Tables 1–3 show the proportions of people for each of the following groups, in addition to the agreement measures:

Proportion of medical diagnoses as stated by patients in the survey and/or contained in routine data as well as the agreement in diagnoses between the two data sources using Cohen’s κ, PABAκ, sensitivity, and specificity.

The data relate to the 12 months preceding the time of the survey or to the 4th quarter of 2020–3rd quarter of 2021 (see data table in eTable 1 for illustration)

The proportion of individuals identified in at least one of the two data sources is calculated as the sum of the three proportions listed. Sensitivity corresponds to the percentage of people with a documented diagnosis in the routine data who also state the respective diagnosis in the survey, in relation to all people who have the diagnosis documented in the routine data. Specificity is the percentage of people without a documented diagnosis in the routine data who also do not state a respective diagnosis in the survey, in relation to all people who do not have a corresponding diagnosis documented in the routine data. κ, Cohen’s kappa measure of agreement; PABAκ, prevalence- and and bias-adjusted kappa; PTSD, post-traumatic stress disorder

eTable 1. Proportion of medical diagnoses as stated by patients in the survey and/or contained in routine data as well as the agreement in diagnoses between the two data sources using Cohen’s κ, PABAκ, sensitivity, and specificity*.

		Proportion (%, weighted) and number (n) with a diagnosis of the disease/disorder
		In both data sources		Only in the survey		Only in the routine data
Disease/disorder	ICD-10 codes	%	n	%	n	%	n	Sensitivity % [95% CI]	Specificity% [95% CI]	κ [95% CI]	PABAκ [95% CI]	n Total
Hypertension and medication	I10–I15	29.4	2242	3.5	215	6.5	488	82.0 [80.1; 83.9]	94.6 [93.7; 95.4]	0.78 [0.76; 0.80]	0.80 [0.78; 0.82]	6367
Hypertension	I10–I15	26.6	1959	3.4	196	9.0	686	74.7 [72.8; 76.6]	94.8 [94.0; 95.6]	0.72 [0.70; 0.74]	0.75 [0.73; 0.77]	6213
PTSD	F43.1	0.6	34	3.0	175	0.2	17	74.4 [61.1; 87.8]	97.0 [96.5; 97.5]	0.27 [0.18; 0.35]	0.94 [0.92; 0.95]	6105
Diabetes mellitus	E10–E14	8.0	614	0.3	19	3.1	251	72.4 [68.8; 75.9]	99.7 [99.5; 99.8]	0.81 [0.78; 0.84]	0.93 [0.92; 0.94]	6414
Obesity	E66	6.4	406	5.1	291	5.2	341	55.3 [51.2; 59.5]	94.3 [93.4; 95.1]	0.50 [0.46; 0.54]	0.79 [0.78; 0.81]	5965
Heart failure	I50	2.2	174	3.9	299	1.8	139	55.0 [48.8; 61.1]	96.0 [95.4; 96.5]	0.41 [0.36; 0.46]	0.89 [0.87; 0.90]	6114
Anxiety disorder	F40, F41	3.7	194	3.9	235	3.3	226	52.4 [46.5; 58.3]	95.9 [95.2; 96.5]	0.47 [0.41; 0.52]	0.86 [0.84; 0.87]	6135
Depression	F32, F33.0–F33.3, F33.8–F33.9, F34.1 (excluding F30, F31)	7.1	407	3.5	188	8.2	569	46.5 [42.8; 50.2]	95.8 [95.0; 96.6]	0.48 [0.44; 0.52]	0.77 [0.75; 0.79]	6080
Coronary heart disease	I20–I25	2.9	241	1.4	103	4.6	371	38.6 [34.3; 42.9]	98.5 [98.1; 98.9]	0.46 [0.42; 0.51]	0.88 [0.87; 0.89]	6121
Alcohol-related disorder	F10	0.4	24	0.6	38	1.0	67	29.6 [17.2; 42.1]	99.3 [99.1; 99.6]	0.33 [0.20; 0.47]	0.97 [0.96; 0.97]	6432
Mental disorder	F00–F99	8.3	482	1.4	66	28.8	1863	22.3 [20.2; 24.3]	97.8 [97.1; 98.4]	0.24 [0.21; 0.26]	0.40 [0.37; 0.43]	6188
Somatoform disorder	F45	1.4	89	3.1	185	8.0	537	14.4 [11.0; 17.9]	96.7 [96.0; 99.3]	0.14 [0.10; 0.19]	0.78 [0.76; 0.80]	6047

Open in a new tab

* sorted in order of descending sensitivity; the data relate to the 12 months preceding the time of the survey or to the 4th quarter of 2020–3rd quarter of 2021

The proportion of individuals identified in at least one of the two data sources is calculated as the sum of the three proportions listed.

Sensitivity corresponds to the percentage of people with a documented diagnosis in the routine data who also state the respective diagnosis in the survey, in relation to all people who have their diagnosis documented in the routine data. Specificity is the percentage of people without a documented diagnosis in the routine data who also do not state a respective diagnosis in the survey, in relation to all people who do not have a corresponding diagnosis documented in the routine data.

κ, Cohen’s kappa measure of agreement; CI, confidence interval; PABAκ, prevalence- and bias-adjusted kappa; PTSD, post-traumatic stress disorder

eTable 3. Proportion of medical diagnoses as stated by patients in the survey and/or contained in routine data with regard to lifetime and agreement between the two data sources using Cohen’s κ, PABAκ, sensitivity, and specificity by sex and age group*.

Disease/disorder (ICD-10 codes)	Sex Age group	Proportion (%, weighted) and number (n) with a diagnosis of the disease/disorder						Sensitivity % [95% CI]	Specificity % [95% CI]	κ [95% CI]	PABAκ [95% CI]	n
		In both data sources		Only in the survey		Only in the routine data
		%	n	%	n	%	n
Myocardial infarction (I21, I22)	Total	2.4	165	1.5	103	0.4	30	85.3 [79.7; 90.8]	98.5 [98.2; 98.8]	0.71 [0.66; 0.77]	0.96 [0.95; 0.97]	5710
	Men	3.8	129	2.2	81	0.5	20	88.6 [83.1; 94.2]	97.7 [97.2; 98.3]	0.73 [0.67; 0.78]	0.95 [0.94; 0.96]	2481
	Women	1.2	36	0.8	22	0.4	10	77.2 [62.9; 91.4]	99.2 [98.8; 99.6]	0.67 [0.54; 0.80]	0.98 [0.97; 0.99]	3229
	45–64 Years	2.2	42	0.8	17	0.4	6	84.1 [72.2; 95.9]	99.2 [98.7; 99.6]	0.77 [0.67; 0.87]	0.98 [0.96; 0.99]	2102
	65–79 Years	5.6	85	3.2	48	0.6	13	90.4 [84.9; 95.8]	96.6 [95.4; 97.7]	0.73 [0.65; 0.81]	0.92 [0.90; 0.95]	1633
	≥ 80 Years	5.6	38	5.4	38	1.6	11	77.6 [64.7; 90.5]	94.2 [92.1; 96.2]	0.58 [0.46; 0.69]	0.86 [0.82; 0.90]	705
Hypertension (I10–I15)	Total	37.0	2429	4.3	216	8.9	588	80.6 [79.0; 82.3]	92.0 [90.8; 93.2]	0.73 [0.71; 0.75]	0.74 [0.72; 0.76]	5701
	Men	38.0	1206	5.1	105	8.9	286	81.0 [78.5; 83.4]	90.4 [88.3; 92.5]	0.72 [0.69; 0.75]	0.72 [0.69; 0.75]	2492
	Women	36.0	1223	3.6	111	8.8	302	80.3 [78.1; 82.6]	93.4 [92.0; 94.8]	0.75 [0.72; 0.77]	0.75 [0.72; 0.78]	3209
	18–29 Years	3.7	13	5.0	18	2.4	10	61.2 [37.4; 84.9]	94.7 [92.0; 97.4]	0.46 [0.27; 0.66]	0.85 [0.79; 0.91]	455
	30–44 Years	15.3	104	5.6	46	4.5	36	77.4 [69.4; 85.4]	93.0 [90.8; 95.2]	0.69 [0.61; 0.77]	0.80 [0.75; 0.85]	799
	45–64 Years	36.3	756	5.9	116	8.6	182	80.8 [78.0; 83.7]	89.3 [87.3; 91.4]	0.71 [0.67; 0.74]	0.71 [0.67; 0.75]	2095
	65–79 Years	64.9	1042	1.7	31	14.0	236	82.2 [80.0; 84.4]	92.1 [89.3; 95.0]	0.61 [0.57; 0.66]	0.69 [0.65; 0.72]	1639
	≥ 80 Years	72.8	514	0.5	5	17.4	124	80.7 [77.5; 84.0]	95.2 [90.7; 99.7]	0.43 [0.35; 0.51]	0.64 [0.58; 0.70]	713
Diabetes mellitus (E10–E14)	Total	10.3	688	0.8	52	4.8	341	68.0 [64.8; 71.3]	99.0 [98.7; 99.3]	0.75 [0.73; 0.78]	0.89 [0.87; 0.90]	5769
	Men	13.0	420	0.2	7	5.0	179	72.1 [68.1; 76.2]	99.7 [99.5; 100]	0.80 [0.77; 0.83]	0.90 [0.88; 0.91]	2497
	Women	7.8	268	1.4	45	4.7	162	62.6 [57.0; 68.2]	98.5 [98.0; 98.9]	0.69 [0.64; 0.74]	0.88 [0.86; 0.90]	3256
	30–44 Years	4.8	34	1.7	18	1.6	14	75.2 [62.4; 88.0]	98.2 [97.4; 99.0]	0.73 [0.63; 0.83]	0.93 [0.91; 0.96]	813
	45–64 Years	8.4	165	0.9	24	3.8	72	68.8 [61.8; 75.8]	99.0 [98; 6; 99.5]	0.76 [0.70; 0.81]	0.91 [0.89; 0.93]	2121
	65–79 Years	20.1	317	0.4	6	8.9	148	69.4 [64.7; 74.0]	99.4 [98.8; 100]	0.75 [0.71; 0.79]	0.81 [0.78; 0.84]	1642
	≥ 80 Years	21.5	169	0.4	2	13.3	105	61.8 [55.2; 68.4]	99.4 [98.6; 100]	0.67 [0.61; 0.73]	0.73 [0.67; 0.78]	730
PTSD (F43.1)	Total	1.3	63	4.2	256	1.0	51	57.3 [46.9; 67.7]	95.7 [95.1; 96.3]	0.32 [0.25; 0.39]	0.90 [0.88; 0.91]	5479
	Men	0.3	8	3.3	86	0.4	10	44.9 [13.7; 76.1]	96.7 [95.8; 97.5]	0.13 [0.00; 0.26]	0.93 [0.91; 0.94]	2397
	Women	2.3	55	5.0	170	1.6	41	59.3 [48.2; 70.4]	94.8 [93.9; 95.6]	0.38 [0.30; 0.46]	0.87 [0.85; 0.89]	3082
Heart failure (I50)	Total	5.1	356	4.6	303	4.2	272	55.1 [50.5; 59.7]	95.0 [94.3; 95.6]	0.49 [0.45; 0.53]	0.83 [0.81; 0.84]	5513
	Men	6.0	226	4.9	163	4.1	140	59.6 [53.6; 65.6]	94.6 [93.7; 95.5]	0.52 [0.47; 0.58]	0.82 [0.80; 0.84]	2420
	Women	4.3	130	4.3	140	4.2	132	50.1 [42.7; 57.5]	95.3 [94.4; 96.2]	0.45 [0.39; 0.52]	0.83 [0.81; 0.85]	3093
	45–64 Years	3.1	60	3.4	65	2.3	50	57.7 [47.4; 68.0]	96.4 [95.5; 97.3]	0.49 [0.40; 0.59]	0.89 [0.86; 0.91]	2052
	65–79 Years	9.6	153	9.4	145	8.3	123	53.6 [46.9; 60.3]	88.5 [86.6; 90.4]	0.41 [0.35; 0.47]	0.65 [0.60; 0.69]	1553
	≥ 80 Years	21.9	141	9.3	71	16.0	89	57.8 [49.9; 65.7]	85.0 [81.2; 88.8]	0.44 [0.36; 0.53]	0.49 [0.42; 0.57]	652
Coronary heart disease (I20–I25)	Total	7.5	530	2.2	130	6.3	437	54.3 [50.6; 58.1]	97.4 [96.9; 98.0]	0.59 [0.56; 0.63]	0.83 [0.81; 0.85]	5544
	Men	10.6	381	2.0	60	6.5	217	62.1 [57.6; 66.5]	97.7 [96.9; 98.5]	0.67 [0.63; 0.71]	0.83 [0.81; 0.85]	2412
	Women	4.7	149	2.5	70	6.2	220	43.1 [37.0; 49.1]	97.3 [96.5; 98.0]	0.47 [0.41; 0.54]	0.83 [0.81; 0.85]	3132
	45–64 Years	4.8	88	1.5	29	5.1	105	48.4 [40.0; 56.7]	98.3 [97.7; 99.0]	0.56 [0.48; 0.64]	0.87 [0.84; 0.89]	2049
	65–79 Years	17.0	265	3.4	50	12.1	202	58.4 [53.3; 63.5]	95.2 [93.7; 96.7]	0.59 [0.54; 0.64]	0.69 [0.65; 0.73]	1576
	≥ 80 Years	25.0	174	7.3	44	19.7	119	56.0 [49.0; 63.0]	86.7 [82.4; 91.0]	0.44 [0.35; 0.52]	0.46 [0.37; 0.54]	659
Obesity (E66)	Total	10.7	573	2.8	149	12.1	661	46.9 [43.7; 50.1]	96.3 [95.6; 97.0]	0.50 [0.47; 0.54]	0.70 [0.68; 0.73]	5324
	Men	9.4	204	2.7	59	12.4	314	43.2 [37.9; 48.6]	96.5 [95.4; 97.7]	0.47 [0.41; 0.53]	0.70 [0.66; 0.74]	2272
	Women	11.8	369	3.0	90	11.8	347	49.9 [45.6; 54.3]	96.1 [95.2; 97.0]	0.53 [0.49; 0.57]	0.70 [0.67; 0.73]	3052
	18–29 Years	6.6	19	2.6	10	7.7	27	45.8 [29.2; 62.5]	97.0 [95.0; 98.9]	0.50 [0.34; 0.67]	0.79 [0.72; 0.87]	455
	30–44 Years	13.0	98	2.6	19	8.6	62	60.3 [51.5; 69.2]	96.7 [94.9; 98.5]	0.63 [0.55; 0.72]	0.78 [0.72; 0.83]	785
	45–64 Years	11.2	243	3.4	65	11.8	225	48.6 [43.5; 53.8]	95.6 [94.5; 96.8]	0.51 [0.46; 0.56]	0.70 [0.66; 0.73]	2026
	65–79 Years	12.0	170	2.5	39	17.9	259	40.1 [35.0; 45.2]	96.4 [95.1; 97.6]	0.43 [0.37; 0.48]	0.59 [0.55; 0.63]	1468
	≥ 80 Years	6.3	43	2.4	16	16.5	88	27.6 [19.1; 36.1]	96.9 [95.2; 98.5]	0.31 [0.21; 0.42]	0.62 [0.55; 0.70]	590
Alcohol-related disorder (F10)	Total	1.7	86	1.2	64	2.0	94	46.6 [38.3; 55.0]	98.8 [98.4; 99.1]	0.51 [0.43; 0.58]	0.94 [0.93; 0.95]	5763
	Men	2.7	58	1.6	39	2.8	62	48.7 [37.9; 59.5]	98.4 [97.8; 98.9]	0.53 [0.43; 0.63]	0.91 [0.89; 0.93]	2507
	Women	0.9	28	0.9	25	1.2	32	41.6 [27.1; 56.1]	99.1 [98.6; 99.6]	0.44 [0.31; 0.58]	0.96 [0.95; 0.97]	3256
	45–64 Years	2.3	40	1.7	28	2.3	43	50.6 [39.4; 61.7]	98.2 [97.5; 99.0]	0.52 [0.41; 0.62]	0.92 [0.90; 0.94]	2120
	65–79 Years	2.2	34	1.0	19	1.4	23	60.5 [45.1; 76.0]	99.0 [98.5; 99.5]	0.63 [0.50; 0.76]	0.95 [0.93; 0.97]	1645
Depression (F32, F33.0–F33.3, F33.8–F33.9, F34.1; excluding F30, F31)	Total	12.4	799	5.5	314	14.3	991	46.4 [43.6; 49.2]	92.4 [91.4; 93.5]	0.43 [0.40; 0.47]	0.60 [0.58; 0.63]	6243
	Men	9.2	262	5.2	131	9.8	330	48.2 [43.4; 53.1]	93.5 [92.2; 94.9]	0.46 [0.41; 0.51]	0.70 [0.67; 0.73]	2729
	Women	15.5	537	5.8	183	18.6	661	45.4 [42.0; 48.7]	91.2 [89.7; 92.7]	0.40 [0.36; 0.44]	0.51 [0.48; 0.55]	3514
	18–29 Years	8.8	50	5.6	36	4.8	28	64.7 [51.3; 78.0]	93.6 [91.1; 96.1]	0.57 [0.45; 0.69]	0.79 [0.73; 0.85]	630
	30–44 Years	11.2	113	7.3	64	11.1	121	50.4 [43.6; 57.1]	90.6 [87.9; 93.3]	0.44 [0.36; 0.51]	0.63 [0.58; 0.69]	995
	45–64 Years	15.7	379	6.1	137	15.1	342	51.0 [47.2; 54.9]	91.2 [89.6; 92.7]	0.46 [0.42; 0.50]	0.58 [0.54; 0.61]	2299
	65–79 Years	12.9	203	3.4	54	20.1	333	39.1 [34.5; 43.7]	94.9 [93.2; 96.5]	0.39 [0.33; 0.44]	0.53 [0.48; 0.58]	1632
	≥ 80 Years	7.3	54	2.7	23	26.6	167	21.6 [16.1; 27.2]	96.0 [94.3; 97.6]	0.21 [0.14; 0.28]	0.42 [0.34; 0.49]	687
Anxiety disorder (F40, F41)	Total	7.2	364	3.8	222	11.3	651	38.7 [35.0; 42.4]	95.4 [94.6; 96.1]	0.41 [0.37; 0.44]	0.70 [0.68; 0.72]	5522
	Men	5.3	110	3.5	96	7.1	184	42.7 [35.7; 49.8]	96.0 [95.0; 97.0]	0.44 [0.37; 0.51]	0.79 [0.76; 0.82]	2416
	Women	8.8	254	4.0	126	15.2	467	36.8 [32.7; 40.9]	94.7 [93.7; 95.7]	0.37 [0.33; 0.42]	0.62 [0.58; 0.65]	3106
	18–29 Years	10.0	36	2.0	7	9.1	38	52.3 [37.8; 66.8]	97.5 [95.4; 99.6]	0.58 [0.44; 0.72]	0.78 [0.70; 0.85]	443
	30–44 Years	6.9	59	2.4	18	12.5	102	35.4 [27.4; 43.4]	97.0 [95.4; 98.6]	0.40 [0.31; 0.49]	0.70 [0.64; 0.76]	785
	45–64 Years	7.8	159	5.0	99	10.2	216	43.4 [38.2; 48.7]	93.9 [92.6; 95.3]	0.42 [0.37; 0.47]	0.70 [0.66; 0.73]	2026
	65–79 Years	5.6	83	4.4	71	13.4	219	29.3 [23.5; 35.1]	94.5 [93.2; 95.8]	0.29 [0.22; 0.36]	0.64 [0.60; 0.68]	1590
	≥ 80 Years	4.7	27	3.3	27	11.6	76	28.6 [17.9; 39.2]	96.1 [94.4; 97.8]	0.31 [0.19; 0.43]	0.70 [0.64; 0.76]	678
Mental disorder (F00–F99)	Total	14.6	779	0.5	30	56.3	3186	20.6 [19.1; 22.1]	98.2 [97.5; 99.0]	0.12 [0.11; 0.13]	−0.14 [−0.17; –0.11]	5584
	Men	12.0	275	0.7	17	50.2	1262	19.3 [16.9; 21.8]	98.3 [97.3; 99.3]	0.14 [0.12; 0.16]	−0.02 [−0.07; 0.03]	2455
	Women	17.0	504	0.4	13	62.1	1924	21.5 [19.5; 23.5]	98.1 [96.9; 99.3]	0.09 [0.08; 0.11]	−0.25 [−0.29; −0.21]	3129
	18–29 Years	18.0	66	1.0	5	45.0	206	28.6 [21.5; 35.8]	97.3 [94.8; 99.8]	0.21 [0.14; 0.28]	0.08 [−0.03; 0.19]	446
	30–44 Years	14.6	119	0.4	4	55.9	437	20.7 [16.8; 24.7]	98.6 [97.1; 100]	0.13 [0.09; 0.16]	−0.13 [−0.21; −0.05]	783
	45–64 Years	17.2	363	0.7	13	55.9	1150	23.5 [21.3; 25.8]	97.5 [96.2; 98.9]	0.13 [0.11; 0.15]	−0.13 [−0.17; −0.09]	2061
	65–79 Years	11.2	182	0.2	5	60.6	955	15.6 [13.3; 18]	99.2 [98.4; 99.9]	0.09 [0.07; 0.11]	−0.22 [−0.27; −0.16]	1599
	≥ 80 Years	7.2	49	0.2	3	65.6	438	9.9 [6.6; 13.1]	99.3 [98.4; 100]	0.05 [0.03; 0.07]	−0.32 [−0.40; −0.24]	695
Somatoform disorder (F45)	Total	4.6	245	2.7	151	28.2	1605	13.9 [12.1; 15.8]	96.0 [95.3; 96.8]	0.12 [0.10; 0.15]	0.38 [0.35; 0.41]	5436
	Men	2.9	67	2.4	62	19.6	514	13.0 [9.5; 16.5]	96.9 [95.9; 97.8]	0.14 [0.09; 0.18]	0.56 [0.52; 0.60]	2387
	Women	6.1	178	2.9	89	36.2	1091	14.4 [12.2; 16.6]	95.0 [93.7; 96.3]	0.11 [0.08; 0.13]	0.22 [0.18; 0.26]	3049
	18–29 Years	3.4	15	2.4	10	21.5	95	13.7 [6.1; 21.3]	96.7 [94.4; 99.1]	0.14 [0.03; 0.25]	0.52 [0.42; 0.62]	437
	30–44 Years	4.5	39	2.2	18	28.5	243	13.5 [9.0; 18.1]	96.7 [95.0; 98.5]	0.13 [0.07; 0.19]	0.39 [0.31; 0.46]	778
	45–64 Years	6.7	133	3.3	67	27.6	562	19.4 [16.1; 22.7]	95.0 [93.7; 96.3]	0.17 [0.13; 0.21]	0.38 [0.34; 0.42]	1976
	65–79 Years	3.0	48	2.6	42	31.3	494	8.8 [6.0; 11.7]	96.0 [94.6; 97.5]	0.06 [0.02; 0.10]	0.32 [0.27; 0.37]	1565
	≥ 80 Years	2.0	10	1.9	14	32.4	211	5.9 [1.9; 9.8]	97.1 [95.0; 99.3]	0.04 [-0.02; 0.10]	0.31 [0.23; 0.40]	680

Open in a new tab

* sorted in order of descending sensitivity; the data relate to lifetime or to the 4th quarter of 2011–3rd quarter of 2021

Only the results for age groups with at least 10 cases in the survey and/or in the routine data are shown.

The proportion of individuals identified in at least one of the two data sources is calculated as the sum of the three proportions listed.

κ, Cohen’s kappa measure of agreement; CI, confidence interval; PABAκ, prevalence- and bias-adjusted kappa; PTSD, post-traumatic stress disorder

People for whom there is a diagnosis in both data sources
People who report a diagnosis that is not documented in the routine data (only in the survey)
People who have a diagnosis according to routine data but who do not report this (only in the routine data).

Agreement between diagnoses as stated by patients and those contained in routine data relating to the previous 12 months

For the majority of the 11 diseases or disorders, diagnoses are more frequently documented in routine data than reported by patients in the survey. However, the diagnoses for heart failure and PTSD are more frequently reported in the survey than they are documented (Figure, eTable 1).

The highest sensitivity is seen for the self-reporting of the diagnosis of high blood pressure. Taking the question on current use of antihypertensive drugs into account, 82 % of persons with a documented hypertension diagnoses in routine data reported this diagnosis in the survey. With sensitivities of between 74.4% and 38.6%, this was followed by patient-reported diagnoses of PTSD, diabetes, obesity, heart failure, anxiety disorder, depression, and CHD. Sensitivities of under 30% were seen for patients’ stated diagnoses of alcohol-related disorder, any type of mental disorder, and somatoform disorder. The latter was reported in the survey by only 14.4% of those with a documented diagnosis in the routine data. Specificity varied between 99.7% for diabetes and 94.3% for obesity. This means that out of all the people who do not have a documented diagnosis of diabetes, only a very small proportion (0.3%) stated this diagnosis in the survey. For obesity, this percentage of people is 5.7%, closely followed by hypertension (5.4%).

The degrees of agreement between diagnoses in the two data sources also vary according to Cohen’s κ. The highest κ values of around 0.8 were achieved for patient-reporting of diabetes and hypertension. This shows good to very good agreement between the two data sources. Moderate agreement was seen for self-reported obesity, depression, anxiety disorder, CHD, and heart failure. Sufficient agreement was found for patients’ reporting of an alcohol-related disorder, PTSD, or any mental disorder. The lowest κ (0.14) was for somatoform disorder, for which there was insufficient agreement in the stated diagnoses.

Good to very good agreement based on the calculation of PABAκ can be observed for patients’ stated diagnoses for virtually all diseases and disorders considered. Only for the diagnosis of any mental disorder is there moderate agreement with a PABAκ = 0.4 (Figure, eTable 1).

Differentiated results for measures of agreement relating to diseases or disorders in the preceding 12 months according to sex and age group are reported in eTable 2. While no relevant gender differences can be seen for sensitivity, specificity, κ, or PABAκ, the degrees of agreement decline with increasing age (eTable 2).

eTable 2. Proportion of medical diagnoses as stated by patients in the survey and/or contained in routine data with regard to the previous 12 months as well as agreement between the two data sources using Cohen’s κ, PABAκ, sensitivity, and specificity by sex and age group*.

Disease/disorder (ICD-10 codes)	Sex Age group	Proportion (%, weighted) and number (n) with a diagnosis of the disease/disorder						Sensitivity % [95% CI]	Specificity % [95% CI]	κ [95% CI]	PABAκ [95% CI]	n
		In both data sources		Only in the survey		Only in the routine data
		%	n	%	n	%	n
Hypertension + medication (I10-I15)	Total	29.4	2242	3.5	215	6.5	488	82.0 [80.4; 83.6]	94.6 [93.8; 95.3]	0.78 [0.76; 0.80]	0.80 [0.79; 0.82]	6367
	Men	30.4	1141	3.9	106	6.5	257	82.3 [79.9; 84.7]	93.8 [92.5; 95.2]	0.77 [0.75; 0.80]	0.79 [0.77; 0.82]	2798
	Women	28.4	1101	3.1	109	6.4	231	81.7 [79.5; 83.8]	95.2 [94.3; 96.2]	0.79 [0.76; 0.81]	0.81 [0.79; 0.83]	3583
	30–44 Years	10.4	88	3.3	33	3.3	30	76.0 [67.2; 84.9]	96.2 [94.7; 97.6]	0.72 [0.64; 0.80]	0.87 [0.83; 0.90]	1013
	45–64 Years	30.7	715	4.6	103	5.2	130	85.5 [82.8; 88.1]	92.8 [91.3; 94.3]	0.79 [0.76; 0.81]	0.80 [0.78; 0.83]	2349
	65–79 Years	58.4	959	2.7	48	12.4	201	82.5 [80.3; 84.7]	90.8 [88.1; 93.5]	0.67 [0.63; 0.71]	0.70 [0.66; 0.73]	1668
	≥ 80 Years	66.4	471	2.0	16	17.6	122	79.0 [75.3; 82.8]	87.7 [80.5; 94.9]	0.48 [0.40; 0.56]	0.61 [0.54; 0.68]	717
Hypertension (I10-I15)	Total	26.6	1959	3.4	196	9.0	686	74.7 [72.8; 76.6]	94.8 [94.0; 95.6]	0.72 [0.70; 0.74]	0.75 [0.73; 0.77]	6213
	Men	27.4	988	3.8	96	9.6	374	74.0 [71.2; 76.9]	94.0 [92.6; 95.3]	0.70 [0.67; 0.73]	0.73 [0.70; 0.76]	2720
	Women	25.8	971	3.0	100	8.4	312	75.4 [72.8; 78.0]	95.5 [94.6; 96.4]	0.74 [0.71; 0.76]	0.77 [0.75; 0.80]	3493
	30–44 Years	9.5	83	3.4	33	4.0	31	70.4 [59.6; 81.1]	96.1 [94.6; 97.6]	0.68 [0.59; 0.77]	0.85 [0.81; 0.89]	990
	45–64 Years	28.8	644	4.5	95	7.2	174	80.0 [77.0; 83.0]	93.0 [91.6; 94.5]	0.74 [0.71; 0.78]	0.77 [0.74; 0.80]	2275
	65–79 Years	51.0	822	2.5	42	19.4	308	72.4 [69.5; 75.4]	91.6 [89.0; 94.2]	0.55 [0.51; 0.59]	0.56 [0.52; 0.61]	1629
	≥ 80 Years	60.1	402	1.7	12	23.3	168	72.1 [68.1; 76.1]	89.8 [82.9; 96.7]	0.41 [0.33; 0.48]	0.50 [0.43; 0.57]	692
PTSD (F43.1)	Total	0.6	34	3.0	175	0.2	17	74.4 [62.3; 86.6]	97.0 [96.4; 97.5]	0.27 [0.17; 0.36]	0.94 [0.92; 0.95]	6105
	Men	0.2	4	2.3	58	0.1	2	81.2 [44.1; 100]	97.6 [96.9; 98.4]	0.16 [0.00; 0.33]	0.95 [0.94; 0.97]	2678
	Women	1.0	30	3.7	117	0.4	15	73.0 [59.0; 87.0]	96.3 [95.5; 97.1]	0.31 [0.21; 0.42]	0.92 [0.90; 0.94]	3427
	30–44 Years	1.0	9	2.7	23	0.3	4	77.4 [52.9; 100]	97.3 [96.1; 98.5]	0.39 [0.16; 0.63]	0.94 [0.92; 0.97]	975
	45–64 Years	0.6	15	4.1	91	0.2	7	73.1 [55.4; 90.7]	95.8 [94.8; 96.9]	0.20 [0.10; 0.31]	0.91 [0.89; 0.93]	2210
Diabetes mellitus (E10–E14)	Total	8.0	614	0.3	19	3.1	251	72.4 [68.8; 75.9]	99.7 [99.5; 99.8]	0.81 [0.78; 0.84]	0.93 [0.92; 0.94]	6414
	Men	10.4	391	0.2	7	3.4	141	75.5 [71.2; 79.7]	99.8 [99.6; 100]	0.83 [0.80; 0.86]	0.93 [0.92; 0.94]	2805
	Women	5.7	223	0.4	12	2.8	110	67.5 [61.8; 73.2]	99.6 [99.4; 99.8]	0.77 [0.72; 0.81]	0.94 [0.93; 0.95]	3609
	30–44 Years	3.0	26	0.3	4	0.8	9	79.3 [65.4; 93.2]	99.7 [99.4; 100]	0.85 [0.75; 0.95]	0.98 [0.97; 0.99]	1022
	45–64 Years	7.3	160	0.3	5	2.2	44	76.5 [69.8; 83.1]	99.7 [99.5; 100]	0.84 [0.79; 0.89]	0.95 [0.94; 0.96]	2363
	65–79 Years	18.0	287	0.6	7	6.6	114	73.0 [68.1; 77.9]	99.3 [98.7; 99.9]	0.79 [0.75; 0.83]	0.86 [0.83; 0.88]	1668
	≥ 80 Years	17.7	137	0.4	2	11.1	84	61.6 [54.0; 69.1]	99.5 [98.7; 100]	0.69 [0.62; 0.76]	0.77 [0.72; 0.82]	716
Obesity (E66)	Total	6.4	406	5.1	291	5.2	341	55.3 [51.1; 59.5]	94.3 [93.5; 95.0]	0.50 [0.46; 0.54]	0.79 [0.78; 0.81]	5965
	Men	5.5	144	4.7	110	5.4	174	50.4 [43.5; 57.4]	94.7 [93.6; 95.9]	0.46 [0.40; 0.53]	0.80 [0.77; 0.83]	2563
	Women	7.3	262	5.4	181	5.0	167	59.4 [53.9; 64.9]	93.8 [92.8; 94.8]	0.52 [0.48; 0.57]	0.79 [0.77; 0.82]	3402
	18–29 Years	2.6	13	4.1	16	3.4	16	43.2 [22.0; 64.4]	95.7 [93.3; 98.1]	0.37 [0.17; 0.57]	0.85 [0.79; 0.91]	632
	30–44 Years	7.2	71	5.7	54	2.7	27	72.4 [62.9; 82.0]	93.7 [91.9; 95.4]	0.58 [0.50; 0.67]	0.83 [0.79; 0.87]	987
	45–64 Years	8.1	196	5.3	120	4.9	105	62.4 [56.0; 68.8]	93.9 [92.8; 95.1]	0.56 [0.50; 0.61]	0.80 [0.77; 0.82]	2260
	65–79 Years	7.0	104	5.5	81	10.2	151	40.8 [34.0; 47.6]	93.4 [91.9; 94.9]	0.38 [0.31; 0.45]	0.69 [0.65; 0.73]	1495
	≥ 80 Years	3.1	22	3.4	20	7.2	42	29.9 [17.1; 42.6]	96.2 [94.4; 98.1]	0.31 [0.18; 0.45]	0.79 [0.73; 0.85]	591
Heart failure (I50)	Total	2.2	174	3.9	299	1.8	139	55.0 [48.1; 61.8]	96.0 [95.4; 96.5]	0.41 [0.36; 0.46]	0.89 [0.87; 0.90]	6114
	Men	2.5	108	4.1	166	2.0	81	55.2 [47.0; 63.3]	95.7 [94.9; 96.5]	0.41 [0.35; 0.48]	0.88 [0.86; 0.89]	2686
	Women	2.0	66	3.7	133	1.7	58	54.7 [44.4; 65.1]	96.2 [95.5; 96.9]	0.41 [0.32; 0.49]	0.89 [0.88; 0.91]	3428
	45–64 Years	1.7	34	2.5	58	0.9	23	65.4 [52.1; 78.8]	97.4 [96.7; 98.2]	0.48 [0.36; 0.60]	0.93 [0.92; 0.95]	2292
	65–79 Years	3.7	66	8.3	131	5.2	69	41.7 [32.3; 51.1]	90.9 [89.4; 92.4]	0.28 [0.21; 0.36]	0.73 [0.69; 0.77]	1554
	≥ 80 Years	12.5	72	13.5	92	7.4	43	62.9 [51.5; 74.3]	83.1 [79.3; 87.0]	0.41 [0.31; 0.52]	0.58 [0.51; 0.66]	630
Anxiety disorder (F40, F41)	Total	3.7	194	3.9	235	3.3	226	52.4 [46.9; 57.8]	95.9 [95.2; 96.5]	0.47 [0.42; 0.52]	0.86 [0.84; 0.87]	6135
	Men	2.8	59	3.1	84	2.0	72	58.5 [48.7; 68.2]	96.7 [95.9; 97.6]	0.50 [0.41; 0.59]	0.90 [0.88; 0.92]	2689
	Women	4.5	135	4.6	151	4.6	154	49.2 [42.2; 56.2]	95.0 [94.1; 95.9]	0.44 [0.38; 0.51]	0.82 [0.79; 0.84]	3446
	18–29 Years	4.7	23	3.7	17	2.4	13	65.8 [46.0; 85.6]	96.0 [93.8; 98.2]	0.57 [0.40; 0.75]	0.88 [0.83; 0.93]	614
	30–44 Years	4.1	41	2.3	23	1.8	19	69.2 [55.4; 82.9]	97.5 [96.5; 98.6]	0.64 [0.53; 0.75]	0.92 [0.89; 0.95]	981
	45–64 Years	3.9	85	5.0	112	3.3	72	54.4 [46.2; 62.5]	94.6 [93.5; 95.7]	0.44 [0.37; 0.51]	0.83 [0.81; 0.86]	2257
	65–79 Years	2.5	33	3.6	52	5.4	87	31.5 [23.0; 40.1]	96.1 [95.0; 97.3]	0.31 [0.22; 0.40]	0.82 [0.79; 0.85]	1618
	≥ 80 Years	2.2	12	4.3	31	5.3	35	29.8 [13.4; 46.1]	95.3 [93.4; 97.2]	0.27 [0.10; 0.43]	0.81 [0.76; 0.86]	665
Depression F32, F33.0–F33.3, F33.8–F33.9, F34.1 (excluding F30, F31)	Total	7.1	407	3.5	188	8.2	569	46.5 [42.8; 50.2]	95.8 [95.0; 96.6]	0.48 [0.44; 0.52]	0.77 [0.75; 0.79]	6080
	Men	5.8	137	2.8	69	6.0	209	49.3 [42.7; 56.0]	96.8 [95.9; 97.8]	0.52 [0.46; 0.59]	0.82 [0.80; 0.85]	2677
	Women	8.4	270	4.2	119	10.4	360	44.8 [40.3; 49.3]	94.8 [93.5; 96.1]	0.45 [0.40; 0.50]	0.71 [0.68; 0.74]	3403
	18–29 Years	7.3	41	3.7	21	3.0	18	71.0 [56.9; 85.1]	95.9 [93.8; 98.0]	0.65 [0.52; 0.78]	0.87 [0.82; 0.92]	614
	30–44 Years	7.0	62	4.2	32	4.4	44	61.6 [51.3; 71.8]	95.3 [93.5; 97.1]	0.57 [0.47; 0.67]	0.83 [0.79; 0.87]	967
	45–64 Years	8.8	198	3.8	77	9.5	217	47.9 [42.7; 53.1]	95.4 [94.3; 96.5]	0.49 [0.44; 0.55]	0.73 [0.70; 0.77]	2226
	65–79 Years	5.6	81	2.9	41	11.8	191	32.3 [25.9; 38.7]	96.5 [95.4; 97.7]	0.36 [0.29; 0.43]	0.71 [0.67; 0.75]	1600
	≥ 80 Years	3.6	25	1.9	17	15.9	99	18.4 [11.0; 25.8]	97.6 [96.6; 98.6]	0.22 [0.12; 0.32]	0.64 [0.58; 0.71]	673
Coronary heart disease (I20–I25)	Total	2.9	241	1.4	103	4.6	371	38.6 [33.6; 43.7]	98.5 [98.1; 98.8]	0.46 [0.41; 0.51]	0.88 [0.87; 0.89]	6121
	Men	4.3	183	1.3	48	6.1	250	41.1 [35.2; 47.0]	98.5 [98.0; 99.1]	0.50 [0.44; 0.56]	0.85 [0.83; 0.87]	2659
	Women	1.6	58	1.5	55	3.1	121	33.3 [24.7; 41.9]	98.4 [98.0; 98.9]	0.38 [0.29; 0.47]	0.91 [0.89; 0.92]	3462
	45–64 Years	2.3	48	0.6	15	3.0	67	43.2 [32.8; 53.6]	99.3 [98.9; 99.7]	0.54 [0.43; 0.64]	0.93 [0.91; 0.94]	2290
	65–79 Years	6.7	112	2.9	43	12.6	198	34.8 [28.5; 41.2]	96.4 [95.3; 97.6]	0.39 [0.32; 0.46]	0.69 [0.65; 0.73]	1567
	≥ 80 Years	11.8	80	7.1	41	15.8	99	42.9 [34.0; 51.7]	90.2 [86.9; 93.5]	0.37 [0.27; 0.46]	0.54 [0.47; 0.62]	622
Alcohol-related disorder (F10)	Total	0.4	24	0.6	38	1.0	67	29.6 [17.0; 42.3]	99.3 [99.1; 99.6]	0.33 [0.20; 0.47]	0.97 [0.96; 0.97]	6432
	Men	0.7	15	0.7	21	1.4	47	31.5 [15.7; 47.3]	99.3 [98.9; 99.6]	0.37 [0.20; 0.54]	0.96 [0.95; 0.97]	2807
	Women	0.2	9	0.6	17	0.6	20	25.3 [8.4; 42.1]	99.4 [99.0; 99.8]	0.25 [0.10; 0.41]	0.98 [0.97; 0.99]	3625
	45–64 Years	0.5	13	0.9	17	1.9	38	21.8 [10.3; 33.4]	99.1 [98.6; 99.6]	0.26 [0.13; 0.39]	0.94 [0.93; 0.96]	2364
	65–79 Years	0.4	5	0.6	11	1.7	27	17.4 [2.4; 32.4]	99.4 [99.1; 99.8]	0.23 [0.05; 0.41]	0.96 [0.94; 0.97]	1683
Mental disorder (F00–F99)	Total	8.3	482	1.4	66	28.8	1863	22.3 [20.2; 24.3]	97.8 [97.1; 98.4]	0.24 [0.21; 0.26]	0.40 [0.37; 0.43]	6188
	Men	6.5	164	1.7	34	26.2	781	19.9 [16.7; 23.2]	97.5 [96.6; 98.5]	0.22 [0.18; 0.26]	0.44 [0.40; 0.49]	2722
	Women	9.9	318	1.1	32	31.3	1082	24.1 [21.5; 26.7]	98.1 [97.2; 98.9]	0.25 [0.22; 0.28]	0.35 [0.31; 0.39]	3466
	18–29 Years	9.9	57	2.7	11	18.7	107	34.7 [25.5; 43.9]	96.3 [93.9; 98.7]	0.37 [0.27; 0.48]	0.57 [0.49; 0.65]	616
	30–44 Years	8.0	79	1.6	15	23.7	211	25.2 [19.7; 30.7]	97.6 [96.4; 98.9]	0.28 [0.22; 0.35]	0.49 [0.43; 0.56]	981
	45–64 Years	10.6	246	1.3	24	30.0	678	26.1 [23.1; 29.1]	97.9 [97.0; 98.8]	0.27 [0.24; 0.30]	0.38 [0.33; 0.42]	2275
	65–79 Years	4.7	74	0.7	10	36.5	593	11.5 [8.7; 14.3]	98.8 [98.0; 99.6]	0.12 [0.08; 0.15]	0.26 [0.21; 0.31]	1620
	≥ 80 Years	3.5	26	0.4	6	40.2	274	8.1 [4.6; 11.6]	99.2 [98.5; 99.9]	0.08 [0.04; 0.12]	0.19 [0.11; 0.27]	696
Somatoform disorder (F45)	Total	1.4	89	3.1	185	8.0	537	14.4 [11.0; 17.9]	96.6 [96.1; 97.2]	0.14 [0.10; 0.19]	0.78 [0.76; 0.80]	6047
	Men	1.0	29	2.6	70	6.0	195	14.7 [8.7; 20.6]	97.2 [96.5; 98.0]	0.15 [0.08; 0.23]	0.83 [0.80; 0.85]	2671
	Women	1.7	60	3.6	115	10.0	342	14.3 [10.4; 18.3]	96.0 [95.1; 96.8]	0.14 [0.09; 0.18]	0.73 [0.70; 0.75]	3376
	18–29 Years	0.9	7	1.8	12	6.5	37	12.6 [1.5; 23.7]	98.1 [96.9; 99.3]	0.15 [0.01; 0.29]	0.84 [0.78; 0.89]	611
	30–44 Years	1.3	15	3.1	34	6.4	65	16.5 [8.0; 25.0]	96.6 [95.4; 97.9]	0.16 [0.06; 0.26]	0.81 [0.77; 0.85]	972
	45–64 Years	2.0	49	4.2	86	7.6	178	20.5 [14.3; 26.7]	95.4 [94.3; 96.5]	0.19 [0.12; 0.26]	0.76 [0.74; 0.79]	2192
	65–79 Years	0.9	13	2.6	40	11.3	187	7.6 [3.0; 12.3]	97.1 [96.1; 98.1]	0.07 [0.00; 0.14]	0.72 [0.69; 0.76]	1595
	≥ 80 Years	0.9	5	2.1	13	10.4	70	7.7 [0.5; 14.9]	97.6 [96.0; 99.3]	0.08 [0.03; 0.19]	0.75 [0.69; 0.81]	677

Open in a new tab

* sorted in order of descending sensitivity; the data relate to the 12 months preceding the survey date or to the 4th quarter of 2020–3rd quarter of 2021

Only the results for age groups with at least 10 cases in the survey and/or in the routine data are shown.

The proportion of individuals identified in at least one of the two data sources is calculated as the sum of the three proportions listed.

κ, Cohen’s kappa measure of agreement; CI, confidence interval; PABAκ, prevalence- and bias-adjusted kappa; PTSD, post-traumatic stress disorder

Agreement between diagnoses as stated by patients and those contained in routine data relating to the previous 10 years

The measures of agreement as well as the frequencies of patients’ stated diagnoses taken from the survey on diagnoses ever made by a physician and the documented diagnoses of people who had been continuously insured with BARMER over the preceding 10 years are shown in eTable 3 for 12 diseases or disorders (including heart attack). The measures and frequencies are given for the overall group as well as stratified by gender and age groups. Again, there is good to very good agreement for patient-reported diagnoses of diabetes and hypertension, as well as of heart attack. Patients’ stated diagnoses of CHD, obesity, heart failure, depression, and anxiety disorder continue to show moderate agreement. Better agreement compared to the 12-month reference period was found for diagnoses as stated by patients for alcohol-related disorder. There is poorer agreement for patient reporting of diagnoses of any mental disorder. Agreement for patient-stated diagnoses of PTSD and somatoform disorder remains thelowest (eTable 3).

Sensitivity analyses

Varying the comparison periods in the routine data from Q4/2020–Q3/2021 to Q4/2020–Q4/2021 or to Q1/2021–Q4/2021 did not result in any relevant changes in Cohen’s κ or PABAκ. Sensitivities increased if the inclusion criteria M2C and M2Q were taken into consideration. However, since specificities simultaneously declined, Cohen’s κ and PABAκ changed only marginally (data not shown).

Discussion

To our knowledge, this study is the first in Germany to quantify the agreement between diagnoses as stated by patients and those contained in routine health insurance data for a variety of diagnoses. The results vary between the investigated diseases. For diabetes and hypertension, there is good to very good agreement, while agreement is moderate for obesity, heart failure, anxiety disorder, depression, and CHD. The lowest level of agreement was seen for patients’ stated diagnoses of PTSD, alcohol-related disorder, and any mental or somatoform disorder. Thus, discrepancies are common between diagnoses as stated by the patients themselves and those contained in SHI routine data. Discrepancies that can be considered negligible were found for only two of the 11 diseases studied.

In concordance with our results, studies from Canada (6, 8, 13) and Korea (7) reported good agreement according to Cohen’s κ between the two data sources for patient-stated diagnoses of diabetes (0.7-0.9) and hypertension (0.6-0.8), as well as moderate agreement for patient-reported CHD (0.5) (7, 13) and depression (0.5) (8).

The determination of four different parameters (Cohen’s κ, PABAκ, sensitivity, specificity)shows that they should be viewed together when comparing different diseases, since each parameter reflects specific aspects of the agreement between diagnoses as stated by patients and those contained in routine data and is therefore limited. For example, disorders with a very low prevalence, such as PTSD or alcohol-related disorders, showed very low agreement using Cohen’s κ (0.27 and 0.33, respectively), while the calculation using PABAκ resulted in very high values (0.94 and 0.97, respectively). The differences between Cohen’s κ and PABAκ tended to increase with decreasing disease prevalence. This indicates that the meaningfulness of these parameters is only limited for rare diseases.

Implications of the results relate not only to the possible causes of the discrepancies but also to the validity of the two data sources for epidemiology and health-care research. If one assumes that patients are predominantly aware of their diagnoses and should be able to state these in a survey, avoidable causes of the observed discrepancies should be addressed to the extent possible. These causes include problems or shortcomings in the provision of medical information about a documented diagnosis, in patients’ understanding and recollection of this information, and in their willingness to state the diagnosis in the survey. What also needs to be investigated is the extent to which problems with the validity of documented diagnoses in terms of the actual presence of a disease affect agreement. For example, coding quality of medical and psychotherapeutic diagnoses are the subject of controversy (14–16). For mental disorders, for example, a comparison of primary care diagnoses with the results of standardized assessments of the same individuals shows both under-reporting and over-reporting of disorders in routine data (17–20). Therefore, it is conceivable that the self-reporting of a diagnosis is also based on respondents’ experience of their disease, which was not always medically diagnosed or documented. An accurate assessment of this type of misclassification could be achieved by a survey and investigation study (in line with a gold standard) of standardized clinical examination and diagnosis in conjunction with the linkage of patient-specific routine data. On the routine data side, a comparison with medical records or even a patient follow-up examination could yield important insights.

The investigated sample of individuals insured by BARMER, one of the largest German SHI carriers, can be seen as a strength of this study. The fact that consent to personal data linkage was given in over 90% of cases, and by adjusting the weighting to the population distribution of adults living in Germany for sex, age, region, and education, means that it was possible to effectively counteract a possible systematic bias of the results due to selective participation.

A possible limiting factor is that the wording of the question regarding the occurrence of a disease does not allow a precise inference as to whether the disease has also been medically diagnosed in the preceding 12 months. The question asked whether the disease had also been present during that time period. However, one can assume that affected individuals who answered in the affirmative would also answer “yes” to a more precise formulation.

Conclusion

Data linkage enables a valid quantification of differences between diagnoses as stated by patients and those contained in routine health insurance data. When a variety of agreement measures were taken into consideration, frequent and strongly varying discrepancies with no clear pattern became apparent. For example, patients’ stated diagnoses of somatic disorders did not generally agree better than those of mental disorders. While agreement worsened with increasing age, there were no general differences according to sex. Changes to the 12-month reference period and higher requirements in terms of the criteria for inclusion in routine data did not affect the results. Against this background, the discrepancies found here between the data sources should be reflected in a disease-specific manner when using diagnoses as stated by patients. Only further research can reveal to what extent these discrepancies reflect under- or over-recording of morbidity or disease experience in routine data and whether self-reported medical diagnoses are, as a result, informative even in the absence of agreement.

Acknowledgments

Translated from the original German by Christine Rye.

Footnotes

Funding

The present study was undertaken as part of the project titled “Optimierte Datenbasis für Public Mental Health: Datenlinkage–Studie zur Aufklärung von Diskrepanzen zwischen Befragungs- und Routinedaten” (OptDatPMH), funded by the Innovation Fund of the Joint Federal Committee (Grant No.: 01VSF19015).

Conflict of interest statement

The authors declare that no conflict of interests exists.

References

1.Bundesinstitut für Arzneimittel und Medizinprodukte. Internationale statistische Klassifikation der Krankheiten und verwandter Gesundheitsprobleme, German Modification. www.bfarm.de/DE/Kodiersysteme/Klassifikationen/ICD/ICD-10-GM/_node.html (last accessed on 09 June 2023) [Google Scholar]
2.Frank J. Comparing nationwide prevalences of hypertension and depression based on claims data and survey data: an example from Germany. Health Policy. 2016;120:1061–1069. doi: 10.1016/j.healthpol.2016.07.008. [DOI] [PubMed] [Google Scholar]
3.Grobe TG, Kleine-Budde K, Bramesfeld A, Thom J, Bretschneider J, Hapke U. Prävalenzen von Depressionen bei Erwachsenen—eine vergleichende Analyse bundesweiter Survey- und Routinedaten. Gesundheitswesen. 2019;81:1011–1017. doi: 10.1055/a-0652-5424. [DOI] [PubMed] [Google Scholar]
4.Jacobi F, Bretschneider J, Müllender S. Veränderungen und Variationen der Häufigkeit psychischer Störungen in Deutschland—Krankenkassenstatistiken und epidemiologische Befunde. In: Kliner K, Rennert D, Richter M, editors. Gesundheit in Regionen—Blickpunkt Psyche. BKK Gesundheitsatlas 2015. Berlin: Medizinisch wissenschaftliche Verlagsgesellschaft und BKK Dachverband; 2015. pp. 63–71. [Google Scholar]
5.March S, Andrich S, Drepper J, et al. Gute Praxis Datenlinkage (GPD) Gesundheitswesen. 2019;81:636–650. doi: 10.1055/a-0962-9933. [DOI] [PubMed] [Google Scholar]
6.Fortin M, Haggerty J, Sanche S, Almirall J. Self-reported versus health administrative data: implications for assessing chronic illness burden in populations. A cross-sectional study. CMAJ Open. 2017;5:e729–e733. doi: 10.9778/cmajo.20170029. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Kim YY, Park JH, Kang HJ, Lee EJ, Ha S, Shin SA. Level of agreement and factors associated with discrepancies between nationwide medical history questionnaires and hospital claims data. J Prev Med Public Health. 2017;50:294–302. doi: 10.3961/jpmph.17.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Payette Y, Moura CSd, Boileau C, Bernatsky S, Noisel N. Is there an agreement between self-reported medical diagnosis in the CARTaGENE cohort and the Québec administrative health databases? Int J Pop Data Sci. 2020;5(1) doi: 10.23889/ijpds.v5i1.1155. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Forschungsdatenzentren der Statistischen Ämter des Bundes und der Länder. Mikrozensus 2018, eigene Berechnungen. www.forschungsdatenzentrum.de/bestand/mikrozensus (last accessed on 09 June 2023) [Google Scholar]
10.Byrt T, Bishop J, Carlin JB. Bias, prevalence and kappa. J Clin Epidemiol. 1993;46:423–429. doi: 10.1016/0895-4356(93)90018-v. [DOI] [PubMed] [Google Scholar]
11.Grouven U, Bender R, Ziegler A, Lange S. Der Kappa-Koeffizient. Dtsch Med Wochenschr. 2007;132:e65–e68. doi: 10.1055/s-2007-959046. [DOI] [PubMed] [Google Scholar]
12.Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res. 1999;8:135–160. doi: 10.1177/096228029900800204. [DOI] [PubMed] [Google Scholar]
13.Lix LM, Yogendran MS, Shaw SY, Burchill C, Metge C, Bond R. Population-based data sources for chronic disease surveillance. Chronic Dis Can. 2008;29:31–38. [PubMed] [Google Scholar]
14.Slagman A, Hoffmann F, Horenkamp-Sonntag D, Swart E, Vogt V, Herrmann WJ. Analyse von Routinedaten in der Gesundheitsforschung: Validität, Generalisierbarkeit und Herausforderungen. Z Allg Med. 2023;99:86–92. [Google Scholar]
15.Drösler SE, Neukirch B. Evaluation der Kodierqualität von vertragsärztlichen Diagnosen. Gutachten im Auftrag der Kassenärztlichen Bundesvereinigung. www.kbv.de/media/sp/2014-11-18_Gutachten_Kodierqualitaet.pdf (last accessed on 07 November 2023) [Google Scholar]
16.IGES. www.gkv-spitzenverband.de/media/dokumente/krankenversicherung_1/aerztliche_versorgung/verguetung_und_leistungen/klassifikationsverfahren/9_Endbericht_Kodierqualitaet_Hauptstudie_2012_12-19.pdf (last accessed 07 November 2023) IGES Institut für Gesundheits- und Sozialforschung GmbH; 2012. Bewertung der Kodierqualität von vertragsärztlichen Diagnosen - Eine Studie im Auftrag des GKV-Spitzenverbands in Kooperation mit der BARMER GEK Berlin. [Google Scholar]
17.Sielk M, Altiner A, Janssen B, Becker N, Pilars M, Abholz HH. Prävalenz und Diagnostik depressiver Störungen in der Allgemeinarztpraxis. Ein kritischer Vergleich zwischen PHQ-D und hausärztlicher Einschätzung. Psychiatr Prax. 2009;36:169–174. doi: 10.1055/s-0028-1090150. [DOI] [PubMed] [Google Scholar]
18.Reitzle L, Köster I, Tuncer O, Schmidt C, Meyer I. Entwicklung und interne Validierung von Falldefinitionen für die Prävalenzschätzung mikrovaskulärer Komplikationen des Diabetes in Routinedaten. Gesundheitswesen. 2023 doi: 10.1055/a-2061-6954. 10.1055/a-2061-6954 (online ahead of print) [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Piontek K, Shedden-Mora MC, Gladigau M, Kuby A, Löwe B. Diagnosis of somatoform disorders in primary care: diagnostic agreement, predictors, and comaprisons with depression and anxiety. BMC Psychiatry. 2018;18 doi: 10.1186/s12888-018-1940-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Mitchell AJ, Vaze A, Rao S. Clinical diagnosis of depression in primary care: a meta-analysis. Lancet. 2009;374:609–619. doi: 10.1016/S0140-6736(09)60879-5. [DOI] [PubMed] [Google Scholar]
21.Kwiecien R, Kopp-Schneider A, Blettner M. Concordance analysis: part 16 of a series on evaluation of scientific publications. Dtsch Arztebl Int. 2011;108:515–521. doi: 10.3238/arztebl.2011.0515. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Feinstein AR, Cicchetti DV. High agreement but low kappa: I. The problems of two paradoxes. J Clin Epidemiol. 1990;43:543–549. doi: 10.1016/0895-4356(90)90158-l. [DOI] [PubMed] [Google Scholar]

[R1] 1.Bundesinstitut für Arzneimittel und Medizinprodukte. Internationale statistische Klassifikation der Krankheiten und verwandter Gesundheitsprobleme, German Modification. www.bfarm.de/DE/Kodiersysteme/Klassifikationen/ICD/ICD-10-GM/_node.html (last accessed on 09 June 2023) [Google Scholar]

[R2] 2.Frank J. Comparing nationwide prevalences of hypertension and depression based on claims data and survey data: an example from Germany. Health Policy. 2016;120:1061–1069. doi: 10.1016/j.healthpol.2016.07.008. [DOI] [PubMed] [Google Scholar]

[R3] 3.Grobe TG, Kleine-Budde K, Bramesfeld A, Thom J, Bretschneider J, Hapke U. Prävalenzen von Depressionen bei Erwachsenen—eine vergleichende Analyse bundesweiter Survey- und Routinedaten. Gesundheitswesen. 2019;81:1011–1017. doi: 10.1055/a-0652-5424. [DOI] [PubMed] [Google Scholar]

[R4] 4.Jacobi F, Bretschneider J, Müllender S. Veränderungen und Variationen der Häufigkeit psychischer Störungen in Deutschland—Krankenkassenstatistiken und epidemiologische Befunde. In: Kliner K, Rennert D, Richter M, editors. Gesundheit in Regionen—Blickpunkt Psyche. BKK Gesundheitsatlas 2015. Berlin: Medizinisch wissenschaftliche Verlagsgesellschaft und BKK Dachverband; 2015. pp. 63–71. [Google Scholar]

[R5] 5.March S, Andrich S, Drepper J, et al. Gute Praxis Datenlinkage (GPD) Gesundheitswesen. 2019;81:636–650. doi: 10.1055/a-0962-9933. [DOI] [PubMed] [Google Scholar]

[R6] 6.Fortin M, Haggerty J, Sanche S, Almirall J. Self-reported versus health administrative data: implications for assessing chronic illness burden in populations. A cross-sectional study. CMAJ Open. 2017;5:e729–e733. doi: 10.9778/cmajo.20170029. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Kim YY, Park JH, Kang HJ, Lee EJ, Ha S, Shin SA. Level of agreement and factors associated with discrepancies between nationwide medical history questionnaires and hospital claims data. J Prev Med Public Health. 2017;50:294–302. doi: 10.3961/jpmph.17.024. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Payette Y, Moura CSd, Boileau C, Bernatsky S, Noisel N. Is there an agreement between self-reported medical diagnosis in the CARTaGENE cohort and the Québec administrative health databases? Int J Pop Data Sci. 2020;5(1) doi: 10.23889/ijpds.v5i1.1155. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Forschungsdatenzentren der Statistischen Ämter des Bundes und der Länder. Mikrozensus 2018, eigene Berechnungen. www.forschungsdatenzentrum.de/bestand/mikrozensus (last accessed on 09 June 2023) [Google Scholar]

[R10] 10.Byrt T, Bishop J, Carlin JB. Bias, prevalence and kappa. J Clin Epidemiol. 1993;46:423–429. doi: 10.1016/0895-4356(93)90018-v. [DOI] [PubMed] [Google Scholar]

[R11] 11.Grouven U, Bender R, Ziegler A, Lange S. Der Kappa-Koeffizient. Dtsch Med Wochenschr. 2007;132:e65–e68. doi: 10.1055/s-2007-959046. [DOI] [PubMed] [Google Scholar]

[R12] 12.Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res. 1999;8:135–160. doi: 10.1177/096228029900800204. [DOI] [PubMed] [Google Scholar]

[R13] 13.Lix LM, Yogendran MS, Shaw SY, Burchill C, Metge C, Bond R. Population-based data sources for chronic disease surveillance. Chronic Dis Can. 2008;29:31–38. [PubMed] [Google Scholar]

[R14] 14.Slagman A, Hoffmann F, Horenkamp-Sonntag D, Swart E, Vogt V, Herrmann WJ. Analyse von Routinedaten in der Gesundheitsforschung: Validität, Generalisierbarkeit und Herausforderungen. Z Allg Med. 2023;99:86–92. [Google Scholar]

[R15] 15.Drösler SE, Neukirch B. Evaluation der Kodierqualität von vertragsärztlichen Diagnosen. Gutachten im Auftrag der Kassenärztlichen Bundesvereinigung. www.kbv.de/media/sp/2014-11-18_Gutachten_Kodierqualitaet.pdf (last accessed on 07 November 2023) [Google Scholar]

[R16] 16.IGES. www.gkv-spitzenverband.de/media/dokumente/krankenversicherung_1/aerztliche_versorgung/verguetung_und_leistungen/klassifikationsverfahren/9_Endbericht_Kodierqualitaet_Hauptstudie_2012_12-19.pdf (last accessed 07 November 2023) IGES Institut für Gesundheits- und Sozialforschung GmbH; 2012. Bewertung der Kodierqualität von vertragsärztlichen Diagnosen - Eine Studie im Auftrag des GKV-Spitzenverbands in Kooperation mit der BARMER GEK Berlin. [Google Scholar]

[R17] 17.Sielk M, Altiner A, Janssen B, Becker N, Pilars M, Abholz HH. Prävalenz und Diagnostik depressiver Störungen in der Allgemeinarztpraxis. Ein kritischer Vergleich zwischen PHQ-D und hausärztlicher Einschätzung. Psychiatr Prax. 2009;36:169–174. doi: 10.1055/s-0028-1090150. [DOI] [PubMed] [Google Scholar]

[R18] 18.Reitzle L, Köster I, Tuncer O, Schmidt C, Meyer I. Entwicklung und interne Validierung von Falldefinitionen für die Prävalenzschätzung mikrovaskulärer Komplikationen des Diabetes in Routinedaten. Gesundheitswesen. 2023 doi: 10.1055/a-2061-6954. 10.1055/a-2061-6954 (online ahead of print) [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Piontek K, Shedden-Mora MC, Gladigau M, Kuby A, Löwe B. Diagnosis of somatoform disorders in primary care: diagnostic agreement, predictors, and comaprisons with depression and anxiety. BMC Psychiatry. 2018;18 doi: 10.1186/s12888-018-1940-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Mitchell AJ, Vaze A, Rao S. Clinical diagnosis of depression in primary care: a meta-analysis. Lancet. 2009;374:609–619. doi: 10.1016/S0140-6736(09)60879-5. [DOI] [PubMed] [Google Scholar]

[R21] 21.Kwiecien R, Kopp-Schneider A, Blettner M. Concordance analysis: part 16 of a series on evaluation of scientific publications. Dtsch Arztebl Int. 2011;108:515–521. doi: 10.3238/arztebl.2011.0515. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Feinstein AR, Cicchetti DV. High agreement but low kappa: I. The problems of two paradoxes. J Clin Epidemiol. 1990;43:543–549. doi: 10.1016/0895-4356(90)90158-l. [DOI] [PubMed] [Google Scholar]

PERMALINK

The Agreement Between Diagnoses as Stated by Patients and Those Contained in Routine Health Insurance Data

Felicitas Vogelgesang

Roma Thamm, Dr. oec. troph.

Timm Frerk

Thomas G Grobe, Dr. med.

Joachim Saam

Catharina Schumacher

Julia Thom, Dr. rer. medic.

Abstract

Background

Methods

Results

Conclusion

Methods

Sample

Surveying medical diagnoses in the questionnaire

Diagnostic information in routine data

Measures of agreement and statistical analyses

Box. Measures of agreement: Cohen’s kappa and PABAκ.

eMethods.

Results

Figure.

eTable 1. Proportion of medical diagnoses as stated by patients in the survey and/or contained in routine data as well as the agreement in diagnoses between the two data sources using Cohen’s κ, PABAκ, sensitivity, and specificity*.

eTable 3. Proportion of medical diagnoses as stated by patients in the survey and/or contained in routine data with regard to lifetime and agreement between the two data sources using Cohen’s κ, PABAκ, sensitivity, and specificity by sex and age group*.

Agreement between diagnoses as stated by patients and those contained in routine data relating to the previous 12 months

eTable 2. Proportion of medical diagnoses as stated by patients in the survey and/or contained in routine data with regard to the previous 12 months as well as agreement between the two data sources using Cohen’s κ, PABAκ, sensitivity, and specificity by sex and age group*.

Agreement between diagnoses as stated by patients and those contained in routine data relating to the previous 10 years

Sensitivity analyses

Discussion

Conclusion

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

The Agreement Between Diagnoses as Stated by Patients and Those Contained in Routine Health Insurance Data

Felicitas Vogelgesang

Roma Thamm, Dr. oec. troph.

Timm Frerk

Thomas G Grobe, Dr. med.

Joachim Saam

Catharina Schumacher

Julia Thom, Dr. rer. medic.

Abstract

Background

Methods

Results

Conclusion

Methods

Sample

Surveying medical diagnoses in the questionnaire

Diagnostic information in routine data

Measures of agreement and statistical analyses

Box. Measures of agreement: Cohen’s kappa and PABAκ.

eMethods.

Results

Figure.

eTable 1. Proportion of medical diagnoses as stated by patients in the survey and/or contained in routine data as well as the agreement in diagnoses between the two data sources using Cohen’s κ, PABAκ, sensitivity, and specificity*.

eTable 3. Proportion of medical diagnoses as stated by patients in the survey and/or contained in routine data with regard to lifetime and agreement between the two data sources using Cohen’s κ, PABAκ, sensitivity, and specificity by sex and age group*.

Agreement between diagnoses as stated by patients and those contained in routine data relating to the previous 12 months

eTable 2. Proportion of medical diagnoses as stated by patients in the survey and/or contained in routine data with regard to the previous 12 months as well as agreement between the two data sources using Cohen’s κ, PABAκ, sensitivity, and specificity by sex and age group*.

Agreement between diagnoses as stated by patients and those contained in routine data relating to the previous 10 years

Sensitivity analyses

Discussion

Conclusion

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases