Application of measurement error models to correct for systematic differences among readers and vendors in echocardiography measurements: the CARDIA study

Aisha Betoko; Chike Nwabuo; Bharath Ambale Venkatesh; Erin P Ricketts; Sejong Bae; Colin Wu; Samuel S Gidding; Kiang Liu; João A C Lima; Christopher Cox

doi:10.1080/02664763.2019.1686133

. 2019 Nov 12;47(7):1315–1324. doi: 10.1080/02664763.2019.1686133

Application of measurement error models to correct for systematic differences among readers and vendors in echocardiography measurements: the CARDIA study

Aisha Betoko ^a, Chike Nwabuo ^b, Bharath Ambale Venkatesh ^b, Erin P Ricketts ^b, Sejong Bae ^c, Colin Wu ^d, Samuel S Gidding ^e, Kiang Liu ^f, João A C Lima ^b, Christopher Cox ^a,^CONTACT

PMCID: PMC9042058 PMID: 35707021

ABSTRACT

We illustrate the application of linear measurement error models to calibrate echocardiography measurements acquired 20 years apart in the CARDIA study. Of 4242 echocardiograms acquired at Year-5 (1990–1991), 36% were reread 20 years later. Left ventricular (LV) mass and 8 other measurements were assessed. A machine reproducibility study including 96 additional patients also compared Year-5 and Year-25 equipment. A linear measurement error model was developed to calibrate the original Year-5 measurements, incorporating the additional Year-5 reread and machine reproducibility study data, and adjusting for differences among readers and machines. Median (quartiles) of original Year-5 LV mass was 144.4 (117.6, 174.2) g before and 129.9 (103.8, 158.6) g, after calibration. The correlation between original and calibrated LV mass was 0.989 (95% confidence interval: 0.988, 0.990). The original and calibrated measurements had similar distributions. Additional comparisons of original and calibrated data supported the use of the model. We conclude that systematic differences among readers and machines have been accounted for, and that the calibrated Year-5 measurements can be used in future longitudinal comparisons. It is hoped that this paper will encourage the wider application of measurement error models.

KEYWORDS: Bias, calibration, echocardiography, linear measurement error models, systematic differences

1. Introduction

In the Coronary Artery Risk Development in Young Adults (CARDIA) study, there was a 20-year interval between echocardiography examinations in 1990–1991 (Year-5) and 2010–2011 (Year-25), requiring different sonographers, readers, and equipment. As a result, there was concern that the sets of measurements collected at those two time periods would not be comparable, and that longitudinal comparisons would be biased. To address this question, a separate study (the machine reproducibility study) compared the two different machines used for the two measurements, Acuson (Year-5) and Toshiba (Year-25), using a two-period cross-over design in which each subject was scanned by both machines and each scan was read by each of two sonographers (cf. Supplementary data, Section A). In addition, a sample of the original Year-5 images (1516) was reread by one of four members of the CARDIA Echocardiography Reading Center at Johns Hopkins University (reread substudy) using the Year-25 software. Results of the machine reproducibility study indicated that there were systematic differences between the two machines, as well as differences among readers. Therefore, we sought to calibrate the Year-5 measurements to the Year-25 values, allowing for measurement error in the standard (Toshiba) measurement and for differences among readers and machines. Data from the machine substudy were combined with data from the reread substudy, and a linear measurement error model [6] was developed. The goal of this paper is to illustrate the development and application of this model for the calibration of the Year-5 echocardiography measurements, using programs available in standard statistical packages, with the hope of encouraging the wider application of these useful models.

2. Methods

2.1. Study design

CARDIA is an ongoing, multi-center prospective cohort study [5, 7]. The baseline cohort included 5115 African-American and white participants aged 18–30 years recruited from community-based target populations at 4 Field Centers (Birmingham, AL; Oakland, CA; Chicago, IL; and Minneapolis, MN) between 1985 and 1986. All examinations were approved by the institutional review boards at each participating institution, and all participants gave written informed consent.

2.2. Data collection

Of the 4352 participants who attended the CARDIA Year-5 examination between 1990 and 1991, 4242 (97.5%) underwent the echocardiography examination. The median (quartiles) age at Year-5 was 30 (27–33) years, 45% were males, and 48% were African-American. Participants had follow-up examinations at 2, 5, 7, 10, 15, 20, and 25 years after enrollment, with 72% of the surviving cohort attending the Year-25 examination. Additional details on the CARDIA study design are available in Supplementary data, Section A.

The calibration effort concentrated on the nine echo parameters that were measured in all three studies: Left ventricular mass (LVMass), M-mode left atrial internal dimension in systole (LAIDS), M-mode LV internal dimension in systole (LVIDs), M-mode LV internal dimension in diastole (LVIDd), M-mode Inter-ventricular septal thickness in systole (IVSs), M-mode Inter-ventricular septal thickness in diastole (IVSd), M-mode LV posterior wall thickness in systole (LVPWs), M-mode LV posterior wall thickness in diastole (LVPWd) and LV fractional shortening (LVFS).

3. Statistical model

Combining data from the machine reproducibility and reread substudies, we developed a parametric linear measurement error model to correct for systematic differences among readers and machines, as well as changes in software for the Acuson machine between Year-5 and Year-25. The goal was to use the fitted model to calibrate the Year-5 Acuson measurements to the corresponding Toshiba measurement, accounting for measurement error in the observed Toshiba values. We will describe this model by specifying separate but related submodels for the machine reproducibility data and the original/reread data. For each of these, we will use the approach of Carroll et al. [3, Chapter 8], which defines three standard components for a parametric linear measurement error model from which the likelihood can be constructed.

The following discussion closely follows the paradigm of Carroll et al. [3, Chapter 8, Section 8.2] for constructing the likelihood for a general, errors-in-variables regression model. In a simplified notation, let X denote the true, unobserved value (vector) of the independent variable (exposure) in a regression model with observed outcome (vector) Y. We assume that X is measured with error so that instead of the true value, we observe $W = X + U$ (the assumption of additive measurement error is not really needed, but is fairly standard). We want to model the dependence of the outcome on the true, unmeasured value of the exposure, X, rather than the measured value W. We assume the classical case of nondifferential measurement error, which is plausible for these data. By definition then,

p r (Y = y | W = w, X = x) = p r (Y = y | X = x)

Therefore, the joint distribution of $(Y, W, X)$ may be written as follows:

p r (Y = y, W = w, X = x) = p r (Y = y | X = x) p r (W = w | X = x) p r (X = x)

Thus to model the joint distribution we need (1) an exposure model for the unobserved, true values, $p r (X = x)$ , (2) an error model for the observed values, conditional on the true values, $p r (W = w | X = x)$ and finally (3) the regression model of primary interest, relating the measured outcome to the true rather than the measured values of the independent variable, $p r (Y = y | X = x)$ . We assume a parametric model, $p r (Y = y | X = x, β)$ , and the goal of the analysis is to estimate the parameters $β$ . The problem of course is that X is unobserved, and the joint likelihood depends on the unobserved value x. However, we can compute the marginal likelihood for the observed data.

p r (Y = y, W = w) = \int p r (Y = y | X = x, β) p r (W = w | X = x) p r (X = x) d μ (x)

In our approach, all three models are fully parametric, and the joint likelihood $(Y, W, X)$ can be written as the product of the parametric likelihood functions for the three terms, noting that the likelihoods for (2) and (3) are conditional on $X = x$ . The marginal likelihood of the observed data $(Y, W)$ is computed by numerical integration with respect to x, and involves the parameters of all three component models, which are estimated by maximum likelihood using standard numerical optimization algorithms.

We first describe the submodel for the machine reproducibility data. Only three of the four Year-25 readers participated in the machine study and each subject was examined by two of the three participating readers. Denote the measurement of subject i $(1 \leq i \leq 96)$ , using machine j $(1 = A, 2 = T)$ by reader k $(1 \leq k \leq 3)$ . The exposure model in Equation (1) is the model for the unobserved true value of the Toshiba measurements ( $m_{i 2 k}$ ), measured without error but including adjustment for differences among readers. The error model in Equation (2) is for the observed Toshiba measurements ( $y_{i 2 k}$ ), given the true, unmeasured value and measured with additive error ( $δ_{i 2 k}$ ), and finally the regression model in Equation (3) is for the observed Acuson measurements ( $y_{i 1 k}$ ) and is again a true value ( $m_{i 1 k}$ ), which depends on the true value of the Toshiba measurement and includes adjustment for differences among readers, plus error ( $δ_{i 1 k}$ ). Let $1_{k}$ denote the indicator variable for reader k $(1 \leq k \leq 3)$ . The three components of the model are as follows. The reference for readers is Reader 1.

Exposure model: $m_{i 2 k} = β_{20} + β_{22} 1_{2} + β_{23} 1_{3} + u_{i} u_{i} \sim N (0, σ_{1}^{2})$
Error model: $y_{i 2 k} = m_{i 2 k} + δ_{i 2 k} δ_{i 2 k} \sim N (0, σ_{2}^{2} m_{i 2 k}^{θ_{1}})$
Regression model: $y_{i 1 k} = m_{i 1 k} + δ_{i 1 k} = β_{10} + β_{12} 1_{2} + β_{13} 1_{3} + α m_{i 2 k} + δ_{i 1 k} δ_{i 1 k} \sim N (0, σ_{3}^{2} m_{i 1 k}^{θ_{1}})$

The exposure model assumes that the true, unobserved value of the Toshiba measurement for subject i ( $m_{i 2 k}$ ) is normally distributed with mean $β_{20} + β_{22} 1_{2} + β_{23} 1_{3}$ and variance $σ_{1}^{2}$ ; the random variable u_i is a subject level variable. The error model assumes that the observed Toshiba measurement ( $y_{i 2 k}$ ) is equal to the true value ( $m_{i 2 k}$ ) plus a measurement error ( $δ_{i 2 k}$ ), which is normally distributed with heterogeneous variance $(σ_{2}^{2} m_{i 2 k}^{θ_{1}})$ that depends on a power ( $θ_{1}$ ) of the true value. Models with individual level error variances were previously discussed and illustrated by Dunn [4, Section 4.11], using an approximation to maximum likelihood. Unlike the true value ( $m_{i 2 k}$ ), a different value of the measurement error ( $δ_{i 2 k}$ ) is present with repeated observations of the same subject. The power parameter ( $θ_{1}$ ) was expected to be positive, reflecting greater variability with a larger mean. This kind of inhomogeneity of variance is not uncommon with positive measurements [2, Chapters 2, 3; 2, Sections 4.8, 4.10] and was supported by plots of the data. As part of the model fitting process the power parameters were checked, and if not different from zero, or if there were estimation problems (negative rather than positive estimates or large standard errors), the parameter was assumed to be zero, corresponding to a constant measurement error variance. LV mass was the most notable instance of inhomogeneous measurement error. The linear regression model (3) for the measured Acuson value (y_i1k) includes adjustment for the different readers, and is a standard linear regression model, but with dependence on the true value of the Toshiba measurement ( $m_{i 2 k}$ ), rather than the observed value, through a regression parameter ( $α$ ). It assumes inhomogeneity of the error ( $ε_{i 1 k}$ ) variance $(σ_{3}^{2} m_{i 1 k}^{θ_{1}})$ similar to the error model, with different variance parameters assumed for the error $(σ_{2}^{2})$ and regression $(σ_{3}^{2})$ models, Equations (2) and (3), but the same power parameter ( $θ_{1}$ ).

We now describe the submodel for the original Year-5 and corresponding rereads for the sample of 1516 cardia participants, again using the paradigm of Carroll et al. [3]. The rereads of the original, Year-5 Acuson images were performed using the Digisonics analysis software for 2D measures, according to the Year-25 protocol. All four readers (readers 1–4) at JHU participated. The original reads were done by a total of 7 readers (readers 5–11) in the Cardia Study, all of whom were represented in the sample. Our ultimate goal is to calibrate the original year-5 reads to the unobserved true value of the corresponding Toshiba measurement. To accomplish this we require a regression model linking these two measurements. The necessary linkage is provided by the regression model for the machine study, Equation (3).

Denote the measurement on subject i $(1 \leq i \leq 1516)$ , from reading j $(1 = O, 2 = R)$ by reader k $(1 \leq k \leq 11)$ in the reread study. For the exposure model of the reread study in Equation (I), the unobserved true value of the reread of the original Acuson image with the Year-25 protocol ( $n_{i 2 k}$ ) for subject i by reader k $(1 \leq k \leq 4)$ is assumed to be normally distributed with mean depending on the particular reader and variance $(α^{2} τ_{1}^{2})$ . Different reader effects were assumed for the three readers who also participated in the machine study. The regression coefficient $(α)$ is the same as in Equation (3) since we are again modeling the relationship between an Acuson measurement and an unobserved true value from theYear-25 protocol. In other words, the information about the relationship between the unobserved true Acuson reread and unobserved true Toshiba measurement, which is not available here, is borrowed from the data of the machine study. In the error model in Equation (II), the observed value of the reread $(z_{i 2 k})$ includes additive measurement error ( $ε_{i 2 k}$ ). In the regression model for the original reads in Equation (III), the observed value for the original read ( $z_{i 1 k}$ ) depends on the true value ( $n_{i 1 k}$ ), through the regression coefficient $α_{1}$ plus measurement error ( $ε_{i 1 k}$ ). The model includes correction for reader effects. A power of the mean model was assumed for the heterogeneous variance of the measurement errors of both the rereads, Equation (II) (with the same variance and power parameters as Equation (2)) and original reads, Equation (III) with different variance $(τ_{3}^{2})$ and power parameters ( $θ_{2}$ ) in Equation (III). In this case, Readers 1 in Equation (I) and (5) in Equation (III) are the reference.

Exposure model: $n_{i 2 k} = γ_{0} + γ_{2} 1_{2} + γ_{3} 1_{3} + γ_{4} 1_{4} + α u_{i} u_{i} \sim N (0, τ_{1}^{2})$
Error model: $z_{i 2 k} = n_{i 2 k} + ε_{i 2 k} ε_{i 2 k} \sim N (0, σ_{2}^{2} n_{i 2 k}^{θ_{1}})$
Regression model: $z_{i 1 k} = n_{i 1 k} + ε_{i 1 k} = γ_{5} + γ_{6} 1_{6} + γ_{7} 1_{7} + γ_{8} 1_{8} + γ_{9} 1_{9} + γ_{10} 1_{10} + γ_{11} 1_{11} + α_{1} n_{i 2 k} + ε_{i 1 k} ε_{i 1 k} \sim N (0, τ_{3}^{2} n_{i 1 k}^{θ_{2}})$

In the classical linear calibration study [1, Section 11.5], measured values are regressed on known standards, and the calibrated measurement is obtained by inverting the regression $(x = (y - a) / b)$ . In this case, we wish to calibrate the original reads to obtain the true, unobserved Toshiba value. This requires inverting the regression in Equation (III), but we must also correct for the regression in Equation (3), so that we must divide by the product ( $α α_{1}$ ). The calibrated Year-5 measurement for subject i, $C_{i}$ , included adjustment for differences among the original readers, and is estimated using the final regression model (III).

This was possible because the reread original reads included all of the readers of the Year-5 scans. The advantage of this approach is that it does not distinguish between the reread original and the remaining original reads. As check on the model we also calibrated the rereads, as well as the Acuson measurements from the machine study, using the appropriate model equations.

The fully parametric Gaussian likelihood for the complete model has a total of 26 parameters (11 for Equations (1)–(3) and 15 for Equations (I)–(II)), which were estimated by the method of maximum likelihood. SAS PROC NLMIXED (Version 9.3, SAS Institute Inc., Cary, NC, USA.), a general purpose program for maximum likelihood estimation, was used to fit this complex model. This required coding the model equations for the observed data [(2)–(3) and (II)–(III)], and declaring the variable u in Equations (1) and (I) to be a normally distributed random effect. The program performed very well, presumably because the marginal likelihood, obtained by numerical integration with respect to the unobserved random effects, was reasonably well-behaved, and no problems with convergence were encountered. Normality of the distributions was checked for all the outcomes using plots of residuals. Figures were created using with TIBCO Spotfire S + 8.2 for Windows.

4. Results

Results of the calibration are summarized as comparisons between the original and calibrated data. For this purpose we distinguish five different sets of data, two from the preliminary machine reproducibility study (one for Toshiba and one for Acuson), and three from the Year-5 data (the 1516 rereads of the original images, the corresponding original reads, and the remaining Year-5 data that were not reread and were not used in fitting the model).

Medians, quartiles and the coefficient of variation for the original and calibrated values, together with the correlation between the two are presented in Table 1. For the Toshiba measurements, the calibrated values are those predicted by the model. The distributions of calibrated data for the three sets of original measurements (including the rereads) are similar, as are the two calibrated distributions from the machine study. These results suggest that the primary effect of the calibration was a shift rather than a proportionate change (see Panel B of Figure 1). The high correlations are consistent with the fact that the calibration adjustment was linear in the original measurements. The coefficient of variation is quite similar in the five sets of data for all the original measurements. This is also the case for the calibrated data except for the IVS and LVPW measurements, where there was less variation in the calibrated Year-5 reread data compared to the original Year-5 reread.

Table 1. Descriptive statistics of original and calibrated measurements, by substudy, in CARDIA.

		Original		Calibrated
Measurement	Machine/Study	Median (IQR)	CV^b	Median (IQR)	CV	Correlation (95%CI)
LV mass (g)	Acuson	137.4 (120.4, 165.8)^a	29.55	136.0 (117.7, 164.9)	31.13	0.999 (0.999, 0.999)
	Toshiba	137.5 (123.3, 161.1)	29.68	137.2 (120.1, 158.2)	27.30	0.983 (0.977, 0.987)
	Original Y5	144.4 (117.6, 174.2)	28.59	129.9 (103.8, 158.6)	30.67	0.989 (0.988, 0.990)
	Reread Y5	124.8 (102.0, 150.0)	27.90	128.0 (104.3, 152.0)	27.52	0.983 (0.981, 0.984)
	The Rest of Y5	143.2 (116.2, 176.2)	30.61	129.2 (103.5, 160.0)	32.80	0.989 (0.988, 0.990)
LAIDS (cm)	Acuson	3.92 (3.56, 4.46)	14.81	3.96 (3.63, 4.49)	14.27	1.000 (0.999, 1.000)
	Toshiba	4.00 (3.60, 4.51)	13.99	3.94 (3.65, 4.41)	11.80	0.916 (0.888, 0.936)
	Original Y5	3.52 (3.21, 3.82)	13.01	3.39 (3.14, 3.64)	11.13	0.998 (0.997, 0.998)
	Reread Y5	3.49 (3.16, 3.81)	13.81	3.40 (3.09, 3.71)	13.75	1.000 (0.999, 1.000)
	The Rest of Y5	3.50 (3.19, 3.85)	13.44	3.37 (3.12, 3.66)	11.50	0.998 (0.997, 0.998)
LVIDs (cm)	Acuson	3.36 (3.02, 3.65)	14.72	3.46 (3.12, 3.68)	13.27	0.997 (0.996, 0.998)
	Toshiba	3.43 (3.11, 3.71)	13.93	3.41 (3.14, 3.64)	11.40	0.889 (0.854, 0.916)
	Original Y5	3.15 (2.87, 3.44)	13.50	2.92 (2.62, 3.26)	15.94	0.974 (0.971, 0.976)
	Reread Y5	3.20 (2.91, 3.50)	14.05	2.97 (2.72, 3.25)	13.21	0.972 (0.969, 0.975)
	The Rest of Y5	3.18 (2.88, 3.48)	14.40	2.96 (2.64, 3.29)	16.98	0.978 (0.976, 0.979)
LVIDd (cm)	Acuson	5.23 (4.78, 5.50)	10.01	5.30 (4.89, 5.55)	9.42	0.998 (0.998, 0.999)
	Toshiba	5.26 (4.89, 5.54)	9.35	5.27 (4.91, 5.50)	8.58	0.964 (0.952, 0.973)
	Original Y5	4.94 (4.63, 5.26)	9.60	4.78 (4.44, 5.13)	10.56	0.994 (0.994, 0.995)
	Reread Y5	5.02 (4.68, 5.38)	9.95	4.82 (4.51, 5.15)	9.67	0.988 (0.987, 0.989)
	The Rest of Y5	4.93 (4.61, 5.28)	9.97	4.78 (4.44, 5.15)	10.88	0.994 (0.993, 0.994)
IVSs (cm)	Acuson	1.27 (1.09, 1.39)	19.67	1.10 (1.03, 1.15)	9.08	0.993 (0.990, 0.995)
	Toshiba	1.20 (1.07, 1.34)	20.39	1.09 (1.05, 1.14)	7.31	0.829 (0.777, 0.869)
	Original Y5	1.30 (1.16, 1.45)	16.36	1.02 (0.90, 1.14)	16.92	0.964 (0.961, 0.968)
	Reread Y5	1.16 (1.03, 1.30)	17.38	1.00 (0.96, 1.06)	7.65	0.880 (0.868, 0.891)
	The Rest of Y5	1.29 (1.15, 1.45)	17.02	1.02 (0.90, 1.14)	17.59	0.965 (0.963, 0.968)
IVSd (cm)	Acuson	0.78 (0.69, 0.87)	19.63	0.88 (0.82, 0.93)	9.97	0.987 (0.982, 0.990)
	Toshiba	0.76 (0.68, 0.86)	20.26	0.87 (0.83, 0.92)	8.70	0.937 (0.916, 0.952)
	Original Y5	0.86 (0.76, 0.96)	16.99	0.86 (0.74, 0.99)	21.78	0.951 (0.946, 0.956)
	Reread Y5	0.74 (0.65, 0.84)	18.60	0.85 (0.81, 0.90)	8.09	0.869 (0.855, 0.880)
	The Rest of Y5	0.85 (0.75, 0.96)	18.21	0.86 (0.73, 1.00)	23.34	0.951 (0.948, 0.955)
LVPWs (cm)	Acuson	1.25 (1.14, 1.41)	18.29	1.10 (1.04, 1.16)	8.29	0.983 (0.977, 0.987)
	Toshiba	1.20 (1.09, 1.37)	18.54	1.10 (1.06, 1.14)	6.54	0.804 (0.745, 0.849)
	Original Y5	1.45 (1.30, 1.60)	15.09	0.98 (0.85, 1.11)	20.38	0.932 (0.924, 0.938)
	Reread Y5	1.24 (1.09, 1.38)	17.67	0.96 (0.91, 1.00)	7.82	0.883 (0.871, 0.893)
	The Rest of Y5	1.43 (1.27, 1.60)	16.76	0.97 (0.83, 1.12)	22.62	0.941 (0.936, 0.945)
LVPWd (cm)	Acuson	0.79 (0.73, 0.89)	18.52	0.87 (0.83, 0.92)	8.56	0.993 (0.991, 0.995)
	Toshiba	0.76 (0.68, 0.86)	18.60	0.88 (0.83, 0.91)	7.29	0.887 (0.851, 0.914)
	Original Y5	0.84 (0.76, 0.93)	15.75	0.81 (0.68, 0.94)	23.05	0.962 (0.958, 0.966)
	Reread Y5	0.75 (0.65, 0.84)	17.64	0.80 (0.75, 0.84)	7.64	0.858 (0.844, 0.871)
	The Rest of Y5	0.83 (0.75, 0.93)	17.24	0.81 (0.68, 0.94)	24.50	0.959 (0.955, 0.962)
LVFS^c (%)	Acuson	35.32 (31.71, 40.42)	17.89	34.45 (31.96, 38.32)²	11.09	0.892 (0.857, 0.918)
	Toshiba	34.69 (31.14, 39.22)	15.56	34.64 (31.86, 38.56)	11.16	0.575 (0.468, 0.664)
	Original Y5	35.91 (32.08, 39.84)	15.85	38.39 (34.30, 42.59)	23.85	0.915 (0.906, 0.923)
	Reread Y5	35.68 (32.12, 39.66)	15.75	37.81 (34.34, 41.24)	15.10	0.972 (0.969, 0.974)
	The Rest of Y5	35.51 (31.63, 39.64)	17.21	38.05 (33.89, 42.45)	26.01	0.924 (0.918, 0.929)

Open in a new tab

Abbreviations: CV coefficient of variation; CI Confidence interval; IQR Interquartile range, Y5 Year-5, LV mass Left ventricular mass; LAIDS Left atrial internal dimension in systole; LVIDs LV internal dimension in systole; LVIDd LV internal dimension in diastole; IVSs Inter-ventricular septal thickness in systole; IVSd Inter-ventricular septal thickness in diastole; LVPWs LV posterior wall thickness in systole; LVPWd LV posterior wall thickness in diastole; LVFS; LV fractional shortening.

^aValues are Median (inter-quartile range).

^bCoefficient of variation in the measurement.

^cValues are estimated from calibrated LVIDd and LVIDs.

Figure 1. — Panel A shows a scatterplot of the Acuson (y-axis) and corresponding Toshiba (x-axis) measurements from the machine reproducibility substudy, together with the line of identity. Panel B shows percentile boxplots of the distributions of original and calibrated Year-5 Acuson measurements for the three sets of Year-5 data. Panel C is a scatterplot of the Year-5 original Acuson vs. Year-5 reread Acuson measurements from the reread substudy. Panel D is a scatterplot of the calibrated Year-5 original Acuson vs. calibrated Year-5 reread Acuson measurements from the reread substudy. Both panels include the line of identity. Panel E is a scatterplot of the Year-5 original Acuson vs. Year-5 calibrated original Acuson measurements from the reread substudy. Panel F is a scatterplot of the original Year-5 Acuson vs. Year-5 calibrated original Acuson measurements for the remaining Year-5 reads.

The estimates of the two regression coefficients from the linear measurement error model that were needed for calibration are shown in Table 2. The slope of the regression of Acuson on predicted Toshiba, based on the unobserved true value from the machine reproducibility study $(\hat{α})$ is consistently close to one for all measurements. This is not the case for the regression of the original Year-5 Acuson on the predicted reread Acuson, based on the true value of the Toshiba measurement $({\hat{α}}_{1})$ . The product $(\hat{α} {\hat{α}}_{1})$ , which is the slope of the regression of the original values on the calibrated values, is 1.018 for LV mass and generally close to one for the first four measurements, confirming that the primary effect of the calibration was a shift upward or downward. For the four LVPW and IVS measurements the effect also involved a proportionate increase; additional discussion is included in the Supplement.

Table 2. Estimated regression coefficients from the linear measurement error model for each of the nine echocardiography measurements.

Measurement	$\hat{α} (S E)$	${\hat{α}}_{1} (S E)$	$\hat{α} {\hat{α}}_{1}$
LV mass (g)	0.956 (0.042)	1.065 (0.043)	1.018
LAIDS (cm)	1.026 (0.063)	1.182 (0.029)	1.213
LVIDs (cm)	1.088 (0.069)	0.805 (0.031)	0.876
LVIDd (cm)	1.047 (0.044)	0.895 (0.029)	0.937
IVSs (cm)	1.055 (0.086)	0.491 (0.030)	0.518
IVSd (cm)	0.980 (0.059)	0.388 (0.026)	0.380
LVPWs (cm)	1.135 (0.103)	0.371 (0.027)	0.421
LVPWd (cm)	1.054 (0.064)	0.347 (0.027)	0.366
LVFS^a (%)	–	–	–

Open in a new tab

Abbreviations: LV mass Left ventricular mass; LAIDS Left atrial internal dimension in systole; LVIDs LV internal dimension in systole; LVIDd LV internal dimension in diastole; IVSs Inter-ventricular septal thickness in systole; IVSd Inter-ventricular septal thickness in diastole; LVPWs LV posterior wall thickness in systole; LVPWd LV posterior wall thickness in diastole; LVFS; LV fractional shortening.

^aLVFS was not calibrated using the model. Calibrated LVIDs and LVIDd were used to compute the calibrated LVFS. The coefficient $\hat{α}$ is the slope of the regression of Acuson on the predicted Toshiba measurement, based on the latent true value, from the machine reproducibility study as well as the rereads of the Year-5 images (Equations (3) and (5), see also (6)). The coefficient ${\hat{α}}_{1}$ is the slope of the regression of the original Year-5 Acuson on the predicted reread Acuson, based on the true value of the Toshiba measurement (Equation (6)). $\hat{α} {\hat{α}}_{1}$ is the slope of the regression of the original values on the calibrated values.

Figure 1 displays the performance of the calibration for LV mass. There was a strong linear relationship between the Acuson and Toshiba measurements from the machine reproducibility study, supporting the appropriateness of a linear calibration model at this crucial stage (panel A). The overall distributions of the three sets of original and calibrated Year-5 measurements were comparable (panel B), although they reflect the difference in the medians between the two, and the calibrated distributions were similar to each other. A comparison of the Year-5 calibrated original and reread values to the raw data showed improved agreement, both in the sense of reduced scatter and absence of bias (Panels C and D). Further, panel C shows a characteristic fan shape, illustrating the increasing variability with increasing level that supports the use of the power of the mean model for the error variance. Finally, panels E and F display the adjustment for systematic effects of differences among readers. In this case, the original reads all appear larger than the calibrated values. In both panels, the regression of the original on the calibrated values appears to have a slope of about one.

The remaining figures all exhibit similar features, although sometimes to a different extent (cf. Supplementary data, Figures B.2 to B.9).

5. Discussion and conclusion

Overall, our results demonstrate the utility of a combination of linear measurement error models and carefully designed substudies to calibrate echocardiographic measurements acquired and analyzed 20 years apart, allowing for effective longitudinal comparison of serial repeated echocardiograms. For four of the measurements the effect of the calibration was largely a shift in the calibrated values, while for LVPW and IVS a proportionate increase was also involved. The shift seems reasonable given that the primary sources of bias were differences among readers and machines, which we might expect to be consistent within themselves but not necessarily between different individuals. In addition, the calibrated distributions were similar to each other, confirming that bias resulting from systematic differences among readers and machines has been accounted for.

Linear measurement error models such as this may be considered for general use in a variety of applications. This application illustrates the utility of the paradigm of Carrol et al. [4] for the development of models with measurement error. An important corollary is the ability of parametric models to handle complex designs such as this. Although a small amount of custom programming was required, the model we developed was not difficult to fit using a procedure available in a standard statistical software package. Similar programs are available in other packages such as Stata and R. It is hoped that this example will encourage the wider use of this approach to models with measurement error.

Supplementary Material

SupplementaryData_08.07.2019.pdf

Click here for additional data file.^{(4MB, pdf)}

Acknowledgements

The authors would like to thank four reviewers and the Associate Editor for their detailed and constructive comments.

Funding Statement

The Coronary Artery Risk Development in Young Adults Study (CARDIA) is supported by contracts HHSN268201800003I, HHSN268201800004I, HHSN268201800005I, HHSN268201800006I, and HHSN268201800007I from the National Heart, Lung, and Blood Institute (NHLBI).

Disclosure statement

No potential conflict of interest was reported by the authors.

References

1.Brownlee K.A., Statistical Theory and Methodology in Science and Engineering, Wiley, New York, 1965. [Google Scholar]
2.Carroll R.J., and Ruppert D., Transformation and Weighting in Regression, Chapman and Hall/CRC, London, 1988. [Google Scholar]
3.Carroll R.J., Ruppert D., Stefanski L.A., and Crainiceanu C.M., Measurement Error in Nonlinear Models: A Modern Perspective, 2nd ed., Chapman and Hall/CRC, London, 2006. [Google Scholar]
4.Dunn G., Statistical Evaluation of Measurement Errors, 2nd ed., Wiley, New York, 2004. [Google Scholar]
5.Friedman G.D., Cutter G.R., Donahue R.P., Hughes G.H., Hulley S.B., Jacobs Jr D.R., Liu K., and Savage P.J., CARDIA: study design, recruitment, and some characteristics of the examined subjects. J. Clin. Epidemiol 41 (1988), pp. 1105–1116. doi: 10.1016/0895-4356(88)90080-7 [DOI] [PubMed] [Google Scholar]
6.Fuller W.A., Measurement Error Models, Wiley, New York, 1987. [Google Scholar]
7.Gardin J.M., Wagenknecht L.E., Anton-Culver H., Flack J., Gidding S., Kurosaki T., Wong N.D., and Manolio T.A., Relationship of cardiovascular risk factors to echocardiographic left ventricular mass in healthy young black and white adult men and women the CARDIA study. Circulation 92 (1995), pp. 380–387. doi: 10.1161/01.CIR.92.3.380 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SupplementaryData_08.07.2019.pdf

Click here for additional data file.^{(4MB, pdf)}

[CIT0001] 1.Brownlee K.A., Statistical Theory and Methodology in Science and Engineering, Wiley, New York, 1965. [Google Scholar]

[CIT0002] 2.Carroll R.J., and Ruppert D., Transformation and Weighting in Regression, Chapman and Hall/CRC, London, 1988. [Google Scholar]

[CIT0003] 3.Carroll R.J., Ruppert D., Stefanski L.A., and Crainiceanu C.M., Measurement Error in Nonlinear Models: A Modern Perspective, 2nd ed., Chapman and Hall/CRC, London, 2006. [Google Scholar]

[CIT0004] 4.Dunn G., Statistical Evaluation of Measurement Errors, 2nd ed., Wiley, New York, 2004. [Google Scholar]

[CIT0005] 5.Friedman G.D., Cutter G.R., Donahue R.P., Hughes G.H., Hulley S.B., Jacobs Jr D.R., Liu K., and Savage P.J., CARDIA: study design, recruitment, and some characteristics of the examined subjects. J. Clin. Epidemiol 41 (1988), pp. 1105–1116. doi: 10.1016/0895-4356(88)90080-7 [DOI] [PubMed] [Google Scholar]

[CIT0006] 6.Fuller W.A., Measurement Error Models, Wiley, New York, 1987. [Google Scholar]

[CIT0007] 7.Gardin J.M., Wagenknecht L.E., Anton-Culver H., Flack J., Gidding S., Kurosaki T., Wong N.D., and Manolio T.A., Relationship of cardiovascular risk factors to echocardiographic left ventricular mass in healthy young black and white adult men and women the CARDIA study. Circulation 92 (1995), pp. 380–387. doi: 10.1161/01.CIR.92.3.380 [DOI] [PubMed] [Google Scholar]

PERMALINK

Application of measurement error models to correct for systematic differences among readers and vendors in echocardiography measurements: the CARDIA study

Aisha Betoko

Chike Nwabuo

Bharath Ambale Venkatesh

Erin P Ricketts

Sejong Bae

Colin Wu

Samuel S Gidding

Kiang Liu

João A C Lima

Christopher Cox

ABSTRACT

1. Introduction

2. Methods

2.1. Study design

2.2. Data collection

3. Statistical model

4. Results

Table 1. Descriptive statistics of original and calibrated measurements, by substudy, in CARDIA.

Figure 1.

Table 2. Estimated regression coefficients from the linear measurement error model for each of the nine echocardiography measurements.

5. Discussion and conclusion

Supplementary Material

Acknowledgements

Funding Statement

Disclosure statement

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Application of measurement error models to correct for systematic differences among readers and vendors in echocardiography measurements: the CARDIA study

Aisha Betoko

Chike Nwabuo

Bharath Ambale Venkatesh

Erin P Ricketts

Sejong Bae

Colin Wu

Samuel S Gidding

Kiang Liu

João A C Lima

Christopher Cox

ABSTRACT

1. Introduction

2. Methods

2.1. Study design

2.2. Data collection

3. Statistical model

4. Results

Table 1. Descriptive statistics of original and calibrated measurements, by substudy, in CARDIA.

Figure 1.

Table 2. Estimated regression coefficients from the linear measurement error model for each of the nine echocardiography measurements.

5. Discussion and conclusion

Supplementary Material

Acknowledgements

Funding Statement

Disclosure statement

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases