Skip to main content
UKPMC Funders Author Manuscripts logoLink to UKPMC Funders Author Manuscripts
. Author manuscript; available in PMC: 2017 Jul 1.
Published in final edited form as: Addiction. 2017 Apr 25;112(7):1270–1280. doi: 10.1111/add.13797

Adjustment for survey non-representativeness using record-linkage: refined estimates of alcohol consumption by deprivation in Scotland

Emma Gorman 1,2, Alastair H Leyland 1, Gerry McCartney 3, Srinivasa Vittal Katikireddi 1, Lisa Rutherford 4, Lesley Graham 5, Mark Robinson 2, Linsay Gray 1,6,§
PMCID: PMC5467727  EMSID: EMS73021  PMID: 28276110

Abstract

Aims

Analytical approaches to addressing survey non-participation bias typically only use demographic information to improve estimates. We applied a novel methodology which utilises health information from data linkage to adjust for non-representativeness. We illustrate the method by presenting adjusted alcohol consumption estimates for Scotland.

Design

Data on consenting respondents to the Scottish Health Surveys (SHeSs) 1995-2010 were confidentially linked to routinely-collected hospital admission and mortality records. Synthetic observations representing non-respondents were created using general population data. Multiple imputation was performed to compute adjusted alcohol estimates given a range of assumptions about the missing data. Adjusted estimates of mean weekly consumption were additionally calibrated to per-capita alcohol sales data.

Setting

Scotland

Participants

13,936 male and 18,021 female respondents to the SHeSs 1995-2010, aged 20-64 years.

Measurements

Weekly alcohol consumption, non-, binge- and problem-drinking.

Findings

Initial adjustment for non-response resulted in estimates of overall mean weekly consumption that were elevated by up to 17.8% [26.5 units (18.6 - 34.4)] compared to corrections based solely on socio-demographic data [22.5 (17.7 - 27.3)]; other drinking behaviour estimates were little changed. Under more extreme assumptions the overall difference was up to 53% and calibrating to sales estimates resulted in up to 88% difference. Increases were especially pronounced among males in deprived areas.

Conclusions

Use of routinely-collected health data to reduce bias arising from survey non-response resulted in elevated alcohol consumption estimates among working age males in Scotland, with less impact for females. Our new method can be generalised to other surveys and improve estimates of alternative harmful behaviours.

Keywords: Alcohol-related harm, Alcohol consumption, Health surveys, Bias, Non-participation, Epidemiology, Scotland, Record-linkage

Introduction

Accurate data on addictive substance use are necessary to inform policy development, implementation and evaluation (1, 2), and for alcohol research. In many countries, estimates of population consumption of legal addictive substances are derived from taxation or sales data (3). However, data on trends in alcohol consumption by social and demographic groups (such as age, gender and socioeconomic position) and the pattern in which substances are consumed (for example, frequency and amount per occasion) typically require the administration of population-sampled surveys.

In the Scottish context, the Scottish Health Surveys (SHeS) (4) offer detailed exploration of consumption at the individual level. Although providing important additional insights compared to sales data (5), survey-derived estimates face various biases—including those arising from non-participation of invited respondents (unit non-response), social desirability and recall (6)—which may threaten internal validity and generalisability to the population. Additionally, sampling frames of many population-sampled health surveys, including the SHeSs, are confined to private residences, excluding institutionalised and transient populations. Furthermore, incidence rates of alcohol-related harm (7, 8), as well as all-cause mortality(9), have been found to be substantially lower among survey respondents compared with the general population, as we identified from record-linkage in SHeS (10). These features contribute to underestimation of population consumption from surveys (1114), evident as a substantial differential between survey- and sales data-based estimates of mean weekly units consumed (1517), thus hampering alcohol research.

Understanding the consequences of survey non-response is particularly salient as survey response levels are declining (18, 19). Poor health and risky health behaviours correlate with non-response, (2022)suggesting that the estimation of health behaviour prevalence could be biased. Adjustments for non-response are often confined to a limited set of socio-demographic variables (as is the case in the SHeSs (4)), with the use of survey weighting being the most commonly adopted method. This is intended to align the socio-demographic profile of the survey to that of the target population, but any further differences between respondents and non-respondents within socio-demographic groups are not corrected, so weights are likely to be mis-specified. The addition of health measures may be further informative, although there have been few attempts (12, 13, 23) to incorporate these, not least because the necessary data—comparable across respondents and non-respondents—are not readily available. However it is possible for equivalent information to be obtained directly via record-linkage for those who have responded and implicitly for those who have not (10, 24) which is the case for SHeS.

Even with a broader range of informative data, approaches to adjust for non-participation necessarily rely on untestable assumptions about the nature of the missing data. Presenting results based on only one set of assumptions may convey an unrealistic level of certainty about the estimates. Sensitivity analyses based on a range of credible assumptions recognise this uncertainty. Information from additional sources about the likely behaviour of non-respondents (14) can be used to establish plausible scenarios, with each forming the basis of inference, offering a range of informative estimates for data users to consider.

The aims of this study were threefold:

  • (i)

    Exploit linkage of survey records to administrative health data to adjust for health-related non-representativeness in alcohol consumption estimates;

  • (ii)

    Conduct sensitivity analysis given a range of assumptions about the unobserved data;

  • (iii)

    Triangulate adjusted survey estimates with alcohol sales data.

In doing so, we quantify the impact of survey non-response on population estimates and socioeconomic inequalities in alcohol consumption, using generalisable methodology.

Methods

Data

Baseline survey and population data

The SHeSs are a series of stratified, cluster-sampled repeated cross-sectional surveys designed to describe the health of the Scottish population living in private households (24). We used the surveys conducted in 1995 (25), 1998 (26), 2003 (27), 2008 (28), 2009 (29), and 2010 (4), henceforth baseline years (adult response percentages ranged from 55% in 2010 to 84% in 1995 (Table 1)). Analyses were restricted to data on individuals aged 20 to 64 years, as this age range was available in all survey years and enhanced comparability between the survey sampling frame and population data. Survey weights, which had previously been derived to account for the survey sampling design, incorporating differential selection of addresses/households, calibration to match population estimates for age/sex and health board and within-household response, were used throughout (30). The alcohol measures of interest were: usual weekly alcohol consumption derived using the Quantity-Frequency method (31); the prevalence of non-drinkers; binge drinking (consumption in excess of six/eight units (one alcohol unit is measured as 10ml or 8g of pure alcohol) on the heaviest drinking day in the last seven days for women/men) (27) and potential problem-drinking—defined as two or more positive answers on the CAGE instrument (32, 33).

Table 1.

Response proportions and consent to linkage in the Scottish Health Surveys, 1995–2010

Survey year Household response proportion, % Adult response proportion, % Proportion consenting to linkage, % No. of men aged 20–64 years No. of women aged 20–64 years

1995 81 84 93 3,118 3,867
1998 77 76 92 2,944 3,674
2003 67 60 91 2,353 3,028
2008 61 54 86 1,683 2,234
2009 64 56 85 1,944 2,647
2010 63 55 86 1,894 2,571

General population data comparable with each SHeS survey were constructed using mid-year population estimates for small-area geographical units (i.e. datazones which contain around 350 households and populations of 500-1000 residents (34)) from the National Records of Scotland (NRS) by sex and five-year age-group at each baseline year (35). Datazone-level population count estimates were not available for mid-1995, so mid-1996 estimates were used.

Area-based measures of deprivation were matched to both the survey records and population data. The Carstairs 2001 measure of small-area material disadvantage was used in the 1995 and 1998 baseline years, and the Scottish Index of Multiple Deprivation (SIMD) (36) from 2003 onward (37).

Morbidity and mortality records

The Scottish Morbidity Records (SMR) are hospital episode statistics drawn from routinely-collected NHS records of socio-demographic, episode management and clinical data across Scotland (38) and have been found to be ~90% accurate in recording the correct diagnosis (39) and ~99% complete (40). Inpatient and day cases discharged from general and mental health specialties with an alcohol-related diagnosis in any diagnostic position (41) and mortality data collected by NRS were classified using the International Classification of Disease (ICD) 9th and 10th Editions. For consenting SHeS respondents (range of 85% in 2009 to 93% in 1995(Table 1)), SMR and NRS records were linked to survey records. Morbidity and mortality data were available until the end of 2011, allowing a maximum follow-up period of 16 years from 1995. Two overlapping binary measures (collectively referred to as “harm”) were created: death due to any cause and any alcohol-related event (hospitalisation or the primary cause of death (42)). Data associated with baseline years 2008, 2009 and 2010 were pooled to accommodate smaller sample sizes of the annual format surveys and shorter follow-up periods (Table 1).

Sales data

The alcohol sales data used were collected by market research specialists Nielsen/CGA. Estimates of annual sales in Scotland, including both on- and off-trade, were available for all the baseline years except 1998, which were linearly interpolated (43).

Statistical methodology

We developed a new methodology, detailed elsewhere(44), to correct the non-participation aspect of bias arising in survey data. Rather than adjust the weighting, we took an imputation approach. Multiple imputation is a statistical technique for analysing incomplete data sets(45) which has the advantage of allowing the flexibility afforded by being amenable to sensitivity analyses(46). In essence, in the absence of data on individual non-respondents, we used comparisons of the composition of survey respondents in terms of age, sex, deprivation and harms with that of the general population to identified the numbers of missing survey respondents within each sociodemographic/harm combination group. We then created “observations” for non-responders within each group and imputed their unknown alcohol consumption estimates. We allowed associations between consumption and harm to differ between respondents and non-respondents.

Our methodology considers the nature of missingness with reference to the classification of missing data mechanisms (45), as follows: Data can be missing at random (MAR) or missing not at random MNAR (45). MAR is the case where the probability of missingness is unrelated to the unobserved data taking account of the observed data. Alternatively, if the missingness depends on unobserved data (even after taking account all the information in the observed data), the observations are MNAR. Note that data which are MNAR can become MAR if additional variables are observed and used in analysis; this is a feature of our approach.

Briefly, the approach involves establishing: 1) the total number of missing respondents, based on the “effective response level”–the percentage of the sample that both responded to the survey and consented to linkage–and number of observed respondents; 2) the respondent composition in terms of age, sex, deprivation quintile and harms during follow-up; 3) the population composition; 4) the number of missing respondents within each sociodemographic–harm combination group by comparison of survey and population data (from steps 2 and 3); 5) creation of synthetic observations for the nonrespondents; 6) conducting multiple imputation (47) to provide estimates of alcohol consumption measures in the synthetic ‘nonrespondents’, given the data on age, sex, deprivation and harms, and based on the assumption that the consumption data are MAR (Appendix S1); 7) generation of nonresponse-corrected alcohol estimates under the MAR assumption by combining the observed alcohol data on the respondents, and the imputed alcohol data on the synthetic nonrespondents ; and 8) altering the MAR imputation model by specific estimates for the mean difference in alcohol consumption between respondents and nonrespondents in sensitivity analyses, allowing for the possibility that the consumption data could be MNAR (Appendix S2). The effects of a range of MNAR scenarios were explored, as outlined in the MNAR sensitivity analyses section below and Appendix S2.

MNAR sensitivity analyses

We relaxed the MAR assumption in sensitivity analyses by modifying the imputation model, using a pattern-mixture approach, as detailed in Appendix S2 (48). This involved changing the imputation model to reflect potential differences in the distribution of alcohol consumption between respondents and non-respondent, given the observed data. This required specifying a value for the mean difference in alcohol consumption between respondents and non-respondents, after adjusting for observed covariates (this is zero under MAR). The value of this parameter can be varied to represent different assumptions and the impact on substantive conclusions assessed. Two scenarios were considered.

The first MNAR scenario (MNARCR) drew on data on the number of attempts to contact a household for interview and “continuum of resistance” theory: non-respondents may be similar to late-respondents, as late-respondents would have been non-respondents if efforts to contact had ceased earlier (11, 49). The specific scenario considered was that the deviation from MAR could be up to twice the adjusted differences in mean consumption between early- (≤3 attempts to contact) and late-respondents (>3 attempts).

The second MNAR scenario aimed to incorporate the possibility of a subgroup of very heavy drinkers—we focussed on those experiencing harm—whose consumption may not look similar to any observed subgroup and are not “adjusted for” in typical corrections. As the most extreme scenario (MNAR***), the imputation model was altered such that sex-specific mean consumption among non-respondents experiencing harm was six times greater than the observed mean. This was informed by an estimate of weekly consumption among patients with serious alcohol problems hospitalised or in treatment in two Edinburgh hospitals, which estimated mean weekly consumption among this sample as 197.7 units (50). This resulted in an adjusted mean among drinkers experiencing harm of 197.5 units, compared with a MAR estimate of 48.4 units in this group. More moderate scenarios were considered, where sex-specific mean consumption among those experiencing harm was double (MNAR*) and quadruple (MNAR**) that of their observed counterparts.

Sales data-based triangulation

We adjusted the survey estimates informed by comparison with per capita estimates (5153) (Appendix S3). Each overall non-response adjusted estimate of mean weekly consumption was compared with the per capita consumption estimate for Scotland to assess the magnitude of the remaining coverage gap: that which is not explained by participation bias in our data. This proportionate difference was used to shift up each of the estimates of mean consumption sex- and deprivation quintile-specific subgroups. All analyses were conducted in Stata/SE 13.1 (StataCorp LP, Texas).

Results

The generation of synthetic non-respondent observations aligned survey and population in terms of sex- and area deprivation quintile-specific percentage breakdowns (Supplementary Table S1). The differential gradient in the probability of prospective alcohol-related harm between the survey and population data was corrected in the adjusted data (Supplementary Table S2).

The MAR adjustment resulted in elevated estimates of weekly consumption among males (Table 2), for whom the magnitude of correction ranged from 1.9% in 1995, 3.8% in 1998, 2.7% in 2003 and 1.6% in 2008/10, compared with little among females. Taking 2003 as an example, the second scenario MNAR adjustments among males increased weekly units consumed from the original (R) estimate of 21.8 units to 24.6 units (14% increase) in the weakest scenario (MNAR*), and to 33.3 units in the most extreme (53% increase; MNAR***; Table 2). The first scenario MNAR-based estimates (MNARCR) were generally around those of the second scenario MNAR*. The set of estimates calibrated to sales data ranged from 33.2 to 36.4 units, with the biggest increase of 88% during 2008-10 (Table 2).

Table 2.

Mean weekly alcohol consumption estimates for Scottish Health Surveys 1995, 1998, 2003 and 2008-2010 among individuals aged 20 to 64 years by sex.

Baseline year 1995 1998 2003 2008-10

Mean (95% CI)/SD Mean (95% CI)/SD Mean (95% CI)/SD Mean (95% CI)/SD
Males
R 20.8 (19.7 - 22.0) 20.0 (19.0 - 21.0) 21.8 (20.5 - 23.1) 18.8 (17.9 - 19.6)
MAR 21.2 (20.0 - 22.4) 20.8 (19.5 - 22.0) 22.4 (20.3 - 24.4) 19.1 (18.2 - 20.1)
MNARCR 22.6 (21.4 - 23.8) 22.2 (20.9 - 23.6) 24.9 (22.8 - 27.0) 20.7 (19.7 - 21.7)
MNAR* 22.8 (21.5 - 24.1) 22.3 (20.9 - 23.6) 24.6 (22.4 - 26.7) 20.1 (19.1 - 21.1)
MNAR** 25.9 (24.2 - 27.7) 25.3 (23.5 - 27.0) 28.9 (26.4 - 31.5) 22.1 (20.9 - 23.3)
MNAR*** 29.1 (26.7 - 31.4) 28.2 (26.0 - 30.5) 33.3 (30.1 - 36.5) 24.1 (22.6 - 25.6)
Calibrated 33.8 39.3 34.6 40.4 33.2 41.1 33.5 39.9
Calibrated* 34.1 40.4 34.9 41.8 34.0 42.3 33.9 41.5
Calibrated** 34.7 43.9 35.4 46.1 35.3 47.2 34.6 45.3
Calibrated*** 35.1 47.0 35.8 49.9 36.4 51.5 35.3 48.8
Females
R 6.3 (5.8 - 6.7) 7.0 (6.6 - 7.3) 10.8 (10.1 - 11.6) 8.8 (8.5 - 9.1)
MAR 6.4 (5.9 - 6.9) 7.0 (6.5 - 7.5) 10.8 (9.8 - 11.7) 8.8 (8.5 - 9.1)
MNARCR 6.7 (6.2 - 7.2) 7.3 (6.9 - 7.8) 11.5 (10.5 - 12.4) 9.4 (9.0 - 9.8)
MNAR* 6.6 (6.1 - 7.1) 7.3 (6.8 - 7.7) 11.0 (10.0 - 12.0) 8.9 (8.5 - 9.3)
MNAR** 7.0 (6.4 - 7.6) 7.8 (7.2 - 8.3) 11.5 (10.5 - 12.5) 9.1 (8.7 - 9.5)
MNAR*** 7.4 (6.8 - 8.0) 8.3 (7.6 - 8.9) 11.9 (10.8 - 13.0) 9.3 (8.9 - 9.7)
Calibrated 10.2 14.3 11.7 15.6 16.0 20.4 15.5 20.7
Calibrated* 9.9 13.8 11.4 15.2 15.2 19.4 15.0 20.2
Calibrated** 9.4 13.3 10.9 14.9 14.0 18.0 14.3 19.3
Calibrated*** 8.9 12.8 10.5 14.7 13.0 16.9 13.7 18.6

95% CI: 95% confidence interval; SD: standard deviation;R: linkage-consenting Scottish Health Survey respondents (survey-weighted); MAR: missing-at-random; MNAR: missing-not-at-random;

CR

continuum of resistance-based sensitivity analysis;

*

slight sensitivity analysis;

**

moderate sensitivity analysis;

***

extreme sensitivity analyses;

Calibrated: calibrated to retail data.

The percentage increase in weekly units consumed from the survey-weighted estimate to the MAR adjustment among males was typically the greatest in the most deprived quintile (+4.9% in 1995, +3.6% in 1998, +17.8% in 2003 and +13.6% in 2008-2010 compared with -1.6%, +4.4%, -2.6% and -13.6%, respectively in the least deprived quintile; Table 3; Supplementary Tables S3a-d). Among females, mean consumption was consistently greater in the most deprived quintile both before and after the MAR adjustment and the extent of the adjustment did not follow a pronounced pattern over deprivation; (Table 3; Supplementary Tables S3a-d). As the association between harms and consumption is progressively increased through the second MNAR sensitivity analyses (Table 3; Supplementary Tables S3a-d), the gradient over deprivation emerges, in contrast to the survey-weighted and MAR. These altered gradients are reflected in the corresponding sales data-calibrated estimates (Table 4; Supplementary Tables S4a-d).

Table 3.

Weekly alcohol consumption estimates in the 2003 Scottish Health Survey respondents aged 20 to 64 years by sex and area deprivation quintile under a range of assumption about the missing data: socio-demographic based survey weights; MAR; MNAR.

Survey-weighted estimates among respondents MAR estimates in adjusted sample MNARCR estimates in adjusted sample MNAR* estimates in adjusted sample MNAR** estimates in adjusted sample MNAR*** estimates in adjusted sample

Quintile of deprivation N Mean (95% CI) Mean (95% CI) Mean (95% CI) Mean (95% CI) Mean (95% CI) Mean (95% CI)
Males
Least deprived 484 23.1 (20.9 - 25.3) 22.5 (19.3 - 25.7) 23.9 (20.7 - 27.1) 23.2 (20.1 - 26.4) 24.7 (21.2 - 28.1) 26.1 (22.0 - 30.2)
2 532 21.4 (19.2 - 23.6) 20.0 (16.4 - 23.7) 21.9 (18.2 - 25.6) 21.4 (17.6 - 25.1) 24.1 (19.8 - 28.3) 26.8 (21.6 - 31.9)
3 500 21.9 (18.8 - 25.0) 22.8 (18.8 - 26.9) 24.9 (20.6 - 29.1) 24.3 (20.0 - 28.7) 27.3 (22.1 - 32.5) 30.3 (23.9 - 36.7)
4 457 20.0 (17.6 - 22.5) 20.2 (17.4 - 23.0) 22.9 (20.0 - 25.9) 22.8 (19.7 - 25.9) 28.0 (23.6 - 32.3) 33.1 (27.1 - 39.2)
Most deprived 380 22.5 (17.7 - 27.3) 26.5 (18.6 - 34.4) 31.2 (23.0 - 39.4) 31.6 (23.3 - 40.0) 41.9 (32.2 - 51.6) 52.1 (40.5 - 63.8)
All quintiles 2,353 21.8 (20.5 - 23.1) 22.4 (20.3 - 24.4) 24.9 (22.8 - 27.0) 24.6 (22.4 - 26.7) 28.9 (26.4 - 31.5) 33.3 (30.1 - 36.5)
Females
Least deprived 603 12.5 (11.4 - 13.5) 12.9 (11.2 - 14.5) 13.5 (11.8 - 15.1) 13.0 (11.3 - 14.6) 13.2 (11.5 - 14.8) 13.4 (11.6 - 15.1)
2 666 12.7 (10.3 - 15.1) 12.2 (9.4 - 15.0) 12.8 (10.0 - 15.6) 12.3 (9.4 - 15.1) 12.4 (9.6 - 15.3) 12.6 (9.7 - 15.5)
3 631 9.7 (8.6 - 10.8)   9.6 (8.1 - 11.2) 10.3 (8.8 - 11.9)   9.8 (8.2 - 11.4) 10.1 (8.5 - 11.7) 10.4 (8.7 - 12.2)
4 586 9.5 (8.3 - 10.8)   9.4 (7.7 - 11.1) 10.1 (8.4 - 11.8)   9.7 (8.0 - 11.4) 10.2 (8.4 - 12.1) 10.8 (8.8 - 12.8)
Most deprived 542 9.4 (7.8 - 11.0)   9.7 (7.5 - 11.9) 10.5 (8.3 - 12.7) 10.2 (8.0 - 12.5) 11.2 (8.8 - 13.7) 12.3 (9.6 - 14.9)
All quintiles 3,028 10.8 (10.1 - 11.6) 10.8 (9.8 - 11.7) 11.5 (10.5 - 12.4) 11.0 (10.0 - 12.0) 11.5 (10.5 - 12.5) 11.9 (10.8 - 13.0)

Scottish Health Survey respondents that have consented to linkage; 95% CI: 95% confidence interval; SD: standard deviation; MAR: missing-at-random; MNAR: missing-not-at-random;

CR

continuum of resistance-based sensitivity analysis;

*

slight sensitivity analysis;

**

moderate sensitivity analysis;

***

extreme sensitivity analyses;

Calibrated: calibrated to retail data.

Table 4.

Mean weekly alcohol consumption estimates and standard deviations in individuals aged 20 to 64 years in 2003 by sex calibrated to per capita totals.

Quintile of deprivation Calibrated CalibratedCR Calibrated* Calibrated** Calibrated***
Mean SD Mean SD Mean SD Mean SD Mean SD
Males
Least deprived 33.4 35.6 32.4 30.7 32.1 34.3 30.1 32.6 28.5 31.4
2 29.8 32.1 29.7 29.2 29.6 33.2 29.4 36.8 29.2 40.0
3 33.9 38.4 33.7 37.4 33.7 38.7 33.3 39.8 33.0 40.8
4 30.0 33.4 31.1 31.7 31.6 36.6 34.1 46.0 36.1 54.2
Most deprived 39.4 69.6 42.3 50.4 43.8 72.8 51.1 88.2 56.9 101.7
All quintiles 33.2 41.1 33.7 35.7 34.0 42.3 35.3 47.2 36.4 51.5
Femaless
Least deprived 19.1 21.9 18.3 18.5 18.0 20.4 16.1 18.4 14.6 16.8
2 18.0 20.3 17.3 27.4 17.0 19.0 15.2 17.2 13.8 15.7
3 14.3 19.2 14.0 18.7 13.5 18.2 12.3 16.6 11.4 15.5
4 14.0 19.3 13.7 18.2 13.4 18.6 12.5 17.7 11.8 17.0
Most deprived 14.4 19.6 14.2 17.0 14.2 19.2 13.7 18.8 13.4 18.5
All quintiles 16.0 20.4 15.5 20.2 15.2 19.4 14.0 18.0 13.0 16.9
Scaling factor 1.5 1.4 1.4 1.2 1.1

SD: standard deviation;

CR

continuum of resistance-based sensitivity analysis;

*

slight sensitivity analysis;

**

moderate sensitivity analysis;

***

extreme sensitivity analyses;

Calibrated: calibrated to retail data;

rounded to one decimal place.

The unadjusted prevalence of problem-drinking was strongly socially patterned—highest in the most deprived quintile—for both sexes (Supplementary Table S5). MAR adjustment resulted in a proportionally larger change in the more deprived quintiles (Supplementary Table S5) and had a marginal increase overall (Table 5). The prevalence of binge drinking exhibited a similar social gradient to that of problem drinking, but with a higher prevalence; MAR correction had little impact (Table 4 and Supplementary Table S6). Non-drinkers were more prevalent in the most deprived areas; there was negligible effect from MAR correction (Supplementary Table S7).

Table 5.

Potential problem-drinker and binge-drinking prevalence estimates in the Scottish Health Survey respondents and in the adjusted sample by survey year, sex and area deprivation quintile

Males Females
Survey-weighted MAR Survey-weighted MAR
Survey year(s) % (CI) % (CI) % (CI) % (CI)

Potential problem drinking prevalence (among current drinkers)
1998 12.3 (10.9 - 13.8) 12.9 (11.5 - 14.3) 5.0 (4.1 - 5.8) 5.4 (4.5 - 6.3)
2003 12.8 (11.2 - 14.3) 13.3 (11.8 - 14.7) 6.7 (5.6 - 7.7) 6.6 (5.6 - 7.7)
2008/10 14.8 (13.6 - 16.0) 14.7 (13.6 - 15.8) 9.1 (8.3 - 10.0) 8.7 (8.0 - 9.4)
Binge drinking prevalence (among those who drank in the last 7 days)
1998 39.7 (37.4 - 42.0) 39.2 (36.6 - 41.7) 18.8 (18.6-19.0) 18.6 (16.6 - 20.7)
2003 34.2 (31.7 - 36.6) 34.3 (31.5 - 37.0) 21.1 (19.1 - 23.2) 20.9 (18.5 - 23.2)
2008/10 43.2 (41.3 - 45.0) 43.3 (41.7 - 44.9) 33.9 (32.3 - 35.5) 35.7 (34.1 - 37.2)

Respondents that have consented to linkage; MAR: missing-at-random

Discussion

Adjusting for differential participation resulted in elevated estimates of weekly alcohol consumption. This was particularly pronounced among men living in the most deprived areas, operating chiefly through the elevated levels of alcohol-related harm experienced by non-respondents in this group. Among women the correction did not have a substantial impact on the level or patterning of weekly consumption. Generally, the prevalence of non-drinkers and binge drinking were not materially affected by adjustment. For problem-drinking, there tended to be a proportionally larger change in the more deprived quintiles. Sensitivity analyses yielded a possible higher range of adjusted estimates of weekly consumption and a steeper social gradient.

Previous studies have shown an association between ill-health, including alcohol misuse, and response status (8, 21, 54). However few studies have used this information to produce adjusted estimates. Recent exceptions considering adjustments to alcohol consumption have assessed the impact of adjustments on overall consumption level (rather than within subgroups) and found small to non-existent effects (12, 13). Adjusted estimates for subpopulations of interest are important for evaluation of the impact of policy on inequalities and of heterogeneous or unintended effects (55, 56), and the present study demonstrates health-related non-response may have important differential effects. The most similar study used Swedish registry-based data on differences in retrospective alcohol-related hospitalisation between survey respondents and inferred estimates for non-respondents to adjust prevalence estimates of hazardous drinking and abstinence (12). Although those with previous alcohol-related hospitalisations were, on average, 2.4-times more likely to become survey non-respondents, adjusting for these differences had little impact on rates of hazardous alcohol consumption. Potential explanations were, first, the proportion hospitalised was low (1.7% over a ten-year period), and second, it is likely that the experience of those hospitalised may not generalise to, or provide sufficient information on, the wider population of hazardous drinkers. Our study finds in general, larger impacts of our adjustment, likely due to a longer maximum follow up period (16 years from 1995) and higher probabilities of harm (maximum of 6.0% over 16 years in the adjusted sample in 1995) compared with Sweden.

Given declining survey response, use of auxiliary variables sourced from routine data to make corrections is likely to be of increasing value – particularly, as the availability of linked health data increases in many countries (57).

A number of limitations are of note. First, not all survey participants consent to linkage which may generate distortions depending on the nature of any differences between non-consenters and non-responders. Second, the comparison data are not free of bias themselves. Although high—96% in 2001—Census enumeration is incomplete, and under-enumeration in the 2001 Census was higher among deprived and transient groups (58). Third, the SHeSs’ sampling frame is confined to individuals living in private residences. This excludes a number of marginal population groups: those at high risk of alcohol-related harm but low access to alcohol (eg, those incarcerated or in-treatment), high risk and high access to alcohol groups (eg, rough sleepers and the armed forces) and those with low risk and low access to alcohol (eg, long-term care and nursing homes). Therefore we would expect to see differences in our comparisons even if the SHeSs do accurately represent their target populations. These are likely to be small as although the excluded group experience higher rates of harm (59) they are small in size (60, 61). Besides, our correction procedure may go some way to generalising beyond the private-residing sampling frame. Fourth, the restricting of the analyses to data on individuals aged 20 to 64 years makes comparison with sales data more challenging since they consume more than older and younger groups per capita. This was taken account of by increasing the proportional difference in mean consumption, as detailed in Supplementary Appendix S3. Fifth, more information was available to inform corrections in the earlier survey years due to longer follow-up. Therefore adjusted time trends should be interpreted with caution, as the magnitude of the adjustment in any year is a function both of this differential level of information available to inform corrections, in addition to any real differences in level and impact of non-response over time. There have also been refinements to the way data on alcohol consumption are collected, so pooling and direct comparison between the pre- and post-2003 data are not recommended (27). We are thus unable to determine the extent to which changes in consumption estimates are attributable to the changes in response levels over time. Sixth, the use of information on alcohol-related harms in our methodology was specifically motivated by the objective of refining alcohol consumption estimates and the extent to which it is informative for other health-behaviours is limited. Finally, unlike countries operating national registries, with unique individual identification and comprehensive linkage (62), attributes of individual non-respondents cannot be explicitly identified from our linked data and have to be inferred based on reference to general population data. Validation of this approach is a potential future avenue of research.

The results yielded some initially unexpected findings. Firstly, adjustment had little impact on binge drinking estimates. This could be because binge drinking relates to questions concerning drinking in the last seven days, whereas the consumption estimates relate to usual drinking in the previous year and problem drinking relates to ever occurrence. As such, the binge drinking measure itself is likely to be less stable and, on investigation, there was weaker association between harms and binge drinking (than, e.g., problem drinkers). Another factor could be a weak association between missingness and binge drinking but this was untestable with the available data. Secondly, in some cases (e.g. for women), the sales data-calibrated figures decreased within subgroups as the MNAR assumptions become more extreme. This was found to result from the decreasing scaling factors of the calibration process: for sub-groups for which the non-response estimates do not change greatly going from MAR to MNAR***, the decreasing scaling factors “outweigh” the small increases in non-response adjusted estimates.

This study exploited record-linked survey data and comparison population data to identify health-related deviations from representativeness among survey respondents aged 20 to 64 years in Scotland. Identified differences were then used to adjust key measures of alcohol consumption in an innovative way. The findings indicated alcohol-related non-representativeness may have little impact on estimates of women’s alcohol consumption, but for men, overall levels and socio-economic gradients in consumption were under-estimated. The study provides a guide to the magnitude of the effect of the universal limitation of non-response in alcohol studies and illustrates new methods for triangulation of alcohol consumption estimates. The methodology has utility for refinement of measures of other health-related behaviours such as smoking with corresponding informative outcomes (smoking-related deaths and hospitalisation for smoking-related causes) wherever survey data have been record linked to relevant data.

Supplementary Material

Supplementary Data

Acknowledgements

The authors wish to thank project steering group members Ian R. White from the MRC Biostatistics Unit, Julie Landsberg from the Scottish Government, Michaela Benzeval from Institute for Social and Economics Research, Jim Sherval from NHS Lothian and Clare Beeston from NHS Health Scotland.

Funding statement

This work was supported by the Medical Research Council Methodology Research Panel under the Population and Patient Data Sharing Initiative for Research into Mental Health grant number (MC_EX_MR/J013498/1). LG, AHL and SVK receive core funding from the Medical Research Council (MC_UU_12017/13 and MC_UU_12017/15) and the Scottish Government Chief Scientist Office (SPHSU13 and SPHSU15). SVK is also funded by a NRS Scottish Senior Clinical Fellowship (SCAF/15/02). GMcC, LGrah and MR were funded by the Scottish Government-funded MESAS (Monitoring and Evaluating Scotland’s Alcohol Strategy) evaluation.

Footnotes

Declaration of interests statement

GM, LeG and MR were members of the Scottish Government-funded MESAS (Monitoring and Evaluating Scotland’s Alcohol Strategy) evaluation. The remaining authors declare that they have no competing interests.

References

  • 1.Babor T. Alcohol: no ordinary commodity: research and public policy. Oxford University Press; 2010. [Google Scholar]
  • 2.Katikireddi SV, Bond L, Hilton S. Changing Policy Framing as a Deliberate Strategy for Public Health Advocacy: A Qualitative Policy Case Study of Minimum Unit Pricing of Alcohol. Milbank Quarterly. 2014;92:250–283. doi: 10.1111/1468-0009.12057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.World Health Organization. International guide for monitoring alcohol consumption and related harm. 2000. [Google Scholar]
  • 4.Bromley C, Given L. The Scottish Health Survey 2010. Vol. 2. Edinburgh: The Scottish Government Health Directorate; 2011. [Google Scholar]
  • 5.Robinson M, Thorpe R, Beeston C, McCartney G. A Review of the Validity and Reliability of Alcohol Retail Sales Data for Monitoring Population Levels of Alcohol Consumption: A Scottish Perspective. Alcohol Alcohol. 2012 doi: 10.1093/alcalc/ags098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Stockwell T, Donath S, Cooper-Stanbury M, Chikritzhs T, Catalano P, Mateo C. Under-reporting of alcohol consumption in household surveys: a comparison of quantity–frequency, graduated–frequency and recent recall. Addiction. 2004;99:1024–1033. doi: 10.1111/j.1360-0443.2004.00815.x. [DOI] [PubMed] [Google Scholar]
  • 7.Christensen AI, Ekholm O, Gray L, Glümer C, Juel K. What is wrong with non-respondents? Alcohol-, drug- and smoking-related mortality and morbidity in a 12-year follow-up study of respondents and non-respondents in the Danish Health and Morbidity Survey. Addiction. 2015;110:1505–1512. doi: 10.1111/add.12939. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Jousilahti P, Salomaa V, Kuulasmaa K, Niemela M, Vartiainen E. Total and cause specific mortality among participants and non-participants of population based health surveys: a comprehensive follow up of 54 372 Finnish men and women. J Epidemiol Community Health. 2005;59:310–315. doi: 10.1136/jech.2004.024349. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Tolonen H, Helakorpi S, Talala K, Helasoja V, Martelin T, Prattala R. 25-year trends and socio-demographic differences in response rates: Finnish adult health behaviour survey. Eur J Epidemiol. 2006;21:409–415. doi: 10.1007/s10654-006-9019-8. [DOI] [PubMed] [Google Scholar]
  • 10.Gorman E, Leyland AH, McCartney G, White IR, Katikireddi SV, Rutherford L, et al. Assessing the representativeness of population-sampled health surveys through linkage to administrative data on alcohol-related outcomes. American Journal of Epidemiology. 2014;180:941–948. doi: 10.1093/aje/kwu207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Maclennan B, Kypri K, Langley J, Room R. Non-response bias in a community survey of drinking, alcohol-related experiences and public opinion on alcohol policy. Drug Alcohol Depend. 2012;126:189–194. doi: 10.1016/j.drugalcdep.2012.05.014. [DOI] [PubMed] [Google Scholar]
  • 12.Ahacic K, Kareholt I, Helgason AR, Allebeck P. Non-response bias and hazardous alcohol use in relation to previous alcohol-related hospitalization: comparing survey responses with population data. Subst Abuse Treat Prev Policy. 2013;8:10. doi: 10.1186/1747-597X-8-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Dawson DA, Goldstein RB, Pickering RP, Grant BF. Nonresponse bias in survey estimates of alcohol consumption and its association with harm. J Stud Alcohol Drugs. 2014;75:938–4114. doi: 10.15288/jsad.2014.75.695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Boniface S, Scholes S, Shelton N, Connor J. Assessment of Non-Response Bias in Estimates of Alcohol Consumption: Applying the Continuum of Resistance Model in a General Population Survey in England. PLOS ONE. 2017;12:e0170892. doi: 10.1371/journal.pone.0170892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Beeston C, Reid G, Robinson M, Craig N, McCartney G, Graham L, et al. In: Monitoring and Evaluating Scotland’s Alcohol Strategy. Third Annual Report. Scotland N. H., editor. Edinburgh: 2013. [Google Scholar]
  • 16.Catto S. How much are people in Scotland really drinking? A review of data from Scotland’s routine national surveys, Glasgow: Public Health Observatory Division. NHS Health Scotland. 2008 [Google Scholar]
  • 17.Boniface S, Kneale J, Shelton N. Drinking pattern is more strongly associated with under-reporting of alcohol consumption than socio-demographic factors: evidence from a mixed-methods study. BMC Public Health. 2014;14:1297. doi: 10.1186/1471-2458-14-1297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Aromaa A, Koponen P, Tafforeau J, Vermeire C, The His/Hes Core Group Evaluation of Health Interview Surveys and Health Examination Surveys in the European Union. Eur J Public Health. 2003;13:67–72. doi: 10.1093/eurpub/13.suppl_1.67. [DOI] [PubMed] [Google Scholar]
  • 19.Brick JM, Williams D. Explaining rising nonresponse rates in cross-sectional surveys. The ANNALS of the American Academy of Political and Social Science. 2013;645:36–59. [Google Scholar]
  • 20.Goldberg M, Chastang JF, Leclerc A, Zins M, Bonenfant S, Bugel I, et al. Socioeconomic, demographic, occupational, and health factors associated with participation in a long-term epidemiologic survey: a prospective study of the French GAZEL cohort and its target population. Am J Epidemiol. 2001;154:373–384. doi: 10.1093/aje/154.4.373. [DOI] [PubMed] [Google Scholar]
  • 21.Zhao J, Stockwell T, MacDonald S. Non-response bias in alcohol and drug population surveys. Drug Alcohol Rev. 2009;28:648–657. doi: 10.1111/j.1465-3362.2009.00077.x. [DOI] [PubMed] [Google Scholar]
  • 22.Tolonen H, Laatikainen T, Helakorpi S, Talala K, Martelin T, Prattala R. Marital status, educational level and household income explain part of the excess mortality of survey non-respondents. Eur J Epidemiol. 2010;25:69–76. doi: 10.1007/s10654-009-9389-9. [DOI] [PubMed] [Google Scholar]
  • 23.Santin G, Geoffroy B, Benezet L, Delezire P, Chatelot J, Sitta R, et al. In an occupational health surveillance study, auxiliary data from administrative health and occupational databases effectively corrected for nonresponse. J Clin Epidemiol. 2014;67:722–730. doi: 10.1016/j.jclinepi.2013.10.017. [DOI] [PubMed] [Google Scholar]
  • 24.Gray L, Batty GD, Craig P, Stewart C, Whyte B, Finlayson A, et al. Cohort profile: the Scottish Health Surveys cohort: linkage of study participants to routinely collected records for mortality, hospital discharge, cancer and offspring birth characteristics in three nationwide studies. Int J Epidemiol. 2010;39:345–350. doi: 10.1093/ije/dyp155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Dong W, Erens B, editors. Scotland's Health: Scottish Health Survey 1995. Vol. 2. Edinburgh: The Stationery Office; 1997. [Google Scholar]
  • 26.Shaw A, McMunn A, Field J, editors. The Scottish Health Survey 1998. Vol. 2. Edinburgh: The Stationery Office; 2000. [Google Scholar]
  • 27.Bromley C, Sprogston K, Shelton N, editors. The Scottish Health Survey 2003. Vol. 4. Edinburgh: The Stationery Office; 2005. [Google Scholar]
  • 28.Bromley C, Bradshaw P, Given L. The Scottish Health Survey 2008. Vol. 2. Edinburgh: The Scottish Government Health Directorate; 2009. [Google Scholar]
  • 29.Bromley C, Given L, Ormston R. The Scottish Health Survey 2009. Vol. 2. Edinburgh: The Scottish Government Health Directorate; 2010. [Google Scholar]
  • 30.Seaman SR, White IR, Copas AJ, Li L. Combining multiple imputation and inverse-probability weighting. Biometrics. 2012;68:129–137. doi: 10.1111/j.1541-0420.2011.01666.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Straus R, Bacon SD. Drinking in College New Haven. Conn: Yale Univ Press; 1953. [Google Scholar]
  • 32.Mayfield D, McLeod G, Hall P. The CAGE Questionnaire: Validation of a New Alcoholism Screening Instrument. American Journal of Psychiatry. 1974;131:1121–1123. doi: 10.1176/ajp.131.10.1121. [DOI] [PubMed] [Google Scholar]
  • 33.Ewing JA. Detecting alcoholism: The cage questionnaire. JAMA. 1984;252:1905–1907. doi: 10.1001/jama.252.14.1905. [DOI] [PubMed] [Google Scholar]
  • 34.Flowerdew R, Feng Z, Manley D. Constructing data zones for Scottish Neighbourhood Statistics. Computers, Environment and Urban Systems. 2007;31:76–90. [Google Scholar]
  • 35.Scottish Government. Scottish Neighbourhood Statistics Data Zones Background Information. Edinburgh: 2004. Feb 18, [Google Scholar]
  • 36.Scottish Government. Scottish Index of Multiple Deprivation. Edinburgh, Scotland: [Google Scholar]
  • 37.McLoone P. Carstairs scores for Scottish postcode sectors from the 2001 Census. Glasgow: MRC Social and Public Health Sciences Unit; 2004. [Google Scholar]
  • 38.Information Services Division NHS National Services Scotland. SMR Data Manual. Edinburgh, Scotland: [Google Scholar]
  • 39.Harley K, Jones C. Quality of Scottish Morbidity Record (SMR) data. Health Bulletin (Edinb) 1996;54:410–417. [PubMed] [Google Scholar]
  • 40.Information Services Division NHS National Services Scotland. Hospital Records Data Monitoring: SMR Completeness Tables. Edinburgh, Scotland: [Google Scholar]
  • 41.Grant I, Springbett A, Graham L. Alcohol attributable mortality and morbidity: alcohol population attributable fractions for Scotland. Edinburgh: Information Services Division National Services Scotland; 2009. Jun, [Google Scholar]
  • 42.Information Services Division. Alcohol-related Hospital Statistics 2010. Edinburgh: 2010. [Google Scholar]
  • 43.Thorpe R, Robinson M, McCartney G, Beeston C. Monitoring and Evaluating Scotland’s Alcohol Strategy: A review of the validity and reliability of alcohol retail sales data for the purpose of Monitoring and Evaluating Scotland’s Alcohol Strategy. Edinburgh: NHS Health Scotland; 2012. [Google Scholar]
  • 44.Gray L, McCartney G, White IR, Katikireddi SV, Rutherford L, Gorman E, et al. Use of record-linkage to handle non-response and improve alcohol consumption estimates in health survey data: a study protocol. BMJ Open. 2013;3 doi: 10.1136/bmjopen-2013-002647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Rubin DB. Multiple Imputation for Nonresponse in Surveys. New York: J Wiley & Sons; 1987. [Google Scholar]
  • 46.Little RJA. Pattern-Mixture Models for Multivariate Incomplete Data. Journal of the American Statistical Association. 1993;88:125–134. [Google Scholar]
  • 47.Rubin DB. Inference and Missing Data. Biometrika. 1976;63:581–592. [Google Scholar]
  • 48.Carpenter J, Kenward M. Multiple imputation and its application. John Wiley & Sons; 2012. [Google Scholar]
  • 49.Lin I-F, Schaeffer NC. Using survey participants to estimate the impact of nonparticipation. Public Opinion Quarterly. 1995;59:236–258. [Google Scholar]
  • 50.Black H, Gill J, Chick J. The price of a drink: levels of consumption and price paid per unit of alcohol by Edinburgh's ill drinkers with a comparison to wider alcohol sales in Scotland. Addiction. 2011;106:729–736. doi: 10.1111/j.1360-0443.2010.03225.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Meier PS, Meng Y, Holmes J, Baumberg B, Purshouse R, Hill-McManus D, et al. Adjusting for unrecorded consumption in survey and per capita sales data: quantification of impact on gender- and age-specific alcohol-attributable fractions for oral and pharyngeal cancers in Great Britain. Alcohol Alcohol. 2013;48:241–249. doi: 10.1093/alcalc/agt001. [DOI] [PubMed] [Google Scholar]
  • 52.Rehm J, Kehoe T, Gmel G, Stinson F, Grant B. Statistical modeling of volume of alcohol exposure for epidemiological studies of population health: the US example. Popul Health Metr. 2010;8:3. doi: 10.1186/1478-7954-8-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Boniface S, Shelton N. How is alcohol consumption affected if we account for under-reporting? A hypothetical scenario. Eur J Public Health. 2013;23:1076–1081. doi: 10.1093/eurpub/ckt016. [DOI] [PubMed] [Google Scholar]
  • 54.Knudsen AK, Hotopf M, Skogen JC, Overland S, Mykletun A. The health status of nonparticipants in a population-based health study: the Hordaland Health Study. Am J Epidemiol. 2010;172:1306–1314. doi: 10.1093/aje/kwq257. [DOI] [PubMed] [Google Scholar]
  • 55.Katikireddi SV, Bond L, Hilton S. Perspectives on econometric modelling to inform policy: a UK qualitative case study of minimum unit pricing of alcohol. The European Journal of Public Health. 2014;24:490–495. doi: 10.1093/eurpub/ckt206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Meier PS, Purshouse R, Brennan A. Policy options for alcohol price regulation: the importance of modelling population heterogeneity. Addiction. 2010;105:383–393. doi: 10.1111/j.1360-0443.2009.02721.x. [DOI] [PubMed] [Google Scholar]
  • 57.Report from the Administrative Data Taskforce. The UK Administrative Data Research Network: Improving Access for Research and Policy: Economic and Social Research Council. Medical Research Council and Wellcome Trust; 2012. Dec, [Google Scholar]
  • 58.Baffour B. Modelling Census Under-Enumeration: A Logistic Regression Perspective. General Register Office for Scotland; 2006. [Google Scholar]
  • 59.Gilchrist G, Morrison DS. Prevalence of alcohol related brain damage among homeless hostel dwellers in Glasgow. Eur J Public Health. 2005;15:587–588. doi: 10.1093/eurpub/cki036. [DOI] [PubMed] [Google Scholar]
  • 60.National Records of Scotland. Scotland's Census Results Online. Edinburgh, Scotland: National Records of Scotland; [Google Scholar]
  • 61.Makela P, Huhtanen P. The effect of survey sampling frame on coverage: the level of and changes in alcohol-related mortality in Finland as a test case. Addiction. 2010;105:1935–1941. doi: 10.1111/j.1360-0443.2010.03069.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Mattila VM, Parkkari J, Rimpela A. Adolescent survey non-response and later risk of death. A prospective cohort study of 78,609 persons with 11-year follow-up. BMC Public Health. 2007;7:87. doi: 10.1186/1471-2458-7-87. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

RESOURCES