Abstract
Gambling harm is a global public health challenge. Gambling is often recorded in settings using routinely collected data (RCD). Linking of existing RCD affords numerous opportunities for policy-led research on gambling harm and early intervention. To date, no previous review has examined research describing data linkage of RCD and gambling. Here, we searched for peer-reviewed articles using data linkage methodology with RCD and measures of gambling, gambling harm, or health-related outcomes. After screening 2373 articles, we conducted a narrative synthesis of the 17 included articles. Studies described data from 2,136,966 individuals, most originated from Nordic countries, adopted a range of experimental designs, tended to link individual-level data with risk factors for physical and mental health harms, and defined gambling in diverse ways. Study quality was mixed. There exist numerous opportunities for further data linkage studies with RCD to both inform public policy and understand population-wide changes in gambling.
Subject terms: Human behaviour, Psychology, Diseases, Health care, Medical research, Risk factors, Signs and symptoms
Introduction
Routinely collected data (RCD) are data originally gathered from administrative and clinical records, and which may subsequently form the basis of further research1–3. In healthcare and research settings, analysis of RCD may provide increased statistical power due to the large sample sizes often involved, which helps improve the external validity and generalizability of findings. This affords opportunities for descriptive or analytical epidemiology of health-related problems, identification of potential risk or protective factors, and evaluation of treatment effects over and above more restrictive sample designs, such as randomized control trials, that are both time- and cost-effective with a range of populations4,5. There are, however, inherent limitations with forms of RCD obtained in real-world settings that rely on observational recording and where the resulting data quality may be poor (due to coding errors, such as incorrect recording in notes, or an inaccurate assumption on the ordering of diagnoses). Yet, the increased availability of electronic health records, registries, and clinical databases permits a nuanced, evidence-led understanding of individual and population-wide healthcare journeys and opens exciting vistas for research that warrant further attention6.
Data linkage involves combining datasets of RCD to create a new data source7,8. Datasets that may be linked in studies with RCD include general patient population records, hospitalizations, national social insurance/welfare registries, employment records, prevalence surveys, occupational context, and death registrations. For instance, the secure anonymised information linkage (SAIL) databank holds anonymised data of the whole population of Wales, United Kingdom, from demographic, physical and mental health, mortality, and primary and secondary healthcare databases9. Three broad categories of methods are used to link data sources: deterministic (rule-based), probabilistic (score-based), and machine learning-based approaches7,8,10 (see Fig. 1). Deterministic methods use pre-existing specifically defined rules to classify data sources, such as an individual’s hospital identity number, date of birth, or postcode, while probabilistic methods assign weights to record pairs conditional on a range of identifiers to represent the likelihood that they are drawn from the same individual. It may not be possible to apply deterministic methods if an individual identifier is unavailable. In such instances, probabilistic methods provide a suitable, although less precise and more labor-intensive alternative, particularly for incomplete or error-prone data10. Machine-learning methods may be either supervised or unsupervised (based on training with prior datasets or not) and produce clusters of individuals at a higher degree of accuracy than other methods11. Regardless of method, in recent years, the use of data linkage in healthcare research has continued to increase as the availability and quality of data sources expand, supplemented by wider adoption of reporting guidelines like the reporting of studies conducted using observational routinely collected health data (RECORD) statement1,12.
Fig. 1. Graphical illustration of deterministic and probabilistic data linkage methods.
The former methods use specifically defined rules to classify data sources, such as an individual’s identity number or date of birth, while the latter methods assign probabilistic weights to records conditional on a range of identifiers to represent the likelihood that they are drawn from the same individual.
Data linkage studies often use long-term data from multiple clinical encounters throughout the life course to provide valuable insight into health conditions and behaviors for research purposes13. Research using linked RCD not only illustrates the pragmatic nature of healthcare provision but enables researchers to efficiently access data recorded in real-time5. RCD linkage also facilitates the investigation of rarer conditions, such as suicidality14 and the use of a control group of people from the same source population who do not have the condition. Linkage methods have informed public health research, particularly regarding addictive behaviors and mental health conditions, such as evaluations of smoking cessation treatment15, opiate substitution treatment16,17, alcohol screening programs18, and the risk factors and pathways leading to suicidal behavior19. Although a growing body of evidence attests to the clinical and research utility of data linkage studies using RCD, there has been relatively minimal attention paid to their use in addiction research, particularly with behavioral addictions like gambling disorder (GD)5,20. A systematic review of this literature as it pertains to gambling harm and GD would therefore be timely.
Harm caused by gambling is recognized globally as a public health issue21,22. Gambling harms form part of the diagnostic criteria for GD within DSM-5 and ICD-11, in which both emphasize a persistent or recurrent pattern of gambling leading to adverse effects on health and wellbeing to individuals or to others, such as families, communities, and wider society. The prevalence of individuals who engage in problematic gambling or “who gamble in a manner that creates multiple problems that disrupt personal, family, financial, and employment circumstances”21 is estimated globally at 1.4% (95% CI: 1.06–1.84). A further 8.7% (95% CI: 6.6–11.3) of adults are estimated to be engaging in ‘any risk’’ gambling, which includes individuals “who meet the thresholds for problematic gambling or GD but also includes individuals who, at a minimum, report sometimes or occasionally experiencing at least one behavioral symptom or adverse personal, social, or health-related consequence from gambling”23.
Globally, the greatest risk profiles are evident among those engaged in online gambling21, and problematic patterns of gambling are associated with comorbidities, such as anxiety and depression, and increased risk of suicide21–26. For instance, estimates of suicidal ideation among people accessing gambling treatment vary between 22 to 81%26 and between 7 to 30% of individuals in clinical populations experiencing gambling harm report previous suicide attempts27. By way of comparison, among the general population, one study conducted in the UK found that up to 5% of people with experience of gambling harm report previous suicide attempts, compared to less than 1% of those without22. Many who die by suicide have had contact with primary and secondary healthcare services in the year before death28–30. Past-year contact with primary care settings can be as high as seven interactions30, while between two and five people are seen in secondary care settings like emergency departments—some as many as three times29,31. Contacts with primary and secondary healthcare form part of RCD and, as a result, there exist opportunities to exploit data linkage to better understand the benefits for research on gambling harm, comorbid conditions, and prevention/early-intervention programs.
There is increasing interest in the use of naturalistic, large datasets in research on gambling harm, such as operator data32–34, banking transaction data35–37, help-line data38, and geospatial data39,40. The development of new technology, such as new forms of online gambling and sports betting, and the widespread use of social media have all introduced novel sources of data for the analysis of gambling behavior35,41–43. This has afforded opportunities to investigate, for example, gambling operators’ use of social media44–46, population-wide trends in online searching for gambling47, natural language processing of online gambling treatment forums48,49, and fusing bank account transaction data via open banking with self-report gambling severity scores to identify risk profiles of these who did and did not experience gambling harm50. Clearly, the analysis of existing large datasets combined with advances in digital and financial technology is an innovative approach for policy-led gambling research and is capable of even wider dissemination with the inclusion of linked RCD.
To our knowledge, no prior work has sought to systematically synthesize the literature on the use of data linkage methods involving RCD in gambling research. Doing so confers considerable promise20 and insights into the status of the evidence base for policy-making research, as well as highlighting research gaps. Here, we sought to undertake the first scoping review of the use of routinely collected linked data in research on gambling harm. Our review included the following research questions:
What is the nature and extent of the literature investigating gambling harm using RCD?
How are datasets linked?
How are gambling harms defined?
What is the quality of the existing evidence?
Results
Study characteristics
The scoping review identified a total of 17 articles that met the inclusion criteria and were included in the final analysis. Figure 2 shows the PRISMA flow diagram and highlights that 14 articles were identified from literature searches and three further articles were included following expert consultation. The characteristics of the included studies are presented in Table 1.
Fig. 2. PRISMA flow chart of literature search and study selection process.
Following identification of 2370 records, studies were screened against the inclusion criteria and resulted in 17 studies included in the present review.
Table 1.
Characteristics of studies included in the scoping review
| First author, year, (country) | Study design | Timeframe | Sample characteristics | Gambling screen | Sample size, age range, mean/median age, gender % | Data sources linked |
|---|---|---|---|---|---|---|
| Aarestad et al.56 (Norway) | Cross sectional | 2008–2018 | GD diagnosis = 5131, no GD diagnosis = 60,640 | ICD-10 F63.0 | N = 65,771, 18–88 years, mean age = 40.9, 18.2% female (n = 936). | Norwegian Patient Registry (NPR) and Social & Welfare Registry (FD-Trygd database) |
| Bhatti et al.65(Canada) | Cohort | 2007–2014 | Self-identified gambling = 16,002, no GD diagnosis = 14,650 | PGSI ≥ 3 | N = 30,652, 18+ years, 51.2% aged 35–65 (n = 15,872) 55% female (n = 16,897) | Canadian Community Health Survey (CCHS), Ontario Health Insurance Plan (OHIP), and the Canadian Institute for Health Information (CIHI) databases. |
| Binde et al.53 (Sweden) | Cohort | 2015 | Identified gambling = 2112, no GD diagnosis = 5127 | PGSI ≥ 1 | N = 7284, 18–67 years, mean age =43, 54% female (n = 3933) | Swedish Longitudinal Gambling Study (Swelogs) and Statistics Sweden registry data (SSYK) |
| Fröberg et al.54 (Sweden) | Cohort | 2008–2010 | 2241 participants (3816 person-years)* | PGSI ≥ 3 | N = 2241, 16–24 years, 43% of person-years* female (n = 1642) | Swedish Longitudinal Gambling Study (Swelogs) and Swedish National Agency for Education registry data |
| Girard et al.53 (Norway) | Case control | 2008–2018 | GD diagnosis = 5131, Diagnosis of psychiatric conditions = 30,476, Healthy control group = 30,164 | ICD-10 F63.0 | N = 65,771, 18–88 years, mean = 40.9, GD group comprised 18.2% women (n = 933). | NPR and the Statistics of Norway (SSB), Division of Welfare Statistics. |
| Karlsson and Håkansson51 (Sweden) | Cohort | 2005–2016 | GD diagnosis = 2099 | ICD-10 F63.0 | N = 2099, 18–83 years, mean age 36.5 years, 23% female (n = 474) | Swedish National Patient Register (SNPR), Swedish Cause of Death Register (CDR). |
| Karlsson et al.52 (Sweden) | Cohort | 2011–2014 | GD diagnosis = 848 | ICD-10 F63.0 | N = 848, 18–84 years, median age 38.23, 19.9% female (n = 169) | Swedish National Patient Register (NPR), Hospital Discharge Register, Swedish National Council for Crime Prevention, Register for SWPs, and Swedish CDR. |
| Kaur et al.60 (Norway) | Case control | 2008–2018 | GD diagnosis = 5131, no GD diagnosis = 22,289 | ICD-10 F63.0 | N = 27,420, mean age 41.2 years, 20.9% female (n = 4648) | NPR, Norwegian Prescription Registry (NorPD) |
| Kristensen et al.25 (Norway) | Cohort | 2008–2021 | GD diagnosis = 6899, no GD diagnosis = 391,897 | ICD-10 F63.0 | N = 398,796, mean age 36.8, 18.1% female (n = 1248) | Norwegian National Patient Registry (NPR), Norwegian CDR. |
| Latvala et al.62 (Finland) | Cross sectional | 2015 | At risk and problem gambling = 228** ;No GD: 603 | PGSI ≥ 1 | N = 831, 18–29 years, mean age 23.3, 48.5% women (n = 403) | Finnish Gambling 2015 Survey, Statistics Finland, Population Information Registry |
| Latvala et al.61 (Finland) | Cross sectional | 2015 | Weekly gambling = 137 | PGSI ≥ 1 | N = 676, 18–29 years, mean age 23.3, 43.4% female (n = 294) | Finnish Gambling 2015 Survey and Statistics Finland register data |
| Latvala et al.63 (Finland) | Cross sectional | 2016–2017 | Problem gambling = 139, at-risk gambling = 626, recreational gambling = 5003**, no gambling = 1310 | PPGM ≥ 1 | N = 7186, 18–75+ years, 33.1% 35–54 years (n = 2075), 52.3% female (n = 3910). | Finnish Gambling 2015 Survey, Statistics Finland social security register |
| Laursen et al.67 (Denmark) | Cohort | 2000–2010 | Problem gambling = 384, non-gamblers (non-problem and never gambled) = 18,241 | 2x positive answers on Lie/Bet questionnaire | N = 18,625, aged 20- > 50 + , 57.7% aged 50+ (n = 10, 196) 53.9% women (n = 10,037) | Danish Health and Morbidity Surveys, Danish National Criminal Register |
| Reccord et al.64 (Canada) | Cross sectional | 1997–2016 | History of gambling = 20, no history of gambling = 952 | “History of gambling” | N = 972, aged 10–70+ years, mean age 41–44 years, 18.8% female (n = 223) | Newfoundland and Labrador Center for Health Information (NLCHI) suicide database, Vital Statistics annual mortality dataset and BLCHI client registry. |
| Syvertsen et al.57 (Norway) | Case control | 2008–2018 | GD = 5121, Illness control = 27,826, control = 26,695 | ICD-10 F63.0 | N = 59,642, 18+ years, mean age 31, 18.7% female (n = 11,166) | NPR and Social & Welfare Registry (FD-Trygd database) |
| Syvertsen et al.58(Norway) | Case control | 2008–2018 | GD = 5131, illness control = 30,476, control = 30,164 | ICD-10 F63.0 | N = 65,771, 18+ years, mean age 41, 18.5% female (n = 12,165) | NPR and Social & Welfare Registry (FD-Trygd database) |
| Vestergaard et al.66 (Denmark) | Case control | 2013–2017 | Diagnosed with GD = 1381, no GD = 1,381,000 | ICD-10 F63.0 | N = 1,382,381, 18+ median age 34, 12.9% female (n = 196,736) | Danish National Patient Registry, Danish Civil Registration System, Danish National Prescription Registry, Statistics Denmark, and Danish Health Service Registry |
Note: GD gambling disorder, DSM-5 Diagnostic and Statistical Manual of Mental Illnesses-5, ICD-10 International Classification of Diseases 10th revision, PGSI Problem Gambling Severity Index, *Variables in this study were measured for the duration of time each individual contributed to the study, described as person-years. ** At risk (PGSI score 3–7) and problem gambling (PGSI score 8 + ).
Study design and settings
Of the included studies, eight were based on cohort designs, five used cross-sectional designs, and four employed case-control designs. Most studies were conducted in Sweden (i.e., four)51–54, Norway (i.e., six)55–59,60, and Finland (i.e., three)61–63, with two studies each from Canada64,65 and Denmark66,67, and published between 2016 and 2025. Twelve studies focused on adverse outcomes associated with gambling, while five focused on risk factors associated with gambling behaviors.
Sample characteristics
One study included participants aged 16 and above, with most including those aged 18 and older54. Ten studies consisted of mostly males, with the proportion of females ranging between 8.7% and 23%25,51,52,55–58,64,66. The remaining studies had an even distribution of gender53,55,61–63,65,66. Gambling harm was measured based on clinical diagnosis coded using ICD-10 in nine patient registry studies25,51,52,55–58,67 or the Problem Gambling Severity Index in five studies54,61,62,65,68,69. One study utilized self-reported gambling using the Pathological Gambling Measure (PPGM)63, and one used the Lie/Bet questionnaire66. A further study described an unspecified self-reported history of gambling64. The threshold for inferring harm from gambling using the PGSI varied between studies, with the majority including a score greater than one as “at risk and problem gambling”53,61–63, and one study including a score of greater than three as indicative of ‘problem gambling’54. Additional participant characteristics presented varied depending on the study, and included employment status, marital status, education level, and ethnicity (Table 1).
Outcomes
A wide range of outcomes was included in this review. Most studies reported morbidity associated with gambling, including psychiatric diagnosis52, alcohol and smoking51,52, road traffic accidents (RTA)65, as well as physical comorbidity, such as chronic pulmonary disease67. Three studies reported on the association with suicide51,59,64. Further, eight studies described socio-cultural consequences of gambling, such as criminal activity66, changes in marital status57, unemployment and income52,55,63, and poor school achievement54,61,62. The remaining studies described risk factors for gambling harm, including ethnicity56, occupation53, unemployment58, and the role of gender62.
Data linkage approaches
Most studies linked two datasets, with a minority of studies linking more than three and one containing five different datasets (Table 2). Deterministic linkage methodology was used in most studies, utilizing national identity numbers to link different records for one individual together4,25,51,52,55–57,59,65,66,67,69,70. One study linked data using a probabilistic method, based on the statistical similarity of data records62. No linkage method was described in five studies54,58,61–63. Four studies linked registry information to national surveys54,65,66,69. Studies based in Finland are all linked to the Finnish Gambling Survey and Statistics Finland61–63. Most studies (i.e., ten) linked to registers containing demographic information such as social insurance data53,55,56,61–65,67. Eight studies linked to patient registries25,51,52,55–58,65, four to mortality registries51,52,59,64, and two studies used crime registries52,66.
Table 2.
Summary of data linkage methods
| Study | Number of datasets | Linked to questionnaire/survey? | Linkage Method | Description of linkage methodology |
|---|---|---|---|---|
| Aarestad et al.56 | 2 | No | Deterministic | Data from the two registries were linked using unique 11-digit National identity numbers. |
| Bhatti et al.65 | 3 | Yes (CCHS) | Deterministic | Survey and administrative data were linked deterministically at the individual level using unique encoded identifiers and analysed at the Institute for Clinical Evaluative Sciences (ICES) in Toronto, Ontario. |
| Binde et al.53 | 2 | Yes (Swelogs) | Deterministic | Swedish civic registration number based on date of birth and four extra digits. |
| Fröberg et al.54 | 2 | Yes (Swelogs) | Unspecified | Register-based socio-demographic information was linked to the data (phone and postal interview/questionnaire). |
| Girard et al.55 | 2 | No | Deterministic | Linked using participants national identity number. |
| Karlsson & Håkansson51 | 2 | No | Deterministic | Linked using Swedish personal identification number. |
| Karlsson et al.52 | 4 | No | Deterministic | Linked using personal identification number. |
| Kaur et al.60 | 2 | No | Deterministic | Data from the two registries were linked using unique 11-digit National identity numbers. |
| Kristensen et al.25 | 2 | No | Deterministic | Data from the two registries were linked using unique 11-digit National identity numbers. |
| Latvala et al.62 | 3 | No | Unspecified | Registry data were linked with the Finnish Gambling 2015 data. |
| Latvala et al.61 | 2 | No | Unspecified | NA |
| Latvala et al.63 | 2 | No | Unspecified | The survey data were linked with the social security register data administered by Statistics Finland. |
| Laursen et al.67 | 2 | Yes (Danish Health and Morbidity Surveys) | Deterministic | Linked using personal identification numbers. |
| Reccord et al.64 | 3 | No | Probabilistic | Linkage between suicide data set and annual mortality dataset and client registry not described. Postal code to geographic area linkage is described for census data (demographics). |
| Syvertsen et al.57 | 2 | No | Deterministic | Data from the two registries were linked using unique 11-digit National identity numbers. |
| Syvertsen et al.58 | 2 | No | Unspecified | Data from the two registries were linked using unique 11-digit National identity numbers. |
| Vestergaard et al.66 | 5 | No | Deterministic | Data were linked on an individual level across databases using the unique personal identification number. |
Definitions of gambling harm
A range of methods were used to define gambling harm and related constructs. Eight studies used the ICD-10 code F63.0 for ‘pathological gambling,’’ six studies used PGSI scores obtained from survey data, and one used scores on the Lie/Bet questionnaire or the PPGM, respectively.
Quality assessment
The results of the quality assessment are summarized in Fig. 3 and Supplementary Table 3. Of the included studies, four were good quality, ten were medium quality, and three were poor quality. Quality of each linkage study was assessed across four domains; 1) description of the datasets which were linked in each study, 2) variables included in each study and sources of bias, 3) the linkage process, and 4) ethics approval. All studies fulfilled domain 4 by gaining prior ethics approval. No studies achieved good quality in domain 3, with insufficient details of linkage methods and any changes to coding systems and potential sources of bias.
Fig. 3. Graphical illustration of the quality assessment of included studies.
The four assessed domains included description of the datasets linked, variables included in each study and sources of bias, the linkage process, and ethics approval. Note: + (green) denotes good quality, - (yellow) average quality, and x (orange) poor quality.
Narrative details of the included studies
Aarestad et al.56 assessed the relationship between ethnicity and risk of gambling harm using NPR and the Norwegian social insurance database. Gambling harm was defined as a registered diagnosis of GD based on ICD-10. Second-generation individuals from minority ethnic groups, including Asian, African, and North American countries of birth, were at an increased risk of GD compared to the rest of the population of Norway56.
Bhatti et al.65 linked the CCHS, OHIP, and CIHI databases to determine the risk of RTAs among people who gamble. It was found that those at the highest risk (i.e., defined by PGSI scores greater than three) were at increased risk of RTAs compared to those who did not gamble65.
Binde et al.53 assessed gambling risk among different occupational groups using the Swedish Longitudinal Gambling Study (Swelogs 2015) and Statistics Sweden registry data. Gambling risk was defined as a PGSI score of three or more. Males working manual jobs were found to be at increased risk53.
Two studies reported findings related to educational attainment. Froberg et al.54 linked the Swedish Longitudinal Gambling Study survey (Swelogs) and Swedish National Agency for Education registry data and described an increased risk of gambling associated with poor school achievement among Swedish youth (16–24-year-olds)54. Latvala et al.61 linked results from the Finnish Gambling Survey with statistics Finland registry data to assess the association between gambling and school attainment among Finnish adults. Gambling risk was defined as a PGSI score of more than one. It was found that those with low grade point average (GPA) attainment scores were more likely to play daily lottery games and use online casinos compared to those with average and high GPA61.
Two studies reported findings related to employment and income. Girard et al.55 used linked registry data in Norway from the NPR and Statistics of Norway (SSB) to assess the relationship between income and gambling harm. Patients diagnosed with GD, as defined by ICD-10, were more likely to have lower annual income compared to the general population55. Latvala et al.63 investigated the role of social disadvantage with gambling severity using linked data from the Finnish Gambling Harms Survey and Statistics Finland social security registry. This study employed the PPGM as a gambling harm measure and demonstrated that harm was more common among people who were unemployed or received social security benefits64.
Three studies reported findings related to suicidality and gambling. Karlsson and Håkansson51 demonstrated an association between GD and increased mortality, suicidality, and comorbidity. This study linked the Swedish NPR and the Swedish CDR. Gambling harm was defined as a GD diagnosis as coded by ICD-1051. Reccord et al.64 demonstrated a significant association between completed suicide and gambling history. This study linked the NLCHI suicide database, the Vital Statistics annual mortality dataset, and the BLCHI client registry. Gambling was defined as having “a history of gambling” (p.920)64. Kristensen et al.25 used the NPR and CDR to assess suicide risk associated with GD, as well as 12 other patient groups, compared to the general population59. Suicide was the leading cause of death for people diagnosed with GD.
Kaur et al.60 described the relationship between the use of antidepressant medications and the likelihood of developing GD. The study compared participants diagnosed with GD and age- and gender-matched non-gambling individuals using linked data from the NPR with the NPR. It was found that the odds of being diagnosed with GD were almost three times greater among those individuals prescribed antidepressant medication60.
Latvala et al.62 demonstrated that gambling, defined by a PGSI score of more than one, was associated with smoking and risky alcohol use among men and with smoking among women. This study linked results from Finnish Gambling 2015, Statistics Finland, and the Population Information Registry62.
Laursen et al.67 linked Danish Health and Morbidity Surveys with the Danish National Criminal Register. Gambling was defined as two positive answers on the Lie/Bet questionnaire, and the authors found a significant association between problem gambling and increased criminal activity. No increase was detected in economic crime compared to other crimes66.
Syvertsen et al.57 linked the NPR with the Social & Welfare Registry (FD-Trygd) to demonstrate a reduced incidence of marriage and an increased risk of divorce among people with a GD diagnosis, as defined by ICD-1057. A subsequent paper by the same authors used the NPR and the Social and Welfare Registry (FD-Trygd) to assess unemployment as a risk factor for harmful/disordered gambling58.
Some studies reported outcomes related to multiple domains of inquiry, including clinical diagnoses, medication receipt, criminal behavior, and receipt of benefits. Karlsson et al.52 linked the Swedish NPR, Hospital Discharge Register, the Swedish National Council for Crime Prevention, the Register for Social Welfare Payments, and Swedish CDR. Analysis showed that a diagnosis of GD was associated with an increased prevalence of social welfare payments, criminal conviction, and diagnosis of psychiatric conditions, including intentional self-harm disorders52. Vestergaard et al.66 linked five registries: the Danish National Patient Registry, the Danish Civil Registration System, the Danish National Prescription Registry, Statistics Denmark, and the Danish Health Service Registry. The authors demonstrated an increased burden of mental and physical comorbidity among individuals with GD and increased use of prescribed medications and likelihood of criminal sentencing67.
Discussion
The present scoping review identified 17 RCD linkage studies investigating gambling harm and a range of demographic and social-psychiatric factors. Gambling harm tended to be defined through either clinical diagnosis of GD or self-reported problem gambling severity, and a wide range of study designs were adopted. Most studies originated from Nordic countries and the overall quality was mixed. In total, we identified 27 linked datasets from primary and secondary healthcare settings and national/social insurance data, including social welfare information, population-wide prevalence surveys, and national mortality data. Deterministic linkage methods based on national identity numbers were most used to link two datasets, while some studies linked between three and five datasets. Analyzed timeframes ranged between one and 19 years and captured data from a combined population of 2,136,966 individuals across five countries.
This review demonstrates knowledge gaps in the literature on gambling harm in relation to linked RCD. We found RCD linkage studies with individuals who were predominantly identified as male, ranging in age between 16 and 88 years old. Across the included studies, men were more likely to be diagnosed with GD53,60 and experience adverse outcomes including RTA65 and low educational attainment54, while women were more likely to experience financial instability55 and concomitant psychiatric disorders52. Young people who engaged in gambling were at increased risk of unemployment, financial instability54,55 and early mortality51. Population-wide studies of gambling harm like these are critical for identifying demographic risk factors that may make some individuals more likely to experience harm than others. They also offer prevention and early intervention opportunities in settings routinely recording these demographic factors, such as further education and employment providers, financial advice and debt management services, and mental health screening and assessment agencies. The predictive relationships are evident, and the well-powered analyses on which they are based increased the likelihood of generalization to other jurisdictions.
Of the studies included in this review, only three assessed clinical comorbidities. Two of these studies investigated the association between gambling and suicidality51,59,64, while Vestergaard et al.66 found higher incidence of psychiatric comorbidity among people diagnosed with GD compared to those without a diagnosis67. Karlsson et al.52 demonstrated an increased prevalence of psychiatric conditions, including intentional self-harm disorders associated with people diagnosed with GD52. Kristensen et al.25 demonstrated that people with GD had an increased risk of suicide compared to the general population59. The relationship between harms experienced from gambling and psychiatric comorbidities is complex and potentially bidirectional, whereby the onset of specific disorders like depression or anxiety may either precede or follow problematic patterns of gambling71. It remains an important research challenge with the broader field of gambling studies to better elucidate the temporal relationships involved in problematic gambling and comorbidities72, and linked RCD studies may be uniquely placed to aid such investigations of assumed bidirectionality. For instance, longitudinal designs may permit an examination of the onset and time course of gambling harms and comorbid disorders73.
As outlined, data linkage methods are widely used in biomedical and public health research on a host of conditions relevant to gambling, such as suicide. For instance, Karlsson and Håkansson51 linked national registry data in Sweden with hospital admissions, medical appointments, and cause of death data to reveal a 15-fold increased risk of suicide in those with a diagnosis of GD. Of the 1024 patients admitted and the 5236 patient appointments reviewed, it was found that 55% of patients had a primary diagnosis of GD in either a primary or secondary care setting51. Unfortunately, the predictive role of these healthcare contacts was not further explored. No further studies were identified describing patient healthcare journeys, clinical trajectories, or analysis of contacts with healthcare settings. It is known that the link between past-year healthcare contact and suicide is both robust and a valuable means of informing suicide prevention74. Linkage of RCD may therefore provide a valuable data source for analysis of gambling-related suicidality, with large sample sizes, increased statistical power, and greater predictive utility when controlling for under-reporting and comorbidities.
Our review found a range of methods to define disordered or problematic gambling and gambling harm. Eight studies used the ICD-10 code, F63.0, for ‘pathological gambling’’ recorded in patient registries and typically had fewer participants defined as “gambling” compared to studies using survey data. Six studies defined gambling severity or harm using PGSI scores obtained from survey data, and one used the Lie/Bet questionnaire or the PPGM, respectively. The relative ubiquity of the PGSI as a proxy measure of gambling-related harm is hardly surprising; it is seen by the research community as the gold standard measure of gambling severity and associated negative consequences or harm. The PGSI was not, however, intended to be a measure of gambling-related harm75, although seven of its nine items do refer to the consequences (i.e., harms) experienced from gambling76,77. While specific measures of gambling harms have been developed78, none were employed by the data linkage studies reviewed here. As a result, people experiencing lower levels of gambling harm may therefore be under-represented in the included studies. Conversely, although the validity and accuracy of the PGSI is widely accepted79, variable inclusion thresholds observed between studies may limit the reliability or generalizability of results. Four studies defined problem gambling as a PGSI score of one or higher, and two defined this as having a score of three or higher.
This review included three studies describing socio-demographic characteristics associated with gambling, where gambling was defined using PGSI scores. The majority of PGSI questions relate to negative consequences of gambling and, operationally, we used it here to infer gambling harms as defined in the inclusion criteria78. We acknowledge that the use of the PGSI as a proxy for gambling harms is not widely accepted and that alternative scales exist that operationalize the health-harming impacts of gambling more explicitly80,81. While it is beyond the remit of the present scoping review to address the relative merits of the different gambling harm measures available, we support distinguishing between the language used to define gambling harm, hazardous gambling, GD, and problem (or problematic) gambling. The predominance of the PGSI in the studies included in this review demonstrates that much work remains to be done in challenging this orthodoxy. We not only encourage future data linkage research on gambling to adopt a person-centered approach aimed at reducing the stigma surrounding gambling23 but also to consider a range of diagnostic outcome measures that capture the continuum of gambling and related harms.
We undertook a quality assessment of the current state of linked RCD research on gambling harm. The included studies were assessed as demonstrating a range of quality, with no studies obtaining maximum scores (Fig. 3). All described the datasets linked, including their purpose and type, as well as the original data collection method. However, few studies described the percentage of the population from which the data were derived or any quality assurance process to ensure high-quality, representative data. Although most studies described a deterministic linkage process using national identification numbers to link data sets, the full details of the linkage process, such as specific changes to coding systems and data quality assessment, were often inadequately described. Indeed, it is noteworthy that one of the included studies identified following expert consultation was previously excluded for not explicitly describing the linkage method. Methods exist to evaluate linkage quality and inform the likelihood of rates of missed links, false links, and any clustering or errors with relevant variables of interest82. Future data linkage research with RCD should consider describing the outcome of any linkage quality techniques applied to the data and account for potential variability in subsequent analysis. Moreover, since none of the present studies referred to RECORD guidelines1,12 for the reporting of data linkage, we encourage both data linkers and data analysts to consider the wider adoption of reporting standards and practices in their work. Doing so will not instill confidence in data linkage methods as powerful research tools but should foster wider dissemination and increased uptake.
Data linkage studies in gambling research should make explicit that and how linkage was conducted—indeed, we have highlighted this as a research and knowledge gap and contend that there is an opportunity for the development of reporting guidelines specifically for studies conducted on gambling. We note that all studies received ethical approval and included formal declarations of interest, where relevant. Overall, to promote wider adoption of data linkage methods in gambling research and enhance reliability and generalizability of findings, future studies should describe more fully the linkage methods involved, incorporate machine learning-based analysis of large gambling datasets83,84, justify the assumed representativeness of the population(s) studied, and highlight all data management and quality assurance procedures followed. Our review noted an absence of machine learning methods in the linkage of datasets, which perhaps may not be surprising given concerns in the context of patient privacy, the use of personal identifiers, and how they may be used85. However, machine learning does present unique opportunities to reduce the risk of bias in the linkage of data parameters, but may also lead to a high false positive rate. One solution may be to consider using machine learning as a verifier of linkages obtained via deterministic and probabilistic methods. Further research should evaluate this possibility.
Our findings indicate that the use of conflicts of interest statements, funding declarations, and adoption of open science practices in data linkage RCD gambling research was limited. Nine studies reported conflicts of interest (Table 3), all but one of the 17 studies described a funding source, and none were funded by the gambling industry. The role of industry funding in gambling harms research and any subsequent impact on public health policy should, at a minimum, necessitate that any conflicts of interest are disclosed, and that industry funding should be avoided23.
Table 3.
Conflicts of interest and funding source reported in studies
| Study | Conflict of interest reported? | Nature of conflict of interest | Funding source |
|---|---|---|---|
| Aarestad et al.56 | Yes | Author received research funding from Norsk Tipping (a gambling operator owned by the Norwegian government) and GambleAware, a charitable body, which funds its research program based on donations from the gambling industry. The author also undertakes consultancy for various gambling companies in the area of player protection and social responsibility in gambling. | This study was funded by the Research Council of Norway (grant number 273718). Open access funding provided by University of Bergen. |
| Bhatti et al.65 | No | This work was supported by Sunnybrook Research Institute intramural funds. Investigators were supported by the Canadian Institutes of Health Research, Canada Research Chair in Medical Decision Sciences (Redelmeier). | |
| Binde et al.53 | No | Project funded by the Public Health Agency of Sweden, financed by the Swedish Research Council for Health, Working Life and Welfare (Forte). | |
| Fröberg et al.54 | `Yes | The author has received personal fees from the Swedish organization of online gambling companies (BOS), personal fees from Svenska Spel (Swedish state-owned gambling company), and Play among friends, an NGO-owned Finnish gambling company, outside the submitted work. | The Public Health Agency of Sweden funded the Swedish Longitudinal Gambling Study. |
| Girard et al.55 | Yes | An author has received research funding from Norsk Tipping and GambleAware. The author undertakes consultancy for various gambling companies in player protection and social responsibility in gambling. | The present study was funded by the Research Council of Norway, grant no. 273718. The authors have no other financial relationships relevant to this article to disclose. |
| Karlsson & Håkansson51 | No | No financial support was received specifically for this study. An author holds a position as professor at Lund University financed in collaboration between Lund University and the Swedish gambling operator monopoly, Svenska spel AB. | |
| Karlsson et al.52 | Yes | An author holds a position as professor at Lund University financed in collaboration between Lund University and the Swedish gambling operator monopoly, Svenska spel AB. Another author has received a grant from the same gambling operator monopoly, Svenska Spel AB as part of Svenska Spel ABs’s responsibility for gambling research. | This research was funded by the Swedish Southern Health Care Region Research (grant 2020-0424, GD—associations with suicidality, economic vulnerability, and mortality) and from AB Svenska Spel, the Swedish state-owned gambling operator (grant FO 2019-0013 GD—associations with psychosocial problems, suicide, and crime). |
| Kaur et al.60 | Yes | An author has received research funding from Norsk Tipping and GambleAware. The author undertakes consultancy for various gambling companies in player protection and social responsibility in gambling | Funded by the Research Council of Norway (No. 273718). |
| Kristensen et al.25 | Yes | An author has received research funding from Norsk Tipping and GambleAware. The author undertakes consultancy for various gambling companies in player protection and social responsibility in gambling. | Study was funded by the Norwegian Competence Center for Gambling and Gaming Research and the faculty of Psychology at the University of Bergen. |
| Latvala et al.62 | No | The Ministry of Social Affairs and Health, Finland, and the Finnish Foundation for Alcohol Studies funded the study (appropriation under section 52 of the Lotteries Act). | |
| Latvala et al.61 | No | The Ministry of Social Affairs and Health, Finland, and the Finnish Foundation for Alcohol Studies funded the study (appropriation under section 52 of the Lotteries Act). | |
| Latvala et al.63 | No | The Gambling Harms survey was funded by the Ministry of Social Affairs and Health, Finland, within the objectives of the 52 Appropriation of the Lotteries Act. | |
| Laursen et al.67 | No | This work was funded by the Danish Agency for Science, Technology and Innovation. | |
| Reccord et al.64 | No | This study was funded by a grant from the Newfoundland and Labrador Support Unit for People and Patient-Oriented Research and Trials (NL SUPPORT). | |
| Syvertsen et al.57 | Yes | An author has received research funding from Norsk Tipping and GambleAware. The author undertakes consultancy for various gambling companies. | Open access funding provided by University of Bergen. The study was funded by the Research Council of Norway (no. 273718). |
| Syvertsen et al.58 | Yes | An author has received research funding from Norsk Tipping and GambleAware. The author undertakes consultancy for various gambling companies. | The study was funded by the Research Council of Norway (no. 273718). |
| Vestergaard et al.66 | Yes | In 2020–2022, the clinic provided several gambling operators with an expert appraisal in online gambling patterns The clinic was paid for these advisory services by the standard tariff at Aarhus University Hospital. | Open access funding provided by Aarhus University Hospital. This work was supported by a grant from the Ministry of the Interior and Health of Denmark. |
We failed, however, to identify any open science practices in the included studies, which may reflect the heterogeneity of study designs, the relative novelty of the field of data-linkage gambling research, and potential data-access restrictions. While it was not the intent of the present review to gauge the adoption of open science practices here, clearly, the field of gambling research has much work to do. The implementation of open science methods aims to improve research quality and reduce publication bias86,87 and further linked RCD studies are encouraged to consider adoption of open science practices wherever feasible.
The review did not apply any geographical restrictions with its search criteria and found that the extant data linkage gambling research with RCD was overwhelmingly conducted by Nordic countries88,89. In contrast to the relative paucity of research from other countries, the burgeoning literature using linked RCD from Nordic countries like Sweden, Denmark, Norway, and Finland may reflect differences in clinical coding practices or the impact of different gambling landscapes, such as the public gambling monopolies or private licensing systems operating within these countries and their impact on research. In Europe, only Finland and Norway currently operate fully public monopoly models of gambling, and there is a relative paucity of research evaluating their effectiveness at reducing gambling harm. The available evidence suggests that monopolies may have lower estimated prevalence rates of problematic gambling and overall reduced levels of gambling participation (total consumption) compared to private licensed regimes90. Other countries identified in our review, like Denmark and Canada operate state-owned or state-controlled monopolistic companies which tend to operate increasingly in a commercial or expansionist manner69,91. It is noteworthy, therefore, and perhaps unsurprising that most of the studies included in this review originated in Nordic countries like Sweden, Norway, and Finland with state-owned operators and access to large datasets of RCD for research and prevention purposes.
Linkage models vary by country and research environment, and this may affect the risk of linkage error, selection bias, and data completeness as well as the ability to conduct linkage studies4,10. Individuals in Nordic countries access tax-funded and public health care systems similar to the UK, and which remain valuable sources of research data, as our findings confirm88. The personal identification numbers used in Nordic countries to access healthcare services enable the deterministic linkage of multiple RCD sources, reducing the risk of misclassification and incomplete linkage92. By contrast, countries, such as the UK, have seen repeated proposals to link healthcare records from primary and secondary care since 2012, yet progress has been hindered by concerns over data transparency, governance challenges, and public trust6,93,94. This notwithstanding, the reviewed studies primarily include male participants from Nordic countries, which poses limitations in terms of the wider relevance and generalizability of the findings, particularly among more ethnically diverse or economically varied populations. It is of paramount importance that future data linkage studies on gambling ensure as representative a sample as possible and for cross-cultural comparisons to be undertaken where data recording systems allow.
Since the Covid-19 pandemic, access to linked data has improved, notably for health surveillance. However, our findings underscore that the adoption of linked RCD in gambling research remains limited, and further investigation is needed to understand how different linkage methodologies impact data reliability and generalizability93. Selection bias may also influence gambling research outcomes, as individuals with gambling-related harms may not always engage with healthcare services, leading to underrepresentation in linked datasets. Moreover, the lack of systematically collected gambling-related data in UK healthcare settings presents a significant limitation, constraining opportunities to develop robust evidence and targeted interventions91. The lack of evidence from the UK may imply that researchers in the UK face additional barriers to working with RCD and that gambling-related data are not widely collected in UK healthcare settings, which clearly limits research opportunities.
Linkage of RCD may provide novel datasets to assess the socio-economic costs of gambling harm and determine intervention cost-effectiveness95,96. However, it is recognized that linked data alone does not establish causation. While some studies included in the review did assess the economic cost of gambling among occupational groups, the impact of gambling on income, and role of economic hardship as a risk factor for self-harm among people with a diagnosis of GD52,53,55, the social and economic costs were not calculated. That is, the studies reviewed did not link to datasets of aggregated or patient-level costs and healthcare utilization activities52,53,55. Vestergaard et al.66 did, however, assess the health costs of GD and mental and somatic comorbid conditions and found that gambling was associated with an estimated attributable cost of illness and welfare services of €4.0 and €17.6 M of indirect attributable costs due to reduced productivity calculated using the human capital approach67. It is possible to adopt this approach and undertake secondary analysis of costs by using estimates of, for instance, the social costs of certain occupational groups53 or social welfare and criminal justice costs52. Estimating the costs of gambling harm in future linked RCD studies either directly or indirectly via secondary analysis may generate novel insights and policy-led research opportunities involving financial (banking) transactions, affordability, and gambling behaviors95. These financial insights may have valuable implications for policy.
There are policy implications of using data linkage research to improve public health interventions for gambling and comorbid conditions. Linked healthcare record data could, for instance, help develop early intervention programs by identifying individuals at risk of gambling harm. These data could also support monitoring systems that track gambling-related risk factors over time, enabling more proactive harm reduction approaches. Enhanced collaboration between gambling regulators, healthcare providers, and financial institutions could improve intervention effectiveness by combining financial data, self-exclusion registries, and healthcare records. Overall, the burgeoning work on data linkage we identified here alongside similar developments in data fusion and Big Data analytic techniques involving financial transactions and industry provided customer dataset may have enormous potential for policy-led research20,50; we call for wider research consensus on how these innovative data-led synergies may be optimized to tackle gambling harm.
A final research gap concerns the finding that only one study investigated the role of ethnicity and GD, highlighting barriers to accessing care, which may prevent or delay diagnosis56. Future data linkage research with RCD should consider strategies to promote inclusion and diversity in public health research on gambling. For instance, a better understanding is needed of ethnicity and gambling harm among minority communities and individuals from deprived backgrounds97,98. Tracking patient healthcare utilization journeys in large datasets of RCD will help to inform prevention and early intervention opportunities.
The present review may have limitations. A risk of bias analysis, while not standard practice in scoping reviews, was not conducted. Our critical appraisal assessment may, therefore, have been limited by the absence of quality thresholds; we were, however, able to qualitatively appraise included studies to demonstrate methodological robustness. Our unrestricted systematic search was followed by a snowballing method of identifying included studies and may have omitted relevant papers. Similarly, we did not include the gambling gray literature in our search99.
The present scoping review is the first to describe research using linkage of RCD to investigate gambling harm. Most research was conducted in Nordic countries with unique gambling landscapes. Much of the evidence was focused on sociocultural factors, such as financial impacts, crime, and marriage, but few studies included in this review described health consequences associated with gambling. A growing number of data linkage studies examine the relationship between gambling suicide. Overall, our findings support the need to conduct research using linked RCD to further explore relationships between gambling harm, demographic, and mental health factors.
Methods
We conducted a scoping review in accordance with Joanna Briggs Institute99,100 and the preferred reporting items for systematic reviews and meta-analysis extension for scoping reviews (PRISMA-ScR) guidelines101. The review protocol was pre-registered on Open Science Framework (DOI: 10.17605/OSF.IO/MEV68) and the completed PRISMA-ScR is included in Supplementary Table 2.
Search strategy
The search strategy was developed by PB and MJ in consultation with an expert-by-experience, KP. Systematic searches with no specified timeframe were conducted of Medline, PubMed, Web of Science, Scopus, Embase, and PsycInfo databases using the search terms, ‘linkage,’’ ‘routine data,’’ ‘gambling,’’ and ‘specific gambling harm’’ (see Supplementary Table 1 for the full search terms). Articles identified from the search were uploaded to Covidence for further extraction. The reference lists of included studies were manually checked for additional studies that may fulfill the inclusion criteria.
Inclusion and exclusion criteria
We screened articles using a full list of inclusion and exclusion criteria organized according to PICO (population, intervention/issue, comparison, and outcome) categories presented in Table 4. To be included, articles had to be in English and involve individuals experiencing gambling harm, problematic gambling, or GD, with any intervention, exposure, or comparator, measuring any gambling-related harm outcomes, and across time and/or settings. Articles had to involve data linkage of at least two independent databases. Studies using aggregated data were excluded due to the lack of linkage methodology, despite using registry data73,102.
Table 4.
Inclusion and exclusion criteria
| PICO | Include | Exclude |
|---|---|---|
| Population | Persons at risk of, or experiencing, gambling harms, including problematic gambling and GD. | No exposure to gambling in sample. |
| Intervention/Exposure | Any intervention or exposure. | - |
| Comparators | Any comparators. | - |
| Outcomes | Any gambling related harms or related health outcomes (e.g., gambling behavior, financial wellbeing, or mental health). | Outcomes unrelated to gambling |
| Timings | Any timings (e.g., pre- and post-intervention or exposure or cross-sectional). | - |
| Settings | Any clinical or supportive care settings (e.g., primary care or third sector addiction services). | - |
Note: To be included, studies also had to describe a data linkage method with at least two databases of RCD.
Article selection and data extraction process
Titles and abstracts were reviewed independently by PB and MJ. Full text manuscripts of selected citations were then obtained and assessed against eligibility criteria. Disagreements were resolved through discussion, and inter-rater reliability was high (89.8% agreement, kappa p = 0.171). Identified articles were shared and discussed with experts with experience of conducting data linkage research on gambling to ensure the findings were representative and up to date. One further, unpublished study was identified from this process. Data extracted included study design and methodology, data linkage methods, participant demographics, and outcome measures, including number of events and measures of association.
Quality assessment
Study quality was assessed using accepted guidelines103,104. These guidelines assess four major domains, including 1) details regarding the datasets which were linked, 2) researcher-selected variables and sources of bias, 3) the linkage process, and 4) ethics approval. The first domain assesses each data set included in the linkage study independently, whereas the remaining domains summarize the study. Two authors (M.J., P.B.) assessed each study individually, and disagreement was resolved through discussion. Studies achieved points for each domain and were subsequently classified as ‘good,’’ ‘average,’’ and ‘poor’’ quality (see Fig. 3 for a graphical visualization of the quality assessment findings and Supplementary Table 3 for the ratings of individual studies).
Supplementary information
Acknowledgements
We thank members of the GAMLINK Steering and Advisory Group for their input with the present review and J.H.K. for helpful comments on an earlier version of this article. This project was funded by an award from Greo Evidence Insights as part of the Gambling-Related Suicide Research Program made to S.D. and D.L.
Author contributions
Conceptualization: S.D. and D.L. Methodology: P.B., M.J., D.L., and S.D. Validation: P.B., M.J., and D.L. Formal analysis: P.B. and M.J. Investigation: P.B., M.J., D.L., and S.D. Resources: D.L. and S.D. Data curation: P.B. and M.J. Writing and editing: P.B., M.J., K.P., D.L., and S.D. Supervision: D.L. and S.D. Project administration: P.B. Funding acquisition: M.J., D.L., and S.D. All authors read and agreed to the final submitted version of the manuscript.
Data availability
Data supporting this study are openly available from OSF at http://osf.io/mev68/ (DOI: 10.17605/OSF.IO/MEV68).
Code availability
Code sharing is not applicable to this article as no code was generated during the current study.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
The online version contains supplementary material available at 10.1038/s41746-025-01713-z.
References
- 1.Benchimol, E. I. et al. The reporting of studies conducted using observational routinely-collected health data (RECORD) statement. PLoS Med.12, e1001885 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Nicholls, S. G., Langan, S. M. & Benchimol, E. I. Routinely collected data: the importance of high-quality diagnostic coding to research. CMAJ189, E1054–e5 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hemkens, L. G., Contopoulos-Ioannidis, D. G. & Ioannidis, J. P. A. Routinely collected data and comparative effectiveness evidence: promises and limitations. CMAJ188, E158–e64 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Harron, K. et al. Challenges in administrative data linkage for research. Big Data Soc.4, 2053951717745678 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kotz, D., O’Donnell, A., McPherson, S. & Thomas, K. H. Using primary care databases for addiction research: an introduction and overview of strengths and weaknesses. Addict. Behav. Rep.15, 100407 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Taylor, J. A. et al. The road to hell is paved with good intentions: the experience of applying for national data for linkage and suggestions for improvement. BMJ Open11, e047575 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Christen, P. Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection (Springer Nat., 2012).
- 8.Dusetzina, S. B., Tyree, S. & Meyer, A. M. Linking Data for Health Services Research: A Framework and Instructional Guide. (Agency for Healthcare Res. Qual., 2014). [PubMed]
- 9.Ford, D. V. et al. The SAIL databank: building a national architecture for e-health research and evaluation. BMC Health Serv. Res.9, 157 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Harron, K. Data linkage in medical research. BMJ Med.1, e000087 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Ellum, R. et al. Machine learning for data linkage. Int. J. Popul. Data Sci. 8, (2023).
- 12.Gilbert, R. et al. GUILD: guidance for information about linking data sets. J. Public Health40, 191–198 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Padmanabhan, S. et al. Approach to record linkage of primary care data from clinical practice research datalink to other health-related patient data: overview and implications. Eur. J. Epidemiol.34, 91–99 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Herrett, E. et al. Data resource profile: clinical practice research datalink. Int. J. Epidemiol.44, 827–836 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Thomas, K. H. et al. Smoking cessation treatment and risk of depression, suicide, and self-harm in the clinical practice research datalink: prospective cohort study. BMJ347, f5704 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Cornish, R., Macleod, J., Strang, J., Vickerman, P. & Hickman, M. Risk of death during and after opiate substitution treatment in primary care: prospective observational study in UK General Practice Research Database. BMJ341, c5475 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Hickman, M. et al. The impact of buprenorphine and methadone on mortality: a primary care cohort study in the United Kingdom. Addiction113, 1461–1476 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.O’Donnell, A. et al. Impact of the introduction and withdrawal of financial incentives on the delivery of alcohol screening and brief advice in English primary health care: an interrupted time–series analysis. Addiction115, 49–60 (2020). [DOI] [PubMed] [Google Scholar]
- 19.Clapperton, A., Spittal, M. J., Dwyer, J., Nicholas, A. & Pirkis, J. Suicide within five years of hospital-treated self-harm: a data linkage cohort study. J. Affect. Disord.356, 528–534 (2024). [DOI] [PubMed] [Google Scholar]
- 20.Muggleton, N. Redefining harm: the role of data integration in understanding gambling behaviour. Addiction119, 1164–1165 (2024). [DOI] [PubMed] [Google Scholar]
- 21.Tran, L. T. et al. The prevalence of gambling and problematic gambling: a systematic review and meta-analysis. Lancet Public Health9, e594–e613 (2024). [DOI] [PubMed] [Google Scholar]
- 22.Wardle, H., John, A., Dymond, S. & McManus, S. Problem gambling and suicidality in England: secondary analysis of a representative cross-sectional survey. Public Health184, 11–16 (2020). [DOI] [PubMed] [Google Scholar]
- 23.Wardle, H. et al. The Lancet Public Health Commission on gambling. Lancet Public Health9, e950–e994 (2024). [DOI] [PubMed] [Google Scholar]
- 24.Wardle, H. & Tipping, S. The relationship between problematic gambling severity and engagement with gambling products: longitudinal analysis of the emerging adults gambling survey. Addiction118, 1127–1139 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Kristensen, J. H. et al. Suicidality among individuals with gambling problems: a meta-analytic literature review. Psychol. Bull.150, 82–106 (2024). [DOI] [PubMed] [Google Scholar]
- 26.Ronzitti, S. et al. Current suicidal ideation in treatment-seeking individuals in the United Kingdom with gambling problems. Addict. Behav.74, 33–40 (2017). [DOI] [PubMed] [Google Scholar]
- 27.Marionneau, V. & Nikkinen, J. Gambling-related suicides and suicidality: a systematic review of qualitative evidence. Front. Psychiatry13, 980303 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Roberts, S. E. et al. Suicide following acute admissions for physical illnesses across England and Wales. Psychol. Med.48, 578–591 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Gairin, I., House, A. & Owens, D. Attendance at the accident and emergency department in the year before suicide: retrospective study. Br. J. Psychiatry183, 28–33 (2003). [DOI] [PubMed] [Google Scholar]
- 30.Stene-Larsen, K. & Reneflot, A. Contact with primary and mental health care prior to suicide: a systematic review of the literature from 2000 to 2017. Scand. J. Public Health47, 9–17 (2019). [DOI] [PubMed] [Google Scholar]
- 31.Walby, F. A., Myhre, M. & Kildahl, A. T. Contact with mental health services prior to suicide: a systematic review and meta-analysis. Psychiatr. Serv.69, 751–759 (2018). [DOI] [PubMed] [Google Scholar]
- 32.Auer, M. & Griffiths, M. D. Using artificial intelligence algorithms to predict self-reported problem gambling with account-based player data in an online casino setting. J. Gambl. Stud.39, 1273–1294 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Dinos, S. et al. Patterns of play. https://natcen.ac.uk/publications/patterns-play. Published 2021. Accessed 22.10.2024.
- 34.Whiteford, S., Hoon, A. E., James, R., Tunney, R. & Dymond, S. Quantile regression analysis of in-play betting in a large online gambling dataset. Comput. Hum. Behav. Rep.6, 100194 (2022). [Google Scholar]
- 35.Ghaharian, K., Abarbanel, B., Kraus, S. W., Singh, A. & Bernhard, B. Players gonna pay: characterizing gamblers and gambling-related harm with payments transaction data. Comput. Hum. Behav.143, 107717 (2023). [Google Scholar]
- 36.Marionneau, V. K., Lahtinen, A. E. & Nikkinen, J. T. Gambling among indebted individuals: an analysis of bank transaction data. Eur. J. Public Health34, 342–346 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Muggleton, N. et al. The association between gambling and financial, social and health outcomes in big financial data. Nat. Hum. Behav.5, 319–326 (2021). [DOI] [PubMed] [Google Scholar]
- 38.Marionneau, V., Kristiansen, S. & Wall, H. Harmful types of gambling: changes and emerging trends in longitudinal helpline data. Eur. J. Public Health34, 335–341 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rogers, R. D. et al. Gambling as a public health issue in Wales: Public Health Wales. https://research.bangor.ac.uk/portal/files/22557880/Gambling_as_Public_Health_Issue_Wales_Eng2.pdf. Published 2019.
- 40.Saunders, M. et al. Using geospatial mapping to predict and compare gambling harm hotspots in urban, rural and coastal areas of a large county in England. J. Public Health45, 847–853 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Chagas, B. T. & Gomes, J. F. S. Internet gambling: a critical review of behavioural tracking research. J. Gambl. Issues36, 1–27 (2017). [Google Scholar]
- 42.Deng, X., Lesch, T. & Clark, L. Applying data science to behavioral analysis of online gambling. Curr. Addict. Rep.6, 159–164 (2019). [Google Scholar]
- 43.James, R. J. E. & Bradley, A. The use of social media in research on gambling: a systematic review. Curr. Addict. Rep.8, 235–245 (2021). [Google Scholar]
- 44.Rossi, R. & Nairn, A. new developments in gambling marketing: the rise of social media ads and its effect on youth. Curr. Addict. Rep.9, 385–391 (2022). [Google Scholar]
- 45.Lindeman, M. et al. Gambling operators’ social media image creation in Finland and Sweden 2017–2020. Nord. Stud. Alcohol Dr.40, 40–60 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Russell, A. M. T. et al. Gambling advertising on Twitter before, during and after the initial Australian COVID-19 lockdown. J. Behav. Addict.12, 557–570 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Houghton, S. et al. Tracking online searches for gambling activities and operators in the United Kingdom during the COVID-19 pandemic: a Google trends™ analysis. J. Behav. Addict.12, 983–991 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Bradley, A. & James, R. J. E. Defining the key issues discussed by problematic gamblers on web-based forums: a data-driven approach. Int. Gambl. Stud.21, 59–73 (2021). [Google Scholar]
- 49.van Baal, S. T., Bogdanski, P., Daryanani, A., Walasek, L. & Newall, P. The lived experience of gambling-related harm in natural language. Psychol. Addict. Behav.10, 1030 (2024). [DOI] [PubMed] [Google Scholar]
- 50.Zendle, D. & Newall, P. The relationship between gambling behaviour and gambling-related harm: a data fusion approach using open banking data. Addiction119, 1826–1835 (2024). [DOI] [PubMed] [Google Scholar]
- 51.Karlsson, A. & Håkansson, A. Gambling disorder, increased mortality, suicidality, and associated comorbidity: a longitudinal nationwide register study. J. Behav. Addict.7, 1091–1099 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Karlsson, A., Hedén, O., Hansson, H., Sandgren, J. & Håkansson, A. Psychiatric comorbidity and economic hardship as risk factors for intentional self-harm in gambling disorder-a nationwide register study. Front. Psychiatry12, 688285 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Binde, P. & Romild, U. Risk of problem gambling among occupational groups: a population and registry study. Nordisk Alkohol Nark.37, 262–278 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Fröberg, F., Modin, B., Rosendahl, I. K., Tengström, A. & Hallqvist, J. The association between compulsory school achievement and problem gambling among Swedish young people. J. Adolesc. Health56, 420–428 (2015). [DOI] [PubMed] [Google Scholar]
- 55.Girard, L. C., Leino, T., Griffiths, M. D. & Pallesen, S. Income and gambling disorder: a longitudinal matched case-control study with registry data from Norway. SSM Popul. Health24, 101504 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Aarestad, S. H. et al. Ethnicity as a risk factor for gambling disorder: a large-scale study linking data from the Norwegian patient registry with the Norwegian social insurance database. BMC Psychol.11, 355 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Syvertsen, A. et al. Marital status and gambling disorder: a longitudinal study based on national registry data. BMC Psychiatry23, 199 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Syvertsen, A. et al. Unemployment as a risk factor for gambling disorder: a longitudinal study based on national registry data. J. Behav. Addict.13, 751–760 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Kristensen, J. H. et al. Association between gambling disorder and suicide mortality: a comparative cohort study using Norwegian health registry data. Lancet Region. Health Eur.48, 101127 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Kaur, P. et al. Antidepressant prescription as a risk factor for developing gambling disorder: a longitudinal registry-based study in Norway. J. Behav. Addict.14, 457–464 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Latvala, T., Alho, H., Raisamo, S. & Salonen, A. H. Gambling involvement, type of gambling and grade point average among 18-29-year-old Finnish men and women. Nordisk Alkohol Nark.36, 190–202 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Latvala, T., Castrén, S., Alho, H. & Salonen, A. Compulsory school achievement and gambling among men and women aged 18-29 in Finland. Scand. J. Public Health46, 505–513 (2018). [DOI] [PubMed] [Google Scholar]
- 63.Latvala, T. A., Lintonen, T. P., Browne, M., Rockloff, M. & Salonen, A. H. Social disadvantage and gambling severity: a population-based study with register-linkage. Eur. J. Public Health31, 1217–1223 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Reccord, C. et al. Rural-urban differences in suicide mortality: an observational study in Newfoundland and Labrador, Canada. Can. J. Psychiatry66, 918–928 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Bhatti, J. A., Thiruchelvam, D. & Redelmeier, D. A. Gambling and subsequent road traffic injuries: a longitudinal cohort analysis. J. Addict. Med.13, 139–146 (2019). [DOI] [PubMed] [Google Scholar]
- 66.Laursen, B., Plauborg, R., Ekholm, O., Larsen, C. V. & Juel, K. problem gambling associated with violent and criminal behaviour: a Danish population-based survey and register study. J. Gambl. Stud.32, 25–34 (2016). [DOI] [PubMed] [Google Scholar]
- 67.Vestergaard, S. V., Ulrichsen, S. P., Dahl, C. M., Marcussen, T. & Christiansen, C. F. comorbidity, criminality, and costs of patients treated for gambling disorder in Denmark. J. Gambl. Stud.39, 1765–1780 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Public Health England, Office for Health Improvement and Disparities. Gambling-Related Harms: Evidence Review. (Public Health Engl., 2023).
- 69.Binde, P. Gambling in Sweden: the cultural and socio-political context. Addiction109, 193–198 (2014). [DOI] [PubMed] [Google Scholar]
- 70.Doidge, J. C. & Harron, K. Demystifying probabilistic linkage: common myths and misconceptions. Int. J. Popul. Data Sci.3, 410 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Hartmann, M. & Blaszczynski, A. The longitudinal relationships between psychiatric disorders and gambling disorders. Int. J. Ment. Health Addict.16, 16–44 (2018). [Google Scholar]
- 72.Giovanni, M. et al. Gambling disorder and suicide: an overview of the associated co-morbidity and clinical characteristics. Int. J. High. Risk Behav. Addict.6, e30827 (2017). [Google Scholar]
- 73.Leino, T., Torsheim, T., Griffiths, M. D. & Pallesen, S. The relationship between substance use disorder and gambling disorder: a nationwide longitudinal health registry study. Scand. J. Public Health51, 28–34 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Pirkis, J. et al. Addressing key risk factors for suicide at a societal level. Lancet Public Health9, e816–e824 (2024). [DOI] [PubMed] [Google Scholar]
- 75.Raisamo, S. U., Mäkelä, P., Salonen, A. H. & Lintonen, T. P. The extent and distribution of gambling harm in Finland as assessed by the problem gambling severity index. Eur. J. Public Health25, 716–722 (2015). [DOI] [PubMed] [Google Scholar]
- 76.Young, M. M. et al. Not too much, not too often, and not too many: the results of the first large-scale, international project to develop lower-risk gambling guidelines. Int. J. Ment. Health Addict.22, 666–684 (2024). [Google Scholar]
- 77.Moore, E., Pryce, R., Squires, H. & Goyder, E. The association between health-related quality of life and problem gambling severity: a cross-sectional analysis of the health survey for England. BMC Public Health24, 434 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Browne, M., Goodwin, B. C. & Rockloff, M. J. Validation of the short gambling harm screen: a tool for assessment of harms from gambling. J. Gambl. Stud.34, 499–512 (2018). [DOI] [PubMed] [Google Scholar]
- 79.Currie, S. R., Hodgins, D. C. & Casey, D. M. Validity of the problem gambling severity index interpretive categories. J. Gambl. Stud.29, 311–327 (2013). [DOI] [PubMed] [Google Scholar]
- 80.Browne, M. et al. Benchmarking gambling screens to health-state utility: the PGSI and the SGHS estimate similar levels of population gambling-harm. BMC Public Health22, 839 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Gooding, N. B., Williams, R. J. & Volberg, R. A. The problem gambling measure: a revision of the problem & pathological gambling measure to better predict at-risk and chronic gambling. Int. Gambl. Stud.24, 73–397 (2024). [Google Scholar]
- 82.Doidge, J., Christian, P. & Harron, K. Quality assessment in data linkage. https://www.gov.uk/government/publications/joined-up-data-in-government-the-future-of-data-linking-methods/quality-assessment-in-data-linkage. Published 2021. Accessed 01.10.2024.
- 83.Hopfgartner, N., Auer, M., Helic, D. & Griffiths, M. D. Using artificial intelligence algorithms to predict self-reported problem gambling among online casino gamblers from different countries using account-based player data. J. Gambl. Stud.39, 1273–1294 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Seo, W., Kim, N., Lee, S.-K. & Park, S.-M. Machine learning-based analysis of adolescent gambling factors. J. Behav. Addict.9, 734–743 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Murdoch, B. Privacy and artificial intelligence: challenges for protecting health information in a new era. BMC Med. Ethics22, 122 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Louderback, E. R. et al. Open science practices in gambling research publications (2016–2019): a scoping review. J. Gambl. Stud.39, 987–1011 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Heirene, R. et al. Preregistration specificity and adherence: a review of preregistered gambling studies and cross-disciplinary comparison. Meta-Psychology, 8, 2909 (2024).
- 88.Laugesen, K. et al. Nordic health registry-based research: a review of health care systems and key registries. Clin. Epidemiol.13, 533–554 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Nikkinen, J. & Marionneau, V. On the efficiency of Nordic state-controlled gambling companies. Nord. Stud. Alcohol Dr.38, 212–226 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Marionneau, V., Egerer, M. & Nikkinen, J. How do state gambling monopolies affect levels of gambling harm? Curr. Addict. Rep.8, 225–234 (2021). [Google Scholar]
- 91.Sulkunen, P. et al. Setting limits: gambling, science and public policy-summary of results. Addiction116, 32–40 (2021). [DOI] [PubMed] [Google Scholar]
- 92.Ludvigsson, J. F., Otterblad-Olausson, P., Pettersson, B. U. & Ekbom, A. The Swedish personal identity number: possibilities and pitfalls in healthcare and medical research. Eur. J. Epidemiol.24, 659–667 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Cavallaro, F., Lugg-Widger, F., Cannings-John, R. & Harron, K. Reducing barriers to data access for research in the public interest—lessons from Covid-19. BMJ Opin., 6, 1503 (2020)
- 94.Wood, A. et al. Linked electronic health records for research on a nationwide cohort of more than 54 million people in England: data resource. BMJ373, n826 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Harris, S. et al. Social and economic costs of gambling problems and related harm among UK military veterans. BMJ Mil. Health169, 413–418 (2023). [DOI] [PubMed] [Google Scholar]
- 96.Rossow, I., Kesaite, V., Pallesen, S. & Wardle, H. Concentration of gambling spending by product type: analysis of gambling accounts records in Norway. Addict. Res. Theory, 33, 114–121 (2024). [DOI] [PMC free article] [PubMed]
- 97.Grant, J. E. & Chamberlain, S. R. Gambling disorder in minority ethnic groups. Addict. Behav.136, 107475 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Selin, J., Okkonen, P. & Raisamo, S. Accessibility, neighborhood socioeconomic disadvantage and expenditures on electronic gambling machines: a spatial analysis based on player account data. Int. J. Health Geogr.23, 19 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Baxter, D. G., Nicoll, F. & Akçayir, M. Grey literature is a necessary facet in a critical approach to gambling research. Nat. Res. Council Italy22, 125–134 (2021).
- 100.Arksey, H. & O’Malley, L. Scoping studies: towards a methodological framework. Int. J. Soc. Res. Methodol.8, 19–32 (2005). [Google Scholar]
- 101.Tricco, A. et al. PRISMA extension for scoping reviews: checklist and explanation. Ann. Int. Med.169, 467–473 (2018). [DOI] [PubMed] [Google Scholar]
- 102.Grönroos, T. et al. Somatic and psychiatric comorbidity in people with diagnosed gambling disorder: a Finnish nation-wide register study. Addiction119, 2015–2022 (2024). [DOI] [PubMed] [Google Scholar]
- 103.Bohensky, M. A. et al. Development and validation of reporting guidelines for studies involving data linkage. Aust. N. Z. J. Public Health35, 486–489 (2011). [DOI] [PubMed] [Google Scholar]
- 104.Patel, K. et al. What do register-based studies tell us about migrant mental health? A scoping review. Syst. Rev.6, 78 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data supporting this study are openly available from OSF at http://osf.io/mev68/ (DOI: 10.17605/OSF.IO/MEV68).
Code sharing is not applicable to this article as no code was generated during the current study.



