ABSTRACT
The Integrated Data Infrastructure (IDI) is a collection of deidentified whole population administrative datasets, linked at the individual level, and made available through Stats NZ for ‘public good’ research. This paper reviews longitudinal research that has been undertaken using the IDI, and highlights the strengths, weaknesses, opportunities, and threats of using the IDI for longitudinal research. It is argued that the IDI can and has been used for longitudinal research that would be difficult or impossible to undertake without a resource such as the IDI, including longitudinal research involving small, sometimes marginalised populations, research involving intergenerational research and quasi-experimental family designs, and research investigating residential variations in the natural environment. However, issues regarding ethical governance need addressing. Researchers wishing to use the IDI should familiarise themselves with its limitations, particularly around what service use data capture and what this represents, what is missed by assessing only deficit-focused data, and the variable quality of the data.
KEYWORDS: Longitudinal, administrative data, Integrated Data Infrastructure, ethics, Māori data sovereignty
Introduction
Aotearoa New Zealand has a long and rich history of longitudinal studies – particularly but not exclusively birth cohort studies – beginning with the Dunedin and Christchurch studies in the 1970s (Silva and Stanton 1996; Fergusson and Horwood 2001), the Auckland Birthweight Collaborative (ABC) in the 1990s (Thompson et al. 2001), and the Pacific Island Families Study (Paterson et al. 2008), Te Hoe Nuku Roa (Fitzgerald and Durie 2000), the Survey of Families, Income and Employment (SOFIE; Carter et al. 2010), and the Growing Up in New Zealand Study in the 2000s (Morton et al. 2013). Each has involved recruitment of participants, intensive assessment at regular intervals as participants age, and an enormous amount of funding, effort, and specialised staff to undertake the assessments, maintain the cohort, and support the day-to-day running of the study.
Using existing data for longitudinal research circumvents the need for researchers to devote time, effort, and resources to data collection. In particular, data collected by administrative agencies as part of their business operations (for example, running a health system, an education sector, etc.) produce on ongoing record of ‘events’ that can be used for longitudinal analysis.
Internationally, linked administrative data have been made available for research in countries such as Sweden, Denmark, Norway, Canada, Taiwan, and Australia (Hamm et al. 2021; Segal et al. 2021). In New Zealand, the development in 2011 of the Integrated Data Infrastructure (IDI) – a collection of whole population administrative datasets linked at the individual level across sectors and made available in de-identified form for research (Milne et al. 2019; Stats NZ 2021a) – has enabled a vast range of possibilities for administrative longitudinal research in Aotearoa New Zealand.
This paper will review longitudinal research that has been undertaken using the IDI with a focus on select examples showcasing different longitudinal uses of the data – it is not the intention to provide a comprehensive review. In the following sections the IDI and methods for the review will be briefly described, longitudinal research using the IDI will be reviewed, and the strengths, weaknesses, opportunities, and threats of using the IDI for longitudinal research will be discussed.
The Integrated Data Infrastructure (IDI)
The IDI is a collection of New Zealand administrative data sources which have been linked at the individual level (Milne et al. 2019; Stats NZ 2021a) and deidentified (i.e. stripped of identifiers such a name and address, and agency identifiers such as IRD and NHI numbers). The IDI is curated and maintained by Stats NZ. Data in the IDI include administrative data from government agencies (for example, Ministry of Health (MOH), Ministry of Social Development (MSD), Ministry of Education), the New Zealand censuses in 2013 and 2018, as well as several social and socio-economic surveys (for example, general social survey, household economic survey). Data are updated (‘refreshed’) up to four times a year with the addition of more recent data and (periodically) new data tables. An up-to-date list of data sources in the IDI can be found here: https://www.stats.govt.nz/integrated-data/integrated-data-infrastructure/data-in-the-idi/.
Data from different agencies are linked to a central concordance table (the ‘spine’), which includes those born in New Zealand since 1920 (from the Department of Internal Affairs’ birth records), those arriving in New Zealand on a long-term visa since 1997 (from Ministry of Business Innovation and Employment’s border movement records), and those working in New Zealand since 1999 (from the Inland Revenue Department’s tax records). The spine forms the base population of the IDI and can be thought of as an ‘ever resident’ New Zealand population (this is currently around 10 million individuals). Linkage is through deterministic and probabilistic methods. Linkage rates vary by agency, though most are >90%, while false positive rates also vary by agency with most ≤1% (Stats NZ 2021b).
Access to the IDI for ‘public good’ research is by application to Stats NZ (https://www.stats.govt.nz/integrated-data/access-microdata-in-the-data-lab/). Stats NZ applies the ‘Five Safes’ framework to statistical disclosure control – Safe People, Safe Projects, Safe Settings, Safe Data and Safe Output (Desai et al. 2016) – and researchers wishing to access the IDI are required to work within this framework.
Methods
Studies were considered for inclusion for review if they used IDI data and focussed on an assessment of individuals over time. A list of IDI papers had been compiled and updated by the author in relation to a previous paper (Milne et al. 2019). This list was augmented by a search of the ‘pubmed’ and ‘scopus’ databases using the term ‘Integrated Data Infrastructure’, conducted in September 2021, which was then restricted to studies focussing on assessments of individuals over time.
Results
The studies reviewed below are organised by the age at which the cohorts investigated were constituted.
Birth cohorts
Berry et al. (2018) investigated long-term health and education outcomes for extremely pre-term babies (as early as 23 weeks gestation). They found that while children born extremely pre-term were more likely to experience health problems and education difficulties, the majority of extremely pre-term children had few developmental concerns, and most took the National Certificate for Educational Achievement (NCEA) Level 1 exam (typically taken in year 11).
Richmond-Rakerd et al. (2021) took a birth cohort approach to define a cohort (all those born in New Zealand from 1928 to 1978), and then followed them as adults and into old age, assessing hospitalisations from 1988 to 2018. They reported that being hospitalised for a mental health condition was associated with double the risk of being hospitalised for a subsequent physical health chronic condition (such as diabetes, cancer, coronary heart disease), and triple the risk for mortality, even after controlling for previous physical health hospitalisations.
Similarly, Paynter et al. (2019) tracked individuals born from 1984 to 1999 and assessed whether they received the meningococcal B vaccination from 2004 to 2008. Those that had were found to have a decreased risk of subsequent hospitalisation for gonorrhea. The vaccine effectiveness against gonorrhea was estimated to be 24%.
Family influences
Two birth cohort investigations using IDI have made use of the fact that children can be linked to parents and wider families through birth records (Milne et al. 2020). Leong et al. (2020) studied children born in 2008–2011 to assess the influence of antibiotic use in pregnancy and during the first two years of children’s life on obesity assessed at age 4. While dose–response associations were found between antibiotic use in pregnancy and in the child’s first two years of life and later obesity, these association were no longer apparent when siblings and twins within the same families were compared, suggesting that the apparent associations may have been due to unmeasured familial confounding.
Slykerman et al. (2020), using a cohort of children born between 1996 and 1998, also compared children within families to assess whether caesarean birth has an influence on educational attainment. Both within-family analyses and covariate-controlled analyses across the whole sample found no evidence that caesarean birth impacts later educational attainment as measured by performance in NCEA level 2 (year 12) examinations.
Influences of the natural environment
Because the IDI covers the full geography of New Zealand it is possible to link geographical-level data to the IDI to assess the impact of factors that vary geographically on outcomes of interest. Donovan et al. (2018) explored this possibility by assessing the association between exposure to surrounding ‘green’ environments and vegetation diversity on childhood asthma for children born in 1998 and followed up until 2016. Both greenness and vegetation diversity of residential areas measured from birth to age 18 predicted lower prevalence of asthma, which the authors postulated might be due to greater and more diverse microbial exposure. In a subsequent paper, Donovan et al. (2019a) reported that both living in rural areas between ages 2 and 18 and vegetation diversity between ages 2 and 18 independently predicted lower prevalence of an attention-deficit hyperactivity disorder (ADHD) diagnosis, though they caution that the rurality finding may reflect healthcare access issues.
‘Natural’ experiments
The availability in the IDI of longitudinal data covering all of New Zealand allow for tests of ‘natural experiments’, whereby natural variation in exposure to interventions and policies can be used to determine whether these interventions and polices are likely to have a causal impact. Vaithianathan et al. (2016) utilised variation by territorial local authority in phasing in of the ‘Family Start Home Visiting programme’ to test the impact of the programme on child outcomes. They report that the programme resulted in fewer neonatal child deaths for the population as a whole, and also specifically for Māori children.
Child cohorts
Charania et al. (2018, 2020) used the IDI to identify migrant status among children in order to assess immunisation uptake and subsequent morbidity. They found that 46% of foreign-born children had an immunisation recorded in the MOH’s National Immunisation Register, and only 3% of refugee children had received all age-appropriate vaccinations (Charania et al. 2018). In a follow-up paper they report greater hospitalisations for vaccine-preventable diseases for both foreign-born children and refugee children (Charania et al. 2020).
Shackleton et al. (2021) utilised data on children aged 0–17 in the Survey of Families, Income and Employment (SOFIE) in the IDI to assess associations between household-level poverty and hospitalisations across the eight waves of the study. They found weak or absent associations between income poverty and hospitalisation, but stronger predictive relationships between measures of individual and area-level deprivation and hospitalisations.
Schluter et al. (2018, 2020) investigated whether data from the Before School Check (B4SC), a nationwide screen of the health, behaviour, and development of four-year-olds (Ministry of Health—Manatū Hauora 2015), could be used to identify children who might need later literacy support (as indicated by a literacy intervention at age 6). The B4SC was found to have poor discriminatory power to detect a later literacy intervention, both for the population as a whole (c-statistic = 0.624; Schluter et al. 2020), and specifically for children of Pacific ethnicity (c-statistic = 0.592; Schluter et al. 2018).
Older age cohorts
Dixon (2015) identified 20-59-year-olds in 2008–2009 and used data provided by MOH, Inland Revenue, and MSD to assess how employment and earnings over a four year follow-up period were impacted by the onset of eight different chronic conditions: stroke, traumatic brain injury, coronary heart disease, diabetes, chronic obstructive pulmonary disease, breast cancer, melanoma, and prostate cancer. Results showed that employment rates and earnings were impacted both one year and four years after disease onset for six of eight conditions (all except melanoma and prostate cancer), with stroke having the largest impact.
Taking a similar approach, Davie and Lilley (2018) compared 45–64-year-old workers who had made an injury claim in 2009 to those who had not. They found that the reduction in income among those injured increased over the three-year follow-up period, to 6.7% in the third year.
Teng et al. (2017) used the IDI to assess health outcomes in those aged 45 and older in relation to exposure to the 2010–2011 earthquakes in Christchurch. Utilising geographic variation in exposure to earthquake damage, they found modest increases in risk for myocardial infarction-related hospital admissions in the year after the earthquakes (risk ratio = 1.25) for those who lived in the most vs the least damaged areas. No increased risk was apparent 2–5 years after the earthquake.
Pierse et al. (2019) worked with a community provider to integrate data on the homeless population aged 15+ years in Hamilton into the IDI and assessed the volume and variety of the services the homeless population accessed in their lifetime. The 390 people in their cohort that were homeless had >200,000 interactions with services, most commonly health, justice, and welfare in their lifetime. This level of service use far exceeded that of the general population.
Building on their previous work with children, Donovan et al. (2019b) explored the impact of exposure to a green environment on recovery from knee and hip surgery arthroplasty for adults who underwent surgery in 2006 or 2007. They found that hip (but not knee) surgery patients exposed to greener environments lived longer and took fewer opioids for pain. The authors suggest that exposure to the natural environment may be beneficial for postsurgical recovery.
Discussion
While the studies reviewed highlight some obvious strengths of the IDI for longitudinal research and suggested opportunities for further research, there are also weaknesses and threats to its ongoing use. These are summarised in Table 1.
Table 1.
Strengths, weaknesses, opportunities, and threats of using the IDI for longitudinal research.
Strengths
|
Weaknesses
|
Opportunities
|
Threats
|
Strengths
The IDI contains more or less the whole population, so estimates tend to have high precision, and small and sometimes difficult to reach populations population groups are able to be studied. Life-stages from birth to old age can be followed longitudinally, which greatly increases the range of research questions that can be investigated. Tests for cohort effects are possible (for example, by comparing people of the same age born years apart). The IDI includes data from a wide range of health and social sector agencies, so exposures and outcomes spanning several domains of functioning can be investigated. Some of these data sources (e.g. tax and income data; births, deaths, and marriages data) were not available to researchers before the IDI was established.
Weaknesses
The IDI has the limitations expected of an administrative data resource. Foremost of these is that most data provided by agencies to the IDI capture services use, not experiences had, and these are not the same thing. The IDI does not include data on the health conditions of individuals who do not access services for those conditions. Moreover, only individuals charged or convicted for crimes are included in justice data, and this will be a (biased) subset of individuals committing crimes (JustSpeak 2020). As such, claims about prevalence, trends in prevalence and subgroup differences in prevalence need to be undertaken with caution, and researchers need to be specific about the outcomes for which prevalence is being estimated. For example, an increase in mental health service use (the outcome estimated) does not necessarily reflect an increase in prevalence of mental health problems (a related but different outcome). Comparisons between service use data from the IDI and non-service use data from other sources may help to clarify what prevalence data do and do not mean. For example, Svardal et al. (2022) showed that the ethnic and socio-economic patterning of the prevalence of antidepressant use in pregnancy is completely different from the ethnic and socio-economic patterning of the prevalence of antenatal depression using a diagnostic screen.
Additional limitations include that (i) most IDI data are deficit-focused, as services are often only accessed when an individual is in need; (ii) the denominator for the population at any point in time has to be inferred from the group of people who have used services – this will not be perfect, and it is difficult to estimate prior to 2007 (Zhao et al. 2018; Stats NZ 2021c); (iii) different datasets cover different time-periods (https://www.stats.govt.nz/integrated-data/integrated-data-infrastructure/data-in-the-idi/), which will limit the length and breadth of longitudinal investigations; (iv) data quality and data documentation are variable (Milne et al. 2019; Bycroft et al. 2021); (v) tracking households and families over time is difficult as the records capturing movements in and out of households relies on accurate and timely reporting to social sector agencies; and (vi) some links will be incorrect and others will be missed (Stats NZ 2021b), and methods to account for these false and missed links need to be developed. Note that the final limitation is not shared by some international administrative data resources (particular those in Sweden, Denmark, and Norway), which are able to use national identifiers to track individuals’ interactions across administrative systems so do not rely on record linkage (Hamm et al., 2021).
Opportunities
Opportunities include, first, that natural experiments and exposures which vary by geography can be investigated (see studies by Donovan et al. 2018, 2019a, 2019b; Vaithianathan et al. 2016 reviewed above). For example, investigations of the longitudinal impacts of covid lockdowns may be able to be compared across regions which experienced different durations and intensities of lockdowns. Another opportunity involves greater use of quasi-experimental family designs, such as comparisons of differentially exposed siblings (Leong et al. 2020; Slykerman et al. 2020). Wider environmental influences are also starting to be explored (Roy et al. 2022), and could be explored further.
Threats
There are few threats to the use of the IDI that are specific to longitudinal research, but there are a number of threats to use of the IDI generally about which longitudinal researchers need to be aware. Three are highlighted here: (i) maintaining social licence, (ii) under-developed ethical governance, and (iii) inadequate recognition of Māori data sovereignty.
Social licence
Social licence refers to the permission granted by population groups and stakeholders for businesses and governments to act in certain ways (Ballantyne and Stewart 2019). Regarding permission to use administrative data for research, studies have found general acceptance across population groups so long as there is transparency in what the data include and are being used for, and there is trust in the institutions maintaining the data (Kalkman et al. 2022; Paprica et al. 2019). Concerningly, a recent report found ‘mixed’ levels of trust in Stats NZ’s data stewardship (Nielson 2020), and specific knowledge about the IDI is also low (Gulliver et al. 2018). More needs to be done to education the public about how their data are being used and could be used (e.g. for longitudinal investigations using data from several government agencies) to ensure that there is trust in the IDI and that social licence for the IDI is maintained.
Ethics
Another threat to trust in the IDI is that ethical governance is lacking. There has never been any mandate for non-health IDI projects to undergo ethical review, and the requirement (until mid-2021) for projects involving health data to request an ethical assessment from the Health and Disability Ethics Committee (HDEC) has invariably involved HDEC classifying projects as ‘out of scope’ and not undertaking an ethical assessment. The lack of mandate places the onus on researchers to seek ethical review. However, unequal access to ethical review committees and knowledge of ethical processes among researchers using the IDI – who may belong to government agencies and private research companies as well as universities – makes for uneven ethical governance of IDI research.
A specific ethical issue relating to longitudinal research using the IDI is that individuals have not given informed consent. Lessof (2009) argues that the important question regarding use of administrative data for longitudinal research without informed consent is ‘whether an individual’s health, interests or confidentiality could be affected negatively’ (p42). In the case of the IDI, individuals’ health is unlikely to be negatively affected, and confidentiality is strongly guarded by the Five Safes framework. The remaining issue – whether individual’s interests are likely to be negatively affected – could be addressed through independent and project-specific ethical oversight of IDI projects, which again emphasises the importance of developing a strong ethical governance system for the IDI.
Māori data sovereignty
A final potential threat to trust in the IDI is inadequate recognition of Māori data sovereignty – the idea that ‘Māori data should be subject to Māori governance’ (Te Mana Raraunga 2021). There has been an attempt to incorporate Māori data sovereignty principles in the IDI application process through the Ngā Tikanga Paihere framework (Sporle et al. 2020). The framework provides an assessment tool for IDI projects to ensure project investigators (i) have the appropriate expertise, and relationships with communities; (ii) have public confidence and trust to use data; (iii) use good data standards and practices; (iv) have a clear purpose and action in the proposed research; and (v) have balanced the benefits and risks of conducting the research (Sporle et al. 2020). In addition to Ngā Tikanga Paihere, Māori researchers and communities are working to determine the best ways to use the IDI to suit their communities’ needs (Sporle 2018; Greaves and Milne 2020), while Māori are also at the forefront of initiatives to revise key legislation and processes relating to the IDI (Kukutai and Cormack 2019, 2021; Cormack et al. 2020; Moses 2020; Sporle et al. 2020). Internationally, Indigenous groups are determining what safe use of administrative data looks like for their own communities (Walker et al. 2018; Rowe et al. 2021; Walter et al. 2021).
Good international examples exist which integrate indigenous data sovereignty with ethical processes for accessing administrative data. For example, the Manitoba Multigenerational Cohort requires approval from an ethics board, the Health Information Privacy Committee, and non-health data providers if non-health data are requested; and research projects using linked datasets in which indigenous populations are over-represented also require approvals from the First Nations Health and Social Secretariat of Manitoba and the Manitoba Metis Community Research and Ethics Protocol (Hamad et al. 2021). These processes might be adaptable for the IDI within the New Zealand context.
Concluding remarks
The IDI is a valuable resource for longitudinal research. Some of the research highlighted in this review, such as research involving small, sometimes marginalised populations (Berry et al. 2018; Charania et al. 2018, 2020; Pierse et al. 2019), research involving intergenerational research and quasi-experimental family designs (Leong et al. 2020; Slykerman et al. 2020) and research investigating residential variations in the natural environment (Donovan et al. 2018, 2019a, 2019b) would have been difficult or impossible to undertake without a linked administrative data resource such as the IDI. However, issues regarding ethical governance need addressing. Researchers wishing to use the IDI should familiarise themselves with its limitations, particularly around what service use data capture and what this represents, what is missed by assessing only deficit-focused data, and the variable quality of the data.
Acknowledgements
The author would like to acknowledge Andrew Sporle and Lara Greaves for helpful comments on a draft of this manuscript.
Funding Statement
This work was supported by the Ministry of Business, Innovation and Employment (Endeavour Grant, ref 62506 ENDRP; A Better Start National Science Challenge, ref UOAX1511).
Disclosure statement
No potential conflict of interest was reported by the author(s).
References
- Ballantyne A, Stewart C.. 2019. Big data and public-private partnerships in healthcare and research: the application of an ethics framework for big data in health and research. Asian Bioethics Review. 11(3):315–326. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berry MJ, Foster T, Rowe K, Robertson O, Robson R, Pierse N.. 2018. Gestational age, health, and educational outcomes in adolescents. Pediatrics. 142(5):e20181016. [DOI] [PubMed] [Google Scholar]
- Bycroft C, Miller S, Gath M, Matheson-Dunning N, Simpson K, Das S.. 2021. The quality of administrative data for census variables: strengths, limitations, and opportunities. Wellington: Statistics New Zealand. [Google Scholar]
- Carter KN, Cronin M, Blakely TB, Hayward M, Richardson K.. 2010. Cohort profile: survey of families, income and employment (SoFIE) and health extension (SoFIE-health). International Journal of Epidemiology. 39(3):653–659. doi: 10.1093/ije/dyp215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Charania NA, Paynter J, Lee AC, Watson DG, Turner NM.. 2018. Exploring immunisation inequities among migrant and refugee children in New Zealand. Human Vaccines & Immunotherapeutics. 14(12):3026–3033. doi: 10.1080/21645515.2018.1496769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Charania NA, Paynter J, Lee AC, Watson DG, Turner NM.. 2020. Vaccine-preventable disease-associated hospitalisations among migrant and non-migrant children in New Zealand. Journal of Immigrant and Minority Health. 22:223–231. doi: 10.1007/s10903-019-00888-4. [DOI] [PubMed] [Google Scholar]
- Cormack D, Kukutai T, Cormack C.. 2020. Not one byte more. In: Chen A., editor. Shouting zeros and ones. Digital technology, ethics and policy in New Zealand. Wellington: BWB books; p. 71–83. [Google Scholar]
- Davie G, Lilley R.. 2018. Financial impact of injury in older workers: use of a national retrospective e-cohort to compare income patterns over 3 years in a universal injury compensation scheme. BMJ Open. 2018(8):e018995. doi: 10.1136/bmjopen-2017-018995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Desai T, Ritchie F, Welpton R.. 2016. Five safes: designing data access for research. Economics Working Paper Series 1601. Bristol: University of the West of England. [Google Scholar]
- Dixon S. 2015. The employment and income effects of eight chronic and acute health conditions. New Zealand Treasury Working Paper 15/15. Wellington: New Zealand Treasury. https://treasury.govt.nz/sites/default/files/2015-12/twp15-15.pdf. [Google Scholar]
- Donovan GH, Gatziolis D, Douwes J.. 2019a. Relationship between exposure to the natural environment and recovery from hip or knee arthroplasty: a New Zealand retrospective cohort study. BMJ Open. 9:9. doi: 10.1136/bmjopen-2019-029522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Donovan GH, Gatziolis D, Longley I, Douwes J.. 2018. Vegetation diversity protects against childhood asthma: results from a large New Zealand birth cohort. Nature Plants. 4:358–364. doi: 10.1038/s41477-018-0151-8. [DOI] [PubMed] [Google Scholar]
- Donovan GH, Michael YL, Gatziolis D, ‘t Mannetje A, Douwes J.. 2019b. Association between exposure to the natural environment, rurality, and attention-deficit hyperactivity disorder in children in New Zealand: a linkage study. The Lancet Planet Health. 3(5):E226–E234. doi: 10.1016/S2542-5196(19)30070-1. [DOI] [PubMed] [Google Scholar]
- Fergusson DM, Horwood LJ.. 2001. The Christchurch health and development study: review of findings on child and adolescent mental health. Australian and New Zealand Journal of Psychiatry. 35:287–296. [DOI] [PubMed] [Google Scholar]
- Fitzgerald E, Durie M.. 2000. Assessing and addressing Māori outcomes: preliminary findings from Te Hoe Nuku Roa Māori household research. New Zealand Population Review. 26:115–121. [Google Scholar]
- Greaves LM, Milne BJ.. 2020. The Māori in-between? Identity, health, and social service access needs. HRC-funded research project. https://hrc.govt.nz/resources/research-repository/maori-between-identity-health-and-social-service-access-needs.
- Gulliver P, Jonas M, Fanslow J, McIntosh T, Waayer D.. 2018. Surveys, social licence and the integrated data infrastructure. Aotearoa New Zealand Social Work. 30(3):57–71. [Google Scholar]
- Hamad AF, Walld R, Lix LM, Urquia ML, Roos LL, Wall-Wieler E.. 2021. Data resource profile: the Manitoba multigenerational cohort. International Journal of Epidemiology. dyab195. doi: 10.1093/ije/dyab195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hamm NC, Hamad AF, Wall-Wieler E, Roos LL, Plana-Ripoll O, Lix LM.. 2021. Multigenerational health research using population-based linked databases. An international review. IJPDS. 6(1):17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- JustSpeak . 2020. A justice system for everyone; [accessed 2021 October 31]. https://www.justspeak.org.nz/ourwork/justspeak-idi-research-a-justice-system-for-everyone.
- Kalkman S, van Delden J, Banerjee A, Tyl B, Mostert M, van Thiel G.. 2022. Patients’ and public views and attitudes towards the sharing of health data for research: a narrative review of the empirical evidence. Journal of Medical Ethics. 48:3–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kukutai T, Cormack D.. 2019. Mana motuhake ā-raraunga: datafication and social science research in Aotearoa. Kōtuitui: New Zealand Journal of Social Sciences. 14(2):201–208. doi: 10.1080/1177083X.2019.1648304. [DOI] [Google Scholar]
- Kukutai T, Cormack D.. 2021. Pushing the space: data sovereignty and self-determination in Aotearoa NZ. In: Walter M., Kukutai T., Carroll S. R., Rodriguez-Lonebear D., editors. Indigenous data sovereignty and policy. London: Routledge; p. 21–35. [Google Scholar]
- Leong KSW, McLay J, Derraik JGB, Gibb S, Shackleton N, Taylor RW, Glover M, Audas R, Taylor B, Milne BJ, Cutfield W.. 2020. Associations of prenatal and childhood antibiotics exposure with obesity at age 4 years. JAMA Network Open. 3:1. doi: 10.1001/jamanetworkopen.2019.19681. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lessof C. 2009. Ethical issues in longitudinal surveys. In: Lynn P, editor. Methodology of longitudinal surveys. Chichester: John Wiley & Sons, Ltd; p. 35–54. [Google Scholar]
- Milne BJ, Atkinson J, Blakely T, Day H, Douwes J, Gibb S, Nicolson M, Shackleton N, Sporle A, Teng A.. 2019. Data resource profile: the New Zealand Integrated Data Infrastructure (IDI). International Journal of Epidemiology. 48:3. doi: 10.1093/ije/dyz014. [DOI] [PubMed] [Google Scholar]
- Milne BJ, Li E, Sporle A.. 2020. Intergenerational analyses using the IDI: an update. COMPASS Research Centre. https://www.auckland.ac.nz/content/dam/uoa/auckland/arts/our-research/research-institutes-centres-groups/compass/whole-population-data-analysis/Intergenerational-Links-IDI-Update.pdf.
- Ministry of Health—Manatū Hauora. 2015. B4 school check. https://www.health.govt.nz/your-health/pregnancy-and-kids/services-and-support-you-and-your-child/well-child-tamariki-ora-visits/about-b4-school-check.
- Morton SMB, Atatoa-Carr PE, Grant CC, Robinson EM, Bandara DK, Bird A, Ivory VC, Kingi TKR, Liang R, Marks EJ, et al. 2013. Cohort profile: growing up in New Zealand. International Journal of Epidemiology. 42(1):65–75. doi: 10.1093/ije/dyr206. Epub 2012 Jan 13. [DOI] [PubMed] [Google Scholar]
- Moses C. 2020. The integrated data infrastructure. In: Chen A., editor. Shouting zeros and ones. Digital technology, ethics and policy in New Zealand. Wellington: BWB books; p. 84–101. [Google Scholar]
- Nielson . 2020. Stats NZ’s social licence for data stewardship. Wellington: Statistics New Zealand. [Google Scholar]
- Paprica PA, de Melo MN, Schull MJ.. 2019. Social licence and the general public’s attitudes toward research based on linked administrative health data: a qualitative study. CMAJ Open. 7(1):E40–E46. doi: 10.9778/cmajo.20180099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paterson J, Percival T, Schluter P.. 2008. Cohort profile, The Pacific Islands Families (PIF) study. International Journal of Epidemiology. 37:273–279. [DOI] [PubMed] [Google Scholar]
- Paynter J, Goodyear-Smith F, Morgan J, Saxton P, Black S, Petousis-Harris H.. 2019. Effectiveness of a group B outer membrane vesicle meningococcal vaccine in preventing hospitalization from gonorrhea in New Zealand: a retrospective cohort study. Vaccines. 7(1):5. doi: 10.3390/vaccines7010005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pierse N, Ombler J, White M, Aspinall C, McMinn C, Atatoa-Carr P, Nelson J, Hawkes K, Fraser B, Cook H, Howden-Chapman P.. 2019. Service usage by a New Zealand housing first cohort prior to being housed. SSM - Population Health. 8:100432. doi: 10.1016/j.ssmph.2019.100432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Richmond-Rakerd LS, D’Souza S, Milne BJ, Caspi A, Moffitt TE.. 2021. Longitudinal associations of mental disorders with physical diseases and mortality among 2.3 million New Zealanders. JAMA Network Open. 4(1):e2033448. doi: 10.1001/jamanetworkopen.2020.33448. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rowe RK, Carroll SR, Healy C, Rodriguez-Lonebear D, Walker JD.. 2021. The SEEDS of Indigenous population health data linkage. IJPDS. 6(1):1417. doi: 10.23889/ijpds.v6i1.1417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roy A, Noy I, Cuffe HE.. 2022. Income and extratropical cyclones in New Zealand. Journal of Environmental Management. 311:114852. [DOI] [PubMed] [Google Scholar]
- Schluter PJ, Audas R, Kokaua J, McNeill B, Taylor B, Milne BJ, Gillon G.. 2020. The efficacy of preschool developmental indicators as a screen for early primary school-based literacy interventions. Child Development. 91(1):e59–e76. doi: 10.1111/cdev.13145. [DOI] [PubMed] [Google Scholar]
- Schluter PJ, Kokaua J, Tautolo ES, Richards R, Taleni T, Kim HM, Audas R, McNeill B, Taylor B, Gillon G.. 2018. Patterns of early primary school-based literacy interventions among Pacific children from a nationwide health screening programme of 4 year olds. Scientific Reports. 8(1):12368. doi: 10.1038/s41598-018-29939-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Segal L, Armfield JM, Gnanamanickam ES, Preen DB, Brown DS.. 2021. Child maltreatment and mortality in young adults. Pediatrics. 147(1):e2020023416. [DOI] [PubMed] [Google Scholar]
- Shackleton N, Li E, Gibb S, Kvalsvig A, Baker M, Sporle A, Bentley R, Milne B.. 2021. The relationship between income poverty and child hospitalisations in New Zealand: evidence from longitudinal household panel data and census data. PLoS One. 16(1):e0243920. doi: 10.1371/journal.pone.0243920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Silva P, Stanton W.. 1996. From child to adult: the Dunedin multidisciplinary health and development study. Auckland: Oxford University Press. [Google Scholar]
- Slykerman RF, Li E, Shackleton N, Milne B.. 2020. Birth by caesarean section and educational achievement in adolescents. Australian and New Zealand Journal of Obstetrics and Gynaecology. 61:3. doi: 10.1111/ajo.13276. [DOI] [PubMed] [Google Scholar]
- Sporle A. 2018. Te Hao Nui. HRC-funded research project. https://hrc.govt.nz/resources/research-repository/te-hao-nui.
- Sporle A, Hudson M, West K.. 2020. Indigenous data and policy in Aotearoa New Zealand. In: Walter M., Kukutai T., Carroll S. R., Rodriguez-Lonebear D., editors. Indigenous data sovereignty and policy. London: Routledge; p. 62–80. [Google Scholar]
- Stats NZ . 2021a. Integrated Data Infrastructure; [accessed 2021 October 29]. https://www.stats.govt.nz/integrated-data/integrated-data-infrastructure/.
- Stats NZ . 2021b. Integrated Data Infrastructure (IDI) refresh: linking report, September 2021 refresh. https://www.stats.govt.nz/integrated-data/integrated-data-infrastructure/data-in-the-idi/.
- Stats NZ . 2021c. Experimental administrative population census; [accessed 2021 October 31]. https://www.stats.govt.nz/experimental/experimental-administrative-population-census.
- Svardal CA, Waldie K, Milne B, Morton SM, D’Souza S.. 2022. Prevalence of antidepressant use and unmedicated depression in pregnant New Zealand women. Australian and New Zealand Journal of Psychiatry. 48(5):489–499. [DOI] [PubMed] [Google Scholar]
- Te Mana Raraunga . 2021. What is Māori Data Sovereignty? [accessed 2021 October 30]. https://www.temanararaunga.maori.nz/.
- Teng AM, Blakely T, Ivory V, Kingham S, Cameron V.. 2017. Living in areas with different levels of earthquake damage and association with risk of cardiovascular disease: a cohort-linkage study. The Lancet Planetary Health. 1(6):E242–E253. doi: 10.1016/S2542-5196(17)30101-8. [DOI] [PubMed] [Google Scholar]
- Thompson JM, Clark PM, Robinson E, Becroft DM, Pattison NS, Glavish N, Pryor JE, Wild CJ, Rees K, Mitchell EA.. 2001. Risk factors for small-for-gestational-age babies: the Auckland birthweight collaborative study. Journal of Paediatrics and Child Health. 37(4):369–375. doi: 10.1046/j.1440-1754.2001.00684.x. [DOI] [PubMed] [Google Scholar]
- Vaithianathan R, Wilson M, Maloney T, Baird S.. 2016. Impact of the family start home visiting programme on outcomes for mothers and children: a quasi-experimental study – Ministry of Social Development. Wellington: MSD. https://www.msd.govt.nz/about-msd-and-our-work/publications-resources/evaluation/family-start-outcomes-study/. [Google Scholar]
- Walker JD, Pyper E, Jones CR, Khan S, Chong N, Legge D, Schull MJ, Henry D.. 2018. Unlocking first nations health information through data linkage. IJPDS. 3(1):450. doi: 10.23889/ijpds.v3i1.450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walter M, Lovett R, Maher B, Williamson B, Prehn J, Bodkin-Andrews G, Lee V.. 2021. Indigenous data sovereignty in the era of big data and open data. Australian Journal of Social Issues. 56(2):143–156. doi: 10.1002/ajs4.141. [DOI] [Google Scholar]
- Zhao J, Gibb S, Jackson R, Mehta S, Exeter DJ.. 2018. Constructing whole of population cohorts for health and social research using the New Zealand integrated data infrastructure. Australian and New Zealand Journal of Public Health. 42(4):382–388. doi: 10.1111/1753-6405.12781. [DOI] [PubMed] [Google Scholar]