. 2023 Aug 16;23:186. doi: 10.1186/s12874-023-02000-9

Table 2.

Data extraction form

Questions	Possible categories
Subject area
What is the study’s subject area?	Cardiology, Oncology, Psychiatry, Neurology, etc.
Data type
Were EHRs or EMRs data used?	Yes or no.
If not, what type of data were used?	Cohort study data, Patient registry data, etc.
Specify the name of the observational database.	Free text.
Data structure
Were structured data used?	Yes or no.
Were unstructured data used?	Yes or no.
If unstructured data were used, were these manually or automatically processed?	Manually or automatically.
Eligibility criteria
What is the target population?	Free text.
Treatments
How many treatments were compared?	Number of treatments.
What treatments were compared?	Free text.
Outcomes
What was(were) the primary outcome(s)?	Free text.
Follow-up
Was the follow-up duration pre-specified?	Yes or no.
Statistical objectives
What is the estimand of interest?	Causal effect of point treatment offer (‘intention-to-treat effect’), causal effect of point treatment receipt (‘per-protocol effect’), causal effect of treatment regimen initiation (‘intention-to-treat effect’) or causal effect of sustained treatment regimen (‘per-protocol effect’).
What was the measurement scale of the outcome(s)?	Continuous, ordinal, binary, time-to-event, other.
Which effect size measure was used to quantify the causal contrast of interest?	Mean difference, odds ratio, hazard ratio, other.
Which statistical method was used for analysing the primary outcome(s)?	Pooled logistic regression, Cox proportional hazards model, etc.
Were sample size or statistical power calculations provided?	Yes or no.
If yes, what was determined?	Power or the effect size.
Treatment assignment procedures
Were treatments administered at one point in time or sustained over time?	Point treatment or treatment regimen.
In either case have pre-initiation confounders been adjusted for?	Yes or no.
If the answer to the last question is ‘yes’, what statistical method has been used for this purpose?	Inclusion of covariates in model, stratification, inverse probability of treatment weighting, propensity score methods, parametric g-formula, other, method not specified.
If treatment regimen, are the investigators interested in the effect of initiating a treatment or the effect of sustaining a treatment?	Initiation or sustained treatment.
If interested in the effect of a sustained treatment, did they account for time-varying confounders?	Yes or no.
If the answer to the last question is ‘yes’, what statistical method has been used for this purpose?	Inverse probability of treatment weighting, parametric g-formula, other, method not specified.
Other bias handling
Was immortal-time bias addressed?	Yes or no.
If yes, how was immortal-time bias handled?	Avoided at the study design stage or using the cloning technique.
Was selection bias due to loss to follow-up addressed explicitly?	Yes or no.
If so, how were missing outcome data handled?	Inverse probability of censoring weighting, multiple imputation, etc.

Abbreviations: EHRs Electronic Health Records, EMRs Electronic Medical Records