Table 4.
Item No | Recommendation | Page No Study (1) [20] | Page No Study (2) [29] | |
---|---|---|---|---|
Objectives | ||||
Background/rationale | 1 | Explain the scientific background and rationale for the study being reported in one or two sentences | Page 1, section “Abstract”, paragraph 1, line 1–7 | Page 1, section “Abstract”, paragraph 1, line 1–4 |
Prespecified hypotheses | 2 | State prespecified hypotheses in on or two sentences | Page 2, section “Introduction”, paragraph 3, line 1–2 | N/A |
Study design: data sources selection & variables selection & data integration | ||||
Data source | 3a | Describe the time coverage | FCDS: Page 2, section “Data source and case selection”, paragraph 1, line 2 | FCDS: Page 4, section “Data sources”, paragraph 1, line 11 |
BRFSS: Page 2, section “Data source and case selection”, paragraph 1, line 6 | BRFSS: N/A | |||
2000 U.S. census data: Page 2, section “Data source and case selection”, paragraph 1, line 7 | United States Census Bureau: Page 4, section “Data sources”, paragraph 1, line 23 | |||
ATSDR: N/A | ||||
County Health Ranking & Roadmaps: N/A | ||||
3b | Describe the geographic coverage | FCDS: Page 2, section “Data source and case selection”, paragraph 1, line 4–5” | FCDS: Page 4, section “Data sources”, paragraph 1, line 12–14 | |
BRFSS: N/A | BRFSS: Page 10, section “Result”, paragraph 2, line 7–8 | |||
2000 U.S. census data: N/A | United States Census Bureau: N/A | |||
ATSDR: N/A | ||||
County Health Ranking & Roadmaps: N/A | ||||
3c | Describe the sample size | FCDS: Page 2, section “Data source and case selection”, paragraph 2, line 7 | FCDS: Page 4, section “Data sources”, paragraph 2, line 6–7 | |
BRFSS: N/A | BRFSS: N/A | |||
2000 U.S. census data: N/A | United States Census Bureau: N/A | |||
ATSDR: N/A | ||||
County Health Ranking & Roadmaps: N/A | ||||
3d | Describe the demographic distribution | FCDS: Page 2, Table 1 | N/A | |
BRFSS: N/A | ||||
2000 U.S. census data: N/A | ||||
3e | Describe the Cohort criteria | FCDS: Page 2, section “Data source and case selection”, paragraph 2, line 1–5 | FCDS: Page 4, section “Data sources”, paragraph 2, line 1–6 | |
BRFSS: N/A | BRFSS: N/A | |||
2000 U.S. census data: N/A | United States Census Bureau: N/A | |||
ATSDR: N/A | ||||
County Health Ranking & Roadmaps: N/A | ||||
3f | Describe the sources of bias | N/A | N/A | |
3 g | Describe the data collection approach | N/A | FCDS: N/A | |
BRFSS: Page 4, section “Data sources”, paragraph 2, line 6–7 | ||||
United States Census Bureau: N/A | ||||
ATSDR: N/A | ||||
County Health Ranking & Roadmaps: N/A | ||||
Dependent variable | 4a | State the variable definition and variable type (e.g., primary outcome variable, secondary outcome variable) | Survival time: Page 2, section “Variable definitions”, line 1–3 | Cancer survival: Page 4, section “Data integration use case: The multi-level integrative data analysis of Cancer survival”, paragraph 1, line 1–2 |
4b | State the data source of dependent variable | Survival time: Page 2, section “Data source and case selection”, paragraph 1, line 2 | Cancer survival: Page 4, section “Data sources”, paragraph 1, line 9–14 | |
4c | State the data type (e.g., numerical, categorical, date-time) of dependent variable | Survival time: Page 2, section “Variable definitions”, paragraph 1, line 1 | Cancer survival: N/A | |
4d | State descriptive statistics (e.g., min, max. Median, value range, percentile) of dependent variable | Survival time: Page 4, Table 1 | Cancer survival: N/A | |
4e | State the NIMHD domain and levels of dependent variable | Survival time: Page 2, section “Data source and case selection”, paragraph 1, line 1–2 | Cancer survival: Page 4, section “Data sources”, paragraph 2, line 15 | |
Independent variable | 5a | State the variable definition and variable type (e.g., primary predictor, secondary predictor) | Socioeconomic status: Page 2, section “Variable definitions”, paragraph 3, line 1–2 | Demographic variables: Page 5, Table 1 |
Individual smoking: Page 2, section “Data source and case selection”, paragraph 2, line 1–2 | Smoking status: Page 10, section “The ontology for Cancer research variables (OCRV)”, paragraph 2, line 13–27 | |||
Regional smoking: Page 3, section “Data source and case selection”, paragraph 2, line 4–6 | Marital status: Page 14, section “Type 4: Queries that generate results based on the knowledge encoded in ontology”, paragraph 2, line 7–10 | |||
Insurance payer: Page 5, Table 1 | ||||
Residency: Page 5, Table 1 | ||||
Age at diagnosis: Page 5, Table 1 | ||||
Year of diagnosis: Page 5, Table 1 | ||||
Tumor stage: Page 5, Table 1 | ||||
Tumor type: Page 5, Table 1 | ||||
Treatment procedure: Page 5, Table 1 | ||||
Census Tract SVI: Page 14, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 5–16 | ||||
Census tract high school completion rates: Page 5, Table 1 | ||||
Census tract family poverty rates: Page 5, Table 1 | ||||
Census tract rurality status: Page 4, section “Data integration use case: The multi-level integrative data analysis of Cancer survival”, paragraph 1, line 8–11 | ||||
County adult mental and physical health status: Page 5, Table 1 | ||||
County density of primary care physicians: Page 5, Table 1 | ||||
County smoking rate: Page 10, section “The ontology for Cancer research variables (OCRV)”, paragraph 2 | ||||
County alcohol consumption rate: Page 5, Table 1 | ||||
5b | State the data type (e.g., numerical, categorical) of independent variable | Socioeconomic status: Page 2, section “Variable definitions”, paragraph 3, line 9–10 | Demographic variables: N/A | |
Individual smoking: Page 2, section “Data source and case selection”, paragraph 2, line 2–3 | Smoking status: Page 13, Table 3 | |||
Regional smoking: Page 3, section “Data source and case selection”, paragraph 2, line 4–6 | Marital status: Page 14, section “Type 4: Queries that generate results based on the knowledge encoded in ontology”, paragraph 2, line 7–10 | |||
Insurance payer: N/A | ||||
Residency: N/A | ||||
Age at diagnosis: Page 16, Fig. 6 | ||||
Year of diagnosis: Page 16, Fig. 6 | ||||
Tumor stage: N/A | ||||
Tumor type: Page 4, section “Data sources”, paragraph 2, line 1–6 | ||||
Treatment procedure: Page 5, Table 1 | ||||
Census Tract SVI: Page 14, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 5–16 | ||||
Census tract high school completion rates: N/A | ||||
Census tract family poverty rates: N/A | ||||
Census tract rurality status: N/A | ||||
County adult mental and physical health status: N/A | ||||
County density of primary care physicians: N/A | ||||
County smoking rate: Page 10, section “The ontology for Cancer research variables (OCRV)”, paragraph 2 | ||||
County alcohol consumption rate: N/A | ||||
5c | State the data source of independent variable | Socioeconomic status: Page 2, section “Data source and case selection”, paragraph 1, line 6–7 | Page 5, Table 1 | |
Individual smoking: Page 2, section “Data source and case selection”, paragraph 1, line 1–2 | ||||
Regional smoking: Page 2, section “Data source and case selection”, paragraph 1, line 7–10 | ||||
5d | State descriptive statistics (e.g., min, max. Median, value range, percentile) of independent variable | Page 4, Table 1 | N/A | |
5e | State the NIMHD domain and levels of independent variable | Socioeconomic status: Page 2, section “Data source and case selection”, paragraph 1, line 6 | Page 5, Table 1 | |
Individual smoking: Page 2, section “Data source and case selection”, paragraph 2, line 1 | ||||
Regional smoking: Page 3, section “Data source and case selection”, paragraph 2, line 4–6 | ||||
Controlled variable | 6a | State the controlled variable and variable type (e.g., numerical, categorical) of controlled variable | Age of diagnosis: Page 2, section “Variable definitions”, paragraph 1, line 10–13 | N/A |
Anatomic site: Page 2, section “Variable definitions”, paragraph 1, line 2–9 | ||||
Race-ethnicity: Page 4, Table 1 | ||||
Marital status: Page 4, Table 1 | ||||
Insurance: Page 4, Table 1 | ||||
Year of diagnosis: Page 4, Table 1 | ||||
Gender: Page 4, Table 1 | ||||
Stage of diagnosis: Page 4, Table 1 | ||||
Treatment: Page 4, Table 1 | ||||
6b | State the data source of controlled variable | Page 2, section “Data source and case selection”, paragraph 1, line 2a | N/A | |
6c | State descriptive statistics (e.g., min, max. Median, value range, percentile) of controlled variable | Page 2, section “Data source and case selection”, paragraph 1, line 2a | N/A | |
6d | State the NIMHD domain and levels of controlled variable | Page 2, section “Data source and case selection”, paragraph 1, line 1–5a | N/A | |
Missing data | 7a | For each data source, describe whether required or expected variable that is not present | N/A | N/A |
7b | For each variable, describe method of how to handle missing data | N/A | N/A | |
7c | For each variable, describe the missing rate | N/A | N/A | |
Data processing | 9a | Data extraction: for each variable, describe how to process the raw data source to extract the variable | N/A | Demographic variables: Page 15, Fig. 5 |
Age at diagnosis: Page 16, Fig. 6 | ||||
Census Tract SVI: Page 16, Fig. 7 | ||||
County smoking rate: Page 17, Fig. 8 | ||||
Marital status: Page 18, Fig. 9 | ||||
9b | Data cleaning: for each variable, describe the method used to detect and correct (or remove) the incorrect records, missing values or outliers | N/A | N/A | |
Integration strategy | 10 | Describe the integration strategy for each variable:1) Integrate with variables from same level, 2) Integrate with variables from different levels, and 3) Creation of additional computed elements | Socioeconomic status: Page 2, section “Variable definitions”, paragraph 3, line 6–7. | Demographic variables: Page 15, Fig. 5 |
Regional smoking: Page 2, section “Variable definitions”, paragraph 2, line 4–5. | Age at diagnosis: Page 16, Fig. 6 | |||
Census Tract SVI: Page 16, Fig. 7 | ||||
County smoking rate: Page 17, Fig. 8 | ||||
Marital status: Page 18, Fig. 9 | ||||
Census tract high school completion rates: Page 15, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 1–3 | ||||
Census tract family poverty rates: Page 15, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 1–3 | ||||
Census tract rurality status: Page 15, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 1–3 | ||||
County adult mental and physical health status: Page 15, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 1–3 | ||||
County density of primary care physicians: Page 15, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 1–3 | ||||
County alcohol consumption rate: Page 15, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 1–3 | ||||
Integration algorithms | 11 | For each variable, describe the algorithm used to integrate it with variables from other data sources | N/A | Demographic variables: Page 15, Fig. 5 |
Age at diagnosis: Page 16, Fig. 6 | ||||
Census Tract SVI: Page 16, Fig. 7 | ||||
County smoking rate: Page 17, Fig. 8 | ||||
Marital status: Page 18, Fig. 9 | ||||
Census tract high school completion rates: Page 15, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 1–3 | ||||
Census tract family poverty rates: Page 15, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 1–3 | ||||
Census tract rurality status: Page 15, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 1–3 | ||||
County adult mental and physical health status: Page 15, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 1–3 | ||||
County density of primary care physicians: Page 15, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 1–3 | ||||
County alcohol consumption rate: Page 15, section “Type 3: Queries that are used to link a patient to contextual factors through geographic variables”, paragraph 1, line 1–3 | ||||
Variable validation | 12 | For each variable, describe data validation rule for the selected variable. Rule should identify both the variable and the validation algorithms | N/A | Demographic variables: Page 19, section “Data quality and consistency checks of the source data using the ontology” |
Integrated variable | 13 | Describe the variable after integration and basic descriptive statistics (e.g., min, max. Median, value range, percentile) | N/A | Page 18, Table 4 |
FCDS Florida Cancer Data System
ATSDR Agency for Toxic Substances& Disease Registry
BRFSS behavioral risk factor surveillance system
aIf the reported items for all variables or data sources are described at the same place, you can list the page/section/table information at once. For the integration related items, we only presented variables that have the information (N/A will not be showed in the table)