/* Before running any analyses, set a path directory for a single folder using the "cd" command. Within this folder, create two new folders: 1) "raw_data"; 2) "data". The "raw_data" folder should contain all raw datasets from the SECCYD. Each section of this do-file lists the raw datasets that are used to create the variables used in the analyses. The "data" folder will be used to save altered datasets that contain cleaned variables. This section of the file cleans the demographic measures. Variables created here include: Table 3 Name Syntax Name ------------ ------------- Male male White dwhite Black dblack Hispanic dhisp Other dother Birth Weight (g) wtgms Log of Family Income logincome Mother's Age at Birth momage Mother's Education momed Mother's PPVT ppvt Site (not listed in Table 3) site1 - site10 EXTRA Variables Birthdate (for calculating age at 54 month interview) Datasets used include: demo0, demo1, demo6, demo15, demo24, demo36, demo54, fam36, and fam54 */ use raw_data/demo0.dta, clear *Merging on other datasets foreach data in "demo1.dta" "demo6.dta" "demo15.dta" "demo24.dta" "demo36.dta" /// "demo54.dta" "fam36.dta" "fam54.dta" { merge 1:1 id using raw_data/`data' drop _merge } ****** Gender *********** rename CSEX_M01 gender codebook gender gen male=. replace male=1 if gender==1 replace male=0 if gender==2 gen female=. replace female=1 if gender==2 replace female=0 if gender==1 tab1 male *705 males; 52% ****** Ethnicity *********** rename CRACEM01 ethnicity codebook ethnicity tab MHISPM01 tab ethnicity CHISPM01 *Hispanic was not designated as an ethnic category, so Hispanics are classified *across categories gen dother=. replace dother=1 if (ethnicity==1 | ethnicity==2 | ethnicity==5) & CHISPM01!=1 replace dother=0 if dother!=1 & ethnicity!=. tab dother CHISPM01 tab dother ethnicity if CHISPM01==0 gen dblack=. replace dblack=1 if ethnicity==3 & CHISPM01!=1 replace dblack=0 if dblack!=1 & ethnicity!=. tab dblack CHISPM01 tab dblack ethnicity if CHISPM01==0 gen dwhite=. replace dwhite=1 if ethnicity==4 & CHISPM01!=1 replace dwhite=0 if dwhite!=1 & ethnicity!=. tab dwhite CHISPM01 tab dwhite ethnicity if CHISPM01==0 gen dhisp=. replace dhisp=1 if CHISPM01==1 replace dhisp=0 if CHISPM01!=1 & CHISPM01!=. tab dhisp tab1 dother dblack dhisp dwhite *the 1's add up to 1364 (i.e., no missingness) *dother - 66; 4.85% *dblack - 173; 12.68% *dhisp- 83; 6.09% *dwhite- 1,042; 76.39% ****** Birth Weight *********** rename BWTGMM00 wtgms ******Income***************** foreach var in INCNTM01 INCNTM06 INCNTM15 INCNTM24 INCNTM36 INCNTM54 { gen missing`var'=. replace missing`var'= 1 if `var' == . } misstable sum INCNTM* tab1 missingINCNTM* egen sum_missing_income = rowtotal (missingINCNTM01 missingINCNTM06 missingINCNTM15 missingINCNTM24 /// missingINCNTM36 missingINCNTM54) tab sum_missing_income *restricting the average to people who have at least 2/6 observations* egen incomeavg= rowmean (INCNTM01 INCNTM06 INCNTM15 INCNTM24 INCNTM36 INCNTM54) if sum_missing_income < 5 sum incomeavg *M= 3.46; SD= 2.70; N= 1299 gen logincome=log(incomeavg) sum logincome *M= .958; SD= .796; N= 1299 ******Mother's Education*********** rename MEDUCM01 momed sum momed *M= 14.23 SD= 2.51; N= 1363 ******Mother's PPVT*********** rename STDSCM36 ppvt sum ppvt *M= 99.01; SD= 18.35; N=1167 ******Mother's Age*********** rename MAGE_M01 momage sum momage *M= 28.12; SD= 5.63; N= 1364 ******Site*********** tab site, gen(site) /* . tab site, gen(site) LOCATION OF | DATA | COLLECTION | Freq. Percent Cum. ------------+----------------------------------- 0 | 150 11.00 11.00 1 | 132 9.68 20.67 2 | 133 9.75 30.43 3 | 140 10.26 40.69 4 | 123 9.02 49.71 5 | 136 9.97 59.68 6 | 136 9.97 69.65 7 | 139 10.19 79.84 8 | 144 10.56 90.40 9 | 131 9.60 100.00 */ *Saving dataset for first set of variables keep id male dwhite dblack dhisp dother wtgms logincome momage momed /// ppvt site1-site10 save data/seccyd_1.dta, replace *------------------------------------------------------------------------------* *------------------------------------------------------------------------------* *------------------------------------------------------------------------------* /* This section of the file cleans the remaining "Child Demographic and Home Controls" listed in Table 3. Table 3 Name Syntax Name ------------ ------------- Child's Age at Delay Meas (mos) agemo Bracken Standard Score (36 mos) bracken Bayley (24 mos) bayley Child Temperament (6 mos) temperament_6 HOME Score Learning Materials hhlrnm36 Language Stimulation hhlanm36 Physical Environment hhphym36 Responsivity hhresm36 Academic Stimulation hhacam36 Modeling hhmodm36 Variety hhvarm36 Acceptance hhaccm36 Responsivity- Empirical Scale hhrsem36 Datasets used include: demo0, cout6, cout24, cout36, cout54, home36 */ use raw_data/demo0.dta, clear *Merging on other datasets foreach data in "cout6" "cout24.dta" "cout36.dta" "cout54.dta" "home36.dta" { merge 1:1 id using raw_data/`data' drop _merge } ***** Age at Delay Measure ******** *start with birthdate rename BRDATM00 birthdate sum birthdate *M= 11456.81 (stored as stata numeric daily date) N= 1364 *date of gratification delay measure sum INTDT55E *M= 13161.63 (stored as stata numeric daily date) N= 1038 gen agemo= INTDT55E - birthdate *in days replace agemo= agemo/30.42 *convert to months, using average number of days per month in a year sum agemo *M= 56.05; SD= 1.14; N= 1038 ***** Bracken Standard Score ******** rename BKSTDO36 bracken sum bracken *M= 9.02; SD= 2.89; N= 1159 ***** Bayley ************* rename MDI24O24 bayley sum bayley *M= 92.15; SD=14.64; N=1162 **** Child Temperament ********* rename TEMP_M06 temperament_6 sum temperament_6 *M= 3.18; SD= .40 N=1279 ***** HOME SCORES *************** *Making names lowercase foreach var of varlist HHLRNM36- HHRSEM36 { rename `var' `=lower("`var'")' } sum hhlrnm36 hhlanm36 hhphym36 hhresm36 hhacam36 hhmodm36 hhvarm36 /// hhaccm36 hhrsem36 /* Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- hhlrnm36 | 1179 7.157761 2.519644 0 11 hhlanm36 | 1179 6.016964 1.135049 0 7 hhphym36 | 1179 5.995759 1.282962 0 7 hhresm36 | 1179 5.605598 1.362113 0 7 hhacam36 | 1179 3.370653 1.22406 0 5 -------------+-------------------------------------------------------- hhmodm36 | 1179 3.166243 1.131962 0 5 hhvarm36 | 1179 6.754877 1.503296 1 9 hhaccm36 | 1179 3.38592 .9192371 0 4 hhrsem36 | 1179 5.429177 1.046798 0 6 */ keep id agemo bracken bayley temperament_6 hhlrnm36 hhlanm36 hhphym36 /// hhresm36 hhacam36 hhmodm36 hhvarm36 hhaccm36 hhrsem36 save data/seccyd_2.dta, replace *------------------------------------------------------------------------------* *------------------------------------------------------------------------------* *------------------------------------------------------------------------------* /* This section of the file cleans the variables measuring 54-month cognitive skills and behaviors, including the delay of gratification measure. Table 2 Name Syntax Name ------------ ------------- Delay of gratification (min. waited) dog_min Delay of gratification (categories) 7 minutes d4 2 to 7 minutes d3 0.333 to 2 minutes d2 < 0.333 minutes d1 Table 3 Name Syntax Name ------------ ------------- 54 mos. WJ-R scores Letter-Word ID lwid_ss_54 Applied Problems appld_ss_54 Picture Vocabulary picvo_ss_54 Memory for Sentences memse_ss_54 Incomplete Words incom_ss_54 54 mos. Child Behavioral Checklist Internalizing internalizing_54 Externalizing externalizing_54 Datasets used: cout54 */ use raw_data/cout54.dta, clear ********* Delay of Gratification ************** sum DOG* rename DOGPFO54 dog_pass tab dog_pass *N= 966, 514 passed (53.21%) *make sure 1 = pass assert DOGTWO54== 7 if dog_pass==1 *good, everyone who passed has max wait time: 7 min *KEY INDEPENDENT VARIABLE (MINUTES WAITED): rename DOGTWO54 dog_min sum dog_min *M= 4.47 SD= 3.01; N= 961 ********** Delay Categories ******************** gen d1=. replace d1= 1 if dog_min<= .333 replace d1= 0 if dog_min > .333 & dog_min!=. gen d2=. replace d2= 1 if dog_min<= 2 & dog_min > .333 replace d2= 0 if (dog_min > 2 | dog_min <= .333) & dog_min!=. gen d3=. replace d3= 1 if dog_min> 2 & dog_min< 7 replace d3= 0 if (dog_min >= 7 | dog_min <= 2) & dog_min!=. gen d4=. replace d4= 1 if dog_min>= 7 & dog_min!=. replace d4= 0 if dog_min < 7 tab1 d1 d2 d3 d4 /* All 1's equal 961 (i.e., 961 kids have data across the 4 measures) d1: 180 coded "1"; 18.73% d2: 129 coded "1"; 13.42% d3: 138 coded "1"; 14.36% d4: 514 coded "1"; 53.49% */ ********** 54 month WJ-R Scores *************** *** Letter-Word ID *** rename WJLWSC54 lwid_ss_54 sum lwid_ss_54 *M=98.93 SD= 13.52; N= 1056 *** Applied Problems *** rename WJAPSC54 appld_ss_54 sum appld_ss_54 *M= 102.94 SD= 15.63; N=1053 **** Picture Vocabulary *** rename WJPVSC54 picvo_ss_54 sum picvo_ss_54 *M= 100.24 SD=15.03; N=1060 **** Memory for Sentences *** rename WJMSSC54 memse_ss_54 sum memse_ss_54 *M= 91.74 SD= 18.49; N=1054 **** Incomeplete Words *** rename WJIWSC54 incom_ss_54 sum incom_ss_54 *M= 96.67 SD= 13.63; N=1050 ********** 54 month Child Behavioral Checklist *************** *** Externalizing *** rename BEX_TM54 externalizing_54 sum externalizing_54 *M=51.69 SD=9.39; N=1061 *** Internalizing *** rename BIN_TM54 internalizing_54 sum internalizing_54 *M= 47.29 SD=8.88; N=1061 keep id dog_min d1 d2 d3 d4 lwid_ss_54 appld_ss_54 picvo_ss_54 /// memse_ss_54 incom_ss_54 externalizing_54 internalizing_54 save data/seccyd_3.dta, replace *------------------------------------------------------------------------------* *------------------------------------------------------------------------------* *------------------------------------------------------------------------------* /* This section of the file cleans the key outcome variables: grade 1 and age-15 achievement and behavior. Table 2 Name Syntax Name ------------ ------------- Outcome Measures- Grade 1 Achievement Composite ach1 Behavior Composite beh1 Outcome Measures - Age 15 Achievement Composite ach15 Behavior Composite beh15 Variables used to make composite (not listed in Table 2): Letter-Word ID (grade 1) lwid_ss_1 Applied Problems (grade 1) appld_ss_1 Passage Comprehenion (age 15) passage_ss_15 Applied Problems (age 15) appld_ss_15 Internalizing (grade 1) internalizing_1st Externalizing (grade 1) externalizing_1st Internalizing (age 15) internalizing_15 Externalizing (age 15) externalizing_15 *It should be noted that the age 15 measures of passage comprehension, applied problems, internalizing and externalizing were used as the main dependent variables in Table S8 (results for disaggregated outcome measures). Datasets used: coutg1, coutx5 */ use raw_data/coutg1.dta, clear merge 1:1 id using raw_data/coutx5.dta drop _merge *** Letter-Word ID (grade 1) *** rename WJLWSC1S lwid_ss_1 sum lwid_ss_1 *M= 111.98; SD=15.79; N= 1025 *** Applied Problems (grade 1) *** rename WJAPSC1S appld_ss_1 sum appld_ss_1 *M= 110.80; SD=17.14; N= 1025 *** Passage Comprehension (age 15) *** rename WJPCSCX5 passage_ss_15 sum passage_ss_15 *M= 107.71; SD=15.72; N= 887 *** Applied Problems (age 15) *** rename WJAPSCX5 appld_ss_15 sum appld_ss_15 *M= 102.92; SD=14.22; N= 887 *** Internalizing (grade 1) *** rename BIN_TM1S internalizing_1st sum internalizing_1st *M= 48.27; SD=8.94; N= 1028 *** Externalizing (grade 1) *** rename BEX_TM1S externalizing_1st sum externalizing_1st *M= 48.64; SD=9.79; N= 1028 *** Internalizing (age 15) *** rename BIN_TMX5 internalizing_15 sum internalizing_15 *M= 46.64; SD=9.86; N= 973 *** Externalizing (age 15) *** rename BEX_TMX5 externalizing_15 sum externalizing_15 *M= 45.51; SD=10.46; N= 973 *********** KEY OUTCOME VARIABLES: COMPOSITE SCORES ************** egen ach1 = rowmean(lwid_ss_1 appld_ss_1) egen ach15 = rowmean(passage_ss_15 appld_ss_15) egen beh1 = rowmean(internalizing_1st externalizing_1st) egen beh15 = rowmean(internalizing_15 externalizing_15) sum ach1 ach15 beh1 beh15 /* Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- ach1 | 1,025 111.3854 14.59709 59 152 ach15 | 892 105.2876 13.7189 46 160 beh1 | 1,028 48.45331 8.322566 32 77 beh15 | 973 46.07451 9.10633 32 83 */ keep id ach1 beh1 ach15 beh15 lwid_ss_1 appld_ss_1 passage_ss_15 appld_ss_15 /// internalizing_1st externalizing_1st internalizing_15 externalizing_15 save data/seccyd_4.dta, replace *------------------------------------------------------------------------------* *------------------------------------------------------------------------------* *------------------------------------------------------------------------------* /* This section of the file cleans the variables used in the supplemental analyses. These variables appear primarily in the supplementary information file. Variable Name Syntax Name ------------ ------------- Continuous Performance Task Sustained Attention propcorrect Impulsivity propincorrect Self-Control Composite selfcontrol54 Supplemental Age 15 Behavioral Measures (Table S5) Stoplight- Brake Applications int_brake Stoplight- Brake Time (ms) int_waittime Internalizing (self-report) internalizing_t Externalizing (self-report) externalizing_t Impulse Control impulse_ctrl Risk Taking risk_taking Variables used for self-control composite: CBQ- Attentional Focusing (caregiv) cbqattention_cg CBQ- Inhibitory Control (caregiv) cbqinhibitory_cg CBQ- Attentional Focusing (mother) cbqattention CBQ- Inhibitory Control (mother) cbqinhibitory Datasets used: cout54, coutx5, cargiv54 Because id does not uniquely identify observations in the cargiv54 data file (i.e., some children have multiple observations due to having multiple caregivers), I start with that file and create a unique dataset before merging on the other files. */ use raw_data/cargiv54.dta, clear keep id ccid CBQAFA54 CBQICA54 egen miss= rowmiss(CBQAFA54 CBQICA54) tab miss sort id ccid miss sum CBQAFA54 CBQICA54 /* Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- CBQAFA54 | 788 4.843954 1.012823 1.25 7 CBQICA54 | 795 5.069679 1.048816 1.7 7 */ /*For the CBQ items, every child has a response from only ONE caregiver. I will drop all ID's who are missing on the 2 CQB items */ drop if miss==2 codebook id *803 id's; 803 unique values *** CBQ- Attention- Caregiver *** rename CBQAFA54 cbqattention_cg sum cbqattention_cg *M= 4.84; SD= 1.01; N=788 *** CBQ- Inhibitory Control- Caregiver *** rename CBQICA54 cbqinhibitory_cg sum cbqinhibitory_cg *M= 5.07; SD= 1.05; N=795 drop ccid miss ****MERGING ON OTHER DATASETS **** foreach data in cout54 coutx5 { merge 1:1 id using raw_data/`data'.dta drop _merge } *** CBQ- Attention- Mother *** rename CBQAFM54 cbqattention sum cbqattention *M=4.71; SD=.85; N=1023 *** CBQ- Inhibitory- Mother *** rename CBQICM54 cbqinhibitory sum cbqinhibitory *M=4.66; SD=.78; N=1061 ****** SELF-CONTROL COMPOSITE (taken from Duckworth et al., 2013; p. 848) ****** sum cbqattention cbqinhibitory cbqattention_cg cbqinhibitory_cg egen selfcontrol54= rowmean(cbqattention cbqinhibitory cbqattention_cg /// cbqinhibitory_cg) sum selfcontrol54 *M= 4.77; SD= .72; N=1083 ******* Continuous Performance Task *************** rename CPPCRC54 propcorrect sum propcorrect *M= 0.75; SD= .19; N=1002 rename CPPIRC54 propincorrect sum propincorrect *M= 0.08; SD= .12; N=1002 ****** Supplemental Age 15 Behavioral Measures ****** *** Stoplight Task- Brake Applications *** rename NBRKSCX5 int_brake sum int_brake *M= 4.95; SD=1.42; N=934 *** Stoplight Task- Wait Time *** rename ATBYBCX5 int_waittime sum int_waittime *M=911.11; SD=349.50; N=923 *** Risk Taking *** rename ANYR_CX5 risk_taking sum risk_taking *M=6.16; SD=5.67; N=954 *** Internalizing (self-report) *** rename BIN_TCX5 internalizing_t sum internalizing_t *M= 47.29; SD=10.17; N=956 *** Externalizing (self-report) *** rename BEX_TCX5 externalizing_t sum externalizing_t *M= 49.31; SD= 9.91; N=956 *** Impulse Control *** rename MPLSCCX5 impulse_ctrl sum impulse_ctrl *M= 3.51; SD= 0.90; N=957 keep id propcorrect propincorrect selfcontrol54 int_brake int_waittime risk_taking /// internalizing_t externalizing_t impulse_ctrl cbqattention_cg cbqinhibitory_cg /// cbqattention cbqinhibitory save data/seccyd_5.dta, replace *------------------------------------------------------------------------------* *--------------------- MERGING TOGETHER ALL FILES ---------------------* *------------------------------------------------------------------------------* use data/seccyd_1.dta, clear forvalues i = 2/5 { merge 1:1 id using data/seccyd_`i'.dta drop _merge } order id dog_min d1 d2 d3 d4 ach1 beh1 ach15 beh15 *Labeling Variables label var dog_min "Delay of Gratification" label var d4 "7 minutes" label var d3 "2 to 7 minutes" label var d2 "0.333 to 2 minutes" label var d1 "< 0.333 minutes" label var ach1 "Achievement Composite - G1" label var beh1 "Behavior Composite- G1" label var ach15 "Achievement Composite- Age 15" label var beh15 "Achievement Composite- Age 15" label var male "Male" label var dwhite "White" label var dblack "Black" label var dhisp "Hispanic" label var dother "Other" label var agemo "Child's Age at Delay Measure" label var wtgms "Birth Weight (g)" label var bracken "Bracken Standard Score" label var bayley "Bayley" label var temperament_6 "Child Temperament" label var logincome "Log of Family Income" label var momage "Mother's Age at Birth" label var momed "Mother's Education" label var ppvt "Mother's PPVT" label var hhlrnm36 "HOME Learning Materials" label var hhlanm36 "HOME Language Stimulation" label var hhphym36 "HOME Physical Environment" label var hhresm36 "HOME Responsivity" label var hhacam36 "HOME Academic Stimulation" label var hhmodm36 "HOME Modeling" label var hhvarm36 "HOME Variety" label var hhaccm36 "HOME Acceptance" label var hhrsem36 "HOME Responsivity- Empirical" label var lwid_ss_54 "Letter-Word ID 54" label var appld_ss_54 "Applied Problems 54" label var picvo_ss_54 "Picture Vocab 54" label var memse_ss_54 "Memory for Sentences 54" label var incom_ss_54 "Incomplete Words 54" label var internalizing_54 "Internalizing 54" label var externalizing_54 "Externalizing 54" label var propcorrect "CPT Attention 54" label var propincorrect "CPT Impulsivity 54" label var selfcontrol54 "Self-Control Comp. 54" label var int_brake "Stoplight- Brake App" label var int_waittime "Stoplight- Brake Time" label var risk_taking "Risk Taking" label var internalizing_t "Internalizing (self)" label var externalizing_t "Externalizing (self)" label var impulse_ctrl "Impulse Control" label var cbqattention_cg "CBQ- Attention (caregiv)" label var cbqinhibitory_cg "CBQ- Inhibitory (caregiv)" label var cbqattention "CBQ- Attention (mom)" label var cbqinhibitory "CBQ- Inhibitory (mom)" label var internalizing_1st "Internalizing (G1)" label var externalizing_1st "Externalizing (G1)" label var internalizing_15 "Internalizing (Age 15)" label var externalizing_15 "Externalizing (Age 15)" label var lwid_ss_1 "Letter-Word ID (G1)" label var appld_ss_1 "Applied Problems (G1)" label var passage_ss_15 "Passage Comprehension (Age 15)" label var appld_ss_15 "Applied Problems (Age 15)" label var site1 "Site 1" label var site2 "Site 2" label var site3 "Site 3" label var site4 "Site 4" label var site5 "Site 5" label var site6 "Site 6" label var site7 "Site 7" label var site8 "Site 8" label var site9 "Site 9" label var site10 "Site 10" ************** CHECKING FOR STRANGE MISSING VALUES *************** misstable sum *Replacing any alternative missing values to "." foreach var in momed memse_ss_54 incom_ss_54 appld_ss_1 { replace `var' =. if `var' >. } ************** SAVING FINAL DATASET FOR ANALYSIS ****************** save data/seccyd_marshmallow.dta, replace