Skip to main content
. Author manuscript; available in PMC: 2021 Jan 1.
Published in final edited form as: Sci Stud Read. 2019 Jul 8;24(1):23–56. doi: 10.1080/10888438.2019.1631827

Table 3.

Descriptions of Twin Projects and Reading, ADHD symptom, and Math Assessments

Description
Twin Projects

Florida Twin Project on Reading, (FTP) FTP is a cohort-sequential longitudinal twin study, with data on achievement, behavior, and the home and school environments for 5200 ethnically- and racially-diverse school-aged twins across Florida1.
Twins Early Development Study (TEDS) TEDS is a UK-based longitudinal twin study with over 13,000 twin-pairs, 90% of which identify as Caucasian, which includes data on health, behavior, and cognitive abilities2.
Colorado Longitudinal Twin Study of Reading Disability (CLTSRD) The CLTSRD is a longitudinal study on a subset of twins and siblings from the Colorado Learning Disabilities Research Center (CLDRC)3. It includes diagnostic, behavioral, cognitive and achievement data on roughly 2330 predominantly Caucasian (>90%) twins4.
Western Reserve Reading and Math Project (WRRMP) The WRRMP is an Ohio-based longitudinal twin project with more than 400 twin family participants5. It spans 7 years, with data on reading, math, and other cognitive outcomes for a predominantly Caucasian (>90%) and socioeconomically diverse sample6.
International Longitudinal Twin Study (ILTS) The ILTS is a longitudinal twin study that recruited participants in preschool and followed their development through the first three years of formal schooling in the US, Australia, Norway, and Sweden7,8. It includes data on reading, and its related skills, as well as other cognitive and achievement outcomes9.
Netherlands Twin Register (NTR) The NTR is comprised of two groups of twins: young twins (YNTR) and adolescent and young adult twins (ANTR)10. The NTR includes over 50,000 twins and their families’ data on cognitive skills, brain imaging, ADHD, and other outcomes10.
Australian Twin ADHD Project (ATAP) ATAP is a cohort-sequential longitudinal twin project aimed at understanding the development of ADHD and related factors in twins and their siblings11. The ATAP includes parent- and child-reported data on behavior, development, and education for over 6,000 twin families11.
Environmental Risk Longitudinal Twin Study (E-risk) E-risk is a cohort-sequential longitudinal twin study following an epidemiological sample of 1116 twin families that were drawn from TEDS at two consecutive birth cohorts (1994 and 1995)11. It includes parent- and teacher-reported data on child health and behavior, cognitive development, family structure, and mental health12.
Quebec Newborn Twin Study (QNTS) QNTS is a longitudinal follow-up study on twin cohorts born from 1995–1998 in Canada13. The sample includes 662 twin families with data on individual differences in cognition, behavior, socio-emotional development, school achievement, health outcomes, child environments (i.e. family-SES, parent- and peer-relationships), brain imaging, and nutrition, among other variables13.
Australian Twin Registry (ATR) ATR is a large volunteer twin registry established in the 1970s, with more than 30,000 registered twins of all ages14. The ATR includes basic demographic and family health data on registrants14.

Reading Assessments

Peabody Individual Achievement Test (PIAT) The PIAT is a US nationally norm-referenced assessment that measures many achievement domains, including reading comprehension and word recognition15. The reading comprehension subtest is comprised of 82 items and assesses literal comprehension of sentences using multiple-choice picture format16. Test-retest reliability is reported at .6415 and .86–.94 for the revised version16.
Netherlands Pupil Monitoring System (PMS) The PMS is an assessment used by almost 95% of Netherland schools17, which was created by the National Institute for Educational Measurement and tracks student progress from 4–12 years old to determine if student performance and teacher instruction meet national standards18. This tracking system utilizes student portfolios for each primary school subject, including reading comprehension, decoding, and math18.
United Kingdom National Curriculum reading test (UKNC Reading) The UK National Curriculum uses teacher ratings of students’ English, science, and mathematics skills at 7, 11, and 14 years-old to compare student performance based on their age to nationally expected performance levels outlined in the UK National Curriculum on a 4-point Likert scale that ranges from below to above average19. The decision consistency for each level (i.e., test-retest reliability) is 80–98%20.
Comprehensive Test of Phonological Processing Rapid Automatized Naming (CTOPP RAN) The CTOPP21 is comprised of seven core subtests, which combine to produce three composite scores: the phonological awareness composite score, the phonological memory composite score, and the rapid automatized naming (RAN) composite score. The RAN composite score is calculated from four subtests, which measure the ability to rapidly name digits, letters, colors and objects, respectively. Average internal reliability and alternate forms reliability exceeds 0.80, with test-retest reliability ranging between .70 and .9222.
Florida Comprehensive Assessment Test (FCAT) The FCAT is an annual standardized achievement test in Florida that measures reading comprehension using multiple choice questions based on narrative and expository text passages23. Alpha reliability is reported at .9023.
Florida Assessment for Instruction in Reading (FAIR) The FAIR is a system of computer-based assessments designed to support reading instruction in Florida24. The Maze task is the reading fluency subtest, which requires students to choose, from a list, which three words are best to fill in blanks within a passage. Alpha reliability is reported to be .77–.9024. The Reading Comprehension assessment requires students to answer questions related to text passages that vary in length and difficulty23. Alpha reliability is reported to be .88–.9224.
Global Online Assessment for Learning (GOAL) The GOAL Formative Assessment in Literacy Key Stage 3 is a UK-based reading comprehension assessment that captures both literal and inferential comprehension skills in multiple-choice format, using prompts made up of words, sentences, and short paragraphs25. The GOAL has a reported alpha reliability of .9126.
Test d’habilités en lecture (THAL; “Reading skills test”) THAL27 is a computerized French-language standardized reading skills test with subtests for phonetic decoding and reading comprehension. The phonetic decoding subtest is comprised of 50 items that require a child to indicate whether or not a phoneme presented in a stimulus word is also found in a comparison word and shows an internal consistency of .9328. The reading comprehension subtest includes 40 items that require children to choose the best missing words to complete a silently read short text and has a reported internal consistency of .9828.
Test of Word Reading Efficiency (TOWRE) The TOWRE is a norm-referenced reading assessment comprised of two subtests: The Sight Word Efficiency (SWE) subtest and the Phonemic Decoding Efficiency (PDE) subtest29. They both measure how efficiently participants read mono- and multi-syllabic words in a maximum of 3 minutes, with the SWE using a list containing 104 real words31 and the PDE using a list of 63 nonwords30. The TOWRE is normed for US readers 6– 24 years of age and has reported reliability >.9430.
Woodcock Johnson-III Tests of Achievement reading subtest (WJ-Reading) The WJ is a standardized achievement measure comprised of 22 separate tests of cognitive performance31. Subtests identified in the current study sample included: Passage Comprehension, which measures reading comprehension using 43 items by presenting students with a series of short passages and requiring them to fill in missing words within the text31 and Letter Word Identification, which requires students to read a list of increasingly difficult words aloud out of 76 total items until six consecutive errors are made32. Split-half reliabilities for these measures range between .88 and .9633.
Alouette-R The Alouette-R34 is a standardized French reading fluency measure for ages 6–16 that requires participants to read connected text as quickly and accurately as possible in a maximum of 3 minutes34. It is often used to group children into reading levels and to diagnose dyslexia35.
Wechsler Individual Achievement Test (WIAT) The WIAT is a norm-referenced achievement measure for children 4–19 years of old36 that assesses four content areas, including reading and math. Reading subtests include Early Reading Skills (i.e., letter naming and phonological skills), Word Reading, Pseudo-word Decoding, Reading Comprehension, and Oral Reading Fluency37. Subscales can be scored independently or combined to calculate composite reading and math scores. Reported inter-item reliability ranges between .69 and .9737.
National Assessment Program, Literacy and Numeracy (NAPLAN Reading) The NAPLAN is an Australian standardized assessment that tests both literacy and numeracy skills in the third, fifth, seventh, and ninth years of school based on national benchmarks set by the Australian Curriculum and Assessment Authority38. The literacy assessments measure student performance in reading, writing, spelling, grammar and punctuation in paper-and-pencil format. Alpha reliability for literacy and numeracy subtests range between .84 and .9338.
Reading Difficulties Questionnaire (RDQ) The RDQ39 is a 6-item measure that asks parents to report on a 5-point Likert scale, ranging from “never/not at all” to “always/a great deal”, the extent to which their child reads slowly and below expectancy level, and requires extra help at school, and the degree of difficulty their child has with spelling and sounding out words. The scale shows excellent internal consistency (alpha = .90) and high inter-rater and test-retest reliabilities (.83 and .81, respectively)39.
Orthographic Choice Task (OCT) The OCT uses 25 items to measure students’ spelling ability by requiring them to choose the correct letter string out of two choices that sound alike (e.g., bote vs. boat), one correct and one nonsense alternative. It has a split-half reliability of .9340.

ADHD symptom Assessments

Direct Behavior Rating Scale (DBRS) The DBRS measures ADHD based on the 18 ADHD symptoms outlined in the DSM-V, with 9 items corresponding to inattention and 9 that correspond to hyperactivity/impulsivity41. It can be parent-, teacher-, or practitioner-administered and asks students to rate their experience of ADHD symptoms over the last 6 months on a 4-point Likert scale42. The internal reliability for the Inattention and Hyperactivity/Impulsivity subscales is reported to be approximately .9041.
Diagnostic Interview for Children and Adolescents (DICA) The DICA is an ADHD measure that can be administered as a semi-structured interview or via computer. It includes three versions: one for 6–12 year-olds, one for 13–18 year-olds, and a parent-interview version. Reliability for symptom count on the ADHD scale has been reported at 0.65 for student respondents, and .84 for parent respondents43.
Revised Conner’s Parent Rating Scale (Revised Conner’s) The Revised Conner’s is a North American-normed comprehensive assessment of child behavior44. The Revised Conner’s measures 7 separate factors, including a hyperactivity/impulsivity factor. Alpha reliability ranges between .73 and .9545.
Strengths and Weaknesses of ADHD-Symptoms and Normal Behavior Scale (SWAN) The SWAN46 is a 30-item measure, informed by the 18 ADHD symptoms listed in the DSM-V. It is scored on a 7-point Likert scale, with nine of the scale items corresponding to Inattention and nine items corresponding to Hyperactivity/Impulsivity47. Higher scores on the SWAN indicate more problems with attention. Cronbach’s alpha reliability for SWAN-Attention is reported at .92 and .94 for SWAN-Hyperactivity48.
Disruptive Behavior Disorder scale (DBD) The DBD scale49 is a diagnostic checklist for DSM-V symptoms of ADHD and other behavior disorders made up of 42 total items, 18 of which correspond to ADHD symptoms. Parents or teachers rate child behavior on a 4-point Likert scale, with higher scores corresponding to more ADHD symptoms. Alpha reliability for the DBD is reported to be .91– .9649.
Child Behavior Check List (CBCL) The CBCL50 Attention Problem Scale is a 20-item measure that asks parents to report on a 3-point Likert scale on the amount and quality of child participation in sports, hobbies, games, activities, jobs, chores, and friendships, performance at school, and how well child plays and works alone and with others50.
Social Behavioral Questionnaire (SBQ) The SBQ is a parent-report ADHD instrument used for children and adolescents that rates child ADHD dimensions over the past 6 months on a 3-point Likert scale (very true to not true at all)51. It includes 3 items for inattention, which have a reported alpha of .84 and 5 items for hyperactivity/impulsivity, which have a reported alpha of .7752.

Math Assessments

United Kingdom National Curriculum math test (UKNC Math) The UK National Curriculum uses teacher ratings of students’ English, science, and mathematics skills at 7, 11, and 14 years-old to compare student performance based on their age to nationally expected performance levels outlined in the UK National Curriculum on a 4-point Likert scale that ranges from below to above average19. The decision consistency for each level (i.e., test-retest reliability) is 80–98%20.
Metropolitan Achievement Test (MAT) The MAT is a norm-referenced achievement test with diagnostic math subtests for arithmetic, math concepts and problem solving, and math communication53. Reliability is reported between .87 and .95 for these subtests54.
Woodcock Johnson Tests of Achievement-III math subtest (WJ-Math) The WJ is a standardized achievement measure comprised of 22 separate tests of cognitive performance31. The math subtests in the present sample include Quantitative Concepts, which measures student knowledge of mathematical concepts with orally presented questions on math facts32; Math Fluency, which assesses children’s ability to correctly answer as many addition, subtraction, and multiplication problems out of 160 as possible within a three minute limit; Applied Problems, which includes 63 items that measure students’ ability to solve orally-presented items on counting objects, probability, and algebra; and Calculation, which includes 45 items on math calculations, that range from writing single numbers to performing calculus operations32. Split-half reliabilities for these measures range between .88 and .9633.
National Foundation for Educational Research 5–14 Mathematics Series (NFER 5–14) The NFER 5–14 Mathematics Series is a UK computer- or paper-administered math assessment that is based on UK curriculum requirements55. Subtests include Understanding Numbers, which uses 33 items to assess students’ ability to solve problems requiring numeric and algebraic processes and has an alpha reliability of .90; Numerical Processes, which includes 25 items assessing non-numerical concepts, like rotational symmetry and has an alpha reliability of .87; and Computation and Knowledge, which includes 37 items on students’ ability to recall math facts and terms and perform simple calculations, with an alpha reliability of .9356.
National Assessment Program, Literacy and Numeracy (NAPLAN Math) The NAPLAN is an annual Australian standardized measure of both literacy and numeracy skills38. The Numeracy assessment tests children’s knowledge and application of mathematical concepts, such as algebra, functions, patterns, space, measurement, probability, and data. This assessment is traditionally a paper and pencil assessment, given in the third, fifth, seventh, and ninth years of school in Australia and is based on national benchmarks set by the Australian Curriculum and Assessment Authority41. Alpha reliability for literacy and numeracy subtests range between .84 and .9338.

Note.