. Author manuscript; available in PMC: 2012 Jul 1.

Published in final edited form as: J Mem Lang. 2011 Jul;65(1):42–73. doi: 10.1016/j.jml.2011.03.002

Table 3.

Summary of continuous text-level predictors, including labels used in statistical models. Mean values, standard deviations and ranges are reported for raw values of predictors and for values of predictors after transformation.

Predictor	Code	Mean(SD): Original	Range: Original	Range: Transformed
Length of Word N, characters	sWordLength	6(2)	2:13	−1.8:3.0
Frequency of Word N (residualized)	srWordFreq	201,522(472,855)	101:6.4*10⁶	−3.2:2.4
Length of Word N−1, characters	sPrevLength	4(2)	1:11	−1.5:3.3
Frequency of Word N−1 (residualized)	srPrevFreq	9.110⁶(1.110⁷)	101:2.3*10⁷	−2.7:1.2
Length of Word N+1, characters	sNextLength	5(2)	1:13	−1.5:3.6
Frequency of Word N+1 (residualized)	srNextFreq	4.810⁶(7.810⁶)	110:2.3*10⁷	−3.5:2.7
Relative word position in sentence	sRelPos	0.51(0.23)	0.11:0.94	−1.7:1.9
Sentence position in experimental list	sTrialNum	73.9(40.3)	1:144	−1.8:1.8
Initial landing position, characters	sFirstFixPos	2.5(1.9)	0:11	−1.3:4.4

Note to Table 3: the baseline set of predictors additionally includes the factor Type with three levels: S (Simple sentence), SE (sentence with a Single Embedded relative clause), and DE (sentence with Doubly Embedded relative clauses). Frequency counts are based on the 320-million HAL written corpus of US English. Prefix “s” indicates that the predictor was standardized (the mean subtracted from the raw value and the difference divided by 1 unit of standard deviation), e.g., sWordLength. Prefix “r” indicates predictors that were residualized prior to standardization due to considerations of statistical modeling, e.g., srNextFreq.