Skip to main content
. Author manuscript; available in PMC: 2012 Jul 1.
Published in final edited form as: J Mem Lang. 2011 Jul;65(1):42–73. doi: 10.1016/j.jml.2011.03.002

Table 3.

Summary of continuous text-level predictors, including labels used in statistical models. Mean values, standard deviations and ranges are reported for raw values of predictors and for values of predictors after transformation.

Predictor Code Mean(SD):
Original
Range:
Original
Range:
Transformed
Length of Word N, characters sWordLength 6(2) 2:13 −1.8:3.0
Frequency of Word N (residualized) srWordFreq 201,522(472,855) 101:6.4*106 −3.2:2.4
Length of Word N−1, characters sPrevLength 4(2) 1:11 −1.5:3.3
Frequency of Word N−1 (residualized) srPrevFreq 9.1*106(1.1*107) 101:2.3*107 −2.7:1.2
Length of Word N+1, characters sNextLength 5(2) 1:13 −1.5:3.6
Frequency of Word N+1 (residualized) srNextFreq 4.8*106(7.8*106) 110:2.3*107 −3.5:2.7
Relative word position in sentence sRelPos 0.51(0.23) 0.11:0.94 −1.7:1.9
Sentence position in experimental list sTrialNum 73.9(40.3) 1:144 −1.8:1.8
Initial landing position, characters sFirstFixPos 2.5(1.9) 0:11 −1.3:4.4

Note to Table 3: the baseline set of predictors additionally includes the factor Type with three levels: S (Simple sentence), SE (sentence with a Single Embedded relative clause), and DE (sentence with Doubly Embedded relative clauses). Frequency counts are based on the 320-million HAL written corpus of US English. Prefix “s” indicates that the predictor was standardized (the mean subtracted from the raw value and the difference divided by 1 unit of standard deviation), e.g., sWordLength. Prefix “r” indicates predictors that were residualized prior to standardization due to considerations of statistical modeling, e.g., srNextFreq.