Table 3.
Summary of continuous text-level predictors, including labels used in statistical models. Mean values, standard deviations and ranges are reported for raw values of predictors and for values of predictors after transformation.
Predictor | Code | Mean(SD): Original |
Range: Original |
Range: Transformed |
Length of Word N, characters | sWordLength | 6(2) | 2:13 | −1.8:3.0 |
Frequency of Word N (residualized) | srWordFreq | 201,522(472,855) | 101:6.4*106 | −3.2:2.4 |
Length of Word N−1, characters | sPrevLength | 4(2) | 1:11 | −1.5:3.3 |
Frequency of Word N−1 (residualized) | srPrevFreq | 9.1*106(1.1*107) | 101:2.3*107 | −2.7:1.2 |
Length of Word N+1, characters | sNextLength | 5(2) | 1:13 | −1.5:3.6 |
Frequency of Word N+1 (residualized) | srNextFreq | 4.8*106(7.8*106) | 110:2.3*107 | −3.5:2.7 |
Relative word position in sentence | sRelPos | 0.51(0.23) | 0.11:0.94 | −1.7:1.9 |
Sentence position in experimental list | sTrialNum | 73.9(40.3) | 1:144 | −1.8:1.8 |
Initial landing position, characters | sFirstFixPos | 2.5(1.9) | 0:11 | −1.3:4.4 |
Note to Table 3: the baseline set of predictors additionally includes the factor Type with three levels: S (Simple sentence), SE (sentence with a Single Embedded relative clause), and DE (sentence with Doubly Embedded relative clauses). Frequency counts are based on the 320-million HAL written corpus of US English. Prefix “s” indicates that the predictor was standardized (the mean subtracted from the raw value and the difference divided by 1 unit of standard deviation), e.g., sWordLength. Prefix “r” indicates predictors that were residualized prior to standardization due to considerations of statistical modeling, e.g., srNextFreq.