Skip to main content
Military Psychology logoLink to Military Psychology
. 2022 Mar 1;34(5):551–569. doi: 10.1080/08995605.2021.2022067

Correction for range restriction: Lessons from 20 research scenarios

Thomas R Carretta a, Malcolm James Ree b,
PMCID: PMC10069334  PMID: 38536384

ABSTRACT

Data are often available only for recruits, a range-restricted sample. This creates the potential for mistaken inferences and poor decisions. This is because inferences and decisions are about the population, not the sample. Despite these problems, researchers must try to determine statistical values as if the sample was not range-restricted. Although range restriction correction methods have been available for over a century, often they are not applied or are applied incorrectly. Technical psychometric discussions of range restriction have not improved researcher practice. As an alternative, realistic scenarios are presented to illustrate and explain the consequences of (1) failing to correct correlations, (2) using the wrong correction formula, (3) correcting when information about previous selection variables is unavailable, (4) using an inappropriate unrestricted sample, (5) incorrectly computing the confidence interval for corrected correlations, and (6) interpretation of results. Although there are situations under which correction has little effect, correction still provides better estimates of relations among variables. It also improves theoretical understanding and interpretation of real-world results.

KEYWORDS: Correlation, range restriction, research methods


What is the public significance of this article?—Many social science studies depend on computing a correlation. However, correlation is very sensitive to reduction in variability from prior selection and many correlational results are inaccurate. Methods to correct these flawed correlations are presented in a series of 20 realistic scenarios that can serve as a practical guide.

Frequently, correlational studies are conducted with samples of those already selected for training or job incumbents. Use of selected samples can result in substantial bias in the sign and apparent magnitude of the correlations (Ree et al., 1994; Thorndike, 1949). These biases are due to methods of sample selection. Bias from using selected samples can cause the estimated correlations to be either higher, range enhancement or lower, range restriction, than if the correlations were computed in an unselected group (Dahlke & Wiernik, 2018; Levin, 1972). Most selection methods reduce variability to create differences between the variance of an unselected sample or an applicant sample and the variance of the selected sample. In rare circumstances, the selection method increases the variability of the sample compared to the unrestricted sample. This is called range enhancement.

Methods to correct correlations for range restriction have been available for more than a century (Aitken, 1934; Lawley, 1943; Le et al., 2016; Pearson, 1903; Thorndike, 1949). Technical psychometric presentations of range restriction are found in the literature, but many articles require more expertise than the typical researcher or practitioner has. In this paper, practical examples in practical situations are presented as scenarios of realistic situations faced by military researchers. Some researchers need a handy reference or to refresh their knowledge, and the scenarios serve these purposes. Each scenario is aimed at teaching a lesson about the appropriate use of range-restriction corrections by noting errors, recommending appropriate procedures, and how the application of the correction would change the results and interpretations.

Importance of correcting for range restriction or range enhancement

The argument for the use of range-restriction-correction formulas is that they provide a more accurate estimate of correlations (Hunter & Schmidt, 2004; Linn et al., 1981; Ree et al., 1994). Linn et al. (1981) analyzed more than 700 validity studies concluding, “Thus it seems desirable to routinely compute and report corrected correlations along with their uncorrected counterparts. Though still conservative (emphasis added), the corrected values will generally provide a better indication of predictive validity and be less misleading than uncorrected correlations alone” (p. 662). Recognizing and correcting for range restriction or range enhancement is especially important if the goal is estimating the relationships between scores or between constructs. Bryant and Gokhale (1972) observed that “ … to infer beyond the sample a correction for restriction in range is necessary” (p. 305). The same is true of range enhancement (Levin, 1972; Zimmerman & Williams, 2000).

There are practical implications for failing to correct or using inappropriate corrections. For example, failure to correct for range restriction may result in prematurely discarding a selection measure after investing resources. Another practical implication is underestimation of the predictive validity or utility of measures. There is also the concern that lack of cumulative knowledge can lead to confusion about relationships. Meta-analysis is used to address the issue of cumulative knowledge and correcting for range restriction in meta-analysis, even if the magnitude of the correction is small, which can provide a more accurate estimate of the true correlation. The corrected correlations also provide a better estimate of the mean corrected correlation, variances, and credibility intervals.

The consequence of failing to correct for range restriction is biased correlations that will not represent the unrestricted sample. Failing to correct for range restriction can lead to inappropriate decisions such as stopping research on a variable or conclusions such as over or underestimating the relations between scores (Alexander et al., 1986; Hunter et al., 2006; Linn, 1968; Ree et al., 1994; Sackett & Yang, 2000). However, there are situations in which the correction for range restriction results in small changes, and conclusions drawn from the uncorrected correlation will be the same. One example is when nearly all applicants are accepted due to a high selection ratio. Another is when direct selection on one variable has little effect on the standard deviations of one or more variables that underwent indirect selection. Since there was little range restriction on the indirectly selected variables, correction for range restriction can have little impact.

Importance of correcting for range restriction

Range restriction correction formulas have existed for more than a century (Lawley, 1943; Pearson, 1903; Thorndike, 1949), and their utility has been well documented. The argument for their use is that it is desirable to obtain as unbiased an estimate of the operational validity of a predictor as possible in the population in which it is being used (American Educational Research Association, American Psychological Association & National Council on Measurement in Education, 2014; Society for Industrial and Organizational Psychology, 2018). When validity coefficients are distorted by the effects of range restriction and the information needed to make the correction is available, a suitable adjustment should be made (Sackett & Yang, 2000; Schmidt et al., 2006; Society for Industrial and Organizational Psychology, 2018).

Nonetheless, some question their use when there is not complete truncation (i.e., complete truncation would be having no scores below a specified numeric value). Campbell (1976, p. 218) concluded about corrections for range restriction that “ … the safest recourse is to not use them.” Damos (1996) referred to range restriction as a “red herring” in explanation for low predictive validities observed in commercial and military pilot selection batteries. She argued that commercial air carriers are not going “to administer tests to a completely unrestricted population: some type of selection based on the candidate’s background and experiences will occur before any testing is conducted” (p. 202). Damos concluded that the uncorrected correlations provide the most accurate estimates of the predictive validity of a test in most cases. She acknowledged that some type of selection has occurred, but does not suggest correcting for that selection. Testing a random sample of the population is not advocated. The appropriate unrestricted sample for pilot training is training applicants or in the case of commercial aviation, job applicants. While there are critics of range restriction correction such as Campbell (1976) and Damos (1996), we believe that these criticisms are unfounded. As long as two of the three assumptions to compute correlations (linearity and homoscedasticity) have been met and appropriate data are available to perform the correction, the range restriction corrections can be applied. Correction provides a more accurate estimate of the relations among variables than the uncorrected values.

Hunter and Schmidt (2004) concluded that in educational and employment selection, the predictive validity of cognitive ability has been underestimated considerably due to the failure to correct for range restriction. Held and Foley (1994) provided an empirical example. They calculated the predictive validity of aptitude scores while varying the selection ratio from 1.00 (unrestricted group) to .10 (highly restricted group). The validity (r) of the aptitude scores steadily decreased as selection became more restrictive. In all instances, the corrected validities were closer to the unrestricted validities than were the uncorrected validities.

Range restriction or range enhancement?

It should be noted that prior selection will not always decrease the magnitude of the correlations. Under unusual circumstances when there is “range enhancement,” meaning increased variance in either the independent or the dependent variable due to selection, increased correlations would be observed (Levin, 1972). Additionally, Zimmerman and Williams (2000) noted that distributions with extreme outliers can cause downwardly or upwardly biased correlations and provided a statistical method to correct correlations in these situations. Some studies compare extreme groups, eliminating the middle of the distribution. This may cause “range enhancement” leading to overestimation of correlations. How well do correction formulas work under this circumstance? Preacher, Rucker, MacCallum, and Nicewander (2005) referencing Pearson (1903) and Wherry (1984) observed that the range restriction correction formulas can be used to correct for overestimation caused by the use of extreme groups. However, Preacher et al. stated that they found no examples of corrections applied to correlational analyses of extreme groups.

Methods of correction

The best-known methods of correcting for range restriction are the univariate Pearson-Thorndike (Pearson, 1903; Thorndike, 1949) Cases 1, 2, and 3. Direct range restriction on a variable occurs when selection decisions for hiring, training, or job assignment are made on that variable. Indirect range restriction occurs when antecedent variables are not known or a variable(s) is administered, but not used for selection, and applicants are directly selected on scores from other measures. Direct selection on one variable may produce indirect selection on other variables. Cases 1 and 2 are appropriate for direct selection, while Case 3 is appropriate for indirect selection. The distinction between Cases 1 and 2 is based on the correspondence between the variable on which selection occurred and the variable on which unrestricted variance is known (Sackett & Yang, 2000, p. 115). Case 3 is appropriate for indirect selection if unrestricted variance is known. When more than one variable is used for selection, the multivariate correction (Aitken, 1934; Lawley, 1943) is appropriate if unrestricted correlations and variances are known on a subset of variables.

Sackett and Yang (2000) provided a classification system that acknowledges direct and indirect selection but goes further by adding three additional considerations. First, they characterize the importance of identifying on which variables the restriction occurred (x, y, and/or a third variable, z). The second consideration is whether unrestricted variances of the relevant variables are available. The final question is whether a third variable if involved is measured or unmeasured. Sackett and Yang described 11 scenarios based on this characterization and provided potential solutions for several of them. They also noted that some situations have no solutions as yet.

Bryant and Gokhale (1972) provided a method to correct correlations if the selection variables are unknown. In the recent development of Case IV to correct indirect range restriction, range restriction occurs on true scores rather than on observed scores (Hunter & Schmidt, 2004, Ch. 5; Hunter et al., 2006). In the Case IV correction for indirect range restriction, the restriction on the true scores tx for the observed score x is estimated from the amount of range restriction on the observed score x. It is assumed that y is fully mediated by the true score (T) of variable X. Little attention has been paid to Case IV due to the potentially unmeetable mediation assumption. Work on Case IV has all but stopped. Building on Bryant and Gokhale (1972), Case V first requires the estimation of true score means and true score correlations. The Case V equations enable correction for indirect range restriction without requiring the full mediation assumption of Case IV.

In all the scenarios described in the next section, it is assumed that required variances are available. The assumptions underlying all methods of range restriction are linearity and homoscedasticity. Normality is not an assumption. See Table 1 for appropriate and inappropriate use of the correction formulas and which scenarios illustrate each range restriction method. See the Appendix for the formulas and how to use them. The Appendix shows the equations for Cases 1, 2, and 3 that can be computed by hand, programmed in MathCad©, Excel©, or the programming language R (Dahkle & Wiernik, 2018, 2019b). The programming language R also provides programs to compute Cases 1, 2, 3, IV, and V and the multivariate method.

Table 1.

Procedures for range restriction correction.

Procedure/
Scenarios
References Intended Use
Correction for
Appropriate Situations Inappropriate Situations
Case 1
Scenario: 3
Pearson, 1903
Thorndike (1949)
Direct truncation due to selection on one variable (either a or b).
Used to correct the correlation between 2 variables (i.e., a and b), when range restriction occurs on one variable, and unrestricted variance is not known but is known for the other variable. In this case, there is direct range restriction on one variable and indirect range restriction on the other variable.
Indirect selection occurred; multivariate selection, non-linear relationships
Case 2
Scenarios: 4, 8, 9, 17,19
Pearson, 1903
Thorndike (1949)
Direct truncation due to selection on one variable. Case 2 is similar to Case 1, but the unrestricted variance is known on the variable used for selection, but not known for the other variable.
Estimates the correlation between 2 variables (i.e., a and b), when the variances for one of the variables are known in both the restricted and unrestricted samples, but the correlation between the variables is available only in the restricted sample that has been directly selected on the variable with known unrestricted variance.
Indirect selection occurred; multivariate selection, non-linear relationships
Case 3
Scenarios: 16, 18, 20
Bryant and Gokhale (1972)
Pearson, 1903
Thorndike (1949)
Indirect restriction on variable b is produced by direct selection on the third variable (a)
Assumed direct top-down selection has been based entirely on a and has not been affected by any other information and rac,rbc,racy, and Sa are known
Direct selection, violation of assumptions
Case IV
Scenarios: 12, Appendix
Hunter and Schmidt (2004)
Le et al. (2016)
Schmidt et al. (2006)
Indirect selection
Used as an alternative to Case 3 (relaxes conditions (a) and (b) required for Case 3).
Multivariate selection, non-linear relationships
Case V
Scenario: Appendix
Dahlke and
Wiernik (2018)
Le et al. (2016)
Indirect selection
Used for correction for indirect range restriction between variables a and b without knowledge about selection on a third variable, c.
Multivariate selection, non-linear relationships, when correlations are not substantially different from zero. In that case use Case 3
Multivariate
Scenarios: 1, 6, 7, 10, 11, 13, 14,
Aitken (1934)
Lawley (1943)
Selection on multiple variables Selection on two or more variables; correction for both direct and indirect selection; know the inter-correlations of the independent variables in both the restricted sample and unrestricted population Insufficient information available, non-linear relationships

This paper provides a source of information on range restriction that is not overly technical through presentation of realistic scenarios. It can be a teaching tool, a review of range restriction, or a handy reference. It is meant to be an introduction to more statistically sophisticated articles. It is not a substitute for articles such as Sackett and Yang (2000), Bryant and Gokhale (1972), and Olson and Becker (1983). Review of this article will facilitate understanding of those and other articles.

In this paper, a series of realistic scenarios follow with information on which correction should be applied. They illustrate how failure to apply the appropriate correction can affect the interpretation of the relations between variables and lead to incorrect decisions. These scenarios provide examples, explanations for commonly occurring situations, and appropriate advice about correcting for range restriction or range enhancement.

Scenarios

Scenario 1: Why correct for range restriction?

Situation

To determine if five scores from the military enlistment test battery were predictive of final grades of basic recruits, a validity study was completed by a psychologist. The sample consisted of 120 recruits in basic military training. Correlational analyses revealed that two of the five potential predictors were statistically significant and surprisingly one of the non-significant correlations was negative. A colleague suggested correcting the correlations for range restriction. The researcher replied, “Why should I, two of the variables are significant and that’s all I need?” No corrections were applied, and three potential predictors were removed from consideration.

Problem

The sample of basic recruits differs from the military applicants because the recruits have been selected for enlistment. This resulted in range restriction and biased correlations. The psychologist would have erroneous beliefs about the relationships of the variables leading to incorrect actions by dismissing variables that were not statistically significant. The corrected values would be more appropriate and lead to correct decisions about the variables.

R. L. Thorndike (1949) reported a study conducted during World War II in which 1,036 soldiers were allowed to enter Army Air Corps pilot training without regard to scores on aptitude tests. After they completed training or dropped out, correlations were computed on those who would have been selected for pilot training if the aptitude standards in place at the end of the war had been applied. In this subset of individuals, not all the correlations for the tests were statistically significant and one test showed an unexpected negative correlation. A second analysis using all subjects showed all test validities to be significant and the one test that had a negative correlation showed a positive correlation. If the researchers only considered the correlations from the range-restricted sample, they would have been wrong and led others to false conclusions. The appropriate interpretation was that all tests were valid and that the negative correlation was a selection artifact.

Solution

Because the participants were enlistees, all the needed variances were available. Since the psychologist was examining the predictive validity of multiple (5) variables, the multivariate procedure should be applied to correct the correlations using the applicants as the unrestricted group. Correction will provide a better estimate of the correlations and may reverse the sign of the unexpected negative correlation.

Scenario 2: Range restriction assumptions

Situation

In a research committee meeting, a management consultant suggested to a government analyst that correlations should be corrected for range restriction. The government analyst replied, “I cannot meet the assumptions for correction and corrections over-correct the correlations.” No corrections were made.

Problem

The analyst has false beliefs about the assumptions of the corrections. The assumptions are two of the three assumptions of correlation, linearity, and homoscedasticity. Linn et al. (1981) demonstrated that in most situations the corrected correlations are underestimated. Millsap (1989) confirming Linn et al. showed that on average correcting correlations under-corrects. However, at extreme selection ratios, correction is not as accurate as under less extreme conditions. Held and Foley (1994) demonstrated this with test scores from a large sample (n = 147,288) of US Navy enlistees for the univariate Case 2 and Case 3 and the multivariate methods while varying the selection ratio from 0.10 (highly selective) to 1.00 (all applicants). The multivariate correction was generally more accurate than the univariate corrections, even when the directly selected variable was negatively skewed and failed the assumptions of linearity and homoscedasticity did not offset each other. At extreme selection ratios, Case 2, Case 3, and multivariate method corrections were not as accurate as with less extreme selection ratios. Considering the use of correction for range restriction even with extreme selection ratios, they concluded, “All corrected validities were more accurate than the respective uncorrected validities.” (p. 361). Greater accuracy will be obtained if correlations are corrected for range restriction.

Solution

The analyst should become familiar with the assumptions of linearity and homoscedasticity, which are necessary for both correlation and range restriction correction. Once it has been verified that the required data are available and a determination of which variables were used for selection, the appropriate range restriction correction formula can be applied.

Scenario 3: Case 1 – direct selection on a variable not the putative selection variable

Situation

An inexperienced military personnel analyst was tasked with finding and validating an experimental test to select traffic management specialists. This organization has used traffic management specialists for 30 years and evaluated job performance with a standardized work-sample test with norms based on Air Force data since the specialty was established. After a job analysis and review of journal articles, he decided to determine whether a test of Attention to Detail was valid for predicting the traffic management specialists’ job performance. The Attention to Detail test was administered and scored for only the 29 job incumbents who were employed for at least 5 years. From the organization’s 30 year archives, job performance ratings were obtained for the 29 incumbents in their first 5 years on the job. The analysis consisted of computing the correlation of the Attention to Detail test and job performance and the mean and standard deviations of both variables. The analyst chose not to correct the correlation for range restriction because he did not have Attention to Detail scores for traffic management specialist applicants. Results showed that the correlation was not significant, and it was believed that the Attention to Detail test lacked predictive validity.

Problem

There are two problems with the conclusions. First, 29 subjects are a small sample making statistical significance less likely to be found due to low statistical power. Second, the analyst said that he did not correct for range restriction because he did not have applicant data on the experimental Attention to Detail test. He did, however, have normative job performance criterion data from the organizational archive.

Solution

While application of the range restriction correction formula will provide a better estimate of the relations between the variables, it will not convert a nonsignificant correlation to a statistically significant correlation. In this scenario, the Attention to Detail test was range restricted; however, its unrestricted variance is not known, but the unrestricted variance is known for the job performance ratings of incumbents with at least 5 years on the job. The Case 1 correction formula should have been applied to obtain the corrected validity because of the correspondence between the variable on which they had been selected (job incumbents with at least 5 years of criterion data) and the variable for which unrestricted variance is known (job performance).

The participants were included because they had criterion data for five years. Because of the effect of range restriction, the utility of the Attention to Detail test was not accurately estimated. Range restriction has occurred on the job performance criterion.

The Case 1 correction formula should have been applied to obtain the corrected validity because of the correspondence between the variable on which they had been selected (5 years of criterion data) and the variable for which unrestricted variance is known (job performance). This would lead to a better understanding of the relationship between the Attention to Detail test and job performance.

Scenario 4: Case 2 – direct selection on the independent variable

Situation

An Air Force psychologist was interested in examining the predictive validity of the Armed Services Vocational Aptitude Battery (ASVAB) General (G) composite score for students in the Flight Attendant and Cyberspace Support courses. Minimum aptitude requirements are G ≥ 45 for the Flight Attendant course and G ≥ 62 for the Cyberspace Support course. Statistical testing of the correlation between the test scores and final school grade in technical training showed a non-significant correlation for the Cyberspace Support students (n = 235, r = .05), but a statistically significant correlation for the Flight Attendant students (n = 197, r = .27). The psychologist noted that the standard deviation (SD) on the general aptitude scores among Cyberspace Support students is only 25% as great as the SD for the Flight Attendants. Despite noting the smaller SD on the G score for Cyberspace Support students compared with Flight Attendant students, the researcher was confident in the results for both groups.

Problem

Differential range restriction is present as suggested by the difference in the SDs of the Generalaptitude score for the two groups. The psychologist failed to recognize differential range restriction thus his confidence is misplaced. Recruits (applicants) are not assigned to jobs randomly; they are assigned based on the G score. The Flight Attendant course receives students with a wide variance of ability, while the Cyberspace Support course receives students with a narrower variance of ability. Correlation is dependent on variability, and the lower variability in the Generalscores for Cyberspace Support students restricts the magnitude of the correlation. As computed, the two correlations cannot be compared. Differential range restriction for the Cyberspace Support and Flight Attendant students has contributed to the apparent validity difference for the two groups. The belief of the researcher is incorrect.

Archival normative data for Air Force enlistees indicate that the standard deviation on the ASVAB General composite score is 8.5 (compared to the US Military applicant SD of 10.0). This value of 8.5 was used to correct the correlations for the two training specialties using the Case 2 formula along with the SD found in each sample. After Case 2 correction, the correlation was r = .40 for the Flight Attendant students and r = .41 for the Cyberspace Support students. Differential range restriction caused the original correlations to seem different when they were about the same. While the application of the range restriction correction formula will provide a better estimate of the relations between the variables, it will not convert a nonsignificant correlation to a statistically significant correlation. The researcher should conclude that the General composite score has similar correlations for both training courses.

Solution

In this instance, students were selected only by their General scores. Each training course sets unique minimum selection scores. Students who scored below course minimums were not permitted to apply for that course. The Case 2 correction for range restriction for direct selection should be applied.

Scenario 5: Case 3 – indirect selection

Situation

An industrial/organizational psychologist from the Air Force Research Laboratory was consulted to improve the validity of a selection system for enlisted supervisors. Currently, the Armed Forces Qualification Test (AFQT), a measure of general mental ability, is used. After reading the management selection literature, she identified an experimental test of leadership. Next, she administered the leadership test to all 152 of last year’s enlisted supervisor applicants, 81 of whom were selected to be enlisted supervisors. Squadron supervisors used a structured rating scale to measure the performance of the enlisted supervisors selected last year. These ratings were collected by the psychologist. These 152 applicants had scores on both the AFQT and experimental leadership tests. The AFQT correlated with job performance ratings at r = .29, while the leadership test correlated with the job performance ratings at r = .35. She concluded that the test of leadership was more valid than the AFQT.

Problem

The problem is the failure to recognize a situation of direct selection on the AFQT and indirect selection on the test of leadership. Evaluation of the uncorrected correlations leads to the incorrect conclusion that the leadership test is more valid than the AFQT. The Case 2 correction is appropriate for the AFQT as it was used to select applicants for the enlisted supervisor jobs.

The Case 3 correction is appropriate for the leadership test because while it was not used for selection, the unrestricted variance is known in a sample of applicants. After application of the Case 3 correction for indirect range restriction on the leadership test and Case 2 correction for direct range restriction on the AFQT, the correlations with job performance were .46 for AFQT and .40 for the test of leadership. Her wrong conclusion leads to an erroneous interpretation of the validities.

Solution

The psychologist should have recognized the direct selection on the AFQT and the indirect selection on the leadership test and that an estimate of the unrestricted variance in the applicant sample was available for both the AFQT and leadership tests. Given that the data were sufficient and other conditions were met, Case 2 and Case 3 corrections should have been applied.

There are two equations for Case 3. The first is for when only incumbents, selected on the operational test, take the experimental test, and the other equation is for when applicants take both the operational and experimental test. See the Appendix section on Case 3 for the equations and explanation.

Scenario 6: Multivariate selection

Situation

A researcher was studying the relationship of safe operation of forklifts in a parts depot and experimental tests of working memory and of personality and two operational tests from the Armed Services Vocational Aptitude Battery (ASVAB), Arithmetic Reasoning (AR), and Mechanical Comprehension (MC). Safe operation was measured by safety-related accidents such as hitting people, shelves, doors, walls, and other forklifts. All participants were selected on scores on the ASVAB AR and MC tests. The other two tests, working memory, and personality were experimental and not used for selection. A group of 194 forklift operators participated in the study. To measure the criterion, accelerometers were installed on all forklifts to measure the number of times it struck objects. All scores on the four tests were available after study completion, and the correlations were examined. Working memory (r = .44) and personality (r = .56) showed strong correlations with the number of times a forklift struck objects, while the AR (r = .12) and MC (r = .10) tests had only weak relations. The researcher concluded that AR and MC scores were not meaningfully related to the criterion, but working memory and personality were.

Problem

The conclusions of the researcher are wrong because the test scores were subject to different types and amounts of range restriction and were not corrected. The ASVAB AR and MC tests were used for selection, and the other tests were not.

Solution

The sample of interest is last year’s applicants for forklift control training. Given the four independent variables, the multivariate correction for range restriction developed by Lawley (1943) is appropriate. The multivariate correction would include the four independent variables and the criterion. At least one unrestricted correlation such as between AR and MC could be estimated from the archival data. In addition to the correlations of the four independent variables with the criterion, the corrected correlations of the variables with each other are provided after applying the multivariate correction. The resultant matrix of corrected correlations is more accurate than the uncorrected correlations or the individual correlations corrected separately (Held & Foley, 1994).

In this scenario, complete data were available and the variables used for selection were identified. The multivariate correction (Lawley, 1943) should be applied. The multivariate method uses more information than the univariate methods to correct the correlations. It should be used in situations when two or more variables are identified as being used for selection and data are available.

Scenario 7: Consequence of failing to correct for range restriction

Situation

A researcher was presented with data on verbal and mathematical aptitude and five domain scores for a Big Five personality measure from the Air Force Officer Qualifying Test (AFOQT). She wanted to determine if these scores were highly correlated or if they were relatively independent. The participants (n = 217) were college graduates who had been accepted for officer commissioning through the Air Force Officer Training School (OTS) program. Minimum qualifying scores for applying to the OTS program are the 15th percentile on the AFOQT Verbal composite and 10th percentile on the AFOQT Quantitative composite. However, because there are typically many high-quality applicants, the scores for those accepted are well above the minimum requirements. She computed descriptive statistics (means and standard deviations [SD]) for the scores, noting that the AFOQT cognitive score means for the OTS trainees were about one SD above the values in a five-year sample of OTS applicants. She also noted that the SDs for the cognitive scores were about 25 to 50% lower than those for the five-year sample of OTS applicants.

Next, she computed a 7 by 7 correlation matrix, evaluated individual correlations, and noted that the correlation between the verbal and quantitative aptitude scores was much lower than that in the five-year OTS applicant sample. Additionally, the correlations between the personality domain score for conscientiousness and the other four domain scores were low. The correlations of the verbal and quantitative scores with the five personality domain scores were also low. Because of these low correlations, she concluded that the cognitive and personality scores for OTS trainees are relatively independent. The needed range restriction correction was not made.

Problem

The researcher has failed to identify range restriction as an influence on the magnitude of the correlations of the scores. As noted in the Scenario description, when the researcher consulted the data from the five-year OTS applicant sample, she noted that the OTS trainees’ cognitive test score means were above the five-year OTS applicant values and the SDs were lower than the five-year OTS applicant values.

Solution

Because multiple variables were involved, cognitive and personality scores, the multivariate correction for range restriction should have been used. One question to be resolved is identification of the appropriate reference group. If she intends to make inferences about the relation of the test scores for OTS applicants, the appropriate reference group would be OTS applicants who met the minimum aptitude requirements for the program (AFOQT Verbal ≥ 15 and Quantitative ≥ 10). After application of the multivariate correction, it would be noticed that the corrected correlation matrix shows moderate to strong correlations contrary to her conclusion.

Scenario 8: Determining the appropriate unrestricted group

Situation

A law school selection board was interested in examining the predictive validity of law school admissions test scores for predicting first-year law school GPA. Data from archival records for the last 5 years of those accepted to the law school were used to calculate the correlation between the law school admissions test scores and first-year law school GPA. Understanding that the variance of the test scores was restricted due to prior selection, the board decided to use the law school admission test national norms as the unrestricted data for range correction to provide a better estimate of the validity. The Case 2 correction for direct selection was used.

Problem

National norms are not the appropriate unrestricted group in this situation. Using national norms would likely overcorrect the correlations. The appropriate unrestricted group is the group about which inferences are to be made.

Solution

Case 2 is appropriate. The selection board is interested in the relation of the law school admissions test scores to first-year GPA at this law school. The appropriate unrestricted group is applicants to this law school.

Scenario 9: Extreme groups and range enhancement

Situation

A graduate student in organizational leadership was interested in the relationship between agreeableness and leader-member exchange (LMX). She hypothesized a positive relationship between supervisor agreeableness and LMX ratings made by their subordinates. As part of the application process for the position of financial manager, a test of the Big Five personality scales was administered. She administers an online survey to collect data from 253 employees of a large financial business who rated their immediate supervisors on leader-member exchange. Based on their agreeableness scores, she then directly selected only the highest and lowest 20% of job incumbents. The resulting correlation between agreeableness and leader-member exchange was r = .53, confirming her hypothesis.

Problem

An examination of the standard deviation for agreeableness indicated a higher value in the extreme-group sample than in the archival sample of financial manager applicants. The use of extreme groups has increased the variability of the scores creating range enhancement (Johnson et al., 2018; Levin, 1972) and an increase in the correlation between agreeableness and leader-member exchange.

Solution

The graduate student should correct the correlation for direct (Case 2) range restriction using the variance of agreeableness from the financial manager applicant sample to provide a more accurate estimate. The corrected correlation between agreeableness and leader-member exchange was r = .27. The use of extreme groups created range enhancement.

Scenario 10: Univariate versus multivariate correction

Situation

A large metropolitan police department used a measure of general mental ability (GMA) to screen job applicants for over a decade. Although the measure of GMA consistently demonstrated validity for supervisor ratings of first-year performance, the Chief of Police is concerned about occurrences of counterproductive work behavior (CWB). In response, the human resources department introduced a test of the three dimensions of the Dark Triad, Machiavellianism, neuroticism, and psychopathy into the selection battery. The Dark Triad tests were administered for 3 years, yielding data for all applicants from which correlations were computed.

Although the Dark Triad tests were administered, their scores were not used in the selection process. After a large sample of new hires (n = 243) completed their first year on the job, the human resources manager calculated the correlations between the measures of GMA, Machiavellianism, neuroticism, psychopathy, and supervisory ratings of CWB. All tests had statistically significant correlations with supervisor ratings of CWB in the expected direction. He was aware of the effects of range restriction and corrected the correlation to estimate the correlations in the applicant sample. The GMA measure was corrected for direct range restriction (Case 2), and the correlations of the measures of the Dark Triad were corrected for indirect range restriction (Case 3). As expected, all the validities increased in magnitude, with a larger increase for the measure of GMA than for the measures of personality.

Problem

The application of the Case 2 and Case 3 correction formulas is commendable as they show recognition of the problem of range restriction. Case 2 and Case 3 corrections provide better estimates than the uncorrected correlations but are less accurate than the multivariate method.

Solution

The multivariate method (Lawley, 1943) should have been applied to produce more accurate results (Held & Foley, 1994). The multivariate correction makes more accurate corrections because it uses more information than either Case 2 or Case 3. Additionally, the multivariate correction provides corrected correlations of the tests and criteria as well as the tests with one another. This facilitates the use of the matrix of corrected correlations in other analyses such as multiple regression. For example, the multivariate correction allows the researcher to examine the incremental validity of the Machiavellianism, neuroticism, and psychopathy measures over the measure of GMA.

Scenario 11: Sign changes when correcting for range restriction

Situation

A psychologist was evaluating the predictive validity of scores from a multiple-aptitude test battery for grades in an advanced electronics training course. The test battery was administered to a large group of applicants. The selection process was competitive, and only the top 25% of applicants were selected based on a composite of the test scores from the battery. The psychologist computed the correlations between the test scores and training grades, noting that they were low with one unexpected negative correlation. He applied the multivariate correction and noted that the magnitudes of all the correlations increased and the test with the negative correlation now displayed a positive correlation. After observing the change in sign, he concluded that something went wrong with the multivariate correction and applied the Case 2 correction instead. The Case 2 corrections lead to increased correlations for all scores, including a larger negative value for the negative correlation.

Problem

Under some selection conditions, sign changes can occur (Ree et al., 1994; Thorndike, 1947, 1949). These can be corrected using Case 3 (indirect univariate) or the multivariate method (Lawley, 1943; Ree et al., 1994). Case 1 and Case 2 cannot correct sign changes due to range restriction. However, as noted by Ree et al., in extreme conditions where the rangecorrected correlation may have the wrong sign it is a better estimate than the observed correlation. For an illustration of the effect of Case 3 correlation corrections involving potential sign change as a function of ratio of variances and correlations, see Ree et al. (1994, p. 299, Table 1).

Solution

The experimental psychologist was correct the first time when he applied the multivariate correction. Signs of correlations can change from negative to positive and positive to negative depending on the nature and magnitude of the range restriction and the correlations among the variables. High correlation is more likely to lead to sign change and is more likely to occur when the ratio of the unrestricted to restricted variance is high, which is usually a consequence of the selection ratio (Ree et al., 1994). The corrected correlations using the multivariate method were more accurate estimates leading to correct decisions.

Scenario 12: Reliability and range restriction

Situation

A researcher at a highly selective university was investigating the theory that student performance on a standardized test of advanced algebra was negatively related to class size as measured by the number of registered students. After a correlation was computed, the researcher first corrected it for unreliability and then corrected for direct range restriction to estimate the true score correlation (rt) of the variables. This was accomplished by dividing the observed correlation (rxy = −.57) by the product of the square roots of the reliabilities (Rxx and Ryy). The researcher assumed that the student count was perfectly reliable and found the reliability of .8 in the manual for the test of advanced algebra. The correction for unreliability is

rt=rxy/RxxRyy,

where rt is the true-score correlation, rxy is the observed correlation, and Rxx and Ryy are their reliabilities. The reported result after correction for unreliability and direct range restriction was rt = −.63, a 10% change in magnitude of the correlation.

Problem

The researcher has not reported a correct true-score correlation because she applied the corrections in the wrong order.

Solution

When selection has occurred on an observed score, the correction for range restriction is applied first followed by the correction for unreliability. When selection has occurred on a latent variable such as self-selection, the correction for unreliability should be done before the correction for range restriction. Correcting for range restriction yielded r = −.60, and after correction for unreliability, r = −.75. This is a 32% change beyond the uncorrected correlation of −.57. Stauffer and Mendoza (2001) have shown that the sequence of corrections for observed scores and latent variables applies to Cases 1, 2, 3, and V. Order of corrections with the multivariate procedure has not been studied.

Scenario 13: Regression weights and range restriction

Situation

A researcher uses multiple regression to predict the number of hours to complete training in computer programming. Applicants were selected based on a test of logical reasoning and a test of non-verbal reasoning. An experimental test of the ability to follow complex procedures was administered at the same time as the other two tests, but not used in selection. For admission to the course, the reasoning tests had required minimum scores of the 50th percentile for logical reasoning and the 45th percentile for non-verbal reasoning. The number of hours to complete training was the criterion. Analyses were conducted both without and with the experimental test. The first multiple regression produced a statistically significant R = .390 with the following unstandardized regression weights: logical reasoning b = .372 and non-verbal reasoning b = .306. A second regression was conducted including adding the experimental test showing a significant R = .422, and the unstandardized weights were as follows: logical reasoning b = .305, non-verbal reasoning b = .253, and ability to follow complex procedures b = .173. The test of ability to follow complex procedures demonstrated incremental validity (.422 vs .394) and was recommended as an addition to the selection tests. The researcher suggested using the unstandardized regression coefficients as weights for computing scores for use in selecting applicants.

Problem

Direct selection does not change regression weights but indirect selection does. The unstandardized b-weights for variables subject to indirect selection are biased due to range restriction. There was direct selection on logical reasoning and non-verbal reasoning. The experimental test of ability to follow complex procedures was subjected to indirect range restriction. Applicant data were available for the tests of logical reasoning and non-verbal reasoning. Had the correlations been corrected for multivariate range restriction, the results of the first multiple regression would have shown a statistically significant R = .662 with the following unstandardized regression weights: logical reasoning b = .366 and non-verbal reasoning b = .374. When the experimental test was included in the second regression, the R = .692 and the unstandardized weights were b = .266 for logical reasoning, b = .269 for non-verbal reasoning, and b = .269 for the test of ability to follow complex procedures.

Solution

Correct all the correlations with the Lawley (1943) multivariate procedures using applicant scores as the unrestricted sample. After correction, the unstandardized weights were not biased and the test of ability to follow complex procedures showed less incremental validity. The unstandardized regression coefficients could be used for selection.

Mendoza and Mumford (1987) noted that while direct range restriction has no effect on regression slopes, indirect restriction leads to a reduction in the regression slope (Hunter et al., 2006; Mendoza & Mumford, 1987). If indirect range restriction has occurred, the unstandardized regression weights in the unrestricted sample and the range-restricted sample would be equal only if the independent variables were perfectly reliable.

Scenario 14: Factor analysis and range restriction

Situation

Senior military leadership was concerned that while new enlistees have adequate verbal, math, and technical knowledge, they cannot solve novel or unique problems. They suggested additional measures be added to the enlistment test battery. A personnel research psychologist familiar with cognitive psychology theory recognizes that the distinction being made is between crystallized intelligence (Gc) and fluid intelligence (Gf). He noted that the current enlistment test consists entirely of measures of Gc. As a first step in investigating the utility of fluid intelligence in personnel selection, he administered a battery of Gf tests to new recruits and conducted a confirmatory factor analysis that includes both the Gc tests from the extant enlistment battery and the battery of Gf tests. Drawing on the published literature, he tested and confirmed a hierarchical factor structure for each of the test batteries and interpreted the higher-order factors as general crystallized intelligence and general fluid intelligence. He observed a correlation of .705 between the two higher-order factors and concluded that while they are strongly correlated, there is enough unique variance in the measures of crystallized and fluid intelligence that the Gf tests might have substantial incremental validity when used with the current test battery, which measures Gc.

Problem

The confirmatory factor analysis was done with a sample of enlistees who had been directly selected based on their scores on the measures of Gc. The correlations involving the measures of fluid intelligence were affected by indirect restriction due to their relation with the Gc enlistment test. The correlations between the tests for the enlistees are lower than those for military applicants. Thus, the correlations among the factors had been underestimated. The researcher overestimated the uniqueness of the Gf tests and their potential incremental validity.

Solution

Apply the multivariate correction for range restriction and then conduct another confirmatory factor analysis. In the corrected data, the two higher-order factors correlated at r = .911, indicating little unique variance and that the measures of Gf would likely have little or no incremental validity beyond the current test battery.

Given that both confirmatory factor analysis and exploratory factor analysis are based on correlations (or covariances), range restriction will affect the results of both. Correction for range restriction is desirable for both confirmatory and exploratory factor analyses to provide a better estimate of the relations among the factors.

Scenario 15: Correcting when information about previous selection variables is unavailable

Situation

A researcher was testing a hypothesis from a theory of management. The hypothesis states that there is a strong positive relationship between situational judgment and verbal analogies. The 199 employee participants were previously selected for employment, but information on the variables used for selection was lost and unavailable. The researcher administered the Management Appraisal of Situational Judgment and Verbal Analogies Assessment tests. She computed a correlation between these scores (r = .18) and reported a weak relationship.

Problem

Frequently, a problem is the effect of factors or variables not accounted for because they are unknown or unavailable, but are influential in creating range restriction nonetheless (see, Gross, 1990; Gross & McGanney, 1987; Jackson & Ree, 1992; Olson & Becker, 1983).

Solution

The researcher should have used the equation of Bryant and Gokhale (1972) who developed a method of correcting for range restriction when information about previous selection variables is unavailable. The data required by their formula are the restricted correlation and the unrestricted and restricted standard deviations for each variable of interest. The result of their equation is a corrected correlation of the variables. Applying Bryant and Gohkale’s formula, they reported a corrected correlation of r = .36.

There are times when the correlation of the constructs underlying the observed scores is of interest. When the relations among the constructs are the focus as in meta-analysis, it is necessary to also correct for the reliability of the measures. The Case V correction for range restriction (Le et al., 2016) is an important expansion of the Bryant and Gokhale (1972) formula that also corrects correlations for unreliability. Applying the Case V formula to this scenario, the fully corrected correlation was r = .45.

Scenario 16: Estimating the utility of a new selection variable

Situation

The human resources manager of a regional airline was considering adding a structured interview to the pilot selection process. Pilots were selected on the scores on a test of aviation-job knowledge. The interview scores were collected but not used in selection. She conducted a study where all applicants during two years completed the interview and received ratings from the board of interviewers. Those hired received quarterly job performance ratings from their daily crewmembers. After a sufficiently large number of pilots had been hired and evaluated, the human resources manager analyzed the data to determine the utility of the structured interview when added to current selection procedures. She calculated the validity of the current procedure and the current procedure plus the interview to estimate the baseline predictive validity and incremental validity. Success on the job was defined as receiving a mean peer rating of 4 or higher on a scale ranging from 1 to 5. The correlation of the current system is r = .30, and when the structured interview was added, r = .36. She then consulted the Taylor-Russel tables (Taylor & Russell, 1939) to estimate utility. The success rate of the current system is 30%. The system with the structured interview added has a success rate of 36% indicating a 20% increase (6/30 = 20%) over the base rate.

Problem

Taylor and Russell (1939) showed that the proportion of successful employees after selection is a function of the selection ratio, the base rate, and the validity of the proposed selection variable. Base rate is defined as the percentage of employees who would be successful under random selection. The accuracy of the results of utility (proportion successful) analysis (Naylor & Shine, 1965; Schmidt et al., 2016; Taylor & Russell, 1939) depends on the value of the correlation between the selection score and success on the job. The use of uncorrected correlations will frequently lead to underestimates, sometimes severely, of the utility of the predictor. Because the interview scores were not used in the pilot selection process and an estimate of the variance of the rating scores is available from the applicant sample, the Case 3 correction is appropriate.

Solution

Corrected correlations give a more accurate estimate of utility. In this example, the corrected proportion of success was 43% for the current system and the proportion successful with the addition of the structured interview was 48% when the correlations were corrected for range restriction. The 20% improvement over the baseline with uncorrected correlations was reduced to 12% (5/43 = 12%).

Scenario 17: More accurate meta-analyses

Situation

A scientist at Drones R Us conducted a literature review regarding the relation of general mental ability (GMA) and effectiveness of human-automation interaction (HAI). Among these, several independent small-scale studies included an assessment of video game experience (VGE). Participants in these studies had been selected based on scores on a GMA test; VGE was assessed as part of a background questionnaire, but not used for selection. The scientist conducted a meta-analysis to estimate the correlations among GMA, HAI, and VGE. Incremental validity of VGE for predicting HAI was based on the meta-analytic results. He corrects the validities involving GMA for direct range restriction (Case 2) and those involving VGE for indirect range restriction (Case 3).

Problem

The application of the Case 2 correction for GMA and the Case 3 correction for VGE is appropriate, but could be improved by also correcting for unreliability of the measures.

Solution

Case 2 is appropriate in the scenario for correcting the validity of GMA due to direct selection; it could be improved to include correction for measurement error by use of the correction for unreliability. Both variables used in Case 2 would be corrected for unreliability using the formula shown in Scenario 12. The order of correction for range restriction and unreliability (Stauffer & Mendoza, 2001) is dependent on how the range restriction occurred (observed or latent variables) and is discussed in Scenario 12.

The accuracy of the Case 3, univariate indirect range restriction (UVIRR) correction for VGE could be improved by using Case V when appropriate data are available. The use of the Case V procedure (Le et al., 2016) provides a better estimate because it corrects the correlations for measurement error, indirect range restriction, and in meta-analyses, facilitates estimation of sampling error across studies. Dahlke and Wiernik (2019a) using the term bivariate indirect range restriction (BVIRR) to describe Case V, noted that the method in the study by Le et al. (2016) did not take into account important impacts of the BVIRR correction on the sampling error of corrected correlations and noted the implications of applying the BVIRR correction in primary research and meta-analyses. Dahlke and Wiernik provided a generalized Case V (BVIRR) formula that can correct for either range restriction or range enhancement and substantially improved parameter and correlation estimates by reducing bias. They also described new methods to adjust for its impact on the sampling variance of correlations. Dahlke and Wiernik (2019b) noted that the Case V (BVIRR) correction functions very differently than univariate direct range restriction (UVDRR or Case 2) or univariate indirect range restriction (UVIRR, Case 3 or Case IV) corrections or correction for measurement error alone when used in psychometric meta-analyses.

Scenario 18: Meta-analysis and indirect selection

Situation

A meta-analysis was conducted to estimate the validity of two variables for a criterion of a job knowledge test. The first variable was a psychomotor test that was used in each study for direct selection of employees, and the second variable was a self-report conscientiousness score for only the selected employees. Correlations were computed between psychomotor test scores, conscientiousness ratings, and job knowledge. There were 112 studies, each with the correlations of interest. Estimates of the unrestricted standard deviations of the psychomotor test and conscientiousness scores were available from test manuals. Using off-the-shelf software, the researcher corrected the correlations for range restriction using Case 2 as implemented by the software. The meta-analytic results showed an average correlation corrected for range restriction of .53 for the psychomotor test and .33 for conscientiousness ratings with job knowledge scores.

Problem

The corrected correlation for conscientiousness is wrong. The Case 2 correction is appropriate for the psychomotor test scores, which were used to screen applicants for employment. Case 2 correction is not appropriate for the conscientiousness scores, which were collected only for those selected for employment. Case 3 is appropriate for conscientiousness.

Solution

Given the situation, the Case 3 equation should have been used to correct the correlation between the conscientiousness ratings and job knowledge scores, not the Case 2 equation. The Case 2 correction does not correct for indirect restriction. In this situation, with the use of the wrong equation, the meta-analytic average corrected correlation for conscientiousness was incorrect (Hunter & Schmidt, 2004), causing the meta-analyst to report incorrect underestimated results. When the correction for indirect restriction was used in the meta-analysis, the fully corrected correlation between conscientiousness and job knowledge was r = .44.

Scenario 19: Confidence intervals for corrected correlations

Situation

A study was conducted to find the correlation between scores on a standardized test of physical fitness and the number of injuries for firefighters. Job incumbents were selected based on achieving a score at or above the 50th percentile on the physical fitness test. The 36 participants in this study were from a local fire department. A correlation of r = −.33 was found. Normative data for the physical fitness test were available for fire fighter applicants for the last 10 years as a reference. The principal investigator corrected the correlation for direct selection (Case 2), giving a corrected correlation of r = −.50. He also computed the confidence interval for the correlation both before and after range-restriction correction using the equation for the standard error of a correlation. He reported both uncorrected and corrected correlations and their 95% confidence intervals.

Problem

There are two mistakes with the statistics reported. First, Millsap (1989) demonstrated that the standard error of a range-restricted correlation is underestimated using the usual standard error equation. This makes confidence intervals too small. Because no accepted adjustment exists, reports should include a caveat. The second mistake is using the standard error equation for the corrected correlation. Because the correlation has been corrected, the standard error has increased and the estimated confidence interval is biased. Confidence intervals for correlations corrected for range restriction are used routinely in psychometric meta-analyses (see, Hunter & Schmidt, 2004, p. 108).

Solution

The appropriate method is to compute the two endpoints of the uncorrected correlation using the usual standard error formula and then apply the correction used for the correlation to each of the endpoints. For example, if a Case 2 correction was used, correct the lower and upper endpoints of the interval using the Case 2 formula.

Scenario 20: Making point value predictions

Situation

To study the relationship of agreeableness with job satisfaction, a measure of each was administered to a large group of incumbent financial advisors. One goal of this study was to determine if the statistically significant relationship between agreeableness and job satisfaction and to make an inference about future financial advisor job applicants. After scoring the measures, a research psychologist computed means, standard deviations, and the correlation (r = .17) between agreeableness and job satisfaction. Using the formula for the standard error of a correlation, the psychologist tests the correlation of .17 and finds non-significance. A manual for the agreeableness measure gives a standard deviation for a nationally representative sample of adults greater than the value computed in the study sample. Applying the Case 2 correction for range restriction, a corrected correlation with agreeableness of r = .44 results. The researcher reports that the corrected correlation in a future group of applicants for the job of financial advisor will be r = .44.

Problem

There are three problems. First, the measure of agreeableness and job satisfaction was collected from job incumbents and not used to select the sample. Thus, the Case 2 procedure is inappropriate. Case 3 should have been used. Second, research has shown that the standard errors of range-restricted correlations are greater than the result of the usual formula (Millsap, 1989). The probability that the corrected-for-range-restriction correlation in a new group of applicants will be .44 is extremely small. Third, while applying correction for range restriction is admirable, using a nationally representative sample to provide the unrestricted standard deviation is inappropriate in this situation as the stated goal was to make inferences about future financial analyst job applicants.

Solution

Case 3 is the appropriate correction. Rather than only reporting a point-value estimate, the research psychologist should have reported the point estimate and 95% confidence interval. The appropriate unrestricted sample is a representative sample of financial advisor applicants. As discussed in Scenario 8, the unrestricted group should always represent the group about which inferences are to be made, in this circumstance, the financial advisor job applicants.

Conclusions and recommendations

Identification of the appropriate range restriction correction method requires the answer to three critical questions. Was the selection direct or indirect or both? Was more than one variable used for selection? Are data available for selection variables both administered and not administered by the researcher? After these questions are answered, proper correction procedures can be identified if an appropriate correction exists (Sackett & Yang, 2000). In situations where answers to the three critical questions are not available, appropriate methods may not exist.

Considering the scenarios, the question arises if corrections should be applied universally when studying relationships between psychological variables in restricted samples such as employees or students. In general, the answer is yes for two reasons. First, more accurate estimates will result which may lead to operationalizing variables that otherwise would be abandoned. Conversely, correction may show that a variable should be dropped from consideration to conserve resources. Applying range restriction correction to almost any correlational study can clarify understanding of the relations among constructs or among variables, informing researchers and practitioners. In keeping with Linn et al. (1981) and the professional standards for correlations (American Educational Research Association, American Psychological Association & National Council on Measurement in Education, 2014; Society for Industrial and Organizational Psychology, 2018), we report both the uncorrected and corrected correlations.

Limitations

A limitation of this paper is that the scenarios only depict situations for which the three critical questions can be answered. The scenarios do not provide suggested remedies when answers are not available. Sackett and Yang (2000) have acknowledged unresolved issues in their taxonomy such as not having all estimated variance for variables for several of their taxonomic models. Possible solutions might be found in other disciplines. Treating range restriction as a missing data problem might provide estimated variances. More widespread study and adoption of econometric methods such as Tobit analysis (Tobin, 1958) or censored regression analysis might be useful as they were explicitly developed for censored samples. Another limitation is the failure of the published research on range restriction corrections to examine the effectiveness of correction procedures when the underlying assumptions of correlation and range correction (linearity and homoscedasticity) have not been met.

Appendix.

Range Restriction Equations

Please note that all correction formulas assume linearity and homoscedasticity. The equations have been presented in consistent notation of a, b, z, and y, where a, b, and z are independent variables and y is the dependent variable.

Case 1 Correction (direct range restriction)

The Case 1 correction applies when the correlation to be corrected is between two variables, a and b, when range restriction occurs on one variable, and unrestricted variance is not known but is known for the other variable. In this case, there is direct range restriction on one variable and indirect range restriction on the other variable. The Case 1 correction formula is

Rab=1(sdb2)/(SDb2)(1rab2)

where Rab is the corrected correlation between variable a and variable b in the unrestricted population, rab is the correlation between a and b in the restricted sample, sdb is the standard deviation (SD) of b in the restricted sample, and SDb is the SD of b in the unrestricted population.

Case 2 Correction (direct range restriction)

The Case 2 correction (direct selection) is well-known and frequently used. In Case 2, the variances for a are known in both the restricted sample and unrestricted population. However, the correlation between a and y is known only for the restricted sample that has been directly selected on a. The Case 2 (direct range restriction) correction equation is

Ray=raYSDasda/(1ray2)+ray2SDa2sda2

where Ray is the corrected correlation between variable a and variable y in the unrestricted population, ray is the observed correlation between a and y in the restricted sample, sda is the standard deviation (SD) of a in the restricted sample, and SDa is the SD of a in the unrestricted population),

Case 3 Correction (indirect range restriction)

Case 3 is sometimes called correction for incidental selection. It is in two forms for two differing situations. If only job incumbents are administered, the operational and experimental test equation 1 is appropriate. If applicants are administered in both operational tests, then equation 2 is appropriate. Suppose a test, a, were given to a group of applicants some of whom were selected based on their test (a) scores, and the variance for test a is known for both the restricted sample and unrestricted population. The correlation for an experimental test, b, and the job performance criterion, y are available only for the selected (restricted, i.e. on the job) sample. When all three correlations can be calculated, the Case 3 (indirect range restriction) correction equation is

Ray= rabray(Sa2/sa2)1)/1+rab2Sa2sa211+ray2Sa2sa21 (1)

where Ray is the corrected correlation and Sa/sa is the ratio of the unrestricted to the restricted standard deviation of test a. The correlation between the predictors is rab.

Thorndike also presented an equation to be used if applicants took both the operational test and the experimental test and therefore the unrestricted predictor intercorrelation (i.e., the correlation between a and Z) is available. This second equation is listed below:

Ray=ray1+Raz2(sz2Sz21)+Razryz(SasaszSz)[1+ryz2(SZ2sz21)], (2)

where Ray is the corrected correlation, Raz is the unrestricted correlation between predictors a and Z. ray is the restricted correlation between a and the criterion y, and ryz is the restricted correlation between y, the criterion, and the operational predictor. SZ is the unrestricted standard deviation of Z, sz is the restricted standard deviation of z and Sa is the standard deviation of the experimental test and sa is the standard deviation of the experimental test. Subjects were selected on Z which correlates with a.

Case IV Correction (indirect range restriction)

The Case IV correction equation is described below. As noted by Schmidt et al. (2006), an important difference between direct and indirect range restriction is that in the direct case, because observed scores are used to select, restriction occurs on the observed scores. Observed scores are not used in selection in the indirect case. In the indirect case, range restriction occurs on true scores rather than on observed scores (Hunter & Schmidt, 2004, Ch. 5; Hunter et al., 2006). In the Case IV correction for indirect range restriction, the restriction on the true scores tx for the observed score x is estimated from the amount of range restriction on the observed score x. It is assumed that y is fully mediated by the true score (T) of variable X. As in Case 2, the ratio of the restricted to the unrestricted observed SDs of x is defined as sx/Sx. The range restriction on the true scores t is estimated from observed scores x and is defined as st/St. The equation1 that gives st/St (Hunter & Schmidt, 2004, Equations 3.16 and 5.31; Hunter et al., 2006, Equation 22) is:

stSt=sX2SX2(1rXXa)rXXa

where rXXa is the reliability of the predictor in the unrestricted group (i.e., applicant group). The Case IV range correction equation is

rXtYt=1stStrXtYt/1st2St21rXtYt2+1

In their discussion of various scenarios that can produce range restriction, Sackett and Yang (2000) introduced the little-known Case V correction for range restriction (Bryant & Gokhale, 1972), which was derived from the Case 3 equation. Le et al. (2016) conducted studies to improve the accuracy and reliability of the Case V formula for use in meta-analyses.

Case V Correction (indirect range restriction)

The Case V equation enables correction for indirect range restriction without requiring the full mediation assumption of Case IV.

Bryant and Gokhale’s (1972) correction equations is:

RaY=raYsdaSDasdYSDY+1sdXa2SDa21sdY2SDY2

where RaY is the corrected correlation between variables a and Y in the unrestricted population, raY is the observed correlation between a and Y in the restricted sample, the range restriction ratio of a is sda/SDa, where sda is the standard deviation [SD] of a in the restricted sample and SDX is the SD of X in the unrestricted population, and the range restriction ratio of Y is sdY/SDY, where sdY is the SD of Y in the restricted sample and SDY is the SD of Y in the unrestricted population. The Bryant and Gokhale (1972) correction equation allows for correction of indirect range restriction on a and Y without knowledge about the third variable on which selection occurred. The Bryant and Gokhale equation can be used to correct individual study correlations. Case V is suggested for meta-analysis.

The Case V method, building on Bryant and Gokhale (1972), first requires the estimation of true score means and true correlations. Le et al. (2016) modified the Bryant and Gokhale formula to also correct for measurement error and described a revised meta-analytic approach that incorporates the Case V correction. Le et al. provided the equations for the multi-step procedure for estimating true score elements and should be consulted for descriptions of the detailed steps.

Even though Case V has been applied in meta-analysis, it is also appropriate for correction of a single correlation to estimate the true relationship of a and Y. As noted by Le et al., the Case V method does not require any further assumptions beyond those of linearity and homoscedasticity, which underlie all existing range correction formulas. The steps below describe the process to perform correction meta-analysis based on Case V.

  1. Correct for measurement error in measure Y.2 For each study, the observed correlation, rXYi, provides the best estimate for the restricted correlation, ρXYi.
    ρXYi=rXYi
    Then, using the familiar equation, we can correct for measurement error in Y,
    ρXPi=ρXYi/ρYYi

    where ρXPi is the reliability corrected correlation of variable X and its true score, the underlying construct, P. ρXYi is the observed correlation between X and Y, and ρYYi is the reliability of Y.

  2. Correct for measurement error in measure X. Nest, correct for measurement error in X using the reliability of measure X on the underlying construct T, in the restricted population, ρXXi.
    ρTPi=ρXPiρXXi

    where ρTP1 is the correlation between construct T (the reliability corrected construct underlying variable X) and construct P underlying variable Y in the restricted population. . ρxp1 is the observed correlation between X and Y and ρxx is the reliability of X. Next µt and µp

  3. Correct for indirect range restriction. Le et al. (2016) adapted the Bryant and Gokhale (1972) equation shown above correcting the variables for measurement error. Le et al. replaced the range restriction ratio on X with the range restriction ratio on T and replaced the range restriction ratio on Y with the range restriction ratio on P.

    Estimate uT from uX sdXSDX and ρXXi:
    uT=ρXXiuX21+(ρXXiuX2uX2)
    Likewise, estimate uP from uY sdYSDY and ρYYi:
    uP=ρYYiuY21+(ρYYiuY2uY2)
    Finally, estimate the correlation between T and P in the unrestricted population using the following formula:
    ρTPauTuP+1uT21uP2

Multivariate Correction3

Lawley (1943) developed the multivariate or general case that allows for correction for range restriction resulting from selection on several variables and provided a theorem. The theorem is explained in terms of variance-covariance matrices that can be converted to correlation matrices. The discussion that follows uses the notation of Birnbaum, Paulson, and Andrews (1950).

Suppose one has a current test battery of p variables for which population information is available. A sample is selected based on test scores on the current battery and suppose that n-p additional variables (perhaps a combination of new tests and criteria variables) are collected on the selected (or restricted) sample.

The variance-covariance matrix from the restricted sample is

v=vp,pvp.npvnp,pvnp,np

All of v is known. The comparable variance-covariance matrix from an unrestricted (mostly unknown values) population is

v=vp,pvp.npvnp,pvnp,np

where there is only knowledge of vp,p. We want to estimate the other three parameters of v. vnp,np gives estimates of unrestricted variances and covariances for variables of the new tests and criteria. vp,np (or its transpose vnp,p) gives estimates of unrestricted covariances for variables of the current test battery with the new tests and criteria.

Lawley’s (103) theorem follows where xi and xj are any pair of variables.

The multivariate correction is expressed in matrix algebra notation.

Assumption 1 – Linearity

Assumption 2 – Equality of Error Variance. (Homoscedasticity)

Then the following equations use the known variances and covariances plus the variances and covariances from the restricted sample to provide corrected variances and covariances for all variables. These are then converted to correlations. The matrix equations follow:

vp,np=vp,pvp,p1(vp,np) and vnp,np=vnp,np+vnp,pvp,p(vp,npvp,np)

Software

correct_r is found in the package “psychmeta.” correct_r is very flexible and can compute Pearson Cases 1, 2, and 3. It can also compute Case V. In Case V if the reliabilities are set to 1.0 it computes the Bryant and Gokhale correction.

https://CRAN.R-project.org/package=psychmeta

lMvrrc in the package “Iopsych” computes the Lawley multivariate correction for range restriction. The input requires a matrix of the unrestricted correlations and a matrix of the restricted correlations. lMvrrc and Iopsych are copyrighted by Allen Goebl, Jeff Jones, and Adam Beatty, 2016.https://CRAN.R-project.org/package = iopsych

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Notes

1.

Presented in the terminology in Hunter and Schmidt (2004).

2.

Presented in the source notation.

3.

Presented in notation from Lawley (1943).

Data Availability Statement

No data were collected or generated for this article.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  1. Aitken, A. C. (1934). Note on selection from a multivariate normal population. Proceedings of the Edinburgh Mathematical Society (Series 2), 4(2), 106–110. 10.1017/S0013091500008063 [DOI] [Google Scholar]
  2. Alexander, R. A., Bennett, G. V., Alliger, G. M., & Carson, K. P. (1986). Toward a general model of non-random sampling and the impact of population correlations: Generalizations of Berkson’s Fallacy and restriction of range. British Journal of Mathematical and Statistical Psychology, 39(1), 90–105. 10.1111/j.2044-8317.1986.tb00849.x [DOI] [PubMed] [Google Scholar]
  3. American Educational Research Association, American Psychological Association, & National Council on Measurement in Education . (2014). Standards for educational and psychological testing. [Google Scholar]
  4. Birnbaum, Z. W., Paulson, E., & Andrews, F. C. (1950). On the effect of selection performed on some coordinates of a multi-dimensional population. Psychometrika, 15(2), 191–204. 10.1007/BF02289200 [DOI] [PubMed] [Google Scholar]
  5. Bryant, N., D., & Gokhale, S. (1972). Correcting correlations for restrictions in range due to selection on an unmeasured variable. Educational and Psychological Measurement, 32(2), 305–310. 10.1177/001316447203200207 [DOI] [Google Scholar]
  6. Campbell, J. P. (1976). Psychometric theory. In M. Dunnett (Ed.), Handbook of industrial and organizational psychology (pp. 185-222). Chicago: Rand-McNall. [Google Scholar]
  7. Dahlke, J. A., & Wiernik, B. M. (2018). psychmeta: An R package for psychometric meta-analysis. Applied Psychological Measurement, 43(5), 415-416. 10.1177/0146621618795933 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Dahlke, J. A., & Wiernik, B. M. (2019a). Not restricted to selection research: Accounting for indirect range restriction in organizational research. Organizational Research Methods, 22(1), 95–115. Published onlne on July 24, 2019. 10.1177/109442811985939830636863 [DOI] [Google Scholar]
  9. Dahlke, J. A., & Wiernik, B. M. (2019b). psychmeta: Psychometric meta-analysis toolkit (Version 2.3.2) [R Package]. Retrieved from psychmeta.com https://CRAN.Rproject.org/package=psychmeta (Original work published 2017; ) [Google Scholar]
  10. Damos, D. (1996). Pilot selection batteries: Shortcomings and perspectives. International Journal of Aviation Psychology, 6(2), 199–209. 10.1207/s15327108ijap0602_6 [DOI] [PubMed] [Google Scholar]
  11. Gross, A. L. (1990). A maximum likelihood approach to test validation with missing and censored dependent variables. Psychometrika, 55(3), 533–549. 10.1007/BF02294766 [DOI] [Google Scholar]
  12. Gross, A. L., & McGanney, M. L. (1987). The restriction of range problem and nonignorable selection processes. Journal of Applied Psychology, 72(4), 604–610. 10.1037/0021-9010.72.4.604 [DOI] [Google Scholar]
  13. Held, J. D., & Foley, P. P. (1994). Explanations for accuracy of the general multivariate formulas in correcting for range restriction. Applied Psychological Measurement, 18(4), 355–367. 10.1177/014662169401800406 [DOI] [Google Scholar]
  14. Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis. Sage. [Google Scholar]
  15. Hunter, J. E., Schmidt, F. L., & Le, H. (2006). Implications of direct and indirect range restriction for meta-analysis methods and findings. Journal of Applied Psychology, 91(3), 594–612. 10.1037/0021-9010.91.3.594 [DOI] [PubMed] [Google Scholar]
  16. Jackson, D. E., & Ree, M. J. 1992. On the effect of range restriction on correlation coefficient estimation AL-CR-1992–0001. Armstrong Laboratory, Human. [Google Scholar]
  17. Johnson, W., Deary, I. J., & Bouchard, T. J., Jr. (2018). Have standard formulas correcting correlations for range restriction been adequately tested? Minor sampling distribution quirks distort them. Educational and Psychological Measurement, 78(6), 1021–1055. 10.1177/0013164417736092 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Lawley, D. N. (1943). A note on Karl Pearson’s selection formulae. Proceedings of the Royal Society of Edinburgh, Section A, 62(1), 28–30. 10.1017/S0080454100006385 [DOI] [Google Scholar]
  19. Le, H., Oh, I., Schmidt, F. L., & Wooldridge, C., G. (2016). Correction for range restriction in meta-analysis revisited: Improvements and implications for organizational research. Personnel Psychology, 69(4), 975–1008. 10.1111/peps.12122 [DOI] [Google Scholar]
  20. Levin, J. (1972). The occurrence of an increase in correlation by restriction of range. Psychometrika, 37(1), 93–97. 10.1007/BF02291414 [DOI] [Google Scholar]
  21. Linn, R. T. (1968). Range restriction problems in the use of self-selected groups for test validation. Psychological Bulletin, 69(1), 69–73. 10.1037/h0025263 [DOI] [Google Scholar]
  22. Linn, R. L., Harnisch, D. L., & Dunbar, S. B. (1981). Corrections for range restriction: An empirical investigation of conditions resulting in conservative corrections. Journal of Applied Psychology, 66(6), 655–663. 10.1037/0021-9010.66.6.655 [DOI] [Google Scholar]
  23. Mendoza, J. L., & Mumford, M. (1987). Correction for attenuation and range restriction on the predictor. Journal of Educational Statistics, 12(3), 282–293. 10.3102/10769986012003282 [DOI] [Google Scholar]
  24. Millsap, R. E. (1989). Sampling variance in the correlation coefficient under range restriction: A Monte Carlo study. Journal of Applied Psychology, 74(3), 456–461. 10.1037/0021-9010.74.3.456 [DOI] [Google Scholar]
  25. Naylor, J. C., & Shine, L. C. (1965). A table for determining the increase in mean criterion score obtained by using a selection device. Journal of Industrial Psychology, 3(2), 33–42. https://psycnet.apa.org/record/1968-06201-001 [Google Scholar]
  26. Olson, C. A., & Becker, B. E. (1983). A proposed technique for the treatment of restriction of range of selected validation. Psychological Bulletin, 93(1), 137–148. 10.1037/0033-2909.93.1.137 [DOI] [Google Scholar]
  27. Pearson, K. (1903). Mathematical contributions to the theory of evolution: XI. On the influence of natural selection on the variability and correlation of organs. Royal Society PhilosophicalTransactions, 200 (Series A), 1–66. Stable URL https://www.jstor.org/stable/90869 [Google Scholar]
  28. Preacher, K. J., Rucker, D. D., MacCallum, R. C., & Nicewander, W. A. (2005). Use of the Extreme Groups Approach: A Critical Reexamination and New Recommendations. Psychological Methods, 10(2), 178–192. 10.1037/1082-989X.10.2.178 [DOI] [PubMed] [Google Scholar]
  29. Ree, M. J., Carretta, T. R., Earles, J. A., & Albert, W. (1994). Sign changes when correcting for range restriction: A note on Pearson’s and Lawley’s selection formulas. Journal of Applied Psychology, 79(2), 298–301. 10.1037/0021-9010.79.2.298 [DOI] [Google Scholar]
  30. Sackett, P. R., & Yang, H. (2000). Correction for range restriction: An expanded typology. Journal of Applied Psychology, 85(1), 112–118. 10.1037/0021-9010.85.1.112 [DOI] [PubMed] [Google Scholar]
  31. Schmidt, F. L., Oh, I. S., & Le, H. (2006). Increasing the accuracy of corrections for range restriction: Implications for selection procedure validities and other research results. Personnel Psychology, 59(2), 281–305. 10.1111/j.1744-6570.2006.00065.x [DOI] [Google Scholar]
  32. Schmidt, F. L., Oh, I. S., & Shaffer, J. A. (2016). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 100 years of research findings. Psychological Bulletin, 124(2), 262–274. 10.1037/0033-2909.124.2.262 [DOI] [Google Scholar]
  33. Society for Industrial and Organizational Psychology . (2018). Principles for the validation and use of personnel selection procedures, 5th edition. Society forIndustrial and Organizational Psychology. [Google Scholar]
  34. Stauffer, J. M., & Mendoza, J. L. (2001). The proper sequence for correcting correlation coefficients for range restriction and unreliability. Psychometrika, 66(1), 63–68. 10.1007/BF02295732 [DOI] [Google Scholar]
  35. Taylor, H. C., & Russell, J. T. (1939). The relationship of validity coefficients to the practical effectiveness of tests in selection: Discussion and tables. Journal of Applied Psychology, 23(5), 565–578. 10.1037/h0057079 [DOI] [Google Scholar]
  36. Thorndike, R. L. (1947). Research problems and techniques. Report no. 3. [Google Scholar]
  37. Thorndike, R. L. (1949). Personnel selection: Test and measurement techniques. Wiley. https://psycnet.apa.org/record/1949-05074-000 [Google Scholar]
  38. Tobin, J. (1958). Estimation of relationships for limited dependent variables. Econometrica, 26(1), 24–36. 10.2307/1907382 [DOI] [Google Scholar]
  39. Wherry, R. J. (1984). Contributions to correlational analysis. Academic Press. [Google Scholar]
  40. Zimmerman, D. W., & Williams, R. H. (2000). Restriction of range and correlation in outlier- prone distributions. Applied Psychological Measurement, 24(3), 267–280. 10.1177/01466210022031741 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No data were collected or generated for this article.


Articles from Military Psychology are provided here courtesy of Division of Military Psychology of the American Psychological Association and Taylor & Francis

RESOURCES