Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Aug 1.
Published in final edited form as: Am J Public Health. 2008 Jun 12;98(8):1354–1359. doi: 10.2105/AJPH.2007.127563

Screening Experiments and Fractional Factorial Designs in Behavioral Intervention Research

Vijay Nair 1,3, Victor Strecher 2,3, Angela Fagerlin 3,4,5,6, Peter Ubel 3,4,5,6, Kenneth Resnicow 2,3, Susan Murphy 1,3,7, Roderick Little 3,8, Bibhas Chakraborty 1, Aijun Zhang 1
PMCID: PMC2446451  NIHMSID: NIHMS99652  PMID: 18556602

Abstract

Health-behavior intervention studies have focused primarily on comparing a new program over the control using randomized controlled trials. However, we are seeing a dramatic increase in the number of possible components (factors) due to developments in science and technology (internet, web-based surveys, and so on). These changes dictate the need for alternative methods that can screen a large set of potentially important components in order to identify the important ones quickly and economically. We have developed and implemented a multiphase experimentation strategy for accomplishing this goal. This article describes the screening phase of this strategy and the use of fractional factorial designs for studying several factors (also called components) economically and uses two on-going projects on behavioral intervention to illustrate their usefulness. The fractional factorial designs are supplemented with follow up experiments in the refining phase, so any critical assumptions about interactions can be verified and the dosage levels can be optimized.

Keywords: Cancer prevention, Chronic Disease, Decision aid, Multiphase experimentation, Smoking cessation, Surveys, Web-based intervention

1. Introduction

The landscape in health-behavior intervention studies is changing rapidly. Recent developments in science and technology have resulted in a dramatic increase in the types and formulation of feasible interventions, in the ways in which interventions are delivered, messages are presented, data are collected, and so on. These, in turn, are leading to an explosion in the number of possible components (factors) that can be studied.

Traditional behavior intervention studies are typically large-scale randomized controlled trials (RCTs) where the goal is to confirm the superiority of a new program over the control. For example, a trial might test whether prostate cancer patients who receive a decision aid are better informed about their treatment options and are more involved in their health care decisions. Often in such trials, the new program consists of a combination of many interventions. The decision aid, for instance, contains many different components, each of which might influence the main outcome variables. These confirmation trials do not provide direct information on which components are active and whether they have been set at optimal levels. This is usually done using post-hoc analyses based on non-randomized data to tease out the additional information. Where randomized trials are used to obtain the information, they usually involve adding or subtracting components one at a time or at most in small groups: e.g., 2 X 2 factorial designs. These studies can assess the impact of only a limited number of treatment components. By the time these findings are disseminated, the population of interest may have changed or the technology is different or both (for example, there are new communications media or the population of interest has become more sophisticated, etc.), and the conclusions may no longer be valid. All of these suggest the need for alternative methodologies in health-behavior research.

Over the past five years, our NCI-funded Center for Health Communications Research has developed and implemented a multiphase experimentation strategy for systematically studying new interventions and confirming their superiority over controls. We have called this a multiphase optimization strategy or MOST1. It is adapted from a similar framework that has been successfully used in engineering applications for many years 2. It consists of three phases involving separate randomized trials:

  1. I. Screening: The goal in this phase is to “screen” a larger set of potentially important treatment components quickly and efficiently and identify the important ones. This is done through a screening experiment which studies the effects of all the components simultaneously. Two-level fractional factorial designs are very useful in accomplishing this goal economically. The Pareto principle underlies the screening phase – only a small subset of the components and their interactions will be important. Thus, many interactions can a priori be excluded, increasing the efficiency of the design.

  2. II. Refining: This phase is aimed at refining our understanding of the effects of the important components that are identified in Phase I. The use of screening experiments involves prior knowledge or working assumptions that need to be further examined and verified in this phase. This is done using follow up experiments to untangle important effects that may be “aliased”, determine optimal “dosage” levels of factors through “response surface” experiments, and so on. The information from this phase will lead to the formulation of an optimal treatment program.

  3. III. Confirming: The final phase is a confirmation trial to compare the new program to the gold standard and assess its advantage. While this phase is similar to the current RCTs with two or more arms, the multiphase approach leads to the inclusion of only important components at their optimized levels.

The focus of this paper is on screening experiments and the use of fractional factorial designs (FFDs) in public health intervention research. Our goal is to discuss the role of screening experiments in this context and illustrate the usefulness of FFDs. Factorial and fractional factorial designs have a long history 3,4,5,6. They were originally developed in the context of agricultural applications and have found widespread use in engineering. This article provides an overview of FFDs and uses two projects from our center to demonstrate their usefulness. More information about FFDs is available from standard textbooks 2,7,8.

The successful use of FFDs relies on the principle of effect sparsity. There are two types of sparsity: few of the factors will be active and higher-order interactions will be negligible. We can use prior knowledge (theory, experience or empirical evidence) as working assumptions about interactions. The results from the screening experiments will suggest which of these assumptions are critical, and they can be resolved in the refining phase with suitable follow-up experiments.

2. Guide to Decide: Background

We provide an introduction to this study here so that we can make discussion of FFDs in Section 3 concrete. The Guide to Decide project deals with the effectiveness of decision aids (DAs) for women who are at high risk of breast cancer. Tamoxifen reduces the risk of a primary diagnosis of breast cancer by 50%, but it has significant side effects 9. Women’s decision to take tamoxifen requires understanding the benefits (reducing the risk of developing breast cancer) vs the risks (side effects). Women must also know their baseline risk of breast cancer. Our goal in this study was to determine how DAs influence women’s knowledge of complex statistical information, their risk perceptions, and their health behaviors. The benefits of DAs are well established 10,11. However, there has been limited research on understanding why a DA is effective and which of the different components (factors) contribute to better decision making.

The screening phase of the study examined the effectiveness of five 2-level communication factors within a web-based decision aid: A -- statistics presented in prose only or prose + pictograph; B -- risks presented in denominator of 100 or 1000; C -- risks presented using incremental risk language (incremental risk of tamoxifen side-effect) or total risk language; D -- order of presentation of risks and benefits; and E -- provide statistics of competing risks or not. We will return to this example in Section 4.

3. Full factorial Designs

For simplicity, we restrict attention to the first four factors A, B, C, and D of the Guide to Decide project. A full factorial design corresponding to the four factors is given in Table 1 together with all the interactions. Since there are 24 = 16 possible combinations of the four factors each at two levels, there are 16 groups (rows). The – and + signs under the columns A–D indicate the two settings of the four factors. For example, all the subjects assigned to group (row) 1 will get the treatment combination with all four factors A–D at their – level.

Table 1.

A full factorial design for four components (factors) each at two levels (four columns A–D) and their interactions

Group A B C D AB AC AD BC BD CD ABC ABD ACD BCD ABCD
1 + + + + + + +
2 + + + + + + +
3 + + + + + + +
4 + + + + + + +
5 + + + + + + +
6 + + + + + + +
7 + + + + + + +
8 + + + + + + +
9 + + + + + + +
10 + + + + + + +
11 + + + + + + +
12 + + + + + + +
13 + + + + + + +
14 + + + + + + +
15 + + + + + + +
16 + + + + + + + + + + + + + + +

The subjects will be assigned to the 16 groups (treatment combination as follows). Let N be the total number of subjects in the study. It is most efficient, in a statistical sense, to assign equal number of subjects to all the groups. Let K = N/16 be the number of subjects per group. So the N subjects will be randomly assigned to the 16 groups with K subjects in each group.

Note that this design leads to a single randomized trial rather than 16 different trials corresponding to the 16 groups. In particular, the main effect of a factor is obtained by combining the data from all 16 groups. To see this, let Y1, Y2, …, Y16 be the average response in each group (row), i.e., Y1 is the average of the responses from the K subjects in group 1 and so on. Then, the main effect of A is given by

[(Y1+Y2+Y3+Y4+Y5+Y6+Y7+Y8)(Y9+Y10+Y11+Y12+Y13+Y14+Y15+Y16)]/16,

i.e., multiplying the Y’s by the – and + signs in the A column, adding them and dividing by 16. Note that the main effect estimate is based on the data in all the groups, so the factorial design combines information across all 16 groups (rows).

The columns AB, AC, etc. in Table 1 correspond to two, three, and four-way interaction effects. In two-level designs, these interaction columns can be obtained by simply multiplying the corresponding main effect columns. For example, AB is obtained by multiplying columns A and B and treating – and + as –1 and +1 respectively. The interaction effects are estimated in a manner similar to the main effects. For example, the interaction effect AB is given by

[(Y1+Y2+Y3+Y4+Y13+Y14+Y15+Y16)(Y5+Y6+Y7+Y8+Y9+Y10+Y11+Y12)]/16,

i.e., multiplying the Y’s by the – and + signs in the AB interaction column and adding them and dividing by 16.

The design in Table 1 is balanced in a number of different ways: each factor occurs at the low and high levels an equal number of times; and each combination of factors occurs an equal number of times (for example, the four different combinations (−, −), (−,+), (+,−), (+, +) of the pair (A, B) all occur four times). This balance leads to good statistical efficiency in estimating the main effects and interactions. Further, the columns in the design matrix (see Table 1) are orthogonal to each other, resulting in uncorrelated estimates.

The problem with full factorial designs is that the number of groups increases rapidly with the number of factors and their levels. Table 2 shows the situation for two-level factors. The problem is worse for factors with more levels; even for 3 factors at 5 levels, there are 5 x 5 x 5 = 125 groups. Full factorial designs are geared towards estimating the main effects and all higher order interactions. However, in many experiments, only a small proportion of the factors are likely to be active. Also, most of the higher-order interactions will be negligible and are not of primary interest in the screening stage. As noted in the literature 2, “… there tends to be a redundancy in [full factorial designs] – redundancy in terms of an excess number of interactions that can be estimated and sometimes in an excess number of [components] that are studied.” Fractional factorial designs exploit this redundancy to study the effects of several factors economically.

Table 2.

Number of groups in a two-level full factorial design as the number of factors increase

Number of factors Number of Groups
2 4
3 8
4 16
5 32
6 64
8 256
10 1,024
15 32,768

Fractional experiments have been used in the behavior intervention literature 12, 13. Some of them are fractional designs in the loose sense of the word: any subset of a full factorial design. They do not have elegant structure and statistical properties associated with the fractional factorial designs discussed in this article.

4. Fractional Factorial Designs: Half-fraction

Suppose we want to study all 5 factors A, B, C, D, and E in the Guide to Decide project using an FFD with 16 groups. If the fourth-order interaction ABCD is negligible, we can vary the fifth factor E according to the ABCD column in Table 1. This results in aliasing the two effects, i.e., the effect of E cannot be separated from that of ABCD, or E = ABCD. (The notation U = V means that two effects U and V are aliased.) If our assumption about the ABCD interaction is valid, then any significant effect associated with this column should be attributed to the main effect of factor E.

There are additional consequences from the aliasing. The relationship E = ABCD implies that

A = BCDE, B = ACDE, C = ABDE, and D = ABCE,

or each main effect is aliased with a fourth-order interaction. In addition,

AB = CDE,
AC = BDE,
AD = BCE,
AE = BCD,
BC = ADE,
BD = ACE,
BE = ACD,
CD = ABE,
CE = ABD,
and DE = ABC.

That is, two-factor interactions are all aliased with three-factor interactions, and we can estimate the two-factor interactions only if we know that the three-factor interactions are negligible. This is reasonable in many situations. If so, we can use the 16-group design to study 5 factors simultaneously. This FFD is a half-fraction of a 25 full factorial. It has the attractive property that all the main effects are aliased with four or higher-order interactions and all the two-way interactions are aliased with three or higher-order interactions. So we can estimate all the main effects and second-order interactions provided all third and higher-order interactions are negligible.

We will illustrate the usefulness of a half-fraction for studying 5 factors in 16 groups in a real application in the next section. A 16-group design can also be obtained as a one-quarter fraction of a 26 full factorial design. We will use this to study 6 factors in 16 groups in Section 6.

5. Guide to Decide Revisited

Table 3 shows the 16-group fractional factorial design used in the screening phase of this study. We obtained it by setting E = ABCD. It reduced the number of groups by half, yet allowed us to estimate all the main effects and two-factor interactions assuming that three and higher-order interactions were absent.

Table 3.

FFD for the Guide to Decide Project

Group A =
Pictograph
B =
Risk Context
C =
Statistics
D =
Order
E = ABCD =
Risk Presentation
1 Pictograph Absent 100 Benefits first Incremental
2 Pictograph Absent 100 Risks first Total
3 Pictograph Absent 1000 Benefits first Total
4 Pictograph Absent 1000 Risks first Incremental
5 Pictograph Present 100 Benefits first Total
6 Pictograph Present 100 Risks first Incremental
7 Pictograph Present 1000 Benefits first Incremental
8 Pictograph Present 1000 Risks first Total
9 Prose only Absent 100 Benefits first Total
10 Prose only Absent 100 Risks first Incremental
11 Prose only Absent 1000 Benefits first Incremental
12 Prose only Absent 1000 Risks first Total
13 Prose only Present 100 Benefits first Incremental
14 Prose only Present 100 Risks first Total
15 Prose only Present 1000 Benefits first Total
16 Prose only Present 1000 Risks first Incremental

The screening phase of the study involved 632 women who were at high risk of developing a first breast cancer in the next five years. There were three primary outcome measures: 1) knowledge of risks and benefits of tamoxifen, 2) participants’ perceptions of the risks and benefits, and 3) participants’ intention to take additional action or seek more information.

Table 4 shows the results of our analysis for one outcome measure: knowledge of risks and benefits. Only the significant main effects and interactions are shown (except for pictograph which has a small main effect but significant interactions). The main effect of incremental risk format is important with a negative coefficient, but it has a significant positive interaction with pictograph. So incremental risk formats result in lower knowledge scores among women who received their risk information in prose only format, but not among women receiving risk information in a pictograph format. Risk denominator is also significant with use of 1000 person denominator increasing knowledge compared to 100. The pictograph x denominator interaction is marginal and trends in the same direction above, suggesting that pictographs can at least partially mediate the knowledge deficits resulting from the use of 100 person denominators.

Table 4.

Analysis of participants’ knowledge scores*

Factors Coefficient p-value
Pictograph (vs. Numeric Text) 0.001 0.996
Incremental Risk (vs. Total) −0.674 <0.001
Pictograph x Incremental Risk 0.791 <0.001
1000 Risk Denominator (vs. 100) 0.493 0.003
Pictograph x Denominator −0.364 0.114
*

Analysis controls for participants’ numeracy score. The main effect “Pictograph” is included even though it is not important because it has significant interactions with other factors.

The results from the screening phase provided clear guidelines for the refining experiment in Phase II. All numerical information will be presented in pictographs with the incremental risk format (due to strong main effect of incremental risk and its interaction with pictograph). Contextual information will not be included in Phase II. The refining phase will examine four new components (factors) of the DA for describing risks and benefits. In addition, we will study both tamoxifen and raloxifene (a recently approved drug).

6. Project Quit

This project deals with smoking cessation. We use it here to demonstrate a ¼-fraction of a 64-group design. (For illustrative purposes, we present a slightly modified version of the actual experiment.) There have been several smoking-cessation studies using computer-tailored programs 14,15,16. We have been using the multiphase framework to identify the active components of a web-based smoking prevention study. Phase I was a six month study involving 1848 subjects. The primary outcome measure was abstinence over a seven-day period (“Did you smoke a tobacco cigarette in the past 7 days?”). There were six two-level factors:

  1. A: Exposure – a single, large set of materials or multiple correspondences over several weeks.

  2. B: Outcome expectation depth – High-depth group received individualized feedback and advice for quitting. Low-depth group received general feedback related to motives for quitting.

  3. C: Success story depth – High-depth group received a story tailored to their specific socio-demographic background. Story for low-depth group was tailored only to their gender.

  4. D: Efficacy Expectation – High-depth group received feedback and advice on the highest barriers to quitting. Low-depth group received advice on a broader range of barriers cited by the smoker.

  5. E: Source – High-depth personalized source condition including a photograph of, and supportive text from, the individual smoking cessation team of the HMO. Low-depth version included only a photograph.

  6. F: Framing – Gain framing (positive aspects of quitting) or loss framing (negative effects of continued smoking).

A full factorial experiment with six factors requires 64 groups which is not practical. In this application, prior experience suggested that factor B (outcome expectation depth) could have interactions with C (success story depth), D (efficacy expectation) and F (framing). In addition, the DF interaction (efficacy expectation x framing) may be active, and all other interactions are small. So we want to be able to estimate the BC, BD, BF and DF interactions assuming all others are small. We can then use a 16-group design with the aliasing relationships

E = ABC and F = ACD.

(Taking products, this implies the additional aliasing EF = BD). This is a ¼-fraction of a 64-group full factorial design. Unlike the Design Aid example, some of the two-factor interactions are now aliased with other two-factor interactions. Le us focus on the two-factor interactions of interest to us listed on the left-hand side of the equations below:

BC = AE
BD = EF
BF = DE
DF = BE.

Fortunately, these are aliased with other two-order interactions that were considered negligible. This was not by accident. The aliasing structure E = ABC and F = ACD was chosen judiciously to accomplish this goal.

Of course, our assumption about the interactions can be wrong. Many intervention researchers view this as a severe limitation of FFDs 12, 13. However, in MOST, the FFDs are embedded within a multiphase strategy, so we can verify any critical working assumptions using follow up experiments in the refining phase. (This concept is well known within the design literature.) See [1,8] for examples of follow-up experiments to resolve ambiguities in aliased interaction effects.

Turning to the analysis, we fitted the binary outcome variable (abstinence or not) to the intervention components and several baseline socio-demographic characteristics. Motivation was the only significant socio-demographic variable. Of the components, source depth (p-value.027) and success story depth (p-value.046) had significant effects with high-depth increasing quit rates. Outcome expectations and efficacy expectations were marginally important (p-values of 0.195 for both). The other main effects were not significant. In addition, none of the two-factor interactions was significant including BC, BD, BF, and DF that we a priori felt may be important.

This study demonstrates the effect-sparsity principle: only 2 of the 6 components were important and there were no significant interactions. We are able to examine several components economically using only a 16-arm trial. The refinement phase of Project Quit is underway with another randomized trial that examines the important factors in more detail: understanding the impact of added source personalization and variations in story design on smoking cessation. Results of this study should be available by the Fall of 2008.

7. Summary

Advances in technology (internet, web-base surveys, and so on) are leading to an explosion in the number of possible components in behavior intervention research. There is a need for alternative methods that can screen a large set of potential components and identify the important ones that can be subsequently used in an optimal intervention program. We also need to understand which components are important, what their optimal levels are, and so on. However, full factorial experiments lead to a large number of groups and are not practical. For example, in our projects, this would have required the design and implementation of 32 or 64 combinations of tailoring programs. FFDs are a promising alternative, especially when used within a multiphase framework that allows us to verify critical assumptions made in the screening phase. FFDs are used extensively in engineering, and they can also be useful in health-behavior studies.

The class of FFDs discussed here is sometimes referred to as regular fractional factorial designs 8. They have a very simple aliasing structure where two effects are completely aliased with each other. There are other types of FFDs, such as Plackett-Burman designs, where the aliasing structure is more complex 2, 7, 8. Their interaction effects are not as easy to untangle in the refining phase of the multiphase experimentation strategy, so we do not recommend them in the screening phase.

References

  • 1.Collins LM, Murphy SA, Nair V, Strecher V. A Strategy for Optimizing and Evaluating Behavioral Interventions. Annals of Behavioral Medicine. 2005;30:65–73. doi: 10.1207/s15324796abm3001_8. [DOI] [PubMed] [Google Scholar]
  • 2.Box GEP, Hunter WG, Hunter JS. Statistics for Experimenters. New York: Wiley; 1978. [Google Scholar]
  • 3.Yates F. The design and analysis of factorial experiments. Harpenden: Imperial Bureau of Soil Sciences; 1937. Technical communication No. 35. [Google Scholar]
  • 4.Fisher RA. The Design of Experiments. 3. Edinburgh: Oliver & Boyd; 1942. [Google Scholar]
  • 5.Finney DJ. The fractional replication of factorial arrangements. Annals of Eugenics. 1945;12:291–301. [Google Scholar]
  • 6.Box GEP, Hunter JS. The 2k–p fractional factorial designs: parts I and II. Technometrics. 1961;3:311–351. 449–458. [Google Scholar]
  • 7.Montgomery DC. Design and Analysis of Experiments. 6. Wiley; 2005. [Google Scholar]
  • 8.Wu CFJ, Hamada M. Experiments: Planning, Analysis, and Parameter Design Optimization. New York: Wiley; 2000. [Google Scholar]
  • 9.Fisher B, Costantino JP, Wickerham DL, Redmond CK, et al. Tamoxifen for prevention of breast cancer: Report of the National Surgical Adjuvant Breast and Bowel Project P-1 Study. Journal of the National Cancer Institute. 1998;90(18):1371–1388. doi: 10.1093/jnci/90.18.1371. [DOI] [PubMed] [Google Scholar]
  • 10.O’Connor AM, Fiset V, Rostom A, et al. The Cochrane Library. Vol. 4. Chichester, UK: John Wiley & Sons, Ltd; 2004. Decision aids for people facing health treatment or screening decisions (Cochrane Review) [DOI] [PubMed] [Google Scholar]
  • 11.O’Connor AM, Rostom A, Fiset V, et al. Decision aids for patients facing health treatment or screening decisions: Systematic review. British Medical Journal. 1999;319(7212):731–734. doi: 10.1136/bmj.319.7212.731. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.West SG, Aiken LS, Todd M. Probing the effects of individual components in multiple component prevention programs. American Journal of Community Psychology. 1993;21(5):571–605. doi: 10.1007/BF00942173. [DOI] [PubMed] [Google Scholar]
  • 13.West SG, Aiken LS. Toward understanding individual effects in multicomponent prevention programs: Design and analysis strategies. In: Bryant KJ, Windle M, West SG, editors. The science of prevention: Methodological advances from alcohol and substance use research. Washington, DC: American Psychological Association; 1997. [Google Scholar]
  • 14.Shiffman S, Gitchell J. World’s best practice in tobacco control: Increasing quitting by increasing access to treatment medications: USA. Tobacco Control. 2000;9:228–236. doi: 10.1136/tc.9.2.228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Shiffman S, Paty JA, Rohay JM, DiMarino ME, Gitchell JG. The efficacy of computer tailored smoking cessation material as a supplement to nicotine patch therapy. Drug and Alcohol Dependence. 2001;64(1):35–46. doi: 10.1016/s0376-8716(00)00237-4. [DOI] [PubMed] [Google Scholar]
  • 16.Etter JF. Using new information technology to treat tobacco dependence. Respiration. 2002;69(2):111–114. doi: 10.1159/000056311. [DOI] [PubMed] [Google Scholar]
  • 17.Strecher VJ. Computer-tailored smoking cessation materials: A review and discussion. Pat Educ Couns. 1999;36:107–117. doi: 10.1016/s0738-3991(98)00128-1. [DOI] [PubMed] [Google Scholar]

RESOURCES