Explore Psychology

What Is a Factorial Design? Definition and Examples

Categories Dictionary

A factorial design is a type of experiment that involves manipulating two or more variables. While simple psychology experiments look at how one independent variable affects one dependent variable, researchers often want to know more about the effects of multiple independent variables.

Table of Contents

How a Factorial Design Works

Let’s take a closer look at how a factorial design might work in a psychology experiment:

  • The independent variable is the variable of interest that the experimenter will manipulate.
  • The dependent variable is the variable that the researcher then measures.

By doing this, psychologists can see if changing the independent variable results in some type of change in the dependent variable.

For example, imagine that a researcher wants to do an experiment looking at whether sleep deprivation hurts reaction times during a driving test. If she were only to perform the experiment using these variables–the sleep deprivation being the independent variable and the performance on the driving test being the dependent variable–it would be an example of a simple experiment.

However, let’s imagine that she is also interested in learning if sleep deprivation impacts the driving abilities of men and women differently. She has just added a second independent variable of interest (sex of the driver) into her study, which now makes it a factorial design.

Types of Factorial Designs

One common type of experiment is known as a 2×2 factorial design. In this type of study, there are two factors (or independent variables), each with two levels.

The number of digits tells you how many independent variables (IVs) there are in an experiment, while the value of each number tells you how many levels there are for each independent variable.

So, for example, a 4×3 factorial design would involve two independent variables with four levels for one IV and three levels for the other IV.

Advantages of a Factorial Design

One of the big advantages of factorial designs is that they allow researchers to look for interactions between independent variables.

An interaction is a result in which the effects of one experimental manipulation depends upon the experimental manipulation of another independent variable.

Example of a Factorial Design

For example, imagine that researchers want to test the effects of a memory-enhancing drug. Participants are given one of three different drug doses, and then asked to either complete a simple or complex memory task.

The researchers note that the effects of the memory drug are more pronounced with the simple memory tasks, but not as apparent when it comes to the complex tasks. In this 3×2 factorial design, there is an interaction effect between the drug dosage and the complexity of the memory task.

Understanding Variable Effects in Factorial Designs

So if researchers are manipulating two or more independent variables, how exactly do they know which effects are linked to which variables?

“It is true that when two manipulations are operating simultaneously, it is impossible to disentangle their effects completely,” explain authors Breckler, Olson, and Wiggins in their book Social Psychology Alive .

“Nevertheless, the researchers can explore the effects of each independent variable separately by averaging across all levels of the other independent variable . This procedure is called looking at the main effect.”

Examples of Factorial Designs

A university wants to assess the starting salaries of their MBA graduates. The study looks at graduates working in four different employment areas: accounting, management, finance, and marketing.

In addition to looking at the employment sector, the researchers also look at gender. In this example, the employment sector and gender of the graduates are the independent variables, and the starting salaries are the dependent variables. This would be considered a 4×2 factorial design.

Researchers want to determine how the amount of sleep a person gets the night before an exam impacts performance on a math test the next day. But the experimenters also know that many people like to have a cup of coffee (or two) in the morning to help them get going.

So, the researchers decided to look at how the amount of sleep and caffeine influence test performance. 

The researchers then decided to look at three levels of sleep (4 hours, 6 hours, and 8 hours) and only two levels of caffeine consumption (2 cups versus no coffee). In this case, the study is a 3×2 factorial design.

Baker TB, Smith SS, Bolt DM, et al. Implementing clinical research using factorial designs: A primer .  Behav Ther . 2017;48(4):567-580. doi:10.1016/j.beth.2016.12.005

Collins LM, Dziak JJ, Li R. Design of experiments with multiple independent variables: a resource management perspective on complete and reduced factorial designs .  Psychol Methods . 2009;14(3):202-224. doi:10.1037/a0015826

Haerling Adamson K, Prion S. Two-by-two factorial design .  Clin Simul Nurs . 2020;49:90-91. doi:10.1016/j.ecns.2020.06.004

Watkins ER, Newbold A. Factorial designs help to understand how psychological therapy works .  Front Psychiatry . 2020;11:429. doi:10.3389/fpsyt.2020.00429

logo

Research Methods in Psychology

5. factorial designs ¶.

We have usually no knowledge that any one factor will exert its effects independently of all others that can be varied, or that its effects are particularly simply related to variations in these other factors. —Ronald Fisher

In Chapter 1 we briefly described a study conducted by Simone Schnall and her colleagues, in which they found that washing one’s hands leads people to view moral transgressions as less wrong [SBH08] . In a different but related study, Schnall and her colleagues investigated whether feeling physically disgusted causes people to make harsher moral judgments [SHCJ08] . In this experiment, they manipulated participants’ feelings of disgust by testing them in either a clean room or a messy room that contained dirty dishes, an overflowing wastebasket, and a chewed-up pen. They also used a self-report questionnaire to measure the amount of attention that people pay to their own bodily sensations. They called this “private body consciousness”. They measured their primary dependent variable, the harshness of people’s moral judgments, by describing different behaviors (e.g., eating one’s dead dog, failing to return a found wallet) and having participants rate the moral acceptability of each one on a scale of 1 to 7. They also measured some other dependent variables, including participants’ willingness to eat at a new restaurant. Finally, the researchers asked participants to rate their current level of disgust and other emotions. The primary results of this study were that participants in the messy room were in fact more disgusted and made harsher moral judgments than participants in the clean room—but only if they scored relatively high in private body consciousness.

The research designs we have considered so far have been simple—focusing on a question about one variable or about a statistical relationship between two variables. But in many ways, the complex design of this experiment undertaken by Schnall and her colleagues is more typical of research in psychology. Fortunately, we have already covered the basic elements of such designs in previous chapters. In this chapter, we look closely at how and why researchers combine these basic elements into more complex designs. We start with complex experiments—considering first the inclusion of multiple dependent variables and then the inclusion of multiple independent variables. Finally, we look at complex correlational designs.

5.1. Multiple Dependent Variables ¶

5.1.1. learning objectives ¶.

Explain why researchers often include multiple dependent variables in their studies.

Explain what a manipulation check is and when it would be included in an experiment.

Imagine that you have made the effort to find a research topic, review the research literature, formulate a question, design an experiment, obtain approval from teh relevant institutional review board (IRB), recruit research participants, and manipulate an independent variable. It would seem almost wasteful to measure a single dependent variable. Even if you are primarily interested in the relationship between an independent variable and one primary dependent variable, there are usually several more questions that you can answer easily by including multiple dependent variables.

5.1.2. Measures of Different Constructs ¶

Often a researcher wants to know how an independent variable affects several distinct dependent variables. For example, Schnall and her colleagues were interested in how feeling disgusted affects the harshness of people’s moral judgments, but they were also curious about how disgust affects other variables, such as people’s willingness to eat in a restaurant. As another example, researcher Susan Knasko was interested in how different odors affect people’s behavior [Kna92] . She conducted an experiment in which the independent variable was whether participants were tested in a room with no odor or in one scented with lemon, lavender, or dimethyl sulfide (which has a cabbage-like smell). Although she was primarily interested in how the odors affected people’s creativity, she was also curious about how they affected people’s moods and perceived health—and it was a simple enough matter to measure these dependent variables too. Although she found that creativity was unaffected by the ambient odor, she found that people’s moods were lower in the dimethyl sulfide condition, and that their perceived health was greater in the lemon condition.

When an experiment includes multiple dependent variables, there is again a possibility of carryover effects. For example, it is possible that measuring participants’ moods before measuring their perceived health could affect their perceived health or that measuring their perceived health before their moods could affect their moods. So the order in which multiple dependent variables are measured becomes an issue. One approach is to measure them in the same order for all participants—usually with the most important one first so that it cannot be affected by measuring the others. Another approach is to counterbalance, or systematically vary, the order in which the dependent variables are measured.

5.1.3. Manipulation Checks ¶

When the independent variable is a construct that can only be manipulated indirectly—such as emotions and other internal states—an additional measure of that independent variable is often included as a manipulation check. This is done to confirm that the independent variable was, in fact, successfully manipulated. For example, Schnall and her colleagues had their participants rate their level of disgust to be sure that those in the messy room actually felt more disgusted than those in the clean room.

Manipulation checks are usually done at the end of the procedure to be sure that the effect of the manipulation lasted throughout the entire procedure and to avoid calling unnecessary attention to the manipulation. Manipulation checks become especially important when the manipulation of the independent variable turns out to have no effect on the dependent variable. Imagine, for example, that you exposed participants to happy or sad movie music—intending to put them in happy or sad moods—but you found that this had no effect on the number of happy or sad childhood events they recalled. This could be because being in a happy or sad mood has no effect on memories for childhood events. But it could also be that the music was ineffective at putting participants in happy or sad moods. A manipulation check, in this case, a measure of participants’ moods, would help resolve this uncertainty. If it showed that you had successfully manipulated participants’ moods, then it would appear that there is indeed no effect of mood on memory for childhood events. But if it showed that you did not successfully manipulate participants’ moods, then it would appear that you need a more effective manipulation to answer your research question.

5.1.4. Measures of the Same Construct ¶

Another common approach to including multiple dependent variables is to operationalize and measure the same construct, or closely related ones, in different ways. Imagine, for example, that a researcher conducts an experiment on the effect of daily exercise on stress. The dependent variable, stress, is a construct that can be operationalized in different ways. For this reason, the researcher might have participants complete the paper-and-pencil Perceived Stress Scale and also measure their levels of the stress hormone cortisol. This is an example of the use of converging operations. If the researcher finds that the different measures are affected by exercise in the same way, then he or she can be confident in the conclusion that exercise affects the more general construct of stress.

When multiple dependent variables are different measures of the same construct - especially if they are measured on the same scale - researchers have the option of combining them into a single measure of that construct. Recall that Schnall and her colleagues were interested in the harshness of people’s moral judgments. To measure this construct, they presented their participants with seven different scenarios describing morally questionable behaviors and asked them to rate the moral acceptability of each one. Although the researchers could have treated each of the seven ratings as a separate dependent variable, these researchers combined them into a single dependent variable by computing their mean.

When researchers combine dependent variables in this way, they are treating them collectively as a multiple-response measure of a single construct. The advantage of this is that multiple-response measures are generally more reliable than single-response measures. However, it is important to make sure the individual dependent variables are correlated with each other by computing an internal consistency measure such as Cronbach’s \(\alpha\) . If they are not correlated with each other, then it does not make sense to combine them into a measure of a single construct. If they have poor internal consistency, then they should be treated as separate dependent variables.

5.1.5. Key Takeaways ¶

Researchers in psychology often include multiple dependent variables in their studies. The primary reason is that this easily allows them to answer more research questions with minimal additional effort.

When an independent variable is a construct that is manipulated indirectly, it is a good idea to include a manipulation check. This is a measure of the independent variable typically given at the end of the procedure to confirm that it was successfully manipulated.

Multiple measures of the same construct can be analyzed separately or combined to produce a single multiple-item measure of that construct. The latter approach requires that the measures taken together have good internal consistency.

5.1.6. Exercises ¶

Practice: List three independent variables for which it would be good to include a manipulation check. List three others for which a manipulation check would be unnecessary. Hint: Consider whether there is any ambiguity concerning whether the manipulation will have its intended effect.

Practice: Imagine a study in which the independent variable is whether the room where participants are tested is warm (30°) or cool (12°). List three dependent variables that you might treat as measures of separate variables. List three more that you might combine and treat as measures of the same underlying construct.

5.2. Multiple Independent Variables ¶

5.2.1. learning objectives ¶.

Explain why researchers often include multiple independent variables in their studies.

Define factorial design, and use a factorial design table to represent and interpret simple factorial designs.

Distinguish between main effects and interactions, and recognize and give examples of each.

Sketch and interpret bar graphs and line graphs showing the results of studies with simple factorial designs.

Just as it is common for studies in psychology to include multiple dependent variables, it is also common for them to include multiple independent variables. Schnall and her colleagues studied the effect of both disgust and private body consciousness in the same study. The tendency to include multiple independent variables in one experiment is further illustrated by the following titles of actual research articles published in professional journals:

The Effects of Temporal Delay and Orientation on Haptic Object Recognition

Opening Closed Minds: The Combined Effects of Intergroup Contact and Need for Closure on Prejudice

Effects of Expectancies and Coping on Pain-Induced Intentions to Smoke

The Effect of Age and Divided Attention on Spontaneous Recognition

The Effects of Reduced Food Size and Package Size on the Consumption Behavior of Restrained and Unrestrained Eaters

Just as including multiple dependent variables in the same experiment allows one to answer more research questions, so too does including multiple independent variables in the same experiment. For example, instead of conducting one study on the effect of disgust on moral judgment and another on the effect of private body consciousness on moral judgment, Schnall and colleagues were able to conduct one study that addressed both variables. But including multiple independent variables also allows the researcher to answer questions about whether the effect of one independent variable depends on the level of another. This is referred to as an interaction between the independent variables. Schnall and her colleagues, for example, observed an interaction between disgust and private body consciousness because the effect of disgust depended on whether participants were high or low in private body consciousness. As we will see, interactions are often among the most interesting results in psychological research.

5.2.2. Factorial Designs ¶

By far the most common approach to including multiple independent variables in an experiment is the factorial design. In a factorial design, each level of one independent variable (which can also be called a factor) is combined with each level of the others to produce all possible combinations. Each combination, then, becomes a condition in the experiment. Imagine, for example, an experiment on the effect of cell phone use (yes vs. no) and time of day (day vs. night) on driving ability. This is shown in the factorial design table in Figure 5.1 . The columns of the table represent cell phone use, and the rows represent time of day. The four cells of the table represent the four possible combinations or conditions: using a cell phone during the day, not using a cell phone during the day, using a cell phone at night, and not using a cell phone at night. This particular design is referred to as a 2 x 2 (read “two-by- two”) factorial design because it combines two variables, each of which has two levels. If one of the independent variables had a third level (e.g., using a hand-held cell phone, using a hands-free cell phone, and not using a cell phone), then it would be a 3 x 2 factorial design, and there would be six distinct conditions. Notice that the number of possible conditions is the product of the numbers of levels. A 2 x 2 factorial design has four conditions, a 3 x 2 factorial design has six conditions, a 4 x 5 factorial design would have 20 conditions, and so on.

../_images/C8factorial.png

Fig. 5.1 Factorial Design Table Representing a 2 x 2 Factorial Design ¶

In principle, factorial designs can include any number of independent variables with any number of levels. For example, an experiment could include the type of psychotherapy (cognitive vs. behavioral), the length of the psychotherapy (2 weeks vs. 2 months), and the sex of the psychotherapist (female vs. male). This would be a 2 x 2 x 2 factorial design and would have eight conditions. Figure 5.2 shows one way to represent this design. In practice, it is unusual for there to be more than three independent variables with more than two or three levels each.

This is for at least two reasons: For one, the number of conditions can quickly become unmanageable. For example, adding a fourth independent variable with three levels (e.g., therapist experience: low vs. medium vs. high) to the current example would make it a 2 x 2 x 2 x 3 factorial design with 24 distinct conditions. Second, the number of participants required to populate all of these conditions (while maintaining a reasonable ability to detect a real underlying effect) can render the design unfeasible (for more information, see the discussion about the importance of adequate statistical power in Chapter 13 ). As a result, in the remainder of this section we will focus on designs with two independent variables. The general principles discussed here extend in a straightforward way to more complex factorial designs.

../_images/C83way.png

Fig. 5.2 Factorial Design Table Representing a 2 x 2 x 2 Factorial Design ¶

5.2.3. Assigning Participants to Conditions ¶

Recall that in a simple between-subjects design, each participant is tested in only one condition. In a simple within-subjects design, each participant is tested in all conditions. In a factorial experiment, the decision to take the between-subjects or within-subjects approach must be made separately for each independent variable. In a between-subjects factorial design, all of the independent variables are manipulated between subjects. For example, all participants could be tested either while using a cell phone or while not using a cell phone and either during the day or during the night. This would mean that each participant was tested in one and only one condition. In a within-subjects factorial design, all of the independent variables are manipulated within subjects. All participants could be tested both while using a cell phone and while not using a cell phone and both during the day and during the night. This would mean that each participant was tested in all conditions. The advantages and disadvantages of these two approaches are the same as those discussed in Chapter 4 ). The between-subjects design is conceptually simpler, avoids carryover effects, and minimizes the time and effort of each participant. The within-subjects design is more efficient for the researcher and help to control extraneous variables.

It is also possible to manipulate one independent variable between subjects and another within subjects. This is called a mixed factorial design. For example, a researcher might choose to treat cell phone use as a within-subjects factor by testing the same participants both while using a cell phone and while not using a cell phone (while counterbalancing the order of these two conditions). But he or she might choose to treat time of day as a between-subjects factor by testing each participant either during the day or during the night (perhaps because this only requires them to come in for testing once). Thus each participant in this mixed design would be tested in two of the four conditions.

Regardless of whether the design is between subjects, within subjects, or mixed, the actual assignment of participants to conditions or orders of conditions is typically done randomly.

5.2.4. Non-manipulated Independent Variables ¶

In many factorial designs, one of the independent variables is a non-manipulated independent variable. The researcher measures it but does not manipulate it. The study by Schnall and colleagues is a good example. One independent variable was disgust, which the researchers manipulated by testing participants in a clean room or a messy room. The other was private body consciousness, a variable which the researchers simply measured. Another example is a study by Halle Brown and colleagues in which participants were exposed to several words that they were later asked to recall [BKD+99] . The manipulated independent variable was the type of word. Some were negative, health-related words (e.g., tumor, coronary), and others were not health related (e.g., election, geometry). The non-manipulated independent variable was whether participants were high or low in hypochondriasis (excessive concern with ordinary bodily symptoms). Results from this study suggested that participants high in hypochondriasis were better than those low in hypochondriasis at recalling the health-related words, but that they were no better at recalling the non-health-related words.

Such studies are extremely common, and there are several points worth making about them. First, non-manipulated independent variables are usually participant characteristics (private body consciousness, hypochondriasis, self-esteem, and so on), and as such they are, by definition, between-subject factors. For example, people are either low in hypochondriasis or high in hypochondriasis; they cannot be in both of these conditions. Second, such studies are generally considered to be experiments as long as at least one independent variable is manipulated, regardless of how many non-manipulated independent variables are included. Third, it is important to remember that causal conclusions can only be drawn about the manipulated independent variable. For example, Schnall and her colleagues were justified in concluding that disgust affected the harshness of their participants’ moral judgments because they manipulated that variable and randomly assigned participants to the clean or messy room. But they would not have been justified in concluding that participants’ private body consciousness affected the harshness of their participants’ moral judgments because they did not manipulate that variable. It could be, for example, that having a strict moral code and a heightened awareness of one’s body are both caused by some third variable (e.g., neuroticism). Thus it is important to be aware of which variables in a study are manipulated and which are not.

5.2.5. Graphing the Results of Factorial Experiments ¶

The results of factorial experiments with two independent variables can be graphed by representing one independent variable on the x-axis and representing the other by using different kinds of bars or lines. (The y-axis is always reserved for the dependent variable.)

../_images/C8graphing.png

Fig. 5.3 Two ways to plot the results of a factorial experiment with two independent variables ¶

Figure 5.3 shows results for two hypothetical factorial experiments. The top panel shows the results of a 2 x 2 design. Time of day (day vs. night) is represented by different locations on the x-axis, and cell phone use (no vs. yes) is represented by different-colored bars. It would also be possible to represent cell phone use on the x-axis and time of day as different-colored bars. The choice comes down to which way seems to communicate the results most clearly. The bottom panel of Figure 5.3 shows the results of a 4 x 2 design in which one of the variables is quantitative. This variable, psychotherapy length, is represented along the x-axis, and the other variable (psychotherapy type) is represented by differently formatted lines. This is a line graph rather than a bar graph because the variable on the x-axis is quantitative with a small number of distinct levels. Line graphs are also appropriate when representing measurements made over a time interval (also referred to as time series information) on the x-axis.

5.2.6. Main Effects and Interactions ¶

In factorial designs, there are two kinds of results that are of interest: main effects and interactions. A main effect is the statistical relationship between one independent variable and a dependent variable-averaging across the levels of the other independent variable(s). Thus there is one main effect to consider for each independent variable in the study. The top panel of Figure 5.4 shows a main effect of cell phone use because driving performance was better, on average, when participants were not using cell phones than when they were. The blue bars are, on average, higher than the red bars. It also shows a main effect of time of day because driving performance was better during the day than during the night-both when participants were using cell phones and when they were not. Main effects are independent of each other in the sense that whether or not there is a main effect of one independent variable says nothing about whether or not there is a main effect of the other. The bottom panel of Figure 5.4 , for example, shows a clear main effect of psychotherapy length. The longer the psychotherapy, the better it worked.

../_images/C8interactionbars.png

Fig. 5.4 Bar graphs showing three types of interactions. In the top panel, one independent variable has an effect at one level of the second independent variable but not at the other. In the middle panel, one independent variable has a stronger effect at one level of the second independent variable than at the other. In the bottom panel, one independent variable has the opposite effect at one level of the second independent variable than at the other. ¶

There is an interaction effect (or just “interaction”) when the effect of one independent variable depends on the level of another. Although this might seem complicated, you already have an intuitive understanding of interactions. It probably would not surprise you, for example, to hear that the effect of receiving psychotherapy is stronger among people who are highly motivated to change than among people who are not motivated to change. This is an interaction because the effect of one independent variable (whether or not one receives psychotherapy) depends on the level of another (motivation to change). Schnall and her colleagues also demonstrated an interaction because the effect of whether the room was clean or messy on participants’ moral judgments depended on whether the participants were low or high in private body consciousness. If they were high in private body consciousness, then those in the messy room made harsher judgments. If they were low in private body consciousness, then whether the room was clean or messy did not matter.

The effect of one independent variable can depend on the level of the other in several different ways. This is shown in Figure 5.5 .

../_images/C8interactionlines.png

Fig. 5.5 Line Graphs Showing Three Types of Interactions. In the top panel, one independent variable has an effect at one level of the second independent variable but not at the other. In the middle panel, one independent variable has a stronger effect at one level of the second independent variable than at the other. In the bottom panel, one independent variable has the opposite effect at one level of the second independent variable than at the other. ¶

In the top panel, independent variable “B” has an effect at level 1 of independent variable “A” but no effect at level 2 of independent variable “A” (much like the study of Schnall in which there was an effect of disgust for those high in private body consciousness but not for those low in private body consciousness). In the middle panel, independent variable “B” has a stronger effect at level 1 of independent variable “A” than at level 2. This is like the hypothetical driving example where there was a stronger effect of using a cell phone at night than during the day. In the bottom panel, independent variable “B” again has an effect at both levels of independent variable “A”, but the effects are in opposite directions. This is what is called called a crossover interaction. One example of a crossover interaction comes from a study by Kathy Gilliland on the effect of caffeine on the verbal test scores of introverts and extraverts [Gil80] . Introverts perform better than extraverts when they have not ingested any caffeine. But extraverts perform better than introverts when they have ingested 4 mg of caffeine per kilogram of body weight.

In many studies, the primary research question is about an interaction. The study by Brown and her colleagues was inspired by the idea that people with hypochondriasis are especially attentive to any negative health-related information. This led to the hypothesis that people high in hypochondriasis would recall negative health-related words more accurately than people low in hypochondriasis but recall non-health-related words about the same as people low in hypochondriasis. And this is exactly what happened in this study.

5.2.7. Key Takeaways ¶

Researchers often include multiple independent variables in their experiments. The most common approach is the factorial design, in which each level of one independent variable is combined with each level of the others to create all possible conditions.

In a factorial design, the main effect of an independent variable is its overall effect averaged across all other independent variables. There is one main effect for each independent variable.

There is an interaction between two independent variables when the effect of one depends on the level of the other. Some of the most interesting research questions and results in psychology are specifically about interactions.

5.2.8. Exercises ¶

Practice: Return to the five article titles presented at the beginning of this section. For each one, identify the independent variables and the dependent variable.

Practice: Create a factorial design table for an experiment on the effects of room temperature and noise level on performance on the MCAT. Be sure to indicate whether each independent variable will be manipulated between-subjects or within-subjects and explain why.

Practice: Sketch 8 different bar graphs to depict each of the following possible results in a 2 x 2 factorial experiment:

No main effect of A; no main effect of B; no interaction

Main effect of A; no main effect of B; no interaction

No main effect of A; main effect of B; no interaction

Main effect of A; main effect of B; no interaction

Main effect of A; main effect of B; interaction

Main effect of A; no main effect of B; interaction

No main effect of A; main effect of B; interaction

No main effect of A; no main effect of B; interaction

5.3. Factorial designs: Round 2 ¶

Factorial designs require the experimenter to manipulate at least two independent variables. Consider the light-switch example from earlier. Imagine you are trying to figure out which of two light switches turns on a light. The dependent variable is the light (we measure whether it is on or off). The first independent variable is light switch #1, and it has two levels, up or down. The second independent variable is light switch #2, and it also has two levels, up or down. When there are two independent variables, each with two levels, there are four total conditions that can be tested. We can describe these four conditions in a 2x2 table.

Switch 1 Up

Switch 1 Down

Switch 2 Up

Light ?

Light ?

Switch 2 Down

Light ?

Light ?

This kind of design has a special property that makes it a factorial design. That is, the levels of each independent variable are each manipulated across the levels of the other indpendent variable. In other words, we manipulate whether switch #1 is up or down when switch #2 is up, and when switch numebr #2 is down. Another term for this property of factorial designs is “fully-crossed”.

It is possible to conduct experiments with more than independent variable that are not fully-crossed, or factorial designs. This would mean that each of the levels of one independent variable are not necessarilly manipulated for each of the levels of the other independent variables. These kinds of designs are sometimes called unbalanced designs, and they are not as common as fully-factorial designs. An example, of an unbalanced design would be the following design with only 3 conditions:

Switch 1 Up

Switch 1 Down

Switch 2 Up

Light ?

Light ?

Switch 2 Down

Light ?

NOT MEASURED

Factorial designs are often described using notation such as AXB, where A indicates the number of levels for the first independent variable, and B indicates the number of levels for the second independent variable. The fully-crossed version of the 2-light switch experiment would be called a 2x2 factorial design. This notation is convenient because by multiplying the numbers in the equation we can find the number of conditions in the design. For example 2x2 = 4 conditions.

More complicated factorial designs have more indepdent variables and more levels. We use the same notation describe these designs. Each number represents the number of levels for one of the independent variables, and the number of numbers represents the number of variables. So, a 2x2x2 design has three independent variables, and each one has 2 levels, for a total of 2x2x2=6 conditions. A 3x3 design has two independent variables, each with three levels, for a total of 9 conditions. Designs can get very complicated, such as a 5x3x6x2x7 experiment, with five independent variables, each with differing numbers of levels, for a total of 1260 conditions. If you are considering a complicated design like that one, you might want to consider how to simplify it.

5.3.1. 2x2 Factorial designs ¶

For simplicity, we will focus mainly on 2x2 factorial designs. As with simple designs with only one independent variable, factorial designs have the same basic empirical question. Did manipulation of the independent variables cause changes in the dependent variables? However, 2x2 designs have more than one manipulation, so there is more than one way that the dependent variable can change. So, we end up asking the basic empirical question more than once.

More specifically, the analysis of factorial designs is split into two parts: main effects and interactions. Main effects occur when the manipulation of one independent variable cause a change in the dependent variable. In a 2x2 design, there are two independent variables, so there are two possible main effects: the main effect of independent variable 1, and the main effect of independent variable 2. An interaction occurs when the effect of one independent variable depends on the levels of the other independent variable. My experience in teaching the concept of main effects and interactions is that they are confusing. So, I expect that these definitions will not be very helpful, and although they are clear and precise, they only become helpful as definitions after you understand the concepts…so they are not useful for explaining the concepts. To explain the concepts we will go through several different kinds of examples.

To briefly add to the confusion, or perhaps to illustrate why these two concepts can be confusing, we will look at the eight possible outcomes that could occur in a 2x2 factorial experiment.

Possible outcome

IV1 main effect

IV2 main effect

Interaction

1

yes

yes

yes

2

yes

no

yes

3

no

yes

yes

4

no

no

yes

5

yes

yes

no

6

yes

no

no

7

no

yes

no

8

no

no

no

In the table, a yes means that there was statistically significant difference for one of the main effects or interaction, and a no means that there was not a statisically significant difference. As you can see, just by adding one more independent variable, the number of possible outcomes quickly become more complicated. When you conduct a 2x2 design, the task for analysis is to determine which of the 8 possibilites occured, and then explain the patterns for each of the effects that occurred. That’s a lot of explaining to do.

5.3.2. Main effects ¶

Main effects occur when the levels of an independent variable cause change in the measurement or dependent variable. There is one possible main effect for each independent variable in the design. When we find that independent variable did influence the dependent variable, then we say there was a main effect. When we find that the independent variable did not influence the dependent variable, then we say there was no main effect.

The simplest way to understand a main effect is to pretend that the other independent variables do not exist. If you do this, then you simply have a single-factor design, and you are asking whether that single factor caused change in the measurement. For a 2x2 experiment, you do this twice, once for each independent variable.

Let’s consider a silly example to illustrate an important property of main effects. In this experiment the dependent variable will be height in inches. The independent variables will be shoes and hats. The shoes independent variable will have two levels: wearing shoes vs. no shoes. The hats independent variable will have two levels: wearing a hat vs. not wearing a hat. The experimenter will provide the shoes and hats. The shoes add 1 inch to a person’s height, and the hats add 6 inches to a person’s height. Further imagine that we conduct a within-subjects design, so we measure each person’s height in each of the fours conditions. Before we look at some example data, the findings from this experiment should be pretty obvious. People will be 1 inch taller when they wear shoes, and 6 inches taller when they where a hat. We see this in the example data from 10 subjects presented below:

NoShoes-NoHat

Shoes-NoHat

NoShoes-Hat

Shoes-Hat

57

58

63

64

58

59

64

65

58

59

64

65

58

59

64

65

59

60

65

66

58

59

64

65

57

58

63

64

59

60

65

66

57

58

63

64

58

59

64

65

The mean heights in each condition are:

Condition

Mean

NoShoes-NoHat

57.9

Shoes-NoHat

58.9

NoShoes-Hat

63.9

Shoes-Hat

64.9

To find the main effect of the shoes manipulation we want to find the mean height in the no shoes condition, and compare it to the mean height of the shoes condition. To do this, we collapse , or average over the observations in the hat conditions. For example, looking only at the no shoes vs. shoes conditions we see the following averages for each subject.

NoShoes

Shoes

60

61

61

62

61

62

61

62

62

63

61

62

60

61

62

63

60

61

61

62

The group means are:

Shoes

Mean

No

60.9

Yes

61.9

As expected, we see that the average height is 1 inch taller when subjects wear shoes vs. do not wear shoes. So, the main effect of wearing shoes is to add 1 inch to a person’s height.

We can do the very same thing to find the main effect of hats. Except in this case, we find the average heights in the no hat vs. hat conditions by averaging over the shoe variable.

NoHat

Hat

57.5

63.5

58.5

64.5

58.5

64.5

58.5

64.5

59.5

65.5

58.5

64.5

57.5

63.5

59.5

65.5

57.5

63.5

58.5

64.5

Hat

Mean

No

58.4

Yes

64.4

As expected, we the average height is 6 inches taller when the subjects wear a hat vs. do not wear a hat. So, the main effect of wearing hats is to add 1 inch to a person’s height.

Instead of using tables to show the data, let’s use some bar graphs. First, we will plot the average heights in all four conditions.

../_images/hat-shoes-full.png

Fig. 5.6 Means from our experiment involving hats and shoes. ¶

Some questions to ask yourself are 1) can you identify the main effect of wearing shoes in the figure, and 2) can you identify the main effet of wearing hats in the figure. Both of these main effects can be seen in the figure, but they aren’t fully clear. You have to do some visual averaging.

Perhaps the most clear is the main effect of wearing a hat. The red bars show the conditions where people wear hats, and the green bars show the conditions where people do not wear hats. For both levels of the wearing shoes variable, the red bars are higher than the green bars. That is easy enough to see. More specifically, in both cases, wearing a hat adds exactly 6 inches to the height, no more no less.

Less clear is the main effect of wearing shoes. This is less clear because the effect is smaller so it is harder to see. How to find it? You can look at the red bars first and see that the red bar for no-shoes is slightly smaller than the red bar for shoes. The same is true for the green bars. The green bar for no-shoes is slightly smaller than the green bar for shoes.

../_images/hatandshoes-hatmain.png

Fig. 5.7 Means of our Hat and No-Hat conditions (averaging over the shoe condition). ¶

../_images/hatandshoes-shoemain.png

Fig. 5.8 Means of our Shoe and No-Shoe conditions (averaging over the hat condition). ¶

Data from 2x2 designs is often present in graphs like the one above. An advantage of these graphs is that they display means in all four conditions of the design. However, they do not clearly show the two main effects. Someone looking at this graph alone would have to guesstimate the main effects. Or, in addition to the main effects, a researcher could present two more graphs, one for each main effect (however, in practice this is not commonly done because it takes up space in a journal article, and with practice it becomes second nature to “see” the presence or absence of main effects in graphs showing all of the conditions). If we made a separate graph for the main effect of shoes we should see a difference of 1 inch between conditions. Similarly, if we made a separate graph for the main effect of hats then we should see a difference of 6 between conditions. Examples of both of those graphs appear in the margin.

Why have we been talking about shoes and hats? These independent variables are good examples of variables that are truly independent from one another. Neither one influences the other. For example, shoes with a 1 inch sole will always add 1 inch to a person’s height. This will be true no matter whether they wear a hat or not, and no matter how tall the hat is. In other words, the effect of wearing a shoe does not depend on wearing a hat. More formally, this means that the shoe and hat independent variables do not interact. It would be very strange if they did interact. It would mean that the effect of wearing a shoe on height would depend on wearing a hat. This does not happen in our universe. But in some other imaginary universe, it could mean, for example, that wearing a shoe adds 1 to your height when you do not wear a hat, but adds more than 1 inch (or less than 1 inch) when you do wear a hat. This thought experiment will be our entry point into discussing interactions. A take-home message before we begin is that some independent variables (like shoes and hats) do not interact; however, there are many other independent variables that do.

5.3.3. Interactions ¶

Interactions occur when the effect of an independent variable depends on the levels of the other independent variable. As we discussed above, some independent variables are independent from one another and will not produce interactions. However, other combinations of independent variables are not independent from one another and they produce interactions. Remember, independent variables are always manipulated independently from the measured variable (see margin note), but they are not necessarilly independent from each other.

Independence

These ideas can be confusing if you think that the word “independent” refers to the relationship between independent variables. However, the term “independent variable” refers to the relationship between the manipulated variable and the measured variable. Remember, “independent variables” are manipulated independently from the measured variable. Specifically, the levels of any independent variable do not change because we take measurements. Instead, the experimenter changes the levels of the independent variable and then observes possible changes in the measures.

There are many simple examples of two independent variables being dependent on one another to produce an outcome. Consider driving a car. The dependent variable (outcome that is measured) could be how far the car can drive in 1 minute. Independent variable 1 could be gas (has gas vs. no gas). Independent variable 2 could be keys (has keys vs. no keys). This is a 2x2 design, with four conditions.

Gas

No Gas

Keys

can drive

x

No Keys

x

x

Importantly, the effect of the gas variable on driving depends on the levels of having a key. Or, to state it in reverse, the effect of the key variable on driving depends on the levesl of the gas variable. Finally, in plain english. You need the keys and gas to drive. Otherwise, there is no driving.

5.3.4. What makes a people hangry? ¶

To continue with more examples, let’s consider an imaginary experiment examining what makes people hangry. You may have been hangry before. It’s when you become highly irritated and angry because you are very hungry…hangry. I will propose an experiment to measure conditions that are required to produce hangriness. The pretend experiment will measure hangriness (we ask people how hangry they are on a scale from 1-10, with 10 being most hangry, and 0 being not hangry at all). The first independent variable will be time since last meal (1 hour vs. 5 hours), and the second independent variable will be how tired someone is (not tired vs very tired). I imagine the data could look something the following bar graph.

../_images/hangry-full.png

Fig. 5.9 Means from our study of hangriness. ¶

The graph shows clear evidence of two main effects, and an interaction . There is a main effect of time since last meal. Both the bars in the 1 hour conditions have smaller hanger ratings than both of the bars in the 5 hour conditions. There is a main effect of being tired. Both of the bars in the “not tired” conditions are smaller than than both of the bars in the “tired” conditions. What about the interaction?

Remember, an interaction occurs when the effect of one independent variable depends on the level of the other independent variable. We can look at this two ways, and either way shows the presence of the very same interaction. First, does the effect of being tired depend on the levels of the time since last meal? Yes. Look first at the effect of being tired only for the “1 hour condition”. We see the red bar (tired) is 1 unit lower than the green bar (not tired). So, there is an effect of 1 unit of being tired in the 1 hour condition. Next, look at the effect of being tired only for the “5 hour” condition. We see the red bar (tired) is 3 units lower than the green bar (not tired). So, there is an effect of 3 units for being tired in the 5 hour condition. Clearly, the size of the effect for being tired depends on the levels of the time since last meal variable. We call this an interaction.

The second way of looking at the interaction is to start by looking at the other variable. For example, does the effect of time since last meal depend on the levels of the tired variable? The answer again is yes. Look first at the effect of time since last meal only for the red bars in the “not tired” condition. The red bar in the 1 hour condition is 1 unit smaller than the red bar in the 5 hour condition. Next, look at the effect of time since last meal only for the green bars in the “tired” condition. The green bar in the 1 hour condition is 3 units smaller than the green bar in the 5 hour condition. Again, the size of the effect of time since last meal depends on the levels of the tired variable.No matter which way you look at the interaction, we get the same numbers for the size of the interaction effect, which is 2 units (i.e., the difference between 3 and 1). The interaction suggests that something special happens when people are tired and haven’t eaten in 5 hours. In this condition, they can become very hangry. Whereas, in the other conditions, there are only small increases in being hangry.

5.3.5. Identifying main effects and interactions ¶

Research findings are often presented to readers using graphs or tables. For example, the very same pattern of data can be displayed in a bar graph, line graph, or table of means. These different formats can make the data look different, even though the pattern in the data is the same. An important skill to develop is the ability to identify the patterns in the data, regardless of the format they are presented in. Some examples of bar and line graphs are presented in the margin, and two example tables are presented below. Each format displays the same pattern of data.

../_images/maineffectsandinteraction-bar.png

Fig. 5.10 Data from a 2x2 factorial design summarized in a bar plot. ¶

../_images/maineffectsandinteraction-line.png

Fig. 5.11 The same data from above, but instead summarized in a line plot. ¶

After you become comfortable with interpreting data in these different formats, you should be able to quickly identify the pattern of main effects and interactions. For example, you would be able to notice that all of these graphs and tables show evidence for two main effects and one interaction.

As an exercise toward this goal, we will first take a closer look at extracting main effects and interactions from tables. This exercise will how the condition means are used to calculate the main effects and interactions. Consider the table of condition means below.

IV1

A

B

IV2

1

4

5

2

3

8

5.3.6. Main effects ¶

Main effects are the differences between the means of single independent variable. Notice, this table only shows the condition means for each level of all independent variables. So, the means for each IV must be calculated. The main effect for IV1 is the comparison between level A and level B, which involves calculating the two column means. The mean for IV1 Level A is (4+3)/2 = 3.5. The mean for IV1 Level B is (5+8)/2 = 6.5. So the main effect is 3 (6.5 - 3.5). The main effect for IV2 is the comparison between level 1 and level 2, which involves calculating the two row means. The mean for IV2 Level 1 is (4+5)/2 = 4.5. The mean for IV2 Level 2 is (3+8)/2 = 5.5. So the main effect is 1 (5.5 - 4.5). The process of computing the average for each level of a single independent variable, always involves collapsing, or averaging over, all of the other conditions from other variables that also occured in that condition

5.3.7. Interactions ¶

Interactions ask whether the effect of one independent variable depends on the levels of the other independent variables. This question is answered by computing difference scores between the condition means. For example, we look the effect of IV1 (A vs. B) for both levels of of IV2. Focus first on the condition means in the first row for IV2 level 1. We see that A=4 and B=5, so the effect IV1 here was 5-4 = 1. Next, look at the condition in the second row for IV2 level 2. We see that A=3 and B=8, so the effect of IV1 here was 8-3 = 5. We have just calculated two differences (5-4=1, and 8-3=5). These difference scores show that the size of the IV1 effect was different across the levels of IV2. To calculate the interaction effect we simply find the difference between the difference scores, 5-1=4. In general, if the difference between the difference scores is different, then there is an interaction effect.

5.3.8. Example bar graphs ¶

../_images/interactions-bar.png

Fig. 5.12 Four patterns that could be observed in a 2x2 factorial design. ¶

The IV1 shows a main effect only for IV1 (both red and green bars are lower for level 1 than level 2). The IV1&IV2 graphs shows main effects for both variables. The two bars on the left are both lower than the two on the right, and the red bars are both lower than the green bars. The IV1xIV2 graph shows an example of a classic cross-over interaction. Here, there are no main effects, just an interaction. There is a difference of 2 between the green and red bar for Level 1 of IV1, and a difference of -2 for Level 2 of IV1. That makes the differences between the differences = 4. Why are their no main effects? Well the average of the red bars would equal the average of the green bars, so there is no main effect for IV2. And, the average of the red and green bars for level 1 of IV1 would equal the average of the red and green bars for level 2 of IV1, so there is no main effect. The bar graph for IV2 shows only a main effect for IV2, as the red bars are both lower than the green bars.

5.3.9. Example line graphs ¶

You may find that the patterns of main effects and interaction looks different depending on the visual format of the graph. The exact same patterns of data plotted up in bar graph format, are plotted as line graphs for your viewing pleasure. Note that for the IV1 graph, the red line does not appear because it is hidden behind the green line (the points for both numbers are identical).

../_images/interactions-line.png

Fig. 5.13 Four patterns that could be observed in a 2x2 factorial design, now depicted using line plots. ¶

5.3.10. Interpreting main effects and interactions ¶

The presence of an interaction, particularly a strong interaction, can sometimes make it challenging to interpet main effects. For example, take a look at Figure 5.14 , which indicates a very strong interaction.

../_images/interpreting-mainfxinteractions-1.png

Fig. 5.14 A clear interaction effect. But what about the main effects? ¶

In Figure 5.14 , IV2 has no effect under level 1 of IV1 (e.g., the red and green bars are the same). IV2 has a large effect under level 2 of IV2 (the red bar is 2 and the green bar is 9). So, the interaction effect is a total of 7. Are there any main effects? Yes there are. Consider the main effect for IV1. The mean for level 1 is (2+2)/2 = 2, and the mean for level 2 is (2+9)/2 = 5.5. There is a difference between the means of 3.5, which is consistent with a main effect. Consider, the main effect for IV2. The mean for level 1 is again (2+2)/2 = 2, and the mean for level 2 is again (2+9)/2 = 5.5. Again, there is a difference between the means of 3.5, which is consistent with a main effect. However, it may seem somewhat misleading to say that our manipulation of IV1 influenced the DV. Why? Well, it only seemed to have have this influence half the time. The same is true for our manipulation of IV2. For this reason, we often say that the presence of interactions qualifies our main effects. In other words, there are two main effects here, but they must be interpreting knowing that we also have an interaction.

The example in Figure 5.15 shows a case in which it is probably a bit more straightforward to interpret both the main effects and the interaction.

../_images/interpreting-mainfxinteractions-2.png

Fig. 5.15 Perhaps the main effects are more straightforward to interpret in this example. ¶

Can you spot the interaction right away? The difference between red and green bars is small for level 1 of IV1, but large for level 2. The differences between the differences are different, so there is an interaction. But, we also see clear evidence of two main effects. For example, both the red and green bars for IV1 level 1 are higher than IV1 Level 2. And, both of the red bars (IV2 level 1) are higher than the green bars (IV2 level 2).

5.4. Complex Correlational Designs ¶

5.5. learning objectives ¶.

Explain why researchers use complex correlational designs.

Create and interpret a correlation matrix.

Describe how researchers can use correlational research to explore causal relationships among variables—including the limits of this approach.

As we have already seen, researchers conduct correlational studies rather than experiments when they are interested in noncausal relationships or when they are interested variables that cannot be manipulated for practical or ethical reasons. In this section, we look at some approaches to complex correlational research that involve measuring several variables and assessing the relationships among them.

5.5.1. Correlational Studies With Factorial Designs ¶

We have already seen that factorial experiments can include manipulated independent variables or a combination of manipulated and non-manipulated independent variables. But factorial designs can also consist exclusively of non-manipulated independent variables, in which case they are no longer experiments but correlational studies. Consider a hypothetical study in which a researcher measures two variables. First, the researcher measures participants’ mood and self-esteem. The research then also measure participants’ willingness to have unprotected sexual intercourse. This study can be conceptualized as a 2 x 2 factorial design with mood (positive vs. negative) and self-esteem (high vs. low) as between-subjects factors. Willingness to have unprotected sex is the dependent variable. This design can be represented in a factorial design table and the results in a bar graph of the sort we have already seen. The researcher would consider the main effect of sex, the main effect of self-esteem, and the interaction between these two independent variables.

Again, because neither independent variable in this example was manipulated, it is a correlational study rather than an experiment (the study by MacDonald and Martineau [MM02] was similar, but was an experiment because they manipulated their participants’ moods). This is important because, as always, one must be cautious about inferring causality from correlational studies because of the directionality and third-variable problems. For example, a main effect of participants’ moods on their willingness to have unprotected sex might be caused by any other variable that happens to be correlated with their moods.

5.5.2. Assessing Relationships Among Multiple Variables ¶

Most complex correlational research, however, does not fit neatly into a factorial design. Instead, it involves measuring several variables, often both categorical and quantitative, and then assessing the statistical relationships among them. For example, researchers Nathan Radcliffe and William Klein studied a sample of middle-aged adults to see how their level of optimism (measured by using a short questionnaire called the Life Orientation Test) was related to several other heart-health-related variables [RK02] . These included health, knowledge of heart attack risk factors, and beliefs about their own risk of having a heart attack. They found that more optimistic participants were healthier (e.g., they exercised more and had lower blood pressure), knew about heart attack risk factors, and correctly believed their own risk to be lower than that of their peers.

This approach is often used to assess the validity of new psychological measures. For example, when John Cacioppo and Richard Petty created their Need for Cognition Scale, a measure of the extent to which people like to think and value thinking, they used it to measure the need for cognition for a large sample of college students along with three other variables: intelligence, socially desirable responding (the tendency to give what one thinks is the “appropriate” response), and dogmatism [CP82] . The results of this study are summarized in Figure 5.16 , which is a correlation matrix showing the correlation (Pearson’s \(r\) ) between every possible pair of variables in the study.

../_images/C8need.png

Fig. 5.16 Correlation matrix showing correlations among need for cognition and three other variables based on research by Cacioppo and Petty (1982). Only half the matrix is filled in because the other half would contain exactly the same information. Also, because the correlation between a variable and itself is always \(r=1.0\) , these values are replaced with dashes throughout the matrix. ¶

For example, the correlation between the need for cognition and intelligence was \(r=.39\) , the correlation between intelligence and socially desirable responding was \(r=.02\) , and so on. In this case, the overall pattern of correlations was consistent with the researchers’ ideas about how scores on the need for cognition should be related to these other constructs.

When researchers study relationships among a large number of conceptually similar variables, they often use a complex statistical technique called factor analysis. In essence, factor analysis organizes the variables into a smaller number of clusters, such that they are strongly correlated within each cluster but weakly correlated between clusters. Each cluster is then interpreted as multiple measures of the same underlying construct. These underlying constructs are also called “factors.” For example, when people perform a wide variety of mental tasks, factor analysis typically organizes them into two main factors—one that researchers interpret as mathematical intelligence (arithmetic, quantitative estimation, spatial reasoning, and so on) and another that they interpret as verbal intelligence (grammar, reading comprehension, vocabulary, and so on). The Big Five personality factors have been identified through factor analyses of people’s scores on a large number of more specific traits. For example, measures of warmth, gregariousness, activity level, and positive emotions tend to be highly correlated with each other and are interpreted as representing the construct of extraversion. As a final example, researchers Peter Rentfrow and Samuel Gosling asked more than 1,700 university students to rate how much they liked 14 different popular genres of music [RG03] . They then submitted these 14 variables to a factor analysis, which identified four distinct factors. The researchers called them Reflective and Complex (blues, jazz, classical, and folk), Intense and Rebellious (rock, alternative, and heavy metal), Upbeat and Conventional (country, soundtrack, religious, pop), and Energetic and Rhythmic (rap/hip-hop, soul/funk, and electronica).

Two additional points about factor analysis are worth making here. One is that factors are not categories. Factor analysis does not tell us that people are either extraverted or conscientious or that they like either “reflective and complex” music or “intense and rebellious” music. Instead, factors are constructs that operate independently of each other. So people who are high in extraversion might be high or low in conscientiousness, and people who like reflective and complex music might or might not also like intense and rebellious music. The second point is that factor analysis reveals only the underlying structure of the variables. It is up to researchers to interpret and label the factors and to explain the origin of that particular factor structure. For example, one reason that extraversion and the other Big Five operate as separate factors is that they appear to be controlled by different genes [PDMM08] .

5.5.3. Exploring Causal Relationships ¶

NO NO NO NO NO NO NO NO NO

IGNORE, SECTION UNDER CONSTRUCTION (or destruction)

Another important use of complex correlational research is to explore possible causal relationships among variables. This might seem surprising given that “correlation does not imply causation”. It is true that correlational research cannot unambiguously establish that one variable causes another. Complex correlational research, however, can often be used to rule out other plausible interpretations.

The primary way of doing this is through the statistical control of potential third variables. Instead of controlling these variables by random assignment or by holding them constant as in an experiment, the researcher measures them and includes them in the statistical analysis. Consider some research by Paul Piff and his colleagues, who hypothesized that being lower in socioeconomic status (SES) causes people to be more generous [PKCote+10] . They measured their participants’ SES and had them play the “dictator game.” They told participants that each would be paired with another participant in a different room. (In reality, there was no other participant.) Then they gave each participant 10 points (which could later be converted to money) to split with the “partner” in whatever way he or she decided. Because the participants were the “dictators,” they could even keep all 10 points for themselves if they wanted to.

As these researchers expected, participants who were lower in SES tended to give away more of their points than participants who were higher in SES. This is consistent with the idea that being lower in SES causes people to be more generous. But there are also plausible third variables that could explain this relationship. It could be, for example, that people who are lower in SES tend to be more religious and that it is their greater religiosity that causes them to be more generous. Or it could be that people who are lower in SES tend to come from certain ethnic groups that emphasize generosity more than other ethnic groups. The researchers dealt with these potential third variables, however, by measuring them and including them in their statistical analyses. They found that neither religiosity nor ethnicity was correlated with generosity and were therefore able to rule them out as third variables. This does not prove that SES causes greater generosity because there could still be other third variables that the researchers did not measure. But by ruling out some of the most plausible third variables, the researchers made a stronger case for SES as the cause of the greater generosity.

Many studies of this type use a statistical technique called multiple regression. This involves measuring several independent variables (X1, X2, X3,…Xi), all of which are possible causes of a single dependent variable (Y). The result of a multiple regression analysis is an equation that expresses the dependent variable as an additive combination of the independent variables. This regression equation has the following general form:

\(b1X1+ b2X2+ b3X3+ ... + biXi = Y\)

The quantities b1, b2, and so on are regression weights that indicate how large a contribution an independent variable makes, on average, to the dependent variable. Specifically, they indicate how much the dependent variable changes for each one-unit change in the independent variable.

The advantage of multiple regression is that it can show whether an independent variable makes a contribution to a dependent variable over and above the contributions made by other independent variables. As a hypothetical example, imagine that a researcher wants to know how the independent variables of income and health relate to the dependent variable of happiness. This is tricky because income and health are themselves related to each other. Thus if people with greater incomes tend to be happier, then perhaps this is only because they tend to be healthier. Likewise, if people who are healthier tend to be happier, perhaps this is only because they tend to make more money. But a multiple regression analysis including both income and happiness as independent variables would show whether each one makes a contribution to happiness when the other is taken into account. Research like this, by the way, has shown both income and health make extremely small contributions to happiness except in the case of severe poverty or illness [Die00] .

The examples discussed in this section only scratch the surface of how researchers use complex correlational research to explore possible causal relationships among variables. It is important to keep in mind, however, that purely correlational approaches cannot unambiguously establish that one variable causes another. The best they can do is show patterns of relationships that are consistent with some causal interpretations and inconsistent with others.

5.5.4. Key Takeaways ¶

Researchers often use complex correlational research to explore relationships among several variables in the same study.

Complex correlational research can be used to explore possible causal relationships among variables using techniques such as multiple regression. Such designs can show patterns of relationships that are consistent with some causal interpretations and inconsistent with others, but they cannot unambiguously establish that one variable causes another.

5.5.5. Exercises ¶

Practice: Construct a correlation matrix for a hypothetical study including the variables of depression, anxiety, self-esteem, and happiness. Include the Pearson’s r values that you would expect.

Discussion: Imagine a correlational study that looks at intelligence, the need for cognition, and high school students’ performance in a critical-thinking course. A multiple regression analysis shows that intelligence is not related to performance in the class but that the need for cognition is. Explain what this study has shown in terms of what causes good performance in the critical- thinking course.

Logo for Kwantlen Polytechnic University

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Factorial Designs

41 setting up a factorial experiment, learning objectives.

  • Explain why researchers often include multiple independent variables in their studies.
  • Define factorial design, and use a factorial design table to represent and interpret simple factorial designs.

Just as it is common for studies in psychology to include multiple levels of a single independent variable (placebo, new drug, old drug), it is also common for them to include multiple independent variables. Schnall and her colleagues studied the effect of both disgust and private body consciousness in the same study. Researchers’ inclusion of multiple independent variables in one experiment is further illustrated by the following actual titles from various professional journals:

  • The Effects of Temporal Delay and Orientation on Haptic Object Recognition
  • Opening Closed Minds: The Combined Effects of Intergroup Contact and Need for Closure on Prejudice
  • Effects of Expectancies and Coping on Pain-Induced Intentions to Smoke
  • The Effect of Age and Divided Attention on Spontaneous Recognition
  • The Effects of Reduced Food Size and Package Size on the Consumption Behavior of Restrained and Unrestrained Eaters

Just as including multiple levels of a single independent variable allows one to answer more sophisticated research questions, so too does including multiple independent variables in the same experiment. For example, instead of conducting one study on the effect of disgust on moral judgment and another on the effect of private body consciousness on moral judgment, Schnall and colleagues were able to conduct one study that addressed both questions. But including multiple independent variables also allows the researcher to answer questions about whether the effect of one independent variable depends on the level of another. This is referred to as an interaction between the independent variables. Schnall and her colleagues, for example, observed an interaction between disgust and private body consciousness because the effect of disgust depended on whether participants were high or low in private body consciousness. As we will see, interactions are often among the most interesting results in psychological research.

By far the most common approach to including multiple independent variables (which are often called factors) in an experiment is the factorial design. In a  factorial design , each level of one independent variable is combined with each level of the others to produce all possible combinations. Each combination, then, becomes a condition in the experiment. Imagine, for example, an experiment on the effect of cell phone use (yes vs. no) and time of day (day vs. night) on driving ability. This is shown in the  factorial design table  in Figure 9.1. The columns of the table represent cell phone use, and the rows represent time of day. The four cells of the table represent the four possible combinations or conditions: using a cell phone during the day, not using a cell phone during the day, using a cell phone at night, and not using a cell phone at night. This particular design is referred to as a 2 × 2 (read “two-by-two”) factorial design because it combines two variables, each of which has two levels.

If one of the independent variables had a third level (e.g., using a handheld cell phone, using a hands-free cell phone, and not using a cell phone), then it would be a 3 × 2 factorial design, and there would be six distinct conditions. Notice that the number of possible conditions is the product of the numbers of levels. A 2 × 2 factorial design has four conditions, a 3 × 2 factorial design has six conditions, a 4 × 5 factorial design would have 20 conditions, and so on. Also notice that each number in the notation represents one factor, one independent variable. So by looking at how many numbers are in the notation, you can determine how many independent variables there are in the experiment. 2 x 2, 3 x 3, and 2 x 3 designs all have two numbers in the notation and therefore all have two independent variables. The numerical value of each of the numbers represents the number of levels of each independent variable. A 2 means that the independent variable has two levels, a 3 means that the independent variable has three levels, a 4 means it has four levels, etc. To illustrate a 3 x 3 design has two independent variables, each with three levels, while a 2 x 2 x 2 design has three independent variables, each with two levels.

factorial experimental design example

In principle, factorial designs can include any number of independent variables with any number of levels. For example, an experiment could include the type of psychotherapy (cognitive vs. behavioral), the length of the psychotherapy (2 weeks vs. 2 months), and the sex of the psychotherapist (female vs. male). This would be a 2 × 2 × 2 factorial design and would have eight conditions. Figure 9.2 shows one way to represent this design. In practice, it is unusual for there to be more than three independent variables with more than two or three levels each. This is for at least two reasons: For one, the number of conditions can quickly become unmanageable. For example, adding a fourth independent variable with three levels (e.g., therapist experience: low vs. medium vs. high) to the current example would make it a 2 × 2 × 2 × 3 factorial design with 24 distinct conditions. Second, the number of participants required to populate all of these conditions (while maintaining a reasonable ability to detect a real underlying effect) can render the design unfeasible (for more information, see the discussion about the importance of adequate statistical power in Chapter 13). As a result, in the remainder of this section, we will focus on designs with two independent variables. The general principles discussed here extend in a straightforward way to more complex factorial designs.

factorial experimental design example

Assigning Participants to Conditions

Recall that in a simple between-subjects design, each participant is tested in only one condition. In a simple within-subjects design, each participant is tested in all conditions. In a factorial experiment, the decision to take the between-subjects or within-subjects approach must be made separately for each independent variable. In a  between-subjects factorial design , all of the independent variables are manipulated between subjects. For example, all participants could be tested either while using a cell phone  or  while not using a cell phone and either during the day  or  during the night. This would mean that each participant would be tested in one and only one condition. In a within-subjects factorial design, all of the independent variables are manipulated within subjects. All participants could be tested both while using a cell phone and  while not using a cell phone and both during the day  and  during the night. This would mean that each participant would need to be tested in all four conditions. The advantages and disadvantages of these two approaches are the same as those discussed in Chapter 5. The between-subjects design is conceptually simpler, avoids order/carryover effects, and minimizes the time and effort of each participant. The within-subjects design is more efficient for the researcher and controls extraneous participant variables.

Since factorial designs have more than one independent variable, it is also possible to manipulate one independent variable between subjects and another within subjects. This is called a  mixed factorial design . For example, a researcher might choose to treat cell phone use as a within-subjects factor by testing the same participants both while using a cell phone and while not using a cell phone (while counterbalancing the order of these two conditions). But they might choose to treat time of day as a between-subjects factor by testing each participant either during the day or during the night (perhaps because this only requires them to come in for testing once). Thus each participant in this mixed design would be tested in two of the four conditions.

Regardless of whether the design is between subjects, within subjects, or mixed, the actual assignment of participants to conditions or orders of conditions is typically done randomly.

Non-Manipulated Independent Variables

In many factorial designs, one of the independent variables is a non-manipulated independent variable . The researcher measures it but does not manipulate it. The study by Schnall and colleagues is a good example. One independent variable was disgust, which the researchers manipulated by testing participants in a clean room or a messy room. The other was private body consciousness, a participant variable which the researchers simply measured. Another example is a study by Halle Brown and colleagues in which participants were exposed to several words that they were later asked to recall (Brown, Kosslyn, Delamater, Fama, & Barsky, 1999) [1] . The manipulated independent variable was the type of word. Some were negative health-related words (e.g.,  tumor, coronary ), and others were not health related (e.g.,  election, geometry ). The non-manipulated independent variable was whether participants were high or low in hypochondriasis (excessive concern with ordinary bodily symptoms). The result of this study was that the participants high in hypochondriasis were better than those low in hypochondriasis at recalling the health-related words, but they were no better at recalling the non-health-related words.

Such studies are extremely common, and there are several points worth making about them. First, non-manipulated independent variables are usually participant variables (private body consciousness, hypochondriasis, self-esteem, gender, and so on), and as such, they are by definition between-subjects factors. For example, people are either low in hypochondriasis or high in hypochondriasis; they cannot be tested in both of these conditions. Second, such studies are generally considered to be experiments as long as at least one independent variable is manipulated, regardless of how many non-manipulated independent variables are included. Third, it is important to remember that causal conclusions can only be drawn about the manipulated independent variable. For example, Schnall and her colleagues were justified in concluding that disgust affected the harshness of their participants’ moral judgments because they manipulated that variable and randomly assigned participants to the clean or messy room. But they would not have been justified in concluding that participants’ private body consciousness affected the harshness of their participants’ moral judgments because they did not manipulate that variable. It could be, for example, that having a strict moral code and a heightened awareness of one’s body are both caused by some third variable (e.g., neuroticism). Thus it is important to be aware of which variables in a study are manipulated and which are not.

Non-Experimental Studies With Factorial Designs

Thus far we have seen that factorial experiments can include manipulated independent variables or a combination of manipulated and non-manipulated independent variables. But factorial designs can also include  only non-manipulated independent variables, in which case they are no longer experiments but are instead non-experimental in nature. Consider a hypothetical study in which a researcher simply measures both the moods and the self-esteem of several participants—categorizing them as having either a positive or negative mood and as being either high or low in self-esteem—along with their willingness to have unprotected sexual intercourse. This can be conceptualized as a 2 × 2 factorial design with mood (positive vs. negative) and self-esteem (high vs. low) as non-manipulated between-subjects factors. Willingness to have unprotected sex is the dependent variable.

Again, because neither independent variable in this example was manipulated, it is a non-experimental study rather than an experiment. (The similar study by MacDonald and Martineau [2002] [2]  was an experiment because they manipulated their participants’ moods.) This is important because, as always, one must be cautious about inferring causality from non-experimental studies because of the directionality and third-variable problems. For example, an effect of participants’ moods on their willingness to have unprotected sex might be caused by any other variable that happens to be correlated with their moods.

  • Brown, H. D., Kosslyn, S. M., Delamater, B., Fama, A., & Barsky, A. J. (1999). Perceptual and memory biases for health-related information in hypochondriacal individuals. Journal of Psychosomatic Research, 47 , 67–78. ↵
  • MacDonald, T. K., & Martineau, A. M. (2002). Self-esteem, mood, and intentions to use condoms: When does low self-esteem lead to risky health behaviors? Journal of Experimental Social Psychology, 38 , 299–306. ↵

Experiments that include more than one independent variable in which each level of one independent variable is combined with each level of the others to produce all possible combinations.

Shows how each level of one independent variable is combined with each level of the others to produce all possible combinations in a factorial design.

All of the independent variables are manipulated between subjects.

A design which manipulates one independent variable between subjects and another within subjects.

An independent variable that is measured but is non-manipulated.

Research Methods in Psychology Copyright © 2019 by Rajiv S. Jhangiani, I-Chant A. Chiang, Carrie Cuttler, & Dana C. Leighton is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

factorial experimental design example

What is a Factorial Design of an Experiment?

The factorial design of experiment is described with examples in Video 1.

Video 1. Introduction to Factorial Design of Experiment DOE and the Main Effect Calculation Explained Example .

In a Factorial Design of Experiment, all possible combinations of the levels of a factor can be studied against all possible levels of other factors. Therefore, the factorial design of experiments is also called the crossed factor design of experiments . Due to the crossed nature of the levels, the factorial design of experiments can also be called the completely randomized design (CRD) of experiments . Therefore, the proper name for the factorial design of experiments would be completely randomized factorial design of experiments .

In an easy to understand study of human comfort, two levels of the temperature factor (or independent variable), including 0 O F and 75 O F; and two levels of the humidity factor, including 0% and 35% were studied with all possible combinations (Figure 1). Therefore, the four (2X2) possible treatment combinations, and their associated responses from human subjects (experimental units) are provided in Table 1.

Table 1. Data Structure/Layout of a Factorial Design of Experiment

factorial experimental design example

Coding Systems for the Factor Levels in the Factorial Design of Experiment

As the factorial design is primarily used for screening variables, only two levels are enough. Often, coding the levels as (1) low/high, (2) -/+, (3) -1/+1, or (4) 0/1 is more convenient and meaningful than the actual level of the factors, especially for the designs and analyses of the factorial experiments. These coding systems are particularly useful in developing the methods in factorial and fractional factorial design of experiments. Moreover, general formula and methods can only be developed utilizing the coding system. Coding systems are also useful in response surface methodology. Often, coded levels produce smooth, meaningful and easy to understand contour plots and response surfaces. Moreover, especially in complex designs, the coded levels such as the low- and high-level of a factor are easier to understand.

How to graphically represent the design?

An example graphical representation of a factorial design of experiment is provided in Figure 1 .

factorial experimental design example

Figure 1. Factorial Design of Experiments with two levels for each factor (independent variable, x). The response (dependent variable, y) is shown using the solid black circle with the associated response values.

Test Your Knowledge

Understanding main effects.

Popular searches

  • How to Get Participants For Your Study
  • How to Do Segmentation?
  • Conjoint Preference Share Simulator
  • MaxDiff Analysis
  • Likert Scales
  • Reliability & Validity

Request consultation

Do you need support in running a pricing or product study? We can help you with agile consumer research and conjoint analysis.

Looking for an online survey platform?

Conjointly offers a great survey tool with multiple question types, randomisation blocks, and multilingual support. The Basic tier is always free.

Research Methods Knowledge Base

  • Navigating the Knowledge Base
  • Foundations
  • Measurement
  • Internal Validity
  • Introduction to Design
  • Types of Designs
  • Two-Group Experimental Designs
  • Defining Experimental Designs
  • Factorial Design Variations
  • Randomized Block Designs
  • Covariance Designs
  • Hybrid Experimental Designs
  • Quasi-Experimental Design
  • Pre-Post Design Relationships
  • Designing Designs for Research
  • Quasi-Experimentation Advances
  • Table of Contents

Fully-functional online survey tool with various question types, logic, randomisation, and reporting for unlimited number of surveys.

Completely free for academics and students .

Factorial Designs

A simple example.

Probably the easiest way to begin understanding factorial designs is by looking at an example. Let’s imagine a design where we have an educational program where we would like to look at a variety of program variations to see which works best. For instance, we would like to vary the amount of time the children receive instruction with one group getting 1 hour of instruction per week and another getting 4 hours per week. And, we’d like to vary the setting with one group getting the instruction in-class (probably pulled off into a corner of the classroom) and the other group being pulled-out of the classroom for instruction in another room. We could think about having four separate groups to do this, but when we are varying the amount of time in instruction, what setting would we use: in-class or pull-out? And, when we were studying setting, what amount of instruction time would we use: 1 hour, 4 hours, or something else?

With factorial designs, we don’t have to compromise when answering these questions. We can have it both ways if we cross each of our two time in instruction conditions with each of our two settings. Let’s begin by doing some defining of terms. In factorial designs, a factor is a major independent variable. In this example we have two factors: time in instruction and setting. A level is a subdivision of a factor. In this example, time in instruction has two levels and setting has two levels. Sometimes we depict a factorial design with a numbering notation. In this example, we can say that we have a 2 x 2 (spoken “two-by-two) factorial design. In this notation, the number of numbers tells you how many factors there are and the number values tell you how many levels. If I said I had a 3 x 4 factorial design, you would know that I had 2 factors and that one factor had 3 levels while the other had 4. Order of the numbers makes no difference and we could just as easily term this a 4 x 3 factorial design. The number of different treatment groups that we have in any factorial design can easily be determined by multiplying through the number notation. For instance, in our example we have 2 x 2 = 4 groups. In our notational example, we would need 3 x 4 = 12 groups.

We can also depict a factorial design in design notation. Because of the treatment level combinations, it is useful to use subscripts on the treatment (X) symbol. We can see in the figure that there are four groups, one for each combination of levels of factors. It is also immediately apparent that the groups were randomly assigned and that this is a posttest-only design.

Now, let’s look at a variety of different results we might get from this simple 2 x 2 factorial design. Each of the following figures describes a different possible outcome. And each outcome is shown in table form (the 2 x 2 table with the row and column averages) and in graphic form (with each factor taking a turn on the horizontal axis). You should convince yourself that the information in the tables agrees with the information in both of the graphs. You should also convince yourself that the pair of graphs in each figure show the exact same information graphed in two different ways. The lines that are shown in the graphs are technically not necessary – they are used as a visual aid to enable you to easily track where the averages for a single level go across levels of another factor. Keep in mind that the values shown in the tables and graphs are group averages on the outcome variable of interest. In this example, the outcome might be a test of achievement in the subject being taught. We will assume that scores on this test range from 1 to 10 with higher values indicating greater achievement. You should study carefully the outcomes in each figure in order to understand the differences between these cases.

The Null Outcome

Let’s begin by looking at the “null” case. The null case is a situation where the treatments have no effect. This figure assumes that even if we didn’t give the training we could expect that students would score a 5 on average on the outcome test. You can see in this hypothetical case that all four groups score an average of 5 and therefore the row and column averages must be 5. You can’t see the lines for both levels in the graphs because one line falls right on top of the other.

The Main Effects

A main effect is an outcome that is a consistent difference between levels of a factor. For instance, we would say there’s a main effect for setting if we find a statistical difference between the averages for the in-class and pull-out groups, at all levels of time in instruction. The first figure depicts a main effect of time. For all settings, the 4 hour/week condition worked better than the 1 hour/week one. It is also possible to have a main effect for setting (and none for time).

In the second main effect graph we see that in-class training was better than pull-out training for all amounts of time.

Finally, it is possible to have a main effect on both variables simultaneously as depicted in the third main effect figure. In this instance 4 hours/week always works better than 1 hour/week and in-class setting always works better than pull-out.

Interaction Effects

If we could only look at main effects, factorial designs would be useful. But, because of the way we combine levels in factorial designs, they also enable us to examine the interaction effects that exist between factors. An interaction effect exists when differences on one factor depend on the level you are on another factor. It’s important to recognize that an interaction is between factors, not levels. We wouldn’t say there’s an interaction between 4 hours/week and in-class treatment. Instead, we would say that there’s an interaction between time and setting, and then we would go on to describe the specific levels involved.

How do you know if there is an interaction in a factorial design? There are three ways you can determine there’s an interaction. First, when you run the statistical analysis, the statistical table will report on all main effects and interactions. Second, you know there’s an interaction when can’t talk about effect on one factor without mentioning the other factor. if you can say at the end of our study that time in instruction makes a difference, then you know that you have a main effect and not an interaction (because you did not have to mention the setting factor when describing the results for time). On the other hand, when you have an interaction it is impossible to describe your results accurately without mentioning both factors. Finally, you can always spot an interaction in the graphs of group means – whenever there are lines that are not parallel there is an interaction present! If you check out the main effect graphs above, you will notice that all of the lines within a graph are parallel. In contrast, for all of the interaction graphs, you will see that the lines are not parallel.

In the first interaction effect graph, we see that one combination of levels – 4 hours/week and in-class setting – does better than the other three. In the second interaction we have a more complex “cross-over” interaction. Here, at 1 hour/week the pull-out group does better than the in-class group while at 4 hours/week the reverse is true. Furthermore, the both of these combinations of levels do equally well.

Factorial design has several important features. First, it has great flexibility for exploring or enhancing the “signal” (treatment) in our studies. Whenever we are interested in examining treatment variations, factorial designs should be strong candidates as the designs of choice. Second, factorial designs are efficient. Instead of conducting a series of independent studies we are effectively able to combine these studies into one. Finally, factorial designs are the only effective way to examine interaction effects.

So far, we have only looked at a very simple 2 x 2 factorial design structure. You may want to look at some factorial design variations to get a deeper understanding of how they work. You may also want to examine how we approach the statistical analysis of factorial experimental designs .

Cookie Consent

Conjointly uses essential cookies to make our site work. We also use additional cookies in order to understand the usage of the site, gather audience analytics, and for remarketing purposes.

For more information on Conjointly's use of cookies, please read our Cookie Policy .

Which one are you?

I am new to conjointly, i am already using conjointly.

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

5.1 - factorial designs with two treatment factors.

For now we will just consider two treatment factors of interest. It looks almost the same as the randomized block design model only now we are including an interaction term:

\(Y_{ijk} = \mu + \alpha_i + \beta_j + (\alpha\beta)_{ij} + e_{ijk}\)

where \(i = 1, \dots, a, j = 1, \dots, b, \text{ and } k = 1, \dots, n\). Thus we have two factors in a factorial structure with n observations per cell. As usual, we assume the \(e_{ijk} ∼ N(0, \sigma^2)\), i.e. independently and identically distributed with the normal distribution. Although it looks like a multiplication, the interaction term need not imply multiplicative interaction.

The Effects Model vs. the Means Model Section  

The cell means model is written:

\(Y_{ijk}=\mu_{ij} + e_{ijk}\)

Here the cell means are: \(\mu_{11}, \dots , \mu_{1b}, \dots , \mu_{a1} \dots \mu_{ab}\). Therefore we have a × b cell means, μ ij . We will define our marginal means as the simple average over our cell means as shown below:

\(\bar{\mu}_{i.}=\frac{1}{b} \sum\limits_j \mu_{ij}\), \(\bar{\mu}_{.j}=\frac{1}{a} \sum\limits_i \mu_{ij}\)

From the cell means structure we can talk about marginal means and row and column means. But first we want to look at the effects model and define more carefully what the interactions are We can write the cell means in terms of the full effects model:

\(\mu_{ij} = \mu + \alpha_i + \beta_j + (\alpha\beta)_{ij}\)

It follows that the interaction terms \((\alpha \beta)_{ij}\)are defined as the difference between our cell means and the additive portion of the model:

\((\alpha\beta)_{ij} = \mu_{ij} - (\mu + \alpha_i + \beta_j) \)

If the true model structure is additive then the interaction terms\((\alpha \beta)_{ij}\) are equal to zero. Then we can say that the true cell means, \(\mu_{ij} = (\mu + \alpha_i + \beta_j)\), have additive structure.

Example 1 Section  

Let's illustrate this by considering the true means \(\mu_{ij} \colon\)

  B
  \(\mu_{ij}\)  
A     1 2 \(\bar{\mu}_{i.}\) \(\alpha_i\)
  1 5 11 8 -2
  2 9 15 12 2
  \(\bar{\mu}_{.j}\) 7 13 10  
  \(\beta_j\) -3 3    

Note that both a and b are 2, thus our marginal row means are 8 and 12, and our marginal column means are 7 and 13. Next, let's calculate the \(\alpha\) and the \(\beta\) effects; since the overall mean is 10, our \(\alpha\) effects are -2 and 2 (which sum to 0), and our \(\beta\) effects are -3 and 3 (which also sum to 0). If you plot the cell means you get two lines that are parallel.

The difference between the two means at the first \(\beta\) factor level is 9 - 5 = 4. The difference between the means for the second \(\beta\) factor level is 15 - 11 = 4. We can say that the effect of \(\alpha\) at the first level of \(\beta\) is the same as the effect of \(\alpha\) at the second level of \(\beta\). Therefore we say that there is no interaction and as we will see the interaction terms are equal to 0.

This example simply illustrates that the cell means, in this case, have additive structure. A problem with data that we actually look at is that you do not know in advance whether the effects are additive or not. Because of random error, the interaction terms are seldom exactly zero. You may be involved in a situation that is either additive or non-additive, and the first task is to decide between them.

Now consider the non-additive case. We illustrate this with Example 2 which follows.

Example 5.2 Section  

This example was constructed so that the marginal means and the overall means are the same as in Example 1. However, it does not have additive structure.

Using the definition of interaction:

\((\alpha \beta)_{ij} = \mu_{ij} - (\mu + \alpha_i + \beta_j)\)

which gives us \((\alpha \beta)_{ij}\) interaction terms that are -2, 2, 2, -2. Again, by the definition of our interaction effects, these \((\alpha \beta)_{ij}\) terms should sum to zero in both directions.

We generally call the \(\alpha_i\) terms the treatment effects for treatment factor A and the \(\beta_j\) terms for treatment factor B, and the \((\alpha \beta)_{ij}\) terms the interaction effects.

The model we have written gives us a way to represent in a mathematical form a two-factor design, whether we use the means model or the effects model, i.e.,

\(Y_{ijk} = \mu_{ij} + e_{ijk}\)

There is really no benefit to the effects model when there is interaction, except that it gives us a mechanism for partitioning the variation due to the two treatments and their interactions. Both models have the same number of distinct parameters. However, when there is no interaction then we can remove the interaction terms from the model and use the reduced additive model.

Now, we'll take a look at the strategy for deciding whether our model fits, whether the assumptions are satisfied and then decide whether we can go forward with an interaction model or an additive model. This is the first decision. When you can eliminate the interactions because they are not significantly different from zero, then you can use the simpler additive model. This should be the goal whenever possible because then you have fewer parameters to estimate, and a simpler structure to represent the underlying scientific process.

Before we get to the analysis, however, we want to introduce another definition of effects - rather than defining the \(\alpha_i\) effects as deviation from the mean, we can look at the difference between the high and the low levels of factor A . These are two different definitions of effects that will be introduced and discussed in this chapter and the next, the \(\alpha_i\) effects and the difference between the high and low levels, which we will generally denote as the A effect.

Factorial Designs with 2 Treatment Factors, cont'd Section  

For a completely randomized design, which is what we discussed for the one-way ANOVA, we need to have n × a × b = N total experimental units available. We randomly assign n of those experimental units to each of the a × b treatment combinations. For the moment we will only consider the model with fixed effects and constant experimental random error.

The model is:

\(i = 1, \dots , a\) \(j = 1, \dots , b\) \(k = 1, \dots , n\)

Read the text section 5.3.2 for the definitions of the means and the sum of squares.

Testing Hypotheses Section  

We can test the hypotheses that the marginal means are all equal, or in terms of the definition of our effects that the \(\alpha_i\)'s are all equal to zero, and the hypothesis that the \(\beta_j\)'s are all equal to zero. And, we can test the hypothesis that the interaction effects are all equal to zero. The alternative hypotheses are that at least one of those effects is not equal to zero.

How we do this, in what order, and how do we interpret these tests?

One of the purposes of a factorial design is to be efficient about estimating and testing factors A and B in a single experiment. Often we are primarily interested in the main effects. Sometimes, we are also interested in knowing whether the factors interact. In either case, the first test we should do is the test on the interaction effects.

The Test of H0: \((\alpha\beta)_{ij}=0\) Section  

If there is interaction and it is significant, i.e. the p -value is less than your chosen cut off, then what do we do? If the interaction term is significant that tells us that the effect of A is different at each level of B . Or you can say it the other way, the effect of B differs at each level of A . Therefore, when we have significant interaction, it is not very sensible to even be talking about the main effect of A and B , because these change depending on the level of the other factor. If the interaction is significant then we want to estimate and focus our attention on the cell means. If the interaction is not significant, then we can test the main effects and focus on the main effect means.

The estimates of the interaction and main effects are given in the text in section 5.3.4.

Note that the estimates of the marginal means for A are the marginal means:

\(\bar{y}_{i..}=\dfrac{1}{bn} \sum\limits_j \sum\limits_k y_{ijk}\), with \(var(\bar{y}_{i..})=\dfrac{\sigma^2}{bn}\)

A similar formula holds for factor B , with

\(var(\bar{y}_{.j.})=\dfrac{\sigma^2}{an}\)

Just the form of these variances tells us something about the efficiency of the two-factor design. A benefit of a two factor design is that the marginal means have n × b number of replicates for factor A and n × a for factor B . The factorial structure, when you do not have interactions, gives us the efficiency benefit of having additional replication, the number of observations per cell times the number of levels of the other factor. This benefit arises from factorial experiments rather than single factor experiments with n observations per cell. An alternative design choice could have been to do two one-way experiments, one with a treatments and the other with b treatments, with n observations per cell. However, these two experiments would not have provided the same level of precision, nor the ability to test for interactions.

Another practical question: If the interaction test is not significant what should we do?

Do we get remove the interaction term in the model? You might consider dropping that term from the model. If n is very small and your df for error are small, then this may be a critical issue. There is a 'rule of thumb' that I sometimes use in these cases. If the p-value for the interaction test is greater than 0.25 then you can drop the interaction term. This is not an exact cut off but a general rule. Remember, if you drop the interaction term, then a variation accounted for by SSab would become part of the error and increasing the SSE, however your error df would also become larger in some cases enough to increase the power of the tests for the main effects. Statistical theory shows that in general dropping the interaction term increases your false rejection rate for subsequent tests. Hence we usually do not drop nonsignificant terms when there are adequate sample sizes. However, if we are doing an independent experiment with the same factors we might not include interaction in the model for that experiment.

What if n = 1, and we have only 1 observation per cell? If n = 1 then we have 0 df for SSerror and we cannot estimate the error variance with MSE. What should we do in order to test our hypothesis? We obviously cannot perform the test for interaction because we have no error term.

If you are willing to assume, and if it is true that there is no interaction, then you can use the interaction as your F -test denominator for testing the main effects. It is a fairly safe and conservative thing to do. If it is not true then the MSab will tend to be larger than it should be, so the F -test is conservative. You're not likely to reject a main effect if it is not true. You won't make a Type I error, however you could more likely make a Type II error.

Extension to a 3 Factor Model Section  

The factorial model with three factors can be written as:

\(Y_{ijk} = \mu + \alpha_i + \beta_j + \gamma_k + (\alpha \beta)_{ij} + (\alpha \gamma)_{ik} + (\beta \gamma)_{jk} + (\alpha \beta \gamma)_{ijk} + e_{ijkl}\)

where \(i = 1, \dots , a, j = 1 , \dots , b, k = 1 , \dots , c, l = 1 , \dots , n\)

We extend the model in the same way. Our analysis of variance has three main effects, three two-way interactions, a three-way interaction and error. If this were conducted as a Completely Randomized Design experiment, each of the a × b × c treatment combinations would be randomly assigned to n of the experimental units.

Sample Size Determination [Section 5.3.5] Section  

We first consider the two-factor case where N = a × b × n , ( n = the number of replicates per cell). The non-centrality parameter for calculating sample size for the A factor is:

\(\phi^2 = ( nb \times D^{2}) / ( 2a \times \sigma^2)\)

where D is the difference between the maximum of \(\bar{\mu_{i.}}\) and the minimum of \(\bar{\mu_{i.}}\), and where b is the number of observations in each level of factor A .

Actually, at the beginning of our design process, we should decide how many observations we should take, if we want to find a difference of D , between the maximum and the minimum of the true means for the factor A . There is a similar equation for factor B .

\(\phi^{2} = ( na \times D^{2} ) / ( 2b \times \sigma^{2})\)

where na is the number of observations in each level of factor B .

In the two factor case, this is just an extension of what we did in the one-factor case. But now we have the marginal means benefiting from a number of observations per cell and the number of levels of the other factor. In this case, we have n observations per cell, and we have b cells. So, we have nb observations.

Logo for Portland State University Pressbooks

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Setting Up a Factorial Experiment

Rajiv S. Jhangiani; I-Chant A. Chiang; Carrie Cuttler; and Dana C. Leighton

Learning Objectives

  • Explain why researchers often include multiple independent variables in their studies.
  • Define factorial design, and use a factorial design table to represent and interpret simple factorial designs.

Just as it is common for studies in psychology to include multiple levels of a single independent variable (placebo, new drug, old drug), it is also common for them to include multiple independent variables. Schnall and her colleagues studied the effect of both disgust and private body consciousness in the same study. Researchers’ inclusion of multiple independent variables in one experiment is further illustrated by the following actual titles from various professional journals:

  • The Effects of Temporal Delay and Orientation on Haptic Object Recognition
  • Opening Closed Minds: The Combined Effects of Intergroup Contact and Need for Closure on Prejudice
  • Effects of Expectancies and Coping on Pain-Induced Intentions to Smoke
  • The Effect of Age and Divided Attention on Spontaneous Recognition
  • The Effects of Reduced Food Size and Package Size on the Consumption Behavior of Restrained and Unrestrained Eaters

Just as including multiple levels of a single independent variable allows one to answer more sophisticated research questions, so too does including multiple independent variables in the same experiment. For example, instead of conducting one study on the effect of disgust on moral judgment and another on the effect of private body consciousness on moral judgment, Schnall and colleagues were able to conduct one study that addressed both questions. But including multiple independent variables also allows the researcher to answer questions about whether the effect of one independent variable depends on the level of another. This is referred to as an interaction between the independent variables. Schnall and her colleagues, for example, observed an interaction between disgust and private body consciousness because the effect of disgust depended on whether participants were high or low in private body consciousness. As we will see, interactions are often among the most interesting results in psychological research.

Factorial Designs

By far the most common approach to including multiple independent variables (which are often called factors) in an experiment is the factorial design. In a  factorial design , each level of one independent variable is combined with each level of the others to produce all possible combinations. Each combination, then, becomes a condition in the experiment. Imagine, for example, an experiment on the effect of cell phone use (yes vs. no) and time of day (day vs. night) on driving ability. This is shown in the  factorial design table  in Figure 9.1. The columns of the table represent cell phone use, and the rows represent time of day. The four cells of the table represent the four possible combinations or conditions: using a cell phone during the day, not using a cell phone during the day, using a cell phone at night, and not using a cell phone at night. This particular design is referred to as a 2 × 2 (read “two-by-two”) factorial design because it combines two variables, each of which has two levels.

If one of the independent variables had a third level (e.g., using a handheld cell phone, using a hands-free cell phone, and not using a cell phone), then it would be a 3 × 2 factorial design, and there would be six distinct conditions. Notice that the number of possible conditions is the product of the numbers of levels. A 2 × 2 factorial design has four conditions, a 3 × 2 factorial design has six conditions, a 4 × 5 factorial design would have 20 conditions, and so on. Also notice that each number in the notation represents one factor, one independent variable. So by looking at how many numbers are in the notation, you can determine how many independent variables there are in the experiment. 2 x 2, 3 x 3, and 2 x 3 designs all have two numbers in the notation and therefore all have two independent variables. The numerical value of each of the numbers represents the number of levels of each independent variable. A 2 means that the independent variable has two levels, a 3 means that the independent variable has three levels, a 4 means it has four levels, etc. To illustrate a 3 x 3 design has two independent variables, each with three levels, while a 2 x 2 x 2 design has three independent variables, each with two levels.

factorial experimental design example

In principle, factorial designs can include any number of independent variables with any number of levels. For example, an experiment could include the type of psychotherapy (cognitive vs. behavioral), the length of the psychotherapy (2 weeks vs. 2 months), and the sex of the psychotherapist (female vs. male). This would be a 2 × 2 × 2 factorial design and would have eight conditions. Figure 9.2 shows one way to represent this design. In practice, it is unusual for there to be more than three independent variables with more than two or three levels each. This is for at least two reasons: For one, the number of conditions can quickly become unmanageable. For example, adding a fourth independent variable with three levels (e.g., therapist experience: low vs. medium vs. high) to the current example would make it a 2 × 2 × 2 × 3 factorial design with 24 distinct conditions. Second, the number of participants required to populate all of these conditions (while maintaining a reasonable ability to detect a real underlying effect) can render the design unfeasible (for more information, see the discussion about the importance of adequate statistical power in Chapter 13). As a result, in the remainder of this section, we will focus on designs with two independent variables. The general principles discussed here extend in a straightforward way to more complex factorial designs.

factorial experimental design example

Assigning Participants to Conditions

Recall that in a simple between-subjects design, each participant is tested in only one condition. In a simple within-subjects design, each participant is tested in all conditions. In a factorial experiment, the decision to take the between-subjects or within-subjects approach must be made separately for each independent variable. In a  between-subjects factorial design , all of the independent variables are manipulated between subjects. For example, all participants could be tested either while using a cell phone  or  while not using a cell phone and either during the day  or  during the night. This would mean that each participant would be tested in one and only one condition. In a within-subjects factorial design, all of the independent variables are manipulated within subjects. All participants could be tested both while using a cell phone and  while not using a cell phone and both during the day  and  during the night. This would mean that each participant would need to be tested in all four conditions. The advantages and disadvantages of these two approaches are the same as those discussed in Chapter 5. The between-subjects design is conceptually simpler, avoids order/carryover effects, and minimizes the time and effort of each participant. The within-subjects design is more efficient for the researcher and controls extraneous participant variables.

Since factorial designs have more than one independent variable, it is also possible to manipulate one independent variable between subjects and another within subjects. This is called a  mixed factorial design . For example, a researcher might choose to treat cell phone use as a within-subjects factor by testing the same participants both while using a cell phone and while not using a cell phone (while counterbalancing the order of these two conditions). But they might choose to treat time of day as a between-subjects factor by testing each participant either during the day or during the night (perhaps because this only requires them to come in for testing once). Thus each participant in this mixed design would be tested in two of the four conditions.

Regardless of whether the design is between subjects, within subjects, or mixed, the actual assignment of participants to conditions or orders of conditions is typically done randomly.

Non-Manipulated Independent Variables

In many factorial designs, one of the independent variables is a non-manipulated independent variable . The researcher measures it but does not manipulate it. The study by Schnall and colleagues is a good example. One independent variable was disgust, which the researchers manipulated by testing participants in a clean room or a messy room. The other was private body consciousness, a participant variable which the researchers simply measured. Another example is a study by Halle Brown and colleagues in which participants were exposed to several words that they were later asked to recall (Brown, Kosslyn, Delamater, Fama, & Barsky, 1999) [1] . The manipulated independent variable was the type of word. Some were negative health-related words (e.g.,  tumor, coronary ), and others were not health related (e.g.,  election, geometry ). The non-manipulated independent variable was whether participants were high or low in hypochondriasis (excessive concern with ordinary bodily symptoms). The result of this study was that the participants high in hypochondriasis were better than those low in hypochondriasis at recalling the health-related words, but they were no better at recalling the non-health-related words.

Such studies are extremely common, and there are several points worth making about them. First, non-manipulated independent variables are usually participant variables (private body consciousness, hypochondriasis, self-esteem, gender, and so on), and as such, they are by definition between-subjects factors. For example, people are either low in hypochondriasis or high in hypochondriasis; they cannot be tested in both of these conditions. Second, such studies are generally considered to be experiments as long as at least one independent variable is manipulated, regardless of how many non-manipulated independent variables are included. Third, it is important to remember that causal conclusions can only be drawn about the manipulated independent variable. For example, Schnall and her colleagues were justified in concluding that disgust affected the harshness of their participants’ moral judgments because they manipulated that variable and randomly assigned participants to the clean or messy room. But they would not have been justified in concluding that participants’ private body consciousness affected the harshness of their participants’ moral judgments because they did not manipulate that variable. It could be, for example, that having a strict moral code and a heightened awareness of one’s body are both caused by some third variable (e.g., neuroticism). Thus it is important to be aware of which variables in a study are manipulated and which are not.

Non-Experimental Studies With Factorial Designs

Thus far we have seen that factorial experiments can include manipulated independent variables or a combination of manipulated and non-manipulated independent variables. But factorial designs can also include  only non-manipulated independent variables, in which case they are no longer experiments but are instead non-experimental in nature. Consider a hypothetical study in which a researcher simply measures both the moods and the self-esteem of several participants—categorizing them as having either a positive or negative mood and as being either high or low in self-esteem—along with their willingness to have unprotected sexual intercourse. This can be conceptualized as a 2 × 2 factorial design with mood (positive vs. negative) and self-esteem (high vs. low) as non-manipulated between-subjects factors. Willingness to have unprotected sex is the dependent variable.

Again, because neither independent variable in this example was manipulated, it is a non-experimental study rather than an experiment. (The similar study by MacDonald and Martineau [2002] [2]  was an experiment because they manipulated their participants’ moods.) This is important because, as always, one must be cautious about inferring causality from non-experimental studies because of the directionality and third-variable problems. For example, an effect of participants’ moods on their willingness to have unprotected sex might be caused by any other variable that happens to be correlated with their moods.

  • Brown, H. D., Kosslyn, S. M., Delamater, B., Fama, A., & Barsky, A. J. (1999). Perceptual and memory biases for health-related information in hypochondriacal individuals. Journal of Psychosomatic Research, 47 , 67–78. ↵
  • MacDonald, T. K., & Martineau, A. M. (2002). Self-esteem, mood, and intentions to use condoms: When does low self-esteem lead to risky health behaviors? Journal of Experimental Social Psychology, 38 , 299–306. ↵

Experiments that include more than one independent variable in which each level of one independent variable is combined with each level of the others to produce all possible combinations.

Shows how each level of one independent variable is combined with each level of the others to produce all possible combinations in a factorial design.

All of the independent variables are manipulated between subjects.

A design which manipulates one independent variable between subjects and another within subjects.

An independent variable that is measured but is non-manipulated.

Setting Up a Factorial Experiment Copyright © by Rajiv S. Jhangiani; I-Chant A. Chiang; Carrie Cuttler; and Dana C. Leighton is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Published: 25 November 2014

Points of Significance

Two-factor designs

  • Martin Krzywinski 1 &
  • Naomi Altman 2  

Nature Methods volume  11 ,  pages 1187–1188 ( 2014 ) Cite this article

58k Accesses

27 Citations

3 Altmetric

Metrics details

  • Research data
  • Statistical methods

When multiple factors can affect a system, allowing for interaction can increase sensitivity.

You have full access to this article via your institution.

When probing complex biological systems, multiple experimental factors may interact in producing effects on the response. For example, in studying the effects of two drugs that can be administered simultaneously, observing all the pairwise level combinations in a single experiment is more revealing than varying the levels of one drug at a fixed level of the other. If we study the drugs independently we may miss biologically relevant insight about synergies or antisynergies and sacrifice sensitivity in detecting the drugs' effects.

The simplest design that can illustrate these concepts is the 2 × 2 design, which has two factors (A and B), each with two levels ( a/A and b/B ). Specific combinations of factors ( a/b , A/b , a/B , A/B ) are called treatments. When every combination of levels is observed, the design is said to be a complete factorial or completely crossed design. So this is a complete 2 × 2 factorial design with four treatments.

Our previous discussion about experimental designs was limited to the study of a single factor for which the treatments are the factor levels. We used ANOVA 1 to determine whether a factor had an effect on the observed variable and followed up with pairwise t -tests 2 to isolate the significant effects of individual levels. We now extend the ANOVA idea to factorial designs. Following the ANOVA analysis, pairwise t -tests can still be done, but often analysis focuses on a different set of comparisons: main effects and interactions.

Figure 1 illustrates some possible outcomes in a 2 × 2 factorial experiment (values in Table 1 ). Suppose that both factors correspond to drugs and the observed variable is liver glucose level. In Figure 1a , drugs A and B increase glucose levels by 1 unit. Because neither drug influences the effect of the other we say there is no interaction and that the effects are additive. In Figure 1b , the effect of A in the presence of B is larger than the sum of their effects when they are administered separately (3 vs. 0.5 + 1). When the effect of the levels of a factor depends on the levels of other factors, we say that there is an interaction between the factors. In this case, we need to be careful about defining the effects of each factor.

figure 1

( a ) The main effect is the difference between τ values (light gray), which is the response for a given level of a factor averaged over the levels of other factors. ( b ) The interaction effect is the difference between effects of A at the different levels of B or vice versa (dark gray, Δ ). ( c ) Interaction effects may mask main effects.

The main effect of factor A is defined as the difference in the means of the two levels of A averaged over all the levels of B. For Figure 1b , the average for level a is τ = (0 + 1)/2 = 0.5 and for level A is τ = (0.5 + 3)/2 = 1.75, giving a main effect of 1.75 − 0.5 = 1.25 ( Table 1 ). Similarly, the main effect of B is 2 − 0.25 = 1.75. The interaction compares the differences in the mean of A at the two levels of B (2 − 0.5 = 1.5; in the Δ row) or, equivalently, the differences in the mean of B at the two levels of A (2.5 − 1 = 1.5). Interaction plots are useful to evaluate effects when the number of factors is small (line plots, Fig 1b ). The x axis represents levels of one factor and lines correspond to levels of other factors. Parallel lines indicate no interaction. The more the lines diverge, or cross, the greater the interaction.

Figure 1c shows an interaction effect with no main effect. This can happen if one factor increases the response at one level of the other factor but decreases it at the other. Both factors have the same average value for each of their levels, τ = 0.5. However, the two factors do interact because the effect of one drug is different depending on the presence of the other.

factorial experimental design example

When there are more factors or more levels, the main effects and interactions are summarized over many comparisons as sums of squares (SS) and usually only the test statistic ( F -test), its d.f. and the P value are reported. If there are statistically significant interactions, pairwise comparisons of different levels of one factor for fixed levels of the other factors (sometimes called simple main effects) are often computed in the manner described above. If the interactions are not significant, we typically compute differences between levels of one factor averaged over the levels of the other factor. Again, these are pairwise comparisons between means that are handled as just described, except that the sample sizes are also summed over the levels.

To illustrate the two-factor design analysis, we'll use a simulated data set in which the effect of levels of the drug and diet were tested in two different designs, with 8 mice and 8 observations ( Fig. 2a ). We'll assume an experimental protocol in which a mouse liver tissue sample is tested for glucose levels using two-way ANOVA. Our simulated simple effects are shown in Figure 1b —the increase in the response variable is 0.5 ( A/b ), 1 ( a/B ) and 3 ( A/B ). The two drugs are synergistic—A is 4× as potent in the presence of B, as can be seen by ( μ AB − μ aB )/( μ Ab − μ ab ) = Δ B /Δ b = 2/0.5 = 4 ( Table 1 ). We'll assume the same variation due to mice and measurement error, σ 2 = 0.25.

figure 2

( a ) Two common two-factor designs with 8 measurements each. In the CR scenario, each mouse is randomly assigned a single treatment. Variability among mice can be mitigated by grouping mice by similar characteristics (e.g., litter or weight). The group becomes a block. Each block is subject to all treatments. ( b ) Partitioning of the total sum of squares (SS T ; CR, 16.9; RCB, 26.4) and P values for the CR and RCB designs in a . M represents the blocking factor. Vertical axis is relative to the SS T . The total d.f. in both cases = 7; all other d.f. = 1.

We'll use a completely randomized design with each of the 8 mice randomly assigned to one of the four treatments in a balanced fashion each providing a single liver sample ( Fig. 2a ). First, let's test the effect of the two factors separately using one-way ANOVA, averaging over the values of the other factor. If we consider only A, the effects of B are considered part of the residual error and we do not detect any effect ( P = 0.48, Fig. 2b ). If we consider only B, we can detect an effect ( P = 0.04) because B has a larger main effect (2.0 − 0.25 = 1.75) than A (1.75 − 0.5 = 1.25).

When we test for multiple factors, the ANOVA calculation partitions the total sum of squares, SS T , into components that correspond to A (SS A ), B (SS B ) and the residual (SS E ) ( Fig. 2b ). The additive two-factor model assumes that there is no interaction between A and B—the effect of a given level of A does not depend on a level of B. In this case, the interaction component is assumed to be part of the error. If this assumption is relaxed, we can partition the total variance into four components, now accounting for how the response of A varies with B. In our example, the SS A and SS B terms remain the same, but SS E is reduced by the amount of SS AB (4.6), to 2.0 from 6.6. The resulting reduction in MS E (0.5 vs. 1.3) corresponds to the variance explained by the interaction between the two factors. When interaction is accounted for, the sensitivity of detecting an effect of A and B is increased because the F -ratio, which is inversely proportional to MS E , is larger.

To improve the sensitivity of detecting an effect of A, we can mitigate biological variability in mice by using a randomized complete block approach 1 ( Fig. 2a ). If the mice share some characteristic, such as litter or weight which contributes to response variability, we could control for some of the variation by assigning one complete replicate to each batch of similar mice. The total number of observations will still be 8, and we will track the mouse batch across measurements and use the batch as a random blocking factor 2 . Now, in addition to the effect of interaction, we can further reduce the MS E by the amount of variance explained by the block ( Fig. 2b ).

The sum-of-squares partitioning and P values for the blocking scenario are shown in Figure 2b . In each case, the SS E value is proportionately lower than in the completely randomized design, which makes the tests more sensitive. Once we incorporate blocking and interaction, we are able to detect both main and interaction effects and account for nearly all of the variance due to sources other than measurement error (SS E = 0.8, MS E = 0.25). The interpretation of P = 0.01 for the blocking factor M is that the biological variation due to the blocking factor has a nonzero variance. Effects and CIs are calculated just as for the completely randomized design—although the means have two sources of variance (block effect and MS E ), their difference has only one (MS E ) because the block effect cancels.

With two factors, more complicated designs are also possible. For example, we might expose the whole mouse to a drug (factor A) in vivo and then expose two liver samples to different in vitro treatments (factor B). In this case, the two liver samples from the same mouse form a block that is nested in mouse.

We might also consider factorial designs with more levels per factor or more factors. If the response to our two drugs depends on genotype, we might consider using three genotypes in a 2 × 2 × 3 factorial design with 12 treatments. This design allows for the possibility of interactions among pairs of factors and also among all three factors. The smallest factorial design with k factors has two levels for each factor, leading to 2 k treatments. Another set of designs, called fractional factorial designs, used frequently in manufacturing, allows for a large number of factors with a smaller number of samples by using a carefully selected subset of treatments.

Complete factorial designs are the simplest designs that allow us to determine synergies among factors. The added complexity in visualization, summary and analysis is rewarded by an enhanced ability to understand the effects of multiple factors acting in unison.

Krzywinski, M. & Altman, N. Nat. Methods 11 , 699–700 (2014).

Article   CAS   Google Scholar  

Krzywinski, M. & Altman, N. Nat. Methods 11 , 215–216 (2014).

Montgomery, D.C. Design and Analysis of Experiments 8th edn. (Wiley, 2012).

Google Scholar  

Download references

Author information

Authors and affiliations.

Martin Krzywinski is a staff scientist at Canada's Michael Smith Genome Sciences Centre.,

  • Martin Krzywinski

Naomi Altman is a Professor of Statistics at The Pennsylvania State University.,

  • Naomi Altman

You can also search for this author in PubMed   Google Scholar

Ethics declarations

Competing interests.

The authors declare no competing financial interests.

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Krzywinski, M., Altman, N. Two-factor designs. Nat Methods 11 , 1187–1188 (2014). https://doi.org/10.1038/nmeth.3180

Download citation

Published : 25 November 2014

Issue Date : December 2014

DOI : https://doi.org/10.1038/nmeth.3180

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Regression modeling of time-to-event data with censoring.

  • Tanujit Dey
  • Stuart R. Lipsitz

Nature Methods (2022)

The standardization fallacy

  • Bernhard Voelkl
  • Hanno Würbel

Nature Methods (2021)

Temporal dynamics of amygdala response to emotion- and action-relevance

  • Raphael Guex
  • Constantino Méndez-Bértolo
  • Judith Domínguez-Borràs

Scientific Reports (2020)

Asymmetric independence modeling identifies novel gene-environment interactions

  • Guoqiang Yu
  • David J. Miller

Scientific Reports (2019)

Optimal experimental design

  • Byran Smucker

Nature Methods (2018)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

factorial experimental design example

COMMENTS

  1. 14.2: Design of experiments via factorial designs

    Factorial Design Example Revisited. Recall the example given in the previous section What is Factorial Design? In the example, there were two factors and two levels, which gave a 2 2 factorial design. The Yates Algorithm can be used in order to quantitatively determine which factor affects the percentage of seizures the most.

  2. Factorial experiment

    Factorial experiment. In statistics, a full factorial experiment is an experiment whose design consists of two or more factors, each with discrete possible values or "levels", and whose experimental units take on all possible combinations of these levels across all such factors. A full factorial design may also be called a fully crossed design.

  3. What Is a Factorial Design? Definition and Examples

    Definition and Examples. Dictionary. A factorial design is a type of experiment that involves manipulating two or more variables. While simple psychology experiments look at how one independent variable affects one dependent variable, researchers often want to know more about the effects of multiple independent variables. Table of Contents.

  4. 3.1: Factorial Designs

    Imagine, for example, an experiment on the effect of cell phone use (yes vs. no) and time of day (day vs. night) on driving ability. This is shown in the factorial design table in Figure 3.1.1 3.1. 1. The columns of the table represent cell phone use, and the rows represent time of day. The four cells of the table represent the four possible ...

  5. 5. Factorial Designs

    5.2.6. Main Effects and Interactions. In factorial designs, there are two kinds of results that are of interest: main effects and interactions. A main effect is the statistical relationship between one independent variable and a dependent variable-averaging across the levels of the other independent variable (s).

  6. Lesson 5: Introduction to Factorial Designs

    Introduction. Factorial designs are the basis for another important principle besides blocking - examining several factors simultaneously. We will start by looking at just two factors and then generalize to more than two factors. Investigating multiple factors in the same design automatically gives us replication for each of the factors.

  7. Factorial design: design, measures, and classic examples

    As a basic example, a factorial 2 × 2 experiment may include two factors, A and B, with two levels each designating on/off for each factor. Calculating all combinations, there will be 2 2 = 4 experimental conditions within the study: A on + B on, A on + B off, A off + B on, A off + B off. The main effect is the average effect of a factor ...

  8. 9.1 Setting Up a Factorial Experiment

    Figure 9.1 Factorial Design Table Representing a 2 × 2 Factorial Design. In principle, factorial designs can include any number of independent variables with any number of levels. For example, an experiment could include the type of psychotherapy (cognitive vs. behavioral), the length of the psychotherapy (2 weeks vs. 2 months), and the sex of ...

  9. Two-level factorial experiments

    We illustrate this by simulating a 2 6 full factorial design (64 runs) with the model y = 1.5 - 0.5A + 0.15C + 0.65F + 0.2AB - 0.5AF + ε, where ε is the same as in our 2 3 model (Table 1 ...

  10. Factorial design: design, measures, classic example

    Factorial design is an experimental setup that consists of multiple factors, independent variables, and is a study of both their separate and conjoint effects on the dependent variable.The factorial design enables a clinical trial to evaluate two or more interventions simultaneously. Factors have subdivisions called levels and factorial design is usually expressed in number notation to signify ...

  11. Lesson 6: The \(2^k\) Factorial Design

    The \ (2^k\) designs are a major set of building blocks for many experimental designs. These designs are usually referred to as screening designs. The \ (2^k\) refers to designs with k factors where each factor has just two levels. These designs are created to explore a large number of factors, with each factor having the minimal number of ...

  12. A Complete Guide: The 2x2 Factorial Design

    A Complete Guide: The 2×2 Factorial Design. by Zach Bobbitt May 13, 2021. A 2×2 factorial design is a type of experimental design that allows researchers to understand the effects of two independent variables (each with two levels) on a single dependent variable. For example, suppose a botanist wants to understand the effects of sunlight (low ...

  13. PDF Chapter 8 Factorial Experiments

    For analysis of. 2n. factorial experiment, the analysis of variance involves the partitioning of treatment sum of squares so as to obtain sum of squares due to main and interaction effects of factors. These sum of squares are mutually orthogonal, so Treatment SS = Total of SS due to main and interaction effects.

  14. Setting Up a Factorial Experiment

    In a factorial design, each level of one independent variable is combined with each level of the others to produce all possible combinations. Each combination, then, becomes a condition in the experiment. Imagine, for example, an experiment on the effect of cell phone use (yes vs. no) and time of day (day vs. night) on driving ability.

  15. A Complete Guide: The 2x3 Factorial Design

    A 2×3 factorial design is a type of experimental design that allows researchers to understand the effects of two independent variables on a single dependent variable. In this type of design, one independent variable has two levels and the other independent variable has three levels. For example, suppose a botanist wants to understand the ...

  16. PDF Topic 9. Factorial Experiments [ST&D Chapter 15]

    If factor A has 3 levels and factor B has 5 then it is a 3 x 5 factorial experiment. 9. 3. Example of a 2x2 factorial An example of an experiment involving two factors is the application of two nitrogen levels, N0 and N1, and two phosphorous levels, P0 and P1 to a crop, with yield (lb/a) as the measured variable. The results are shown here:

  17. 1. What is a Factorial Design of Experiment?

    As the factorial design is primarily used for screening variables, only two levels are enough. Often, coding the levels as (1) low/high, (2) -/+, (3) -1/+1, or (4) 0/1 is more convenient and meaningful than the actual level of the factors, especially for the designs and analyses of the factorial experiments.

  18. Factorial Designs

    The number of different treatment groups that we have in any factorial design can easily be determined by multiplying through the number notation. For instance, in our example we have 2 x 2 = 4 groups. In our notational example, we would need 3 x 4 = 12 groups. We can also depict a factorial design in design notation.

  19. Chapter 9: Factorial Designs

    Chapter 9: Factorial Designs. In Chapter 1 we briefly described a study conducted by Simone Schnall and her colleagues, in which they found that washing one's hands leads people to view moral transgressions as less wrong (Schnall, Benton, & Harvey, 2008) [1]. In a different but related study, Schnall and her colleagues investigated whether ...

  20. 5.1

    For now we will just consider two treatment factors of interest. It looks almost the same as the randomized block design model only now we are including an interaction term: Y i j k = μ + α i + β j + (α β) i j + e i j k. where i = 1, …, a, j = 1, …, b, and k = 1, …, n. Thus we have two factors in a factorial structure with n ...

  21. Setting Up a Factorial Experiment

    In a factorial design, each level of one independent variable is combined with each level of the others to produce all possible combinations. Each combination, then, becomes a condition in the experiment. Imagine, for example, an experiment on the effect of cell phone use (yes vs. no) and time of day (day vs. night) on driving ability.

  22. Two-factor designs

    So this is a complete 2 × 2 factorial design with four treatments. Our previous discussion about experimental designs was limited to the study of a single factor for which the treatments are the ...

  23. PDF FACTORIAL DESIGNS Two Factor Factorial Designs

    ORIAL DESIGNS4.1 Two Factor Factorial DesignsA two-factor factorial design is an experimental design in which data is collected for all possible combination. sible factor combinations then the de. ign is abalanced two-factor factorial design.A balanced a b factorial design is a factorial design for which there are a levels of factor A, b levels ...