David Reimer and John Money Gender Reassignment Controversy: The John/Joan Case

In the mid-1960s, psychologist John Money encouraged the gender reassignment of David Reimer, who was born a biological male but suffered irreparable damage to his penis as an infant. Born in 1965 as Bruce Reimer, his penis was irreparably damaged during infancy due to a failed circumcision. After encouragement from Money, Reimer’s parents decided to raise Reimer as a girl. Reimer underwent surgery as an infant to construct rudimentary female genitals, and was given female hormones during puberty. During childhood, Reimer was never told he was biologically male and regularly visited Money, who tracked the progress of his gender reassignment. Reimer unknowingly acted as an experimental subject in Money’s controversial investigation, which he called the John/Joan case. The case provided results that were used to justify thousands of sex reassignment surgeries for cases of children with reproductive abnormalities. Despite his upbringing, Reimer rejected the female identity as a young teenager and began living as a male. He suffered severe depression throughout his life, which culminated in his suicide at thirty-eight years old. Reimer, and his public statements about the trauma of his transition, brought attention to gender identity and called into question the sex reassignment of infants and children.

Bruce Peter Reimer was born on 22 August 1965 in Winnipeg, Ontario, to Janet and Ron Reimer. At six months of age, both Reimer and his identical twin, Brian, were diagnosed with phimosis, a condition in which the foreskin of the penis cannot retract, inhibiting regular urination. On 27 April 1966, Reimer underwent circumcision, a common procedure in which a physician surgically removes the foreskin of the penis. Usually, physicians performing circumcisions use a scalpel or other sharp instrument to remove foreskin. However, Reimer’s physician used the unconventional technique of cauterization, or burning to cause tissue death. Reimer’s circumcision failed. Reimer’s brother did not undergo circumcision and his phimosis healed naturally. While the true extent of Reimer’s penile damage was unclear, the overwhelming majority of biographers and journalists maintained that it was either totally severed or otherwise damaged beyond the possibility of function.

In 1967, Reimer’s parents sought the help of John Money, a psychologist and sexologist who worked at the Johns Hopkins Hospital in Baltimore, Maryland. In the mid-twentieth century, Money helped establish views on the psychology of gender identities and roles. In his academic work, Money argued in favor of the increasingly mainstream idea that gender was a societal construct, malleable from an early age. He stated that being raised as a female was in Reimer’s interest, and recommended sexual reassignment surgery. At the time, infants born with abnormal or intersex genitalia commonly received such interventions.

Following their consultation with Money, Reimer’s parents decided to raise Reimer as a girl. Physicians at the Johns Hopkins Hospital removed Reimer’s testes and damaged penis, and constructed vestigial vulvae and a vaginal canal in their place. The physicians also opened a small hole in Reimer’s lower abdomen for urination. Following his gender reassignment surgery, Reimer was given the first name Brenda, and his parents raised him as a girl. He received estrogen during adolescence to promote the development of breasts. Throughout his childhood, Reimer was not informed about his male biology.

Throughout his childhood, Reimer received annual checkups from Money. His twin brother was also part of Money’s research on sexual development and gender in children. As identical twins growing up in the same family, the Reimer brothers were what Money considered ideal case subjects for a psychology study on gender. Reimer was the first documented case of sex reassignment of a child born developmentally normal, while Reimer’s brother was a control subject who shared Reimer’s genetic makeup, intrauterine space, and household.

During the twin’s psychiatric visits with Money, and as part of his research, Reimer and his twin brother were directed to inspect one another’s genitals and engage in behavior resembling sexual intercourse. Reimer claimed that much of Money’s treatment involved the forced reenactment of sexual positions and motions with his brother. In some exercises, the brothers rehearsed missionary positions with thrusting motions, which Money justified as the rehearsal of healthy childhood sexual exploration. In a Rolling Stone interview, Reimer recalled that at least once, Money photographed those exercises. Money also made the brothers inspect one another’s pubic areas. Reimer stated that Money observed those exercises both alone and with as many as six colleagues. Reimer recounted anger and verbal abuse from Money if he or his brother resisted orders, in contrast to the calm and scientific demeanor Money presented to their parents. Reimer and his brother underwent Money’s treatments at preschool and grade school age. Money described Reimer’s transition as successful, and claimed that Reimer’s girlish behavior stood in stark contrast to his brother’s boyishness. Money reported on Reimer’s case as the John/Joan case, leaving out Reimer’s real name. For over a decade, Reimer and his brother unknowingly provided data that, according to biographers and the Intersex Society of North America, was used to reinforce Money’s theories on gender fluidity and provided justification for thousands of sex reassignment surgeries for children with abnormal genitals.

Contrary to Money’s notes, Reimer reports that as a child he experienced severe gender dysphoria, a condition in which someone experiences distress as a result of their assigned gender. Reimer reported that he did not identify as a girl and resented Money’s visits for treatment. At the age of thirteen, Reimer threatened to commit suicide if his parents took him to Money on the next annual visit. Bullied by peers in school for his masculine traits, Reimer claimed that despite receiving female hormones, wearing dresses, and having his interests directed toward typically female norms, he always felt that he was a boy. In 1980, at the age of fifteen, Reimer’s father told him the truth about his birth and the subsequent procedures. Following that revelation, Reimer assumed a male identity, taking the first name David. By age twenty-one, Reimer had received testosterone therapy and surgeries to remove his breasts and reconstruct a penis. He married Jane Fontaine, a single mother of three, on 22 September 1990.

In adulthood, Reimer reported that he suffered psychological trauma due to Money’s experiments, which Money had used to justify sexual reassignment surgery for children with intersex or damaged genitals since the 1970s. In the mid-1990s, Reimer met Milton Diamond, a psychologist at the University of Hawaii, in Honolulu, Hawaii, and an academic rival of Money. Reimer participated in a follow-up study conducted by Diamond, in which Diamond cataloged the failures of Reimer’s transition.

In 1997, Reimer began speaking publicly about his experiences, beginning with his participation in Diamond’s study. Reimer’s first interview appeared in the December 1997 issue of Rolling Stone magazine. In interviews, and a later book about his experience, Reimer described his interactions with Money as torturous and abusive. Accordingly, Reimer claimed he developed a lifelong distrust of hospitals and medical professionals.

With those reports, Reimer caused a multifaceted controversy over Money’s methods, honesty in data reporting, and the general ethics of sex reassignment surgeries on infants and children. Reimer’s description of his childhood conflicted with the scientific consensus about sex reassignment at the time. According to NOVA , Money led scientists to believe that the John/Joan case demonstrated an unreservedly successful sex transition. Reimer’s parents later blamed Money’s methods and alleged surreptitiousness for the psychological illnesses of their sons, although the notes of a former graduate student in Money’s lab indicated that Reimer’s parents dishonestly represented the transition’s success to Money and his coworkers. Reimer was further alleged by supporters of Money to have incorrectly recalled the details of his treatment. On Reimer’s case, Money publicly dismissed his criticism as anti-feminist and anti-trans bias, but, according to his colleagues, was personally ashamed of the failure.

In his early twenties, Reimer attempted to commit suicide twice. According to Reimer, his adult family life was strained by marital problems and employment difficulty. Reimer’s brother, who suffered from depression and schizophrenia, died from an antidepressant drug overdose in July of 2002. On 2 May 2004, Reimer’s wife told him that she wanted a divorce. Two days later, at the age of thirty-eight, Reimer committed suicide by firearm.

Reimer, Money, and the case became subjects of numerous books and documentaries following the exposé. Reimer also became somewhat iconic in popular culture, being directly referenced or alluded to in the television shows Chicago Hope , Law & Order , and Mental . The BBC series Horizon covered his story in two episodes, “The Boy Who Was Turned into a Girl” (2000) and “Dr. Money and the Boy with No Penis” (2004). Canadian rock group The Weakerthans wrote “Hymn of the Medical Oddity” about Reimer, and the New York-based Ensemble Studio Theatre production Boy was based on Reimer’s life.

  • Carey, Benedict. “John William Money, 84, Sexual Identity Researcher, Dies.” New York Times , 11 July 2006.
  • Colapinto, John. "The True Story of John/Joan." Rolling Stone 11 (1997): 54–73.
  • Colapinto, John. As Nature Made Him: The Boy who was Raised as a Girl . New York: HarperCollins Publishers, 2000.
  • Colapinto, John. "Gender Gap—What were the real reasons behind David Reimer’s suicide?" Slate (2004).
  • Dr. Money and the Boy with No Penis , documentary, written by Sanjida O’Connell (BBC, 2004), Film.
  • The Boy Who Was Turned Into a Girl , documentary, directed by Andrew Cohen (BBC, 2000.), Film.
  • “Who was David Reimer (also, sadly, known as John/Joan)?” Intersex Society of North America . http://www.isna.org/faq/reimer (Accessed October 31, 2017).

How to cite

Articles rights and graphics.

Copyright Arizona Board of Regents Licensed as Creative Commons Attribution-NonCommercial-Share Alike 3.0 Unported (CC BY-NC-SA 3.0)  

Last modified

Share this page.

Welldoing.org

The Psychology of Gender: What are the Different Perspectives?

Debates around gender invariably centre on the assumed ‘naturalness’ of gender roles, that they are ‘hardwired’ and an inevitable result of our biology – that is, penises lead to masculinity and vaginas lead to femininity. From this perspective, the road is well travelled, and the route is predestined. In contrast, Judith Butler in Gender Trouble describes gender as a practice, as something we do. It is ‘the repeated stylisation of the body, a set of repeated acts within a highly rigid regulatory frame that congeal over time to produce the appearance of substance, of a natural sort of being’. On this view, it is about performance rather than essence, with ‘the body as a kind of canvas on which culture paints images of gender’. It ‘boils down’ to the age-old nature versus nurture debate with which every psychology student must wrangle.

How we acquire gender identity

Traditionally, there are three main psychological explanations of how we navigate the path to gender identity. These are psychodynamic theory, social learning theory, and cognitive-developmental theory. All focus on early childhood, that is, up until about seven years of age.

Psychodynamic theory

Psychodynamic theories, following on from Sigmund Freud’s psychoanalytic theory, focus on unconscious drives, the relationship of the child and early experiences with the parents (or primary caregivers). Gender is a core part of personality that rests on the child’s awareness of its anatomy and its identification with the same-sex parent. The key point in its development is the resolution of the Oedipus complex for boys and the Electra complex for girls. Both involve resolving an incestuous desire for the opposite-sex parent and competition with the same-sex parent. Girls view the same-sex parent as responsible for their loss of a penis. Boys fear that their penis will be taken away by the same-sex parent. This antagonism is somehow resolved, and the child aligns with the same-sex parent. For males, fear of the loss of the penis is a more abstract concept, meaning males must work harder to deal with uncertainty. For females, the loss is already apparent. On this view, the male role is stronger than is the female. It is not difficult to see the three gender lenses at work here.

Social learning theory 

Instead of an innate, unconscious and biological basis of gender identity, social learning theory emphasises the child’s environment and learning experiences. According to this view, gender roles are learned through a mixture of observing the behaviour of others and modelling (imitation of same-sex caregivers). Children recognise the differential behaviours of boys and girls, generally, and the treatment by others in the form of rewards or punishments for appropriate/inappropriate actions. Children also experience individual differences in treatment, which starts at birth with physical handling, clothes and toy choices and patterns of speech. Gender-linked behaviours are observable by age one. Through conditioning, behaviours regularly and consistently rewarded are most likely to persist, whereas those behaviours that are punished are more apt to cease. 

Although social learning theory offers some explanation of how modelling and reinforcement interact, it tends to underplay individual differences in development and reactions from others such as inconsistencies in behavioural reinforcement. While it considers cognitive factors, it also underplays the agency of children and how they actively make sense of the world. It is also not clear how children cope with conflicting messages regarding gender.

Cognitive-developmental theory

According to the cognitive-developmental theory, as children we mature and experience the world, reorganising mental processes as we progress through a series of stages of development. Children’s development hits various milestones moving from the simple to the complex and from the concrete to the abstract, including language development. Children are active agents in acquiring gender roles within development stages that allow for an increasingly sophisticated grasp of concepts and language. As children mature, discrepancies between their knowledge and their experiences of the environment cause their ideas to shift accordingly. The acquisition of gender constancy, stability and consistency can only happen when a child has reached a certain level of cognitive maturity.

According to this view, gender identity exists at several levels, possibly developing in line with language. A strong theme that emerges from the literature is that boys, more so than girls, value their own gender more highly. This offers some support for the psychodynamic view that boys must try harder. 

Overall, the psychology of gender is revealed in the grey areas, that is, the relationship between identity and expression, and how we make sense of the gaps between (biological) sex, self and the social. For many the mismatch gaps might be narrow or even imperceptible, others might find ways of behaving and thinking to bridge the divide, and yet for others, the divide can seem insurmountable.

This is an extract from The Psychology of Gender, published by Routledge

gender based psychology experiments

Find Welldoing therapists near you

Related articles, recent posts.

blog_post_7155

When It Comes to Crime, Is It Always Good to Talk?

blog_post_7164

Dear Therapist..."I'm Attracted to the Wrong People"

blog_post_7163

Meet the Therapist: Matthew Landers

blog_post_7150

The Power of Vulnerability: I Was Deeply Hurt By Men, and Healed By Them Too

blog_post_7145

How to Stop Losing Sleep Over Your Teen's Choices

blog_post_7161

Meet the Therapist: Jennifer Achan

blog_post_7135

Exploring Your Inner House: Journey Through the Rooms of Wellbeing

blog_post_7115

Cancer Treatment and Menopause: Managing Anxiety

blog_post_7154

Dear Therapist..."I am Scared to Face My Mortality"

blog_post_7160

Overcoming Fear: Transforming Adversity into a Road Map for Growth

Find counsellors and therapists in London

Find counsellors and therapists in your region, join over 23,000 others on our newsletter.

Module 2: Studying Gender Using the Scientific Method

3rd edition as of August 2023

Module Overview

In Module 2, we will address the fact that psychology is the scientific study of behavior and mental processes. We will do this by examining the steps of the scientific method and by describing the five major designs used in psychological research. We will also differentiate between reliability and validity and their importance for measurement. Psychology has very clear ethical standards and procedures for scientific research. We will discuss these and why they are needed. The content of this module relates to all areas of psychology, but we will also point out some methods used in the study of gender that may not be used in other subfields as frequently or at all.

Module Outline

2.1. The Scientific Method

2.2. research designs used in the study of gender issues, 2.3. reliability and validity, 2.4. research ethics.

Module Learning Outcomes

  • Clarify what it means for psychology to be scientific by examining the steps of the scientific method and the three cardinal features of science.
  • Outline the five main research methods used in psychology and clarify how they are utilized in social psychology.
  • Differentiate and explain the concepts of reliability and validity.
  • Describe key features of research ethics.

Section Learning Objectives

  • Define scientific method.
  • Outline and describe the steps of the scientific method, defining all key terms.
  • Identify and clarify the importance of the three cardinal features of science.

In Module 1, psychology was defined as the scientific study of behavior and mental processes. More about behavior and mental processes will be explained, but before proceeding, it will be useful to elaborate on what makes psychology scientific. In fact, it is safe to say that most outside the field of psychology, or a sister science, might be surprised to learn that psychology utilizes the scientific method.

The scientific method is a systematic method for gathering knowledge about the world around us. Systematic means that there is a set way to use it. There is some variety in the number of steps used in the scientific method, depending on the souce, but for the purposes of this book, the following breakdown will be used:

Table 2.1: The Steps of the Scientific Method

0 Ask questions and be willing to wonder. To study the world around us, you have to wonder about it. This inquisitive nature is the hallmark of or our ability to assess claims made by others and make objective judgments that are: a) independent of emotion and anecdote, b) based on hard evidence, and c) required to be a scientist. For instance, one might wonder if people are more likely to stumble over words while being interviewed for a new job.
1 Generate a research question or identify a problem to investigate. Through our wonderment about the world around us and why events occur as they do, we begin to ask questions that require further investigation to arrive at an answer. This investigation usually starts with a . This is when a search of the literature is conducted through a university library or search engine, such as Google Scholar, to see what questions have been investigated and what answers have been found. This helps us identify or missing information, in the collective scientific knowledge. For instance, in relation to word fluency and job interviews, we would execute a search using relevant words to our questions as our parameters. Google Scholar and similar search engines would identify those in the key words authors list in their of their research. The abstract is a short description of what the article is about, similar to the summary of a novel on the back cover. These descriptions are useful for choosing which, of sometimes many, articles to read. As you read articles, you can learn which questions still have yet to be asked and answered to give your future research project specificity and direction.
2 Form a prediction. The coherent interpretation of a phenomenon is a A is a specific, testable prediction about that phenomenon which will occur if the theory is correct Zajonc’s drive-theory states that performing a task while being watched creates a state of physiological arousal, increasing the likeliest, or most dominant, response. According to this theory, well-practiced tasks increase correct responses, and unpracticed tasks increase incorrect responses while being watched. We could then hypothesize, or predict, that people who did not practice for their job interview will stumble over their words during the interview more than they normally do. In this way, theories and hypotheses have if-then relationships.
3 Test the hypothesis. If the hypothesis is not testable, then we cannot show whether or not our prediction is accurate. Our plan of action for testing the hypothesis is called the . In the planning stage, we will select the appropriate research method to test our hypothesis and answer our question. We might choose to use the method of observation to record speech patterns during job interviews. Alternatively, we might use a survey method where participants report on their job interview experiences. We could also design an experiment to test the effects of practice on job interviews.
4 Interpret the results. With our research study done, we now examine the data to see whether or not it supports our hypothesis. D provide a means of summarizing or describing data and presenting the data in a usable form, using mean or average, median, and mode, as well as standard deviation and variance. allow us to make inferences about populations from our sample data by determining the of the results. Significance is an indication of how confident we are that our results are not simply due to chance. Typically, psychologists prefer that there is no greater than a 5% probability that results are due to chance.
5 Draw conclusions carefully. We need to accurately interpret our results and not overstate our findings. To do this, we need to be aware of our biases and avoid emotional reasoning. In our effort to stop a child from engaging in self-injurious behavior that could cause substantial harm or even death, it could be tempting to overstate the success of our treatment method. In the case of our job interview and speech fluency study, our descriptive statistics might have revealed that people in their 20’s stumbled more over words than people in their 30’s during their interviews. Even though the results of our sample might be statistically significant, they might not be reflective of the overall population. Additionally, it is important not to imply causation when only a correlation has been demonstrated.
6 Communicate our findings to the larger scientific community. Once we have decided whether our hypothesis is supported or not, we need to share this information with others so that they might comment critically on our methodology, statistical analyses, and conclusions. Sharing also allows for or repeating the study to confirm or produce different results. The dissemination of scientific research is accomplished through scientific journals, conferences, or newsletters released by many of the organizations mentioned in Section 1.3.

Science has at its root three cardinal features that we will encounter throughout this book. They are:

  • Observation – Observational research is a type of non-experimental research method in which the goal is to describe the variables. In naturalistic observation , participants are observed in a natural setting. In structured observation , participants are observed in a more structured environment, such as a lab.
  • Experimentation – To determine whether there is a causal , or cause-and-effect, relationship between two variables, we must be able to isolate variables. In a true experiment, the independent variable is systematically manipulated, and extraneous variables are controlled , or decreased in variability, as much as possible.
  • Measurement –Whether researchers are using a non-experimental, observational design, or an experimental design, it is important for researchers to ensure the scales that are used are valid and reliable . Reliability refers to consistency, in which the same results are achieved at different times and between different researchers. Validity refers to whether or not the study measured the variable it was intended to measure. Validity and reliability will be further discussed in Section 2.3. These concepts help us to know that the conclusions we infer from our data are drawn from trustworthy sources and techniques.
  • List the five main research methods used in psychology.
  • Describe observational research, listing its advantages and disadvantages.
  • Describe case study research, listing its advantages and disadvantages.
  • Describe survey research, listing its advantages and disadvantages.
  • Describe correlational research, listing its advantages and disadvantages.
  • Describe experimental research, listing its advantages and disadvantages.
  • State the utility and need for multimethod research.

Step 3 of the scientific method involves the scientist testing their hypothesis. Psychology as a discipline uses five main research designs. These include observational research, case studies, surveys, correlational designs, and experiments. Note that research can take two forms: quantitative , which is focused on numbers, and qualitative, which is focused on words. Psychology primarily focuses on quantitative research, though qualitative research is just as useful in different ways. Qualitative and quantitative research are complimentary approaches, and often fill in important gaps for one another.

2.2.1. Observational Research

In naturalistic observation , the scientist studies human or animal behavior in its natural environment, which could include the home, school, or a forest. The researcher counts, measures, and rates behavior in a systematic way and at times uses multiple judges to ensure accuracy in how the behavior is being measured. This is called inter-rater reliability, as you will see in Section 2.3. The advantage of this method is that you witness behavior as it occurs, and it is not tainted by the experimenter. The disadvantage is that it could take a long time for the behavior to occur and if the researcher is detected, then the behavior of those being observed may be influenced. In that case, the behavior of the observed could become artificial .

Laboratory observation is a type of structured observation which involves observing people or animals in a laboratory setting. A researcher who wants to know more about parent-child interactions might bring a parent and child into the lab to engage in preplanned tasks, such as playing with toys, eating a meal, or the parent leaving the room for a short period of time. The advantage of this method over the naturalistic method is that the experimenter can control for more extraneous variables and save time. The cost of using a laboratory observation method is that since the subjects know the experimenter is watching them, their behavior may become artificial. Behavior can also be artificial due to the structured lab being too unlike the natural environment.

      2.2.1.1. Example of a psychology of gender study utilizing observation. Olino et al. (2012) indicate that a growing body of literature points to gender differences in child temperament and adult personality traits throughout life, but that many of these studies rely solely on parent-report measures. Their investigation used parental-report, maternal-report, and laboratory observation. The laboratory batteries took approximately two hours, and children were exposed to standardized laboratory episodes with a female experimenter. These episodes were intended to elicit individual differences in temperament traits as they relate to behavioral engagement, social behavior, and emotionality. They included Risk Room, where children explore a set of novel and ambiguous stimuli (such as a black box); Stranger Approach, or when the child is left alone in the room briefly and a male research accomplice enters the room and speaks to the child; Pop-up Snakes, or when the child and experimenter surprise the child’s mother with a can of potato chips that contain coiled snakes; and Painting a Picture, which allows the child to play with watercolor pencils and crayons. Observers assigned a 1 for low intensity, 2 for moderate intensity, and 3 for high intensity in relation to facial, bodily, and vocal positive affect, fear, sadness, and anger displays. Outside of these affective codes, observers also used behavioral codes on a similar three-point scale to assess engagement, sociability, activity, and impulsivity. The sample included 463 boys and 402 girls.

Across the three different measures, girls showed higher positive affect and fear and lower activity level compared to boys. When observed in the laboratory, girls showed higher levels of sociability but lower levels of negative emotionality, anger, sadness, and impulsive behavior. Maternal reports showed higher levels of overall negative emotionality and sadness for girls while paternal reports showed higher levels of sociability for boys.

Read the study for yourself: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3532859/

2.2.2. Case Studies

Psychology also utilizes a detailed description of one person, or a small group, based on careful observation. This was the approach the founder of psychoanalysis, Sigmund Freud, took to develop his theories. The advantage of this method is that you arrive at a rich description of the behavior being investigated in one or two individuals, but the disadvantage is that what you are learning may be unrepresentative of the larger population and, therefore, lacks generalizability . Case studies are also subject to the interpretation and bias of the researcher in that they decide what is important to include and not include in the final report. Despite these limitations, case studies can lead us to novel ideas about the cause of behavior and help us to study unusual conditions that occur too infrequently to study with large sample sizes in a systematic way.

      2.2.2.1. Example of a psychology of gender study utilizing a case study. Mukaddes (2002) studied cross-gender behavior in children with high functioning autism. Specifically, two boys were followed over a period of about four years who showed persistent gender identity problems. Case 2, called A.A., was a 7-year-old boy referred to a child psychiatry department in Turkey due to language delay and issues with social interaction. The author goes on to describe in detail the family history and how the child showed a “persistent attachment to his mother’s and some significant female relative’s clothes and especially liked to make skirts out of their scarves. After age 5 years, he started to ‘play house’ and ‘play mother roles’… His parents have tried to establish good bonding between him with his father as an identification object. Despite this, his cross-gender behaviors are persistent (pg. 531).” In the discussion of both cases, the authors note that the report of cross-gender behavior in autistic cases is rare, and that their case study attempts to, “…underline that (1) diagnosis of GID in autistic individuals with a long follow-up seems possible; and (2) high functioning verbally-able autistic individuals can express their gender preferences as well as other personal preferences” (pg. 532).

To learn more about observational and case study designs, please take a look at the Research Methods in Psychology textbook by visiting:

https://kpu.pressbooks.pub/psychmethods4e/chapter/observational-research/

2.2.3. Surveys/Self-Report Data

A survey is a questionnaire consisting of at least one scale with some number of questions which assess a psychological construct of interest, such as parenting style, depression, locus of control, communication, attitudes, or sensation-seeking behavior. It may be administered by paper and pencil or computer. Surveys allow for the collection of large amounts of data quickly. The actual survey could be tedious for the participant, and social desirability , when a participant answers questions dishonestly so that they are seen in a more favorable light, could be an issue. For instance, if you are asking high school students about their sexual activity, they may not give genuine answers for fear that their parents will find out. If you wanted to know about prejudicial attitudes of a group of people, it could be useful to choose the survey method. You could alternatively gather this information through an interview in a structured or unstructured fashion. R andom sampling is important component in survey research, where everyone in the population has an equal chance of being included in the sample. This helps the survey to be representative of the population and in demographic variables such as gender, age, ethnicity, race, sexual orientation, education level, and religious orientation.

      2.2.3.1. Example of a psychology of gender study utilizing a survey. Weiser (2004) wanted to see to what extent a gender gap existed in internet use. Utilizing a 19-item survey given to introductory psychology students, he found that males used the internet for entertainment and leisure activities while females used it for interpersonal communication and educational activities. Interestingly, he found that age and internet experience mediated the gender differences.

To learn more about the survey research design, please take a look at our Research Methods in Psychology textbook by visiting:

https://kpu.pressbooks.pub/psychmethods4e/chapter/overview-of-survey-research/

2.2.4. Correlational Research

This research method examines the relationship between two variables or two groups of variables. A numerical measure of the strength of this relationship is derived, called the correlation coefficient , and can range from -1.00 (a perfect inverse relationship meaning that as one variable goes up the other goes down), to 0 (or no relationship at all), to +1.00 (or a perfect relationship in which as one variable goes up or down so does the other). The advantage of correlational research is that it allows us to observe statistical relationships between variables. Additionally, correlational research can be used when a researcher is not able to manipulate a variable, as in an experiment. An example of a negative correlation is when a parent becomes more rigid, the attachment of the child to the parent goes down. In contrast, an example of a positive correlation is that as a parent becomes warmer toward the child, the child becomes more attached. However, one must take care not to conflate correlation with causation. Just because there is a statistical relationship between variables does not mean that one caused the other. A spurious correlation is one where there is a statistical relationship between variables, but no causation between them.

For a list of examples of spurious correlations visit: https://www.tylervigen.com/spurious-correlations

     2.2.4.1. Example of a psychology of gender study utilizing a correlational method. In a study investigating the relationship of gender role identity, support for feminism, and willingness to consider oneself a feminist, Toller, Suter, and Trautman (2004) found that when men scored high on the Sexual Identity Scale (which indicates high levels of femininity), they were supportive of the women’s movement and were more willing to consider themselves a feminist (positive correlations). In contrast, high scores on the Personal Attributes Questionnaire (PAQ) masculinity index resulted in reports of being less likely to consider themselves a feminist (a negative correlation). In terms of female participants, a positive correlation was found between highly masculine women and positive attitudes toward nontraditional gender roles. The authors note, “Possible explanations for these findings may be that women often describe feminists with masculine traits, such as “dominating” and “aggressive.” Thus, the more feminine women in our study may have viewed feminism and nontraditional gender roles as masculine.”

To learn more about the correlational research design, please take a look at the Research Methods in Psychology textbook by visiting:

https://kpu.pressbooks.pub/psychmethods4e/chapter/correlational-research/

2.2.5. Experiments

An experiment is a controlled test of a hypothesis in which a researcher manipulates one variable and measures its effect on another variable. The variable that is manipulated is called the independent variable (IV) and the one that is measured is called the dependent variable (DV) . A common feature of experiments is to have a control group that does not receive the treatment or is not manipulated and an experimental group that does receive the treatment or manipulation. If the experiment includes random assignment, participants have an equal chance of being placed in the control or experimental group. The control group allows the researcher to make a comparison to the experimental group, making a causal statement possible.

     2.2.5.1. Example of an experimental psychology of gender study. Wirth and Bodenhausen (2009) investigated whether gender played a moderating role in the stigma of mental illness in a web-based survey experiment. They asked participants to read a case summary in which the patient’s gender was manipulated along with the type of disorder. These cases were either of male-typical or female-typical disorders. Their results showed that when the cases were gender typical, participants were less sympathetic, showed more negative affect, and were less likely to help than if the cases were gender atypical. The authors proposed that the gender-typical cases were much less likely to be seen as genuine mental disturbances by the participants.

To learn more about the experimental research design, please take a look at the Research Methods in Psychology textbook by visiting:

https://kpu.pressbooks.pub/psychmethods4e/part/experimental-research/

2.2.6. Multi-Method Research

As you have seen above, no single method alone is perfect. Each has strengths and limitations. As such, for psychologists to provide the clearest picture of what is affecting behavior or mental processes, several of these approaches are typically employed at different stages of the research process. This is called multi-method research.

2.2.7. Archival Research

Another technique used by psychologists is called archival research, or when the researcher analyzes data that has already been collected for another purpose. For instance, a researcher may request data from high schools about students’ GPA and SAT/ACT score(s) and then obtain their four-year GPA from the university they attended. This can be used to make a prediction about success in college and which measure – GPA or standardized test score – is the better predictor.

2.2.8. Meta-Analysis

Meta-analysis is a statistical procedure that allows a researcher to combine data from more than one study. For instance, Marx and Kettrey (2016) evaluated the association between the presence of gay-straight alliances (GSAs) for LGBTQ+ youth and their allies and the youth’s self-reported victimization. In all, the results of 15 studies spanning 2001 to 2014 were combined for a final sample of 62,923 participants and indicated that when a GSA is present, homophobic victimization, fear for safety, and hearing homophobic remarks is significantly lower. The authors state, “The findings of this meta-analysis should therefore be of value to advocates, educators, and policymakers who are interested in alleviating school-based victimization of youth, as those adolescents who are perceived to be LGBTQ+ are at a marked risk for such victimization.”

2.2.9. Communicating Results

In scientific research, it is common practice to communicate the findings of our investigation. By reporting what we found in our study, other researchers can critique our methodology and address our limitations. Publishing allows psychology to grow its collective knowledge about human behavior based on converging evidence from different kinds of studies. We can also see where gaps still exist. Research is moved to the public domain so others can read and comment on it. Scientists can also replicate what we did and possibly extend our work if it is published.

Communication of results can be through conferences in the form of posters or oral presentations, newsletters from APA or one of its many divisions or other organizations, or through scientific research journals. Published journal articles represent a form of communication between scientists, and in these articles, the researchers describe how their work relates to previous research, how it replicates or extends this work, what their work might mean theoretically, and what it implies for future research.

Research articles begin with an abstract, which is a 150-250-word summary of the article. The purpose is to describe the experiment and allow the reader to make a decision about whether they want to read it further. The abstract provides a statement of purpose, an overview of the methods, the main results, and a brief statement of the conclusion. Key words are also given that allow for students and other researchers to find the article when conducting a search.

The abstract is followed by four major sections. The first is the introduction, designed to provide a summary of the current literature as it relates to your topic. It helps the reader see how you arrived at your hypothesis, as well as the purpose of your study. Essentially, it gives the logic behind the decisions you made. Also stated in the introduction is the hypothesis. Second is the Method section. Since replication is a required element of science, we must have a way to share information on our design and sample with readers. This is the essence of the method section and covers three major aspects of your study – the participants, materials or apparatus, and procedure. The reader needs to know who was in your study so that limitations related to generalizability of your findings can be identified and investigated in the future. Operational definitions are also stated, a description of any groups included, identification of random sampling or assignment procedures, and information is shared about how a scale was scored. The method section can be loosely thought of as a cookbook. The participants are your ingredients, the materials or apparatus are whatever tools you will need, and the procedure is the instructions for how to bake the cake.

Next is the Results section. In this section you state the outcomes of your experiment and whether they were statistically significant or not. In this section, you can also present tables and figures. The final section is the Discussion. In this section, your main findings and hypothesis of the study is restated and an interpretation of the findings is offered. Finally, strengths and limitations of the study are stated which will allow you to propose future directions.

Whether you are writing a research paper for a class, preparing an article for

publication, or reading a research article, the structure and function of a research article is the same. Understanding this will help you when reading psychology of gender research articles.

  • Clarify why reliability and validity are important.
  • Define reliability and list and describe forms it takes.
  • Define validity and list and describe forms it takes.

Recall that measurement involves the assignment of scores to an individual which are used to represent aspects of the individual, such as how conscientious they are or their level of depression. Whether or not the scores actually represent the individual is what is in question. Cuttler (2019) says in her book Research Methods in Psychology, “Psychologists do not simply  assume  that their measures work. Instead, they collect data to demonstrate  that they work. If their research does not demonstrate that a measure works, they stop using it.” So how do they demonstrate that a measure works? This is where reliability and validity come in.

2.3.1. Reliability

First, reliability describes how consistent a measure is. It can be measured in terms of test-retest reliability , or how reliable the measure is across time, internal consistency , or the “consistency of people’s responses across the items on a multiple-item measure,” (Cuttler, 2019), Finally, inter-rater reliability describes the consistency of results between different observers. In terms of inter-rater reliability, Cuttler (2019) writes, “If you were interested in measuring university students’ social skills, you could make video recordings of them as they interacted with another student whom they are meeting for the first time. Then you could have two or more observers watch the videos and rate each student’s level of social skills. To the extent that each participant does, in fact, have some level of social skills that can be detected by an attentive observer, different observers’ ratings should be highly correlated with each other.”

2.3.2. Validity

A measure is considered to be valid if its scores represent the variable it is said to measure. For instance, if a scale says it measures depression, and it does, then we can say it is valid. Validity can take many forms. First, face validity is “the extent to which a measurement method appears “on its face” to measure the construct of interest” (Cuttler, 2019). A scale purported to measure values should have questions about values such as benevolence, conformity, and self-direction, and not questions about depression or attitudes toward toilet paper.

Content validity is to what degree a measure covers the construct of interest. Cuttler (2019) says, “… consider that attitudes are usually defined as involving thoughts, feelings, and actions toward something. By this conceptual definition, a person has a positive attitude toward exercise to the extent that he or she thinks positive thoughts about exercising, feels good about exercising, and actually exercises.”

Often times, we expect a person’s scores on one measure to be correlated with scores on another measure to which we expect it to be related, called criterion validity . For instance, consider parenting style and attachment. We would expect that if a person indicates on one scale that their father was authoritarian (or dictatorial) then attachment would be low or insecure. In contrast, if the mother was authoritative (or democratic) we would expect the child to show a secure attachment style.

As researchers, we strive for results will generalize from our sample to the larger population. In the example of case studies, the sample is too small to make conclusions about everyone. If our results do generalize from the circumstances under which our study was conducted to similar situations, then we can say our study has external validity . External validity is also affected by how real, or natural, the research is. Two types of realism are possible. First, mundane realism occurs when the research setting closely resembles the real-world setting. Experimental realism is the degree to which the experimental procedures that are used feel real to the participant. It does not matter if they truly mirror real life but that they only appear real to the participant. If so, his or her behavior will be more natural and less artificial.

In contrast, a study is said to have good internal validity when we can confidently say that the effect on the dependent variable (the one that is measured) was due solely to our manipulation of the independent variable. A confound occurs when a factor other than the independent variable leads to changes in the dependent variable.

To learn more about reliability and validity, please visit:

https://kpu.pressbooks.pub/psychmethods4e/chapter/reliability-and-validity-of-measurement/

  • Exemplify instances of ethical misconduct in research.
  • List and describe principles of research ethics.

Throughout this module so far, we have seen that it is important for researchers to understand the methods they are using. Equally important, they must understand and appreciate ethical standards in research. The American Psychological Association identifies high standards of ethics and conduct as one of its four main guiding principles or missions. To read about the other three, please visit https://www.apa.org/about/index.aspx . So why are ethical standards needed and what do they look like?

2.4.1. Milgram’s Study on Learning…or Not

The one psychologist most students know about is Stanley Milgram, if not by name then by his study on obedience using shock (Milgram, 1974). Essentially, two individuals came to each experimental session but only one of these two individuals was a participant. The other was what is called a confederate and part of the study without the participant knowing. The confederate was asked to pick heads or tails and then a coin was flipped. As you might expect, the confederate always won and chose to be the learner . The “experimenter,” who was also a confederate, took him into one room where he was hooked up to wires and electrodes. This was done while the “teacher,” the actual participant, watched and added to the realism of what was being done. The teacher was then taken into an adjacent room where he was seated in front of a shock generator. The teacher was told it was his task to read a series of word pairs to the learner. Upon completion of reading the list, he would ask the learner one of the two words and it was the learner’s task to state what the other word in the pair was. If the learner incorrectly paired any of the words, he would be shocked. The shock generator started at 30 volts and increased in 15-volt increments up to 450 volts. The switches were labeled with terms such as “Slight shock,” “Moderate shock,” “Danger: Severe Shock,” and the final two switches were ominously labeled “XXX.”

As the experiment progressed, the teacher would hear the learner scream, holler, plead to be released, complain about a heart condition, or say nothing at all. When the learner stopped replying, the teacher would turn to the experimenter and ask what to do, to which the experimenter indicated for him to treat nonresponses as incorrect and shock the learner. Most participants asked the experimenter whether they should continue at various points in the experiment. The experimenter issued a series of commands to include, “Please continue,” “It is absolutely essential that you continue,” and “You have no other choice, you must go on.” Surprisingly, Milgram found that 65% of participants/teachers shocked the learner to the XXX switches which would have killed them because they were ordered to do so.

Source: Milgram, S. (1974). Obedience to authority. New York, NY: Harper Perennial.

If you would like to learn more about the moral foundations of ethical research, please visit:

https://kpu.pressbooks.pub/psychmethods4e/chapter/moral-foundations-of-ethical-research/

2.4.2. Ethical Guidelines

Due to these studies, and others, the American Psychological Association (APA) established guiding principles for conducting psychological research. The principles can be broken down in terms of when they should occur during the process of a person participating in the study.

2.4.2.1. Before participating. First, researchers must obtain informed consent or when the person agrees to participate because they are told what will happen to them. They are given information about any risks they face, or potential harm that could come to them, whether physical or psychological. They are also told about confidentiality or the person’s right not to be identified. Since most research is conducted with students taking introductory psychology courses, they have to be given the right to do something other than a research study to likely earn required credits for the class. This is called an alternative activity and could take the form of reading and summarizing a research article. The amount of time taken to do this should not exceed the amount of time the student would be expected to participate in a study.

     2.4.2.2. While participating. Participants are afforded the ability to withdraw, or the person’s right to exit the study if any discomfort is experienced.

     2.4.2.3. After participating . Once their participation is over, participants should be debriefed, which is when the true purpose of the study is revealed, they are told where to go if they need assistance, and how to reach the researcher if they have questions. Researchers are even permitted to deceive participants, or intentionally withhold the true purpose of the study from them. According to the APA, a minimal amount of deception is allowed.

Human research must be approved by an Institutional Review Board or IRB. It is the IRB that will determine whether the researcher is providing enough information for the participant to give consent that is truly informed, if debriefing is adequate, and if any deception is allowed.

If you would like to learn more about how to use ethics in your research, please read:

https://kpu.pressbooks.pub/psychmethods4e/chapter/putting-ethics-into-practice/

Module Recap

In Module 1, we stated that psychology is the study of behavior and mental processes using strict standards of science. In Module 2, we outlined how this is achieved through the use of the scientific method and use of the research designs of observation, case study, surveys, correlation, and experiments. The importance of valid and reliable measures is described.  To give our research legitimacy, we must use clear ethical standards for research which include gaining informed consent from participants, telling them of the risks, giving them the right to withdraw, debriefing them, and using only minimal deception.

3rd edition

Share This Book

  • Increase Font Size

Scientists’ Gender May Influence the Results of Experiments

A review of past research has found that subjects respond differently to male and female testers

Brigit Katz

Correspondent

Scientists doing science

In 2015, a study published in the journal Science shook up the scientific community. Researchers tried to reproduce the results of 100 published psychological studies, but were unable to do so two-thirds of the time. Known as the “ replication crisis ,” this phenomenon has been observed in other scientific fields. But the reasons behind the issue are challenging to tease out. As Richard Harris reports for NPR , one often-ignored influencing factor might surprise you: the gender of the scientists involved.

As part of a review published in Science Advances , a team of three researchers from Uppsala University, in Sweden, looked through a number of past studies and found examples of experiments that were impacted by whether the testers were male or female—“many, many” examples, Harris writes. For instance, children tend to perform better on IQ studies if the tester is a woman. But when it comes to problem-solving tasks, male testers elicit better results among subjects of both sexes. Male college students have been found to inflate their number of sexual partners when being surveyed by a woman. And in studies measuring pain sensitivity, men have been found to report significantly higher pain thresholds when they are interacting with a female tester.

"If you're testing out a new drug for pain, and you're getting these kinds of great results, you might want to look at [the genders of] who's running the experiment and who's participating in the experiment, because that could explain it more than the drug itself," Colin Chapman, one of the authors of the new study, tells Harris.

The paper puts forth a number of hypotheses that might explain why gender influences experimental findings—particularly when it comes to heterosexual subjects. It is possible that subjects’ responses are shaped by their desire to appear more likable or attractive to someone of the opposite sex. This “psychosocial stress,” as the researchers put it, can be linked to a biological response. One study has shown, for instance, that men tested by female experimenters display higher systolic blood pressure, and vice versa.

“Experimenter gender should have the greatest impact in areas of study where participants are in frequent and close contact with experimenters,” the authors of the paper write. “In addition, experiments implicating characteristics important for mate selection—such as mental acuity, physical prowess, or morality—may be more influenced.”

Gender is likely not the only factor that can sway the results of an experiment. “I imagine race, ethnicity, age, that all of those things could have important effects on how research participants perform in a research study,” Kristina Gupta, assistant professor in women’s, gender, and sexuality studies at Wake Forest, tells Ryan F. Mandelbaum of Gizmodo . But the new study maintains that accounting for the influence of gender—by making it standard practice to report the gender of experimenters in scientific studies—could help scientists’ ability to replicate significant experiments.

Get the latest stories in your inbox every weekday.

Brigit Katz | | READ MORE

Brigit Katz is a freelance writer based in Toronto. Her work has appeared in a number of publications, including NYmag.com, Flavorwire and Tina Brown Media's Women in the World.

  • Tools and Resources
  • Customer Services
  • Share Facebook LinkedIn Twitter

Article contents

Gender in a social psychology context.

  • Thekla Morgenroth Thekla Morgenroth Department of Psychology, University of Exeter
  • , and  Michelle K. Ryan Michelle K. Ryan Dean of Postgraduate Research and Director of the Doctoral College, University of Exeter
  • https://doi.org/10.1093/acrefore/9780190236557.013.309
  • Published online: 28 March 2018

Understanding gender and gender differences is a prevalent aim in many psychological subdisciplines. Social psychology has tended to employ a binary understanding of gender and has focused on understanding key gender stereotypes and their impact. While women are seen as warm and communal, men are seen as agentic and competent. These stereotypes are shaped by, and respond to, social contexts, and are both descriptive and prescriptive in nature. The most influential theories argue that these stereotypes develop in response to societal structures, including the roles women and men occupy in society, and status differences between the sexes. Importantly, research clearly demonstrates that these stereotypes have a myriad of effects on individuals’ cognitions, attitudes, and behaviors and contribute to sexism and gender inequality in a range of domains, from the workplace to romantic relationships.

  • gender stereotypes
  • gender norms
  • social psychology
  • social role theory
  • stereotype content model
  • ambivalent sexism
  • stereotype threat
  • Social Psychology

Introduction

Gender is omnipresent—it is one of the first categories children learn, and the categorization of people into men and women 1 affects almost every aspect of our lives. Gender is a key determinant of our self-concept and our perceptions of others. It shapes our mental health, our career paths, and our most intimate relationships. It is therefore unsurprising that psychologists invest a great deal of time in understanding gender as a concept, with social psychologists being no exception. However, this has not always been the case. This article begins with “A Brief History of Gender in Psychology,” which gives an overview about gender within psychology more broadly. The remaining sections discuss how gender is examined within social psychology more specifically, with particular attention to how gender stereotypes form and how they affect our sense of self and our evaluations of others.

A Brief History of Gender in Psychology

During the early years of psychology in general, and social psychology in particular, the topic gender was largely absent from psychology, as indeed were women. Male researchers made claims about human nature based on findings that were restricted to a small portion of the population, namely, white, young, able-bodied, middle-class, heterosexual men [see Etaugh, 2016 ; a phenomenon that has been termed androcentrism (Hegarty, & Buechel, 2006 )]. If women and girls were mentioned at all, they were usually seen as inferior to men and boys (e.g., Hall, 1904 ).

This invisibility of women within psychology changed with a rise of the second wave of feminism in the 1960s. Here, more women entered psychology, demanded to be seen, and pushed back against the narrative of women as inferior. They argued that psychology’s androcentrism, and the sexist views of psychologists, had not only biased psychological theory and research, but also contributed to and reinforced gender inequality in society. For example, Weisstein ( 1968 ) argued that most claims about women made by prominent psychologists, such as Freud and Erikson, lacked an evidential grounding and were instead based on these men’s fantasies of what women were like rather than empirical data. A few years later, Maccoby and Jacklin ( 1974 ) published their seminal work, The Psychology of Sex Differences , which synthesized the literature on sex differences and concluded that there were few (but some) sex differences. This led to a growth of interest in the social origins of sex differences, with a shift away from a psychology of sex (i.e., biologically determined male vs. female) and toward a psychology of gender (i.e., socially constructed masculine vs. feminine).

Since then, the psychology of gender has become a respected and widely represented subdiscipline within psychology. In a fascinating analysis of the history of feminism and psychology, Eagly, Eaton, Rose, Riger, and McHugh ( 2012 ) examined publications on sex differences, gender, and women from 1960 to 2009 . In those 50 years, the number of annual publications rose from close to zero to over 6,500. As a proportion of all psychology articles, one can also see a marked rise in popularity in gender articles from 1960 to 2009 , with peak years of interest in the late 1970s and 1990s. In line with the aforementioned shift from sex differences to gender differences, the largest proportion of these articles fall into the topic of “social processes and social issues,” which includes research on gender roles, masculinity, and femininity.

However, as interest in the area has grown, the ways in which gender is studied, and the political views of those studying it, have become more diverse. Eagly and colleagues note:

we believe that this research gained from feminist ideology but has escaped its boundaries. In this garden, many flowers have bloomed, including some flowers not widely admired by some feminist psychologists. (p. 225)

Here, they allude to the fact that some research has shifted away from societal explanations, which feminist psychologists have generally favored, to more complex views of gender difference. Some of these acknowledge the fact that nature and nurture are deeply intertwined, with both biological and social variables being used to understand gender and gender differences (e.g., Wood & Eagly, 2002 ). Others, such as evolutionary approaches (e.g., Baumeister, 2013 ; Buss, 2016 ) and neuroscientific approaches (see Fine, 2010 ), focus more heavily on the biological bases of gender differences, often causing chagrin among feminists. Nevertheless, much of the research in social psychology has, unsurprisingly, focused on social factors and, in particular, on gender stereotypes. Where do they come from and what are their effects?

Origins and Effects of Gender Stereotypes

A stereotype can be defined as a “widely shared and simplified evaluative image of a social group and its members” (Vaughan & Hogg, 2011 , p. 51) and has both descriptive and prescriptive aspects. In other words, gender stereotypes tell us what women and men are like, but also what they should be like (Heilman, 2001 ). Gender stereotypes are not only widely shared, but they are also stubbornly resistant to change (Haines, Deaux, & Lofaro, 2016 ). Both the origin and the consequences of these stereotypes have received much attention in social psychology. So how do stereotypes form? The most widely cited theories on stereotype formation—social role theory (SRT; Eagly, 1987 ; Eagly, Wood, & Diekman, 2000 ) and the stereotype content model (SCM; Fiske, Cuddy, Glick, & Xu, J., 2002 )—answer this question. Both of these models focus on gender as a binary concept (i.e., men and women), as does most psychological research on gender, although they could potentially also be applied to other gender groups. Both theories are considered in turn.

Social Role Theory: Gender Stereotypes Are Determined by Roles

SRT argues that gender stereotypes stem from the distribution of men and women into distinct roles within a given society (Eagly, 1987 ; Eagly et al., 2000 ). The authors note the stability of gender stereotypes across cultures and describe two core dimensions: agency , including traits such as independence, aggression, and assertiveness, and communion , including traits such as caring, altruism, and politeness. While men are generally seen to be high in agency and low in communion, women are generally perceived to be high in communion but low in agency.

According to SRT, these gender stereotypes stem from the fact that women and men are over- and underrepresented in different roles in society. In most societies, even those with higher levels of gender equality, men perform less domestic work compared to women, including childcare, and spend more time in paid employment. Additionally, men disproportionately occupy leadership roles in the workforce (e.g., in politics and management) and are underrepresented in caretaking roles within the workforce (e.g., in elementary education and nursing; see Eagly et al., 2000 ). Eagly and colleagues argue that this gendered division of labor leads to the formation of gender roles and associated stereotypes. More specifically, they propose that different behaviors are seen as necessary to fulfil these social roles, and different skills, abilities, and traits are seen as necessary to execute these behaviors. For example, elementary school teachers are seen to need to care for and interact with children, which is seen to require social skills, empathy, and a caring nature. In contrast, such communal attributes might be seen to be less important—or even detrimental—for a military leader.

To the extent that women and men are differentially represented and visible in certain roles—such as elementary school teachers or military leaders—the behaviors and traits necessary for these roles become part of each respective gender role. In other words, the behaviors and attributes associated with people in caretaking roles, communion, become part of the female gender role, while the behaviors and attributes associated with people in leadership roles, agency, become part of the male gender role.

Building on SRT, Wood and Eagly ( 2002 ) developed a biosocial model of the origins of sex differences which explains the stability of gendered social roles across cultures. The authors argue that, in the past, physical differences between men and women meant that they were better able to perform certain tasks, contributing to the formation of gender roles. More specifically, women had to bear children and nurse them, while men were generally taller and had more upper body strength. In turn, tasks that required upper body strength and long stretches of uninterrupted time (e.g., hunting) were more often carried out by men, while tasks that could be interrupted more easily and be carried out while pregnant or looking after children (e.g., foraging) were more often carried out by women.

Eagly and colleagues further propose that the exact tasks more easily carried out by each sex depended on social and ecological conditions as well as technological and cultural advances. For example, it was only in more advanced, complex societies that the greater size and strength of men led to a division of labor in which men were preferred for activities such as warfare, which also came with higher status and access to resources. Similarly, the development of plough technology led to shifts from hunter–gatherer societies to agricultural societies. This change was often accompanied by a new division of labor in which men owned, farmed, and inherited land while women carried out more domestic tasks. The social structures that arose from these processes in specific contexts in turn affected more proximal causes of gender differences, including gender stereotypes.

It is important to note that this theory focuses on physical differences between the genders, not psychological ones. In other words, the authors do not argue that women and men are inherently different when it comes to their minds, nor that men evolved to be more agentic while women evolved to be more communal.

Stereotype Content Model: Gender Stereotypes Are Determined by Group Relations

The SCM, formulated by Fiske and colleagues ( 2002 ), was not developed specifically for gender, but as an explanation of how stereotypes form more generally. Similar to SRT, the SCM argues that gender stereotypes arise from societal structures. More specifically, the authors suggest that status differences and cooperation versus competition determine group stereotypes—among them, gender stereotypes. This model also suggests two main dimensions to stereotypes, namely, warmth and competence. The concept of warmth is similar to that of communion, previously described, in that it refers to being kind, nice, and caring. Competence refers to attributes such as being intelligent, efficient, and skillful and is thus different from the agency dimension of SRT.

The SCM argues that the dimensions of warmth and competence originate from two fundamental dimensions—status and competition—which characterize the relationships between groups in every culture and society. The degree to which another group is perceived to be warm is determined by whether the group is in cooperation or in competition with one’s own group, which is in turn associated with perceived intentions to help or to harm one’s own group, respectively. While members of cooperating groups are stereotyped as warm, members of competing groups are stereotyped as cold. Evidence suggests that these two dimensions are indeed universal and can be found in many cultures, including collectivist cultures (Cuddy et al., 2009 ). Perceptions of competence, however, are affected by the status and power of the group, which go hand-in-hand with the group’s ability to harm one’s own group. Those groups with high status and power are stereotyped as competent, while those that lack status and power are stereotyped as incompetent.

Groups can thus fall into one of four quadrants of this model. Members of high status groups who cooperate with one’s own group are seen as unequivocally positive—as warm and competent—while those of low status who compete with one’s own group are seen as unequivocally negative—cold and incompetent. More interesting are the two groups that fall into the more ambivalent quadrants—those who are perceived as either warm but incompetent or competent but cold. Applied to gender, this model suggests—and research shows—that typical men are stereotyped as competent but cold, the envious stereotype, while typical women are stereotyped as warm but incompetent, the paternalistic stereotype.

However, these stereotypes do not apply equally to all women and men. Rather, subgroups of men and women come with their own stereotypes. Research demonstrates, for example, that the paternalistic stereotype most strongly applies to traditional women such as housewives, while less traditional women such as feminists and career women are stereotyped as high in competence and low in warmth. For men, there are similar levels of variation—the envious stereotype applies most strongly to men in traditional roles such as managers and career men, while other men are perceived as warm but incompetent (e.g., senior citizens), as cold and incompetent (e.g., punks), or as warm and competent (e.g., professors; Eckes, 2002 ). The section “Gender Stereotypes Affect Emotions, Behavior, and Sexism” discusses the consequences of these stereotypes in more detail.

The Effects of Gender Stereotypes

SRT and the SCM explain how gender stereotypes form. A large body of work in social psychology has focused on the consequences of these stereotypes. These include effects on the gendered perceptions and evaluations of others, as well as effects on the self and one’s own self-image, behavior, and goals.

Gendered Perceptions and Evaluations of Others

Our group-based stereotypes affect how we see members of these groups and how we judge those who do or do not conform to these stereotypes. Gender differs from many other group memberships in several ways (see Fiske & Stevens, 1993 ), which in turn affects consequences of these stereotypes. First, argue Fiske and Stevens, gender stereotypes tend to be more prescriptive than other stereotypes. For example, men may often be told to “man up,” to be tough and dominant, while women may be told to smile, to be nice, and to be sexy (but not too sexy). While stereotypes of other groups also have prescriptive elements, it is probably less common to hear Asians be told to be better at math or African Americans to be told to be more musical. The consequences of these gendered prescriptions are discussed in the section “Gender Stereotypes Affect the Evaluation of Women and Men.” Second, relationships between women and men are characterized by an unusual combination of power differences and close and frequent contact as well as mutual dependence for reproduction and close relationships. The section “Gender Stereotypes Affect Emotions, Behavior, and Sexism” discusses the effects of these factors.

Gender Stereotypes Affect the Evaluation of Women and Men

The evaluation of women and men is affected by both descriptive and prescriptive gender stereotypes. Research on these effects has predominantly focused on those who occupy counterstereotypical roles such as women in leadership or stay-at-home fathers.

Descriptive stereotypes affect the perception and evaluation of women and men in several ways. First, descriptive stereotypes create biased perceptions through expectancy confirming processes (see Fiske, 2000 ) such that individuals, particularly those holding strong stereotypes, seek out information that confirms their stereotypes. This is evident in their tendency to neglect or dismiss ambiguous information and to ask stereotype-confirming questions (Leyens, Yzerbyt, & Schadron, 1994 ; Macrae, Milne, & Bodenhausen, 1994 ). Moreover, people are more likely to recall stereotypical information compared to counterstereotypical information (Rojahn & Pettigrew, 1992 ) Second, descriptive gender stereotypes also bias the extent to which men and women are seen as suitable for different roles, as described in Heilman’s lack of fit model ( 1983 , 1995 ) and Eagly and Karau’s role congruity theory ( 2002 ). These approaches both suggest that the degree of fit between a person’s attributes and the attributes associated with a specific role is positively related to expectations about how successful a person will be in said role. For example, the traits associated with successful managers are generally more similar to those associated with men than those associated with women (Schein, 1973 ; see also Ryan, Haslam, Hersby, & Bongiorno, 2011 ). Thus, all else being equal, a man will be seen as a better fit for a managerial position and in turn as more likely to be a successful manager. These biased evaluations in turn lead to biased decisions, such as in hiring and promotion (see Heilman, 2001 ).

Prescriptive gender stereotypes also affect evaluations, albeit in different ways. They prescribe how women and men should behave, and also how they should not behave. The “shoulds” generally mirror descriptive stereotypes, while the “should nots” often include behaviors associated with the opposite gender. Thus, what is seen as positive and desirable for one gender is often seen as undesirable for the other and can lead to backlash in the form of social and economic penalties (Rudman, 1998 ). For example, women who are seen as agentic are punished with social sanctions because they violate the prescriptive stereotype that women should be nice, even in the absence of information indicating that they are not nice (Rudman & Glick, 2001 ). These processes are particularly problematic in combination with the effects of descriptive stereotypes, as individuals may face a double bind—if women behave in line with gender stereotypes, they lack fit with leadership positions that require agency, but if they behave agentically, they violate gender norms and face backlash in the form of dislike and discrimination (Rudman & Glick, 2001 ). Similar effects have been found for men who violate prescriptive masculine stereotypes, for example, by being modest (Moss-Racusin, Phelan, & Rudman, 2010 ) or by requesting family leave (Rudman & Mescher, 2013 ). Interestingly, however, being communal by itself does not lead to backlash for men (Moss-Racusin et al., 2010 ). In other words, while men can be perceived as highly agentic and highly communal, this is not true for women, who are perceived as lacking communion when being perceived as agentic and as lacking agency when being perceived as communal.

Gender Stereotypes Affect Emotions, Behavior, and Sexism

Stereotypes not only affect how individuals evaluate others, but also their feelings and behaviors toward them. The Behavior from Intergroup Affect and Stereotypes (BIAS) map (Cuddy, Fiske, & Glick, 2007 ), which extends the SCM, describes the relationship between perceptions of warmth and competence of certain groups, emotions directed toward these groups, and behaviors toward them. Cuddy and colleagues argue that bias is comprised of three elements: cognitions (i.e., stereotypes), affect (i.e., emotional prejudice), and behavior (i.e., discrimination), and these are closely linked. Groups perceived as warm and competent elicit admiration while groups perceived as cold and incompetent elicit contempt. Of particular interest to understanding gender are the two ambivalent combinations of warmth and competence: Those perceived as warm, but incompetent—such as typical women—elicit pity, while those perceived as competent, but cold—such as typical men—elicit envy.

Similarly, perceptions of warmth and competence are associated with behavior. Cuddy and colleagues ( 2007 ) argue that the warmth dimension affects behavioral reactions more strongly than competence because it stems from perceptions that a group will help or harm the ingroup. This leads to active facilitation (e.g., helping) when a group is perceived as warm, or active harm (e.g., harassing) when a group is perceived as cold. Competence, however, leads to passive facilitation (e.g., cooperation when it benefits oneself or one’s own group) when the group is perceived as competent, and passive harm (e.g., neglecting to help) when the group is perceived as incompetent.

How these emotional and behavioral reactions affect women and men has received much attention in the literature on ambivalent sexism and ambivalent attitudes toward men (Glick & Fiske, 1996 , 1999 , 2001 ). According to ambivalent sexism theory (AST), sexism is not a uniform, negative attitude toward women or men. Rather, it is comprised of hostile and benevolent elements, which arises from status differences between, and intimate interdependence of, the two genders. While men possess more economic, political, and social power, they depend on women as their mothers and (for heterosexual men) as romantic partners. Thus, while they are likely to be motivated to keep their power, they also need to find ways to foster positive relations with women.

Hostile sexism combines the beliefs that (a) women are inferior to men, (b) men should have more power in society, and (c) women’s sexuality poses a threat to men’s status and power. This form of sexism is mostly directed toward nontraditional women who directly threaten men’s status (e.g., feminists or career women), and women who threaten the heterosexual interdependence of men and women (e.g., lesbians)—in other words, toward women perceived to be competent but cold.

Benevolent sexism is a subtler form of sexism and refers to (a) complementary gender differentiation , the belief that (traditional) women are ultimately the better gender, (b) protective paternalism , where men need to cherish, protect, and provide for women, and (c) heterosexual intimacy , the belief that men and women complement each other such that no man is truly complete without a woman. This form of sexism is directed mainly toward traditional women.

While benevolent sexism may seem less harmful than its hostile counterpart, it ultimately provides an alternative mechanism for the persistence of gender inequality by “keeping women in their place” and discouraging them from seeking out nontraditional roles (see Glick & Fiske, 2001 ). Exposure to benevolent sexism is associated with women’s increased self-stereotyping (Barreto, Ellemers, Piebinga, & Moya, 2010 ), decreased cognitive performance (Dardenne, Dumont, & Bollier, 2007 ), and reduced willingness to take collective action (Becker & Wright, 2011 ), thus reinforcing the status quo.

With perceptions of men, Glick and Fiske ( 1999 ) argue that attitudes are equally ambivalent. Hostile attitudes toward men include (a) resentment of paternalism , stemming from perceptions of unfairness of the disproportionate amounts of power men hold, (b) compensatory gender differentiation , which refers to the application of negative stereotypes to men (e.g., arrogant, unrefined) so that women can positively distinguish themselves from them, and (c) heterosexual hostility , stemming from male sexual aggressiveness and interpersonal dominance. Benevolent attitudes toward men include maternalism , that is, the belief that men are helpless and need to be taken care of at home. Interestingly, while such attitudes portray women as competent in some ways, it still reinforces gender inequality by legitimizing women’s disproportionate amount of domestic work. Benevolent attitudes toward men also include complementary gender differentiation , the belief that men are indeed more competent, and heterosexual attraction , the belief that a woman can only be truly happy when in a romantic relationship with a man.

Cross-cultural research (Glick et al., 2000 , 2004 ) suggests that ambivalent sexism and ambivalent attitudes toward men are similar in many ways and can be found in most cultures. For both constructs, the benevolent and hostile aspects are distinct but positively related, illustrating that attitudes toward women and men are indeed ambivalent, as the mixed nature of stereotypes would suggest. Moreover, ambivalence toward women and men are correlated and national averages of both aspects of sexism and ambivalence toward men are associated with lower gender equality across nations, lending support to the idea that they reinforce the status quo.

Gender Stereotypes Affect the Self

Gender stereotypes not only affect individuals’ reactions toward others, they also play an important part in self-construal, motivation, achievement, and behavior, often without explicit endorsement of the stereotype. This section discusses how gender stereotypes affect observable gender differences and then describes the subtle and insidious effects gender stereotypes can have on performance and achievement through the inducement of stereotype threat (Steele & Aronson, 1995 ).

Gender Stereotypes Affect Gender Differences

Gender stereotypes are a powerful influence on the self-concept, goals, and behaviors. Eagly and colleagues ( 2000 ) argue that girls and boys observe the roles that women and men occupy in society and accommodate accordingly, seeking out different activities and acquiring different skills. They propose two main mechanisms by which gender differences form. First, women and men adjust their behavior to confirm others’ gender-stereotypical expectations. Others communicate their gendered expectations in many, often nonverbal and subtle ways and react positively when expectations are confirmed and negatively when they are not. This subtle communication of expectations reinforces gender-stereotypical behavior as people generally try to elicit positive, and avoid negative, reactions from others. Importantly, the interacting partners need not be aware of these expectations for them to take effect.

The second process by which gender stereotypes translate into gender differences is the self-regulation of behavior based on identity processes and the internalization of stereotypes (e.g., Bem, 1981 ; Markus, 1977 ). Most people form their gender identity based on self-categorization as male or female and subsequently incorporate attributes associated with the respective category into their self-concept (Guimond, Chatard, Martinot, Crisp, & Redersdorff, 2006 ). These gendered differences in the self-concepts of women and men then translate into gender-stereotypical behaviors. The extent to which the self-concept is affected by gender stereotypes—and in turn the extent to which gendered patterns of behavior are displayed—depends on the strength and the salience of this social identity (Hogg & Turner, 1987 ; Onorato & Turner, 2004 ). For example, individuals may be more likely to display gender-stereotypical behavior when they identify more strongly with their gender (e.g., Lorenzi‐Cioldi, 1991 ) or when their gender is more likely to be salient, which is more likely to be the case for women (Cadinu & Galdi, 2012 ).

However, many different subcategories of women exist—housewives, feminists, lesbians—and thus what it means to identify as a woman, and behave like a woman, is likely to be complex and multifaceted (e.g., Fiske et al., 2002 ; van Breen, Spears, Kuppens, & de Lemus, 2017 ). Moreover, research demonstrates that the salience of gender in any given context also determined the degree to which an individual displays gender-stereotypical behavior (e.g., Ryan & David, 2003 ; Ryan, David, & Reynolds, 2004 ). For example, Ryan and colleagues demonstrate that while women and men act in line with gender stereotypes when gender and gender difference are salient, these differences in attitudes and behavior disappear when alternative identities, such as those based on being a student or being an individual, are made salient.

Gender Stereotypes Affect Performance and Achievement

The consequences of stereotypes go beyond the self-concept and behavior. Research in stereotype threat describes the detrimental effects that negative stereotypes can have on performance and achievement. Stereotype threat refers to the phenomenon whereby the awareness of the negative stereotyping of one’s group in a certain domain, and the fear of confirming such stereotypes, can have negative effects on performance, even when the stereotype is not endorsed. The phenomenon was first described by Steele and Aronson ( 1995 ) in the context of African Americans’ intellectual test performance, but has since been found to affect women’s performance and motivation in counterstereotypical domains such as math (Nguyen & Ryan, 2008 ) and leadership (Davies, Spencer, & Steele, 2005 ). This affect holds true even when minority group members’ prior performance and interest in the domain are the same as those of majority group members (Spencer, Steele, & Quinn, 1999 ). Moreover, the effect is particularly pronounced when the minority member’s desire to belong is strong and identity-based devaluation is likely (Steele, Spencer, & Aronson, 2002 ).

Different mechanisms for the effect of stereotype threat have been proposed. Schmader, Johns, and Forbes ( 2008 ) suggest that the inconsistency between one’s self-image as competent and the cultural stereotype about one’s group’s lack of competence leads to a physiological stress response that directly impairs working memory. For example, when made aware of the widely held stereotype that women are bad at math, a female math student is likely to experience an inconsistency. This inconsistency, the authors argue, is not only distressing in itself, but induces uncertainty: Am I actually good at math or am I bad at math as the stereotype would lead me to believe? In an effort to resolve this uncertainty, she is likely to monitor her performance more than others—and more than in a situation in which stereotype threat is absent. This monitoring leads to more conscious, less efficient processing of information—for example, when performing calculations that she would otherwise do more or less automatically—and a stronger focus on detecting potential failure, taking cognitive resources away from the actual task. Moreover, individuals under stereotype threat are more likely to experience negative thoughts and emotions such as fear of failure. In order to avoid the interference of these thoughts, they actively try to suppress them. This suppression, however, takes effort. All of these mechanisms, the authors argue, take working memory space away from the task in question, thereby impairing performance.

The aim of this article is to give an overview of gender research in social psychology, which has focused predominantly on gender stereotypes, their origins, and their consequences, and these are all connected and reinforce each other. Social psychology has produced many fascinating findings regarding gender, and this article has only just touched on these findings. While research into gender has seen a great growth in the past 50 years and has provided us with an unprecedented understanding of women and men and the differences (and similarities) between them, there is still much work to be done.

There are a number of issues that remain largely absent from mainstream social psychological research on gender. First, an interest and acknowledgment of intersectional identities has emerged, such as how gender intersects with race or sexuality. It is thus important to note that many of the theories discussed in this article cannot necessarily be applied directly across intersecting identities (e.g., to women of color or to lesbian women), and indeed the attitudes and behaviors of such women continue to be largely ignored within the field.

Second, almost all social psychological research into gender is conducted using an overly simplistic binary definition of gender in terms of women and men. Social psychological theories and explanations are, for the most part, not taking more complex or more fluid definitions of gender into account and thus are unable to explain gendered attitudes and behavior outside of the gender binary.

Finally, individual perceptions and cognitions are influenced by gendered stereotypes and expectations, and social psychologists are not immune to this influence. How we, as psychologists, ask research questions and how we interpret empirical findings are influenced by gender stereotypes (e.g., Hegarty & Buechel, 2006 ), and we must remain vigilant that we do not inadvertently seek to reinforce our own gendered expectations and reify the gender status quo.

  • Barreto, M. , Ellemers, N. , Piebinga, L. , & Moya, M. (2010). How nice of us and how dumb of me: The effect of exposure to benevolent sexism on women’s task and relational self-descriptions. Sex Roles , 62 , 532–544.
  • Baumeister, R. F. (2013) Gender differences in motivation shape social interaction patterns, sexual relationships, social inequality, and cultural history. In M. K. Ryan & N. R. Branscombe (Eds.), The SAGE handbook of gender and psychology . Los Angeles, CA: SAGE
  • Becker, J. C. , & Wright, S. C. (2011). Yet another dark side of chivalry: Benevolent sexism undermines and hostile sexism motivates collective action for social change. Journal of Personality and Social Psychology , 101 , 62–77.
  • Bem, S. L. (1981). Gender schema theory: A cognitive account of sex typing. Psychological Review , 88 , 354–364
  • Buss, D. M. (2016). The evolution of desire . In T. K. Shackelford & V. A. Weekes-Shackelford (Eds.), Encyclopedia of evolutionary psychological science . Cham, SZ: Springer.
  • Cadinu, M. , & Galdi, S. (2012). Gender differences in implicit gender self‐categorization lead to stronger gender self‐stereotyping by women than by men. European Journal of Social Psychology , 42 , 546–551.
  • Cuddy, A. J. C. , Fiske, S. T. , & Glick, P. (2007). The BIAS map: Behaviors from intergroup affect and stereotypes. Journal of Personality and Social Psychology , 92 , 631–648.
  • Cuddy, A. J. C. , Fiske, S. T. , Kwan, V. S. Y. , Glick, P. , Demoulin, S. , Leyens, J.-P. , . . . & Ziegler, R. (2009). Stereotype content model across cultures: Towards universal similarities and some differences. British Journal of Social Psychology , 48 , 1–33.
  • Dardenne, B. , Dumont, M. , & Bollier, T. (2007). Insidious dangers of benevolent sexism: Consequences for women’s performance. Journal of Personality and Social Psychology , 93 , 764–779.
  • Davies, P. G. , Spencer, S. J. , & Steele, C. M. (2005). Clearing the air: Identity safety moderates the effects of stereotype threat on women’s leadership aspirations. Journal of Personality and Social Psychology , 88 , 276–287.
  • Deaux, K. , & Major, B. (1987) Putting gender into context: An interactive model of gender-related behavior. Psychological Review , 94 , 369–389.
  • Eagly, A. H. (1987). Sex differences in social behavior: A social role interpretation . Hillsdale, NJ: Erlbaum.
  • Eagly, A. H. , Eaton, A. , Rose, S. , Riger, S. , & McHugh, M. (2012). Feminism and psychology: Analysis of a half-century of research on women and gender. American Psychologist , 67 , 211–230.
  • Eagly, A. H. , & Karau, S. J. (2002). Role congruity theory of prejudice towards female leaders. Psychological Review , 109 , 573–598.
  • Eagly, A. H. , Wood, W. , & Diekman, A. B. (2000). Social role theory of sex differences and similarities: A current appraisal. In T. Eckes & H. M. Trautner (Eds.), The developmental social psychology of gender (pp. 123–174). Mahwah, NJ: Erlbaum.
  • Eckes, T. (2002). Paternalistic and envious gender stereotypes: Testing predictions from the stereotype content model. Sex Roles , 47 , 99–114.
  • Etaugh, C. (2016). Psychology of gender: History and development of the field. In N. Naples , R. C. Hoogland , M. Wickramasinghe , & W. C. A. Wong (Eds.), The Wiley Blackwell encyclopedia of gender and sexuality studies , (pp. 1–12). Hoboken, (NJ): John Wiley & Sons.
  • Fine, C. (2010). Delusions of gender: How our minds, society, and neurosexism create difference . New York: Norton.
  • Fiske, S. T. (2000). Stereotyping, prejudice, and discrimination at the seam between the centuries: Evolution, culture, mind, and brain. European Journal of Social Psychology , 30 , 299–322.
  • Fiske, S. T. , Cuddy, A. J. C. , Glick, P. , & Xu, J. (2002). A model of (often mixed) stereotype content: Competence and warmth respectively follow from perceived status and competition. Journal of Personality and Social Psychology , 82 , 878–902.
  • Fiske, S. T. , & Stevens, L. E. (1993). What’s so special about sex? Gender stereotyping and discrimination. In S. Oskamp & M. Costanzo (Eds.), Gender issues in contemporary society (pp. 173–196). Newbury Park, CA: SAGE.
  • Glick, P. , & Fiske, S. T. (1996). The Ambivalent Sexism Inventory: Differentiating hostile and benevolent sexism. Journal of Personality and Social Psychology , 70 , 491–512.
  • Glick, P. , & Fiske, S. T. (1999). The Ambivalence toward Men Inventory: Differentiating hostile and benevolent beliefs about men. Psychology of Women Quarterly , 23 , 519–536.
  • Glick, P. , & Fiske, S. T. (2001). An ambivalent alliance: Hostile and benevolent sexism as complementary justifications of gender inequality. American Psychologist , 56 , 109–118.
  • Glick, P. , Fiske, S. T. , Mladinic, A. , Saiz, J. L. , Abrams, D. , Masser, B. , . . . & López, W. L. (2000). Beyond prejudice as simple antipathy: Hostile and benevolent sexism across cultures. Journal of Personality and Social Psychology , 79 (5), 763–775.
  • Glick, P. , Lameiras, M. , Fiske, S. T. , Eckes, T. , Masser, B. , Volpato, C. , . . . & Wells, R. (2004). Bad but bold: Ambivalent attitudes toward men predict gender inequality in 16 nations. Journal of Personality and Social Psychology , 86 (5), 713–728.
  • Guimond, S. , Chatard, A. , Martinot, D. , Crisp, R. J. , & Redersdorff, S. (2006). Social comparison, self-stereotyping, and gender differences in self-construals. Journal of Personality and Social Psychology , 90 , 221.
  • Haines, E. L. , Deaux, K. , & Lofaro, N. (2016). The times they are a-changing . . . or are they not? A comparison of gender stereotypes, 1983–2014. Psychology of Women Quarterly , 40 (3), 1–11.
  • Hall, G. S. (1904). Adolescence: Its psychology and its relations to physiology, anthropology, sociology, sex, crime, religion, and education . New York: D. Appleton.
  • Hegarty, P. , & Buechel, C. (2006). Androcentric reporting of gender differences in APA journals: 1965–2004. Review of General Psychology , 10 (4), 377.
  • Heilman, M. E. (1983). Sex bias in work settings: The lack of fit model. In B. Staw & L. Cummings (Eds.), Research in organizational behavior (Vol. 5). Greenwich, CT: JAI.
  • Heilman, M. E. (1995). Sex stereotypes and their effects in the workplace: What we know and what we don’t know. Journal of Social Behavior and Personality , 10 , 3–26.
  • Heilman, M. E. (2001). Description and prescription: How gender stereotypes prevent women’s ascent up the organizational ladder. Journal of Social Issues , 57 , 657–674.
  • Hogg, M. , & Turner, J. C. (1987). Intergroup behaviour, self-stereotyping and the salience of social categories. British Journal of Social Psychology , 26 , 325–340.
  • Leyens, J-Ph. , Yzerbyt, V. , & Schadron, G. (1994). Stereotypes, social cognition, and social explanation . London: SAGE.
  • Lorenzi-Cioldi, F. (1991). Self‐stereotyping and self‐enhancement in gender groups. European Journal of Social Psychology , 21 (5), 403–417.
  • Maccoby, E. E. , & Jacklin, C. N. (1974). The psychology of sex differences . Palo Alto, CA: Stanford University Press.
  • Macrae, C. N. , Milne, A. B. , & Bodenhausen, G. V. (1994). Stereotypes as energy-saving devices: A peek inside the cognitive toolbox. Journal of Personality and Social Psychology , 66 , 37–47.
  • Markus, H. R. (1977). Self-schemata and processing information about the self. Journal of Personality and Social Psychology , 35 , 63–78.
  • Moss-Racusin, C. A. , Phelan, J. E. , & Rudman, L. A. (2010). When men break the gender rules: Status incongruity and backlash toward modest men. Psychology of Men and Masculinity , 11 , 140–151.
  • Nguyen, H. D. , & Ryan, A. M. (2008). Does stereotype threat affect test performance of minorities and women? A meta-analysis of experimental evidence. Journal of Applied Psychology , 93 , 1314–1334.
  • Onorato, R. S. , & Turner, J. C. (2004). Fluidity in the self‐concept: the shift from personal to social identity. European Journal of Social Psychology , 34 , 257–278.
  • Rojahn, K. , & Pettigrew, T. F. (1992). Memory for schema-relevant information: A meta-analytic resolution. British Journal of Social Psychology , 31 , 81–109.
  • Rudman, L. A. (1998). Self-promotion as a risk factor for women: The costs and benefits of counterstereotypical impression management. Journal of Personality and Social Psychology , 74 , 629–645.
  • Rudman, L. A. , & Glick, P. (2001). Prescriptive gender stereotypes and backlash toward agentic women. Journal of Social Issues , 57 , 743–762.
  • Rudman, L.A. , & Mescher, K. (2013). Penalizing men who request a family leave: Is flexibility stigma a femininity stigma? Journal of Social Issues , 69 , 322–340.
  • Ryan, M. K. , & David, B. (2003). Gender differences in ways of knowing: The context dependence of The Attitudes Toward Thinking and Learning Survey. Sex Roles , 49 , 693–699.
  • Ryan, M. K. , David, B. , & Reynolds, K. J. (2004). Who cares?: The effect of context on self-concept and moral reasoning. Psychology of Women Quarterly , 28 , 246–255.
  • Ryan, M. K. , Haslam, S. A. , Hersby, M. D. , & Bongiorno, R. (2011). Think crisis—think female: The glass cliff and contextual variation in the think manager—think male stereotype. Journal of Applied Psychology , 96 , 470–484.
  • Schein, V. E. (1973). The relationship between sex-role stereotypes and requisite management characteristics. Journal of Applied Psychology , 57 , 95–100.
  • Schmader, T. , Johns, M. , & Forbes, C. (2008). An integrated process model of stereotype threat effects on performance. Psychological Review , 115 , 336–356.
  • Spencer, S. J. , Steele, C. M. , & Quinn, D. M. (1999). Stereotype threat and women’s math performance. Journal of Experimental Social Psychology , 35 , 4–28.
  • Steele, C. M. , & Aronson, J. (1995). Stereotype threat and the intellectual test performance of African Americans. Journal of Personality and Social Psychology , 69 , 797–811.
  • Steele, C. M. , Spencer, S. J. , & Aronson, J. (2002). Contending with group image: The psychology of stereotype and social identity threat. In M. P. Zanna (Ed.), Advances in experimental social psychology (Vol. 34, pp. 379–440). San Diego, CA: Academic Press
  • Van Breen, J. A. , Spears, R. , Kuppens, T. , & de Lemus, S. (2017). A multiple identity approach to gender: Identification with women, identification with feminists, and their interaction. Frontiers in Psychology , 8 , 1019.
  • Vaughan, G. M. , & Hogg, M. A. , (2011). Introduction to Social Psychology (6th ed.). Sydney: Pearson Australia.
  • Weisstein, N. (1968). “Kinder, kuche, kirche” as scientific law: Psychology constructs the female . Boston, MA: New England Free Press.
  • Wood, W. , & Eagly, A. H. (2002). A cross-cultural analysis of the behavior of women and men: Implications for the origins of sex differences. Psychological Bulletin , 128 , 699–727.

1. Psychology largely conceptualizes gender as binary. While this is problematic in a number of ways, which we touch upon in the Conclusion section, we largely follow these binary conventions throughout this article, as it is representative of the social psychological literature as a whole.

Related Articles

  • Gender and Cultural Diversity in Sport, Exercise, and Performance Psychology
  • Social Categorization
  • Self and Identity

Printed from Oxford Research Encyclopedias, Psychology. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 25 September 2024

  • Cookie Policy
  • Privacy Policy
  • Legal Notice
  • Accessibility
  • [185.80.151.41]
  • 185.80.151.41

Character limit 500 /500

REVIEW article

Gender trouble in social psychology: how can butler’s work inform experimental social psychologists’ conceptualization of gender.

\r\nThekla Morgenroth*

  • 1 Department of Psychology, University of Exeter, Exeter, United Kingdom
  • 2 Faculty of Economics and Business, University of Groningen, Groningen, Netherlands

A quarter of a century ago, philosopher Judith Butler (1990 ) called upon society to create “gender trouble” by disrupting the binary view of sex, gender, and sexuality. She argued that gender, rather than being an essential quality following from biological sex, or an inherent identity, is an act which grows out of, reinforces, and is reinforced by, societal norms and creates the illusion of binary sex. Despite the fact that Butler’s philosophical approach to understanding gender has many resonances with a large body of gender research being conducted by social psychologists, little theorizing and research within experimental social psychology has drawn directly on Butler’s ideas. In this paper, we will discuss how Butler’s ideas can add to experimental social psychologists’ understanding of gender. We describe the Butler’s ideas from Gender Trouble and discuss the ways in which they fit with current conceptualizations of gender in experimental social psychology. We then propose a series of new research questions that arise from this integration of Butler’s work and the social psychological literature. Finally, we suggest a number of concrete ways in which experimental social psychologists can incorporate notions of gender performativity and gender trouble into the ways in which they research gender.

“We’re born naked, and the rest is drag.”

( RuPaul, 1996 )

Introduction

A quarter of a century ago, philosopher Judith Butler (1990) called upon society to create “gender trouble” by disrupting the binary view of sex, gender, and sexuality. Key to her argument is that gender is not an essential, biologically determined quality or an inherent identity, but is repeatedly performed , based on, and reinforced by, societal norms. This repeated performance of gender is also performative , that is, it creates the idea of gender itself, as well as the illusion of two natural, essential sexes. In other words, rather than being women or men, individuals act as women and men, thereby creating the categories of women and men. Moreover, they face clear negative consequences if they fail to do their gender right.

We argue that Butler’s philosophical approach to under standing gender has many resonances with, and implications for, a large body of gender research being conducted by social psychologists. Indeed, Butler’s notion of performativity echoes a range of social psychological approaches to gender and gender difference. What we social psychologists might call gender norms and stereotypes (e.g., Eagly, 1987 ; Fiske and Stevens, 1993 ), or gender schemas ( Bem, 1981 ) provide the “scripts” for what Butler’s describes as the performance of gender.

We are not the first to point out the relevance of Butler’s work to social psychology. Bem (1995) drawing on Butler’s work, argued in that as gender researchers we should create gender trouble by making genders that fall outside of the binary visible, in order to disrupt binary, heteronormative views of gender within and outside of psychology. Minton (1997) argued that queer theory more broadly, which challenges the binary, heteronormative system of sex and gender, should inform psychological theory and practice. Similarly, Hegarty (1997) uses Butler’s arguments regarding performativity to criticize neuropsychological research that essentializes sexual orientation, pointing out the ways in which it ignores historical and cultural variation in sexuality and excludes women and other minorities. However, despite these calls for gender trouble over 20 years ago, we believe that social psychology, and experimental social psychology in particular, has yet to truly step up and answer the call.

Despite past acknowledgments of the importance of Butler’s work by social psychologists, in particular by qualitative psychologist, to our knowledge, little theorizing and research within experimental (and quantitative) social psychology has directly drawn on Butler’s ideas. This is despite the fact that there are identifiable similarities in broad theoretical ideas espoused by many social psychologists with an interest in gender and Butler’s ideas. Thus, we argue that there is great value in (again) promoting the ideas Butler puts forward in Gender Trouble to social psychologists. While experimental social psychological perspectives on gender have been concerned primarily with the origin and perpetuation of gender stereotypes, Butler’s work is more political in her explicit call to create gender trouble. The political nature of the work is perhaps one reason why experimental social psychologists have been reluctant to build on and integrate Butler’s ideas in their work – but, we would argue, it is indeed one of the reasons they should. Combining these two perspectives seems potentially fruitful, bringing together Butler’s theorizing and her call for social and political change with established experimental social psychological theory and empirically testable hypotheses.

In this paper we will first describe Butler’s work in more detail. We will then discuss the extent to which her work fits with different conceptualizations of gender in the social psychological literature, with a focus on experimental social psychology. We will then propose new avenues of research that could potentially grow out of an integration of Butler’s work into social psychology. Finally, we will discuss the different ways in which Butler’s work can inform and challenge the ways in which we, as experimental social psychologists, study and operationalize gender.

Butler’s View on Gender

In her book Gender Trouble Butler (1990) argues that within Western culture, sex, gender, and sexual orientation are viewed as closely linked, essential qualities. The prevalent view is that biological sex is binary (male vs. female), essential, and natural, and that it forms the basis for binary gender, which is viewed as the cultural interpretation of sex, and sexual desire. In other words, there is a belief that a baby born with a penis will grow up to identify and act as a man – whatever that means in a specific culture – and, as part of this gender role, be sexually attracted to women. Similarly, there is a belief that a baby born with a vagina will grow up to identify and act as a woman and, as part of this gender role, be sexually attracted to men. Butler argues that these configurations of sex, gender, and sexual desire are the only “intelligible” genders in our culture.

This societal view of gender is also reflected in the works of many feminist writers, who define sex as biological and gender as cultural (see Gould, 1977 , for a review and critical discussion). Butler criticizes this distinction between sex – as natural, essential, and pre-discursive (i.e., existing before culture and before interpretation) – and gender as its cultural interpretation. She argues that it is not just gender that is culturally constructed and has prescriptive and proscriptive qualities, but that this also applies to sex as a binary category. Through this, Butler (1990) argues that the distinction between sex and gender is meaningless, noting that “perhaps this construct called ‘sex’ is as culturally constructed as gender; indeed, perhaps it was always already gender with the consequence that the distinction between sex and gender turns out to be no distinction at all” (p. 9).

Butler cites evidence for the considerable variability in chromosomes, genitalia, and hormones, that don’t always align in the expected, binary manner. Indeed, even biologists, who traditionally view the body as natural and pre-discursive, increasingly argue that a binary view of human sex is overly simplistic and that sex should be viewed as a spectrum rather than a dichotomy, in terms of anatomical, hormonal, and even cellular sex (see Fausto-Sterling, 2000 ; Ainsworth, 2015 see also Fausto-Sterling, 1993 ). This variability can include ambiguous genitalia, a “mismatch” between chromosomes and genitalia, or a body that is comprised of a mix of “male” (XY) and “female” (XX) cells 1 . Some research suggest that up to 10% of children are born with sex characteristics that do not clearly fall into the category of female or male (e.g., Arboleda et al., 2014 ), although these numbers are debated and some argue the number is much lower. For example, Sax (2002) argues that only very specific “conditions” should qualify as intersex and that only about 0.018% of people should be considered intersex. We would argue, however, that exact numbers or specific definitions of what constitutes “intersex” are irrelevant here and that debates about exact numbers are indeed illustrative of the very process Butler discusses – that there is no “objective” or natural sex, but that it is performatively constructed.

Regardless of exact numbers, Butler argues that any individual who does not fall clearly into one of the two sex categories is labeled as abnormal and pathological (see Sax’s usage of the term “condition”), and steps are taken to “rectify” this abnormality. For example, the majority of babies born with intersex characteristics undergo surgery and are raised as either male or female ( Human Rights Watch, 2017 ), protecting and maintaining the binary construction of sex.

To be clear, Butler does not argue that biological processes do not exist or do not affect differences in hormones or anatomy. Rather, she argues that bodies do not exist outside of cultural interpretation and that this interpretation results in over-simplified, binary views of sex. In other words, biological processes do not themselves result in two “natural,” distinct, and meaningful, categories of people. The two sexes only appear natural, obvious, and important to us because of the gendered world in which we live. More specifically, the repeated performance of two polar, opposite genders makes the existence of two natural, inherent, pre-discursive sexes seem plausible. In other words, Butler views gender as a performance in which we repeatedly engage and which creates the illusion of binary sex. She argues:

“Because there is neither an ‘essence’ that gender expresses or externalizes nor an objective ideal to which gender aspires; because gender is not a fact, the various acts of gender create the idea of gender, and without those acts, there would be no gender at all. Gender is, thus, a construction that regularly conceals its genesis. The tacit collective agreement to perform, produce, and sustain discrete and polar genders as cultural fictions is obscured by the credibility of its own production. The authors of gender become entranced by their own fictions whereby the construction compels one’s belief in its necessity and naturalness.” (p. 522)

Thus, for Butler, gender is neither essential nor biologically determined, but rather it is created by its own performance and hence it is performative . The term performativity , originating in Austin’s (1962) work on performative utterances, refers to speech acts or behaviors which create the very thing they describe. For example, the sentence “I now pronounce you man and wife” not only describes what the person is doing (i.e., pronouncing something) but also creates the marriage (i.e., the thing it is pronouncing) through the pronouncement. Butler builds on this work by exploring how gender works in a similar way – gender is created by its own performance.

However, as this binary performance of gender is almost ubiquitous, its performative nature is concealed. The binary performance of gender is further reinforced by the reactions of others to those who fail to adhere to gender norms. Butler argues that “Discrete genders are part of what ‘humanizes’ individuals within contemporary culture; indeed, those who fail to do their gender right are regularly punished” (p. 522). This punishment includes the oppression of women and the stigmatization and marginalization of those who violate the gender binary, either by disrupting the presumed link between sex and gender (e.g., transgender individuals) or between sex and sexuality (e.g., lesbian and gay individuals) or by challenging the binary system in itself (e.g., intersex, bisexual, or genderqueer individuals). This stigma is clearly evidenced by the high rate of violence against transgender women, particularly those of color ( Adams, 2017 ); surgeries performed on intersex babies to achieve “normal” sex characteristics ( Human Rights Watch, 2017 ); and the stigmatization of sexual minorities ( Lick et al., 2013 ).

These negative reactions and the binary performance of gender, Butler argues, do not exist by chance. Instead, they serve as tools of a system of power structures which is trying to reproduce and sustain itself – namely a patriarchal system of compulsory heterosexuality in which women serve as a means of reproduction to men, as their mothers and wives. These power structures are both prohibitive (i.e., proscriptive), repressing deviating gender performance, as well as generative (i.e., prescriptive), creating binary, heteronormative gender performance.

Butler’s work is a call to action to overthrow these structures and end the problematic practices that they engender. However, she criticizes feminist voices who emphasize a shared identity (“women”) to motivate collective action on behalf of the group in order to achieve societal changes. By arguing that gender is not something one is , but rather something one does or performs , Butler argues that gender identity is not based on some inner truth, but instead a by-product of repeated gender performance. Framing gender identity as an inherent part of the self, as many feminist writers did at the time (and indeed still do), she argues, reinforces the gender binary and in turn plays into the hands of the patriarchy and compulsory heterosexuality. Feminists should instead seek to understand how the category of “women” is produced and restrained by the means through which social change is sought (such as language or the political system).

This argument has particular relevance to the notion of gender identity. As such, it has been criticized as invalidating transgender individuals, whose experience of a true inner gender identity that is not in line with the sex they were assigned at birth is often questioned. This is despite the fact that from a young age transgender individuals view themselves in terms of their expressed gender, both explicitly and implicitly, mirroring self-views of cis -gender 2 children ( Olson et al., 2015 ). Butler has responded to these criticisms repeatedly. For example, answering a question about what is most often misunderstood about her theory in an interview in 2015, she replies:

“I do know that some people believe that I see gender as a “choice” rather than as an essential and firmly fixed sense of self. My view is actually not that. No matter whether one feels one’s gendered and sexed reality to be firmly fixed or less so, every person should have the right to determine the legal and linguistic terms of their embodied lives. So whether one wants to be free to live out a “hard-wired” sense of sex or a more fluid sense of gender, is less important than the right to be free to live it out, without discrimination, harassment, injury, pathologization or criminalization – and with full institutional and community support.” ( The Conversation Project, 2015 )

Thus, Butler does not question people’s sense of self, but instead criticizes a shared gender identity as the necessary basis for political action. She points out that abandoning the idea of gender as an identity does not take away the potential of agency on behalf of women. Instead, it opens up the possibility of agency, which other approaches that view identity as fixed and stable do not enable. The fact that identity is constructed means that it is neither completely arbitrary and free, nor completely determined, leaving room for re-structuring, subversion, and for disrupting the status quo. Thus, the common identity “we, women” is not necessary for collective action on behalf of the feminist movement, as anyone can engage in subversion and the disruption of the gender binary. Indeed, we would argue that feminism becomes more powerful as an inclusive movement for gender equality more broadly defined, not just equality between women and men.

In conclusion, Butler argues that we, as a society, need to create gender trouble by disrupting the gender binary to dismantle the oppressive system of patriarchy and compulsory heterosexuality. While some of Butler’s ideas seem very different from how gender is generally viewed in the experimental social psychological literature, others resonate well with social psychological theorizing and empirical research. In the next section, we will discuss ways in which Butler’s view is compatible – and incompatible – with some of the most prominent conceptualizations of gender in experimental social psychology.

Is Butler’s View Compatible With Conceptualizations of Gender in Social Psychology?

Gender has been an increasingly important focus within psychology more generally, and in social psychology in particular (e.g., Eagly et al., 2012 ). While there is considerable variation in how psychologists view and treat gender, we argue that many of approaches fall into one of three traditions: (1) evolutionary approaches which view binary, biological sex as the determinant of gender and gender differences; (2) social structural approaches which view societal forces such as status and social roles as the determinant of gender stereotypes and, in turn, gender differences; and, not mutually exclusive from a social structural approach; (3) social identity approaches which view gender as one out of many social categories with which individuals identify to varying degrees. In addition, integrative approaches draw on more than one of these traditions, as well as developmental, social cognitive, and sociological models of gender, and integrate them to explain gendered behavior. While none of these approaches is entirely compatible with the argument that binary sex is constructed through the repeated binary performance of gender with gender identity as a by-product of this performance, there are great differences in the extent to which they are in line with, and can speak to, Butler’s ideas.

Evolutionary psychology is, we would argue, the least compatible with Butler’s view on sex and gender. Evolutionary approaches to the psychology of gender maintain that gender differences are, for the most part, genetic – resulting from the different adaptive problems faced by women and men in their evolutionary past (see Byrd-Craven and Geary, 2013 ), particularly due to reproductive differences such as paternal uncertainty for men and higher parental investment for women. These differences, it is argued, then shaped our genes – and gender differences – through sexual selection (i.e., gender differences in the factors predicting successful reproduction; Darwin, 1871 ). These approaches can be described as essentializing gender, that is, promoting the belief that men and women share an important but unobservable “essence.” Essentialism includes a range of factors such the degree to which individuals perceive social categories to be fixed and natural ( Roberts et al., 2017 ) and has been shown to be associated with greater levels of stereotyping and prejudice ( Brescoll and LaFrance, 2004 ; Bastian and Haslam, 2006 ). Evidence further suggests people who hold highly essentialist beliefs of gender are more supportive of what the authors call “boundary-enhancing initiatives” such as gender-segregated classrooms and legislation forcing transgender individuals to use the bathroom associated with the sex they were assigned at birth ( Roberts et al., 2017 ). Thereby, essentialism, and the resultant stereotypes and prejudice, contribute to the reinforcement of the status quo.

Evolutionary psychology’s approach to gender exemplifies many points Butler (1990) criticizes in Gender Trouble . First, it treats sex as a pre-discursive binary fact rather than a cultural construct. In other words, it ignores variability in chromosomes, genitals, and hormones ( Fausto-Sterling, 1993 ; Ainsworth, 2015 ) and views binary sex – and gender – as an inherent, essential quality. Moreover, evolutionary approaches argue that gender follows from sex and thus portray binary sex as an explanation for, rather than a result of, gender differences (i.e., gender performance). In addition to ignoring the existence of intersex individuals, these approaches also often ignore homosexuality, focusing exclusively on heterosexual desires and reproduction. Thus, we would argue, such evolutionary approaches play into the patriarchal system of compulsory heterosexuality in which women function primarily as mothers and wives.

Social structural approaches to gender such as early conceptions of social role theory ( Eagly, 1987 ) and the stereotype content model ( Fiske and Stevens, 1993 ) are more compatible with Butler’s views. Such approaches argue that societal structures such as social roles and differences in power and status determine gender stereotypes, which affect both gendered behavior as well as reactions to those who deviate from gender stereotypes. In other words, gender stereotypes provide the “script” for the performance of gender with negative consequences for those who fail to “learn their lines” or “stick to the script”.

The social psychological literature provides many empirical examples of these negative consequences. For example, Rudman and colleagues describe how those who deviate from their scripts often encounter backlash in the form of economic and social penalties (for a review see Rudman et al., 2012 ). This backlash discourages individuals from engaging in stereotype-incongruent behavior as they avoid negative consequences in the future, reducing their potential to act as deviating role models for others. Moreover, witnessing the backlash gender troublemakers encounter may also vicariously discourages others from breaking gender stereotypes to avoid negative consequences for themselves. The literature on precarious manhood further suggests that these issues might be particularly pronounced for men ( Bosson et al., 2013 ). Research demonstrates that men must continuously prove their masculinity by avoiding anything deemed feminine to avoid negative consequences such as loss of status. Each of these lines of research are very much in line with Butler’s arguments, both with the idea that those who “fail to do their gender right” are punished and with the idea that the gender binary is a tool to uphold the patriarchy.

However, in other respects, social structural approaches are less compatible with Butler’s arguments. First, they tend not to take non-binary gender into account, and the empirical research tends to operationalize men and women as disjunct categories. Although research focusing on how intra-gender variability is often much larger than between gender variability (e.g., Hyde, 2005 ) is a good first step, it still ultimately relies on dividing people into the binary categories of female and male. Moreover, these approaches also rarely take issues of intersectionality into account (see Shields, 2008 ) and focus on stereotypes of white, heterosexual, middle-class, cis women and men, although there are some notable exceptions (e.g., Fingerhut and Peplau, 2006 ; Brambilla et al., 2011 ).

Approaches from the social identity and self-categorization tradition ( Tajfel and Turner, 1979 ; Turner et al., 1987 ) view gender as a social identity (e.g., Skevington and Baker, 1989 ). This tradition argues that in addition to one’s personal identity, different social groups are integrated into the self-concept, forming social identities. These social identities can be based on meaningful social categories such as gender or occupation, but also in response to random allocation to seemingly meaningless groups. The strength of the identification with one’s gender as well the salience of this identity in any given context determine the extent to which the self-concept is affected by gender stereotypes – and in turn the extent to which gendered patterns of behavior are displayed (e.g., Lorenzi-Cioldi, 1991 ; Ryan and David, 2003 ; Ryan et al., 2004 ; Cadinu and Galdi, 2012 ).

While the idea of gender as an identity – rather than a result of gendered behavior – may be seen as being inconsistent with Butler’s argument, results from minimal group studies (e.g., Tajfel et al., 1971 ) are very much in line with her reasoning. These studies demonstrate that identities can form on the basis of completely irrelevant, artificial categories and are thus by no means inherent nor inevitable. Thus, while in our given society, these identities are considered to be largely binary, this is not inevitable and likely the result of social forces. Moreover, the evidence from a social identity perspective that supports the notion that changes in context can affect gender salience, levels of identification, and thus the extent of gendered behaviors, are also very much in line with Butler’s arguments.

Lastly, integrative approaches draw on more than one of these traditions as well as developmental, social cognitive, and sociological models of gender. For example, social role theory has developed over time, integrating biological as well as social identity aspects into its framework, resulting in a biosocial approach ( Eagly and Wood, 2012 ). More specifically, more recent versions of the theory argue that the division of labor leads to gendered behavior via three different mechanisms: (1) social regulation (as described above), (2) identity-based regulation, similar to the processes outlined by social identity theory, and (3) biological regulation through hormonal processes such as changes in testosterone and oxytocin. Importantly, these processes interact with one another, that is, hormonal responses are dependent on expectations from others and gender identity. While the social regulation of gender is very much in line with Butler’s arguments, the integration of biological – and particularly evolutionary – perspectives fits less with her idea that gender performance is what creates gender.

Another influential integrative approach is the interactive model of gender-related behavior ( Deaux and Major, 1987 ). Rather than focusing on distal factors which affect gender stereotypes, this model focuses on the situational and contextual factors which result in gendered behavior. The model assumes that the performance of gender primarily takes place in social interactions and serves specific social purposes. Gendered behavior thus emerges based on the expectations held by the perceiver, such as stereotypes, schemata, and knowledge about the specific target; the target themselves (e.g., their self-schema, their desire to confirm or disprove the perceiver’s expectations), and the situation. For example, large gender differences in behavior are likely to emerge when the perceiver believes men and women are very different and thus expects stereotypical behavior, changing the way they treat and communicate with male and female targets; when male and female targets hold very gendered self-schemata and are motivated to confirm the perceiver’s expectations; and when the situation makes stereotypes salient and allows for different behaviors to emerge.

This model is perhaps the most in line with Butler’s perspectives on gender. Similar to Butler, it focuses on the doing of gender, that is, on gendered behavior and its emergence in social interactions. Moreover, the model takes a more social cognitive approach, referring to gendered self-schemata rather than gender identities . Thus, while retaining the context dependence of gendered behavior inherent in social identity approaches, this model does not necessarily presume gender as a social identity in terms of men and women. In contrast to all other models discussed above, this model allows for a less binary, more fluid understanding of gender.

While these approaches thus vary considerably in how compatible they are with Butler’s argument, all of them treat gender as a given, pre-existing fact, which is in stark contrast to Butler’s core argument of gender being a performative act, coming into existence only through its own performance. The work of social psychologists operating outside of the experimental framework is more compatible in this regard. More specifically, discourse analysts argue that the self, including the gendered self, is created through language (e.g., Kurz and Donaghue, 2013 ) and focus on the production of gender in interactions rather than on gender as a predictor of behavior. For example, researchers conducting feminist conversation analysis have examined how patterns in the delivery of naturally occurring speech reproduce heteronormative gender (e.g., Kitzinger, 2005 ) and research from the ethnomethodology-discursive tradition examines how people acquire a gendered character through speech (e.g., Wetherell and Edley, 1999 ).

Future Research Directions

In the previous section, we have outlined how some of the issues raised by Butler, such as the negative reactions to those who fail to do their gender right, have already received considerable attention in the social psychological literature. Other aspects of her argument, however, have received very little attention and hold the potential for interesting future research. We identify two broad ways in which Butler’s work can inform and shape future social psychological research: (a) engendering new research questions which have not yet been investigated empirically, and (b) challenging our way of studying gender itself.

New Research Questions

Butler’s work is purely theoretical and thus many of her ideas have not been tested empirically, particularly using an experimental approach. Perhaps the most central question that can be examined by social psychologists is whether creating “gender trouble” by subverting ideas about sex, gender, and sexual desire, can indeed lead to changes in binary views of sex and gender and the proscriptive and prescriptive stereotypes that come with these views. Based on predictions derived from social role theory ( Eagly, 1987 ), we would indeed expect that a decrease in the performance of gender as binary (i.e., less gendered social roles) would lead to decreases in gender stereotyping and the reliance on gender as a social category. In other words, if genders are not tied to specific social roles (or vice versa), they lose their ability to be informative, both in terms of self-relevant information (“what should I be like?”) and in terms of expectations of others (“what is this person like?”).

On the other hand, as gender identity is very central to the self-image of many people ( Ryan and David, 2003 ), challenging ideas about gender may be perceived as threatening. Social identity theory and self-categorization theory ( Tajfel and Turner, 1979 ; Turner et al., 1987 ) argue that members of groups – including men and women – have a need to see their own group as distinct from the outgroup. If this distinctiveness is threatened, highly identified men and women are likely to enhance the contrast between their ingroup and the outgroup, for example by presenting themselves in a more gender stereotypical way and applying stereotypes to the other group ( Branscombe et al., 1999 ) or by constructing gender differences as essential and biological ( Falomir-Pichastor and Hegarty, 2014 ). These identity processes may thus reinforce a system of two distinct genders with opposing traits, and further punish and alienate those who fail to conform to gender norms and stereotypes. Future research needs to investigate the circumstances under which gender trouble can indeed lead to less binary views of gender, and the circumstances under which it does not. This needs to include identifying the psychological mechanisms and barriers involved in such change.

Importantly, this investigation should go beyond examining reactions to women and men who behave in counter-stereotypical ways, such as women in leadership positions or stay-at-home fathers, and include a focus on more radical challenges to the gender binary such as non-binary and trans individuals or drag performers. Butler discusses drag as an example of gender trouble in detail, quoting the anthropologist Newton (1968) in her observations of how drag subverts notions of gender. Discussing “layers” of appearance, Newton remarks that on the one hand, the outside appearance of drag queens is feminine, but the inside (i.e., the body) is male. At the same time, however, it appears that the outside appearance (i.e., body) is male, but the inside (the “essence”) is feminine, making it hard to uphold consistent, essentialist ideas about sex and gender. Butler further argues that the exaggeration of femininity (in the case of drag queens) and masculinity (in the case of drag kings) in drag performances highlights the performative nature of gendered behaviors, that is, how gender is created through gendered performance. On the other hand, we would argue that because drag performances often draw heavily on gender stereotypes, they may also reinforce the idea of what it means to be a man or a woman. To our knowledge, there is no psychological research on how drag affects perceptions of gender, but as drag becomes more and more accessible to a wider, and more mainstream, audience (e.g., due to popular TV shows such as RuPaul’s Drag Race) it might be an enlightening line of research to pursue. Does drag indeed highlight the performative nature of gender or does it simply reinforce stereotypes? Are reactions to appearance-based disruptions of the gender binary different to behavior-based ones such as reactions to assertive women or submissive men?

Another potential line of research to pursue would be to build on the discursive literature by examining the performative nature of gender from an experimental social psychological perspective, testing how gender is created through speech and behavior. Drawing on some of the findings from qualitative psychological research discussed in the previous section might be helpful in developing predictions and quantitatively testable hypotheses.

Finally, if gender trouble is indeed effective in challenging binary, essentialist views of sex and gender, it is worth investigating how disruptive gender performance can be encouraged and used as a means of collective action. The literature on collective action to achieve gender equality has often drawn on (gender) identity-based ideas of mobilization (e.g., Kelly and Breinlinger, 1995 ; Burn et al., 2000 ). As outlined above, Butler criticizes these approaches and argues that group-based identities (“we, women”) are not necessary to achieve change. How then can we inclusively mobilize others to engage in collective action without drawing on gender identities and inadvertently reinforcing the gender binary – and with it the patriarchal system of compulsory heterosexuality it supports?

More recently, psychologists have argued that it might be more effective to focus on “feminist” (rather than gender) ideologies which acknowledge, rather than ignore, issues of intersectionality (see Radke et al., 2016 ), and to encourage men to engage in collective action to achieve gender equality (e.g., Subašić et al., 2018 ). We agree with these arguments but further suggest that collective action research should examine how individuals of any gender can (a) be motivated to engage in collective action to achieve gender equality generally, and (b) be motivated to engage in gender trouble and disrupt binary notions of gender as a form of collective action.

Studying Gender From a Performative Perspective

In addition to new research question, Butler’s work also highlights the need for different methodological approaches to gender in experimental social psychology, and indeed there is much that could be learnt from those that work in the discursive tradition. There is also the potential for gender researchers to engage in gender trouble themselves by changing the way in which they treat gender.

For the most part, experimental psychologists have tended to examine gender as a predictor or independent variable – examining gender differences in all manner of social, cognitive, and clinical measures (e.g., Maccoby and Jacklin, 1974 ; Hyde, 2005 ). Indeed, as researchers, we (the authors) are guilty of publishing many papers using this methodology (e.g., Haslam and Ryan, 2008 ; Morgenroth et al., 2017 ). Similar to performative speech acts, we would argue that this can be seen as a performative research practice. The way in which we conduct our research and the choices we make in relation to gender creating the very construct that is studied, namely gender and gender differences. Our assumptions of gender as binary, pre-discursive, and natural produces research that focuses on binary, categorical gender as a predictor of gendered attitudes and behavior.

However, to our knowledge, there is very little quantitative or experimental research, that looks at the psychological processes implicated in the performance of gender, that is, treating gender as an outcome or dependent variable. If experimental social psychologists are to contribute to gender trouble, we should shift our views away from sex and gender as causes for behavior and psychological outcomes (i.e., as an independent or predictor variables). Instead, we should treat gender – whether measured as an identity, in terms of self-stereotyping, as simple self-categorization – as a result of societal and psychological forces. Rather than asking what sex and gender can explain, we need to look at what explains sex and gender.

Moreover, while the literature acknowledges that gender salience and gender self-stereotyping vary depending on context (e.g., Lorenzi-Cioldi, 1991 ; Ryan and David, 2003 ), gender itself, regardless of how it is measured, is measured as a stable, and discrete construct. One is a man or a woman and remains so over the course of one’s life. If, however, we view gender as a performance, then we must also view gender as an act, a behavior, which changes depending on context and audience. Asking participants to tick a box to indicate one’s gender – as many of us often do in our research practices – is an overly simplistic measure and cannot capture the nuances of doing gender . It is neither informative nor, we would argue, terribly interesting. Instead, one could measure gender identity salience and importance or gender performance – for example measuring gender stereotypical behavior or other types of gendered self-stereotyping (e.g., using measures similar to the Bem Sex-Role Inventory; Bem, 1974 ).

Similarly, we, as researchers, need to stop treating gender as a binary variable. This includes our research practices as well as our theory development and research communications. For example, the demographic sections of most questionnaires should not restrict gender to two options. Instead, they should either provide a range of different options (e.g., non-binary, genderqueer, genderfluid, and agender) or allow open responses. We would also suggest not using the option “other” in addition to “male” and “female” as it can be perceived as stigmatizing. Similarly, if asking about sex rather than gender, at least a third option (i.e., intersex) should be provided (see Fonesca, 2017 , for examples).

However, we need to go beyond that. At the moment, even when gender is measured in a non-binary way, those who fall outside of the gender binary are usually excluded from analysis. This is equally true for sexual minorities. Unless sexual orientation is central to the research question, those who don’t identify as heterosexual are often excluded by gender researchers as stereotypes and norms of gay, lesbian, bisexual, or asexual individuals often differ from general gender stereotypes. While these decisions often make sense for each individual case (and we, the authors, have in fact engaged in them as well), this overall produces a picture that erases variation and reinforces the idea that there are two opposing genders with clear boundaries. As experimental social psychologists with an interest in gender, we need to do better. Similarly, our theories themselves should allow for a fluid understanding of gender which also takes issues of intersectionality – with sexual orientation, but also with race, class, and other social categories – into account.

Finally, when we talk about gender, we should do so in a way that makes gender diversity visible rather than way that marginalizes non-binary gender further. For example, replacing binary phrases such as “he or she” with gender-neutral ones such as “they” or ones that highlight non-binary gender such as “he, she, or they” or “he, she, or ze” 3 . While the use of the gender-neutral singular “they” is often frowned upon and deemed grammatically incorrect ( American Psychological Association, 2010 ; University of Chicago, 2010 ), it has in fact been part of the English language for centuries and was widespread before being proscribed by grammarians advocating for the use of the generic masculine in the 19th century ( Bodine, 1975 ). Despite these efforts, the singular “they” has remained part of spoken language, where it is used to refer to individuals whose sex is unknown or unspecified (“Somebody left their unicorn in my stable”) and to members of mixed-gender groups (e.g., “Anybody would feed their unicorn glitter if they could”).

The use of new pronouns such as “ze,” specifically developed to refer to people outside of the binary, might be more effortful and equally controversial. However, evidence from Sweden, where the gender-neutral pronoun “hen” has become more widely used since the publication a children’s book using only “hen” instead of “han” (he) and “hon” (her) in 2012, indicates that attitudes toward its use have shifted dramatically from predominantly negative to predominantly positive in a very short amount of time ( Gustafsson Sendén et al., 2015 ). As gender researchers, we should be at the forefront of such issues and promote and advance gender equality – and gender diversity – not only through our research but also by communicating our research in a gender-inclusive way, especially in light of Butler’s (and others’) arguments that language is a crucial mechanism in creating gender and reinforcing the gender binary.

In this paper we put forward suggestions for ways in which Judith’s Butler’s (1990) notions of gender trouble could be integrated into experimental social psychology’s understanding of gender, gender difference, and gender inequality. We have outlined her work and discussed the extent to which prominent views of gender within psychology are compatible with this work. Moreover, we suggested potential avenues of future research and changes in the way that we, as researchers, treat gender.

We believe that, as experimental social psychologists, we should be aware that we may inadvertently and performatively reinforce the gender binary in the way in which we do research – in the theories we develop, in the measures that we use, and in the research practices we undertake. By taking on board Butler’s ideas into social psychology, we can broaden our research agenda – raising and answering questions of how social change can be achieved. We can provide a greater understanding of the psychological processes involved in creating gender trouble, and in resisting gender trouble – but above all, we are in a position to create our own gender trouble.

The first author of this paper uses they/them/their pronouns, the second author uses she/her/hers pronouns.

Author Contributions

TM and MR jointly developed the ideas in the paper. TM wrote the paper. MR read the paper and provided feedback on several drafts of the paper.

This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant Agreement No. 725128). This article reflects only the authors’ views. The European Research Council and the Commission are not responsible for any use that may be made of the information it contains.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

The authors would like to thank Thomas Morton, Teri Kirby, Christopher Begeny, and Renata Bongiorno for their helpful comments on a previous version of the manuscript and Peter Hegarty for his contribution as an engaged reviewer.

  • ^ Please note that these terms are based on the common view of naturally binary sex under which most researchers operate. We do not mean to imply that Butler herself would use these terms or, indeed, would be convinced by the idea that these bodies – or any bodies – exist “naturally” prior to interpretation.
  • ^ “ Cis ” refers to individuals for whom the sex they are assigned at birth and their gender identity align.
  • ^ The exact origins of the non-binary pronouns ze/hir or ze/zir are unknown, but ze/hir is often credited to Bornstein (1996) . There are no clear conventions around non-binary pronoun use and many different alternatives have been proposed.

Adams, N. (2017). GLAAD Calls for Increased and Accurate Media Coverage of Transgender Murders. Available at: https://www.glaad.org/blog/glaad-calls-increased-and-accurate-media-coverage-transgender-murders

Google Scholar

Ainsworth, C. (2015). Sex redefined. Nature 518, 288–291. doi: 10.1038/518288a

PubMed Abstract | CrossRef Full Text | Google Scholar

American Psychological Association (2010). Publication Manual of the American Psychological Association , 6th Edn, Washington, DC: American Psychological Association.

Arboleda, V. A., Sandberg, D. E., and Vilain, E. (2014). DSDs: genetics, underlying pathologies and psychosexual differentiation. Nat. Rev. Endocrinol. 10, 603–615. doi: 10.1038/nrendo.2014.130

Austin, J. L. (1962). How to Do Things with Words. Oxford: Clarendon Press.

Bastian, B., and Haslam, N. (2006). Psychological essentialism and stereotype endorsement. J. Exp. Soc. Psychol. 42, 228–235. doi: 10.1016/j.jesp.2005.03.003

CrossRef Full Text | Google Scholar

Bem, S. L. (1974). The measurement of psychological androgyny. J. Consult. Clin. Psychol. 42, 155–162. doi: 10.1037/h0036215

Bem, S. L. (1981). Gender schema theory: a cognitive account of sex typing. Psychol. Rev. 88, 354–364. doi: 10.1037/0033-295X.88.4.354

Bem, S. L. (1995). Dismantling gender polarization and compulsory heterosexuality: should we turn the volume down or up? J. Sex Res. 32, 329–334. doi: 10.1080/00224499509551806

Bodine, A. (1975). Androcentrism in prescriptive grammar: singular ‘they’, sex-indefinite ‘he’, and ‘he or she’. Lang. Soc. 4, 129–146. doi: 10.1017/S0047404500004607

Bornstein, K. (1996). Nearly Roadkill: An Infobahn Gender Adventure. London: Serpent’s Tail.

Bosson, J. K., Vandello, J. A., and Caswell, T. A. (2013). “Precarious manhood,” in The SAGE Handbook of Gender and Psychology , eds M. K. Ryan and N. R. Branscombe (London: SAGE Publications), 15–130.

Brambilla, M., Carnaghi, A., and Ravenna, M. (2011). Status and cooperation shape lesbian stereotypes: testing predictions from the stereotype content model. Soc. Psychol. 42, 101–110. doi: 10.1027/1864-9335/a000054

Branscombe, N. R., Ellemers, N., Spears, R., and Doosje, B. (1999). “The context and content of social identity threat,” in Social Identity: Context, Commitment, Content , eds N. Ellemers, R. Spears and B. Doosje (Oxford: Blackwell), 35–59.

Brescoll, V., and LaFrance, M. (2004). The correlates and consequences of newspaper reports of research on sex difference. Psychol. Sci. 15, 515–520. doi: 10.1111/j.0956-7976.2004.00712.x

Burn, S. M., Aboud, R., and Moyles, C. (2000). The relationship between gender social identity and support for feminism. Sex Roles 42, 1081–1089. doi: 10.1023/A:1007044802798

Butler, J. (1990). Gender Trouble: Feminism and the Subversion of Identity. Abingdon: Routledge.

Byrd-Craven, J., and Geary, D. C. (2013). “An evolutionary understanding of sex differences,” in The SAGE Handbook of Gender and Psychology , eds M. K. Ryan and N. R. Branscombe (New York, NY: SAGE Publications), 100–114. doi: 10.4135/9781446269930.n7

Cadinu, M., and Galdi, S. (2012). Gender differences in implicit gender self-categorization lead to stronger gender self-stereotyping by women than by men. Eur. J. Soc. Psychol. 42, 546–551. doi: 10.1037/pspp0000124

Darwin, C. (1871). The Decent of Man and Selection in Relation to Sex. London: Murray.

Deaux, K., and Major, B. (1987). Putting gender into context: an interactive model of gender-related behavior. Psychol. Rev. 94, 369–389. doi: 10.1037/0033-295X.94.3.369

Eagly, A. H. (1987). Sex Differences in Social Behaviour: A Social-Role Interpretation. Hillsdale, NJ: Erlbaum.

Eagly, A. H., Eaton, A., Rose, S., Riger, S., and McHugh, M. (2012). Feminism and psychology: analysis of a half-century of research on women and gender. Am. Psychol. 67, 211–230. doi: 10.1037/a0027260

Eagly, A. H., and Wood, W. (2012). “Social role theory,” in Handbook of Theories of Social Psychology , eds P. A. M. Van Lange, A. W. Kruglanski and E. T. Higgins (Thousand Oaks, CA: Sage Publications Ltd.), 458–476. doi: 10.4135/9781446249222.n49

Falomir-Pichastor, J. M., and Hegarty, P. (2014). Maintaining distinctions under threat: heterosexual men endorse the biological theory of sexuality when equality is the norm. Br. J. Soc. Psychol. 53, 731–751. doi: 10.1111/bjso.12051

Fausto-Sterling, A. (1993). The five sexes: why male and female are not enough. Sciences 33, 19–24. doi: 10.1002/j.2326-1951.1993.tb03081.x

Fausto-Sterling, A. (2000). Sexing the Body: Gender Politics and the Construction of Sexuality. New York, NY: Basic Books.

Fingerhut, A. W., and Peplau, L. A. (2006). The impact of social roles on stereotypes of gay men. Sex Roles 55, 273–278. doi: 10.1007/s11199-006-9080-5

Fiske, S. T., and Stevens, L. E. (1993). “What’s so special about sex? Gender stereotyping and discrimination,” in Gender Issues in Contemporary Society , eds S. Oskamp and M. Costanzo (Thousand Oaks, CA: Sage Publications).

Fonesca, S. (2017). Designing Forms for Gender Diversity and Inclusion. Available at: https://uxdesign.cc/designing-forms-for-gender-diversity-and-inclusion-d8194cf1f51

Gould, M. (1977). Toward a sociological theory of sex and gender. Am. Soc. 12, 182–289. doi: 10.1186/s40064-015-0933-7

Gustafsson Sendén, M., Bäck, E. A., and Lindqvist, A. (2015). Introducing a gender-neutral pronoun in a natural gender language: the influence of time on attitudes and behavior. Front. Psychol. 6:893. doi: 10.3389/fpsyg.2015.00893

Haslam, S. A., and Ryan, M. K. (2008). The road to the glass cliff: differences in the perceived suitability of men and women for leadership positions in succeeding and failing organizations. Leadersh. Q. 19, 530–546. doi: 10.1016/j.leaqua.2008.07.011

Hegarty, P. (1997). Materializing the hypothalamus: a performative account of the ‘gay brain’. Fem. Psychol. 7, 355–372. doi: 10.1177/0959353597073009

Human Rights Watch. (2017). I Want to be Like Nature Made Me”: Medically Unnecessary Surgeries on Intersex Children in the US. Available at: https://www.hrw.org/report/2017/07/25/i-want-be-nature-made-me/medically-unnecessary-surgeries-intersex-children-us

Hyde, J. S. (2005). The gender similarities hypothesis. Am. Psychol. 60, 581–592. doi: 10.1037/0003-066X.60.6.581

Kelly, C., and Breinlinger, S. (1995). Identity and injustice: exploring women’s participation in collective action. J. Community Appl. Soc. Psychol. 5, 41–57. doi: 10.1002/casp.2450050104

Kitzinger, C. (2005). Heteronormativity in action: reproducing the heterosexual nuclear family in after-hours medical calls. Soc. Probl. 52, 477–498. doi: 10.1525/sp.2005.52.4.477

Kurz, T., and Donaghue, N. (2013). “Gender and discourse,” in The SAGE Handbook of Gender and Psychology , eds M. K. Ryan and N. R. Branscombe (London: SAGE Publications), 61–77. doi: 10.4135/9781446269930.n5

Lick, D. J., Durso, L. E., and Johnson, K. L. (2013). Minority stress and physical health among sexual minorities. Perspect. Psychol. Sci. 8, 521–548. doi: 10.1177/1745691613497965

Lorenzi-Cioldi, F. (1991). Self-stereotyping and self-enhancement in gender groups. Eur. J. Soc. Psychol. 21, 403–417. doi: 10.1002/ejsp.2420210504

Maccoby, E. E., and Jacklin, C. N. (1974). Myth, reality and shades of gray: what we know and don’t know about sex differences. Psychol. Today 8, 109–112.

Minton, H. L. (1997). Queer theory: historical roots and implications for psychology. Theory Psychol. 7, 337–353. doi: 10.1177/0959354397073003

Morgenroth, T., Fine, C., Ryan, M. K., and Genat, A. E. (2017). Sex, drugs, and reckless driving: are measures biased toward identifying risk-taking in men? Soc. Psychol. Pers. Sci . doi: 10.1177/1948550617722833

Newton, E. (1968). The Drag Queens: A Study in Urban Anthropology. Ph.D. dissertation, University of Chicago, Chicago, IL.

Olson, K. R., Key, A. C., and Eaton, N. R. (2015). Gender cognition in transgender children. Psychol. Sci. 26, 467–474. doi: 10.1177/0956797614568156

Radke, H. R., Hornsey, M. J., and Barlow, F. K. (2016). Barriers to women engaging in collective action to overcome sexism. Am. Psychol. 71, 863–874. doi: 10.1037/a0040345

Roberts, S. O., Ho, A. K., Rhodes, M., and Gelman, S. A. (2017). Making boundaries great again: essentialism and support for boundary-enhancing initiatives. Pers. Soc. Psychol. Bull. 43, 1643–1658. doi: 10.1177/0146167217724801

Rudman, L. A., Moss-Racusin, C. A., Glick, P., and Phelan, J. E. (2012). “Reactions to vanguards: advances in backlash theory,” in Advances in Experimental Social Psychology , Vol. 45, eds, P. G. Devine and E. A. Plant (San Diego, CA: Academic Press), 167–227. doi: 10.1016/B978-0-12-394286-9.00004-4

RuPaul, Y. (1996). Lettin it All Hang Out: An Autobiography. New York, NY: Hyperion.

Ryan, M. K., and David, B. (2003). Gender differences in ways of knowing: the context dependence of the attitudes toward thinking and learning survey. Sex Roles 49, 693–699. doi: 10.1023/B:SERS.0000003342.16137.32

Ryan, M. K., David, B., and Reynolds, K. J. (2004). Who cares? The effect of gender and context on the self and moral reasoning. Psychol. Women Q. 28, 246–255. doi: 10.1111/j.1471-6402.2004.00142.x

Sax, L. (2002). How common is intersex? A response to Anne Fausto-Sterling. J. Sex Res. 39, 174–178. doi: 10.1080/00224490209552139

Shields, S. A. (2008). Gender: an intersectionality perspective. Sex Roles 59, 301–311. doi: 10.1007/s11199-008-9501-8

Skevington, S. M., and Baker, D. (1989). The Social Identity of Women. London: SAGE Publications.

Subašić, E., Hardacre, S. L., Elton, B., Branscombe, N. R., Ryan, M. K., and Reynolds, K. J. (2018). ““We for she”: mobilising men and women to act in solidarity for gender equality,” in Group Processes and Intergroup Relations , eds D. Abrams and M. A. Hogg (Thousand Oaks, CA: SAGE Publications).

Tajfel, H., Billig, M. G., Bundy, R. P., and Flament, C. (1971). Social categorization and intergroup behaviour. Eur. J. Soc. Psychol. 1, 149–178. doi: 10.1002/ejsp.2420010202

Tajfel, H., and Turner, J. C. (1979). “An integrative theory of intergroup conflict,” in The Social Psychology of Intergroup Relations , eds W. G. Austin and S. Worchel (Monterey, CA: Brooks), 33–37.

The Conversation Project (2015). Gender Performance: An Interview with Judith Butler. Available at: http://radfem.transadvocate.com/gender-performance-an-interview-with-judith-butler/

Turner, J. C., Hogg, M. A., Oakes, P. J., Reicher, S. D., and Wetherell, M. S. (1987). Rediscovering the Social Group: A Self-Categorization Theory. Oxford: Blackwell.

University of Chicago (2010). The Chicago Manual of Style , 16th Edn, Chicago, IL: University of Chicago Press.

Wetherell, M., and Edley, N. (1999). Negotiating hegemonic masculinity: imaginary positions and psycho-discursive practices. Fem. Psychol. 9, 335–356. doi: 10.1177/0959353599009003012

Keywords : gender trouble, gender, gender performativity, social psychology, non-binary gender, genderqueer, Judith Butler

Citation: Morgenroth T and Ryan MK (2018) Gender Trouble in Social Psychology: How Can Butler’s Work Inform Experimental Social Psychologists’ Conceptualization of Gender? Front. Psychol. 9:1320. doi: 10.3389/fpsyg.2018.01320

Received: 28 March 2018; Accepted: 09 July 2018; Published: 27 July 2018.

Reviewed by:

Copyright © 2018 Morgenroth and Ryan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Thekla Morgenroth, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.












































gender based psychology experiments

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

The PMC website is updating on October 15, 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.4(1); 2018 Jan

Experimenter gender and replicability in science

Experimenter gender influences results and may degrade replicability in many fields of scientific inquiry.

There is a replication crisis spreading through the annals of scientific inquiry. Although some work has been carried out to uncover the roots of this issue, much remains unanswered. With this in mind, this paper investigates how the gender of the experimenter may affect experimental findings. Clinical trials are regularly carried out without any report of the experimenter’s gender and with dubious knowledge of its influence. Consequently, significant biases caused by the experimenter’s gender may lead researchers to conclude that therapeutics or other interventions are either overtreating or undertreating a variety of conditions. Bearing this in mind, this policy paper emphasizes the importance of reporting and controlling for experimenter gender in future research. As backdrop, it explores what we know about the role of experimenter gender in influencing laboratory results, suggests possible mechanisms, and suggests future areas of inquiry.

INTRODUCTION

Failure to replicate significant findings has become a recent concern across several disciplines of scientific inquiry. Some research groups report that attempts to replicate published data in biomedical science fail more often than they succeed, and a recent paper revealed that of 100 articles published in high-ranking psychology journals in 2008, only one-third to one-half of original findings were successfully replicated ( 1 , 2 ). Here, we point to one important and overlooked factor likely perpetuating this ubiquitous problem: the role of experimenter gender. Experiments in humans are regularly carried out without any report of the experimenter’s gender; however, there is a range of evidence supporting the influence of experimenter gender on a variety of psychological and physiological variables ( 3 , 4 ).

Pioneering work into experimenter effects demonstrated that several aspects of the experimenter can have significant influence. Scientists such as Robert Rosenthal laid the groundwork for this understanding, revealing the importance of experimenter expectations in relation to participant performance and, among other things, the importance of experimenter gender ( 5 ). Since these initial investigations, the field has grown: From intelligence testing to pain sensitivity, participants demonstrate robust responses to manipulation of experimenter gender ( 6 , 7 ). The range of effects is troubling because it is broad enough to influence many fields of scientific inquiry that are not accustomed to controlling for experimenter effects.

Variance in such prominent mental and physical variables could potentially encourage reporting of illusory effects in clinical biomedical trials, inducing potentially serious consequences for patient treatment. For instance, when testing the efficacy of antinociceptive drugs, males report less pain to nociceptive stimulation when supervised by a female experimenter, as demonstrated by Alabas et al. ( 8 ). If, when testing an antinociceptive drug, a disproportionate number of treatment trials with male participants are supervised by female experimenters, then this could result in overestimations of drug efficacy. Putting aside the possibility of false positives, false negatives could be holding back progress. If scientists have difficulty replicating findings because of excessive null results, then the resulting noise makes any broader analysis less conclusive and more likely to induce further inquiry and delays. For instance, the collaborators in the Open Science Collaboration unsuccessfully attempted to replicate the findings of Epley et al. ( 9 ). The original study had shown that lonely participants were more likely to restore their sense of belonging through increased belief in supernatural agents and events ( 9 ). Meanwhile, the replication failed to find significance. Surprisingly, in both the original and the replicated studies, the authors failed to report experimenter gender.

Thus, this review aims to summarize a sampling of studies demonstrating the influence of experimenter gender in a plethora of contexts, to speculate about mechanisms, and to propose policy recommendations for improving experimenter gender reporting. To this aim, the paper examines—in successive sections—a sampling of the experimenter gender’s established impacts on elements of mind, body, and behavior. Following this, the paper covers possible mechanisms and policy recommendations. Finally, the paper concludes by suggesting future areas of research to further reveal the extent of the biasing effects of experimenter’s gender.

IMPACT ON THE MIND

When an experimenter and participant interact, their genders influence a range of psychological and physical variables, in much the same way as when two friends or colleagues interact. Bearing this in mind, this paper highlights examples of experimenter gender bias within three broad categories of human research: mind, body, and behavior. These sections are further bracketed by areas of study that emphasize the range of experimenter gender effects.

Intelligence

Before any interest was piqued as to the experimenter gender’s role in biasing other measures, there was a wave of interest in its impact on higher-level cognitive functioning. In particular, scientists were curious about how an experimenter’s gender could influence performance on intelligence testing. Early results suggested a variety of interactions. Studies in children revealed a significant effect of experiment gender on performance. Namely, female examiners appear to elicit higher full-scale intelligence quotient (IQ), verbal IQ, comprehension, similarities, and vocabulary scores on the Wechsler Intelligence Scale for Children for both boys and girls ( 10 ). These studies raise obvious concerns about the replicability of intelligence testing, but perhaps more alarming is the impact on the development of therapeutics to treat learning disabilities in children. Newer medications for attention deficit hyperactivity disorder may show results that are too favorable, or not significant enough, as a result of experimenter gender influence. Again, this could be holding back or delaying the development of newer, safer therapeutics for use in treating these conditions because more and more studies are run to determine whether a particular compound’s effects are consistent. Worse still, it could halt investigations altogether if early results are unfavorable.

Additional studies have investigated the impact of experimenter gender on creative problem solving. In general, male experimenters have been shown to elicit more solutions in a creative problem solving task (Remote Associates Test) for both genders of participants ( 11 ). However, female participants were significantly more affected by the gender of the experimenter, whereas men were only marginally affected. In other words, male experimenters improved results for both genders but much more so for females. In addition, female experimenters reduced results but also much more so for females. The researchers concluded that females are generally more sensitive to and responsive to other people than males. However, this conclusion should be tempered by the cultural context and timing of the research.

Learning and memory

One of the first studies looking at experimenter gender demonstrated that verbal learning was influenced, such that female participants learned significantly faster in a serial trigram task with a male experimenter as opposed to a female experimenter ( 5 ). Other studies have taken these findings further. Experiments using simple sorting tasks reveal that participants performed significantly better, regardless of gender, when tested by an opposite-gender experimenter ( 12 ). It was speculated that this could follow from opposite-gender dynamics increasing competitiveness, anxiety, or the desire to please. Making the picture more intricate, however, another study found that, on a complex verbal conditioning task, while, as expected, low-anxiety men performed significantly better when tested by a female experimenter, highly anxious men actually performed worse ( 13 ). The authors theorized that this may have been due to an overload of stress for the high-anxiety men. Thus, although, in general, results support the conclusion that opposite-gender experimenters improve performance on learning and intelligence-related tests, this conclusion must be tempered because qualities specific to the participant appear to also modulate this effect. Finally, some research has revealed that even fundamental memory processes are sensitive to experimenter gender. Men paired with a female experimenter tend to provide more elaborate verbal autobiographical memories, and women with a male experimenter report fewer “internal states” such as emotional or cognitive states ( 14 ).

Again, these studies are significant in light of the recent surge in development of therapeutics designed to treat conditions such as Alzheimer’s disease and other forms of cognitive impairment associated with aging. In some cases the same cognitive tests that demonstrated experimenter gender biases are used to determine whether these cognitive-enhancing therapeutics are efficacious. Imagine an experiment being run without the gender of the experimenter being stringently controlled, where a female directs the treatment participants and a male directs the placebo participants. Imagine further that the participants themselves are male. This design could easily lead to an exaggerated treatment effect.

Neurological factors

More recently, some experimenters have ventured into the territory of neurobiology, looking for the correlates that one might expect to the behavioral differences that experimenter gender elicits. Evidence indicates that defensiveness is related to relative left frontal activation (LFA) in women and right frontal activation (RFA) in men, as measured by electroencephalogram (EEG). LFA has been associated with “behavioral approach,” whereas RFA has been associated with “behavioral withdrawal.” Researchers have found that when an opposite-gender experimenter is in the room, participants who are highly defensive show greater LFA activation, and participants who were not defensive showed greater RFA activation ( 15 ). This suggests that when self-presentation is primed via the presence of the opposite gender, different parts of the brain are stimulated depending on the personality of the participant. Presumably more defensive individuals have greater LFA activation in the presence of an opposite-gender experimenter because they use more approach-related strategies to cope with their defensive dispositions, whereas less defensive individuals gravitate toward avoidance strategies. Most significantly, this study points to neurological differences in the reaction of participants to experimenter gender, which seem most pronounced in an opposite-gender context, demonstrating the possibility of bias in other neurobiological studies that fail to account for such effects.

IMPACT ON THE BODY

Mental differences in response to interaction with different genders are natural to assume because many people experience these personally. However, less intuitive are the possible effects of experimenter gender on bodily functioning. To date, more research has been done on psychological or mental traits; however, there appear to also be several physical effects, partly mediated by central mechanisms. In addition, not only physical performance but also underlying biomarkers and physiological systems appear to be influenced, again underlining the significance of this bias for clinical therapeutic trials.

Physical performance

A small series of studies has investigated the impact of experimenter gender on physical performance, and, again, significant results were observed. In one study, the effect of experimenter gender was investigated for participants performing a 50-yard dash, a shuttle run, and sit-ups. The study demonstrated that, for sit-ups, male experimenters elicited better scores for both genders of participants ( 16 ). On the other hand, both the 50-yard dash and the shuttle run participants performed significantly better when paired with an opposite-gender experimenter, regardless of their own gender. However, other studies have demonstrated a lack of effect with regard to physical performance. One study, for example, investigating the impact of experimenter gender on performance on grip strength and hand steadiness tests found no interaction for either task ( 17 ). Thus, much like intelligence and learning, physical performance appears to generally be enhanced by opposite-gender experimenters, although there are some inconsistencies and null results.

Testosterone

Where measurable physical performance is altered, one should of course expect biological systems underlying this to be modified as well. In particular, experiments reveal that—perhaps unsurprisingly—sex steroids such as testosterone are affected by experimenter gender, which, in turn, causes differences in physical performance. For instance, one study revealed that young male skateboarders take increased physical risks in the presence of an attractive female ( 18 ). This increased risk taking leads to not only more successes but also more crash landings in front of a female observer. Mediational analyses reveal that this effect is influenced in part by elevated testosterone levels in men who performed in front of the attractive female. In addition, performance on a reversal-learning task predicted physical risk taking, and reversal-learning performance was also disrupted by the presence of the attractive female, and the female’s presence moderated the observed relationship between risk taking and reversal learning. These data of course fit closely with earlier data suggesting an impact of experimenter gender on learning. Combined, these results suggest that men use physical risk taking as a sexual display strategy and that this may be moderated by elevated testosterone levels in the presence of a woman (be she an experimenter or otherwise).

Further evidence reveals not only that testosterone is selectively elevated in the presence of a female experimenter but also that it appears that this is quantifiable in perspiration. More specifically, men excrete higher levels of the sex steroids 17β-estradiol and testosterone when performing rigorous exercise in the presence of a female experimenter ( 19 ). In turn, these hormones are absorbed by the experimenter, surely having additional effects on the experimenter and his or her instructions and behavior. Combined, these papers reveal a critically important link: Experimenter gender affects hormonal substrates. The question of how far-reaching this is remains unanswered, but sex steroids could represent the tip of the iceberg. The implications for clinical therapeutics should be clear: There could be, for example, a huge biasing effect produced in estimates of the efficacy of testosterone boosting medications, if the tests are administered by females.

Pain sensitivity

Starting in the 1990s, a growing body of literature on pain sensitivity revealed that experimenter gender was biasing results. Initial findings suggested that male participants demonstrate a significantly higher pain threshold (reporting significantly less pain) when tested by female experimenters ( 20 ). The same study found a trend toward women actually reporting higher pain when tested by a male experimenter, but this did not reach significance. Several years later, studies investigated the phenomenon of male participants demonstrating lower pain sensitivity when tested by females, and the early result has generally been supported ( 7 , 21 ). A recent meta-analysis helps make sense of these findings. Alabas et al. analyzed 13 studies that looked at gender role and pain thresholds. The consensus finding was that participants who viewed themselves as more masculine and less sensitive to pain demonstrated higher pain thresholds and tolerance ( 8 ). Another study investigated whether these findings of reduced pain sensitivity for men with female experimenters were mirrored by alterations in autonomic pain response (as measured by heart rate variability and skin conductance levels). The study found that lower pain reports in male participants with female experimenters were not mediated by changes in autonomic parameters and the effect was thus likely more the result of psychosocial factors ( 22 ). For example, it could be that men in general tolerate higher levels of pain with a female experimenter as a function of their attempt to display higher degrees of masculinity.

IMPACT ON BEHAVIOR

With the preceding sections, the cascade of mental and physical reactions to experimenter gender should reveal a system-wide effect on general functioning. That said, it should be unsurprising that behavior is also affected. Again, the extent of the effect is still understood only for a few dimensions of interpersonal interaction, but the results thus far provide fertile ground for future hypothesis testing. They also, unfortunately, create the same pervasive concern regarding study replicability for behavior-based research and interventions.

Communication

A study investigating gender differences in the way marital couples interact with each other found a variety of somewhat predictable differences in nonverbal communication between men and women (such as the amount of smiling, laughing, and the average length of gazing at their spouse) ( 23 ). In addition, however, they found that some variables in both husbands and wives were dependent on the gender of the administering experimenter. In particular, husbands were more likely to speak first with a male experimenter, and discussions in general went on longer with a female experimenter present. The neurological evidence suggesting differences in the brains of men and women in targets such as Broca’s area (known for its critical role in communicative behavior) suggest that there may be a plethora of other biasing effects of experimenter gender on variables that relate to communication; however, this remains largely uncharted territory. These data also relate back to memory performance, where, again, an effect of experimenter gender on verbal elaboration was discovered, which can be concerning in the context of Alzheimer’s treatment research, for instance.

Several meta-analyses have revealed that males tend more toward physical aggression ( 24 , 25 ). Conversely, females favor verbal or “relational” aggression ( 24 ). However, the gender of the experimenter appears to modulate these general trends. For instance, an early study revealed that, in male college age participants, female experimenters inhibited physical aggression in both genders of participants, whereas male experimenters potentiated it ( 25 ). However, another study demonstrated that the interaction is possibly more complex. Males in the presence of a male experimenter inhibited retaliatory aggression against a female “participant” (a study confederate) who had only mildly disagreed with them, but when the female confederate “participant” strongly disagreed, men tended toward more severe retaliatory insults (verbal aggression) and higher-intensity shocks (again, specifically in the presence of a male experimenter) ( 26 ). Similarly, men in the presence of a female experimenter showed higher levels of physical aggression against a male provocateur (also a confederate). The commonality appears to be that men will show more aggression when they are insulted or aggressed upon in the presence of both genders simultaneously, be they other participants, confederates, or experimenters. This is suggestive of dependence of experimenter gender–based effects on social context as well.

Prosociality

Trust and reciprocity research has gained a lot of traction recently, and a wave of increased interest has sprung fresh studies of human morality. In these studies, manipulating experimenter gender again revealed a robust impact on behavior, such that in the presence of a female experimenter, participants playing a trust game showed more trust and reciprocity ( 27 ). This is of particular interest in the light of recent issues replicating the links between oxytocin and trust. In a seminal study, Kosfeld et al. ( 28 ) seemingly revealed that intranasal oxytocin potently modulates trust behavior in the trust game. However, a host of newer research has shown profound difficulty in replicating these findings, using very similar methodology ( 29 ). One might wonder what characteristics the experimenters administering the task had in Kosfeld et al. —was a woman administering the treatment condition?

Sexual behavior

Perhaps the most obvious domain for a biasing effect of experimenter gender is in the study of sex itself. This effect has been found, for example, in questionnaires relating to sexual experience. In one study, male college students—who were primed with information about how women were becoming more sexually permissive—reported inflated numbers of sexual partners as compared to when they received no priming, but only when the questionnaire was administered by a female ( 30 ). The experimenters hypothesized that this was due to either a defensive reaction or a desire to perpetuate hegemonic masculinity. They supported this theory with the evidence that the significant results appeared to stem from the study participants who scored high on tests of hypermasculinity and ambivalent sexism.

Beyond questionnaires, experimenter gender can affect a participant’s response to a variety of situations that implicate sexuality or sexual behavior. Early research into the impact of experimenter gender on sexual behavior found that both the gender and attractiveness of the experimenter could significantly influence experimentally induced sexual fantasies ( 31 ). In detail, an attractive female experimenter was shown to unsurprisingly promote sexual fantasies in heterosexual male participants in much the same way as other conditions that used different, more explicit stimuli. A later study revealed that experimenter gender could affect a participant’s response to sexually explicit material. In detail, the study found that females who had an “informal” male experimenter felt more anxious after viewing sexually explicit material, whereas males who had an “informal” female experimenter rated the attractiveness of the sexually explicit material significantly higher. Thus, the study argues that experimenter gender may produce either a restraining or a permissive context, which, in turn, can account for a significant portion of the variance of a participant’s response to sexual material ( 32 ). Consider, in this context, medications that could produce sexual dysfunctions as a side effect, such as exist for many antidepressants. It should be clear from the research pattern that if these studies are investigated using female experimenters and male participants, reporting of sexual dysfunction may be significantly underreported.

WHY THE DIFFERENCES?

Opposite-gender dynamics.

There are a variety of possible reasons why men and women respond differently to experimenters of the same or opposite gender. One hypothesis focuses on the role of psychosocial stress in intergender scenarios. For heterosexuals, opposite-gender encounters can mediate social rewards that same-gender encounters cannot ( 33 ). The theory is that favorable perception by the opposite gender can result in romantic, sexual, or marital relationships, all of which have the potential to confer reward ( 33 ). In addition, when a person makes a favorable impression on another, it can result in self-affirming feedback that they are socially and sexually attractive. Although unquestionably valuable, this feedback generally cannot be obtained from same-gender interactions (again, for the sake of simplicity, we refer here only to heterosexuals). Supporting this line of reasoning, a study using daily interaction records from college students demonstrated that they tended to be more concerned with conveying an impression of being likeable, competent, ethical, and attractive when interacting with those of the opposite sex ( 33 ). Further studies on the interaction of a perceiver and a target individual have revealed that the more socially desirable rewards a perceiver controls, the more likely target individuals will attempt to create a favorable impression. Furthermore, the apparent value structure of a perceiver can influence a target’s aggression, reward allocation, and helping behavior.

Thus, opposite-gender experimenters might, in general (again, principally in the case of heterosexuals—the effect should be the reverse for homosexual participants), elicit improved responses on a variety of measures related to general mating “fitness,” including the observed improvements in physical fitness, learning and intellectual abilities, and further alterations in beliefs and social behavior relating to aggression and altruism. Even alterations in pain sensitivity observed in male participants with a female observer could be explained by this phenomenon because experienced pain may not in fact differ, with male participants instead simply reporting less pain to produce a positive impression.

In this line of thought, it is important to recognize that it is not “opposite gender” that is significant per se but likely the psychosocial stress that often results from this scenario and the heightened reward potential, which, in aggregate, creates this trend. Theoretically, this could be manipulated by other circumstances, such as increasing the number of experimenters, manipulating their age and their professional status, and so on. In addition, this interpretation of results suggests that certain research areas will prove more vulnerable. Experimenter gender should have the greatest impact in areas of study where participants are in frequent and close contact with experimenters. In addition, experiments implicating characteristics important for mate selection—such as mental acuity, physical prowess, or morality—may be more influenced.

Psychosocial stress

Further evidence from studies of stress support this general conceptualization of the experimenter gender effect and add an additional layer. Stress is regulated in the body through two primary pathways—the hypothalamic-pituitary-adrenal (HPA) axis and the sympathetic-adrenal-medullary axis. These systems work to increase the body’s vigilance in response to a stressor by increasing circulating levels of stress-regulating hormones such as glucocorticoids, epinephrine, and norepinephrine. The HPA axis in particular is especially sensitive to nonphysical stressors involving a social context, and its activation is therefore considered a strong indicator of exposure to psychosocial stress ( 34 ). There are a variety of paradigms commonly used in experimental settings to induce a stress response in participants. One of the most popular is the Trier Social Stress Test (TSST), which requires participants to present a free speech in front of a panel of “experts” (experimenters in laboratory coats) and afterward to perform a mental arithmetic challenge ( 35 ). Another, the Maastricht Acute Stress Test (MAST), also involves social evaluation but is less time- and resource-intensive than the TSST. Recent evidence indicates that the experimenter’s gender can influence the results of such tests. For example, males tested by a female experimenter in the MAST demonstrated higher systolic blood pressure, whereas females tested by a male experimenter in the TSST showed higher subjective stress ratings ( 35 ). Stress can improve or degrade both physical and intellectual performance, depending on the degree. Thus, opposite-gender dynamics generate performance-enhancing effects through moderate increases in stress, whereas individuals who have high basal anxiety levels may actually perform worse under such circumstances, as discussed previously ( 12 ). Similarly, as discussed previously, these effects should be most pronounced in heterosexuals; other sexual orientations likely produce different effect patterns.

POLICY RECOMMENDATIONS

To improve the prevalence of experimenter gender reporting, first and foremost, individual scientists must take upon themselves the task of tracking and reporting their experimenter and/or research assistant genders going forward. Furthermore, where appropriate, statistical analysis should test for experimenter gender effects. Research group leaders have the strongest influence in this sense; however, it is ultimately each scientist’s personal obligation to maintain reporting standards.

Looking speculatively toward the future of policy changes intended to improve replicability, there are several key players involved in the process that could promote change. For instance, universities or research institutes could take a top-down approach to the issue: It is not uncommon for universities to disseminate policy changes directly to laboratories under their umbrella. Because of the weight that universities have in setting the trajectory of individual scientists and ethical scientific standards, any guidance from them to report experimenter gender could be impactful.

Similarly, funding institutions could play a role. Every researcher is dependent on grants for survival—this gives grant issuers and private industry (such as the pharmaceutical industry) immense influence over research policy. Funding sources such as these could hypothetically augment their policies with a requirement for reporting experimenter gender. Stepping further back in the chain of influencers, governmental authorities could be the most significant potential influencer. Departments of higher education the world around are responsible for significant amounts of funding, both directly to universities and research institutes and indirectly through third-party organizations. Similarly, other divisions of government have large research budgets—for instance, in the United States, the Department of Agriculture alone budgets approximately $1.8 billion to nutritional research ( 36 ). Finally, governmental regulators can influence policy with regard to private industry funding sources. Thus, with the significant amount of funding and influence that governments project toward the sciences, they are well positioned to assist in improving scientific standards.

Finally, journals, while relatively independent of the other key figures in this system, also have a powerful voice. If grants are necessary for survival, then so too are journal publications, which grant issuers evaluate critically in determining how to appropriate funds. Researchers are thus obliged to comply with any policy a journal sets out. The interplay between these various institutions and a roadmap for potential policy change is shown in Fig. 1 .

An external file that holds a picture, illustration, etc.
Object name is 1701427-F1.jpg

As shown, the initiation of a crisis can induce change through several mechanisms. Prominent among these are changes in policy recommendations from government funding sources, in addition to policy changes at journals, universities, and independent funding agencies.

Finally, it is worth addressing why reporting experimenter gender is an excellent jumping-off point in improving replicability. There are many other characteristics of experimenters that have demonstrable impact on participant performance, including age, height, and personality. However, gender has the unique qualities of being both (i) easy to record and report and (ii) categorical. Consider age for instance. Although it is similarly easy to record and report, it is much more difficult to interpret because it is not categorical. In other words, the extent of age differences (on a case-by-case basis) could create subtle differences that are not well understood. Gender, on the other hand, is both easy to record and report and relatively straightforward to demarcate. Thus, although controlling for more variables related to both the gender and the laboratory environment more broadly would be valuable to improving replicability, these changes would require significantly more time, attention, thought, and resources to initiate. Finally, although this paper does argue that experimenter gender should be controlled and reported, this does not imply that every study should use an equal balance of male and female experimenters because this is similarly resource-intensive. Laboratories simply have a duty to report what gender their experimenters are, not to alter their staffing.

THE FUTURE OF THE EXPERIMENTER GENDER EFFECT

Studies investigating psychometric variables and the newer research looking at differences in pain sensitivity have been instructive; however, a wide range of variables remain unexplored. To date, there is limited information on how biological and neurological measures are affected, such as genes, circulating hormones, neuropeptides, or brain activity as captured by functional magnetic resonance imaging (fMRI) or EEG. Where there are differences in psychological responses, there should be corresponding differences in neurobiology. For example, if participants who work with an opposite-gender experimenter are more likely to seek social reward through conveying a positive impression, then there are likely changes in their neurological responses. Studies have revealed that not only the acquisition of social reward but also the mere anticipation of it increases activity in mesolimbic brain structures ( 37 , 38 ). Exposure to social reward also recruits a cohort of neuropeptides—for instance, in mice, the rewarding properties of social interaction have been shown to require the coordinated activity of oxytocin and serotonin (5-HT) in the nucleus accumbens. That said, opposite-gender experimenters are likely causing differential effects—through their impact on social reward processing—that could lead to significant differences in the results of fMRI responses and neuropeptide levels. There is a need to investigate differences that might appear in paradigms using EEG or fMRI or that look at circulating neuropeptide levels to determine where else there is systematic bias occurring.

Furthermore, there is good reason to believe that peripheral biological systems should be affected by changes in the central nervous system (CNS). A recent study in rats demonstrated that the animals’ stress response was heightened in the presence of male experimenters ( 39 ). This stress response involves initial activation in the CNS, but via the HPA axis, activity proliferates to the periphery, and this pattern of effects is mirrored in humans. Thus, there is strong reason to believe that experimenter gender could be influencing a plethora of peripheral biological responses as well.

Some have recently suggested the concept of a “virtual experimenter.” The idea is to create a computer program that delivers treatment and instructions, which should theoretically increase standardization and reduce biasing effects and noise, such as those that come from experimenter gender ( 40 ). This standardized avatar would likely produce several advantages—in addition to controlling gender, other biasing influences such as personality, behavior, physical size, and, in general, human errors would be eliminated as confounders. However, the technology to support this becoming a ubiquitous and fail-safe tool could take some time to develop. Meanwhile, scientists can improve their own standards and practices to combat the issue.

As this paper suggests, there is ample evidence, accumulated over decades of exploration, demonstrating that the gender of an experimenter has significant effects on a range of variables. It is also clear that the variables thus far investigated have been largely behavioral or psychological in nature, whereas biological and neurological responses remain largely unexplored. Given the strong connection between psychological and behavioral responses on the one hand and biological and neurological responses on the other, it stands to reason that this biasing effect should be similarly prevalent in these realms of study. It is common practice for studies in the fields of biology and neuroscience to not report experimenter gender, and yet, there is reason to believe that it could be significantly affecting results, including those of clinical trials. Note that research assistant positions are increasingly held by women, which could also potentially contribute to these replication issues. Combating the issue will be most effective if the major institutions of science—journals, funding sources, government, and universities—work in concert with individual scientists to encourage improved reporting standards. If these efforts are successful, then it could help clarify conflicting results in many subdisciplines and make sense of otherwise unusual data sets. It could pave the way for science to be more empirical, reduce noise in findings, increase the power of study designs, and generally improve the quality of scientific inquiry in these areas. With any luck, it will also aid in rebuilding the credibility of science by improving replicability.

Acknowledgments

Funding: This work was supported by the Swedish Research Council. The funding sources had no input in the design and conduct of this study; in the collection, analysis, and interpretation of the data; or in the preparation, review, or approval of the manuscript. Author contributions: C.D.C., C.B., and H.B.S. contributed to the conceptualizations. C.D.C. drafted the manuscript. C.D.C. and C.B. created the figure. C.D.C., C.B., and H.B.S. edited the manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the articles cited herein.

REFERENCES AND NOTES

American Psychological Association Logo

Think again: Men and women share cognitive skills

Research debunks myths about cognitive difference.

  • Cognition and the Brain
  • Men and Boys
  • Women and Girls
  • Intelligence

abstract illustration of man's and woman's brain activity

Are boys better at math? Are girls better at language? Is aptitude or culture the reason that fewer women than men work as scientists and engineers? Psychologists have gathered solid evidence that when it comes to how—and how well—we think, males and females differ in very few but significant ways.

The evidence has piled up for years. In 1990, Janet Shibley Hyde, PhD, a psychologist at the University of Wisconsin, and colleagues published a groundbreaking meta-analysis that compiled data from 100 different studies of math performance. Synthesizing data collected on more than 3 million participants between 1967 and 1987, the researchers found no large overall differences between boys and girls in math performance. Girls were slightly better at computation in elementary and middle school. In high school, boys showed a slight edge in problem solving, possibly because they took more science classes that emphasized those skills. But boys and girls understood math concepts equally well, and any gender differences actually narrowed over the years, belying the notion of a fixed or biological differentiating factor.

As for verbal ability, in 1988, Hyde and colleagues reported that data from 165 studies revealed a female advantage so slight as to be meaningless, despite previous assertions that girls are more verbally adept. What’s more, the authors found no evidence of substantial gender differences in any component of verbal processing.

In a 2005 report, Hyde reviewed 46 different meta-analyses on sex differences, not only in cognition but also communication style, social and personality variables, motor behaviors, and moral reasoning. In half the studies, sex differences were small; in another third they were virtually nonexistent.

Also in 2005, Elizabeth Spelke, PhD, a psychologist at Harvard University, and colleagues reviewed 111 studies and concluded that gender differences in math and science ability have a genetic basis in cognitive systems that emerge in early childhood. Nevertheless, the studies suggested that men and women on the whole possess an equal aptitude for math and science. In fact, boy and girl infants were found to perform equally well as young as 6 months on tasks that underlie mathematics abilities.

Despite such evidence, questions of gender differences have persisted, in part because men still outnumber women in science and math careers. In 2007, Diane Halpern, PhD, and colleagues including Hyde published a consensus statement regarding that disparity. Indeed, studies suggest that women tend to score slightly higher than men on verbal abilities, while men tend to have a slight edge when it comes to visuospatial skills, the researchers report. However, biology is only a small part of the explanation. The researchers conclude that early experience, educational policies and culture also strongly affect success in math and science.

Other studies suggest that when it comes to math, girls and boys are similarly capable. A 2008 analysis by Hyde and colleagues reported that in children from grades two to 11, there was no gender difference for math skills. And in 2009, Hyde and Janet Mertz, PhD, reported that while more boys than girls score at the highest levels in mathematics, that gender gap has been closing over time. In fact, they reported that the gap is smaller in countries with greater gender equality, suggesting that gender differences in math achievement are largely due to cultural and environmental factors.

Significance

The research suggests that perceived or actual differences in cognitive performance between males and females are most likely the result of social and cultural factors. For example, where girls and boys have differed on tests, researchers believe social context plays a significant role. Spelke believes that differences in career choices are due not to differing abilities but to cultural factors, such as subtle but pervasive gender expectations that kick in during high school and college.

In a 1999 study, Steven Spencer and colleagues explored gender differences among men and women who had a strong math background. They found that merely telling women that a math test had previously shown gender differences hurt their performance. The researchers gave a math test to men and women after telling half the women that the test had shown gender differences, and telling the rest that it found none. Women who expected gender differences did significantly worse than men. Those who were told there was no gender disparity performed equal to men.

Anxiety may be another mechanism explaining gender differences in math performance. A 2014 study by researchers at Boston College found that women had greater anxiety during a math test, which taxed their working memory and led them to underperform on the test. Teaching girls strategies to manage that anxiety could be one useful means to help to close the gender gap in math achievement, the researchers suggest.

Practical application

If males and females were truly understood to be intellectual equals, things might change in schools, colleges and universities, industry, and the workplace in general. As Hyde and her colleagues noted in 1990, “Where gender differences do exist, they are in critical areas. Problem solving is critical for success in many mathematics-related fields, such as engineering and physics.” They believe that well before high school, children should be taught essential problem solving skills in conjunction with computation. The researchers also point to the quantitative portion of the Scholastic Aptitude Test, which may tap problem solving skills that favor boys. The resulting scores are used in college admissions and scholarship decisions. Scientifically unsound gender stereotyping not only costs individuals, but society as a whole.

Cited research and further reading

Ganley, C. M., & Vasilyeva, M. (2014). The role of anxiety and working memory in gender differences in mathematics. Journal of Educational Psychology , 106 (1), 105–120.

Halpern, D. F., Benbow, C. P., Geary, D. C., Gur, R. C., Hyde, J. S., & Gernsbacher, M. A. (2007). The science of sex differences in science and mathematics. Psychological Science in the Public Interest , 8 (1), 1–51.

Hyde, J. S., & Linn, M. C. (1988). Gender differences in verbal ability: A meta-analysis. Psychological Bulletin , 104 , 53–69.

Hyde, J. S., Fennema, E., & Lamon, S. (1990). Gender differences in mathematics performance: A meta-analysis. Psychological Bulletin , 107 , 139–155.

Hyde, J. S. (2005) The gender similarities hypothesis. American Psychologist , 60 (6), 581–592.

Hyde, J. S., Lindberg, S. M., Linn, M. C., Ellis, A. B., & Williams, C. C. (2008) Gender similarities characterize math performance. Science , 321 , 494–495.

Hyde, J. S. & Mertz, J. E. (2009). Gender, culture and mathematics performance. Proceedings of the National Academy of Sciences , 106 (22), 8801–8807.

Spelke, Elizabeth S. (2005). Sex differences in intrinsic aptitude for mathematics and science? A critical review. American Psychologist , 60 (9), 950–958.

Spencer, S. J., Steele, C. M., & Quinn, D. M. (1999) Stereotype threat and women’s math performance. Journal of Experimental Social Psychology , 35 , 4–28.

Recommended Reading

TRANS+

You may also like

  • Search Menu
  • Sign in through your institution
  • Advance Articles
  • Thematic Issues
  • ENDO Meeting Abstracts
  • Clinical Practice Guidelines
  • Endocrine Reviews
  • Endocrinology
  • Journal of the Endocrine Society
  • The Journal of Clinical Endocrinology & Metabolism
  • JCEM Case Reports
  • Molecular Endocrinology (Archives)
  • Endocrine Society Journals
  • Author Guidelines
  • Submission Site
  • Open Access
  • Why Publish with the Endocrine Society?
  • Advertising & Corporate Services
  • Reprints, ePrints, Supplements
  • About Endocrine Reviews
  • Editorial Board
  • Author Resources
  • Reviewer Resources
  • Rights & Permissions
  • Member Access
  • Terms and Conditions
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

Study motivation, study design, data collection, analysis, reporting, and interpretation of sex differences, conclusions, acknowledgments.

  • < Previous

Sex and Gender Differences Research Design for Basic, Clinical, and Population Studies: Essentials for Investigators

  • Article contents
  • Figures & tables
  • Supplementary Data

Janet W Rich-Edwards, Ursula B Kaiser, Grace L Chen, JoAnn E Manson, Jill M Goldstein, Sex and Gender Differences Research Design for Basic, Clinical, and Population Studies: Essentials for Investigators, Endocrine Reviews , Volume 39, Issue 4, August 2018, Pages 424–439, https://doi.org/10.1210/er.2017-00246

  • Permissions Icon Permissions

A sex- and gender-informed perspective increases rigor, promotes discovery, and expands the relevance of biomedical research. In the current era of accountability to present data for males and females, thoughtful and deliberate methodology can improve study design and inference in sex and gender differences research. We address issues of motivation, subject selection, sample size, data collection, analysis, and interpretation, considering implications for basic, clinical, and population research. In particular, we focus on methods to test sex/gender differences as effect modification or interaction, and discuss why some inferences from sex-stratified data should be viewed with caution. Without careful methodology, the pursuit of sex difference research, despite a mandate from funding agencies, will result in a literature of contradiction. However, given the historic lack of attention to sex differences, the absence of evidence for sex differences is not necessarily evidence of the absence of sex differences. Thoughtfully conceived and conducted sex and gender differences research is needed to drive scientific and therapeutic discovery for all sexes and genders.

A sex- and gender-informed perspective increases rigor, promotes discovery, and expands the relevance of biomedical research

Methods exist to test sex and gender differences as interactions; inference from sex- and gender-stratified data should be viewed with caution

Without careful methodology, the pursuit of sex and gender difference research as a poorly considered mandate will result in a literature of contradiction

However, given the paucity of sex and gender differences research, the absence of evidence for differences is not necessarily evidence of the absence of differences

Many compelling publications have argued why sex and gender should be considered in preclinical, clinical, and population research ( 1–4 ). Both sex (the biological attributes of females and males) and gender (socially constructed roles, behaviors, and identities in a spectrum, including femininity and masculinity) affect molecular and cellular processes, clinical traits, response to treatments, health, and disease ( 1 ). Since 2010, the Canadian Institutes of Health Research has mandated that all grant applicants address whether they had considered sex and/or gender in their applications ( 5 ). In 2014, the European Commission issued the Horizon 2020 guideline, which makes explicit the rules for sex and gender inclusion as elements of European Union grant evaluation and monitoring ( 6 , 7 ). Although the 1993 National Institutes of Health (NIH) Revitalization Act required the inclusion of women in NIH-funded clinical research, it was not until 2015 that the NIH announced policies requiring the consideration of sex as a biological variable in study design, analysis, and reporting ( 1 , 8–10 ). Such mandates to include females are not mere political correctness ( 11 ). A sex-informed and gender-informed perspective is essential to increase rigor, promote discovery, expand the relevance of research, and improve patient care. At the very least, it will allow readers of the scientific literature to critically assess the validity of what they read.

Investigators who wish to—or now find themselves required to—include both sexes in their studies are faced with a number of methodological questions, including issues of motivation, subject selection, sample size, data collection, analysis, and interpretation. We provide an overview of these issues in this review as they pertain to basic, clinical, and population research ( Table 1 ). This review builds on earlier discussions of sex differences research methodology ( 11–18 ) in several ways: we consider gender as well as sex differences; we examine the entire research process, from motivation to analysis and presentation; and we discuss nuances of statistical design and interpretation, particularly how to plan robust tests of sex or gender interactions that can help minimize statistical artifacts. Rather than assume ubiquitous sex and gender differences in biology, health, and disease, we propose methods and interpretation that will increase the likelihood of detecting true differences where they exist.

Methodological Considerations in Investigations of Sex and Gender Differences

Research StepBest Practices
MotivationConsider known sex differences in disease incidence, prevalence, and survival.
Review existing literature on sex and gender differences, alert to the fact that many hypotheses have not been well tested. Read carefully to consider likelihood of false-positives (especially in context of multiple testing) and false-negatives (especially where statistical power is low).
Apply a life course perspective to consider the timing of exposures that might interact with sex and gender in specific developmental windows.
Subject selectionConsider sex-specific age incidence of disease to maximize statistical power.
Consider reproductive stages and cycles, particularly where they may modify the impact of the main exposure being investigated.
Consider the impact of gendered social environment for the distribution of factors that may interact with the main exposure.
For basic and preclinical studies, review options for classical gonadectomy, knockouts, or four-core genotype experiments.
Consider whether sex of cell lines is known, relevant, and generalizable.
Randomization (if applicable)In smaller studies, stratified randomization by sex or gender will ensure balance, even if different numbers of males and females are included.
Sample sizeTrue tests of sex differences need to be large enough to test interaction between sex and the main exposure or treatment; such tests typically require several times the sample size to be adequately powered, compared with studies of main effects.
Studies to small to detect interaction can still report the main effects of the exposure or treatment by sex; however, they cannot claim to have tested a sex difference. Be alert to the risk of false-negatives in underpowered sex strata.
Studies too small to detect even the main effects of sex can provide sex-specific data to generate hypotheses or contribute to meta-analyses of sex differences.
“Big data” studies, where the variable of sex is often available, need to be conducted thoughtfully to avoid contributing false-positives to the sex difference literature.
Data collectionConsider sex and gender differences in disease presentation.
Consider whether exposures mean the same thing in both sexes and genders.
Be aware of sex and gender differences in pharmacokinetics and pharmacodynamics; the same dose may have different impact in males and females or may vary by body size.
Collect data on exogenous hormones: contraceptives, menopausal hormone therapy, testosterone, and other steroid use.
Consider recording data on reproductive cycle (follicular/luteal), and stage (prepuberty, puberty, pregnancy, lactation, premenopause and postmenopause).
Collect data on influential covariates that may vary by sex and gender in the study population.
Analysis, reporting, and interpretationPrespecify tests of sex differences to reduce type I error.
Account for confounding by factors associated with sex and gender.
Investigate intermediate “pathway” variables to understand apparent sex differences.
Admit when sex differences were tested as exploratory analyses.
Make opportunities to replicate sex difference findings.
Interpret apparent sex and gender differences in the light of biological plausibility and social context.
Research StepBest Practices
MotivationConsider known sex differences in disease incidence, prevalence, and survival.
Review existing literature on sex and gender differences, alert to the fact that many hypotheses have not been well tested. Read carefully to consider likelihood of false-positives (especially in context of multiple testing) and false-negatives (especially where statistical power is low).
Apply a life course perspective to consider the timing of exposures that might interact with sex and gender in specific developmental windows.
Subject selectionConsider sex-specific age incidence of disease to maximize statistical power.
Consider reproductive stages and cycles, particularly where they may modify the impact of the main exposure being investigated.
Consider the impact of gendered social environment for the distribution of factors that may interact with the main exposure.
For basic and preclinical studies, review options for classical gonadectomy, knockouts, or four-core genotype experiments.
Consider whether sex of cell lines is known, relevant, and generalizable.
Randomization (if applicable)In smaller studies, stratified randomization by sex or gender will ensure balance, even if different numbers of males and females are included.
Sample sizeTrue tests of sex differences need to be large enough to test interaction between sex and the main exposure or treatment; such tests typically require several times the sample size to be adequately powered, compared with studies of main effects.
Studies to small to detect interaction can still report the main effects of the exposure or treatment by sex; however, they cannot claim to have tested a sex difference. Be alert to the risk of false-negatives in underpowered sex strata.
Studies too small to detect even the main effects of sex can provide sex-specific data to generate hypotheses or contribute to meta-analyses of sex differences.
“Big data” studies, where the variable of sex is often available, need to be conducted thoughtfully to avoid contributing false-positives to the sex difference literature.
Data collectionConsider sex and gender differences in disease presentation.
Consider whether exposures mean the same thing in both sexes and genders.
Be aware of sex and gender differences in pharmacokinetics and pharmacodynamics; the same dose may have different impact in males and females or may vary by body size.
Collect data on exogenous hormones: contraceptives, menopausal hormone therapy, testosterone, and other steroid use.
Consider recording data on reproductive cycle (follicular/luteal), and stage (prepuberty, puberty, pregnancy, lactation, premenopause and postmenopause).
Collect data on influential covariates that may vary by sex and gender in the study population.
Analysis, reporting, and interpretationPrespecify tests of sex differences to reduce type I error.
Account for confounding by factors associated with sex and gender.
Investigate intermediate “pathway” variables to understand apparent sex differences.
Admit when sex differences were tested as exploratory analyses.
Make opportunities to replicate sex difference findings.
Interpret apparent sex and gender differences in the light of biological plausibility and social context.

There is ample evidence of sex differences—at the level of the cell, organism, and population—to motivate sex differences research. Sex chromosomes encode sexual differentiation through three mechanisms: (1) presence of Y genes; (2) increased dose of X genes in XX vs XY cells; and (3) X chromosome inactivation and imprinting ( 12 ). These primary chromosomal differences lead to sexual differentiation and the somatic and gonadal expressions of sex ( 19 ). The resulting “sexome” produces differences in all organ systems and across the lifespan, influencing how our bodies interact with the environment to determine health ( 20 ). The sex-informed framework considers sex differences in anatomy and physiology, understood within a lifespan perspective of sensitive periods of fetal and childhood development, differential pace and timing of puberty, reproductive events, and senescence. This is critical given that timing is everything when it comes to identifying developmental sex effects ( 14 , 21 , 22 ). Furthermore, sex differences in treatment abound: pharmacokinetics and pharmacodynamics of medications often vary by sex, as may effects of other treatment modalities ( 23 ).

Gender, too, is a determinant of health, influencing the physical and social environments to which individuals are exposed, their access to resources that affect health, their agency to seek health care and receive treatment, and the equitability of research that drives medical discovery ( 14 , 17 , 24–26 ).

This sex- and gender-informed perspective is necessitated by widespread differences in disease incidence, prevalence, and survival that have been reviewed elsewhere ( 2 , 26–28 ). There are sex and gender differences in symptoms and clinical presentations of illness, reliability of diagnostic tests, and response to treatment. There is “sex bias’” down to the level of epigenetic marking and gene expression ( 28 ). In short, the rigor of research depends on researchers’ understanding of the ways in which sex and gender influence the biologic systems they study.

Investigators seeking to construct a sex- and gender-informed framework for their research may be disappointed by a lack of systematic evidence regarding sex and gender differences in the literature. There is a particular dearth of true gender-difference studies; in fact, literature searches on “gender differences” largely turn up studies on sex differences that have used the term “gender” to refer to biologic sex. The historic neglect of women in clinical studies and the sex of animals and cells in basic research should be kept in mind when gathering evidence of sex and gender differences. Although data to interrogate sex differences may exist in some studies, they have yet to be examined. In other cases, sex-informed questions have yet to be posed. Furthermore, as argued below, the proliferation of ill-considered and often unplanned sex difference inquiries leads to a literature of contradictions. Thus, the absence of evidence for sex differences is not necessarily evidence of the absence of sex differences.

Overarching study design

In most cases, the choice of overarching study design, whether experimental or observational, is little affected by considerations of sex and gender. Exceptions to this are experiments precluded by ethical considerations, such as inclusion of pregnant women for trials of potentially teratogenic drugs. However, nearly every other feature of study design necessitates a sex-informed perspective, including subject selection, randomization, sample size, and data collection.

Subject selection

Inclusion of both sexes is more nuanced than deciding that the sample should be equally divided by sex. Sex-specific age incidence of disease, reproductive stage, reproductive cycle, and environment need to be considered to optimize the validity, generalizability, and efficiency of a study sample. More often than not, investigators have to compromise between competing goals of validity (by narrowing subject selection to increase the likelihood that findings are true for a specific population) and generalizability (by widening subject selection to make broad inference at the risk of overgeneralizing across true differences between groups.) There are also compromises between large scientific goals and restricted available funds. Such trade-offs are best made as choices informed by already known sex and gender differences. The most efficient subject selection will pick the minimum number of each sex or gender necessary to make valid inferences about sex and gender differences; a 50/50 split between males and females may not be the most efficient, as discussed below.

Sex-specific incidence of disease

Sex differences in incidence and age-incidence trajectories are important considerations in subject selection. For example, at ages 55 to 64 men have more than double the rate of coronary heart disease (CHD) of women. By ages 85 to 94, male CHD rates are only 10% higher than those of females ( 20 ). Thus, an investigator wishing to enroll a cohort of 50-year-olds to study CHD incidence will need to enroll two to three times as many women as men to ensure equivalent statistical power, or consider selecting older women. For example, the Vitamin D and Omega-3 Trial study of dietary supplements to reduce heart disease, stroke, and cancer includes women aged 55 or older and men aged 50 or older to account for the later onset of disease in women ( 29 ). Sex differences in disease incidence exist in animals as well. For example, in the nonobese diabetic mouse, diabetes is more prevalent in females, so that more male mice must be included to yield the same number of affected animals of each sex ( 30 ).

Sex-specific differences in aging

Females outlive males in most vertebrate species ( 31 ). In mammals, the heterogametic (XY) sex may have a shorter lifespan because of the unguarded expression of harmful recessive alleles on the Y sex chromosome. Similarly, the homogametic (XX) sex may be protected by the stochastic X-inactivation that creates mosaics of females; although female neonates are a 1:1 mosaic of maternal and paternal allele expression, over time that ratio becomes skewed to favor the cellular population whose active X presumably confers a survival advantage ( 32–35 ).

Sex differences in the rate of aging and the incidence of disease onset are reflected at the cellular level. For example, there are sex differences in the length of telomeres, noncoding DNA sequences that cap and protect chromosomes, the length of which are correlated with longevity. Although similar at birth, male telomeres shorten faster during the lifespan than do female telomeres ( 35 ). This difference could be the result of sex or gender; most likely, it is a combination of biological sex differences and gendered experiences (such as smoking) ( 36 ). Similarly, although stem cell populations decline with aging, this loss is earlier and more rapid in male than in female mice ( 37 ). Methylation patterns also differ between the sexes, likely influencing DNA expression over the life course ( 38 , 39 ). As research further clarifies sex-specific or sex-dependent mechanisms of senescence, investigators may want to consider sex differences in the cellular age and methylation patterns of their subjects, be they cells, animals, or people.

Reproductive stages and cycles

All animals, regardless of sex or species, go through a process of reproductive maturation whose timing, duration, and outcome are subject to physical and social cues from the environment. In mammals, puberty involves sex-specific, but variable, changes in central neural systems, gonadal steroid production, and the emergence of secondary sexual characteristics, including behaviors. When a study investigates adolescence or young adulthood, accounting for sex differences in the pace and timing of puberty will be critical for identifying sex effects ( 14 ).

Mature mammals of both sexes have variations in gonadal steroid levels that may affect subject selection. In males, testosterone levels have circadian and perhaps seasonal variations and vary with age, physical activity, and energy homeostasis ( 40 , 41 ). Reproductive age females have menstrual or estrous cycles. On top of natural variability, women may use hormonal contraceptives or menopausal hormone therapy; many men use exogenous androgens and anabolic steroids. These factors are important in subject selection if an investigator wants to understand how the exposure–outcome associations under study are impacted by sex hormones. Researchers may decide to include a representative range of reproductive phases or cycles. For example, cyclical patterns of DNA synthesis and rates of cell division and death would not have been discovered if females in different cycle phases had not been studied ( 42 , 43 ). The knowledge that natural killer cell activity peaks during the luteal phase came from studies of cycling women ( 44 ). Understanding of the roles of neurokinin B and kisspeptin in reproduction has been facilitated by studying male and female animals at varying reproductive stages, with and without gonadectomy ( 45 ).

Sex differences in physiology and behavior have been observed even in the prepubertal and peripubertal periods, before the pubertal activation of the hypothalamic–pituitary–gonadal axis and production of gonadal sex steroid hormones. These prepubertal sex differences have been largely attributed to the effects of prenatal and perinatal activity of the hypothalamic–pituitary–gonadal axis and resultant sex steroid hormone production and actions. Among the best described effects are the so-called activational and organizational effects of gonadal hormones on brain development ( 46 ). The first robust sex difference described in the mammalian brain was the sexually dimorphic nucleus of the preoptic area ( 47 ). More recently, a sexually dimorphic population of kisspeptin neurons was identified that is present in higher numbers in the anteroventral periventricular nucleus in prepubertal females than in males, to which the sexually dimorphic preovulatory luteinizing hormone surge that occurs in adult females but not males is attributed ( 48 ).

Thus, to the extent that hormone levels affect study outcomes, researchers may need to examine subjects who are premenopausal or postmenopausal, in the follicular or luteal phase, and with or without hysterectomy or gonadectomy. Including or excluding participants using hormonal therapies, such as contraceptives and female and male hormone replacement or suppression therapies, is another potentially important design choice. To fully capture between-sex variability, it may be of use to compare men to two or more groups of women. For example, a study of brain activity in the stress response circuitry found few differences between healthy men and women in the early follicular phase, but striking differences between men and the same women at midcycle ( 49 ). Furthermore, sex differences in brain activity in memory circuitry were statistically significant in premenopausal and perimenopausal women, but attenuated in postmenopausal women compared with men ( 50 ). To capture within-sex variability, studies compare the same females at different cyclical stages, perhaps in crossover fashion.

Note that the effects of sex steroid hormones extend beyond estradiol and testosterone. There are multiple types of estrogens produced by the ovaries and other tissues, as well as multiple androgens beyond testosterone. Progesterone levels also need to be considered. Furthermore, there is target tissue specificity in the actions of estrogens, which can be attributed to tissue-specific expression patterns of estrogen receptors (ERs), including ER α , ER β , and estrogen membrane receptors such as membrane ER α and the G protein–coupled receptor GPER1/GPR30 ( 51 , 52 ).

In addition to the multiple ERs, tissue-specific responses to estrogens can occur through the presence of modulating proteins such as coactivators and corepressors, among others. Varying tissue-specific responses are exemplified by the action of synthetic agonists and antagonists such as the selective ER modulators, including tamoxifen, raloxifene, and toremifene. These selective ER modulators are competitive inhibitors of estrogen binding to ERs, with mixed agonist and antagonist activity, depending on the target tissue ( 53 ). For example, tamoxifen is used in the prevention and treatment of breast cancer as an ER antagonist, but it has ER agonist activity in some other tissues such as bone and endometrium. Progesterone also acts through multiple receptors, which are generated as splice variants from a single gene ( 54 ). The actions of testosterone, through the androgen receptor, are modulated at the local tissue level through local activity of the enzyme 5 α -reductase, which catalyzes the formation of the more potent androgen receptor agonist, dihydrotestosterone ( 55 ).

Gender and subject selection

Many determinants of disease, both physical and social, are differentially distributed by gender. Some of these factors may confound experiments if not carefully accounted for in study design and analysis. For example, in many societies, women are more likely to have vitamin D deficiency ( 56 ), affecting multiple tissues and systems, and men are more likely to smoke cigarettes and drink alcohol. Men and women are exposed differentially to types of violence and trauma ( 57–61 ). Such stressors may affect gonadal steroid secretion in a sex- and hormone-dependent fashion ( 12 ). In the case of powerful covariates strongly associated with gender or sex, investigators may want to select participants to ensure these covariates are balanced in male and female samples.

Special considerations regarding subject selection for basic studies

Historical reliance on male animal models ( e.g. , mice, rats) has resulted in incomplete data to guide human subject research in both men and women. Basic studies can complement clinical studies by investigating mechanisms of sex-dependent or sex-specific processes in greater depth by manipulating genotypic and phenotypic sex experimentally ( 12 ). Beyond simply studying both male and female animals as they age naturally, studies can include classic gonadectomy with or without hormone replacement: prenatally and perinatally to address developmental effects; in juvenile animals to study postnatal developmental and differentiation effects; in adults to assess the effects of sex steroid hormones at the time of testing; and in aging animals to study effects of sex steroids in models of aging. Several new genetic and epigenetic animal models have increasing translational validity to represent human ovarian failure and menopause ( 62 ). Some alternative models of menopause or ovarian failure include Foxl2-deficient mice with accelerated rates of decline in ovarian reserve ( 63 ).

Another frequent approach is to study “knockout” mice (or other species) that lack a specific sex steroid receptor. ER α knockout mice have shown that the absence of ER α promotes adiposity in male and female animals and, in turn, the progression of breast cancer in females ( 64 ). Animals with “conditional knockout” or “conditional knockin” of a specific sex steroid receptor can be used to target specific tissues or life stages.

Additionally, targeted mutagenesis can be used to address the role of specific domains or specific functions of a sex steroid receptor. For example, although ER α has traditionally been thought of as a nuclear, ligand-dependent transcription factor acting through estrogen response elements in gene promoters, the molecular mechanisms of action are more complex. Estradiol actions can be mediated by other “nonclassical” ER α pathways: (1) ligand-independent ER α signaling, in which gene activation alters phosphorylation of ERs via second-messenger pathways that affect intracellular kinase and phosphatase activity; (2) rapid, nongenomic effects through a membrane-associated ER; and (3) genomic, estrogen response element–independent signaling, in which ER α regulates genes via protein–protein interaction with other transcription factors, including c-Fos/c-Jun B (AP-1), Sp1, and nuclear factor κ B ( 65 ). For example, as noted above, estradiol is critical to the regulation of energy balance and body weight. In an experiment with female mice, ER α -null mutant mice become obese, with decreased energy expenditure and locomotion, increased adiposity, hyperleptinemia, and altered glucose homeostasis, characteristics similar to the propensity of postmenopausal women to develop obesity and type 2 diabetes. Interestingly, knockin mice that express a mutant ER α that can signal only through a nonclassical pathway ( i.e. , without direct estrogen response element binding) restored the metabolic parameters to normal or near-normal values, including energy expenditure. These findings indicate that nonclassical ER α signaling mediates major effects of estradiol on energy balance, raising the possibility that selective ER α modulators may be developed to reduce the risks of obesity and metabolic disturbances in postmenopausal women ( 66 ).

“The hormonal environment of cultured cells…can affect experimental outcomes.”

Sex of cells

Although it is facile to insist that basic researchers use and report on both XX and XY cells in their experiments, this is not always possible ( 11 ). In fact, cell lines are a poor model with which to study sex differences, even when the sex of the lineage is specified. By definition, immortalized cell lines, chosen for their peculiarities and derived from a single organism, may be inherently impossible to cull or create from a second organism of any sex. Even where it is possible to create cell lines from a male and female similar enough to interrogate a particular question, inferences about sex differences cannot reliably be made. As with a clinical study with n = 2, a comparison of a male and a female cell line, because each is derived from a single individual, cannot distinguish sex differences from other genetic, epigenetic, or environmental characteristics of the founding individuals from which they were derived. Cell lines may have sex-dependent features other than the sex chromosome complement, including differences in hormone production or hormone responses related to variation in steroidogenic enzyme expression or expression of sex steroid hormone receptors. There may also be differences in expression of other genes related to imprinting or epigenetic differences. Moreover, each cell line is clonal in origin and has unique characteristics based on the experimental conditions in which it was derived and propagated—even two cell lines derived from the same organism may have different characteristics.

It is more reasonable to request that investigators specify the sex of a cell line used in a study ( i.e. , derived from a male vs female, or XX vs XY in sex chromosome complement), as the sex of many cell lines has been established ( 70 ). However, even this is not always possible, as cell lines can lose their sex chromosome complement over time ( 11 ). Although primary cultures can isolate cells directly from the body, permitting the creation of a small population of male or female cells, the procedure may be technically difficult and time-consuming, and the cells may be short-lived, limited in number, difficult to manipulate, and can change their characteristics over time in culture. Furthermore, Miller et al. ( 14 ) caution that the hormonal environment of cultured cells, including some media, can affect experimental outcomes. Finally, comparisons of isolated male and female cells oversimplify the question of sex, let alone gender, because such cells are removed from the complex interactions with other cells, hormones, neurotransmitters, nutrients, pathogens, and environmental exposures, which themselves vary in living organisms by sex and gender ( 11 ). In such cases, the absence of evidence for sex differences in vitro may well be absence of any evidence at all, a straw man (or woman) of an experiment purportedly about sex.

Randomization by sex and/or gender

Experimentalists, particularly those conducting studies with >100 subjects, may wish to randomize the sexes separately to ensure similar distributions of treated and untreated males and females. In preclinical experiments, this is known as a factorial design ( 15 ). Such stratified randomization retains the advantages of standard random allocation, effectively creating a mini-trial within each sex stratum ( 71 ). Stratified randomization can accommodate a study plan with unequal numbers of male and female subjects, especially helpful when men and women join a study at different rates or in different time periods. Stratified randomization can also be used to balance follicular vs luteal phase participants, or any other marker of sex or gender.

Sample size considerations for studies including males and females

Most studies are planned from the outset with a sample size just large enough to afford 80% statistical power to detect the main effect of the primary exposure. Unless preplanned, most studies are underpowered to examine associations separately for males and females. This is particularly true of secondary data analyses of studies never designed to examine subgroup differences. This lack of statistical power to detect sex and gender differences can lead to the premature conclusion that such differences do not exist; in fact, most studies are simply too small to fairly test all but the most pronounced sex and gender differences. In the current era of accountability to analyze and present sex-stratified data, it is worth considering ideal practice and reality with respect to power and sample sizes to detect sex differences. Although most researchers will find that limited samples and funds constrain their ability to investigate sex and gender differences, we will also address the special case of “big data,” where problems may ensue from an abundance of statistical power to detect trivial differences, rather than too little power to detect meaningful differences.

Effect modification and interaction by sex

Epidemiologists and clinical researchers are familiar with the concepts of effect modification and interaction, although the terminology may differ between disciplines. However, basic investigators, whose aim is usually to limit all variation other than the exposure under examination, may be less familiar with these issues. “Effect modification” refers to the ability of a third variable (here, sex) to modify or interact with the “main effect” of the exposure (say, treatment) on outcome (usually, disease). For example, the association of diabetes with cardiovascular disease (CVD) is stronger for women than men ( 72 , 73 ); it is said that sex “interacts” with diabetes to cause CVD or that sex “modifies” the diabetes–CVD association.

Although stratifying data by sex to examine the exposure–disease association separately for males and females allows the investigator to eyeball effect modification by sex, such estimation gives no indication of the extent to which any observed sex differences are due to chance. To gauge this likelihood, many researchers test the statistical significance of sex differences by incorporating into their statistical models a (usually multiplicative) “interaction” term that represents the intersection of exposure and sex. For example, if the main effect of treatment is represented as a binary variable (0 if untreated; 1 if treated) and the main effect of sex as a binary variable (0 if male; 1 if female), then an interaction term (treatment × sex) which equals 1 only for treated females will, when modeled with the main effects of treatment and sex, capture the additional increment or decrement in the risk of the outcome that is attributable to both treatment and female sex, that is, the sex difference in the association between treatment and disease. By convention, P values <0.05 for such interaction terms are indicative of a statistically significant sex difference, one that is unlikely due to chance alone. Such tests of effect modification or interaction by sex can (and should) be as easily incorporated into basic research as in clinical and population research. The difficulty is having the statistical power to do so.

Ideal: statistical power to detect interaction by sex

The sample size required to detect statistically significant sex differences (interactions by sex) is considerably larger than that required to detect the main effects of treatment or sex alone. Statistical power to detect a sex difference depends on the prevalence of the exposure, outcome, and sex, as well as the strength of the associations between them. Software is freely available to calculate sample sizes to detect interactions ( 74 , 75 ). However, the rule of thumb is that it takes fourfold the sample size to detect an interaction than it does to detect main effects ( i.e. , treatment or sex alone) ( 76 ). Investigators need to take into account differential disease rate by sex and the expected magnitude of the main effect in each sex; statistical power to detect either main effects or a sex interaction may not be optimized by recruiting half women and half men. In planning, investigators may have to make “best guesses” at the magnitude of expected sex differences, based on the literature and biologic understanding. As with any power calculation, it is best to input a range of likely main effects and interactions to evaluate the impact of sample size on the ability to detect a sex interaction.

A study that is large enough to detect a sex interaction, if one exists, represents the “ideal” in sex difference studies. Few studies are planned with the power to detect statistically significant sex differences. Many studies that have attempted to test interactions by sex have been woefully underpowered to do so. Unfortunately, researchers easily forget that an interaction P value >0.05 often says as much about the design and size of the study as it does about the presence or absence of a sex difference.

Next best: statistical power to detect main effects within sex strata

Even where a study is too small to test for sex interaction effects, it may still have enough statistical power to examine the main effects of exposure within sex strata. This is simple sex stratification to examine exposure–disease associations for each sex. (Does diabetes predict CVD among males? Does diabetes predict CVD among females?) A study may find a statistically significant beneficial impact of treatment on disease among males and fail to find a significant effect of treatment among females (or, in extreme cases, find statistically significant benefits or harms that vary by sex). However, if the study lacks power to test an interaction by sex, investigators cannot claim that they have detected a difference between males and females that meets conventional standards for ruling out chance. As discussed, the detection of a statistically significant interaction by sex is a high bar. However, apparent contrasts in sex-stratified data—differential main effects of treatment by sex—can suggest the presence of sex differences. At the least, they provide a rationale for larger studies powered to detect sex interactions, or incentivize data collection across studies for meta-analyses of interactions by sex ( 15 ).

To plan a study with adequate statistical power to detect main effects by sex is straightforward: simply calculate sample sizes needed to detect main effects in men and women separately (and add them up), taking into account sex differences in rates of disease, expected size of impact of exposure, and, for observational studies, expected prevalence of exposure.

Many studies analyze their data by sex as an afterthought. Such subgroup analyses of main effects stratified by sex are often underpowered, which heightens the risk of type II error, or false-negative results. This is true even when the original analysis, in which all subjects are analyzed together, regardless of sex, reports a statistically significant association of exposure with disease. For example, in a study in which the exposure–disease association approaches statistical significance (say, a 2 standard error difference in outcome between study arms), splitting subjects into two groups of similar size will yield a one in three chance that the association will be sizeable and statistically significant ( P < 0.05) in one group and inconsequential in the other (less than a standard error difference) ( 77 ).

“Defining gender in human studies is both difficult and controversial.”

Better than nothing: representation of sex

Studies underpowered to detect even sex-stratified main effects can still make available data and/or analyses stratified by sex, particularly in supplemental material, without making inferences regarding sex differences per se. Such data may serve as preliminary analyses for future studies adequately powered to detect sex differences and may be used in meta-analyses.

Special considerations for “big data”

We have entered an era in which enormous datasets are increasingly available. Many of them include the variable “sex.” Two cautions are important to emphasize. First, such datasets, while deep in sample size, are often narrow in breadth, lacking the variables (discussed below) helpful to contextualize and understand sex differences. Second, the temptation in very large datasets to stratify by any variable is strong, as it is easy to detect statistically significant interactions, including sex differences, of clinically trivial and meaningless magnitude ( 78 ). Sound motivation to test sex differences, discussed earlier, is essential. So is conservative interpretation of statistically significant findings. To whom much data are given, much common sense is demanded: extra caution needs to be exercised in interpreting studies with enormous statistical power to detect minute differences between subgroups.

Truly sex-informed research is more than just stratifying by sex or gender. Researchers should collect the data to characterize the ways in which exposures, diseases, and contributing environmental factors vary by sex and gender.

Defining and measuring sex and gender

Although it can be difficult to determine the sex of subjects in some species, for the most part, the sex of humans and nonhuman subjects in biomedical research is known. Categories of sex include males, females, intersexual individuals born with male and female characteristics, and people who undergo interventions to reassign their sex ( 25 ). In some instances, syndromes resulting from atypical sexual development can complicate categorization of sex ( 79 ).

Defining gender in human studies is both difficult and controversial. Indeed, some have argued that sex and gender are “irreducibly entangled,” and that even the most seemingly straightforward presentation of sex as a biological variable in human studies is inevitably a mix of sex and gender ( 24 , 25 , 80 ). Sociologists Westbrook and Saperstein ( 81 ), observing the tendency of large surveys to conflate sex and gender, call the state of measurement a “conceptual muddle” that is fraught with essentialist treatment of sex and gender as synonymous, obvious, easily determined by others, and unchanging over the life course.

The very concept of gender is subtle, complex, and shifting. It has been suggested that gender comprises at least three distinct, but interrelated components, the “three dimensions of gender” ( 82 ). These include: (1) our physical bodies, how we experience them, and how others interact with them; (2) our gender identity, our internal sense of ourselves as female, male, a blend of both, or neither; and (3) our gender expression, how we present our gender and how society interacts with the gender we present. Note that these dimensions are independent of sexual orientation. We are likely to see new measures of gender emerge; however, at present, there are few studies that have attempted to relate nuanced dimensions of gender to health and disease ( 81 , 83 ).

In the meantime, some researchers have ventured measures of gender that are intended to be distinct from sex ( 84–86 ). For example, several gendered factors correlated with poor health among women have been proposed as proxies for gender influences on health, including income, education, labor force participation, single-headed household, unpaid child and elder care, unpaid housework, political participation, and access to education or health ( 86 ). Particularly problematic has been the identification of proxies for male gender that might influence men’s health. The prevalence of gun ownership, for example, has been proposed ( 84 ). Such measures of gender are often measures of gender inequality. Many times they are based on national or state-level statistics, rather than more granular individual or household data ( 86 ).

Pelletier et al. ( 85 ) have proposed a method to measure individual-level gender as “psychosocial sex,” in contrast to “biological sex.” They argue that, as gender roles and attitudes—components that might comprise a gender index—depend on culture, age, and era, no single gender scoring system is broadly applicable. Rather, a method for defining gender within a study population is a better approach to measure gender. Drawing from extensive questionnaires completed by their study participants, the researchers identified a set of seven variables (including income, hours doing housework, and scores on a sex role inventory survey) that resulted in a continuous gender score ranging from masculine to feminine characteristics. Independent of sex, a high gender score (more feminine characteristics) was associated with increased risk of diabetes, hypertension, and depression and anxiety symptoms ( 85 ). In fact, once gender was accounted for, sex no longer predicted these health outcomes. Although the study was not large enough to exclude a modest interaction between sex and gender ( i.e. , did the gender score predict outcomes more among males or among females?), the authors observed that the higher femininity score appeared to predict outcomes for men as well as for women (Louise Pilote, personal communication). This study was possible only because of the extensive collection of economic and psychosocial covariates related to gender. To the extent possible, studies should be designed to collect data on gender. However, lack of data with which to construct a comprehensive gender measure does not absolve investigators of considering gender in their interpretation of data regarding sex differences.

Is gender relevant to animal studies? If it is hard to measure gender in human beings, it would seem entirely alien to do so in other species. However, a few investigators have attempted to design exposures that mimic human gendered experiences. For example, Shors et al. ( 87 ) developed an animal model (sexual conspecific aggressive response, or SCAR) to examine the effects of sexual aggression on the brain and learned behaviors. Pubescent female rodents are paired with sexually experienced adult males. The female releases high levels of adrenal stress hormones. Her ability to learn, including to learn maternal caring behavior, was suppressed. The authors suggested that such experiments are aimed at understanding how sexual trauma impacts mechanisms that shape the female brain. Although other interpretations of that animal model are possible, studies have reported that women with a history of childhood sexual trauma exhibit changes in brain and associated physiology. Women with a history of childhood sexual trauma, a highly prevalent exposure, have irregularities to cortical and subcortical tissue and long-term alterations to their hypothalamic–pituitary adrenal axis, compared with women without childhood sexual trauma ( 88 , 89 ). Sexual assault occurs to all sexes and genders, but considerably more often to girls and women, and therefore constitutes a gendered exposure ( 58–61 ). National surveys show that physical child abuse is also common, often more so for boys than girls ( 59 , 60 ). Other violent exposures, such as combat casualties and war-time trauma, also have gendered distributions and implications ( 90 , 91 ).

Measuring sex and gender differences in disease presentation

It is essential to capture outcomes in sufficient detail to detect sex and gender differences in disease presentation. The classic example in clinical research is CHD, one of the leading causes of morbidity and mortality for men and women in the United States. Traditionally, myocardial infarction was characterized as the result of obstruction of the large coronary arteries. However, up to one third of women with a myocardial infarction and two thirds of women with chest pain had unobstructed arteries upon angiography ( 92 , 93 ). It is now recognized that myocardial ischemia may result from disease of the coronary microvessels. The Women’s Ischemia Syndrome Evaluation study reported that roughly half of the women with angina and ischemia without coronary artery obstruction evidenced microvascular dysfunction ( 94 ). Instead of the classic chest-crushing sensation of coronary artery obstruction, women with microvessel disease may present with shortness of breath and fatigue, nonspecific symptoms easy to misdiagnose and often dismissed. Thus, an investigator studying CHD in both sexes needs to consider the symptoms and diagnostic tests that will capture the presentation of disease in women and men ( 95 ).

Another example of differential disease presentation by sex is the tendency for prolactinomas to be detected as microadenomas among women, but macroadenomas among men. This results, at least in part, from the earlier detection among women, in whom small elevations in prolactin can cause infertility, menstrual disturbances, and/or galactorrhea. In contrast, in men, prolactinomas may progress to macroadenomas before they become symptomatic with headaches, double vision, or vision loss resulting from the mass of the tumor pressing on neurologic structures in the brain. Although this difference between the sexes is largely attributed to differences in diagnostic timing, the possibility that prolactinomas are more aggressive in men has not been entirely excluded ( 96 , 97 ). In animal models, sex differences in the expression and activity of pituitary transforming growth factor β 1 may contribute to sex differences in prolactinoma incidence ( 98 ).

Sex and gender differences in drug exposures and metabolism

“The failure to consider exogenous hormone use…may contribute to the lack of reproducibility in many studies.”

The use of exogenous hormones, such as oral contraceptives, menopausal hormone therapy, testosterone, and anabolic steroids, is particularly important to document. Taken systemically, by mouth, injection, or patch, such drugs affect reproductive and nonreproductive systems throughout the body, and they could be important to investigations in which the exposure–disease relationship could be affected by sex hormones. In the United States, sales of testosterone, available as oral medicine, gel, patch, or injection, grew 77% from 2010 to 2013, with 2.3 million prescriptions filled ( 103 ). It is estimated that 2.9 to 4 million Americans, largely men, have used anabolic steroids in their lifetime ( 104 ).

In addition to their direct impact on brain, bone, muscle, metabolism, immune, cardiovascular, and reproductive function ( 105–107 ), reproductive steroids often interact with other drugs. For example, among patients with growth hormone deficiency, women taking oral estrogen require twice as much growth hormone as men or women not taking oral estrogen to achieve the same levels of insulin-like growth factor 1 ( 108 ); current guidelines for treatment of adult growth hormone deficiency now recommend the consideration of estrogen status in dosing ( 109 ). Basic scientists may find it illuminating to vary the levels of exogenous hormone exposures in their experiments to mimic widespread human exposures.

In addition to intentional exogenous hormone exposure, there is an increasing body of literature suggesting that exposure to endocrine disrupting chemicals in the environment may affect human health. For example, phthalates are a nearly ubiquitous class of chemicals used in the manufacturing of household products, including food packaging and personal care products such as cosmetics and nail polish. Exposure to phthalates may depend on occupation and use of personal care products; higher urine concentrations of phthalate metabolites have been reported among women compared with men ( 110 , 111 ). Phthalate exposures have been associated with insulin receptor and glucose oxidation in the Chang liver cell line (unspecified sex) ( 112 ), signs of diabetes and endocrine disruption in female rats ( 113 ), and with insulin resistance and diabetes in men and women ( 111 , 114 , 115 ).

The failure to consider exogenous hormone use, endogenous hormones, and/or markers of hormonal status (such as menopause) may contribute to the lack of reproducibility in many studies. For example, investigators found that the serum concentrations of 68% of 171 serum biomarkers associated with chronic disease were affected by sex, oral contraceptive use, menstrual phase, or menopausal status ( 116 ). They estimated up to 40% false discoveries in biomarkers when sex was ignored and up to 41% false discoveries when oral contraceptive use was ignored. Heeding this caution, investigators may want to collect data on menstrual or estrus phase, menopausal status, use of exogenous hormones, and/or levels of circulating hormones ( 14 ).

Measuring reproductive cycle and phase

There are nuances to asking women to report their menstrual cycle and menopausal status. For example, as the duration of the luteal phase varies less than that of the follicular phase, menstrual cycle timing is best recorded retrospectively from the first day of the next menstrual period ( 117 ). As menstrual cycles may be suppressed or dictated by hormonal contraceptives (including hormonal intrauterine devices), or breastfeeding, it is useful to record these variables when assessing menstrual timing. Menopause may occur naturally or may result from hysterectomy, oophorectomy, or chemotherapy, and it cannot be determined until a year after the last menstrual period. Measurement of menstrual cycle phase and menopausal transition are covered elsewhere ( 117–119 ). Archived biospecimens should include information about such variables ( e.g. , time of blood draw, day of menstrual cycle).

Influential covariates that vary by sex and gender

Some factors that vary by sex or gender can influence the exposure–disease association under study, either as confounders (easily mistaken for sex differences) or effect modifiers (covariates that interact with sex to change outcome). Some of these sex-dependent covariates are obvious, such as parity. Other aspects of reproductive history may affect nonreproductive systems under study ( 120 ). For example, history of the hypertensive pregnancy disorder preeclampsia predicts twofold higher risk of CVD in women affected by the disorder ( 121 ). A woman’s history of preeclampsia might modify the impact of an antihypertensive drug. Exposures to exogenous endocrine drugs, such as those administered in the course of assisted reproductive technologies such as in vitro fertilization, might affect systems under study.

The degree to which other exposures, such as cigarette smoking, alcohol use, physical activity, socioeconomic position, caretaking responsibilities, and medication use, vary by sex and gender will depend on the population under study. For example, in some, but not all, cultures and climates, circulating 25-hydroxyvitamin D concentrations may differ considerably between men and women as a result of gender differences in factors such as clothing, time spent outdoors, and supplement use ( 122–124 ); depending on the country, these differences may be equalized by dietary vitamin D intake, particularly of fortified foods. Additionally, as shown in a study in the Netherlands, lower 25-hydroxyvitamin D levels among women may be explained by their higher adiposity levels, a difference that could be attributed either to sex (a biological difference) or gender (as many social determinants of adiposity are gendered) ( 125 ).

Particularly important to consider are sex or gender differences in the distribution of comorbidities that might influence an exposure–disease association. For example, compared with diabetic women, diabetic men have lower prevalence of depression and anxiety, gendered psychosocial factors that impede self-care activity and treatment success ( 126 ). Thus, it would be wise for studies examining sex differences in CVD to account for major depression history, especially when depression is associated with the main exposure under consideration ( 21 ).

The problem of stratifying everything by sex

Subgroup analyses from studies thoughtfully designed to query sex differences, particularly once replicated, can provide sound evidence of benefit or protection from harm for women and men. Alternatively, post hoc sex difference analyses, devoid of theoretical basis and sound construction, may create more noise than light. A recent analysis of sex differences presented in Cochrane reviews of clinical trials suggested that few met the stringent criteria of documenting statistically significant interactions by sex ( 127 ); this criticism of sex differences research is cautionary. Whether the absence of sex differences reflects fact, indiscriminate testing, lack of sample size, or the decades it takes for sex differences observed in basic or population research to reach clinical testing remains to be seen ( 128 ).

The risk of a blanket mandate to require all studies to stratify all results by sex is that the literature-wide type I error, that is, the risk of detecting false sex differences, will skyrocket. Furthermore, as we increase the size and statistical power of our studies to detect true sex interactions (minimizing type II error), we court the risk of finding sex differences where none exist (type I error). As mentioned earlier, type I error is a particular hazard of a theoretical “big data” analysis.

If we pursue sex difference analysis as a poorly considered mandate, a literature of contradiction will follow. The field of sex differences research risks discredit from unthinking and profligate enthusiasm. How, then, can we encourage sex differences research that is thoughtful, conservative, and consistent over time?

Prespecified hypothesis tests

In any subgroup analysis, including sex and gender, tests of interaction should be limited and prespecified in statistical analytic plans ( 129 ). Although this does not guarantee that tests of hypotheses will be well constructed, it does help to protect against post hoc fishing and data-derived hypothesis testing (themselves self-fulfilling prophecies). “Surprise” subgroup findings should be presented as such, and interpreted with caution—the basis for further study, not for instant translation to clinic or policy. There seems almost a reflexive tendency of researchers to view male and female as the fundamental dichotomy of the biologic world ( 24 ). We need to approach the question of sex differences with curiosity and skepticism, rather than unquestioning assumption.

In creating an a priori hypothesis, it is best practice to prespecify the expected direction and magnitude of the sex difference. Should there be subgroups within sex, such as nulliparous vs parous, or premenopausal vs postmenopausal? Careful a priori hypothesizing is important for observational studies, experiments, and trials, and it serves to maintain scientific transparency. In large studies, some statistically significant sex differences may arise by chance. Thus, prespecifying the form of the expected interaction helps guard against indiscriminate post hoc scrambling.

Accounting for confounding

“…Researchers will be collecting and analyzing data by sex, but the onus is on investigators to address this adequately and at all levels of basic, clinical, and population research.”

Variables on the pathway between sex or gender and outcome

Some variables may be intermediates between sex or gender and the outcomes under study, and their treatment in analysis deserves special consideration. For example, sex and gender are two of many factors that determine body size and composition. Not only are body size and adiposity determined by sex steroids, sex steroid receptors, adipokines, and other differences between males and females ( 134 ), but gender differences in physical activity also affect body size and composition ( 135 ). [Interestingly, there are also sex differences in the voluntary physical activity of rodents: females exercise more than males. ( 135 )]Whether to control for body size in studies of sex and gender effects is a nuanced decision.

For example, in a study of differences between men and women in the impact of Ambien and impaired driving, should the investigator adjust for body size in evaluating sex or gender differences? Alternatively, body size is (at least partially) a product of sex and gender, and adjusting for it might obscure the most important pathway (body size) through which sex and gender impact the metabolism of the drug. Additionally, adjustment for body size might reveal other mechanisms (some of which are discussed in “Sex and gender differences in drug exposures and metabolism” above) through which sex or gender affect drug clearance. Statistical methods can help segregate or ‘decompose’ the impact of such intermediate variables, also known as mediators ( 136 , 137 ).

Replication

The critical importance of replication has been addressed by others ( 78 ). This is particularly true in a scientific climate that encourages, or even mandates, subgroup analysis.

Interpretation

As with any study finding, apparent sex differences (or lack thereof) need to be interpreted with caution ( 138 ). The magnitude and direction of apparent sex effects need to be placed in the context of prior knowledge. Biological mechanisms, hopefully outlined a priori , need to be discussed. The adequacy of the study to rule out bias, confounding, and chance needs to be frankly addressed. Even statistically significant sex differences may be due to chance or bias, instead of true heterogeneity of exposure–disease associations or of treatment effects ( 129 ).

This is especially true when interpreting main effects stratified by sex, where a study lacks statistical power to test interaction by sex. In this case, the play of chance is often overlooked and findings are overinterpreted. As an example, Assman et al. ( 139 ) cite a subgroup analysis of a trial that followed myocardial infarction survivors ( 140 ). The investigators, laudably, considered the differential impact of treatment on the mortality of men and women, including both sexes and prespecifying stratification by sex in the analyses. However, they did not plan to test an interaction by sex. Their intervention had no overall impact on mortality. There was no association of treatment with cardiac mortality among men ( P = 0.94). However, in women, the authors observed what they interpreted as a “possible harmful impact of the intervention” on women’s cardiac mortality ( P = 0.06). Later, Assman et al. used their data to calculate a proper test of interaction by sex, which revealed no statistically significant evidence of an interaction between treatment and sex ( P for interaction = 0.21), indicating that chance could well explain the seeming sex effect. Thus, despite the disparate associations and P values in the two sex strata, the study was simply too small to test whether the impact of the treatment on cardiac mortality truly differed by sex.

Most importantly, surprise subgroup findings need to be acknowledged as such. Sex differences that were discovered as the result of post hoc poking around in the data need to be treated with caution until they are replicated as pre hoc tests in other studies. In this event, supplemental tables stratified by sex, data repositories, and meta-analyses may extend the impact of any single study. In other words, investigators can make available sex-stratified data to spur the generation of new hypotheses, without presenting sex-stratified analyses that overreach the intent and design of their original study.

New governmental mandates mean that researchers will be collecting and analyzing data by sex, but the onus is on investigators to address this adequately and at all levels of basic, clinical, and population research. If we fail, the “noise” created by multiple testing across all our datasets may drown out the signal of true sex differences. Furthermore, in human studies it is important to investigate the impact of both sex and gender to illuminate fundamental, modifiable causes of disease and avoid a reflexive attribution of seeming sex differences solely to biology. If we address these design and analytic issues skillfully, then we have the chance for new insights for men and women that will be critical for the next generation of scientific and therapeutic discoveries in this age of precision medicine.

Abbreviations

coronary heart disease

cardiovascular disease

estrogen receptor

National Institutes of Health

Disclosure Summary: The authors have nothing to disclose.

National Institutes of Health. Consideration of sex as a biological variable in NIH-funded research. Available at: grants.nih.gov/grants/guide/notice-files/NOT-OD-15-102.html . Accessed 17 April 2017.

Legato MJ , Johnson PA , Manson JE . Consideration of sex differences in medicine to improve health care and patient outcomes . JAMA . 2016 ; 316 ( 18 ): 1865 – 1866 .

Google Scholar

Goldstein JM , Holsen L , Handa R , Tobet S . Fetal hormonal programming of sex differences in depression: linking women’s mental health with sex differences in the brain across the lifespan . Front Neurosci . 2014 ; 8 : 247 .

Institute of Medicine . Exploring the Biological Contributions to Human Health: Does Sex Matter? Washington, DC : National Academies Press ; 2001 .

Google Preview

Johnson J , Sharman Z , Vissandjée B , Stewart DE . Does a change in health research funding policy related to the integration of sex and gender have an impact ? PLoS One . 2014 ; 9 ( 6 ): e99900 .

European Commission Directorate-General for Research and Innovation. H2020 programme: guidance on gender equality in Horizon 2020. Available at: eige.europa.eu/sites/default/files/h2020-hi-guide-gender_en.pdf . Accessed 5 April 2018.

Rabesandratana T. Adding sex-and-gender dimensions to your research. Available at: www.sciencemag.org/careers/2014/03/adding-sex-and-gender-dimensions-your-research . Accessed 12 February 2018.

National Institutes of Health. Enhancing reproducibility through rigor and transparency. Available at: grants.nih.gov/grants/guide/notice-files/NOT-OD-15-103.html . Accessed 17 April 2017.

National Institutes of Health/Agency for Healthcare Research and Quality. Implementing rigor and transparency in NIH & AHRQ research grant applications. Available at: grants.nih.gov/grants/guide/notice-files/NOT-OD-16-011.html . Accessed 17 April 2017.

National Institutes of Health/Agency for Healthcare Research and Quality. Implementing rigor and transparency in NIH & AHRQ career development award applications. Available at: grants.nih.gov/grants/guide/notice-files/NOT-OD-16-012.html . Accessed 17 April 2017.

Ritz SA , Antle DM , Côté J , Deroy K , Fraleigh N , Messing K , Parent L , St-Pierre J , Vaillancourt C , Mergler D . First steps for integrating sex and gender considerations into basic experimental biomedical research . FASEB J . 2014 ; 28 ( 1 ): 4 – 13 .

Becker JB , Arnold AP , Berkley KJ , Blaustein JD , Eckel LA , Hampson E , Herman JP , Marts S , Sadee W , Steiner M , Taylor J , Young E . Strategies and methods for research on sex differences in brain and behavior . Endocrinology . 2005 ; 146 ( 4 ): 1650 – 1673 .

Ouyang P , Wenger NK , Taylor D , Rich-Edwards JW , Steiner M , Shaw LJ , Berga SL , Miller VM , Merz NB . Strategies and methods to study female-specific cardiovascular health and disease: a guide for clinical scientists . Biol Sex Differ . 2016 ; 7 ( 1 ): 19 .

Miller VM , Kaplan JR , Schork NJ , Ouyang P , Berga SL , Wenger NK , Shaw LJ , Webb RC , Mallampalli M , Steiner M , Taylor DA , Merz CN , Reckelhoff JF . Strategies and methods to study sex differences in cardiovascular structure and function: a guide for basic scientists . Biol Sex Differ . 2011 ; 2 ( 1 ): 14 .

Miller LR , Marks C , Becker JB , Hurn PD , Chen WJ , Woodruff T , McCarthy MM , Sohrabji F , Schiebinger L , Wetherington CL , Makris S , Arnold AP , Einstein G , Miller VM , Sandberg K , Maier S , Cornelison TL , Clayton JA . Considering sex as a biological variable in preclinical research . FASEB J . 2017 ; 31 ( 1 ): 29 – 34 .

Cornelison TL , Clayton JA . Considering sex as a biological variable in biomedical research . Gender and the Genome. 2017 ; 1 ( 2 ): 89 – 93 .

Nieuwenhoven L , Klinge I . Scientific excellence in applying sex- and gender-sensitive methods in biomedical and health research . J Womens Health (Larchmt) . 2010 ; 19 ( 2 ): 313 – 321 .

Legato MJ . Principles of Gender-Specific Medicine: Gender in the Genomic Era .3rd ed. London, UK : Academic Press ; 2017 .

Arnold AP , Lusis AJ . Understanding the sexome: measuring and reporting sex differences in gene systems . Endocrinology . 2012 ; 153 ( 6 ): 2551 – 2555 .

Mosca L , Barrett-Connor E , Wenger NK . Sex/gender differences in cardiovascular disease prevention: what a difference a decade makes . Circulation . 2011 ; 124 ( 19 ): 2145 – 2154 .

Tobet SA , Handa RJ , Goldstein JM . Sex-dependent pathophysiology as predictors of comorbidity of major depressive disorder and cardiovascular disease . Pflugers Arch . 2013 ; 465 ( 5 ): 585 – 594 .

Anastario M , Salafia CM , Fitzmaurice G , Goldstein JM . Impact of fetal versus perinatal hypoxia on sex differences in childhood outcomes: developmental timing matters . Soc Psychiatry Psychiatr Epidemiol . 2012 ; 47 ( 3 ): 455 – 464 .

Whitley H , Lindsey W . Sex-based differences in drug activity . Am Fam Physician . 2009 ; 80 ( 11 ): 1254 – 1258 .

Springer KW , Mager Stellman J , Jordan-Young RM . Beyond a catalogue of differences: a theoretical frame and good practice guidelines for researching sex/gender in human health . Soc Sci Med . 2012 ; 74 ( 11 ): 1817 – 1824 .

Krieger N . Genders, sexes, and health: what are the connections—and why does it matter ? Int J Epidemiol . 2003 ; 32 ( 4 ): 652 – 657 .

Sen G , Ostlin P . Gender inequity in health: why it exists and how we can change it . Glob Public Health . 2008 ; 3 ( sup1 , Suppl 1 ) 1 – 12 .

Garcia M , Mulvagh SL , Merz CN , Buring JE , Manson JE . Cardiovascular disease in women: clinical perspectives . Circ Res . 2016 ; 118 ( 8 ): 1273 – 1293 .

Jazin E , Cahill L . Sex differences in molecular neuroscience: from fruit flies to humans . Nat Rev Neurosci . 2010 ; 11 ( 1 ): 9 – 17 .

Manson JE , Bassuk SS , Lee IM , Cook NR , Albert MA , Gordon D , Zaharris E , Macfadyen JG , Danielson E , Lin J , Zhang SM , Buring JE . The VITamin D and OmegA-3 TriaL (VITAL): rationale and design of a large randomized controlled trial of vitamin D and marine omega-3 fatty acid supplements for the primary prevention of cancer and cardiovascular disease . Contemp Clin Trials . 2012 ; 33 ( 1 ): 159 – 171 .

King AJ . The use of animal models in diabetes research . Br J Pharmacol . 2012 ; 166 ( 3 ): 877 – 894 .

Clutton-Brock TH , Isvaran K . Sex differences in ageing in natural populations of vertebrates . Proc Biol Sci . 2007 ; 274 ( 1629 ): 3097 – 3104 .

Lyon MF . Gene action in the X-chromosome of the mouse ( Mus musculus L.) . Nature . 1961 ; 190 ( 4773 ): 372 – 373 .

Busque L , Mio R , Mattioli J , Brais E , Blais N , Lalonde Y , Maragh M , Gilliland DG . Nonrandom X-inactivation patterns in normal females: lyonization ratios vary with age . Blood . 1996 ; 88 ( 1 ): 59 – 65 .

Abkowitz JL , Taboada M , Shelton GH , Catlin SN , Guttorp P , Kiklevich JV . An X chromosome gene regulates hematopoietic stem cell kinetics . Proc Natl Acad Sci USA . 1998 ; 95 ( 7 ): 3862 – 3866 .

Barrett EL , Richardson DS . Sex differences in telomeres and lifespan . Aging Cell . 2011 ; 10 ( 6 ): 913 – 921 .

Astuti Y , Wardhana A , Watkins J , Wulaningsih W ; PILAR Research Network . Cigarette smoking and telomere length: a systematic review of 84 studies and meta-analysis . Environ Res . 2017 ; 158 : 480 – 489 .

Resende MM , Taylor DA . Building solutions for cardiovascular disease in women . Tex Heart Inst J . 2013 ; 40 ( 3 ): 285 – 287 .

El-Maarri O , Becker T , Junen J , Manzoor SS , Diaz-Lacava A , Schwaab R , Wienker T , Oldenburg J . Gender specific differences in levels of DNA methylation at selected loci from human total blood: a tendency toward higher methylation levels in males . Hum Genet . 2007 ; 122 ( 5 ): 505 – 514 .

Zhu ZZ , Hou L , Bollati V , Tarantini L , Marinelli B , Cantone L , Yang AS , Vokonas P , Lissowska J , Fustinoni S , Pesatori AC , Bonzini M , Apostoli P , Costa G , Bertazzi PA , Chow WH , Schwartz J , Baccarelli A . Predictors of global methylation levels in blood DNA of healthy subjects: a combined analysis . Int J Epidemiol . 2012 ; 41 ( 1 ): 126 – 139 .

Smith RP , Coward RM , Kovac JR , Lipshultz LI . The evidence for seasonal variations of testosterone in men . Maturitas . 2013 ; 74 ( 3 ): 208 – 212 .

Winston AP , Wijeratne S . Hypogonadism, hypoleptinaemia and osteoporosis in males with eating disorders . Clin Endocrinol (Oxf) . 2009 ; 71 ( 6 ): 897 – 898 .

Masters JR , Drife JO , Scarisbrick JJ . Cyclic variation of DNA synthesis in human breast epithelium . J Natl Cancer Inst . 1977 ; 58 ( 5 ): 1263 – 1265 .

Anderson TJ . Mitotic activity in the breast . J Obstet Gynaecol . 1984 ; 4 ( Suppl 2 ): S114 – S118 .

Sulke AN , Jones DB , Wood PJ . Variation in natural killer activity in peripheral blood during the menstrual cycle . Br Med J (Clin Res Ed) . 1985 ; 290 ( 6472 ): 884 – 886 .

Ruiz-Pino F , Navarro VM , Bentsen AH , Garcia-Galiano D , Sanchez-Garrido MA , Ciofi P , Steiner RA , Mikkelsen JD , Pinilla L , Tena-Sempere M . Neurokinin B and the control of the gonadotropic axis in the rat: developmental changes, sexual dimorphism, and regulation by gonadal steroids . Endocrinology . 2012 ; 153 ( 10 ): 4818 – 4829 .

de Vries GJ , Södersten P . Sex differences in the brain: the relation between structure and function . Horm Behav . 2009 ; 55 ( 5 ): 589 – 596 .

McCarthy MM , Pickett LA , VanRyzin JW , Kight KE . Surprising origins of sex differences in the brain . Horm Behav . 2015 ; 76 : 3 – 10 .

Semaan SJ , Tolson KP , Kauffman AS .The development of kisspeptin circuits in the mammalian brain. In: Kauffman AS , Smith JT , eds. Kisspeptin Signaling in Reproductive Biology . New York, NY : Springer ; 2013 : 221 – 252 .

Holsen LM , Lancaster K , Klibanski A , Whitfield-Gabrieli S , Cherkerzian S , Buka S , Goldstein JM . HPA-axis hormone modulation of stress response circuitry activity in women with remitted major depression . Neuroscience . 2013 ; 250 : 733 – 742 .

Jacobs EG , Weiss BK , Makris N , Whitfield-Gabrieli S , Buka SL , Klibanski A , Goldstein JM . Impact of sex and menopausal status on episodic memory circuitry in early midlife . J Neurosci . 2016 ; 36 ( 39 ): 10163 – 10173 .

Heldring N , Pike A , Andersson S , Matthews J , Cheng G , Hartman J , Tujague M , Ström A , Treuter E , Warner M , Gustafsson JA . Estrogen receptors: how do they signal and what are their targets . Physiol Rev . 2007 ; 87 ( 3 ): 905 – 931 .

Barton M , Filardo EJ , Lolait SJ , Thomas P , Maggiolini M , Prossnitz ER . Twenty years of the G protein-coupled estrogen receptor GPER: historical and personal perspectives . J Steroid Biochem Mol Biol . 2018 ; 176 : 4 – 15 .

Cosman F , Lindsay R . Selective estrogen receptor modulators: clinical spectrum . Endocr Rev . 1999 ; 20 ( 3 ): 418 – 434 .

Jacobsen BM , Horwitz KB . Progesterone receptors, their isoforms and progesterone regulated transcription . Mol Cell Endocrinol . 2012 ; 357 ( 1-2 ): 18 – 29 .

Marks LS . 5α-Reductase: history and clinical importance . Rev Urol . 2004 ; 6 ( Suppl 9 ): S11 – S21 .

Looker AC , Johnson CL , Lacher DA , Pfeiffer CM , Schleicher RL , Sempos CT . Vitamin D status: United States, 2001–2006 . NCHS Data Brief . 2011 ; 59 ( 59 ): 1 – 8 .

Tolin DF , Foa EB . Sex differences in trauma and posttraumatic stress disorder: a quantitative review of 25 years of research . Psychol Bull . 2006 ; 132 ( 6 ): 959 – 992 .

Breiding MJ , Smith SG , Basile KC , Walters ML , Chen J , Merrick MT . Prevalence and characteristics of sexual violence, stalking, and intimate partner violence victimization—national intimate partner and sexual violence survey, United States, 2011 . MMWR Surveill Summ . 2014 ; 63 ( 8 ): 1 – 18 .

Black MC , Basile KC , Breiding MJ , Smith SG, Walters ML, Merrick MT, Chen J, Stevens MR. The National Intimate Partner and Sexual Violence Survey (NISVS): 2010 Summary Report . Atlanta, GA : National Center for Injury Prevention and Control, Centers for Disease Control and Prevention ; 2011 .

Afifi TO , MacMillan HL , Boyle M , Taillieu T , Cheung K , Sareen J . Child abuse and mental disorders in Canada . CMAJ. 2014 ; 186 ( 9 ): E324 – E332 .

World Health Organization . Global and regional estimates of violence against women: prevalence and health effects of intimate partner violence and non-partner sexual violence . Geneva, Switzerland : World Health Organization ; 2013 .

Diaz Brinton R . Minireview: translational animal models of human menopause: challenges and emerging opportunities . Endocrinology . 2012 ; 153 ( 8 ): 3571 – 3578 .

Uhlenhaut NH , Treier M . Foxl2 function in ovarian development . Mol Genet Metab . 2006 ; 88 ( 3 ): 225 – 234 .

Drew BG , Hamidi H , Zhou Z , Villanueva CJ , Krum SA , Calkin AC , Parks BW , Ribas V , Kalajian NY , Phun J , Daraei P , Christofk HR , Hewitt SC , Korach KS , Tontonoz P , Lusis AJ , Slamon DJ , Hurvitz SA , Hevener AL . Estrogen receptor (ER)α-regulated lipocalin 2 expression in adipose tissue links obesity with breast cancer progression . J Biol Chem . 2015 ; 290 ( 9 ): 5566 – 5581 .

McDevitt MA , Glidewell-Kenney C , Jimenez MA , Ahearn PC , Weiss J , Jameson JL , Levine JE . New insights into the classical and non-classical actions of estrogen: evidence from estrogen receptor knock-out and knock-in mice . Mol Cell Endocrinol . 2008 ; 290 ( 1–2 ): 24 – 30 .

Park CJ , Zhao Z , Glidewell-Kenney C , Lazic M , Chambon P , Krust A , Weiss J , Clegg DJ , Dunaif A , Jameson JL , Levine JE . Genetic rescue of nonclassical ERα signaling normalizes energy balance in obese Erα-null mutant mice . J Clin Invest . 2011 ; 121 ( 2 ): 604 – 612 .

Itoh Y , Mackie R , Kampf K , Domadia S , Brown JD , O’Neill R , Arnold AP . Four core genotypes mouse model: localization of the Sry transgene and bioassay for testicular hormone levels . BMC Res Notes . 2015 ; 8 ( 1 ): 69 .

Chen X , McClusky R , Chen J , Beaven SW , Tontonoz P , Arnold AP , Reue K . The number of X chromosomes causes sex differences in adiposity in mice . PLoS Genet . 2012 ; 8 ( 5 ): e1002709 .

Li J , Chen X , McClusky R , Ruiz-Sundstrom M , Itoh Y , Umar S , Arnold AP , Eghbali M . The number of X chromosomes influences protection from cardiac ischaemia/reperfusion injury in mice: one X is better than two . Cardiovasc Res . 2014 ; 102 ( 3 ): 375 – 384 .

Shah K , McCormack CE , Bradbury NA . Do you know the sex of your cells ? Am J Physiol Cell Physiol . 2014 ; 306 ( 1 ): C3 – C18 .

Kernan WN , Viscoli CM , Makuch RW , Brass LM , Horwitz RI . Stratified randomization for clinical trials . J Clin Epidemiol . 1999 ; 52 ( 1 ): 19 – 26 .

Peters SA , Huxley RR , Woodward M . Diabetes as risk factor for incident coronary heart disease in women compared with men: a systematic review and meta-analysis of 64 cohorts including 858,507 individuals and 28,203 coronary events . Diabetologia . 2014 ; 57 ( 8 ): 1542 – 1551 .

Huxley RR , Peters SA , Mishra GD , Woodward M . Risk of all-cause mortality and vascular events in women versus men with type 1 diabetes: a systematic review and meta-analysis . Lancet Diabetes Endocrinol . 2015 ; 3 ( 3 ): 198 – 206 .

Garcia-Closas M, Lubin JH. POWER V3.0 software. Available at: dceg.cancer.gov/tools/design/power . Accessed 5 April 2018.

VanderWeele T. Tools and tutorials. Available at: www.hsph.harvard.edu/tyler-vanderweele/tools-and-tutorials/ . Accessed 5 April 2018.

Brookes ST , Whitely E , Egger M , Smith GD , Mulheran PA , Peters TJ . Subgroup analyses in randomized trials: risks of subgroup-specific analyses; power and sample size for the interaction test . J Clin Epidemiol . 2004 ; 57 ( 3 ): 229 – 236 .

Peto R .Statistical aspects of cancer trials. In: Halnan KE , ed. Treatment of Cancer . London, UK : Chapman and Hall ; 1982 : 867 – 871 .

Ioannidis JP . Why most published research findings are false . PLoS Med . 2005 ; 2 ( 8 ): e124 .

MacLaughlin DT , Donahoe PK . Sex determination and differentiation . N Engl J Med . 2004 ; 350 ( 4 ): 367 – 378 .

Hankivsky O , Doyal L , Einstein G , Kelly U , Shim J , Weber L , Repta A . The odd couple: using biomedical and intersectional approaches to address health inequities . Global Health Action . 2017 ; 10 ( Suppl 2 ): 1326686 .

Westbrook L , Saperstein A . New categories are not enough: rethinking the measurement of sex and gender in social surveys . Gend Soc . 2015 ; 29 ( 4 ): 534 – 560 .

Gender Spectrum. Understanding gender. Available at: www.genderspectrum.org/quick-links/understanding-gender/ . Accessed 4 April 2018.

Gender Identity in U.S. Surveillance (GenIUSS) Group. Best Practices for Asking Questions to Identify Transgender and Other Gender Minority Respondents on Population-Based Surveys. Herman JL, ed. Los Angeles, CA: The Williams Institute, 2014 .

Phillips SP . Defining and measuring gender: a social determinant of health whose time has come . Int J Equity Health . 2005 ; 4 ( 1 ): 11 .

Pelletier R , Ditto B , Pilote L . A composite measure of gender and its association with risk factors in patients with premature acute coronary syndrome . Psychosom Med . 2015 ; 77 ( 5 ): 517 – 526 .

Tamambang L , Auger N , Lo E , Raynault M-F . Measurement of gender inequality in neighbourhoods of Québec, Canada . Int J Equity Health . 2011 ; 10 ( 1 ): 52 .

Shors TJ , Tobόn K , DiFeo G , Durham DM , Chang HYM . Sexual conspecific aggressive response (SCAR): a model of sexual trauma that disrupts maternal learning and plasticity in the female brain . Sci Rep . 2016 ; 6 ( 1 ): 18960 .

Blanco L , Nydegger LA , Camarillo G , Trinidad DR , Schramm E , Ames SL . Neurological changes in brain structure and functions among individuals with a history of childhood sexual abuse: a review . Neurosci Biobehav Rev . 2015 ; 57 : 63 – 69 .

Heim C , Nemeroff CB . The role of childhood trauma in the neurobiology of mood and anxiety disorders: preclinical and clinical studies . Biol Psychiatry . 2001 ; 49 ( 12 ): 1023 – 1039 .

Tepe V , Yarnell A , Nindl BC , Van Arsdale S , Deuster PA . Women in combat: summary of findings and a way ahead . Mil Med . 2016 ; 181 ( Suppl 1 ): 109 – 118 .

Cross JD , Johnson AE , Wenke JC , Bosse MJ , Ficke JR . Mortality in female war veterans of operations enduring freedom and Iraqi freedom . Clin Orthop Relat Res . 2011 ; 469 ( 7 ): 1956 – 1961 .

Chokshi NP , Iqbal SN , Berger RL , Hochman JS , Feit F , Slater JN , Pena-Sing I , Yatskar L , Keller NM , Babaev A , Attubato MJ , Reynolds HR . Sex and race are associated with the absence of epicardial coronary artery obstructive disease at angiography in patients with acute coronary syndromes . Clin Cardiol . 2010 ; 33 ( 8 ): 495 – 501 .

Sharaf BL , Pepine CJ , Kerensky RA , Reis SE , Reichek N , Rogers WJ , Sopko G , Kelsey SF , Holubkov R , Olson M , Miele NJ , Williams DO , Merz CN ; WISE Study Group. Detailed angiographic analysis of women with suspected ischemic chest pain (pilot phase data from the NHLBI-sponsored Women’s Ischemia Syndrome Evaluation [WISE] Study Angiographic Core Laboratory) . Am J Cardiol . 2001 ; 87 ( 8 ): 937 – 941 .

Reis SE , Holubkov R , Conrad Smith AJ , Kelsey SF , Sharaf BL , Reichek N , Rogers WJ , Merz CN , Sopko G , Pepine CJ ; WISE Investigators . Coronary microvascular dysfunction is highly prevalent in women with chest pain in the absence of coronary artery disease: results from the NHLBI WISE study . Am Heart J . 2001 ; 141 ( 5 ): 735 – 741 .

Sanghavi M , Gulati M . Sex differences in the pathophysiology, treatment, and outcomes in IHD . Curr Atheroscler Rep . 2015 ; 17 ( 6 ): 34 .

Ciccarelli A , Daly AF , Beckers A . The epidemiology of prolactinomas . Pituitary . 2005 ; 8 ( 1 ): 3 – 6 .

Klibanski A . Prolactinomas . N Engl J Med . 2010 ; 362 ( 13 ): 1219 – 1226 .

Recouvreux MV , Faraoni EY , Camilletti MA , Ratner L , Abeledo-Machado A , Rulli SB , Díaz-Torga G . Sex differences in the pituitary TGFβ1 system: the role of TGFβ1 in prolactinoma development . Front Neuroendocrinol . 2017 ; S0091-3022(17)30063-8 .

Greenblatt DJ , Harmatz JS , Singh NN , Steinberg F , Roth T , Moline ML , Harris SC , Kapil RP . Gender differences in pharmacokinetics and pharmacodynamics of zolpidem following sublingual administration . J Clin Pharmacol . 2014 ; 54 ( 3 ): 282 – 290 .

Cubała WJ , Wiglusz M , Burkiewicz A , Gałuszko-Wegielnik M . Zolpidem pharmacokinetics and pharmacodynamics in metabolic interactions involving CYP3A: sex as a differentiating factor . Eur J Clin Pharmacol . 2010 ; 66 ( 9 ): 955 .

Olubodun JO , Ochs HR , von Moltke LL , Roubenoff R , Hesse LM , Harmatz JS , Shader RI , Greenblatt DJ . Pharmacokinetic properties of zolpidem in elderly and young adults: possible modulation by testosterone in men . Br J Clin Pharmacol . 2003 ; 56 ( 3 ): 297 – 304 .

U.S. Food and Drug Administration. Questions and answers: risk of next-morning impairment after use of insomnia drugs; FDA requires lower recommended doses for certain drugs containing zolpidem (Ambien, Ambien CR, Edluar, and Zolpimist). Available at: www.fda.gov/Drugs/DrugSafety/ucm334041.htm#q6 . Accessed 17 April 2017.

U.S. Food and Drug Administration. FDA drug safety communication. FDA cautions about using testosterone products for low testosterone due to aging; requires labeling change to inform of possible increased risk of heart attack and stroke with use. Available at: www.fda.gov/Drugs/DrugSafety/ucm436259.htm . Accessed 17 April 2017.

Pope HG Jr , Kanayama G , Athey A , Ryan E , Hudson JI , Baggish A . The lifetime prevalence of anabolic-androgenic steroid use and dependence in Americans: current best estimates . Am J Addict . 2014 ; 23 ( 4 ): 371 – 377 .

Fish EN . The X-files in immunity: sex-based differences predispose immune responses . Nat Rev Immunol . 2008 ; 8 ( 9 ): 737 – 744 .

Power ML , Schulkin J . Sex differences in fat storage, fat metabolism, and the health risks from obesity: possible evolutionary origins . Br J Nutr . 2008 ; 99 ( 5 ): 931 – 940 .

Mendelsohn ME , Karas RH . Molecular and cellular basis of cardiovascular gender differences . Science . 2005 ; 308 ( 5728 ): 1583 – 1587 .

Cook DM , Ludlam WH , Cook MB . Route of estrogen administration helps to determine growth hormone (GH) replacement dose in GH-deficient adults . J Clin Endocrinol Metab . 1999 ; 84 ( 11 ): 3956 – 3960 .

Molitch ME , Clemmons DR , Malozowski S , Merriam GR , Vance ML ; Endocrine Society . Evaluation and treatment of adult growth hormone deficiency: an Endocrine Society clinical practice guideline . J Clin Endocrinol Metab . 2011 ; 96 ( 6 ): 1587 – 1609 .

Silva MJ , Barr DB , Reidy JA , Malek NA , Hodge CC , Caudill SP , Brock JW , Needham LL , Calafat AM . Urinary levels of seven phthalate metabolites in the U.S. population from the National Health and Nutrition Examination Survey (NHANES) 1999–2000 . Environ Health Perspect . 2004 ; 112 ( 3 ): 331 – 338 .

Huang T , Saxena AR , Isganaitis E , James-Todd T . Gender and racial/ethnic differences in the associations of urinary phthalate metabolites with markers of diabetes risk: National Health and Nutrition Examination Survey 2001–2008 . Environ Health . 2014 ; 13 ( 1 ): 6 .

Rengarajan S , Parthasarathy C , Anitha M , Balasubramanian K . Diethylhexyl phthalate impairs insulin binding and glucose oxidation in Chang liver cells . Toxicol In Vitro . 2007 ; 21 ( 1 ): 99 – 102 .

Gayathri NS , Dhanya CR , Indu AR , Kurup PA . Changes in some hormones by low doses of di (2-ethyl hexyl) phthalate (DEHP), a commonly used plasticizer in PVC blood storage bags & medical tubing . Indian J Med Res . 2004 ; 119 ( 4 ): 139 – 144 .

Stahlhut RW , van Wijngaarden E , Dye TD , Cook S , Swan SH . Concentrations of urinary phthalate metabolites are associated with increased waist circumference and insulin resistance in adult U.S. males . Environ Health Perspect . 2007 ; 115 ( 6 ): 876 – 882 .

James-Todd T , Stahlhut R , Meeker JD , Powell SG , Hauser R , Huang T , Rich-Edwards J . Urinary phthalate metabolite concentrations and diabetes among women in the National Health and Nutrition Examination Survey (NHANES) 2001–2008 . Environ Health Perspect . 2012 ; 120 ( 9 ): 1307 – 1313 .

Ramsey JM , Cooper JD , Penninx BW , Bahn S . Variation in serum biomarkers with sex and female hormonal status: implications for clinical tests . Sci Rep . 2016 ; 6 ( 1 ): 26947 .

Gangestad SW , Haselton MG , Welling LLM ,et al.  . How valid are assessments of conception probability in ovulatory cycle research? Evaluations, recommendations, and theoretical implications . Evol Hum Behav . 2016 ; 37 ( 2 ): 85 – 96 .

Harlow SD , Gass M , Hall JE , Lobo R , Maki P , Rebar RW , Sherman S , Sluss PM , de Villiers TJ ; STRAW + 10 Collaborative Group . Executive summary of the Stages of Reproductive Aging Workshop + 10: addressing the unfinished agenda of staging reproductive aging . J Clin Endocrinol Metab . 2012 ; 97 ( 4 ): 1159 – 1168 .

Shifren JL , Schiff I . The aging ovary . J Womens Health Gend Based Med . 2000 ; 9 ( Suppl 1 ): S3 – S7 .

Rich-Edwards JW , Fraser A , Lawlor DA , Catov JM . Pregnancy characteristics and women’s future cardiovascular health: an underused opportunity to improve women’s health ? Epidemiol Rev . 2014 ; 36 ( 1 ): 57 – 70 .

Bellamy L , Casas J-P , Hingorani AD , Williams DJ . Pre-eclampsia and risk of cardiovascular disease and cancer in later life: systematic review and meta-analysis . BMJ . 2007 ; 335 ( 7627 ): 974 .

Holvik K , Meyer HE , Haug E , Brunvand L . Prevalence and predictors of vitamin D deficiency in five immigrant groups living in Oslo, Norway: the Oslo Immigrant Health Study . Eur J Clin Nutr . 2005 ; 59 ( 1 ): 57 – 63 .

Calvo MS , Whiting SJ , Barton CN . Vitamin D intake: a global perspective of current status . J Nutr . 2005 ; 135 ( 2 ): 310 – 316 .

Hilger J , Friedel A , Herr R , Rausch T , Roos F , Wahl DA , Pierroz DD , Weber P , Hoffmann K . A systematic review of vitamin D status in populations worldwide . Br J Nutr . 2014 ; 111 ( 1 ): 23 – 45 .

van Dam RM , Snijder MB , Dekker JM , Stehouwer CD , Bouter LM , Heine RJ , Lips P . Potentially modifiable determinants of vitamin D status in an older population in the Netherlands: the Hoorn Study . Am J Clin Nutr . 2007 ; 85 ( 3 ): 755 – 761 .

Kautzky-Willer A , Harreiter J . Sex and gender differences in therapy of type 2 diabetes . Diabetes Res Clin Pract . 2017 ; 131 : 230 – 241 .

Wallach JD , Sullivan PG , Trepanowski JF , Steyerberg EW , Ioannidis JP . Sex based subgroup differences in randomized controlled trials: empirical evidence from Cochrane meta-analyses . BMJ . 2016 ; 355 : i5826 .

Miller VM , Tannenbaum C , Regitz-Zagrosek V . Sex based subgroup differences in randomized controlled trials: empirical evidence from Cochrane meta-analyses; response to authors comment . BMJ . 2016 ; 355 : i5826 .

Alosh M , Fritsch K , Huque M ,et al.  . Statistical considerations on subgroup analysis in clinical trials . Stat Biopharm Res . 2015 ; 7 ( 4 ): 286 – 303 .

Rothman KJ , Greenland S , Lash TL . Modern Epidemiology .3rd ed. Philadelphia, PA : Lippincott Williams & Wilkins ; 2008 .

Austin PC . An introduction to propensity score methods for reducing the effects of confounding in observational studies . Multivariate Behav Res . 2011 ; 46 ( 3 ): 399 – 424 .

Fewell Z , Davey Smith G , Sterne JA . The impact of residual and unmeasured confounding in epidemiologic studies: a simulation study . Am J Epidemiol . 2007 ; 166 ( 6 ): 646 – 655 .

Ding P , VanderWeele TJ . Sensitivity analysis without assumptions . Epidemiology . 2016 ; 27 ( 3 ): 368 – 377 .

Shi H , Seeley RJ , Clegg DJ . Sexual differences in the control of energy homeostasis . Front Neuroendocrinol . 2009 ; 30 ( 3 ): 396 – 404 .

Rosenfeld CS . Sex-dependent differences in voluntary physical activity . J Neurosci Res . 2017 ; 95 ( 1-2 ): 279 – 290 .

Valeri L , Vanderweele TJ . Mediation analysis allowing for exposure–mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros . Psychol Methods . 2013 ; 18 ( 2 ): 137 – 150 .

Richiardi L , Bellocco R , Zugna D . Mediation analysis in epidemiology: methods, interpretation and bias . Int J Epidemiol . 2013 ; 42 ( 5 ): 1511 – 1519 .

Buyse ME . Analysis of clinical trial outcomes: some comments on subgroup analyses . Control Clin Trials . 1989 ; 10 ( 4 , Suppl ) 187S – 194S .

Assmann SF , Pocock SJ , Enos LE , Kasten LE . Subgroup analysis and other (mis)uses of baseline data in clinical trials . Lancet . 2000 ; 355 ( 9209 ): 1064 – 1069 .

Frasure-Smith N , Lespérance F , Prince RH , Verrier P , Garber RA , Juneau M , Wolfson C , Bourassa MG . Randomised trial of home-based psychosocial nursing intervention for patients recovering from myocardial infarction . Lancet . 1997 ; 350 ( 9076 ): 473 – 479 .

Month: Total Views:
April 2018 99
May 2018 25
June 2018 19
July 2018 12
August 2018 1,199
September 2018 760
October 2018 495
November 2018 112
December 2018 35
January 2019 52
February 2019 65
March 2019 49
April 2019 61
May 2019 37
June 2019 42
July 2019 32
August 2019 361
September 2019 427
October 2019 390
November 2019 399
December 2019 431
January 2020 555
February 2020 974
March 2020 1,009
April 2020 1,361
May 2020 989
June 2020 1,386
July 2020 1,314
August 2020 1,588
September 2020 1,766
October 2020 2,301
November 2020 1,933
December 2020 2,010
January 2021 2,040
February 2021 2,017
March 2021 2,329
April 2021 2,367
May 2021 2,148
June 2021 2,029
July 2021 2,180
August 2021 2,605
September 2021 3,502
October 2021 2,862
November 2021 2,311
December 2021 1,696
January 2022 1,773
February 2022 1,904
March 2022 2,529
April 2022 2,596
May 2022 2,364
June 2022 1,917
July 2022 1,363
August 2022 1,281
September 2022 1,490
October 2022 2,077
November 2022 1,724
December 2022 1,100
January 2023 1,074
February 2023 910
March 2023 870
April 2023 778
May 2023 784
June 2023 709
July 2023 667
August 2023 617
September 2023 666
October 2023 642
November 2023 655
December 2023 524
January 2024 628
February 2024 480
March 2024 561
April 2024 574
May 2024 588
June 2024 442
July 2024 373
August 2024 433
September 2024 334

Email alerts

Citing articles via.

  • About the Endocrine Society
  • Recommend to Your Librarian
  • Advertising and Corporate Services
  • Journals Career Network

Affiliations

  • Online ISSN 1945-7189
  • Print ISSN 0163-769X
  • Copyright © 2024 Endocrine Society
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Perspective
  • Published: 24 September 2018

Making gender diversity work for scientific discovery and innovation

  • Mathias Wullum Nielsen 1 ,
  • Carter Walter Bloch   ORCID: orcid.org/0000-0003-4718-003X 1 &
  • Londa Schiebinger 2  

Nature Human Behaviour volume  2 ,  pages 726–734 ( 2018 ) Cite this article

6224 Accesses

146 Citations

175 Altmetric

Metrics details

  • Decision making
  • Interdisciplinary studies

Gender diversity has the potential to drive scientific discovery and innovation. Here, we distinguish three approaches to gender diversity: diversity in research teams, diversity in research methods and diversity in research questions. While gender diversity is commonly understood to refer only to the gender composition of research teams, fully realizing the potential of diversity for science and innovation also requires attention to the methods employed and questions raised in scientific knowledge-making. We provide a framework for understanding the best ways to support the three approaches to gender diversity across four interdependent domains — from research teams to the broader disciplines in which they are embedded to research organizations and ultimately to the different societies that shape them through specific gender norms and policies. Our analysis demonstrates that realizing the benefits of diversity for science requires careful management of these four interdependent domains.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 digital issues and online access to articles

111,21 € per year

only 9,27 € per issue

Buy this article

  • Purchase on SpringerLink
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

gender based psychology experiments

Similar content being viewed by others

gender based psychology experiments

The impact of gender diversity on scientific research teams: a need to broaden and accelerate future research

gender based psychology experiments

Gender and Arctic climate change science in Canada

gender based psychology experiments

Collaboration between women helps close the gender gap in ice core science

Communication from the Commission to the European Parliament, the Council and the European Economic and Social Committee and the Committee of the Regions: A Reinforced European Research Area Partnership for Excellence and Growth (European Commission, 2012).

S tatement of Principles and Actions Promoting the Equality and Status of Women in Research (Global Research Council, 2016); https://www.globalresearchcouncil.org/fileadmin//documents/GRC_Publications/Statement_of_Principles_and_Actions_Promoting_the_Equality_and_Status_of_Women_in_Research.pdf

Huyer, S. in UNESCO Science Report: Towards 203 0 (ed. Schneegans, S.) 84–103 (UNESCO Publishing, Paris, 2015); http://unesdoc.unesco.org/images/0023/002354/235406e.pdf

Maes, K., Gvozdanovic, J., Buitendijk, S., Hallberg, I. R. & Mantilleri, B. Women, Research and Universities: Excellence Without Gender Bias (League of European Research Universities, Leuven, 2012).

Diversity in science. The Royal Society https://royalsociety.org/topics-policy/diversity-in-science/topic/ (2017).

Valantine, H. A. & Collins, F. S. National Institutes of Health addresses the science of diversity. Proc. Natl Acad. Sci. USA 112 , 12240–12242 (2015).

Article   CAS   Google Scholar  

Hong, L. & Page, S. E. Groups of diverse problem solvers can outperform groups of high-ability problem solvers. Proc. Natl Acad. Sci. USA 101 , 16385–16389 (2004).

Page, S. E. The Difference: How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies (Princeton Univ. Press, Princeton, 2008).

Phillips, K. W. How diversity works. Sci. Am. 311 , 42–47 (2014).

Article   Google Scholar  

Nishii, L. H. The benefits of climate for inclusion for gender-diverse groups. Acad. Manage. J. 56 , 1754–1774 (2013).

Bear, J. B. & Woolley, A. W. The role of gender in team collaboration and performance. Interdiscip. Sci. Rev. 36 , 146–153 (2011).

Joshi, A., Liao, H. & Roh, H. Bridging domains in workplace demography research: a review and reconceptualization. J. Manage. 37 , 521–552 (2011).

Google Scholar  

Van Dijk, H., van Engen, M. L. & van Knippenberg, D. Defying conventional wisdom: a meta-analytical examination of the differences between demographic and job related diversity relationships with performance. Organ. Behav. Hum. Decis. Process 119 , 38–53 (2012).

Díaz-García, C., González-Moreno, A. & Jose Sáez-Martínez, F. Gender diversity within R&D teams: its impact on radicalness of innovation. Innovation 15 , 149–160 (2013).

Faems, D. & Subramanian, A. M. R&D manpower and technological performance: the impact of demographic and task-related diversity. Res. Policy 42 , 1624–1633 (2013).

Fernández, J. The impact of gender diversity in foreign subsidiaries’ innovation outputs. Int. J. Gend Entrep. 7 , 148–167 (2015).

Østergaard, C. R., Timmermans, B. & Kristinsson, K. Does a different view create something new? The effect of employee diversity on innovation. Res. Policy 40 , 500–509 (2011).

Sastre, J. F. The impact of R&D teams’ gender diversity on innovation outputs. Int. J. Entrep. Small Bus. 24 , 142–162 (2015).

Turner, L. Gender diversity and innovative performance. Int. J. Innov. Sustain. Dev. 4 , 123–134 (2009).

Campbell, L. G., Mehtani, S., Dozier, M. E. & Rinehart, J. Gender-heterogeneous working groups produce higher quality science. PLoS ONE 8 , e79147 (2013).

Joshi, A. By whom and when is women’s expertise recognized? The interactive effects of gender and education in science and engineering teams. Adm. Sci. Q. 25 , 202–239 (2014).

Lungeanu, A. & Contractor, N. S. The effects of diversity and network ties on innovations: the emergence of a new scientific field. Am. Behav. Sci. 59 , 548–564 (2015).

Saá-Pérez, D., Díaz-Díaz, N. L., Aguiar-Díaz, I. & Ballesteros-Rodríguez, J. L. How diversity contributes to academic research teams performance. R&D Manage. 47 , 165–179 (2015).

Stvilia, B. et al. Composition of scientific teams and publication productivity at a national science lab. J. Assoc. Inf. Sci. Technol. 62 , 270–283 (2011).

For a Better Integration of the Gender Dimension in Horizon 2020 Work Programme 2016–2017 (European Commission, 2015); http://ec.europa.eu/transparency/regexpert/index.cfm?do=groupDetail.groupDetailDoc&id=18892&no=1

Buitendijk, S. & Maes, K. Gendered Research and Innovation: Integrating Sex and Gender Analysis into the Research Process (League of European Research Universities, Leuven, 2015).

Schiebinger, L. & Klinge, I. Gendered Innovations: How Gender Analysis Contributes to Research (Publications Office of the European Union, Luxembourg, 2013).

Sánchez de Madariaga, I., de Gregorio Hurtado, S. (eds). Advancing Gender in Research, Innovation and Sustainable Development (Fundación General de la Universidad Politécnica de Madrid, Madrid, 2016).

Bührer, S. & Schraudner, M. Gender-Aspekte in der Forschung: Wie können Gender-Aspekte in Forschungsvorhaben erkannt und bewertet werden? (Frauenhofer IRB Verlag, Stuttgart, 2006).

Adler, R. A. Osteoporosis in men: a review. Bone Res. 2 , 14001 (2014).

Schiebinger, L. et al. Sex and gender analysis policies of major granting agencies. Gendered Innovations http://genderedinnovations.stanford.edu/sex-and-gender-analysis-policies-major-granting-agencies.html (2018).

Johnson, J., Sharman, Z., Vissandjee, B. & Stewart, D. E. Does a change in health research funding policy related to the integration of sex and gender have an impact? PLoS ONE 9 , e99900 (2014).

De Cheveigné, S. & Knoll, B. Interim Evaluation: Gender Equality as a Crosscutting Issue in Horizon 2020 (Publications Office of the European Union, Luxembourg, 2017).

European Commission She Figures 2015 (Publications Office of the European Union, Luxembourg, 2016).

US General Accounting Office. Drug Safety: Most Drugs Withdrawn in Recent Years had Greater Health Risks for Women (Government Publishing Office, Washington DC, 2001).

Roth, J. et al. Economic return from the women’s health initiative estrogen plus progestin clinical trial: a modeling study. Ann. Intern. Med. 160 , 594–602 (2014).

Ovseiko, P. V. et al. A global call for action to include gender in research impact assessment. Health Res. Policy Syst. 14 , 50 (2016).

Nielsen, M. W. et al. Opinion: gender diversity leads to better science. Proc. Natl Acad. Sci. USA 114 , 1740–1742 (2017).

Dolado, J. J., Felgueroso, F. & Almunia, M. Are men and women-economists evenly distributed across research fields? Some new empirical evidence. SERIEs 3 , 367–393 (2012).

Light, R. in Networks, Work, and Inequality (ed. Mcdonald, S.) 239–268 (Research in the Sociology of Work Vol. 24, Emerald Group Publishing, Bingley, 2013).

Mapping Gender in the German Research Area (Elsevier, 2015); https://www.elsevier.com/__data/assets/pdf_file/0004/126715/ELS_Germany_Gender_Research-SinglePages.pdf

Maliniak, D., Powers, R. & Walter, B. F. The gender citation gap in international relations. Int. Organ. 67 , 889–922 (2013).

West, J. D., Jacquet, J., King, M. M., Correll, S. J. & Bergstrom, C. T. The role of gender in scholarly authorship. PLoS ONE 8 , e66212 (2013).

Nonnemaker, L. Women physicians in academic medicine — new insights from cohort studies. N. Engl. J. Med. 342 , 399–405 (2000).

Rosser, S. V. An overview of women’s health in the US since the mid-1960s. Hist. Technol . 18 , 355–369 (2002).

Schiebinger, L. Has Feminism Changed Science? (Harvard Univ. Press, Cambridge, 1999).

Fedigan, L. M. Primate Paradigms: Sex Roles and Social Bonds (Univ. Chicago Press, Chicago, 1992).

Schiebinger, L. (ed.) Gendered Innovations in Science and Engineering (Stanford Univ. Press, Stanford, 2008).

Nielsen, M. W., Andersen, J. P., Schiebinger, L. & Schneider, J. W. One and a half million medical papers reveal a link between author gender and attention to gender and sex analysis. Nat. Hum. Behav. 1 , 791–796 (2017).

Herring, C. Does diversity pay? Race, gender, and the business case for diversity. Am. Sociol. Rev. 74 , 208–224 (2009).

Mannix, E. & Neale, M. A. What differences make a difference? The promise and reality of diverse teams in organizations. Psychol. Sci. Public Interest 6 , 31–55 (2005).

Shore, L. M. et al. Diversity in organizations: Where are we now and where are we going? Hum. Resour . Manage. Rev. 19 , 117–133 (2009).

Williams, K. Y. & O’Reilly, C. A. in Research in Organizational Behavior (eds Staw, B. M. & Cummings, L. L.) 77–140 (JAI Press, Greenwich, 1998).

Homan, A. C., Van Knippenberg, D., Van Kleef, G. A. & De Dreu, C. K. Bridging faultlines by valuing diversity: diversity beliefs, information elaboration, and performance in diverse work groups. J. Appl. Psychol. 92 , 1189–1199 (2007).

Lauring, J. & Villesèche, F. The performance of gender diverse teams: What is the relation between diversity attitudes and degree of diversity? Eur. Manage. Rev. https://doi.org/10.1111/emre.12164 (2017).

Van Knippenberg, D., Haslam, S. A. & Platow, M. J. Unity through diversity: value-in-diversity beliefs, work group diversity, and group identification. Group Dyn. 11 , 207–222 (2007).

Hobman, E. V., Bordia, P. & Gallois, C. Perceived dissimilarity and work group involvement: the moderating effects of group openness to diversity. Group Organ. Manage. 29 , 560–587 (2004).

Joshi, A. & Knight, A. P. Who defers to whom and why? Dual pathways linking demographic differences and dyadic deference to team effectiveness. Acad. Manag. J. 58 , 59–84 (2015).

Hobman, E. V. & Bordia, P. The role of team identification in the dissimilarity-conflict relationship. Group Process. Intergroup Relat. 9 , 483–507 (2006).

Mohammed, S. & Angell, L. C. Surface- and deep-level diversity in workgroups: examining the moderating effects of team orientation and team process on relationship conflict. J. Organ. Behav. 25 , 1015–1039 (2004).

Homan, A. C. et al. Facing differences with an open mind: openness to experience, salience of intragroup differences, and performance of diverse work groups. Acad. Manage. J. 51 , 1204–1222 (2008).

Roos, P. A. & Reskin, B. F. Occupational desegregation in the 1970s: Integration and economic equity? Sociol. Perspect. 35 , 69–91 (1992).

Lautenberger, D. M., Dandar, V. M. & Raezer, C. L. The State of Women in Academic Medicine: The Pipeline and Pathways to Leadership 2013–2014 (Association of American Medical Colleges, Washington DC, 2014).

Becher, T. The significance of disciplinary differences. Stud. Higher Educ. 19 , 151–161 (1994).

Zou, J. & Schiebinger, L. AI can be sexist and racist — it’s time to make it fair. Nature 559 , 324–326 (2018).

Felt, U. & Stöckelová, T. in Knowing and Living in Academic Research (ed. Felt, U.) 41–124 (Institute of Sociology of the Academy of Sciences of the Czech Republic, Prague, 2009).

Lamont, M. How Professors Think (Harvard Univ. Press, Cambridge, 2009).

Whitley, R. The Intellectual and Social Organization of the Sciences (Oxford Univ. Press, Oxford, 2000).

Stewart, A. J., Malley, J. E. & LaVaque-Manty, D. Transforming Science and Engineering: Advancing Academic Women (Univ. Michigan Press, Ann Arbor, 2007).

Courses. Course 1: the science of sex and gender in human health. National Institutes of Health https://sexandgendercourse.od.nih.gov/content/courses (2018).

Sex and Gender in Biomedical Research (Canadian Institutes of Health, 2018); http://www.cihr-irsc-igh-isfh.ca/

Schiebinger, L. et al. Sex and gender analysis policies of peer-reviewed journals. Gendered Innovations https://genderedinnovations.stanford.edu/sex-and-gender-analysis-policies-peer-reviewed-journals.html (2018).

Author instructions: manuscript preparation Circulation Research https://www.ahajournals.org/res/manuscript-preparation (2018).

Miller, V. M. In pursuit of scientific excellence: sex matters. Am. J. Physiol. Cell Physiol. 302 , C1269–C1270 (2012).

Schiebinger, L., Leopold, S. S. & Miller, V. M. Editorial policies for sex and gender analysis. Lancet 388 , 2841–2842 (2016).

Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals (International Committee of Medical Journal Editors, 2017); http://www.icmje.org/recommendations/archives/

Ludwig, S. et al. A successful strategy to integrate sex and gender medicine into a newly developed medical curriculum. J. Womens Health 24 , 996–1005 (2015).

Brooks, C., Fenton, E. M. & Walker, T. J. Gender and the evaluation of research. Res. Policy 43 , 990–1001 (2014).

Hunter, L. A. & Leahey, E. Parenting and research productivity: new evidence and methods. Soc. Stud. Sci. 40 , 433–451 (2010).

Nielsen, M. W. Gender consequences of a national performance-based funding model: new pieces in an old puzzle. Stud. Higher Educ. 42 , 1033–1055 (2017).

Schneid, M., Isidor, R., Li, C. & Kabst, R. The influence of cultural context on the relationship between gender diversity and team performance: a meta-analysis. Int. J. Hum. Resour. Manage. 26 , 733–756 (2015).

European Commission Seventh FP7 Monitoring Report 2013 (Publications Office of the European Union, 2015).

C ommission Staff Working Document: Horizon 2020 Annual Monitoring Report 2015 (European Commission, 2016).

Fact Sheet: Gender Equality in Horizon 2020 (European Commission, 2013); https://ec.europa.eu/programmes/horizon2020/sites/horizon2020/files/FactSheet_Gender_2.pdf

Topics with a gender dimension. European Commission http://ec.europa.eu/research/participants/portal/desktop/en/opportunities/h2020/ftags/gender.html#c,topics=flags/s/Gender/1/1&+callDeadline/desc (2015).

Clayton, J. & Collins, F. NIH to balance sex in cell and animal studies. Nature 509 , 282–283 (2014).

Aksnes, D. et al. Centres of Excellence in the Nordic Countries (NIFU, Oslo, 2012).

Sandström, U., Wold, A., Jordansson, B., Ohlsson, B. & Smedberg, Å. Hans Excellens: om miljardsatsningarna på starka forskningsmiljöer (Delegationen för Jämställdhet i Högskolan, Stockholm, 2010).

Download references

Acknowledgements

We thank E. Steiner, Co-Director, Spatial History Project, Center for Spatial and Textual Analysis, Stanford University, for executing our graphics.

Author information

Authors and affiliations.

Danish Centre for Studies in Research and Research Policy, Department of Political Science, Aarhus University, Aarhus, Denmark

Mathias Wullum Nielsen & Carter Walter Bloch

History of Science, Stanford University, Stanford, CA, USA

Londa Schiebinger

You can also search for this author in PubMed   Google Scholar

Contributions

M.W.N., L.S. and C.W.B. conceptualized and wrote the paper. M.W.N. and L.S. carried out literature searches, and M.W.N. and C.W.B. prepared tables. L.S. and M.W.N. conceptualized Figs. 1 and 2.

Corresponding author

Correspondence to Mathias Wullum Nielsen .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information.

: Supplementary Methods; Supplementary Table 1 -4

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Nielsen, M.W., Bloch, C.W. & Schiebinger, L. Making gender diversity work for scientific discovery and innovation. Nat Hum Behav 2 , 726–734 (2018). https://doi.org/10.1038/s41562-018-0433-1

Download citation

Received : 09 February 2018

Accepted : 18 July 2018

Published : 24 September 2018

Issue Date : October 2018

DOI : https://doi.org/10.1038/s41562-018-0433-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Feminization of the health workforce in china: exploring gendered composition from 2002 to 2020.

  • Joanna Raven
  • Xiaoyun Liu

Human Resources for Health (2024)

How to diversify the dwindling physician–scientist workforce after the US affirmative action ban

  • Jessica L. Ding
  • Briana Christophers
  • Alex D. Waldman

Nature Medicine (2024)

Artificial intelligence and illusions of understanding in scientific research

  • Lisa Messeri
  • M. J. Crockett

Nature (2024)

Examining the influence of women scientists on scientific impact and novelty: insights from top business journals

  • Yining Wang

Scientometrics (2024)

The cyclical ethical effects of using artificial intelligence in education

  • Edward Dieterle
  • Michael Walker

AI & SOCIETY (2024)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

gender based psychology experiments

Sex and attitudinal gender: A critical review and decomposition principle

  • Published: 23 September 2024

Cite this article

gender based psychology experiments

  • Leire Gartzia   ORCID: orcid.org/0000-0001-8837-3888 1  

17 Accesses

Explore all metrics

Research is concerned about sex and gender effects, often addressed interchangeably. This paper reviews the distinct conceptualization and usage of these constructs with a focus on the multiple psychological components of gender. Extending social role theory concerns about how gender internalizes into individual psychology, an organizing framework is provided pointing to attitudinal gender as an overarching psychological construct including gender affect, gender cognition and gender behaviour. In a review about usage of sex and gender in individual differences research, limited conceptual distinction and inconsistent operationalizations are observed, with a shift of approach particularly in social sciences from sex to wide-ranging “gender differences” conceptualizations. We call for greater distinction between biological sex and attitudinal gender components in research reporting and design. A decomposition principle and research diagram is provided to help implement theoretical questions and empirical tests more accurately. Propositions can be followed by researchers, reviewers and practitioners to progress in the analysis and assessment of sex and gender effects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

gender based psychology experiments

Ajzen, I. (2012). The theory of planned behavior. In  Handbook of theories of social psychology: Volume   1  (pp. 438–459). https://doi.org/10.4135/9781446249215.n22

American Psychological Association. (2023, January, 23). APA Style, Gender . https://apastyle.apa.org/style-grammar-guidelines/bias-free-language/gender

Anselmi, D. L., & Law, A. L. (1998). Questions of gender: Perspectives and paradoxes . McGraw-Hill.

Astle, S., Rivas-Koehl, D., Rivas-Koehl, M. et al. (2024). Timing the talks: Exploring children’s ages at parent-child conversations about gender, sexual orientation, and various sexual behaviors.  Sexuality Research and Social Policy , 21 , 738–758. https://doi.org/10.1007/s13178-023-00894-0

Bagger, J., Li, A., & Gutek, B. A. (2008). How much do you value your family and does it matter? The joint effects of family identity salience, family-interference- with-work, and gender. Human Relations, 61 (2), 187–211. https://doi.org/10.1177/0018726707087784

Article   Google Scholar  

Banchefsky, S., Westfall, J., Park, B., & Judd, C. M. (2016). But you don’t look like a scientist!: Women scientists with feminine appearance are deemed less likely to be scientists. Sex Roles , 75 (3), 95–109. https://doi.org/10.1007/s11199-016-0586-1

 Barbir, L. A., Vandevender, A. W., & Cohn, T. J. (2017). Friendship, attitudes, and behavioral intentions of cisgender heterosexuals toward transgender individuals. Journal of Gay & Lesbian Mental Health , 21 (2), 154–170. https://doi.org/10.1080/19359705.2016.1273157

Barron, L. A. (2003). Ask and you shall receive? Gender differences in negotiators ’ beliefs about requests for a higher salary. Human Relations, 56 (200306), 635–662.

Beauvoir, S. D. (1949). The second sex, woman as other . Libraries Gallimard.

Becker, J. B., McClellan, M. L., & Reed, B. G. (2017). Sex differences, gender and addiction. Journal of Neuroscience Research , 95 (1–2), 136–147. https://doi.org/10.1002/jnr.23963

Article   PubMed   PubMed Central   Google Scholar  

Beemyn, B. (2003). Serving the needs of transgender college students. Journal of Gay & Lesbian Issues in Education , 1 (1), 33–50. https://doi.org/10.1300/J367v01n01_03

Beere, C. A., King, D. W., Beere, D. B., & King, L. A. (1984). The sex-role egalitarianism scale: A measure of attitudes toward equality between the sexes. Sex Roles , 10 , 563–576. https://doi.org/10.1007/BF00287265

Bem, S. L. (1974). The measurement of psychological androgyny. Journal of Consulting and Clinical Psychology , 42 (2), 155. https://doi.org/10.1037/h0036215

Berzonsky, M. D. (2011). A social-cognitive perspective on identity construction. In S. J. Schwartz, K. Luyckx, & V. L. Vignoles (Eds.), Handbook of identity theory and research (pp. 55–76). Springer Science + Business Media. https://doi.org/10.1007/978-1-4419-7988-9_3

Breckler, S. J. (1984). Empirical validation of affect, behavior, and cognition as distinct components of attitude. Journal of Personality and Social Psychology , 47 (6), 1191. https://doi.org/10.1037/0022-3514.47.6.1191ç

Brewis, J., Hampton, M. P., & Linstead, S. (1997). Unpacking Priscilla: Subjectivity and identity in the organization of gendered appearance. Human Relations , 50 (10), 1275–1304. https://doi.org/10.1023/A:1016982423169

Bolzendahl, C. I., & Myers, D. J. (2004). Feminist attitudes and support for gender equality: Opinion change in women and men, 1974–1998. Social Forces , 83 (2), 759–789. https://doi.org/10.1353/sof.2005.0005

Bornstein, R. F. (1994). Construct validity of the interpersonal dependency inventory: 1977-1992. Journal of Personality Disorders , 8 (1), 64–76. https://doi.org/10.1521/pedi.1994.8.1.64

Bussey, K., & Bandura, A. (1999). Social cognitive theory of gender development and differentiation. Psychological Review , 106 (4), 676.

Butler, J. (1993).  Bodies that matter: On the discursive limits of “Sex” . Routledge.

Butler, J. (2001). El género en disputa: el feminismo y la subversión de la identidad. Género y Sociedad, 5 , 193. [1] h.

Google Scholar  

Campbell, D. W., & Eaton, W. O. (1999). Sex differences in the activity level of infants. Infant and Child Development, 8 (1), 1–17. https://doi.org/10.1002/(SICI)1522-7219(199903)8:1%3c1::AID-ICD186%3e3.0.CO;2-O

Carroll, L., Gilroy, P. J., & Ryan, J. (2002). Counseling transgendered, transsexual, and gender‐variant clients. Journal of Counseling & Development , 80 (2), 131–139. https://doi.org/10.1002/j.1556-6678.2002.tb00175.x

Cejka, M. A., & Eagly, A. H. (1999). Gender-stereotypic images of occupations correspond to the sex segregation of employment. Personality and Social Psychology Bulletin , 25 (4), 413–423. https://doi.org/10.1177/0146167299025004

Christopher, A. N., & Mull, M. S. (2006). Conservative ideology and ambivalent sexism. Psychology of Women Quarterly , 30 (2), 223–230. https://doi.org/10.1111/j.1471-6402.2006.00284.x

Cooper, A. J., Gupta, S. R., Moustafa, A. F., & Chao, A. M. (2021). Sex/gender differences in obesity prevalence, comorbidities, and treatment. Current Obesity Reports , 1–9. https://doi.org/10.1007/s13679-021-00453-x

Davis, S. N., & Greenstein, T. N. (2009). Gender ideology: Components, predictors, and consequences. Annual review of Sociology , 35 (1), 87–105. https://doi.org/10.1146/annurev-soc-070308-115920

Deaux, K. (1985). Sex and gender. Annual Review of Psychology , 36 , 49–81. https://doi.org/10.1146/annurev.ps.36.020185.000405

Deaux, K., & Lewis, L. L. (1984). Structure of gender stereotypes: Interrelationships among components and gender label. Journal of personality and Social Psychology , 46 (5), 991. https://doi.org/10.1037/0022-3514.46.5.991

DeCasien, A. R., Guma, E., Liu, S., & Raznahan, A. (2022). Sex differences in the human brain: A roadmap for more careful analysis and interpretation of a biological reality. Biology of Sex Differences , 13 (1), 1–21. https://doi.org/10.1186/s13293-022-00448-w

Drummond, S., Driscoll, M. P. O., Brough, P., Kalliath, T., Siu, O., Timms, C., & Lo, D. (2017). The relationship of social support with well-being outcomes via work– family conflict: Moderating effects of gender, dependants and nationality . Human Relations . https://doi.org/10.1177/0018726716662696

Eagly, A. H. (1987). Sex differences in social behavior: A social-role interpretation . Lawrence Erlbaum Associates, Inc.

Eagly, A. H., & Chaiken, S. S. (1993). The nature of attitudes. In A. H. Eagly, & S. S. Chaiken (Eds.). The psychology of attitudes (pp. 1–21).

Eagly, A. H., & Revelle, W. (2022). Understanding the magnitude of psychological differences between women and men requires seeing the forest and the Trees. Perspectives on Psychological Science , 17 (5), 1339–1358. https://doi.org/10.1177/17456916211046006

Eagly, A. H., & Wood, W. (2012). Social role theory. In Van P. A. M. Lange, A. W. Kruglanski, & E. T. Higgins (Eds.), Handbook of theories of social psychology (pp. 458–476). Sage Publications Ltd. https://doi.org/10.4135/9781446249222.n49

Chapter   Google Scholar  

Eagly, A. H., Eaton, A., Rose, S. M., Riger, S., & McHugh, M. C. (2012). Feminism and psychology; analysis of a half-century of research on women and gender. American Psychologist , 67 (3), 211–230. https://doi.org/10.1037/a0027260

Article   PubMed   Google Scholar  

Eagly, A. H., Gartzia, L., & Carli, L. L. (2014). Female advantage: Revisited. In S. Kumra, R. Simpson, & R. J. Burke (Eds.), The Oxford handbook of gender in organizations (pp. 153–174). Oxford University Press. 

Eckes, T., & Trautner, H. M. (2000). The developmental social psychology of gender . Lawrence Erlbaum Associates Publishers.

Ellemers, N. (2018). Gender stereotypes. Annual Review of Psychology , 69 (1), 275–298. https://doi.org/10.1146/annurev-psych-122216-011719

Ember, C. R., & Ember, M. (2002). Father absence and male agresion: A re-examination of the comparative evidence. Ethos , 29 (3), 296–314.

Fausto-Sterling, A. (2012). Sex/gender: Biology in a social world . Routledge.

Flores, A. R., Herman, J., Gates, G. J., & Brown, T. N. (2016). How many adults identify as transgender in the United States ? (vol. 13). Los Angeles, CA: Williams Institute.

Flotskaya, N., Bulanova, S., Ponomareva, M., Flotskiy, N., & Konopleva, T. (2018). Gender identity development among teenagers living in the subarctic region of Russia. Behavioral Sciences , 8 (10). https://doi.org/10.3390/bs8100090

Gartzia, L. (2022). Self and other reported workplace traits: A communal gap of men across occupations. Journal of Applied Social Psychology, 52, 568–587. https://doi.org/10.1111/jasp.12848

Gartzia, L., & Baniandrés, J. (2019). How feminine is the female advantage? Incremental validity of gender traits over leader sex on employees’ responses. Journal of Business Research , 99 . https://doi.org/10.1016/j.jbusres.2018.12.062

Gartzia, L., & Fetterolf, J. C. (2016). What division of labor do university students expect in their future lives? Divergences and communalities of female and male students. Sex Roles, 74 (3–4). https://doi.org/10.1007/s11199-015-0532-7

Gartzia, L., & Lopez-Zafra, E. (2014). Gender research in spanish psychology: An overview for international readers. Sex Roles , 70 (11). https://doi.org/10.1007/s11199-014-0380-x

Gartzia, L., & Lopez-Zafra, E. (2016). Gender research in Spanish psychology, part II: Progress and complexities in the European context. Sex Roles, 74 (3–4). https://doi.org/10.1007/s11199-015-0567-9

Gater, R., Tansella, M., Korten, A., Tiemens, B. G., Mavreas, V. G., & Olatawura, M. O. (1998). Sex differences in the prevalence and detection of depressive and anxiety disorders in general health care settings. Archives of General Psychiatry , 55 (5), 405–413. https://doi.org/10.1001/archpsyc.55.5.405

Geddes, P. & Thompson, J. (1890). The evolution of sex . NY: Scriber & Welford.

Geist, C., Reynolds, M. M., & Gaytán, M. S. (2017). Unfinished business: Disentangling sex, gender, and sexuality in sociological research on gender stratification. Sociology Compass , 11 (4), e12470. https://doi.org/10.1111/soc4.12470

Glick, P., & Fiske, S. T. (1996). The ambivalent sexism inventory: Differentiating hostile and benevolent sexism. Journal of Personality and Social Psychology , 70 (3), 491–512. https://doi.org/10.1037/0022-3514.70.3.491

Glick, P., & Fiske, S. T. (2001). An ambivalent alliance: Hostile and benevolent sexism as complementary justifications for gender inequality. American Psychologist , 56 (2), 109. https://doi.org/10.1037/0003-066X.56.2.109

Global Gender Gap Report. (2022, July). World Economic Forum . https://www.weforum.org/publications/global-gender-gap-report-2022/

Gold, D., & Berger, C. (1978). Problem-solving performance of young boys and girls as a function of task appropriateness and sex identity. Sex Roles , 4 , 183–193. https://doi.org/10.1007/BF00287499

Goldner, V. (2011). Trans: Gender in free fall. Psychoanalytic Dialogues , 21 (2), 159–171. https://doi.org/10.1080/10481885.2011.562836

Golombok, S., & Rust, J. (1993). The pre-school activities inventory: A standardized assessment of gender role in children. Psychological Assessment , 5 (2), 131. https://doi.org/10.1037/1040-3590.5.2.131

Haines, E. L., Deaux, K., & Lofaro, N. (2016). The times they are a-changing… or are they not? A comparison of gender stereotypes, 1983–2014. Psychology of Women Quarterly , 40 (3), 353–363. https://doi.org/10.1177/0361684316634081

Hampshire, N., Mayer, J. D., & Caruso, D. R. (2000). Emotional intelligence meets traditional standards for an intelligence. Intelligence, 27 (4), 267–298.

Haslanger, S. (2000). Feminism in metaphysics: Negotiating the natural. The Cambridge Companion to Feminism in Philosophy , 107–126.

Howard, J. A. (2000). Social psychology of identities. Annual Review of Sociology , 26 (1), 367–393.

Hull, L., Mandy, W., & Petrides, K. V. (2017). Behavioural and cognitive sex/gender differences in autism spectrum condition and typically developing males and females. Autism , 21 (6), 706–727. https://doi.org/10.1177/1362361316669087

Hyde, J. S. (2014). Gender similarities and differences. Annual Review of Psychology , 65 (1), 373–398. https://doi.org/10.1146/annurev-psych-010213-115057

Hyde, J. S. (2018). Gender similarities. In C. B. Travis, J. W. White, A. Rutherford, W. S. Williams, S. L. Cook, & K. F. Wyche (Eds.), APA handbook of the psychology of women: History, theory, and battlegrounds (pp. 129–143). American Psychological Association. https://doi.org/10.1037/0000059-007

Hyde, J. S., Bigler, R. S., Joel, D., Tate, C. C., & van Anders, S. M. (2019). The future of sex and gender in psychology: Five challenges to the gender binary. American Psychologist , 74 (2), 171–193. https://doi.org/10.1037/amp0000307

Jain, A. (2014). Gender role attitudes and marital satisfaction among asian indian couples living in the U.S. an exploratory study (Order No. 3667344). Available from ProQuest One Academic. (1643246456). Retrieved from https://www.proquest.com/dissertations-theses/gender-role-attitudes-marital-satisfaction-among/docview/1643246456/se-2

Joel, D., Tarrasch, R., Berman, Z., Mukamel, M., & Ziv, E. (2014). Queering gender: Studying gender identity in ‘normative’individuals. Psychology & Sexuality , 5 (4), 291–321. https://doi.org/10.1080/19419899.2013.830640

Kaiser, F. G., & Wilson, M. (2019). The Campbell paradigm as a behavior-predictive reinterpretation of the classical tripartite model of attitudes. European Psychologist . https://doi.org/10.1027/1016-9040/a000364

Kauma, H., Savolainen, M. J., Heikkilä, R., Rantala, A. O., Lilja, M., Reunanen, A., & Kesäniemi, Y. A. (1996). Sex difference in the regulation of plasma high density lipoprotein cholesterol by genetic and environmental factors. Human Genetics , 97 (2), 156–162. https://doi.org/10.1007/BF02265258

Kenagy, G. P., Hsieh, C. M., & Kennedy, G. (2005). The risk less known: Female-to-male transgender persons’ vulnerability to HIV infection. AIDS Care - Psychological and Socio-Medical Aspects of AIDS/HIV , 17 (2), 195–207. https://doi.org/10.1080/19540120512331325680

Kessler, R. C., McGonagle, K. A., Nelson, C. B., Hughes, M., Swartz, M., & Blazer, D. G. (1994). Sex and depression in the national comorbidity survey. II: Cohort effects. Journal of Affective Disorders , 30 (1), 15–26. https://doi.org/10.1016/0165-0327(94)90147-3

Kozee, H. B., Tylka, T. L., & Bauerband, L. A. (2012). Measuring transgender individuals’ comfort with gender identity and appearance: Development and validation of the transgender congruence scale. Psychology of Women Quarterly , 36 (2), 179–196. https://doi.org/10.1177/0361684312442161

Lai, M. C., & Szatmari, P. (2020). Sex and gender impacts on the behavioural presentation and recognition of autism. Current Opinion in Psychiatry , 33 (2), 117–123. https://doi.org/10.1097/YCO.0000000000000575

Layton, L. (1998). Gender studies. The Psychoanalytic Quarterly , 67 (2), 340–349.

Lindqvist, A., Sendén, M. G., & Renström, E. A. (2021). What is gender, anyway: A review of the options for operationalising gender. Psychology and Sexuality , 12 (4), 332–344. https://doi.org/10.1080/19419899.2020.1729844

Lips, H. M. (2020). Sex and gender: An introduction . Waveland Press.

Madson, L. (2000). Inferences regarding the personality traits and sexual orientation of physically androgynous people. Psychology of Women Quarterly , 24 (2), 148–160. https://doi.org/10.1111/j.1471-6402.2000.tb00196.x

Magliozzi, D., Saperstein, A., & Westbrook, L. (2016). Scaling up: Representing gender diversity in survey research. Socius , 2 , 2378023116664352. https://doi.org/10.1177/2378023116664352

McCall, L. (2005). The complexity of intersectionality. Signs: Journal of Women in Culture and Society , 30 (3), 1771–1800. https://doi.org/10.1086/426800

McHugh, M. C., & Frieze, I. H. (1997). The measurement of gender‐role attitudes: A review and commentary. Psychology of Women Quarterly , 21 (1), 1–16. https://doi.org/10.1111/j.1471-6402.1997.tb00097.x

Mikkola, M. (2022, Jan 18). Feminist perspectives on sex and gender . https://seop.illc.uva.nl/entries/feminism-gender/

Montañés, P., Megías, J. L., De Lemus, S., & Moya, M. (2015). Influence of early romantic relationships on adolescents’ sexism. International Journal of Social Psychology , 30 (2), 219–240. https://doi.org/10.1080/21711976.2015.1016756

Morgenroth, T., Sendén, M. G., Lindqvist, A., Renström, E. A., Ryan, M. K., & Morton, T. A. (2021). Defending the sex/gender binary: The role of gender identification and need for closure. Social Psychological and Personality Science , 12 (5), 731–740. https://doi.org/10.1177/1948550620937188

Newfield, E., Hart, S., Dibble, S., & Kohler, L. (2006). Female-to-male transgender quality of life. Quality of life Research , 15 , 1447–1457. https://doi.org/10.1007/s11136-006-0002-3

OECD. (2023). Joining forces for gender equality . OECD Publishing, Paris. https://doi.org/10.1787/67d48024-en  

Orlofsky, J. L. (1981). Relationship between sex role attitudes and personality traits and the sex role behavior scale-1: A new measure of masculine and feminine role behaviors and interests. Journal of Personality and Social Psychology , 40 (5), 927. https://doi.org/10.1037/0022-3514.40.5.927

Ostrom, T. M. (1969). The relationship between the affective, behavioral, and cognitive components of attitude. Journal of Experimental Social Psychology , 5 (1), 12–30. https://doi.org/10.1016/0022-1031(69)90003-1

Oyserman, D., Elmore, K., & Smith, G. (2011). Self, self-concept, and identity. In M. R. Leary & J. P. Tangney (Eds.), Handbook of self and identity (pp. 69–104). New York: Guilford Press.

Parrott, A. C. (1991). Performance tests in human psychopharmacology: II. Content validity, criterion validity, and face validity. Human Psychopharmacology: Clinical and Experimental , 6 (2), 91–98. https://doi.org/10.1002/hup.470060203

Raine, A., Yang, Y., Narr, K. L., & Toga, A. W. (2011). Sex differences in orbitofrontal gray as a partial explanation for sex differences in antisocial personality. Molecular Psychiatry , 16 (2), 227–236. https://doi.org/10.1038/mp.2009.136

Rocha-Sánchez, T. E., & Díaz-Loving, R. (2011). Desarrollo de una escala para la evaluación multifactorial de la identidad de género en población mexicana. Revista de psicología social , 26 (2), 191–206. https://doi.org/10.1174/021347411795448965

Rogers, L. J. (1999). Factors associated with exploration in marmosets: age, gender and hand preference. International Journal of Comparative Psychology , 12 (2). https://doi.org/10.46867/C4759T

Rosenberg, M. J., Hovland, C. I., McGuire, W. J., Abelson, R. P., & Brehm, J. W. (1960). Attitude organization and change: An analysis of consistency among attitude components . (Yales studies in attitude and communication.). Yale Univer. Press. 

Rubin, G. (1975). The traffic in women: Notes on the ‘Political Economy’ of Sex . Toward an Anthropology of Woman/Monthly Review.

Russell, B. L., & Trigg, K. Y. (2004). Tolerance of sexual harassment: An examination of gender differences, ambivalent sexism, social dominance, and gender roles. Sex Roles , 50 , 565–573. https://doi.org/10.1023/B:SERS.0000023075.32252.fd

Sanchis-Segura, C., Aguirre, N., Cruz-Gómez, Á. J., Félix, S., & Forn, C. (2022). Beyond “sex prediction”: Estimating and interpreting multivariate sex differences and similarities in the brain. Neuroimage, 257 , 1–29. https://doi.org/10.1016/j.neuroimage.2022.119343

Schmader, T., & Block, K.  (2015). Engendering identity: Toward a clearer conceptualization of gender as a social identity. Sex Roles , 73 , 474–480. https://doi.org/10.1007/s11199-015-0536-3

Schudson, Z. C., Beischel, W. J., & van Anders, S. M. (2019). Individual variation in gender/sex category definitions. Psychology of Sexual Orientation and Gender Diversity , 6 (4), 448. https://psycnet.apa.org/doi/10.1037/sgd0000346

Six, B., & Eckes, T. (1991). A closer look at the complex structure of gender stereotypes. Sex Roles , 24 (1–2), 57–71. https://doi.org/10.1007/BF00288703

Spence, J. T., & Helmreich, R. (1972). The attitudes toward women scale: An objective instrument to measure attitudes toward the rights and roles of women in contemporary society. Catalog of Selected Documents in Psychology , 2 (66).

Spence, J. T., & Hahn, E. D. (1997). The attitudes toward women scale and attitude change in college students. Psychology of Women Quarterly , 21 , 17–34. https://doi.org/10.1111/j.1471-6402.1997.tb00098.x

Spence, J. T., & Buckner, C. E. (2000). Instrumental and expressive traits, trait stereotypes, and sexist attitudes: What do they signify? Psychology of Women Quarterly , 24 (1), 44–62. https://doi.org/10.1111/j.1471-6402.2000.tb01021.x

Spence, J. T., Helmreich, R., & Stapp, J. (1975). Ratings of self and peers on sex role attributes and their relation to self-esteem and conceptions of masculinity and femininity. Journal of Personality And Social Psychology , 32 (1), 29. https://doi.org/10.1037/h0076857

Stoljar, N. (1995). Essence, identity, and the concept of woman. Philosophical Topics , 23 (2), 261–293.

Stoller, R. J. (1968). The sense of femaleness. The Psychoanalytic Quarterly , 37 (1), 42–55. https://doi.org/10.1080/21674086.1968.11926450

Tajfel, H. (1981). Human groups and social categories: Studies in social psychology . Cambridge: Cambridge University Press.

Twenge, J. M. (1997). Changes in masculine and feminine traits over time: A meta-analysis. Sex Roles , 36 (5-6), 305–325. https://doi.org/10.1007/BF02766650

Twenge, J. M. (2001). Changes in women’s assertiveness in response to status and roles: A cross-temporal meta-analysis, 1931–1993. Journal of Personality and Social Psychology , 81 (1), 133–145. https://doi.org/10.1037/0022-3514.81.1.133

UK Office for Statistics. (2021). UK Government. https://www.gov.uk/government/organisations/office-for-national-statistics

Unesco Institute for Statistics. (2022). Unesco. https://sdg4-data.uis.unesco.org/

van Anders, S. M. (2015). Beyond sexual orientation: Integrating gender/sex and diverse sexualities via sexual configurations theory. Archives of Sexual Behavior , 44 , 1177–1213. https://doi.org/10.1007/s10508-015-0490-8

Vocks, S., Stahn, C., Loenser, K., & Legenbauer, T. (2009). Eating and body image disturbances in male-to-female and female-to-male transsexuals. Archives of Sexual Behavior , 38 , 364–377. https://doi.org/10.1007/s10508-008-9424-z

Whitely, S. E. (1983). Construct validity: Construct representation versus nomothetic span. Psychological Bulletin , 93 (1), 179–197. https://doi.org/10.1037/0033-2909.93.1.179

Wieland, A., Durach, C. F., Kembro, J., & Treiblmaier, H. (2017). Statistical and judgmental criteria for scale purification. Supply Chain Management , 22 (4), 321–328. https://doi.org/10.1108/SCM-07-2016-0230

Williams, C. M., Peyre, H., Toro, R., & Ramus, F. (2021). Neuroanatomical norms in the UK Biobank: The impact of allometric scaling, sex, and age. Human Brain Mapping , 42 (14), 4623–4642. https://doi.org/10.1002/hbm.25572

Wood, W., & Eagly, A. H. (2002). A cross-cultural analysis of the behavior of women and men: implications for the origins of sex differences. Psychological Bulletin , 128 (5), 699. https://doi.org/10.1037/0033-2909.128.5.699

Wood, W., & Eagly, A. H. (2009). Gender identity. Handbook of Individual Differences in Social Behavior , 109–125.

Wood, W., & Eagly, A. H. (2012). Biosocial construction of sex differences and similarities in behavior. In Advances in experimental social psychology (Vol. 46). https://doi.org/10.1016/B978-0-12-394281-4.00002-7

Wood, W., & Eagly, A. H. (2015). Two traditions of research on gender identity. Sex Roles , 73 (11–12), 461–473. https://doi.org/10.1007/s11199-015-0480-2

Woodhill, B., Samuels, C., & Jamieson, G. (2024). The origins of gender, not sex: Evolution and the reproductive axis . https://doi.org/10.33774/coe-2020-k7gt1-v10

Download references

Acknowledgements

This work was inspired by collaboration with Prof. Alice H. Eagly; the author is very thankful for her very valuable contributions and feedback during the design and writing process. I also thank Anne Laure Humbert and María J. Pando for their enthusiastic appreciation of a previous version of this paper, as well as Christopher Begeny, Michelle K. Ryan and Teresa Sasiain for interesting discussions.

Author information

Authors and affiliations.

Department of Management, Deusto Business School, University of Deusto, Bilbao, Spain

Leire Gartzia

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Leire Gartzia .

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Gartzia, L. Sex and attitudinal gender: A critical review and decomposition principle. Curr Psychol (2024). https://doi.org/10.1007/s12144-024-06410-w

Download citation

Accepted : 10 July 2024

Published : 23 September 2024

DOI : https://doi.org/10.1007/s12144-024-06410-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Individual differences
  • Find a journal
  • Publish with us
  • Track your research

Experimental Design: Types, Examples & Methods

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Experimental design refers to how participants are allocated to different groups in an experiment. Types of design include repeated measures, independent groups, and matched pairs designs.

Probably the most common way to design an experiment in psychology is to divide the participants into two groups, the experimental group and the control group, and then introduce a change to the experimental group, not the control group.

The researcher must decide how he/she will allocate their sample to the different experimental groups.  For example, if there are 10 participants, will all 10 participants participate in both groups (e.g., repeated measures), or will the participants be split in half and take part in only one group each?

Three types of experimental designs are commonly used:

1. Independent Measures

Independent measures design, also known as between-groups , is an experimental design where different participants are used in each condition of the independent variable.  This means that each condition of the experiment includes a different group of participants.

This should be done by random allocation, ensuring that each participant has an equal chance of being assigned to one group.

Independent measures involve using two separate groups of participants, one in each condition. For example:

Independent Measures Design 2

  • Con : More people are needed than with the repeated measures design (i.e., more time-consuming).
  • Pro : Avoids order effects (such as practice or fatigue) as people participate in one condition only.  If a person is involved in several conditions, they may become bored, tired, and fed up by the time they come to the second condition or become wise to the requirements of the experiment!
  • Con : Differences between participants in the groups may affect results, for example, variations in age, gender, or social background.  These differences are known as participant variables (i.e., a type of extraneous variable ).
  • Control : After the participants have been recruited, they should be randomly assigned to their groups. This should ensure the groups are similar, on average (reducing participant variables).

2. Repeated Measures Design

Repeated Measures design is an experimental design where the same participants participate in each independent variable condition.  This means that each experiment condition includes the same group of participants.

Repeated Measures design is also known as within-groups or within-subjects design .

  • Pro : As the same participants are used in each condition, participant variables (i.e., individual differences) are reduced.
  • Con : There may be order effects. Order effects refer to the order of the conditions affecting the participants’ behavior.  Performance in the second condition may be better because the participants know what to do (i.e., practice effect).  Or their performance might be worse in the second condition because they are tired (i.e., fatigue effect). This limitation can be controlled using counterbalancing.
  • Pro : Fewer people are needed as they participate in all conditions (i.e., saves time).
  • Control : To combat order effects, the researcher counter-balances the order of the conditions for the participants.  Alternating the order in which participants perform in different conditions of an experiment.

Counterbalancing

Suppose we used a repeated measures design in which all of the participants first learned words in “loud noise” and then learned them in “no noise.”

We expect the participants to learn better in “no noise” because of order effects, such as practice. However, a researcher can control for order effects using counterbalancing.

The sample would be split into two groups: experimental (A) and control (B).  For example, group 1 does ‘A’ then ‘B,’ and group 2 does ‘B’ then ‘A.’ This is to eliminate order effects.

Although order effects occur for each participant, they balance each other out in the results because they occur equally in both groups.

counter balancing

3. Matched Pairs Design

A matched pairs design is an experimental design where pairs of participants are matched in terms of key variables, such as age or socioeconomic status. One member of each pair is then placed into the experimental group and the other member into the control group .

One member of each matched pair must be randomly assigned to the experimental group and the other to the control group.

matched pairs design

  • Con : If one participant drops out, you lose 2 PPs’ data.
  • Pro : Reduces participant variables because the researcher has tried to pair up the participants so that each condition has people with similar abilities and characteristics.
  • Con : Very time-consuming trying to find closely matched pairs.
  • Pro : It avoids order effects, so counterbalancing is not necessary.
  • Con : Impossible to match people exactly unless they are identical twins!
  • Control : Members of each pair should be randomly assigned to conditions. However, this does not solve all these problems.

Experimental design refers to how participants are allocated to an experiment’s different conditions (or IV levels). There are three types:

1. Independent measures / between-groups : Different participants are used in each condition of the independent variable.

2. Repeated measures /within groups : The same participants take part in each condition of the independent variable.

3. Matched pairs : Each condition uses different participants, but they are matched in terms of important characteristics, e.g., gender, age, intelligence, etc.

Learning Check

Read about each of the experiments below. For each experiment, identify (1) which experimental design was used; and (2) why the researcher might have used that design.

1 . To compare the effectiveness of two different types of therapy for depression, depressed patients were assigned to receive either cognitive therapy or behavior therapy for a 12-week period.

The researchers attempted to ensure that the patients in the two groups had similar severity of depressed symptoms by administering a standardized test of depression to each participant, then pairing them according to the severity of their symptoms.

2 . To assess the difference in reading comprehension between 7 and 9-year-olds, a researcher recruited each group from a local primary school. They were given the same passage of text to read and then asked a series of questions to assess their understanding.

3 . To assess the effectiveness of two different ways of teaching reading, a group of 5-year-olds was recruited from a primary school. Their level of reading ability was assessed, and then they were taught using scheme one for 20 weeks.

At the end of this period, their reading was reassessed, and a reading improvement score was calculated. They were then taught using scheme two for a further 20 weeks, and another reading improvement score for this period was calculated. The reading improvement scores for each child were then compared.

4 . To assess the effect of the organization on recall, a researcher randomly assigned student volunteers to two conditions.

Condition one attempted to recall a list of words that were organized into meaningful categories; condition two attempted to recall the same words, randomly grouped on the page.

Experiment Terminology

Ecological validity.

The degree to which an investigation represents real-life experiences.

Experimenter effects

These are the ways that the experimenter can accidentally influence the participant through their appearance or behavior.

Demand characteristics

The clues in an experiment lead the participants to think they know what the researcher is looking for (e.g., the experimenter’s body language).

Independent variable (IV)

The variable the experimenter manipulates (i.e., changes) is assumed to have a direct effect on the dependent variable.

Dependent variable (DV)

Variable the experimenter measures. This is the outcome (i.e., the result) of a study.

Extraneous variables (EV)

All variables which are not independent variables but could affect the results (DV) of the experiment. Extraneous variables should be controlled where possible.

Confounding variables

Variable(s) that have affected the results (DV), apart from the IV. A confounding variable could be an extraneous variable that has not been controlled.

Random Allocation

Randomly allocating participants to independent variable conditions means that all participants should have an equal chance of taking part in each condition.

The principle of random allocation is to avoid bias in how the experiment is carried out and limit the effects of participant variables.

Order effects

Changes in participants’ performance due to their repeating the same or similar test more than once. Examples of order effects include:

(i) practice effect: an improvement in performance on a task due to repetition, for example, because of familiarity with the task;

(ii) fatigue effect: a decrease in performance of a task due to repetition, for example, because of boredom or tiredness.

Print Friendly, PDF & Email

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Gender-based pairings influence cooperative expectations and behaviours

Affiliations.

  • 1 OpenSystems Research Group, Departament de Física de la Matèria Condensada, Universitat de Barcelona, Barcelona, 08028, Spain.
  • 2 Universitat de Barcelona Institute of Complex Systems UBICS, Barcelona, 08028, Spain.
  • 3 OpenSystems Research Group, Departament de Física de la Matèria Condensada, Universitat de Barcelona, Barcelona, 08028, Spain. [email protected].
  • 4 Universitat de Barcelona Institute of Complex Systems UBICS, Barcelona, 08028, Spain. [email protected].
  • PMID: 31974477
  • PMCID: PMC6978365
  • DOI: 10.1038/s41598-020-57749-6

The study explores the expectations and cooperative behaviours of men and women in a lab-in-the-field experiment by means of citizen science practices in the public space. It specifically examines the influence of gender-based pairings on the decisions to cooperate or defect in a framed and discrete Prisoner's Dilemma game after visual contact. Overall, we found that when gender is considered behavioural differences emerge in expectations of cooperation, cooperative behaviours, and their decision time depending on whom the partner is. Men pairs are the ones with the lowest expectations and cooperation rates. After visual contact women infer men's behaviour with the highest accuracy. Also, women take significantly more time to defect than to cooperate, compared to men. Finally, when the interacting partners have the opposite gender they expect significantly more cooperation and they achieve the best collective outcome. Together, the findings suggest that non verbal signals may influence men and women differently, offering novel interpretations to the context-dependence of gender differences in social decision tasks.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Cooperation and expected cooperation by…

Cooperation and expected cooperation by gender. The heatmap shows the ratio of cooperation…

Behavioural domains by gender pairing.…

Behavioural domains by gender pairing. Cooperation refers to the ratio of cooperative decisions;…

Boxplot of response time by…

Boxplot of response time by gender. The boxplots represent the log-transformed response time…

Similar articles

  • Nationality dominates gender in decision-making in the Dictator and Prisoner's Dilemma Games. Kumar MM, Tsoi L, Lee MS, Cone J, McAuliffe K. Kumar MM, et al. PLoS One. 2021 Jan 13;16(1):e0244568. doi: 10.1371/journal.pone.0244568. eCollection 2021. PLoS One. 2021. PMID: 33439874 Free PMC article.
  • Children avoid inefficient but fair partners in a cooperative game. Prétôt L, Gonzalez G, McAuliffe K. Prétôt L, et al. Sci Rep. 2020 Jun 29;10(1):10511. doi: 10.1038/s41598-020-65452-9. Sci Rep. 2020. PMID: 32601496 Free PMC article.
  • Women Tend to Defect in a Social Dilemma Game in Southwest China. Pansini R, Shi L, Wang RW. Pansini R, et al. PLoS One. 2016 Nov 9;11(11):e0166101. doi: 10.1371/journal.pone.0166101. eCollection 2016. PLoS One. 2016. PMID: 27829046 Free PMC article.
  • Long-term social bonds promote cooperation in the iterated Prisoner's Dilemma. St-Pierre A, Larose K, Dubois F. St-Pierre A, et al. Proc Biol Sci. 2009 Dec 7;276(1676):4223-8. doi: 10.1098/rspb.2009.1156. Epub 2009 Sep 9. Proc Biol Sci. 2009. PMID: 19740884 Free PMC article.
  • Intuitive thinking impedes cooperation by decreasing cooperative expectations for pro-self but not for pro-social individuals. Sun Q, Luo S, Gao Q, Fan W, Liu Y. Sun Q, et al. J Soc Psychol. 2023 Jan 2;163(1):62-78. doi: 10.1080/00224545.2022.2122768. Epub 2022 Sep 12. J Soc Psychol. 2023. PMID: 36093968
  • Large losses from little lies: Strategic gender misrepresentation and cooperation. Drouvelis M, Gerson J, Powdthavee N, Riyanto YE. Drouvelis M, et al. PLoS One. 2023 Mar 8;18(3):e0282335. doi: 10.1371/journal.pone.0282335. eCollection 2023. PLoS One. 2023. PMID: 36888615 Free PMC article.
  • Motivation Analysis of Online Green Users: Evidence From Chinese "Ant Forest". Chen B, Feng Y, Sun J, Yan J. Chen B, et al. Front Psychol. 2020 Jun 30;11:1335. doi: 10.3389/fpsyg.2020.01335. eCollection 2020. Front Psychol. 2020. PMID: 32714238 Free PMC article.
  • Seguino SPÇ. Change? Evidence on global trends in gender norms and stereotypes. Fem. Econ. 2007;13:1–28. doi: 10.1080/13545700601184880. - DOI
  • Datta S, Mullainathan S. Behavioral design: A new approach to development policy. Rev. Income Wealth. 2014;60:7–35. doi: 10.1111/roiw.12093. - DOI
  • World Bank. World Development Report 2015: Mind, Society and Behavior. Washington. (DC: World Bank, 2015).
  • Ridgeway, C. L. Framed by Gender: How Gender Inequality Persists in the Modern World. (Oxford University Press, 2011).
  • Beall, J. Urban governance: why gender matters. (UNDP, 1996).

Publication types

  • Search in MeSH

Related information

Linkout - more resources, full text sources.

  • Diposit Digital de la Universitat de Barcelona
  • Europe PubMed Central
  • Nature Publishing Group
  • PubMed Central

Miscellaneous

  • NCI CPTAC Assay Portal

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

IMAGES

  1. Dr. John Money Gender Experiment: Reimer Twins

    gender based psychology experiments

  2. PPT

    gender based psychology experiments

  3. Gender Conformity: A Psychological Experiment by Michelle Guedes

    gender based psychology experiments

  4. 10 Psychological Experiments That Went Way Too Far

    gender based psychology experiments

  5. Psychological experiment about gender classification: (a)experimental

    gender based psychology experiments

  6. Gender Educational Experiment": School Revolution from KRONA NGO and

    gender based psychology experiments

VIDEO

  1. What are some psychology experiments with interesting results #redditstories #experiment

  2. Psychology Facts You Need to Know

  3. Gender Experiment Gone Wrong: The David Reimer Case #TrueStory #GenderIdentity #MentalHealth

  4. Male Bias in Science: Androcentrism, The Feminist Element

  5. Demystifying Hormones: The Reality Behind Gender Hormones

  6. Uncovering Bias in Gender & Women’s Sexuality

COMMENTS

  1. John Money Gender Experiment: Reimer Twins

    The John Money Experiment involved David Reimer, a twin boy raised as a girl following a botched circumcision. Money asserted gender was primarily learned, not innate. However, David struggled with his female identity and transitioned back to male in adolescence. The case challenged Money's theory, highlighting the influence of biological sex on gender identity.

  2. David Reimer and John Money Gender Reassignment Controversy: The John

    In the mid-1960s, psychologist John Money encouraged the gender reassignment of David Reimer, who was born a biological male but suffered irreparable damage to his penis as an infant. Born in 1965 as Bruce Reimer, his penis was irreparably damaged during infancy due to a failed circumcision. After encouragement from Money, Reimer's parents decided to raise Reimer as a girl.

  3. The Psychology of Gender: What are the Different Perspectives?

    How we acquire gender identity. Traditionally, there are three main psychological explanations of how we navigate the path to gender identity. These are psychodynamic theory, social learning theory, and cognitive-developmental theory. All focus on early childhood, that is, up until about seven years of age. Psychodynamic theory.

  4. Module 2: Studying Gender Using the Scientific Method

    2.2.5.1. Example of an experimental psychology of gender study. Wirth and Bodenhausen (2009) investigated whether gender played a moderating role in the stigma of mental illness in a web-based survey experiment. They asked participants to read a case summary in which the patient's gender was manipulated along with the type of disorder.

  5. The gender biases that shape our brains

    As neuroscientist and author Gina Rippon of Aston University explains, the fact that we live in a gendered world itself creates a gendered brain. It creates a culture of boys who feel conditioned ...

  6. Scientists' Gender May Influence the Results of Experiments

    Gender is likely not the only factor that can sway the results of an experiment. "I imagine race, ethnicity, age, that all of those things could have important effects on how research ...

  7. Gender Differences in Personality across the Ten Aspects of the Big

    Gender differences in personality traits are often characterized in terms of which gender has higher scores on that trait, on average. ... based on rational review of psychological constructs (e.g., Costa and McCrae, 1992) or by systematic sampling from the space defined by pairs of Big Five factors (e.g., Soto and John, 2009). In the present ...

  8. Fluidity of gender identity induced by illusory body-sex change

    Experiment II tested whether the perceived sex of one's own body also modulates implicit associations between oneself and gender categories. This experiment had the same two-by-two factorial ...

  9. Sex and gender analysis improves science and engineering

    Analysing experimental results by sex and/or gender is critical for improving accuracy and avoiding misinterpretation of data (Fig. 1). The common practice of pooling the response of females and ...

  10. Gender in a Social Psychology Context

    Summary. Understanding gender and gender differences is a prevalent aim in many psychological subdisciplines. Social psychology has tended to employ a binary understanding of gender and has focused on understanding key gender stereotypes and their impact. While women are seen as warm and communal, men are seen as agentic and competent.

  11. Gender-based pairings influence cooperative expectations and ...

    Abstract. The study explores the expectations and cooperative behaviours of men and women in a lab-in-the-field experiment by means of citizen science practices in the public space. It ...

  12. Gender Trouble in Social Psychology: How Can Butler's Work Inform

    1 Department of Psychology, University of Exeter, Exeter, United Kingdom; 2 Faculty of Economics and Business, University of Groningen, Groningen, Netherlands; A quarter of a century ago, philosopher Judith Butler (1990) called upon society to create "gender trouble" by disrupting the binary view of sex, gender, and sexuality.She argued that gender, rather than being an essential quality ...

  13. What is gender, anyway: a review of the options for operationalising gender

    What is gender? A deconstruction of the concept. Defining gender is both highly important and complex. Hegarty (Citation 2001) suggests that the quantitative researcher should address this definition from a performative perspective to de-construct the gender concept.In this way, gender is a non-essential category which instead is repeatedly performed based on societal norms (Morgenroth & Ryan ...

  14. Psychology of Gender

    Middle School - Grades 7-9. P =Project E =Experiment. Learn about the human behavior called delay of gratification, and to determine how delayed gratification depends on gender and attention. [E] Investigate differences in the way boys and girls express emotion, including facial, verbal and physiological differences.

  15. PDF Investigating Gender-based Responses in Social Psychology Experiments

    Investigating Gender-based Responses in Social Psychology Experiments 35 web. The theoretical foundation of Mind Genomics has been established in a number of publications (Moskowitz & Gofman, 2007; Gofman et al., 2010; Moskowitz et al., 2012a; Porretta et al., 2019) to name a few. Mind Genomics is used in many research fields to analyze complex

  16. Gender bias in research: how does it affect evidence based medicine?

    Another facet of gender bias in research is in the lack of incorporation of gender data into evidence-based medicine. For example, despite well recognized gender differences in coronary heart disease management in UK critical care units, 13 the UK NHS guidelines for management are not gender specific. 14 If research lacks or excludes female ...

  17. Experimenter gender and replicability in science

    Here, we point to one important and overlooked factor likely perpetuating this ubiquitous problem: the role of experimenter gender. Experiments in humans are regularly carried out without any report of the experimenter's gender; however, there is a range of evidence supporting the influence of experimenter gender on a variety of psychological ...

  18. Think again: Men and women share cognitive skills

    Psychological Science in the Public Interest, 8(1), 1-51. Hyde, J. S., & Linn, M. C. (1988). Gender differences in verbal ability: A meta-analysis. ... Attachment-Based Family Therapy for Sexual and Gender Minority Young Adults and Their Non-Accepting Parents $54.99. Sex Ed for the Stroller Set $21.99. Members may qualify for lower pricing.

  19. Sex and Gender Differences Research Design for Basic, Clinical, and

    Many compelling publications have argued why sex and gender should be considered in preclinical, clinical, and population research ().Both sex (the biological attributes of females and males) and gender (socially constructed roles, behaviors, and identities in a spectrum, including femininity and masculinity) affect molecular and cellular processes, clinical traits, response to treatments ...

  20. Making gender diversity work for scientific discovery and innovation

    Gender diversity has the potential to drive scientific discovery and innovation. ... most literature on the importance of management is based on social psychological experiments and field studies ...

  21. Sex and attitudinal gender: A critical review and decomposition

    Research is concerned about sex and gender effects, often addressed interchangeably. This paper reviews the distinct conceptualization and usage of these constructs with a focus on the multiple psychological components of gender. Extending social role theory concerns about how gender internalizes into individual psychology, an organizing framework is provided pointing to attitudinal gender as ...

  22. Experimental Design: Types, Examples & Methods

    Three types of experimental designs are commonly used: 1. Independent Measures. Independent measures design, also known as between-groups, is an experimental design where different participants are used in each condition of the independent variable. This means that each condition of the experiment includes a different group of participants.

  23. Gender-based pairings influence cooperative expectations and ...

    The study explores the expectations and cooperative behaviours of men and women in a lab-in-the-field experiment by means of citizen science practices in the public space. It specifically examines the influence of gender-based pairings on the decisions to cooperate or defect in a framed and discrete Prisoner's Dilemma game after visual contact.