B.F. Skinner: The Man Who Taught Pigeons to Play Ping-Pong and Rats to Pull Levers

One of behavioral psychology’s most famous scientists was also one of the quirkiest

Marina Koren

Marina Koren

Psychologist B.F. Skinner taught these pigeons to play ping-pong in 1950.

B.F Skinner, a leading 20th century psychologist who hypothesized that behavior was caused only by external factors, not by thoughts or emotions, was a controversial figure in a field that tends to attract controversial figures. In a realm of science that has given us Sigmund Freud , Carl Jung and Jean Piaget , Skinner stands out by sheer quirkiness. After all, he is the scientist who trained rats to pull levers and push buttons and taught pigeons to read and play ping-pong .

Besides Freud, Skinner is arguably the most famous psychologist of the 20th century. Today, his work is basic study in introductory psychology classes across the country. But what drives a man to teach his children’s cats to play piano and instruct his beagle on how to play hide and seek? Last year, Norwegian researchers dove into his past to figure it out. The team combed through biographies, archival material and interviews with those who knew him, then tested Skinner on a common personality scale.

They found Skinner, who would be 109 years old today, was highly conscientious, extroverted and somewhat neurotic—a trait shared by as many as 45 percent of leading scientists. The analysis revealed him to be a tireless worker, one who introduced a new approach to behavioral science by building on the theories of Ivan Pavlov and John Watson .

Skinner wasn’t interested in understanding the human mind and its mental processes—his field of study, known as behaviorism, was primarily concerned with observable actions and how they arose from environmental factors. He believed that our actions are shaped by our experience of reward and punishment, an approach that he called operant conditioning. The term “operant” refers to an animal or person “operating” on their environment to affect change while learning a new behavior.

B.F. Skinner at the Harvard psychology department, circa 1950

Operant conditioning breaks down a task into increments. If you want to teach a pigeon to turn in a circle to the left, you give it a reward for any small movement it makes in that direction. Soon, the pigeon catches onto this and makes larger movements to the left, which garner more rewards, until the bird completes the full circle. Skinner believed that this type of learning even relates to language and the way we learn to speak. Children are rewarded, through their parents’ verbal encouragement and affection, for making a sound that resembles a certain word until they can actually say that word.

Skinner’s approach introduced a new term into the literature: reinforcement. Behavior that is reinforced, like a mother excitedly drawing out the sounds of “mama” as a baby coos, tends to be repeated, and behavior that’s not reinforced tends to weaken and die out. “Positive” refers to the practice of encouraging a behavior by adding to it, such as rewarding a dog with a treat , and “negative” refers to encouraging a behavior by taking something away. For example, when a driver absentmindedly continues to sit in front of a green light, the driver waiting behind them honks his car horn. The first person is reinforced for moving when the honking stops. The phenomenon of reinforcement extends beyond babies and pigeons: we’re rewarded for going to work each day with a paycheck every two weeks, and likely wouldn’t step inside the office once they were taken away.

Today, the spotlight has shifted from such behavior analysis to cognitive theories, but some of Skinner’s contributions continue to hold water, from teaching dogs to roll over to convincing kids to clean their rooms. Here are a few:

1. The Skinner box. To show how reinforcement works in a controlled environment, Skinner placed a hungry rat into a box that contained a lever. As the rat scurried around inside the box, it would accidentally press the lever, causing a food pellet to drop into the box. After several such runs, the rat quickly learned that upon entering the box, running straight toward the lever and pressing down meant receiving a tasty snack. The rat learned how to use a lever to its benefit in an unpleasant situation too: in another box that administered small electric shocks, pressing the lever caused the unpleasant zapping to stop.

2. Project Pigeon. During World War II, the military invested Skinner’s project to train pigeons to guide missiles through the skies . The psychologist used a device that emitted a clicking noise to train pigeons to peck at a small, moving point underneath a glass screen. Skinner posited that the birds, situated in front of a screen inside of a missile, would see enemy torpedoes as specks on the glass, and rapidly begin pecking at it. Their movements would then be used to steer the missile toward the enemy: Pecks at the center of the screen would direct the rocket to fly straight, while off-center pecks would cause it to tilt and change course. Skinner managed to teach one bird to peck at a spot more than 10,000 times in 45 minutes, but the prospect of pigeon-guided missiles, along with adequate funding, eventually lost luster.

3. The Air-Crib. Skinner tried to mechanize childcare through the use of this “baby box,” which maintained the temperature of a child’s environment . Humorously known as an “heir conditioner,” the crib was completely humidity- and temperate-controlled, a feature Skinner believed would keep his second daughter from getting cold at night and crying. A fan pushed air from the outside through a linen-like surface, adjusting the temperature throughout the night. The air-crib failed commercially, and although his daughter only slept inside at night, many of Skinner’s critics believed it was a cruel and experimental way to raise a child.

4. The teaching box. Skinner believed using his teaching machine to break down material bit by bit, offering rewards along the way for correct responses, could serve almost like a private tutor for students. Material was presented in sequence, and the machine provided hints and suggestions until students verbally explained a response to a problem (Skinner didn’t believe in multiple choice answers). The device wouldn’t allow students to move on in a lesson until they understood the material, and when students got any part of it right, the machine would spit out positive feedback until they reached the solution. The teaching box didn’t stick in a school setting, but many computer-based self-instruction programs today use the same idea.

5. The Verbal Summator. An auditory version of the Rorschach inkblot test , this tool allowed participants to project subconscious thoughts through sound. Skinner quickly abandoned this endeavor as personality assessment didn’t interest him, but the technology spawned several other types of auditory perception tests.

Get the latest Science stories in your inbox.

Marina Koren

Marina Koren | | READ MORE

Marina Koren is a staff writer at The Atlantic . Previously, she was a digital intern for Smithsonian.com.

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

The Behavioral Psychology Theory That Explains Learned Behavior

Aka the Skinner box

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

operant conditioning experiment on pigeon

Emily is a board-certified science editor who has worked with top digital publishing brands like Voices for Biodiversity, Study.com, GoodTherapy, Vox, and Verywell.

operant conditioning experiment on pigeon

Bettmann Archive / Getty Images

A Skinner box is an enclosed apparatus that contains a bar or key that an animal subject can manipulate in order to obtain reinforcement. Developed by B. F. Skinner and also known as an operant conditioning chamber, this box also has a device that records each response provided by the animal as well as the unique schedule of reinforcement that the animal was assigned. Common animal subjects include rats and pigeons.

Skinner was inspired to create his operant conditioning chamber as an extension of the puzzle boxes that Edward Thorndike famously used in his research on the law of effect . Skinner himself did not refer to this device as a Skinner box, instead preferring the term "lever box."

How a Skinner Box Works

The design of a Skinner box can vary depending upon the type of animal and the experimental variables . It must include at least one lever, bar, or key that the animal can manipulate.

When the lever is pressed, food, water, or some other type of reinforcement might be dispensed. Other stimuli can also be presented, including lights, sounds, and images. In some instances, the floor of the chamber may be electrified.

The Skinner box is usually enclosed, to keep the animal from experiencing other stimuli. Using the device, researchers can carefully study behavior in a very controlled environment. For example, researchers could use the Skinner box to determine which schedule of reinforcement led to the highest rate of response in the study subjects.

Today, psychology students may use a virtual version of a Skinner box to conduct experiments and learn about operant conditioning.

The Skinner Box in Research

Imagine that a researcher wants to determine which schedule of reinforcement will lead to the highest response rates. Pigeons are placed in chambers where they receive a food pellet for pecking at a response key. Some pigeons receive a pellet for every response (continuous reinforcement).

Partial Reinforcement Schedules

Other pigeons obtain a pellet only after a certain amount of time or number of responses have occurred (partial reinforcement). There are several types of partial reinforcement schedules.

  • Fixed-ratio schedule : Pigeons receive a pellet after they peck at the key a certain number of times; for example, they would receive a pellet after every five pecks.
  • Variable-ratio schedule : Subjects receive reinforcement after a random number of responses.
  • Fixed-interval schedule : Subjects are given a pellet after a designated period of time has elapsed; for example, every 10 minutes.
  • Variable-interval schedule : Subjects receive a pellet at random intervals of time.

Once the data has been obtained from the trials in the Skinner boxes, researchers can then look at the rate of responding. This will tell them which schedules led to the highest and most consistent level of responses.

Skinner Box Myths

The Skinner box should not be confused with one of Skinner's other inventions, the baby tender (also known as the air crib). At his wife's request, Skinner created a heated crib with a plexiglass window that was designed to be safer than other cribs available at that time. Confusion over the use of the crib led to it being confused with an experimental device, which led some to believe that Skinner's crib was actually a variation of the Skinner box.

At one point, a rumor spread that Skinner had used the crib in experiments with his daughter, leading to her eventual suicide. The Skinner box and the baby tender crib were two different things entirely, and Skinner did not conduct experiments on his daughter or with the crib. Nor did his daughter take her own life.  

A Word From Verywell

The Skinner box is an important tool for studying learned behavior. It has contributed a great deal to our understanding of the effects of reinforcement and punishment.

Operant conditioning chamber . In: APA Dictionary of Psychology. American Psychological Association.

B.F. Skinner Foundation. Biographical information .

Schacter DL, Gilbert DT, Wegner DM. Psychology. 2nd edition. Worth, Inc., 2011.

Ray RD, Miraglia KM. A sample of CyberRat and other experiments: Their pedagogical functions in a learning course . J Behav Neurosci Res . 2011;9(2):44-61.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Skinner Box: What Is an Operant Conditioning Chamber?

Charlotte Nickerson

Research Assistant at Harvard University

Undergraduate at Harvard University

Charlotte Nickerson is a student at Harvard University obsessed with the intersection of mental health, productivity, and design.

Learn about our Editorial Process

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

The Skinner box is a chamber that isolates the subject from the external environment and has a behavior indicator such as a lever or a button.

When the animal pushes the button or lever, the box is able to deliver a positive reinforcement of the behavior (such as food) or a punishment (such as noise), or a token conditioner (such as a light) that is correlated with either the positive reinforcement or punishment.

  • The Skinner box, otherwise known as an operant conditioning chamber, is a laboratory apparatus used to study animal behavior within a compressed time frame.
  • Underlying the development of the Skinner box was the concept of operant conditioning, a type of learning that occurs as a consequence of a behavior.
  • The Skinner Box has been wrongly confused with the Skinner air crib, with detrimental public consequences for Skinner.
  • Commentators have drawn parallels between the Skinner box and modern advertising and game design, citing their addictive qualities and systematized rewards.

How Does It Work?

The Skinner Box is a chamber, often small, that is used to conduct operant conditioning research with animals. Within this chamber, there is usually a lever or key that an individual animal can operate to obtain a food or water source within the chamber as a reinforcer.

The chamber is connected to electronic equipment that records the animal’s lever pressing or key pecking, allowing for the precise quantification of behavior.

Skinner box or operant conditioning chamber experiment outline diagram. Labeled educational laboratory apparatus structure for mouse or rat experiment to understand animal behavior vector illustration

Before the works of Skinner, the namesake of the Skinner box, instrumental learning was typically studied using a maze or puzzle box.

Learning in these settings is well-suited to examining discrete trials or episodes of behavior instead of a continuous stream of behavior.

The Skinner box, meanwhile, was designed as an experimental environment better suited to examine the more natural flow of behavior in animals.

The design of the Skinner Box varies heavily depending on the type of animal enclosed within it and experimental variables.

Nonetheless, it includes, at minimum, at least one lever, bar, or key that an animal can manipulate. Besides the reinforcer and tracker, a skinner box can include other variables, such as lights, sounds, or images. In some cases, the floor of the chamber may even be electrified (Boulay, 2019).

The design of the Skinner box is intended to keep an animal from experiencing other stimuli, allowing researchers to carefully study behavior in a very controlled environment.

This allows researchers to, for example, determine which schedule of reinforcement — or relation of rewards and punishment to the reinforcer — leads to the highest rate of response in the animal being studied (Boulay, 2019).

The Reinforcer

The reinforcer is the part of the Skinner box that provides, naturally, reinforcement for an action. For instance, a lever may provide a pellet of food when pressed a certain number of times. This lever is the reinforcer (Boulay, 2019).

The Tracker/Quantifier

The tracker, meanwhile, provides quantitative data regarding the reinforcer. For example, the tracker may count the number of times that a lever is pressed or the number of electric shocks or pellets dispensed (Boulay, 2019).

Partial Reinforcement Schedules

Partial reinforcement occurs when reinforcement is only given under particular circumstances. For example, a pellet or shock may only be dispensed after a pigeon has pressed a lever a certain number of times.

There are several types of partial reinforcement schedules (Boulay, 2019):

  • Fixed-ratio schedules , where an animal receives a pellet after pushing the trigger a certain number of times.
  • Variable-ratio schedules , where animals receive reinforcement after a random number of responses.
  • Fixed-interval schedules , where animals are given a pellet after a designated period of time has elapsed, such as every 5 minutes.
  • Variable-interval schedules , where animals receive a reinforcer at random.

Once data has been obtained from the Skinner box, researchers can look at the rate of response depending on the schedule.

The Skinner Box in Research

Modified versions of the operant conditioning chamber, or Skinner box, are still widely used in research settings today.

Skinner developed his theory of operant conditioning by identifying four different types of punishment or reward.

To test the effect of these outcomes, he constructed a device called the “Skinner Box,” a cage in which a rat could be placed, with a small lever (which the rat would be trained to press), a chute that would release pellets of food, and a floor which could be electrified.

For example, a hungry rat was placed in a cage. Every time he activated the lever, a food pellet fell into the food dispenser (positive reinforcement). The rats quickly learned to go straight to the lever after a few times of being put in the box.

This suggests that positive reinforcement increases the likelihood of the behavior being repeated.

In another experiment, a rat was placed in a cage in which they were subjected to an uncomfortable electrical current (see diagram above).

As they moved around the cage, the rat hit the lever, which immediately switched off the electrical current (negative reinforcement). The rats quickly learned to go straight to the lever after a few times of being put in the box.

This suggests that negative reinforcement increases the likelihood of the behavior being repeated.

The device allowed Skinner to deliver each of his four potential outcomes, which are:

  • Positive Reinforcement : a direct reward for performing a certain behavior. For instance, the rat could be rewarded with a pellet of food for pushing the lever.
  • Positive Punishment : a direct negative outcome following a particular behavior. Once the rat had been taught to press the lever, for instance, Skinner trained it to cease this behavior by electrifying the floor each time the lever was pressed.
  • Negative Reinforcement : the removal of an unpleasant situation when a particular behavior is performed (thus producing a sense of relief). For instance, a mild electric current was passed through the floor of the cage and was removed when a desired behavior was formed.
  • Negative Punishment : involves taking away a reward or removing a pleasant situation. In the Skinner box, for instance, the rat could be trained to stop pressing the lever by releasing food pellets at regular intervals and then withholding them when the lever was pressed.

Commercial Applications

The application of operant and classical conditioning and the corresponding idea of the Skinner Box in commercial settings is widespread, particularly with regard to advertising and video games.

Advertisers use a number of techniques based on operant conditioning to influence consumer behavior, such as variable-ratio reinforcement schedule (the so-called “slot machine effect”), which encourages viewers to keep watching a particular channel in the hope of seeing a desirable outcome (e.g., winning a prize) (Vu, 2017).

Similarly, video game designers often employ Skinnerian principles in order to keep players engaged in gameplay.

For instance, many games make use of variable-ratio schedules of reinforcement, whereby players are given rewards (e.g., points, new levels) at random intervals.

This encourages players to keep playing in the hope of receiving a reward. In addition, many games make use of Skinner’s principle of shaping, whereby players are gradually given more difficult tasks as they master the easy ones. This encourages players to persevere in the face of frustration in order to see results.

There are a number of potential problems with using operant conditioning principles in commercial settings.

First, advertisers and video game designers may inadvertently create addictive behaviors in consumers.

Second, operant conditioning is a relatively short-term phenomenon; that is, it only affects behavior while reinforcement is being given.

Once reinforcement is removed (e.g., the TV channel is changed, the game is turned off), the desired behavior is likely to disappear as well.

As such, operant conditioning techniques may backfire, leading to addiction without driving the game-playing experiences developers hoped for (Vu, 2017).

Skinner Box Myths

In 1945, B. F. Skinner invented the air crib, a metal crib with walls and a ceiling made of removable safety glass.

The front pane of the crib was also made of safety glass, and the entire structure was meant to sit on legs so that it could be moved around easily.

The air crib was designed to create a climate-controlled, healthier environment for infants. The air crib was not commercially successful, but it did receive some attention from the media.

In particular, Time magazine ran a story about the air crib in 1947, which described it as a “baby tender” that would “give infant care a new scientific basis.” (Joyce & Fay, 2010).

The general lack of publicity around Skinner’s air crib, however, resulted in the perpetuation of the myth that Skinner’s air crib was a Skinner Box and that the infants placed in the crib were being conditioned.

In reality, the air crib was nothing more than a simple bassinet with some features that were meant to make it easier for parents to care for their infants.

There is no evidence that Skinner ever used the air crib to condition children, and in fact, he later said that it was never his intention to do so.

One famous myth surrounding the Skinner Crib was that Skinner’s daughter, Deborah Skinner, was Raised in a Skinner Box.

According to this rumor, Deborah Skinner had become mentally ill, sued her father, and committed suicide as a result of her experience. These rumors persisted until she publicly denied the stories in 2004 (Joyce & Fay, 2010).

Effectiveness

One of the most common criticisms of the Skinner box is that it does not allow animals to understand their actions.

Because behaviorism does not require that an animal understand its actions, this theory can be somewhat misleading about the degree to which an animal actually understands what it is doing (Boulay, 2019).

Another criticism of the Skinner box is that it can be quite stressful for the animals involved. The design of the Skinner box is intended to keep an animal from experiencing other stimuli, which can lead to stress and anxiety.

Finally, some critics argue that the data obtained from Skinner boxes may not be generalizable to real-world situations.

Because the environment in a Skinner box is so controlled, it may not accurately reflect how an animal would behave in an environment outside the lab.

There are very few learning environments in the real world that replicate a perfect operant conditioning environment, with a single action or sequence of actions leading to a stimulus (Boulay, 2019).

Bandura, A. (1977). Social learning theory. Englewood Cliffs, NJ: Prentice Hall.

Dezfouli, A., & Balleine, B. W. (2012). Habits, action sequences and reinforcement learning. European Journal of Neuroscience, 35 (7), 1036-1051.

Du Boulay, B. (2019). Escape from the Skinner Box: The case for contemporary intelligent learning environments. British Journal of Educational Technology, 50 (6), 2902-2919.

Chen, C., Zhang, K. Z., Gong, X., & Lee, M. (2019). Dual mechanisms of reinforcement reward and habit in driving smartphone addiction: the role of smartphone features. Internet Research.

Dad, H., Ali, R., Janjua, M. Z. Q., Shahzad, S., & Khan, M. S. (2010). Comparison of the frequency and effectiveness of positive and negative reinforcement practices in schools. Contemporary Issues in Education Research, 3 (1), 127-136.

Diedrich, J. L. (2010). Motivating students using positive reinforcement (Doctoral dissertation).

Dozier, C. L., Foley, E. A., Goddard, K. S., & Jess, R. L. (2019). Reinforcement. T he Encyclopedia of Child and Adolescent Development, 1-10.

Ferster, C. B., & Skinner, B. F. (1957). Schedules of reinforcement . New York: Appleton-Century-Crofts.

Gunter, P. L., & Coutinho, M. J. (1997). Negative reinforcement in classrooms: What we’re beginning to learn. Teacher Education and Special Education, 20 (3), 249-264.

Joyce, N., & Faye, C. (2010). Skinner Air Crib. APS Observer, 23 (7).

Kamery, R. H. (2004, July). Motivation techniques for positive reinforcement: A review. I n Allied Academies International Conference. Academy of Legal, Ethical and Regulatory Issues. Proceedings (Vol. 8, No. 2, p. 91). Jordan Whitney Enterprises, Inc.

Kohler, W. (1924). The mentality of apes. London: Routledge & Kegan Paul.

Staddon, J. E., & Niv, Y. (2008). Operant conditioning. Scholarpedia, 3 (9), 2318.

Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. New York: Appleton-Century.

Skinner, B. F. (1948). Superstition” in the pigeon . Journal of Experimental Psychology, 38, 168-172.

Skinner, B. F. (1951). How to teach animals. Freeman.

Skinner, B. F. (1953). Science and human behavior. SimonandSchuster.com.

Skinner, B. F. (1963). Operant behavior. American psychologist, 18 (8), 503.

Smith, S., Ferguson, C. J., & Beaver, K. M. (2018). Learning to blast a way into crime, or just good clean fun? Examining aggressive play with toy weapons and its relation with crime. Criminal behaviour and mental health, 28 (4), 313-323.

Staddon, J. E., & Cerutti, D. T. (2003). Operant conditioning. Annual Review of Psychology, 54 (1), 115-144.

Thorndike, E. L. (1898). Animal intelligence: An experimental study of the associative processes in animals. Psychological Monographs: General and Applied, 2(4), i-109.

Vu, D. (2017). An Analysis of Operant Conditioning and its Relationship with Video Game Addiction.

Watson, J. B. (1913). Psychology as the behaviorist views it . Psychological Review, 20, 158–177.

Print Friendly, PDF & Email

Related Articles

Solomon Asch Conformity Line Experiment Study

Famous Experiments , Social Science

Solomon Asch Conformity Line Experiment Study

Aversion Therapy & Examples of Aversive Conditioning

Learning Theories

Aversion Therapy & Examples of Aversive Conditioning

Albert Bandura’s Social Learning Theory

Learning Theories , Psychology , Social Science

Albert Bandura’s Social Learning Theory

Behaviorism In Psychology

Learning Theories , Psychology

Behaviorism In Psychology

Bandura’s Bobo Doll Experiment on Social Learning

Famous Experiments , Learning Theories

Bandura’s Bobo Doll Experiment on Social Learning

Bloom’s Taxonomy of Learning

Bloom’s Taxonomy of Learning

Operant Conditioning (Examples + Research)

practical psychology logo

If you're on this page, you're probably researching B.F. Skinner and his work on operant conditioning! You might be surprised to see how much conditioning you go through each day! We are conditioned to behave in certain ways every day. Our brains naturally gravitate toward the things that bring us pleasure and back away from things that bring us pain. When we connect our behaviors to pleasure and pain, we become conditioned. 

When people are subjected to reinforcements (pleasure) and punishments (pain), they undergo operant conditioning. This article will describe operant conditioning, how it works, and how different schedules of reinforcement can increase the rate at which subjects perform a certain behavior.  

What is Operant Conditioning?

Operant conditioning is a system of learning that happens by changing external variables called 'punishments' and 'rewards.' Throughout time and repetition, learning happens when an association is created between a certain behavior and the consequence of that behavior (good or bad).

Operant Conditioning

You might also hear this concept as “instrumental conditioning” or “Skinnerian conditioning.” This second term comes from BF Skinner , the behaviorist who discovered operant conditioning through this work with pigeons. 

He created what is now known as the “Skinner box,” a device that contained a lever, disc, or other mechanism. Something would occur when the levers were pulled or the discs were pressed. Food would appear, lights would flash, the floor would become electric, etc.

Skinner placed pigeons inside these boxes to record their responses based on whether or not they were conditioned to the responses that occurred after completing a certain task. 

pigeons playing ping pong

Based on how the pigeons understood the consequences of their actions, and changes to their behavior, Skinner developed the idea of operant conditioning. 

How Does Operant Conditioning Work? 

We can unearth the definition of operant conditioning by breaking it down. Skinner defined an operant as any "active behavior that operates upon the environment to generate consequences."  You get a big hug whenever you tell your mother she looks pretty. That compliment is an operant.

In operant conditioning, you can change two variables to achieve two goals. 

The variables you can change are adding a stimulus or removing a stimulus. 

The goals you can achieve are increasing a behavior or decreasing a behavior. 

Depending on what goal you're trying to achieve and how you manipulate the variable, there are four methods of operant conditioning:

  • Positive Reinforcement
  • Negative Reinforcement
  • Positive Punishment
  • Negative Punishment

Increase Behavior

Decrease Behavior

Add Stimulus

Remove Stimulus

Remembering operant conditioning types can be difficult, but here's a simple cheat sheet to help you. 

Reinforcement is increasing a behavior.

Punishment is decreasing behavior.

The positive prefix means you're adding a stimulus. 

The negative prefix means you're removing the stimulus. 

Reinforcement

Positive reinforcement sounds redundant - isn’t all reinforcement positive? In psychology, “positive” doesn’t exactly mean what you think it means. The term “positive reinforcement” simply refers to the idea that you have added stimulus to increase a behavior. Dessert after finishing your chores is positive reinforcement. 

Negative reinforcement is the removal of a stimulus to reinforce a behavior. It’s not always a negative experience. Removing debt from your account is considered negative reinforcement. A night without chores is also a negative reinforcement. 

Under the umbrella of negative reinforcement are escape and active avoidance. These types of negative reinforcement condition your behavior through the threat or existence of “bad” stimuli. 

Escape Learning

Escape learning is a crucial adaptive mechanism that enables a subject to minimize or prevent exposure to aversive stimuli. By understanding the dynamics of escape learning, we can gain insights into how organisms, including humans, respond to threatening or harmful situations. In Martin Seligman's experiments with dogs, the principle illustrated how the dogs learned to change their behavior to escape a negative stimulus. This form of learning highlights the ways in which adverse conditions can motivate behaviors that alleviate discomfort or pain.

Active Avoidance Learning

Active avoidance learning is not just a theoretical concept; it has real-world applications in understanding our daily behaviors and decision-making processes. By recognizing the patterns in which we actively avoid negative stimuli, therapists and educators can design interventions to help individuals address anxieties or phobias. For instance, we actively prevent discomfort by putting on a coat to avoid the cold. Recognizing these patterns provides a foundational understanding of how humans often make proactive choices based on past experiences to avoid potential future discomforts. This proactive behavior adjustment plays a significant role in shaping our daily decisions and habits.

Escape and active avoidance learning are integral to understanding human behavior. They offer insights into how we navigate our environment, respond to threats, and proactively shape our actions to avoid potential negative outcomes.

In Operant Conditioning, Punishment is described as changing a stimulus to decrease the likelihood of a behavior. Like reinforcement, there are two types of punishment: positive and negative.

Positive punishment is not a positive experience - it discourages the subject from repeating their behaviors by adding stimulus.

In The Big Bang Theory, Sheldon and the gang try and devise a plan to avoid getting off-topic. They decide to introduce a positive punishment to discourage that behavior. 

The characters decide to put pieces of duct tape on their arms. When one of them gets off-topic, another person in the group would rip the duct tape off that person’s arm as a form of operant conditioning. Adding that painful feeling makes their scheme a form of positive punishment. 

Negative punishment takes something away from the subject to help discourage behavior. If your parents ever took away your access to video games or toys because you were behaving badly, they were using negative punishment to discourage you from bad behavior.  

Measuring Response and Extinction Rates 

Getting spanked for bad behavior once will not stop you from trying to get away with bad behavior. Feeling cold outside and warmer once you put on a coat will not teach you to put on a coat every time you go outside. 

Researchers use two measurements to determine the effectiveness of different operant conditioning schedules: response rate and extinction rate. 

The Response Rate is how often the subject performs the behavior to receive the reinforcement. 

The Extinction Rate is quite different. If the subject doesn’t trust that they will get a reinforcement for their behavior or does not make the connection between the behavior and the consequence, they are likely to quit performing the behavior. The extinction rate is when that behavior ends after reinforcements are not given.

Schedules of Reinforcement

How fast does operant conditioning happen? Can you manipulate response and extinction rates? The answer varies based on when and why you receive your reinforcement. 

Skinner understood this. Throughout his research, he observed that the timing and frequency of reinforcement or punishment greatly impacted how quickly the subject learned to perform or refrain from a behavior. These factors also make an impact on response rate. 

The different times and frequencies in which reinforcement is delivered can be identified by one of many schedules of reinforcement. Let’s look at those different schedules and how effective they are. 

Continuous reinforcement

If you think about the simplest form of operant conditioning, you are probably thinking of continuous reinforcement. When the subject performs a behavior, they earn a reinforcement. This occurs every single time.

While the response rate is fairly high initially, extinction occurs when continuous reinforcement stops. If you earn dessert every time you clean your room, you will clean your room when you want dessert. But if you clean your room and don’t earn dessert one day, you will lose trust in the reinforcement, and the behavior will likely stop. 

The next four reinforcement schedules are called partial reinforcement. Reinforcements are not delivered every single time a behavior is performed. Instead, reinforcements are distributed based on the amount of behaviors performed or time passed. 

Fixed ratio reinforcement

“Ratio” refers to the amount of responses. “Fixed” refers to a consistent amount. Put them together, and you get a schedule of reinforcement with a consistent amount of responses. Rewards programs often use fixed ratio reinforcement schedules to encourage customers to return . For every ten smoothies, you get one free.

Every time you spend $100, you get $20 off on your next purchase. The free smoothie and reduced purchases are both reinforcements distributed after a consistent amount of behaviors. It could take a subject two years or two weeks to reach that tenth smoothie - either way, the reinforcement is distributed after that tenth purchase. 

The rate of response becomes more rapid as subjects endure fixed ratio reinforcement. Think about people in sales who work on commission. They know they will get a $1,000 paycheck for every five items they sell - you can bet that they are pushing hard to sell those five items and earn that reinforcement faster. 

Fixed interval reinforcement

Whereas “ratio” refers to the amount of responses, “interval” refers to the timing of the response. Subjects receive reinforcement after a certain amount of time has passed. You experience fixed interval reinforcement when you receive a paycheck on the 15th and 30th of every month, regardless of how often you perform a behavior.

The response rate is typically slower in situations with fixed interval reinforcement. Subjects know they will receive a reward no matter how often they behave . People in jobs with steady and consistent paychecks are often less likely to push hard and sell more products because they know they will get the same paycheck no matter how many items they sell. Other factors, like bonuses or verbal reprimands, may impact their motivation, but those extra factors don’t exist in pure fixed interval reinforcement.

Variable ratio reinforcement

When discussing reinforcement schedules, “variable” refers to something that varies after a reinforcement is given. 

Let’s go back to the example of the rewards card. On a variable ratio reinforcement schedule, the subject would receive their first free smoothie after buying ten smoothies. Once they get that first free smoothie, they only have to buy seven for another free smoothie. After that reinforcement is distributed, the subject has to buy 15 smoothies to get a free smoothie. The ratio of reinforcement is variable.

This type of schedule isn’t always used because it can be confusing - in many cases, the subject does not know how many smoothies they must purchase before getting their free one. ​

However, response rates are high for this type of schedule. The reinforcement is dependent on the subject’s behavior. They know they are one step closer to their reward by performing one more behavior. If they don’t get the reinforcement, they can perform one more behavior and again become one step closer to getting the reinforcement. 

Think of slot machines. You never know how often you must pull the level before winning the jackpot. But you know you are one step closer to winning with every pull . At some point, if you just keep pulling, you will win the jackpot and receive a big reinforcement. 

Variable interval reinforcement

The final reinforcement schedule identified by Skinner was that of variable interval reinforcement. By now, you can probably guess what this means. Variable interval reinforcement occurs when reinforcements are distributed after a certain amount of time has passed, but this amount varies after each reinforcement is distributed. 

In this example, let’s say you work at a retail store. At any given time, secret shoppers enter the store. If you behave correctly and sell the right items to the secret shopper, the higher-ups give you a bonus. 

This could happen anytime as long as you are performing the behavior. This schedule keeps people on their toes, encouraging a high response rate and low extinction rate. 

FAQs About Operant Conditioning 

Is operant conditioning trial and error.

Not exactly, although trial and error helped psychologists recognize operant conditioning. Through trial and error, it was discovered that reinforcements and rewards helped behaviors stick. These reinforcements (praise, treats, etc.) are the key to behaviors being performed and even repeated.

Is Operant Conditioning Behaviorism?

Behaviorism is an approach to psychology; think of operant conditioning as a theory under the umbrella of behaviorism. B.F. Skinner is considered one of the most important Behaviorists in the history of psychology. For decades, theories like operant conditioning and classical conditioning have helped shape how people approach behavior. 

Differences Between Operant Conditioning vs. Classical Conditioning

Classical conditioning ties existing behaviors (like salivating) to stimuli (like a bell). “Classical Connects.” Operant conditioning trains an animal or human to perform or refrain from certain behaviors. You don’t train a dog to salivate, but you can train a dog to sit by giving him treats when he sits.

Operant Conditioning vs. Instrumental Conditioning

Operant conditioning and instrumental conditioning refer to the same process. You are more likely to hear the term "operant conditioning" in psychology and "instrumental conditioning" in economics! However, they differ from another type of conditioning: classical conditioning.

​Can Operant Conditioning Be Used in the Classroom? 

Yes! Intentionally rewarding students for their behavior is a form of operant conditioning. If students receive praise every time they get an A , they are likelier to strive for an A on their tests and quizzes.

Everyday Examples of Operant Conditioning 

You can probably think of ways you have used operant conditioning on yourself, your child, or your pets! Reddit users see operant conditioning in video games and pet training, ...

Post from iurichibaBR in r/FFBE (Final Fantasy Brave Exvius)

When you think about FFBE, what's the first thing that comes to mind? Most of you would probably answer  CRYSTALS, PULLING, RAINBOWS, EVE! That's a clear example of Operant Conditioning. You wanna play the game every day and get that daily summons because you know you may get something awesome! And that's also why the Rainbow rates are low -- if you won them too frequently, it would lose its effect.

A cute example of operant conditioning from Narwahl_in_spaze in r/ABA (Applied Behavior Analysis)

Post from barbiegoingbad in r/Diabla

Now that Mary knows the basketball player is in the game for fame, she uses this to her advantage. Every time he does something desirable, she uses this as a reinforcement for him to continue and upgrade this behavior . After their first date went well, they went to an event together. She knows he wants adulation and to feel important, so she puts the spotlight on him and makes him look good in front of others whenever he goes out of his way to provide for her. This subconsciously makes him feel good, so he continues to provide her with what she wants and needs (in her case, gifts, money, and affection.)

​​​Using Operant Conditioning ​On Yourself

We are used to operant conditioning forms set up by the natural world or authority figures. But you can also use operant conditioning on yourself or with accountability . 

Here’s how you can do it yourself. You set up a fixed ratio reinforcement schedule: for every 10 note cards you write or memorize, you give yourself an hour of video games. You can set up a fixed interval reinforcement schedule: after every week of finals, you take a vacation. 

Accountabilibuddies are best for setting up variable ratio and variable interval reinforcement schedules. That way, you don’t know when the reinforcement is coming. Tell your buddy to give you your video game controller back after a random amount of note cards that you write. Or, ask them to walk into your room at random intervals. If you’re studying, they hand you a beer. If you’re not, no reinforcement.

Related posts:

  • Variable Interval Reinforcement Schedule (Examples)
  • Fixed Ratio Reinforcement Schedule (Examples)
  • Skinner’s Box Experiment (Behaviorism Study)
  • Schedules of Reinforcement (Examples)
  • Fixed Interval Reinforcement Schedule (Examples)

Reference this article:

About The Author

Photo of author

Operant Conditioning

Classical Conditioning

Observational Learning

Latent Learning

Experiential Learning

The Little Albert Study

Bobo Doll Experiment

Spacing Effect

Von Restorff Effect

operant conditioning experiment on pigeon

PracticalPie.com is a participant in the Amazon Associates Program. As an Amazon Associate we earn from qualifying purchases.

Follow Us On:

Youtube Facebook Instagram X/Twitter

Psychology Resources

Developmental

Personality

Relationships

Psychologists

Serial Killers

Psychology Tests

Personality Quiz

Memory Test

Depression test

Type A/B Personality Test

© PracticalPsychology. All rights reserved

Privacy Policy | Terms of Use

6.3 Operant Conditioning

Learning objectives.

By the end of this section, you will be able to:

  • Define operant conditioning
  • Explain the difference between reinforcement and punishment
  • Distinguish between reinforcement schedules

The previous section of this chapter focused on the type of associative learning known as classical conditioning. Remember that in classical conditioning, something in the environment triggers a reflex automatically, and researchers train the organism to react to a different stimulus. Now we turn to the second type of associative learning, operant conditioning . In operant conditioning, organisms learn to associate a behavior and its consequence ( Table 6.1 ). A pleasant consequence makes that behavior more likely to be repeated in the future. For example, Spirit, a dolphin at the National Aquarium in Baltimore, does a flip in the air when her trainer blows a whistle. The consequence is that she gets a fish.

Classical Conditioning Operant Conditioning
Conditioning approach An unconditioned stimulus (such as food) is paired with a neutral stimulus (such as a bell). The neutral stimulus eventually becomes the conditioned stimulus, which brings about the conditioned response (salivation). The target behavior is followed by reinforcement or punishment to either strengthen or weaken it, so that the learner is more likely to exhibit the desired behavior in the future.
Stimulus timing The stimulus occurs immediately before the response. The stimulus (either reinforcement or punishment) occurs soon after the response.

Psychologist B. F. Skinner saw that classical conditioning is limited to existing behaviors that are reflexively elicited, and it doesn’t account for new behaviors such as riding a bike. He proposed a theory about how such behaviors come about. Skinner believed that behavior is motivated by the consequences we receive for the behavior: the reinforcements and punishments. His idea that learning is the result of consequences is based on the law of effect, which was first proposed by psychologist Edward Thorndike . According to the law of effect , behaviors that are followed by consequences that are satisfying to the organism are more likely to be repeated, and behaviors that are followed by unpleasant consequences are less likely to be repeated (Thorndike, 1911). Essentially, if an organism does something that brings about a desired result, the organism is more likely to do it again. If an organism does something that does not bring about a desired result, the organism is less likely to do it again. An example of the law of effect is in employment. One of the reasons (and often the main reason) we show up for work is because we get paid to do so. If we stop getting paid, we will likely stop showing up—even if we love our job.

Working with Thorndike’s law of effect as his foundation, Skinner began conducting scientific experiments on animals (mainly rats and pigeons) to determine how organisms learn through operant conditioning (Skinner, 1938). He placed these animals inside an operant conditioning chamber, which has come to be known as a “Skinner box” ( Figure 6.10 ). A Skinner box contains a lever (for rats) or disk (for pigeons) that the animal can press or peck for a food reward via the dispenser. Speakers and lights can be associated with certain behaviors. A recorder counts the number of responses made by the animal.

Link to Learning

Watch this brief video to see Skinner's interview and a demonstration of operant conditioning of pigeons to learn more.

In discussing operant conditioning, we use several everyday words—positive, negative, reinforcement, and punishment—in a specialized manner. In operant conditioning, positive and negative do not mean good and bad. Instead, positive means you are adding something, and negative means you are taking something away. Reinforcement means you are increasing a behavior, and punishment means you are decreasing a behavior. Reinforcement can be positive or negative, and punishment can also be positive or negative. All reinforcers (positive or negative) increase the likelihood of a behavioral response. All punishers (positive or negative) decrease the likelihood of a behavioral response. Now let’s combine these four terms: positive reinforcement, negative reinforcement, positive punishment, and negative punishment ( Table 6.2 ).

Reinforcement Punishment
Positive Something is to the likelihood of a behavior. Something is to the likelihood of a behavior.
Negative Something is to the likelihood of a behavior. Something is to the likelihood of a behavior.

Reinforcement

The most effective way to teach a person or animal a new behavior is with positive reinforcement. In positive reinforcement , a desirable stimulus is added to increase a behavior.

For example, you tell your five-year-old son, Jerome, that if he cleans his room, he will get a toy. Jerome quickly cleans his room because he wants a new art set. Let’s pause for a moment. Some people might say, “Why should I reward my child for doing what is expected?” But in fact we are constantly and consistently rewarded in our lives. Our paychecks are rewards, as are high grades and acceptance into our preferred school. Being praised for doing a good job and for passing a driver’s test is also a reward. Positive reinforcement as a learning tool is extremely effective. It has been found that one of the most effective ways to increase achievement in school districts with below-average reading scores was to pay the children to read. Specifically, second-grade students in Dallas were paid $2 each time they read a book and passed a short quiz about the book. The result was a significant increase in reading comprehension (Fryer, 2010). What do you think about this program? If Skinner were alive today, he would probably think this was a great idea. He was a strong proponent of using operant conditioning principles to influence students’ behavior at school. In fact, in addition to the Skinner box, he also invented what he called a teaching machine that was designed to reward small steps in learning (Skinner, 1961)—an early forerunner of computer-assisted learning. His teaching machine tested students’ knowledge as they worked through various school subjects. If students answered questions correctly, they received immediate positive reinforcement and could continue; if they answered incorrectly, they did not receive any reinforcement. The idea was that students would spend additional time studying the material to increase their chance of being reinforced the next time (Skinner, 1961).

In negative reinforcement , an undesirable stimulus is removed to increase a behavior. For example, car manufacturers use the principles of negative reinforcement in their seatbelt systems, which go “beep, beep, beep” until you fasten your seatbelt. The annoying sound stops when you exhibit the desired behavior, increasing the likelihood that you will buckle up in the future. Negative reinforcement is also used frequently in horse training. Riders apply pressure—by pulling the reins or squeezing their legs—and then remove the pressure when the horse performs the desired behavior, such as turning or speeding up. The pressure is the negative stimulus that the horse wants to remove.

Many people confuse negative reinforcement with punishment in operant conditioning, but they are two very different mechanisms. Remember that reinforcement, even when it is negative, always increases a behavior. In contrast, punishment always decreases a behavior. In positive punishment , you add an undesirable stimulus to decrease a behavior. An example of positive punishment is scolding a student to get the student to stop texting in class. In this case, a stimulus (the reprimand) is added in order to decrease the behavior (texting in class). In negative punishment , you remove a pleasant stimulus to decrease behavior. For example, when a child misbehaves, a parent can take away a favorite toy. In this case, a stimulus (the toy) is removed in order to decrease the behavior.

Punishment, especially when it is immediate, is one way to decrease undesirable behavior. For example, imagine your five-year-old son, Brandon, runs out into the street to chase a ball. You have Brandon write 100 times “I will not run into the street" (positive punishment). Chances are he won’t repeat this behavior. While strategies like this are common today, in the past children were often subject to physical punishment, such as spanking. It’s important to be aware of some of the drawbacks in using physical punishment on children. First, punishment may teach fear. Brandon may become fearful of the street, but he also may become fearful of the person who delivered the punishment—you, his parent. Similarly, children who are punished by teachers may come to fear the teacher and try to avoid school (Gershoff et al., 2010). Consequently, most schools in the United States have banned corporal punishment. Second, punishment may cause children to become more aggressive and prone to antisocial behavior and delinquency (Gershoff, 2002). They see their parents resort to spanking when they become angry and frustrated, so, in turn, they may act out this same behavior when they become angry and frustrated. For example, if you spank your child when you are angry with them for their misbehavior, they might start hitting their friends when they won’t share their toys.

While positive punishment can be effective in some cases, Skinner suggested that the use of punishment should be weighed against the possible negative effects. Today’s psychologists and parenting experts favor reinforcement over punishment—they recommend that you catch your child doing something good and reward them for it.

In his operant conditioning experiments, Skinner often used an approach called shaping. Instead of rewarding only the target behavior, in shaping , we reward successive approximations of a target behavior. Why is shaping needed? Remember that in order for reinforcement to work, the organism must first display the behavior. Shaping is needed because it is extremely unlikely that an organism will display anything but the simplest of behaviors spontaneously. In shaping, behaviors are broken down into many small, achievable steps. The specific steps used in the process are the following:

  • Reinforce any response that resembles the desired behavior.
  • Then reinforce the response that more closely resembles the desired behavior. You will no longer reinforce the previously reinforced response.
  • Next, begin to reinforce the response that even more closely resembles the desired behavior.
  • Continue to reinforce closer and closer approximations of the desired behavior.
  • Finally, only reinforce the desired behavior.

Shaping is often used in teaching a complex behavior or chain of behaviors. Skinner used shaping to teach pigeons not only such relatively simple behaviors as pecking a disk in a Skinner box, but also many unusual and entertaining behaviors, such as turning in circles, walking in figure eights, and even playing ping pong; the technique is commonly used by animal trainers today. An important part of shaping is stimulus discrimination. Recall Pavlov’s dogs—he trained them to respond to the tone of a bell, and not to similar tones or sounds. This discrimination is also important in operant conditioning and in shaping behavior.

Watch this brief video of Skinner's pigeons playing ping pong to learn more.

It’s easy to see how shaping is effective in teaching behaviors to animals, but how does shaping work with humans? Let’s consider parents whose goal is to have their child learn to clean his room. They use shaping to help him master steps toward the goal. Instead of performing the entire task, they set up these steps and reinforce each step. First, he cleans up one toy. Second, he cleans up five toys. Third, he chooses whether to pick up ten toys or put his books and clothes away. Fourth, he cleans up everything except two toys. Finally, he cleans his entire room.

Primary and Secondary Reinforcers

Rewards such as stickers, praise, money, toys, and more can be used to reinforce learning. Let’s go back to Skinner’s rats again. How did the rats learn to press the lever in the Skinner box? They were rewarded with food each time they pressed the lever. For animals, food would be an obvious reinforcer.

What would be a good reinforcer for humans? For your child cleaning the room, it was the promise of a toy. How about Sydney, the soccer player? If you gave Sydney a piece of candy every time Sydney scored a goal, you would be using a primary reinforcer . Primary reinforcers are reinforcers that have innate reinforcing qualities. These kinds of reinforcers are not learned. Water, food, sleep, shelter, sex, and touch, among others, are primary reinforcers. Pleasure is also a primary reinforcer. Organisms do not lose their drive for these things. For most people, jumping in a cool lake on a very hot day would be reinforcing and the cool lake would be innately reinforcing—the water would cool the person off (a physical need), as well as provide pleasure.

A secondary reinforcer has no inherent value and only has reinforcing qualities when linked with a primary reinforcer. Praise, linked to affection, is one example of a secondary reinforcer, as when you called out “Great shot!” every time Sydney made a goal. Another example, money, is only worth something when you can use it to buy other things—either things that satisfy basic needs (food, water, shelter—all primary reinforcers) or other secondary reinforcers. If you were on a remote island in the middle of the Pacific Ocean and you had stacks of money, the money would not be useful if you could not spend it. What about the stickers on the behavior chart? They also are secondary reinforcers.

Sometimes, instead of stickers on a sticker chart, a token is used. Tokens, which are also secondary reinforcers, can then be traded in for rewards and prizes. Entire behavior management systems, known as token economies, are built around the use of these kinds of token reinforcers. Token economies have been found to be very effective at modifying behavior in a variety of settings such as schools, prisons, and mental hospitals. For example, a study by Adibsereshki and Abkenar (2014) found that use of a token economy increased appropriate social behaviors and reduced inappropriate behaviors in a group of eight grade students. Similar studies show demonstrable gains on behavior and academic achievement for groups ranging from first grade to high school, and representing a wide array of abilities and disabilities. For example, during studies involving younger students, when children in the study exhibited appropriate behavior (not hitting or pinching), they received a “quiet hands” token. When they hit or pinched, they lost a token. The children could then exchange specified amounts of tokens for minutes of playtime.

Everyday Connection

Behavior modification in children.

Parents and teachers often use behavior modification to change a child’s behavior. Behavior modification uses the principles of operant conditioning to accomplish behavior change so that undesirable behaviors are switched for more socially acceptable ones. Some teachers and parents create a sticker chart, in which several behaviors are listed ( Figure 6.11 ). Sticker charts are a form of token economies, as described in the text. Each time children perform the behavior, they get a sticker, and after a certain number of stickers, they get a prize, or reinforcer. The goal is to increase acceptable behaviors and decrease misbehavior. Remember, it is best to reinforce desired behaviors, rather than to use punishment. In the classroom, the teacher can reinforce a wide range of behaviors, from students raising their hands, to walking quietly in the hall, to turning in their homework. At home, parents might create a behavior chart that rewards children for things such as putting away toys, brushing their teeth, and helping with dinner. In order for behavior modification to be effective, the reinforcement needs to be connected with the behavior; the reinforcement must matter to the child and be done consistently.

Time-out is another popular technique used in behavior modification with children. It operates on the principle of negative punishment. When a child demonstrates an undesirable behavior, they are removed from the desirable activity at hand ( Figure 6.12 ). For example, say that Sophia and her brother Mario are playing with building blocks. Sophia throws some blocks at her brother, so you give her a warning that she will go to time-out if she does it again. A few minutes later, she throws more blocks at Mario. You remove Sophia from the room for a few minutes. When she comes back, she doesn’t throw blocks.

There are several important points that you should know if you plan to implement time-out as a behavior modification technique. First, make sure the child is being removed from a desirable activity and placed in a less desirable location. If the activity is something undesirable for the child, this technique will backfire because it is more enjoyable for the child to be removed from the activity. Second, the length of the time-out is important. The general rule of thumb is one minute for each year of the child’s age. Sophia is five; therefore, she sits in a time-out for five minutes. Setting a timer helps children know how long they have to sit in time-out. Finally, as a caregiver, keep several guidelines in mind over the course of a time-out: remain calm when directing your child to time-out; ignore your child during time-out (because caregiver attention may reinforce misbehavior); and give the child a hug or a kind word when time-out is over.

Reinforcement Schedules

Remember, the best way to teach a person or animal a behavior is to use positive reinforcement. For example, Skinner used positive reinforcement to teach rats to press a lever in a Skinner box. At first, the rat might randomly hit the lever while exploring the box, and out would come a pellet of food. After eating the pellet, what do you think the hungry rat did next? It hit the lever again, and received another pellet of food. Each time the rat hit the lever, a pellet of food came out. When an organism receives a reinforcer each time it displays a behavior, it is called continuous reinforcement . This reinforcement schedule is the quickest way to teach someone a behavior, and it is especially effective in training a new behavior. Let’s look back at the dog that was learning to sit earlier in the chapter. Now, each time he sits, you give him a treat. Timing is important here: you will be most successful if you present the reinforcer immediately after he sits, so that he can make an association between the target behavior (sitting) and the consequence (getting a treat).

Watch this video clip of veterinarian Dr. Sophia Yin shaping a dog's behavior using the steps outlined above to learn more.

Once a behavior is trained, researchers and trainers often turn to another type of reinforcement schedule—partial reinforcement. In partial reinforcement , also referred to as intermittent reinforcement, the person or animal does not get reinforced every time they perform the desired behavior. There are several different types of partial reinforcement schedules ( Table 6.3 ). These schedules are described as either fixed or variable, and as either interval or ratio. Fixed refers to the number of responses between reinforcements, or the amount of time between reinforcements, which is set and unchanging. Variable refers to the number of responses or amount of time between reinforcements, which varies or changes. Interval means the schedule is based on the time between reinforcements, and ratio means the schedule is based on the number of responses between reinforcements.

Reinforcement Schedule Description Result Example
Fixed interval Reinforcement is delivered at predictable time intervals (e.g., after 5, 10, 15, and 20 minutes). Moderate response rate with significant pauses after reinforcement Hospital patient uses patient-controlled, doctor-timed pain relief
Variable interval Reinforcement is delivered at unpredictable time intervals (e.g., after 5, 7, 10, and 20 minutes). Moderate yet steady response rate Checking social media
Fixed ratio Reinforcement is delivered after a predictable number of responses (e.g., after 2, 4, 6, and 8 responses). High response rate with pauses after reinforcement Piecework—factory worker getting paid for every x number of items manufactured
Variable ratio Reinforcement is delivered after an unpredictable number of responses (e.g., after 1, 4, 5, and 9 responses). High and steady response rate Gambling

Now let’s combine these four terms. A fixed interval reinforcement schedule is when behavior is rewarded after a set amount of time. For example, June undergoes major surgery in a hospital. During recovery, they are expected to experience pain and will require prescription medications for pain relief. June is given an IV drip with a patient-controlled painkiller. Their doctor sets a limit: one dose per hour. June pushes a button when pain becomes difficult, and they receive a dose of medication. Since the reward (pain relief) only occurs on a fixed interval, there is no point in exhibiting the behavior when it will not be rewarded.

With a variable interval reinforcement schedule , the person or animal gets the reinforcement based on varying amounts of time, which are unpredictable. Say that Manuel is the manager at a fast-food restaurant. Every once in a while someone from the quality control division comes to Manuel’s restaurant. If the restaurant is clean and the service is fast, everyone on that shift earns a $20 bonus. Manuel never knows when the quality control person will show up, so he always tries to keep the restaurant clean and ensures that his employees provide prompt and courteous service. His productivity regarding prompt service and keeping a clean restaurant are steady because he wants his crew to earn the bonus.

With a fixed ratio reinforcement schedule , there are a set number of responses that must occur before the behavior is rewarded. Carla sells glasses at an eyeglass store, and she earns a commission every time she sells a pair of glasses. She always tries to sell people more pairs of glasses, including prescription sunglasses or a backup pair, so she can increase her commission. She does not care if the person really needs the prescription sunglasses, Carla just wants her bonus. The quality of what Carla sells does not matter because her commission is not based on quality; it’s only based on the number of pairs sold. This distinction in the quality of performance can help determine which reinforcement method is most appropriate for a particular situation. Fixed ratios are better suited to optimize the quantity of output, whereas a fixed interval, in which the reward is not quantity based, can lead to a higher quality of output.

In a variable ratio reinforcement schedule , the number of responses needed for a reward varies. This is the most powerful partial reinforcement schedule. An example of the variable ratio reinforcement schedule is gambling. Imagine that Sarah—generally a smart, thrifty woman—visits Las Vegas for the first time. She is not a gambler, but out of curiosity she puts a quarter into the slot machine, and then another, and another. Nothing happens. Two dollars in quarters later, her curiosity is fading, and she is just about to quit. But then, the machine lights up, bells go off, and Sarah gets 50 quarters back. That’s more like it! Sarah gets back to inserting quarters with renewed interest, and a few minutes later she has used up all her gains and is $10 in the hole. Now might be a sensible time to quit. And yet, she keeps putting money into the slot machine because she never knows when the next reinforcement is coming. She keeps thinking that with the next quarter she could win $50, or $100, or even more. Because the reinforcement schedule in most types of gambling has a variable ratio schedule, people keep trying and hoping that the next time they will win big. This is one of the reasons that gambling is so addictive—and so resistant to extinction.

In operant conditioning, extinction of a reinforced behavior occurs at some point after reinforcement stops, and the speed at which this happens depends on the reinforcement schedule. In a variable ratio schedule, the point of extinction comes very slowly, as described above. But in the other reinforcement schedules, extinction may come quickly. For example, if June presses the button for the pain relief medication before the allotted time the doctor has approved, no medication is administered. They are on a fixed interval reinforcement schedule (dosed hourly), so extinction occurs quickly when reinforcement doesn’t come at the expected time. Among the reinforcement schedules, variable ratio is the most productive and the most resistant to extinction. Fixed interval is the least productive and the easiest to extinguish ( Figure 6.13 ).

Connect the Concepts

Gambling and the brain.

Skinner (1953) stated, “If the gambling establishment cannot persuade a patron to turn over money with no return, it may achieve the same effect by returning part of the patron's money on a variable-ratio schedule” (p. 397).

Skinner uses gambling as an example of the power of the variable-ratio reinforcement schedule for maintaining behavior even during long periods without any reinforcement. In fact, Skinner was so confident in his knowledge of gambling addiction that he even claimed he could turn a pigeon into a pathological gambler (“Skinner’s Utopia,” 1971). It is indeed true that variable-ratio schedules keep behavior quite persistent—just imagine the frequency of a child’s tantrums if a parent gives in even once to the behavior. The occasional reward makes it almost impossible to stop the behavior.

Recent research in rats has failed to support Skinner’s idea that training on variable-ratio schedules alone causes pathological gambling (Laskowski et al., 2019). However, other research suggests that gambling does seem to work on the brain in the same way as most addictive drugs, and so there may be some combination of brain chemistry and reinforcement schedule that could lead to problem gambling ( Figure 6.14 ). Specifically, modern research shows the connection between gambling and the activation of the reward centers of the brain that use the neurotransmitter (brain chemical) dopamine (Murch & Clark, 2016). Interestingly, gamblers don’t even have to win to experience the “rush” of dopamine in the brain. “Near misses,” or almost winning but not actually winning, also have been shown to increase activity in the ventral striatum and other brain reward centers that use dopamine (Chase & Clark, 2010). These brain effects are almost identical to those produced by addictive drugs like cocaine and heroin (Murch & Clark, 2016). Based on the neuroscientific evidence showing these similarities, the DSM-5 now considers gambling an addiction, while earlier versions of the DSM classified gambling as an impulse control disorder.

In addition to dopamine, gambling also appears to involve other neurotransmitters, including norepinephrine and serotonin (Potenza, 2013). Norepinephrine is secreted when a person feels stress, arousal, or thrill. It may be that pathological gamblers use gambling to increase their levels of this neurotransmitter. Deficiencies in serotonin might also contribute to compulsive behavior, including a gambling addiction (Potenza, 2013).

It may be that pathological gamblers’ brains are different than those of other people, and perhaps this difference may somehow have led to their gambling addiction, as these studies seem to suggest. However, it is very difficult to ascertain the cause because it is impossible to conduct a true experiment (it would be unethical to try to turn randomly assigned participants into problem gamblers). Therefore, it may be that causation actually moves in the opposite direction—perhaps the act of gambling somehow changes neurotransmitter levels in some gamblers’ brains. It also is possible that some overlooked factor, or confounding variable, played a role in both the gambling addiction and the differences in brain chemistry.

Cognition and Latent Learning

Strict behaviorists like Watson and Skinner focused exclusively on studying behavior rather than cognition (such as thoughts and expectations). In fact, Skinner was such a staunch believer that cognition didn't matter that his ideas were considered radical behaviorism . Skinner considered the mind a "black box"—something completely unknowable—and, therefore, something not to be studied. However, another behaviorist, Edward C. Tolman, had a different opinion. Tolman’s experiments with rats demonstrated that organisms can learn even if they do not receive immediate reinforcement (Tolman & Honzik, 1930; Tolman, Ritchie, & Kalish, 1946). This finding was in conflict with the prevailing idea at the time that reinforcement must be immediate in order for learning to occur, thus suggesting a cognitive aspect to learning.

In the experiments, Tolman placed hungry rats in a maze with no reward for finding their way through it. He also studied a comparison group that was rewarded with food at the end of the maze. As the unreinforced rats explored the maze, they developed a cognitive map : a mental picture of the layout of the maze ( Figure 6.15 ). After 10 sessions in the maze without reinforcement, food was placed in a goal box at the end of the maze. As soon as the rats became aware of the food, they were able to find their way through the maze quickly, just as quickly as the comparison group, which had been rewarded with food all along. This is known as latent learning : learning that occurs but is not observable in behavior until there is a reason to demonstrate it.

Latent learning also occurs in humans. Children may learn by watching the actions of their parents but only demonstrate it at a later date, when the learned material is needed. For example, suppose that Ravi’s dad drives him to school every day. In this way, Ravi learns the route from his house to his school, but he’s never driven there himself, so he has not had a chance to demonstrate that he’s learned the way. One morning Ravi’s dad has to leave early for a meeting, so he can’t drive Ravi to school. Instead, Ravi follows the same route on his bike that his dad would have taken in the car. This demonstrates latent learning. Ravi had learned the route to school, but had no need to demonstrate this knowledge earlier.

This Place Is Like a Maze

Have you ever gotten lost in a building and couldn’t find your way back out? While that can be frustrating, you’re not alone. At one time or another we’ve all gotten lost in places like a museum, hospital, or university library. Whenever we go someplace new, we build a mental representation—or cognitive map—of the location, as Tolman’s rats built a cognitive map of their maze. However, some buildings are confusing because they include many areas that look alike or have short lines of sight. Because of this, it’s often difficult to predict what’s around a corner or decide whether to turn left or right to get out of a building. Psychologist Laura Carlson (2010) suggests that what we place in our cognitive map can impact our success in navigating through the environment. She suggests that paying attention to specific features upon entering a building, such as a picture on the wall, a fountain, a statue, or an escalator, adds information to our cognitive map that can be used later to help find our way out of the building.

Watch this video about Carlson's studies on cognitive maps and navigation in buildings to learn more.

As an Amazon Associate we earn from qualifying purchases.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Access for free at https://openstax.org/books/psychology-2e/pages/1-introduction
  • Authors: Rose M. Spielman, William J. Jenkins, Marilyn D. Lovett
  • Publisher/website: OpenStax
  • Book title: Psychology 2e
  • Publication date: Apr 22, 2020
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/psychology-2e/pages/1-introduction
  • Section URL: https://openstax.org/books/psychology-2e/pages/6-3-operant-conditioning

© Jan 6, 2024 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

Logo for UH Pressbooks

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Operant Conditioning

OpenStaxCollege

[latexpage]

Learning Objectives

By the end of this section, you will be able to:

  • Define operant conditioning
  • Explain the difference between reinforcement and punishment
  • Distinguish between reinforcement schedules

The previous section of this chapter focused on the type of associative learning known as classical conditioning. Remember that in classical conditioning, something in the environment triggers a reflex automatically, and researchers train the organism to react to a different stimulus. Now we turn to the second type of associative learning, operant conditioning . In operant conditioning, organisms learn to associate a behavior and its consequence ( [link] ). A pleasant consequence makes that behavior more likely to be repeated in the future. For example, Spirit, a dolphin at the National Aquarium in Baltimore, does a flip in the air when her trainer blows a whistle. The consequence is that she gets a fish.

Classical and Operant Conditioning Compared
Classical Conditioning Operant Conditioning
Conditioning approach An unconditioned stimulus (such as food) is paired with a neutral stimulus (such as a bell). The neutral stimulus eventually becomes the conditioned stimulus, which brings about the conditioned response (salivation). The target behavior is followed by reinforcement or punishment to either strengthen or weaken it, so that the learner is more likely to exhibit the desired behavior in the future.
Stimulus timing The stimulus occurs immediately before the response. The stimulus (either reinforcement or punishment) occurs soon after the response.

Psychologist B. F. Skinner saw that classical conditioning is limited to existing behaviors that are reflexively elicited, and it doesn’t account for new behaviors such as riding a bike. He proposed a theory about how such behaviors come about. Skinner believed that behavior is motivated by the consequences we receive for the behavior: the reinforcements and punishments. His idea that learning is the result of consequences is based on the law of effect, which was first proposed by psychologist Edward Thorndike . According to the law of effect , behaviors that are followed by consequences that are satisfying to the organism are more likely to be repeated, and behaviors that are followed by unpleasant consequences are less likely to be repeated (Thorndike, 1911). Essentially, if an organism does something that brings about a desired result, the organism is more likely to do it again. If an organism does something that does not bring about a desired result, the organism is less likely to do it again. An example of the law of effect is in employment. One of the reasons (and often the main reason) we show up for work is because we get paid to do so. If we stop getting paid, we will likely stop showing up—even if we love our job.

Working with Thorndike’s law of effect as his foundation, Skinner began conducting scientific experiments on animals (mainly rats and pigeons) to determine how organisms learn through operant conditioning (Skinner, 1938). He placed these animals inside an operant conditioning chamber, which has come to be known as a “Skinner box” ( [link] ). A Skinner box contains a lever (for rats) or disk (for pigeons) that the animal can press or peck for a food reward via the dispenser. Speakers and lights can be associated with certain behaviors. A recorder counts the number of responses made by the animal.

A photograph shows B.F. Skinner. An illustration shows a rat in a Skinner box: a chamber with a speaker, lights, a lever, and a food dispenser.

Watch this brief video clip to learn more about operant conditioning: Skinner is interviewed, and operant conditioning of pigeons is demonstrated.

In discussing operant conditioning, we use several everyday words—positive, negative, reinforcement, and punishment—in a specialized manner. In operant conditioning, positive and negative do not mean good and bad. Instead, positive means you are adding something, and negative means you are taking something away. Reinforcement means you are increasing a behavior, and punishment means you are decreasing a behavior. Reinforcement can be positive or negative, and punishment can also be positive or negative. All reinforcers (positive or negative) increase the likelihood of a behavioral response. All punishers (positive or negative) decrease the likelihood of a behavioral response. Now let’s combine these four terms: positive reinforcement, negative reinforcement, positive punishment, and negative punishment ( [link] ).

Positive and Negative Reinforcement and Punishment
Reinforcement Punishment
Positive Something is to the likelihood of a behavior. Something is to the likelihood of a behavior.
Negative Something is to the likelihood of a behavior. Something is to the likelihood of a behavior.

REINFORCEMENT

The most effective way to teach a person or animal a new behavior is with positive reinforcement. In positive reinforcement , a desirable stimulus is added to increase a behavior.

For example, you tell your five-year-old son, Jerome, that if he cleans his room, he will get a toy. Jerome quickly cleans his room because he wants a new art set. Let’s pause for a moment. Some people might say, “Why should I reward my child for doing what is expected?” But in fact we are constantly and consistently rewarded in our lives. Our paychecks are rewards, as are high grades and acceptance into our preferred school. Being praised for doing a good job and for passing a driver’s test is also a reward. Positive reinforcement as a learning tool is extremely effective. It has been found that one of the most effective ways to increase achievement in school districts with below-average reading scores was to pay the children to read. Specifically, second-grade students in Dallas were paid $2 each time they read a book and passed a short quiz about the book. The result was a significant increase in reading comprehension (Fryer, 2010). What do you think about this program? If Skinner were alive today, he would probably think this was a great idea. He was a strong proponent of using operant conditioning principles to influence students’ behavior at school. In fact, in addition to the Skinner box, he also invented what he called a teaching machine that was designed to reward small steps in learning (Skinner, 1961)—an early forerunner of computer-assisted learning. His teaching machine tested students’ knowledge as they worked through various school subjects. If students answered questions correctly, they received immediate positive reinforcement and could continue; if they answered incorrectly, they did not receive any reinforcement. The idea was that students would spend additional time studying the material to increase their chance of being reinforced the next time (Skinner, 1961).

In negative reinforcement , an undesirable stimulus is removed to increase a behavior. For example, car manufacturers use the principles of negative reinforcement in their seatbelt systems, which go “beep, beep, beep” until you fasten your seatbelt. The annoying sound stops when you exhibit the desired behavior, increasing the likelihood that you will buckle up in the future. Negative reinforcement is also used frequently in horse training. Riders apply pressure—by pulling the reins or squeezing their legs—and then remove the pressure when the horse performs the desired behavior, such as turning or speeding up. The pressure is the negative stimulus that the horse wants to remove.

Many people confuse negative reinforcement with punishment in operant conditioning, but they are two very different mechanisms. Remember that reinforcement, even when it is negative, always increases a behavior. In contrast, punishment always decreases a behavior. In positive punishment , you add an undesirable stimulus to decrease a behavior. An example of positive punishment is scolding a student to get the student to stop texting in class. In this case, a stimulus (the reprimand) is added in order to decrease the behavior (texting in class). In negative punishment , you remove an aversive stimulus to decrease behavior. For example, when a child misbehaves, a parent can take away a favorite toy. In this case, a stimulus (the toy) is removed in order to decrease the behavior.

Punishment, especially when it is immediate, is one way to decrease undesirable behavior. For example, imagine your four-year-old son, Brandon, hit his younger brother. You have Brandon write 100 times “I will not hit my brother” (positive punishment). Chances are he won’t repeat this behavior. While strategies like this are common today, in the past children were often subject to physical punishment, such as spanking. It’s important to be aware of some of the drawbacks in using physical punishment on children. First, punishment may teach fear. Brandon may become fearful of the street, but he also may become fearful of the person who delivered the punishment—you, his parent. Similarly, children who are punished by teachers may come to fear the teacher and try to avoid school (Gershoff et al., 2010). Consequently, most schools in the United States have banned corporal punishment. Second, punishment may cause children to become more aggressive and prone to antisocial behavior and delinquency (Gershoff, 2002). They see their parents resort to spanking when they become angry and frustrated, so, in turn, they may act out this same behavior when they become angry and frustrated. For example, because you spank Brenda when you are angry with her for her misbehavior, she might start hitting her friends when they won’t share their toys.

While positive punishment can be effective in some cases, Skinner suggested that the use of punishment should be weighed against the possible negative effects. Today’s psychologists and parenting experts favor reinforcement over punishment—they recommend that you catch your child doing something good and reward her for it.

In his operant conditioning experiments, Skinner often used an approach called shaping. Instead of rewarding only the target behavior, in shaping , we reward successive approximations of a target behavior. Why is shaping needed? Remember that in order for reinforcement to work, the organism must first display the behavior. Shaping is needed because it is extremely unlikely that an organism will display anything but the simplest of behaviors spontaneously. In shaping, behaviors are broken down into many small, achievable steps. The specific steps used in the process are the following:

Shaping is often used in teaching a complex behavior or chain of behaviors. Skinner used shaping to teach pigeons not only such relatively simple behaviors as pecking a disk in a Skinner box, but also many unusual and entertaining behaviors, such as turning in circles, walking in figure eights, and even playing ping pong; the technique is commonly used by animal trainers today. An important part of shaping is stimulus discrimination. Recall Pavlov’s dogs—he trained them to respond to the tone of a bell, and not to similar tones or sounds. This discrimination is also important in operant conditioning and in shaping behavior.

Here is a brief video of Skinner’s pigeons playing ping pong.

It’s easy to see how shaping is effective in teaching behaviors to animals, but how does shaping work with humans? Let’s consider parents whose goal is to have their child learn to clean his room. They use shaping to help him master steps toward the goal. Instead of performing the entire task, they set up these steps and reinforce each step. First, he cleans up one toy. Second, he cleans up five toys. Third, he chooses whether to pick up ten toys or put his books and clothes away. Fourth, he cleans up everything except two toys. Finally, he cleans his entire room.

PRIMARY AND SECONDARY REINFORCERS

Rewards such as stickers, praise, money, toys, and more can be used to reinforce learning. Let’s go back to Skinner’s rats again. How did the rats learn to press the lever in the Skinner box? They were rewarded with food each time they pressed the lever. For animals, food would be an obvious reinforcer.

What would be a good reinforce for humans? For your daughter Sydney, it was the promise of a toy if she cleaned her room. How about Joaquin, the soccer player? If you gave Joaquin a piece of candy every time he made a goal, you would be using a primary reinforcer . Primary reinforcers are reinforcers that have innate reinforcing qualities. These kinds of reinforcers are not learned. Water, food, sleep, shelter, sex, and touch, among others, are primary reinforcers. Pleasure is also a primary reinforcer. Organisms do not lose their drive for these things. For most people, jumping in a cool lake on a very hot day would be reinforcing and the cool lake would be innately reinforcing—the water would cool the person off (a physical need), as well as provide pleasure.

A secondary reinforcer has no inherent value and only has reinforcing qualities when linked with a primary reinforcer. Praise, linked to affection, is one example of a secondary reinforcer, as when you called out “Great shot!” every time Joaquin made a goal. Another example, money, is only worth something when you can use it to buy other things—either things that satisfy basic needs (food, water, shelter—all primary reinforcers) or other secondary reinforcers. If you were on a remote island in the middle of the Pacific Ocean and you had stacks of money, the money would not be useful if you could not spend it. What about the stickers on the behavior chart? They also are secondary reinforcers.

Sometimes, instead of stickers on a sticker chart, a token is used. Tokens, which are also secondary reinforcers, can then be traded in for rewards and prizes. Entire behavior management systems, known as token economies, are built around the use of these kinds of token reinforcers. Token economies have been found to be very effective at modifying behavior in a variety of settings such as schools, prisons, and mental hospitals. For example, a study by Cangi and Daly (2013) found that use of a token economy increased appropriate social behaviors and reduced inappropriate behaviors in a group of autistic school children. Autistic children tend to exhibit disruptive behaviors such as pinching and hitting. When the children in the study exhibited appropriate behavior (not hitting or pinching), they received a “quiet hands” token. When they hit or pinched, they lost a token. The children could then exchange specified amounts of tokens for minutes of playtime.

Parents and teachers often use behavior modification to change a child’s behavior. Behavior modification uses the principles of operant conditioning to accomplish behavior change so that undesirable behaviors are switched for more socially acceptable ones. Some teachers and parents create a sticker chart, in which several behaviors are listed ( [link] ). Sticker charts are a form of token economies, as described in the text. Each time children perform the behavior, they get a sticker, and after a certain number of stickers, they get a prize, or reinforcer. The goal is to increase acceptable behaviors and decrease misbehavior. Remember, it is best to reinforce desired behaviors, rather than to use punishment. In the classroom, the teacher can reinforce a wide range of behaviors, from students raising their hands, to walking quietly in the hall, to turning in their homework. At home, parents might create a behavior chart that rewards children for things such as putting away toys, brushing their teeth, and helping with dinner. In order for behavior modification to be effective, the reinforcement needs to be connected with the behavior; the reinforcement must matter to the child and be done consistently.

A photograph shows a child placing stickers on a chart hanging on the wall.

Time-out is another popular technique used in behavior modification with children. It operates on the principle of negative punishment. When a child demonstrates an undesirable behavior, she is removed from the desirable activity at hand ( [link] ). For example, say that Sophia and her brother Mario are playing with building blocks. Sophia throws some blocks at her brother, so you give her a warning that she will go to time-out if she does it again. A few minutes later, she throws more blocks at Mario. You remove Sophia from the room for a few minutes. When she comes back, she doesn’t throw blocks.

There are several important points that you should know if you plan to implement time-out as a behavior modification technique. First, make sure the child is being removed from a desirable activity and placed in a less desirable location. If the activity is something undesirable for the child, this technique will backfire because it is more enjoyable for the child to be removed from the activity. Second, the length of the time-out is important. The general rule of thumb is one minute for each year of the child’s age. Sophia is five; therefore, she sits in a time-out for five minutes. Setting a timer helps children know how long they have to sit in time-out. Finally, as a caregiver, keep several guidelines in mind over the course of a time-out: remain calm when directing your child to time-out; ignore your child during time-out (because caregiver attention may reinforce misbehavior); and give the child a hug or a kind word when time-out is over.

Photograph A shows several children climbing on playground equipment. Photograph B shows a child sitting alone at a table looking at the playground.

REINFORCEMENT SCHEDULES

Remember, the best way to teach a person or animal a behavior is to use positive reinforcement. For example, Skinner used positive reinforcement to teach rats to press a lever in a Skinner box. At first, the rat might randomly hit the lever while exploring the box, and out would come a pellet of food. After eating the pellet, what do you think the hungry rat did next? It hit the lever again, and received another pellet of food. Each time the rat hit the lever, a pellet of food came out. When an organism receives a reinforcer each time it displays a behavior, it is called continuous reinforcement . This reinforcement schedule is the quickest way to teach someone a behavior, and it is especially effective in training a new behavior. Let’s look back at the dog that was learning to sit earlier in the chapter. Now, each time he sits, you give him a treat. Timing is important here: you will be most successful if you present the reinforcer immediately after he sits, so that he can make an association between the target behavior (sitting) and the consequence (getting a treat).

Watch this video clip where veterinarian Dr. Sophia Yin shapes a dog’s behavior using the steps outlined above.

Once a behavior is trained, researchers and trainers often turn to another type of reinforcement schedule—partial reinforcement. In partial reinforcement , also referred to as intermittent reinforcement, the person or animal does not get reinforced every time they perform the desired behavior. There are several different types of partial reinforcement schedules ( [link] ). These schedules are described as either fixed or variable, and as either interval or ratio. Fixed refers to the number of responses between reinforcements, or the amount of time between reinforcements, which is set and unchanging. Variable refers to the number of responses or amount of time between reinforcements, which varies or changes. Interval means the schedule is based on the time between reinforcements, and ratio means the schedule is based on the number of responses between reinforcements.

Reinforcement Schedules
Reinforcement Schedule Description Result Example
Fixed interval Reinforcement is delivered at predictable time intervals (e.g., after 5, 10, 15, and 20 minutes). Moderate response rate with significant pauses after reinforcement Hospital patient uses patient-controlled, doctor-timed pain relief
Variable interval Reinforcement is delivered at unpredictable time intervals (e.g., after 5, 7, 10, and 20 minutes). Moderate yet steady response rate Checking Facebook
Fixed ratio Reinforcement is delivered after a predictable number of responses (e.g., after 2, 4, 6, and 8 responses). High response rate with pauses after reinforcement Piecework—factory worker getting paid for every x number of items manufactured
Variable ratio Reinforcement is delivered after an unpredictable number of responses (e.g., after 1, 4, 5, and 9 responses). High and steady response rate Gambling

Now let’s combine these four terms. A fixed interval reinforcement schedule is when behavior is rewarded after a set amount of time. For example, June undergoes major surgery in a hospital. During recovery, she is expected to experience pain and will require prescription medications for pain relief. June is given an IV drip with a patient-controlled painkiller. Her doctor sets a limit: one dose per hour. June pushes a button when pain becomes difficult, and she receives a dose of medication. Since the reward (pain relief) only occurs on a fixed interval, there is no point in exhibiting the behavior when it will not be rewarded.

With a variable interval reinforcement schedule , the person or animal gets the reinforcement based on varying amounts of time, which are unpredictable. Say that Manuel is the manager at a fast-food restaurant. Every once in a while someone from the quality control division comes to Manuel’s restaurant. If the restaurant is clean and the service is fast, everyone on that shift earns a $20 bonus. Manuel never knows when the quality control person will show up, so he always tries to keep the restaurant clean and ensures that his employees provide prompt and courteous service. His productivity regarding prompt service and keeping a clean restaurant are steady because he wants his crew to earn the bonus.

With a fixed ratio reinforcement schedule , there are a set number of responses that must occur before the behavior is rewarded. Carla sells glasses at an eyeglass store, and she earns a commission every time she sells a pair of glasses. She always tries to sell people more pairs of glasses, including prescription sunglasses or a backup pair, so she can increase her commission. She does not care if the person really needs the prescription sunglasses, Carla just wants her bonus. The quality of what Carla sells does not matter because her commission is not based on quality; it’s only based on the number of pairs sold. This distinction in the quality of performance can help determine which reinforcement method is most appropriate for a particular situation. Fixed ratios are better suited to optimize the quantity of output, whereas a fixed interval, in which the reward is not quantity based, can lead to a higher quality of output.

In a variable ratio reinforcement schedule , the number of responses needed for a reward varies. This is the most powerful partial reinforcement schedule. An example of the variable ratio reinforcement schedule is gambling. Imagine that Sarah—generally a smart, thrifty woman—visits Las Vegas for the first time. She is not a gambler, but out of curiosity she puts a quarter into the slot machine, and then another, and another. Nothing happens. Two dollars in quarters later, her curiosity is fading, and she is just about to quit. But then, the machine lights up, bells go off, and Sarah gets 50 quarters back. That’s more like it! Sarah gets back to inserting quarters with renewed interest, and a few minutes later she has used up all her gains and is $10 in the hole. Now might be a sensible time to quit. And yet, she keeps putting money into the slot machine because she never knows when the next reinforcement is coming. She keeps thinking that with the next quarter she could win $50, or $100, or even more. Because the reinforcement schedule in most types of gambling has a variable ratio schedule, people keep trying and hoping that the next time they will win big. This is one of the reasons that gambling is so addictive—and so resistant to extinction.

In operant conditioning, extinction of a reinforced behavior occurs at some point after reinforcement stops, and the speed at which this happens depends on the reinforcement schedule. In a variable ratio schedule, the point of extinction comes very slowly, as described above. But in the other reinforcement schedules, extinction may come quickly. For example, if June presses the button for the pain relief medication before the allotted time her doctor has approved, no medication is administered. She is on a fixed interval reinforcement schedule (dosed hourly), so extinction occurs quickly when reinforcement doesn’t come at the expected time. Among the reinforcement schedules, variable ratio is the most productive and the most resistant to extinction. Fixed interval is the least productive and the easiest to extinguish ( [link] ).

A graph has an x-axis labeled “Time” and a y-axis labeled “Cumulative number of responses.” Two lines labeled “Variable Ratio” and “Fixed Ratio” have similar, steep slopes. The variable ratio line remains straight and is marked in random points where reinforcement occurs. The fixed ratio line has consistently spaced marks indicating where reinforcement has occurred, but after each reinforcement, there is a small drop in the line before it resumes its overall slope. Two lines labeled “Variable Interval” and “Fixed Interval” have similar slopes at roughly a 45-degree angle. The variable interval line remains straight and is marked in random points where reinforcement occurs. The fixed interval line has consistently spaced marks indicating where reinforcement has occurred, but after each reinforcement, there is a drop in the line.

Skinner (1953) stated, “If the gambling establishment cannot persuade a patron to turn over money with no return, it may achieve the same effect by returning part of the patron’s money on a variable-ratio schedule” (p. 397).

Skinner uses gambling as an example of the power and effectiveness of conditioning behavior based on a variable ratio reinforcement schedule. In fact, Skinner was so confident in his knowledge of gambling addiction that he even claimed he could turn a pigeon into a pathological gambler (“Skinner’s Utopia,” 1971). Beyond the power of variable ratio reinforcement, gambling seems to work on the brain in the same way as some addictive drugs. The Illinois Institute for Addiction Recovery (n.d.) reports evidence suggesting that pathological gambling is an addiction similar to a chemical addiction ( [link] ). Specifically, gambling may activate the reward centers of the brain, much like cocaine does. Research has shown that some pathological gamblers have lower levels of the neurotransmitter (brain chemical) known as norepinephrine than do normal gamblers (Roy, et al., 1988). According to a study conducted by Alec Roy and colleagues, norepinephrine is secreted when a person feels stress, arousal, or thrill; pathological gamblers use gambling to increase their levels of this neurotransmitter. Another researcher, neuroscientist Hans Breiter, has done extensive research on gambling and its effects on the brain. Breiter (as cited in Franzen, 2001) reports that “Monetary reward in a gambling-like experiment produces brain activation very similar to that observed in a cocaine addict receiving an infusion of cocaine” (para. 1). Deficiencies in serotonin (another neurotransmitter) might also contribute to compulsive behavior, including a gambling addiction.

It may be that pathological gamblers’ brains are different than those of other people, and perhaps this difference may somehow have led to their gambling addiction, as these studies seem to suggest. However, it is very difficult to ascertain the cause because it is impossible to conduct a true experiment (it would be unethical to try to turn randomly assigned participants into problem gamblers). Therefore, it may be that causation actually moves in the opposite direction—perhaps the act of gambling somehow changes neurotransmitter levels in some gamblers’ brains. It also is possible that some overlooked factor, or confounding variable, played a role in both the gambling addiction and the differences in brain chemistry.

A photograph shows four digital gaming machines.

COGNITION AND LATENT LEARNING

Although strict behaviorists such as Skinner and Watson refused to believe that cognition (such as thoughts and expectations) plays a role in learning, another behaviorist, Edward C. Tolman , had a different opinion. Tolman’s experiments with rats demonstrated that organisms can learn even if they do not receive immediate reinforcement (Tolman & Honzik, 1930; Tolman, Ritchie, & Kalish, 1946). This finding was in conflict with the prevailing idea at the time that reinforcement must be immediate in order for learning to occur, thus suggesting a cognitive aspect to learning.

In the experiments, Tolman placed hungry rats in a maze with no reward for finding their way through it. He also studied a comparison group that was rewarded with food at the end of the maze. As the unreinforced rats explored the maze, they developed a cognitive map : a mental picture of the layout of the maze ( [link] ). After 10 sessions in the maze without reinforcement, food was placed in a goal box at the end of the maze. As soon as the rats became aware of the food, they were able to find their way through the maze quickly, just as quickly as the comparison group, which had been rewarded with food all along. This is known as latent learning : learning that occurs but is not observable in behavior until there is a reason to demonstrate it.

An illustration shows three rats in a maze, with a starting point and food at the end.

Latent learning also occurs in humans. Children may learn by watching the actions of their parents but only demonstrate it at a later date, when the learned material is needed. For example, suppose that Ravi’s dad drives him to school every day. In this way, Ravi learns the route from his house to his school, but he’s never driven there himself, so he has not had a chance to demonstrate that he’s learned the way. One morning Ravi’s dad has to leave early for a meeting, so he can’t drive Ravi to school. Instead, Ravi follows the same route on his bike that his dad would have taken in the car. This demonstrates latent learning. Ravi had learned the route to school, but had no need to demonstrate this knowledge earlier.

Have you ever gotten lost in a building and couldn’t find your way back out? While that can be frustrating, you’re not alone. At one time or another we’ve all gotten lost in places like a museum, hospital, or university library. Whenever we go someplace new, we build a mental representation—or cognitive map—of the location, as Tolman’s rats built a cognitive map of their maze. However, some buildings are confusing because they include many areas that look alike or have short lines of sight. Because of this, it’s often difficult to predict what’s around a corner or decide whether to turn left or right to get out of a building. Psychologist Laura Carlson (2010) suggests that what we place in our cognitive map can impact our success in navigating through the environment. She suggests that paying attention to specific features upon entering a building, such as a picture on the wall, a fountain, a statue, or an escalator, adds information to our cognitive map that can be used later to help find our way out of the building.

Watch this video to learn more about Carlson’s studies on cognitive maps and navigation in buildings.

Operant conditioning is based on the work of B. F. Skinner. Operant conditioning is a form of learning in which the motivation for a behavior happens after the behavior is demonstrated. An animal or a human receives a consequence after performing a specific behavior. The consequence is either a reinforcer or a punisher. All reinforcement (positive or negative) increases the likelihood of a behavioral response. All punishment (positive or negative) decreases the likelihood of a behavioral response. Several types of reinforcement schedules are used to reward behavior depending on either a set or variable period of time.

Review Questions

________ is when you take away a pleasant stimulus to stop a behavior.

  • positive reinforcement
  • negative reinforcement
  • positive punishment
  • negative punishment

Which of the following is not an example of a primary reinforcer?

Rewarding successive approximations toward a target behavior is ________.

Slot machines reward gamblers with money according to which reinforcement schedule?

  • fixed ratio
  • variable ratio
  • fixed interval
  • variable interval

Critical Thinking Questions

What is a Skinner box and what is its purpose?

A Skinner box is an operant conditioning chamber used to train animals such as rats and pigeons to perform certain behaviors, like pressing a lever. When the animals perform the desired behavior, they receive a reward: food or water.

What is the difference between negative reinforcement and punishment?

In negative reinforcement you are taking away an undesirable stimulus in order to increase the frequency of a certain behavior (e.g., buckling your seat belt stops the annoying beeping sound in your car and increases the likelihood that you will wear your seatbelt). Punishment is designed to reduce a behavior (e.g., you scold your child for running into the street in order to decrease the unsafe behavior.)

What is shaping and how would you use shaping to teach a dog to roll over?

Shaping is an operant conditioning method in which you reward closer and closer approximations of the desired behavior. If you want to teach your dog to roll over, you might reward him first when he sits, then when he lies down, and then when he lies down and rolls onto his back. Finally, you would reward him only when he completes the entire sequence: lying down, rolling onto his back, and then continuing to roll over to his other side.

Personal Application Questions

Explain the difference between negative reinforcement and punishment, and provide several examples of each based on your own experiences.

Think of a behavior that you have that you would like to change. How could you use behavior modification, specifically positive reinforcement, to change your behavior? What is your positive reinforcer?

Operant Conditioning Copyright © 2014 by OpenStaxCollege is licensed under a Creative Commons Attribution 4.0 International License , except where otherwise noted.

CogniFit Blog: Brain Health News

CogniFit Blog: Brain Health News

Brain Training, Mental Health, and Wellness

operant conditioning

Operant Conditioning – 4 Interesting Experiments by B.F. Skinner

' src=

Operant conditioning might sound like something out of a dystopian novel. But it’s not. It’s a very real thing that was forged by a brilliant, yet quirky, psychologist. Today, we will take a quick look at his work as we as a few odd experiments that went with it…

There are few names in psychology more well-known than B. F. Skinner. First-year psychology students scribble endless lecture notes on him. Doctoral candidates cite his work in their dissertations as they test whether a rat’s behavior can be used to predict behavior in humans.

Skinner is one of the most well-known psychologists of our time that was famous for his experiments on operant conditioning. But how did he become such a central figure of these Intro to Psych courses? And, how did he develop his theories and methodologies cited by those sleep-deprived Ph.D. students?

THE FATHER OF OPERANT CONDITIONING

Skinner spent his life studying the way we behave and act. But, more importantly, how this behavior can be modified.

He viewed Ivan Pavlov’s classical model of behavioral conditioning as being “too simplistic a solution” to fully explain the complexities of human (and animal) behavior and learning. It was because of this, that Skinner started to look for a better way to explain why we do things.

His early work was based on Edward Thorndike’s 1989 Law of Effect . Skinner went on to expand on the idea that most of our behavior is directly related to the consequences of said behavior. His expanded model of behavioral learning would be called operant conditioning. This centered around two things…

  • The concepts of behaviors – the actions an organism or test subject exhibits
  • The operants – the environmental response/consequences directly following the behavior

But, it’s important to note that the term “consequences” can be misleading. This is because there doesn’t need to be a causal relationship between the behavior and the operant. Skinner broke these responses down into three parts.

1. REINFORCERS – These give the organism a desirable stimulus and serve to increase the frequency of the behavior.

2. PUNISHERS – These are environmental responses that present an undesirable stimulus and serve to reduce the frequency of the behavior.

3. NEUTRAL OPERANTS – As the name suggests, these present stimuli that neither increase nor decrease the tested behavior.

Throughout his long and storied career, Skinner performed a number of strange experiments trying to test the limits of how punishment and reinforcement affect behavior.

4 INTERESTING OPERANT EXPERIMENTS

Though Skinner was a professional through and through, he was also quite a quirky person. And, his unique ways of thinking are very clear in the strange and interesting experiments he performed while researching the properties of operant conditioning.

Experiment #1: The Operant Conditioning Chamber

The Operant Conditioning Chamber, better known as the Skinner Box , is a device that B.F. Skinner used in many of his experiments. At its most basic, the Skinner Box is a chamber where a test subject, such as a rat or a pigeon, must ‘learn’ the desired behavior through trial and error.

B.F. Skinner used this device for several different experiments. One such experiment involves placing a hungry rat into a chamber with a lever and a slot where food is dispensed when the lever is pressed. Another variation involves placing a rat into an enclosure that is wired with a slight electric current on the floor. When the current is turned on, the rat must turn a wheel in order to turn off the current.  

Though this is the most basic experiment in operant conditioning research, there is an infinite number of variations that can be created based on this simple idea.

Experiment #2: A Pigeon That Can Read

Building on the basic ideas from his work with the Operant Conditioning Chamber, B. F. Skinner eventually began designing more and more complex experiments.

One of these experiments involved teaching a pigeon to read words presented to it in order to receive food. Skinner began by teaching the pigeon a simple task, namely, pecking a colored disk, in order to receive a reward. He then began adding additional environmental cues (in this case, they were words), which were paired with a specific behavior that was required in order to receive the reward.

Through this evolving process, Skinner was able to teach the pigeon to ‘read’ and respond to several unique commands.

Though the pigeon can’t actually read English, the fact that he was able to teach a bird multiple behaviors, each one linked to a specific stimulus, by using operant conditioning shows us that this form of behavioral learning can be a powerful tool for teaching both animals and humans complex behaviors based on environmental cues.

Experiment #3: Pigeon Ping-Pong

But Skinner wasn’t only concerned with teaching pigeons how to read. It seems he also made sure they had time to play games as well. In one of his more whimsical experiments , B. F. Skinner taught a pair of common pigeons how to play a simplified version of table tennis.

The pigeons in this experiment were placed on either side of a box and were taught to peck the ball to the other bird’s side. If a pigeon was able to peck the ball across the table and past their opponent, they were rewarded with a small amount of food. This reward served to reinforce the behavior of pecking the ball past their opponent.

Though this may seem like a silly task to teach a bird, the ping-pong experiment shows that operant conditioning can be used not only for a specific, robot-like action but also to teach dynamic, goal-based behaviors.

Experiment #4: Pigeon-Guided Missiles

Thought pigeons playing ping-pong was as strange as things could get? Skinner pushed the envelope even further with his work on pigeon-guided missiles.

While this may sound like the crazy experiment of a deluded mad scientist, B. F. Skinner did actually do work to train pigeons to control the flight paths of missiles for the U.S. Army during the second world war.

Skinner began by training the pigeons to peck at shapes on a screen. Once the pigeons reliably tracked these shapes, Skinner was able to use sensors to track whether the pigeon’s beak was in the center of the screen, to one side or the other, or towards the top or bottom of the screen. Based on the relative location of the pigeon’s beak, the tracking system could direct the missile towards the target location.

Though the system was never used in the field due in part to advances in other scientific areas, it highlights the unique applications that can be created using operant training for animal behaviors.

THE CONTINUED IMPACT OF OPERANT CONDITIONING

B. F. Skinner is one of the most recognizable names in modern psychology, and with good reason. Though many of his experiments seem outlandish, the science behind them continues to impact us in ways we rarely think about.

The most prominent example is in the way we train animals for tasks such as search and rescue, companion services for the blind and disabled, and even how we train our furry friends at home—but the benefits of his research go far beyond teaching Fido how to roll over.

Operant conditioning research has found its way into the way schools motivate and discipline students, how prisons rehabilitate inmates, and even in how governments handle geopolitical relationships .

  • Category: Brain Health

operant conditioning experiment on pigeon

  • Overcoming Anxiety At Work: Steps to Promote Wellness Towards Successful Career

Pin It on Pinterest

Share this post with your friends!

  • Utility Menu

University Logo

Department of Psychology

  • https://twitter.com/PsychHarvard
  • https://www.facebook.com/HarvardPsychology/
  • https://www.youtube.com/channel/UCFBv7eBJIQWCrdxPRhYft9Q
  • Participate

B. F. Skinner

B.F. Skinner (Image Source: Wikimedia Commons)

“To say that a reinforcement is contingent upon a response may mean nothing more than that it follows the response. It may follow because of some mechanical connection or because of the mediation of another organism; but conditioning takes place presumably because of the temporal relation only, expressed in terms of the order and proximity of response and reinforcement. Whenever we present a state of affairs which is known to be reinforcing at a given drive, we must suppose that conditioning takes place, even though we have paid no attention to the behavior of the organism in making the presentation.”

– B.F. Skinner, “Superstition’ in the Pigeon” (p. 168)

In the 20th century, many of the images that came to mind when thinking about experimental psychology were tied to the work of Burrhus Frederick Skinner.  The stereotype of a bespectacled experimenter in a white lab coat, engaged in shaping behavior through the operant conditioning of lab rats or pigeons in contraptions known as Skinner boxes comes directly from Skinner’s immeasurably influential research.

Although he originally intended to make a career as a writer, Skinner received his Ph.D. in psychology from Harvard in 1931, and stayed on as a researcher until 1936, when he departed to take academic posts at the University of Minnesota and Indiana University.  He returned to Harvard in 1948 as a professor, and was the Edgar Pierce Professor of Psychology from 1958 until he retired in 1974. 

Skinner was influenced by John B. Watson’s philosophy of psychology called behaviorism, which rejected not just the introspective method and the elaborate psychoanalytic theories of Freud and Jung, but any psychological explanation based on mental states or internal representations such as beliefs, desires, memories, and plans. The very idea of “mind” was dismissed as a pre-scientific superstition, not amenable to empirical investigation. Skinner argued that the goal of a science of psychology was to predict and control an organism’s behavior from its current stimulus situation and its history of reinforcement. In a utopian novel called Walden Two and a 1971 bestseller called Beyond Freedom and Dignity, he argued that human behavior was always controlled by its environment. According to Skinner, the future of humanity depended on abandoning the concepts of individual freedom and dignity and engineering the human environment so that behavior was controlled systematically and to desirable ends rather than haphazardly.

 In the laboratory, Skinner refined the concept of operant conditioning and the Law of Effect. Among his contributions were a systematic exploration of intermittent schedules of reinforcement, the shaping of novel behavior through successive approximations, the chaining of complex behavioral sequences via secondary (learned) reinforcers, and “superstitious” (accidentally reinforced) behavior.

Skinner was also an inveterate inventor. Among his gadgets were the “Skinner box” for shaping and counting lever-pressing in rats and key-pecking in pigeons; the cumulative recorder, a mechanism for recording rates of behavior as a pen tracing; a World-War II-era missile guidance system (never deployed) in which a trained pigeon in the missile’s transparent nose cone continually pecked at the target; and “teaching machines” for “programmed learning,” in which students were presented a sentence at a time and then filled in the blank in a similar sentence, shown in a small window. He achieved notoriety for a mid-1950s Life magazine article showcasing his “air crib,” a temperature-controlled glass box in which his infant daughter would play. This led to the urban legend, occasionally heard to this day, that Skinner “experimented on his daughter” or “raised her in a box” and that she grew up embittered and maladjusted, all of which are false.

B.F. Skinner was ranked by the American Psychological Association as the 20th century’s most eminent psychologist.

B. F. Skinner. (1998).  Public Broadcasting Service.  Retrieved December 12, 2007, from:  http://www.pbs.org/wgbh/aso/databank/entries/bhskin.html

Eminent psychologists of the 20th century.  (July/August, 2002). Monitor on Psychology, 33(7), p.29.

Skinner, B. F. (1947).  ‘Superstition’ in the pigeon. Journal of Experimental Psychology, 38, 168-172.

Skinner, B. F. (1959) Cumulative record. New York: Appleton Century Crofts.

Bjork, D. W. (1991). Burrhus Frederick Skinner: The contingencies of a life.  In: Kimble, G. A. & Wertheimer, M. [Eds.]  Portraits of Pioneers in Psychology. 

Role/Affiliation

Filter: role.

  • Faculty (26)
  • Affiliated Faculty (5)
  • Non-Ladder Faculty (14)
  • Visiting Scholars (4)
  • Fellows and Associates (72)
  • Graduate Students (79)
  • Historical Faculty (24)
  • Postdocs and Research Associates (60)
  • Professors Emeriti (4)

Filter: Research Program

  • Clinical Science (6)
  • Cognition, Brain, & Behavior (17)
  • Developmental Psychology (7)
  • Social Psychology (12)

Filter by alphabetical grouping of Last Name

Logo for Digital Editions

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

56 Operant Conditioning

Learning Objectives

By the end of this section, you will be able to:

  • Define operant conditioning
  • Explain the difference between reinforcement and punishment
  • Distinguish between reinforcement schedules
  • Define insight and latent learning

The previous section of this chapter focused on the type of associative learning known as classical conditioning. Remember that in classical conditioning, something in the environment triggers a reflex automatically, and researchers train the organism to react to a different stimulus. Now we turn to the second type of associative learning,  operant conditioning . In operant conditioning, organisms learn to associate a behaviour and its consequence ( Table L.1 ). A pleasant consequence makes that behaviour more likely to be repeated in the future. For example, Spirit, a dolphin at the National Aquarium in Baltimore, does a flip in the air when Spirit’s trainer blows a whistle. The consequence is that Spirit gets a fish.

Table L.1 Classical and Operant Conditioning Compared
Classical Conditioning Operant Conditioning
Conditioning approach An unconditioned stimulus (such as food) is paired with a neutral stimulus (such as a bell). The neutral stimulus eventually becomes the conditioned stimulus, which brings about the conditioned response (salivation). The target behaviour is followed by reinforcement or punishment to either strengthen or weaken it, so that the learner is more likely to exhibit the desired behaviour in the future.
Stimulus timing The stimulus occurs immediately before the response. The stimulus (either reinforcement or punishment) occurs soon after the response.

Psychologist B. F.  Skinner  saw that classical conditioning is limited to existing behaviours that are reflexively elicited, and it doesn’t account for new behaviours such as riding a bike. He proposed a theory about how such behaviours come about. Skinner believed that behaviour is motivated by the consequences we receive for the behaviour: the reinforcements and punishments. His idea that learning is the result of consequences is based on the law of effect, which was first proposed by psychologist Edward  Thorndike . According to the  law of effect , behaviours that are followed by consequences that are satisfying to the organism are more likely to be repeated, and behaviours that are followed by unpleasant consequences are less likely to be repeated (Thorndike, 1911). Essentially, if an organism does something that brings about a desired result, the organism is more likely to do it again. If an organism does something that does not bring about a desired result, the organism is less likely to do it again. An example of the law of effect is in employment. One of the reasons (and often the main reason) we show up for work is because we get paid to do so. If we stop getting paid, we will likely stop showing up—even if we love our job.

Working with Thorndike’s law of effect as his foundation, Skinner began conducting scientific experiments on animals (mainly rats and pigeons) to determine how organisms learn through operant conditioning (Skinner, 1938). He placed these animals inside an operant conditioning chamber, which has come to be known as a “Skinner box” ( Figure L.10 ). A Skinner box contains a lever (for rats) or disk (for pigeons) that the animal can press or peck for a food reward via the dispenser. Speakers and lights can be associated with certain behaviours. A recorder counts the number of responses made by the animal.

A photograph shows B.F. Skinner. An illustration shows a rat in a Skinner box: a chamber with a speaker, lights, a lever, and a food dispenser.

LINK TO LEARNING

In discussing operant conditioning, we use several everyday words—positive, negative, reinforcement, and punishment—in a specialized manner. In operant conditioning, positive and negative do not mean good and bad. Instead,  positive  means you are adding something, and  negative  means you are taking something away.  Reinforcement  means you are increasing a behaviour, and  punishment  means you are decreasing a behaviour. Reinforcement can be positive or negative, and punishment can also be positive or negative. All reinforcers (positive or negative)  increase  the likelihood of a behavioural response. All punishers (positive or negative)  decrease  the likelihood of a behavioural response. Now let’s combine these four terms: positive reinforcement, negative reinforcement, positive punishment, and negative punishment ( Table L.2 ).

Table L.2 Positive and Negative Reinforcement and Punishment
Reinforcement Punishment
Positive Something is   to   the likelihood of a behaviour. Something is   to   the likelihood of a behaviour.
Negative Something is   to   the likelihood of a behaviour. Something is   to   the likelihood of a behaviour.

The most effective way to teach a person or animal a new behaviour is with positive reinforcement. In  positive reinforcement , a desirable stimulus is added to increase a behaviour.

For example, you tell your five-year-old kid, Karson, that if they clean their room, they will get a toy. Karson quickly cleans their room because they want a new art set. Let’s pause for a moment. Some people might say, “Why should I reward my child for doing what is expected?” But in fact we are constantly and consistently rewarded in our lives. Our paycheques are rewards, as are high grades and acceptance into our preferred school. Being praised for doing a good job and for passing a driver’s test is also a reward. Positive reinforcement as a learning tool is extremely effective. It has been found that one of the most effective ways to increase achievement in school districts with below-average reading scores was to pay the children to read. Specifically, second-grade students in Dallas were paid $2 each time they read a book and passed a short quiz about the book. The result was a significant increase in reading comprehension (Fryer, 2010). What do you think about this program? If Skinner were alive today, he would probably think this was a great idea. He was a strong proponent of using operant conditioning principles to influence students’ behaviour at school. In fact, in addition to the Skinner box, he also invented what he called a teaching machine that was designed to reward small steps in learning (Skinner, 1961)—an early forerunner of computer-assisted learning. His teaching machine tested students’ knowledge as they worked through various school subjects. If students answered questions correctly, they received immediate positive reinforcement and could continue; if they answered incorrectly, they did not receive any reinforcement. The idea was that students would spend additional time studying the material to increase their chance of being reinforced the next time (Skinner, 1961).

In  negative reinforcement , an undesirable stimulus is removed to increase a behaviour. For example, car manufacturers use the principles of negative reinforcement in their seatbelt systems, which go “beep, beep, beep” until you fasten your seatbelt. The annoying sound stops when you exhibit the desired behaviour, increasing the likelihood that you will buckle up in the future. Negative reinforcement is also used frequently in horse training. Riders apply pressure—by pulling the reins or squeezing their legs—and then remove the pressure when the horse performs the desired behaviour, such as turning or speeding up. The pressure is the negative stimulus that the horse wants to remove.

Many people confuse negative reinforcement with punishment in operant conditioning, but they are two very different mechanisms. Remember that reinforcement, even when it is negative, always increases a behaviour. In contrast,  punishment   always decreases a behaviour. In  positive punishment , you add an undesirable stimulus to decrease a behaviour. An example of positive punishment is scolding a student to get the student to stop texting in class. In this case, a stimulus (the reprimand) is added in order to decrease the behaviour (texting in class). In  negative punishment , you remove a pleasant stimulus to decrease behaviour. For example, when a child misbehaves, a parent can take away a favourite toy. In this case, a stimulus (the toy) is removed in order to decrease the behaviour.

Punishment, especially when it is immediate, is one way to decrease undesirable behaviour. For example, imagine your four-year-old, Sasha, hit another kid. You have Sasha write 100 times “I will not hit other children” (positive punishment). Chances are Sasha won’t repeat this behaviour. While strategies like this are common today, in the past children were often subject to physical punishment, such as spanking. It’s important to be aware of some of the drawbacks in using physical punishment on children. First, punishment may teach fear. Sasha may become fearful of the street, but Sasha also may become fearful of the person who delivered the punishment—you, the parent. Similarly, children who are punished by teachers may come to fear the teacher and try to avoid school (Gershoff et al., 2010). Consequently, most schools in the United States have banned corporal punishment. Second, punishment may cause children to become more aggressive and prone to antisocial behaviour and delinquency (Gershoff, 2002). They see their parents resort to spanking when they become angry and frustrated, so, in turn, they may act out this same behaviour when they become angry and frustrated. For example, because you spank Sasha when you are angry with them for misbehaving, Sasha might start hitting their friends when they won’t share their toys.

While positive punishment can be effective in some cases, Skinner suggested that the use of punishment should be weighed against the possible negative effects. Today’s psychologists and parenting experts favour reinforcement over punishment—they recommend that you catch your child doing something good and reward them for it.

In his operant conditioning experiments, Skinner often used an approach called shaping. Instead of rewarding only the target behaviour, in   shaping , we reward successive approximations of a target behaviour. Why is shaping needed? Remember that in order for reinforcement to work, the organism must first display the behaviour. Shaping is needed because it is extremely unlikely that an organism will display anything but the simplest of behaviours spontaneously. In shaping, behaviours are broken down into many small, achievable steps. The specific steps used in the process are the following:

  • Reinforce any response that resembles the desired behaviour.
  • Then reinforce the response that more closely resembles the desired behaviour. You will no longer reinforce the previously reinforced response.
  • Next, begin to reinforce the response that even more closely resembles the desired behaviour.
  • Continue to reinforce closer and closer approximations of the desired behaviour.
  • Finally, only reinforce the desired behaviour.

Shaping is often used in teaching a complex behaviour or chain of behaviours. Skinner used shaping to teach pigeons not only such relatively simple behaviours as pecking a disk in a Skinner box, but also many unusual and entertaining behaviours, such as turning in circles, walking in figure eights, and even playing ping pong; the technique is commonly used by animal trainers today. An important part of shaping is stimulus discrimination. Recall Pavlov’s dogs—he trained them to respond to the tone of a bell, and not to similar tones or sounds. This discrimination is also important in operant conditioning and in shaping behaviour.

It’s easy to see how shaping is effective in teaching behaviours to animals, but how does shaping work with humans? Let’s consider a parent whose goal is to have their child learn to clean their room. The parent shaping to help the child master steps toward the goal. Instead of performing the entire task, they set up these steps and reinforce each step. First, the child cleans up one toy. Second, the child cleans up five toys. Third, the child chooses whether to pick up ten toys or put their books and clothes away. Fourth, the child cleans up everything except two toys. Finally, the child cleans their entire room.

Primary and Secondary Reinforcers

Rewards such as stickers, praise, money, toys, and more can be used to reinforce learning. Let’s go back to Skinner’s rats again. How did the rats learn to press the lever in the Skinner box? They were rewarded with food each time they pressed the lever. For animals, food would be an obvious reinforcer.

What would be a good reinforcer for humans? For your child Karson, it was the promise of a toy when they cleaned their room. How about Sydney, the soccer player? If you gave Sydney a piece of candy every time Sydney scored a goal, you would be using a  primary reinforcer . Primary reinforcers are reinforcers that have innate reinforcing qualities. These kinds of reinforcers are not learned. Water, food, sleep, shelter, sex, and touch, among others, are primary reinforcers. Pleasure is also a primary reinforcer. Organisms do not lose their drive for these things. For most people, jumping in a cool lake on a very hot day would be reinforcing and the cool lake would be innately reinforcing—the water would cool the person off (a physical need), as well as provide pleasure.

A  secondary reinforcer  has no inherent value and only has reinforcing qualities when linked with a primary reinforcer. Praise, linked to affection, is one example of a secondary reinforcer, as when you called out “Great shot!” every time Sydney made a goal. Another example, money, is only worth something when you can use it to buy other things—either things that satisfy basic needs (food, water, shelter—all primary reinforcers) or other secondary reinforcers. If you were on a remote island in the middle of the Pacific Ocean and you had stacks of money, the money would not be useful if you could not spend it. What about the stickers on the behaviour chart? They also are secondary reinforcers.

Sometimes, instead of stickers on a sticker chart, a token is used. Tokens, which are also secondary reinforcers, can then be traded in for rewards and prizes. Entire behaviour management systems, known as token economies, are built around the use of these kinds of token reinforcers. Token economies have been found to be very effective at modifying behaviour in a variety of settings such as schools, prisons, and mental hospitals. For example, a study by Cangi and Daly (2013) found that use of a token economy increased appropriate social behaviours and reduced inappropriate behaviours in a group of autistic school children. Autistic children tend to exhibit disruptive behaviours such as pinching and hitting. When the children in the study exhibited appropriate behaviour (not hitting or pinching), they received a “quiet hands” token. When they hit or pinched, they lost a token. The children could then exchange specified amounts of tokens for minutes of playtime.

EVERYDAY CONNECTION

Behaviour Modification in Children

Parents and teachers often use behaviour modification to change a child’s behaviour. Behaviour modification uses the principles of operant conditioning to accomplish behaviour change so that undesirable behaviours are switched for more socially acceptable ones. Some teachers and parents create a sticker chart, in which several behaviours are listed ( Figure L.11 ). Sticker charts are a form of token economies, as described in the text. Each time children perform the behaviour, they get a sticker, and after a certain number of stickers, they get a prize, or reinforcer. The goal is to increase acceptable behaviours and decrease misbehaviour. Remember, it is best to reinforce desired behaviours, rather than to use punishment. In the classroom, the teacher can reinforce a wide range of behaviours, from students raising their hands, to walking quietly in the hall, to turning in their homework. At home, parents might create a behaviour chart that rewards children for things such as putting away toys, brushing their teeth, and helping with dinner. In order for behaviour modification to be effective, the reinforcement needs to be connected with the behaviour; the reinforcement must matter to the child and be done consistently.

A photograph shows a child placing stickers on a chart hanging on the wall.

Time-out is another popular technique used in behaviour modification with children. It operates on the principle of negative punishment. When a child demonstrates an undesirable behaviour, she is removed from the desirable activity at hand ( Figure L.12 ). For example, say that Paton and their sibling Bennet are playing with building blocks. Paton throws some blocks at Bennet, so you give Paton a warning that they will go to time-out if they do it again. A few minutes later, Paton throws more blocks at Bennet. You remove Paton from the room for a few minutes. When Paton comes back, they don’t throw blocks.

There are several important points that you should know if you plan to implement time-out as a behaviour modification technique. First, make sure the child is being removed from a desirable activity and placed in a less desirable location. If the activity is something undesirable for the child, this technique will backfire because it is more enjoyable for the child to be removed from the activity. Second, the length of the time-out is important. The general rule of thumb is one minute for each year of the child’s age. Sophia is five; therefore, she sits in a time-out for five minutes. Setting a timer helps children know how long they have to sit in time-out. Finally, as a caregiver, keep several guidelines in mind over the course of a time-out: remain calm when directing your child to time-out; ignore your child during time-out (because caregiver attention may reinforce misbehaviour); and give the child a hug or a kind word when time-out is over.

Photograph A shows several children climbing on playground equipment. Photograph B shows a child sitting alone on a bench.

Reinforcement Schedules

Remember, the best way to teach a person or animal a behaviour is to use positive reinforcement. For example, Skinner used positive reinforcement to teach rats to press a lever in a Skinner box. At first, the rat might randomly hit the lever while exploring the box, and out would come a pellet of food. After eating the pellet, what do you think the hungry rat did next? It hit the lever again, and received another pellet of food. Each time the rat hit the lever, a pellet of food came out. When an organism receives a reinforcer each time it displays a behaviour, it is called  continuous reinforcement . This reinforcement schedule is the quickest way to teach someone a behaviour, and it is especially effective in training a new behaviour. Let’s look back at the dog that was learning to sit earlier in the chapter. Now, each time the dog sits, you give the dog a treat. Timing is important here: you will be most successful if you present the reinforcer immediately after the dog sits, so that the dog can make an association between the target behaviour (sitting) and the consequence (getting a treat).

Once a behaviour is trained, researchers and trainers often turn to another type of reinforcement schedule— partial reinforcement. In  partial reinforcement , also referred to as intermittent reinforcement, the person or animal does not get reinforced every time they perform the desired behaviour. There are several different types of partial reinforcement schedules ( Table L.3 ). These schedules are described as either fixed or variable, and as either interval or ratio.  Fixed  refers to the number of responses between reinforcements, or the amount of time between reinforcements, which is set and unchanging.  Variable  refers to the number of responses or amount of time between reinforcements, which varies or changes.  Interval  means the schedule is based on the time between reinforcements, and  ratio  means the schedule is based on the number of responses between reinforcements.

Table L.3 Reinforcement Schedules
Reinforcement Schedule Description Result Example
Fixed interval Reinforcement is delivered at predictable time intervals (e.g., after 5, 10, 15, and 20 minutes). Moderate response rate with significant pauses after reinforcement Hospital patient uses patient-controlled, doctor-timed pain relief
Variable interval Reinforcement is delivered at unpredictable time intervals (e.g., after 5, 7, 10, and 20 minutes). Moderate yet steady response rate Checking Facebook
Fixed ratio Reinforcement is delivered after a predictable number of responses (e.g., after 2, 4, 6, and 8 responses). High response rate with pauses after reinforcement Piecework—factory worker getting paid for every x number of items manufactured
Variable ratio Reinforcement is delivered after an unpredictable number of responses (e.g., after 1, 4, 5, and 9 responses). High and steady response rate Gambling

Now let’s combine these four terms. A  fixed interval reinforcement schedule  is when behaviour is rewarded after a set amount of time. For example, June undergoes major surgery in a hospital. During recovery, June is expected to experience pain and will require prescription medications for pain relief. June is given an IV drip with a patient-controlled painkiller. June’s doctor sets a limit: one dose per hour. June pushes a button when pain becomes difficult, and they receive a dose of medication. Since the reward (pain relief) only occurs on a fixed interval, there is no point in exhibiting the behaviour when it will not be rewarded.

With a  variable interval reinforcement schedule , the person or animal gets the reinforcement based on varying amounts of time, which are unpredictable. Say that Tate is the manager at a fast-food restaurant. Every once in a while someone from the quality control division comes to Tate’s restaurant. If the restaurant is clean and the service is fast, everyone on that shift earns a $20 bonus. Tate never knows when the quality control person will show up, so they always tries to keep the restaurant clean and ensures that their employees provide prompt and courteous service. Tate’s productivity regarding prompt service and keeping a clean restaurant are steady because Tate wants their crew to earn the bonus.

With a  fixed ratio reinforcement schedule , there are a set number of responses that must occur before the behaviour is rewarded. Reed sells glasses at an eyeglass store, and  earns a commission every time they sell a pair of glasses. Reed always tries to sell people more pairs of glasses, including prescription sunglasses or a backup pair, so they can increase their commission. Reed does not care if the person really needs the prescription sunglasses, they  just wants the bonus. The quality of what Reed sells does not matter because Reed’s commission is not based on quality; it’s only based on the number of pairs sold. This distinction in the quality of performance can help determine which reinforcement method is most appropriate for a particular situation. Fixed ratios are better suited to optimize the quantity of output, whereas a fixed interval, in which the reward is not quantity based, can lead to a higher quality of output.

In a   variable ratio reinforcement schedule , the number of responses needed for a reward varies. This is the most powerful partial reinforcement schedule. An example of the variable ratio reinforcement schedule is gambling. Imagine that Quinn—generally a smart, thrifty person—visits Las Vegas for the first time. Quinn is not a gambler, but out of curiosity they put a quarter into the slot machine, and then another, and another. Nothing happens. Two dollars in quarters later, Quinn’s curiosity is fading, and they are just about to quit. But then, the machine lights up, bells go off, and Quinn gets 50 quarters back. That’s more like it! Quinn gets back to inserting quarters with renewed interest, and a few minutes later they have used up all the gains and is $10 in the hole. Now might be a sensible time to quit. And yet, Quinn keeps putting money into the slot machine because they never know when the next reinforcement is coming. Quinn keeps thinking that with the next quarter they could win $50, or $100, or even more. Because the reinforcement schedule in most types of gambling has a variable ratio schedule, people keep trying and hoping that the next time they will win big. This is one of the reasons that gambling is so addictive—and so resistant to extinction.

In operant conditioning, extinction of a reinforced behaviour occurs at some point after reinforcement stops, and the speed at which this happens depends on the reinforcement schedule. In a variable ratio schedule, the point of extinction comes very slowly, as described above. But in the other reinforcement schedules, extinction may come quickly. For example, if June presses the button for the pain relief medication before the allotted time their doctor has approved, no medication is administered. June is on a fixed interval reinforcement schedule (dosed hourly), so extinction occurs quickly when reinforcement doesn’t come at the expected time. Among the reinforcement schedules, variable ratio is the most productive and the most resistant to extinction. Fixed interval is the least productive and the easiest to extinguish ( Figure L.13 ).

A graph has an x-axis labeled “Time” and a y-axis labeled “Cumulative number of responses.” Two lines labeled “Variable Ratio” and “Fixed Ratio” have similar, steep slopes. The variable ratio line remains straight and is marked in random points where reinforcement occurs. The fixed ratio line has consistently spaced marks indicating where reinforcement has occurred, but after each reinforcement, there is a small drop in the line before it resumes its overall slope. Two lines labeled “Variable Interval” and “Fixed Interval” have similar slopes at roughly a 45-degree angle. The variable interval line remains straight and is marked in random points where reinforcement occurs. The fixed interval line has consistently spaced marks indicating where reinforcement has occurred, but after each reinforcement, there is a drop in the line.

CONNECT THE CONCEPTS

Skinner (1953) stated, “If the gambling establishment cannot persuade a patron to turn over money with no return, it may achieve the same effect by returning part of the patron’s money on a variable-ratio schedule” (p. 397).

Skinner uses gambling as an example of the power of the variable-ratio reinforcement schedule for maintaining behaviour even during long periods without any reinforcement. In fact, Skinner was so confident in his knowledge of gambling addiction that he even claimed he could turn a pigeon into a pathological gambler (“Skinner’s Utopia,” 1971). It is indeed true that variable-ratio schedules keep behaviour quite persistent—just imagine the frequency of a child’s tantrums if a parent gives in even once to the behaviour. The occasional reward makes it almost impossible to stop the behaviour.

Recent research in rats has failed to support Skinner’s idea that training on variable-ratio schedules alone causes pathological gambling (Laskowski et al., 2019). However, other research suggests that gambling does seem to work on the brain in the same way as most addictive drugs, and so there may be some combination of brain chemistry and reinforcement schedule that could lead to problem gambling ( Figure L.14 ). Specifically, modern research shows the connection between gambling and the activation of the reward centres of the brain that use the neurotransmitter (brain chemical) dopamine (Murch & Clark, 2016). Interestingly, gamblers don’t even have to win to experience the “rush” of dopamine in the brain. “Near misses,” or almost winning but not actually winning, also have been shown to increase activity in the ventral striatum and other brain reward centres that use dopamine (Chase & Clark, 2010). These brain effects are almost identical to those produced by addictive drugs like cocaine and heroin (Murch & Clark, 2016). Based on the neuroscientific evidence showing these similarities, the DSM-5 now considers gambling an addiction, while earlier versions of the DSM classified gambling as an impulse control disorder.

A photograph shows four digital gaming machines.

In addition to dopamine, gambling also appears to involve other neurotransmitters, including norepinephrine and serotonin (Potenza, 2013). Norepinephrine is secreted when a person feels stress, arousal, or thrill. It may be that pathological gamblers use gambling to increase their levels of this neurotransmitter. Deficiencies in serotonin might also contribute to compulsive behaviour, including a gambling addiction (Potenza, 2013).

It may be that pathological gamblers’ brains are different than those of other people, and perhaps this difference may somehow have led to their gambling addiction, as these studies seem to suggest. However, it is very difficult to ascertain the cause because it is impossible to conduct a true experiment (it would be unethical to try to turn randomly assigned participants into problem gamblers). Therefore, it may be that causation actually moves in the opposite direction—perhaps the act of gambling somehow changes neurotransmitter levels in some gamblers’ brains. It also is possible that some overlooked factor, or confounding variable, played a role in both the gambling addiction and the differences in brain chemistry.

TRICKY TOPIC: SCHEDULES OF REINFORCEMENT

Other Types of Learning

Strict behaviourists like Watson and Skinner focused exclusively on studying behaviour rather than cognition (such as thoughts and expectations). In fact, Skinner was such a staunch believer that cognition didn’t matter that his ideas were considered  radical behaviorism . Skinner considered the mind a “black box”—something completely unknowable—and, therefore, something not to be studied. However, another behaviourist, Edward C. Tolman, had a different opinion. Tolman’s experiments with rats demonstrated that organisms can learn even if they do not receive immediate reinforcement (Tolman & Honzik, 1930; Tolman, Ritchie, & Kalish, 1946). This finding was in conflict with the prevailing idea at the time that reinforcement must be immediate in order for learning to occur, thus suggesting a cognitive aspect to learning.

Edward Tolman studied the behaviour of three groups of rats that were learning to navigate through mazes (Tolman & Honzik, 1930). The first group always received a reward of food at the end of the maze but the second group never received any reward. The third group did not receive any reward for the first 10 days an then began receiving rewards on the 11th day of the experimental period. As you might expect when considering the principles of conditioning, the rats in the first group quickly learned to negotiate the maze, while the rats of the second group seemed to wander aimlessly through it. The rats in the third group, however, although they wandered aimlessly for the first 10 days, quickly learned to navigate to the end of the maze as soon as they received food on day 11. By the next day, the rats in the third group had caught up in their learning to the rats that had been rewarded from the beginning. Tolman argued that this was because as the unreinforced rats explored the maze, they developed a  cognitive map : a mental picture of the layout of the maze ( Figure 6.15 ). As soon as the rats became aware of the food (beginning on the 11th day), they were able to find their way through the maze quickly, just as quickly as the comparison group, which had been rewarded with food all along. This is known as  latent learning : learning that occurs but is not observable in behaviour until there is a reason to demonstrate it.

An illustration shows three rats in a maze, with a starting point and food at the end.

Latent learning also occurs in humans. Children may learn by watching the actions of their parents but only demonstrate it at a later date, when the learned material is needed. For example, suppose that Zan’s parent drives Zan to school every day. In this way, Zan learns the route from their house to their school, but Zan’s never driven there themselves, so they have not had a chance to demonstrate that they’ve learned the way. One morning Zan’s parent has to leave early for a meeting, so they can’t drive Zan to school. Instead, Zan follows the same route on their bike that Zan’s parent would have taken in the car. This demonstrates latent learning. Zan had learned the route to school, but had no need to demonstrate this knowledge earlier.

Introduction to Psychology & Neuroscience Copyright © 2020 by Edited by Leanne Stevens is licensed under a Creative Commons Attribution 4.0 International License , except where otherwise noted.

Share This Book

Pigeons, Operant Conditioning, and Social Control

Audrey watters.

This is the transcript of the talk I gave at the Tech4Good event I'm at this weekend in Albuquerque, New Mexico. The complete slide deck is here .

operant conditioning experiment on pigeon

I want to talk a little bit about a problem I see – or rather, a problem I see in the “solutions” that some scientists and technologists and engineers seem to gravitate towards. So I want to talk to you about pigeons, operant conditioning, and social control, which I recognize is a bit of a strange and academic title. I toyed with some others:

operant conditioning experiment on pigeon

I spent last week at the Harvard University archives, going through the papers of Professor B. F. Skinner, arguably one of the most important psychologists of the twentieth century. (The other, of course, being Sigmund Freud.)

operant conditioning experiment on pigeon

I don’t know how familiar this group is with Skinner – he’s certainly a name that those working in educational psychology have heard of. I’d make a joke here about software engineers having no background in the humanities or social sciences but I hear Mark Zuckerberg was actually a psych major at Harvard. (So that’s the joke.)

I actually want to make the case this morning that Skinner’s work – behavioral psychology in particular – has had profound influence on the development of computer science, particularly when it comes to the ways in which “programming” has become a kind of social engineering. I’m not sure this lineage is always explicitly considered – like I said, there’s that limited background in or appreciation for history thing your field seems to have got going on.

B. F. Skinner was a behaviorist. Indeed, almost all the American psychologists in the early twentieth century were. Unlike Freud, who was concerned with the subconscious mind, behaviorists like Skinner were interested in – well, as the name suggests – behaviors. Observable behaviors. Behaviors that could be conditioned or controlled.

operant conditioning experiment on pigeon

Skinner’s early work was with animals. As a graduate student at Harvard, he devised the operant conditioning chamber – better known as the Skinner box – that was used to study animal behavior. The chamber provided some sort of response mechanism that the animal would be trained to use, typically by rewarding the animal with food.

operant conditioning experiment on pigeon

During World War II, Skinner worked on a program called Project Pigeon – also known as Project Orcon, short for Organic Control – an experimental project to create pigeon-guided missiles.

The pigeons were trained by Skinner to peck at a target, and they rewarded with food when they completed the task correctly. Skinner designed a missile that carried pigeons which could see the target through the windows. The pigeons would peck at the target; the pecking in turn would control the missile’s tail fins, keeping it on course, via a metal conductor connected to the birds’ beak, transmitting the force of the pecking to the missile’s guidance system. The pigeons’ accuracy, according to Skinner’s preliminary tests: nearly perfect.

As part of their training, Skinner also tested the tenacity of the pigeons – testing their psychological fitness, if you will, for battle. He fired a pistol next to their heads to see if loud noise would disrupt their pecking. He put the pigeons in a pressure chamber, setting the altitude at 10,000 feet. The pigeons were whirled around in a centrifuge meant to simulate massive G forces; they were exposed to bright flashes meant to simulate shell bursts. The pigeons kept pecking. They had been trained, conditioned to do so.

The military canceled and revived Project Pigeon a couple of times, but Skinner’s ideas were never used in combat. “Our problem,” Skinner admitted, “was no one would take us seriously.” And by 1953, the military had devised an electronic system for missile guidance, so animal-guided systems were no longer necessary (if they ever were).

This research was all classified, and when the American public were introduced to Skinner’s well-trained pigeons in the 1950s, there was no reference to their proposed war-time duties. Rather, the media talked about his pigeons that could play ping-pong and piano.

Admittedly, part of my interest in Skinner’s papers at Harvard involved finding more about his research on pigeons. I use the pigeons as a visual metaphor throughout my work. And I could talk to you for an hour, easily, about the birds – indeed, I have given a keynote like that before. But I’m writing a book on the history of education technology, and B. F. Skinner is probably the name best known with “teaching machines” – that is, programmed instruction (pre-computer).

Skinner’s work on educational technology – on teaching and learning with machines – is connected directly, explicitly to his work with animals. Hence my usage of the pigeon imagery. Skinner believed that there was not enough (if any) of the right kind of behavior modification undertaken in schools. He pointed that that students are punished when they do something wrong – that’s the behavioral reinforcement that they receive: aversion. But students are rarely rewarded when they do something right. And again, this isn’t simply about “classroom behavior” – the kind of thing you get a grade for “good citizenship” on (not talking in class or cutting in the lunch line). Learning, to Skinner, was a behavior – and a behavior that needed what he called “contingencies of reinforcement.” These should be positive. They should minimize the chances of doing something wrong – getting the wrong answer, for example. (That’s why Skinner didn’t like multiple choice tests.) The reinforcement should be immediate.

operant conditioning experiment on pigeon

Skinner designed a teaching machine that he said would do all these things – allow the student to move at her own pace through the material. The student would know instantaneously if she had the answer right. (The reward was getting to move on to the next exciting question or concept.) And you can hear all this echoed in today’s education technology designers and developers and school reformers – from Sal Khan and Khan Academy to US Secretary of Education Betsy DeVos. It’s called “personalized learning.” But it’s essentially pigeon training with a snazzier interface.

“Once we have arranged the particular type of consequence called a reinforcement,” Skinner wrote in 1954 in “The Science of Learning and the Art of Teaching,” "our techniques permit us to shape the behavior of an organism almost at will. It has become a routine exercise to demonstrate this in classes in elementary psychology by conditioning such an organism as a pigeon.”

“ …Such an organism as a pigeon .” We often speak of “lab rats” as shorthand for the animals used in scientific experiments. We use the phrase too to describe people who work in labs, who are completely absorbed in performing their tasks again and again and again. In education and in education technology, students are also the subjects of experimentation and conditioning. In Skinner’s framework, they are not “lab rats”; they are pigeons . As he wrote,

…Comparable results have been obtained with pigeons, rats, dogs, monkeys, human children… and psychotic subjects. In spite of great phylogenetic differences, all these organisms show amazingly similar properties of the learning process. It should be emphasized that this has been achieved by analyzing the effects of reinforcement and by designing techniques that manipulate reinforcement with considerable precision. Only in this way can the behavior of the individual be brought under such precise control.

If we do not bring students’ behavior under control, Skinner cautioned, we will find ourselves “losing our pigeon.” The animal will be beyond our control.

Like I said, I’m writing a book. So I can talk at great length about Skinner and teaching machines. But I want folks to consider how behaviorism hasn’t just found its way into education reform or education technology. Indeed, Skinner and many others envisioned that application of operant conditioning outside of the laboratory, outside of the classroom – the usage (past and present) of behavior modification for social engineering is at the heart of a lot of “fixes” that people think they’re doing “for the sake of the children,” or “for the good of the country,” or “to make the world a better place.”

operant conditioning experiment on pigeon

Among the discoveries I made – new to me, not new to the world, to be clear: in the mid–1960s, B. F. Skinner was contacted by the Joseph P. Kennedy Jr. Foundation, a non-profit that funded various institutions and research projects that dealt with mental disabilities. Eunice Kennedy Shriver was apparently interested in his work on operant behavior and child-rearing, and her husband Sargent Shriver who’d been appointed by President Johnson to head the newly formed Office of Economic Opportunity was also keen to find ways to use operant conditioning as part of the War on Poverty.

There was a meeting. Skinner filed a report. But as he wrote in his autobiography, nothing came of it. “A year later,” he added, “one of Shriver’s aides came to see me about motivating the peasants in Venezuela.”

Motivating pigeons or poor people or peasants (or motivating peasants and poor people as pigeons) – it’s all offered, quite earnestly no doubt – as the ways in which science and scientific management will make the world better.

But if nothing else, the application of behavior modification to poverty implies that this is a psychological problem and not a structural one. Focus on the individual and their “mindset” – to use the language that education technology and educational psychology folks invoke these days – not on the larger, societal problems.

I recognize, of course, that you can say “it’s for their own good” – but it involves a great deal of hubris (and often historical and cultural ignorance, quite frankly) to assume that you know what “their own good” actually entails.

operant conditioning experiment on pigeon

You’ll sometimes hear that B. F. Skinner’s theories are no longer in fashion – the behaviorist elements of psychology have given way to the cognitive turn. And with or without developments in cognitive and neuroscience, Skinner’s star had certainly lost some of its luster towards the end of his career, particularly, as many like to tell the story, after Noam Chomsky penned a brutal review of his book Beyond Freedom and Dignity in the December 1971 issue of The New York Review of Books . In the book, Skinner argues that our ideas of freedom and free will and human dignity stand in the way of a behavioral science that can better organize and optimize society.

“Skinner’s science of human behavior, being quite vacuous, is as congenial to the libertarian as to the fascist,” writes Chomsky, adding that “there is nothing in Skinner’s approach that is incompatible with a police state in which rigid laws are enforced by people who are themselves subject to them and the threat of dire punishment hangs over all.”

Skinner argues in Beyond Freedom and Dignity that the goal of behavioral technologies should be to “design a world in which behavior likely to be punished seldom or never occurs” – a world of “automatic goodness.“ We should not be concerned with freedom, Skinner argues – that’s simply mysticism. We should pursue ”effectiveness of techniques of control“ which will ”make the world safer." Or make the world totalitarian, as Chomsky points out.

operant conditioning experiment on pigeon

Building behavioral technologies is, of course, what many computer scientists now do (perhaps what some of you do cough FitBit) – most, I’d say, firmly believing that they’re also building a world of “automatic goodness.” “Persuasive technologies,” as Stanford professor B. J. Fogg calls it. And in true Silicon Valley fashion, Fogg erases the long history of behavioral psychology in doing so: “the earliest signs of persuasive technology appeared in the 1970s and 1980s when a few computing systems were designed to promote health and increase workplace productivity,” he writes in his textbook. His students at his Behavioral Design Lab at Stanford have included Mike Krieger, the co-founder of Instagram, and Tristan Harris, a former Googler, founder of the Center for Humane Technology, and best known figure in what I call the “tech regrets industry” – he’s into “ethical” persuasive technologies now, you see.

Behavior modification. Behavioral conditioning. Behavioral design. Gamification. Operant conditioning. All practices and products and machines that are perhaps so ubiquitous in technology that we don’t see them – we just feel the hook and the urge for the features that reward us for behaving like those Project Pigeon birds pecking away at their target – not really aware of why there’s a war or what’s at stake or that we’re going to suffer and die if this missile runs its course. But nobody asked the pigeons. And even with the best of intentions for pigeons – promising pigeons an end to poverty and illiteracy, nobody asked the pigeons. Folks just assumed that because the smart men at Harvard (or Stanford or Silicon Valley or the US government) were on it, that it was surely right “fix.”

Published 15 Jun 2018

Hack Education

The history of the future of education technology.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.39(1); 2016 May

The Other Shoe: An Early Operant Conditioning Chamber for Pigeons

Takayuki sakagami.

Department of Psychology, Keio University, 2-15-45 Minato-ku, Tokyo, 108-8345 Japan

Kennon A. Lattal

Department of Psychology, West Virginia University, Morgantown, WV 26506-6040 USA

We describe an early operant conditioning chamber fabricated by Harvard University instrument maker Ralph Gerbrands and shipped to Japan in 1952 in response to a request of Professor B. F. Skinner by Japanese psychologists. It is a rare example, perhaps the earliest still physically existing, of such a chamber for use with pigeons. Although the overall structure and many of the components are similar to contemporary pigeon chambers, several differences are noted and contrasted to evolutionary changes in this most important laboratory tool in the experimental analysis of behavior. The chamber also is testimony to the early internationalization of behavior analysis.

In 1952, B. F. Skinner arranged with Ralph Gerbrands a shipment of operant conditioning apparatus to two universities in Japan: Tokyo University and Keio University (Skinner, 1983 , p. 38). The two names in Japan that have been most prominently mentioned in the Keio shipment are M. Yokoyama and Takashi Ogawa, who were respectively professor and associate professor of psychology at Keio University at that time. Yokoyama traveled to the USA in 1950, supported by the Government Aid and Relief Fund in Occupied Areas GARIOA Program. During that trip he met, among others, E. G. Boring, who was in the Psychology Department at Harvard, and reportedly was impressed by American social psychology and animal psychology. He may have been introduced to operant conditioning apparatus during that visit, but this has not been corroborated. Asano and Lattal ( 2012 ) noted that both Skinner and Yokoyama thereafter attended an international congress in Stockholm (Skinner, 1951 ) and speculated that that encounter may have been part of the impetus for the apparatus arriving in Japan as well. Professor Masao Tachibana of the University of Tokyo did confirm that a rat chamber delivered to Tokyo University was purchased in April, 1952 for 259,000 Japanese Yen (which corresponds to 717.00 1952 US dollars or 2290.00 2016 US dollars). He also indicated that the chamber was discarded in 2008 (Tachibana, personal communication, March 27, 2015), following the fate of much old research apparatus. Asano and Lattal ( 2012 ) described the cumulative recorder that was shipped to Keio University. Here the other apparatus shoe to that discovery is dropped with a description of the operant chamber for pigeons that accompanied the cumulative recorder.

Early physical examples of operant conditioning chambers for either rats or pigeons are rare. One pre-1950s example of a rat chamber of the type described by Heron and Skinner ( 1939 ) is held in the Department of Psychology at the University of Minnesota. Although Skinner started conducting experiments with pigeons at the University of Minnesota in the days of what has come to be known as Project Pelican (Skinner, 1960 ), other than the Project Pelican apparatus (which is a part of the Smithsonian Museum’s permanent collection) no examples of pre-1950s operant chambers for pigeons have been forthcoming, at the University of Minnesota or anywhere else. The early 1950s operant chamber for pigeons that was shipped to Japan as noted above was recently rediscovered in one of the operant behavior laboratories at Keio University in Tokyo. In the first part of this paper and in the Appendix we describe, in some detail for the historical record, this seminal apparatus for the experimental analysis of behavior. In the second part, we review some early uses of the chamber in Japan, putting it into the historical context of what would become behavior analysis in that country. In the third and final part, we discuss the chamber in the context of the broader history of operant chamber technology. We focus specifically on the differences between this chamber and ones that followed, and some of the implications of these differences for conducting an experimental analysis of behavior.

The Chamber

Figure ​ Figure1 1 shows the exterior of the chamber. The shell is a J. C. Higgins™ ice chest. “J. C. Higgins” was a signature brand of sporting and outdoor equipment sold by the Sears and Roebuck Company of Chicago, IL between 1908 and the early 1960s. Then, as now, such ice chests were popular with picnickers, campers, and other outdoor enthusiasts, as well as with operant conditioners. Indeed, there are operant laboratories around the world that continue to use similar ice chests for chambers. This one is in excellent condition, although, as Fig.  1 shows, some of the brand-name decal has worn off. Its remaining part, however, still is readable. A label attached to the top of the chamber (Fig.  2 ) at some point in its past reads as follows: “Skinner box: It was sent from [donated by] Harvard University. Handle with care.” The chamber, when closed, is sealed except for a ventilation fan.

An external file that holds a picture, illustration, etc.
Object name is 40614_2016_55_Fig1_HTML.jpg

Exterior view of the operant conditioning chamber for pigeons. The cable resting atop the chamber makes the connection between the chamber work panel and the equipment used to program the contingencies to which the pigeon is exposed

An external file that holds a picture, illustration, etc.
Object name is 40614_2016_55_Fig2_HTML.jpg

Back side of the chamber showing the ventilation fan and the label indicating the chamber came from Harvard University (see text)

The interior of the chamber is shown in Fig.  3 . The chamber is divided by a 3-mm thick aluminum panel into a work area, where the pigeon is placed, and a service area, where the response key, discriminative stimulus lights, reinforcement dispenser, and plugs to connect the chamber to the control equipment are located. Figure ​ Figure4 4 is a pigeon’s-eye-level view of the face of the work panel. Its most salient features are a single response key located behind the circular opening near the top center of the chamber and a food magazine (also called a food hopper, feeder, or grain dispenser in laboratory jargon) located behind the square opening below the response key. The small dark circles are screw heads, and the dark bar across the middle of the panel is a brace.

An external file that holds a picture, illustration, etc.
Object name is 40614_2016_55_Fig3_HTML.jpg

Top view of chamber with the lid open. The left photograph shows the electrical components in the service area that translate the programs into specific experimental operations in the lower portion of the photograph and the wooden floor in the work area in the top portion . For orientation purposes, the arrow marks the top of the food storage bin. The right photograph shows the work area, where the pigeon is stationed during an experimental session, in the lower part of the photograph (the wooden floor was removed from the work area in this photograph). The rubber insulation around the top is marked by two arrows , which are joined together on the foam piece atop the work panel

An external file that holds a picture, illustration, etc.
Object name is 40614_2016_55_Fig4_HTML.jpg

Front view of the work panel. The dark line across the middle of the photograph is a metal bar. The dark material at the top of the work panel is a foam rubber cushion that seals the front side of the chamber for the control equipment behind the work panel face. It is held in place on either side by two pieces of twine. The arrow denotes the electric plug described in the text

Figure ​ Figure5 5 shows two views of the service area. The two most unique and technologically interesting components in this chamber are the response key and the food magazine. The location of the response key is described in the Fig.  5 caption and the key is shown in more detail in Fig.  6 . The center portion of the key, made of white plastic, is suspended such that when it is pecked through the circle on the work-area side (see Fig.  4 ), it closes a small switch, circumscribed by the rectangle in the right photograph of Fig.  6 . Details of the key’s operation are provided in the Appendix , along with further description of the stimulus lights.

An external file that holds a picture, illustration, etc.
Object name is 40614_2016_55_Fig5_HTML.jpg

The left photograph shows the rear of the work panel, with its electrical and mechanical components. On the left of the work panel is the ( black ) connection box, where the wires come in through the 12-prong male Jones plug ( arrow ) and then are connected to the other electrical components. To its right is the food magazine. Above the food magazine the two lights (housed in black plastic cylinders) used to transilluminate the key are visible, and in front of the lights mounted on the chamber wall and appearing as a partial rectangle is the response key. The photograph on the right shows a side view of the electrical components. The metal bars set at an angle are the support for the work panel. Behind the metal bar, the connection box is seen open, revealing two electromechanical relays and the electrical connections to other locations on the work panel. The male Jones plug protrudes from the lower part of the rear side of the connection box

An external file that holds a picture, illustration, etc.
Object name is 40614_2016_55_Fig6_HTML.jpg

The left photograph, showing a portion of the rear of the work panel, was taken from slightly above the plane of and to the left of the response key (the black and white partial rectangle is the response key). The two key-light sockets can be seen in the center foreground of the photograph ( shorter arrows ). The longer arrow points to an opaque shield that diffuses the light from the key lights. The right photograph shows the switch for the key, circumscribed by a rectangle, which closes to record a response when the key is operated from the other side of the work panel. The arrow points to the electrical contacts that close to define the response when the key is operated

The food magazine is shown in the left photograph of Fig.  7 . The magazine consists of a frame that supports a vertically mounted storage bin (hopper) emptying into a horizontally mounted tray. At one end of the tray is an opening in its top side and at the other end is a lead counterweight. Reinforcement consists of raising the tray to a small aperture located behind and just below the square opening on the work-area side of the work panel (see Fig.  4 ). This is accomplished by operating an electric motor that turns the cogwheel (medium-length arrows in the left and lower right photographs in Fig.  7 ). The tray lifts when the raised portion of the cogwheel pushes on a lever attached to the tray (longer arrow in the left and lower right photographs in Fig.  7 ) and lowers when the indented portion of the wheel releases the lever.

An external file that holds a picture, illustration, etc.
Object name is 40614_2016_55_Fig7_HTML.jpg

The left photograph shows the food magazine with its unique cogwheel ( longer arrow ) and lever ( mid-length arrow ) arrangement for raising the food tray to the level of the aperture, to which the pigeon has access. The key lights are marked in this photograph by the shorter arrows . The lower right photograph shows a closer view of the food tray raising mechanism. The lever is marked by the longer arrow and the cogwheel by the shorter one . The upper right photograph shows a slightly later Gerbrands design for a pigeon food magazine. Rather than using a cogwheel arrangement, the food magazine in the right figure uses a solenoid ( longer arrow ) attached to the food tray ( mid-length arrow ) by a spring ( shorter arrow ) to raise the tray to the aperture

Early History of the Chamber Use

Professor (Emeritus) Toshiro Yoshida unpacked the parcel containing both the operant conditioning chamber and the cumulative recorder (Asano & Lattal, 2012 ) when it arrived at Keio from Harvard via sea mail. He told the authors that there were no detailed explanations or instruction manuals accompanying the apparatus, making it difficult to understand the operation of both pieces of equipment. He recalled that the chamber was used first by Sukeo Sugimoto (at that time, a doctoral student in psychology who was supervised by Associate Professor Ogawa), but he did not publish his experiments using it. Both Professor Yoshida and retired Professor Satoko Ohinata (personal communication, 12 May 2015), who authored the experiment described below (and who also was supervised by Ogawa), noted that the box was difficult to use for flexible experimental conditions because the space within the box was so limited. They both also noted that the control equipment (probably electromechanical relays, timers, and counters) for displaying discriminative stimuli and magazine was so modest that the apparatus could only be used for relatively basic contingencies like simple discrimination, reinforcement schedules, or extinction.

The first published experiment based on research conducted using the chamber was conducted by Professor Ohinata ( 1955 ). The English version of the abstract of the paper reads in part:

The present study on the instrumental conditioning of color discrimination by pigeons was undertaken to determine whether the learning was based on absolute or on relative discrimination. It was assumed that if the learning was based upon relative discrimination, the luminance relation of the stimuli would be transferred regardless of their wavelength and, on the other hand, if it was based upon absolute discrimination, pigeons would respond to wavelengths without regards to luminance relations.

The paper otherwise is written in Japanese, with only a few words written using the Latin alphabet; however, the following description of the apparatus is included: “装置: Harvard大学製鳩用Skinner-Box.色光刺戟呈示のための附属装置は特に慶応義塾大学心理学研究室に於いて設備された。” (p 313). The words with Latin letters describe the present chamber and its origin. Professor Ohinata (personal communication, 12 May 2015) reported that the chamber continued to be used for a number of years for various undergraduate, graduate, and faculty research projects at Keio. One of these was conducted and reported by Professor Masaya Sato ( 1963 ) (the first president of the Association for Behavior Analysis International from outside the United States) related to deprivation level and discrimination performance.

The concerns noted above of Professors Ogawa and Ohinata with the limitations of the chamber probably relate to Ogawa’s dedication to comparative psychology. His interests in discrimination and perception were in that context. Thus, the early research involving the chamber was not focused on “operant conditioning” in the sense characterized by the work of Ferster and Skinner ( 1957 ), but rather on experiments and problems related to comparative psychology. The chamber had evolved in the USA to meet the emphasis and special needs of operant conditioning, which was concerned largely with the basic contingencies mentioned above (e.g., Ferster, 1953 ; Ferster & Skinner, 1957 ). It is interesting to consider the possibility that this chamber, along with the cumulative recorder and rat chamber mentioned in the introduction, may have created an environment in which operant conditioning could be shaped in Japan. Given the kinds of research for which the chambers and recorder developed, it seems feasible that, with this apparatus available, research related to problems more typical of the experimental analysis of behavior might have developed through successive approximations as subsequent Japanese psychologists with different research interests came into contact with the apparatus. We know that the cumulative recorder Skinner shipped to Japan became the model for a Japanese-manufactured version of the cumulative recorder (Asano & Lattal, 2008 ). We do not know whether or to what degree the chambers of this shipment became models for construction of other chambers in Japan, but it is not hard to imagine that they did. As time passed there was increasing contact between Japanese and American psychologists of many theoretical orientations, but Skinner’s shipment of apparatus can be considered among the factors leading to the development of Japanese behavior analysis.

Implications for the History of Operant Chamber Technology

The most striking thing about the chamber is its “modernity,” give that is more than 64 years old. With the exception of the response key and food magazine, described above, this chamber could as readily be used in any behavior analysis laboratory today as one built in the past year. Indeed, its functions are identical to its contemporary counterparts. This could, and probably will be, taken by some as evidence that the research methods of the experimental analysis of behavior are too entrenched, stuck in the halcyon days of “operant conditioning” with only a dim future ahead. An alternative perspective, which we prefer, is that Skinner developed a powerful tool when he invented the operant conditioning chamber. Its utility persists, and we have only begun to exploit its potential to enhance our understanding of behavior.

The overarching function of an operant chamber, then and now, is to provide a more or less constant, distraction-free environment in which the interactions between organism and environment can be studied. Such isolation requires that the chamber be ventilated to maintain a constant, comfortable temperature for the animal (see Ferster, 1953 ). This was accomplished by the ventilation fan described above; however, the location of the ventilation fan resulted in it pulling air across the pigeon in the work area and then the service area before exhausting through the fan. This design resulted in more exposure of the mechanical and electrical control and recording devices in the service area to more pigeon dust (the lubricant generated by the pigeon’s feathers) than would occur had the ventilation circulation been reversed. This problem was not always recognized even by later commercial manufacturers, who often similarly placed the ventilation fan as it is in this chamber. A quick perusal of the pigeon chambers in two of the operant laboratories at West Virginia University uncovered four different commercial models of pigeon chambers, three manufactured in the 1960s and 1970s and one after 2010. Tellingly, all were vented as this chamber. By contrast, all of the home-made chambers were vented such that the exhaust was in the work area rather than the service area.

Except for the fan, the chamber is completely isolated from the external environment when it is closed. One consequence of this is that there is no way, short of leaving the lid open, to observe the animal in the chamber. The early rat chamber housed at the University of Minnesota’s Department of Psychology is similarly completely isolated, because it too lacks any means by which the behaving animal can be seen when the chamber is closed. Even though Skinner developed many demonstrations in which the animal was placed in an open environment for all to see (one of these environments is shown in a popular photograph of him, see Skinner, 1979 , photographic display between pages 184 and 185 that is labeled “Demonstrating operant conditioning of a pigeon, Indiana, 1948”). The balance between the risk of disturbing the animal while working and the need to see what the animal actually is doing was later resolved in the construction of pigeon and rat chambers by including a means for observing the animal when the experiment was in progress. In earlier days it might have been a glass- or plastic-covered aperture with a curtain over it that could be lifted to allow the experimenter to watch the pigeon. Or it could have been a peep hole of the sort found in entry doors of homes and apartments. Today it often is a miniature camera mounted in an unobtrusive spot inside the chamber. The absence of a means of seeing the subject directly in these early chambers made it impossible to observe behavior other than the recorded operant. The absence of an observation port early in the history of operant conditioning may have contributed to an unfortunate behavioral precedent for some experimenters and laboratories of not only ignoring, but perhaps even dismissing, observational data as too subjective and of limited value. Although there have been many demonstrations to the contrary (e.g., Laties, Weiss, Clark, & Reynolds, 1965 ; Staddon & Simmelhag, 1971 ), precedents sometimes are hard to undo. Perhaps our science would be further along had some of the time spent mesmerized by cumulative records been spent looking through observation ports and peep holes to see what was going on that was not always reflected in those cumulative records.

Cumulative records were created by routing the electrical impulses generated when a response key was activated to a cumulative recorder (Lattal, 2004 ). To do this, a switch closure was required. That switch was the response key. Close inspection of the actual switch circumscribed by the rectangle in the right photograph of Fig.  6 reveals it to be a normally open switch (arrow). This means that operation of the key created an electrical pulse that in turn could be translated to a standard duration (usually 50 ms) and then counted with an electromechanical counting device and/or routed to the cumulative recorder to “step” the response pen one unit with each switch operation (Lattal, 2004 ). The electrical response pulse also operated programming devices that delivered the reinforcer. It takes longer to close a switch than it does to open one. Skipping the electronic details, suffice to say that normally open circuits soon were discovered to simply not be fast enough to accurately capture the pecks operating the key switch. So, at some point normally open response keys like the one on the present chamber were replaced by response keys that operated when a circuit was broken rather than “made.” This change greatly improved the capture of responses by the electromechanical circuitry; making the obtained data more closely reflect the actual key pecking of the pigeon (though still not always with 100 % accuracy). Contemporary use of touch-screens to record pecking responses of pigeons, when they work properly (a caveat for any item of equipment, of course), allow recording of the location of the response relative to the target area. So-called “off key” pecks occurring when a switch-type response key is used are lost without special techniques to capture them (e.g., Dunham, Mariner, & Adams, 1969 ). Modern response key technology also can allow for the possibility of capturing variations in response force (ordinary response keys that are switches require a minimum force and cannot differentiate between force requirements above or below that limit). The upshot of this is that the key in this chamber is truly a dinosaur, one of an earlier era that has been extinct for a very long time.

The other unique feature of this chamber noted above, is the food-magazine operation system. By the time Ferster ( 1953 ) published his description of the methods of operant conditioning, food magazines of the sort on this box were, as far as we can tell, extinct. A typical food magazine of the mid 1950s is shown in Ferster and Skinner’s ( 1957 ) Fig.  2 and in the previously described upper right photograph of Fig.  7 previously described. The skeleton of the latter and the one on the work panel of this chamber are almost identical. Both consist of a food storage bin that releases grain into a food tray. The food tray is connected to a device to raise the food tray to an aperture at the base of a chute through which the pigeon could stick its beak and obtain a few bits of food. The present food magazine accomplished the raising of the food tray by the cogwheel mechanism as described above. The one shown in Ferster and Skinner’s Fig.  2 and in the upper right photograph of Fig.  7 above accomplish the raising by activating a solenoid attached to the food tray by a spring. This activation pulls the food tray up into position such that access to the food through the aperture is possible. We can only speculate as to the reasons for the demise of the cogwheel mechanism. One possibility is that it was too large. The motor that operates the cam is bulky and covers most of the area above the food tray, but it does not seem to obstruct anything. A second possibility is that the cogwheels did not operate reliably. There is no evidence one way or the other on this. The cogwheel on the present food magazine appears to be quite sturdy and, when operated manually, raised and lowered the food magazine with precision. A third possibility is that the motor operation was not sufficiently loud to result in reliable eating. That is, the raising of the food magazine did not function as a conditioned reinforcer. Iversen (personal communication, 2013) found with contemporary “silent operation” pellet dispensers for use with rats that the rat often left the food pellets in the tray after they were delivered. Only when he added a sound that occurred simultaneously with the dispenser operation did the rats rapidly approach the food cup and consume the pellet. A silent feeder could be an even greater problem with pigeons, because, unlike a pellet that stays in the food cup after it is delivered, grain is available only so long as the food tray is raised. The effect of a silent motor, however, would be compensated for by the fact that there is a light above the food aperture that presumably operated when the magazine motor was operated. A fourth possibility is that changing the duration of the reinforcement cycle would have required changing the cogwheel, because the magazine is raised so long as the upper portion of the cogwheel is in contact with the lever. Thus, reinforcement duration is fixed by the length of the upper portion of the cogwheel such that changing reinforcement magnitude would require replacing the cogwheel with one configured another way. A final possibility is that solenoids were cheaper than the cogwheels, which, as noted, required an electric motor to operate. The solenoid, however, required an independent timing device external to the chamber to hold current on the solenoid throughout the reinforcement cycle. Thus, whatever monetary savings there was in using the solenoid may have been offset by the need for an external timer. It is difficult to assess after the fact which, if any of these factors contributed to the switch to solenoids. Whatever the reason, the solenoid has been an enduring feature of pigeon food magazines up from their first use in the 1950s to today.

Solenoids in early chambers typically were operated by a 110-v AC current. Indeed, most of the early electromechanical programming equipment operated off of this high voltage (Catania, 2002 ; Dinsmoor, 1990 ). These solenoids created no problems with most electromechanical circuitry of the era of this chamber, but when, beginning in the early 1960s, transistorized circuitry began replacing or complementing electromechanical equipment in operant conditioning laboratories, problems arose because of the electrical interference created by the operating and deactivating of these high-voltage solenoids. This interference caused transistors to operate at unscheduled times, thereby disrupting programming and recording equipment. These relatively high-voltage solenoids eventually were replaced by ones that were operated by a 28-v DC current, which generally did not disrupt sensitive equipment. These remain the standard today.

We also should comment on the lights for transilluminating the response keys, which thus served as discriminative stimuli. We could not discern whether the lights were operated by a 110-v alternating or 28-v direct current. The generation of chambers in the era of Ferster and Skinner ( 1957 ) commonly were equipped with low-wattage 110 v AC Christmas tree lights as the discriminative stimuli. These lights were used because 28 v DC lights used to transilluminate response keys would flicker due to voltage fluctuations when the key was pecked and recorded or even when reinforcers were “set up” by the electromechanical equipment used to control the experiment. The result of this could be to provide a reliable visual cue as to the availability of a reinforcer, with the effect of undermining the experiment. Low-voltage (28 v DC) bulbs came into use only when it was feasible to operate them from a second power supply that operated independently of the one controlling the relay programming apparatus. Enclosed “pilot lights” located directly behind the key—typically much closer than the distance between the lights in this chamber and the back of the response key—offered some protection of the lights from the fine covering of pigeon dust described above. The most contemporary device for presenting discriminative stimuli is a computer screen, which offers the investigator almost unlimited control over the type, and location, of these stimuli.

Over time, this pigeon chamber found its way to the back of a shelf in an operant laboratory at Keio University, where it lay fallow until it recently was retrieved by the first author after an inquiry by the second. It is a truly rare item and as such is an important part of the collective heritage of our discipline. Beyond its obvious significance in the history of the experimental analysis of behavior, the chamber also is testimony to the strong international connections between behavior analysts, exemplified by the one between the early Japanese behavior analysts and Skinner that helped bring mid-twentieth century cutting-edge behavioral research apparatus to Japan.

Details of the chamber omitted from the general description in the section above labeled “ The Chamber ” appear in this Appendix.

The Chamber Shell

The exterior dimensions of the aluminum chamber are 56 cm long by 41 cm high by 33 cm wide. There are latches at either end (part of the latch on the work-area end is missing), centered on the short sides of the chest. Except for the attachment of a ventilation fan and a single aperture to accommodate the electrical cable, the chest otherwise looks like any other of this product line. The connecting cable that protrudes from the work panel through the rear wall appears to be original. It is 180 cm long, excluding the male Jones plugs (12 prong) connected to either end. The connector at the end distal to the chamber was attached to either directly to the programming apparatus or to a female connector, which in turn attached to another cable that connected to the programming apparatus that controlled the contingencies to which the pigeon in the chamber was to be exposed. The cable covering is of a heavy fabric, rather than the later plastic cable coatings/coverings. The ventilation fan (shown in Fig.  2 on the rear long side of the chamber) is powered from a plug that connects through the ice chest wall to another plug located on the back side of the work panel. The fan housing is 11.5 cm long by 8 cm high, and protrudes 9.5 cm from the outer wall. It is powered by a 110 v AC motor, manufactured by Fasco Industries of Rochester, NY (model number 507451N (the last letter is slightly marred, so it could be another letter). The opening for the fan on the inside of the chamber is 15.5 cm from the top and 9.5 cm from the rear wall of the service area. The hinges for the chamber lid are located on either end of the lid, as can be seen in the right photograph of Fig.  3 . Attached around the inside perimeter of the lip of the ice chest is a rubber gasket (indicated by two arrows in the right photograph of Fig.  3 ), which has come unattached in several places, but is not deteriorated.

The inside of the chamber is 50.5 cm long by 28 cm wide by 33 cm high. It is divided by an aluminum panel (hereafter, the work panel) into a work area (where the pigeon is placed), shown at the top of the left photograph and at the bottom of the right photograph of the chamber in Fig.  3 , and a service area, shown at the bottom of the left photograph and the top of the right photograph in this figure. The work area measures 32.5 cm long by 28 cm wide by 33 cm high and the service area 18 cm long by 28 cm wide by 33 cm high. The opening for the ventilation fan is on the right wall of the service area (when viewing the rear of the work panel from the service area), as shown in the left photograph of Fig.  3 . The floor of the work area is covered by a piece of wood, raising the work area by 3.8 cm, but it could not be determined whether this was part of the original design or was added later. We speculate that it may have been added in Japan because Japanese pigeons may not have been as tall as the ones used in the USA. If so, this may have made it more difficult for the pigeons to reach the response key.

The Work Panel

The work panel, shown in Figs.  4 and ​ and5, 5 , is 27.2 cm wide by 32 cm high. Figures  4 shows a piece of black foam rubber (somewhat deteriorated, and difficult to determine whether original or added later) across the top such that it covers the small space that otherwise would exist at the top of the work panel between the work and service areas of the chamber (thus accounting for the difference in the chamber height and the work panel height). The foam is seen most clearly in the left photograph of Fig.  3 , where the twine holding it onto the work panel also can be seen. The response key is located behind a 7.1 cm diameter opening, the center of which is about 26.8 from the chamber floor (23 cm from the wooden platform floor), on the midline (13.6 cm from the left wall) of the work panel, shown in Fig.  4 . Below it is the food magazine aperture through which grain can be accessed. This aperture is 5.2 cm high by 5.6 cm wide, with its center also on the midline of the panel (13.6 cm from the left wall) and about 9.7 cm from the chamber floor (5.9 cm. from the floor of the wooden platform).

There is no means of providing general illumination through devices built into the panel; however, there is a small two-prong electric plug in the top right corner of the work panel (Fig.  4 , arrow), with an unconnected wire attached to it. Inside the chamber in the work area are two candelabra type 100 v lamp holders (one of the holders contained a 110 v bulb) placed unattached on the floor directly below the loose wires. These can be seen in Fig.  3 in the upper right corner of the work area shown in the right photograph. The wires connected to the two candelabra bases appear to be old, but it cannot be determined whether they and the bases were original or not. The insulation on the wires is not plastic; rather, they are of the same fabric material as the afore-described cable wire, seemingly revealing something of its age. Whether these constituted a houselight for general illumination of the work area is not known.

Figure ​ Figure5 5 shows rear (left photograph) and side (right photograph) views of the control side of the work panel. The single sheet of aluminum that comprises the panel is bent at a 90° angle such that its base covers the floor of the control area. The panel is braced by (now) rusty iron bars set at an angle and attached to the side lip of the work panel and its base, apparently to prevent the panel from coming out of position in the chamber as the pigeon pecks the key (there are no grooves for holding the work panel or other means of stabilizing it in the chamber). A similar, but horizontal, iron bar braces the panel from the work-area side, as noted in the “ The Chamber ” section above. The wiring and solder connections on the control panel appear to be original, although it is difficult to determine whether some of the connections have been re-soldered. Many of the individual wires leading to various components, however, are bundled with a wire binder holding them together and they appear to be unmodified over the years of the chamber’s residence at Keio. Located on the work panel are a connection box to which the cable connects, a response key, two stimulus lights used to transilluminate the response key, and a device for delivering mixed grain through the aforementioned aperture on the work-area side of the work panel.

A metal connection box is located in the lower left corner of the control side of the metal panel (viewing from the control area; see the left and right photographs of Fig.  5 ). The connector cord (connecting the box to the programming and recording equipment) is plugged into the box through a male Jones plug visible at the lower rear of the box. The box contains two 2-pole double-throw relays, function unknown. These relays sometimes were used to channel power to lights or food magazines. Some of the wires from the male 12-prong Jones plug connector go through these relays, but other wires go directly from the Jones plug connector to the various components.

Response Key

The response key, shown in Fig.  6 , is composed of a 6.5 cm square piece of thin black plastic on which is mounted a piece of white opaque plastic (6.4 cm high by 4 cm wide). The unit is located behind the circular opening in the work panel. The white plastic piece is unhinged and can move off its fixed location in four directions. There appears to be a small spring attached to the bracket at the top of the key assembly that holds the white, moveable portion of the key in place and ensures the return of the key to its neutral position at the end of each peck. The face of the key (the pecking surface) is recessed about 3 mm from the face of the work panel. The force requirement of the key does not appear to be adjustable. The key operation is described in the “ The Chamber ” section.

Stimulus Display

As noted above, the two stimulus lamps are shown in the left photographs of both Figs.  6 and ​ and7 7 as the two black cylinders (marked by the short arrows, above the food magazine in Fig.  7 ). The one on the right (from the rear of the work panel) is placed above the plane passing through the center of the key aperture and the one on the left is placed below this plane. Their location is precise and they do not appear to have been added later, suggesting that this arrangement was part of the original chamber design, although some of the wires connected to the lights may have been cut and re-soldered. It was difficult to determine by visual inspection whether these jewel lamps (pilot lamps) were 24 v DC or 110 v AC. Twenty cm in front of the lamps is a piece of frosted glass (Fig.  6 , left photograph, longer arrow), perhaps used to diffuse the light coming from the key lights and diffuse it evenly across the response key. The frosted glass is 65 mm behind the response key.

Food Magazine

The food magazine is shown in the left photograph of Fig.  7 . It is located behind the square aperture on the work-area side of the work panel. The cogwheel and lever are shown in the lower right photograph of Fig.  7 . The details of its operation, and a comparison of it with the later Gerbrands model shown in the upper right photograph of Fig.  7 are described in the “ The Chamber ” section above. The cogwheel is rotated by a 110-v AC synchronous motor, which is not readily visible because of the cogwheel. We could not determine whether the motor operates from a single pulse and continues to operate through a reinforcement cycle or whether continuous application of current to the motor is required to ensure raising and lowering of the food tray. There is a light above the feeder aperture that presumably illuminates with the operation of the magazine.

Author’s note

The second author’s participation in this project resulted from his receipt of a Global Professorship from Keio University.

Contributor Information

Takayuki Sakagami, Email: [email protected] .

Kennon A. Lattal, Email: ude.uvw@lattalk .

  • Asano T, Lattal KA. Historical note: cumulative recorders manufactured in Japan. Journal of the Experimental Analysis of Behavior. 2008; 90 :125–129. doi: 10.1901/jeab.2008.90-125. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Asano T, Lattal KA. A missing link in the history of the cumulative recorder. Journal of the Experimental Analysis of Behavior. 2012; 98 :227–241. doi: 10.1901/jeab.2012.98-227. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Catania AC. The watershed years of 1958–1962 in the Harvard Pigeon lab. Journal of the Experimental Analysis of Behavior. 2002; 77 :327–345. doi: 10.1901/jeab.2002.77-327. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Dinsmoor JA. Academic roots: Columbia University 1943–1951. Journal of the Experimental Analysis of Behavior. 1990; 54 :129–149. doi: 10.1901/jeab.1990.54-129. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Dunham PJ, Mariner A, Adams H. Enhancement of off-key pecking by on-key punishment. Journal of the Experimental Analysis of Behavior. 1969; 12 :789–797. doi: 10.1901/jeab.1969.12-789. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ferster CB. The use of the free operant in the analysis of behavior. Psychological Bulletin. 1953; 50 :263–274. doi: 10.1037/h0055514. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ferster CB, Skinner BF. Schedules of reinforcement. New York: Appleton Century Crofts; 1957. [ Google Scholar ]
  • Heron WT, Skinner BF. An apparatus for the study of animal behavior. Psychological Record. 1939; 3 :166–176. [ Google Scholar ]
  • Laties VG, Weiss B, Clark RL, Reynolds MD. Overt “mediating” behavior during temporally spaced responding. Journal of the Experimental Analysis of Behavior. 1965; 8 :107–116. doi: 10.1901/jeab.1965.8-107. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lattal, K. A. (2004). Steps and pips in the history of the cumulative recorder. Journal of the Experimental Analysis of Behavior, 82 , 329–355. [ PMC free article ] [ PubMed ]
  • Ohinata S. The relative efficiency of luminance and wave-length during the discrimination learning in the pigeon. Shinrigaku Kenkyu (The Japanese Journal of Psychology) 1955; 26 :311–319. doi: 10.4992/jjpsy.26.311. [ CrossRef ] [ Google Scholar ]
  • Sato M. Appetite and operant behavior in the pigeon—Preliminary report. The Annual of Animal Psychology. 1963; 13 :95–100. [ Google Scholar ]
  • Skinner BF. The experimental analysis of behavior. Stockholm: Plenary address at the 13th International Congress of Psychology; 1951. [ Google Scholar ]
  • Skinner BF. Pigeons in a pelican. American Psychologist. 1960; 15 :28–37. doi: 10.1037/h0045345. [ CrossRef ] [ Google Scholar ]
  • Skinner BF. The shaping of a behaviorist. New York: Knopf; 1979. [ Google Scholar ]
  • Skinner BF. A matter of consequences. New York: Knopf; 1983. [ Google Scholar ]
  • Staddon JER, Simmelhag VL. The “superstition” experiment: a reexamination of its implications for the principles of adaptive behavior. Psychological Review. 1971; 78 :3–43. doi: 10.1037/h0030305. [ CrossRef ] [ Google Scholar ]

PSYCH101: Introduction to Psychology

Principles of operant conditioning.

Read this text, which discusses the definition of operant conditioning, describes the difference between reinforcement and punishment , and introduces reinforcement schedules. Make sure you can respond to these questions. What is a Skinner box, and what is its purpose? What is the difference between negative reinforcement and punishment? What is shaping, and how would you use shaping to teach a dog to roll over?

The previous section of this chapter focused on the type of associative learning known as classical conditioning. Remember that in classical conditioning, something in the environment triggers a reflex automatically, and researchers train the organism to react to a different stimulus. Now we turn to the second type of associative learning, operant conditioning . In operant conditioning, organisms learn to associate a behavior and its consequence (Table 6.1). A pleasant consequence makes that behavior more likely to be repeated in the future. For example, Spirit, a dolphin at the National Aquarium in Baltimore, does a flip in the air when her trainer blows a whistle. The consequence is that she gets a fish.

Classical and Operant Conditioning Compared

Classical Conditioning Operant Conditioning
Conditioning approach An unconditioned stimulus (such as food) is paired with a neutral stimulus (such as a bell). The neutral stimulus eventually becomes the conditioned stimulus, which brings about the conditioned response (salivation). The target behavior is followed by reinforcement or punishment to either strengthen or weaken it, so that the learner is more likely to exhibit the desired behavior in the future.
Stimulus timing The stimulus occurs immediately before the response. The stimulus (either reinforcement or punishment) occurs soon after the response.

Psychologist B. F. Skinner saw that classical conditioning is limited to existing behaviors that are reflexively elicited, and it doesn't account for new behaviors such as riding a bike. He proposed a theory about how such behaviors come about. Skinner believed that behavior is motivated by the consequences we receive for the behavior: the reinforcements and punishments. His idea that learning is the result of consequences is based on the law of effect, which was first proposed by psychologist Edward Thorndike.

According to the law of effect , behaviors that are followed by consequences that are satisfying to the organism are more likely to be repeated, and behaviors that are followed by unpleasant consequences are less likely to be repeated. Essentially, if an organism does something that brings about a desired result, the organism is more likely to do it again. If an organism does something that does not bring about a desired result, the organism is less likely to do it again. An example of the law of effect is in employment. One of the reasons (and often the main reason) we show up for work is because we get paid to do so. If we stop getting paid, we will likely stop showing up - even if we love our job.

Working with Thorndike's law of effect as his foundation, Skinner began conducting scientific experiments on animals (mainly rats and pigeons) to determine how organisms learn through operant conditioning. He placed these animals inside an operant conditioning chamber, which has come to be known as a "Skinner box" (Figure 6.10). A Skinner box contains a lever (for rats) or disk (for pigeons) that the animal can press or peck for a food reward via the dispenser. Speakers and lights can be associated with certain behaviors. A recorder counts the number of responses made by the animal.

A photograph shows B.F. Skinner. An illustration shows a rat in a Skinner box: a chamber with a speaker, lights, a lever, and

Figure 6.10 (a) B. F. Skinner developed operant conditioning for systematic study of how behaviors are strengthened or weakened according to their consequences. (b) In a Skinner box, a rat presses a lever in an operant conditioning chamber to receive a food reward.

In discussing operant conditioning, we use several everyday words - positive, negative, reinforcement, and punishment - in a specialized manner. In operant conditioning, positive and negative do not mean good and bad. Instead, positive means you are adding something, and negative means you are taking something away. Reinforcement means you are increasing a behavior, and punishment means you are decreasing a behavior. Reinforcement can be positive or negative, and punishment can also be positive or negative. All reinforcers (positive or negative) increase the likelihood of a behavioral response. All punishers (positive or negative) decrease the likelihood of a behavioral response. Now let's combine these four terms: positive reinforcement, negative reinforcement, positive punishment, and negative punishment (Table 6.2).

Positive and Negative Reinforcement and Punishment

Reinforcement Punishment
Positive Something is to the likelihood of a behavior. Something is to the likelihood of a behavior.
Negative Something is to the likelihood of a behavior. Something is to the likelihood of a behavior.

Reinforcement

The most effective way to teach a person or animal a new behavior is with positive reinforcement. In positive reinforcement , a desirable stimulus is added to increase a behavior. For example, you tell your five-year-old son, Jerome, that if he cleans his room, he will get a toy. Jerome quickly cleans his room because he wants a new art set. Let's pause for a moment. Some people might say, "Why should I reward my child for doing what is expected?" But in fact we are constantly and consistently rewarded in our lives. Our paychecks are rewards, as are high grades and acceptance into our preferred school.

Being praised for doing a good job and for passing a driver's test is also a reward. Positive reinforcement as a learning tool is extremely effective. It has been found that one of the most effective ways to increase achievement in school districts with below-average reading scores was to pay the children to read. Specifically, second-grade students in Dallas were paid $2 each time they read a book and passed a short quiz about the book. The result was a significant increase in reading comprehension. What do you think about this program? If Skinner were alive today, he would probably think this was a great idea. He was a strong proponent of using operant conditioning principles to influence students' behavior at school. In fact, in addition to the Skinner box, he also invented what he called a teaching machine that was designed to reward small steps in learning - an early forerunner of computer-assisted learning. His teaching machine tested students' knowledge as they worked through various school subjects. If students answered questions correctly, they received immediate positive reinforcement and could continue; if they answered incorrectly, they did not receive any reinforcement. The idea was that students would spend additional time studying the material to increase their chance of being reinforced the next time.

In negative reinforcement , an undesirable stimulus is removed to increase a behavior. For example, car manufacturers use the principles of negative reinforcement in their seatbelt systems, which go "beep, beep, beep" until you fasten your seatbelt. The annoying sound stops when you exhibit the desired behavior, increasing the likelihood that you will buckle up in the future. Negative reinforcement is also used frequently in horse training. Riders apply pressure - by pulling the reins or squeezing their legs - and then remove the pressure when the horse performs the desired behavior, such as turning or speeding up. The pressure is the negative stimulus that the horse wants to remove.

While positive punishment can be effective in some cases, Skinner suggested that the use of punishment should be weighed against the possible negative effects. Today's psychologists and parenting experts favor reinforcement over punishment - they recommend that you catch your child doing something good and reward them for it.

  • Reinforce any response that resembles the desired behavior.
  • Then reinforce the response that more closely resembles the desired behavior. You will no longer reinforce the previously reinforced response.
  • Next, begin to reinforce the response that even more closely resembles the desired behavior.
  • Continue to reinforce closer and closer approximations of the desired behavior.
  • Finally, only reinforce the desired behavior.

It's easy to see how shaping is effective in teaching behaviors to animals, but how does shaping work with humans? Let's consider parents whose goal is to have their child learn to clean his room. They use shaping to help him master steps toward the goal. Instead of performing the entire task, they set up these steps and reinforce each step. First, he cleans up one toy. Second, he cleans up five toys. Third, he chooses whether to pick up ten toys or put his books and clothes away. Fourth, he cleans up everything except two toys. Finally, he cleans his entire room.

Primary and Secondary Reinforcers

Sometimes, instead of stickers on a sticker chart, a token is used. Tokens, which are also secondary reinforcers, can then be traded in for rewards and prizes. Entire behavior management systems, known as token economies, are built around the use of these kinds of token reinforcers. Token economies have been found to be very effective at modifying behavior in a variety of settings such as schools, prisons, and mental hospitals.

For example, a study by Adibsereshki and Abkenar (2014) found that use of a token economy increased appropriate social behaviors and reduced inappropriate behaviors in a group of eight grade students. Similar studies show demonstrable gains on behavior and academic achievement for groups ranging from first grade to high school, and representing a wide array of abilities and disabilities. For example, during studies involving younger students, when children in the study exhibited appropriate behavior (not hitting or pinching), they received a "quiet hands" token. When they hit or pinched, they lost a token. The children could then exchange specified amounts of tokens for minutes of playtime.

Everyday Connection Behavior Modification in Children

Parents and teachers often use behavior modification to change a child's behavior. Behavior modification uses the principles of operant conditioning to accomplish behavior change so that undesirable behaviors are switched for more socially acceptable ones. Some teachers and parents create a sticker chart, in which several behaviors are listed (Figure 6.11). Sticker charts are a form of token economies, as described in the text. Each time children perform the behavior, they get a sticker, and after a certain number of stickers, they get a prize, or reinforcer. The goal is to increase acceptable behaviors and decrease misbehavior.

Remember, it is best to reinforce desired behaviors, rather than to use punishment. In the classroom, the teacher can reinforce a wide range of behaviors, from students raising their hands, to walking quietly in the hall, to turning in their homework. At home, parents might create a behavior chart that rewards children for things such as putting away toys, brushing their teeth, and helping with dinner. In order for behavior modification to be effective, the reinforcement needs to be connected with the behavior; the reinforcement must matter to the child and be done consistently.

A photograph shows a child placing stickers on a chart hanging on the wall.

Figure 6.11 Sticker charts are a form of positive reinforcement and a tool for behavior modification. Once this child earns a certain number of stickers for demonstrating a desired behavior, she will be rewarded with a trip to the ice cream parlor.

Time-out is another popular technique used in behavior modification with children. It operates on the principle of negative punishment. When a child demonstrates an undesirable behavior, they are removed from the desirable activity at hand (Figure 6.12). For example, say that Sophia and her brother Mario are playing with building blocks. Sophia throws some blocks at her brother, so you give her a warning that she will go to time-out if she does it again. A few minutes later, she throws more blocks at Mario. You remove Sophia from the room for a few minutes. When she comes back, she doesn't throw blocks.

There are several important points that you should know if you plan to implement time-out as a behavior modification technique. First, make sure the child is being removed from a desirable activity and placed in a less desirable location. If the activity is something undesirable for the child, this technique will backfire because it is more enjoyable for the child to be removed from the activity. Second, the length of the time-out is important.

The general rule of thumb is one minute for each year of the child's age. Sophia is five; therefore, she sits in a time-out for five minutes. Setting a timer helps children know how long they have to sit in time-out. Finally, as a caregiver, keep several guidelines in mind over the course of a time-out: remain calm when directing your child to time-out; ignore your child during time-out (because caregiver attention may reinforce misbehavior); and give the child a hug or a kind word when time-out is over.

Photograph A shows several children climbing on playground equipment. Photograph B shows a child sitting alone on a bench.

Figure 6.12 Time-out is a popular form of negative punishment used by caregivers. When a child misbehaves, they are removed from a desirable activity in an effort to decrease the unwanted behavior. For example, (a) a child might be playing on the playground with friends and push another child; (b) the child who misbehaved would then be removed from the activity for a short period of time.

Reinforcement Schedules

Remember, the best way to teach a person or animal a behavior is to use positive reinforcement. For example, Skinner used positive reinforcement to teach rats to press a lever in a Skinner box. At first, the rat might randomly hit the lever while exploring the box, and out would come a pellet of food. After eating the pellet, what do you think the hungry rat did next? It hit the lever again, and received another pellet of food. Each time the rat hit the lever, a pellet of food came out. When an organism receives a reinforcer each time it displays a behavior, it is called continuous reinforcement .

This reinforcement schedule is the quickest way to teach someone a behavior, and it is especially effective in training a new behavior. Let's look back at the dog that was learning to sit earlier in the chapter. Now, each time he sits, you give him a treat. Timing is important here: you will be most successful if you present the reinforcer immediately after he sits, so that he can make an association between the target behavior (sitting) and the consequence (getting a treat).

Once a behavior is trained, researchers and trainers often turn to another type of reinforcement schedule - partial reinforcement. In partial reinforcement , also referred to as intermittent reinforcement, the person or animal does not get reinforced every time they perform the desired behavior. There are several different types of partial reinforcement schedules (Table 6.3). These schedules are described as either fixed or variable, and as either interval or ratio. Fixed refers to the number of responses between reinforcements, or the amount of time between reinforcements, which is set and unchanging. Variable refers to the number of responses or amount of time between reinforcements, which varies or changes. Interval means the schedule is based on the time between reinforcements, and ratio means the schedule is based on the number of responses between reinforcements.

Reinforcement Schedule Description Result Example
Fixed interval Reinforcement is delivered at predictable time intervals (e.g., after 5, 10, 15, and 20 minutes). Moderate response rate with significant pauses after reinforcement Hospital patient uses patient-controlled, doctor-timed pain relief
Variable interval Reinforcement is delivered at unpredictable time intervals (e.g., after 5, 7, 10, and 20 minutes). Moderate yet steady response rate Checking social media
Fixed ratio Reinforcement is delivered after a predictable number of responses (e.g., after 2, 4, 6, and 8 responses). High response rate with pauses after reinforcement Piecework—factory worker getting paid for every x number of items manufactured
Variable ratio Reinforcement is delivered after an unpredictable number of responses (e.g., after 1, 4, 5, and 9 responses). High and steady response rate Gambling

A graph has an x-axis labeled "Time" and a y-axis labeled "Cumulative number of responses". Two lines labeled "Variable Ratio

Figure 6.13 The four reinforcement schedules yield different response patterns. The variable ratio schedule is unpredictable and yields high and steady response rates, with little if any pause after reinforcement (e.g., gambler). A fixed ratio schedule is predictable and produces a high response rate, with a short pause after reinforcement (e.g., eyeglass saleswoman). The variable interval schedule is unpredictable and produces a moderate, steady response rate (e.g., restaurant manager). The fixed interval schedule yields a scallop-shaped response pattern, reflecting a significant pause after reinforcement (e.g., surgery patient).

Connect the Concepts Gambling and the Brain

Skinner (1953) stated, "If the gambling establishment cannot persuade a patron to turn over money with no return, it may achieve the same effect by returning part of the patron's money on a variable-ratio schedule."

Skinner uses gambling as an example of the power of the variable-ratio reinforcement schedule for maintaining behavior even during long periods without any reinforcement. In fact, Skinner was so confident in his knowledge of gambling addiction that he even claimed he could turn a pigeon into a pathological gambler. It is indeed true that variable-ratio schedules keep behavior quite persistent - just imagine the frequency of a child's tantrums if a parent gives in even once to the behavior. The occasional reward makes it almost impossible to stop the behavior.

Recent research in rats has failed to support Skinner's idea that training on variable-ratio schedules alone causes pathological gambling. However, other research suggests that gambling does seem to work on the brain in the same way as most addictive drugs, and so there may be some combination of brain chemistry and reinforcement schedule that could lead to problem gambling (Figure 6.14). Specifically, modern research shows the connection between gambling and the activation of the reward centers of the brain that use the neurotransmitter (brain chemical) dopamine. Interestingly, gamblers don't even have to win to experience the "rush" of dopamine in the brain. 

"Near misses," or almost winning but not actually winning, also have been shown to increase activity in the ventral striatum and other brain reward centers that use dopamine. These brain effects are almost identical to those produced by addictive drugs like cocaine and heroin. Based on the neuroscientific evidence showing these similarities, the DSM-5 now considers gambling an addiction, while earlier versions of the DSM classified gambling as an impulse control disorder.

A photograph shows four digital gaming machines.

Figure 6.14 Some research suggests that pathological gamblers use gambling to compensate for abnormally low levels of the hormone norepinephrine, which is associated with stress and is secreted in moments of arousal and thrill.

In addition to dopamine, gambling also appears to involve other neurotransmitters, including norepinephrine and serotonin. Norepinephrine is secreted when a person feels stress, arousal, or thrill. It may be that pathological gamblers use gambling to increase their levels of this neurotransmitter. Deficiencies in serotonin might also contribute to compulsive behavior, including a gambling addiction.

It may be that pathological gamblers' brains are different than those of other people, and perhaps this difference may somehow have led to their gambling addiction, as these studies seem to suggest.

However, it is very difficult to ascertain the cause because it is impossible to conduct a true experiment (it would be unethical to try to turn randomly assigned participants into problem gamblers). Therefore, it may be that causation actually moves in the opposite direction - perhaps the act of gambling somehow changes neurotransmitter levels in some gamblers' brains. It also is possible that some overlooked factor, or confounding variable, played a role in both the gambling addiction and the differences in brain chemistry.

Cognition and Latent Learning

An illustration shows three rats in a maze, with a starting point and food at the end.

Figure 6.15 Psychologist Edward Tolman found that rats use cognitive maps to navigate through a maze. Have you ever worked your way through various levels on a video game? You learned when to turn left or right, move up or down. In that case you were relying on a cognitive map, just like the rats in a maze.

Latent learning also occurs in humans. Children may learn by watching the actions of their parents but only demonstrate it at a later date, when the learned material is needed. For example, suppose that Ravi's dad drives him to school every day. In this way, Ravi learns the route from his house to his school, but he's never driven there himself, so he has not had a chance to demonstrate that he's learned the way. One morning Ravi's dad has to leave early for a meeting, so he can't drive Ravi to school. Instead, Ravi follows the same route on his bike that his dad would have taken in the car. This demonstrates latent learning. Ravi had learned the route to school, but had no need to demonstrate this knowledge earlier.

Everyday Connection This Place Is Like a Maze

Have you ever gotten lost in a building and couldn't find your way back out? While that can be frustrating, you're not alone. At one time or another we've all gotten lost in places like a museum, hospital, or university library. Whenever we go someplace new, we build a mental representation - or cognitive map - of the location, as Tolman's rats built a cognitive map of their maze. However, some buildings are confusing because they include many areas that look alike or have short lines of sight.

Because of this, it's often difficult to predict what's around a corner or decide whether to turn left or right to get out of a building. Psychologist Laura Carlson (2010) suggests that what we place in our cognitive map can impact our success in navigating through the environment. She suggests that paying attention to specific features upon entering a building, such as a picture on the wall, a fountain, a statue, or an escalator, adds information to our cognitive map that can be used later to help find our way out of the building.

Creative Commons License

Chapter 8: Learning

Operant conditioning, learning objectives.

By the end of this section, you will be able to:

  • Define operant conditioning
  • Explain the difference between reinforcement and punishment
  • Distinguish between reinforcement schedules

The previous section of this chapter focused on the type of associative learning known as classical conditioning. Remember that in classical conditioning, something in the environment triggers a reflex automatically, and researchers train the organism to react to a different stimulus. Now we turn to the second type of associative learning, operant conditioning . In operant conditioning, organisms learn to associate a behavior and its consequence ( [link] ). A pleasant consequence makes that behavior more likely to be repeated in the future. For example, Spirit, a dolphin at the National Aquarium in Baltimore, does a flip in the air when her trainer blows a whistle. The consequence is that she gets a fish.

Classical and Operant Conditioning Compared
Classical Conditioning Operant Conditioning
Conditioning approach An unconditioned stimulus (such as food) is paired with a neutral stimulus (such as a bell). The neutral stimulus eventually becomes the conditioned stimulus, which brings about the conditioned response (salivation). The target behavior is followed by reinforcement or punishment to either strengthen or weaken it, so that the learner is more likely to exhibit the desired behavior in the future.
Stimulus timing The stimulus occurs immediately before the response. The stimulus (either reinforcement or punishment) occurs soon after the response.

Psychologist B. F. Skinner saw that classical conditioning is limited to existing behaviors that are reflexively elicited, and it doesn’t account for new behaviors such as riding a bike. He proposed a theory about how such behaviors come about. Skinner believed that behavior is motivated by the consequences we receive for the behavior: the reinforcements and punishments. His idea that learning is the result of consequences is based on the law of effect, which was first proposed by psychologist Edward Thorndike . According to the law of effect , behaviors that are followed by consequences that are satisfying to the organism are more likely to be repeated, and behaviors that are followed by unpleasant consequences are less likely to be repeated (Thorndike, 1911). Essentially, if an organism does something that brings about a desired result, the organism is more likely to do it again. If an organism does something that does not bring about a desired result, the organism is less likely to do it again. An example of the law of effect is in employment. One of the reasons (and often the main reason) we show up for work is because we get paid to do so. If we stop getting paid, we will likely stop showing up—even if we love our job.

Working with Thorndike’s law of effect as his foundation, Skinner began conducting scientific experiments on animals (mainly rats and pigeons) to determine how organisms learn through operant conditioning (Skinner, 1938). He placed these animals inside an operant conditioning chamber, which has come to be known as a “Skinner box” ( [link] ). A Skinner box contains a lever (for rats) or disk (for pigeons) that the animal can press or peck for a food reward via the dispenser. Speakers and lights can be associated with certain behaviors. A recorder counts the number of responses made by the animal.

A photograph shows B.F. Skinner. An illustration shows a rat in a Skinner box: a chamber with a speaker, lights, a lever, and a food dispenser.

(a) B. F. Skinner developed operant conditioning for systematic study of how behaviors are strengthened or weakened according to their consequences. (b) In a Skinner box, a rat presses a lever in an operant conditioning chamber to receive a food reward. (credit a: modification of work by “Silly rabbit”/Wikimedia Commons)

Link to Learning

Watch this brief video clip to learn more about operant conditioning: Skinner is interviewed, and operant conditioning of pigeons is demonstrated.

In discussing operant conditioning, we use several everyday words—positive, negative, reinforcement, and punishment—in a specialized manner. In operant conditioning, positive and negative do not mean good and bad. Instead, positive means you are adding something, and negative means you are taking something away. Reinforcement means you are increasing a behavior, and punishment means you are decreasing a behavior. Reinforcement can be positive or negative, and punishment can also be positive or negative. All reinforcers (positive or negative) increase the likelihood of a behavioral response. All punishers (positive or negative) decrease the likelihood of a behavioral response. Now let’s combine these four terms: positive reinforcement, negative reinforcement, positive punishment, and negative punishment ( [link] ).

Positive and Negative Reinforcement and Punishment
Reinforcement Punishment
Positive Something is to the likelihood of a behavior. Something is to the likelihood of a behavior.
Negative Something is to the likelihood of a behavior. Something is to the likelihood of a behavior.

REINFORCEMENT

The most effective way to teach a person or animal a new behavior is with positive reinforcement. In positive reinforcement , a desirable stimulus is added to increase a behavior.

For example, you tell your five-year-old son, Jerome, that if he cleans his room, he will get a toy. Jerome quickly cleans his room because he wants a new art set. Let’s pause for a moment. Some people might say, “Why should I reward my child for doing what is expected?” But in fact we are constantly and consistently rewarded in our lives. Our paychecks are rewards, as are high grades and acceptance into our preferred school. Being praised for doing a good job and for passing a driver’s test is also a reward. Positive reinforcement as a learning tool is extremely effective. It has been found that one of the most effective ways to increase achievement in school districts with below-average reading scores was to pay the children to read. Specifically, second-grade students in Dallas were paid $2 each time they read a book and passed a short quiz about the book. The result was a significant increase in reading comprehension (Fryer, 2010). What do you think about this program? If Skinner were alive today, he would probably think this was a great idea. He was a strong proponent of using operant conditioning principles to influence students’ behavior at school. In fact, in addition to the Skinner box, he also invented what he called a teaching machine that was designed to reward small steps in learning (Skinner, 1961)—an early forerunner of computer-assisted learning. His teaching machine tested students’ knowledge as they worked through various school subjects. If students answered questions correctly, they received immediate positive reinforcement and could continue; if they answered incorrectly, they did not receive any reinforcement. The idea was that students would spend additional time studying the material to increase their chance of being reinforced the next time (Skinner, 1961).

In negative reinforcement , an undesirable stimulus is removed to increase a behavior. For example, car manufacturers use the principles of negative reinforcement in their seatbelt systems, which go “beep, beep, beep” until you fasten your seatbelt. The annoying sound stops when you exhibit the desired behavior, increasing the likelihood that you will buckle up in the future. Negative reinforcement is also used frequently in horse training. Riders apply pressure—by pulling the reins or squeezing their legs—and then remove the pressure when the horse performs the desired behavior, such as turning or speeding up. The pressure is the negative stimulus that the horse wants to remove.

Many people confuse negative reinforcement with punishment in operant conditioning, but they are two very different mechanisms. Remember that reinforcement, even when it is negative, always increases a behavior. In contrast, punishment always decreases a behavior. In positive punishment , you add an undesirable stimulus to decrease a behavior. An example of positive punishment is scolding a student to get the student to stop texting in class. In this case, a stimulus (the reprimand) is added in order to decrease the behavior (texting in class). In negative punishment , you remove a pleasant stimulus to decrease a behavior. For example, a driver might blast her horn when a light turns green, and continue blasting the horn until the car in front moves.

Punishment, especially when it is immediate, is one way to decrease undesirable behavior. For example, imagine your four-year-old son, Brandon, runs into the busy street to get his ball. You give him a time-out (positive punishment) and tell him never to go into the street again. Chances are he won’t repeat this behavior. While strategies like time-outs are common today, in the past children were often subject to physical punishment, such as spanking. It’s important to be aware of some of the drawbacks in using physical punishment on children. First, punishment may teach fear. Brandon may become fearful of the street, but he also may become fearful of the person who delivered the punishment—you, his parent. Similarly, children who are punished by teachers may come to fear the teacher and try to avoid school (Gershoff et al., 2010). Consequently, most schools in the United States have banned corporal punishment. Second, punishment may cause children to become more aggressive and prone to antisocial behavior and delinquency (Gershoff, 2002). They see their parents resort to spanking when they become angry and frustrated, so, in turn, they may act out this same behavior when they become angry and frustrated. For example, because you spank Brenda when you are angry with her for her misbehavior, she might start hitting her friends when they won’t share their toys.

While positive punishment can be effective in some cases, Skinner suggested that the use of punishment should be weighed against the possible negative effects. Today’s psychologists and parenting experts favor reinforcement over punishment—they recommend that you catch your child doing something good and reward her for it.

In his operant conditioning experiments, Skinner often used an approach called shaping. Instead of rewarding only the target behavior, in shaping , we reward successive approximations of a target behavior. Why is shaping needed? Remember that in order for reinforcement to work, the organism must first display the behavior. Shaping is needed because it is extremely unlikely that an organism will display anything but the simplest of behaviors spontaneously. In shaping, behaviors are broken down into many small, achievable steps. The specific steps used in the process are the following: Reinforce any response that resembles the desired behavior. Then reinforce the response that more closely resembles the desired behavior. You will no longer reinforce the previously reinforced response. Next, begin to reinforce the response that even more closely resembles the desired behavior. Continue to reinforce closer and closer approximations of the desired behavior. Finally, only reinforce the desired behavior.

Shaping is often used in teaching a complex behavior or chain of behaviors. Skinner used shaping to teach pigeons not only such relatively simple behaviors as pecking a disk in a Skinner box, but also many unusual and entertaining behaviors, such as turning in circles, walking in figure eights, and even playing ping pong; the technique is commonly used by animal trainers today. An important part of shaping is stimulus discrimination. Recall Pavlov’s dogs—he trained them to respond to the tone of a bell, and not to similar tones or sounds. This discrimination is also important in operant conditioning and in shaping behavior.

Here is a brief video of Skinner’s pigeons playing ping pong.

It’s easy to see how shaping is effective in teaching behaviors to animals, but how does shaping work with humans? Let’s consider parents whose goal is to have their child learn to clean his room. They use shaping to help him master steps toward the goal. Instead of performing the entire task, they set up these steps and reinforce each step. First, he cleans up one toy. Second, he cleans up five toys. Third, he chooses whether to pick up ten toys or put his books and clothes away. Fourth, he cleans up everything except two toys. Finally, he cleans his entire room.

PRIMARY AND SECONDARY REINFORCERS

Rewards such as stickers, praise, money, toys, and more can be used to reinforce learning. Let’s go back to Skinner’s rats again. How did the rats learn to press the lever in the Skinner box? They were rewarded with food each time they pressed the lever. For animals, food would be an obvious reinforcer.

What would be a good reinforce for humans? For your daughter Sydney, it was the promise of a toy if she cleaned her room. How about Joaquin, the soccer player? If you gave Joaquin a piece of candy every time he made a goal, you would be using a primary reinforcer . Primary reinforcers are reinforcers that have innate reinforcing qualities. These kinds of reinforcers are not learned. Water, food, sleep, shelter, sex, and touch, among others, are primary reinforcers. Pleasure is also a primary reinforcer. Organisms do not lose their drive for these things. For most people, jumping in a cool lake on a very hot day would be reinforcing and the cool lake would be innately reinforcing—the water would cool the person off (a physical need), as well as provide pleasure.

A secondary reinforcer has no inherent value and only has reinforcing qualities when linked with a primary reinforcer. Praise, linked to affection, is one example of a secondary reinforcer, as when you called out “Great shot!” every time Joaquin made a goal. Another example, money, is only worth something when you can use it to buy other things—either things that satisfy basic needs (food, water, shelter—all primary reinforcers) or other secondary reinforcers. If you were on a remote island in the middle of the Pacific Ocean and you had stacks of money, the money would not be useful if you could not spend it. What about the stickers on the behavior chart? They also are secondary reinforcers.

Sometimes, instead of stickers on a sticker chart, a token is used. Tokens, which are also secondary reinforcers, can then be traded in for rewards and prizes. Entire behavior management systems, known as token economies, are built around the use of these kinds of token reinforcers. Token economies have been found to be very effective at modifying behavior in a variety of settings such as schools, prisons, and mental hospitals. For example, a study by Cangi and Daly (2013) found that use of a token economy increased appropriate social behaviors and reduced inappropriate behaviors in a group of autistic school children. Autistic children tend to exhibit disruptive behaviors such as pinching and hitting. When the children in the study exhibited appropriate behavior (not hitting or pinching), they received a “quiet hands” token. When they hit or pinched, they lost a token. The children could then exchange specified amounts of tokens for minutes of playtime.

Parents and teachers often use behavior modification to change a child’s behavior. Behavior modification uses the principles of operant conditioning to accomplish behavior change so that undesirable behaviors are switched for more socially acceptable ones. Some teachers and parents create a sticker chart, in which several behaviors are listed ( [link] ). Sticker charts are a form of token economies, as described in the text. Each time children perform the behavior, they get a sticker, and after a certain number of stickers, they get a prize, or reinforcer. The goal is to increase acceptable behaviors and decrease misbehavior. Remember, it is best to reinforce desired behaviors, rather than to use punishment. In the classroom, the teacher can reinforce a wide range of behaviors, from students raising their hands, to walking quietly in the hall, to turning in their homework. At home, parents might create a behavior chart that rewards children for things such as putting away toys, brushing their teeth, and helping with dinner. In order for behavior modification to be effective, the reinforcement needs to be connected with the behavior; the reinforcement must matter to the child and be done consistently.

A photograph shows a child placing stickers on a chart hanging on the wall.

Sticker charts are a form of positive reinforcement and a tool for behavior modification. Once this little girl earns a certain number of stickers for demonstrating a desired behavior, she will be rewarded with a trip to the ice cream parlor. (credit: Abigail Batchelder)

Time-out is another popular technique used in behavior modification with children. It operates on the principle of negative punishment. When a child demonstrates an undesirable behavior, she is removed from the desirable activity at hand ( [link] ). For example, say that Sophia and her brother Mario are playing with building blocks. Sophia throws some blocks at her brother, so you give her a warning that she will go to time-out if she does it again. A few minutes later, she throws more blocks at Mario. You remove Sophia from the room for a few minutes. When she comes back, she doesn’t throw blocks.

There are several important points that you should know if you plan to implement time-out as a behavior modification technique. First, make sure the child is being removed from a desirable activity and placed in a less desirable location. If the activity is something undesirable for the child, this technique will backfire because it is more enjoyable for the child to be removed from the activity. Second, the length of the time-out is important. The general rule of thumb is one minute for each year of the child’s age. Sophia is five; therefore, she sits in a time-out for five minutes. Setting a timer helps children know how long they have to sit in time-out. Finally, as a caregiver, keep several guidelines in mind over the course of a time-out: remain calm when directing your child to time-out; ignore your child during time-out (because caregiver attention may reinforce misbehavior); and give the child a hug or a kind word when time-out is over.

Photograph A shows several children climbing on playground equipment. Photograph B shows a child sitting alone at a table looking at the playground.

Time-out is a popular form of negative punishment used by caregivers. When a child misbehaves, he or she is removed from a desirable activity in an effort to decrease the unwanted behavior. For example, (a) a child might be playing on the playground with friends and push another child; (b) the child who misbehaved would then be removed from the activity for a short period of time. (credit a: modification of work by Simone Ramella; credit b: modification of work by “JefferyTurner”/Flickr)

REINFORCEMENT SCHEDULES

Remember, the best way to teach a person or animal a behavior is to use positive reinforcement. For example, Skinner used positive reinforcement to teach rats to press a lever in a Skinner box. At first, the rat might randomly hit the lever while exploring the box, and out would come a pellet of food. After eating the pellet, what do you think the hungry rat did next? It hit the lever again, and received another pellet of food. Each time the rat hit the lever, a pellet of food came out. When an organism receives a reinforcer each time it displays a behavior, it is called continuous reinforcement . This reinforcement schedule is the quickest way to teach someone a behavior, and it is especially effective in training a new behavior. Let’s look back at the dog that was learning to sit earlier in the chapter. Now, each time he sits, you give him a treat. Timing is important here: you will be most successful if you present the reinforcer immediately after he sits, so that he can make an association between the target behavior (sitting) and the consequence (getting a treat).

Watch this video clip where veterinarian Dr. Sophia Yin shapes a dog’s behavior using the steps outlined above.

Once a behavior is trained, researchers and trainers often turn to another type of reinforcement schedule—partial reinforcement. In partial reinforcement , also referred to as intermittent reinforcement, the person or animal does not get reinforced every time they perform the desired behavior. There are several different types of partial reinforcement schedules ( [link] ). These schedules are described as either fixed or variable, and as either interval or ratio. Fixed refers to the number of responses between reinforcements, or the amount of time between reinforcements, which is set and unchanging. Variable refers to the number of responses or amount of time between reinforcements, which varies or changes. Interval means the schedule is based on the time between reinforcements, and ratio means the schedule is based on the number of responses between reinforcements.

Reinforcement Schedules
Reinforcement Schedule Description Result Example
Fixed interval Reinforcement is delivered at predictable time intervals (e.g., after 5, 10, 15, and 20 minutes). Moderate response rate with significant pauses after reinforcement Hospital patient uses patient-controlled, doctor-timed pain relief
Variable interval Reinforcement is delivered at unpredictable time intervals (e.g., after 5, 7, 10, and 20 minutes). Moderate yet steady response rate Checking Facebook
Fixed ratio Reinforcement is delivered after a predictable number of responses (e.g., after 2, 4, 6, and 8 responses). High response rate with pauses after reinforcement Piecework—factory worker getting paid for every x number of items manufactured
Variable ratio Reinforcement is delivered after an unpredictable number of responses (e.g., after 1, 4, 5, and 9 responses). High and steady response rate Gambling

Now let’s combine these four terms. A fixed interval reinforcement schedule is when behavior is rewarded after a set amount of time. For example, June undergoes major surgery in a hospital. During recovery, she is expected to experience pain and will require prescription medications for pain relief. June is given an IV drip with a patient-controlled painkiller. Her doctor sets a limit: one dose per hour. June pushes a button when pain becomes difficult, and she receives a dose of medication. Since the reward (pain relief) only occurs on a fixed interval, there is no point in exhibiting the behavior when it will not be rewarded.

With a variable interval reinforcement schedule , the person or animal gets the reinforcement based on varying amounts of time, which are unpredictable. Say that Manuel is the manager at a fast-food restaurant. Every once in a while someone from the quality control division comes to Manuel’s restaurant. If the restaurant is clean and the service is fast, everyone on that shift earns a $20 bonus. Manuel never knows when the quality control person will show up, so he always tries to keep the restaurant clean and ensures that his employees provide prompt and courteous service. His productivity regarding prompt service and keeping a clean restaurant are steady because he wants his crew to earn the bonus.

With a fixed ratio reinforcement schedule , there are a set number of responses that must occur before the behavior is rewarded. Carla sells glasses at an eyeglass store, and she earns a commission every time she sells a pair of glasses. She always tries to sell people more pairs of glasses, including prescription sunglasses or a backup pair, so she can increase her commission. She does not care if the person really needs the prescription sunglasses, Carla just wants her bonus. The quality of what Carla sells does not matter because her commission is not based on quality; it’s only based on the number of pairs sold. This distinction in the quality of performance can help determine which reinforcement method is most appropriate for a particular situation. Fixed ratios are better suited to optimize the quantity of output, whereas a fixed interval, in which the reward is not quantity based, can lead to a higher quality of output.

In a variable ratio reinforcement schedule , the number of responses needed for a reward varies. This is the most powerful partial reinforcement schedule. An example of the variable ratio reinforcement schedule is gambling. Imagine that Sarah—generally a smart, thrifty woman—visits Las Vegas for the first time. She is not a gambler, but out of curiosity she puts a quarter into the slot machine, and then another, and another. Nothing happens. Two dollars in quarters later, her curiosity is fading, and she is just about to quit. But then, the machine lights up, bells go off, and Sarah gets 50 quarters back. That’s more like it! Sarah gets back to inserting quarters with renewed interest, and a few minutes later she has used up all her gains and is $10 in the hole. Now might be a sensible time to quit. And yet, she keeps putting money into the slot machine because she never knows when the next reinforcement is coming. She keeps thinking that with the next quarter she could win $50, or $100, or even more. Because the reinforcement schedule in most types of gambling has a variable ratio schedule, people keep trying and hoping that the next time they will win big. This is one of the reasons that gambling is so addictive—and so resistant to extinction.

In operant conditioning, extinction of a reinforced behavior occurs at some point after reinforcement stops, and the speed at which this happens depends on the reinforcement schedule. In a variable ratio schedule, the point of extinction comes very slowly, as described above. But in the other reinforcement schedules, extinction may come quickly. For example, if June presses the button for the pain relief medication before the allotted time her doctor has approved, no medication is administered. She is on a fixed interval reinforcement schedule (dosed hourly), so extinction occurs quickly when reinforcement doesn’t come at the expected time. Among the reinforcement schedules, variable ratio is the most productive and the most resistant to extinction. Fixed interval is the least productive and the easiest to extinguish ( [link] ).

A graph has an x-axis labeled “Time” and a y-axis labeled “Cumulative number of responses.” Two lines labeled “Variable Ratio” and “Fixed Ratio” have similar, steep slopes. The variable ratio line remains straight and is marked in random points where reinforcement occurs. The fixed ratio line has consistently spaced marks indicating where reinforcement has occurred, but after each reinforcement, there is a small drop in the line before it resumes its overall slope. Two lines labeled “Variable Interval” and “Fixed Interval” have similar slopes at roughly a 45-degree angle. The variable interval line remains straight and is marked in random points where reinforcement occurs. The fixed interval line has consistently spaced marks indicating where reinforcement has occurred, but after each reinforcement, there is a drop in the line.

The four reinforcement schedules yield different response patterns. The variable ratio schedule is unpredictable and yields high and steady response rates, with little if any pause after reinforcement (e.g., gambler). A fixed ratio schedule is predictable and produces a high response rate, with a short pause after reinforcement (e.g., eyeglass saleswoman). The variable interval schedule is unpredictable and produces a moderate, steady response rate (e.g., restaurant manager). The fixed interval schedule yields a scallop-shaped response pattern, reflecting a significant pause after reinforcement (e.g., surgery patient).

Connect the Concepts: Gambling and the Brain

Skinner (1953) stated, “If the gambling establishment cannot persuade a patron to turn over money with no return, it may achieve the same effect by returning part of the patron’s money on a variable-ratio schedule” (p. 397).

Skinner uses gambling as an example of the power and effectiveness of conditioning behavior based on a variable ratio reinforcement schedule. In fact, Skinner was so confident in his knowledge of gambling addiction that he even claimed he could turn a pigeon into a pathological gambler (“Skinner’s Utopia,” 1971). Beyond the power of variable ratio reinforcement, gambling seems to work on the brain in the same way as some addictive drugs. The Illinois Institute for Addiction Recovery (n.d.) reports evidence suggesting that pathological gambling is an addiction similar to a chemical addiction ( [link] ). Specifically, gambling may activate the reward centers of the brain, much like cocaine does. Research has shown that some pathological gamblers have lower levels of the neurotransmitter (brain chemical) known as norepinephrine than do normal gamblers (Roy, et al., 1988). According to a study conducted by Alec Roy and colleagues, norepinephrine is secreted when a person feels stress, arousal, or thrill; pathological gamblers use gambling to increase their levels of this neurotransmitter. Another researcher, neuroscientist Hans Breiter, has done extensive research on gambling and its effects on the brain. Breiter (as cited in Franzen, 2001) reports that “Monetary reward in a gambling-like experiment produces brain activation very similar to that observed in a cocaine addict receiving an infusion of cocaine” (para. 1). Deficiencies in serotonin (another neurotransmitter) might also contribute to compulsive behavior, including a gambling addiction.

It may be that pathological gamblers’ brains are different than those of other people, and perhaps this difference may somehow have led to their gambling addiction, as these studies seem to suggest. However, it is very difficult to ascertain the cause because it is impossible to conduct a true experiment (it would be unethical to try to turn randomly assigned participants into problem gamblers). Therefore, it may be that causation actually moves in the opposite direction—perhaps the act of gambling somehow changes neurotransmitter levels in some gamblers’ brains. It also is possible that some overlooked factor, or confounding variable, played a role in both the gambling addiction and the differences in brain chemistry.

A photograph shows four digital gaming machines.

Some research suggests that pathological gamblers use gambling to compensate for abnormally low levels of the hormone norepinephrine, which is associated with stress and is secreted in moments of arousal and thrill. (credit: Ted Murphy)

COGNITION AND LATENT LEARNING

Although strict behaviorists such as Skinner and Watson refused to believe that cognition (such as thoughts and expectations) plays a role in learning, another behaviorist, Edward C. Tolman , had a different opinion. Tolman’s experiments with rats demonstrated that organisms can learn even if they do not receive immediate reinforcement (Tolman & Honzik, 1930; Tolman, Ritchie, & Kalish, 1946). This finding was in conflict with the prevailing idea at the time that reinforcement must be immediate in order for learning to occur, thus suggesting a cognitive aspect to learning.

In the experiments, Tolman placed hungry rats in a maze with no reward for finding their way through it. He also studied a comparison group that was rewarded with food at the end of the maze. As the unreinforced rats explored the maze, they developed a cognitive map : a mental picture of the layout of the maze ( [link] ). After 10 sessions in the maze without reinforcement, food was placed in a goal box at the end of the maze. As soon as the rats became aware of the food, they were able to find their way through the maze quickly, just as quickly as the comparison group, which had been rewarded with food all along. This is known as latent learning : learning that occurs but is not observable in behavior until there is a reason to demonstrate it.

An illustration shows three rats in a maze, with a starting point and food at the end.

Psychologist Edward Tolman found that rats use cognitive maps to navigate through a maze. Have you ever worked your way through various levels on a video game? You learned when to turn left or right, move up or down. In that case you were relying on a cognitive map, just like the rats in a maze. (credit: modification of work by “FutUndBeidl”/Flickr)

Latent learning also occurs in humans. Children may learn by watching the actions of their parents but only demonstrate it at a later date, when the learned material is needed. For example, suppose that Ravi’s dad drives him to school every day. In this way, Ravi learns the route from his house to his school, but he’s never driven there himself, so he has not had a chance to demonstrate that he’s learned the way. One morning Ravi’s dad has to leave early for a meeting, so he can’t drive Ravi to school. Instead, Ravi follows the same route on his bike that his dad would have taken in the car. This demonstrates latent learning. Ravi had learned the route to school, but had no need to demonstrate this knowledge earlier.

Everyday Connection: This Place Is Like a Maze

Have you ever gotten lost in a building and couldn’t find your way back out? While that can be frustrating, you’re not alone. At one time or another we’ve all gotten lost in places like a museum, hospital, or university library. Whenever we go someplace new, we build a mental representation—or cognitive map—of the location, as Tolman’s rats built a cognitive map of their maze. However, some buildings are confusing because they include many areas that look alike or have short lines of sight. Because of this, it’s often difficult to predict what’s around a corner or decide whether to turn left or right to get out of a building. Psychologist Laura Carlson (2010) suggests that what we place in our cognitive map can impact our success in navigating through the environment. She suggests that paying attention to specific features upon entering a building, such as a picture on the wall, a fountain, a statue, or an escalator, adds information to our cognitive map that can be used later to help find our way out of the building.

Watch this video to learn more about Carlson’s studies on cognitive maps and navigation in buildings.

Operant conditioning is based on the work of B. F. Skinner. Operant conditioning is a form of learning in which the motivation for a behavior happens after the behavior is demonstrated. An animal or a human receives a consequence after performing a specific behavior. The consequence is either a reinforcer or a punisher. All reinforcement (positive or negative) increases the likelihood of a behavioral response. All punishment (positive or negative) decreases the likelihood of a behavioral response. Several types of reinforcement schedules are used to reward behavior depending on either a set or variable period of time.

Self Check Questions

Critical thinking questions.

1. What is a Skinner box and what is its purpose?

2. What is the difference between negative reinforcement and punishment?

3. What is shaping and how would you use shaping to teach a dog to roll over?

4. Explain the difference between negative reinforcement and punishment, and provide several examples of each based on your own experiences.

5. Think of a behavior that you have that you would like to change. How could you use behavior modification, specifically positive reinforcement, to change your behavior? What is your positive reinforcer?

1. A Skinner box is an operant conditioning chamber used to train animals such as rats and pigeons to perform certain behaviors, like pressing a lever. When the animals perform the desired behavior, they receive a reward: food or water.

2. In negative reinforcement you are taking away an undesirable stimulus in order to increase the frequency of a certain behavior (e.g., buckling your seat belt stops the annoying beeping sound in your car and increases the likelihood that you will wear your seatbelt). Punishment is designed to reduce a behavior (e.g., you scold your child for running into the street in order to decrease the unsafe behavior.)

3. Shaping is an operant conditioning method in which you reward closer and closer approximations of the desired behavior. If you want to teach your dog to roll over, you might reward him first when he sits, then when he lies down, and then when he lies down and rolls onto his back. Finally, you would reward him only when he completes the entire sequence: lying down, rolling onto his back, and then continuing to roll over to his other side.

  • Psychology. Authored by : OpenStax College. Located at : http://cnx.org/contents/[email protected]:1/Psychology . License : CC BY: Attribution . License Terms : Download for free at http://cnx.org/content/col11629/latest/.

COMMENTS

  1. B.F. Skinner: The Man Who Taught Pigeons to Play Ping-Pong and Rats to

    Operant conditioning breaks down a task into increments. If you want to teach a pigeon to turn in a circle to the left, you give it a reward for any small movement it makes in that direction. Soon ...

  2. Skinner's Box Experiment (Behaviorism Study)

    Pigeons could be pilots in World War II missions, fighting Nazi Germany. When Skinner proposed this idea to the military, he was met with skepticism. Yet, he received $25,000 to start his work on "Project Pigeon." The device worked! Operant conditioning trained pigeons to navigate missiles appropriately and hit their targets.

  3. Operant Conditioning In Psychology: B.F. Skinner Theory

    Operant conditioning is a method of learning that occurs through rewards and punishments for behavior. Through operant conditioning, an individual makes an association between a particular behavior and a consequence. ... Skinner's Pigeon Experiment. B.F. Skinner conducted several experiments with pigeons to demonstrate the principles of ...

  4. Understanding Behavioral Psychology: the Skinner Box

    The Behavioral Psychology Theory That Explains Learned Behavior. A Skinner box is an enclosed apparatus that contains a bar or key that an animal subject can manipulate in order to obtain reinforcement. Developed by B. F. Skinner and also known as an operant conditioning chamber, this box also has a device that records each response provided by ...

  5. B. F. Skinner

    An operant conditioning chamber (also known as a "Skinner box") is a laboratory apparatus used in the experimental analysis of animal behavior. It was invented by Skinner while he was a graduate student at Harvard University. As used by Skinner, the box had a lever (for rats), or a disk in one wall (for pigeons).

  6. Operant Conditioning: Real Pigeon Experiment

    B. F. Skinner is a psychologist and a behaviorist he is amongst the firsts to study operant conditioning. He made many experiments on animals he would put th...

  7. Classics in the History of Psychology -- Skinner (1948)

    A simple experiment demonstrates this to be the case. ... operant conditioning usually takes place. In six out of eight cases the resulting responses were so clearly defined that two observers could agree perfectly in counting instances. ... the speedier and more marked the conditioning. One reason is that the pigeon's behavior becomes more ...

  8. Skinner Box: What Is an Operant Conditioning Chamber?

    The Skinner Box is a chamber, often small, that is used to conduct operant conditioning research with animals. Within this chamber, there is usually a lever or key that an individual animal can operate to obtain a food or water source within the chamber as a reinforcer. The chamber is connected to electronic equipment that records the animal ...

  9. Operant Conditioning (Examples + Research)

    Operant conditioning is a system of learning that happens by changing external variables called 'punishments' and 'rewards.'. Throughout time and repetition, learning happens when an association is created between a certain behavior and the consequence of that behavior (good or bad). You might also hear this concept as "instrumental ...

  10. 6.3 Operant Conditioning

    Figure 6.10 (a) B. F. Skinner developed operant conditioning for systematic study of how behaviors are strengthened or weakened according to their consequences. (b) In a Skinner box, a rat presses a lever in an operant conditioning chamber to receive a food reward. (credit a: modification of work by "Silly rabbit"/Wikimedia Commons)

  11. Operant Conditioning

    Working with Thorndike's law of effect as his foundation, Skinner began conducting scientific experiments on animals (mainly rats and pigeons) to determine how organisms learn through operant conditioning (Skinner, 1938). He placed these animals inside an operant conditioning chamber, which has come to be known as a "Skinner box" (). A ...

  12. Operant Conditioning

    Experiment #3: Pigeon Ping-Pong. But Skinner wasn't only concerned with teaching pigeons how to read. It seems he also made sure they had time to play games as well. In one of his more whimsical experiments, B. F. Skinner taught a pair of common pigeons how to play a simplified version of table tennis.. The pigeons in this experiment were placed on either side of a box and were taught to ...

  13. A Science Odyssey: People and Discoveries: B.F. Skinner

    Operant conditioning can be used to shape behavior. If the goal is to have a pigeon turn in a circle to the left, a reward is given for any small movement to the left.

  14. BF Skinner and Superstition in the Pigeon

    A classic experiment in operant conditioning conducted by the famous behaviourist. ... They reveal another side to the man famous for his operant conditioning experiments with rats and pigeons ...

  15. B. F. Skinner

    The stereotype of a bespectacled experimenter in a white lab coat, engaged in shaping behavior through the operant conditioning of lab rats or pigeons in contraptions known as Skinner boxes comes directly from Skinner's immeasurably influential research. ... Skinner refined the concept of operant conditioning and the Law of Effect. Among his ...

  16. Operant Conditioning

    In operant conditioning, organisms learn to associate a behaviour and its consequence ( Table L.1 ). A pleasant consequence makes that behaviour more likely to be repeated in the future. For example, Spirit, a dolphin at the National Aquarium in Baltimore, does a flip in the air when Spirit's trainer blows a whistle.

  17. Pigeons, Operant Conditioning, and Social Control

    During World War II, Skinner worked on a program called Project Pigeon - also known as Project Orcon, short for Organic Control - an experimental project to create pigeon-guided missiles. The pigeons were trained by Skinner to peck at a target, and they rewarded with food when they completed the task correctly.

  18. PDF Skinner, B. F. (1948). Superstition in the pigeon. Journal of

    One of Skinner's most famous works, his novel, Walden Two, was first published in the same year as his article on pigeon superstition. Walden Two was Skinner's personal vision of a utopian society governed by his principles of operant conditioning in which everyone is happy, content, safe, and pro-ductive.

  19. The Other Shoe: An Early Operant Conditioning Chamber for Pigeons

    The first published experiment based on research conducted using the chamber was conducted by Professor Ohinata . The English version of the abstract of the paper reads in part: ... 1979, photographic display between pages 184 and 185 that is labeled "Demonstrating operant conditioning of a pigeon, Indiana, 1948"). The balance between the ...

  20. PSYCH101: Principles of Operant Conditioning

    Working with Thorndike's law of effect as his foundation, Skinner began conducting scientific experiments on animals (mainly rats and pigeons) to determine how organisms learn through operant conditioning. He placed these animals inside an operant conditioning chamber, which has come to be known as a "Skinner box" (Figure 6.10).

  21. Dr. B.F. Skinner and Operant Conditioning

    Operant Conditioning is introduced and explained. COntains great old footage of Skinner demonstrating shaping and explaining his principles of this kind of ...

  22. Operant Conditioning

    Working with Thorndike's law of effect as his foundation, Skinner began conducting scientific experiments on animals (mainly rats and pigeons) to determine how organisms learn through operant conditioning (Skinner, 1938). He placed these animals inside an operant conditioning chamber, which has come to be known as a "Skinner box" (). A ...

  23. Operant conditioning chamber

    Skinner box. An operant conditioning chamber (also known as a Skinner box) is a laboratory apparatus used to study animal behavior.The operant conditioning chamber was created by B. F. Skinner while he was a graduate student at Harvard University.The chamber can be used to study both operant conditioning and classical conditioning.. Skinner created the operant conditioning chamber as a ...