Encyclopedia Britannica

  • History & Society
  • Science & Tech
  • Biographies
  • Animals & Nature
  • Geography & Travel
  • Arts & Culture
  • Games & Quizzes
  • On This Day
  • One Good Fact
  • New Articles
  • Lifestyles & Social Issues
  • Philosophy & Religion
  • Politics, Law & Government
  • World History
  • Health & Medicine
  • Browse Biographies
  • Birds, Reptiles & Other Vertebrates
  • Bugs, Mollusks & Other Invertebrates
  • Environment
  • Fossils & Geologic Time
  • Entertainment & Pop Culture
  • Sports & Recreation
  • Visual Arts
  • Demystified
  • Image Galleries
  • Infographics
  • Top Questions
  • Britannica Kids
  • Saving Earth
  • Space Next 50
  • Student Center

flow chart of scientific method

scientific method

Our editors will review what you’ve submitted and determine whether to revise the article.

  • University of Nevada, Reno - College of Agriculture, Biotechnology and Natural Resources Extension - The Scientific Method
  • World History Encyclopedia - Scientific Method
  • LiveScience - What Is Science?
  • Verywell Mind - Scientific Method Steps in Psychology Research
  • WebMD - What is the Scientific Method?
  • Chemistry LibreTexts - The Scientific Method
  • National Center for Biotechnology Information - PubMed Central - Redefining the scientific method: as the use of sophisticated scientific methods that extend our mind
  • Khan Academy - The scientific method
  • Simply Psychology - What are the steps in the Scientific Method?
  • Stanford Encyclopedia of Philosophy - Scientific Method

flow chart of scientific method

scientific method , mathematical and experimental technique employed in the sciences . More specifically, it is the technique used in the construction and testing of a scientific hypothesis .

The process of observing, asking questions, and seeking answers through tests and experiments is not unique to any one field of science. In fact, the scientific method is applied broadly in science, across many different fields. Many empirical sciences, especially the social sciences , use mathematical tools borrowed from probability theory and statistics , together with outgrowths of these, such as decision theory , game theory , utility theory, and operations research . Philosophers of science have addressed general methodological problems, such as the nature of scientific explanation and the justification of induction .

science experimental methods

The scientific method is critical to the development of scientific theories , which explain empirical (experiential) laws in a scientifically rational manner. In a typical application of the scientific method, a researcher develops a hypothesis , tests it through various means, and then modifies the hypothesis on the basis of the outcome of the tests and experiments. The modified hypothesis is then retested, further modified, and tested again, until it becomes consistent with observed phenomena and testing outcomes. In this way, hypotheses serve as tools by which scientists gather data. From that data and the many different scientific investigations undertaken to explore hypotheses, scientists are able to develop broad general explanations, or scientific theories.

See also Mill’s methods ; hypothetico-deductive method .

Experimentation in Scientific Research: Variables and controls in practice

by Anthony Carpi, Ph.D., Anne E. Egger, Ph.D.

Listen to this reading

Did you know that experimental design was developed more than a thousand years ago by a Middle Eastern scientist who studied light? All of us use a form of experimental research in our day to day lives when we try to find the spot with the best cell phone reception, try out new cooking recipes, and more. Scientific experiments are built on similar principles.

Experimentation is a research method in which one or more variables are consciously manipulated and the outcome or effect of that manipulation on other variables is observed.

Experimental designs often make use of controls that provide a measure of variability within a system and a check for sources of error.

Experimental methods are commonly applied to determine causal relationships or to quantify the magnitude of response of a variable.

Anyone who has used a cellular phone knows that certain situations require a bit of research: If you suddenly find yourself in an area with poor phone reception, you might move a bit to the left or right, walk a few steps forward or back, or even hold the phone over your head to get a better signal. While the actions of a cell phone user might seem obvious, the person seeking cell phone reception is actually performing a scientific experiment: consciously manipulating one component (the location of the cell phone) and observing the effect of that action on another component (the phone's reception). Scientific experiments are obviously a bit more complicated, and generally involve more rigorous use of controls , but they draw on the same type of reasoning that we use in many everyday situations. In fact, the earliest documented scientific experiments were devised to answer a very common everyday question: how vision works.

  • A brief history of experimental methods

Figure 1: Alhazen (965-ca.1039) as pictured on an Iraqi 10,000-dinar note

Figure 1: Alhazen (965-ca.1039) as pictured on an Iraqi 10,000-dinar note

One of the first ideas regarding how human vision works came from the Greek philosopher Empedocles around 450 BCE . Empedocles reasoned that the Greek goddess Aphrodite had lit a fire in the human eye, and vision was possible because light rays from this fire emanated from the eye, illuminating objects around us. While a number of people challenged this proposal, the idea that light radiated from the human eye proved surprisingly persistent until around 1,000 CE , when a Middle Eastern scientist advanced our knowledge of the nature of light and, in so doing, developed a new and more rigorous approach to scientific research . Abū 'Alī al-Hasan ibn al-Hasan ibn al-Haytham, also known as Alhazen , was born in 965 CE in the Arabian city of Basra in what is present-day Iraq. He began his scientific studies in physics, mathematics, and other sciences after reading the works of several Greek philosophers. One of Alhazen's most significant contributions was a seven-volume work on optics titled Kitab al-Manazir (later translated to Latin as Opticae Thesaurus Alhazeni – Alhazen's Book of Optics ). Beyond the contributions this book made to the field of optics, it was a remarkable work in that it based conclusions on experimental evidence rather than abstract reasoning – the first major publication to do so. Alhazen's contributions have proved so significant that his likeness was immortalized on the 2003 10,000-dinar note issued by Iraq (Figure 1).

Alhazen invested significant time studying light , color, shadows, rainbows, and other optical phenomena. Among this work was a study in which he stood in a darkened room with a small hole in one wall. Outside of the room, he hung two lanterns at different heights. Alhazen observed that the light from each lantern illuminated a different spot in the room, and each lighted spot formed a direct line with the hole and one of the lanterns outside the room. He also found that covering a lantern caused the spot it illuminated to darken, and exposing the lantern caused the spot to reappear. Thus, Alhazen provided some of the first experimental evidence that light does not emanate from the human eye but rather is emitted by certain objects (like lanterns) and travels from these objects in straight lines. Alhazen's experiment may seem simplistic today, but his methodology was groundbreaking: He developed a hypothesis based on observations of physical relationships (that light comes from objects), and then designed an experiment to test that hypothesis. Despite the simplicity of the method , Alhazen's experiment was a critical step in refuting the long-standing theory that light emanated from the human eye, and it was a major event in the development of modern scientific research methodology.

Comprehension Checkpoint

  • Experimentation as a scientific research method

Experimentation is one scientific research method , perhaps the most recognizable, in a spectrum of methods that also includes description, comparison, and modeling (see our Description , Comparison , and Modeling modules). While all of these methods share in common a scientific approach, experimentation is unique in that it involves the conscious manipulation of certain aspects of a real system and the observation of the effects of that manipulation. You could solve a cell phone reception problem by walking around a neighborhood until you see a cell phone tower, observing other cell phone users to see where those people who get the best reception are standing, or looking on the web for a map of cell phone signal coverage. All of these methods could also provide answers, but by moving around and testing reception yourself, you are experimenting.

  • Variables: Independent and dependent

In the experimental method , a condition or a parameter , generally referred to as a variable , is consciously manipulated (often referred to as a treatment) and the outcome or effect of that manipulation is observed on other variables. Variables are given different names depending on whether they are the ones manipulated or the ones observed:

  • Independent variable refers to a condition within an experiment that is manipulated by the scientist.
  • Dependent variable refers to an event or outcome of an experiment that might be affected by the manipulation of the independent variable .

Scientific experimentation helps to determine the nature of the relationship between independent and dependent variables . While it is often difficult, or sometimes impossible, to manipulate a single variable in an experiment , scientists often work to minimize the number of variables being manipulated. For example, as we move from one location to another to get better cell reception, we likely change the orientation of our body, perhaps from south-facing to east-facing, or we hold the cell phone at a different angle. Which variable affected reception: location, orientation, or angle of the phone? It is critical that scientists understand which aspects of their experiment they are manipulating so that they can accurately determine the impacts of that manipulation . In order to constrain the possible outcomes of an experimental procedure, most scientific experiments use a system of controls .

  • Controls: Negative, positive, and placebos

In a controlled study, a scientist essentially runs two (or more) parallel and simultaneous experiments: a treatment group, in which the effect of an experimental manipulation is observed on a dependent variable , and a control group, which uses all of the same conditions as the first with the exception of the actual treatment. Controls can fall into one of two groups: negative controls and positive controls .

In a negative control , the control group is exposed to all of the experimental conditions except for the actual treatment . The need to match all experimental conditions exactly is so great that, for example, in a trial for a new drug, the negative control group will be given a pill or liquid that looks exactly like the drug, except that it will not contain the drug itself, a control often referred to as a placebo . Negative controls allow scientists to measure the natural variability of the dependent variable(s), provide a means of measuring error in the experiment , and also provide a baseline to measure against the experimental treatment.

Some experimental designs also make use of positive controls . A positive control is run as a parallel experiment and generally involves the use of an alternative treatment that the researcher knows will have an effect on the dependent variable . For example, when testing the effectiveness of a new drug for pain relief, a scientist might administer treatment placebo to one group of patients as a negative control , and a known treatment like aspirin to a separate group of individuals as a positive control since the pain-relieving aspects of aspirin are well documented. In both cases, the controls allow scientists to quantify background variability and reject alternative hypotheses that might otherwise explain the effect of the treatment on the dependent variable .

  • Experimentation in practice: The case of Louis Pasteur

Well-controlled experiments generally provide strong evidence of causality, demonstrating whether the manipulation of one variable causes a response in another variable. For example, as early as the 6th century BCE , Anaximander , a Greek philosopher, speculated that life could be formed from a mixture of sea water, mud, and sunlight. The idea probably stemmed from the observation of worms, mosquitoes, and other insects "magically" appearing in mudflats and other shallow areas. While the suggestion was challenged on a number of occasions, the idea that living microorganisms could be spontaneously generated from air persisted until the middle of the 18 th century.

In the 1750s, John Needham, a Scottish clergyman and naturalist, claimed to have proved that spontaneous generation does occur when he showed that microorganisms flourished in certain foods such as soup broth, even after they had been briefly boiled and covered. Several years later, the Italian abbot and biologist Lazzaro Spallanzani , boiled soup broth for over an hour and then placed bowls of this soup in different conditions, sealing some and leaving others exposed to air. Spallanzani found that microorganisms grew in the soup exposed to air but were absent from the sealed soup. He therefore challenged Needham's conclusions and hypothesized that microorganisms suspended in air settled onto the exposed soup but not the sealed soup, and rejected the idea of spontaneous generation .

Needham countered, arguing that the growth of bacteria in the soup was not due to microbes settling onto the soup from the air, but rather because spontaneous generation required contact with an intangible "life force" in the air itself. He proposed that Spallanzani's extensive boiling destroyed the "life force" present in the soup, preventing spontaneous generation in the sealed bowls but allowing air to replenish the life force in the open bowls. For several decades, scientists continued to debate the spontaneous generation theory of life, with support for the theory coming from several notable scientists including Félix Pouchet and Henry Bastion. Pouchet, Director of the Rouen Museum of Natural History in France, and Bastion, a well-known British bacteriologist, argued that living organisms could spontaneously arise from chemical processes such as fermentation and putrefaction. The debate became so heated that in 1860, the French Academy of Sciences established the Alhumbert prize of 2,500 francs to the first person who could conclusively resolve the conflict. In 1864, Louis Pasteur achieved that result with a series of well-controlled experiments and in doing so claimed the Alhumbert prize.

Pasteur prepared for his experiments by studying the work of others that came before him. In fact, in April 1861 Pasteur wrote to Pouchet to obtain a research description that Pouchet had published. In this letter, Pasteur writes:

Paris, April 3, 1861 Dear Colleague, The difference of our opinions on the famous question of spontaneous generation does not prevent me from esteeming highly your labor and praiseworthy efforts... The sincerity of these sentiments...permits me to have recourse to your obligingness in full confidence. I read with great care everything that you write on the subject that occupies both of us. Now, I cannot obtain a brochure that I understand you have just published.... I would be happy to have a copy of it because I am at present editing the totality of my observations, where naturally I criticize your assertions. L. Pasteur (Porter, 1961)

Pasteur received the brochure from Pouchet several days later and went on to conduct his own experiments . In these, he repeated Spallanzani's method of boiling soup broth, but he divided the broth into portions and exposed these portions to different controlled conditions. Some broth was placed in flasks that had straight necks that were open to the air, some broth was placed in sealed flasks that were not open to the air, and some broth was placed into a specially designed set of swan-necked flasks, in which the broth would be open to the air but the air would have to travel a curved path before reaching the broth, thus preventing anything that might be present in the air from simply settling onto the soup (Figure 2). Pasteur then observed the response of the dependent variable (the growth of microorganisms) in response to the independent variable (the design of the flask). Pasteur's experiments contained both positive controls (samples in the straight-necked flasks that he knew would become contaminated with microorganisms) and negative controls (samples in the sealed flasks that he knew would remain sterile). If spontaneous generation did indeed occur upon exposure to air, Pasteur hypothesized, microorganisms would be found in both the swan-neck flasks and the straight-necked flasks, but not in the sealed flasks. Instead, Pasteur found that microorganisms appeared in the straight-necked flasks, but not in the sealed flasks or the swan-necked flasks.

Figure 2: Pasteur's drawings of the flasks he used (Pasteur, 1861). Fig. 25 D, C, and B (top) show various sealed flasks (negative controls); Fig. 26 (bottom right) illustrates a straight-necked flask directly open to the atmosphere (positive control); and Fig. 25 A (bottom left) illustrates the specially designed swan-necked flask (treatment group).

Figure 2: Pasteur's drawings of the flasks he used (Pasteur, 1861). Fig. 25 D, C, and B (top) show various sealed flasks (negative controls); Fig. 26 (bottom right) illustrates a straight-necked flask directly open to the atmosphere (positive control); and Fig. 25 A (bottom left) illustrates the specially designed swan-necked flask (treatment group).

By using controls and replicating his experiment (he used more than one of each type of flask), Pasteur was able to answer many of the questions that still surrounded the issue of spontaneous generation. Pasteur said of his experimental design, "I affirm with the most perfect sincerity that I have never had a single experiment, arranged as I have just explained, which gave me a doubtful result" (Porter, 1961). Pasteur's work helped refute the theory of spontaneous generation – his experiments showed that air alone was not the cause of bacterial growth in the flask, and his research supported the hypothesis that live microorganisms suspended in air could settle onto the broth in open-necked flasks via gravity .

  • Experimentation across disciplines

Experiments are used across all scientific disciplines to investigate a multitude of questions. In some cases, scientific experiments are used for exploratory purposes in which the scientist does not know what the dependent variable is. In this type of experiment, the scientist will manipulate an independent variable and observe what the effect of the manipulation is in order to identify a dependent variable (or variables). Exploratory experiments are sometimes used in nutritional biology when scientists probe the function and purpose of dietary nutrients . In one approach, a scientist will expose one group of animals to a normal diet, and a second group to a similar diet except that it is lacking a specific vitamin or nutrient. The researcher will then observe the two groups to see what specific physiological changes or medical problems arise in the group lacking the nutrient being studied.

Scientific experiments are also commonly used to quantify the magnitude of a relationship between two or more variables . For example, in the fields of pharmacology and toxicology, scientific experiments are used to determine the dose-response relationship of a new drug or chemical. In these approaches, researchers perform a series of experiments in which a population of organisms , such as laboratory mice, is separated into groups and each group is exposed to a different amount of the drug or chemical of interest. The analysis of the data that result from these experiments (see our Data Analysis and Interpretation module) involves comparing the degree of the organism's response to the dose of the substance administered.

In this context, experiments can provide additional evidence to complement other research methods . For example, in the 1950s a great debate ensued over whether or not the chemicals in cigarette smoke cause cancer. Several researchers had conducted comparative studies (see our Comparison in Scientific Research module) that indicated that patients who smoked had a higher probability of developing lung cancer when compared to nonsmokers. Comparative studies differ slightly from experimental methods in that you do not consciously manipulate a variable ; rather you observe differences between two or more groups depending on whether or not they fall into a treatment or control group. Cigarette companies and lobbyists criticized these studies, suggesting that the relationship between smoking and lung cancer was coincidental. Several researchers noted the need for a clear dose-response study; however, the difficulties in getting cigarette smoke into the lungs of laboratory animals prevented this research. In the mid-1950s, Ernest Wynder and colleagues had an ingenious idea: They condensed the chemicals from cigarette smoke into a liquid and applied this in various doses to the skin of groups of mice. The researchers published data from a dose-response experiment of the effect of tobacco smoke condensate on mice (Wynder et al., 1957).

As seen in Figure 3, the researchers found a positive relationship between the amount of condensate applied to the skin of mice and the number of cancers that developed. The graph shows the results of a study in which different groups of mice were exposed to increasing amounts of cigarette tar. The black dots indicate the percentage of each sample group of mice that developed cancer for a given amount cigarette smoke "condensate" applied to their skin. The vertical lines are error bars, showing the amount of uncertainty . The graph shows generally increasing cancer rates with greater exposure. This study was one of the first pieces of experimental evidence in the cigarette smoking debate , and it helped strengthen the case for cigarette smoke as the causative agent in lung cancer in smokers.

Figure 3: Percentage of mice with cancer versus the amount cigarette smoke

Figure 3: Percentage of mice with cancer versus the amount cigarette smoke "condensate" applied to their skin (source: Wynder et al., 1957).

Sometimes experimental approaches and other research methods are not clearly distinct, or scientists may even use multiple research approaches in combination. For example, at 1:52 a.m. EDT on July 4, 2005, scientists with the National Aeronautics and Space Administration (NASA) conducted a study in which a 370 kg spacecraft named Deep Impact was purposely slammed into passing comet Tempel 1. A nearby spacecraft observed the impact and radioed data back to Earth. The research was partially descriptive in that it documented the chemical composition of the comet, but it was also partly experimental in that the effect of slamming the Deep Impact probe into the comet on the volatilization of previously undetected compounds , such as water, was assessed (A'Hearn et al., 2005). It is particularly common that experimentation and description overlap: Another example is Jane Goodall 's research on the behavior of chimpanzees, which can be read in our Description in Scientific Research module.

  • Limitations of experimental methods

science experimental methods

Figure 4: An image of comet Tempel 1 67 seconds after collision with the Deep Impact impactor. Image credit: NASA/JPL-Caltech/UMD http://deepimpact.umd.edu/gallery/HRI_937_1.html

While scientific experiments provide invaluable data regarding causal relationships, they do have limitations. One criticism of experiments is that they do not necessarily represent real-world situations. In order to clearly identify the relationship between an independent variable and a dependent variable , experiments are designed so that many other contributing variables are fixed or eliminated. For example, in an experiment designed to quantify the effect of vitamin A dose on the metabolism of beta-carotene in humans, Shawna Lemke and colleagues had to precisely control the diet of their human volunteers (Lemke, Dueker et al. 2003). They asked their participants to limit their intake of foods rich in vitamin A and further asked that they maintain a precise log of all foods eaten for 1 week prior to their study. At the time of their study, they controlled their participants' diet by feeding them all the same meals, described in the methods section of their research article in this way:

Meals were controlled for time and content on the dose administration day. Lunch was served at 5.5 h postdosing and consisted of a frozen dinner (Enchiladas, Amy's Kitchen, Petaluma, CA), a blueberry bagel with jelly, 1 apple and 1 banana, and a large chocolate chunk cookie (Pepperidge Farm). Dinner was served 10.5 h post dose and consisted of a frozen dinner (Chinese Stir Fry, Amy's Kitchen) plus the bagel and fruit taken for lunch.

While this is an important aspect of making an experiment manageable and informative, it is often not representative of the real world, in which many variables may change at once, including the foods you eat. Still, experimental research is an excellent way of determining relationships between variables that can be later validated in real world settings through descriptive or comparative studies.

Design is critical to the success or failure of an experiment . Slight variations in the experimental set-up could strongly affect the outcome being measured. For example, during the 1950s, a number of experiments were conducted to evaluate the toxicity in mammals of the metal molybdenum, using rats as experimental subjects . Unexpectedly, these experiments seemed to indicate that the type of cage the rats were housed in affected the toxicity of molybdenum. In response, G. Brinkman and Russell Miller set up an experiment to investigate this observation (Brinkman & Miller, 1961). Brinkman and Miller fed two groups of rats a normal diet that was supplemented with 200 parts per million (ppm) of molybdenum. One group of rats was housed in galvanized steel (steel coated with zinc to reduce corrosion) cages and the second group was housed in stainless steel cages. Rats housed in the galvanized steel cages suffered more from molybdenum toxicity than the other group: They had higher concentrations of molybdenum in their livers and lower blood hemoglobin levels. It was then shown that when the rats chewed on their cages, those housed in the galvanized metal cages absorbed zinc plated onto the metal bars, and zinc is now known to affect the toxicity of molybdenum. In order to control for zinc exposure, then, stainless steel cages needed to be used for all rats.

Scientists also have an obligation to adhere to ethical limits in designing and conducting experiments . During World War II, doctors working in Nazi Germany conducted many heinous experiments using human subjects . Among them was an experiment meant to identify effective treatments for hypothermia in humans, in which concentration camp prisoners were forced to sit in ice water or left naked outdoors in freezing temperatures and then re-warmed by various means. Many of the exposed victims froze to death or suffered permanent injuries. As a result of the Nazi experiments and other unethical research , strict scientific ethical standards have been adopted by the United States and other governments, and by the scientific community at large. Among other things, ethical standards (see our Scientific Ethics module) require that the benefits of research outweigh the risks to human subjects, and those who participate do so voluntarily and only after they have been made fully aware of all the risks posed by the research. These guidelines have far-reaching effects: While the clearest indication of causation in the cigarette smoke and lung cancer debate would have been to design an experiment in which one group of people was asked to take up smoking and another group was asked to refrain from smoking, it would be highly unethical for a scientist to purposefully expose a group of healthy people to a suspected cancer causing agent. As an alternative, comparative studies (see our Comparison in Scientific Research module) were initiated in humans, and experimental studies focused on animal subjects. The combination of these and other studies provided even stronger evidence of the link between smoking and lung cancer than either one method alone would have.

  • Experimentation in modern practice

Like all scientific research , the results of experiments are shared with the scientific community, are built upon, and inspire additional experiments and research. For example, once Alhazen established that light given off by objects enters the human eye, the natural question that was asked was "What is the nature of light that enters the human eye?" Two common theories about the nature of light were debated for many years. Sir Isaac Newton was among the principal proponents of a theory suggesting that light was made of small particles . The English naturalist Robert Hooke (who held the interesting title of Curator of Experiments at the Royal Society of London) supported a different theory stating that light was a type of wave, like sound waves . In 1801, Thomas Young conducted a now classic scientific experiment that helped resolve this controversy . Young, like Alhazen, worked in a darkened room and allowed light to enter only through a small hole in a window shade (Figure 5). Young refocused the beam of light with mirrors and split the beam with a paper-thin card. The split light beams were then projected onto a screen, and formed an alternating light and dark banding pattern – that was a sign that light was indeed a wave (see our Light I: Particle or Wave? module).

Figure 5: Young's split-light beam experiment helped clarify the wave nature of light.

Figure 5: Young's split-light beam experiment helped clarify the wave nature of light.

Approximately 100 years later, in 1905, new experiments led Albert Einstein to conclude that light exhibits properties of both waves and particles . Einstein's dual wave-particle theory is now generally accepted by scientists.

Experiments continue to help refine our understanding of light even today. In addition to his wave-particle theory , Einstein also proposed that the speed of light was unchanging and absolute. Yet in 1998 a group of scientists led by Lene Hau showed that light could be slowed from its normal speed of 3 x 10 8 meters per second to a mere 17 meters per second with a special experimental apparatus (Hau et al., 1999). The series of experiments that began with Alhazen 's work 1000 years ago has led to a progressively deeper understanding of the nature of light. Although the tools with which scientists conduct experiments may have become more complex, the principles behind controlled experiments are remarkably similar to those used by Pasteur and Alhazen hundreds of years ago.

Table of Contents

Activate glossary term highlighting to easily identify key terms within the module. Once highlighted, you can click on these terms to view their definitions.

Activate NGSS annotations to easily identify NGSS standards within the module. Once highlighted, you can click on them to view these standards.

Science and the scientific method: Definitions and examples

Here's a look at the foundation of doing science — the scientific method.

Kids follow the scientific method to carry out an experiment.

The scientific method

Hypothesis, theory and law, a brief history of science, additional resources, bibliography.

Science is a systematic and logical approach to discovering how things in the universe work. It is also the body of knowledge accumulated through the discoveries about all the things in the universe. 

The word "science" is derived from the Latin word "scientia," which means knowledge based on demonstrable and reproducible data, according to the Merriam-Webster dictionary . True to this definition, science aims for measurable results through testing and analysis, a process known as the scientific method. Science is based on fact, not opinion or preferences. The process of science is designed to challenge ideas through research. One important aspect of the scientific process is that it focuses only on the natural world, according to the University of California, Berkeley . Anything that is considered supernatural, or beyond physical reality, does not fit into the definition of science.

When conducting research, scientists use the scientific method to collect measurable, empirical evidence in an experiment related to a hypothesis (often in the form of an if/then statement) that is designed to support or contradict a scientific theory .

"As a field biologist, my favorite part of the scientific method is being in the field collecting the data," Jaime Tanner, a professor of biology at Marlboro College, told Live Science. "But what really makes that fun is knowing that you are trying to answer an interesting question. So the first step in identifying questions and generating possible answers (hypotheses) is also very important and is a creative process. Then once you collect the data you analyze it to see if your hypothesis is supported or not."

Here's an illustration showing the steps in the scientific method.

The steps of the scientific method go something like this, according to Highline College :

  • Make an observation or observations.
  • Form a hypothesis — a tentative description of what's been observed, and make predictions based on that hypothesis.
  • Test the hypothesis and predictions in an experiment that can be reproduced.
  • Analyze the data and draw conclusions; accept or reject the hypothesis or modify the hypothesis if necessary.
  • Reproduce the experiment until there are no discrepancies between observations and theory. "Replication of methods and results is my favorite step in the scientific method," Moshe Pritsker, a former post-doctoral researcher at Harvard Medical School and CEO of JoVE, told Live Science. "The reproducibility of published experiments is the foundation of science. No reproducibility — no science."

Some key underpinnings to the scientific method:

  • The hypothesis must be testable and falsifiable, according to North Carolina State University . Falsifiable means that there must be a possible negative answer to the hypothesis.
  • Research must involve deductive reasoning and inductive reasoning . Deductive reasoning is the process of using true premises to reach a logical true conclusion while inductive reasoning uses observations to infer an explanation for those observations.
  • An experiment should include a dependent variable (which does not change) and an independent variable (which does change), according to the University of California, Santa Barbara .
  • An experiment should include an experimental group and a control group. The control group is what the experimental group is compared against, according to Britannica .

The process of generating and testing a hypothesis forms the backbone of the scientific method. When an idea has been confirmed over many experiments, it can be called a scientific theory. While a theory provides an explanation for a phenomenon, a scientific law provides a description of a phenomenon, according to The University of Waikato . One example would be the law of conservation of energy, which is the first law of thermodynamics that says that energy can neither be created nor destroyed. 

A law describes an observed phenomenon, but it doesn't explain why the phenomenon exists or what causes it. "In science, laws are a starting place," said Peter Coppinger, an associate professor of biology and biomedical engineering at the Rose-Hulman Institute of Technology. "From there, scientists can then ask the questions, 'Why and how?'"

Laws are generally considered to be without exception, though some laws have been modified over time after further testing found discrepancies. For instance, Newton's laws of motion describe everything we've observed in the macroscopic world, but they break down at the subatomic level.

This does not mean theories are not meaningful. For a hypothesis to become a theory, scientists must conduct rigorous testing, typically across multiple disciplines by separate groups of scientists. Saying something is "just a theory" confuses the scientific definition of "theory" with the layperson's definition. To most people a theory is a hunch. In science, a theory is the framework for observations and facts, Tanner told Live Science.

This Copernican heliocentric solar system, from 1708, shows the orbit of the moon around the Earth, and the orbits of the Earth and planets round the sun, including Jupiter and its moons, all surrounded by the 12 signs of the zodiac.

The earliest evidence of science can be found as far back as records exist. Early tablets contain numerals and information about the solar system , which were derived by using careful observation, prediction and testing of those predictions. Science became decidedly more "scientific" over time, however.

1200s: Robert Grosseteste developed the framework for the proper methods of modern scientific experimentation, according to the Stanford Encyclopedia of Philosophy. His works included the principle that an inquiry must be based on measurable evidence that is confirmed through testing.

1400s: Leonardo da Vinci began his notebooks in pursuit of evidence that the human body is microcosmic. The artist, scientist and mathematician also gathered information about optics and hydrodynamics.

1500s: Nicolaus Copernicus advanced the understanding of the solar system with his discovery of heliocentrism. This is a model in which Earth and the other planets revolve around the sun, which is the center of the solar system.

1600s: Johannes Kepler built upon those observations with his laws of planetary motion. Galileo Galilei improved on a new invention, the telescope, and used it to study the sun and planets. The 1600s also saw advancements in the study of physics as Isaac Newton developed his laws of motion.

1700s: Benjamin Franklin discovered that lightning is electrical. He also contributed to the study of oceanography and meteorology. The understanding of chemistry also evolved during this century as Antoine Lavoisier, dubbed the father of modern chemistry , developed the law of conservation of mass.

1800s: Milestones included Alessandro Volta's discoveries regarding electrochemical series, which led to the invention of the battery. John Dalton also introduced atomic theory, which stated that all matter is composed of atoms that combine to form molecules. The basis of modern study of genetics advanced as Gregor Mendel unveiled his laws of inheritance. Later in the century, Wilhelm Conrad Röntgen discovered X-rays , while George Ohm's law provided the basis for understanding how to harness electrical charges.

1900s: The discoveries of Albert Einstein , who is best known for his theory of relativity, dominated the beginning of the 20th century. Einstein's theory of relativity is actually two separate theories. His special theory of relativity, which he outlined in a 1905 paper, " The Electrodynamics of Moving Bodies ," concluded that time must change according to the speed of a moving object relative to the frame of reference of an observer. His second theory of general relativity, which he published as " The Foundation of the General Theory of Relativity ," advanced the idea that matter causes space to curve.

In 1952, Jonas Salk developed the polio vaccine , which reduced the incidence of polio in the United States by nearly 90%, according to Britannica . The following year, James D. Watson and Francis Crick discovered the structure of DNA , which is a double helix formed by base pairs attached to a sugar-phosphate backbone, according to the National Human Genome Research Institute .

2000s: The 21st century saw the first draft of the human genome completed, leading to a greater understanding of DNA. This advanced the study of genetics, its role in human biology and its use as a predictor of diseases and other disorders, according to the National Human Genome Research Institute .

  • This video from City University of New York delves into the basics of what defines science.
  • Learn about what makes science science in this book excerpt from Washington State University .
  • This resource from the University of Michigan — Flint explains how to design your own scientific study.

Merriam-Webster Dictionary, Scientia. 2022. https://www.merriam-webster.com/dictionary/scientia

University of California, Berkeley, "Understanding Science: An Overview." 2022. ​​ https://undsci.berkeley.edu/article/0_0_0/intro_01  

Highline College, "Scientific method." July 12, 2015. https://people.highline.edu/iglozman/classes/astronotes/scimeth.htm  

North Carolina State University, "Science Scripts." https://projects.ncsu.edu/project/bio183de/Black/science/science_scripts.html  

University of California, Santa Barbara. "What is an Independent variable?" October 31,2017. http://scienceline.ucsb.edu/getkey.php?key=6045  

Encyclopedia Britannica, "Control group." May 14, 2020. https://www.britannica.com/science/control-group  

The University of Waikato, "Scientific Hypothesis, Theories and Laws." https://sci.waikato.ac.nz/evolution/Theories.shtml  

Stanford Encyclopedia of Philosophy, Robert Grosseteste. May 3, 2019. https://plato.stanford.edu/entries/grosseteste/  

Encyclopedia Britannica, "Jonas Salk." October 21, 2021. https://www.britannica.com/ biography /Jonas-Salk

National Human Genome Research Institute, "​Phosphate Backbone." https://www.genome.gov/genetics-glossary/Phosphate-Backbone  

National Human Genome Research Institute, "What is the Human Genome Project?" https://www.genome.gov/human-genome-project/What  

‌ Live Science contributor Ashley Hamer updated this article on Jan. 16, 2022.

Sign up for the Live Science daily newsletter now

Get the world’s most fascinating discoveries delivered straight to your inbox.

Ocean Photographer of the Year 2024: See stunning photos of hungry whale, surfing seagull, freaky fish babies, land-loving eel and adorable toxic octopus

Earth from space: Ghostly figure emerges in Greenland ice after underground lake collapses

Earth once wore a Saturn-like ring, study of ancient craters suggests

Most Popular

  • 2 Watch mesmerizing video of weird waves that 'shape life itself' inside a fly embryo
  • 3 A passing star may have kicked the solar system's weirdest moons into place
  • 4 'Their capacity to emulate human language and thought is immensely powerful': Far from ending the world, AI systems might actually save it
  • 5 The bubbling surface of a distant star was captured on video for the 1st time ever

science experimental methods

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Guide to Experimental Design | Overview, Steps, & Examples

Guide to Experimental Design | Overview, 5 steps & Examples

Published on December 3, 2019 by Rebecca Bevans . Revised on June 21, 2023.

Experiments are used to study causal relationships . You manipulate one or more independent variables and measure their effect on one or more dependent variables.

Experimental design create a set of procedures to systematically test a hypothesis . A good experimental design requires a strong understanding of the system you are studying.

There are five key steps in designing an experiment:

  • Consider your variables and how they are related
  • Write a specific, testable hypothesis
  • Design experimental treatments to manipulate your independent variable
  • Assign subjects to groups, either between-subjects or within-subjects
  • Plan how you will measure your dependent variable

For valid conclusions, you also need to select a representative sample and control any  extraneous variables that might influence your results. If random assignment of participants to control and treatment groups is impossible, unethical, or highly difficult, consider an observational study instead. This minimizes several types of research bias, particularly sampling bias , survivorship bias , and attrition bias as time passes.

Table of contents

Step 1: define your variables, step 2: write your hypothesis, step 3: design your experimental treatments, step 4: assign your subjects to treatment groups, step 5: measure your dependent variable, other interesting articles, frequently asked questions about experiments.

You should begin with a specific research question . We will work with two research question examples, one from health sciences and one from ecology:

To translate your research question into an experimental hypothesis, you need to define the main variables and make predictions about how they are related.

Start by simply listing the independent and dependent variables .

Research question Independent variable Dependent variable
Phone use and sleep Minutes of phone use before sleep Hours of sleep per night
Temperature and soil respiration Air temperature just above the soil surface CO2 respired from soil

Then you need to think about possible extraneous and confounding variables and consider how you might control  them in your experiment.

Extraneous variable How to control
Phone use and sleep in sleep patterns among individuals. measure the average difference between sleep with phone use and sleep without phone use rather than the average amount of sleep per treatment group.
Temperature and soil respiration also affects respiration, and moisture can decrease with increasing temperature. monitor soil moisture and add water to make sure that soil moisture is consistent across all treatment plots.

Finally, you can put these variables together into a diagram. Use arrows to show the possible relationships between variables and include signs to show the expected direction of the relationships.

Diagram of the relationship between variables in a sleep experiment

Here we predict that increasing temperature will increase soil respiration and decrease soil moisture, while decreasing soil moisture will lead to decreased soil respiration.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Now that you have a strong conceptual understanding of the system you are studying, you should be able to write a specific, testable hypothesis that addresses your research question.

Null hypothesis (H ) Alternate hypothesis (H )
Phone use and sleep Phone use before sleep does not correlate with the amount of sleep a person gets. Increasing phone use before sleep leads to a decrease in sleep.
Temperature and soil respiration Air temperature does not correlate with soil respiration. Increased air temperature leads to increased soil respiration.

The next steps will describe how to design a controlled experiment . In a controlled experiment, you must be able to:

  • Systematically and precisely manipulate the independent variable(s).
  • Precisely measure the dependent variable(s).
  • Control any potential confounding variables.

If your study system doesn’t match these criteria, there are other types of research you can use to answer your research question.

How you manipulate the independent variable can affect the experiment’s external validity – that is, the extent to which the results can be generalized and applied to the broader world.

First, you may need to decide how widely to vary your independent variable.

  • just slightly above the natural range for your study region.
  • over a wider range of temperatures to mimic future warming.
  • over an extreme range that is beyond any possible natural variation.

Second, you may need to choose how finely to vary your independent variable. Sometimes this choice is made for you by your experimental system, but often you will need to decide, and this will affect how much you can infer from your results.

  • a categorical variable : either as binary (yes/no) or as levels of a factor (no phone use, low phone use, high phone use).
  • a continuous variable (minutes of phone use measured every night).

How you apply your experimental treatments to your test subjects is crucial for obtaining valid and reliable results.

First, you need to consider the study size : how many individuals will be included in the experiment? In general, the more subjects you include, the greater your experiment’s statistical power , which determines how much confidence you can have in your results.

Then you need to randomly assign your subjects to treatment groups . Each group receives a different level of the treatment (e.g. no phone use, low phone use, high phone use).

You should also include a control group , which receives no treatment. The control group tells us what would have happened to your test subjects without any experimental intervention.

When assigning your subjects to groups, there are two main choices you need to make:

  • A completely randomized design vs a randomized block design .
  • A between-subjects design vs a within-subjects design .

Randomization

An experiment can be completely randomized or randomized within blocks (aka strata):

  • In a completely randomized design , every subject is assigned to a treatment group at random.
  • In a randomized block design (aka stratified random design), subjects are first grouped according to a characteristic they share, and then randomly assigned to treatments within those groups.
Completely randomized design Randomized block design
Phone use and sleep Subjects are all randomly assigned a level of phone use using a random number generator. Subjects are first grouped by age, and then phone use treatments are randomly assigned within these groups.
Temperature and soil respiration Warming treatments are assigned to soil plots at random by using a number generator to generate map coordinates within the study area. Soils are first grouped by average rainfall, and then treatment plots are randomly assigned within these groups.

Sometimes randomization isn’t practical or ethical , so researchers create partially-random or even non-random designs. An experimental design where treatments aren’t randomly assigned is called a quasi-experimental design .

Between-subjects vs. within-subjects

In a between-subjects design (also known as an independent measures design or classic ANOVA design), individuals receive only one of the possible levels of an experimental treatment.

In medical or social research, you might also use matched pairs within your between-subjects design to make sure that each treatment group contains the same variety of test subjects in the same proportions.

In a within-subjects design (also known as a repeated measures design), every individual receives each of the experimental treatments consecutively, and their responses to each treatment are measured.

Within-subjects or repeated measures can also refer to an experimental design where an effect emerges over time, and individual responses are measured over time in order to measure this effect as it emerges.

Counterbalancing (randomizing or reversing the order of treatments among subjects) is often used in within-subjects designs to ensure that the order of treatment application doesn’t influence the results of the experiment.

Between-subjects (independent measures) design Within-subjects (repeated measures) design
Phone use and sleep Subjects are randomly assigned a level of phone use (none, low, or high) and follow that level of phone use throughout the experiment. Subjects are assigned consecutively to zero, low, and high levels of phone use throughout the experiment, and the order in which they follow these treatments is randomized.
Temperature and soil respiration Warming treatments are assigned to soil plots at random and the soils are kept at this temperature throughout the experiment. Every plot receives each warming treatment (1, 3, 5, 8, and 10C above ambient temperatures) consecutively over the course of the experiment, and the order in which they receive these treatments is randomized.

Prevent plagiarism. Run a free check.

Finally, you need to decide how you’ll collect data on your dependent variable outcomes. You should aim for reliable and valid measurements that minimize research bias or error.

Some variables, like temperature, can be objectively measured with scientific instruments. Others may need to be operationalized to turn them into measurable observations.

  • Ask participants to record what time they go to sleep and get up each day.
  • Ask participants to wear a sleep tracker.

How precisely you measure your dependent variable also affects the kinds of statistical analysis you can use on your data.

Experiments are always context-dependent, and a good experimental design will take into account all of the unique considerations of your study system to produce information that is both valid and relevant to your research question.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 21). Guide to Experimental Design | Overview, 5 steps & Examples. Scribbr. Retrieved September 18, 2024, from https://www.scribbr.com/methodology/experimental-design/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, random assignment in experiments | introduction & examples, quasi-experimental design | definition, types & examples, how to write a lab report, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

science experimental methods

Scientific Method

Mark Cartwright

The Scientific Method was first used during the Scientific Revolution (1500-1700). The method combined theoretical knowledge such as mathematics with practical experimentation using scientific instruments, results analysis and comparisons, and finally peer reviews, all to better determine how the world around us works. In this way, hypotheses were rigorously tested, and laws could be formulated which explained observable phenomena. The goal of this scientific method was to not only increase human knowledge but to do so in a way that practically benefitted everyone and improved the human condition.

A New Approach: Bacon 's Vision

Francis Bacon (1561-1626) was an English philosopher, statesman, and author. He is considered one of the founders of modern scientific research and scientific method, even as "the father of modern science " because he proposed a new combined method of empirical (observable) experimentation and shared data collection so that humanity might finally discover all of nature's secrets and improve itself. Bacon championed the need for systematic and detailed empirical study, as this was the only way to increase humanity's understanding and, for him, more importantly, gain control of nature. This approach sounds quite obvious today, but at the time, the highly theoretical approach of the Greek philosopher Aristotle (l. 384-322 BCE) still dominated thought. Verbal arguments had become more important than what could actually be seen in the world. Further, natural philosophers had become preoccupied with why things happen instead of first ascertaining what was happening in nature.

Bacon rejected the current backward-looking approach to knowledge, that is, the seemingly never-ending attempt to prove the ancients right. Instead, new thinkers and experimenters, said Bacon, should act like the new navigators who had pushed beyond the limits of the known world. Christopher Columbus (1451-1506) had shown there was land across the Atlantic Ocean. Vasco da Gama (c. 1469-1524) had explored the globe in the other direction. Scientists, as we would call them today, had to be similarly bold. Old knowledge had to be rigorously tested to see that it was worth keeping. New knowledge had to be acquired by thoroughly testing nature without preconceived ideas. Reason had to be applied to data collected from experiments, and the same data had to be openly shared with other thinkers so that it could be tested again, comparing it to what others had discovered. Finally, this knowledge must then be used to improve the human condition; otherwise, it was no use pursuing it in the first place. This was Bacon's vision. What he proposed did indeed come about but with three notable factors added to the scientific method. These were mathematics, hypotheses, and technology.

The Importance of Experiments & Instruments

Experiments had always been carried out by thinkers, from ancient figures like Archimedes (l. 287-212 BCE) to the alchemists of the Middle Ages, but their experiments were usually haphazard, and very often thinkers were trying to prove a preconceived idea. In the Scientific Revolution, experimentation became a more systematic and multi-layered activity involving many different people. This more rigorous approach to gathering observable data was also a reaction against traditional activities and methods such as magic, astrology, and alchemy , all ancient and secret worlds of knowledge-gathering that now came under attack.

The Alchemists by Pietro Longhi

At the outset of the Scientific Revolution, experiments were any sort of activity carried out to see what would happen, a sort of anything-goes approach to satisfying scientific curiosity. It is important to note, though, that the modern meaning of scientific experiment is rather different, summarised here by W. E. Burns: "the creation of an artificial situation designed to study scientific principles held to apply in all situations" (95). It is fair to say, though, that the modern approach to experimentation, with its highly specialised focus where only one specific hypothesis is being tested, would not have become possible without the pioneering experimenters of the Scientific Revolution.

The first well-documented practical experiment of our period was made by William Gilbert using magnets; he published his findings in 1600 in On the Magnet . The work was pioneering because "Central to Gilbert's enterprise was the claim that you could reproduce his experiments and confirm his results: his book was, in effect, a collection of experimental recipes" (Wootton, 331).

There remained sceptics of experimentation, those who stressed that the senses could be misled when the reason of the mind could not be. One such doubter was René Descartes (1596-1650), but if anything, he and other natural philosophers who questioned the value of the work of the practical experimenters were responsible for creating a lasting new division between philosophy and what we would today call science. The term "science" was still not widely used in the 17th century, instead, many experimenters referred to themselves as practitioners of "experimental philosophy". The first use in English of the term "experimental method" was in 1675.

The first truly international effort in coordinated experiments involved the development of the barometer. This process began with the efforts of the Italian Evangelista Torricelli (1608-1647) in 1643. Torricelli discovered that mercury could be raised within a glass tube when one end of that tube was placed in a container of mercury. The air pressure on the mercury in the container pushed the mercury in the tube up around 30 inches (76 cm) higher than the level in the container. In 1648, Blaise Pascal (1623-1662) and his brother-in- law Florin Périer conducted experiments using similar apparatus, but this time tested under different atmospheric pressures by setting up the devices at a variety of altitudes on the side of a mountain. The scientists noted that the level of the mercury in the glass tube fell the higher up the mountain readings were taken.

Torricelli's Barometer

The Anglo-Irish chemist Robert Boyle (1627-1691) named the new instrument a barometer and conclusively demonstrated the effect of air pressure by using a barometer inside an air pump where a vacuum was established. Boyle formulated a principle which became known as 'Boyle's Law'. This law states that the pressure exerted by a certain quantity of air varies inversely in proportion to its volume (provided temperatures are constant). The story of the development of the barometer became typical throughout the Scientific Revolution: natural phenomena were observed, instruments were invented to measure and understand these observable facts, scientists collaborated (sometimes even competed), and so they extended the work of each other until, finally, a universal law could be devised which explained what was being seen. This law could then be used as a predictive device in future experiments.

Experiments like Robert Boyle's air pump demonstrations and Isaac Newton 's use of a prism to demonstrate white light is made up of different coloured light continued the trend of experimentation to prove, test, and adjust theories. Further, these endeavours highlight the importance of scientific instruments in the new method of inquiry. The scientific method was employed to invent useful and accurate instruments, which were, in turn, used in further experiments. The invention of the telescope (c. 1608), microscope (c. 1610), barometer (1643), thermometer (c. 1650), pendulum clock (1657), air pump (1659), and balance spring watch (1675) all allowed fine measurements to be made which previously had been impossible. New instruments meant that a whole new range of experiments could be carried out. Whole new specialisations of study became possible, such as meteorology, microscopic anatomy, embryology, and optics.

The scientific method came to involve the following key components:

  • conducting practical experiments
  • conducting experiments without prejudice of what they should prove
  • using deductive reasoning (creating a generalisation from specific examples) to form a hypothesis (untested theory), which is then tested by an experiment, after which the hypothesis might be accepted, altered, or rejected based on empirical (observable) evidence
  • conducting multiple experiments and doing so in different places and by different people to confirm the reliability of the results
  • an open and critical review of the results of an experiment by peers
  • the formulation of universal laws (inductive reasoning or logic) using, for example, mathematics
  • a desire to gain practical benefits from scientific experiments and a belief in the idea of scientific progress

(Note: the above criteria are expressed in modern linguistic terms, not necessarily those terms 17th-century scientists would have used since the revolution in science also caused a revolution in the language to describe it).

Newton's Prism

Scientific Institutions

The scientific method really took hold when it became institutionalised, that is, when it was endorsed and employed by official institutions like the learned societies where thinkers tested their theories in the real world and worked collaboratively. The first such society was the Academia del Cimento in Florence, founded in 1657. Others soon followed, notably the Royal Academy of Sciences in Paris in 1667. Four years earlier, London had gained its own academy with the foundation of the Royal Society . The founding fellows of this society credited Bacon with the idea, and they were keen to follow his principles of scientific method and his emphasis on sharing and communicating scientific data and results. The Berlin Academy was founded in 1700 and the St. Petersburg Academy in 1724. These academies and societies became the focal points of an international network of scientists who corresponded, read each other's works, and even visited each other as the new scientific method took hold.

Official bodies were able to fund expensive experiments and assemble or commission new equipment. They showed these experiments to the public, a practice that illustrates that what was new here was not the act of discovery but the creation of a culture of discovery. Scientists went much further than a real-time audience and ensured their results were printed for a far wider (and more critical) readership in journals and books. Here, in print, the experiments were described in great detail, and the results were presented for all to see. In this way, scientists were able to create "virtual witnesses" to their experiments. Now, anyone who cared to be could become a participant in the development of knowledge acquired through science.

Subscribe to topic Related Content Books Cite This Work License

Bibliography

  • Burns, William E. The Scientific Revolution in Global Perspective. Oxford University Press, 2015.
  • Burns, William E. The Scientific Revolution. ABC-CLIO, 2001.
  • Bynum, William F. & Browne, Janet & Porter, Roy. Dictionary of the History of Science . Princeton University Press, 1982.
  • Henry, John. The Scientific Revolution and the Origins of Modern Science . Red Globe Press, 2008.
  • Jardine, Lisa. Ingenious Pursuits. Nan A. Talese, 1999.
  • Moran, Bruce T. Distilling Knowledge. Harvard University Press, 2006.
  • Wootton, David. The Invention of Science. Harper, 2015.

About the Author

Mark Cartwright

Translations

We want people all over the world to learn about history. Help us and translate this definition into another language!

Questions & Answers

What are the different steps of the scientific method, what was the scientific method in the scientific revolution, related content.

Scientific Revolution

Scientific Revolution

Science

Ancient Greek Science

Roman Science

Roman Science

The Scientific Revolution

The Scientific Revolution

Mesopotamian Science and Technology

Mesopotamian Science and Technology

Free for the world, supported by you.

World History Encyclopedia is a non-profit organization. For only $5 per month you can become a member and support our mission to engage people with cultural heritage and to improve history education worldwide.

Recommended Books

Cite This Work

Cartwright, M. (2023, November 07). Scientific Method . World History Encyclopedia . Retrieved from https://www.worldhistory.org/Scientific_Method/

Chicago Style

Cartwright, Mark. " Scientific Method ." World History Encyclopedia . Last modified November 07, 2023. https://www.worldhistory.org/Scientific_Method/.

Cartwright, Mark. " Scientific Method ." World History Encyclopedia . World History Encyclopedia, 07 Nov 2023. Web. 17 Sep 2024.

License & Copyright

Submitted by Mark Cartwright , published on 07 November 2023. The copyright holder has published this content under the following license: Creative Commons Attribution-NonCommercial-ShareAlike . This license lets others remix, tweak, and build upon this content non-commercially, as long as they credit the author and license their new creations under the identical terms. When republishing on the web a hyperlink back to the original content source URL must be included. Please note that content linked from this page may have different licensing terms.

Scientific Method Example

Illustration by J.R. Bee. ThoughtCo. 

  • Cell Biology
  • Weather & Climate
  • B.A., Biology, Emory University
  • A.S., Nursing, Chattahoochee Technical College

The scientific method is a series of steps that scientific investigators follow to answer specific questions about the natural world. Scientists use the scientific method to make observations, formulate hypotheses , and conduct scientific experiments .

A scientific inquiry starts with an observation. Then, the formulation of a question about what has been observed follows. Next, the scientist will proceed through the remaining steps of the scientific method to end at a conclusion.

The six steps of the scientific method are as follows:

Observation

The first step of the scientific method involves making an observation about something that interests you. Taking an interest in your scientific discovery is important, for example, if you are doing a science project , because you will want to work on something that holds your attention. Your observation can be of anything from plant movement to animal behavior, as long as it is something you want to know more about.​ This step is when you will come up with an idea if you are working on a science project.

Once you have made your observation, you must formulate a question about what you observed. Your question should summarize what it is you are trying to discover or accomplish in your experiment. When stating your question, be as specific as possible.​ For example, if you are doing a project on plants , you may want to know how plants interact with microbes. Your question could be: Do plant spices inhibit bacterial growth ?

The hypothesis is a key component of the scientific process. A hypothesis is an idea that is suggested as an explanation for a natural event, a particular experience, or a specific condition that can be tested through definable experimentation. It states the purpose of your experiment, the variables used, and the predicted outcome of your experiment. It is important to note that a hypothesis must be testable. That means that you should be able to test your hypothesis through experimentation .​ Your hypothesis must either be supported or falsified by your experiment. An example of a good hypothesis is: If there is a relation between listening to music and heart rate, then listening to music will cause a person's resting heart rate to either increase or decrease.

Once you have developed a hypothesis, you must design and conduct an experiment that will test it. You should develop a procedure that states clearly how you plan to conduct your experiment. It is important you include and identify a controlled variable or dependent variable in your procedure. Controls allow us to test a single variable in an experiment because they are unchanged. We can then make observations and comparisons between our controls and our independent variables (things that change in the experiment) to develop an accurate conclusion.​

The results are where you report what happened in the experiment. That includes detailing all observations and data made during your experiment. Most people find it easier to visualize the data by charting or graphing the information.​

Developing a conclusion is the final step of the scientific method. This is where you analyze the results from the experiment and reach a determination about the hypothesis. Did the experiment support or reject your hypothesis? If your hypothesis was supported, great. If not, repeat the experiment or think of ways to improve your procedure.

  • Glycolysis Steps
  • Biology Prefixes and Suffixes: chrom- or chromo-
  • Biology Prefixes and Suffixes: proto-
  • 6 Things You Should Know About Biological Evolution
  • Biology Prefixes and Suffixes: Aer- or Aero-
  • Taxonomy and Organism Classification
  • Homeostasis
  • Biology Prefixes and Suffixes: diplo-
  • The Biology Suffix -lysis
  • Biology Prefixes and Suffixes Index
  • Biology Prefixes and Suffixes: tel- or telo-
  • Parasitism: Definition and Examples
  • Biology Prefixes and Suffixes: Erythr- or Erythro-
  • Biology Prefixes and Suffixes: ana-
  • Biology Prefixes and Suffixes: phago- or phag-
  • Biology Prefixes and Suffixes: -phyll or -phyl

Back Home

  • Science Notes Posts
  • Contact Science Notes
  • Todd Helmenstine Biography
  • Anne Helmenstine Biography
  • Free Printable Periodic Tables (PDF and PNG)
  • Periodic Table Wallpapers
  • Interactive Periodic Table
  • Periodic Table Posters
  • Science Experiments for Kids
  • How to Grow Crystals
  • Chemistry Projects
  • Fire and Flames Projects
  • Holiday Science
  • Chemistry Problems With Answers
  • Physics Problems
  • Unit Conversion Example Problems
  • Chemistry Worksheets
  • Biology Worksheets
  • Periodic Table Worksheets
  • Physical Science Worksheets
  • Science Lab Worksheets
  • My Amazon Books

Experiment Definition in Science – What Is a Science Experiment?

Experiment Definition in Science

In science, an experiment is simply a test of a hypothesis in the scientific method . It is a controlled examination of cause and effect. Here is a look at what a science experiment is (and is not), the key factors in an experiment, examples, and types of experiments.

Experiment Definition in Science

By definition, an experiment is a procedure that tests a hypothesis. A hypothesis, in turn, is a prediction of cause and effect or the predicted outcome of changing one factor of a situation. Both the hypothesis and experiment are components of the scientific method. The steps of the scientific method are:

  • Make observations.
  • Ask a question or identify a problem.
  • State a hypothesis.
  • Perform an experiment that tests the hypothesis.
  • Based on the results of the experiment, either accept or reject the hypothesis.
  • Draw conclusions and report the outcome of the experiment.

Key Parts of an Experiment

The two key parts of an experiment are the independent and dependent variables. The independent variable is the one factor that you control or change in an experiment. The dependent variable is the factor that you measure that responds to the independent variable. An experiment often includes other types of variables , but at its heart, it’s all about the relationship between the independent and dependent variable.

Examples of Experiments

Fertilizer and plant size.

For example, you think a certain fertilizer helps plants grow better. You’ve watched your plants grow and they seem to do better when they have the fertilizer compared to when they don’t. But, observations are only the beginning of science. So, you state a hypothesis: Adding fertilizer increases plant size. Note, you could have stated the hypothesis in different ways. Maybe you think the fertilizer increases plant mass or fruit production, for example. However you state the hypothesis, it includes both the independent and dependent variables. In this case, the independent variable is the presence or absence of fertilizer. The dependent variable is the response to the independent variable, which is the size of the plants.

Now that you have a hypothesis, the next step is designing an experiment that tests it. Experimental design is very important because the way you conduct an experiment influences its outcome. For example, if you use too small of an amount of fertilizer you may see no effect from the treatment. Or, if you dump an entire container of fertilizer on a plant you could kill it! So, recording the steps of the experiment help you judge the outcome of the experiment and aid others who come after you and examine your work. Other factors that might influence your results might include the species of plant and duration of the treatment. Record any conditions that might affect the outcome. Ideally, you want the only difference between your two groups of plants to be whether or not they receive fertilizer. Then, measure the height of the plants and see if there is a difference between the two groups.

Salt and Cookies

You don’t need a lab for an experiment. For example, consider a baking experiment. Let’s say you like the flavor of salt in your cookies, but you’re pretty sure the batch you made using extra salt fell a bit flat. If you double the amount of salt in a recipe, will it affect their size? Here, the independent variable is the amount of salt in the recipe and the dependent variable is cookie size.

Test this hypothesis with an experiment. Bake cookies using the normal recipe (your control group ) and bake some using twice the salt (the experimental group). Make sure it’s the exact same recipe. Bake the cookies at the same temperature and for the same time. Only change the amount of salt in the recipe. Then measure the height or diameter of the cookies and decide whether to accept or reject the hypothesis.

Examples of Things That Are Not Experiments

Based on the examples of experiments, you should see what is not an experiment:

  • Making observations does not constitute an experiment. Initial observations often lead to an experiment, but are not a substitute for one.
  • Making a model is not an experiment.
  • Neither is making a poster.
  • Just trying something to see what happens is not an experiment. You need a hypothesis or prediction about the outcome.
  • Changing a lot of things at once isn’t an experiment. You only have one independent and one dependent variable. However, in an experiment, you might suspect the independent variable has an effect on a separate. So, you design a new experiment to test this.

Types of Experiments

There are three main types of experiments: controlled experiments, natural experiments, and field experiments,

  • Controlled experiment : A controlled experiment compares two groups of samples that differ only in independent variable. For example, a drug trial compares the effect of a group taking a placebo (control group) against those getting the drug (the treatment group). Experiments in a lab or home generally are controlled experiments
  • Natural experiment : Another name for a natural experiment is a quasi-experiment. In this type of experiment, the researcher does not directly control the independent variable, plus there may be other variables at play. Here, the goal is establishing a correlation between the independent and dependent variable. For example, in the formation of new elements a scientist hypothesizes that a certain collision between particles creates a new atom. But, other outcomes may be possible. Or, perhaps only decay products are observed that indicate the element, and not the new atom itself. Many fields of science rely on natural experiments, since controlled experiments aren’t always possible.
  • Field experiment : While a controlled experiments takes place in a lab or other controlled setting, a field experiment occurs in a natural setting. Some phenomena cannot be readily studied in a lab or else the setting exerts an influence that affects the results. So, a field experiment may have higher validity. However, since the setting is not controlled, it is also subject to external factors and potential contamination. For example, if you study whether a certain plumage color affects bird mate selection, a field experiment in a natural environment eliminates the stressors of an artificial environment. Yet, other factors that could be controlled in a lab may influence results. For example, nutrition and health are controlled in a lab, but not in the field.
  • Bailey, R.A. (2008). Design of Comparative Experiments . Cambridge: Cambridge University Press. ISBN 9780521683579.
  • di Francia, G. Toraldo (1981). The Investigation of the Physical World . Cambridge University Press. ISBN 0-521-29925-X.
  • Hinkelmann, Klaus; Kempthorne, Oscar (2008). Design and Analysis of Experiments. Volume I: Introduction to Experimental Design (2nd ed.). Wiley. ISBN 978-0-471-72756-9.
  • Holland, Paul W. (December 1986). “Statistics and Causal Inference”.  Journal of the American Statistical Association . 81 (396): 945–960. doi: 10.2307/2289064
  • Stohr-Hunt, Patricia (1996). “An Analysis of Frequency of Hands-on Experience and Science Achievement”. Journal of Research in Science Teaching . 33 (1): 101–109. doi: 10.1002/(SICI)1098-2736(199601)33:1<101::AID-TEA6>3.0.CO;2-Z

Related Posts

SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Entry Contents

Bibliography

Academic tools.

  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Scientific Method

Science is an enormously successful human enterprise. The study of scientific method is the attempt to discern the activities by which that success is achieved. Among the activities often identified as characteristic of science are systematic observation and experimentation, inductive and deductive reasoning, and the formation and testing of hypotheses and theories. How these are carried out in detail can vary greatly, but characteristics like these have been looked to as a way of demarcating scientific activity from non-science, where only enterprises which employ some canonical form of scientific method or methods should be considered science (see also the entry on science and pseudo-science ). Others have questioned whether there is anything like a fixed toolkit of methods which is common across science and only science. Some reject privileging one view of method as part of rejecting broader views about the nature of science, such as naturalism (Dupré 2004); some reject any restriction in principle (pluralism).

Scientific method should be distinguished from the aims and products of science, such as knowledge, predictions, or control. Methods are the means by which those goals are achieved. Scientific method should also be distinguished from meta-methodology, which includes the values and justifications behind a particular characterization of scientific method (i.e., a methodology) — values such as objectivity, reproducibility, simplicity, or past successes. Methodological rules are proposed to govern method and it is a meta-methodological question whether methods obeying those rules satisfy given values. Finally, method is distinct, to some degree, from the detailed and contextual practices through which methods are implemented. The latter might range over: specific laboratory techniques; mathematical formalisms or other specialized languages used in descriptions and reasoning; technological or other material means; ways of communicating and sharing results, whether with other scientists or with the public at large; or the conventions, habits, enforced customs, and institutional controls over how and what science is carried out.

While it is important to recognize these distinctions, their boundaries are fuzzy. Hence, accounts of method cannot be entirely divorced from their methodological and meta-methodological motivations or justifications, Moreover, each aspect plays a crucial role in identifying methods. Disputes about method have therefore played out at the detail, rule, and meta-rule levels. Changes in beliefs about the certainty or fallibility of scientific knowledge, for instance (which is a meta-methodological consideration of what we can hope for methods to deliver), have meant different emphases on deductive and inductive reasoning, or on the relative importance attached to reasoning over observation (i.e., differences over particular methods.) Beliefs about the role of science in society will affect the place one gives to values in scientific method.

The issue which has shaped debates over scientific method the most in the last half century is the question of how pluralist do we need to be about method? Unificationists continue to hold out for one method essential to science; nihilism is a form of radical pluralism, which considers the effectiveness of any methodological prescription to be so context sensitive as to render it not explanatory on its own. Some middle degree of pluralism regarding the methods embodied in scientific practice seems appropriate. But the details of scientific practice vary with time and place, from institution to institution, across scientists and their subjects of investigation. How significant are the variations for understanding science and its success? How much can method be abstracted from practice? This entry describes some of the attempts to characterize scientific method or methods, as well as arguments for a more context-sensitive approach to methods embedded in actual scientific practices.

1. Overview and organizing themes

2. historical review: aristotle to mill, 3.1 logical constructionism and operationalism, 3.2. h-d as a logic of confirmation, 3.3. popper and falsificationism, 3.4 meta-methodology and the end of method, 4. statistical methods for hypothesis testing, 5.1 creative and exploratory practices.

  • 5.2 Computer methods and the ‘new ways’ of doing science

6.1 “The scientific method” in science education and as seen by scientists

6.2 privileged methods and ‘gold standards’, 6.3 scientific method in the court room, 6.4 deviating practices, 7. conclusion, other internet resources, related entries.

This entry could have been given the title Scientific Methods and gone on to fill volumes, or it could have been extremely short, consisting of a brief summary rejection of the idea that there is any such thing as a unique Scientific Method at all. Both unhappy prospects are due to the fact that scientific activity varies so much across disciplines, times, places, and scientists that any account which manages to unify it all will either consist of overwhelming descriptive detail, or trivial generalizations.

The choice of scope for the present entry is more optimistic, taking a cue from the recent movement in philosophy of science toward a greater attention to practice: to what scientists actually do. This “turn to practice” can be seen as the latest form of studies of methods in science, insofar as it represents an attempt at understanding scientific activity, but through accounts that are neither meant to be universal and unified, nor singular and narrowly descriptive. To some extent, different scientists at different times and places can be said to be using the same method even though, in practice, the details are different.

Whether the context in which methods are carried out is relevant, or to what extent, will depend largely on what one takes the aims of science to be and what one’s own aims are. For most of the history of scientific methodology the assumption has been that the most important output of science is knowledge and so the aim of methodology should be to discover those methods by which scientific knowledge is generated.

Science was seen to embody the most successful form of reasoning (but which form?) to the most certain knowledge claims (but how certain?) on the basis of systematically collected evidence (but what counts as evidence, and should the evidence of the senses take precedence, or rational insight?) Section 2 surveys some of the history, pointing to two major themes. One theme is seeking the right balance between observation and reasoning (and the attendant forms of reasoning which employ them); the other is how certain scientific knowledge is or can be.

Section 3 turns to 20 th century debates on scientific method. In the second half of the 20 th century the epistemic privilege of science faced several challenges and many philosophers of science abandoned the reconstruction of the logic of scientific method. Views changed significantly regarding which functions of science ought to be captured and why. For some, the success of science was better identified with social or cultural features. Historical and sociological turns in the philosophy of science were made, with a demand that greater attention be paid to the non-epistemic aspects of science, such as sociological, institutional, material, and political factors. Even outside of those movements there was an increased specialization in the philosophy of science, with more and more focus on specific fields within science. The combined upshot was very few philosophers arguing any longer for a grand unified methodology of science. Sections 3 and 4 surveys the main positions on scientific method in 20 th century philosophy of science, focusing on where they differ in their preference for confirmation or falsification or for waiving the idea of a special scientific method altogether.

In recent decades, attention has primarily been paid to scientific activities traditionally falling under the rubric of method, such as experimental design and general laboratory practice, the use of statistics, the construction and use of models and diagrams, interdisciplinary collaboration, and science communication. Sections 4–6 attempt to construct a map of the current domains of the study of methods in science.

As these sections illustrate, the question of method is still central to the discourse about science. Scientific method remains a topic for education, for science policy, and for scientists. It arises in the public domain where the demarcation or status of science is at issue. Some philosophers have recently returned, therefore, to the question of what it is that makes science a unique cultural product. This entry will close with some of these recent attempts at discerning and encapsulating the activities by which scientific knowledge is achieved.

Attempting a history of scientific method compounds the vast scope of the topic. This section briefly surveys the background to modern methodological debates. What can be called the classical view goes back to antiquity, and represents a point of departure for later divergences. [ 1 ]

We begin with a point made by Laudan (1968) in his historical survey of scientific method:

Perhaps the most serious inhibition to the emergence of the history of theories of scientific method as a respectable area of study has been the tendency to conflate it with the general history of epistemology, thereby assuming that the narrative categories and classificatory pigeon-holes applied to the latter are also basic to the former. (1968: 5)

To see knowledge about the natural world as falling under knowledge more generally is an understandable conflation. Histories of theories of method would naturally employ the same narrative categories and classificatory pigeon holes. An important theme of the history of epistemology, for example, is the unification of knowledge, a theme reflected in the question of the unification of method in science. Those who have identified differences in kinds of knowledge have often likewise identified different methods for achieving that kind of knowledge (see the entry on the unity of science ).

Different views on what is known, how it is known, and what can be known are connected. Plato distinguished the realms of things into the visible and the intelligible ( The Republic , 510a, in Cooper 1997). Only the latter, the Forms, could be objects of knowledge. The intelligible truths could be known with the certainty of geometry and deductive reasoning. What could be observed of the material world, however, was by definition imperfect and deceptive, not ideal. The Platonic way of knowledge therefore emphasized reasoning as a method, downplaying the importance of observation. Aristotle disagreed, locating the Forms in the natural world as the fundamental principles to be discovered through the inquiry into nature ( Metaphysics Z , in Barnes 1984).

Aristotle is recognized as giving the earliest systematic treatise on the nature of scientific inquiry in the western tradition, one which embraced observation and reasoning about the natural world. In the Prior and Posterior Analytics , Aristotle reflects first on the aims and then the methods of inquiry into nature. A number of features can be found which are still considered by most to be essential to science. For Aristotle, empiricism, careful observation (but passive observation, not controlled experiment), is the starting point. The aim is not merely recording of facts, though. For Aristotle, science ( epistêmê ) is a body of properly arranged knowledge or learning—the empirical facts, but also their ordering and display are of crucial importance. The aims of discovery, ordering, and display of facts partly determine the methods required of successful scientific inquiry. Also determinant is the nature of the knowledge being sought, and the explanatory causes proper to that kind of knowledge (see the discussion of the four causes in the entry on Aristotle on causality ).

In addition to careful observation, then, scientific method requires a logic as a system of reasoning for properly arranging, but also inferring beyond, what is known by observation. Methods of reasoning may include induction, prediction, or analogy, among others. Aristotle’s system (along with his catalogue of fallacious reasoning) was collected under the title the Organon . This title would be echoed in later works on scientific reasoning, such as Novum Organon by Francis Bacon, and Novum Organon Restorum by William Whewell (see below). In Aristotle’s Organon reasoning is divided primarily into two forms, a rough division which persists into modern times. The division, known most commonly today as deductive versus inductive method, appears in other eras and methodologies as analysis/​synthesis, non-ampliative/​ampliative, or even confirmation/​verification. The basic idea is there are two “directions” to proceed in our methods of inquiry: one away from what is observed, to the more fundamental, general, and encompassing principles; the other, from the fundamental and general to instances or implications of principles.

The basic aim and method of inquiry identified here can be seen as a theme running throughout the next two millennia of reflection on the correct way to seek after knowledge: carefully observe nature and then seek rules or principles which explain or predict its operation. The Aristotelian corpus provided the framework for a commentary tradition on scientific method independent of science itself (cosmos versus physics.) During the medieval period, figures such as Albertus Magnus (1206–1280), Thomas Aquinas (1225–1274), Robert Grosseteste (1175–1253), Roger Bacon (1214/1220–1292), William of Ockham (1287–1347), Andreas Vesalius (1514–1546), Giacomo Zabarella (1533–1589) all worked to clarify the kind of knowledge obtainable by observation and induction, the source of justification of induction, and best rules for its application. [ 2 ] Many of their contributions we now think of as essential to science (see also Laudan 1968). As Aristotle and Plato had employed a framework of reasoning either “to the forms” or “away from the forms”, medieval thinkers employed directions away from the phenomena or back to the phenomena. In analysis, a phenomena was examined to discover its basic explanatory principles; in synthesis, explanations of a phenomena were constructed from first principles.

During the Scientific Revolution these various strands of argument, experiment, and reason were forged into a dominant epistemic authority. The 16 th –18 th centuries were a period of not only dramatic advance in knowledge about the operation of the natural world—advances in mechanical, medical, biological, political, economic explanations—but also of self-awareness of the revolutionary changes taking place, and intense reflection on the source and legitimation of the method by which the advances were made. The struggle to establish the new authority included methodological moves. The Book of Nature, according to the metaphor of Galileo Galilei (1564–1642) or Francis Bacon (1561–1626), was written in the language of mathematics, of geometry and number. This motivated an emphasis on mathematical description and mechanical explanation as important aspects of scientific method. Through figures such as Henry More and Ralph Cudworth, a neo-Platonic emphasis on the importance of metaphysical reflection on nature behind appearances, particularly regarding the spiritual as a complement to the purely mechanical, remained an important methodological thread of the Scientific Revolution (see the entries on Cambridge platonists ; Boyle ; Henry More ; Galileo ).

In Novum Organum (1620), Bacon was critical of the Aristotelian method for leaping from particulars to universals too quickly. The syllogistic form of reasoning readily mixed those two types of propositions. Bacon aimed at the invention of new arts, principles, and directions. His method would be grounded in methodical collection of observations, coupled with correction of our senses (and particularly, directions for the avoidance of the Idols, as he called them, kinds of systematic errors to which naïve observers are prone.) The community of scientists could then climb, by a careful, gradual and unbroken ascent, to reliable general claims.

Bacon’s method has been criticized as impractical and too inflexible for the practicing scientist. Whewell would later criticize Bacon in his System of Logic for paying too little attention to the practices of scientists. It is hard to find convincing examples of Bacon’s method being put in to practice in the history of science, but there are a few who have been held up as real examples of 16 th century scientific, inductive method, even if not in the rigid Baconian mold: figures such as Robert Boyle (1627–1691) and William Harvey (1578–1657) (see the entry on Bacon ).

It is to Isaac Newton (1642–1727), however, that historians of science and methodologists have paid greatest attention. Given the enormous success of his Principia Mathematica and Opticks , this is understandable. The study of Newton’s method has had two main thrusts: the implicit method of the experiments and reasoning presented in the Opticks, and the explicit methodological rules given as the Rules for Philosophising (the Regulae) in Book III of the Principia . [ 3 ] Newton’s law of gravitation, the linchpin of his new cosmology, broke with explanatory conventions of natural philosophy, first for apparently proposing action at a distance, but more generally for not providing “true”, physical causes. The argument for his System of the World ( Principia , Book III) was based on phenomena, not reasoned first principles. This was viewed (mainly on the continent) as insufficient for proper natural philosophy. The Regulae counter this objection, re-defining the aims of natural philosophy by re-defining the method natural philosophers should follow. (See the entry on Newton’s philosophy .)

To his list of methodological prescriptions should be added Newton’s famous phrase “ hypotheses non fingo ” (commonly translated as “I frame no hypotheses”.) The scientist was not to invent systems but infer explanations from observations, as Bacon had advocated. This would come to be known as inductivism. In the century after Newton, significant clarifications of the Newtonian method were made. Colin Maclaurin (1698–1746), for instance, reconstructed the essential structure of the method as having complementary analysis and synthesis phases, one proceeding away from the phenomena in generalization, the other from the general propositions to derive explanations of new phenomena. Denis Diderot (1713–1784) and editors of the Encyclopédie did much to consolidate and popularize Newtonianism, as did Francesco Algarotti (1721–1764). The emphasis was often the same, as much on the character of the scientist as on their process, a character which is still commonly assumed. The scientist is humble in the face of nature, not beholden to dogma, obeys only his eyes, and follows the truth wherever it leads. It was certainly Voltaire (1694–1778) and du Chatelet (1706–1749) who were most influential in propagating the latter vision of the scientist and their craft, with Newton as hero. Scientific method became a revolutionary force of the Enlightenment. (See also the entries on Newton , Leibniz , Descartes , Boyle , Hume , enlightenment , as well as Shank 2008 for a historical overview.)

Not all 18 th century reflections on scientific method were so celebratory. Famous also are George Berkeley’s (1685–1753) attack on the mathematics of the new science, as well as the over-emphasis of Newtonians on observation; and David Hume’s (1711–1776) undermining of the warrant offered for scientific claims by inductive justification (see the entries on: George Berkeley ; David Hume ; Hume’s Newtonianism and Anti-Newtonianism ). Hume’s problem of induction motivated Immanuel Kant (1724–1804) to seek new foundations for empirical method, though as an epistemic reconstruction, not as any set of practical guidelines for scientists. Both Hume and Kant influenced the methodological reflections of the next century, such as the debate between Mill and Whewell over the certainty of inductive inferences in science.

The debate between John Stuart Mill (1806–1873) and William Whewell (1794–1866) has become the canonical methodological debate of the 19 th century. Although often characterized as a debate between inductivism and hypothetico-deductivism, the role of the two methods on each side is actually more complex. On the hypothetico-deductive account, scientists work to come up with hypotheses from which true observational consequences can be deduced—hence, hypothetico-deductive. Because Whewell emphasizes both hypotheses and deduction in his account of method, he can be seen as a convenient foil to the inductivism of Mill. However, equally if not more important to Whewell’s portrayal of scientific method is what he calls the “fundamental antithesis”. Knowledge is a product of the objective (what we see in the world around us) and subjective (the contributions of our mind to how we perceive and understand what we experience, which he called the Fundamental Ideas). Both elements are essential according to Whewell, and he was therefore critical of Kant for too much focus on the subjective, and John Locke (1632–1704) and Mill for too much focus on the senses. Whewell’s fundamental ideas can be discipline relative. An idea can be fundamental even if it is necessary for knowledge only within a given scientific discipline (e.g., chemical affinity for chemistry). This distinguishes fundamental ideas from the forms and categories of intuition of Kant. (See the entry on Whewell .)

Clarifying fundamental ideas would therefore be an essential part of scientific method and scientific progress. Whewell called this process “Discoverer’s Induction”. It was induction, following Bacon or Newton, but Whewell sought to revive Bacon’s account by emphasising the role of ideas in the clear and careful formulation of inductive hypotheses. Whewell’s induction is not merely the collecting of objective facts. The subjective plays a role through what Whewell calls the Colligation of Facts, a creative act of the scientist, the invention of a theory. A theory is then confirmed by testing, where more facts are brought under the theory, called the Consilience of Inductions. Whewell felt that this was the method by which the true laws of nature could be discovered: clarification of fundamental concepts, clever invention of explanations, and careful testing. Mill, in his critique of Whewell, and others who have cast Whewell as a fore-runner of the hypothetico-deductivist view, seem to have under-estimated the importance of this discovery phase in Whewell’s understanding of method (Snyder 1997a,b, 1999). Down-playing the discovery phase would come to characterize methodology of the early 20 th century (see section 3 ).

Mill, in his System of Logic , put forward a narrower view of induction as the essence of scientific method. For Mill, induction is the search first for regularities among events. Among those regularities, some will continue to hold for further observations, eventually gaining the status of laws. One can also look for regularities among the laws discovered in a domain, i.e., for a law of laws. Which “law law” will hold is time and discipline dependent and open to revision. One example is the Law of Universal Causation, and Mill put forward specific methods for identifying causes—now commonly known as Mill’s methods. These five methods look for circumstances which are common among the phenomena of interest, those which are absent when the phenomena are, or those for which both vary together. Mill’s methods are still seen as capturing basic intuitions about experimental methods for finding the relevant explanatory factors ( System of Logic (1843), see Mill entry). The methods advocated by Whewell and Mill, in the end, look similar. Both involve inductive generalization to covering laws. They differ dramatically, however, with respect to the necessity of the knowledge arrived at; that is, at the meta-methodological level (see the entries on Whewell and Mill entries).

3. Logic of method and critical responses

The quantum and relativistic revolutions in physics in the early 20 th century had a profound effect on methodology. Conceptual foundations of both theories were taken to show the defeasibility of even the most seemingly secure intuitions about space, time and bodies. Certainty of knowledge about the natural world was therefore recognized as unattainable. Instead a renewed empiricism was sought which rendered science fallible but still rationally justifiable.

Analyses of the reasoning of scientists emerged, according to which the aspects of scientific method which were of primary importance were the means of testing and confirming of theories. A distinction in methodology was made between the contexts of discovery and justification. The distinction could be used as a wedge between the particularities of where and how theories or hypotheses are arrived at, on the one hand, and the underlying reasoning scientists use (whether or not they are aware of it) when assessing theories and judging their adequacy on the basis of the available evidence. By and large, for most of the 20 th century, philosophy of science focused on the second context, although philosophers differed on whether to focus on confirmation or refutation as well as on the many details of how confirmation or refutation could or could not be brought about. By the mid-20 th century these attempts at defining the method of justification and the context distinction itself came under pressure. During the same period, philosophy of science developed rapidly, and from section 4 this entry will therefore shift from a primarily historical treatment of the scientific method towards a primarily thematic one.

Advances in logic and probability held out promise of the possibility of elaborate reconstructions of scientific theories and empirical method, the best example being Rudolf Carnap’s The Logical Structure of the World (1928). Carnap attempted to show that a scientific theory could be reconstructed as a formal axiomatic system—that is, a logic. That system could refer to the world because some of its basic sentences could be interpreted as observations or operations which one could perform to test them. The rest of the theoretical system, including sentences using theoretical or unobservable terms (like electron or force) would then either be meaningful because they could be reduced to observations, or they had purely logical meanings (called analytic, like mathematical identities). This has been referred to as the verifiability criterion of meaning. According to the criterion, any statement not either analytic or verifiable was strictly meaningless. Although the view was endorsed by Carnap in 1928, he would later come to see it as too restrictive (Carnap 1956). Another familiar version of this idea is operationalism of Percy William Bridgman. In The Logic of Modern Physics (1927) Bridgman asserted that every physical concept could be defined in terms of the operations one would perform to verify the application of that concept. Making good on the operationalisation of a concept even as simple as length, however, can easily become enormously complex (for measuring very small lengths, for instance) or impractical (measuring large distances like light years.)

Carl Hempel’s (1950, 1951) criticisms of the verifiability criterion of meaning had enormous influence. He pointed out that universal generalizations, such as most scientific laws, were not strictly meaningful on the criterion. Verifiability and operationalism both seemed too restrictive to capture standard scientific aims and practice. The tenuous connection between these reconstructions and actual scientific practice was criticized in another way. In both approaches, scientific methods are instead recast in methodological roles. Measurements, for example, were looked to as ways of giving meanings to terms. The aim of the philosopher of science was not to understand the methods per se , but to use them to reconstruct theories, their meanings, and their relation to the world. When scientists perform these operations, however, they will not report that they are doing them to give meaning to terms in a formal axiomatic system. This disconnect between methodology and the details of actual scientific practice would seem to violate the empiricism the Logical Positivists and Bridgman were committed to. The view that methodology should correspond to practice (to some extent) has been called historicism, or intuitionism. We turn to these criticisms and responses in section 3.4 . [ 4 ]

Positivism also had to contend with the recognition that a purely inductivist approach, along the lines of Bacon-Newton-Mill, was untenable. There was no pure observation, for starters. All observation was theory laden. Theory is required to make any observation, therefore not all theory can be derived from observation alone. (See the entry on theory and observation in science .) Even granting an observational basis, Hume had already pointed out that one could not deductively justify inductive conclusions without begging the question by presuming the success of the inductive method. Likewise, positivist attempts at analyzing how a generalization can be confirmed by observations of its instances were subject to a number of criticisms. Goodman (1965) and Hempel (1965) both point to paradoxes inherent in standard accounts of confirmation. Recent attempts at explaining how observations can serve to confirm a scientific theory are discussed in section 4 below.

The standard starting point for a non-inductive analysis of the logic of confirmation is known as the Hypothetico-Deductive (H-D) method. In its simplest form, a sentence of a theory which expresses some hypothesis is confirmed by its true consequences. As noted in section 2 , this method had been advanced by Whewell in the 19 th century, as well as Nicod (1924) and others in the 20 th century. Often, Hempel’s (1966) description of the H-D method, illustrated by the case of Semmelweiss’ inferential procedures in establishing the cause of childbed fever, has been presented as a key account of H-D as well as a foil for criticism of the H-D account of confirmation (see, for example, Lipton’s (2004) discussion of inference to the best explanation; also the entry on confirmation ). Hempel described Semmelsweiss’ procedure as examining various hypotheses explaining the cause of childbed fever. Some hypotheses conflicted with observable facts and could be rejected as false immediately. Others needed to be tested experimentally by deducing which observable events should follow if the hypothesis were true (what Hempel called the test implications of the hypothesis), then conducting an experiment and observing whether or not the test implications occurred. If the experiment showed the test implication to be false, the hypothesis could be rejected. If the experiment showed the test implications to be true, however, this did not prove the hypothesis true. The confirmation of a test implication does not verify a hypothesis, though Hempel did allow that “it provides at least some support, some corroboration or confirmation for it” (Hempel 1966: 8). The degree of this support then depends on the quantity, variety and precision of the supporting evidence.

Another approach that took off from the difficulties with inductive inference was Karl Popper’s critical rationalism or falsificationism (Popper 1959, 1963). Falsification is deductive and similar to H-D in that it involves scientists deducing observational consequences from the hypothesis under test. For Popper, however, the important point was not the degree of confirmation that successful prediction offered to a hypothesis. The crucial thing was the logical asymmetry between confirmation, based on inductive inference, and falsification, which can be based on a deductive inference. (This simple opposition was later questioned, by Lakatos, among others. See the entry on historicist theories of scientific rationality. )

Popper stressed that, regardless of the amount of confirming evidence, we can never be certain that a hypothesis is true without committing the fallacy of affirming the consequent. Instead, Popper introduced the notion of corroboration as a measure for how well a theory or hypothesis has survived previous testing—but without implying that this is also a measure for the probability that it is true.

Popper was also motivated by his doubts about the scientific status of theories like the Marxist theory of history or psycho-analysis, and so wanted to demarcate between science and pseudo-science. Popper saw this as an importantly different distinction than demarcating science from metaphysics. The latter demarcation was the primary concern of many logical empiricists. Popper used the idea of falsification to draw a line instead between pseudo and proper science. Science was science because its method involved subjecting theories to rigorous tests which offered a high probability of failing and thus refuting the theory.

A commitment to the risk of failure was important. Avoiding falsification could be done all too easily. If a consequence of a theory is inconsistent with observations, an exception can be added by introducing auxiliary hypotheses designed explicitly to save the theory, so-called ad hoc modifications. This Popper saw done in pseudo-science where ad hoc theories appeared capable of explaining anything in their field of application. In contrast, science is risky. If observations showed the predictions from a theory to be wrong, the theory would be refuted. Hence, scientific hypotheses must be falsifiable. Not only must there exist some possible observation statement which could falsify the hypothesis or theory, were it observed, (Popper called these the hypothesis’ potential falsifiers) it is crucial to the Popperian scientific method that such falsifications be sincerely attempted on a regular basis.

The more potential falsifiers of a hypothesis, the more falsifiable it would be, and the more the hypothesis claimed. Conversely, hypotheses without falsifiers claimed very little or nothing at all. Originally, Popper thought that this meant the introduction of ad hoc hypotheses only to save a theory should not be countenanced as good scientific method. These would undermine the falsifiabililty of a theory. However, Popper later came to recognize that the introduction of modifications (immunizations, he called them) was often an important part of scientific development. Responding to surprising or apparently falsifying observations often generated important new scientific insights. Popper’s own example was the observed motion of Uranus which originally did not agree with Newtonian predictions. The ad hoc hypothesis of an outer planet explained the disagreement and led to further falsifiable predictions. Popper sought to reconcile the view by blurring the distinction between falsifiable and not falsifiable, and speaking instead of degrees of testability (Popper 1985: 41f.).

From the 1960s on, sustained meta-methodological criticism emerged that drove philosophical focus away from scientific method. A brief look at those criticisms follows, with recommendations for further reading at the end of the entry.

Thomas Kuhn’s The Structure of Scientific Revolutions (1962) begins with a well-known shot across the bow for philosophers of science:

History, if viewed as a repository for more than anecdote or chronology, could produce a decisive transformation in the image of science by which we are now possessed. (1962: 1)

The image Kuhn thought needed transforming was the a-historical, rational reconstruction sought by many of the Logical Positivists, though Carnap and other positivists were actually quite sympathetic to Kuhn’s views. (See the entry on the Vienna Circle .) Kuhn shares with other of his contemporaries, such as Feyerabend and Lakatos, a commitment to a more empirical approach to philosophy of science. Namely, the history of science provides important data, and necessary checks, for philosophy of science, including any theory of scientific method.

The history of science reveals, according to Kuhn, that scientific development occurs in alternating phases. During normal science, the members of the scientific community adhere to the paradigm in place. Their commitment to the paradigm means a commitment to the puzzles to be solved and the acceptable ways of solving them. Confidence in the paradigm remains so long as steady progress is made in solving the shared puzzles. Method in this normal phase operates within a disciplinary matrix (Kuhn’s later concept of a paradigm) which includes standards for problem solving, and defines the range of problems to which the method should be applied. An important part of a disciplinary matrix is the set of values which provide the norms and aims for scientific method. The main values that Kuhn identifies are prediction, problem solving, simplicity, consistency, and plausibility.

An important by-product of normal science is the accumulation of puzzles which cannot be solved with resources of the current paradigm. Once accumulation of these anomalies has reached some critical mass, it can trigger a communal shift to a new paradigm and a new phase of normal science. Importantly, the values that provide the norms and aims for scientific method may have transformed in the meantime. Method may therefore be relative to discipline, time or place

Feyerabend also identified the aims of science as progress, but argued that any methodological prescription would only stifle that progress (Feyerabend 1988). His arguments are grounded in re-examining accepted “myths” about the history of science. Heroes of science, like Galileo, are shown to be just as reliant on rhetoric and persuasion as they are on reason and demonstration. Others, like Aristotle, are shown to be far more reasonable and far-reaching in their outlooks then they are given credit for. As a consequence, the only rule that could provide what he took to be sufficient freedom was the vacuous “anything goes”. More generally, even the methodological restriction that science is the best way to pursue knowledge, and to increase knowledge, is too restrictive. Feyerabend suggested instead that science might, in fact, be a threat to a free society, because it and its myth had become so dominant (Feyerabend 1978).

An even more fundamental kind of criticism was offered by several sociologists of science from the 1970s onwards who rejected the methodology of providing philosophical accounts for the rational development of science and sociological accounts of the irrational mistakes. Instead, they adhered to a symmetry thesis on which any causal explanation of how scientific knowledge is established needs to be symmetrical in explaining truth and falsity, rationality and irrationality, success and mistakes, by the same causal factors (see, e.g., Barnes and Bloor 1982, Bloor 1991). Movements in the Sociology of Science, like the Strong Programme, or in the social dimensions and causes of knowledge more generally led to extended and close examination of detailed case studies in contemporary science and its history. (See the entries on the social dimensions of scientific knowledge and social epistemology .) Well-known examinations by Latour and Woolgar (1979/1986), Knorr-Cetina (1981), Pickering (1984), Shapin and Schaffer (1985) seem to bear out that it was social ideologies (on a macro-scale) or individual interactions and circumstances (on a micro-scale) which were the primary causal factors in determining which beliefs gained the status of scientific knowledge. As they saw it therefore, explanatory appeals to scientific method were not empirically grounded.

A late, and largely unexpected, criticism of scientific method came from within science itself. Beginning in the early 2000s, a number of scientists attempting to replicate the results of published experiments could not do so. There may be close conceptual connection between reproducibility and method. For example, if reproducibility means that the same scientific methods ought to produce the same result, and all scientific results ought to be reproducible, then whatever it takes to reproduce a scientific result ought to be called scientific method. Space limits us to the observation that, insofar as reproducibility is a desired outcome of proper scientific method, it is not strictly a part of scientific method. (See the entry on reproducibility of scientific results .)

By the close of the 20 th century the search for the scientific method was flagging. Nola and Sankey (2000b) could introduce their volume on method by remarking that “For some, the whole idea of a theory of scientific method is yester-year’s debate …”.

Despite the many difficulties that philosophers encountered in trying to providing a clear methodology of conformation (or refutation), still important progress has been made on understanding how observation can provide evidence for a given theory. Work in statistics has been crucial for understanding how theories can be tested empirically, and in recent decades a huge literature has developed that attempts to recast confirmation in Bayesian terms. Here these developments can be covered only briefly, and we refer to the entry on confirmation for further details and references.

Statistics has come to play an increasingly important role in the methodology of the experimental sciences from the 19 th century onwards. At that time, statistics and probability theory took on a methodological role as an analysis of inductive inference, and attempts to ground the rationality of induction in the axioms of probability theory have continued throughout the 20 th century and in to the present. Developments in the theory of statistics itself, meanwhile, have had a direct and immense influence on the experimental method, including methods for measuring the uncertainty of observations such as the Method of Least Squares developed by Legendre and Gauss in the early 19 th century, criteria for the rejection of outliers proposed by Peirce by the mid-19 th century, and the significance tests developed by Gosset (a.k.a. “Student”), Fisher, Neyman & Pearson and others in the 1920s and 1930s (see, e.g., Swijtink 1987 for a brief historical overview; and also the entry on C.S. Peirce ).

These developments within statistics then in turn led to a reflective discussion among both statisticians and philosophers of science on how to perceive the process of hypothesis testing: whether it was a rigorous statistical inference that could provide a numerical expression of the degree of confidence in the tested hypothesis, or if it should be seen as a decision between different courses of actions that also involved a value component. This led to a major controversy among Fisher on the one side and Neyman and Pearson on the other (see especially Fisher 1955, Neyman 1956 and Pearson 1955, and for analyses of the controversy, e.g., Howie 2002, Marks 2000, Lenhard 2006). On Fisher’s view, hypothesis testing was a methodology for when to accept or reject a statistical hypothesis, namely that a hypothesis should be rejected by evidence if this evidence would be unlikely relative to other possible outcomes, given the hypothesis were true. In contrast, on Neyman and Pearson’s view, the consequence of error also had to play a role when deciding between hypotheses. Introducing the distinction between the error of rejecting a true hypothesis (type I error) and accepting a false hypothesis (type II error), they argued that it depends on the consequences of the error to decide whether it is more important to avoid rejecting a true hypothesis or accepting a false one. Hence, Fisher aimed for a theory of inductive inference that enabled a numerical expression of confidence in a hypothesis. To him, the important point was the search for truth, not utility. In contrast, the Neyman-Pearson approach provided a strategy of inductive behaviour for deciding between different courses of action. Here, the important point was not whether a hypothesis was true, but whether one should act as if it was.

Similar discussions are found in the philosophical literature. On the one side, Churchman (1948) and Rudner (1953) argued that because scientific hypotheses can never be completely verified, a complete analysis of the methods of scientific inference includes ethical judgments in which the scientists must decide whether the evidence is sufficiently strong or that the probability is sufficiently high to warrant the acceptance of the hypothesis, which again will depend on the importance of making a mistake in accepting or rejecting the hypothesis. Others, such as Jeffrey (1956) and Levi (1960) disagreed and instead defended a value-neutral view of science on which scientists should bracket their attitudes, preferences, temperament, and values when assessing the correctness of their inferences. For more details on this value-free ideal in the philosophy of science and its historical development, see Douglas (2009) and Howard (2003). For a broad set of case studies examining the role of values in science, see e.g. Elliott & Richards 2017.

In recent decades, philosophical discussions of the evaluation of probabilistic hypotheses by statistical inference have largely focused on Bayesianism that understands probability as a measure of a person’s degree of belief in an event, given the available information, and frequentism that instead understands probability as a long-run frequency of a repeatable event. Hence, for Bayesians probabilities refer to a state of knowledge, whereas for frequentists probabilities refer to frequencies of events (see, e.g., Sober 2008, chapter 1 for a detailed introduction to Bayesianism and frequentism as well as to likelihoodism). Bayesianism aims at providing a quantifiable, algorithmic representation of belief revision, where belief revision is a function of prior beliefs (i.e., background knowledge) and incoming evidence. Bayesianism employs a rule based on Bayes’ theorem, a theorem of the probability calculus which relates conditional probabilities. The probability that a particular hypothesis is true is interpreted as a degree of belief, or credence, of the scientist. There will also be a probability and a degree of belief that a hypothesis will be true conditional on a piece of evidence (an observation, say) being true. Bayesianism proscribes that it is rational for the scientist to update their belief in the hypothesis to that conditional probability should it turn out that the evidence is, in fact, observed (see, e.g., Sprenger & Hartmann 2019 for a comprehensive treatment of Bayesian philosophy of science). Originating in the work of Neyman and Person, frequentism aims at providing the tools for reducing long-run error rates, such as the error-statistical approach developed by Mayo (1996) that focuses on how experimenters can avoid both type I and type II errors by building up a repertoire of procedures that detect errors if and only if they are present. Both Bayesianism and frequentism have developed over time, they are interpreted in different ways by its various proponents, and their relations to previous criticism to attempts at defining scientific method are seen differently by proponents and critics. The literature, surveys, reviews and criticism in this area are vast and the reader is referred to the entries on Bayesian epistemology and confirmation .

5. Method in Practice

Attention to scientific practice, as we have seen, is not itself new. However, the turn to practice in the philosophy of science of late can be seen as a correction to the pessimism with respect to method in philosophy of science in later parts of the 20 th century, and as an attempted reconciliation between sociological and rationalist explanations of scientific knowledge. Much of this work sees method as detailed and context specific problem-solving procedures, and methodological analyses to be at the same time descriptive, critical and advisory (see Nickles 1987 for an exposition of this view). The following section contains a survey of some of the practice focuses. In this section we turn fully to topics rather than chronology.

A problem with the distinction between the contexts of discovery and justification that figured so prominently in philosophy of science in the first half of the 20 th century (see section 2 ) is that no such distinction can be clearly seen in scientific activity (see Arabatzis 2006). Thus, in recent decades, it has been recognized that study of conceptual innovation and change should not be confined to psychology and sociology of science, but are also important aspects of scientific practice which philosophy of science should address (see also the entry on scientific discovery ). Looking for the practices that drive conceptual innovation has led philosophers to examine both the reasoning practices of scientists and the wide realm of experimental practices that are not directed narrowly at testing hypotheses, that is, exploratory experimentation.

Examining the reasoning practices of historical and contemporary scientists, Nersessian (2008) has argued that new scientific concepts are constructed as solutions to specific problems by systematic reasoning, and that of analogy, visual representation and thought-experimentation are among the important reasoning practices employed. These ubiquitous forms of reasoning are reliable—but also fallible—methods of conceptual development and change. On her account, model-based reasoning consists of cycles of construction, simulation, evaluation and adaption of models that serve as interim interpretations of the target problem to be solved. Often, this process will lead to modifications or extensions, and a new cycle of simulation and evaluation. However, Nersessian also emphasizes that

creative model-based reasoning cannot be applied as a simple recipe, is not always productive of solutions, and even its most exemplary usages can lead to incorrect solutions. (Nersessian 2008: 11)

Thus, while on the one hand she agrees with many previous philosophers that there is no logic of discovery, discoveries can derive from reasoned processes, such that a large and integral part of scientific practice is

the creation of concepts through which to comprehend, structure, and communicate about physical phenomena …. (Nersessian 1987: 11)

Similarly, work on heuristics for discovery and theory construction by scholars such as Darden (1991) and Bechtel & Richardson (1993) present science as problem solving and investigate scientific problem solving as a special case of problem-solving in general. Drawing largely on cases from the biological sciences, much of their focus has been on reasoning strategies for the generation, evaluation, and revision of mechanistic explanations of complex systems.

Addressing another aspect of the context distinction, namely the traditional view that the primary role of experiments is to test theoretical hypotheses according to the H-D model, other philosophers of science have argued for additional roles that experiments can play. The notion of exploratory experimentation was introduced to describe experiments driven by the desire to obtain empirical regularities and to develop concepts and classifications in which these regularities can be described (Steinle 1997, 2002; Burian 1997; Waters 2007)). However the difference between theory driven experimentation and exploratory experimentation should not be seen as a sharp distinction. Theory driven experiments are not always directed at testing hypothesis, but may also be directed at various kinds of fact-gathering, such as determining numerical parameters. Vice versa , exploratory experiments are usually informed by theory in various ways and are therefore not theory-free. Instead, in exploratory experiments phenomena are investigated without first limiting the possible outcomes of the experiment on the basis of extant theory about the phenomena.

The development of high throughput instrumentation in molecular biology and neighbouring fields has given rise to a special type of exploratory experimentation that collects and analyses very large amounts of data, and these new ‘omics’ disciplines are often said to represent a break with the ideal of hypothesis-driven science (Burian 2007; Elliott 2007; Waters 2007; O’Malley 2007) and instead described as data-driven research (Leonelli 2012; Strasser 2012) or as a special kind of “convenience experimentation” in which many experiments are done simply because they are extraordinarily convenient to perform (Krohs 2012).

5.2 Computer methods and ‘new ways’ of doing science

The field of omics just described is possible because of the ability of computers to process, in a reasonable amount of time, the huge quantities of data required. Computers allow for more elaborate experimentation (higher speed, better filtering, more variables, sophisticated coordination and control), but also, through modelling and simulations, might constitute a form of experimentation themselves. Here, too, we can pose a version of the general question of method versus practice: does the practice of using computers fundamentally change scientific method, or merely provide a more efficient means of implementing standard methods?

Because computers can be used to automate measurements, quantifications, calculations, and statistical analyses where, for practical reasons, these operations cannot be otherwise carried out, many of the steps involved in reaching a conclusion on the basis of an experiment are now made inside a “black box”, without the direct involvement or awareness of a human. This has epistemological implications, regarding what we can know, and how we can know it. To have confidence in the results, computer methods are therefore subjected to tests of verification and validation.

The distinction between verification and validation is easiest to characterize in the case of computer simulations. In a typical computer simulation scenario computers are used to numerically integrate differential equations for which no analytic solution is available. The equations are part of the model the scientist uses to represent a phenomenon or system under investigation. Verifying a computer simulation means checking that the equations of the model are being correctly approximated. Validating a simulation means checking that the equations of the model are adequate for the inferences one wants to make on the basis of that model.

A number of issues related to computer simulations have been raised. The identification of validity and verification as the testing methods has been criticized. Oreskes et al. (1994) raise concerns that “validiation”, because it suggests deductive inference, might lead to over-confidence in the results of simulations. The distinction itself is probably too clean, since actual practice in the testing of simulations mixes and moves back and forth between the two (Weissart 1997; Parker 2008a; Winsberg 2010). Computer simulations do seem to have a non-inductive character, given that the principles by which they operate are built in by the programmers, and any results of the simulation follow from those in-built principles in such a way that those results could, in principle, be deduced from the program code and its inputs. The status of simulations as experiments has therefore been examined (Kaufmann and Smarr 1993; Humphreys 1995; Hughes 1999; Norton and Suppe 2001). This literature considers the epistemology of these experiments: what we can learn by simulation, and also the kinds of justifications which can be given in applying that knowledge to the “real” world. (Mayo 1996; Parker 2008b). As pointed out, part of the advantage of computer simulation derives from the fact that huge numbers of calculations can be carried out without requiring direct observation by the experimenter/​simulator. At the same time, many of these calculations are approximations to the calculations which would be performed first-hand in an ideal situation. Both factors introduce uncertainties into the inferences drawn from what is observed in the simulation.

For many of the reasons described above, computer simulations do not seem to belong clearly to either the experimental or theoretical domain. Rather, they seem to crucially involve aspects of both. This has led some authors, such as Fox Keller (2003: 200) to argue that we ought to consider computer simulation a “qualitatively different way of doing science”. The literature in general tends to follow Kaufmann and Smarr (1993) in referring to computer simulation as a “third way” for scientific methodology (theoretical reasoning and experimental practice are the first two ways.). It should also be noted that the debates around these issues have tended to focus on the form of computer simulation typical in the physical sciences, where models are based on dynamical equations. Other forms of simulation might not have the same problems, or have problems of their own (see the entry on computer simulations in science ).

In recent years, the rapid development of machine learning techniques has prompted some scholars to suggest that the scientific method has become “obsolete” (Anderson 2008, Carrol and Goodstein 2009). This has resulted in an intense debate on the relative merit of data-driven and hypothesis-driven research (for samples, see e.g. Mazzocchi 2015 or Succi and Coveney 2018). For a detailed treatment of this topic, we refer to the entry scientific research and big data .

6. Discourse on scientific method

Despite philosophical disagreements, the idea of the scientific method still figures prominently in contemporary discourse on many different topics, both within science and in society at large. Often, reference to scientific method is used in ways that convey either the legend of a single, universal method characteristic of all science, or grants to a particular method or set of methods privilege as a special ‘gold standard’, often with reference to particular philosophers to vindicate the claims. Discourse on scientific method also typically arises when there is a need to distinguish between science and other activities, or for justifying the special status conveyed to science. In these areas, the philosophical attempts at identifying a set of methods characteristic for scientific endeavors are closely related to the philosophy of science’s classical problem of demarcation (see the entry on science and pseudo-science ) and to the philosophical analysis of the social dimension of scientific knowledge and the role of science in democratic society.

One of the settings in which the legend of a single, universal scientific method has been particularly strong is science education (see, e.g., Bauer 1992; McComas 1996; Wivagg & Allchin 2002). [ 5 ] Often, ‘the scientific method’ is presented in textbooks and educational web pages as a fixed four or five step procedure starting from observations and description of a phenomenon and progressing over formulation of a hypothesis which explains the phenomenon, designing and conducting experiments to test the hypothesis, analyzing the results, and ending with drawing a conclusion. Such references to a universal scientific method can be found in educational material at all levels of science education (Blachowicz 2009), and numerous studies have shown that the idea of a general and universal scientific method often form part of both students’ and teachers’ conception of science (see, e.g., Aikenhead 1987; Osborne et al. 2003). In response, it has been argued that science education need to focus more on teaching about the nature of science, although views have differed on whether this is best done through student-led investigations, contemporary cases, or historical cases (Allchin, Andersen & Nielsen 2014)

Although occasionally phrased with reference to the H-D method, important historical roots of the legend in science education of a single, universal scientific method are the American philosopher and psychologist Dewey’s account of inquiry in How We Think (1910) and the British mathematician Karl Pearson’s account of science in Grammar of Science (1892). On Dewey’s account, inquiry is divided into the five steps of

(i) a felt difficulty, (ii) its location and definition, (iii) suggestion of a possible solution, (iv) development by reasoning of the bearing of the suggestions, (v) further observation and experiment leading to its acceptance or rejection. (Dewey 1910: 72)

Similarly, on Pearson’s account, scientific investigations start with measurement of data and observation of their correction and sequence from which scientific laws can be discovered with the aid of creative imagination. These laws have to be subject to criticism, and their final acceptance will have equal validity for “all normally constituted minds”. Both Dewey’s and Pearson’s accounts should be seen as generalized abstractions of inquiry and not restricted to the realm of science—although both Dewey and Pearson referred to their respective accounts as ‘the scientific method’.

Occasionally, scientists make sweeping statements about a simple and distinct scientific method, as exemplified by Feynman’s simplified version of a conjectures and refutations method presented, for example, in the last of his 1964 Cornell Messenger lectures. [ 6 ] However, just as often scientists have come to the same conclusion as recent philosophy of science that there is not any unique, easily described scientific method. For example, the physicist and Nobel Laureate Weinberg described in the paper “The Methods of Science … And Those By Which We Live” (1995) how

The fact that the standards of scientific success shift with time does not only make the philosophy of science difficult; it also raises problems for the public understanding of science. We do not have a fixed scientific method to rally around and defend. (1995: 8)

Interview studies with scientists on their conception of method shows that scientists often find it hard to figure out whether available evidence confirms their hypothesis, and that there are no direct translations between general ideas about method and specific strategies to guide how research is conducted (Schickore & Hangel 2019, Hangel & Schickore 2017)

Reference to the scientific method has also often been used to argue for the scientific nature or special status of a particular activity. Philosophical positions that argue for a simple and unique scientific method as a criterion of demarcation, such as Popperian falsification, have often attracted practitioners who felt that they had a need to defend their domain of practice. For example, references to conjectures and refutation as the scientific method are abundant in much of the literature on complementary and alternative medicine (CAM)—alongside the competing position that CAM, as an alternative to conventional biomedicine, needs to develop its own methodology different from that of science.

Also within mainstream science, reference to the scientific method is used in arguments regarding the internal hierarchy of disciplines and domains. A frequently seen argument is that research based on the H-D method is superior to research based on induction from observations because in deductive inferences the conclusion follows necessarily from the premises. (See, e.g., Parascandola 1998 for an analysis of how this argument has been made to downgrade epidemiology compared to the laboratory sciences.) Similarly, based on an examination of the practices of major funding institutions such as the National Institutes of Health (NIH), the National Science Foundation (NSF) and the Biomedical Sciences Research Practices (BBSRC) in the UK, O’Malley et al. (2009) have argued that funding agencies seem to have a tendency to adhere to the view that the primary activity of science is to test hypotheses, while descriptive and exploratory research is seen as merely preparatory activities that are valuable only insofar as they fuel hypothesis-driven research.

In some areas of science, scholarly publications are structured in a way that may convey the impression of a neat and linear process of inquiry from stating a question, devising the methods by which to answer it, collecting the data, to drawing a conclusion from the analysis of data. For example, the codified format of publications in most biomedical journals known as the IMRAD format (Introduction, Method, Results, Analysis, Discussion) is explicitly described by the journal editors as “not an arbitrary publication format but rather a direct reflection of the process of scientific discovery” (see the so-called “Vancouver Recommendations”, ICMJE 2013: 11). However, scientific publications do not in general reflect the process by which the reported scientific results were produced. For example, under the provocative title “Is the scientific paper a fraud?”, Medawar argued that scientific papers generally misrepresent how the results have been produced (Medawar 1963/1996). Similar views have been advanced by philosophers, historians and sociologists of science (Gilbert 1976; Holmes 1987; Knorr-Cetina 1981; Schickore 2008; Suppe 1998) who have argued that scientists’ experimental practices are messy and often do not follow any recognizable pattern. Publications of research results, they argue, are retrospective reconstructions of these activities that often do not preserve the temporal order or the logic of these activities, but are instead often constructed in order to screen off potential criticism (see Schickore 2008 for a review of this work).

Philosophical positions on the scientific method have also made it into the court room, especially in the US where judges have drawn on philosophy of science in deciding when to confer special status to scientific expert testimony. A key case is Daubert vs Merrell Dow Pharmaceuticals (92–102, 509 U.S. 579, 1993). In this case, the Supreme Court argued in its 1993 ruling that trial judges must ensure that expert testimony is reliable, and that in doing this the court must look at the expert’s methodology to determine whether the proffered evidence is actually scientific knowledge. Further, referring to works of Popper and Hempel the court stated that

ordinarily, a key question to be answered in determining whether a theory or technique is scientific knowledge … is whether it can be (and has been) tested. (Justice Blackmun, Daubert v. Merrell Dow Pharmaceuticals; see Other Internet Resources for a link to the opinion)

But as argued by Haack (2005a,b, 2010) and by Foster & Hubner (1999), by equating the question of whether a piece of testimony is reliable with the question whether it is scientific as indicated by a special methodology, the court was producing an inconsistent mixture of Popper’s and Hempel’s philosophies, and this has later led to considerable confusion in subsequent case rulings that drew on the Daubert case (see Haack 2010 for a detailed exposition).

The difficulties around identifying the methods of science are also reflected in the difficulties of identifying scientific misconduct in the form of improper application of the method or methods of science. One of the first and most influential attempts at defining misconduct in science was the US definition from 1989 that defined misconduct as

fabrication, falsification, plagiarism, or other practices that seriously deviate from those that are commonly accepted within the scientific community . (Code of Federal Regulations, part 50, subpart A., August 8, 1989, italics added)

However, the “other practices that seriously deviate” clause was heavily criticized because it could be used to suppress creative or novel science. For example, the National Academy of Science stated in their report Responsible Science (1992) that it

wishes to discourage the possibility that a misconduct complaint could be lodged against scientists based solely on their use of novel or unorthodox research methods. (NAS: 27)

This clause was therefore later removed from the definition. For an entry into the key philosophical literature on conduct in science, see Shamoo & Resnick (2009).

The question of the source of the success of science has been at the core of philosophy since the beginning of modern science. If viewed as a matter of epistemology more generally, scientific method is a part of the entire history of philosophy. Over that time, science and whatever methods its practitioners may employ have changed dramatically. Today, many philosophers have taken up the banners of pluralism or of practice to focus on what are, in effect, fine-grained and contextually limited examinations of scientific method. Others hope to shift perspectives in order to provide a renewed general account of what characterizes the activity we call science.

One such perspective has been offered recently by Hoyningen-Huene (2008, 2013), who argues from the history of philosophy of science that after three lengthy phases of characterizing science by its method, we are now in a phase where the belief in the existence of a positive scientific method has eroded and what has been left to characterize science is only its fallibility. First was a phase from Plato and Aristotle up until the 17 th century where the specificity of scientific knowledge was seen in its absolute certainty established by proof from evident axioms; next was a phase up to the mid-19 th century in which the means to establish the certainty of scientific knowledge had been generalized to include inductive procedures as well. In the third phase, which lasted until the last decades of the 20 th century, it was recognized that empirical knowledge was fallible, but it was still granted a special status due to its distinctive mode of production. But now in the fourth phase, according to Hoyningen-Huene, historical and philosophical studies have shown how “scientific methods with the characteristics as posited in the second and third phase do not exist” (2008: 168) and there is no longer any consensus among philosophers and historians of science about the nature of science. For Hoyningen-Huene, this is too negative a stance, and he therefore urges the question about the nature of science anew. His own answer to this question is that “scientific knowledge differs from other kinds of knowledge, especially everyday knowledge, primarily by being more systematic” (Hoyningen-Huene 2013: 14). Systematicity can have several different dimensions: among them are more systematic descriptions, explanations, predictions, defense of knowledge claims, epistemic connectedness, ideal of completeness, knowledge generation, representation of knowledge and critical discourse. Hence, what characterizes science is the greater care in excluding possible alternative explanations, the more detailed elaboration with respect to data on which predictions are based, the greater care in detecting and eliminating sources of error, the more articulate connections to other pieces of knowledge, etc. On this position, what characterizes science is not that the methods employed are unique to science, but that the methods are more carefully employed.

Another, similar approach has been offered by Haack (2003). She sets off, similar to Hoyningen-Huene, from a dissatisfaction with the recent clash between what she calls Old Deferentialism and New Cynicism. The Old Deferentialist position is that science progressed inductively by accumulating true theories confirmed by empirical evidence or deductively by testing conjectures against basic statements; while the New Cynics position is that science has no epistemic authority and no uniquely rational method and is merely just politics. Haack insists that contrary to the views of the New Cynics, there are objective epistemic standards, and there is something epistemologically special about science, even though the Old Deferentialists pictured this in a wrong way. Instead, she offers a new Critical Commonsensist account on which standards of good, strong, supportive evidence and well-conducted, honest, thorough and imaginative inquiry are not exclusive to the sciences, but the standards by which we judge all inquirers. In this sense, science does not differ in kind from other kinds of inquiry, but it may differ in the degree to which it requires broad and detailed background knowledge and a familiarity with a technical vocabulary that only specialists may possess.

  • Aikenhead, G.S., 1987, “High-school graduates’ beliefs about science-technology-society. III. Characteristics and limitations of scientific knowledge”, Science Education , 71(4): 459–487.
  • Allchin, D., H.M. Andersen and K. Nielsen, 2014, “Complementary Approaches to Teaching Nature of Science: Integrating Student Inquiry, Historical Cases, and Contemporary Cases in Classroom Practice”, Science Education , 98: 461–486.
  • Anderson, C., 2008, “The end of theory: The data deluge makes the scientific method obsolete”, Wired magazine , 16(7): 16–07
  • Arabatzis, T., 2006, “On the inextricability of the context of discovery and the context of justification”, in Revisiting Discovery and Justification , J. Schickore and F. Steinle (eds.), Dordrecht: Springer, pp. 215–230.
  • Barnes, J. (ed.), 1984, The Complete Works of Aristotle, Vols I and II , Princeton: Princeton University Press.
  • Barnes, B. and D. Bloor, 1982, “Relativism, Rationalism, and the Sociology of Knowledge”, in Rationality and Relativism , M. Hollis and S. Lukes (eds.), Cambridge: MIT Press, pp. 1–20.
  • Bauer, H.H., 1992, Scientific Literacy and the Myth of the Scientific Method , Urbana: University of Illinois Press.
  • Bechtel, W. and R.C. Richardson, 1993, Discovering complexity , Princeton, NJ: Princeton University Press.
  • Berkeley, G., 1734, The Analyst in De Motu and The Analyst: A Modern Edition with Introductions and Commentary , D. Jesseph (trans. and ed.), Dordrecht: Kluwer Academic Publishers, 1992.
  • Blachowicz, J., 2009, “How science textbooks treat scientific method: A philosopher’s perspective”, The British Journal for the Philosophy of Science , 60(2): 303–344.
  • Bloor, D., 1991, Knowledge and Social Imagery , Chicago: University of Chicago Press, 2 nd edition.
  • Boyle, R., 1682, New experiments physico-mechanical, touching the air , Printed by Miles Flesher for Richard Davis, bookseller in Oxford.
  • Bridgman, P.W., 1927, The Logic of Modern Physics , New York: Macmillan.
  • –––, 1956, “The Methodological Character of Theoretical Concepts”, in The Foundations of Science and the Concepts of Science and Psychology , Herbert Feigl and Michael Scriven (eds.), Minnesota: University of Minneapolis Press, pp. 38–76.
  • Burian, R., 1997, “Exploratory Experimentation and the Role of Histochemical Techniques in the Work of Jean Brachet, 1938–1952”, History and Philosophy of the Life Sciences , 19(1): 27–45.
  • –––, 2007, “On microRNA and the need for exploratory experimentation in post-genomic molecular biology”, History and Philosophy of the Life Sciences , 29(3): 285–311.
  • Carnap, R., 1928, Der logische Aufbau der Welt , Berlin: Bernary, transl. by R.A. George, The Logical Structure of the World , Berkeley: University of California Press, 1967.
  • –––, 1956, “The methodological character of theoretical concepts”, Minnesota studies in the philosophy of science , 1: 38–76.
  • Carrol, S., and D. Goodstein, 2009, “Defining the scientific method”, Nature Methods , 6: 237.
  • Churchman, C.W., 1948, “Science, Pragmatics, Induction”, Philosophy of Science , 15(3): 249–268.
  • Cooper, J. (ed.), 1997, Plato: Complete Works , Indianapolis: Hackett.
  • Darden, L., 1991, Theory Change in Science: Strategies from Mendelian Genetics , Oxford: Oxford University Press
  • Dewey, J., 1910, How we think , New York: Dover Publications (reprinted 1997).
  • Douglas, H., 2009, Science, Policy, and the Value-Free Ideal , Pittsburgh: University of Pittsburgh Press.
  • Dupré, J., 2004, “Miracle of Monism ”, in Naturalism in Question , Mario De Caro and David Macarthur (eds.), Cambridge, MA: Harvard University Press, pp. 36–58.
  • Elliott, K.C., 2007, “Varieties of exploratory experimentation in nanotoxicology”, History and Philosophy of the Life Sciences , 29(3): 311–334.
  • Elliott, K. C., and T. Richards (eds.), 2017, Exploring inductive risk: Case studies of values in science , Oxford: Oxford University Press.
  • Falcon, Andrea, 2005, Aristotle and the science of nature: Unity without uniformity , Cambridge: Cambridge University Press.
  • Feyerabend, P., 1978, Science in a Free Society , London: New Left Books
  • –––, 1988, Against Method , London: Verso, 2 nd edition.
  • Fisher, R.A., 1955, “Statistical Methods and Scientific Induction”, Journal of The Royal Statistical Society. Series B (Methodological) , 17(1): 69–78.
  • Foster, K. and P.W. Huber, 1999, Judging Science. Scientific Knowledge and the Federal Courts , Cambridge: MIT Press.
  • Fox Keller, E., 2003, “Models, Simulation, and ‘computer experiments’”, in The Philosophy of Scientific Experimentation , H. Radder (ed.), Pittsburgh: Pittsburgh University Press, 198–215.
  • Gilbert, G., 1976, “The transformation of research findings into scientific knowledge”, Social Studies of Science , 6: 281–306.
  • Gimbel, S., 2011, Exploring the Scientific Method , Chicago: University of Chicago Press.
  • Goodman, N., 1965, Fact , Fiction, and Forecast , Indianapolis: Bobbs-Merrill.
  • Haack, S., 1995, “Science is neither sacred nor a confidence trick”, Foundations of Science , 1(3): 323–335.
  • –––, 2003, Defending science—within reason , Amherst: Prometheus.
  • –––, 2005a, “Disentangling Daubert: an epistemological study in theory and practice”, Journal of Philosophy, Science and Law , 5, Haack 2005a available online . doi:10.5840/jpsl2005513
  • –––, 2005b, “Trial and error: The Supreme Court’s philosophy of science”, American Journal of Public Health , 95: S66-S73.
  • –––, 2010, “Federal Philosophy of Science: A Deconstruction-and a Reconstruction”, NYUJL & Liberty , 5: 394.
  • Hangel, N. and J. Schickore, 2017, “Scientists’ conceptions of good research practice”, Perspectives on Science , 25(6): 766–791
  • Harper, W.L., 2011, Isaac Newton’s Scientific Method: Turning Data into Evidence about Gravity and Cosmology , Oxford: Oxford University Press.
  • Hempel, C., 1950, “Problems and Changes in the Empiricist Criterion of Meaning”, Revue Internationale de Philosophie , 41(11): 41–63.
  • –––, 1951, “The Concept of Cognitive Significance: A Reconsideration”, Proceedings of the American Academy of Arts and Sciences , 80(1): 61–77.
  • –––, 1965, Aspects of scientific explanation and other essays in the philosophy of science , New York–London: Free Press.
  • –––, 1966, Philosophy of Natural Science , Englewood Cliffs: Prentice-Hall.
  • Holmes, F.L., 1987, “Scientific writing and scientific discovery”, Isis , 78(2): 220–235.
  • Howard, D., 2003, “Two left turns make a right: On the curious political career of North American philosophy of science at midcentury”, in Logical Empiricism in North America , G.L. Hardcastle & A.W. Richardson (eds.), Minneapolis: University of Minnesota Press, pp. 25–93.
  • Hoyningen-Huene, P., 2008, “Systematicity: The nature of science”, Philosophia , 36(2): 167–180.
  • –––, 2013, Systematicity. The Nature of Science , Oxford: Oxford University Press.
  • Howie, D., 2002, Interpreting probability: Controversies and developments in the early twentieth century , Cambridge: Cambridge University Press.
  • Hughes, R., 1999, “The Ising Model, Computer Simulation, and Universal Physics”, in Models as Mediators , M. Morgan and M. Morrison (eds.), Cambridge: Cambridge University Press, pp. 97–145
  • Hume, D., 1739, A Treatise of Human Nature , D. Fate Norton and M.J. Norton (eds.), Oxford: Oxford University Press, 2000.
  • Humphreys, P., 1995, “Computational science and scientific method”, Minds and Machines , 5(1): 499–512.
  • ICMJE, 2013, “Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals”, International Committee of Medical Journal Editors, available online , accessed August 13 2014
  • Jeffrey, R.C., 1956, “Valuation and Acceptance of Scientific Hypotheses”, Philosophy of Science , 23(3): 237–246.
  • Kaufmann, W.J., and L.L. Smarr, 1993, Supercomputing and the Transformation of Science , New York: Scientific American Library.
  • Knorr-Cetina, K., 1981, The Manufacture of Knowledge , Oxford: Pergamon Press.
  • Krohs, U., 2012, “Convenience experimentation”, Studies in History and Philosophy of Biological and BiomedicalSciences , 43: 52–57.
  • Kuhn, T.S., 1962, The Structure of Scientific Revolutions , Chicago: University of Chicago Press
  • Latour, B. and S. Woolgar, 1986, Laboratory Life: The Construction of Scientific Facts , Princeton: Princeton University Press, 2 nd edition.
  • Laudan, L., 1968, “Theories of scientific method from Plato to Mach”, History of Science , 7(1): 1–63.
  • Lenhard, J., 2006, “Models and statistical inference: The controversy between Fisher and Neyman-Pearson”, The British Journal for the Philosophy of Science , 57(1): 69–91.
  • Leonelli, S., 2012, “Making Sense of Data-Driven Research in the Biological and the Biomedical Sciences”, Studies in the History and Philosophy of the Biological and Biomedical Sciences , 43(1): 1–3.
  • Levi, I., 1960, “Must the scientist make value judgments?”, Philosophy of Science , 57(11): 345–357
  • Lindley, D., 1991, Theory Change in Science: Strategies from Mendelian Genetics , Oxford: Oxford University Press.
  • Lipton, P., 2004, Inference to the Best Explanation , London: Routledge, 2 nd edition.
  • Marks, H.M., 2000, The progress of experiment: science and therapeutic reform in the United States, 1900–1990 , Cambridge: Cambridge University Press.
  • Mazzochi, F., 2015, “Could Big Data be the end of theory in science?”, EMBO reports , 16: 1250–1255.
  • Mayo, D.G., 1996, Error and the Growth of Experimental Knowledge , Chicago: University of Chicago Press.
  • McComas, W.F., 1996, “Ten myths of science: Reexamining what we think we know about the nature of science”, School Science and Mathematics , 96(1): 10–16.
  • Medawar, P.B., 1963/1996, “Is the scientific paper a fraud”, in The Strange Case of the Spotted Mouse and Other Classic Essays on Science , Oxford: Oxford University Press, 33–39.
  • Mill, J.S., 1963, Collected Works of John Stuart Mill , J. M. Robson (ed.), Toronto: University of Toronto Press
  • NAS, 1992, Responsible Science: Ensuring the integrity of the research process , Washington DC: National Academy Press.
  • Nersessian, N.J., 1987, “A cognitive-historical approach to meaning in scientific theories”, in The process of science , N. Nersessian (ed.), Berlin: Springer, pp. 161–177.
  • –––, 2008, Creating Scientific Concepts , Cambridge: MIT Press.
  • Newton, I., 1726, Philosophiae naturalis Principia Mathematica (3 rd edition), in The Principia: Mathematical Principles of Natural Philosophy: A New Translation , I.B. Cohen and A. Whitman (trans.), Berkeley: University of California Press, 1999.
  • –––, 1704, Opticks or A Treatise of the Reflections, Refractions, Inflections & Colors of Light , New York: Dover Publications, 1952.
  • Neyman, J., 1956, “Note on an Article by Sir Ronald Fisher”, Journal of the Royal Statistical Society. Series B (Methodological) , 18: 288–294.
  • Nickles, T., 1987, “Methodology, heuristics, and rationality”, in Rational changes in science: Essays on Scientific Reasoning , J.C. Pitt (ed.), Berlin: Springer, pp. 103–132.
  • Nicod, J., 1924, Le problème logique de l’induction , Paris: Alcan. (Engl. transl. “The Logical Problem of Induction”, in Foundations of Geometry and Induction , London: Routledge, 2000.)
  • Nola, R. and H. Sankey, 2000a, “A selective survey of theories of scientific method”, in Nola and Sankey 2000b: 1–65.
  • –––, 2000b, After Popper, Kuhn and Feyerabend. Recent Issues in Theories of Scientific Method , London: Springer.
  • –––, 2007, Theories of Scientific Method , Stocksfield: Acumen.
  • Norton, S., and F. Suppe, 2001, “Why atmospheric modeling is good science”, in Changing the Atmosphere: Expert Knowledge and Environmental Governance , C. Miller and P. Edwards (eds.), Cambridge, MA: MIT Press, 88–133.
  • O’Malley, M., 2007, “Exploratory experimentation and scientific practice: Metagenomics and the proteorhodopsin case”, History and Philosophy of the Life Sciences , 29(3): 337–360.
  • O’Malley, M., C. Haufe, K. Elliot, and R. Burian, 2009, “Philosophies of Funding”, Cell , 138: 611–615.
  • Oreskes, N., K. Shrader-Frechette, and K. Belitz, 1994, “Verification, Validation and Confirmation of Numerical Models in the Earth Sciences”, Science , 263(5147): 641–646.
  • Osborne, J., S. Simon, and S. Collins, 2003, “Attitudes towards science: a review of the literature and its implications”, International Journal of Science Education , 25(9): 1049–1079.
  • Parascandola, M., 1998, “Epidemiology—2 nd -Rate Science”, Public Health Reports , 113(4): 312–320.
  • Parker, W., 2008a, “Franklin, Holmes and the Epistemology of Computer Simulation”, International Studies in the Philosophy of Science , 22(2): 165–83.
  • –––, 2008b, “Computer Simulation through an Error-Statistical Lens”, Synthese , 163(3): 371–84.
  • Pearson, K. 1892, The Grammar of Science , London: J.M. Dents and Sons, 1951
  • Pearson, E.S., 1955, “Statistical Concepts in Their Relation to Reality”, Journal of the Royal Statistical Society , B, 17: 204–207.
  • Pickering, A., 1984, Constructing Quarks: A Sociological History of Particle Physics , Edinburgh: Edinburgh University Press.
  • Popper, K.R., 1959, The Logic of Scientific Discovery , London: Routledge, 2002
  • –––, 1963, Conjectures and Refutations , London: Routledge, 2002.
  • –––, 1985, Unended Quest: An Intellectual Autobiography , La Salle: Open Court Publishing Co..
  • Rudner, R., 1953, “The Scientist Qua Scientist Making Value Judgments”, Philosophy of Science , 20(1): 1–6.
  • Rudolph, J.L., 2005, “Epistemology for the masses: The origin of ‘The Scientific Method’ in American Schools”, History of Education Quarterly , 45(3): 341–376
  • Schickore, J., 2008, “Doing science, writing science”, Philosophy of Science , 75: 323–343.
  • Schickore, J. and N. Hangel, 2019, “‘It might be this, it should be that…’ uncertainty and doubt in day-to-day science practice”, European Journal for Philosophy of Science , 9(2): 31. doi:10.1007/s13194-019-0253-9
  • Shamoo, A.E. and D.B. Resnik, 2009, Responsible Conduct of Research , Oxford: Oxford University Press.
  • Shank, J.B., 2008, The Newton Wars and the Beginning of the French Enlightenment , Chicago: The University of Chicago Press.
  • Shapin, S. and S. Schaffer, 1985, Leviathan and the air-pump , Princeton: Princeton University Press.
  • Smith, G.E., 2002, “The Methodology of the Principia”, in The Cambridge Companion to Newton , I.B. Cohen and G.E. Smith (eds.), Cambridge: Cambridge University Press, 138–173.
  • Snyder, L.J., 1997a, “Discoverers’ Induction”, Philosophy of Science , 64: 580–604.
  • –––, 1997b, “The Mill-Whewell Debate: Much Ado About Induction”, Perspectives on Science , 5: 159–198.
  • –––, 1999, “Renovating the Novum Organum: Bacon, Whewell and Induction”, Studies in History and Philosophy of Science , 30: 531–557.
  • Sober, E., 2008, Evidence and Evolution. The logic behind the science , Cambridge: Cambridge University Press
  • Sprenger, J. and S. Hartmann, 2019, Bayesian philosophy of science , Oxford: Oxford University Press.
  • Steinle, F., 1997, “Entering New Fields: Exploratory Uses of Experimentation”, Philosophy of Science (Proceedings), 64: S65–S74.
  • –––, 2002, “Experiments in History and Philosophy of Science”, Perspectives on Science , 10(4): 408–432.
  • Strasser, B.J., 2012, “Data-driven sciences: From wonder cabinets to electronic databases”, Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences , 43(1): 85–87.
  • Succi, S. and P.V. Coveney, 2018, “Big data: the end of the scientific method?”, Philosophical Transactions of the Royal Society A , 377: 20180145. doi:10.1098/rsta.2018.0145
  • Suppe, F., 1998, “The Structure of a Scientific Paper”, Philosophy of Science , 65(3): 381–405.
  • Swijtink, Z.G., 1987, “The objectification of observation: Measurement and statistical methods in the nineteenth century”, in The probabilistic revolution. Ideas in History, Vol. 1 , L. Kruger (ed.), Cambridge MA: MIT Press, pp. 261–285.
  • Waters, C.K., 2007, “The nature and context of exploratory experimentation: An introduction to three case studies of exploratory research”, History and Philosophy of the Life Sciences , 29(3): 275–284.
  • Weinberg, S., 1995, “The methods of science… and those by which we live”, Academic Questions , 8(2): 7–13.
  • Weissert, T., 1997, The Genesis of Simulation in Dynamics: Pursuing the Fermi-Pasta-Ulam Problem , New York: Springer Verlag.
  • William H., 1628, Exercitatio Anatomica de Motu Cordis et Sanguinis in Animalibus , in On the Motion of the Heart and Blood in Animals , R. Willis (trans.), Buffalo: Prometheus Books, 1993.
  • Winsberg, E., 2010, Science in the Age of Computer Simulation , Chicago: University of Chicago Press.
  • Wivagg, D. & D. Allchin, 2002, “The Dogma of the Scientific Method”, The American Biology Teacher , 64(9): 645–646
How to cite this entry . Preview the PDF version of this entry at the Friends of the SEP Society . Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers , with links to its database.
  • Blackmun opinion , in Daubert v. Merrell Dow Pharmaceuticals (92–102), 509 U.S. 579 (1993).
  • Scientific Method at philpapers. Darrell Rowbottom (ed.).
  • Recent Articles | Scientific Method | The Scientist Magazine

al-Kindi | Albert the Great [= Albertus magnus] | Aquinas, Thomas | Arabic and Islamic Philosophy, disciplines in: natural philosophy and natural science | Arabic and Islamic Philosophy, historical and methodological topics in: Greek sources | Arabic and Islamic Philosophy, historical and methodological topics in: influence of Arabic and Islamic Philosophy on the Latin West | Aristotle | Bacon, Francis | Bacon, Roger | Berkeley, George | biology: experiment in | Boyle, Robert | Cambridge Platonists | confirmation | Descartes, René | Enlightenment | epistemology | epistemology: Bayesian | epistemology: social | Feyerabend, Paul | Galileo Galilei | Grosseteste, Robert | Hempel, Carl | Hume, David | Hume, David: Newtonianism and Anti-Newtonianism | induction: problem of | Kant, Immanuel | Kuhn, Thomas | Leibniz, Gottfried Wilhelm | Locke, John | Mill, John Stuart | More, Henry | Neurath, Otto | Newton, Isaac | Newton, Isaac: philosophy | Ockham [Occam], William | operationalism | Peirce, Charles Sanders | Plato | Popper, Karl | rationality: historicist theories of | Reichenbach, Hans | reproducibility, scientific | Schlick, Moritz | science: and pseudo-science | science: theory and observation in | science: unity of | scientific discovery | scientific knowledge: social dimensions of | simulations in science | skepticism: medieval | space and time: absolute and relational space and motion, post-Newtonian theories | Vienna Circle | Whewell, William | Zabarella, Giacomo

Copyright © 2021 by Brian Hepburn < brian . hepburn @ wichita . edu > Hanne Andersen < hanne . andersen @ ind . ku . dk >

  • Accessibility

10 Experimental research

Experimental research—often considered to be the ‘gold standard’ in research designs—is one of the most rigorous of all research designs. In this design, one or more independent variables are manipulated by the researcher (as treatments), subjects are randomly assigned to different treatment levels (random assignment), and the results of the treatments on outcomes (dependent variables) are observed. The unique strength of experimental research is its internal validity (causality) due to its ability to link cause and effect through treatment manipulation, while controlling for the spurious effect of extraneous variable.

Experimental research is best suited for explanatory research—rather than for descriptive or exploratory research—where the goal of the study is to examine cause-effect relationships. It also works well for research that involves a relatively limited and well-defined set of independent variables that can either be manipulated or controlled. Experimental research can be conducted in laboratory or field settings. Laboratory experiments , conducted in laboratory (artificial) settings, tend to be high in internal validity, but this comes at the cost of low external validity (generalisability), because the artificial (laboratory) setting in which the study is conducted may not reflect the real world. Field experiments are conducted in field settings such as in a real organisation, and are high in both internal and external validity. But such experiments are relatively rare, because of the difficulties associated with manipulating treatments and controlling for extraneous effects in a field setting.

Experimental research can be grouped into two broad categories: true experimental designs and quasi-experimental designs. Both designs require treatment manipulation, but while true experiments also require random assignment, quasi-experiments do not. Sometimes, we also refer to non-experimental research, which is not really a research design, but an all-inclusive term that includes all types of research that do not employ treatment manipulation or random assignment, such as survey research, observational research, and correlational studies.

Basic concepts

Treatment and control groups. In experimental research, some subjects are administered one or more experimental stimulus called a treatment (the treatment group ) while other subjects are not given such a stimulus (the control group ). The treatment may be considered successful if subjects in the treatment group rate more favourably on outcome variables than control group subjects. Multiple levels of experimental stimulus may be administered, in which case, there may be more than one treatment group. For example, in order to test the effects of a new drug intended to treat a certain medical condition like dementia, if a sample of dementia patients is randomly divided into three groups, with the first group receiving a high dosage of the drug, the second group receiving a low dosage, and the third group receiving a placebo such as a sugar pill (control group), then the first two groups are experimental groups and the third group is a control group. After administering the drug for a period of time, if the condition of the experimental group subjects improved significantly more than the control group subjects, we can say that the drug is effective. We can also compare the conditions of the high and low dosage experimental groups to determine if the high dose is more effective than the low dose.

Treatment manipulation. Treatments are the unique feature of experimental research that sets this design apart from all other research methods. Treatment manipulation helps control for the ‘cause’ in cause-effect relationships. Naturally, the validity of experimental research depends on how well the treatment was manipulated. Treatment manipulation must be checked using pretests and pilot tests prior to the experimental study. Any measurements conducted before the treatment is administered are called pretest measures , while those conducted after the treatment are posttest measures .

Random selection and assignment. Random selection is the process of randomly drawing a sample from a population or a sampling frame. This approach is typically employed in survey research, and ensures that each unit in the population has a positive chance of being selected into the sample. Random assignment, however, is a process of randomly assigning subjects to experimental or control groups. This is a standard practice in true experimental research to ensure that treatment groups are similar (equivalent) to each other and to the control group prior to treatment administration. Random selection is related to sampling, and is therefore more closely related to the external validity (generalisability) of findings. However, random assignment is related to design, and is therefore most related to internal validity. It is possible to have both random selection and random assignment in well-designed experimental research, but quasi-experimental research involves neither random selection nor random assignment.

Threats to internal validity. Although experimental designs are considered more rigorous than other research methods in terms of the internal validity of their inferences (by virtue of their ability to control causes through treatment manipulation), they are not immune to internal validity threats. Some of these threats to internal validity are described below, within the context of a study of the impact of a special remedial math tutoring program for improving the math abilities of high school students.

History threat is the possibility that the observed effects (dependent variables) are caused by extraneous or historical events rather than by the experimental treatment. For instance, students’ post-remedial math score improvement may have been caused by their preparation for a math exam at their school, rather than the remedial math program.

Maturation threat refers to the possibility that observed effects are caused by natural maturation of subjects (e.g., a general improvement in their intellectual ability to understand complex concepts) rather than the experimental treatment.

Testing threat is a threat in pre-post designs where subjects’ posttest responses are conditioned by their pretest responses. For instance, if students remember their answers from the pretest evaluation, they may tend to repeat them in the posttest exam.

Not conducting a pretest can help avoid this threat.

Instrumentation threat , which also occurs in pre-post designs, refers to the possibility that the difference between pretest and posttest scores is not due to the remedial math program, but due to changes in the administered test, such as the posttest having a higher or lower degree of difficulty than the pretest.

Mortality threat refers to the possibility that subjects may be dropping out of the study at differential rates between the treatment and control groups due to a systematic reason, such that the dropouts were mostly students who scored low on the pretest. If the low-performing students drop out, the results of the posttest will be artificially inflated by the preponderance of high-performing students.

Regression threat —also called a regression to the mean—refers to the statistical tendency of a group’s overall performance to regress toward the mean during a posttest rather than in the anticipated direction. For instance, if subjects scored high on a pretest, they will have a tendency to score lower on the posttest (closer to the mean) because their high scores (away from the mean) during the pretest were possibly a statistical aberration. This problem tends to be more prevalent in non-random samples and when the two measures are imperfectly correlated.

Two-group experimental designs

R

Pretest-posttest control group design . In this design, subjects are randomly assigned to treatment and control groups, subjected to an initial (pretest) measurement of the dependent variables of interest, the treatment group is administered a treatment (representing the independent variable of interest), and the dependent variables measured again (posttest). The notation of this design is shown in Figure 10.1.

Pretest-posttest control group design

Statistical analysis of this design involves a simple analysis of variance (ANOVA) between the treatment and control groups. The pretest-posttest design handles several threats to internal validity, such as maturation, testing, and regression, since these threats can be expected to influence both treatment and control groups in a similar (random) manner. The selection threat is controlled via random assignment. However, additional threats to internal validity may exist. For instance, mortality can be a problem if there are differential dropout rates between the two groups, and the pretest measurement may bias the posttest measurement—especially if the pretest introduces unusual topics or content.

Posttest -only control group design . This design is a simpler version of the pretest-posttest design where pretest measurements are omitted. The design notation is shown in Figure 10.2.

Posttest-only control group design

The treatment effect is measured simply as the difference in the posttest scores between the two groups:

\[E = (O_{1} - O_{2})\,.\]

The appropriate statistical analysis of this design is also a two-group analysis of variance (ANOVA). The simplicity of this design makes it more attractive than the pretest-posttest design in terms of internal validity. This design controls for maturation, testing, regression, selection, and pretest-posttest interaction, though the mortality threat may continue to exist.

C

Because the pretest measure is not a measurement of the dependent variable, but rather a covariate, the treatment effect is measured as the difference in the posttest scores between the treatment and control groups as:

Due to the presence of covariates, the right statistical analysis of this design is a two-group analysis of covariance (ANCOVA). This design has all the advantages of posttest-only design, but with internal validity due to the controlling of covariates. Covariance designs can also be extended to pretest-posttest control group design.

Factorial designs

Two-group designs are inadequate if your research requires manipulation of two or more independent variables (treatments). In such cases, you would need four or higher-group designs. Such designs, quite popular in experimental research, are commonly called factorial designs. Each independent variable in this design is called a factor , and each subdivision of a factor is called a level . Factorial designs enable the researcher to examine not only the individual effect of each treatment on the dependent variables (called main effects), but also their joint effect (called interaction effects).

2 \times 2

In a factorial design, a main effect is said to exist if the dependent variable shows a significant difference between multiple levels of one factor, at all levels of other factors. No change in the dependent variable across factor levels is the null case (baseline), from which main effects are evaluated. In the above example, you may see a main effect of instructional type, instructional time, or both on learning outcomes. An interaction effect exists when the effect of differences in one factor depends upon the level of a second factor. In our example, if the effect of instructional type on learning outcomes is greater for three hours/week of instructional time than for one and a half hours/week, then we can say that there is an interaction effect between instructional type and instructional time on learning outcomes. Note that the presence of interaction effects dominate and make main effects irrelevant, and it is not meaningful to interpret main effects if interaction effects are significant.

Hybrid experimental designs

Hybrid designs are those that are formed by combining features of more established designs. Three such hybrid designs are randomised bocks design, Solomon four-group design, and switched replications design.

Randomised block design. This is a variation of the posttest-only or pretest-posttest control group design where the subject population can be grouped into relatively homogeneous subgroups (called blocks ) within which the experiment is replicated. For instance, if you want to replicate the same posttest-only design among university students and full-time working professionals (two homogeneous blocks), subjects in both blocks are randomly split between the treatment group (receiving the same treatment) and the control group (see Figure 10.5). The purpose of this design is to reduce the ‘noise’ or variance in data that may be attributable to differences between the blocks so that the actual effect of interest can be detected more accurately.

Randomised blocks design

Solomon four-group design . In this design, the sample is divided into two treatment groups and two control groups. One treatment group and one control group receive the pretest, and the other two groups do not. This design represents a combination of posttest-only and pretest-posttest control group design, and is intended to test for the potential biasing effect of pretest measurement on posttest measures that tends to occur in pretest-posttest designs, but not in posttest-only designs. The design notation is shown in Figure 10.6.

Solomon four-group design

Switched replication design . This is a two-group design implemented in two phases with three waves of measurement. The treatment group in the first phase serves as the control group in the second phase, and the control group in the first phase becomes the treatment group in the second phase, as illustrated in Figure 10.7. In other words, the original design is repeated or replicated temporally with treatment/control roles switched between the two groups. By the end of the study, all participants will have received the treatment either during the first or the second phase. This design is most feasible in organisational contexts where organisational programs (e.g., employee training) are implemented in a phased manner or are repeated at regular intervals.

Switched replication design

Quasi-experimental designs

Quasi-experimental designs are almost identical to true experimental designs, but lacking one key ingredient: random assignment. For instance, one entire class section or one organisation is used as the treatment group, while another section of the same class or a different organisation in the same industry is used as the control group. This lack of random assignment potentially results in groups that are non-equivalent, such as one group possessing greater mastery of certain content than the other group, say by virtue of having a better teacher in a previous semester, which introduces the possibility of selection bias . Quasi-experimental designs are therefore inferior to true experimental designs in interval validity due to the presence of a variety of selection related threats such as selection-maturation threat (the treatment and control groups maturing at different rates), selection-history threat (the treatment and control groups being differentially impacted by extraneous or historical events), selection-regression threat (the treatment and control groups regressing toward the mean between pretest and posttest at different rates), selection-instrumentation threat (the treatment and control groups responding differently to the measurement), selection-testing (the treatment and control groups responding differently to the pretest), and selection-mortality (the treatment and control groups demonstrating differential dropout rates). Given these selection threats, it is generally preferable to avoid quasi-experimental designs to the greatest extent possible.

N

In addition, there are quite a few unique non-equivalent designs without corresponding true experimental design cousins. Some of the more useful of these designs are discussed next.

Regression discontinuity (RD) design . This is a non-equivalent pretest-posttest design where subjects are assigned to the treatment or control group based on a cut-off score on a preprogram measure. For instance, patients who are severely ill may be assigned to a treatment group to test the efficacy of a new drug or treatment protocol and those who are mildly ill are assigned to the control group. In another example, students who are lagging behind on standardised test scores may be selected for a remedial curriculum program intended to improve their performance, while those who score high on such tests are not selected from the remedial program.

RD design

Because of the use of a cut-off score, it is possible that the observed results may be a function of the cut-off score rather than the treatment, which introduces a new threat to internal validity. However, using the cut-off score also ensures that limited or costly resources are distributed to people who need them the most, rather than randomly across a population, while simultaneously allowing a quasi-experimental treatment. The control group scores in the RD design do not serve as a benchmark for comparing treatment group scores, given the systematic non-equivalence between the two groups. Rather, if there is no discontinuity between pretest and posttest scores in the control group, but such a discontinuity persists in the treatment group, then this discontinuity is viewed as evidence of the treatment effect.

Proxy pretest design . This design, shown in Figure 10.11, looks very similar to the standard NEGD (pretest-posttest) design, with one critical difference: the pretest score is collected after the treatment is administered. A typical application of this design is when a researcher is brought in to test the efficacy of a program (e.g., an educational program) after the program has already started and pretest data is not available. Under such circumstances, the best option for the researcher is often to use a different prerecorded measure, such as students’ grade point average before the start of the program, as a proxy for pretest data. A variation of the proxy pretest design is to use subjects’ posttest recollection of pretest data, which may be subject to recall bias, but nevertheless may provide a measure of perceived gain or change in the dependent variable.

Proxy pretest design

Separate pretest-posttest samples design . This design is useful if it is not possible to collect pretest and posttest data from the same subjects for some reason. As shown in Figure 10.12, there are four groups in this design, but two groups come from a single non-equivalent group, while the other two groups come from a different non-equivalent group. For instance, say you want to test customer satisfaction with a new online service that is implemented in one city but not in another. In this case, customers in the first city serve as the treatment group and those in the second city constitute the control group. If it is not possible to obtain pretest and posttest measures from the same customers, you can measure customer satisfaction at one point in time, implement the new service program, and measure customer satisfaction (with a different set of customers) after the program is implemented. Customer satisfaction is also measured in the control group at the same times as in the treatment group, but without the new program implementation. The design is not particularly strong, because you cannot examine the changes in any specific customer’s satisfaction score before and after the implementation, but you can only examine average customer satisfaction scores. Despite the lower internal validity, this design may still be a useful way of collecting quasi-experimental data when pretest and posttest data is not available from the same subjects.

Separate pretest-posttest samples design

An interesting variation of the NEDV design is a pattern-matching NEDV design , which employs multiple outcome variables and a theory that explains how much each variable will be affected by the treatment. The researcher can then examine if the theoretical prediction is matched in actual observations. This pattern-matching technique—based on the degree of correspondence between theoretical and observed patterns—is a powerful way of alleviating internal validity concerns in the original NEDV design.

NEDV design

Perils of experimental research

Experimental research is one of the most difficult of research designs, and should not be taken lightly. This type of research is often best with a multitude of methodological problems. First, though experimental research requires theories for framing hypotheses for testing, much of current experimental research is atheoretical. Without theories, the hypotheses being tested tend to be ad hoc, possibly illogical, and meaningless. Second, many of the measurement instruments used in experimental research are not tested for reliability and validity, and are incomparable across studies. Consequently, results generated using such instruments are also incomparable. Third, often experimental research uses inappropriate research designs, such as irrelevant dependent variables, no interaction effects, no experimental controls, and non-equivalent stimulus across treatment groups. Findings from such studies tend to lack internal validity and are highly suspect. Fourth, the treatments (tasks) used in experimental research may be diverse, incomparable, and inconsistent across studies, and sometimes inappropriate for the subject population. For instance, undergraduate student subjects are often asked to pretend that they are marketing managers and asked to perform a complex budget allocation task in which they have no experience or expertise. The use of such inappropriate tasks, introduces new threats to internal validity (i.e., subject’s performance may be an artefact of the content or difficulty of the task setting), generates findings that are non-interpretable and meaningless, and makes integration of findings across studies impossible.

The design of proper experimental treatments is a very important task in experimental design, because the treatment is the raison d’etre of the experimental method, and must never be rushed or neglected. To design an adequate and appropriate task, researchers should use prevalidated tasks if available, conduct treatment manipulation checks to check for the adequacy of such tasks (by debriefing subjects after performing the assigned task), conduct pilot tests (repeatedly, if necessary), and if in doubt, use tasks that are simple and familiar for the respondent sample rather than tasks that are complex or unfamiliar.

In summary, this chapter introduced key concepts in the experimental design research method and introduced a variety of true experimental and quasi-experimental designs. Although these designs vary widely in internal validity, designs with less internal validity should not be overlooked and may sometimes be useful under specific circumstances and empirical contingencies.

Social Science Research: Principles, Methods and Practices (Revised edition) Copyright © 2019 by Anol Bhattacherjee is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Published: 01 June 2023

Data, measurement and empirical methods in the science of science

  • Lu Liu 1 , 2 , 3 , 4 ,
  • Benjamin F. Jones   ORCID: orcid.org/0000-0001-9697-9388 1 , 2 , 3 , 5 , 6 ,
  • Brian Uzzi   ORCID: orcid.org/0000-0001-6855-2854 1 , 2 , 3 &
  • Dashun Wang   ORCID: orcid.org/0000-0002-7054-2206 1 , 2 , 3 , 7  

Nature Human Behaviour volume  7 ,  pages 1046–1058 ( 2023 ) Cite this article

20k Accesses

19 Citations

117 Altmetric

Metrics details

  • Scientific community

The advent of large-scale datasets that trace the workings of science has encouraged researchers from many different disciplinary backgrounds to turn scientific methods into science itself, cultivating a rapidly expanding ‘science of science’. This Review considers this growing, multidisciplinary literature through the lens of data, measurement and empirical methods. We discuss the purposes, strengths and limitations of major empirical approaches, seeking to increase understanding of the field’s diverse methodologies and expand researchers’ toolkits. Overall, new empirical developments provide enormous capacity to test traditional beliefs and conceptual frameworks about science, discover factors associated with scientific productivity, predict scientific outcomes and design policies that facilitate scientific progress.

Similar content being viewed by others

science experimental methods

SciSciNet: A large-scale open data lake for the science of science research

science experimental methods

A dataset for measuring the impact of research data and their curation

science experimental methods

Envisioning a “science diplomacy 2.0”: on data, global challenges, and multi-layered networks

Scientific advances are a key input to rising standards of living, health and the capacity of society to confront grand challenges, from climate change to the COVID-19 pandemic 1 , 2 , 3 . A deeper understanding of how science works and where innovation occurs can help us to more effectively design science policy and science institutions, better inform scientists’ own research choices, and create and capture enormous value for science and humanity. Building on these key premises, recent years have witnessed substantial development in the ‘science of science’ 4 , 5 , 6 , 7 , 8 , 9 , which uses large-scale datasets and diverse computational toolkits to unearth fundamental patterns behind scientific production and use.

The idea of turning scientific methods into science itself is long-standing. Since the mid-20th century, researchers from different disciplines have asked central questions about the nature of scientific progress and the practice, organization and impact of scientific research. Building on these rich historical roots, the field of the science of science draws upon many disciplines, ranging from information science to the social, physical and biological sciences to computer science, engineering and design. The science of science closely relates to several strands and communities of research, including metascience, scientometrics, the economics of science, research on research, science and technology studies, the sociology of science, metaknowledge and quantitative science studies 5 . There are noticeable differences between some of these communities, mostly around their historical origins and the initial disciplinary composition of researchers forming these communities. For example, metascience has its origins in the clinical sciences and psychology, and focuses on rigour, transparency, reproducibility and other open science-related practices and topics. The scientometrics community, born in library and information sciences, places a particular emphasis on developing robust and responsible measures and indicators for science. Science and technology studies engage the history of science and technology, the philosophy of science, and the interplay between science, technology and society. The science of science, which has its origins in physics, computer science and sociology, takes a data-driven approach and emphasizes questions on how science works. Each of these communities has made fundamental contributions to understanding science. While they differ in their origins, these differences pale in comparison to the overarching, common interest in understanding the practice of science and its societal impact.

Three major developments have encouraged rapid advances in the science of science. The first is in data 9 : modern databases include millions of research articles, grant proposals, patents and more. This windfall of data traces scientific activity in remarkable detail and at scale. The second development is in measurement: scholars have used data to develop many new measures of scientific activities and examine theories that have long been viewed as important but difficult to quantify. The third development is in empirical methods: thanks to parallel advances in data science, network science, artificial intelligence and econometrics, researchers can study relationships, make predictions and assess science policy in powerful new ways. Together, new data, measurements and methods have revealed fundamental new insights about the inner workings of science and scientific progress itself.

With multiple approaches, however, comes a key challenge. As researchers adhere to norms respected within their disciplines, their methods vary, with results often published in venues with non-overlapping readership, fragmenting research along disciplinary boundaries. This fragmentation challenges researchers’ ability to appreciate and understand the value of work outside of their own discipline, much less to build directly on it for further investigations.

Recognizing these challenges and the rapidly developing nature of the field, this paper reviews the empirical approaches that are prevalent in this literature. We aim to provide readers with an up-to-date understanding of the available datasets, measurement constructs and empirical methodologies, as well as the value and limitations of each. Owing to space constraints, this Review does not cover the full technical details of each method, referring readers to related guides to learn more. Instead, we will emphasize why a researcher might favour one method over another, depending on the research question.

Beyond a positive understanding of science, a key goal of the science of science is to inform science policy. While this Review mainly focuses on empirical approaches, with its core audience being researchers in the field, the studies reviewed are also germane to key policy questions. For example, what is the appropriate scale of scientific investment, in what directions and through what institutions 10 , 11 ? Are public investments in science aligned with public interests 12 ? What conditions produce novel or high-impact science 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 ? How do the reward systems of science influence the rate and direction of progress 13 , 21 , 22 , 23 , 24 , and what governs scientific reproducibility 25 , 26 , 27 ? How do contributions evolve over a scientific career 28 , 29 , 30 , 31 , 32 , and how may diversity among scientists advance scientific progress 33 , 34 , 35 , among other questions relevant to science policy 36 , 37 .

Overall, this review aims to facilitate entry to science of science research, expand researcher toolkits and illustrate how diverse research approaches contribute to our collective understanding of science. Section 2 reviews datasets and data linkages. Section 3 reviews major measurement constructs in the science of science. Section 4 considers a range of empirical methods, focusing on one study to illustrate each method and briefly summarizing related examples and applications. Section 5 concludes with an outlook for the science of science.

Historically, data on scientific activities were difficult to collect and were available in limited quantities. Gathering data could involve manually tallying statistics from publications 38 , 39 , interviewing scientists 16 , 40 , or assembling historical anecdotes and biographies 13 , 41 . Analyses were typically limited to a specific domain or group of scientists. Today, massive datasets on scientific production and use are at researchers’ fingertips 42 , 43 , 44 . Armed with big data and advanced algorithms, researchers can now probe questions previously not amenable to quantification and with enormous increases in scope and scale, as detailed below.

Publication datasets cover papers from nearly all scientific disciplines, enabling analyses of both general and domain-specific patterns. Commonly used datasets include the Web of Science (WoS), PubMed, CrossRef, ORCID, OpenCitations, Dimensions and OpenAlex. Datasets incorporating papers’ text (CORE) 45 , 46 , 47 , data entities (DataCite) 48 , 49 and peer review reports (Publons) 33 , 50 , 51 have also become available. These datasets further enable novel measurement, for example, representations of a paper’s content 52 , 53 , novelty 15 , 54 and interdisciplinarity 55 .

Notably, databases today capture more diverse aspects of science beyond publications, offering a richer and more encompassing view of research contexts and of researchers themselves (Fig. 1 ). For example, some datasets trace research funding to the specific publications these investments support 56 , 57 , allowing high-scale studies of the impact of funding on productivity and the return on public investment. Datasets incorporating job placements 58 , 59 , curriculum vitae 21 , 59 and scientific prizes 23 offer rich quantitative evidence on the social structure of science. Combining publication profiles with mentorship genealogies 60 , 61 , dissertations 34 and course syllabi 62 , 63 provides insights on mentoring and cultivating talent.

figure 1

This figure presents commonly used data types in science of science research, information contained in each data type and examples of data sources. Datasets in the science of science research have not only grown in scale but have also expanded beyond publications to integrate upstream funding investments and downstream applications that extend beyond science itself.

Finally, today’s scope of data extends beyond science to broader aspects of society. Altmetrics 64 captures news media and social media mentions of scientific articles. Other databases incorporate marketplace uses of science, including through patents 10 , pharmaceutical clinical trials and drug approvals 65 , 66 . Policy documents 67 , 68 help us to understand the role of science in the halls of government 69 and policy making 12 , 68 .

While datasets of the modern scientific enterprise have grown exponentially, they are not without limitations. As is often the case for data-driven research, drawing conclusions from specific data sources requires scrutiny and care. Datasets are typically based on published work, which may favour easy-to-publish topics over important ones (the streetlight effect) 70 , 71 . The publication of negative results is also rare (the file drawer problem) 72 , 73 . Meanwhile, English language publications account for over 90% of articles in major data sources, with limited coverage of non-English journals 74 . Publication datasets may also reflect biases in data collection across research institutions or demographic groups. Despite the open science movement, many datasets require paid subscriptions, which can create inequality in data access. Creating more open datasets for the science of science, such as OpenAlex, may not only improve the robustness and replicability of empirical claims but also increase entry to the field.

As today’s datasets become larger in scale and continue to integrate new dimensions, they offer opportunities to unveil the inner workings and external impacts of science in new ways. They can enable researchers to reach beyond previous limitations while conducting original studies of new and long-standing questions about the sciences.

Measurement

Here we discuss prominent measurement approaches in the science of science, including their purposes and limitations.

Modern publication databases typically include data on which articles and authors cite other papers and scientists. These citation linkages have been used to engage core conceptual ideas in scientific research. Here we consider two common measures based on citation information: citation counts and knowledge flows.

First, citation counts are commonly used indicators of impact. The term ‘indicator’ implies that it only approximates the concept of interest. A citation count is defined as how many times a document is cited by subsequent documents and can proxy for the importance of research papers 75 , 76 as well as patented inventions 77 , 78 , 79 . Rather than treating each citation equally, measures may further weight the importance of each citation, for example by using the citation network structure to produce centrality 80 , PageRank 81 , 82 or Eigenfactor indicators 83 , 84 .

Citation-based indicators have also faced criticism 84 , 85 . Citation indicators necessarily oversimplify the construct of impact, often ignoring heterogeneity in the meaning and use of a particular reference, the variations in citation practices across fields and institutional contexts, and the potential for reputation and power structures in science to influence citation behaviour 86 , 87 . Researchers have started to understand more nuanced citation behaviours ranging from negative citations 86 to citation context 47 , 88 , 89 . Understanding what a citation actually measures matters in interpreting and applying many research findings in the science of science. Evaluations relying on citation-based indicators rather than expert judgements raise questions regarding misuse 90 , 91 , 92 . Given the importance of developing indicators that can reliably quantify and evaluate science, the scientometrics community has been working to provide guidance for responsible citation practices and assessment 85 .

Second, scientists use citations to trace knowledge flows. Each citation in a paper is a link to specific previous work from which we can proxy how new discoveries draw upon existing ideas 76 , 93 and how knowledge flows between fields of science 94 , 95 , research institutions 96 , regions and nations 97 , 98 , 99 , and individuals 81 . Combinations of citation linkages can also approximate novelty 15 , disruptiveness 17 , 100 and interdisciplinarity 55 , 95 , 101 , 102 . A rapidly expanding body of work further examines citations to scientific articles from other domains (for example, patents, clinical drug trials and policy documents) to understand the applied value of science 10 , 12 , 65 , 66 , 103 , 104 , 105 .

Individuals

Analysing individual careers allows researchers to answer questions such as: How do we quantify individual scientific productivity? What is a typical career lifecycle? How are resources and credits allocated across individuals and careers? A scholar’s career can be examined through the papers they publish 30 , 31 , 106 , 107 , 108 , with attention to career progression and mobility, publication counts and citation impact, as well as grant funding 24 , 109 , 110 and prizes 111 , 112 , 113 ,

Studies of individual impact focus on output, typically approximated by the number of papers a researcher publishes and citation indicators. A popular measure for individual impact is the h -index 114 , which takes both volume and per-paper impact into consideration. Specifically, a scientist is assigned the largest value h such that they have h papers that were each cited at least h times. Later studies build on the idea of the h -index and propose variants to address limitations 115 , these variants ranging from emphasizing highly cited papers in a career 116 , to field differences 117 and normalizations 118 , to the relative contribution of an individual in collaborative works 119 .

To study dynamics in output over the lifecycle, individuals can be studied according to age, career age or the sequence of publications. A long-standing literature has investigated the relationship between age and the likelihood of outstanding achievement 28 , 106 , 111 , 120 , 121 . Recent studies further decouple the relationship between age, publication volume and per-paper citation, and measure the likelihood of producing highly cited papers in the sequence of works one produces 30 , 31 .

As simple as it sounds, representing careers using publication records is difficult. Collecting the full publication list of a researcher is the foundation to study individuals yet remains a key challenge, requiring name disambiguation techniques to match specific works to specific researchers. Although algorithms are increasingly capable at identifying millions of career profiles 122 , they vary in accuracy and robustness. ORCID can help to alleviate the problem by offering researchers the opportunity to create, maintain and update individual profiles themselves, and it goes beyond publications to collect broader outputs and activities 123 . A second challenge is survivorship bias. Empirical studies tend to focus on careers that are long enough to afford statistical analyses, which limits the applicability of the findings to scientific careers as a whole. A third challenge is the breadth of scientists’ activities, where focusing on publications ignores other important contributions such as mentorship and teaching, service (for example, refereeing papers, reviewing grant proposals and editing journals) or leadership within their organizations. Although researchers have begun exploring these dimensions by linking individual publication profiles with genealogical databases 61 , 124 , dissertations 34 , grants 109 , curriculum vitae 21 and acknowledgements 125 , scientific careers beyond publication records remain under-studied 126 , 127 . Lastly, citation-based indicators only serve as an approximation of individual performance with similar limitations as discussed above. The scientific community has called for more appropriate practices 85 , 128 , ranging from incorporating expert assessment of research contributions to broadening the measures of impact beyond publications.

Over many decades, science has exhibited a substantial and steady shift away from solo authorship towards coauthorship, especially among highly cited works 18 , 129 , 130 . In light of this shift, a research field, the science of team science 131 , 132 , has emerged to study the mechanisms that facilitate or hinder the effectiveness of teams. Team size can be proxied by the number of coauthors on a paper, which has been shown to predict distinctive types of advance: whereas larger teams tend to develop ideas, smaller teams tend to disrupt current ways of thinking 17 . Team characteristics can be inferred from coauthors’ backgrounds 133 , 134 , 135 , allowing quantification of a team’s diversity in terms of field, age, gender or ethnicity. Collaboration networks based on coauthorship 130 , 136 , 137 , 138 , 139 offer nuanced network-based indicators to understand individual and institutional collaborations.

However, there are limitations to using coauthorship alone to study teams 132 . First, coauthorship can obscure individual roles 140 , 141 , 142 , which has prompted institutional responses to help to allocate credit, including authorship order and individual contribution statements 56 , 143 . Second, coauthorship does not reflect the complex dynamics and interactions between team members that are often instrumental for team success 53 , 144 . Third, collaborative contributions can extend beyond coauthorship in publications to include members of a research laboratory 145 or co-principal investigators (co-PIs) on a grant 146 . Initiatives such as CRediT may help to address some of these issues by recording detailed roles for each contributor 147 .

Institutions

Research institutions, such as departments, universities, national laboratories and firms, encompass wider groups of researchers and their corresponding outputs. Institutional membership can be inferred from affiliations listed on publications or patents 148 , 149 , and the output of an institution can be aggregated over all its affiliated researchers 150 . Institutional research information systems (CRIS) contain more comprehensive research outputs and activities from employees.

Some research questions consider the institution as a whole, investigating the returns to research and development investment 104 , inequality of resource allocation 22 and the flow of scientists 21 , 148 , 149 . Other questions focus on institutional structures as sources of research productivity by looking into the role of peer effects 125 , 151 , 152 , 153 , how institutional policies impact research outcomes 154 , 155 and whether interdisciplinary efforts foster innovation 55 . Institution-oriented measurement faces similar limitations as with analyses of individuals and teams, including name disambiguation for a given institution and the limited capacity of formal publication records to characterize the full range of relevant institutional outcomes. It is also unclear how to allocate credit among multiple institutions associated with a paper. Moreover, relevant institutional employees extend beyond publishing researchers: interns, technicians and administrators all contribute to research endeavours 130 .

In sum, measurements allow researchers to quantify scientific production and use across numerous dimensions, but they also raise questions of construct validity: Does the proposed metric really reflect what we want to measure? Testing the construct’s validity is important, as is understanding a construct’s limits. Where possible, using alternative measurement approaches, or qualitative methods such as interviews and surveys, can improve measurement accuracy and the robustness of findings.

Empirical methods

In this section, we review two broad categories of empirical approaches (Table 1 ), each with distinctive goals: (1) to discover, estimate and predict empirical regularities; and (2) to identify causal mechanisms. For each method, we give a concrete example to help to explain how the method works, summarize related work for interested readers, and discuss contributions and limitations.

Descriptive and predictive approaches

Empirical regularities and generalizable facts.

The discovery of empirical regularities in science has had a key role in driving conceptual developments and the directions of future research. By observing empirical patterns at scale, researchers unveil central facts that shape science and present core features that theories of scientific progress and practice must explain. For example, consider citation distributions. de Solla Price first proposed that citation distributions are fat-tailed 39 , indicating that a few papers have extremely high citations while most papers have relatively few or even no citations at all. de Solla Price proposed that citation distribution was a power law, while researchers have since refined this view to show that the distribution appears log-normal, a nearly universal regularity across time and fields 156 , 157 . The fat-tailed nature of citation distributions and its universality across the sciences has in turn sparked substantial theoretical work that seeks to explain this key empirical regularity 20 , 156 , 158 , 159 .

Empirical regularities are often surprising and can contest previous beliefs of how science works. For example, it has been shown that the age distribution of great achievements peaks in middle age across a wide range of fields 107 , 121 , 160 , rejecting the common belief that young scientists typically drive breakthroughs in science. A closer look at the individual careers also indicates that productivity patterns vary widely across individuals 29 . Further, a scholar’s highest-impact papers come at a remarkably constant rate across the sequence of their work 30 , 31 .

The discovery of empirical regularities has had important roles in shaping beliefs about the nature of science 10 , 45 , 161 , 162 , sources of breakthrough ideas 15 , 163 , 164 , 165 , scientific careers 21 , 29 , 126 , 127 , the network structure of ideas and scientists 23 , 98 , 136 , 137 , 138 , 139 , 166 , gender inequality 57 , 108 , 126 , 135 , 143 , 167 , 168 , and many other areas of interest to scientists and science institutions 22 , 47 , 86 , 97 , 102 , 105 , 134 , 169 , 170 , 171 . At the same time, care must be taken to ensure that findings are not merely artefacts due to data selection or inherent bias. To differentiate meaningful patterns from spurious ones, it is important to stress test the findings through different selection criteria or across non-overlapping data sources.

Regression analysis

When investigating correlations among variables, a classic method is regression, which estimates how one set of variables explains variation in an outcome of interest. Regression can be used to test explicit hypotheses or predict outcomes. For example, researchers have investigated whether a paper’s novelty predicts its citation impact 172 . Adding additional control variables to the regression, one can further examine the robustness of the focal relationship.

Although regression analysis is useful for hypothesis testing, it bears substantial limitations. If the question one wishes to ask concerns a ‘causal’ rather than a correlational relationship, regression is poorly suited to the task as it is impossible to control for all the confounding factors. Failing to account for such ‘omitted variables’ can bias the regression coefficient estimates and lead to spurious interpretations. Further, regression models often have low goodness of fit (small R 2 ), indicating that the variables considered explain little of the outcome variation. As regressions typically focus on a specific relationship in simple functional forms, regressions tend to emphasize interpretability rather than overall predictability. The advent of predictive approaches powered by large-scale datasets and novel computational techniques offers new opportunities for modelling complex relationships with stronger predictive power.

Mechanistic models

Mechanistic modelling is an important approach to explaining empirical regularities, drawing from methods primarily used in physics. Such models predict macro-level regularities of a system by modelling micro-level interactions among basic elements with interpretable and modifiable formulars. While theoretical by nature, mechanistic models in the science of science are often empirically grounded, and this approach has developed together with the advent of large-scale, high-resolution data.

Simplicity is the core value of a mechanistic model. Consider for example, why citations follow a fat-tailed distribution. de Solla Price modelled the citing behaviour as a cumulative advantage process on a growing citation network 159 and found that if the probability a paper is cited grows linearly with its existing citations, the resulting distribution would follow a power law, broadly aligned with empirical observations. The model is intentionally simplified, ignoring myriad factors. Yet the simple cumulative advantage process is by itself sufficient in explaining a power law distribution of citations. In this way, mechanistic models can help to reveal key mechanisms that can explain observed patterns.

Moreover, mechanistic models can be refined as empirical evidence evolves. For example, later investigations showed that citation distributions are better characterized as log-normal 156 , 173 , prompting researchers to introduce a fitness parameter to encapsulate the inherent differences in papers’ ability to attract citations 174 , 175 . Further, older papers are less likely to be cited than expected 176 , 177 , 178 , motivating more recent models 20 to introduce an additional aging effect 179 . By combining the cumulative advantage, fitness and aging effects, one can already achieve substantial predictive power not just for the overall properties of the system but also the citation dynamics of individual papers 20 .

In addition to citations, mechanistic models have been developed to understand the formation of collaborations 136 , 180 , 181 , 182 , 183 , knowledge discovery and diffusion 184 , 185 , topic selection 186 , 187 , career dynamics 30 , 31 , 188 , 189 , the growth of scientific fields 190 and the dynamics of failure in science and other domains 178 .

At the same time, some observers have argued that mechanistic models are too simplistic to capture the essence of complex real-world problems 191 . While it has been a cornerstone for the natural sciences, representing social phenomena in a limited set of mathematical equations may miss complexities and heterogeneities that make social phenomena interesting in the first place. Such concerns are not unique to the science of science, as they represent a broader theme in computational social sciences 192 , 193 , ranging from social networks 194 , 195 to human mobility 196 , 197 to epidemics 198 , 199 . Other observers have questioned the practical utility of mechanistic models and whether they can be used to guide decisions and devise actionable policies. Nevertheless, despite these limitations, several complex phenomena in the science of science are well captured by simple mechanistic models, showing a high degree of regularity beneath complex interacting systems and providing powerful insights about the nature of science. Mixing such modelling with other methods could be particularly fruitful in future investigations.

Machine learning

The science of science seeks in part to forecast promising directions for scientific research 7 , 44 . In recent years, machine learning methods have substantially advanced predictive capabilities 200 , 201 and are playing increasingly important parts in the science of science. In contrast to the previous methods, machine learning does not emphasize hypotheses or theories. Rather, it leverages complex relationships in data and optimizes goodness of fit to make predictions and categorizations.

Traditional machine learning models include supervised, semi-supervised and unsupervised learning. The model choice depends on data availability and the research question, ranging from supervised models for citation prediction 202 , 203 to unsupervised models for community detection 204 . Take for example mappings of scientific knowledge 94 , 205 , 206 . The unsupervised method applies network clustering algorithms to map the structures of science. Related visualization tools make sense of clusters from the underlying network, allowing observers to see the organization, interactions and evolution of scientific knowledge. More recently, supervised learning, and deep neural networks in particular, have witnessed especially rapid developments 207 . Neural networks can generate high-dimensional representations of unstructured data such as images and texts, which encode complex properties difficult for human experts to perceive.

Take text analysis as an example. A recent study 52 utilizes 3.3 million paper abstracts in materials science to predict the thermoelectric properties of materials. The intuition is that the words currently used to describe a material may predict its hitherto undiscovered properties (Fig. 2 ). Compared with a random material, the materials predicted by the model are eight times more likely to be reported as thermoelectric in the next 5 years, suggesting that machine learning has the potential to substantially speed up knowledge discovery, especially as data continue to grow in scale and scope. Indeed, predicting the direction of new discoveries represents one of the most promising avenues for machine learning models, with neural networks being applied widely to biology 208 , physics 209 , 210 , mathematics 211 , chemistry 212 , medicine 213 and clinical applications 214 . Neural networks also offer a quantitative framework to probe the characteristics of creative products ranging from scientific papers 53 , journals 215 , organizations 148 , to paintings and movies 32 . Neural networks can also help to predict the reproducibility of papers from a variety of disciplines at scale 53 , 216 .

figure 2

This figure illustrates the word2vec skip-gram methods 52 , where the goal is to predict useful properties of materials using previous scientific literature. a , The architecture and training process of the word2vec skip-gram model, where the 3-layer, fully connected neural network learns the 200-dimensional representation (hidden layer) from the sparse vector for each word and its context in the literature (input layer). b , The top two principal components of the word embedding. Materials with similar features are close in the 2D space, allowing prediction of a material’s properties. Different targeted words are shown in different colours. Reproduced with permission from ref. 52 , Springer Nature Ltd.

While machine learning can offer high predictive accuracy, successful applications to the science of science face challenges, particularly regarding interpretability. Researchers may value transparent and interpretable findings for how a given feature influences an outcome, rather than a black-box model. The lack of interpretability also raises concerns about bias and fairness. In predicting reproducible patterns from data, machine learning models inevitably include and reproduce biases embedded in these data, often in non-transparent ways. The fairness of machine learning 217 is heavily debated in applications ranging from the criminal justice system to hiring processes. Effective and responsible use of machine learning in the science of science therefore requires thoughtful partnership between humans and machines 53 to build a reliable system accessible to scrutiny and modification.

Causal approaches

The preceding methods can reveal core facts about the workings of science and develop predictive capacity. Yet, they fail to capture causal relationships, which are particularly useful in assessing policy interventions. For example, how can we test whether a science policy boosts or hinders the performance of individuals, teams or institutions? The overarching idea of causal approaches is to construct some counterfactual world where two groups are identical to each other except that one group experiences a treatment that the other group does not.

Towards causation

Before engaging in causal approaches, it is useful to first consider the interpretative challenges of observational data. As observational data emerge from mechanisms that are not fully known or measured, an observed correlation may be driven by underlying forces that were not accounted for in the analysis. This challenge makes causal inference fundamentally difficult in observational data. An awareness of this issue is the first step in confronting it. It further motivates intermediate empirical approaches, including the use of matching strategies and fixed effects, that can help to confront (although not fully eliminate) the inference challenge. We first consider these approaches before turning to more fully causal methods.

Matching. Matching utilizes rich information to construct a control group that is similar to the treatment group on as many observable characteristics as possible before the treatment group is exposed to the treatment. Inferences can then be made by comparing the treatment and the matched control groups. Exact matching applies to categorical values, such as country, gender, discipline or affiliation 35 , 218 . Coarsened exact matching considers percentile bins of continuous variables and matches observations in the same bin 133 . Propensity score matching estimates the probability of receiving the ‘treatment’ on the basis of the controlled variables and uses the estimates to match treatment and control groups, which reduces the matching task from comparing the values of multiple covariates to comparing a single value 24 , 219 . Dynamic matching is useful for longitudinally matching variables that change over time 220 , 221 .

Fixed effects. Fixed effects are a powerful and now standard tool in controlling for confounders. A key requirement for using fixed effects is that there are multiple observations on the same subject or entity (person, field, institution and so on) 222 , 223 , 224 . The fixed effect works as a dummy variable that accounts for the role of any fixed characteristic of that entity. Consider the finding where gender-diverse teams produce higher-impact papers than same-gender teams do 225 . A confounder may be that individuals who tend to write high-impact papers may also be more likely to work in gender-diverse teams. By including individual fixed effects, one accounts for any fixed characteristics of individuals (such as IQ, cultural background or previous education) that might drive the relationship of interest.

In sum, matching and fixed effects methods reduce potential sources of bias in interpreting relationships between variables. Yet, confounders may persist in these studies. For instance, fixed effects do not control for unobserved factors that change with time within the given entity (for example, access to funding or new skills). Identifying casual effects convincingly will then typically require distinct research methods that we turn to next.

Quasi-experiments

Researchers in economics and other fields have developed a range of quasi-experimental methods to construct treatment and control groups. The key idea here is exploiting randomness from external events that differentially expose subjects to a particular treatment. Here we review three quasi-experimental methods: difference-in-differences, instrumental variables and regression discontinuity (Fig. 3 ).

figure 3

a – c , This figure presents illustrations of ( a ) differences-in-differences, ( b ) instrumental variables and ( c ) regression discontinuity methods. The solid line in b represents causal links and the dashed line represents the relationships that are not allowed, if the IV method is to produce causal inference.

Difference-in-differences. Difference-in-difference regression (DiD) investigates the effect of an unexpected event, comparing the affected group (the treated group) with an unaffected group (the control group). The control group is intended to provide the counterfactual path—what would have happened were it not for the unexpected event. Ideally, the treated and control groups are on virtually identical paths before the treatment event, but DiD can also work if the groups are on parallel paths (Fig. 3a ). For example, one study 226 examines how the premature death of superstar scientists affects the productivity of their previous collaborators. The control group are collaborators of superstars who did not die in the time frame. The two groups do not show significant differences in publications before a death event, yet upon the death of a star scientist, the treated collaborators on average experience a 5–8% decline in their quality-adjusted publication rates compared with the control group. DiD has wide applicability in the science of science, having been used to analyse the causal effects of grant design 24 , access costs to previous research 155 , 227 , university technology transfer policies 154 , intellectual property 228 , citation practices 229 , evolution of fields 221 and the impacts of paper retractions 230 , 231 , 232 . The DiD literature has grown especially rapidly in the field of economics, with substantial recent refinements 233 , 234 .

Instrumental variables. Another quasi-experimental approach utilizes ‘instrumental variables’ (IV). The goal is to determine the causal influence of some feature X on some outcome Y by using a third, instrumental variable. This instrumental variable is a quasi-random event that induces variation in X and, except for its impact through X , has no other effect on the outcome Y (Fig. 3b ). For example, consider a study of astronomy that seeks to understand how telescope time affects career advancement 235 . Here, one cannot simply look at the correlation between telescope time and career outcomes because many confounds (such as talent or grit) may influence both telescope time and career opportunities. Now consider the weather as an instrumental variable. Cloudy weather will, at random, reduce an astronomer’s observational time. Yet, the weather on particular nights is unlikely to correlate with a scientist’s innate qualities. The weather can then provide an instrumental variable to reveal a causal relationship between telescope time and career outcomes. Instrumental variables have been used to study local peer effects in research 151 , the impact of gender composition in scientific committees 236 , patents on future innovation 237 and taxes on inventor mobility 238 .

Regression discontinuity. In regression discontinuity, policies with an arbitrary threshold for receiving some benefit can be used to construct treatment and control groups (Fig. 3c ). Take the funding paylines for grant proposals as an example. Proposals with scores increasingly close to the payline are increasingly similar in their both observable and unobservable characteristics, yet only those projects with scores above the payline receive the funding. For example, a study 110 examines the effect of winning an early-career grant on the probability of winning a later, mid-career grant. The probability has a discontinuous jump across the initial grant’s payline, providing the treatment and control groups needed to estimate the causal effect of receiving a grant. This example utilizes the ‘sharp’ regression discontinuity that assumes treatment status to be fully determined by the cut-off. If we assume treatment status is only partly determined by the cut-off, we can use ‘fuzzy’ regression discontinuity designs. Here the probability of receiving a grant is used to estimate the future outcome 11 , 110 , 239 , 240 , 241 .

Although quasi-experiments are powerful tools, they face their own limitations. First, these approaches identify causal effects within a specific context and often engage small numbers of observations. How representative the samples are for broader populations or contexts is typically left as an open question. Second, the validity of the causal design is typically not ironclad. Researchers usually conduct different robustness checks to verify whether observable confounders have significant differences between the treated and control groups, before treatment. However, unobservable features may still differ between treatment and control groups. The quality of instrumental variables and the specific claim that they have no effect on the outcome except through the variable of interest, is also difficult to assess. Ultimately, researchers must rely partly on judgement to tell whether appropriate conditions are met for causal inference.

This section emphasized popular econometric approaches to causal inference. Other empirical approaches, such as graphical causal modelling 242 , 243 , also represent an important stream of work on assessing causal relationships. Such approaches usually represent causation as a directed acyclic graph, with nodes as variables and arrows between them as suspected causal relationships. In the science of science, the directed acyclic graph approach has been applied to quantify the causal effect of journal impact factor 244 and gender or racial bias 245 on citations. Graphical causal modelling has also triggered discussions on strengths and weaknesses compared to the econometrics methods 246 , 247 .

Experiments

In contrast to quasi-experimental approaches, laboratory and field experiments conduct direct randomization in assigning treatment and control groups. These methods engage explicitly in the data generation process, manipulating interventions to observe counterfactuals. These experiments are crafted to study mechanisms of specific interest and, by designing the experiment and formally randomizing, can produce especially rigorous causal inference.

Laboratory experiments. Laboratory experiments build counterfactual worlds in well-controlled laboratory environments. Researchers randomly assign participants to the treatment or control group and then manipulate the laboratory conditions to observe different outcomes in the two groups. For example, consider laboratory experiments on team performance and gender composition 144 , 248 . The researchers randomly assign participants into groups to perform tasks such as solving puzzles or brainstorming. Teams with a higher proportion of women are found to perform better on average, offering evidence that gender diversity is causally linked to team performance. Laboratory experiments can allow researchers to test forces that are otherwise hard to observe, such as how competition influences creativity 249 . Laboratory experiments have also been used to evaluate how journal impact factors shape scientists’ perceptions of rewards 250 and gender bias in hiring 251 .

Laboratory experiments allow for precise control of settings and procedures to isolate causal effects of interest. However, participants may behave differently in synthetic environments than in real-world settings, raising questions about the generalizability and replicability of the results 252 , 253 , 254 . To assess causal effects in real-world settings, researcher use randomized controlled trials.

Randomized controlled trials. A randomized controlled trial (RCT), or field experiment, is a staple for causal inference across a wide range of disciplines. RCTs randomly assign participants into the treatment and control conditions 255 and can be used not only to assess mechanisms but also to test real-world interventions such as policy change. The science of science has witnessed growing use of RCTs. For instance, a field experiment 146 investigated whether lower search costs for collaborators increased collaboration in grant applications. The authors randomly allocated principal investigators to face-to-face sessions in a medical school, and then measured participants’ chance of writing a grant proposal together. RCTs have also offered rich causal insights on peer review 256 , 257 , 258 , 259 , 260 and gender bias in science 261 , 262 , 263 .

While powerful, RCTs are difficult to conduct in the science of science, mainly for two reasons. The first concerns potential risks in a policy intervention. For instance, while randomizing funding across individuals could generate crucial causal insights for funders, it may also inadvertently harm participants’ careers 264 . Second, key questions in the science of science often require a long-time horizon to trace outcomes, which makes RCTs costly. It also raises the difficulty of replicating findings. A relative advantage of the quasi-experimental methods discussed earlier is that one can identify causal effects over potentially long periods of time in the historical record. On the other hand, quasi-experiments must be found as opposed to designed, and they often are not available for many questions of interest. While the best approaches are context dependent, a growing community of researchers is building platforms to facilitate RCTs for the science of science, aiming to lower their costs and increase their scale. Performing RCTs in partnership with science institutions can also contribute to timely, policy-relevant research that may substantially improve science decision-making and investments.

Research in the science of science has been empowered by the growth of high-scale data, new measurement approaches and an expanding range of empirical methods. These tools provide enormous capacity to test conceptual frameworks about science, discover factors impacting scientific productivity, predict key scientific outcomes and design policies that better facilitate future scientific progress. A careful appreciation of empirical techniques can help researchers to choose effective tools for questions of interest and propel the field. A better and broader understanding of these methodologies may also build bridges across diverse research communities, facilitating communication and collaboration, and better leveraging the value of diverse perspectives. The science of science is about turning scientific methods on the nature of science itself. The fruits of this work, with time, can guide researchers and research institutions to greater progress in discovery and understanding across the landscape of scientific inquiry.

Bush, V . S cience–the Endless Frontier: A Report to the President on a Program for Postwar Scientific Research (National Science Foundation, 1990).

Mokyr, J. The Gifts of Athena (Princeton Univ. Press, 2011).

Jones, B. F. in Rebuilding the Post-Pandemic Economy (eds Kearney, M. S. & Ganz, A.) 272–310 (Aspen Institute Press, 2021).

Wang, D. & Barabási, A.-L. The Science of Science (Cambridge Univ. Press, 2021).

Fortunato, S. et al. Science of science. Science 359 , eaao0185 (2018).

Article   PubMed   PubMed Central   Google Scholar  

Azoulay, P. et al. Toward a more scientific science. Science 361 , 1194–1197 (2018).

Article   PubMed   Google Scholar  

Clauset, A., Larremore, D. B. & Sinatra, R. Data-driven predictions in the science of science. Science 355 , 477–480 (2017).

Article   CAS   PubMed   Google Scholar  

Zeng, A. et al. The science of science: from the perspective of complex systems. Phys. Rep. 714 , 1–73 (2017).

Article   Google Scholar  

Lin, Z., Yin. Y., Liu, L. & Wang, D. SciSciNet: a large-scale open data lake for the science of science research. Sci. Data, https://doi.org/10.1038/s41597-023-02198-9 (2023).

Ahmadpoor, M. & Jones, B. F. The dual frontier: patented inventions and prior scientific advance. Science 357 , 583–587 (2017).

Azoulay, P., Graff Zivin, J. S., Li, D. & Sampat, B. N. Public R&D investments and private-sector patenting: evidence from NIH funding rules. Rev. Econ. Stud. 86 , 117–152 (2019).

Yin, Y., Dong, Y., Wang, K., Wang, D. & Jones, B. F. Public use and public funding of science. Nat. Hum. Behav. 6 , 1344–1350 (2022).

Merton, R. K. The Sociology of Science: Theoretical and Empirical Investigations (Univ. Chicago Press, 1973).

Kuhn, T. The Structure of Scientific Revolutions (Princeton Univ. Press, 2021).

Uzzi, B., Mukherjee, S., Stringer, M. & Jones, B. Atypical combinations and scientific impact. Science 342 , 468–472 (2013).

Zuckerman, H. Scientific Elite: Nobel Laureates in the United States (Transaction Publishers, 1977).

Wu, L., Wang, D. & Evans, J. A. Large teams develop and small teams disrupt science and technology. Nature 566 , 378–382 (2019).

Wuchty, S., Jones, B. F. & Uzzi, B. The increasing dominance of teams in production of knowledge. Science 316 , 1036–1039 (2007).

Foster, J. G., Rzhetsky, A. & Evans, J. A. Tradition and innovation in scientists’ research strategies. Am. Sociol. Rev. 80 , 875–908 (2015).

Wang, D., Song, C. & Barabási, A.-L. Quantifying long-term scientific impact. Science 342 , 127–132 (2013).

Clauset, A., Arbesman, S. & Larremore, D. B. Systematic inequality and hierarchy in faculty hiring networks. Sci. Adv. 1 , e1400005 (2015).

Ma, A., Mondragón, R. J. & Latora, V. Anatomy of funded research in science. Proc. Natl Acad. Sci. USA 112 , 14760–14765 (2015).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Ma, Y. & Uzzi, B. Scientific prize network predicts who pushes the boundaries of science. Proc. Natl Acad. Sci. USA 115 , 12608–12615 (2018).

Azoulay, P., Graff Zivin, J. S. & Manso, G. Incentives and creativity: evidence from the academic life sciences. RAND J. Econ. 42 , 527–554 (2011).

Schor, S. & Karten, I. Statistical evaluation of medical journal manuscripts. JAMA 195 , 1123–1128 (1966).

Platt, J. R. Strong inference: certain systematic methods of scientific thinking may produce much more rapid progress than others. Science 146 , 347–353 (1964).

Ioannidis, J. P. Why most published research findings are false. PLoS Med. 2 , e124 (2005).

Simonton, D. K. Career landmarks in science: individual differences and interdisciplinary contrasts. Dev. Psychol. 27 , 119 (1991).

Way, S. F., Morgan, A. C., Clauset, A. & Larremore, D. B. The misleading narrative of the canonical faculty productivity trajectory. Proc. Natl Acad. Sci. USA 114 , E9216–E9223 (2017).

Sinatra, R., Wang, D., Deville, P., Song, C. & Barabási, A.-L. Quantifying the evolution of individual scientific impact. Science 354 , aaf5239 (2016).

Liu, L. et al. Hot streaks in artistic, cultural, and scientific careers. Nature 559 , 396–399 (2018).

Liu, L., Dehmamy, N., Chown, J., Giles, C. L. & Wang, D. Understanding the onset of hot streaks across artistic, cultural, and scientific careers. Nat. Commun. 12 , 5392 (2021).

Squazzoni, F. et al. Peer review and gender bias: a study on 145 scholarly journals. Sci. Adv. 7 , eabd0299 (2021).

Hofstra, B. et al. The diversity–innovation paradox in science. Proc. Natl Acad. Sci. USA 117 , 9284–9291 (2020).

Huang, J., Gates, A. J., Sinatra, R. & Barabási, A.-L. Historical comparison of gender inequality in scientific careers across countries and disciplines. Proc. Natl Acad. Sci. USA 117 , 4609–4616 (2020).

Gläser, J. & Laudel, G. Governing science: how science policy shapes research content. Eur. J. Sociol. 57 , 117–168 (2016).

Stephan, P. E. How Economics Shapes Science (Harvard Univ. Press, 2012).

Garfield, E. & Sher, I. H. New factors in the evaluation of scientific literature through citation indexing. Am. Doc. 14 , 195–201 (1963).

Article   CAS   Google Scholar  

de Solla Price, D. J. Networks of scientific papers. Science 149 , 510–515 (1965).

Etzkowitz, H., Kemelgor, C. & Uzzi, B. Athena Unbound: The Advancement of Women in Science and Technology (Cambridge Univ. Press, 2000).

Simonton, D. K. Scientific Genius: A Psychology of Science (Cambridge Univ. Press, 1988).

Khabsa, M. & Giles, C. L. The number of scholarly documents on the public web. PLoS ONE 9 , e93949 (2014).

Xia, F., Wang, W., Bekele, T. M. & Liu, H. Big scholarly data: a survey. IEEE Trans. Big Data 3 , 18–35 (2017).

Evans, J. A. & Foster, J. G. Metaknowledge. Science 331 , 721–725 (2011).

Milojević, S. Quantifying the cognitive extent of science. J. Informetr. 9 , 962–973 (2015).

Rzhetsky, A., Foster, J. G., Foster, I. T. & Evans, J. A. Choosing experiments to accelerate collective discovery. Proc. Natl Acad. Sci. USA 112 , 14569–14574 (2015).

Poncela-Casasnovas, J., Gerlach, M., Aguirre, N. & Amaral, L. A. Large-scale analysis of micro-level citation patterns reveals nuanced selection criteria. Nat. Hum. Behav. 3 , 568–575 (2019).

Hardwicke, T. E. et al. Data availability, reusability, and analytic reproducibility: evaluating the impact of a mandatory open data policy at the journal Cognition. R. Soc. Open Sci. 5 , 180448 (2018).

Nagaraj, A., Shears, E. & de Vaan, M. Improving data access democratizes and diversifies science. Proc. Natl Acad. Sci. USA 117 , 23490–23498 (2020).

Bravo, G., Grimaldo, F., López-Iñesta, E., Mehmani, B. & Squazzoni, F. The effect of publishing peer review reports on referee behavior in five scholarly journals. Nat. Commun. 10 , 322 (2019).

Tran, D. et al. An open review of open review: a critical analysis of the machine learning conference review process. Preprint at https://doi.org/10.48550/arXiv.2010.05137 (2020).

Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571 , 95–98 (2019).

Yang, Y., Wu, Y. & Uzzi, B. Estimating the deep replicability of scientific findings using human and artificial intelligence. Proc. Natl Acad. Sci. USA 117 , 10762–10768 (2020).

Mukherjee, S., Uzzi, B., Jones, B. & Stringer, M. A new method for identifying recombinations of existing knowledge associated with high‐impact innovation. J. Prod. Innov. Manage. 33 , 224–236 (2016).

Leahey, E., Beckman, C. M. & Stanko, T. L. Prominent but less productive: the impact of interdisciplinarity on scientists’ research. Adm. Sci. Q. 62 , 105–139 (2017).

Sauermann, H. & Haeussler, C. Authorship and contribution disclosures. Sci. Adv. 3 , e1700404 (2017).

Oliveira, D. F. M., Ma, Y., Woodruff, T. K. & Uzzi, B. Comparison of National Institutes of Health grant amounts to first-time male and female principal investigators. JAMA 321 , 898–900 (2019).

Yang, Y., Chawla, N. V. & Uzzi, B. A network’s gender composition and communication pattern predict women’s leadership success. Proc. Natl Acad. Sci. USA 116 , 2033–2038 (2019).

Way, S. F., Larremore, D. B. & Clauset, A. Gender, productivity, and prestige in computer science faculty hiring networks. In Proc. 25th International Conference on World Wide Web 1169–1179. (ACM 2016)

Malmgren, R. D., Ottino, J. M. & Amaral, L. A. N. The role of mentorship in protege performance. Nature 465 , 622–626 (2010).

Ma, Y., Mukherjee, S. & Uzzi, B. Mentorship and protégé success in STEM fields. Proc. Natl Acad. Sci. USA 117 , 14077–14083 (2020).

Börner, K. et al. Skill discrepancies between research, education, and jobs reveal the critical need to supply soft skills for the data economy. Proc. Natl Acad. Sci. USA 115 , 12630–12637 (2018).

Biasi, B. & Ma, S. The Education-Innovation Gap (National Bureau of Economic Research Working papers, 2020).

Bornmann, L. Do altmetrics point to the broader impact of research? An overview of benefits and disadvantages of altmetrics. J. Informetr. 8 , 895–903 (2014).

Cleary, E. G., Beierlein, J. M., Khanuja, N. S., McNamee, L. M. & Ledley, F. D. Contribution of NIH funding to new drug approvals 2010–2016. Proc. Natl Acad. Sci. USA 115 , 2329–2334 (2018).

Spector, J. M., Harrison, R. S. & Fishman, M. C. Fundamental science behind today’s important medicines. Sci. Transl. Med. 10 , eaaq1787 (2018).

Haunschild, R. & Bornmann, L. How many scientific papers are mentioned in policy-related documents? An empirical investigation using Web of Science and Altmetric data. Scientometrics 110 , 1209–1216 (2017).

Yin, Y., Gao, J., Jones, B. F. & Wang, D. Coevolution of policy and science during the pandemic. Science 371 , 128–130 (2021).

Sugimoto, C. R., Work, S., Larivière, V. & Haustein, S. Scholarly use of social media and altmetrics: a review of the literature. J. Assoc. Inf. Sci. Technol. 68 , 2037–2062 (2017).

Dunham, I. Human genes: time to follow the roads less traveled? PLoS Biol. 16 , e3000034 (2018).

Kustatscher, G. et al. Understudied proteins: opportunities and challenges for functional proteomics. Nat. Methods 19 , 774–779 (2022).

Rosenthal, R. The file drawer problem and tolerance for null results. Psychol. Bull. 86 , 638 (1979).

Franco, A., Malhotra, N. & Simonovits, G. Publication bias in the social sciences: unlocking the file drawer. Science 345 , 1502–1505 (2014).

Vera-Baceta, M.-A., Thelwall, M. & Kousha, K. Web of Science and Scopus language coverage. Scientometrics 121 , 1803–1813 (2019).

Waltman, L. A review of the literature on citation impact indicators. J. Informetr. 10 , 365–391 (2016).

Garfield, E. & Merton, R. K. Citation Indexing: Its Theory and Application in Science, Technology, and Humanities (Wiley, 1979).

Kelly, B., Papanikolaou, D., Seru, A. & Taddy, M. Measuring Technological Innovation Over the Long Run Report No. 0898-2937 (National Bureau of Economic Research, 2018).

Kogan, L., Papanikolaou, D., Seru, A. & Stoffman, N. Technological innovation, resource allocation, and growth. Q. J. Econ. 132 , 665–712 (2017).

Hall, B. H., Jaffe, A. & Trajtenberg, M. Market value and patent citations. RAND J. Econ. 36 , 16–38 (2005).

Google Scholar  

Yan, E. & Ding, Y. Applying centrality measures to impact analysis: a coauthorship network analysis. J. Am. Soc. Inf. Sci. Technol. 60 , 2107–2118 (2009).

Radicchi, F., Fortunato, S., Markines, B. & Vespignani, A. Diffusion of scientific credits and the ranking of scientists. Phys. Rev. E 80 , 056103 (2009).

Bollen, J., Rodriquez, M. A. & Van de Sompel, H. Journal status. Scientometrics 69 , 669–687 (2006).

Bergstrom, C. T., West, J. D. & Wiseman, M. A. The eigenfactor™ metrics. J. Neurosci. 28 , 11433–11434 (2008).

Cronin, B. & Sugimoto, C. R. Beyond Bibliometrics: Harnessing Multidimensional Indicators of Scholarly Impact (MIT Press, 2014).

Hicks, D., Wouters, P., Waltman, L., De Rijcke, S. & Rafols, I. Bibliometrics: the Leiden Manifesto for research metrics. Nature 520 , 429–431 (2015).

Catalini, C., Lacetera, N. & Oettl, A. The incidence and role of negative citations in science. Proc. Natl Acad. Sci. USA 112 , 13823–13826 (2015).

Alcacer, J. & Gittelman, M. Patent citations as a measure of knowledge flows: the influence of examiner citations. Rev. Econ. Stat. 88 , 774–779 (2006).

Ding, Y. et al. Content‐based citation analysis: the next generation of citation analysis. J. Assoc. Inf. Sci. Technol. 65 , 1820–1833 (2014).

Teufel, S., Siddharthan, A. & Tidhar, D. Automatic classification of citation function. In Proc. 2006 Conference on Empirical Methods in Natural Language Processing, 103–110 (Association for Computational Linguistics 2006)

Seeber, M., Cattaneo, M., Meoli, M. & Malighetti, P. Self-citations as strategic response to the use of metrics for career decisions. Res. Policy 48 , 478–491 (2019).

Pendlebury, D. A. The use and misuse of journal metrics and other citation indicators. Arch. Immunol. Ther. Exp. 57 , 1–11 (2009).

Biagioli, M. Watch out for cheats in citation game. Nature 535 , 201 (2016).

Jo, W. S., Liu, L. & Wang, D. See further upon the giants: quantifying intellectual lineage in science. Quant. Sci. Stud. 3 , 319–330 (2022).

Boyack, K. W., Klavans, R. & Börner, K. Mapping the backbone of science. Scientometrics 64 , 351–374 (2005).

Gates, A. J., Ke, Q., Varol, O. & Barabási, A.-L. Nature’s reach: narrow work has broad impact. Nature 575 , 32–34 (2019).

Börner, K., Penumarthy, S., Meiss, M. & Ke, W. Mapping the diffusion of scholarly knowledge among major US research institutions. Scientometrics 68 , 415–426 (2006).

King, D. A. The scientific impact of nations. Nature 430 , 311–316 (2004).

Pan, R. K., Kaski, K. & Fortunato, S. World citation and collaboration networks: uncovering the role of geography in science. Sci. Rep. 2 , 902 (2012).

Jaffe, A. B., Trajtenberg, M. & Henderson, R. Geographic localization of knowledge spillovers as evidenced by patent citations. Q. J. Econ. 108 , 577–598 (1993).

Funk, R. J. & Owen-Smith, J. A dynamic network measure of technological change. Manage. Sci. 63 , 791–817 (2017).

Yegros-Yegros, A., Rafols, I. & D’este, P. Does interdisciplinary research lead to higher citation impact? The different effect of proximal and distal interdisciplinarity. PLoS ONE 10 , e0135095 (2015).

Larivière, V., Haustein, S. & Börner, K. Long-distance interdisciplinarity leads to higher scientific impact. PLoS ONE 10 , e0122565 (2015).

Fleming, L., Greene, H., Li, G., Marx, M. & Yao, D. Government-funded research increasingly fuels innovation. Science 364 , 1139–1141 (2019).

Bowen, A. & Casadevall, A. Increasing disparities between resource inputs and outcomes, as measured by certain health deliverables, in biomedical research. Proc. Natl Acad. Sci. USA 112 , 11335–11340 (2015).

Li, D., Azoulay, P. & Sampat, B. N. The applied value of public investments in biomedical research. Science 356 , 78–81 (2017).

Lehman, H. C. Age and Achievement (Princeton Univ. Press, 2017).

Simonton, D. K. Creative productivity: a predictive and explanatory model of career trajectories and landmarks. Psychol. Rev. 104 , 66 (1997).

Duch, J. et al. The possible role of resource requirements and academic career-choice risk on gender differences in publication rate and impact. PLoS ONE 7 , e51332 (2012).

Wang, Y., Jones, B. F. & Wang, D. Early-career setback and future career impact. Nat. Commun. 10 , 4331 (2019).

Bol, T., de Vaan, M. & van de Rijt, A. The Matthew effect in science funding. Proc. Natl Acad. Sci. USA 115 , 4887–4890 (2018).

Jones, B. F. Age and great invention. Rev. Econ. Stat. 92 , 1–14 (2010).

Newman, M. Networks (Oxford Univ. Press, 2018).

Mazloumian, A., Eom, Y.-H., Helbing, D., Lozano, S. & Fortunato, S. How citation boosts promote scientific paradigm shifts and nobel prizes. PLoS ONE 6 , e18975 (2011).

Hirsch, J. E. An index to quantify an individual’s scientific research output. Proc. Natl Acad. Sci. USA 102 , 16569–16572 (2005).

Alonso, S., Cabrerizo, F. J., Herrera-Viedma, E. & Herrera, F. h-index: a review focused in its variants, computation and standardization for different scientific fields. J. Informetr. 3 , 273–289 (2009).

Egghe, L. An improvement of the h-index: the g-index. ISSI Newsl. 2 , 8–9 (2006).

Kaur, J., Radicchi, F. & Menczer, F. Universality of scholarly impact metrics. J. Informetr. 7 , 924–932 (2013).

Majeti, D. et al. Scholar plot: design and evaluation of an information interface for faculty research performance. Front. Res. Metr. Anal. 4 , 6 (2020).

Sidiropoulos, A., Katsaros, D. & Manolopoulos, Y. Generalized Hirsch h-index for disclosing latent facts in citation networks. Scientometrics 72 , 253–280 (2007).

Jones, B. F. & Weinberg, B. A. Age dynamics in scientific creativity. Proc. Natl Acad. Sci. USA 108 , 18910–18914 (2011).

Dennis, W. Age and productivity among scientists. Science 123 , 724–725 (1956).

Sanyal, D. K., Bhowmick, P. K. & Das, P. P. A review of author name disambiguation techniques for the PubMed bibliographic database. J. Inf. Sci. 47 , 227–254 (2021).

Haak, L. L., Fenner, M., Paglione, L., Pentz, E. & Ratner, H. ORCID: a system to uniquely identify researchers. Learn. Publ. 25 , 259–264 (2012).

Malmgren, R. D., Ottino, J. M. & Amaral, L. A. N. The role of mentorship in protégé performance. Nature 465 , 662–667 (2010).

Oettl, A. Reconceptualizing stars: scientist helpfulness and peer performance. Manage. Sci. 58 , 1122–1140 (2012).

Morgan, A. C. et al. The unequal impact of parenthood in academia. Sci. Adv. 7 , eabd1996 (2021).

Morgan, A. C. et al. Socioeconomic roots of academic faculty. Nat. Hum. Behav. 6 , 1625–1633 (2022).

San Francisco Declaration on Research Assessment (DORA) (American Society for Cell Biology, 2012).

Falk‐Krzesinski, H. J. et al. Advancing the science of team science. Clin. Transl. Sci. 3 , 263–266 (2010).

Cooke, N. J. et al. Enhancing the Effectiveness of Team Science (National Academies Press, 2015).

Börner, K. et al. A multi-level systems perspective for the science of team science. Sci. Transl. Med. 2 , 49cm24 (2010).

Leahey, E. From sole investigator to team scientist: trends in the practice and study of research collaboration. Annu. Rev. Sociol. 42 , 81–100 (2016).

AlShebli, B. K., Rahwan, T. & Woon, W. L. The preeminence of ethnic diversity in scientific collaboration. Nat. Commun. 9 , 5163 (2018).

Hsiehchen, D., Espinoza, M. & Hsieh, A. Multinational teams and diseconomies of scale in collaborative research. Sci. Adv. 1 , e1500211 (2015).

Koning, R., Samila, S. & Ferguson, J.-P. Who do we invent for? Patents by women focus more on women’s health, but few women get to invent. Science 372 , 1345–1348 (2021).

Barabâsi, A.-L. et al. Evolution of the social network of scientific collaborations. Physica A 311 , 590–614 (2002).

Newman, M. E. Scientific collaboration networks. I. Network construction and fundamental results. Phys. Rev. E 64 , 016131 (2001).

Newman, M. E. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys. Rev. E 64 , 016132 (2001).

Palla, G., Barabási, A.-L. & Vicsek, T. Quantifying social group evolution. Nature 446 , 664–667 (2007).

Ross, M. B. et al. Women are credited less in science than men. Nature 608 , 135–145 (2022).

Shen, H.-W. & Barabási, A.-L. Collective credit allocation in science. Proc. Natl Acad. Sci. USA 111 , 12325–12330 (2014).

Merton, R. K. Matthew effect in science. Science 159 , 56–63 (1968).

Ni, C., Smith, E., Yuan, H., Larivière, V. & Sugimoto, C. R. The gendered nature of authorship. Sci. Adv. 7 , eabe4639 (2021).

Woolley, A. W., Chabris, C. F., Pentland, A., Hashmi, N. & Malone, T. W. Evidence for a collective intelligence factor in the performance of human groups. Science 330 , 686–688 (2010).

Feldon, D. F. et al. Postdocs’ lab engagement predicts trajectories of PhD students’ skill development. Proc. Natl Acad. Sci. USA 116 , 20910–20916 (2019).

Boudreau, K. J. et al. A field experiment on search costs and the formation of scientific collaborations. Rev. Econ. Stat. 99 , 565–576 (2017).

Holcombe, A. O. Contributorship, not authorship: use CRediT to indicate who did what. Publications 7 , 48 (2019).

Murray, D. et al. Unsupervised embedding of trajectories captures the latent structure of mobility. Preprint at https://doi.org/10.48550/arXiv.2012.02785 (2020).

Deville, P. et al. Career on the move: geography, stratification, and scientific impact. Sci. Rep. 4 , 4770 (2014).

Edmunds, L. D. et al. Why do women choose or reject careers in academic medicine? A narrative review of empirical evidence. Lancet 388 , 2948–2958 (2016).

Waldinger, F. Peer effects in science: evidence from the dismissal of scientists in Nazi Germany. Rev. Econ. Stud. 79 , 838–861 (2012).

Agrawal, A., McHale, J. & Oettl, A. How stars matter: recruiting and peer effects in evolutionary biology. Res. Policy 46 , 853–867 (2017).

Fiore, S. M. Interdisciplinarity as teamwork: how the science of teams can inform team science. Small Group Res. 39 , 251–277 (2008).

Hvide, H. K. & Jones, B. F. University innovation and the professor’s privilege. Am. Econ. Rev. 108 , 1860–1898 (2018).

Murray, F., Aghion, P., Dewatripont, M., Kolev, J. & Stern, S. Of mice and academics: examining the effect of openness on innovation. Am. Econ. J. Econ. Policy 8 , 212–252 (2016).

Radicchi, F., Fortunato, S. & Castellano, C. Universality of citation distributions: toward an objective measure of scientific impact. Proc. Natl Acad. Sci. USA 105 , 17268–17272 (2008).

Waltman, L., van Eck, N. J. & van Raan, A. F. Universality of citation distributions revisited. J. Am. Soc. Inf. Sci. Technol. 63 , 72–77 (2012).

Barabási, A.-L. & Albert, R. Emergence of scaling in random networks. Science 286 , 509–512 (1999).

de Solla Price, D. A general theory of bibliometric and other cumulative advantage processes. J. Am. Soc. Inf. Sci. 27 , 292–306 (1976).

Cole, S. Age and scientific performance. Am. J. Sociol. 84 , 958–977 (1979).

Ke, Q., Ferrara, E., Radicchi, F. & Flammini, A. Defining and identifying sleeping beauties in science. Proc. Natl Acad. Sci. USA 112 , 7426–7431 (2015).

Bornmann, L., de Moya Anegón, F. & Leydesdorff, L. Do scientific advancements lean on the shoulders of giants? A bibliometric investigation of the Ortega hypothesis. PLoS ONE 5 , e13327 (2010).

Mukherjee, S., Romero, D. M., Jones, B. & Uzzi, B. The nearly universal link between the age of past knowledge and tomorrow’s breakthroughs in science and technology: the hotspot. Sci. Adv. 3 , e1601315 (2017).

Packalen, M. & Bhattacharya, J. NIH funding and the pursuit of edge science. Proc. Natl Acad. Sci. USA 117 , 12011–12016 (2020).

Zeng, A., Fan, Y., Di, Z., Wang, Y. & Havlin, S. Fresh teams are associated with original and multidisciplinary research. Nat. Hum. Behav. 5 , 1314–1322 (2021).

Newman, M. E. The structure of scientific collaboration networks. Proc. Natl Acad. Sci. USA 98 , 404–409 (2001).

Larivière, V., Ni, C., Gingras, Y., Cronin, B. & Sugimoto, C. R. Bibliometrics: global gender disparities in science. Nature 504 , 211–213 (2013).

West, J. D., Jacquet, J., King, M. M., Correll, S. J. & Bergstrom, C. T. The role of gender in scholarly authorship. PLoS ONE 8 , e66212 (2013).

Gao, J., Yin, Y., Myers, K. R., Lakhani, K. R. & Wang, D. Potentially long-lasting effects of the pandemic on scientists. Nat. Commun. 12 , 6188 (2021).

Jones, B. F., Wuchty, S. & Uzzi, B. Multi-university research teams: shifting impact, geography, and stratification in science. Science 322 , 1259–1262 (2008).

Chu, J. S. & Evans, J. A. Slowed canonical progress in large fields of science. Proc. Natl Acad. Sci. USA 118 , e2021636118 (2021).

Wang, J., Veugelers, R. & Stephan, P. Bias against novelty in science: a cautionary tale for users of bibliometric indicators. Res. Policy 46 , 1416–1436 (2017).

Stringer, M. J., Sales-Pardo, M. & Amaral, L. A. Statistical validation of a global model for the distribution of the ultimate number of citations accrued by papers published in a scientific journal. J. Assoc. Inf. Sci. Technol. 61 , 1377–1385 (2010).

Bianconi, G. & Barabási, A.-L. Bose-Einstein condensation in complex networks. Phys. Rev. Lett. 86 , 5632 (2001).

Bianconi, G. & Barabási, A.-L. Competition and multiscaling in evolving networks. Europhys. Lett. 54 , 436 (2001).

Yin, Y. & Wang, D. The time dimension of science: connecting the past to the future. J. Informetr. 11 , 608–621 (2017).

Pan, R. K., Petersen, A. M., Pammolli, F. & Fortunato, S. The memory of science: Inflation, myopia, and the knowledge network. J. Informetr. 12 , 656–678 (2018).

Yin, Y., Wang, Y., Evans, J. A. & Wang, D. Quantifying the dynamics of failure across science, startups and security. Nature 575 , 190–194 (2019).

Candia, C. & Uzzi, B. Quantifying the selective forgetting and integration of ideas in science and technology. Am. Psychol. 76 , 1067 (2021).

Milojević, S. Principles of scientific research team formation and evolution. Proc. Natl Acad. Sci. USA 111 , 3984–3989 (2014).

Guimera, R., Uzzi, B., Spiro, J. & Amaral, L. A. N. Team assembly mechanisms determine collaboration network structure and team performance. Science 308 , 697–702 (2005).

Newman, M. E. Coauthorship networks and patterns of scientific collaboration. Proc. Natl Acad. Sci. USA 101 , 5200–5205 (2004).

Newman, M. E. Clustering and preferential attachment in growing networks. Phys. Rev. E 64 , 025102 (2001).

Iacopini, I., Milojević, S. & Latora, V. Network dynamics of innovation processes. Phys. Rev. Lett. 120 , 048301 (2018).

Kuhn, T., Perc, M. & Helbing, D. Inheritance patterns in citation networks reveal scientific memes. Phys. Rev. 4 , 041036 (2014).

Jia, T., Wang, D. & Szymanski, B. K. Quantifying patterns of research-interest evolution. Nat. Hum. Behav. 1 , 0078 (2017).

Zeng, A. et al. Increasing trend of scientists to switch between topics. Nat. Commun. https://doi.org/10.1038/s41467-019-11401-8 (2019).

Siudem, G., Żogała-Siudem, B., Cena, A. & Gagolewski, M. Three dimensions of scientific impact. Proc. Natl Acad. Sci. USA 117 , 13896–13900 (2020).

Petersen, A. M. et al. Reputation and impact in academic careers. Proc. Natl Acad. Sci. USA 111 , 15316–15321 (2014).

Jin, C., Song, C., Bjelland, J., Canright, G. & Wang, D. Emergence of scaling in complex substitutive systems. Nat. Hum. Behav. 3 , 837–846 (2019).

Hofman, J. M. et al. Integrating explanation and prediction in computational social science. Nature 595 , 181–188 (2021).

Lazer, D. et al. Computational social science. Science 323 , 721–723 (2009).

Lazer, D. M. et al. Computational social science: obstacles and opportunities. Science 369 , 1060–1062 (2020).

Albert, R. & Barabási, A.-L. Statistical mechanics of complex networks. Rev. Mod. Phys. 74 , 47 (2002).

Newman, M. E. The structure and function of complex networks. SIAM Rev. 45 , 167–256 (2003).

Song, C., Qu, Z., Blumm, N. & Barabási, A.-L. Limits of predictability in human mobility. Science 327 , 1018–1021 (2010).

Alessandretti, L., Aslak, U. & Lehmann, S. The scales of human mobility. Nature 587 , 402–407 (2020).

Pastor-Satorras, R. & Vespignani, A. Epidemic spreading in scale-free networks. Phys. Rev. Lett. 86 , 3200 (2001).

Pastor-Satorras, R., Castellano, C., Van Mieghem, P. & Vespignani, A. Epidemic processes in complex networks. Rev. Mod. Phys. 87 , 925 (2015).

Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).

Bishop, C. M. Pattern Recognition and Machine Learning (Springer, 2006).

Dong, Y., Johnson, R. A. & Chawla, N. V. Will this paper increase your h-index? Scientific impact prediction. In Proc. 8th ACM International Conference on Web Search and Data Mining, 149–158 (ACM 2015)

Xiao, S. et al. On modeling and predicting individual paper citation count over time. In IJCAI, 2676–2682 (IJCAI, 2016)

Fortunato, S. Community detection in graphs. Phys. Rep. 486 , 75–174 (2010).

Chen, C. Science mapping: a systematic review of the literature. J. Data Inf. Sci. 2 , 1–40 (2017).

CAS   Google Scholar  

Van Eck, N. J. & Waltman, L. Citation-based clustering of publications using CitNetExplorer and VOSviewer. Scientometrics 111 , 1053–1070 (2017).

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521 , 436–444 (2015).

Senior, A. W. et al. Improved protein structure prediction using potentials from deep learning. Nature 577 , 706–710 (2020).

Krenn, M. & Zeilinger, A. Predicting research trends with semantic and neural networks with an application in quantum physics. Proc. Natl Acad. Sci. USA 117 , 1910–1916 (2020).

Iten, R., Metger, T., Wilming, H., Del Rio, L. & Renner, R. Discovering physical concepts with neural networks. Phys. Rev. Lett. 124 , 010508 (2020).

Guimerà, R. et al. A Bayesian machine scientist to aid in the solution of challenging scientific problems. Sci. Adv. 6 , eaav6971 (2020).

Segler, M. H., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555 , 604–610 (2018).

Ryu, J. Y., Kim, H. U. & Lee, S. Y. Deep learning improves prediction of drug–drug and drug–food interactions. Proc. Natl Acad. Sci. USA 115 , E4304–E4311 (2018).

Kermany, D. S. et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172 , 1122–1131.e9 (2018).

Peng, H., Ke, Q., Budak, C., Romero, D. M. & Ahn, Y.-Y. Neural embeddings of scholarly periodicals reveal complex disciplinary organizations. Sci. Adv. 7 , eabb9004 (2021).

Youyou, W., Yang, Y. & Uzzi, B. A discipline-wide investigation of the replicability of psychology papers over the past two decades. Proc. Natl Acad. Sci. USA 120 , e2208863120 (2023).

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. & Galstyan, A. A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR) 54 , 1–35 (2021).

Way, S. F., Morgan, A. C., Larremore, D. B. & Clauset, A. Productivity, prominence, and the effects of academic environment. Proc. Natl Acad. Sci. USA 116 , 10729–10733 (2019).

Li, W., Aste, T., Caccioli, F. & Livan, G. Early coauthorship with top scientists predicts success in academic careers. Nat. Commun. 10 , 5170 (2019).

Hendry, D. F., Pagan, A. R. & Sargan, J. D. Dynamic specification. Handb. Econ. 2 , 1023–1100 (1984).

Jin, C., Ma, Y. & Uzzi, B. Scientific prizes and the extraordinary growth of scientific topics. Nat. Commun. 12 , 5619 (2021).

Azoulay, P., Ganguli, I. & Zivin, J. G. The mobility of elite life scientists: professional and personal determinants. Res. Policy 46 , 573–590 (2017).

Slavova, K., Fosfuri, A. & De Castro, J. O. Learning by hiring: the effects of scientists’ inbound mobility on research performance in academia. Organ. Sci. 27 , 72–89 (2016).

Sarsons, H. Recognition for group work: gender differences in academia. Am. Econ. Rev. 107 , 141–145 (2017).

Campbell, L. G., Mehtani, S., Dozier, M. E. & Rinehart, J. Gender-heterogeneous working groups produce higher quality science. PLoS ONE 8 , e79147 (2013).

Azoulay, P., Graff Zivin, J. S. & Wang, J. Superstar extinction. Q. J. Econ. 125 , 549–589 (2010).

Furman, J. L. & Stern, S. Climbing atop the shoulders of giants: the impact of institutions on cumulative research. Am. Econ. Rev. 101 , 1933–1963 (2011).

Williams, H. L. Intellectual property rights and innovation: evidence from the human genome. J. Polit. Econ. 121 , 1–27 (2013).

Rubin, A. & Rubin, E. Systematic Bias in the Progress of Research. J. Polit. Econ. 129 , 2666–2719 (2021).

Lu, S. F., Jin, G. Z., Uzzi, B. & Jones, B. The retraction penalty: evidence from the Web of Science. Sci. Rep. 3 , 3146 (2013).

Jin, G. Z., Jones, B., Lu, S. F. & Uzzi, B. The reverse Matthew effect: consequences of retraction in scientific teams. Rev. Econ. Stat. 101 , 492–506 (2019).

Azoulay, P., Bonatti, A. & Krieger, J. L. The career effects of scandal: evidence from scientific retractions. Res. Policy 46 , 1552–1569 (2017).

Goodman-Bacon, A. Difference-in-differences with variation in treatment timing. J. Econ. 225 , 254–277 (2021).

Callaway, B. & Sant’Anna, P. H. Difference-in-differences with multiple time periods. J. Econ. 225 , 200–230 (2021).

Hill, R. Searching for Superstars: Research Risk and Talent Discovery in Astronomy Working Paper (Massachusetts Institute of Technology, 2019).

Bagues, M., Sylos-Labini, M. & Zinovyeva, N. Does the gender composition of scientific committees matter? Am. Econ. Rev. 107 , 1207–1238 (2017).

Sampat, B. & Williams, H. L. How do patents affect follow-on innovation? Evidence from the human genome. Am. Econ. Rev. 109 , 203–236 (2019).

Moretti, E. & Wilson, D. J. The effect of state taxes on the geographical location of top earners: evidence from star scientists. Am. Econ. Rev. 107 , 1858–1903 (2017).

Jacob, B. A. & Lefgren, L. The impact of research grant funding on scientific productivity. J. Public Econ. 95 , 1168–1177 (2011).

Li, D. Expertise versus bias in evaluation: evidence from the NIH. Am. Econ. J. Appl. Econ. 9 , 60–92 (2017).

Pearl, J. Causal diagrams for empirical research. Biometrika 82 , 669–688 (1995).

Pearl, J. & Mackenzie, D. The Book of Why: The New Science of Cause and Effect (Basic Books, 2018).

Traag, V. A. Inferring the causal effect of journals on citations. Quant. Sci. Stud. 2 , 496–504 (2021).

Traag, V. & Waltman, L. Causal foundations of bias, disparity and fairness. Preprint at https://doi.org/10.48550/arXiv.2207.13665 (2022).

Imbens, G. W. Potential outcome and directed acyclic graph approaches to causality: relevance for empirical practice in economics. J. Econ. Lit. 58 , 1129–1179 (2020).

Heckman, J. J. & Pinto, R. Causality and Econometrics (National Bureau of Economic Research, 2022).

Aggarwal, I., Woolley, A. W., Chabris, C. F. & Malone, T. W. The impact of cognitive style diversity on implicit learning in teams. Front. Psychol. 10 , 112 (2019).

Balietti, S., Goldstone, R. L. & Helbing, D. Peer review and competition in the Art Exhibition Game. Proc. Natl Acad. Sci. USA 113 , 8414–8419 (2016).

Paulus, F. M., Rademacher, L., Schäfer, T. A. J., Müller-Pinzler, L. & Krach, S. Journal impact factor shapes scientists’ reward signal in the prospect of publication. PLoS ONE 10 , e0142537 (2015).

Williams, W. M. & Ceci, S. J. National hiring experiments reveal 2:1 faculty preference for women on STEM tenure track. Proc. Natl Acad. Sci. USA 112 , 5360–5365 (2015).

Collaboration, O. S. Estimating the reproducibility of psychological science. Science 349 , aac4716 (2015).

Camerer, C. F. et al. Evaluating replicability of laboratory experiments in economics. Science 351 , 1433–1436 (2016).

Camerer, C. F. et al. Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nat. Hum. Behav. 2 , 637–644 (2018).

Duflo, E. & Banerjee, A. Handbook of Field Experiments (Elsevier, 2017).

Tomkins, A., Zhang, M. & Heavlin, W. D. Reviewer bias in single versus double-blind peer review. Proc. Natl Acad. Sci. USA 114 , 12708–12713 (2017).

Blank, R. M. The effects of double-blind versus single-blind reviewing: experimental evidence from the American Economic Review. Am. Econ. Rev. 81 , 1041–1067 (1991).

Boudreau, K. J., Guinan, E. C., Lakhani, K. R. & Riedl, C. Looking across and looking beyond the knowledge frontier: intellectual distance, novelty, and resource allocation in science. Manage. Sci. 62 , 2765–2783 (2016).

Lane, J. et al. When Do Experts Listen to Other Experts? The Role of Negative Information in Expert Evaluations for Novel Projects Working Paper #21-007 (Harvard Business School, 2020).

Teplitskiy, M. et al. Do Experts Listen to Other Experts? Field Experimental Evidence from Scientific Peer Review (Harvard Business School, 2019).

Moss-Racusin, C. A., Dovidio, J. F., Brescoll, V. L., Graham, M. J. & Handelsman, J. Science faculty’s subtle gender biases favor male students. Proc. Natl Acad. Sci. USA 109 , 16474–16479 (2012).

Forscher, P. S., Cox, W. T., Brauer, M. & Devine, P. G. Little race or gender bias in an experiment of initial review of NIH R01 grant proposals. Nat. Hum. Behav. 3 , 257–264 (2019).

Dennehy, T. C. & Dasgupta, N. Female peer mentors early in college increase women’s positive academic experiences and retention in engineering. Proc. Natl Acad. Sci. USA 114 , 5964–5969 (2017).

Azoulay, P. Turn the scientific method on ourselves. Nature 484 , 31–32 (2012).

Download references

Acknowledgements

The authors thank all members of the Center for Science of Science and Innovation (CSSI) for invaluable comments. This work was supported by the Air Force Office of Scientific Research under award number FA9550-19-1-0354, National Science Foundation grant SBE 1829344, and the Alfred P. Sloan Foundation G-2019-12485.

Author information

Authors and affiliations.

Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA

Lu Liu, Benjamin F. Jones, Brian Uzzi & Dashun Wang

Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA

Kellogg School of Management, Northwestern University, Evanston, IL, USA

College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, USA

National Bureau of Economic Research, Cambridge, MA, USA

Benjamin F. Jones

Brookings Institution, Washington, DC, USA

McCormick School of Engineering, Northwestern University, Evanston, IL, USA

  • Dashun Wang

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Dashun Wang .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Human Behaviour thanks Ludo Waltman, Erin Leahey and Sarah Bratt for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Liu, L., Jones, B.F., Uzzi, B. et al. Data, measurement and empirical methods in the science of science. Nat Hum Behav 7 , 1046–1058 (2023). https://doi.org/10.1038/s41562-023-01562-4

Download citation

Received : 30 June 2022

Accepted : 17 February 2023

Published : 01 June 2023

Issue Date : July 2023

DOI : https://doi.org/10.1038/s41562-023-01562-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Publication, funding, and experimental data in support of human reference atlas construction and usage.

  • Yongxin Kong
  • Katy Börner

Scientific Data (2024)

Quantifying attrition in science: a cohort-based, longitudinal study of scientists in 38 OECD countries

  • Marek Kwiek
  • Lukasz Szymula

Higher Education (2024)

Rescaling the disruption index reveals the universality of disruption distributions in science

  • Alex J. Yang
  • Hongcun Gong
  • Sanhong Deng

Scientometrics (2024)

Signs of Dysconscious Racism and Xenophobiaism in Knowledge Production and the Formation of Academic Researchers: A National Study

  • Dina Zoe Belluigi

Journal of Academic Ethics (2024)

Scientific Data (2023)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

science experimental methods

Experimentology

An Open Science Approach to Experimental Psychology Methods

September 13, 2024

As scientists and practitioners, we often want to create generalizable, causal theories of human behavior. As it turns out, experiments—in which we use random assignment to measure a causal effect—are an unreasonably effective tool to help with this task. But how should we go about doing good experiments?

This book provides an introduction to the workflow of the experimental researcher working in psychology or the behavioral sciences more broadly. The organization of the book is sequential, from the planning stages of the research process through design, data gathering, analysis, and reporting. We introduce these concepts via narrative examples from a range of sub-disciplines, including cognitive, developmental, and social psychology. Throughout, we also illustrate the pitfalls that led to the “replication crisis” in psychology.

To help researchers avoid these pitfalls, we advocate for an open-science based approach in which transparency is integral to the entire experimental workflow. We provide readers with guidance for preregistration, project management, data sharing, and reproducible report writing.

The story of this book

Experimental Methods (Psych 251) is the foundational course for incoming graduate students in the Stanford psychology department. The course goal is to orient students to the nuts and bolts of doing behavioral experiments, including how to plan and design a solid experiment and how to avoid common pitfalls regarding design, measurement, and sampling.

Almost all student coursework both before and in graduate school deals with the content of their research, including theories and results in their areas of focus. In contrast, our course is sometimes the only one that deals with the process of research, from big questions about why we do experiments and what it means to make a causal inference, all the way to the tiny details of project organization, like what to name your directories and how to make sure you don’t lose your data in a computer crash.

This observation leads to our book’s title. “Experimentology” is the set of practices, findings, and approaches that enable the construction of robust, precise, and generalizable experiments.

The centerpiece of the Experimental Methods course is a replication project, reflecting a teaching model first described in Frank and Saxe (2012) 1 and later expanded on in Hawkins, Smith et al. (2018). 2 Each student chooses a published experiment in the literature and collects new data on a pre-registered version of the same experimental paradigm, comparing their result to the original publication. Over the course of the quarter, we walk through how to set up a replication experiment, how to pre-register confirmatory analyses, and how to write a reproducible report on the findings. The project teaches concepts like reliability and validity, which allow students to analyze choices that the original experimenters made—often choices that could have been made differently in hindsight!

1  Frank, Michael C, and Rebecca Saxe. 2012. “Teaching Replication.” Perspectives on Psychological Science 7: 595–99.

2  Hawkins, Robert D, Eric N Smith, Carolyn Au, Juan Miguel Arias, Rhia Catapano, Eric Hermann, Martin Keil, et al. 2018. “Improving the Replicability of Psychological Science Through Pedagogy.” Advances in Methods and Practices in Psychological Science 1 (1): 7–18.

At the end of the course, we reap the harvest of these projects. The project presentations are a wonderful demonstration of both how much the students can accomplish in a quarter and also how tricky it can be to reproduce (redo calculations in the original data) and replicate (recover similar results in new data) the published literature. Often our replication success rate for the course hovers just above 50%, an outcome that can be disturbing or distressing for students who assume that the published literature reports the absolute truth.

This book is an attempt to distill some of the lessons of the course (and students’ course projects) into a textbook. We’ll tell the story of the major shifts in psychology that have come about in the last ten years, including both the “replication crisis” and the positive methodological reforms that have resulted from it. Using this story as motivation, we will highlight the importance of transparency during all aspects of the experimental process from planning to dissemination of materials, data, and code.

What this book is and isn’t about

This book is about psychology experiments. These will be typically be short studies conducted online or in a single visit to a lab, often—though certainly not always—with a convenience sample. When we say “experiments” here we mean randomized experiments where some aspect of the participants’ experience is manipulated by the experimenter and then some outcome variable is measured . 3

3  We use bold to indicate the introduction of new technical terms.

The central thesis of the book is that:

Experiments are intended to make maximally unbiased, generalizable, and precise estimates of specific causal effects.

We’ll explore the implications of this thesis for a host of topics, including causal inference, experimental design, measurement, sampling, preregistration, data analysis, and many others.

Because our focus is on experiments, we won’t be talking much about observational designs, survey methods, or qualitative research; these are important tools and appropriate for a whole host of questions, but they aren’t our focus here. We also won’t go into depth about the many fascinating methodological and statistical issues brought up by single-participant case studies, longitudinal research, field studies, or other methodological variants. Many of the concerns we raise are still important for these types of studies, but some of our advice won’t transfer to these less common designs.

Even for students who are working on non-experimental research, we expect that a substantial part of the book content will still be useful, including chapters on replication ( 3  Replication ), ethics ( 4  Ethics ), statistics (chapters 5  Estimation , 6  Inference , 7  Models ), sampling ( 10  Sampling ), project management ( 13  Project management ), and reporting (chapters 14  Writing , 15  Visualization , 16  Meta-analysis ).

In our writing, we presuppose that readers have some background in psychology, at least at an introductory level. In addition, although we introduce a number of statistical topics, readers might find these sections more accessible with an undergraduate statistics course under their belt. Finally, our examples are written in the R statistical programming language, and for chapters on statistics and visualization especially (chapters 5  Estimation , 6  Inference , 7  Models , 15  Visualization , 16  Meta-analysis ), some familiarity with R will be helpful for understanding the code. None of these prerequisites are necessary to read the book, but we offer them so that readers can calibrate their expectations.

How to use this book

The book is organized into five main parts, mirroring the timeline of an experiment: 1) Foundations, 2) Statistics, 3) Planning, 4) Execution, and 5) Reporting. We hope that this organization makes it well-suited for teaching or for use as a reference book. 4

4  If you are an instructor who is planning to adopt the book for a course, you might be interested in our resources for instructors, including sample course schedules, in appendix A — Instructor’s guide .

The book is designed for a course for graduate students or advanced undergraduates, but the material is also suitable for self-study by anyone interested in experimental methods, whether in academic psychology or any other context—in our out of academia—in which behavioral experimentation is relevant. We also hope that some readers will come to particular chapters of the book because of an interest in specific topics like measurement ( 8  Measurement ) or sampling ( 10  Sampling ) and will be able to use those chapters as standalone references. And finally, for those interested in the “replication crisis” and subsequent reforms, chapters 3  Replication , 11  Preregistration , and 13  Project management will be especially interesting.

Ultimately, we want to give you what you need to plan and execute your own study! Instead of enumerating different approaches, we try to provide a single coherent – and often quite opinionated – perspective, using marginal notes and references to give pointers to more advanced materials or alternative approaches. Throughout, we offer:

  • Case studies that illustrate the central concepts of a chapter,
  • Accident reports describing examples where poor research practices led to issues in the literature, and
  • Depth boxes providing simulations, linkages to advanced techniques, or more nuanced discussion.

While case studies are often integral to the chapters, the other boxes can typically be skipped without issue.

We highlight four major cross-cutting themes for the book: transparency , measurement precision , bias reduction , and generalizability . 5

5  Themes are noted using small caps .

  • transparency : For experiments to be reproducible, other researchers need to be able to determine exactly what you did. Thus, every stage of the research process should be guided by a primary concern for transparency. For example, preregistration creates transparency into the researcher’s evolving expectations and thought processes; releasing open materials and analysis scripts creates transparency into the details of the procedure.
  • measurement precision : We want researchers to start planning an experiment by thinking “what causal effect do I want to measure” and to make planning, sampling, design, and analytic choices that maximize the precision of this measurement. A downstream consequence of this mindset is that we move away from a focus on dichotomized inferences about statistical significance and towards analytic and meta-analytic models that focus on continuous effect sizes and confidence intervals.
  • bias reduction : While precision refers to random error in a measurement, measurements also have systematic sources of error that bias them away from the true quantity. In our samples, analyses, experimental designs, and in the literature, we need to think carefully about sources of bias in the quantity being estimated.
  • generalizability : Complex behaviors are rarely universal across all settings and populations, and any given experiment can only hope to cover a small slice of the possible conditions where a behavior of interest takes place. Psychologists must therefore consider the generalizability of their findings at every stage of the process, from stimulus selection and sampling procedures, to analytic methods and reporting.

Throughout the book, we will return to these four themes again and again as we discuss how the decisions made by the experimenter at every stage of design, data gathering, and analysis bear on the inferences that can be made about the results. The introduction of each chapter highlights connections to specific themes.

The software toolkit for this book

We introduce and advocate for an approach to reproducible study planning, analysis, and writing. This approach depends on an ecosystem of open-source software tools, which we introduce in the book’s appendices. 6

6  These appendices are available online at https://experimentology.io but not in the print version of the book, since their content is best viewed in the web format.

  • The R statistical programming language and the RStudio integrated development environment,
  • Version control using git and GitHub , allowing collaboration on text documents like code, prose, and data, storing and integrating contributions over time ( appendix B — Git and GitHub ),
  • The RMarkdown and Quarto tools for creating reproducible reports that can be rendered to a variety of formats ( appendix C — R Markdown and Quarto ),
  • The tidyverse family of R packages, which extend the basic functionality of R with simple tools for data wrangling, analysis, and visualization ( appendix D — Tidyverse ), and
  • The ggplot2 plotting package, which makes it easy to create flexible data visualizations for both confirmatory and exploratory data analyses ( appendix E — ggplot ).

Where appropriate, we provide code boxes that show the specific R code used to create our examples.

Thanks for joining us for Experimentology! Whether you are casually browsing, doing readings for a course, or using the book as a reference in your own experimental work, we hope you find it useful. Throughout, we have tried to practice what we preach in terms of reproducibility, and so the full source code for the book is available at https://github.com/langcog/experimentology . We encourage you to browse, comment, and log issues or suggestions. 7

7  The best way to give us specific feedback is to create an issue on our github page at https://github.com/langcog/experimentology/issues .

Citing this book

Acknowledgments.

Thanks first and foremost to the many generations of students and TAs in Stanford Psych 251, who have collectively influenced the content of this book through their questions and interests.

Thanks to the staff at the MIT Press, especially Philip Laughlin and Amy Brand, for embracing a vision of a completely open web textbook that is also reviewed and published through a traditional press. We appreciate your support and flexibility.

We adapt the Contributor Roles (CRediT) Taxonomy 8 to describe our contributions to this manuscript, and we recommend that you do so in your work as well.

8  Learn more at https://credit.niso.org/ .

  • Primary writer: Michael C. Frank
  • Editor: Tom E. Hardwicke
  • Co-writer: Nicholas Coles
  • Editor: Nicholas Coles, Tom E. Hardwicke
  • Editor: Maya B. Mathur, Tom E. Hardwicke, Nicholas Coles
  • Primary writer: Rondeline Williams
  • Co-writer: Michael C. Frank
  • Editor: Tom E. Hardwicke, Julie Cachia
  • Co-writer: Maya B. Mathur, Nicholas Coles, Michael C. Frank
  • Editor: Julie Cachia, Tom E. Hardwicke
  • Co-writer: Maya B. Mathur
  • Co-writer: Maya B. Mathur, Michael C. Frank
  • Editor: Tom E. Hardwicke, Robert D. Hawkins
  • Editor: Robert D. Hawkins, Tom E. Hardwicke, Rondeline Williams
  • Editor: Julie Cachia, Tom E. Hardwicke, Maya B. Mathur
  • Primary writer: Tom E. Hardwicke
  • Editor: Michael C. Frank
  • Co-writer: Rondeline Williams, Michael C. Frank
  • Primary writer: Robert D. Hawkins
  • Editor: Michael C. Frank, Tom E. Hardwicke, Mika Braginsky
  • Co-writer: Nicholas Coles, Maya B. Mathur
  • Editor: Michael C. Frank, Tom E. Hardwicke
  • Primary writer: Julie Cachia
  • Editor: Julie Cachia
  • Editor: Julie Cachia, Mika Braginsky
  • Developer: Mika Braginsky, Natalie Braginsky
  • Privacy Policy

Research Method

Home » Experimental Design – Types, Methods, Guide

Experimental Design – Types, Methods, Guide

Table of Contents

Experimental Research Design

Experimental Design

Experimental design is a process of planning and conducting scientific experiments to investigate a hypothesis or research question. It involves carefully designing an experiment that can test the hypothesis, and controlling for other variables that may influence the results.

Experimental design typically includes identifying the variables that will be manipulated or measured, defining the sample or population to be studied, selecting an appropriate method of sampling, choosing a method for data collection and analysis, and determining the appropriate statistical tests to use.

Types of Experimental Design

Here are the different types of experimental design:

Completely Randomized Design

In this design, participants are randomly assigned to one of two or more groups, and each group is exposed to a different treatment or condition.

Randomized Block Design

This design involves dividing participants into blocks based on a specific characteristic, such as age or gender, and then randomly assigning participants within each block to one of two or more treatment groups.

Factorial Design

In a factorial design, participants are randomly assigned to one of several groups, each of which receives a different combination of two or more independent variables.

Repeated Measures Design

In this design, each participant is exposed to all of the different treatments or conditions, either in a random order or in a predetermined order.

Crossover Design

This design involves randomly assigning participants to one of two or more treatment groups, with each group receiving one treatment during the first phase of the study and then switching to a different treatment during the second phase.

Split-plot Design

In this design, the researcher manipulates one or more variables at different levels and uses a randomized block design to control for other variables.

Nested Design

This design involves grouping participants within larger units, such as schools or households, and then randomly assigning these units to different treatment groups.

Laboratory Experiment

Laboratory experiments are conducted under controlled conditions, which allows for greater precision and accuracy. However, because laboratory conditions are not always representative of real-world conditions, the results of these experiments may not be generalizable to the population at large.

Field Experiment

Field experiments are conducted in naturalistic settings and allow for more realistic observations. However, because field experiments are not as controlled as laboratory experiments, they may be subject to more sources of error.

Experimental Design Methods

Experimental design methods refer to the techniques and procedures used to design and conduct experiments in scientific research. Here are some common experimental design methods:

Randomization

This involves randomly assigning participants to different groups or treatments to ensure that any observed differences between groups are due to the treatment and not to other factors.

Control Group

The use of a control group is an important experimental design method that involves having a group of participants that do not receive the treatment or intervention being studied. The control group is used as a baseline to compare the effects of the treatment group.

Blinding involves keeping participants, researchers, or both unaware of which treatment group participants are in, in order to reduce the risk of bias in the results.

Counterbalancing

This involves systematically varying the order in which participants receive treatments or interventions in order to control for order effects.

Replication

Replication involves conducting the same experiment with different samples or under different conditions to increase the reliability and validity of the results.

This experimental design method involves manipulating multiple independent variables simultaneously to investigate their combined effects on the dependent variable.

This involves dividing participants into subgroups or blocks based on specific characteristics, such as age or gender, in order to reduce the risk of confounding variables.

Data Collection Method

Experimental design data collection methods are techniques and procedures used to collect data in experimental research. Here are some common experimental design data collection methods:

Direct Observation

This method involves observing and recording the behavior or phenomenon of interest in real time. It may involve the use of structured or unstructured observation, and may be conducted in a laboratory or naturalistic setting.

Self-report Measures

Self-report measures involve asking participants to report their thoughts, feelings, or behaviors using questionnaires, surveys, or interviews. These measures may be administered in person or online.

Behavioral Measures

Behavioral measures involve measuring participants’ behavior directly, such as through reaction time tasks or performance tests. These measures may be administered using specialized equipment or software.

Physiological Measures

Physiological measures involve measuring participants’ physiological responses, such as heart rate, blood pressure, or brain activity, using specialized equipment. These measures may be invasive or non-invasive, and may be administered in a laboratory or clinical setting.

Archival Data

Archival data involves using existing records or data, such as medical records, administrative records, or historical documents, as a source of information. These data may be collected from public or private sources.

Computerized Measures

Computerized measures involve using software or computer programs to collect data on participants’ behavior or responses. These measures may include reaction time tasks, cognitive tests, or other types of computer-based assessments.

Video Recording

Video recording involves recording participants’ behavior or interactions using cameras or other recording equipment. This method can be used to capture detailed information about participants’ behavior or to analyze social interactions.

Data Analysis Method

Experimental design data analysis methods refer to the statistical techniques and procedures used to analyze data collected in experimental research. Here are some common experimental design data analysis methods:

Descriptive Statistics

Descriptive statistics are used to summarize and describe the data collected in the study. This includes measures such as mean, median, mode, range, and standard deviation.

Inferential Statistics

Inferential statistics are used to make inferences or generalizations about a larger population based on the data collected in the study. This includes hypothesis testing and estimation.

Analysis of Variance (ANOVA)

ANOVA is a statistical technique used to compare means across two or more groups in order to determine whether there are significant differences between the groups. There are several types of ANOVA, including one-way ANOVA, two-way ANOVA, and repeated measures ANOVA.

Regression Analysis

Regression analysis is used to model the relationship between two or more variables in order to determine the strength and direction of the relationship. There are several types of regression analysis, including linear regression, logistic regression, and multiple regression.

Factor Analysis

Factor analysis is used to identify underlying factors or dimensions in a set of variables. This can be used to reduce the complexity of the data and identify patterns in the data.

Structural Equation Modeling (SEM)

SEM is a statistical technique used to model complex relationships between variables. It can be used to test complex theories and models of causality.

Cluster Analysis

Cluster analysis is used to group similar cases or observations together based on similarities or differences in their characteristics.

Time Series Analysis

Time series analysis is used to analyze data collected over time in order to identify trends, patterns, or changes in the data.

Multilevel Modeling

Multilevel modeling is used to analyze data that is nested within multiple levels, such as students nested within schools or employees nested within companies.

Applications of Experimental Design 

Experimental design is a versatile research methodology that can be applied in many fields. Here are some applications of experimental design:

  • Medical Research: Experimental design is commonly used to test new treatments or medications for various medical conditions. This includes clinical trials to evaluate the safety and effectiveness of new drugs or medical devices.
  • Agriculture : Experimental design is used to test new crop varieties, fertilizers, and other agricultural practices. This includes randomized field trials to evaluate the effects of different treatments on crop yield, quality, and pest resistance.
  • Environmental science: Experimental design is used to study the effects of environmental factors, such as pollution or climate change, on ecosystems and wildlife. This includes controlled experiments to study the effects of pollutants on plant growth or animal behavior.
  • Psychology : Experimental design is used to study human behavior and cognitive processes. This includes experiments to test the effects of different interventions, such as therapy or medication, on mental health outcomes.
  • Engineering : Experimental design is used to test new materials, designs, and manufacturing processes in engineering applications. This includes laboratory experiments to test the strength and durability of new materials, or field experiments to test the performance of new technologies.
  • Education : Experimental design is used to evaluate the effectiveness of teaching methods, educational interventions, and programs. This includes randomized controlled trials to compare different teaching methods or evaluate the impact of educational programs on student outcomes.
  • Marketing : Experimental design is used to test the effectiveness of marketing campaigns, pricing strategies, and product designs. This includes experiments to test the impact of different marketing messages or pricing schemes on consumer behavior.

Examples of Experimental Design 

Here are some examples of experimental design in different fields:

  • Example in Medical research : A study that investigates the effectiveness of a new drug treatment for a particular condition. Patients are randomly assigned to either a treatment group or a control group, with the treatment group receiving the new drug and the control group receiving a placebo. The outcomes, such as improvement in symptoms or side effects, are measured and compared between the two groups.
  • Example in Education research: A study that examines the impact of a new teaching method on student learning outcomes. Students are randomly assigned to either a group that receives the new teaching method or a group that receives the traditional teaching method. Student achievement is measured before and after the intervention, and the results are compared between the two groups.
  • Example in Environmental science: A study that tests the effectiveness of a new method for reducing pollution in a river. Two sections of the river are selected, with one section treated with the new method and the other section left untreated. The water quality is measured before and after the intervention, and the results are compared between the two sections.
  • Example in Marketing research: A study that investigates the impact of a new advertising campaign on consumer behavior. Participants are randomly assigned to either a group that is exposed to the new campaign or a group that is not. Their behavior, such as purchasing or product awareness, is measured and compared between the two groups.
  • Example in Social psychology: A study that examines the effect of a new social intervention on reducing prejudice towards a marginalized group. Participants are randomly assigned to either a group that receives the intervention or a control group that does not. Their attitudes and behavior towards the marginalized group are measured before and after the intervention, and the results are compared between the two groups.

When to use Experimental Research Design 

Experimental research design should be used when a researcher wants to establish a cause-and-effect relationship between variables. It is particularly useful when studying the impact of an intervention or treatment on a particular outcome.

Here are some situations where experimental research design may be appropriate:

  • When studying the effects of a new drug or medical treatment: Experimental research design is commonly used in medical research to test the effectiveness and safety of new drugs or medical treatments. By randomly assigning patients to treatment and control groups, researchers can determine whether the treatment is effective in improving health outcomes.
  • When evaluating the effectiveness of an educational intervention: An experimental research design can be used to evaluate the impact of a new teaching method or educational program on student learning outcomes. By randomly assigning students to treatment and control groups, researchers can determine whether the intervention is effective in improving academic performance.
  • When testing the effectiveness of a marketing campaign: An experimental research design can be used to test the effectiveness of different marketing messages or strategies. By randomly assigning participants to treatment and control groups, researchers can determine whether the marketing campaign is effective in changing consumer behavior.
  • When studying the effects of an environmental intervention: Experimental research design can be used to study the impact of environmental interventions, such as pollution reduction programs or conservation efforts. By randomly assigning locations or areas to treatment and control groups, researchers can determine whether the intervention is effective in improving environmental outcomes.
  • When testing the effects of a new technology: An experimental research design can be used to test the effectiveness and safety of new technologies or engineering designs. By randomly assigning participants or locations to treatment and control groups, researchers can determine whether the new technology is effective in achieving its intended purpose.

How to Conduct Experimental Research

Here are the steps to conduct Experimental Research:

  • Identify a Research Question : Start by identifying a research question that you want to answer through the experiment. The question should be clear, specific, and testable.
  • Develop a Hypothesis: Based on your research question, develop a hypothesis that predicts the relationship between the independent and dependent variables. The hypothesis should be clear and testable.
  • Design the Experiment : Determine the type of experimental design you will use, such as a between-subjects design or a within-subjects design. Also, decide on the experimental conditions, such as the number of independent variables, the levels of the independent variable, and the dependent variable to be measured.
  • Select Participants: Select the participants who will take part in the experiment. They should be representative of the population you are interested in studying.
  • Randomly Assign Participants to Groups: If you are using a between-subjects design, randomly assign participants to groups to control for individual differences.
  • Conduct the Experiment : Conduct the experiment by manipulating the independent variable(s) and measuring the dependent variable(s) across the different conditions.
  • Analyze the Data: Analyze the data using appropriate statistical methods to determine if there is a significant effect of the independent variable(s) on the dependent variable(s).
  • Draw Conclusions: Based on the data analysis, draw conclusions about the relationship between the independent and dependent variables. If the results support the hypothesis, then it is accepted. If the results do not support the hypothesis, then it is rejected.
  • Communicate the Results: Finally, communicate the results of the experiment through a research report or presentation. Include the purpose of the study, the methods used, the results obtained, and the conclusions drawn.

Purpose of Experimental Design 

The purpose of experimental design is to control and manipulate one or more independent variables to determine their effect on a dependent variable. Experimental design allows researchers to systematically investigate causal relationships between variables, and to establish cause-and-effect relationships between the independent and dependent variables. Through experimental design, researchers can test hypotheses and make inferences about the population from which the sample was drawn.

Experimental design provides a structured approach to designing and conducting experiments, ensuring that the results are reliable and valid. By carefully controlling for extraneous variables that may affect the outcome of the study, experimental design allows researchers to isolate the effect of the independent variable(s) on the dependent variable(s), and to minimize the influence of other factors that may confound the results.

Experimental design also allows researchers to generalize their findings to the larger population from which the sample was drawn. By randomly selecting participants and using statistical techniques to analyze the data, researchers can make inferences about the larger population with a high degree of confidence.

Overall, the purpose of experimental design is to provide a rigorous, systematic, and scientific method for testing hypotheses and establishing cause-and-effect relationships between variables. Experimental design is a powerful tool for advancing scientific knowledge and informing evidence-based practice in various fields, including psychology, biology, medicine, engineering, and social sciences.

Advantages of Experimental Design 

Experimental design offers several advantages in research. Here are some of the main advantages:

  • Control over extraneous variables: Experimental design allows researchers to control for extraneous variables that may affect the outcome of the study. By manipulating the independent variable and holding all other variables constant, researchers can isolate the effect of the independent variable on the dependent variable.
  • Establishing causality: Experimental design allows researchers to establish causality by manipulating the independent variable and observing its effect on the dependent variable. This allows researchers to determine whether changes in the independent variable cause changes in the dependent variable.
  • Replication : Experimental design allows researchers to replicate their experiments to ensure that the findings are consistent and reliable. Replication is important for establishing the validity and generalizability of the findings.
  • Random assignment: Experimental design often involves randomly assigning participants to conditions. This helps to ensure that individual differences between participants are evenly distributed across conditions, which increases the internal validity of the study.
  • Precision : Experimental design allows researchers to measure variables with precision, which can increase the accuracy and reliability of the data.
  • Generalizability : If the study is well-designed, experimental design can increase the generalizability of the findings. By controlling for extraneous variables and using random assignment, researchers can increase the likelihood that the findings will apply to other populations and contexts.

Limitations of Experimental Design

Experimental design has some limitations that researchers should be aware of. Here are some of the main limitations:

  • Artificiality : Experimental design often involves creating artificial situations that may not reflect real-world situations. This can limit the external validity of the findings, or the extent to which the findings can be generalized to real-world settings.
  • Ethical concerns: Some experimental designs may raise ethical concerns, particularly if they involve manipulating variables that could cause harm to participants or if they involve deception.
  • Participant bias : Participants in experimental studies may modify their behavior in response to the experiment, which can lead to participant bias.
  • Limited generalizability: The conditions of the experiment may not reflect the complexities of real-world situations. As a result, the findings may not be applicable to all populations and contexts.
  • Cost and time : Experimental design can be expensive and time-consuming, particularly if the experiment requires specialized equipment or if the sample size is large.
  • Researcher bias : Researchers may unintentionally bias the results of the experiment if they have expectations or preferences for certain outcomes.
  • Lack of feasibility : Experimental design may not be feasible in some cases, particularly if the research question involves variables that cannot be manipulated or controlled.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Applied Research

Applied Research – Types, Methods and Examples

Exploratory Research

Exploratory Research – Types, Methods and...

Observational Research

Observational Research – Methods and Guide

Descriptive Research Design

Descriptive Research Design – Types, Methods and...

Quantitative Research

Quantitative Research – Methods, Types and...

Ethnographic Research

Ethnographic Research -Types, Methods and Guide

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Sweepstakes
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

How the Experimental Method Works in Psychology

sturti/Getty Images

The Experimental Process

Types of experiments, potential pitfalls of the experimental method.

The experimental method is a type of research procedure that involves manipulating variables to determine if there is a cause-and-effect relationship. The results obtained through the experimental method are useful but do not prove with 100% certainty that a singular cause always creates a specific effect. Instead, they show the probability that a cause will or will not lead to a particular effect.

At a Glance

While there are many different research techniques available, the experimental method allows researchers to look at cause-and-effect relationships. Using the experimental method, researchers randomly assign participants to a control or experimental group and manipulate levels of an independent variable. If changes in the independent variable lead to changes in the dependent variable, it indicates there is likely a causal relationship between them.

What Is the Experimental Method in Psychology?

The experimental method involves manipulating one variable to determine if this causes changes in another variable. This method relies on controlled research methods and random assignment of study subjects to test a hypothesis.

For example, researchers may want to learn how different visual patterns may impact our perception. Or they might wonder whether certain actions can improve memory . Experiments are conducted on many behavioral topics, including:

The scientific method forms the basis of the experimental method. This is a process used to determine the relationship between two variables—in this case, to explain human behavior .

Positivism is also important in the experimental method. It refers to factual knowledge that is obtained through observation, which is considered to be trustworthy.

When using the experimental method, researchers first identify and define key variables. Then they formulate a hypothesis, manipulate the variables, and collect data on the results. Unrelated or irrelevant variables are carefully controlled to minimize the potential impact on the experiment outcome.

History of the Experimental Method

The idea of using experiments to better understand human psychology began toward the end of the nineteenth century. Wilhelm Wundt established the first formal laboratory in 1879.

Wundt is often called the father of experimental psychology. He believed that experiments could help explain how psychology works, and used this approach to study consciousness .

Wundt coined the term "physiological psychology." This is a hybrid of physiology and psychology, or how the body affects the brain.

Other early contributors to the development and evolution of experimental psychology as we know it today include:

  • Gustav Fechner (1801-1887), who helped develop procedures for measuring sensations according to the size of the stimulus
  • Hermann von Helmholtz (1821-1894), who analyzed philosophical assumptions through research in an attempt to arrive at scientific conclusions
  • Franz Brentano (1838-1917), who called for a combination of first-person and third-person research methods when studying psychology
  • Georg Elias Müller (1850-1934), who performed an early experiment on attitude which involved the sensory discrimination of weights and revealed how anticipation can affect this discrimination

Key Terms to Know

To understand how the experimental method works, it is important to know some key terms.

Dependent Variable

The dependent variable is the effect that the experimenter is measuring. If a researcher was investigating how sleep influences test scores, for example, the test scores would be the dependent variable.

Independent Variable

The independent variable is the variable that the experimenter manipulates. In the previous example, the amount of sleep an individual gets would be the independent variable.

A hypothesis is a tentative statement or a guess about the possible relationship between two or more variables. In looking at how sleep influences test scores, the researcher might hypothesize that people who get more sleep will perform better on a math test the following day. The purpose of the experiment, then, is to either support or reject this hypothesis.

Operational definitions are necessary when performing an experiment. When we say that something is an independent or dependent variable, we must have a very clear and specific definition of the meaning and scope of that variable.

Extraneous Variables

Extraneous variables are other variables that may also affect the outcome of an experiment. Types of extraneous variables include participant variables, situational variables, demand characteristics, and experimenter effects. In some cases, researchers can take steps to control for extraneous variables.

Demand Characteristics

Demand characteristics are subtle hints that indicate what an experimenter is hoping to find in a psychology experiment. This can sometimes cause participants to alter their behavior, which can affect the results of the experiment.

Intervening Variables

Intervening variables are factors that can affect the relationship between two other variables. 

Confounding Variables

Confounding variables are variables that can affect the dependent variable, but that experimenters cannot control for. Confounding variables can make it difficult to determine if the effect was due to changes in the independent variable or if the confounding variable may have played a role.

Psychologists, like other scientists, use the scientific method when conducting an experiment. The scientific method is a set of procedures and principles that guide how scientists develop research questions, collect data, and come to conclusions.

The five basic steps of the experimental process are:

  • Identifying a problem to study
  • Devising the research protocol
  • Conducting the experiment
  • Analyzing the data collected
  • Sharing the findings (usually in writing or via presentation)

Most psychology students are expected to use the experimental method at some point in their academic careers. Learning how to conduct an experiment is important to understanding how psychologists prove and disprove theories in this field.

There are a few different types of experiments that researchers might use when studying psychology. Each has pros and cons depending on the participants being studied, the hypothesis, and the resources available to conduct the research.

Lab Experiments

Lab experiments are common in psychology because they allow experimenters more control over the variables. These experiments can also be easier for other researchers to replicate. The drawback of this research type is that what takes place in a lab is not always what takes place in the real world.

Field Experiments

Sometimes researchers opt to conduct their experiments in the field. For example, a social psychologist interested in researching prosocial behavior might have a person pretend to faint and observe how long it takes onlookers to respond.

This type of experiment can be a great way to see behavioral responses in realistic settings. But it is more difficult for researchers to control the many variables existing in these settings that could potentially influence the experiment's results.

Quasi-Experiments

While lab experiments are known as true experiments, researchers can also utilize a quasi-experiment. Quasi-experiments are often referred to as natural experiments because the researchers do not have true control over the independent variable.

A researcher looking at personality differences and birth order, for example, is not able to manipulate the independent variable in the situation (personality traits). Participants also cannot be randomly assigned because they naturally fall into pre-existing groups based on their birth order.

So why would a researcher use a quasi-experiment? This is a good choice in situations where scientists are interested in studying phenomena in natural, real-world settings. It's also beneficial if there are limits on research funds or time.

Field experiments can be either quasi-experiments or true experiments.

Examples of the Experimental Method in Use

The experimental method can provide insight into human thoughts and behaviors, Researchers use experiments to study many aspects of psychology.

A 2019 study investigated whether splitting attention between electronic devices and classroom lectures had an effect on college students' learning abilities. It found that dividing attention between these two mediums did not affect lecture comprehension. However, it did impact long-term retention of the lecture information, which affected students' exam performance.

An experiment used participants' eye movements and electroencephalogram (EEG) data to better understand cognitive processing differences between experts and novices. It found that experts had higher power in their theta brain waves than novices, suggesting that they also had a higher cognitive load.

A study looked at whether chatting online with a computer via a chatbot changed the positive effects of emotional disclosure often received when talking with an actual human. It found that the effects were the same in both cases.

One experimental study evaluated whether exercise timing impacts information recall. It found that engaging in exercise prior to performing a memory task helped improve participants' short-term memory abilities.

Sometimes researchers use the experimental method to get a bigger-picture view of psychological behaviors and impacts. For example, one 2018 study examined several lab experiments to learn more about the impact of various environmental factors on building occupant perceptions.

A 2020 study set out to determine the role that sensation-seeking plays in political violence. This research found that sensation-seeking individuals have a higher propensity for engaging in political violence. It also found that providing access to a more peaceful, yet still exciting political group helps reduce this effect.

While the experimental method can be a valuable tool for learning more about psychology and its impacts, it also comes with a few pitfalls.

Experiments may produce artificial results, which are difficult to apply to real-world situations. Similarly, researcher bias can impact the data collected. Results may not be able to be reproduced, meaning the results have low reliability .

Since humans are unpredictable and their behavior can be subjective, it can be hard to measure responses in an experiment. In addition, political pressure may alter the results. The subjects may not be a good representation of the population, or groups used may not be comparable.

And finally, since researchers are human too, results may be degraded due to human error.

What This Means For You

Every psychological research method has its pros and cons. The experimental method can help establish cause and effect, and it's also beneficial when research funds are limited or time is of the essence.

At the same time, it's essential to be aware of this method's pitfalls, such as how biases can affect the results or the potential for low reliability. Keeping these in mind can help you review and assess research studies more accurately, giving you a better idea of whether the results can be trusted or have limitations.

Colorado State University. Experimental and quasi-experimental research .

American Psychological Association. Experimental psychology studies human and animals .

Mayrhofer R, Kuhbandner C, Lindner C. The practice of experimental psychology: An inevitably postmodern endeavor . Front Psychol . 2021;11:612805. doi:10.3389/fpsyg.2020.612805

Mandler G. A History of Modern Experimental Psychology .

Stanford University. Wilhelm Maximilian Wundt . Stanford Encyclopedia of Philosophy.

Britannica. Gustav Fechner .

Britannica. Hermann von Helmholtz .

Meyer A, Hackert B, Weger U. Franz Brentano and the beginning of experimental psychology: implications for the study of psychological phenomena today . Psychol Res . 2018;82:245-254. doi:10.1007/s00426-016-0825-7

Britannica. Georg Elias Müller .

McCambridge J, de Bruin M, Witton J.  The effects of demand characteristics on research participant behaviours in non-laboratory settings: A systematic review .  PLoS ONE . 2012;7(6):e39116. doi:10.1371/journal.pone.0039116

Laboratory experiments . In: The Sage Encyclopedia of Communication Research Methods. Allen M, ed. SAGE Publications, Inc. doi:10.4135/9781483381411.n287

Schweizer M, Braun B, Milstone A. Research methods in healthcare epidemiology and antimicrobial stewardship — quasi-experimental designs . Infect Control Hosp Epidemiol . 2016;37(10):1135-1140. doi:10.1017/ice.2016.117

Glass A, Kang M. Dividing attention in the classroom reduces exam performance . Educ Psychol . 2019;39(3):395-408. doi:10.1080/01443410.2018.1489046

Keskin M, Ooms K, Dogru AO, De Maeyer P. Exploring the cognitive load of expert and novice map users using EEG and eye tracking . ISPRS Int J Geo-Inf . 2020;9(7):429. doi:10.3390.ijgi9070429

Ho A, Hancock J, Miner A. Psychological, relational, and emotional effects of self-disclosure after conversations with a chatbot . J Commun . 2018;68(4):712-733. doi:10.1093/joc/jqy026

Haynes IV J, Frith E, Sng E, Loprinzi P. Experimental effects of acute exercise on episodic memory function: Considerations for the timing of exercise . Psychol Rep . 2018;122(5):1744-1754. doi:10.1177/0033294118786688

Torresin S, Pernigotto G, Cappelletti F, Gasparella A. Combined effects of environmental factors on human perception and objective performance: A review of experimental laboratory works . Indoor Air . 2018;28(4):525-538. doi:10.1111/ina.12457

Schumpe BM, Belanger JJ, Moyano M, Nisa CF. The role of sensation seeking in political violence: An extension of the significance quest theory . J Personal Social Psychol . 2020;118(4):743-761. doi:10.1037/pspp0000223

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

  • Grades 6-12
  • School Leaders

Get 50% off your first box of Home Chef! 🥙

72 Easy Science Experiments Using Materials You Already Have On Hand

Because science doesn’t have to be complicated.

Easy science experiments including a "naked" egg and "leakproof" bag

If there is one thing that is guaranteed to get your students excited, it’s a good science experiment! While some experiments require expensive lab equipment or dangerous chemicals, there are plenty of cool projects you can do with regular household items. We’ve rounded up a big collection of easy science experiments that anybody can try, and kids are going to love them!

Easy Chemistry Science Experiments

Easy physics science experiments, easy biology and environmental science experiments, easy engineering experiments and stem challenges.

Skittles form a circle around a plate. The colors are bleeding toward the center of the plate. (easy science experiments)

1. Taste the Rainbow

Teach your students about diffusion while creating a beautiful and tasty rainbow! Tip: Have extra Skittles on hand so your class can eat a few!

Learn more: Skittles Diffusion

Colorful rock candy on wooden sticks

2. Crystallize sweet treats

Crystal science experiments teach kids about supersaturated solutions. This one is easy to do at home, and the results are absolutely delicious!

Learn more: Candy Crystals

3. Make a volcano erupt

This classic experiment demonstrates a chemical reaction between baking soda (sodium bicarbonate) and vinegar (acetic acid), which produces carbon dioxide gas, water, and sodium acetate.

Learn more: Best Volcano Experiments

4. Make elephant toothpaste

This fun project uses yeast and a hydrogen peroxide solution to create overflowing “elephant toothpaste.” Tip: Add an extra fun layer by having kids create toothpaste wrappers for plastic bottles.

Girl making an enormous bubble with string and wire

5. Blow the biggest bubbles you can

Add a few simple ingredients to dish soap solution to create the largest bubbles you’ve ever seen! Kids learn about surface tension as they engineer these bubble-blowing wands.

Learn more: Giant Soap Bubbles

Plastic bag full of water with pencils stuck through it

6. Demonstrate the “magic” leakproof bag

All you need is a zip-top plastic bag, sharp pencils, and water to blow your kids’ minds. Once they’re suitably impressed, teach them how the “trick” works by explaining the chemistry of polymers.

Learn more: Leakproof Bag

Several apple slices are shown on a clear plate. There are cards that label what they have been immersed in (including salt water, sugar water, etc.) (easy science experiments)

7. Use apple slices to learn about oxidation

Have students make predictions about what will happen to apple slices when immersed in different liquids, then put those predictions to the test. Have them record their observations.

Learn more: Apple Oxidation

8. Float a marker man

Their eyes will pop out of their heads when you “levitate” a stick figure right off the table! This experiment works due to the insolubility of dry-erase marker ink in water, combined with the lighter density of the ink.

Learn more: Floating Marker Man

Mason jars stacked with their mouths together, with one color of water on the bottom and another color on top

9. Discover density with hot and cold water

There are a lot of easy science experiments you can do with density. This one is extremely simple, involving only hot and cold water and food coloring, but the visuals make it appealing and fun.

Learn more: Layered Water

Clear cylinder layered with various liquids in different colors

10. Layer more liquids

This density demo is a little more complicated, but the effects are spectacular. Slowly layer liquids like honey, dish soap, water, and rubbing alcohol in a glass. Kids will be amazed when the liquids float one on top of the other like magic (except it is really science).

Learn more: Layered Liquids

Giant carbon snake growing out of a tin pan full of sand

11. Grow a carbon sugar snake

Easy science experiments can still have impressive results! This eye-popping chemical reaction demonstration only requires simple supplies like sugar, baking soda, and sand.

Learn more: Carbon Sugar Snake

12. Mix up some slime

Tell kids you’re going to make slime at home, and watch their eyes light up! There are a variety of ways to make slime, so try a few different recipes to find the one you like best.

Two children are shown (without faces) bouncing balls on a white table

13. Make homemade bouncy balls

These homemade bouncy balls are easy to make since all you need is glue, food coloring, borax powder, cornstarch, and warm water. You’ll want to store them inside a container like a plastic egg because they will flatten out over time.

Learn more: Make Your Own Bouncy Balls

Pink sidewalk chalk stick sitting on a paper towel

14. Create eggshell chalk

Eggshells contain calcium, the same material that makes chalk. Grind them up and mix them with flour, water, and food coloring to make your very own sidewalk chalk.

Learn more: Eggshell Chalk

Science student holding a raw egg without a shell

15. Make naked eggs

This is so cool! Use vinegar to dissolve the calcium carbonate in an eggshell to discover the membrane underneath that holds the egg together. Then, use the “naked” egg for another easy science experiment that demonstrates osmosis .

Learn more: Naked Egg Experiment

16. Turn milk into plastic

This sounds a lot more complicated than it is, but don’t be afraid to give it a try. Use simple kitchen supplies to create plastic polymers from plain old milk. Sculpt them into cool shapes when you’re done!

Student using a series of test tubes filled with pink liquid

17. Test pH using cabbage

Teach kids about acids and bases without needing pH test strips! Simply boil some red cabbage and use the resulting water to test various substances—acids turn red and bases turn green.

Learn more: Cabbage pH

Pennies in small cups of liquid labeled coca cola, vinegar + salt, apple juice, water, catsup, and vinegar. Text reads Cleaning Coins Science Experiment. Step by step procedure and explanation.

18. Clean some old coins

Use common household items to make old oxidized coins clean and shiny again in this simple chemistry experiment. Ask kids to predict (hypothesize) which will work best, then expand the learning by doing some research to explain the results.

Learn more: Cleaning Coins

Glass bottle with bowl holding three eggs, small glass with matches sitting on a box of matches, and a yellow plastic straw, against a blue background

19. Pull an egg into a bottle

This classic easy science experiment never fails to delight. Use the power of air pressure to suck a hard-boiled egg into a jar, no hands required.

Learn more: Egg in a Bottle

20. Blow up a balloon (without blowing)

Chances are good you probably did easy science experiments like this when you were in school. The baking soda and vinegar balloon experiment demonstrates the reactions between acids and bases when you fill a bottle with vinegar and a balloon with baking soda.

21 Assemble a DIY lava lamp

This 1970s trend is back—as an easy science experiment! This activity combines acid-base reactions with density for a totally groovy result.

Four colored cups containing different liquids, with an egg in each

22. Explore how sugary drinks affect teeth

The calcium content of eggshells makes them a great stand-in for teeth. Use eggs to explore how soda and juice can stain teeth and wear down the enamel. Expand your learning by trying different toothpaste-and-toothbrush combinations to see how effective they are.

Learn more: Sugar and Teeth Experiment

23. Mummify a hot dog

If your kids are fascinated by the Egyptians, they’ll love learning to mummify a hot dog! No need for canopic jars , just grab some baking soda and get started.

24. Extinguish flames with carbon dioxide

This is a fiery twist on acid-base experiments. Light a candle and talk about what fire needs in order to survive. Then, create an acid-base reaction and “pour” the carbon dioxide to extinguish the flame. The CO2 gas acts like a liquid, suffocating the fire.

I Love You written in lemon juice on a piece of white paper, with lemon half and cotton swabs

25. Send secret messages with invisible ink

Turn your kids into secret agents! Write messages with a paintbrush dipped in lemon juice, then hold the paper over a heat source and watch the invisible become visible as oxidation goes to work.

Learn more: Invisible Ink

26. Create dancing popcorn

This is a fun version of the classic baking soda and vinegar experiment, perfect for the younger crowd. The bubbly mixture causes popcorn to dance around in the water.

Students looking surprised as foamy liquid shoots up out of diet soda bottles

27. Shoot a soda geyser sky-high

You’ve always wondered if this really works, so it’s time to find out for yourself! Kids will marvel at the chemical reaction that sends diet soda shooting high in the air when Mentos are added.

Learn more: Soda Explosion

Empty tea bags burning into ashes

28. Send a teabag flying

Hot air rises, and this experiment can prove it! You’ll want to supervise kids with fire, of course. For more safety, try this one outside.

Learn more: Flying Tea Bags

Magic Milk Experiment How to Plus Free Worksheet

29. Create magic milk

This fun and easy science experiment demonstrates principles related to surface tension, molecular interactions, and fluid dynamics.

Learn more: Magic Milk Experiment

Two side-by-side shots of an upside-down glass over a candle in a bowl of water, with water pulled up into the glass in the second picture

30. Watch the water rise

Learn about Charles’s Law with this simple experiment. As the candle burns, using up oxygen and heating the air in the glass, the water rises as if by magic.

Learn more: Rising Water

Glasses filled with colored water, with paper towels running from one to the next

31. Learn about capillary action

Kids will be amazed as they watch the colored water move from glass to glass, and you’ll love the easy and inexpensive setup. Gather some water, paper towels, and food coloring to teach the scientific magic of capillary action.

Learn more: Capillary Action

A pink balloon has a face drawn on it. It is hovering over a plate with salt and pepper on it

32. Give a balloon a beard

Equally educational and fun, this experiment will teach kids about static electricity using everyday materials. Kids will undoubtedly get a kick out of creating beards on their balloon person!

Learn more: Static Electricity

DIY compass made from a needle floating in water

33. Find your way with a DIY compass

Here’s an old classic that never fails to impress. Magnetize a needle, float it on the water’s surface, and it will always point north.

Learn more: DIY Compass

34. Crush a can using air pressure

Sure, it’s easy to crush a soda can with your bare hands, but what if you could do it without touching it at all? That’s the power of air pressure!

A large piece of cardboard has a white circle in the center with a pencil standing upright in the middle of the circle. Rocks are on all four corners holding it down.

35. Tell time using the sun

While people use clocks or even phones to tell time today, there was a time when a sundial was the best means to do that. Kids will certainly get a kick out of creating their own sundials using everyday materials like cardboard and pencils.

Learn more: Make Your Own Sundial

36. Launch a balloon rocket

Grab balloons, string, straws, and tape, and launch rockets to learn about the laws of motion.

Steel wool sitting in an aluminum tray. The steel wool appears to be on fire.

37. Make sparks with steel wool

All you need is steel wool and a 9-volt battery to perform this science demo that’s bound to make their eyes light up! Kids learn about chain reactions, chemical changes, and more.

Learn more: Steel Wool Electricity

38. Levitate a Ping-Pong ball

Kids will get a kick out of this experiment, which is really all about Bernoulli’s principle. You only need plastic bottles, bendy straws, and Ping-Pong balls to make the science magic happen.

Colored water in a vortex in a plastic bottle

39. Whip up a tornado in a bottle

There are plenty of versions of this classic experiment out there, but we love this one because it sparkles! Kids learn about a vortex and what it takes to create one.

Learn more: Tornado in a Bottle

Homemade barometer using a tin can, rubber band, and ruler

40. Monitor air pressure with a DIY barometer

This simple but effective DIY science project teaches kids about air pressure and meteorology. They’ll have fun tracking and predicting the weather with their very own barometer.

Learn more: DIY Barometer

A child holds up a pice of ice to their eye as if it is a magnifying glass. (easy science experiments)

41. Peer through an ice magnifying glass

Students will certainly get a thrill out of seeing how an everyday object like a piece of ice can be used as a magnifying glass. Be sure to use purified or distilled water since tap water will have impurities in it that will cause distortion.

Learn more: Ice Magnifying Glass

Piece of twine stuck to an ice cube

42. String up some sticky ice

Can you lift an ice cube using just a piece of string? This quick experiment teaches you how. Use a little salt to melt the ice and then refreeze the ice with the string attached.

Learn more: Sticky Ice

Drawing of a hand with the thumb up and a glass of water

43. “Flip” a drawing with water

Light refraction causes some really cool effects, and there are multiple easy science experiments you can do with it. This one uses refraction to “flip” a drawing; you can also try the famous “disappearing penny” trick .

Learn more: Light Refraction With Water

44. Color some flowers

We love how simple this project is to re-create since all you’ll need are some white carnations, food coloring, glasses, and water. The end result is just so beautiful!

Square dish filled with water and glitter, showing how a drop of dish soap repels the glitter

45. Use glitter to fight germs

Everyone knows that glitter is just like germs—it gets everywhere and is so hard to get rid of! Use that to your advantage and show kids how soap fights glitter and germs.

Learn more: Glitter Germs

Plastic bag with clouds and sun drawn on it, with a small amount of blue liquid at the bottom

46. Re-create the water cycle in a bag

You can do so many easy science experiments with a simple zip-top bag. Fill one partway with water and set it on a sunny windowsill to see how the water evaporates up and eventually “rains” down.

Learn more: Water Cycle

Plastic zipper bag tied around leaves on a tree

47. Learn about plant transpiration

Your backyard is a terrific place for easy science experiments. Grab a plastic bag and rubber band to learn how plants get rid of excess water they don’t need, a process known as transpiration.

Learn more: Plant Transpiration

Students sit around a table that has a tin pan filled with blue liquid wiht a feather floating in it (easy science experiments)

48. Clean up an oil spill

Before conducting this experiment, teach your students about engineers who solve environmental problems like oil spills. Then, have your students use provided materials to clean the oil spill from their oceans.

Learn more: Oil Spill

Sixth grade student holding model lungs and diaphragm made from a plastic bottle, duct tape, and balloons

49. Construct a pair of model lungs

Kids get a better understanding of the respiratory system when they build model lungs using a plastic water bottle and some balloons. You can modify the experiment to demonstrate the effects of smoking too.

Learn more: Model Lungs

Child pouring vinegar over a large rock in a bowl

50. Experiment with limestone rocks

Kids  love to collect rocks, and there are plenty of easy science experiments you can do with them. In this one, pour vinegar over a rock to see if it bubbles. If it does, you’ve found limestone!

Learn more: Limestone Experiments

Plastic bottle converted to a homemade rain gauge

51. Turn a bottle into a rain gauge

All you need is a plastic bottle, a ruler, and a permanent marker to make your own rain gauge. Monitor your measurements and see how they stack up against meteorology reports in your area.

Learn more: DIY Rain Gauge

Pile of different colored towels pushed together to create folds like mountains

52. Build up towel mountains

This clever demonstration helps kids understand how some landforms are created. Use layers of towels to represent rock layers and boxes for continents. Then pu-u-u-sh and see what happens!

Learn more: Towel Mountains

Layers of differently colored playdough with straw holes punched throughout all the layers

53. Take a play dough core sample

Learn about the layers of the earth by building them out of Play-Doh, then take a core sample with a straw. ( Love Play-Doh? Get more learning ideas here. )

Learn more: Play Dough Core Sampling

Science student poking holes in the bottom of a paper cup in the shape of a constellation

54. Project the stars on your ceiling

Use the video lesson in the link below to learn why stars are only visible at night. Then create a DIY star projector to explore the concept hands-on.

Learn more: DIY Star Projector

Glass jar of water with shaving cream floating on top, with blue food coloring dripping through, next to a can of shaving cream

55. Make it rain

Use shaving cream and food coloring to simulate clouds and rain. This is an easy science experiment little ones will beg to do over and over.

Learn more: Shaving Cream Rain

56. Blow up your fingerprint

This is such a cool (and easy!) way to look at fingerprint patterns. Inflate a balloon a bit, use some ink to put a fingerprint on it, then blow it up big to see your fingerprint in detail.

Edible DNA model made with Twizzlers, gumdrops, and toothpicks

57. Snack on a DNA model

Twizzlers, gumdrops, and a few toothpicks are all you need to make this super-fun (and yummy!) DNA model.

Learn more: Edible DNA Model

58. Dissect a flower

Take a nature walk and find a flower or two. Then bring them home and take them apart to discover all the different parts of flowers.

DIY smartphone amplifier made from paper cups

59. Craft smartphone speakers

No Bluetooth speaker? No problem! Put together your own from paper cups and toilet paper tubes.

Learn more: Smartphone Speakers

Car made from cardboard with bottlecap wheels and powered by a blue balloon

60. Race a balloon-powered car

Kids will be amazed when they learn they can put together this awesome racer using cardboard and bottle-cap wheels. The balloon-powered “engine” is so much fun too.

Learn more: Balloon-Powered Car

Miniature Ferris Wheel built out of colorful wood craft sticks

61. Build a Ferris wheel

You’ve probably ridden on a Ferris wheel, but can you build one? Stock up on wood craft sticks and find out! Play around with different designs to see which one works best.

Learn more: Craft Stick Ferris Wheel

62. Design a phone stand

There are lots of ways to craft a DIY phone stand, which makes this a perfect creative-thinking STEM challenge.

63. Conduct an egg drop

Put all their engineering skills to the test with an egg drop! Challenge kids to build a container from stuff they find around the house that will protect an egg from a long fall (this is especially fun to do from upper-story windows).

Learn more: Egg Drop Challenge Ideas

Student building a roller coaster of drinking straws for a ping pong ball (Fourth Grade Science)

64. Engineer a drinking-straw roller coaster

STEM challenges are always a hit with kids. We love this one, which only requires basic supplies like drinking straws.

Learn more: Straw Roller Coaster

Outside Science Solar Oven Desert Chica

65. Build a solar oven

Explore the power of the sun when you build your own solar ovens and use them to cook some yummy treats. This experiment takes a little more time and effort, but the results are always impressive. The link below has complete instructions.

Learn more: Solar Oven

Mini Da Vinci bridge made of pencils and rubber bands

66. Build a Da Vinci bridge

There are plenty of bridge-building experiments out there, but this one is unique. It’s inspired by Leonardo da Vinci’s 500-year-old self-supporting wooden bridge. Learn how to build it at the link, and expand your learning by exploring more about Da Vinci himself.

Learn more: Da Vinci Bridge

67. Step through an index card

This is one easy science experiment that never fails to astonish. With carefully placed scissor cuts on an index card, you can make a loop large enough to fit a (small) human body through! Kids will be wowed as they learn about surface area.

Student standing on top of a structure built from cardboard sheets and paper cups

68. Stand on a pile of paper cups

Combine physics and engineering and challenge kids to create a paper cup structure that can support their weight. This is a cool project for aspiring architects.

Learn more: Paper Cup Stack

Child standing on a stepladder dropping a toy attached to a paper parachute

69. Test out parachutes

Gather a variety of materials (try tissues, handkerchiefs, plastic bags, etc.) and see which ones make the best parachutes. You can also find out how they’re affected by windy days or find out which ones work in the rain.

Learn more: Parachute Drop

Students balancing a textbook on top of a pyramid of rolled up newspaper

70. Recycle newspapers into an engineering challenge

It’s amazing how a stack of newspapers can spark such creative engineering. Challenge kids to build a tower, support a book, or even build a chair using only newspaper and tape!

Learn more: Newspaper STEM Challenge

Plastic cup with rubber bands stretched across the opening

71. Use rubber bands to sound out acoustics

Explore the ways that sound waves are affected by what’s around them using a simple rubber band “guitar.” (Kids absolutely love playing with these!)

Learn more: Rubber Band Guitar

Science student pouring water over a cupcake wrapper propped on wood craft sticks

72. Assemble a better umbrella

Challenge students to engineer the best possible umbrella from various household supplies. Encourage them to plan, draw blueprints, and test their creations using the scientific method.

Learn more: Umbrella STEM Challenge

Plus, sign up for our newsletters to get all the latest learning ideas straight to your inbox.

Science doesn't have to be complicated! Try these easy science experiments using items you already have around the house or classroom.

You Might Also Like

Magic Milk Experiment How to Plus Free Worksheet

Magic Milk Experiment: How-To Plus Free Worksheet

This classic experiment teaches kids about basic chemistry and physics. Continue Reading

Copyright © 2024. All rights reserved. 5335 Gate Parkway, Jacksonville, FL 32256

  • Search Menu
  • Sign in through your institution
  • Advance articles
  • Darwin Reviews
  • Special Issues
  • Expert View
  • Flowering Newsletter Reviews
  • Technical Innovations
  • Editor's Choice
  • Virtual Issues
  • Community Resources
  • Reasons to submit
  • Author Guidelines
  • Peer Reviewers
  • Submission Site
  • Open Access
  • About Journal of Experimental Botany
  • About the Society for Experimental Biology
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Permissions
  • Self-Archiving Policy
  • Dispatch Dates
  • Journal metrics
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

Decoding of life: genome analyses and functional characterization, kiss and ride: molecular interactions, femto is no limit: the content of plant tissues, seeing is believing: microscopy as an indispensable technique for studying plants, rewrite ‘whatever’: gene editing, conclusions, acknowledgements, conflict of interest, methods in plant science.

  • Article contents
  • Figures & tables
  • Supplementary Data

Martin Janda, Methods in plant science, Journal of Experimental Botany , Volume 75, Issue 17, 11 September 2024, Pages 5163–5168, https://doi.org/10.1093/jxb/erae328

  • Permissions Icon Permissions

The development of new techniques and technical advances in existing methods are the driving force in any area of research, including experimental botany. Improving our knowledge about the mechanisms of how the ‘world works’ is a necessary first step in creating a new technology. On the other hand, very often, it needs an advance in methods in order to make a new scientific discovery. Research in plant biology is no exception to this, and it often paves the way for the development of new methods or approaches that can have applications across wider subject areas.

The idea to put together this special issue came via the support of the Journal of Experimental Botany for the conference ‘Methods in Plant Sciences 2023’, held in Srní, Czech Republic. The meeting brought together more than 200 plant biologists and focused exclusively on techniques applicable to plant sciences. Several participants of the conference accepted the offer to write reviews about methodology approaches within their fields of expertise, and they represent the core of this issue, whilst other contributors have provided original reseach papers and also additional reviews. The result is that the papers cover a wide range of methods and approaches, from molecular dynamics through -omics and microscopy to monitoring plant–herbivore interactions. The articles can broadly be divided into two categories, namely texts focusing on a single methodological approach, its potential applications, and associated pitfalls, and those focusing on complex biological phenomena that require multiple methodological approaches in order to be properly characterized and understood.

In early studies, the variety of available methods was limited, and often only one technique was used. Thus, the findings were burdened by assumptions and potential shortcomings in the method. For example, in 1671, Grew and Malpigni used a recently developed microscope to describe pollen grains ( Manten, 1969 ): nowadays, basic research in experimental plant biology is not satisfied just with mere description but is focused on gaining an in-depth understanding of the underlying mechanism. In such studies, using one approach is insufficient, and scientists need to integrate multiple available methods—and sometimes improving them—to push our knowledge further. I am convinced that the papers in this special issue provide insightful information about recent advances in a wide spectrum of plant methods and will be useful for a broad audience.

Knowledge of hereditary information is one of the fundamental building blocks in understanding how organisms work, and plant science has benefited greatly in recent years from advances in sequencing technologies. The first plant genome to be fully sequenced was published in 2000, the same year as the first human genome, and it was that of the flowering plant Arabidopsis thaliana ( The Arabidopsis Genome Initiative, 2000 ). The Arabidopsis genome project took approximately 10 years from its beginnings and cost around 100 million USD; nowadays, it costs under 1000 USD to obtain a high-quality Arabidopsis genome, and the results are available within a week ( Li and Harkess, 2018 ). Thus, it is not surprising that more than a thousand Arabidopsis ecotypes have been sequenced ( 1001 Genomes Consortium, 2016 ), and projects aiming at sequencing tens of thousands of plants have become realistic ( https://db.cngb.org/10kp/ ).

Among other uses, such large datasets form the basis of so-called genome-wide association studies (GWAS). GWAS are capable of providing information on the functional involvement of genes within the studied biological process ( Nordborg and Weigel, 2008 ). To perform GWAS effectively, in addition to the availability of a high-throughput screening method, such as seedling growth analysis, it is also necessary to use the best possible algorithm to search for mutations within the collection. In this issue, John et al . (2024) present technical improvements regarding the approach to analysing sequence data using a permutational GWAS method.

Hereditary information is not just stored in the plant cell nucleus, with DNA also being found in mitochondria and plastids. The development of techniques known collectively as next-generation sequencing has in particular made it possible to analyse DNA sequences from these organelles. In their review, Štorchová and Krüger (2024) focus on the analysis of mitochondrial genomes in plants, describing the bottlenecks that stand in the way of obtaining hereditary information from them while highlighting that different next-generation sequencing techniques do not all provide the same results, and that using a combination of them appears to be the most efficient way to obtain high-quality and reliable mitochondrial DNA sequences. Hereditary information and its expression in mitochondria is also the subject of a review in this issue by Kwasniak-Owczarek and Janska (2024) . The authors focus on the translation of genes encoded in semi-autonomous organelles, pointing out that translation is a fundamental regulatory element in the expression of these genes. They also describe techniques that have not yet been applied to plants.

The development of the polymerase chain reaction (PCR) method, for which Kary Mullis was awarded the Nobel Prize in 1993, enabled gene transcription analysis and was a milestone for studying the involvement influence of individual genes on events in living organisms. Normally, samples prepared for transcriptomic analysis consist of the genetic information of many plant cells and cell types. For example, researchers collect RNA from the whole or at least part of the leaf. However, in recent years, attention has been turned to being able to analyse transcription in single cells ( Tang et al. , 2009 ). This technological advance has led to the expansion of analyses at the single-cell level beyond transcriptomics, and techniques are now also being developed for other ‘-omics’ ( Bennett et al ., 2023 ). In this special issue, Tenorio Berrío and Dubois (2024) discuss the potential of single-cell transcriptomics in the context of plant stress responses. Data from single-cell studies, which can discern spatial patterns of cell responses within whole organs, are likely to provide an explanation for the events where homogenizing of the ‘whole leaf’ is too low-resolution, such as transcriptomic patterns within the leaf as a response to the pathogens.

Genome analysis does not end with the knowledge of the nucleotide sequence. The mixture of DNA and proteins that form the chromosomes is folded into a higher structure—chromatin—and its 3D structure has a very significant influence on the subsequent manifestation of hereditary information. Our current knowledge is summarized by Šimková et al. (2024) , and they consider the possibilities and limitations of different approaches to analyse genome-folding into a 3D architecture. They give particular focus on describing proximity ligation Hi-C methods. Detailed knowledge of the chromatin structure of individual chromosomes leads us to the goal of studying the functional wiring of individual chromosomes. In this context, sex chromosomes are an interesting research target, and this topic is reviewed by Hobza et al. (2024) .

Gene expression ends with the creation of the desired molecules, such as enzymes. Enzymes are essential components of metabolic processes that generate further compounds. All the molecules produced are transported to the site of their function, whether as enzymes, structural components, or energy sources. Molecules do not exist in a vacuum and are constantly interacting with other molecules. In this special issue, three reviews examine methods used to study the interactions of specific types of molecules: Cuadrado and van Damme (2024) focus on advances in protein–protein interactions, whilst Neubergerová and Pleskot (2024) and Škrabálková et al. (2024) focus on interactions between proteins and lipids, with each paper looking at the subject from a different perspective. The usage and power of molecular dynamics are described by Neubergerová and Pleskot, and biochemical, ‘wet’ molecular biology and microscopic approaches are presented by Škrabálková et al .

The molecules that influence most physiological phenotypes in plants are phytohormones, also referred to in the past as plant growth regulators. Perhaps surprisingly, phytohormones are also involved in the regulation of epigenetic events in plants: by their interactions with receptors or binding proteins, they are able to cause epigenetic changes. These interactions, their mechanisms of action, and the methods used to study the links between phytohormones and epigenetic regulation are discussed by Rudolf et al . (2024) .

Plants are ingenious chemical engineers. As such, they are able to synthesize a plethora of molecules, and they serve as a source of inspiration for organic chemists. In this special issue, Ćavar Zeljković et al. (2024) present an improved protocol for determining nitrogen-related metabolites, which include amino acids together with biogenic amines and their acetylated and methylated derivates. The method that they present enables the determination of the concentration of a total of 74 metabolites, some of the which have a limit of detection in the lower units of femtomoles per injection. The authors demonstrate the functionality of the protocol in the model dicot plant Arabidopsis thaliana , as well as in a variety of crop species: Solanum lycopersicum and Nicotiana tabacum (C 3 dicots), Zea mays (C 4 monocot) and Hordeum vulgare and Triticum aesetivum (C 3 monocots).

Petrik et al. (2024) focus on the analysis of phytohormones. A detailed knowledge of phytohormone concentrations in different parts of the plant and under different growing and stress conditions is needed to understand their effects. Phytohormones represent diverse types of molecules, and Petrik et al. refer to the art of their analysis as ‘hormonomics’, thereby placing it alongside traditional ‘-omics’ such as transcriptomics, proteomics, and metabolomics. They highlight the boom in phytohormonal analyses as evidenced by the large increase in publications focusing on their determination in recent years. Phytohormones can be analysed by targeted or non-targeted approaches and, in combination with microscopic techniques, it is now becoming possible to consider the analysis of their concentrations not only at organ-level resolution, but to approximate their concentrations at the cellular and even subcellular levels as well ( Petrik et al ., 2024 ), thus approaching that of single-cell transcriptomics as noted above. It is expected that using parallel reaction monitoring for high-resolution mass spectrometry, a sensitivity at 50 amol can be achieved, which makes the future use of this approach in ‘hormonics’ look promising.

Microscopy is a technique without which it is hard to imagine plant research. Its use in plants dates back to 1665 when Robert Hooke published the book Micrographia . From this early origin, the technique expanded massively, and today it is divided into many branches, such as bright-field, fluorescence, confocal, and electron microscopy. Each has its distinct limits in resolution and its own (dis)advantages. This special issue includes two original research articles that use microscopy as their main method. Daněk et al . (2024) improve the analysis and quantification of autophagosomes in a 3D perspective by critically evaluating the limitations of previously used techniques and comparing them with the modified approach that they present. Narasimhan et al. (2024) present a tool box for analysing the binding of peptides to their receptors, for which they have developed genetic tools to use at good spatiotemporal resolution in vivo . In addition, a review by Müller-Schüssele (2024) describes the creation of biosensors for the determination of redox dynamics that are analysed using microscopy..

A microscope is not always necessary to prove that ‘seeing is believing’. Sotta and Fujiwara (2024) describe the development of a new, inexpensive, and reliable high-throughput method for monitoring herbivore insect feeding in Arabidopsis, which ultilizes a scanner and custom-built macro software for the free ImageJ and R packages.

Good quality visualization contributes to a better understanding of the processes taking place in plants, and Jo and Kajala (2024) introduce a new open-source ggPlantmap programme for visualization of experimental data, in a similar fashion to that provided by ePlant ( Waese et al. , 2017 ).

Since the description of the revolutionary gene-editing technology using CRISPR/Cas9 ( Jinek et al ., 2012 ), researchers in nearly all fields of biology have harnessed its possibility of creating desired mutant organisms. In plants, the technology is being used and improved extensively. The database created by the EU-SAGE initiative, which aims to open the European Union to the use of new genomic techniques in agriculture, contains more than 800 studies that have already utilized CRISPR to create genome-edited crop plants ( https://www.eu-sage.eu/genome-search ).

In this special issue, Přibylová and Fischer (2024) have written a detailed ‘cookbook’ for planning and performing gene editing using CRISPR. The authors highlight details that are not easy (or are indeed impossible) to obtain from original research articles, such as the effect of the chromatin state on CRISPR/Cas9 activity. In another paper, Vats et al . (2024) discuss prime editing in plants, noting its advantages and, at the same time, the challenges that need to be overcome to use this technique effectively, such as the fact that prime editing works with very low efficiency in dicots.

CRISPR/Cas9 has opened up unexpected possibilities for transcribing genomes and made it possible to think about editing in virtually all plants. As a result, the current bottleneck in research is no longer editing itself but determining the most efficient and reliable transformation process for the plant chosen for gene editing.

From the very beginning, plant research has benefited from the most advanced methodological approaches that have been available and, in turn, it has often been part of their evolution, such as the birth of genetics ( Mendel, 1865 ) or gene silencing ( Baulcombe, 2002 ). It is rather like the ‘chicken-and-egg’ question: which came first, knowledge or method? One cannot be without the other. In the past, one available method (e.g. microscopy) and excellent observational talent were often enough to make significant scientific progress: such a situation is perhaps not imaginable today even in basic research. Now, significant advances in knowledge require a detailed description of the mechanisms of how the plant works, and this in turn requires a combination of the multiple, often state-of-the-art, methods that are available, many of which have only been developed just recently.

I am confident that the collection of articles assembled in this special issue will help a broad range of our colleagues obtain a good overview and better understanding of the techniques now available in plant biology. As an example, the fictional case study in Box 1 summarizes the possibilities of the methods and approaches presented in this issue to deal with a complex, long-term scientific goal.

This virtual case study examines the effects of herbivore-associated molecular patterns on plants, and serves to highlight the application of methods considered in this special issue. All the ‘results’ referred to and the associated discussion are fictitious.

graphic

(A) We have identified a novel peptide from larvae that can be used to treat plants and trigger their resistance. We monitored the damage caused by larvae to the plant using the methodology presented by Sotta and Fujiwara (2024) , [1] , which provides a very cheap and high-throughput approach. In addition to protection against larvae, we also observed a triggered immune response after the peptide treatment. This prompted us to ask what the molecular mechanism was, and we assumed that there must be a plant receptor recognizing the peptide. (B) Due to the availability of a collection of sequenced ecotypes of our plant, we were able to perform a genome-wide association study (GWAS) analysis and obtained a candidate gene for the receptor ( John et al ., 2024 ) [2] . The structure of the receptor revealed its localization on the plasma membrane. (C) We analysed the binding of the peptide to its receptor using the techniques described by Cuadrado and van Damme (2024) and Narasimhan et al . (2024) , [3,4] . We next characterized the receptor complex and the interaction between the two proteins. So far, we have not been able to perform prime editing on our plant to confirm the direct site of the peptide binding to the receptor ( Vats et al ., 2024 ) [5] . Using molecular dynamics, we have predicted the local composition of the plasma membrane around the receptor in the absence and presence of the peptide ( Neubergerová and Pleskot, 2024 ) [6] . A combination of molecular biology, biochemistry, and microscopy techniques confirmed the results obtained using molecular dynamics ( Škrabálková et al ., 2024 ) [7] . Using the approach described by Daňek et al. (2024) , [8] , we were able to demonstrate that the binding of the peptide to the receptor does not lead to autophagy. We performed a detailed single-cell transcriptomic analysis, as described by Tenorio Berrío and Dubois (2024) , [9] , and the results showed a clear similarity to typical biotic stress transcriptomic responses. Utilizing complex methods described by Rudolf et al . (2024) , [10] , we did not find any influence of the peptide treatment on the plant epigenetic landscape. One of the typical responses to biotic stress is changes in redox, which we analysed using the biosensors described by Müller-Schüssele (2024) , [11] . Another typical stress response is a change in the concentration of phytohormones, and we used the state-of-the-art methods described by Petrik et al . (2024) , [12] to examine these at a detailed level. Unsurprisingly, we found significant changes in the jasmonic and salicylic acid levels in distinct parts of the leaf. An intriguing observation from our transcriptomic analysis was that there might be changes in the abundance of nitrogen-related compounds. Using the techniques described by Ćavar Zeljković et al . (2024) , [13] , we were able to analyse these changes and identify altered concentrations in particular nitrogen-related compounds. We visualized the results obtained in (C) using the ggPlantap package ( Jo and Kajala, 2024 ) [14] . We have found that plants from other families, for example, sunflowers, do not respond well to the peptide. Focusing on sunflower, we determined that the homolog receptor protein contains a number of structural differences, effectively abolishing its interaction with the aphid peptide. (D) Therefore, we decided to edit the receptor gene in sunflower to make its sequence identical to that of our original model plant. For this, we utilized the CRISPR/Cas9 approach, as described in detail by Přibylová and Fischer (2024) [15] . As predicted, our genetic modification led to decreased damage caused by the larvae in sunflowers. Created with BioRender.com.

Many thanks to all the people who contributed to the success of the conference ‘Methods in Plant Sciences 2023’ in Srní, Czech Republic, and to all the authors who have contributed to this special issue, on whose work this editorial is based. I would like to thank Dr Štěpán Jeřábek of Columbia University (NY, USA) for help with editing the final draft of this text. Last but not least, I would like to thank the editorial office of the Journal of Experimental Botany for all their help and patience with the special issue.

The author declares that he has no conflict of interest in relation to this work.

1001 Genomes Consortium. 2016 . 1,135 Genomes reveal the global pattern of polymorphism in Arabidopsis thaliana . Cell 166 , 481 – 491 .

Baulcombe D. 2002 . RNA silencing . Current Biology 12 , R82 – R84 .

Google Scholar

Bennett HM , Stephenson W , Rose CM , Darmanis S. 2023 . Single-cell proteomics enabled by next-generation sequencing or mass spectrometry . Nature Methods 20 , 363 – 374 .

Ćavar Zeljković S , De Diego N , Drašar L , Nisler J , Havlíček L , Spíchal L , Tarkowski P. 2024 . Comprehensive LC-MS/MS analysis of nitrogen-related plant metabolites . Journal of Experimental Botany 75 , 5390 – 5411 . https://doi.org/10.1093/jxb/erae129

Cuadrado AF , Van Damme D. 2024 . Unlocking protein–protein interactions in plants: a comprehensive review of established and emerging techniques . Journal of Experimental Botany 75 , 5220 – 5236 . https://doi.org/10.1093/jxb/erae088

Daněk M , Kocourková D , Podmanická TK , Eliášová K , Nesvadbová K , Krupař P , Martinec J. 2024 . A novel workflow for unbiased 3D quantification of autophagosomes in Arabidopsis thaliana roots . Journal of Experimental Botany 75 , 5412 – 5427 . https://doi.org/10.1093/jxb/erae084

Hobza R , Bačovský V , Čegan R , et al. . 2024 . Sexy ways: approaches to studying plant sex chromosomes . Journal of Experimental Botany 75 , 5204 – 5219 . https://doi.org/10.1093/jxb/erae173

Jinek M , Chylinski K , Fonfara I , Hauer M , Doudna JA , Charpentier E. 2012 . A programmable dual-RNA–guided DNA endonuclease in adaptive bacterial immunity . Science 337 , 816 – 821 .

Jo L , Kajala K. 2024 . ggPlantmap: an open-source R package for the creation of informative and quantitative ggplot maps derived from plant images . Journal of Experimental Botany 75 , 5366 – 5376 . https://doi.org/10.1093/jxb/erae043

John M , Korte A , Grimm DG. 2024 . The benefits of permutation-based genome-wide association studies . Journal of Experimental Botany 75 , 5377 – 5389 . doi: https://doi.org/10.1093/jxb/erae280

Kwasniak-Owczarek M , Janska H. 2024 . . Experimental approaches to studying translation in plant semi-autonomous organelles . Journal of Experimental Botany 75 , 5175 – 5187 . https://doi.org/10.1093/jxb/erae151

Li FW , Harkess A. 2018 . A guide to sequence yourfavorite plant genomes . Applications in Plant Sciences 6 , e1030 .

Manten AA. 1969 . On the earliest microscopical observations of pollen grains . Review of Paleobotany and Palynology 9 , 5 – 16 .

Mendel GJ. 1865 . Versuche über Pflanzen-Hybriden [Experiments Concerning Plant Hybrids] . In: Verhandlungen des naturforschenden Vereines in Brünn [Proceedings of the Natural History Society of Brünn] IV . 3 – 47 .

Google Preview

Müller-Schüssele SJ. 2024 . Chloroplast thiol redox dynamics through the lens of genetically encoded biosensors . Journal of Experimental Botany 75 , 5312 – 5324 . https://doi.org/10.1093/jxb/erae075

Narasimhan M , Jahnke N , Kallert F , Bahafid E , Böhmer F , Hartmann L , Simon R. 2024 . Macromolecular tool box to elucidate CLAVATA3/Embryo Surrounding Region-Related-RLK binding, signaling and downstream effects . Journal of Experimental Botany 75 , 5438 – 5456 . https://doi.org/10.1093/jxb/erae206

Neubergerová M , Pleskot R. 2024 . Plant protein–lipid interfaces studied by molecular dynamics simulations . Journal of Experimental Botany 75 , 5237 – 5250 . https://doi.org/10.1093/jxb/erae228

Nordborg M , Weigel D. 2008 . Next-generation genetics in plants . Nature 456 , 720 – 723 .

Petrik I , Hladik P , Zhang C , Pencik A , Novak O. 2024 . Spatio-temporal plant hormonomics: from tissue to subcellular resolution . Journal of Experimental Botany 75 , 5295 – 5311 . https:/10.1093/jxb/erae267

Přibylová A , Fischer L. 2024 . How to use CRISPR/Cas9 in plants: from target site selection to DNA repair . Journal of Experimental Botany 75 , 5325 – 5343 . https://doi.org/10.1093/jxb/erae147

Rudolf J , Tomovicova L , Panzarova K , Fajkus J , Hejatko J , Skalak J. 2024 . Epigenetics and plant hormone dynamics: a functional and methodological perspective . Journal of Experimental Botany 75 , 5267 – 5294 . https://doi.org/10.1093/jxb/erae054

Sotta N , Fujiwara T. 2024 . Time-course analysis system for leaf feeding marks reveals effects of Arabidopsis trichomes on insect herbivore feeding behavior . Journal of Experimental Botany 75 , 5428 – 5437 . https://doi.org/10.1093/jxb/erae184

Škrabálková E , Pejchar P , Potocký M. 2024 . Exploring lipid–protein interactions in plant membranes . Journal of Experimental Botany 75 , 5251 – 5266 . https://doi.org/10.1093/jxb/erae199

Šimková H , Câmara AS , Mascher M. 2024 . Hi-C techniques: from genome assemblies to transcription regulation . Journal of Experimental Botany 75 , 5357 – 5365 . https://doi.org/10.1093/jxb/erae085

Štorchová H , Krüger M. 2024 . Methods for assembling complex mitochondrial genomes in land plants . Journal of Experimental Botany 75 , 5169 – 5174 . https://doi.org/10.1093/jxb/erae034

Tang F , Barbacioru C , Wang Y , et al. . 2009 . mRNA-seq whole-transcriptome analysis of a single cell . Nature Methods 6 , 377 – 382 .

Tenorio Berrío R , Dubois M. 2024 . Single-cell transcriptomics reveals heterogeneity in plant responses to the environment: a focus on biotic and abiotic interactions . Journal of Experimental Botany 75 , 5188 – 5203 . https://doi.org/10.1093/jxb/erae107

The Arabidopsis Genome Initiative 2000 . Analysis of the genome sequence of the flowering plant Arabidopsis thaliana . Nature 408 , 796 – 815 .

Vats S , Kumar J , Sonah H , Zhang F , Deshmukh R. 2024 . Prime editing in plants: prospects and challenges . Journal of Experimental Botany 75 , 5344 – 5356 . https://doi.org/10.1093/jxb/erae053

Waese J , Fan J , Pasha A , et al. . 2017 . ePlant: visualizing and exploring multiple levels of data for hypothesis generation in plant biology . The Plant Cell 29 , 1806 – 1821 .

Month: Total Views:
September 2024 246

Email alerts

Citing articles via.

  • Recommend to your Library

Affiliations

  • Online ISSN 1460-2431
  • Print ISSN 0022-0957
  • Copyright © 2024 Society for Experimental Biology
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

science experimental methods

News from EUROfusion, Members and Partners

Eurofusion spearheads advances in artificial intelligence and machine learning to unlock fusion energy.

  • September 12, 2024
  • Gieljan de Vries
  • EUROfusion news

Fusion energy promises to deliver safe, sustainable, and low-carbon baseload power, complementing other clean energy sources like solar and wind.

To achieve this, we need to address complex physics and engineering challenges, including understanding the collective movements of charged particles in magnetic fields, mitigating disruption events, analysing material erosion effects, and processing data rapidly enough for use in control loops. Artificial Intelligence and Machine Learning offer new opportunities to deepen our understanding of these phenomena.

Artist's impression of Artificial Intelligence research for fusion. Credit: Pexels / GoogleDeepMind

Building on the world’s largest dataset

“With new research projects on Artificial Intelligence and Machine Learning, EUROfusion aims to accelerate progress towards fusion energy and support the ongoing efforts in its work packages”, explains Sara Moradi of the EUROfusion Programme Management Unit. “Machine learning and Artificial Intelligence are powerful tools for extracting insight from data, uncovering patterns and suggest control schemes that are too computationally intensive to identify with traditional computer models.”

EUROfusion’s extensive dataset of fusion experiments spans decades of research, from the earliest fusion machines to the most advanced systems currently in operation. This unparalleled resource positions EUROfusion uniquely to drive forward Artificial Intelligence applications in fusion research.

Fusion is a great sandbox for Artificial Intelligence and Machine Learning, agrees José Vicente (University of Lisbon), the principal investigator of one of the fifteen projects. “As a very complex system, it has many open questions. We can already address those with today’s large amounts of experimental data and realistic numerical simulations of the key physics, but not all of them — that is the gap that Artificial Intelligence may help close.”

Artificial Intelligence for Fusion projects

Following a call and selection process, the EUROfusion General Assembly approved the support for 15 new Artificial Intelligence Fusion research projects. The strong response to the call highlights the scientific community’s commitment to using state-of-the-art approaches to advance computational techniques for magnetically confined plasmas.

The 15 projects will receive a total amount of €2.659 million, of which half is provided by collaborative co-funding from the researchers’ home institutes and half from EUROfusion. The research projects will run for a period of two years.

Different methods to estimate the group delay for different radar frequencies in reflectometry data from a fusion plasma. The anticipated machine learning method (ML) outperforms traditional analysis methods (MP, BP). Credit: J.Vicente and J.Santos, IST, University of Lisbon

Quotes from participating researchers

“Our project uses probability theory and machine learning to understand the conditions for so-called ‘hybrid scenarios’. These scenarios offer an improved energy confinement in fusion machines, but the necessary conditions change between machines with different characteristics. If successful, our approach will let us find the underlying patterns and even predict the optimal conditions for future facilities like ITER and the power plants that follow.”

—Professor Geert Verdoolaege, department of Applied Physics, Ghent University, Belgium Identification and confinement scaling of hybrid scenarios across multiple devices

“This project is about using machine learning to get faster simulations of the stability of the edge plasma in tokamaks, the ‘pedestal’ region. These kinds of calculations can already be done, but they are computationally heavy. Our goal is to substantially accelerate these tools so they can be used in fast everyday data analysis as well as in real-time applications like control.”

—Dr Aaro Järvinen at VTT, Finland Machine learning accelerated pedestal MHD stability simulations

“We will investigate the application of deep learning techniques to improve the reconstruction of signals detected with radar-like instruments in fusion devices. This is important because it will allow better and automatic control of the fusion plasma that needs to be efficiently confined in those devices and in future fusion reactors.”

—Dr José Vicente at IST, University of Lisbon, Portugal Deep Learning for Spectrogram Analysis of Reflectometry Data

These projects underscore the potential of Artificial Intelligence and Machine Learning to address key challenges in fusion research, paving the way for more efficient and effective control strategies as we move closer to realizing fusion energy.

Using machine learning to get faster simulations of the stability of the edge plasma ('pedestal') in tokamaks. Credit: A. Järvinen et al., VTT

Supported projects

David Zarzoso ( CEA / CNRS, France) Artificial Intelligence augmented Scrape Off Layer modelling for capturing impact of filaments on transport and PWI in mean field codes simulations.

Feda Almuhisen (CEA / Aix-Marseille Université, France) Towards Tokamak operations Conversational Artificial Intelligence Interface Using Multimodal Large Language Models

Augusto Pereira (CIEMAT, Spain) Testing cutting-edge Artificial Intelligence research to increase pattern recognition and image classification in nuclear fusion databases

Sven Wiesen ( DIFFER , the Netherlands) Machine learning accelerated Scrape Off Layer L simulations: SOLPS-NN

Gergő Pokol (EK-CER, Hungary) Fast inference methods of advanced diagnostics for real-time control

Riccardo Rossi (ENEA / Università di Roma Tor Vergata, Italy) Artificial Intelligence-assisted Causality Detection and Modelling of Plasma Instabilities for Tokamak Disruption Prediction and Control

Michela Gelfusa (ENEA / Università di Roma Tor Vergata, Italy) Development of Physics Informed Neural Networks (PINNs) for Modelling and Prediction of Data in the Form of Time Series

Alessandro Pau (EPFL, Switzerland) Artificial Intelligence-assisted Plasma State Monitoring for Control and Disruption-free Operations in Tokamaks

Pawel Gasior (IPPLM, Poland) Laser Induced Breakdown Spectrocopy data-processing with Deep Neural Networks and Convolutional Neural Networks for chemical composition quantification in the wall of the next step-fusion reactors

Jose Vicente (IST, Portugal) Deep Learning for Spectrogram Analysis of Reflectometry Data

Geert Verdoolaege (LPP-ERM-KMS / Ghent University, Belgium) Identification and confinement scaling of hybrid scenarios across multiple devices

Marcin Jakubowski (IPP, Germany) Leveraging Generative Artificial Intelligence Models for Thermal Load Control in High-Performance Steady-State Operation of Fusion Devices

Daniel Böckenhoff (IPP, Germany) Surrogate modelling of ray-tracing and radiation transport code for faster real-time plasma profile inference in a magnetic confinement device

Antti Snicker (VTT, Finland) Applying Artificial Intelligence/Machine Learning for Neutral Beam Injection ionization and slowing-down simulations using ASCOT/BBNBI

Aaro Järvinen (VTT, Finland) Machine learning accelerated pedestal Magneto Hydro Dynamics stability simulations

About EUROfusion

The EUROfusion consortium coordinates experts, students and facilities from across Europe to advance fusion energy in line with the EUROfusion fusion roadmap . Co-funded through the Euratom Research and Training Programme, EUROfusion supports the preparation for experiments at ITER and the development of the European demonstration fusion power plant DEMO . The programme also fosters fusion education, training, and industry collaboration.

You may also enjoy these articles:

European fusion technology marketplace lauched.

  • September 18, 2024
  • EUROfusion news , Fusion For Energy , Partner news

2nd JT-60SA International Fusion School in 2024

  • September 11, 2024

Cycling tour Ride4Fusion promotes fusion energy

  • September 6, 2024

Effects of curing condition and solder mask on substrate warpage: an experimental and simulation study

  • Published: 16 September 2024
  • Volume 35 , article number  1734 , ( 2024 )

Cite this article

science experimental methods

  • Guowei Fan 1 , 2 ,
  • Zengming Hu 1 , 2 ,
  • Jie Xu 1 , 2 ,
  • Junqi Tang 3 ,
  • Dashun Liu 4 ,
  • Zeming Fang 1 , 2 ,
  • Qianfa Liu 3 ,
  • Dong Lu 5 ,
  • Ke Xue 5 &
  • Ke Wang   ORCID: orcid.org/0000-0001-8221-3488 1 , 2  

The warpage of the package substrate mainly originates from the material property and size variations of individual components, especially when multiple components are involved. To maintain the substrate warpage within acceptable limits, it’s crucial to fine-tune component parameters carefully, choosing appropriate materials and processing conditions. The current study investigates the factors influencing substrate warpage and explores the methods to mitigate it with the assistance of finite element analysis. Glass fiber reinforced epoxy-based substrates were prepared under different curing temperatures, and characterized by thermo-mechanical analysis and mechanical testing. A finite element simulation model of the bare carrier board was developed using ABAQUS software. The results show that the curing temperature impacts the coefficient of thermal expansion (CTE), strength and modulus of the substrate. The differences in CTE and dimensional parameters among the component materials strongly influence substrate warpage. While the curing conditions affect bare carrier board warpage to some extent, the type and thickness of solder mask have more significant effects and warpage can be mitigated by properly choosing and applying solder masks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

science experimental methods

Similar content being viewed by others

science experimental methods

Strain behaviors of solder bump with underfill for flip chip package under thermal loading condition

science experimental methods

Solder Joint Reliability: Thermo-mechanical analysis on Power Flat Packages

Realistic creep characterization for sn3.0ag0.5cu solder joints in flip chip bga package, explore related subjects, data availability.

Data will be made available on reasonable request.

S.Y. Park, S.Y. On, J. Kim, J. Lee, T.-S. Kim, B.L. Wardle, S.S. Kim, ACS Appl. Mater. Interfaces 15 , 11024–11032 (2023). https://doi.org/10.1021/acsami.2c21229

Article   CAS   PubMed   PubMed Central   Google Scholar  

S.Y. Yang, Y.-D. Jeon, S.-B. Lee, K.-W. Paik, Microelectron. Reliab. 46 , 512–522 (2006). https://doi.org/10.1016/j.microrel.2005.06.007

Article   CAS   Google Scholar  

J.-H. Baek, D.-W. Park, G.-H. Oh, D.-O. Kawk, S.S. Park, H.-S. Kim, Mater. Sci. Semicond. Process. 148 , 106758 (2022). https://doi.org/10.1016/j.mssp.2022.106758

H. Wang, Y. Liu, J. Zhang, T. Li, Z. Hu, Y. Yu, RSC Adv. 5 , 11358–11370 (2015). https://doi.org/10.1039/c4ra13678k

R. Saito, Y. Yamaguchi, S. Matsubara, S. Moriguchi, Y. Mihara, T. Kobayashi, K. Terada, Int. J. Solids Struct. 190 , 199–215 (2020). https://doi.org/10.1016/j.ijsolstr.2019.11.010

L. Granado, S. Kempa, S. Bremmert, L.J. Gregoriades, F. Brüning, E. Anglaret, N. Fréty, J. Microelectron. Electron. Packag. 14 , 45–50 (2017). https://doi.org/10.4071/imaps.359903

Article   Google Scholar  

C Kim, T Lee, H Choi, M Kim, T Kim, 2014 Electronic Components & Technology Conference. 1004, 1009.

C. Kim, T.-I. Lee, M. Kim, T.-S. Kim, Polymers 7 , 985–1004 (2015). https://doi.org/10.3390/polym7060985

J Kim, S Lee, J Lee, S Jung, C Ryu, Unit Process Development of Advanced Circuit Interconnect. 1, 12.

C. Selvanayagam, R. Mandal, N. Raghavan, Microelectron. Reliab. 88–90 , 817–823 (2018). https://doi.org/10.1016/j.microrel.2018.07.110

Y. Duan, G. Liu, W. Wang, Q. Deng, J. Li, R. Cao, C. Wang, Microelectron. Reliab. 151 , 115260 (2023). https://doi.org/10.1016/j.microrel.2023.115260

M. Shih, K. Chen, T. Lee, D. Tarng, C.P. Hung, IEEE transactions on components. Packag. Manuf. Technol. 11 , 690–696 (2021). https://doi.org/10.1109/tcpmt.2021.3065647

J Talledo, R Real, A Cadag, 29th ASEMEP national technical symposium. 1, 7.

A. Inamdar, Y.-H. Yang, A. Prisacaru, P. Gromala, B. Han, Polym. Degrad. Stab. (2021). https://doi.org/10.1016/j.polymdegradstab.2021.109572

D.S. Kim, M.J. Han, J.R. Lee, Polym. Eng. Sci. 35 , 1353–1358 (2004). https://doi.org/10.1002/pen.760351705

X. Liu, Y. Yu, S. Li, Polymer 47 , 3767–3773 (2006). https://doi.org/10.1016/j.polymer.2006.03.102

F. Gao, J. Shi, X. Zhang, X. Wang, L. Weng, X. Sun, L. Yu, C. Li, Polym. Compos. 43 , 7200–7210 (2022). https://doi.org/10.1002/pc.26783

H.A. Khonakdar, J. Morshedian, U. Wagenknecht, S.H. Jafari, Polymer 44 , 4301–4309 (2003). https://doi.org/10.1016/s0032-3861(03)00363-x

Z. Lan, J. Deng, Z. Xu, Z. Ye, Y. Nie, Polymers (2023). https://doi.org/10.3390/polym15122734

Article   PubMed   PubMed Central   Google Scholar  

C.L. Hsieh, W.H. Tuan, Mater. Sci. Eng. A 396 , 202–205 (2005). https://doi.org/10.1016/j.msea.2005.01.029

C. Tz-Cheng, G. Che-Li, H. Hong-Wei, L. Yi-Shao, IEEE Trans. Device Mater. Reliab. 11 , 339–348 (2011). https://doi.org/10.1109/tdmr.2011.2135860

P. Hutapea, J.L. Grenestedt, M. Modi, M. Mello, K. Frutschy, Microelectron. Eng. 83 , 557–569 (2006). https://doi.org/10.1016/j.mee.2005.12.009

Download references

Acknowledgements

The authors sincerely acknowledge the Department of Science and Technology of Guangdong Province (Project No. 2023B0101020004), National Natural Science Foundation of China (NSFC, No. 62374080), and Shenzhen Key Laboratory of Intelligent Manufacturing for Continuous Carbon Fibre Reinforced Composites (Project No. ZDSYS20220527171404011) for the financial support.

This work was supported by Ministry of Industry and Information Technology of China, National Natural Science Foundation of China (NSFC, No. 62374080), and Shenzhen Key Laboratory of Intelligent Manufacturing for Continuous Carbon Fibre Reinforced Composites (Project No. ZDSYS20220527171404011).

Author information

Authors and affiliations.

Shenzhen Key Laboratory of Intelligent Manufacturing for Continuous Carbon Fiber Reinforced Composites, Southern University of Science and Technology, Shenzhen, 518055, China

Guowei Fan, Zengming Hu, Jie Xu, Zeming Fang & Ke Wang

School of System Design and Intelligent Manufacturing, Southern University of Science and Technology, Shenzhen, 518055, China

Guangdong Shengyi Technology Co., Ltd., Dongguan, 523000, China

Junqi Tang, Li Luo & Qianfa Liu

Division of Advanced Engineering Materials, Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou, 511458, China

The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China

Dong Lu & Ke Xue

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed to the study conception and design. Guowei Fan: Material preparation, data collection and analysis, simulation, manuscript drafting and revision; Jie Xu, Junqi Tang, Zeming Fang, Li Luo and Qianfa Liu: Experiment design and execution, data analysis; Dashun Liu, Dong Lu and Ke Xue: Data validation, simulation, manuscript review; Ke Wang: Supervision, conceptualization, manuscript review and editing. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Ke Wang .

Ethics declarations

Competing interests.

The authors declare that they have no conflict of interest. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non‐financial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Fan, G., Hu, Z., Xu, J. et al. Effects of curing condition and solder mask on substrate warpage: an experimental and simulation study. J Mater Sci: Mater Electron 35 , 1734 (2024). https://doi.org/10.1007/s10854-024-13499-z

Download citation

Received : 18 July 2024

Accepted : 10 September 2024

Published : 16 September 2024

DOI : https://doi.org/10.1007/s10854-024-13499-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Find a journal
  • Publish with us
  • Track your research

COMMENTS

  1. Steps of the Scientific Method

    The six steps of the scientific method include: 1) asking a question about something you observe, 2) doing background research to learn what is already known about the topic, 3) constructing a hypothesis, 4) experimenting to test the hypothesis, 5) analyzing the data from the experiment and drawing conclusions, and 6) communicating the results ...

  2. Scientific method

    scientific method, mathematical and experimental technique employed in the sciences. More specifically, it is the technique used in the construction and testing of a scientific hypothesis. The process of observing, asking questions, and seeking answers through tests and experiments is not unique to any one field of science.

  3. Experimentation in Scientific Research

    Experimentation is one scientific research method, perhaps the most recognizable, in a spectrum of methods that also includes description, comparison, and modeling (see our Description, Comparison, and Modeling modules). While all of these methods share in common a scientific approach, experimentation is unique in that it involves the conscious ...

  4. Scientific method

    The scientific method is an empirical method for acquiring knowledge that has characterized the development of science since at least the 17th century. The scientific method involves careful observation coupled with rigorous scepticism, because cognitive assumptions can distort the interpretation of the observation.Scientific inquiry includes creating a hypothesis through inductive reasoning ...

  5. Science and the scientific method: Definitions and examples

    Science is a systematic and logical approach to discovering how things in the universe work. Scientists use the scientific method to make observations, form hypotheses and gather evidence in an ...

  6. Guide to Experimental Design

    Table of contents. Step 1: Define your variables. Step 2: Write your hypothesis. Step 3: Design your experimental treatments. Step 4: Assign your subjects to treatment groups. Step 5: Measure your dependent variable. Other interesting articles. Frequently asked questions about experiments.

  7. Scientific Method

    The scientific method, developed during the Scientific Revolution (1500-1700), changed theoretical philosophy into practical science when experiments to demonstrate observable results were used to confirm, adjust, or deny specific hypotheses. Experimental results were then shared and critically reviewed by peers until universal laws could be made.

  8. Controlled experiments (article)

    Controlled experiments (article) | Khan Academy. If you're seeing this message, it means we're having trouble loading external resources on our website. If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked. Khanmigo is now free for all US educators! Plan lessons, develop exit tickets ...

  9. Scientific Method: Definition and Examples

    Regina Bailey. Updated on August 16, 2024. The scientific method is a series of steps that scientific investigators follow to answer specific questions about the natural world. Scientists use the scientific method to make observations, formulate hypotheses, and conduct scientific experiments. A scientific inquiry starts with an observation.

  10. Experiment Definition in Science

    In science, an experiment is a procedure that tests a hypothesis. In science, an experiment is simply a test of a hypothesis in the scientific method. It is a controlled examination of cause and effect. Here is a look at what a science experiment is (and is not), the key factors in an experiment, examples, and types of experiments.

  11. Scientific Method

    Science is an enormously successful human enterprise. The study of scientific method is the attempt to discern the activities by which that success is achieved. Among the activities often identified as characteristic of science are systematic observation and experimentation, inductive and deductive reasoning, and the formation and testing of ...

  12. Four Ways to Teach the Scientific Method

    Teaching Steps of the Scientific Method. Whether preparing students to do classroom science experiments and science fair projects or reviewing the way science experiments work and how scientists approach testing science questions, educators teach and use the scientific method with K-12 students at all grade levels.

  13. Experimental research

    10 Experimental research. 10. Experimental research. Experimental research—often considered to be the 'gold standard' in research designs—is one of the most rigorous of all research designs. In this design, one or more independent variables are manipulated by the researcher (as treatments), subjects are randomly assigned to different ...

  14. PDF and Engineering Students Experimental Methods for Science

    Experimental Methods for Science and Engineering StudentsResponding to the developments of the past 20 years, Les Kirkup has thoroughly revised his popular book on experimental methods, while retaining the exten. ive coverage and practical advice from the first edition. Many topics from that edition remain, including documenting experiments ...

  15. Data, measurement and empirical methods in the science of science

    Here we review three quasi-experimental methods: difference-in-differences, instrumental variables and regression discontinuity (Fig. 3). Fig. 3: Quasi-experiment methods. a - c, This figure ...

  16. Experimentology

    The story of this book. Experimental Methods (Psych 251) is the foundational course for incoming graduate students in the Stanford psychology department. The course goal is to orient students to the nuts and bolts of doing behavioral experiments, including how to plan and design a solid experiment and how to avoid common pitfalls regarding ...

  17. Experiment

    Experiments might be categorized according to a number of dimensions, depending upon professional norms and standards in different fields of study. In some disciplines (e.g., psychology or political science), a 'true experiment' is a method of social research in which there are two kinds of variables.

  18. Experimental method

    V The place of experiments in social science. Experimental methods are one kind of data collection that may be used to assess theoretical knowledge. They are certainly not the only method, for many others are available: surveys, content analyses, structured and unstructured observation, and others. As noted previously, all research searches for ...

  19. Experimental Design

    Experimental design methods refer to the techniques and procedures used to design and conduct experiments in scientific research. Here are some common experimental design methods: ... Environmental science: Experimental design is used to study the effects of environmental factors, such as pollution or climate change, on ecosystems and wildlife. ...

  20. The past, present, and future of experimental methods in the social

    The first trend to emphasize from Fig. 1 is that all three disciplines are utilizing the experimental method more today than they were 30 years ago. On average across the three disciplines, roughly four percent of articles published in the early 1990s used experimental methods; in contrast, over fourteen percent of published articles used experiments in the late 2010s, more than a threefold ...

  21. How the Experimental Method Works in Psychology

    The experimental method is a type of research procedure that involves manipulating variables to determine if there is a cause-and-effect relationship. The results obtained through the experimental method are useful but do not prove with 100% certainty that a singular cause always creates a specific effect.

  22. 70 Easy Science Experiments Using Materials You Already Have

    Go Science Kids. 43. "Flip" a drawing with water. Light refraction causes some really cool effects, and there are multiple easy science experiments you can do with it. This one uses refraction to "flip" a drawing; you can also try the famous "disappearing penny" trick.

  23. Critical assessment of water enthalpy characterization ...

    Theoretical and experimental evidence is provided, which shows that these temperature differences invalidate the equal energy input assumption. ... it is easy to see how misleading results of vaporization enthalpy reduction are obtained using this method. These experiments and analyses show clearly that, when energy inputs from the environment ...

  24. Methods in plant science

    Martin Janda, Methods in plant science, Journal of Experimental Botany, Volume 75, Issue 17, 11 September 2024, Pages 5163-5168, ... The idea to put together this special issue came via the support of the Journal of Experimental Botany for the conference 'Methods in Plant Sciences 2023', held in Srní, Czech Republic. The meeting brought ...

  25. EUROfusion spearheads advances in Artificial Intelligence and Machine

    By launching 15 new research projects, EUROfusion is engaging data science experts across Europe to apply Artificial Intelligence and Machine Learning techniques to fusion energy. These projects will leverage the world's largest and most diverse dataset of fusion experiments to identify optimal methods for understanding and controling the fusion process, ultimately shortening the road to ...

  26. Cats are (almost) liquid!—Cats selectively rely on body size awareness

    The study was approved by the Institutional Animal Welfare Committee (Eötvös Loránd University, Budapest), who checked and accepted the experimental method (Ref. no.: ELTE-AWC-016/2023). All tests were performed in accordance with the Hungarian regulations on animal experimentation and the Guidelines for the use of animals in research ...

  27. Effects of curing condition and solder mask on substrate ...

    The warpage of the package substrate mainly originates from the material property and size variations of individual components, especially when multiple components are involved. To maintain the substrate warpage within acceptable limits, it's crucial to fine-tune component parameters carefully, choosing appropriate materials and processing conditions. The current study investigates the ...