General Education
Think about something strange and unexplainable in your life. Maybe you get a headache right before it rains, or maybe you think your favorite sports team wins when you wear a certain color. If you wanted to see whether these are just coincidences or scientific fact, you would form a hypothesis, then create an experiment to see whether that hypothesis is true or not.
But what is a hypothesis, anyway? If you’re not sure about what a hypothesis is--or how to test for one!--you’re in the right place. This article will teach you everything you need to know about hypotheses, including:
So let’s get started!
Merriam Webster defines a hypothesis as “an assumption or concession made for the sake of argument.” In other words, a hypothesis is an educated guess . Scientists make a reasonable assumption--or a hypothesis--then design an experiment to test whether it’s true or not. Keep in mind that in science, a hypothesis should be testable. You have to be able to design an experiment that tests your hypothesis in order for it to be valid.
As you could assume from that statement, it’s easy to make a bad hypothesis. But when you’re holding an experiment, it’s even more important that your guesses be good...after all, you’re spending time (and maybe money!) to figure out more about your observation. That’s why we refer to a hypothesis as an educated guess--good hypotheses are based on existing data and research to make them as sound as possible.
Hypotheses are one part of what’s called the scientific method . Every (good) experiment or study is based in the scientific method. The scientific method gives order and structure to experiments and ensures that interference from scientists or outside influences does not skew the results. It’s important that you understand the concepts of the scientific method before holding your own experiment. Though it may vary among scientists, the scientific method is generally made up of six steps (in order):
You’ll notice that the hypothesis comes pretty early on when conducting an experiment. That’s because experiments work best when they’re trying to answer one specific question. And you can’t conduct an experiment until you know what you’re trying to prove!
After doing your research, you’re ready for another important step in forming your hypothesis: identifying variables. Variables are basically any factor that could influence the outcome of your experiment . Variables have to be measurable and related to the topic being studied.
There are two types of variables: independent variables and dependent variables. I ndependent variables remain constant . For example, age is an independent variable; it will stay the same, and researchers can look at different ages to see if it has an effect on the dependent variable.
Speaking of dependent variables... dependent variables are subject to the influence of the independent variable , meaning that they are not constant. Let’s say you want to test whether a person’s age affects how much sleep they need. In that case, the independent variable is age (like we mentioned above), and the dependent variable is how much sleep a person gets.
Variables will be crucial in writing your hypothesis. You need to be able to identify which variable is which, as both the independent and dependent variables will be written into your hypothesis. For instance, in a study about exercise, the independent variable might be the speed at which the respondents walk for thirty minutes, and the dependent variable would be their heart rate. In your study and in your hypothesis, you’re trying to understand the relationship between the two variables.
The best hypotheses start by asking the right questions . For instance, if you’ve observed that the grass is greener when it rains twice a week, you could ask what kind of grass it is, what elevation it’s at, and if the grass across the street responds to rain in the same way. Any of these questions could become the backbone of experiments to test why the grass gets greener when it rains fairly frequently.
As you’re asking more questions about your first observation, make sure you’re also making more observations . If it doesn’t rain for two weeks and the grass still looks green, that’s an important observation that could influence your hypothesis. You'll continue observing all throughout your experiment, but until the hypothesis is finalized, every observation should be noted.
Finally, you should consult secondary research before writing your hypothesis . Secondary research is comprised of results found and published by other people. You can usually find this information online or at your library. Additionally, m ake sure the research you find is credible and related to your topic. If you’re studying the correlation between rain and grass growth, it would help you to research rain patterns over the past twenty years for your county, published by a local agricultural association. You should also research the types of grass common in your area, the type of grass in your lawn, and whether anyone else has conducted experiments about your hypothesis. Also be sure you’re checking the quality of your research . Research done by a middle school student about what minerals can be found in rainwater would be less useful than an article published by a local university.
Once you’ve considered all of the factors above, you’re ready to start writing your hypothesis. Hypotheses usually take a certain form when they’re written out in a research report.
When you boil down your hypothesis statement, you are writing down your best guess and not the question at hand . This means that your statement should be written as if it is fact already, even though you are simply testing it.
The reason for this is that, after you have completed your study, you'll either accept or reject your if-then or your null hypothesis. All hypothesis testing examples should be measurable and able to be confirmed or denied. You cannot confirm a question, only a statement!
In fact, you come up with hypothesis examples all the time! For instance, when you guess on the outcome of a basketball game, you don’t say, “Will the Miami Heat beat the Boston Celtics?” but instead, “I think the Miami Heat will beat the Boston Celtics.” You state it as if it is already true, even if it turns out you’re wrong. You do the same thing when writing your hypothesis.
Additionally, keep in mind that hypotheses can range from very specific to very broad. These hypotheses can be specific, but if your hypothesis testing examples involve a broad range of causes and effects, your hypothesis can also be broad.
Now that you understand what goes into a hypothesis, it’s time to look more closely at the two most common types of hypothesis: the if-then hypothesis and the null hypothesis.
First of all, if-then hypotheses typically follow this formula:
If ____ happens, then ____ will happen.
The goal of this type of hypothesis is to test the causal relationship between the independent and dependent variable. It’s fairly simple, and each hypothesis can vary in how detailed it can be. We create if-then hypotheses all the time with our daily predictions. Here are some examples of hypotheses that use an if-then structure from daily life:
In each of these situations, you’re making a guess on how an independent variable (sleep, time, or studying) will affect a dependent variable (the amount of work you can do, making it to a party on time, or getting better grades).
You may still be asking, “What is an example of a hypothesis used in scientific research?” Take one of the hypothesis examples from a real-world study on whether using technology before bed affects children’s sleep patterns. The hypothesis read s:
“We hypothesized that increased hours of tablet- and phone-based screen time at bedtime would be inversely correlated with sleep quality and child attention.”
It might not look like it, but this is an if-then statement. The researchers basically said, “If children have more screen usage at bedtime, then their quality of sleep and attention will be worse.” The sleep quality and attention are the dependent variables and the screen usage is the independent variable. (Usually, the independent variable comes after the “if” and the dependent variable comes after the “then,” as it is the independent variable that affects the dependent variable.) This is an excellent example of how flexible hypothesis statements can be, as long as the general idea of “if-then” and the independent and dependent variables are present.
Your if-then hypothesis is not the only one needed to complete a successful experiment, however. You also need a null hypothesis to test it against. In its most basic form, the null hypothesis is the opposite of your if-then hypothesis . When you write your null hypothesis, you are writing a hypothesis that suggests that your guess is not true, and that the independent and dependent variables have no relationship .
One null hypothesis for the cell phone and sleep study from the last section might say:
“If children have more screen usage at bedtime, their quality of sleep and attention will not be worse.”
In this case, this is a null hypothesis because it’s asking the opposite of the original thesis!
Conversely, if your if-then hypothesis suggests that your two variables have no relationship, then your null hypothesis would suggest that there is one. So, pretend that there is a study that is asking the question, “Does the amount of followers on Instagram influence how long people spend on the app?” The independent variable is the amount of followers, and the dependent variable is the time spent. But if you, as the researcher, don’t think there is a relationship between the number of followers and time spent, you might write an if-then hypothesis that reads:
“If people have many followers on Instagram, they will not spend more time on the app than people who have less.”
In this case, the if-then suggests there isn’t a relationship between the variables. In that case, one of the null hypothesis examples might say:
“If people have many followers on Instagram, they will spend more time on the app than people who have less.”
You then test both the if-then and the null hypothesis to gauge if there is a relationship between the variables, and if so, how much of a relationship.
If you’re going to take the time to hold an experiment, whether in school or by yourself, you’re also going to want to take the time to make sure your hypothesis is a good one. The best hypotheses have four major elements in common: plausibility, defined concepts, observability, and general explanation.
At first glance, this quality of a hypothesis might seem obvious. When your hypothesis is plausible, that means it’s possible given what we know about science and general common sense. However, improbable hypotheses are more common than you might think.
Imagine you’re studying weight gain and television watching habits. If you hypothesize that people who watch more than twenty hours of television a week will gain two hundred pounds or more over the course of a year, this might be improbable (though it’s potentially possible). Consequently, c ommon sense can tell us the results of the study before the study even begins.
Improbable hypotheses generally go against science, as well. Take this hypothesis example:
“If a person smokes one cigarette a day, then they will have lungs just as healthy as the average person’s.”
This hypothesis is obviously untrue, as studies have shown again and again that cigarettes negatively affect lung health. You must be careful that your hypotheses do not reflect your own personal opinion more than they do scientifically-supported findings. This plausibility points to the necessity of research before the hypothesis is written to make sure that your hypothesis has not already been disproven.
The more advanced you are in your studies, the more likely that the terms you’re using in your hypothesis are specific to a limited set of knowledge. One of the hypothesis testing examples might include the readability of printed text in newspapers, where you might use words like “kerning” and “x-height.” Unless your readers have a background in graphic design, it’s likely that they won’t know what you mean by these terms. Thus, it’s important to either write what they mean in the hypothesis itself or in the report before the hypothesis.
Here’s what we mean. Which of the following sentences makes more sense to the common person?
If the kerning is greater than average, more words will be read per minute.
If the space between letters is greater than average, more words will be read per minute.
For people reading your report that are not experts in typography, simply adding a few more words will be helpful in clarifying exactly what the experiment is all about. It’s always a good idea to make your research and findings as accessible as possible.
Good hypotheses ensure that you can observe the results.
In order to measure the truth or falsity of your hypothesis, you must be able to see your variables and the way they interact. For instance, if your hypothesis is that the flight patterns of satellites affect the strength of certain television signals, yet you don’t have a telescope to view the satellites or a television to monitor the signal strength, you cannot properly observe your hypothesis and thus cannot continue your study.
Some variables may seem easy to observe, but if you do not have a system of measurement in place, you cannot observe your hypothesis properly. Here’s an example: if you’re experimenting on the effect of healthy food on overall happiness, but you don’t have a way to monitor and measure what “overall happiness” means, your results will not reflect the truth. Monitoring how often someone smiles for a whole day is not reasonably observable, but having the participants state how happy they feel on a scale of one to ten is more observable.
In writing your hypothesis, always keep in mind how you'll execute the experiment.
Perhaps you’d like to study what color your best friend wears the most often by observing and documenting the colors she wears each day of the week. This might be fun information for her and you to know, but beyond you two, there aren’t many people who could benefit from this experiment. When you start an experiment, you should note how generalizable your findings may be if they are confirmed. Generalizability is basically how common a particular phenomenon is to other people’s everyday life.
Let’s say you’re asking a question about the health benefits of eating an apple for one day only, you need to realize that the experiment may be too specific to be helpful. It does not help to explain a phenomenon that many people experience. If you find yourself with too specific of a hypothesis, go back to asking the big question: what is it that you want to know, and what do you think will happen between your two variables?
We know it can be hard to write a good hypothesis unless you’ve seen some good hypothesis examples. We’ve included four hypothesis examples based on some made-up experiments. Use these as templates or launch pads for coming up with your own hypotheses.
You are a student at PrepScholar University. When you walk around campus, you notice that, when the temperature is above 60 degrees, more students study in the quad. You want to know when your fellow students are more likely to study outside. With this information, how do you make the best hypothesis possible?
You must remember to make additional observations and do secondary research before writing your hypothesis. In doing so, you notice that no one studies outside when it’s 75 degrees and raining, so this should be included in your experiment. Also, studies done on the topic beforehand suggested that students are more likely to study in temperatures less than 85 degrees. With this in mind, you feel confident that you can identify your variables and write your hypotheses:
If-then: “If the temperature in Fahrenheit is less than 60 degrees, significantly fewer students will study outside.”
Null: “If the temperature in Fahrenheit is less than 60 degrees, the same number of students will study outside as when it is more than 60 degrees.”
These hypotheses are plausible, as the temperatures are reasonably within the bounds of what is possible. The number of people in the quad is also easily observable. It is also not a phenomenon specific to only one person or at one time, but instead can explain a phenomenon for a broader group of people.
To complete this experiment, you pick the month of October to observe the quad. Every day (except on the days where it’s raining)from 3 to 4 PM, when most classes have released for the day, you observe how many people are on the quad. You measure how many people come and how many leave. You also write down the temperature on the hour.
After writing down all of your observations and putting them on a graph, you find that the most students study on the quad when it is 70 degrees outside, and that the number of students drops a lot once the temperature reaches 60 degrees or below. In this case, your research report would state that you accept or “failed to reject” your first hypothesis with your findings.
Let’s say that you work at a bakery. You specialize in cupcakes, and you make only two colors of frosting: yellow and purple. You want to know what kind of customers are more likely to buy what kind of cupcake, so you set up an experiment. Your independent variable is the customer’s gender, and the dependent variable is the color of the frosting. What is an example of a hypothesis that might answer the question of this study?
Here’s what your hypotheses might look like:
If-then: “If customers’ gender is female, then they will buy more yellow cupcakes than purple cupcakes.”
Null: “If customers’ gender is female, then they will be just as likely to buy purple cupcakes as yellow cupcakes.”
This is a pretty simple experiment! It passes the test of plausibility (there could easily be a difference), defined concepts (there’s nothing complicated about cupcakes!), observability (both color and gender can be easily observed), and general explanation ( this would potentially help you make better business decisions ).
While watching your backyard bird feeder, you realized that different birds come on the days when you change the types of seeds. You decide that you want to see more cardinals in your backyard, so you decide to see what type of food they like the best and set up an experiment.
However, one morning, you notice that, while some cardinals are present, blue jays are eating out of your backyard feeder filled with millet. You decide that, of all of the other birds, you would like to see the blue jays the least. This means you'll have more than one variable in your hypothesis. Your new hypotheses might look like this:
If-then: “If sunflower seeds are placed in the bird feeders, then more cardinals will come than blue jays. If millet is placed in the bird feeders, then more blue jays will come than cardinals.”
Null: “If either sunflower seeds or millet are placed in the bird, equal numbers of cardinals and blue jays will come.”
Through simple observation, you actually find that cardinals come as often as blue jays when sunflower seeds or millet is in the bird feeder. In this case, you would reject your “if-then” hypothesis and “fail to reject” your null hypothesis . You cannot accept your first hypothesis, because it’s clearly not true. Instead you found that there was actually no relation between your different variables. Consequently, you would need to run more experiments with different variables to see if the new variables impact the results.
You’re about to give a speech in one of your classes about the importance of paying attention. You want to take this opportunity to test a hypothesis you’ve had for a while:
If-then: If students sit in the first two rows of the classroom, then they will listen better than students who do not.
Null: If students sit in the first two rows of the classroom, then they will not listen better or worse than students who do not.
You give your speech and then ask your teacher if you can hand out a short survey to the class. On the survey, you’ve included questions about some of the topics you talked about. When you get back the results, you’re surprised to see that not only do the students in the first two rows not pay better attention, but they also scored worse than students in other parts of the classroom! Here, both your if-then and your null hypotheses are not representative of your findings. What do you do?
This is when you reject both your if-then and null hypotheses and instead create an alternative hypothesis . This type of hypothesis is used in the rare circumstance that neither of your hypotheses is able to capture your findings . Now you can use what you’ve learned to draft new hypotheses and test again!
The more comfortable you become with writing hypotheses, the better they will become. The structure of hypotheses is flexible and may need to be changed depending on what topic you are studying. The most important thing to remember is the purpose of your hypothesis and the difference between the if-then and the null . From there, in forming your hypothesis, you should constantly be asking questions, making observations, doing secondary research, and considering your variables. After you have written your hypothesis, be sure to edit it so that it is plausible, clearly defined, observable, and helpful in explaining a general phenomenon.
Writing a hypothesis is something that everyone, from elementary school children competing in a science fair to professional scientists in a lab, needs to know how to do. Hypotheses are vital in experiments and in properly executing the scientific method . When done correctly, hypotheses will set up your studies for success and help you to understand the world a little better, one experiment at a time.
If you’re studying for the science portion of the ACT, there’s definitely a lot you need to know. We’ve got the tools to help, though! Start by checking out our ultimate study guide for the ACT Science subject test. Once you read through that, be sure to download our recommended ACT Science practice tests , since they’re one of the most foolproof ways to improve your score. (And don’t forget to check out our expert guide book , too.)
If you love science and want to major in a scientific field, you should start preparing in high school . Here are the science classes you should take to set yourself up for success.
If you’re trying to think of science experiments you can do for class (or for a science fair!), here’s a list of 37 awesome science experiments you can do at home
How to Get Into Harvard and the Ivy League
How to Get a Perfect 4.0 GPA
How to Write an Amazing College Essay
What Exactly Are Colleges Looking For?
ACT vs. SAT: Which Test Should You Take?
When should you take the SAT or ACT?
Get Your Free
Find Your Target SAT Score
Free Complete Official SAT Practice Tests
Score 800 on SAT Math
Score 800 on SAT Reading and Writing
Score 600 on SAT Math
Score 600 on SAT Reading and Writing
Find Your Target ACT Score
Complete Official Free ACT Practice Tests
Get a 36 on ACT English
Get a 36 on ACT Math
Get a 36 on ACT Reading
Get a 36 on ACT Science
Get a 24 on ACT English
Get a 24 on ACT Math
Get a 24 on ACT Reading
Get a 24 on ACT Science
Stay Informed
Get the latest articles and test prep tips!
Ashley Sufflé Robinson has a Ph.D. in 19th Century English Literature. As a content writer for PrepScholar, Ashley is passionate about giving college-bound students the in-depth information they need to get into the school of their dreams.
Have any questions about this article or other topics? Ask below and we'll reply!
Basic Elements of the Scientific Method: Hypotheses
A hypothesis states what one is looking for in an experiment. When facts are assembled, ordered, and seen in a relationship, they build up to become a theory. This theory needs to be deduced for further confirmation of the facts, this formulation of the deductions constitutes of a hypothesis. As a theory states a logical relationship between facts and from this, the propositions which are deduced should be true. Hence, these deduced prepositions are called hypotheses.
There are three major difficulties in the formulation of a hypothesis, they are as follows:
Sometimes the deduction of a hypothesis may be difficult as there would be many variables and the necessity to take them all into consideration becomes a challenge. For instance, observing two cases:
Deduction: This situation holds much more sense to the people who are in professions such as psychotherapy, psychiatry and law to some extent. They possess a very intimate relationship with their clients, thus are more susceptible to issues regarding emotional strains in the client-practitioner relationship and more implicit and explicit controls over both participants in comparison to other professions.
Deduction: There can numerous ways to approach this principle, one could go with the comparison applying to martial relationships of the members and further argue that such differential pressures could be observed through divorce rates. This hypothesis would show inverse correlations between class position and divorce rates. There would be a very strong need to define the terms carefully to show the deduction from the principle problem.
Science and hypothesis.
“The general culture in which a science develops furnishes many of its basic hypotheses” holds true as science has developed more in the West and is no accident that it is a function of culture itself. This is quite evident with the culture of the West as they read for morals, science and happiness. After the examination of a bunch of variables, it is quite easy to say that the cultural emphasis upon happiness has been productive of an almost limitless range.
Analogies are a source of useful hypotheses but not without its dangers as all variables may not be accounted for it as no civilization has a perfect system.
Hypotheses are also the consequence of personal, idiosyncratic experience as the manner in which the individual reacts to the hypotheses is also important and should be accounted for in the experiment.
The formulation of a hypothesis is probably the most necessary step in good research practice and it is very essential to get the thought process started. It helps the researcher to have a specific goal in mind and deduce the end result of an experiment with ease and efficiency. History is evident that asking the right questions always works out fine.
Also Read: Research Methods – Basics
Kartik is studying BA in International Relations at Amity and Dropped out of engineering from NIT Hamirpur and he lived in over 5 different countries.
Defining the hypothesis, the role of a hypothesis in the scientific method, types of hypotheses, hypothesis formulation, hypotheses and variables.
In sociology, as in other scientific disciplines, the hypothesis serves as a crucial building block for research. It is a central element that directs the inquiry and provides a framework for testing the relationships between social phenomena. This article will explore what a hypothesis is, how it is formulated, and its role within the broader scientific method. By understanding the hypothesis, students of sociology can grasp how sociologists construct and test theories about the social world.
A hypothesis is a specific, testable statement about the relationship between two or more variables. It acts as a proposed explanation or prediction based on limited evidence, which researchers then test through empirical investigation. In essence, it is a statement that can be supported or refuted by data gathered from observation, experimentation, or other forms of systematic inquiry. The hypothesis typically takes the form of an “if-then” statement: if one variable changes, then another will change in response.
In sociological research, a hypothesis helps to focus the investigation by offering a clear proposition that can be tested. For instance, a sociologist might hypothesize that an increase in education levels leads to a decrease in crime rates. This hypothesis gives the researcher a direction, guiding them to collect data on education and crime, and analyze the relationship between the two variables. By doing so, the hypothesis serves as a tool for making sense of complex social phenomena.
The hypothesis is a key component of the scientific method, which is the systematic process by which sociologists and other scientists investigate the world. The scientific method begins with an observation of the world, followed by the formulation of a question or problem. Based on prior knowledge, theory, or preliminary observations, researchers then develop a hypothesis, which predicts an outcome or proposes a relationship between variables.
Once a hypothesis is established, researchers gather data to test it. If the data supports the hypothesis, it may be used to build a broader theory or to further refine the understanding of the social phenomenon in question. If the data contradicts the hypothesis, researchers may revise their hypothesis or abandon it altogether, depending on the strength of the evidence. In either case, the hypothesis helps to organize the research process, ensuring that it remains focused and methodologically sound.
In sociology, this method is particularly important because the social world is highly complex. Researchers must navigate a vast range of variables—age, gender, class, race, education, and countless others—that interact in unpredictable ways. A well-constructed hypothesis allows sociologists to narrow their focus to a manageable set of variables, making the investigation more precise and efficient.
Sociologists use different types of hypotheses, depending on the nature of their research question and the methods they plan to use. Broadly speaking, hypotheses can be classified into two main types: null hypotheses and alternative (or research) hypotheses.
The null hypothesis, denoted as H0, states that there is no relationship between the variables being studied. It is a default assumption that any observed differences or relationships are due to random chance rather than a real underlying cause. In research, the null hypothesis serves as a point of comparison. Researchers collect data to see if the results allow them to reject the null hypothesis in favor of an alternative explanation.
For example, a sociologist studying the relationship between income and political participation might propose a null hypothesis that income has no effect on political participation. The goal of the research would then be to determine whether this null hypothesis can be rejected based on the data. If the data shows a significant correlation between income and political participation, the null hypothesis would be rejected.
The alternative hypothesis, denoted as H1 or Ha, proposes that there is a significant relationship between the variables. This is the hypothesis that researchers aim to support with their data. In contrast to the null hypothesis, the alternative hypothesis predicts a specific direction or effect. For example, a researcher might hypothesize that higher levels of education lead to greater political engagement. In this case, the alternative hypothesis is proposing a positive correlation between the two variables.
The alternative hypothesis is the one that guides the research design, as it directs the researcher toward gathering evidence that will either support or refute the predicted relationship. The research process is structured around testing this hypothesis and determining whether the evidence is strong enough to reject the null hypothesis.
The process of formulating a hypothesis is both an art and a science. It requires a deep understanding of the social phenomena under investigation, as well as a clear sense of what is possible to observe and measure. Hypothesis formulation is closely linked to the theoretical framework that guides the research. Sociologists draw on existing theories to generate hypotheses, ensuring that their predictions are grounded in established knowledge.
To formulate a good hypothesis, a researcher must identify the key variables and determine how they are expected to relate to one another. Variables are the factors or characteristics that are being measured in a study. In sociology, these variables often include social attributes such as class, race, gender, age, education, and income, as well as behavioral variables like voting, criminal activity, or social participation.
For example, a sociologist studying the effects of social media on self-esteem might propose the following hypothesis: “Increased time spent on social media leads to lower levels of self-esteem among adolescents.” Here, the independent variable is the time spent on social media, and the dependent variable is the level of self-esteem. The hypothesis predicts a negative relationship between the two variables: as time spent on social media increases, self-esteem decreases.
A strong hypothesis has several key characteristics. It should be clear and specific, meaning that it unambiguously states the relationship between the variables. It should also be testable, meaning that it can be supported or refuted through empirical investigation. Finally, it should be grounded in theory, meaning that it is based on existing knowledge about the social phenomenon in question.
You must be a member to access this content.
View Membership Levels
Mr Edwards has a PhD in sociology and 10 years of experience in sociological knowledge
The term "moral statistics" might seem paradoxical at first glance, as morality is often perceived as a subjective domain, difficult...
Cluster analysis is a widely used statistical technique in sociology that aims to identify groups within a dataset based on...
Get the latest sociology.
How would you rate the content on Easy Sociology?
24 hour trending.
Understanding the concept of ‘community’ in sociology, functionalism: an introduction, understanding gemeinschaft and gesellschaft, what are social constructs.
Easy Sociology makes sociology as easy as possible. Our aim is to make sociology accessible for everybody. © 2023 Easy Sociology
© 2023 Easy Sociology
All research studies involve the use of the scientific method, which is a mathematical and experimental technique used to conduct experiments by developing and testing a hypothesis or a prediction about an outcome. Simply put, a hypothesis is a suggested solution to a problem. It includes elements that are expressed in terms of relationships with each other to explain a condition or an assumption that hasn’t been verified using facts. 1 The typical steps in a scientific method include developing such a hypothesis, testing it through various methods, and then modifying it based on the outcomes of the experiments.
A research hypothesis can be defined as a specific, testable prediction about the anticipated results of a study. 2 Hypotheses help guide the research process and supplement the aim of the study. After several rounds of testing, hypotheses can help develop scientific theories. 3 Hypotheses are often written as if-then statements.
Here are two hypothesis examples:
Dandelions growing in nitrogen-rich soils for two weeks develop larger leaves than those in nitrogen-poor soils because nitrogen stimulates vegetative growth. 4
If a company offers flexible work hours, then their employees will be happier at work. 5
A hypothesis expresses an expected relationship between variables in a study and is developed before conducting any research. Hypotheses are not opinions but rather are expected relationships based on facts and observations. They help support scientific research and expand existing knowledge. An incorrectly formulated hypothesis can affect the entire experiment leading to errors in the results so it’s important to know how to formulate a hypothesis and develop it carefully.
A few sources of a hypothesis include observations from prior studies, current research and experiences, competitors, scientific theories, and general conditions that can influence people. Figure 1 depicts the different steps in a research design and shows where exactly in the process a hypothesis is developed. 4
There are seven different types of hypotheses—simple, complex, directional, nondirectional, associative and causal, null, and alternative.
The seven types of hypotheses are listed below: 5 , 6,7
Example: Exercising in the morning every day will increase your productivity.
Example: Spending three hours or more on social media daily will negatively affect children’s mental health and productivity, more than that of adults.
Example: The inclusion of intervention X decreases infant mortality compared to the original treatment.
Example: Cats and dogs differ in the amount of affection they express.
Example: There is a positive association between physical activity levels and overall health.
A causal hypothesis, on the other hand, expresses a cause-and-effect association between variables.
Example: Long-term alcohol use causes liver damage.
Example: Sleep duration does not have any effect on productivity.
Example: Sleep duration affects productivity.
So, what makes a good hypothesis? Here are some important characteristics of a hypothesis. 8,9
The following list mentions some important functions of a hypothesis: 1
To summarize, a hypothesis provides the conceptual elements that complete the known data, conceptual relationships that systematize unordered elements, and conceptual meanings and interpretations that explain the unknown phenomena. 1
Listed below are the main steps explaining how to write a hypothesis. 2,4,5
For example, if you notice that an office’s vending machine frequently runs out of a specific snack, you may predict that more people in the office choose that snack over another.
For example, after observing employees’ break times at work, you could ask “why do more employees take breaks in the morning rather than in the afternoon?”
For example, based on your observations you might state a hypothesis that employees work more efficiently when the air conditioning in the office is set at a lower temperature. However, during your preliminary research you find that this hypothesis was proven incorrect by a prior study.
P opulation: The specific group or individual who is the main subject of the research
I nterest: The main concern of the study/research question
C omparison: The main alternative group
O utcome: The expected results
T ime: Duration of the experiment
Once you’ve finalized your hypothesis statement you would need to conduct experiments to test whether the hypothesis is true or false.
The following table provides examples of different types of hypotheses. 10 ,11
Null | Hyperactivity is not related to eating sugar. |
There is no relationship between height and shoe size. | |
Alternative | Hyperactivity is positively related to eating sugar. |
There is a positive association between height and shoe size. | |
Simple | Students who eat breakfast perform better in exams than students who don’t eat breakfast. |
Reduced screen time improves sleep quality. | |
Complex | People with high-sugar diet and sedentary activity levels are more likely to develop depression. |
Motorists who talk on the phone while driving will be more likely to make errors on a driving course than those who do not talk on the phone. | |
Directional | As job satisfaction increases, the rate of employee turnover decreases. |
Increase in sun exposure increases the risk of skin cancer. | |
Non-directional | College students will perform differently from elementary school students on a memory task. |
Advertising exposure correlates with variations in purchase decisions among consumers. | |
Associative | Hospitals have more sick people in them than other institutions in society. |
Watching TV is related to increased snacking. | |
Causal | Inadequate sleep decreases memory retention. |
Recreational drugs cause psychosis. |
Key takeaways
Here’s a summary of all the key points discussed in this article about how to write a hypothesis.
Hypotheses and research questions have different objectives and structure. The following table lists some major differences between the two. 9
Includes a prediction based on the proposed research | No prediction is made |
Designed to forecast the relationship of and between two or more variables | Variables may be explored |
Closed ended | Open ended, invites discussion |
Used if the research topic is well established and there is certainty about the relationship between the variables | Used for new topics that haven’t been researched extensively. The relationship between different variables is less known |
Here are a few examples to differentiate between a research question and hypothesis.
What is the effect of eating an apple a day by adults aged over 60 years on the frequency of physician visits? | Eating an apple each day, after the age of 60, will result in a reduction of frequency of physician visits |
What is the effect of flexible or fixed working hours on employee job satisfaction? | Workplaces that offer flexible working hours report higher levels of employee job satisfaction than workplaces with fixed hours. |
Does drinking coffee in the morning affect employees’ productivity? | Drinking coffee in the morning improves employees’ productivity. |
Yes, here’s a simple checklist to help you gauge the effectiveness of your hypothesis. 9 1. When writing a hypothesis statement, check if it: 2. Predicts the relationship between the stated variables and the expected outcome. 3. Uses simple and concise language and is not wordy. 4. Does not assume readers’ knowledge about the subject. 5. Has observable, falsifiable, and testable results.
As mentioned earlier in this article, a hypothesis is an assumption or prediction about an association between variables based on observations and simple evidence. These statements are usually generic. Research objectives, on the other hand, are more specific and dictated by hypotheses. The same hypothesis can be tested using different methods and the research objectives could be different in each case. For example, Louis Pasteur observed that food lasts longer at higher altitudes, reasoned that it could be because the air at higher altitudes is cleaner (with fewer or no germs), and tested the hypothesis by exposing food to air cleaned in the laboratory. 12 Thus, a hypothesis is predictive—if the reasoning is correct, X will lead to Y—and research objectives are developed to test these predictions.
Null hypothesis testing is a method to decide between two assumptions or predictions between variables (null and alternative hypotheses) in a statistical relationship in a sample. The null hypothesis, denoted as H 0 , claims that no relationship exists between variables in a population and any relationship in the sample reflects a sampling error or occurrence by chance. The alternative hypothesis, denoted as H 1 , claims that there is a relationship in the population. In every study, researchers need to decide whether the relationship in a sample occurred by chance or reflects a relationship in the population. This is done by hypothesis testing using the following steps: 13 1. Assume that the null hypothesis is true. 2. Determine how likely the sample relationship would be if the null hypothesis were true. This probability is called the p value. 3. If the sample relationship would be extremely unlikely, reject the null hypothesis and accept the alternative hypothesis. If the relationship would not be unlikely, accept the null hypothesis.
To summarize, researchers should know how to write a good hypothesis to ensure that their research progresses in the required direction. A hypothesis is a testable prediction about any behavior or relationship between variables, usually based on facts and observation, and states an expected outcome.
We hope this article has provided you with essential insight into the different types of hypotheses and their functions so that you can use them appropriately in your next research project.
References
Paperpal is a comprehensive AI writing toolkit that helps students and researchers achieve 2x the writing in half the time. It leverages 21+ years of STM experience and insights from millions of research articles to provide in-depth academic writing, language editing, and submission readiness support to help you write better, faster.
Get accurate academic translations, rewriting support, grammar checks, vocabulary suggestions, and generative AI assistance that delivers human precision at machine speed. Try for free or upgrade to Paperpal Prime starting at US$19 a month to access premium features, including consistency, plagiarism, and 30+ submission readiness checks to help you succeed.
Experience the future of academic writing – Sign up to Paperpal and start writing for free!
What are scholarly sources and where can you find them , you may also like, dissertation printing and binding | types & comparison , what is a dissertation preface definition and examples , how to write a research proposal: (with examples..., how to write your research paper in apa..., how to choose a dissertation topic, how to write a phd research proposal, how to write an academic paragraph (step-by-step guide), maintaining academic integrity with paperpal’s generative ai writing..., research funding basics: what should a grant proposal..., how to write an abstract in research papers....
Updated December 26, 2023
A research hypothesis is a statement that a researcher makes at the beginning of their research to outline what they expect the outcome to be.
If the hypothesis is “ More air pollution in an area can lead to more respiratory diseases ,” researchers expect that an increase in air pollution will cause more respiratory diseases in that area.
Start Your Free Data Science Course
Hadoop, Data Science, Statistics & others
This hypothesis is needed because it provides focus, structure, and purpose to research, helping researchers test their ideas and make meaningful conclusions based on evidence gathered during their study.
How to write a research hypothesis.
Research vs. null vs. statistical hypothesis, key highlights.
Detailed explanation of each type is as follows:
This type looks at how two variables might be related to each other. These variables are the dependent variable and independent variable. A dependent variable is a factor that changes with the changes in the independent variable.
Suppose you want to study the relationship between studying for long hours and grades. In this relationship, grades are the dependent variable, and hours of study are the independent variable, as getting a high or low grade depends on how much you study. And your simple hypothesis could be: “ More study time leads to higher grades .”
Unlike the simple hypothesis, a complex hypothesis predicts a relationship between multiple variables.
Imagine you want to understand how sleep, diet, and exercise affect health. In this case, you have one dependent variable, health, and three independent variables – sleep, diet, and exercise. Your complex hypothesis can be something like: “ A combination of enough sleep, a balanced diet, and regular exercise positively impacts overall health. “
This hypothesis assumes no relationship between variables. It is a negative statement. It’s usually the opposite of your actual hypothesis.
Suppose you are studying whether shoe size affects intelligence; the null hypothesis would say: “ There is no association between shoe size and intelligence. “
This is the opposite of the null hypothesis. It is a statement specifying a relationship between variables.
Suppose you are researching the effect of water intake on memory. An alternative hypothesis could be: “ Increased water intake improves memory performance. “
A directional hypothesis predicts the specific direction or nature of the relationship between two or more variables.
Let’s say you want to investigate the effect of practicing an instrument on musical skills. A directional hypothesis could be something like: “ Increased practice time improves musical skill. ” In this case, it is clear how one variable impacts the other.
In a non-directional hypothesis, a researcher states that two variables are related but doesn’t specify how.
Suppose you are researching the relationship between caffeine intake and heart rate. A non-directional hypothesis might state: “ There is a relationship between caffeine intake and heart rate. ” This hypothesis doesn’t tell you if caffeine intake affects the heart rate or if the heart rate affects your caffeine intake.
This hypothesis focuses on a relationship between variables but doesn’t claim that one causes the other.
Imagine you want to study if TV watching and increased snacking are related. An associative hypothesis might state: “ Watching more TV is related to increased snacking .”
On the other hand, a causal hypothesis suggests that the changes in one thing directly cause changes in another.
If you are studying sunlight exposure and vitamin D levels, a causal hypothesis could be: “ Lack of sunlight exposure causes vitamin D deficiency. “
A research hypothesis should have the following characteristics:
It’s important for research to have a research hypothesis because of the following reasons:
Below is the step-by-step guide to writing a research hypothesis.
Start by identifying what you want to research. Say, for instance, you are interested in understanding the relationship between AI and Productivity; this will form the basis of your research question. Your research question could be:
In the above example, the two variables are AI integration and employee productivity. Now, define which variable is dependent and which one is independent. The independent variable is the one you think will influence, and the dependent variable is the one that will be influenced.
Before you formulate your research hypothesis, you need to find out what past research on this subject is saying. This will help you understand what direction your research might take.
Based on past research, you can now write a clear and specific statement predicting how your dependent and independent variables are connected. Now, write down your research and null hypothesis.
Once you have developed your alternative/research and null hypothesis, your next task is to test your research hypothesis. Here’s how you do that:
Decide how you will gather information or conduct experiments to test your hypothesis. Determine what tools or methods you will use, the research population, the research sample, sample size, etc.
Carry out your experiments or observations and gather data related to your hypothesis. For example, if you are studying the impact of study time on grades, write down how many hours each student participating in your research spends on studying and the grades they get.
Use statistical tools or other analysis methods to study the collected data.
Based on your analysis, determine if the evidence supports your hypothesis. If the data backs up your prediction, your hypothesis is supported.
Share your results with others through reports or presentations, explaining how your experiments or observations relate to your hypothesis.
Let’s take the example of Dr. Lily Perry, a researcher from New York City. She wants to investigate if there is a relationship between respiratory diseases and air pollution in New York.
She starts by creating her research hypothesis and null hypothesis.
Dr. Perry followed a detailed plan to do her research:
After an intensive three-year study, Dr. Perry found interesting results:
Based on these results, Dr. Perry concluded that: There is a correlation between increased air pollution and respiratory diseases in New York.
She recommended the following:
Following are the main advantages and disadvantages of a research hypothesis
It gives your research a specific direction. | It can stop you from exploring all aspects of the study. |
It helps you predict the outcome of your study. | It can cause bias as you already have an outcome in mind. |
It assists in finding an appropriate data collection method. | You might need to revise your hypothesis based on the results of the study. |
The following are the main differences between research, null, and statistical hypothesis.
States the expected relationship between two or more variables. | Assumes there is no relationship between the variables involved in a study. | A mathematical explanation of the relationship between variables. | |
Individuals exercising regularly have a lower risk of heart disease. | There is no link between exercise and heart diseases. | Individuals exercising regularly have a lower resting heart rate than those who do not. | |
To prove a certain relationship exists between the variables. | To prove the variables are not related to each other. | To be proven using statistical methods. | |
Usually states how two variables are related. | Has no direction and emphasizes there is no correlation. | Could be directional as well as non-directional based on the context. | |
Needs to be tested through statistical and non-statistical methods. | Assumed true until disproven by research. | Is tested with statistical methods. |
Formulating a research hypothesis is usually the first step in conducting any research. However, it is important to know that your hypothesis might be disproven on occasion as well. The purpose of the research is to determine if your predictions about a specific relationship hold in light of evidence.
Q1. how long should a research hypothesis be.
Answer: A good research hypothesis should be just one or two sentences. For example: Increasing the amount of water that a cucumber plant receives will lead to increased production.
Answer: In the research paper, the hypothesis is usually placed after the introduction section. The introduction section is added after the background section and before the research methodology.
Answer: To understand this concept, let’s use an example. Let’s say you want to investigate whether there is a difference in the average marks of students in four different divisions. For this, you can use ANOVA (it helps determine if there is a significant difference between the means of three or more samples). So, your research hypothesis would be: There is a difference in the average scores of students in the four divisions. Your null hypothesis would be: There is no difference in the average scores of students in the four divisions. To test these hypotheses, you would collect data (marks of the students) from the four divisions. You would then analyze the data using ANOVA and determine whether you should accept the research hypothesis and reject the null hypothesis or vice versa.
Answer: Qualitative and descriptive research typically do not have a hypothesis. Instead, they have research questions to help the researcher conduct a detailed analysis.
Examples of research questions:
Answer: A research question is what you want to explore, while a research hypothesis is what you expect the outcome of the study to be.
Answer: “ Increased exercise leads to improved heart health ” is an example of a strong hypothesis as it predicts a clear relationship between variables. Furthermore, it is possible to test the hypothesis. On the other hand, “ Apples are better than oranges ” is an example of a bad or poor hypothesis as it is a subjective statement and can’t be tested.
By signing up, you agree to our Terms of Use and Privacy Policy .
*Please provide your correct email id. Login details for this Free course will be emailed to you
Forgot Password?
This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy
Explore 1000+ varieties of Mock tests View more
Submit Next Question
Early-Bird Offer: ENROLL NOW
In Statistics, the determination of the variation between the group of data due to true variation is done by hypothesis testing. The sample data are taken from the population parameter based on the assumptions. The hypothesis can be classified into various types. In this article, let us discuss the hypothesis definition, various types of hypothesis and the significance of hypothesis testing, which are explained in detail.
In Statistics, a hypothesis is defined as a formal statement, which gives the explanation about the relationship between the two or more variables of the specified population. It helps the researcher to translate the given problem to a clear explanation for the outcome of the study. It clearly explains and predicts the expected outcome. It indicates the types of experimental design and directs the study of the research process.
The hypothesis can be broadly classified into different types. They are:
Simple Hypothesis
A simple hypothesis is a hypothesis that there exists a relationship between two variables. One is called a dependent variable, and the other is called an independent variable.
Complex Hypothesis
A complex hypothesis is used when there is a relationship between the existing variables. In this hypothesis, the dependent and independent variables are more than two.
Null Hypothesis
In the null hypothesis, there is no significant difference between the populations specified in the experiments, due to any experimental or sampling error. The null hypothesis is denoted by H 0 .
Alternative Hypothesis
In an alternative hypothesis, the simple observations are easily influenced by some random cause. It is denoted by the H a or H 1 .
Empirical Hypothesis
An empirical hypothesis is formed by the experiments and based on the evidence.
Statistical Hypothesis
In a statistical hypothesis, the statement should be logical or illogical, and the hypothesis is verified statistically.
Apart from these types of hypothesis, some other hypotheses are directional and non-directional hypothesis, associated hypothesis, casual hypothesis.
The important characteristics of the hypothesis are:
To learn more Maths definitions, register with BYJU’S – The Learning App.
Put your understanding of this concept to test by answering a few MCQs. Click ‘Start Quiz’ to begin!
Select the correct answer and click on the “Finish” button Check your score and answers at the end of the quiz
Visit BYJU’S for all Maths related queries and study materials
Your result is as below
Request OTP on Voice Call
MATHS Related Links | |
Your Mobile number and Email id will not be published. Required fields are marked *
Post My Comment
Register with byju's & watch live videos.
Hypothesis testing is the act of testing a hypothesis or a supposition in relation to a statistical parameter. Analysts implement hypothesis testing in order to test if a hypothesis is plausible or not.
In data science and statistics , hypothesis testing is an important step as it involves the verification of an assumption that could help develop a statistical parameter. For instance, a researcher establishes a hypothesis assuming that the average of all odd numbers is an even number.
In order to find the plausibility of this hypothesis, the researcher will have to test the hypothesis using hypothesis testing methods. Unlike a hypothesis that is ‘supposed’ to stand true on the basis of little or no evidence, hypothesis testing is required to have plausible evidence in order to establish that a statistical hypothesis is true.
Perhaps this is where statistics play an important role. A number of components are involved in this process. But before understanding the process involved in hypothesis testing in research methodology, we shall first understand the types of hypotheses that are involved in the process. Let us get started!
In data sampling, different types of hypothesis are involved in finding whether the tested samples test positive for a hypothesis or not. In this segment, we shall discover the different types of hypotheses and understand the role they play in hypothesis testing.
Alternative Hypothesis (H1) or the research hypothesis states that there is a relationship between two variables (where one variable affects the other). The alternative hypothesis is the main driving force for hypothesis testing.
It implies that the two variables are related to each other and the relationship that exists between them is not due to chance or coincidence.
When the process of hypothesis testing is carried out, the alternative hypothesis is the main subject of the testing process. The analyst intends to test the alternative hypothesis and verifies its plausibility.
The Null Hypothesis (H0) aims to nullify the alternative hypothesis by implying that there exists no relation between two variables in statistics. It states that the effect of one variable on the other is solely due to chance and no empirical cause lies behind it.
The null hypothesis is established alongside the alternative hypothesis and is recognized as important as the latter. In hypothesis testing, the null hypothesis has a major role to play as it influences the testing against the alternative hypothesis.
(Must read: What is ANOVA test? )
The Non-directional hypothesis states that the relation between two variables has no direction.
Simply put, it asserts that there exists a relation between two variables, but does not recognize the direction of effect, whether variable A affects variable B or vice versa.
The Directional hypothesis, on the other hand, asserts the direction of effect of the relationship that exists between two variables.
Herein, the hypothesis clearly states that variable A affects variable B, or vice versa.
A statistical hypothesis is a hypothesis that can be verified to be plausible on the basis of statistics.
By using data sampling and statistical knowledge, one can determine the plausibility of a statistical hypothesis and find out if it stands true or not.
(Related blog: z-test vs t-test )
Now that we have understood the types of hypotheses and the role they play in hypothesis testing, let us now move on to understand the process in a better manner.
In hypothesis testing, a researcher is first required to establish two hypotheses - alternative hypothesis and null hypothesis in order to begin with the procedure.
To establish these two hypotheses, one is required to study data samples, find a plausible pattern among the samples, and pen down a statistical hypothesis that they wish to test.
A random population of samples can be drawn, to begin with hypothesis testing. Among the two hypotheses, alternative and null, only one can be verified to be true. Perhaps the presence of both hypotheses is required to make the process successful.
At the end of the hypothesis testing procedure, either of the hypotheses will be rejected and the other one will be supported. Even though one of the two hypotheses turns out to be true, no hypothesis can ever be verified 100%.
(Read also: Types of data sampling techniques )
Therefore, a hypothesis can only be supported based on the statistical samples and verified data. Here is a step-by-step guide for hypothesis testing.
First things first, one is required to establish two hypotheses - alternative and null, that will set the foundation for hypothesis testing.
These hypotheses initiate the testing process that involves the researcher working on data samples in order to either support the alternative hypothesis or the null hypothesis.
Once the hypotheses have been formulated, it is now time to generate a testing plan. A testing plan or an analysis plan involves the accumulation of data samples, determining which statistic is to be considered and laying out the sample size.
All these factors are very important while one is working on hypothesis testing.
As soon as a testing plan is ready, it is time to move on to the analysis part. Analysis of data samples involves configuring statistical values of samples, drawing them together, and deriving a pattern out of these samples.
While analyzing the data samples, a researcher needs to determine a set of things -
Significance Level - The level of significance in hypothesis testing indicates if a statistical result could have significance if the null hypothesis stands to be true.
Testing Method - The testing method involves a type of sampling-distribution and a test statistic that leads to hypothesis testing. There are a number of testing methods that can assist in the analysis of data samples.
Test statistic - Test statistic is a numerical summary of a data set that can be used to perform hypothesis testing.
P-value - The P-value interpretation is the probability of finding a sample statistic to be as extreme as the test statistic, indicating the plausibility of the null hypothesis.
The analysis of data samples leads to the inference of results that establishes whether the alternative hypothesis stands true or not. When the P-value is less than the significance level, the null hypothesis is rejected and the alternative hypothesis turns out to be plausible.
As we have already looked into different aspects of hypothesis testing, we shall now look into the different methods of hypothesis testing. All in all, there are 2 most common types of hypothesis testing methods. They are as follows -
The frequentist hypothesis or the traditional approach to hypothesis testing is a hypothesis testing method that aims on making assumptions by considering current data.
The supposed truths and assumptions are based on the current data and a set of 2 hypotheses are formulated. A very popular subtype of the frequentist approach is the Null Hypothesis Significance Testing (NHST).
The NHST approach (involving the null and alternative hypothesis) has been one of the most sought-after methods of hypothesis testing in the field of statistics ever since its inception in the mid-1950s.
A much unconventional and modern method of hypothesis testing, the Bayesian Hypothesis Testing claims to test a particular hypothesis in accordance with the past data samples, known as prior probability, and current data that lead to the plausibility of a hypothesis.
The result obtained indicates the posterior probability of the hypothesis. In this method, the researcher relies on ‘prior probability and posterior probability’ to conduct hypothesis testing on hand.
On the basis of this prior probability, the Bayesian approach tests a hypothesis to be true or false. The Bayes factor, a major component of this method, indicates the likelihood ratio among the null hypothesis and the alternative hypothesis.
The Bayes factor is the indicator of the plausibility of either of the two hypotheses that are established for hypothesis testing.
(Also read - Introduction to Bayesian Statistics )
To conclude, hypothesis testing, a way to verify the plausibility of a supposed assumption can be done through different methods - the Bayesian approach or the Frequentist approach.
Although the Bayesian approach relies on the prior probability of data samples, the frequentist approach assumes without a probability. A number of elements involved in hypothesis testing are - significance level, p-level, test statistic, and method of hypothesis testing.
(Also read: Introduction to probability distributions )
A significant way to determine whether a hypothesis stands true or not is to verify the data samples and identify the plausible hypothesis among the null hypothesis and alternative hypothesis.
Be a part of our Instagram community
5 Factors Influencing Consumer Behavior
Elasticity of Demand and its Types
An Overview of Descriptive Analysis
What is PESTLE Analysis? Everything you need to know about it
What is Managerial Economics? Definition, Types, Nature, Principles, and Scope
5 Factors Affecting the Price Elasticity of Demand (PED)
6 Major Branches of Artificial Intelligence (AI)
Scope of Managerial Economics
Dijkstra’s Algorithm: The Shortest Path Algorithm
Different Types of Research Methods
Talk to our experts
1800-120-456-456
You must have heard about hypotheses that led to several achievements in scientific inventions. A hypothesis is a milestone in any research; it is the point of the research where we propose an analysis. The hypothesis of any research corresponds to the assumptions we conclude from the evidence gathered. The hypothesis consists of the points or the concepts that are proven successful. Now, let us learn about what exactly a hypothesis means and the type of hypothesis along with examples.
An assumption that is made based on some limited evidence collected is known as a hypothesis. It is the beginning point of study that translates research questions into predictions that might or might not be true. It depends on the variables and population used, also the relation between the variables. The hypothesis used to test the relationship between two or multiple variables is known as the research hypothesis.
The properties of the hypothesis are as follows:
It should be empirically tested irrespective of being right or wrong.
It should establish the relationship between the variables that are considered.
It must be specific, clear, and precise.
It should possess the scope for future studies and be capable of conducting more tests.
It should be capable of testing it in a reasonable time and it must be reliable.
Hypothesis can be classified as follows:
Simple hypothesis
Directional hypothesis
Complex hypothesis
Non-directional hypothesis
Causal and associative hypothesis
It states that one variable doesn't affect the other variables being studied. A null hypothesis asserts that two factors or groups are independent of each other and that some traits of a population or process are identical. To contradict or invalidate the null hypothesis, we must assess the likelihood of the alternative hypothesis in addition to the null hypothesis.
There are two types of variables i.e, dependent and independent variables. A simple hypothesis shows the relationship between the dependent and independent variables. For example, if you pump petrol into your bike, you can go for long rides. Here bike is the dependent variable and petrol is the independent one.
A directional hypothesis is a researcher's prediction of a positive or negative change, relationship, or difference between two variables in a population. This statement is often supported by prior research, a widely established theory, considerable experience, or relevant literature.
For example, students who do proper revision and assignments could score more marks than the students who skipped. Here, we already know the process and its impact on the outcome. This is what we call a directional hypothesis.
The complex hypothesis shows the relationship that comes between two or more dependent and independent variables. For example, if you pump petrol in your bike, you can go for long rides, also you become an expert in riding a bike, you explore more places and come across new things.
There is no theory for this kind. Unlike the directional hypothesis, there are no predictions. We can say there is a relation between the variables but prediction and nature are unknown.
If there is a change in one variable and as a result, it affects the other variable, then we say it is associative. Meanwhile, the causal hypothesis comes into play when the cause and effect interaction occurs between two or more variables.
The major sources of hypothesis are:
Scientific theories
Personal experience and conclusion arrived
Studies that underwent in the past
The resemblances between the phenomena, that is the pattern observed in common
Common thoughts and thinking
The functions of hypothesis are as follows:
It tells us the specific aspects of studies we investigate. It provides study with focus.
The cnstruction of the hypothesis led to objectivity in the investigation
It helps to formulate the theory for the research work and sort out what is wrong and right.
It filters out the data that have to be collected for the work.
Some examples of hypotheses are as follows
Consumption of tobacco led to cancer, which is an example of a simple hypothesis.
If a person does work out daily, his/her skin, body, and mind remain healthy and fresh, which is an example of a directional hypothesis.
If you consume tobacco it not only causes cancer, but also affects your brain, turns your lips black, etc.
Experimental designing
Predicting results
Background research
Question formation
Data collection
Verification of results
Concluding the experiment
Being a future reference for the further studies
Role of hypothesis in the scientific method
In conclusion, it can be understood that a hypothesis is an assumption that researchers make on the basis of the limited evidence collected. It is the starting point of study that translates research questions into predictions. The various types of hypotheses include Null Hypothesis, Simple hypothesis, Directional hypothesis, Complex hypothesis, Non-directional hypothesis, and Causal and associative hypothesis. We proceed with our research or experiments according to the hypothesis we design.
1. Why is a hypothesis important?
Hypothesis plays an important role in any research project; it's a stepping stone to proving a theory. Hypothesis serves in establishing a connection to the underlying theory and particular research subject. It helps in data processing and evaluates the reliability and validity of the study. It offers a foundation or supporting evidence to demonstrate the accuracy of the study. A hypothesis allows researchers not only to get a relationship between variables, but also to predict a relationship based on theoretical guidelines and/or empirical proof.
2. How do I write a hypothesis?
Writing a good hypothesis starts before you even begin to type. Like several tasks, preparation is vital, thus you begin first by conducting analysis yourself, and reading all you can regarding the subject that you decide to do research on. From there, you’ll gain the information you need to know , where your focus within the subject will lie. Keep in mind that a hypothesis may be a prediction of the relationship that exists between 2 or more variables. The hypothesis should be straightforward and concise , the result should be predictable , clear and with no assumptions about the reader's knowledge.
3. What are a few examples of hypotheses?
Consumption of drugs leads to depression is an example of a simple hypothesis. If a person has a proper diet plan, his/her skin, body, and mind remain healthy and fresh. This is an example of a directional hypothesis. If you consume drugs it not only causes depression, but also affects your brain, leads to addiction, etc. If you pump petrol in your bike, you can go for long rides, also you become an expert in riding a bike, you explore more places and come across new things.
The alternative hypothesis.
Adam Hayes, Ph.D., CFA, is a financial writer with 15+ years Wall Street experience as a derivatives trader. Besides his extensive derivative trading expertise, Adam is an expert in economics and behavioral finance. Adam received his master's in economics from The New School for Social Research and his Ph.D. from the University of Wisconsin-Madison in sociology. He is a CFA charterholder as well as holding FINRA Series 7, 55 & 63 licenses. He currently researches and teaches economic sociology and the social studies of finance at the Hebrew University in Jerusalem.
Yarilet Perez is an experienced multimedia journalist and fact-checker with a Master of Science in Journalism. She has worked in multiple cities covering breaking news, politics, education, and more. Her expertise is in personal finance and investing, and real estate.
A null hypothesis is a type of statistical hypothesis that proposes that no statistical significance exists in a set of given observations. Hypothesis testing is used to assess the credibility of a hypothesis by using sample data. Sometimes referred to simply as the “null,” it is represented as H 0 .
The null hypothesis, also known as “the conjecture,” is used in quantitative analysis to test theories about markets, investing strategies, and economies to decide if an idea is true or false.
Alex Dos Diaz / Investopedia
A gambler may be interested in whether a game of chance is fair. If it is, then the expected earnings per play come to zero for both players. If it is not, then the expected earnings are positive for one player and negative for the other.
To test whether the game is fair, the gambler collects earnings data from many repetitions of the game, calculates the average earnings from these data, then tests the null hypothesis that the expected earnings are not different from zero.
If the average earnings from the sample data are sufficiently far from zero, then the gambler will reject the null hypothesis and conclude the alternative hypothesis—namely, that the expected earnings per play are different from zero. If the average earnings from the sample data are near zero, then the gambler will not reject the null hypothesis, concluding instead that the difference between the average from the data and zero is explainable by chance alone.
A null hypothesis can only be rejected, not proven.
The null hypothesis assumes that any kind of difference between the chosen characteristics that you see in a set of data is due to chance. For example, if the expected earnings for the gambling game are truly equal to zero, then any difference between the average earnings in the data and zero is due to chance.
Analysts look to reject the null hypothesis because doing so is a strong conclusion. This requires evidence in the form of an observed difference that is too large to be explained solely by chance. Failing to reject the null hypothesis—that the results are explainable by chance alone—is a weak conclusion because it allows that while factors other than chance may be at work, they may not be strong enough for the statistical test to detect them.
An important point to note is that we are testing the null hypothesis because there is an element of doubt about its validity. Whatever information that is against the stated null hypothesis is captured in the alternative (alternate) hypothesis (H 1 ).
For the examples below, the alternative hypothesis would be:
In other words, the alternative hypothesis is a direct contradiction of the null hypothesis.
Here is a simple example: A school principal claims that students in their school score an average of seven out of 10 in exams. The null hypothesis is that the population mean is not 7.0. To test this null hypothesis, we record marks of, say, 30 students ( sample ) from the entire student population of the school (say, 300) and calculate the mean of that sample.
We can then compare the (calculated) sample mean to the (hypothesized) population mean of 7.0 and attempt to reject the null hypothesis. (The null hypothesis here—that the population mean is not 7.0—cannot be proved using the sample data. It can only be rejected.)
Take another example: The annual return of a particular mutual fund is claimed to be 8%. Assume that the mutual fund has been in existence for 20 years. The null hypothesis is that the mean return is not 8% for the mutual fund. We take a random sample of annual returns of the mutual fund for, say, five years (sample) and calculate the sample mean. We then compare the (calculated) sample mean to the (claimed) population mean (8%) to test the null hypothesis.
For the above examples, null hypotheses are:
For the purposes of determining whether to reject the null hypothesis (abbreviated H0), said hypothesis is assumed, for the sake of argument, to be true. Then the likely range of possible values of the calculated statistic (e.g., the average score on 30 students’ tests) is determined under this presumption (e.g., the range of plausible averages might range from 6.2 to 7.8 if the population mean is 7.0).
If the sample average is outside of this range, the null hypothesis is rejected. Otherwise, the difference is said to be “explainable by chance alone,” being within the range that is determined by chance alone.
As an example related to financial markets, assume Alice sees that her investment strategy produces higher average returns than simply buying and holding a stock . The null hypothesis states that there is no difference between the two average returns, and Alice is inclined to believe this until she can conclude contradictory results.
Refuting the null hypothesis would require showing statistical significance, which can be found by a variety of tests. The alternative hypothesis would state that the investment strategy has a higher average return than a traditional buy-and-hold strategy.
One tool that can determine the statistical significance of the results is the p-value. A p-value represents the probability that a difference as large or larger than the observed difference between the two average returns could occur solely by chance.
A p-value that is less than or equal to 0.05 often indicates whether there is evidence against the null hypothesis. If Alice conducts one of these tests, such as a test using the normal model, resulting in a significant difference between her returns and the buy-and-hold returns (the p-value is less than or equal to 0.05), she can then reject the null hypothesis and conclude the alternative hypothesis.
The analyst or researcher establishes a null hypothesis based on the research question or problem they are trying to answer. Depending on the question, the null may be identified differently. For example, if the question is simply whether an effect exists (e.g., does X influence Y?), the null hypothesis could be H 0 : X = 0. If the question is instead, is X the same as Y, the H 0 would be X = Y. If it is that the effect of X on Y is positive, H 0 would be X > 0. If the resulting analysis shows an effect that is statistically significantly different from zero, the null can be rejected.
In finance , a null hypothesis is used in quantitative analysis. It tests the premise of an investing strategy, the markets, or an economy to determine if it is true or false.
For instance, an analyst may want to see if two stocks, ABC and XYZ, are closely correlated. The null hypothesis would be ABC ≠ XYZ.
Statistical hypotheses are tested by a four-step process . The first is for the analyst to state the two hypotheses so that only one can be right. The second is to formulate an analysis plan, which outlines how the data will be evaluated. The third is to carry out the plan and physically analyze the sample data. The fourth and final step is to analyze the results and either reject the null hypothesis or claim that the observed differences are explainable by chance alone.
An alternative hypothesis is a direct contradiction of a null hypothesis. This means that if one of the two hypotheses is true, the other is false.
A null hypothesis states there is no difference between groups or relationship between variables. It is a type of statistical hypothesis and proposes that no statistical significance exists in a set of given observations. “Null” means nothing.
The null hypothesis is used in quantitative analysis to test theories about economies, investing strategies, and markets to decide if an idea is true or false. Hypothesis testing assesses the credibility of a hypothesis by using sample data. It is represented as H 0 and is sometimes simply known as “the null.”
Correction—July 23, 2024: This article was corrected to state accurate examples of null hypothesis in the Null Hypothesis Examples section.
Limited time offer, save 50% on standard digital, explore more offers..
Then $75 per month. Complete digital access to quality FT journalism. Cancel anytime during your trial.
Complete digital access to quality FT journalism with expert analysis from industry leaders. Pay a year upfront and save 20%.
10% off your first year. The new FT Digital Edition: today’s FT, cover to cover on any device. This subscription does not include access to ft.com or the FT App.
Terms & Conditions apply
Why the ft.
See why over a million readers pay to read the Financial Times.
It's essential for Ree's pot roast!
Every item on this page was chosen by a The Pioneer Woman editor. We may earn commission on some of the items you choose to buy.
A good collection of cookware will make it easier to get dinner on the table. But if you're looking for a go-to pot that can do it all, shop for a Dutch oven . Sure, there's a time and place for more modern appliances, like Instant Pots or slow cookers, but a Dutch oven has a long history of being a trusty vessel for making big braises, soups, and one-pot meals that can go straight from the stovetop to the table. But what exactly is a Dutch oven and where did the name come from?
Dutch ovens are by far some of the most popular cooking pots in the kitchen. So, it's no wonder Ree Drummond has added quite a few varieties to The Pioneer Woman collection . Over the years, she's come out with both large and small sizes, pretty colors , and even some floral Dutch ovens featuring her signature prints. Not to mention, Dutch ovens can range in shape and material depending on your needs. So, should you go with a cast iron Dutch oven, enameled, or even a ceramic pot? Keep reading for a run-down on everything you need to know about the kitchen staple, including different types of Dutch ovens, how to use them, and how to care for them.
A Dutch oven is a heavy, thick-walled pot with a tight-fitting lid. You might recognize the enameled cast iron pots that come in different colors, but they can also come in other materials as well (more on that below). What makes a Dutch oven stand out from other pots and pans on the shelf is its ability to be ultra-sturdy and a good way to retain heat. That’s why Dutch ovens are so useful for making soups , stews , sauces, and braises. Not to mention, many Dutch ovens are known to be timeless vessels that can last for years.
Dutch ovens may have been around since the 17th century! As the story goes, an Englishman named Abraham Darby visited the Netherlands where they were known to create shallow pots made of brass using a casting process with sand molds. Darby took inspiration from the process but used a cheaper material—iron—to perfect the results. He later patented the process for casting iron in sand in 1707.
Many people say the name "Dutch" oven comes from Darby's visits to the Dutch factories. More than 300 years later and we're still calling them Dutch ovens. However, in the Netherlands, these types of pots are simply referred to as braadpan , which means roasting pan.
As you now know, Dutch ovens were all originally made out from cast iron. But in later years, the French company Le Creuset started coated them with enamel and labeled them the French oven instead. The name didn't quite stick, but now you know the difference between a Dutch and French oven.
The best part about a Dutch oven is its versatility! You can use it in so many different ways. Ree likes to put her Dutch oven to good use for one-pot dinners like her perfect pot roast or homemade chili , but you can also use it to cook beans, roast a whole chicken , or bake a crusty loaf of bread . Not only can Dutch ovens get searingly hot, but they'll also hold their heat well. So, whether you're simmering something on the stovetop, deep frying something in oil, or baking something in the oven, there's nothing this study pot can't do.
Dutch ovens can come in different materials and sizes. They can be as small as mini 1 to 2- quart Dutch ovens or as large as a 15-quart Dutch oven that can feed a crowd. The most common size is a 5-quart Dutch oven, like The Pioneer Woman's Enamel on Cast Iron Dutch Oven with a pretty embossed lid. Speaking of enameled cast-iron, that’s just one of the materials for Dutch ovens that you can find. There are also non-enameled cast-iron Dutch ovens which are great for Dutch oven recipes of all kinds, but they do require seasoning and special care, along with ceramic Dutch ovens, which tend to be lighter weight but less durable than the enameled kind.
Keep these care tips in mind to make the most of your Dutch oven and ensure that it lasts for years to come. To start, avoid using metal utensils on enameled cast iron to prevent scratches. Instead, opt for wooden spoons or silicone spatulas. When cleaning your Dutch oven , it's best to hand wash rather than sticking it in the dishwasher. For stubborn foods, allow the Dutch oven to soak in soapy water for 15 minutes, then use a soft sponge and rinse away. Note: Non-enameled cast iron Dutch ovens should be cleaned the same way you would clean a cast-iron skillet .
21 Cute Halloween Pajamas for the Whole Family
Shop Walmart's Scary Good Halloween Flash Sale
The Best Gift Ideas Under $20
Get Paige's Look From Her Day of Dress Shopping
Dolly Parton Published a New Family-Style Cookbook
Ree Says This $18 Mug Set 'Makes the Perfect Gift'
The Best Products from Ree's Collection Under $15
10 Best Products Under $10 from Ree's Walmart Line
Lenox's Iconic Spice Village Is Finally Back
Stanley Releases Tumbler Collaboration With Barbie
The Best Picks from Ree's Collection at Walmart
The Best Gift Ideas Under $10
This story was updated to correct a misspelling/typo .
Sheldon Kennedy remembers playing his first hockey game when he was 4 years old, on an outdoor rink in Winnipeg, Manitoba.
His family lived on a dairy farm several hours west in the Canadian province. Kennedy played tournaments on weekends and, in between, the games shifted to the road once the street lights came on.
His love of hockey softened the hard work on the farm and the anxiety brought on by a father he describes as angry and violent.
“We weren't modeled with a great loving relationship between mother and father, I can tell you that,” Kennedy, who played parts of eight seasons in the NHL, recalled in a 2022 Players’ Tribune podcast .
Kennedy’s junior hockey coach offered much more of a connection. He called Kennedy’s parents and invited the young player to stay at his house and discuss Kennedy's future. They couldn’t get Kennedy on the bus fast enough.
It was a decision that altered his life forever . The coach abused him for a number of years.
“I had the love of the game stolen from me,” Kennedy said.
The experience drove Kennedy to depression, substance abuse and suicide attempts. In 1996, late in his NHL career, he became one of the first prominent male athletes to come forward about being sexually abused.
He became a hero in Canada and in 2004, he started the Respect Group with Wayne McNeil to help sports organizations across his nation prevent what happened to him.
Sexual abuse in youth serving organizations is a recognized problem in countries all over the world, according to Canada’s Child Advocacy Centre. Protect Youth Sports, a U.S.-based organization, runs more than 1.1 million volunteers through background checks each year.
“The stories are scary,” says RJ Frasca, Protect Youth Sports’ vice president. “It's incredibly important to have communication lines open with your kids. I think it's much more effective than the background screening itself."
USA TODAY Sports spoke with Frasca and McNeil, Kennedy’s business partner, about how we can recognize signs of physical and emotional abuse and prevent it in youth sports.
(Questions and responses are edited for length and clarity.)
In the United States, a number of companies, such as Protect Youth Sports and NCSI , offer background screenings for schools, leagues, camps and other youth organizations and institutions.
The Respect Group focuses on abuse prevention for coaches, parents, officials and athletes. McNeil says screening, for which his company’s clients use a third party, can offer a false sense of security.
USA TODAY: Is there a success rate with background checks?
RJ Frasca: Of the 1.1 million visitors we run, about 6% come back with some type of criminal conviction or they're on the sex offender registry. Most recently, we had a coach come through and we caught him on a very recent criminal charge. Then he resubmitted four times with different dates of birth, social security numbers, really trying to get around the system. And I don't know his backstory, maybe he just really wanted to coach his son's team or daughter's team, but kept going through to the point where then he finally admitted it. But he said that he had legal court documents that showed that (the charge) had been dismissed and he had some documents that falsified a judge’s signatures on it.
This one was not a sexual charge (but) it was a criminal conviction. We verified it with the courts. And (they said), ‘It's not been dismissed.’
Wayne McNeil: A police check feels good, but coaches that have been convicted often know how to play around the system, or they go to organizations that don't demand screening or a police check. And (there's people) that have been accused but never convicted (and) they don't show up in a police check.
I think what our approach is, empower the bystander.
The Respect In Sport program has trained about 2.5 million volunteers through its online programs in preventing abuse.
They learn we can’t just look for stranger danger: The white van or the guy with the mask. The vast majority of sexual abusers know their victims and go to great lengths to not only get close to the victim, but to establish themselves with the victim’s family. This process is known as grooming.
Grooming can come in the form of offering gifts to a child and making them feel special with one-on-one meetings away from the team.
Kennedy says his abuser got to know his parents and brother and then made himself the most trusted figure in Kennedy’s life by isolating Kennedy from them.
Abusers are known to target kids from broken homes where parents may be absent at times. Victims feel alone and trapped. Kennedy kept his secret for so long because, like many kids, he felt no one would believe him.
USA TODAY: What are some of the detecting skills that you highlight within your programs?
Wayne McNeil: A coach starts treating your kid a bit differently, maybe saying, 'Why don't we have a one-on-one practice? Can I pick you up?' A lot of parents turn off those signals because they want the kid to succeed. And they’re like, 'Oh my god, this Olympic coach is starting to spend a lot of time with my kid, that's probably a good thing.' Well, it could be, but highly unlikely that it is because they’ve got several people they need to coach and maybe this behavior is going down a different path. I always used to say that the kiss of death is when a coach tells a parent that your kid has potential and I personally can take them all the way to the podium or to the pros, just entrust your child to me.
RJ Frasca: Anybody can see a bruise (although kids may cover it with long sleeves out of embarrassment) but not necessarily a different behavior, how the coach is speaking to the kids or how your child is off in a corner somewhere and isolated. You need to be watching if there's an uptick in anxiety or fear, or if a specific kid on the team does not want to associate with a certain adult … you see the drawing away socially, social awkwardness or distancing; watch for those types of things, and if they can't be explained otherwise.
According to the latest data available from the U.S. Department of Health & Human Services , of the 558,899 victims of child abuse and neglect in 2022, 74.3% of victims experienced neglect, 17% were physically abused, 10.6% were sexually abused, and 6.8% were psychologically abused.
All of these types of abuse can unfortunately be found in youth sports .
BAHD behavior, as the Respect Group refers to bullying, abuse, harassment and discrimination, can overlap and be difficult to separate.
Its training module, to which USA TODAY Sports was granted access, stresses a volunteer doesn’t have to name a behavior in order to take action to stop it. We can just use our gut if something doesn’t seem right.
Related: Watch a safety training video provided by Protect Youth Sports (note: contains some graphic content)
USA TODAY: What are signs in kids' behavior that indicate there might be a problem?
Wayne McNeil: If there's a sudden change in the kid's behavior relative to the sport, you need to figure out what's going on. And oftentimes it could be bullying between other kids. They're not feeling accepted by their peers. Oftentimes it could be the coach excluding them from the play. There's all sorts of things that lead to sudden changes in a kid's behavior. And when that happens, whatever the change is, you need to be wondering, why doesn't my kid want to practice? Why doesn't my kid want to go to the game when they used to be so pumped about going to the game? And maybe there's something going on with the peer group or the coach that's causing that to happen. Conversely, if a coach is seeing behavior changes in the kid, they need to be aware of the fact that, maybe there's something going on at home, maybe there's some ugly things happening at school.
Predatory or abusive behavior often builds over time, according to the Respect Group. We can watch for behavior that seems out of the ordinary and catch the abuse before it starts. Don’t dismiss flirting between a coach and a younger participant, a coach who encourages inappropriate attention with vulgarities or excessive hugging or physical contact, even when it appears consensual.
Watch for a coach spending one-on-one time after practice or away from the group with one player while it’s still going on.
Establish a “Rule of Two,” which the Respect Group teaches, requiring there to be at least two leaders (one of the same gender) with a young person. One-on-one interaction must be within earshot of the other.
Learn to identify the forms of emotional abuse, which consists of both physical abuse (a coach who throws objects at or near someone to cause them to feel afraid or intimidated) and neglect (a coach who frequently ignores a player as a way to “toughen up” or motivate them to perform better), regardless of intent.
These are no longer behaviors accepted in coaching, and they can lead to emotional harm.
Coach Steve: What young athles can learn from the late Frank Howard – and not Bob Knight
Have the courage to speak up if something doesn’t seem right.
USA TODAY: In terms of preventing abuse, are there things you teach adults to say to their kids about what to look out for?
RJ Frasca: It's really just working with the league and the community that you're putting your kids in the sports, to make sure that there's a good code of conduct reporting mechanism and program in place. And then, you know, participating in it. Don’t just drop your kids off; be part of it.
Only 38% of youth that are abused in any form or fashion ever report it. That's a terrifying number. So, are we encouraging them to really, really communicate and ask those right questions as they go along through the season? There's a lot of opportunity to increase your chances of communication, if you're asking the right questions, if you're talking to them at their level, at their age group, and really starting to dig in on like, 'What do you think of this? And how are the other kids? What happened today at practice? How did that happen? Why did you say this? And how do you feel?' Those type of questions.
When he was first beginning his journey of self-discovery, Kennedy rollerbladed across Canada to spread awareness for victims of abuse.
Speaking to people, he says he came to learn perpetrators prey on communities' ignorance and indifference.
In the United States, the SafeSport Act requires amateur athletics governing bodies to report awareness of any case of abuse immediately to local or federal law enforcement or to a child-welfare agency designated by the U.S. Justice Department and have education and training for adults who are in direct contact with athletes who are minors.
“Sheldon has a great saying,” McNeil says. “It's not common sense; it’s good sense that we need to be common again.”
USA TODAY: How do you promote a different type of coaching in your program that gives people the proper training on how to be coaches?
Wayne McNeil: If I were to encapsulate it, I would say it’s trying to create an environment that is psychologically safe, and, obviously, with concussions and so on, physically safe. It's not about skills development: Here's how to kick a soccer ball or shoot a puck. It's really giving coaches, if we're talking about the coach program, the insights and guidelines on how to create an environment that's respectful, welcoming and psychologically safe. Hey, if I treat my kid, the ref, the coach with respect and vice versa, chances are I'm going to have a psychologically safe environment for which that kid can participate.
People are like, 'Why do I have to take a program?' Well, the answer is, you're a good person, and you’ve probably never had this education, so we give you more tools to be a little bit better; it's advantageous for you and it's advantageous for your child.
We're under no illusions that our program is going to catch an abuser, and that's really not our focus. Our focus is, if you can empower everybody in a situation with a good education, they will be the ones that call out the anomalies, whether it be a parent or coach .
Steve Borelli, aka Coach Steve, has been an editor and writer with USA TODAY since 1999. He spent 10 years coaching his two sons’ baseball and basketball teams. He and his wife, Colleen, are now sports parents for two high schoolers. His column is posted weekly. For his past columns, click here .
Got a question for Coach Steve you want answered in a column? Email him at [email protected]
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Scientific Reports volume 14 , Article number: 21922 ( 2024 ) Cite this article
Metrics details
Land use and land cover change (LULCC) have profoundly altered land surface properties and ecosystem functions, including carbon and water production. While mapping these changes from local to global scales has become more achievable due to advancements in earth observations and remote sensing, linking land cover changes to ecosystem functions remains challenging, especially at regional scale. Our study attempts to fill this gap by employing a computationally efficient method and two types of widely used high-resolution satellite images. We first investigated the contribution of landscape composition to ecosystem function by examining how land cover and proportion affected gross primary production (GPP) and evapotranspiration (ET) at six macro-landscapes in Mongolia and Kazakhstan. We hypothesized that both ecosystem and landscape GPP and ET are disproportionate to their composition and, therefore, changes in land cover will have asymmetrical influences on landscape functions. We leveraged a computational-friendly linear downscaling approach to align the coarse spatial resolution of MODIS (500 m) with a fine-grain and localized land cover map developed from Landsat (30 m) for six provinces in countries where intensive LULCC occurred in recent decades. By establishing two metrics—function to composition ratio (F/C) and function to changes in composition change (ΔF/ΔC)—we tested our hypothesis and evaluated the impact of land cover change on ecosystem functions within and among the landscapes. Our results show three major themes. (1) The five land cover types have signature downscaled ET and GPP that appears to vary between the two countries as well as within each country. (2) F/C of ET and GPP of forests is statistically greater than 1 (i.e., over-contributing), whereas F/C of grasslands and croplands is close to or slightly less than 1 (i.e., under-contribution). F/C of barrens is clearly lower than 1 but greater than zero. Specifically, a unit of forest generates 1.085 unit of ET and 1.123 unit of GPP, a unit of grassland generates 0.993 unit of ET and GPP, and a unit of cropland produces 0.987 unit of ET and 0.983 unit of GPP. The divergent F/C values among the land cover classes support the hypothesis that landscape function is disproportionate to its composition. (3) ΔET/ΔC and ΔGPP/ΔC of forests and croplands showed negative values, while grasslands and barrens showed positive values, indicating that converting a unit of forest to other land cover leads to a decrease in ET and GPP, while converting units of grassland or barren to other land cover classes will result in increased ET and GPP. This linear downscaling approach for calculating F/C and ΔF/ΔC is labor-saving and cost-effective for rapid assessment on the impact of land use land cover change on ecosystem functions.
Introduction.
Land use and land cover change (LULCC) have profoundly and extensively modified land surface properties and, consequently, ecosystem functions. Direct alterations of land surfaces during LULCC include changes in vegetation (e.g., cover type, species composition, canopies, biomass), soil (e.g., bare coverage, texture) and microclimate (land surface temperature, vapor pressure deficit), whereas indirect influences touch all aspects of ecosystem processes and functions 1 , 2 , 3 , 4 , 5 . Scientific investigations on LULCC, as well as its causes and consequences on ecosystems properties, people and societies have been a central concern for several decades 6 , 7 , 8 , 9 . Across the Asia Drylands Belt (ADB), intensive and extensive LULCC has been jointly driven by rapid economic development, population growth, urban expansion, abrupt political shifts, and climate change 10 , 11 . Here the water-stressed ecosystems are under extreme pressure, with predictions of a drier and warmer climate and more frequent extreme climate events (e.g., droughts, heatwaves, dzuds), anticipating additional pressures on fragile ecosystems and societies with relatively less-advanced infrastructures. As in all the terrestrial regions, the most pressing question has been: What are the independent and interactive forcings from LULCC and climate change on ecosystems and societies? Taking a macroecology approach, we focus on the spatial and temporal changes of two major terrestrial ecosystem functions: gross primary production (GPP) and evapotranspiration (ET). GPP, the largest carbon flux term, is total carbon uptake through photosynthesis. ET is the sum of water loss from the land surface to the atmosphere via evaporation and transpiration. Both GPP and ET are tightly coupled with LULCC and climate. Our specific study objectives are to: (1) quantify the magnitude of GPP and ET in major land cover classes in different regions and time periods; (2) explore how different direction of land cover changes (LCC) may result in similar GPP and ET in different parts of the ADB; and (3) identify the key LCCs that may cause disproportional changes in GPP and ET.
Rich earth observation satellites provide regular reflectance of land surfaces across continuous space, which can be used to model GPP, ET, land cover, and other properties. Among the remote sensing products, Moderate Resolution Imaging Spectroradiometer (MODIS) provides GPP and ET products at 500-m spatial resolution and 8-day frequency. Land cover maps and other products are also available. Ideally, we should be able to use land cover maps to examine the changes in ET or GPP. Unfortunately, the coarse spatial resolution from MODIS often prevents us from connecting GPP or ET directly to a specific land cover because many ecosystems are smaller than 500 m. Fortunately, NASA’s Landsat and ESA’s Sentinel satellites provide 10–30 m resolution reflectance measures for developing accurate land cover classifications 12 , 13 , 14 . Therefore, if MODIS GPP or ET can be downscaled to 30 m resolution, we can compare the interdependent changes between LCC and ecosystem functions.
Here we leverage a computational-friendly linear downscaling approach 15 , 16 , 17 , 18 to align the coarse resolution GPP and ET products with high-resolution land cover maps of Kazakhstan and Mongolia 19 to examine the changes in ecosystem functions with LULCC. Two specific hypotheses will be tested on how ecosystem GPP and ET might be affected by landscape composition and land cover changes. First, ecosystem functional contributions of a land cover type (F i ) are not proportional to its proportion of the land area (C i ) of a landscape or a region (where i indicates a specific land cover class). While this hypothesis is intuitive, a more interesting premise is that these disproportionate contributions vary significantly among the landscapes of a region and/or among regions. This hypothesis can be mathematically expressed as.
Lessons learned from testing this hypothesis in multiple regions of Kazakhstan and Mongolia will have profound consequences on understanding and managing landscapes. Our second hypothesis is that changes in landscape function are not equally caused by the changes of a land cover class (i):
In other words, a same amount of land conversion from a cover type will yield very different ΔF depending on the cover type after conversion. By testing this hypothesis, we can directly connect land cover change (∆C i ) to functional consequences (ΔF i ). Similarly, the unequal contributions of a cover type may vary significantly among the regions. These hypotheses are tested in grassland biomes 20 in six provinces in Kazakhstan and Mongolia (three provinces in each country) where detailed land cover classes for the past three decades are available 19 .
We selected three provinces (each country) as our macrosystems in Kazakhstan and Mongolia based on the climate, dominant vegetation and soils, and socioeconomic positions in the country. Kazakhstan is the largest landlocked country in the world (2,724,900 km 2 ) and divided into 13 regions (provinces). It has a climate ranging from hot summer humid continental climate in the north to cold semi-arid climate in the south according to Köppen-Geiger Climate Classification (Table 1 ). Mongolia is the second largest landlocked country (1,564,116 km 2 ) and divided into 21 Aimags (provinces). It ranges from Monsoon-influenced subarctic climate to semi-arid climate in the middle, to cold desert climate in the south. The six study provinces are Aktobe, Akmola and Almaty in Kazakhstan, and Arkhangai, Tov and Dornod in Mongolia. Overall, the selected provinces in Kazakhstan are larger in size than those in Mongolia (Fig. 1 ). Aktobe is the largest among the six provinces (301,697 km 2 ), and the smallest one is Arkhangai (54,952 km 2 ). Mongolia sits on the Mongolian Plateau that has a base elevation of 800 m a.s.l. Major geomorphological features of the study provinces are provided in Table 1 . Other comparisons of the states and changes in basic biophysical, social, and economic conditions of Kazakhstan and Mongolia can be found in Chen et al. (2022) 11 .
Spatial distribution of average ET and GPP in Kazakhstan and Mongolia in 2020. This map was generated on Google Earth Engine with the color theme stretched to two standard deviations. Gobi Desert in southwest Mongolia does not have data coverage from MODIS databases. Polygons with white lines are the six study provinces in Kazakhstan (Aktobe, Akmola, and Almaty) and Mongolia (Arkhangai, Tov and Dornod).
Two spatial datasets are needed to downscale the coarse resolution MODIS products (500 m) of ET and GPP to fine resolution (30 m) land cover classifications, following the linear downscaling method of Chen et al. (2019) 16 (a.k.a. Dasymetric modeling approach), which has been widely used in human geography 21 and landscape studies 15 , 17 , 18 . In brief, the cumulative ET (or GPP) of a MODIS pixel is a linear combination of the ET (or GPP) values of all land cover classes weighted by the compositional proportion (0–1) of each land cover class. This downscaling modeling approach generates one instance ET (or GPP) value for each land cover class of the study landscape. The land cover maps from Yuan et al. (2022) 19 were the most recent available at the time of this study, and the maps from 2000, 2010, and 2020 are used in this study, which can be accessed upon request. MODIS ET and GPP data can be accessed through the Earth Engine Code Editor and downloaded from Earth Engine Data Catalog ( https://developers.google.com/earth-engine/datasets/catalog/modis ).
A prerequisite to applying the abovementioned downscaling approach is that the ecosystem function of any given cover class within the study landscape should be the same. One way to meet this requirement is to limit the size of the ‘landscape,’ but it would be an obvious violation if we treated each province as a whole landscape to downscale ET and GPP. In this study, we tile-cut each province into 50 × 50 km segments to serve as the landscapes (Fig. 2 A). To address any mismatches between tile boundaries and provincial boundaries, titles with ≥ 80% of their area falling outside the province were excluded. Each tile has 10,000 MODIS pixels (Fig. 2 B), and each MODIS pixel includes 289 Landsat pixels (Fig. 2 C). For example, Dornod province ended up having 156 tiles, with 50 tiles excluded for further downscaling. The linear downscaling was independently applied for each tile to compute ET (and GPP) of each land cover type in 2000, 2010, and 2020.
Method for dividing a province into 50 × 50 km tiles for linear downscaling ET and GPP to match land cover class at 30 m resolution. ( A ) Example of a tile in Dornod province on a classified land cover map in 2020 (Yuan et al. 2022). ( B ) ET in 2020 for a demonstrative title (50 × 50 km) in Dornod. ( C ) Land cover map of a demonstrative MODIS pixel (500 m) showing a heterogeneous cover distribution at 30 m resolution.
There is a mismatch in temporal resolution between MODIS and Landsat datasets. MODIS ET and GPP are annually available from 2001 through 2020, whereas land cover products are for decadal periods during 2000–2020 (i.e., no annual land cover maps are available). We generated pseudo annual land cover maps by duplicating decadal year land cover maps. For example, land cover maps for years 2008, 2009, 2010 and 2011 are treated as the one for 2010. This approach is premised on the assumption that LCC is expected to be negligible over a 5-year period, and the decadal land cover map was classified based on a temporal composite of images from 1–2 years (Yuan et al. 2022). Consequently, the pseudo land cover maps for 2001, 2002, 2003, 2004, and 2005 were used for 2000. Similarly, land cover maps for 2016, 2017, 2018 and 2019 were used for 2020.
Another prerequisite to the downscaling method is that land cover class should be more or less equal in a landscape 16 . If one land cover type dominates a landscape, the linear downscaling method may not be converge as the model needs presence of all cover classes 11 . In Mongolia and Kazakhstan, the grassland class often dominates the landscapes 19 , 22 , 23 , as is illustrated in extreme cases such as Dornod, where grasslands make up > 90% and barrens < 10%. Such a grassland monopoly becomes even more prominent when a landscape is divided into small tiles and the grassland class occasionally gets as high as 95%. It clearly violates the assumption of linear downscaling method that land cover classes should be equally represented in a landscape. The solution for this is to lump rare land cover classes into the dominant land cover class to meet the requirements. A rare land cover class is operationally defined as a class whose proportion is < 5% of a MODIS pixel, whereas a dominant land cover class is defined as > 70%. In most cases in our study landscape where classes were combined, barren is the rare cover class, and it is merged into grassland class.
The large spatial coverage (924,441 km 2 in total) and the average size of provinces (283,642 km 2 ) in this study amount to more than > 300 million Landsat pixels, requiring extreme computing power. For this we accessed the High-Performance Computing Center (HPCC) of the Institute for Cyber-Enabled Research at Michigan State University. All spatial analyses were performed in R on HPCC. Data preparation began with aligning pixels from two different coordinate systems. The default coordinate reference system (CRS) of land cover maps from Landsat images is the geographic coordinate (WGS 84), which is different from the native sinusoidal projection MODIS ET/GPP product. Raster layers of land cover maps and MODIS products (ET and GPP) were reprojected to Asia North Albers Equal Area Conic. After the projections, four steps were followed for each pair of land cover maps and ET (or GPP). (1) A 500 × 500 m grid represents MODIS pixel was generated based on the geographic extent of a province “sf” package in R. These were overlaid on top of land cover maps that were extract land cover compositions for each MODIS grid using “exactextract” package in R 24 . (2) MODIS grids were overlaid on MODIS ET/GPP products to extract ET (or GPP) values for each MODIS grid using “exact_extract”. (3) A 50 × 50 km grid layer (i.e., tile) was generated for each province and spatially joined to the MODIS grid layer to create a foreign key table where the results from step 1 and step 2 were joined. (4) The linear regression model was fit between land cover composition and ET (or GPP) for each tile by province, country and year. R Packages, including “parallel”, “doParallel” 25 , “doSNOW” 26 , and “foreach” 27 were loaded to set up the parallel computing environment on HPCC for steps 1 through 3. The computing time was 6 h on HPCC for one variable for Almaty, compared with 7 days when using ArcGIS on a personal computer. Downscaled ET and GPP were tallied by land cover class, period, and province. Descriptive statistics were also generated for each country to visualize the distribution of downscaled ET and GPP for the three periods.
Based on downscaled ET and GPP, we first calculated the functional contribution by land cover class at provincial scale: functional proportion of a land cover class in a province to its composition proportion (F/C). To understand the importance of land cover change (ΔC) in affecting functional changes (ΔF) between the consequent decades, we also calculated ΔF/ΔC. A simple linear regression model with zero intercept (F = β * C ) was used to explore the relationship between land cover composition and function (ET and GPP) to test the first hypothesis. The slope mean (β 0 ) and its standard error were used to construct 95% confidence intervals (CI) for comparisons by land cover class, province, country, and decade using “broom”, “purr” 28 , “tidyr” 29 , “ggplot2” 30 and “ggpubr” 31 in R. The second metric (ΔF/ΔC) between the two decadal years (i.e., from 2000 to 2010 and from 2010 to 2020) reflects the functional change enumerating the difference in ET (or GPP) as a result of land cover change between the two periods. ΔC and ΔF were first quantified by land cover class, and each time step at the 50-km tile before the mean and standard deviations of ET and GPP were tabulated at the provincial and country level.
Downscaled ET and GPP of the five land cover classes were produced by the three study times and for the six provinces in Kazakhstan and Mongolia (Tables 2 and 3 ; Figs. 3 and 4 ). Among the land cover classes in Kazakhstan, the overall mean ET and GPP were lowest for barrens (82.25 mm year −1 and 170.50 gC m −2 year −1 , respectively) and highest for forests (333.29 mm year −1 and 635.30 gC m −2 year −1 , respectively). Cropland and grassland classes have similar mean ET and GPP, with slightly higher values in the cropland class. Similar differences by land cover classes exist in Mongolia, although both ET and GPP of all land cover classes in Mongolia are larger than those in Kazakhstan, except forest class where ET MG (312.85 ± 62.44) is lower than ET KZ (333.29 ± 94.23), and GPP MG (595.14 ± 110.94) is lower than GPP KZ (678.84 ± 242.47). However, standard deviations of ET and GPP are large, resulting in insignificant differences. Within a country, downscaled ET and GPP differ significantly among the provinces. In Kazakhstan, Akmola has the highest ET and GPP in barren, cropland, forest, and grassland classes but not in water class where Almaty edges over Akmola. In Mongolia, Tov has the highest ET in barren and water classes and Dornod has the highest ET and GPP in cropland and forest classes. Interestingly, the variances of ET and GPP in the water class are high in both countries, particularly for Almaty and Tov provinces at 164.78 ± 131.61 mm year −1 and 290.69 ± 120.74 mm, respectively. It is also worth noting that the data suggests some trends in ET and GPP during the three decades. A pairwise t-test of ET and GPP by year indicates that 2010 is statistically different from 2000 and 2020.
Boxplot of downscaled evapotranspiration (ET, mm year⁻ 1 ) and gross primary production (GPP, gC m⁻ 2 year⁻ 1 ) by land cover class in the three study provinces of Kazakhstan for 2000, 2010, and 2020. This panel consists of two rows and five columns of boxplots. The five columns correspond to different land cover classes, while the two rows represent downscaled evapotranspiration (ET, mm year⁻ 1 ) and gross primary production (GPP, gC m⁻ 2 year⁻ 1 ). For each boxplot, the x-axis displays three decades: 2000, 2010, and 2020. Each boxplot represents data for the three study provinces in Kazakhstan (Aktobe, Akmola, and Almaty from left to right), with the data points representing downscaled ET/GPP values for 50 × 50 km landscape tiles within each province. Outliers, identified using the interquartile range (IQR) criterion, are indicated by red star symbols.
Boxplot of downscaled evapotranspiration (ET, mm year⁻ 1 ) and gross primary production (GPP, gC m⁻ 2 year⁻ 1 ) by land cover class in the three study provinces of Mongolia for 2000, 2010, and 2020. This panel consists of two rows and five columns of boxplots. The five columns correspond to different land cover classes, while the two rows represent downscaled evapotranspiration (ET, mm year⁻ 1 ) and gross primary production (GPP, gC m⁻ 2 year⁻ 1 ). For each boxplot, the x-axis displays three decades: 2000, 2010, and 2020. Each boxplot represents data for the three study provinces in Mongolia (Arkhangai, Tov and Dornod from left to right), with the data points representing downscaled ET/GPP values for 50 × 50 km landscape tiles within each province. Outliers, identified using the interquartile range (IQR) criterion, are indicated by red star symbols.
Our first hypothesis—a positive causal relationship between ecosystem functional contribution and compositional amount—is supported for landscapes in Kazakhstan and Mongolia (Fig. 5 ). However, these positive relationships vary by land cover class for ET and GPP at the national (Fig. 5 ) and provincial levels (Figs. S1 and S2 ). At the national level, ET and GPP of the forest class are clearly above the overall mean (i.e., the 1:1 line) albeit within the 2:1 ratio (Fig. 5 ). The regression slopes (β 0 ) of ET/C and GPP/C of forest are 1.086 (± 0.005) and 1.123 (± 0.006), respectively (Table S1 ), indicating that the contributions of the forests to the landscape was 8.6% and 12.3% of the landscape average ET and GPP, respectively. Interestingly, ET/C and GPP/C of the grassland and cropland classes have β 0 near the 1:1 line, although estimated mean (CI) β 0 are < 1 (0.988 ± 0.004 for cropland ET; 0.982 ± 0.004 for cropland GPP; 0.993 ± 0.003 for grassland ET; and 0.994 ± 0.002 for grassland GPP) (Table S2 ). The β 0 of regression lines for ET and GPP of the barren class are much lower than 1 but greater than zero (0.941 ± 0.015 for ET and 0.809 ± 0.013 for GPP). Estimated β 0 of ET and GPP for water is 0.971 ± 0.008 and 0.949 ± 0.007, respectively.
( A ) Changes in ET and GPP with landscape composition by land cover class of the six study provinces. ( B ) Slope and confidence interval (CI) of fitted linear regression lines in A. See Table S1 for slope and CI values by cover type and province. The solid lines denote a ratio of 1:1, the dotted line indicates a ratio of 1:2, and the dashed line refers to a ratio of 2:1.
At the provincial level, similar causal relationships by land cover class exist but with some degrees of differences by province (Figure S1 and S2 ). For three provinces in Kazakhstan, β 0 of forest ET and GPP are all > 1. β 0 for GPP of Aktobe and Akmola are 1.510 ± 0.014 and 1.513 ± 0.012, respectively, compared with 1.108 ± 0.015 for Almaty. Similar differences exist for β 0 values among the three provinces. For the grassland class, β 0 of GPP in Aktobe and Akmola are 1.013 ± 0.004 and 1.016 ± 0.006, respectively, whereas in Almaty it is 0.968 ± 0.005. β 0 of grassland ET in the three provinces show similar contrasts: 1.009 ± 0.004 for Aktobe, 1.009 ± 0.007 for Akmola, and 0.964 ± 0.006 for Almaty. Interestingly, β 0 for cropland GPP β 0 in Aktobe and Akmola is slightly smaller than 1, but it is greater than 1 in Almaty (1.048 ± 0.018). β 0 for cropland ET in Aktobe is > 1 but < 1 in Akmola and Almaty. As expected, β 0 for ET and GPP for barren class are always < 1. Large differences in β 0 for ET of the water class are observed among the three provinces, with 0.008 ± 0.005 for Aktobe, 1.028 ± 0.010 for Almaty, and 0.765 ± 0.017 for Akmola. β 0 for GPP of water mirror the pattern found in ET, which 0.129 ± 0.017 for Aktobe, 0.999 ± 0.008 for Almaty, and 0.688 ± 0.015 for Akmola. For provinces in Mongolia, β 0 of forest ET and GPP are all > 1; near 1 for grassland but < 1 for cropland and barren. β 0 of cropland ET and GPP of Akhangai are smaller than that of Tov and Dornod.
Our second hypothesis is also supported, albeit with large differences by land cover class and landscape, suggesting that the changes in landscape ET over the 30-year study period are not caused equally by the changes in land cover class (Fig. 6 , Figs. S3 and S4 ). At the national level, ΔET/ΔC and ΔGPP/ΔC of the forest and cropland classes are negative, while those of the grassland and barren classes are positive. For the water class, mean ΔET/ΔC is negative but mean ΔGPP/ΔC is positive. Mean ΔF/ΔC of a province does not always agree with that of the country. Cropland in Arkhangai has positive mean ΔET/ΔC and ΔGPP/ΔC; these values are negative in other two provinces in Mongolia. The mean grassland ΔET/ΔC and ΔGPP/ΔC in Aktobe and Akmola are negative but positive in Almaty. For water class in Tov and Dornod, mean ΔET/ΔC and ΔGPP/ΔC are negative, but they are positive in Arkhangai. Barren class has positive ΔET/ΔC and ΔGPP/ΔC across all provinces.
Changes in land cover composition with ΔF/ΔC by land cover class. The solid line denotes a ratio of −1, the dotted line denotes a ratio of positive 1, and the dashed line denotes a ratio of negative 1. ΔF/ΔC is ratio of functional change to compositional change of a landscape tile (50 km × 50 km).
Through downscaling MODIS ET and GPP at a coarse resolution of 500 m we produced landscape (i.e., tile) and province ET and GP by land cover class for three periods in six provinces of Kazakhstan and Mongolia. This an essential task for understanding the spatial and temporal changes in ecosystem functions during land cover change where the patch size is often smaller than MODIS resolution. Otherwise, one cannot pin the magnitude and dynamics of change for any land cover patch in a landscape. Because our downscaling is based on MODIS products, ET and GPP values for the provinces and landscapes should match well with other values using MODIS reported in the literature 32 . In the most recent report on MODIS GPP and ET for these two countries, Chen et al. (2020) 10 estimated the average GPP in Kazakhstan and Mongolia is 225.9 gC m −2 year −1 and 181.9 gC m −2 year −1 , respectively. These national averages of GPP are at the lower end of our study-area estimates, ranging 170.50–635.20 gC m −2 year −1 for Kazakhstan and 176.08 181.9–595.14 181.9 gC m −2 year −1 for Mongolia. This is likely because the six provinces included in this study are dominated by grasslands, while other provinces in these countries include desert biomes that have substantially lower GPP and ET. Similarly, the ET estimate by Chen et al. (2020) 10 is 182 mm year −1 for Kazakhstan and 259 mm year −1 for Mongolia, while our ET estimate is 82.45–333.29 mm year −1 for Kazakhstan and 127.86–312.85 mm year −1 for Mongolia. In another study on GPP using a light use efficiency model in Kazakhstan, Propastin and Kappas (2012) 33 reported a steppe grassland GPP of 243.70 (± 59.50) gC m −2 year −1 , which is close to our estimate of 254.13 (± 101.97) gC m −2 year −1 . Liu et al. (2013) 34 reported ET of three land cover classes on the Mongolian Plateau using a process-based model, including grassland (242–374 mm year −1 ), boreal forest (213–278 mm year −1 ) and semi-desert/desert (100–199 mm year −1 ). The corresponding estimates for these land cover classes in our study are 204.74 (± 53.94) mm year −1 , 312.85 (± 49.43) mm year −1 , and 127.86 (± 45.95) mm year −1 , respectively, which are within the ranges modeled values. However, it is worth noting that drought, which can reduce GPP, is often underestimated by remote sensing methods 35 . In Kazakhstan and Mongolia, where arid landscapes are under constant threat of water deficit and warmer and drier trends are predicted due to global climate change, drought can also lead to decreases in GPP 36 , suggesting that our GPP values are likely underestimated.
This study presents a highly effective approach that is applicable not only to Asian drylands but also to global investigations of land cover and land cover changes, focusing on differences in carbon production, ET, water use efficiency, and other ecosystem functions at high spatial resolution under varying climate gradients and scenarios. A substantial body of literature supports our findings using alternative methods such as remote sensing modeling and field observations 5 , 37 , 38 . These recent studies serve as examples for interested readers.
We advocate for applying this innovative method to ecosystems beyond arid and semi-arid regions, extending to non-water-constrained ecosystems in the future. Additionally, we acknowledge the importance of understanding the mechanisms or drivers on ecosystem functional changes resulting from land use and cover change (LULCC). However, this study focuses primarily on developing computationally efficient approaches and rapid assessment tools/metrics for evaluating the impact of LULCC on ecosystem functions.
Disproportionate functional contribution to compositional makeup of various land cover classes have been intensively studied at plot-to-ecosystem scale. We designed our research to quantify this asymmetrical functional-to-compositional association by land cover class at landscape scale, which provides a unique perspective for understanding the changing landscape. The slopes of linear regression for F/C can be interpreted as practical and operational terms: a unit area of forest generates 1.085 unit of ET and 1.123 unit of GPP, a unit area of grassland generates 0.993 unit of ET and GPP, a unit area of cropland produces 0.987 unit of ET and 0.983 unit of GPP, a unit area of water produces 0.949 unit of GPP and 0.971 unit of ET, a unit area of barren generates 0.941 unit of ET and 0.809 unit of GPP (Fig. 5 ). Interestingly, we found that not all land cover classes have functions proportionate to their compositional shares. Grassland and cropland classes have ecosystem functions approximately commensurate to their size, whereas forest, barren and water classes have asymmetrical F/C with opposite directions: forest’s functional contribution is larger than its area, while barren and water land cover have functional contributions smaller than their compositional amounts. The 1:1 line can be considered the equal contribution line, which means a unit of land cover generates a unit of ecosystem function measured by ET or GPP. A land cover class near this line (e.g., the grassland and cropland) is considered average performance, whereas F/C = 2:1 is making a double contribution, i.e., a unit of land cover generates 2 units of ecosystem function. Importantly, the land cover classes above the 1:1 line contribute more functions to the landscape total; forest and sometimes water (depending on whether the water class is dominated by wetlands or open water) belong to this group. Similarly, the 1:2 line indicates a half contribution, i.e., a unit of land cover generates 0.5 unit of ecosystem function. The barren land cover class clearly belongs to this group.
The divergent F/C among land cover classes further accounts for ΔF/ΔC—an indicator of functional changes during land cover change. For example, where the F/C forests is the highest among the five classes, mean ΔET/ΔC and ΔGPP/ΔC are always negative across the provinces in both countries (Table 4 ), indicating that converting a unit of forest to another land cover class will lead to a 0.67 unit decrease in ET and a 0.6 unit decrease in GPP. Conversely, the ΔET/ΔC and ΔGPP/ΔC values of the barren class are greater than one, meaning that converting a unit of barren land cover to another land cover class will lead to a 0.3 unit increase in ET and a 0.28 unit increase in GPP, partially because F/C of the barren class is the lowest among the five land cover classes. In Mongolia, removal of a unit area of cropland will cause a 0.12 unit decrease in ET and a 0.11 decrease in GPP (Table 4 ). Losing a unit area of grassland will increase ET by 0.14 and GPP by 0.05 unit. Interestingly, in Kazakhstan ΔF/ΔC of grasslands and croplands varied among the provinces. ΔET/ΔC and ΔGPP/ΔC of grasslands are less than zero, except in Almaty, while in Mongolia grassland F/C is invariably above zero. ΔET/ΔC of croplands is above zero in Arkhangai and below zero in Aktobe and Akmola. Conversely, ΔF/ΔC of the grassland is below zero in Arkhangai and above in the other two provinces.
The utility of downscaled ET and GPP is that it allows us to connect the functional changes of ecosystems to the compositional changes of landscapes during LULCC. With the development of fine resolution land cover products worldwide, our downscaling method can be applied to produce corresponding resolution ecosystem function measurements, such as ET and GPP. This rapid assessment tool will potentially become more useful for evaluating the impacts of LULCC on ecosystem functions than in-situ field-based method. Facing a growing global demand for food and fiber, certain types of land use and land cover conversions (e. g. deforestation, afforestation, reclamation, cropland abandonment, etc.) will likely expand. The consequences for water and carbon budgets beyond the ET and GPP we examined should be considered for land use planning when decision-making, especially in light of increasing frequency and intensity of climate extremes.
For example, Kazakhstan and Mongolia experienced extensive LULCC in the mid-twentieth century, through the virgin land campaign in Kazakhstan in the 1950s and through Atar Ezemshik I in Mongolia in the 1940s. If we use our proposed metrics F/C as a quick reference to calculate this impacts of land use land cover change on ecosystem function , the conversion of 5522 km 2 of grassland to cropland from 1990 to 2000 in Tov 19 would result in an increase of 1.45 × 10 8 m 3 year −1 water evaporated to air (cropland ET/C − grassland ET/C × ΔArea = [(158.67 − 132.39) mm year −1 × 10 –3 m × 5522 km 2 × 10 6 m 2 ] ≈ 1.45 × 10 8 m 3 year −1 ) and gaining carbon 4.12 × 10 11 g C m −2 year −1 ((cropland GPP/C − grassland GPP/C) × ΔArea = [(300.30 − 225.70) gC m −2 year −1 × 5522 km 2 × 10 6 m 2 ] ≈ 4.12 × 10 11 g C m −2 year −1 ). In another words, this type of conversion would save 1.45 × 108 m 3 of water per year via ET reduction but result in loss of 4.12 × 10 11 g C carbon production per year excluding all other factors affecting the water and carbon cycles. Another LULCC scenario emerging on the Mongolia Plateau is the conversion of native steppe to forest under the Three-North Shelter Forest of China—an afforestation program that converted 1.64 million ha grassland to forest in northern China between 1990 and 2005 39 . Our proposed metrics ΔF/ΔC serves a handy calculator to assess the impact of this type of land conversion on the ecosystem function. Grassland ΔET/ΔC is positive, meaning that conversion of grassland to another land cover results in an increase of 1219 m 3 ET over a decade, while gaining 4.43 × 10 6 g C over a decade. In other words, this type of conversion would lead to a 1219 m 3 water deficit. Contextualizing these types of land use conversions in semi-arid/arid environments where water resources are scarce illustrates the outsized impact of directional LULCC and the need for caution as well as scientific investigation in environmental planning and decision-making. Our proposed metrics can be utilized as a rapid assessment tool to weigh costs and benefits in the context of landscape scale ecosystem function and services.
We are aware of several major drawbacks in estimating high resolution ET and GPP by land cover class. The first issue is associated with the sample design. We generated 50 km × 50 km tiles as the landscape for downscaling—an assumption for the method 16 . This tile size is obviously arbitrary and may need adjustment according to the size of provinces. Our aim in selecting this size was to establish an analysis unit that represented the whole landscape in theory and, at the same time, was easy to convert in practice in the downscaling method. We ended up multiplying the pixel size of ET and GPP products from MODIS (500 × 500 m) by 100 as our spatial unit. This tiling approach can occasional fragment the study landscape, albeit serving our purpose. Nevertheless, we propose that future research should address this issue and optimize the spatial sampling unit. Performing a sensitivity analysis would likely be helpful and informative to test how landscape properties (i.e., patch composition, configuration, and connectivity etc.) change in response to a varying tile size. Another concern is that climate trends have not been taken into consideration. We compiled ET and GPP from 2001 to 2015 and pooled them to match the decadal years of land cover maps. We found that decadal year 2000 and 2020 products have similar ET and GPP values. However, we propose using advanced statistics to de-trend ET and GPP datasets in the future. More importantly, we are not measuring ET and GPP changes as a result of LULCC but providing a landscape scale method to estimate ET and GPP by land cover class, and ultimately this method can serve as a rapid assessment to quantify potential causes of LULCC for ET and GPP. Another hoped-for improvement for future research would be the production of a finer thematic resolution land cover product. The water land cover class could be improved with more specific classification, such as wetland and open water, and forest could also be further divided into coniferous and deciduous forests (both are common in Kazakhstan and Mongolia).
Linear downscaled ecosystem measurement by land cover change product reflects the ecosystem properties in this arid/semi-arid region and reveals the distinctness in ET/GPP by land cover class. Land cover types have signature ecosystem ET and GPP, and their values vary within and between countries. This landscape scale evaluation of ecosystem function provides a unique perspective for understanding the consequences of LULCC on ecosystem water cycle and carbon budget at the macrosystem level. The spatial resolution of our method (30 m) establishes a link between MODIS products (500 m) and land cover products (30 m), which are widely but separately used to assess landscape scale LULCC. By establishing two metrics—function to composition ratio (F/C) and changes in function to changes in composition (ΔF/ΔC)—our hypothesis that a landscape’s function measured by ET and GPP is disproportionate to its composition was tested and supported. The slopes of linear regressions for F/C can be interpretated in practical and operational terms as: a unit area of forest generates 1.085 unit of ET and 1.123 unit of GPP, a unit area of grassland generates 0.993 unit of ET and GPP, a unit area of cropland produces 0.987 unit of ET and 0.983 unit of GPP, a unit area of water produces 0.949 unit of GPP and 0.971 unit of ET, and a unit area of barren generates 0.941 unit of ET and 0.809 unit of GPP. In addition, the hypothesis that changes in land cover composition will have asymmetrical impacts on a landscape’s functioning was also supported. The divergent F/C among land cover classes further accounted for ΔF/ΔC—an indicator of functional changes during land cover change. ΔET/ΔC and ΔGPP/ΔC of forest and cropland class have negative values, and grassland and barren have positive values, indicating that conversion of a unit of forest to another land cover leads to a 0.67 unit decrease in ET and a 0.6 unit decrease in GPP, but conversion of a unit of grassland or barren to another land cover classes results in an increase in ET and GPP. The utility of this linear downscaling approach in macroecology and two metrics F/C and ΔF/ΔC is that it is a labor-saving and cost-effective approach and provides a rapid assessment for the impact of LULCC on ecosystem functions.
The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request.
The code used in this study is available from the corresponding author on reasonable request.
Luyssaert, S. et al. Land management and land-cover change have impacts of similar magnitude on surface temperature. Nat. Clim. Change 4 , 389–393 (2014).
Article ADS Google Scholar
Powers, R. P. & Jetz, W. Global habitat loss and extinction risk of terrestrial vertebrates under future land-use-change scenarios. Nat. Clim. Chang. 9 , 323–329 (2019).
Pielke, R. A. et al. Land use/land cover changes and climate: Modeling analysis and observational evidence. Wiley Interdiscip. Rev. Clim. Change 2 , 828–850 (2011).
Article Google Scholar
Mahmood, R. et al. Land cover changes and their biogeophysical effects on climate. Int. J. Climatol. 34 , 929–953 (2014).
Chen, J., North, M. & Franklin, J. F. The contributions of microclimatic information in advancing ecosystem science. Agric. For. Meteorol. 355 , 110105 (2024).
Lambin, E. F. et al. The causes of land-use and land-cover change: Moving beyond the myths. Glob. Environ. Change 11 , 261–269 (2001).
Lambin, E. F. & Meyfroidt, P. Global land use change, economic globalization, and the looming land scarcity. Proc. Natl. Acad. Sci. USA 108 , 3465–3472 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Goldewijk, K. K., Beusen, A., Doelman, J. & Stehfest, E. Anthropogenic land use estimates for the Holocene-HYDE 3.2. Earth Syst. Sci. Data 9 , 927–953 (2017).
Technical Summary—Special Report on Climate Change and Land . https://www.ipcc.ch/srccl/chapter/technical-summary/ .
Gutman, G., Chen, J., Henebry, G. M. & Kappas, M. Landscape Dynamics of Drylands Across Greater Central Asia: People, Societies and Ecosystems . Vol. 17 (Springer, 2020).
Chen, J. et al. Sustainability challenges for the social-environmental systems across the Asian Drylands Belt. Environ. Res. Lett. 17 , 023001 (2022).
Potapov, P. et al. Global maps of cropland extent and change show accelerated cropland expansion in the twenty-first century. Nat. Food https://doi.org/10.1038/s43016-021-00429-z (2021).
Article PubMed Google Scholar
Potapov, P. et al. The global 2000–2020 land cover and land use change dataset derived from the landsat archive: First results. Front. Remote Sens. 3 , 18 (2022).
Wulder, M. A., Masek, J. G., Cohen, W. B., Loveland, T. R. & Woodcock, C. E. Opening the archive: How free data has enabled the science and monitoring promise of Landsat. Remote Sens. Environ. 122 , 2–10 (2012).
Shakya, A. K., Ramola, A. & Vidyarthi, A. Statistical quantification of texture visual features for pattern recognition by analyzing pre- and post-multispectral landsat satellite imagery. Nat. Hazards Rev. 22 , 05021011 (2021).
Chen, J. et al. Linear downscaling from MODIS to landsat: Connecting landscape composition with ecosystem functions. Landsc. Ecol. 34 , 2917–2934 (2019).
Sciusco, P. et al. Albedo-induced global warming impact at multiple temporal scales within an Upper Midwest USA watershed. Land 11 , 283 (2022).
Shirkey, G. et al. Fine resolution remote sensing spectra improves estimates of gross primary production of croplands. Agric. For. Meteorol. 326 , 109175 (2022).
Yuan, J. et al. Land use hotspots of the two largest landlocked countries: Kazakhstan and Mongolia. Remote Sens. 14 , 1805 (2022).
Olson, D. M. et al. Terrestrial ecoregions of the world: A new map of life on Earth: A new global map of terrestrial ecoregions provides an innovative tool for conserving biodiversity. Bioscience 51 , 933–938 (2001).
Nagle, N. N., Buttenfield, B. P., Leyk, S. & Spielman, S. Dasymetric modeling and uncertainty. Ann. Assoc. Am. Geogr. 104 , 80–95 (2014).
Article PubMed PubMed Central Google Scholar
Venkatesh, K. et al. Untangling the impacts of socioeconomic and climatic changes on vegetation greenness and productivity in Kazakhstan. Environ. Res. Lett. 17 , 095007 (2022).
Venkatesh, K. et al. Optimal ranges of social-environmental drivers and their impacts on vegetation dynamics in Kazakhstan. Sci. Total Environ. 847 , 157562 (2022).
Article CAS PubMed Google Scholar
Baston, D. exactextractr: Fast Extraction from Raster Datasets Using Polygons. R Package Version 0.7.0 (2021).
Weston, S. doParallel: For Each Parallel Adaptor for the ‘Parallel’ Package (2022).
Weston, S. doSNOW: Foreach Parallel Adaptor for the ‘Snow’ Package (2022).
Weston, S. foreach: Provides Foreach Looping Construct (2022).
Wickham, H. & Henry, L. purrr: Functional Programming Tools (2023).
Wickham, H., Vaughan, D. & Girlich, M. tidyr: Tidy Messy Data (2023).
Wickham, H. Ggplot2: Elegant Graphics for Data Analysis . Use R! (Springer, 2016).
Kassambara, A. ggpubr: ‘ggplot2’ Based Publication Ready Plots (2023).
John, R. et al. Modelling gross primary production in semi-arid Inner Mongolia using MODIS imagery and eddy covariance data. Int. J. Remote Sens. 34 , 2829–2857 (2013).
Propastin, P. A., Kappas, M. W., Herrmann, S. M. & Tucker, C. J. Modified light use efficiency model for assessment of carbon sequestration in grasslands of Kazakhstan: Combining ground biomass data and remote-sensing. Int. J. Remote Sens. 33 , 1465–1487 (2012).
Liu, Y. et al. Response of evapotranspiration and water availability to changing climate and land cover on the Mongolian Plateau during the 21st century. Glob. Planet. Change 108 , 85–99 (2013).
Stocker, B. D. et al. Drought impacts on terrestrial primary production underestimated by satellite monitoring. Nat. Geosci. 12 , 264–270 (2019).
Article ADS CAS Google Scholar
Wei, X. et al. Global assessment of lagged and cumulative effects of drought on grassland gross primary production. Ecol. Indic. 136 , 45 (2022).
Yao, J. et al. Accelerated dryland expansion regulates future variability in dryland gross primary production. Nat. Commun. 11 , 1665 (2020).
Zhang, H. et al. Regular and irregular vegetation pattern formation in semiarid regions: A study on discrete Klausmeier model. Complexity 2020 , 54 (2020).
Google Scholar
He, B., Miao, L., Cui, X. & Wu, Z. Carbon sequestration from China’s afforestation projects. Environ. Earth Sci. 74 , 5491–5499 (2015).
Download references
This study was supported by National Aeronautics and Space Administration (Grant No. 80NSSC20K0410).
Jing Yuan & Jiquan Chen
Present address: California Department of Water Resources, Sacramento, CA, 95814, USA
Center for Global Change and Earth Observations, Michigan State University, East Lansing, MI, 48823, USA
Department of Geography, Environment and Spatial Sciences, Michigan State University, East Lansing, MI, 48823, USA
Jiquan Chen
You can also search for this author in PubMed Google Scholar
J.Y. and J.C both contribute to manuscript writing and data analysis. All authors reviewed the manuscript.
Correspondence to Jing Yuan .
Competing interests.
The authors declare no competing interests.
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information., rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .
Reprints and permissions
Cite this article.
Yuan, J., Chen, J. Disproportionate contributions of land cover and changes to ecosystem functions in Kazakhstan and Mongolia. Sci Rep 14 , 21922 (2024). https://doi.org/10.1038/s41598-024-72231-3
Download citation
Received : 20 October 2023
Accepted : 04 September 2024
Published : 20 September 2024
DOI : https://doi.org/10.1038/s41598-024-72231-3
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.
BMC Genomics volume 25 , Article number: 888 ( 2024 ) Cite this article
Metrics details
Arbuscular mycorrhizal fungi (AMF) form mutualistic partnerships with approximately 80% of plant species. AMF, and their diversity, play a fundamental role in plant growth, driving plant diversity, and global carbon cycles. Knowing whether AMF are sexual or asexual has fundamental consequences for how they can be used in agricultural applications. Evidence for and against sexuality in the model AMF, Rhizophagus irregularis, has been proposed. The discovery of a putative mating-type locus (MAT locus) in R. irregularis, and the previously suggested recombination among nuclei of a dikaryon R. irregularis isolate, potentially suggested sexuality. Unless undergoing frequent sexual reproduction, evolution of MAT-locus diversity is expected to be very low. Additionally, in sexual species, MAT-locus evolution is decoupled from the evolution of arbitrary genome-wide loci.
We studied MAT-locus diversity of R. irregulari s. This was then compared to diversification in a phosphate transporter gene (PTG), that is not involved in sex, and to genome-wide divergence, defined by 47,378 single nucleotide polymorphisms. Strikingly, we found unexpectedly high MAT-locus diversity indicating that either it is not involved in sex, or that AMF are highly active in sex. However, a strongly congruent evolutionary history of the MAT-locus, PTG and genome-wide arbitrary loci allows us to reject both the hypothesis that the MAT-locus is involved in mating and that the R. irregularis lineage is sexual.
Our finding shapes the approach to developing more effective AMF strains and is highly informative as it suggests that introduced strains applied in agriculture will not exchange DNA with native populations.
Peer Review reports
The symbiosis between plants and arbuscular mycorrhizal fungi (AMF; phylum Glomeromycota) is one of the most successful mutualistic partnerships on earth. The fungi colonize the roots of vascular plants and help plants acquire various nutrients (especially phosphate) from the soil, as well as mitigating plant stress [ 1 ]. The symbiosis occurs with plants in almost all terrestrial ecosystems [ 2 ], influencing global ecosystem functioning, especially carbon, phosphate and nitrogen cycles. Genetic variation in AMF differently alters plant productivity and growth, [ 3 , 4 , 5 , 6 , 7 , 8 ]. Identifying the mechanisms influencing or maintaining the genetic diversity in these important fungi is, thus, crucial for understanding global ecosystem functioning.
Even though AMF have successfully lived with plants in terrestrial ecosystems for approximately 460 mya [ 1 ], how the genetic diversity is generated and maintained in AMF remains largely unknown. Mating and consequent sexual recombination are considered to be the fundamental mechanisms for generating genetic diversity in Eukaryotes. Previously, AMF were considered as ancient asexuals [ 9 ]. This was based on circumstantial evidence; namely, the lack of any observed sexual structures and the finding of AMF-like structures in fossilized roots of the earliest land plants [ 1 ]. These fungi are unique in that they form multi-nucleated spores (several hundreds to thousands nuclei per single spore) and never produce a single or two nucleus stage in their whole lifecycle [ 1 ] This highlights the a physiological difficulty for mating and sexual recombination in AMF. More recently, molecular studies have shown that AMF nuclei are haploid and to date no diploid nuclei have ever been observed [ 1 , 10 ]. More recently, evidence for sexuality in AMF has been suggested. Three lines of evidence support possible sexual reproduction in AMF. First, the genome of the model AMF species, Rhizophagus irregularis , contains a conserved partial set of genes thought necessary for meiosis, although the role of these genes in AMF has not been demonstrated [ 11 ]. Second, putative recombination in an AMF population has been suggested [ 12 , 13 ]. Third, R. irregularis exists as homokaryons (carrying a population of genetically identical haploid nuclei) and as dikaryons (carrying a population of two different haploid nucleus genotypes) even though the stage is not diploid. Although AMF lack most of the known sex loci involved in fungal mating, a genomic region was identified in R. irregularis that is similar to a mating type locus (MAT-locus) of Basidiomycetes; a fungal phylum that is evolutionarily distant from the Glomeromycota [ 14 ]. In R. irregularis, each haploid nucleus carries one copy of the MAT-locus [ 14 ]. To date, all R. irregularis dikaryons have been shown to possess two different MAT alleles [ 14 ]. MAT-loci define the sexual identity of fungi in all uni- or bifactorial mating systems found in different fungal lineages [ 15 ]. The MAT-locus in Basidiomycetes contains genes encoding homeodomain transcription factors named HD1 and HD2. Allelic variation of these genes determines sexual compatibility with other individuals. In heterothallic fungi, only individuals with different alleles at the MAT-locus can engage in sexual reproduction [ 16 , 17 , 18 ]. For an organism with facultative or rare sex, the diversity of MAT-types is expected to remain low due to the difficulties for rare MAT-type maintenance. Indeed, most sexual fungi have only two MAT-types [ 19 ]. There are some extreme cases in obligatory sexual fungi where high numbers of MAT-types have evolved and maintained. For example, Coprinellus disseminatus has 143 MAT-types and Schizophyllum commune has > 23,000 MAT-types [ 15 , 20 ] and Trichaptum species have over 17,000 predicted MAT-types across 2 MAT-loci [ 21 ]. However, in these fungi, frequent mating is necessary to ensure that rare MAT-alleles will not go to extinction [ 19 ].
In other fungi, MAT-locus can be also involved in functions that are not related to sexual reproduction, such as asexual sporulation and related cell cycle regulation [ 22 ]. Therefore, the existence of the MAT-locus does not prove sexuality in this organism. Nevertheless, the existence of this MAT-locus in AMF has become a focus candidate indicating sex because of the lack of any other promising candidate loci. The current sex model in AMF assumes MAT-locus based non-self-recognition, formation of diploid nuclei followed by eventual sexual recombination to generate genetic diversity [ 14 ]. The existence of the locus, with two different alleles located on the two nucleus genotypes in dikaryons formed the basis for recent genomic studies proposing sexual recombination in AMF. Several studies attempted to demonstrate recombination events between nuclei of R. irregularis dikaryons carrying different MAT alleles. Chen, et al. [ 23 ] surveyed three genetically different dikaryon isolates and found a very small number of potential recombination sites in only one isolate. However, this was questioned by Auxier and Bazzicalupo [ 24 ] who suggested that these may be artefacts. A re-analysis of the data from Chen, et al. [ 21 ] also revealed a small number of potential recombination sites between nuclei [ 25 ]. Sperschneider, et al. [ 26 ] conducted similar analyses with phased genome assemblies of dikaryon isolates but could not detect reciprocal recombination between co-existing nuclei. To date, none of those studies were able to determine whether any possible recombination events were from meiotic or mitotic recombination.
Population genomics studies on R. irregularis do not support sexuality in this fungus. Genetically highly similar R. irregularis isolates found in very distant geographical locations, and even on different continents, is entirely inconsistent with a species that exhibits frequent sexual recombination [ 27 ]. Consequently, in R. irregularis high MAT-type diversity is not expected. To date, sequencing of the MAT-locus has revealed some diversity with 7 MAT-types among 114 different isolates [ 14 , 28 ]. However, the clustering of MAT-types reported in previous studies were based on nucleotide sequence similarity, with an arbitrary similarity threshold. Consequently, it is possible that two alleles with similar, but small sequence differences, could be reported as identical MAT-types, even if those sequence differences are non-synonymous. If the MAT-locus confers mating identity, then sequence divergence at the MAT-locus should be carefully investigated in both nucleotide and amino acid sequence levels to define the MAT-types. Consequently, the full diversity of the MAT-locus may not have been elucidated.
Understanding whether AMF are sexual or not is not only essential for understanding their ecology and how they have evolved, but also for their use in agriculture. Enormous variation in the effects of genetically different R. irregularis isolates on growth and yield of globally important crops [ 5 , 8 ] means that understanding how genetic variation in these fungi is generated is essential. Furthermore, the application of AMF in agriculture raises questions about their impact in local environments [ 29 ], with the concern that an introduced AMF inoculum may mate with the local population, thus, altering the genetic composition of the local population, with unknown consequences. Finally, knowing whether AMF are sexual or not will also determine whether using genetic variation in AMF to improve crop growth can rely on fungal breeding approaches or whether programs will have to rely on the existing genetic variation in the absence of recombination.
Mating and subsequent sexual recombination decouple the evolution of different loci in populations. Loci of asexual organisms share the same genealogical history and, thus, the fate of each locus depends on the fitness afforded by the entire genome [ 30 ]. In a population of a sexual species where MAT-locus serves as sex locus, MAT-locus alleles (also sometimes referred to as MAT-types) are expected to display two characteristics: 1) The sequences should be highly conserved, as it determines the mating identity of the organism, and allows non-self recognition which is a prerequisite for mating. 2) The divergence of genes at non-related loci (functionally unrelated, as well as unrelated in genetic distance) or ultimately, the whole genome of a nucleus (represented by thousands of loci) should be independent from MAT-locus identity, because mating decouples the evolution of different loci (Fig. 1 ) [ 31 , 32 , 33 , 34 ].
Schematic illustration of hypothetical asexual vs sexual scenarios of genome evolution. a Genomes of different individuals represented as coloured bars. The MAT-locus is indicated by a circle while a given arbitrarily chosen locus, that is unrelated to sex, is shown as a square. b In the scenario of sexual reproduction, due to genome recombination, there should be no correlation expected among the genome (without MAT-locus and a without the locus containing a gene unrelated to mating or sex) divergence, the MAT-locus divergence and divergence of the gene unrelated to mating or sex. c In the scenario of asexuality, the evolution of different loci of the genome, including the MAT-locus (marked as circular shape) and another locus unrelated to mating or sex (in this case, the phosphate transporter gene) will share the same genealogical history because there is no sexual recombination among genomes [ 30 ]
Here, we studied and compared the diversity of the MAT-locus, a phosphate transporter gene (PTG) and genome divergence at arbitrarily chosen loci in the model AMF R. irregularis from Africa, Europe and North America in order to address the question regarding sexuality in this important fungal species. The PTG was chosen because functionally it is an important gene in the symbiosis between AMF and plants. Therefore, its divergence reflects functional divergence of the fungi, with respect to the symbiosis. At the same time, the PTG encodes a protein that is not functionally linked to, or expected to be involved in, sex or mating. Genome divergence was based on genome-wide single nucleotide polymorphisms (SNPs) that represents a set of alleles at thousands of arbitrarily chosen loci. We first hypothesized that AMF are sexual and, thus, MAT-type diversity is low compared to genome diversity, as is the case of global yeast populations [ 32 ]. In the scenario of sexual reproduction, one gene, or a randomly selected coding region, will not evolve together with all other loci (including both coding and non-coding regions). Therefore, our second hypothesis is that for each pairwise comparison of the 3 data sets (a selected gene which is unrelated to sex or mating, the MAT-locus and genome-wide arbitrary loci), there will be no correlation in between any pair of the 3 datasets. In order to test these hypotheses, we first carried out MAT-locus sequencing of many R. irregularis isolates. We then re-analysed all available population datasets available for Rhizophagus [ 27 , 35 ] and newly generated genome and PTG datasets.
Based on sequence divergence at the MAT-locus, the Rhizophagus isolates were grouped into distinctive MAT-types (Fig. 2 a). We found the MAT phylogeny discriminated different species of the Rhizophagus genus. The phylogeny showed a clear clustering of R. intraradices isolates, supported with a high posterior probability of 1.00. Another well supported group, R. proliferus, clustered with R. intraradices with a posterior probability of 1.00. This is in congruence with previous publication based on double digest restriction-site associated DNA (ddRAD) sequencing [ 27 ]. Overall, the sequences at the MAT-locus were able to assess the interspecific diversity within the Rhizophagus genus and discriminate potentially diverged isolates within a group. Interestingly, isolates LPA54, ESQLS69 and KUVA were previously reported as R. irregularis by Savary et al. [ 27 ], although their genomes showed divergence from the main genetic group of R. irregularis . Even though the MAT-types of these isolates showed higher similarity for the R. irregularis group by forming a monophyletic cluster, the group was still separated from main R. irregularis group with high posterior probability (1.00). Each of the five isolates of the Rhizophagus sp. were different from each other, even affecting the amino acid sequences of their MAT-loci in coding region of HD2 protein (Figure S1). The result clearly indicates several different MAT-types among the isolates of this group.
A Bayesian phylogeny based on MAT-types of 81 sequences of Rhizophagus spp and isolates. MAT-types in the main R. irregularis cluster are further divided into different MAT-types, where the first number before the decimal point represents the MAT-types defined by previous studies [ 14 , 28 ] and where the number after the decimal point represents strongly supported divergence within a given MAT-type. MAT-type 8 is newly identified in this study. Text in parentheses following the name of an isolate represents the Genbank accession code for sequences used from previous publications [ 14 , 36 ]. Both the high posterior probability values (> 0.9) and internal reference sequences (recognisable by the Genbank accession code in parentheses) from were used to verify different Rhizophagus species and MAT-types reported in previous studies [ 14 , 28 ]. The sequence of Rhizophagus clarus (KU550091) served as a root for the tree. Each cluster was assigned a specific colour, while the groups that are not R. irregularis were displayed in different shades of grey. The revealed MAT-types were not only supported by nucleotide sequence divergence ( a ), but also by amino acid sequence divergence ( b )
Our initial assignment of MAT-types to each R. irregularis isolate was based on clustering of the seven previously published MAT-types. Each MAT-type was defined by a custumal threshold of posterior probability value, ranging between 0.90 and 1.00 (Fig. 2 a). Using this procedure, we observed 7 main clusters representing 6 of the 7 previously identified MAT types and an 8th previously unreported MAT-type. This newly identified MAT-type showed closest sequence similarity to MAT2 and is labelled here as MAT8. However, we found considerable sequence divergence within some of the main MAT-types observed in this study. The majority of sequence divergences within a MAT-type represented non-synonymous mutations and was directly linked to amino acid sequence divergence of the HD2-encoding protein; (Figure S2b). We, thus, defined different alleles within each MAT-type as those containing substitutions that altered the amino acid sequence. Subdivisions, based on non-synonymous substitutions, were observed in MAT-types 1, 3, 4, 5 and 6. Two sub-groups were observed in MAT1 (MAT1.1 and MAT1.2), MAT3 (MAT3.1 and MAT3.2) and MAT4 (MAT4.1 and MAT4.2). This divergence was supported by high posterior probability values and sequence divergence. MAT5 was divided into three sub-groups having the respective MAT-types (MAT5.1, MAT5.2 and MAT5.3). Lastly, MAT6 was further separated into four MAT-types, referred to here as MAT6.1 to MAT6.4 (Figure S2). In summary, R. irregularis displayed an unexpected diversity of MAT-types, based on amino acid and DNA sequence divergence, with 15 different alleles.
In a scenario of sexual reproduction, a given gene, or a randomly selected coding region, is unlikely to evolve together with another locus, especially when the two loci are not functionally linked [ 30 ]. We tested the linear correlation between the divergence of the MAT and PTG loci. Unexpectedly, a significant correlation was observed (Mantel test; ρ = 0.7447, p < 1E −04 ) between divergence of alleles at the MAT and PTG loci in data comprising all Rhizophagus species and isolates ( N = 37). The linear correlation was also significant when we tested for a correlation between intraspecific divergence in the MAT and PTG loci in R. irregularis ( N = 30) (Mantel test; ρ = 0.4076, p < 7E −04 ), Because some isolates that were previously described as R. irregularis grouped outside this species in the MAT phylogeny (Fig. 2 ), we also performed Mantel test after excluding those isolates from the analysis. This more conservative Mantel test was also significant showing high correlation between the MAT and PTG phylogenies ( N = 27) (Mantel test; ρ = 0.3615, p < 1.4E −03 ). All correlations were positive, meaning that more divergence in MAT-locus is associated with more divergence in the PTG. A further test of congruency between the phylogenies, using Baker’s gamma, was performed with the most conservative grouping that only included R. irregularis isolates that clustered as R. irregularis in the MAT phylogeny. This also showed significant similarity between MAT-locus and PTG phylogenies ( p = 0.032) (Figs. 3 a and S3a).
Congruence among three phylogenetic trees based on genome-wide SNPs, the MAT-locus and PTG in R. irregularis . Distances among isolates based on genome-wide data were calculated using data on 47,378 SNPs without locus overlap with the MAT-locus or PTG. Distances of divergence in partial MAT-locus (261 bp) and PTG (639 bp) were calculated by Tamura 3-parameter model [ 37 ]. Matching nodes or clusters between phylogenies are highlighted in the same colour. a Significant correlation of both MAT-locus and PTG phylogenies. b Significant correlation of phylogeny from genome-wide SNPs to MAT-locus and to the PTG phylogeny
To answer whether the observed correlation between the MAT-locus and PTG divergence are likely to be a result of purifying or positive selection, we tested codon evolution in both the MAT-locus and PTG (Table 1 ). Purifying selection removes deleterious variations and, therefore, contributes to the functional stability of a gene. On the other hand, positive selection promotes the spread of beneficial alleles and contributes to the gene diversity for environmental adaptation [ 38 ]. Both loci were shown to have evolved under purifying selection (Table 1 ), allowing us to reject the hypothesis that they were under positive selection. Together with the observed high diversity of MAT-alleles carrying non-synonymous mutations, the results implied the diversification of MAT-alleles still occurred under purifying selection. Additionally, both loci also showed high codon usage similarity, between loci, as well as among the isolates (MAT-locus average codon adaptation index (CAI): 0.830, expected value of CAI (eCAI): 0.857 ( P < 0.05) and PTG average CAI: 0.809, eCAI: 0.821 ( P < 0.05)) (Fig. 4 ). The observed inter-isolate CAI of the MAT-locus and PTG had an overlapping distribution peak (Fig. 4 ) which also overlapped perfectly with the intra-isolate CAI distribution peak of 26,183 genes in the reference isolate (DAOM197198; average CAI: 0.818, eCAI: 0.862; P < 0.05). This showed that codon usage convergence in these two genes is not an exception compared to the overall observed genome-wide codon usage similarities in R. irregularis . Our results suggest that even though the two loci are not functionally related, both MAT-locus and PTG evolution were driven by purifying selection resulting the converged codon usage similarity and their correlated divergence.
Codon adaptation index (CAI) analysis reveals genome-wide codon usage similarity in the reference isolate (DAOM197198) and the among-isolate codon usage similarity of the MAT-locus and PTG. All CDS (blue) represents CAI from a total of 26,183 genes in the reference isolate (DAOM197198). CAI_MAT (red) and CAI_PTG (green) represent the CAI of the target gene in each isolate, respectively. Observed high CAI in all datasets (distribution plot peak at CAI > 0.8) and clear overlap of the peaks among datasets show the both MAT-locus and PTG of different isolate have similar codon usage and the CAI of the two genes are not the exceptions to overall converged codon usage detected in the reference isolate
The ddRAD-seq dataset provided us with a total of 47,378 SNPs, representing random coding and non-coding regions of the R. irregularis genome. By only focusing on homokaryons, which avoids potential single nucleotide polymorphisms (SNP) calling bias incurred by dikaryons, and by having a dataset with over six times more SNPs than previous population studies [ 27 ], we were able to achieve a higher resolution of genome diversity in R. irregularis (Fig. 3 b). Geographic origin could not explain the similarities among genomes (PERMANOVA with Jaccard distance, p > 0.05), but isolate MAT-type explained genome similarity (PERMANOVA with Jaccard distance, p < 0.05). A Mantel test was performed to examine the correlation. We found significant correlation between the divergence of genomes and the divergence of the MAT-locus which implies co-evolution or associated evolution of the sets of random coding and non-coding regions of the genome together with the MAT-locus ( N = 37) (Mantel test; ρ = 0.4601, p < 1E −04 ). The correlation was positive, indicating that greater divergence in genomes was reflected in greater divergence in their MAT alleles. The significance of observed correlation of the entire dataset was not influenced by additional filtering for isolate selection, showing the correlation holds true at both interspecific and intraspecific levels. A significant correlation also occurred within R. irregularis isolates ( N = 30) (Mantel test; ρ = 0.3303, p = 0.0067) as well as the more conservative group excluding isolates that clustered outside the R. irregularis group of the MAT locus phylogeny ( N = 27) (Mantel test; ρ = 0.2684, p = 0.0093). With this more conservative group of R. irregularis isolates we also found a linear correlation between the distance among genome-wide SNPs and the PTG divergence (Pearson ρ = 0.5996, p = 0.0044). Together with the previously described positive linear correlation between PTG and MAT sequence divergence, all 3 possible pairwise comparisons among genome-wide SNPs, MAT-locus and PTG distance matrices of R. irregularis showed significant correlation. The results are also congruent with the detected intra- and inter-isolate levels of codon usage similarities (Fig. 4 ). Further analysis of phylogenetic congruency with Baker’s gamma among these three datasets also confirmed the significant phylogenetic congruency observed in all pairwise correlations among genome-wide SNPs, MAT-locus and PTG ( p < 0.05 in all comparisons) (Fig. 3 and S3).
In this study, we found that the model AMF, R. irregularis, harbours diversity at the MAT-locus that is higher than that expected for a fungus that could, at most, exhibit facultative or rare sexual reproduction. This result, on its own, would signify that either this locus is not involved in sex, or that extremely frequent sex has maintained its diversity. However, we also tested whether R. irregularis is likely a sexually reproducing species, based on the fundamental evolutionary concept of decoupling between the evolutionary history of the MAT-locus, another locus independent of mating and genome-wide arbitrarily chosen loci [ 30 ]. In R. irregularis , the congruence between the evolutionary history of these three datasets, and including codon usage, matches that of a clonal organism, and thus contradicts the notion that very frequent sex to maintain diversity at the MAT-locus has occurred. Taken together, these results allow us to reject the hypothesis of sexual reproduction in this important fungal species (Fig. 1 c). Credence to the hypotheses that AMF are sexual is strongly based on the existence of the MAT-locus in AMF and that dikaryons carry two different copies. The results of this study suggest that it is highly unlikely that this species is sexually reproducing. Our findings have a number of important consequences that we discuss in more detail.
Recent studies of sexuality and potential recombination in AMF have relied on detecting recombination between genomes carrying different MAT alleles. Seven MAT-types had been reported in R. irregularis which suggested possible cryptic, but active sex and recombination [ 28 ]. In that study, the approach to define different MAT-types was based on phylogenetic analyses of MAT-locus sequence similarity with a defined node-support value, or threshold, for each cluster. However, the posterior probability values for the nodes depend on the relative similarity between MAT-locus sequences and do not reflect the true sequence difference. We found that nucleotide sequence variation at the MAT-locus affects the HD2 amino acid sequence; a homeodomain transcription factor that plays an important role in self/non-self recognition in other fungi [ 16 ]. If the sequence at the R. irregularis MAT-locus defines mating types in this species, the sequences should be highly conserved as is the case for other fungi [ 34 ]. Relying on an arbitrary threshold of node support to define MAT-type clusters overlooks the actual diversity of existing MAT-types. Previous studies targeted a less variable region of the locus. and this likely hindered finding true MAT-locus variation. We found 15 MAT types out of 50 unambiguous R. irregularis isolates. The frequency (approx. 0.3) is almost five time higher than that previously reported (0.061; 7 MAT-types in 114 isolates). However, several of the 50 isolates in our study are undistinguishable from each other with ddRAD-seq data. Thus, it is likely that the 15 MAT-types occur in considerably less than 50 genetically different R. irregularis isolates. Furthermore, our study and previous studies only considered partial sequences at the MAT-locus. Full length sequences could contain more nucleotide sequence variation. To maintain such a number of MAT-alleles, the species should undergo frequent sex as in the cases of other fungal species [ 15 , 19 , 20 , 21 ]. We conclude that the higher-than-expected level of MAT-type variation is inconsistent with the proposed cryptic rare sexuality in these fungi. The alternative explanation is that the MAT-locus which fundamentally constitutes the current sexuality paradigm of AMF, is not involved in mating.
Mating, followed by recombination, decouple the evolution of different loci in the population. In contrast, loci of asexual organisms share the same genealogical history [ 30 ]. If R. irregularis is sexual and recombination takes place, it is highly unlikely that the intraspecific diversification of two non-related loci is linked to the evolutionary history of the entire genome. The PTG is an important AMF gene, as the symbiosis between plants and AMF is constituted by the nutrient exchange, especially translocation of soil phosphate from the fungus to the plant [ 1 ]. The genes of MAT-locus and PTG were known to have clearly different functions [ 16 , 40 ]. The two loci are located on two different chromosomes (chromosome 11 for HD2 and 18 for PTG in R. irregularis DAOM197198) [ 41 ]. Strikingly, we found a clear positive linear correlation in the divergence between these two functionally and distantly unrelated loci, which is expected in the absence of recombination. Moreover, we found that the sequence divergence, based on multiple genome-wide coding and non-coding regions across the R. irregularis genome, and in which both the MAT-locus and PTG were excluded, was congruent with the MAT-locus or PTG phylogenies. The observed congruency in sequence divergence is further supported by the clear codon usage similarities between the MAT-locus and PTG. However, more surprisingly, the known coding regions of the reference isolate genome showed strong genome-wide codon usage convergence which is consistent with the correlated genealogical history among different loci in the genome. The fact that the MAT-locus, PTG and R. irregularis genome (excluding the MAT-locus and PTG) are correlated, and share similar codon usage, represents a genomic feature which consistent with a clonal organism.
The conserved putative MAT-locus reported in Rhizophagus spp. is the homolog of a fungal mating type locus that is conserved in Basidiomycete fungi. From the evidence presented here, it seems highly unlikely that this putative MAT-locus is involved in mating in AMF. Some other known genes of the Dikarya or Mucoromycota (ancient and recent fungal lineages) that are involved in mating and recognition are present in AMF genomes [ 11 ] and are expressed during co-inoculation of roots with two genetically different R. irregularis isolates [ 42 , 43 ]. However, those genes are not all fully conserved, and most also have other known functions in fungi (for example, conidia development and germination, mycotoxin production and oxidative stress response) [ 22 , 44 ] and none of them have two different alleles in dikaryons. For these reasons, they are unlikely candidates as MAT-loci in R. irregularis . Thus, while R. irregularis genomic features do not point to sexuality, if endeavours towards finding a true MAT-locus in AMF continue then we propose two additional important criteria that must be satisfied. First, it will be a previously undescribed locus in the fungal kingdom. Second, the locus should display a very low degree of allele diversity as frequent sex in this fungus can be excluded.
So, what is the role of the MAT-locus studied here? One possibility is that R. irregularis was very active sexually in the past, giving rise to high MAT-type diversity, and at a certain point in the evolution of the lineage, the fungi lost the ability to sexually reproduce. However, this locus still contains a conserved HD1-like and HD2 protein. MAT-loci identified in other asexual fungi were found to be involved in asexual functions, such as asexual sporulation and cell cycle regulation [ 22 ]. It is well known that R. irregularis anastomoses with hyphae of the same genotype and that four different stages of recognition and compatibility between pairs of genetically different R. irregularis isolates have been described [ 12 ]. In the case of successful fusion, this allows cytoplasm of the two individuals to flow rapidly in both directions [ 12 ]. It is conceivable that the locus could be involved in some of these recognition mechanisms allowing, or preventing, the fusion of hyphae of compatible individuals to distribute nutrients and improve structural integrity of hyphal networks. This could also allow the co-existence of genetically different multiple nuclei in one cytoplasm, even if no recombination takes place between them.
Explaining the existence of ancient asexual lineages is problematic in evolutionary biology because a fundamental role of recombination is to purge deleterious mutations [ 45 ]. The Glomeromycota are thought to be an ancient lineage that formed symbioses with plants since the colonisation of land. Coupled with their seemingly low morphological diversification, they were suggested to be ancient asexuals [ 9 ]. The fossil record for Glomeromycota is extremely poor and there could have been great diversification in the Glomeromycota in the past, as seen for the species diversity of major plant lineages that preceded angiosperm radiation. There is a danger that our results on the asexuality of R. irregularis will be interpreted as evidence for the long-term asexuality of the Glomeromycota lineage. While our results strongly support asexuality in R. irregularis , we are not claiming the whole Glomeromycota lineage to be asexual. While there are hardly any confirmed examples of long-lived asexual lineages, there are many examples in nature where an order or genus contains sexual and asexual species [ 46 ]. This may be the case in the Glomeromycota. To answer the separate question of sexuality versus asexuality in the Glomeromycota lineage, we urge researchers to carry out similar studies on other Glomeromycota species spread widely across the phylogeny.
The question of sexuality in R. irregularis will greatly affect how AMF can be applied to improve agricultural production and ecosystem functions. Our findings may disappoint researchers intending to develop a breeding program relying on crossing to improve AMF. Our results show that this will likely not be possible with R. irregularis . However, other genetic mechanisms in R. irregularis allow the development of new strains of this fungus that have been shown to greatly alter productivity of globally important crops [ 5 , 8 , 47 ].
However, there are positive consequences of our findings. First, R. irregularis is a safe AMF species to develop for agricultural applications because it can be produced readily in vitro without other unwanted microorganisms. That introduced an R. irregularis strain will not recombine with local AMF populations is of great benefit for applications because there should be no introgression of introduced genes into the local population and the introduced fungus should retain its functional characteristics. Secondly, there is no current method to track in introduced R. irregularis in soil where Rhizopagus spp already occur (which is usually the case in agricultural soils). Each nucleus of R. irregularis is haploid and, thus, represents the genome of the individual and all nuclei in a homokaryon individual are identical. Each nucleus carries one MAT-type. Because MAT-type variation is positively correlated with variation in the R. irregularis genome, MAT-type variation represents an excellent proxy for studying Rhizophagus variation in populations, which was previously not possible. This will allow direct tracking of introduced Rhizophagus to finally allow the study of AMF invasiveness. Furthermore, it will be the first time researchers have a tool for directly studying the population biology of this important fungus to measure diversity, and to allow the study AMF competition and co-existence.
Fungal isolates included in the study of mat-locus diversity.
A total of 51 isolates of Rhizophagus species (comprising, R. clarus , R. cerebriforme , R. irregularis , R. intraradices , R. proliferus , and an undescribed Rhizophagus species) were used in the study for sequencing of the MAT-locus. The isolates originated from the soils of 13 countries in four continents and were isolated between 1981 and 2013 (Table S1). All the isolates were maintained as monoxenic in vitro cultures with root inducing (Ri) T-DNA transformed carrot ( Daucus carota ) roots in Petri dishes containing minimal (M) medium solidified with 0.4% phytagel [ 48 ]. All in vitro cultures were initiated from a single spore. The cultures were incubated at 25 °C under dark for 12 weeks.
For collection of fungal material for DNA extraction, citrate buffer was first used to dissolve the medium [ 35 ]. After dissolving the medium, the mycelium was collected, washed with the MiliQ water three times. Samples were then immediately frozen with liquid nitrogen and stored at -80 °C until DNA extraction. The DNA was extracted with DNeasy® Plant Mini Kit (Qiagen, Switzerland), following manufacturer’s protocol. After extraction, DNA samples were purified using the Monarch PCR & DNA Cleanup Kit (New England BioLabs, United States) and further quantified using Qubit™ dsDNA HS assay kit (ThermoFisher Scientific, Switzerland). After quantification, all DNA samples were diluted with MiliQ water to obtain a final concentration of 2 ng·µL −1 .
The full length MAT locus in AMF genomes from previously published studies [ 14 , 36 ] were retrieved and aligned by using MUSCLE v5 [ 49 ]. Based on multiple sequence alignment (MSA), regions were surveyed for genetic variability. Degenerate primers (forward primer (SJF): 5’-CGTGRGCGKATTACCAAGGA-3’ and reverse primer (SJR): 5’-GACATGGTTCAATAATAGAAGAAATCG-3’) were designed manually to yield an approximately 300 bp amplicon length (Table S2). The primers were tested in silico with IDT OligoAnalyzer ( https://eu.idtdna.com/pages/tools/oligoanalyzer ) for the T m and potential homo- and hetero-dimer formation. Target specificity was also tested and confirmed by the National Center for Biotechnology Information (NCBI) PrimerBLAST with a targeted search against a non-redundant sequence database (NR).
Polymerase chain reaction (PCR) was conducted with Taq PCR Master Mix (Qiagen, Switzerland). Total reaction volume was 20 µL with 2 × Taq PCR Master Mix, 2 µL of primer pair (1 µM) and 4 ng of template DNA. Amplifications were performed in SimpliAmpTM Thermal Cycler (Applied biosystems, Switzerland) with 32 cycles of 1 min at 94 °C, 45 s at 55 °C, and 1 min at 72 °C, followed by a final extension step of 10 min at 72 °C. PCR amplicons were purified using the Monarch PCR & DNA Cleanup Kit (New England Biolabs, United States) and quantified using a Qubit™ dsDNA HS assay kit (ThermoFisher Scientific, Switzerland. The purified amplicons were diluted to obtain a final concentration of 8 ng/µL and pair-end sequenced using Sanger sequencing technology with two technical replicates and two biological replicates of each isolate. The sequences were deposited in the International Nucleotide Sequence Database Collaboration (INSDC) and publicly available at through the NCBI Genbank at ( https://www.ncbi.nlm.nih.gov/nuccore/ ) under accession codes: LC738554 to LC738607.
Sequences of the MAT locus from 51 homokaryon isolates of present study and 30 publicly available MAT sequences of homokaryons and dikaryons from previous studies [ 14 , 36 ] were aligned using MUSCLE v5 [ 49 ]. After trimming, the 81 sequences with a 281 bp length were subjected to model testing for Bayesian phylogenetic analysis by JModelTest v2.1.1036 [ 50 ]. The resulting best nucleotide substitution model was the Hasegawa-Kishino-Yano (HKY) model with a gamma distribution. To test the protein coding gene divergence, translated amino acid sequences of the partial HD2 gene, covered by sequenced amplicons. The protein evolution model test was conducted on a multiple sequence alignment of 81 amino acids with ProTest v3.4.2 [ 51 ]. The selected best nucleotide substitution model was the Jones-Taylor-Thornton (JTT) model with a gamma distribution. The Bayesian phylogenetic analyses were conducted with BEAST2.5 [ 52 ], with 10,000,000 generations and with a burning in of the first 20% generations. The resulting phylogenies were visualised using iTOL ( https://itol.embl.de ).
Publicly available phosphate transporter gene (PTG) sequences of homokaryon isolates were retrieved if corresponding genome-wide SNP data were also available [ 27 ]. The sequences from a total of 37 isolates were retrieved and used for downstream analyses. Pairwise distance matrices of the published PTG sequences and the corresponding MAT sequences of current study were built using the pairwise nucleotide sequence similarity calculated with the Tamura 3-parameter model [ 37 ]. Codon evolution of MAT alleles and PTG alleles was also tested by analysing the numbers of nonsynonymous (d N ), synonymous (d S ) substitutions and their variances: Var(d N ) and Var(d S ). Analyses were conducted using the Nei-Gojobori method [ 39 ]. The Z-value was used for testing the null hypothesis: Z = (d N —d S ) / SQRT (Var(d N ) + Var(d S )) The null hypothesis (H 0 ) for the tests of positive or negative selection was: There is no difference between strict-neutrality (d N = d S , Z = 0). The threshold to reject the null hypothesis was set to 0.05. Synonymous codon usage bias among the isolates was measured by calculating the codon adaptation index (CAI) in MAT and PTG sequences by CAIcal [ 53 ]. For the intra-isolate codon usage bias calculation, the coding sequences of a total 26,183 genes in the model R. irregularis isolate (DAOM197198) were used as a reference.
To compare the overall genome variation of each isolate with the variation in PTG and at the MAT sequences, we built genome-wide SNP database. Raw reads of homokaryon isolates from two previous studies using double digest restriction-site associated DNA sequencing (ddRAD-seq; [ 27 , 35 ] were downloaded from NCBI and analysed using Stacks v2.3 [ 54 ]. We downloaded data for the same 37 isolates used for the building pairwise distance matrices of MAT-types and PTG. However, for the accurate calculation of the genetic differences among isolates, we removed the data of any isolates that were considered ambiguous . By ambiguous, we mean those isolates that were previously assigned to R. irregularis according to Savary et al. [ 27 ], but that did not cluster with R. irregularis according to the MAT or PTG phylogenies of this study. For the calculation of single nucleotide polymorphisms (SNP) across the isolates, we did not include any data of dikaryons from previous studies, as they contain mixed reads originating from two different genomes. After excluding ambiguous isolates, 27 R. irregularis homokaryons were retained that could be compared with both MAT and PTG sequences. Low quality reads were trimmed with PrinSeq-lite 0.20.4 lite [ 55 ] with default parameters. Demultiplexing of sequences was performed using Stacks command “process_radtags”. Demultiplexed sequences from homokaryon isolates were mapped to the version 2.0 genome of R. irregularis DAOM197198 from Joint genome institute (JGI) [ 56 ] as reference using Burrows-Wheeler Alignment tool (BWA) v0.7.17 [ 57 ], with the default parameters of “bwa mem”. The obtained.sam files were then converted into.bam files via SAMtools v1.1043 [ 58 ]. SNP calling at each locus was performed using the gstacks from Stacks [ 54 ] and exported as.genepop files using the command “populations –genepop”. The –min-mapq gstacks parameter for minimum PHRED-scaled mapping quality was set at 60 and population parameters for minimum allele counts required to process a SNP was set to 2 with –min-mac and the observed heterozygosity was set to 0 with –max-obs-het. The resulting.genepop files were converted into.gen. All above described analyses were performed on the high performance computing server of the University of Lausanne, Switzerland. The generated SNP data of the isolates were further converted into.genind object using the Adegenet package [ 59 ] in R v4.0.0 [ 60 ]. Further filtering was applied to the SNP dataset. SNPs were located in coding and non-coding regions and contained at least 10 reads of coverage. Only SNPs supported by more than 80% reads of each isolate were considered. The SNPs in the MAT locus and PTG encoding regions were removed to avoid the effect of sequence divergence of those loci on the genome similarity/dissimilarity calculation. This was necessary for subsequent tests for congruence between phylogenies generated using the SNP database and the phylogenies of the PTG and MAT-locus.
A Jaccard distance matrix of isolate based on the genome-wide SNP data was computed with “vegdist” from the vegan R package [ 61 ]. Permutational multivariate analysis of variance (PERMANOVA), with “adonis” was computed to test the clustering by geographic origin or MAT-type. The tested null hypothesis (H 0 ) was: There is no difference in nucleus genotype clustering by geographic origin or MAT-type. The genome-wide phylogeny based on hierarchical clustering analysis was performed using the R package pvclust [ 62 ]. The R package ggplot2 [ 63 ] was used for graphic visualisation of plots. These were further modified with the software Inkscape v1.1 for adding related metadata. To test the correlations among the divergences of genomes, MAT-locus and PTG in distance matrix level, pairwise Mantel tests were applied with vegan package [ 61 ] in R v4.0.0 [ 60 ]. The null hypothesis (H 0 ) tested was: There is no linear correlation between pairs of matrices. To test the congruency of genome, MAT-locus and PTG divergence, dendrograms were built from the corresponding distance matrices with Ward clustering option in dendextend R package [ 64 ]. Following pairwise Baker’s gamma [ 65 ] was calculated and statistical significance was tested with 1000 permutations. The null hypothesis (H 0 ) tested was: There is no association between the two phylogenetic trees. For the test of codon preferences of MAT-locus and PTG, as well as all coding sequences in the reference isolate (DAOM197198), the expected value of the codon adaptation index (eCAI) was calculated by E-CAI [ 53 ]. The null H 0 tested was: Measured eCAI are artefacts that arise from internal biases in the G + C composition and/or amino acid composition of the target sequences. In all statistical analyses, the threshold to reject the null hypothesis was set to 0.05.
The data generated and/or analysed during the current study are deposited in the International Nucleotide Sequence Database Collaboration (INSDC) and publicly available at through the NCBI Genbank at ( https://www.ncbi.nlm.nih.gov/nuccore/ ) under accession codes: LC738554 to LC738607.
Arbuscular mycorrhizal fungi
Double digest restriction-site associated DNA
Mating-type locus
Phosphate transporter gene
Single nucleotide polymorphism
Smith S, Read D. Mycorrhizal Symbiosis. 3rd ed. London: Academic Press; 2008.
Brundrett MC, Tedersoo L. Evolutionary history of mycorrhizal symbioses and global host plant diversity. New Phytol. 2018;220(4):1108–15.
Article PubMed Google Scholar
van der Heijden MG, Klironomos JN, Ursic M, Moutoglis P, Streitwolf-Engel R, Boller T, Wiemken A, Sanders IR. Mycorrhizal fungal diversity determines plant biodiversity, ecosystem variability and productivity. Nature. 1998;396:69–72.
Article Google Scholar
van der Heijden MG, Wiemken A, Sanders IR. Different arbuscular mycorrhizal fungi alter coexistence and resource distribution between co-occurring plant. New Phytol. 2003;157(3):569–78.
Angelard C, Colard A, Niculita-Hirzel H, Croll D, Sanders IR. Segregation in a mycorrhizal fungus alters rice growth and symbiosis-specific gene transcription. Curr Biol. 2010;20(13):1216–21.
Article CAS PubMed Google Scholar
Koch AM, Antunes PM, Klironomos JN. Diversity effects on productivity are stronger within than between trophic groups in the arbuscular mycorrhizal symbiosis. PLoS ONE. 2012;7(5): e36950.
Article CAS PubMed PubMed Central Google Scholar
Nuccio EE, Hodge A, Pett-Ridge J, Herman DJ, Weber PK, Firestone MK. An arbuscular mycorrhizal fungus significantly modifies the soil bacterial community and nitrogen cycling during litter decomposition. Environ Microbiol. 2013;15(6):1870–81.
Pena Venegas RA, Lee SJ, Thuita M, Mlay DP, Masso C, Vanlauwe B, Rodriguez A, Sanders IR. The Phosphate Inhibition Paradigm: Host and Fungal Genotypes Determine Arbuscular Mycorrhizal Fungal Colonization and Responsiveness to Inoculation in Cassava With Increasing Phosphorus Supply. Front Plant Sci. 2021;12: 693037.
Article PubMed PubMed Central Google Scholar
Judson OP, Normark BB. Ancient asexual scandals. Trends Ecol Evol. 1996;11(2):41–6.
Yamato M, Yamada H, Maeda T, Yamamoto K, Kusakabe R, Orihara T. Clonal spore populations in sporocarps of arbuscular mycorrhizal fungi. Mycorrhiza. 2022;32(5–6):373–85.
Halary S, Daubois L, Terrat Y, Ellenberger S, Wostemeyer J, Hijri M. Mating type gene homologues and putative sex pheromone-sensing pathway in arbuscular mycorrhizal fungi, a presumably asexual plant root symbiont. PLoS ONE. 2013;8(11): e80729.
Croll D, Sanders IR. Recombination in Glomus intraradices, a supposed ancient asexual arbuscular mycorrhizal fungus. BMC Evol Biol. 2009;9:13.
Gandolfi A, Sanders IR, Rossi V, Menozzi P. Evidence of recombination in putative ancient asexuals. Mol Biol Evol. 2003;20(5):754–61.
Ropars J, Toro KS, Noel J, Pelin A, Charron P, Farinelli L, Marton T, Kruger M, Fuchs J, Brachmann A, et al. Evidence for the sexual origin of heterokaryosis in arbuscular mycorrhizal fungi. Nat Microbiol. 2016;1(6):16033.
Nieuwenhuis BP, Billiard S, Vuilleumier S, Petit E, Hood ME, Giraud T. Evolution of uni- and bifactorial sexual compatibility systems in fungi. Heredity (Edinb). 2013;111(6):445–55.
Casselton LA, Olesnicky NS. Molecular genetics of mating recognition in basidiomycete fungi. Microbiol Mol Biol Rev. 1998;62(1):55–70.
May G, Shaw F, Badrane H, Vekemans X. The signature of balancing selection: fungal mating compatibility gene evolution. Proc Natl Acad Sci U S A. 1999;96(16):9172–7.
Casselton LA. Mate recognition in fungi. Heredity (Edinb). 2002;88(2):142–7.
Constable GWA, Kokko H. The rate of facultative sex governs the number of expected mating types in isogamous species. Nat Ecol Evol. 2018;2(7):1168–75.
Kothe E. Tetrapolar fungal mating types: sexes by the thousands. FEMS Microbiol Rev. 1996;18(1):65–87.
Peris D, Lu DS, Kinneberg VB, Methlie IS, Dahl MS, James TY, Kauserud H, Skrede I. Large-scale fungal strain sequencing unravels the molecular diversity in mating loci maintained by long-term balancing selection. PLoS Genet. 2022;18(3): e1010097.
Wang Q, Wang S, Xiong CL, James TY, Zhang XG. Mating-type genes of the anamorphic fungus Ulocladium botrytis affect both asexual sporulation and sexual reproduction. Sci Rep. 2017;7(1):7932.
Chen EC, Mathieu S, Hoffrichter A, Sedzielewska-Toro K, Peart M, Pelin A, Ndikumana S, Ropars J, Dreissig S, Fuchs J, et al. Single nucleus sequencing reveals evidence of inter-nucleus recombination in arbuscular mycorrhizal fungi. Elife. 2018;7:e39813.
Auxier B, Bazzicalupo A. Comment on “Single nucleus sequencing reveals evidence of inter-nucleus recombination in arbuscular mycorrhizal fungi.” Elife. 2019;8:e47301.
Mateus ID, Auxier B, Ndiaye MMS, Cruz J, Lee SJ, Sanders IR. Reciprocal recombination genomic signatures in the symbiotic arbuscular mycorrhizal fungi Rhizophagus irregularis. PLoS ONE. 2022;17(7): e0270481.
Sperschneider J, Yildirir G, Rizzi Y, Malar CM, Sorwar E, Chen E, Iwasaki W, Brauer E, Bosnich W, Gutjahr C, et al. Resolving the haplotypes of arbuscular mycorrhizal fungi highlights the role of two nuclear populations in host interactions. bioRxiv 2023:2023.2001.2015.524138.
Savary R, Masclaux FG, Wyss T, Droh G, Cruz Corella J, Machado AP, Morton JB, Sanders IR. A population genomics approach shows widespread geographical distribution of cryptic genomic forms of the symbiotic fungus Rhizophagus irregularis. ISME J. 2018;12(1):17–30.
Kokkoris V, Chagnon PL, Yildirir G, Clarke K, Goh D, MacLean AM, Dettman J, Stefani F, Corradi N. Host identity influences nuclear dynamics in arbuscular mycorrhizal fungi. Curr Biol. 2021;31(7):1531-1538 e1536.
Hart MM, Antunes PM, Chaudhary VB, Abbott LK. Fungal inoculants in the field: Is the reward greater than the risk? Funct Ecol. 2018;32(1):126–35.
Neher RA, Kessinger TA, Shraiman BI. Coalescence and genetic diversity in sexual populations under selection. Proc Natl Acad Sci U S A. 2013;110(39):15836–41.
Dapper AL, Payseur BA. Connecting theory and data to understand recombination rate evolution. Philos Trans R Soc Lond B Biol Sci. 2017;372(1736):20160469.
Peter J, De Chiara M, Friedrich A, Yue JX, Pflieger D, Bergstrom A, Sigwalt A, Barre B, Freel K, Llored A, et al. Genome evolution across 1,011 Saccharomyces cerevisiae isolates. Nature. 2018;556(7701):339–44.
Mouresan EF, Gonzalez-Rodriguez A, Canas-Alvarez JJ, Munilla S, Altarriba J, Diaz C, Baro JA, Molina A, Lopez-Buesa P, Piedrafita J, et al. Mapping Recombination Rate on the Autosomal Chromosomes Based on the Persistency of Linkage Disequilibrium Phase Among Autochthonous Beef Cattle Populations in Spain. Front Genet. 2019;10:1170.
Hartmann FE, Duhamel M, Carpentier F, Hood ME, Foulongne-Oriol M, Silar P, Malagnac F, Grognet P, Giraud T. Recombination suppression and evolutionary strata around mating-type loci in fungi: documenting patterns and understanding evolutionary and mechanistic causes. New Phytol. 2021;229(5):2470–91.
Wyss T, Masclaux FG, Rosikiewicz P, Pagni M, Sanders IR. Population genomics reveals that within-fungus polymorphism is common and maintained in populations of the mycorrhizal fungus Rhizophagus irregularis. ISME J. 2016;10(10):2514–26.
Chaturvedi A, Cruz Corella J, Robbins C, Loha A, Menin L, Gasilova N, Masclaux FG, Lee SJ, Sanders IR. The methylome of the model arbuscular mycorrhizal fungus, Rhizophagus irregularis, shares characteristics with early diverging fungi and Dikarya. Commun Biol. 2021;4(1):901.
Tamura K. Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C-content biases. Mol Biol Evol. 1992;9(4):678–87.
CAS PubMed Google Scholar
Cvijovic I, Good BH, Desai MM. The Effect of Strong Purifying Selection on Genetic Diversity. Genetics. 2018;209(4):1235–78.
Nei M, Gojobori T. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol. 1986;3(5):418–26.
Sokolski S, Dalpe Y, Piche Y. Phosphate transporter genes as reliable gene markers for the identification and discrimination of arbuscular mycorrhizal fungi in the genus glomus. Appl Environ Microbiol. 2011;77(5):1888–91.
Yildirir G, Sperschneider J, Malar CM, Chen ECH, Iwasaki W, Cornell C, Corradi N. Long reads and Hi-C sequencing illuminate the two-compartment genome of the model arbuscular mycorrhizal symbiont Rhizophagus irregularis. New Phytol. 2022;233(3):1097–107.
Mateus ID, Lee SJ, Sanders IR:Co-existence of AMF with different putative MAT-alleles induces genes homologous to those involved in mating in other fungi: a reply to Malar, et al. ISME J. 2021;15(8):2180–2.
Mateus ID, Rojas EC, Savary R, Dupuis C, Masclaux FG, Aletti C, Sanders IR. Coexistence of genetically different Rhizophagus irregularis isolates induces genes involved in a putative fungal mating response. ISME J. 2020;14(10):2381–94.
Karacsony Z, Gacser A, Vagvolgyi C, Scazzocchio C, Hamari Z. A dually located multi-HMG-box protein of Aspergillus nidulans has a crucial role in conidial and ascospore germination. Mol Microbiol. 2014;94(2):383–402.
Smith MJ. The Evolution of Sex. Cambridge: Cambridge University Press; 1976.
Schwander T, Henry L, Crespi BJ. Molecular evidence for ancient asexuality in timema stick insects. Curr Biol. 2011;21(13):1129–34.
Ceballos I, Mateus ID, Peña R, Peña-Quemba DC, Robbins C, Ordoñez YM, Rosikiewicz P, Rojas EC, Thuita M, Mlay DP, et al. Using variation in arbuscular mycorrhizal fungi to drive the productivity of the food security crop cassava. bioRxiv. 2019:830547.
St-Arnaud M, Hamel C, Vimard B, Caron M, Fortin JA. Enhanced hyphal growth and spore production of the arbuscular mycorrhizal fungus Glomus intraradices in an in vitro system in the absence of host roots. Mycol Res. 1996;100(3):328–32.
Edgar RC. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 2004;5:113.
Darriba D, Taboada GL, Doallo R, Posada D. jModelTest 2: more models, new heuristics and parallel computing. Nat Methods. 2012;9(8):772.
Darriba D, Taboada GL, Doallo R, Posada D. ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics. 2011;27(8):1164–5.
Bouckaert R, Vaughan TG, Barido-Sottani J, Duchene S, Fourment M, Gavryushkina A, Heled J, Jones G, Kuhnert D, De Maio N, et al. BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis. PLoS Comput Biol. 2019;15(4):e1006650.
Puigbo P, Bravo IG, Garcia-Vallve S. E-CAI: a novel server to estimate an expected value of Codon Adaptation Index (eCAI). BMC Bioinformatics. 2008;9:65.
Rochette NC, Rivera-Colon AG, Catchen JM. Stacks 2: Analytical methods for paired-end sequencing improve RADseq-based population genomics. Mol Ecol. 2019;28(21):4737–54.
Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics. 2011;27(6):863–4.
Chen ECH, Morin E, Beaudet D, Noel J, Yildirir G, Ndikumana S, Charron P, St-Onge C, Giorgi J, Kruger M, et al. High intraspecific genome diversity in the model arbuscular mycorrhizal symbiont Rhizophagus irregularis. New Phytol. 2018;220(4):1161–71.
Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics. 2010;26(5):589–95.
Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27(21):2987–93.
Jombart T. adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics. 2008;24(11):1403–5.
R Core Team. A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2020.
Google Scholar
Oksanen J, Kindt R, Legendre P, O'Hara B, Simpson G, Solymos P, Stevens M, Wagner H. vegan: Community Ecology Package. 2020.
Suzuki R, Shimodaira H. Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics. 2006;22(12):1540–2.
Wickham H. ggplot2: Elegant graphics for data analysis. New York: Springer-Verlag; 2016.
Book Google Scholar
Galili T. dendextend: an R package for visualizing, adjusting and comparing trees of hierarchical clustering. Bioinformatics. 2015;31(22):3718–20.
Baker FB. Stability of two hierarchical grouping techniques case 1: sensitivity to data errors. J Am Stat Assoc. 1974;69(346):440–5.
Download references
We would like to thank Jerome Gippet, Daniel Croll and Tristan Cumer for critical discussion and insightful comments on the manuscript. We thank Jinwon Kim for the assistance in graphical illustrations. We thank anonymous reviewers for their insightful comments for improving the manuscript quality.
Open access funding provided by University of Lausanne The research was funded by the Swiss National Science Foundation (project no. 310030B_182826).
Soon-Jae Lee and Eric Risse contributed equally to this work.
Department of Ecology and Evolution, University of Lausanne, Lausanne, 1015, Switzerland
Soon-Jae Lee, Eric Risse, Ivan D. Mateus & Ian R. Sanders
You can also search for this author in PubMed Google Scholar
SJL and IRS designed the study; SJL and ER performed the experiment; SJL and ER analysed data; SJL and IRS interpreted data and wrote the manuscript; ER and IM participated in the data interpretation and manuscript writing. IRS acquired project funding.
Correspondence to Ian R. Sanders .
Ethics approval and consent to participate.
Not applicable: Ethics approval and consent to participate was not required as the study did not involve human or animal subjects, human data or human tissues.
Not applicable: This study does not include any individual persons’ data and thus no consernt to publish was necessary.
The authors declare no competing interest.
Publisher’s note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary material 1., supplementary material 2., supplementary material 3., supplementary material 4., rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .
Reprints and permissions
Cite this article.
Lee, SJ., Risse, E., Mateus, I.D. et al. Evolution of unexpected diversity in a putative mating type locus and its correlation with genome variability reveals likely asexuality in the model mycorrhizal fungus Rhizophagus irregularis . BMC Genomics 25 , 888 (2024). https://doi.org/10.1186/s12864-024-10770-9
Download citation
Received : 18 August 2023
Accepted : 04 September 2024
Published : 20 September 2024
DOI : https://doi.org/10.1186/s12864-024-10770-9
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
ISSN: 1471-2164
IMAGES
VIDEO
COMMENTS
Hypothesis is a hypothesis is fundamental concept in the world of research and statistics. It is a testable statement that explains what is happening or observed. It proposes the relation between the various participating variables. Hypothesis is also called Theory, Thesis, Guess, Assumption, or Suggestion. Hypothesis creates a structure that ...
Definition: Hypothesis is an educated guess or proposed explanation for a phenomenon, based on some initial observations or data. It is a tentative statement that can be tested and potentially proven or disproven through further investigation and experimentation. Hypothesis is often used in scientific research to guide the design of experiments ...
A hypothesis is a tentative statement about the relationship between two or more variables. It is a specific, testable prediction about what you expect to happen in a study. It is a preliminary answer to your question that helps guide the research process. Consider a study designed to examine the relationship between sleep deprivation and test ...
13 Different Types of Hypothesis. There are 13 different types of hypothesis. These include simple, complex, null, alternative, composite, directional, non-directional, logical, empirical, statistical, associative, exact, and inexact. A hypothesis can be categorized into one or more of these types. However, some are mutually exclusive and ...
It seeks to explore and understand a particular aspect of the research subject. In contrast, a research hypothesis is a specific statement or prediction that suggests an expected relationship between variables. It is formulated based on existing knowledge or theories and guides the research design and data analysis. 7.
Simple hypothesis. A simple hypothesis is a statement made to reflect the relation between exactly two variables. One independent and one dependent. Consider the example, "Smoking is a prominent cause of lung cancer." The dependent variable, lung cancer, is dependent on the independent variable, smoking. 4.
Here are a few different types of hypotheses: Simple hypothesis: A simple hypothesis predicts a relationship between an independent and a dependent variable. Complex hypothesis: A complex hypothesis looks at the relationship between two or more independent variables and two or more dependent variables. Empirical hypothesis: An empirical ...
A research hypothesis helps test theories. A hypothesis plays a pivotal role in the scientific method by providing a basis for testing existing theories. For example, a hypothesis might test the predictive power of a psychological theory on human behavior. It serves as a great platform for investigation activities.
Examples. A research hypothesis, in its plural form "hypotheses," is a specific, testable prediction about the anticipated results of a study, established at its outset. It is a key component of the scientific method. Hypotheses connect theory to data and guide the research process towards expanding scientific understanding.
5. Phrase your hypothesis in three ways. To identify the variables, you can write a simple prediction in if…then form. The first part of the sentence states the independent variable and the second part states the dependent variable. If a first-year student starts attending more lectures, then their exam scores will improve.
A hypothesis is a prediction of what will be found at the outcome of a research project and is typically focused on the relationship between two different variables studied in the research. It is usually based on both theoretical expectations about how things work and already existing scientific evidence. Within social science, a hypothesis can ...
If the hypothesis is a relational hypothesis, then it should be stating the relationship between variables. The hypothesis must be specific and should have scope for conducting more tests. The way of explanation of the hypothesis must be very simple and it should also be understood that the simplicity of the hypothesis is not related to its ...
A hypothesis is a research-based prediction. of an outcome involving at least two variables in an experiment or test. For a. prediction to be considered a hypothesis, it must be testable. In other words, you need to be able to manipulate the two variables to prove your prediction. It must also be falsifiable.
Types of Research Hypotheses. There are seven different types of research hypotheses. Simple Hypothesis. A simple hypothesis predicts the relationship between a single dependent variable and a single independent variable. Complex Hypothesis. A complex hypothesis predicts the relationship between two or more independent and dependent variables.
Hypothesis is a prediction of the outcome of a study. Hypotheses are drawn from theories and research questions or from direct observations. In fact, a research problem can be formulated as a hypothesis. To test the hypothesis we need to formulate it in terms that can actually be analysed with statistical tools.
Merriam Webster defines a hypothesis as "an assumption or concession made for the sake of argument.". In other words, a hypothesis is an educated guess. Scientists make a reasonable assumption--or a hypothesis--then design an experiment to test whether it's true or not.
The Function of the Hypotheses. A hypothesis states what one is looking for in an experiment. When facts are assembled, ordered, and seen in a relationship, they build up to become a theory. This theory needs to be deduced for further confirmation of the facts, this formulation of the deductions constitutes of a hypothesis.
Types of Hypotheses; Hypothesis Formulation; Hypotheses and Variables; The Importance of Testing Hypotheses; The Hypothesis and Sociological Theory; Conclusion; In sociology, as in other scientific disciplines, the hypothesis serves as a crucial building block for research. It is a central element that directs the inquiry and provides a ...
The seven types of hypotheses are listed below: 5,6,7. Simple: Predicts the relationship between a single dependent variable and a single independent variable. Example: Exercising in the morning every day will increase your productivity. Complex: Predicts the relationship between two or more variables.
Types. Here's a snapshot to help you differentiate between all types of research hypothesis easily. Detailed explanation of each type is as follows: 1. Simple Hypothesis. This type looks at how two variables might be related to each other. These variables are the dependent variable and independent variable.
Types of Hypothesis. The hypothesis can be broadly classified into different types. They are: Simple Hypothesis. A simple hypothesis is a hypothesis that there exists a relationship between two variables. One is called a dependent variable, and the other is called an independent variable. Complex Hypothesis.
Directional Hypothesis . The Directional hypothesis, on the other hand, asserts the direction of effect of the relationship that exists between two variables. Herein, the hypothesis clearly states that variable A affects variable B, or vice versa. Statistical Hypothesis . A statistical hypothesis is a hypothesis that can be verified to be ...
Hypothesis plays an important role in any research project; it's a stepping stone to proving a theory. Hypothesis serves in establishing a connection to the underlying theory and particular research subject. It helps in data processing and evaluates the reliability and validity of the study.
A null hypothesis is a type of conjecture in statistics that proposes that there is no difference between certain characteristics of a population or data-generating process. The alternative ...
"It's got to be wrong to some extent." "The question is whether it is efficient for your purpose. And for almost every investor I know, the answer to that is yes.
A good collection of cookware will make it easier to get dinner on the table. But if you're looking for a go-to pot that can do it all, shop for a Dutch oven.Sure, there's a time and place for more modern appliances, like Instant Pots or slow cookers, but a Dutch oven has a long history of being a trusty vessel for making big braises, soups, and one-pot meals that can go straight from the ...
RJ Frasca: It's really just working with the league and the community that you're putting your kids in the sports, to make sure that there's a good code of conduct reporting mechanism and program ...
A simple linear regression model with zero intercept (F = β * C) was used to explore the relationship between land cover composition and function (ET and GPP) to test the first hypothesis. The ...
Specialist divers surveying the wreckage of the $40 million superyacht that sank off Sicily in August, killing seven people including British tech tycoon Mike Lynch, have asked for heightened ...
The Z-value was used for testing the null hypothesis: Z = (d N —d S) / SQRT (Var(d N) + Var(d S)) The null hypothesis (H 0) for the tests of positive or negative selection was: There is no difference between strict-neutrality (d N = d S, Z = 0). The threshold to reject the null hypothesis was set to 0.05.