• En español – ExME
  • Em português – EME

What are sampling methods and how do you choose the best one?

Posted on 18th November 2020 by Mohamed Khalifa

""

This tutorial will introduce sampling methods and potential sampling errors to avoid when conducting medical research.

Introduction to sampling methods

Examples of different sampling methods, choosing the best sampling method.

It is important to understand why we sample the population; for example, studies are built to investigate the relationships between risk factors and disease. In other words, we want to find out if this is a true association, while still aiming for the minimum risk for errors such as: chance, bias or confounding .

However, it would not be feasible to experiment on the whole population, we would need to take a good sample and aim to reduce the risk of having errors by proper sampling technique.

What is a sampling frame?

A sampling frame is a record of the target population containing all participants of interest. In other words, it is a list from which we can extract a sample.

What makes a good sample?

A good sample should be a representative subset of the population we are interested in studying, therefore, with each participant having equal chance of being randomly selected into the study.

We could choose a sampling method based on whether we want to account for sampling bias; a random sampling method is often preferred over a non-random method for this reason. Random sampling examples include: simple, systematic, stratified, and cluster sampling. Non-random sampling methods are liable to bias, and common examples include: convenience, purposive, snowballing, and quota sampling. For the purposes of this blog we will be focusing on random sampling methods .

Example: We want to conduct an experimental trial in a small population such as: employees in a company, or students in a college. We include everyone in a list and use a random number generator to select the participants

Advantages: Generalisable results possible, random sampling, the sampling frame is the whole population, every participant has an equal probability of being selected

Disadvantages: Less precise than stratified method, less representative than the systematic method

Simple sampling method example in stick men.

Example: Every nth patient entering the out-patient clinic is selected and included in our sample

Advantages: More feasible than simple or stratified methods, sampling frame is not always required

Disadvantages:  Generalisability may decrease if baseline characteristics repeat across every nth participant

Systematic sampling method example in stick men

Example: We have a big population (a city) and we want to ensure representativeness of all groups with a pre-determined characteristic such as: age groups, ethnic origin, and gender

Advantages:  Inclusive of strata (subgroups), reliable and generalisable results

Disadvantages: Does not work well with multiple variables

Stratified sampling method example stick men

Example: 10 schools have the same number of students across the county. We can randomly select 3 out of 10 schools as our clusters

Advantages: Readily doable with most budgets, does not require a sampling frame

Disadvantages: Results may not be reliable nor generalisable

Cluster sampling method example with stick men

How can you identify sampling errors?

Non-random selection increases the probability of sampling (selection) bias if the sample does not represent the population we want to study. We could avoid this by random sampling and ensuring representativeness of our sample with regards to sample size.

An inadequate sample size decreases the confidence in our results as we may think there is no significant difference when actually there is. This type two error results from having a small sample size, or from participants dropping out of the sample.

In medical research of disease, if we select people with certain diseases while strictly excluding participants with other co-morbidities, we run the risk of diagnostic purity bias where important sub-groups of the population are not represented.

Furthermore, measurement bias may occur during re-collection of risk factors by participants (recall bias) or assessment of outcome where people who live longer are associated with treatment success, when in fact people who died were not included in the sample or data analysis (survivors bias).

By following the steps below we could choose the best sampling method for our study in an orderly fashion.

Research objectiveness

Firstly, a refined research question and goal would help us define our population of interest. If our calculated sample size is small then it would be easier to get a random sample. If, however, the sample size is large, then we should check if our budget and resources can handle a random sampling method.

Sampling frame availability

Secondly, we need to check for availability of a sampling frame (Simple), if not, could we make a list of our own (Stratified). If neither option is possible, we could still use other random sampling methods, for instance, systematic or cluster sampling.

Study design

Moreover, we could consider the prevalence of the topic (exposure or outcome) in the population, and what would be the suitable study design. In addition, checking if our target population is widely varied in its baseline characteristics. For example, a population with large ethnic subgroups could best be studied using a stratified sampling method.

Random sampling

Finally, the best sampling method is always the one that could best answer our research question while also allowing for others to make use of our results (generalisability of results). When we cannot afford a random sampling method, we can always choose from the non-random sampling methods.

To sum up, we now understand that choosing between random or non-random sampling methods is multifactorial. We might often be tempted to choose a convenience sample from the start, but that would not only decrease precision of our results, and would make us miss out on producing research that is more robust and reliable.

References (pdf)

' src=

Mohamed Khalifa

Leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

No Comments on What are sampling methods and how do you choose the best one?

' src=

Thank you for this overview. A concise approach for research.

' src=

really helps! am an ecology student preparing to write my lab report for sampling.

' src=

I learned a lot to the given presentation.. It’s very comprehensive… Thanks for sharing…

' src=

Very informative and useful for my study. Thank you

' src=

Oversimplified info on sampling methods. Probabilistic of the sampling and sampling of samples by chance does rest solely on the random methods. Factors such as the random visits or presentation of the potential participants at clinics or sites could be sufficiently random in nature and should be used for the sake of efficiency and feasibility. Nevertheless, this approach has to be taken only after careful thoughts. Representativeness of the study samples have to be checked at the end or during reporting by comparing it to the published larger studies or register of some kind in/from the local population.

' src=

Thank you so much Mr.mohamed very useful and informative article

Subscribe to our newsletter

You will receive our monthly newsletter and free access to Trip Premium.

Related Articles

essay on sampling method

How to read a funnel plot

This blog introduces you to funnel plots, guiding you through how to read them and what may cause them to look asymmetrical.

""

Internal and external validity: what are they and how do they differ?

Is this study valid? Can I trust this study’s methods and design? Can I apply the results of this study to other contexts? Learn more about internal and external validity in research to help you answer these questions when you next look at a paper.

""

Cluster Randomized Trials: Concepts

This blog summarizes the concepts of cluster randomization, and the logistical and statistical considerations while designing a cluster randomized controlled trial.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Emerg (Tehran)
  • v.5(1); 2017

Logo of emergency

Sampling methods in Clinical Research; an Educational Review

Mohamed elfil.

1 Faculty of Medicine, Alexandria University, Egypt.

Ahmed Negida

2 Faculty of Medicine, Zagazig University, Egypt.

Clinical research usually involves patients with a certain disease or a condition. The generalizability of clinical research findings is based on multiple factors related to the internal and external validity of the research methods. The main methodological issue that influences the generalizability of clinical research findings is the sampling method. In this educational article, we are explaining the different sampling methods in clinical research.

Introduction

In clinical research, we define the population as a group of people who share a common character or a condition, usually the disease. If we are conducting a study on patients with ischemic stroke, it will be difficult to include the whole population of ischemic stroke all over the world. It is difficult to locate the whole population everywhere and to have access to all the population. Therefore, the practical approach in clinical research is to include a part of this population, called “sample population”. The whole population is sometimes called “target population” while the sample population is called “study population. When doing a research study, we should consider the sample to be representative to the target population, as much as possible, with the least possible error and without substitution or incompleteness. The process of selecting a sample population from the target population is called the “sampling method”.

Sampling types

There are two major categories of sampling methods ( figure 1 ): 1; probability sampling methods where all subjects in the target population have equal chances to be selected in the sample [ 1 , 2 ] and 2; non-probability sampling methods where the sample population is selected in a non-systematic process that does not guarantee equal chances for each subject in the target population [ 2 , 3 ]. Samples which were selected using probability sampling methods are more representatives of the target population.

An external file that holds a picture, illustration, etc.
Object name is emerg-5-e52-g001.jpg

Sampling methods.

Probability sampling method

Simple random sampling

This method is used when the whole population is accessible and the investigators have a list of all subjects in this target population. The list of all subjects in this population is called the “sampling frame”. From this list, we draw a random sample using lottery method or using a computer generated random list [ 4 ].

Stratified random sampling

This method is a modification of the simple random sampling therefore, it requires the condition of sampling frame being available, as well. However, in this method, the whole population is divided into homogeneous strata or subgroups according a demographic factor (e.g. gender, age, religion, socio-economic level, education, or diagnosis etc.). Then, the researchers select draw a random sample from the different strata [ 3 , 4 ]. The advantages of this method are: (1) it allows researchers to obtain an effect size from each strata separately, as if it was a different study. Therefore, the between group differences become apparent, and (2) it allows obtaining samples from minority/under-represented populations. If the researchers used the simple random sampling, the minority population will remain underrepresented in the sample, as well. Simply, because the simple random method usually represents the whole target population. In such case, investigators can better use the stratified random sample to obtain adequate samples from all strata in the population.

Systematic random sampling (Interval sampling)

In this method, the investigators select subjects to be included in the sample based on a systematic rule, using a fixed interval. For example: If the rule is to include the last patient from every 5 patients. We will include patients with these numbers (5, 10, 15, 20, 25, ...etc.). In some situations, it is not necessary to have the sampling frame if there is a specific hospital or center which the patients are visiting regularly. In this case, the researcher can start randomly and then systemically chooses next patients using a fixed interval [ 4 ].

Cluster sampling (Multistage sampling)

It is used when creating a sampling frame is nearly impossible due to the large size of the population. In this method, the population is divided by geographic location into clusters. A list of all clusters is made and investigators draw a random number of clusters to be included. Then, they list all individuals within these clusters, and run another turn of random selection to get a final random sample exactly as simple random sampling. This method is called multistage because the selection passed with two stages: firstly, the selection of eligible clusters, then, the selection of sample from individuals of these clusters. An example for this, if we are conducting a research project on primary school students from Iran. It will be very difficult to get a list of all primary school students all over the country. In this case, a list of primary schools is made and the researcher randomly picks up a number of schools, then pick a random sample from the eligible schools [ 3 ].

Non-probability sampling method

Convenience sampling

Although it is a non-probability sampling method, it is the most applicable and widely used method in clinical research. In this method, the investigators enroll subjects according to their availability and accessibility. Therefore, this method is quick, inexpensive, and convenient. It is called convenient sampling as the researcher selects the sample elements according to their convenient accessibility and proximity [ 3 , 6 ]. For example: assume that we will perform a cohort study on Egyptian patients with Hepatitis C (HCV) virus. The convenience sample here will be confined to the accessible population for the research team. Accessible population are HCV patients attending in Zagazig University Hospital and Cairo University Hospitals. Therefore, within the study period, all patients attending these two hospitals and meet the eligibility criteria will be included in this study.

Judgmental sampling

In this method, the subjects are selected by the choice of the investigators. The researcher assumes specific characteristics for the sample (e.g. male/female ratio = 2/1) and therefore, they judge the sample to be suitable for representing the population. This method is widely criticized due to the likelihood of bias by investigator judgement [ 5 ].

Snow-ball sampling

This method is used when the population cannot be located in a specific place and therefore, it is different to access this population. In this method, the investigator asks each subject to give him access to his colleagues from the same population. This situation is common in social science research, for example, if we running a survey on street children, there will be no list with the homeless children and it will be difficult to locate this population in one place e.g. a school/hospital. Here, the investigators will deliver the survey to one child then, ask him to take them to his colleagues or deliver the surveys to them.

Conflict of interest:

Sampling Methods In Reseach: Types, Techniques, & Examples

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Sampling methods in psychology refer to strategies used to select a subset of individuals (a sample) from a larger population, to study and draw inferences about the entire population. Common methods include random sampling, stratified sampling, cluster sampling, and convenience sampling. Proper sampling ensures representative, generalizable, and valid research results.
  • Sampling : the process of selecting a representative group from the population under study.
  • Target population : the total group of individuals from which the sample might be drawn.
  • Sample: a subset of individuals selected from a larger population for study or investigation. Those included in the sample are termed “participants.”
  • Generalizability : the ability to apply research findings from a sample to the broader target population, contingent on the sample being representative of that population.

For instance, if the advert for volunteers is published in the New York Times, this limits how much the study’s findings can be generalized to the whole population, because NYT readers may not represent the entire population in certain respects (e.g., politically, socio-economically).

The Purpose of Sampling

We are interested in learning about large groups of people with something in common in psychological research. We call the group interested in studying our “target population.”

In some types of research, the target population might be as broad as all humans. Still, in other types of research, the target population might be a smaller group, such as teenagers, preschool children, or people who misuse drugs.

Sample Target Population

Studying every person in a target population is more or less impossible. Hence, psychologists select a sample or sub-group of the population that is likely to be representative of the target population we are interested in.

This is important because we want to generalize from the sample to the target population. The more representative the sample, the more confident the researcher can be that the results can be generalized to the target population.

One of the problems that can occur when selecting a sample from a target population is sampling bias. Sampling bias refers to situations where the sample does not reflect the characteristics of the target population.

Many psychology studies have a biased sample because they have used an opportunity sample that comprises university students as their participants (e.g., Asch ).

OK, so you’ve thought up this brilliant psychological study and designed it perfectly. But who will you try it out on, and how will you select your participants?

There are various sampling methods. The one chosen will depend on a number of factors (such as time, money, etc.).

Probability and Non-Probability Samples

Random Sampling

Random sampling is a type of probability sampling where everyone in the entire target population has an equal chance of being selected.

This is similar to the national lottery. If the “population” is everyone who bought a lottery ticket, then everyone has an equal chance of winning the lottery (assuming they all have one ticket each).

Random samples require naming or numbering the target population and then using some raffle method to choose those to make up the sample. Random samples are the best method of selecting your sample from the population of interest.

  • The advantages are that your sample should represent the target population and eliminate sampling bias.
  • The disadvantage is that it is very difficult to achieve (i.e., time, effort, and money).

Stratified Sampling

During stratified sampling , the researcher identifies the different types of people that make up the target population and works out the proportions needed for the sample to be representative.

A list is made of each variable (e.g., IQ, gender, etc.) that might have an effect on the research. For example, if we are interested in the money spent on books by undergraduates, then the main subject studied may be an important variable.

For example, students studying English Literature may spend more money on books than engineering students, so if we use a large percentage of English students or engineering students, our results will not be accurate.

We have to determine the relative percentage of each group at a university, e.g., Engineering 10%, Social Sciences 15%, English 20%, Sciences 25%, Languages 10%, Law 5%, and Medicine 15%. The sample must then contain all these groups in the same proportion as the target population (university students).

  • The disadvantage of stratified sampling is that gathering such a sample would be extremely time-consuming and difficult to do. This method is rarely used in Psychology.
  • However, the advantage is that the sample should be highly representative of the target population, and therefore we can generalize from the results obtained.

Opportunity Sampling

Opportunity sampling is a method in which participants are chosen based on their ease of availability and proximity to the researcher, rather than using random or systematic criteria. It’s a type of convenience sampling .

An opportunity sample is obtained by asking members of the population of interest if they would participate in your research. An example would be selecting a sample of students from those coming out of the library.

  • This is a quick and easy way of choosing participants (advantage)
  • It may not provide a representative sample and could be biased (disadvantage).

Systematic Sampling

Systematic sampling is a method where every nth individual is selected from a list or sequence to form a sample, ensuring even and regular intervals between chosen subjects.

Participants are systematically selected (i.e., orderly/logical) from the target population, like every nth participant on a list of names.

To take a systematic sample, you list all the population members and then decide upon a sample you would like. By dividing the number of people in the population by the number of people you want in your sample, you get a number we will call n.

If you take every nth name, you will get a systematic sample of the correct size. If, for example, you wanted to sample 150 children from a school of 1,500, you would take every 10th name.

  • The advantage of this method is that it should provide a representative sample.

Sample size

The sample size is a critical factor in determining the reliability and validity of a study’s findings. While increasing the sample size can enhance the generalizability of results, it’s also essential to balance practical considerations, such as resource constraints and diminishing returns from ever-larger samples.

Reliability and Validity

Reliability refers to the consistency and reproducibility of research findings across different occasions, researchers, or instruments. A small sample size may lead to inconsistent results due to increased susceptibility to random error or the influence of outliers. In contrast, a larger sample minimizes these errors, promoting more reliable results.

Validity pertains to the accuracy and truthfulness of research findings. For a study to be valid, it should accurately measure what it intends to do. A small, unrepresentative sample can compromise external validity, meaning the results don’t generalize well to the larger population. A larger sample captures more variability, ensuring that specific subgroups or anomalies don’t overly influence results.

Practical Considerations

Resource Constraints : Larger samples demand more time, money, and resources. Data collection becomes more extensive, data analysis more complex, and logistics more challenging.

Diminishing Returns : While increasing the sample size generally leads to improved accuracy and precision, there’s a point where adding more participants yields only marginal benefits. For instance, going from 50 to 500 participants might significantly boost a study’s robustness, but jumping from 10,000 to 10,500 might not offer a comparable advantage, especially considering the added costs.

Print Friendly, PDF & Email

Educational resources and simple solutions for your research journey

Sampling Methods

What are Sampling Methods? Techniques, Types, and Examples

Every type of research includes samples from which inferences are drawn. The sample could be biological specimens or a subset of a specific group or population selected for analysis. The goal is often to conclude the entire population based on the characteristics observed in the sample. Now, the question comes to mind: how does one collect the samples? Answer: Using sampling methods. Various sampling strategies are available to researchers to define and collect samples that will form the basis of their research study.

In a study focusing on individuals experiencing anxiety, gathering data from the entire population is practically impossible due to the widespread prevalence of anxiety. Consequently, a sample is carefully selected—a subset of individuals meant to represent (or not in some cases accurately) the demographics of those experiencing anxiety. The study’s outcomes hinge significantly on the chosen sample, emphasizing the critical importance of a thoughtful and precise selection process. The conclusions drawn about the broader population rely heavily on the selected sample’s characteristics and diversity.

Table of Contents

What is sampling?

Sampling involves the strategic selection of individuals or a subset from a population, aiming to derive statistical inferences and predict the characteristics of the entire population. It offers a pragmatic and practical approach to examining the features of the whole population, which would otherwise be difficult to achieve because studying the total population is expensive, time-consuming, and often impossible. Market researchers use various sampling methods to collect samples from a large population to acquire relevant insights. The best sampling strategy for research is determined by criteria such as the purpose of the study, available resources (time and money), and research hypothesis.

For example, if a pet food manufacturer wants to investigate the positive impact of a new cat food on feline growth, studying all the cats in the country is impractical. In such cases, employing an appropriate sampling technique from the extensive dataset allows the researcher to focus on a manageable subset. This enables the researcher to study the growth-promoting effects of the new pet food. This article will delve into the standard sampling methods and explore the situations in which each is most appropriately applied.

essay on sampling method

What are sampling methods or sampling techniques?

Sampling methods or sampling techniques in research are statistical methods for selecting a sample representative of the whole population to study the population’s characteristics. Sampling methods serve as invaluable tools for researchers, enabling the collection of meaningful data and facilitating analysis to identify distinctive features of the people. Different sampling strategies can be used based on the characteristics of the population, the study purpose, and the available resources. Now that we understand why sampling methods are essential in research, we review the various sample methods in the following sections.

Types of sampling methods  

essay on sampling method

Before we go into the specifics of each sampling method, it’s vital to understand terms like sample, sample frame, and sample space. In probability theory, the sample space comprises all possible outcomes of a random experiment, while the sample frame is the list or source guiding sample selection in statistical research. The  sample  represents the group of individuals participating in the study, forming the basis for the research findings. Selecting the correct sample is critical to ensuring the validity and reliability of any research; the sample should be representative of the population. 

There are two most common sampling methods: 

  • Probability sampling: A sampling method in which each unit or element in the population has an equal chance of being selected in the final sample. This is called random sampling, emphasizing the random and non-zero probability nature of selecting samples. Such a sampling technique ensures a more representative and unbiased sample, enabling robust inferences about the entire population. 
  • Non-probability sampling:  Another sampling method is non-probability sampling, which involves collecting data conveniently through a non-random selection based on predefined criteria. This offers a straightforward way to gather data, although the resulting sample may or may not accurately represent the entire population. 

  Irrespective of the research method you opt for, it is essential to explicitly state the chosen sampling technique in the methodology section of your research article. Now, we will explore the different characteristics of both sampling methods, along with various subtypes falling under these categories. 

What is probability sampling?  

The probability sampling method is based on the probability theory, which means that the sample selection criteria involve some random selection. The probability sampling method provides an equal opportunity for all elements or units within the entire sample space to be chosen. While it can be labor-intensive and expensive, the advantage lies in its ability to offer a more accurate representation of the population, thereby enhancing confidence in the inferences drawn in the research.   

Types of probability sampling  

Various probability sampling methods exist, such as simple random sampling, systematic sampling, stratified sampling, and clustered sampling. Here, we provide detailed discussions and illustrative examples for each of these sampling methods: 

Simple Random Sampling

  • Simple random sampling:  In simple random sampling, each individual has an equal probability of being chosen, and each selection is independent of the others. Because the choice is entirely based on chance, this is also known as the method of chance selection. In the simple random sampling method, the sample frame comprises the entire population. 

For example,  A fitness sports brand is launching a new protein drink and aims to select 20 individuals from a 200-person fitness center to try it. Employing a simple random sampling approach, each of the 200 people is assigned a unique identifier. Of these, 20 individuals are then chosen by generating random numbers between 1 and 200, either manually or through a computer program. Matching these numbers with the individuals creates a randomly selected group of 20 people. This method minimizes sampling bias and ensures a representative subset of the entire population under study. 

Systematic Random Sampling

  • Systematic sampling:  The systematic sampling approach involves selecting units or elements at regular intervals from an ordered list of the population. Because the starting point of this sampling method is chosen at random, it is more convenient than essential random sampling. For a better understanding, consider the following example.  

For example, considering the previous model, individuals at the fitness facility are arranged alphabetically. The manufacturer then initiates the process by randomly selecting a starting point from the first ten positions, let’s say 8. Starting from the 8th position, every tenth person on the list is then chosen (e.g., 8, 18, 28, 38, and so forth) until a sample of 20 individuals is obtained.  

Stratified Sampling

  • Stratified sampling: Stratified sampling divides the population into subgroups (strata), and random samples are drawn from each stratum in proportion to its size in the population. Stratified sampling provides improved representation because each subgroup that differs in significant ways is included in the final sample. 

For example, Expanding on the previous simple random sampling example, suppose the manufacturer aims for a more comprehensive representation of genders in a sample of 200 people, consisting of 90 males, 80 females, and 30 others. The manufacturer categorizes the population into three gender strata (Male, Female, and Others). Within each group, random sampling is employed to select nine males, eight females, and three individuals from the others category, resulting in a well-rounded and representative sample of 200 individuals. 

  • Clustered sampling: In this sampling method, the population is divided into clusters, and then a random sample of clusters is included in the final sample. Clustered sampling, distinct from stratified sampling, involves subgroups (clusters) that exhibit characteristics similar to the whole sample. In the case of small clusters, all members can be included in the final sample, whereas for larger clusters, individuals within each cluster may be sampled using the sampling above methods. This approach is referred to as multistage sampling. This sampling method is well-suited for large and widely distributed populations; however, there is a potential risk of sample error because ensuring that the sampled clusters truly represent the entire population can be challenging. 

Clustered Sampling

For example, Researchers conducting a nationwide health study can select specific geographic clusters, like cities or regions, instead of trying to survey the entire population individually. Within each chosen cluster, they sample individuals, providing a representative subset without the logistical challenges of attempting a nationwide survey. 

Use s of probability sampling  

Probability sampling methods find widespread use across diverse research disciplines because of their ability to yield representative and unbiased samples. The advantages of employing probability sampling include the following: 

  • Representativeness  

Probability sampling assures that every element in the population has a non-zero chance of being included in the sample, ensuring representativeness of the entire population and decreasing research bias to minimal to non-existent levels. The researcher can acquire higher-quality data via probability sampling, increasing confidence in the conclusions. 

  • Statistical inference  

Statistical methods, like confidence intervals and hypothesis testing, depend on probability sampling to generalize findings from a sample to the broader population. Probability sampling methods ensure unbiased representation, allowing inferences about the population based on the characteristics of the sample. 

  • Precision and reliability  

The use of probability sampling improves the precision and reliability of study results. Because the probability of selecting any single element/individual is known, the chance variations that may occur in non-probability sampling methods are reduced, resulting in more dependable and precise estimations. 

  • Generalizability  

Probability sampling enables the researcher to generalize study findings to the entire population from which they were derived. The results produced through probability sampling methods are more likely to be applicable to the larger population, laying the foundation for making broad predictions or recommendations. 

  • Minimization of Selection Bias  

By ensuring that each member of the population has an equal chance of being selected in the sample, probability sampling lowers the possibility of selection bias. This reduces the impact of systematic errors that may occur in non-probability sampling methods, where data may be skewed toward a specific demographic due to inadequate representation of each segment of the population. 

What is non-probability sampling?  

Non-probability sampling methods involve selecting individuals based on non-random criteria, often relying on the researcher’s judgment or predefined criteria. While it is easier and more economical, it tends to introduce sampling bias, resulting in weaker inferences compared to probability sampling techniques in research. 

Types of Non-probability Sampling   

Non-probability sampling methods are further classified as convenience sampling, consecutive sampling, quota sampling, purposive or judgmental sampling, and snowball sampling. Let’s explore these types of sampling methods in detail. 

  • Convenience sampling:  In convenience sampling, individuals are recruited directly from the population based on the accessibility and proximity to the researcher. It is a simple, inexpensive, and practical method of sample selection, yet convenience sampling suffers from both sampling and selection bias due to a lack of appropriate population representation. 

Convenience sampling

For example, imagine you’re a researcher investigating smartphone usage patterns in your city. The most convenient way to select participants is by approaching people in a shopping mall on a weekday afternoon. However, this convenience sampling method may not be an accurate representation of the city’s overall smartphone usage patterns as the sample is limited to individuals present at the mall during weekdays, excluding those who visit on other days or never visit the mall.

  • Consecutive sampling: Participants in consecutive sampling (or sequential sampling) are chosen based on their availability and desire to participate in the study as they become available. This strategy entails sequentially recruiting individuals who fulfill the researcher’s requirements. 

For example, In researching the prevalence of stroke in a hospital, instead of randomly selecting patients from the entire population, the researcher can opt to include all eligible patients admitted over three months. Participants are then consecutively recruited upon admission during that timeframe, forming the study sample. 

  • Quota sampling:  The selection of individuals in quota sampling is based on non-random selection criteria in which only participants with certain traits or proportions that are representative of the population are included. Quota sampling involves setting predetermined quotas for specific subgroups based on key demographics or other relevant characteristics. This sampling method employs dividing the population into mutually exclusive subgroups and then selecting sample units until the set quota is reached.  

Quota sampling

For example, In a survey on a college campus to assess student interest in a new policy, the researcher should establish quotas aligned with the distribution of student majors, ensuring representation from various academic disciplines. If the campus has 20% biology majors, 30% engineering majors, 20% business majors, and 30% liberal arts majors, participants should be recruited to mirror these proportions. 

  • Purposive or judgmental sampling: In purposive sampling, the researcher leverages expertise to select a sample relevant to the study’s specific questions. This sampling method is commonly applied in qualitative research, mainly when aiming to understand a particular phenomenon, and is suitable for smaller population sizes. 

Purposive Sampling

For example, imagine a researcher who wants to study public policy issues for a focus group. The researcher might purposely select participants with expertise in economics, law, and public administration to take advantage of their knowledge and ensure a depth of understanding.  

  • Snowball sampling:  This sampling method is used when accessing the population is challenging. It involves collecting the sample through a chain-referral process, where each recruited candidate aids in finding others. These candidates share common traits, representing the targeted population. This method is often used in qualitative research, particularly when studying phenomena related to stigmatized or hidden populations. 

Snowball Sampling

For example, In a study focusing on understanding the experiences and challenges of individuals in hidden or stigmatized communities (e.g., LGBTQ+ individuals in specific cultural contexts), the snowball sampling technique can be employed. The researcher initiates contact with one community member, who then assists in identifying additional candidates until the desired sample size is achieved.

Uses of non-probability sampling  

Non-probability sampling approaches are employed in qualitative or exploratory research where the goal is to investigate underlying population traits rather than generalizability. Non-probability sampling methods are also helpful for the following purposes: 

  • Generating a hypothesis  

In the initial stages of exploratory research, non-probability methods such as purposive or convenience allow researchers to quickly gather information and generate hypothesis that helps build a future research plan.  

  • Qualitative research  

Qualitative research is usually focused on understanding the depth and complexity of human experiences, behaviors, and perspectives. Non-probability methods like purposive or snowball sampling are commonly used to select participants with specific traits that are relevant to the research question.  

  • Convenience and pragmatism  

Non-probability sampling methods are valuable when resource and time are limited or when preliminary data is required to test the pilot study. For example, conducting a survey at a local shopping mall to gather opinions on a consumer product due to the ease of access to potential participants.  

Probability vs Non-probability Sampling Methods  

     
Selection of participants  Random selection of participants from the population using randomization methods  Non-random selection of participants from the population based on convenience or criteria 
Representativeness  Likely to yield a representative sample of the whole population allowing for generalizations  May not yield a representative sample of the whole population; poor generalizability 
Precision and accuracy  Provides more precise and accurate estimates of population characteristics  May have less precision and accuracy due to non-random selection  
Bias   Minimizes selection bias  May introduce selection bias if criteria are subjective and not well-defined 
Statistical inference  Suited for statistical inference and hypothesis testing and for making generalization to the population  Less suited for statistical inference and hypothesis testing on the population 
Application  Useful for quantitative research where generalizability is crucial   Commonly used in qualitative and exploratory research where in-depth insights are the goal 

Frequently asked questions  

  • What is multistage sampling ? Multistage sampling is a form of probability sampling approach that involves the progressive selection of samples in stages, going from larger clusters to a small number of participants, making it suited for large-scale research with enormous population lists.  
  • What are the methods of probability sampling? Probability sampling methods are simple random sampling, stratified random sampling, systematic sampling, cluster sampling, and multistage sampling.
  • How to decide which type of sampling method to use? Choose a sampling method based on the goals, population, and resources. Probability for statistics and non-probability for efficiency or qualitative insights can be considered . Also, consider the population characteristics, size, and alignment with study objectives.
  • What are the methods of non-probability sampling? Non-probability sampling methods are convenience sampling, consecutive sampling, purposive sampling, snowball sampling, and quota sampling.
  • Why are sampling methods used in research? Sampling methods in research are employed to efficiently gather representative data from a subset of a larger population, enabling valid conclusions and generalizations while minimizing costs and time.  

R Discovery is a literature search and research reading platform that accelerates your research discovery journey by keeping you updated on the latest, most relevant scholarly content. With 250M+ research articles sourced from trusted aggregators like CrossRef, Unpaywall, PubMed, PubMed Central, Open Alex and top publishing houses like Springer Nature, JAMA, IOP, Taylor & Francis, NEJM, BMJ, Karger, SAGE, Emerald Publishing and more, R Discovery puts a world of research at your fingertips.  

Try R Discovery Prime FREE for 1 week or upgrade at just US$72 a year to access premium features that let you listen to research on the go, read in your language, collaborate with peers, auto sync with reference managers, and much more. Choose a simpler, smarter way to find and read research – Download the app and start your free 7-day trial today !  

Related Posts

trends in science communication

What is Research Impact: Types and Tips for Academics

Research in Shorts

Research in Shorts: R Discovery’s New Feature Helps Academics Assess Relevant Papers in 2mins 

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Sampling Methods | Types, Techniques, & Examples

Sampling Methods | Types, Techniques, & Examples

Published on 3 May 2022 by Shona McCombes . Revised on 10 October 2022.

When you conduct research about a group of people, it’s rarely possible to collect data from every person in that group. Instead, you select a sample. The sample is the group of individuals who will actually participate in the research.

To draw valid conclusions from your results, you have to carefully decide how you will select a sample that is representative of the group as a whole. There are two types of sampling methods:

  • Probability sampling involves random selection, allowing you to make strong statistical inferences about the whole group. It minimises the risk of selection bias .
  • Non-probability sampling involves non-random selection based on convenience or other criteria, allowing you to easily collect data.

You should clearly explain how you selected your sample in the methodology section of your paper or thesis.

Table of contents

Population vs sample, probability sampling methods, non-probability sampling methods, frequently asked questions about sampling.

First, you need to understand the difference between a population and a sample , and identify the target population of your research.

  • The population is the entire group that you want to draw conclusions about.
  • The sample is the specific group of individuals that you will collect data from.

The population can be defined in terms of geographical location, age, income, and many other characteristics.

Population vs sample

It is important to carefully define your target population according to the purpose and practicalities of your project.

If the population is very large, demographically mixed, and geographically dispersed, it might be difficult to gain access to a representative sample.

Sampling frame

The sampling frame is the actual list of individuals that the sample will be drawn from. Ideally, it should include the entire target population (and nobody who is not part of that population).

You are doing research on working conditions at Company X. Your population is all 1,000 employees of the company. Your sampling frame is the company’s HR database, which lists the names and contact details of every employee.

Sample size

The number of individuals you should include in your sample depends on various factors, including the size and variability of the population and your research design. There are different sample size calculators and formulas depending on what you want to achieve with statistical analysis .

Prevent plagiarism, run a free check.

Probability sampling means that every member of the population has a chance of being selected. It is mainly used in quantitative research . If you want to produce results that are representative of the whole population, probability sampling techniques are the most valid choice.

There are four main types of probability sample.

Probability sampling

1. Simple random sampling

In a simple random sample , every member of the population has an equal chance of being selected. Your sampling frame should include the whole population.

To conduct this type of sampling, you can use tools like random number generators or other techniques that are based entirely on chance.

You want to select a simple random sample of 100 employees of Company X. You assign a number to every employee in the company database from 1 to 1000, and use a random number generator to select 100 numbers.

2. Systematic sampling

Systematic sampling is similar to simple random sampling, but it is usually slightly easier to conduct. Every member of the population is listed with a number, but instead of randomly generating numbers, individuals are chosen at regular intervals.

All employees of the company are listed in alphabetical order. From the first 10 numbers, you randomly select a starting point: number 6. From number 6 onwards, every 10th person on the list is selected (6, 16, 26, 36, and so on), and you end up with a sample of 100 people.

If you use this technique, it is important to make sure that there is no hidden pattern in the list that might skew the sample. For example, if the HR database groups employees by team, and team members are listed in order of seniority, there is a risk that your interval might skip over people in junior roles, resulting in a sample that is skewed towards senior employees.

3. Stratified sampling

Stratified sampling involves dividing the population into subpopulations that may differ in important ways. It allows you draw more precise conclusions by ensuring that every subgroup is properly represented in the sample.

To use this sampling method, you divide the population into subgroups (called strata) based on the relevant characteristic (e.g., gender, age range, income bracket, job role).

Based on the overall proportions of the population, you calculate how many people should be sampled from each subgroup. Then you use random or systematic sampling to select a sample from each subgroup.

The company has 800 female employees and 200 male employees. You want to ensure that the sample reflects the gender balance of the company, so you sort the population into two strata based on gender. Then you use random sampling on each group, selecting 80 women and 20 men, which gives you a representative sample of 100 people.

4. Cluster sampling

Cluster sampling also involves dividing the population into subgroups, but each subgroup should have similar characteristics to the whole sample. Instead of sampling individuals from each subgroup, you randomly select entire subgroups.

If it is practically possible, you might include every individual from each sampled cluster. If the clusters themselves are large, you can also sample individuals from within each cluster using one of the techniques above. This is called multistage sampling .

This method is good for dealing with large and dispersed populations, but there is more risk of error in the sample, as there could be substantial differences between clusters. It’s difficult to guarantee that the sampled clusters are really representative of the whole population.

The company has offices in 10 cities across the country (all with roughly the same number of employees in similar roles). You don’t have the capacity to travel to every office to collect your data, so you use random sampling to select 3 offices – these are your clusters.

In a non-probability sample , individuals are selected based on non-random criteria, and not every individual has a chance of being included.

This type of sample is easier and cheaper to access, but it has a higher risk of sampling bias . That means the inferences you can make about the population are weaker than with probability samples, and your conclusions may be more limited. If you use a non-probability sample, you should still aim to make it as representative of the population as possible.

Non-probability sampling techniques are often used in exploratory and qualitative research . In these types of research, the aim is not to test a hypothesis about a broad population, but to develop an initial understanding of a small or under-researched population.

Non probability sampling

1. Convenience sampling

A convenience sample simply includes the individuals who happen to be most accessible to the researcher.

This is an easy and inexpensive way to gather initial data, but there is no way to tell if the sample is representative of the population, so it can’t produce generalisable results.

You are researching opinions about student support services in your university, so after each of your classes, you ask your fellow students to complete a survey on the topic. This is a convenient way to gather data, but as you only surveyed students taking the same classes as you at the same level, the sample is not representative of all the students at your university.

2. Voluntary response sampling

Similar to a convenience sample, a voluntary response sample is mainly based on ease of access. Instead of the researcher choosing participants and directly contacting them, people volunteer themselves (e.g., by responding to a public online survey).

Voluntary response samples are always at least somewhat biased, as some people will inherently be more likely to volunteer than others.

You send out the survey to all students at your university and many students decide to complete it. This can certainly give you some insight into the topic, but the people who responded are more likely to be those who have strong opinions about the student support services, so you can’t be sure that their opinions are representative of all students.

3. Purposive sampling

Purposive sampling , also known as judgement sampling, involves the researcher using their expertise to select a sample that is most useful to the purposes of the research.

It is often used in qualitative research , where the researcher wants to gain detailed knowledge about a specific phenomenon rather than make statistical inferences, or where the population is very small and specific. An effective purposive sample must have clear criteria and rationale for inclusion.

You want to know more about the opinions and experiences of students with a disability at your university, so you purposely select a number of students with different support needs in order to gather a varied range of data on their experiences with student services.

4. Snowball sampling

If the population is hard to access, snowball sampling can be used to recruit participants via other participants. The number of people you have access to ‘snowballs’ as you get in contact with more people.

You are researching experiences of homelessness in your city. Since there is no list of all homeless people in the city, probability sampling isn’t possible. You meet one person who agrees to participate in the research, and she puts you in contact with other homeless people she knows in the area.

A sample is a subset of individuals from a larger population. Sampling means selecting the group that you will actually collect data from in your research.

For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

Statistical sampling allows you to test a hypothesis about the characteristics of a population. There are various sampling methods you can use to ensure that your sample is representative of the population as a whole.

Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.

Probability sampling means that every member of the target population has a known chance of being included in the sample.

Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .

In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.

Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling , and quota sampling .

Sampling bias occurs when some members of a population are systematically more likely to be selected in a sample than others.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

McCombes, S. (2022, October 10). Sampling Methods | Types, Techniques, & Examples. Scribbr. Retrieved 9 September 2024, from https://www.scribbr.co.uk/research-methods/sampling/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, what is quantitative research | definition & methods, a quick guide to experimental design | 5 steps & examples, controlled experiments | methods & examples of control.

Sampling Techniques in Education Essay

  • To find inspiration for your paper and overcome writer’s block
  • As a source of information (ensure proper referencing)
  • As a template for you assignment

Random sampling is a sampling technique where all elements in a population have an equal probability of being selected to form the sample. It means, therefore, that elements are chosen arbitrarily without following any formulae (Babbie, 2010). This technique is unbiased and it gives true representative statistics, especially when the sample size is large.

In addition, the technique requires minimal prior knowledge of the population. Similarly, it is simple to use since it does not require one to have complex mathematical knowledge (Babbie, 2010).

Systematic sampling on the other hand requires arrangement of the population in a given order. The first element is chosen randomly while the subsequent elements are chosen after certain regular intervals. It should be note that this type of sampling gives every element an equal chance of selection. This type of sampling is easy to use and check incase need arises.

On the same note, since the technique arranges the population in a systematic order, sampling is quick which saves time and labor (Babbie, 2010). In addition, when the frame used in systematic sampling is modern, the technique is efficient compared to random sampling.

Convenience sampling refers to a sampling technique where researchers are free to choose sample elements using any method they deem fit. There is no laid down procedure as to how the elements should be sampled thus, it neither applies probability nor judgment. The technique is easy to use for investigators because they choose the sample that is useful to their study (Babbie, 2010). It is good when one has no time and money to gather a large population, because it does not require specific rules to be met.

In stratified sampling, researchers group the population into different groups using differentiating characteristics. The researcher will then randomly select elements from each stratum using the size of the stratum in relation to the population to determine the number of elements to be picked from each stratum. The elements are then combined to form the sample (Babbie, 2010). The technique allows study of each specific group which might not be possible in a generalized population.

In cases where different segments of the population need different degrees of accuracy, stratified sampling is more applicable. Moreover, the resulting sample is more representative and gives more efficient statistics. Furthermore, stratified sampling gives room for investigators to use different types of sampling methods for each stratum as and when they deem fit (Babbie, 2010).

On the other hand, cluster sampling involves the grouping of the population into groups called clusters. A few clusters are then selected randomly and all the elements in the selected clusters are used to form the sample (Babbie, 2010). The advantage of clustering is that it greatly reduces costs of travelling as well as administrative costs. On the same note, this type reduces variability of the statistics observed as compared to other methods of sampling (Babbie, 2010).

Multi-stage sampling involves combination of two or more sampling techniques. Initially, the researcher divides the population into large clusters. The researcher then subdivides few selected clusters into sub-clusters. The clusters to be subdivided are selected either randomly, or using information collected from elements in the first clusters.

The process is repeated until the elements in the sub-sets are few enough. Finally, the researcher uses any other sampling technique to select sub-sets whose elements are used as a sample. The method is beneficial in cases where it is difficult to get a complete list of the population. It is an advanced form of cluster sampling (Babbie, 2010).

Multi-stage sampling is accurate compared to cluster sampling when the same sample size is used. Moreover, multi-stage sampling is a more convenient way of finding a sample. On the same note, the method is more cost effective and in many instances the survey can be done quickly compared to other methods (Babbie, 2010).

Babbie, E. R. (2010). The Basics of Social Research . Stanford: Cengage Learning.

  • Correlational Research
  • Sampling and Sampling Distributions
  • Quality Measurement With Stratified Random Sampling
  • On-Campus Food Services: Part-Time and Full-Time Students
  • Probability, Sampling, and Regression in Business
  • Qualitative and Quantitative Analysis
  • Body Modification: Past and Present
  • What We Know About Planets
  • Ice Mummies: The Siberian Ice Maiden’s Discovery Reveals Much About Archeology
  • The Revelations of Epigenetics: A New Way to Look at the Chances of Gene Expression
  • Chicago (A-D)
  • Chicago (N-B)

IvyPanda. (2018, November 30). Sampling Techniques in Education. https://ivypanda.com/essays/sampling-techniques/

"Sampling Techniques in Education." IvyPanda , 30 Nov. 2018, ivypanda.com/essays/sampling-techniques/.

IvyPanda . (2018) 'Sampling Techniques in Education'. 30 November.

IvyPanda . 2018. "Sampling Techniques in Education." November 30, 2018. https://ivypanda.com/essays/sampling-techniques/.

1. IvyPanda . "Sampling Techniques in Education." November 30, 2018. https://ivypanda.com/essays/sampling-techniques/.

Bibliography

IvyPanda . "Sampling Techniques in Education." November 30, 2018. https://ivypanda.com/essays/sampling-techniques/.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

Cluster Sampling | A Simple Step-by-Step Guide with Examples

Published on September 7, 2020 by Lauren Thomas . Revised on June 22, 2023.

In cluster sampling , researchers divide a population into smaller groups known as clusters . They then randomly select among these clusters to form a sample .

Cluster sampling is a method of probability sampling that is often used to study large populations, particularly those that are widely geographically dispersed. Researchers usually use pre-existing units such as schools or cities as their clusters.

Cluster sampling is a method of probability sampling

Table of contents

How to cluster sample, multistage cluster sampling, advantages and disadvantages, other interesting articles, frequently asked questions about cluster sampling.

The simplest form of cluster sampling is single-stage cluster sampling . It involves 4 key steps.

Step 1: Define your population

As with other forms of sampling, you must first begin by clearly defining the population you wish to study.

The first step of cluster sampling is to define the population you're interested in studying.

Step 2: Divide your sample into clusters

This is the most important part of the process. The quality of your clusters and how well they represent the larger population determines the validity of your results. Ideally, you would like for your clusters to meet the following criteria:

  • Each cluster’s population should be as diverse as possible. You want every potential characteristic of the entire population to be represented in each cluster.
  • Each cluster should have a similar distribution of characteristics as the distribution of the population as a whole.
  • Taken together, the clusters should cover the entire population.
  • There not be any overlap between clusters (i.e. the same people or units do not appear in more than one cluster).

Ideally, each cluster should be a mini-representation of the entire population. However, in practice, clusters often do not perfectly represent the population’s characteristics, which is why this method provides less statistical certainty than simple random sampling , and is more prone to research biases like selection bias .

Because clusters are usually naturally occurring groups, such as schools, cities, or households, they are often more homogenous than the population as a whole. You should be aware of this when performing your study, as it might affect its validity.

The second step of cluster sampling is to group the population into clusters, ideally representative of the population.

Step 3: Randomly select clusters to use as your sample

If each cluster is itself a mini-representation of the larger population, randomly selecting and sampling from the clusters allows you to imitate simple random sampling, which in turn supports the validity of your results.

Conversely, if the clusters are not representative, then random sampling will allow you to gather data on a diverse array of clusters, which should still provide you with an overview of the population as a whole.

The third step of cluster sampling is to randomly select clusters to use as your sample.

You choose the number of clusters based on how large you want your sample size to be. This in turn is based on the estimated size of the entire seventh-grade population, your desired confidence interval and confidence level , and your best guess of the standard deviation (a measure of how spread apart the values in a population are) of the reading levels of the seventh-graders.

Step 4: Collect data from the sample

You then conduct your study and collect data from every unit in the selected clusters.

In single-stage cluster sampling, the final step is to collect data from every unit in your selected clusters.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

In multistage cluster sampling , rather than collect data from every single unit in the selected clusters, you randomly select individual units from within the cluster to use as your sample.

You can then collect data from each of these individual units – this is known as double-stage sampling .

In double-stage cluster sampling, you randomly select units from within your selected clusters.

You can also continue this procedure, taking progressively smaller and smaller random samples, which is usually called multistage sampling .

You should use this method when it is infeasible or too expensive to test the entire cluster.

  • From each school, you randomly select a sample of seventh-grade classes.
  • From within those classes, you randomly select a sample of students.

Cluster sampling is commonly used for its practical advantages, but it has some disadvantages in terms of statistical validity.

  • Cluster sampling is time- and cost-efficient, especially for samples that are widely geographically spread and would be difficult to properly sample otherwise.
  • Because cluster sampling uses randomization, if the population is clustered properly, your study will have high external validity because your sample will reflect the characteristics of the larger population.

Disadvantages

  • Internal validity is less strong than with simple random sampling, particularly as you use more stages of clustering.
  • If your clusters are not a good mini-representation of the population as a whole, then it is more difficult to rely upon your sample to provide valid results, and is very likely to be biased .
  • Cluster sampling is much more complex to plan than other forms of sampling.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Prospective cohort study

Research bias

  • Implicit bias
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic
  • Social desirability bias

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

essay on sampling method

Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample.

The clusters should ideally each be mini-representations of the population as a whole.

There are three types of cluster sampling : single-stage, double-stage and multi-stage clustering. In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample.

  • In single-stage sampling , you collect data from every unit within the selected clusters.
  • In double-stage sampling , you select a random sample of units from within the clusters.
  • In multi-stage sampling , you repeat the procedure of randomly sampling elements from within the clusters until you have reached a manageable sample.

Cluster sampling is more time- and cost-efficient than other probability sampling methods , particularly when it comes to large samples spread across a wide geographical area.

However, it provides less statistical certainty than other methods, such as simple random sampling , because it is difficult to ensure that your clusters properly represent the population as a whole.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Thomas, L. (2023, June 22). Cluster Sampling | A Simple Step-by-Step Guide with Examples. Scribbr. Retrieved September 9, 2024, from https://www.scribbr.com/methodology/cluster-sampling/

Is this article helpful?

Lauren Thomas

Lauren Thomas

Other students also liked, multistage sampling | introductory guide & examples, stratified sampling | definition, guide & examples, simple random sampling | definition, steps & examples, what is your plagiarism score.

Peer Reviewed

GPT-fabricated scientific papers on Google Scholar: Key features, spread, and implications for preempting evidence manipulation

Article metrics.

CrossRef

CrossRef Citations

Altmetric Score

PDF Downloads

Academic journals, archives, and repositories are seeing an increasing number of questionable research papers clearly produced using generative AI. They are often created with widely available, general-purpose AI applications, most likely ChatGPT, and mimic scientific writing. Google Scholar easily locates and lists these questionable papers alongside reputable, quality-controlled research. Our analysis of a selection of questionable GPT-fabricated scientific papers found in Google Scholar shows that many are about applied, often controversial topics susceptible to disinformation: the environment, health, and computing. The resulting enhanced potential for malicious manipulation of society’s evidence base, particularly in politically divisive domains, is a growing concern.

Swedish School of Library and Information Science, University of Borås, Sweden

Department of Arts and Cultural Sciences, Lund University, Sweden

Division of Environmental Communication, Swedish University of Agricultural Sciences, Sweden

essay on sampling method

Research Questions

  • Where are questionable publications produced with generative pre-trained transformers (GPTs) that can be found via Google Scholar published or deposited?
  • What are the main characteristics of these publications in relation to predominant subject categories?
  • How are these publications spread in the research infrastructure for scholarly communication?
  • How is the role of the scholarly communication infrastructure challenged in maintaining public trust in science and evidence through inappropriate use of generative AI?

research note Summary

  • A sample of scientific papers with signs of GPT-use found on Google Scholar was retrieved, downloaded, and analyzed using a combination of qualitative coding and descriptive statistics. All papers contained at least one of two common phrases returned by conversational agents that use large language models (LLM) like OpenAI’s ChatGPT. Google Search was then used to determine the extent to which copies of questionable, GPT-fabricated papers were available in various repositories, archives, citation databases, and social media platforms.
  • Roughly two-thirds of the retrieved papers were found to have been produced, at least in part, through undisclosed, potentially deceptive use of GPT. The majority (57%) of these questionable papers dealt with policy-relevant subjects (i.e., environment, health, computing), susceptible to influence operations. Most were available in several copies on different domains (e.g., social media, archives, and repositories).
  • Two main risks arise from the increasingly common use of GPT to (mass-)produce fake, scientific publications. First, the abundance of fabricated “studies” seeping into all areas of the research infrastructure threatens to overwhelm the scholarly communication system and jeopardize the integrity of the scientific record. A second risk lies in the increased possibility that convincingly scientific-looking content was in fact deceitfully created with AI tools and is also optimized to be retrieved by publicly available academic search engines, particularly Google Scholar. However small, this possibility and awareness of it risks undermining the basis for trust in scientific knowledge and poses serious societal risks.

Implications

The use of ChatGPT to generate text for academic papers has raised concerns about research integrity. Discussion of this phenomenon is ongoing in editorials, commentaries, opinion pieces, and on social media (Bom, 2023; Stokel-Walker, 2024; Thorp, 2023). There are now several lists of papers suspected of GPT misuse, and new papers are constantly being added. 1 See for example Academ-AI, https://www.academ-ai.info/ , and Retraction Watch, https://retractionwatch.com/papers-and-peer-reviews-with-evidence-of-chatgpt-writing/ . While many legitimate uses of GPT for research and academic writing exist (Huang & Tan, 2023; Kitamura, 2023; Lund et al., 2023), its undeclared use—beyond proofreading—has potentially far-reaching implications for both science and society, but especially for their relationship. It, therefore, seems important to extend the discussion to one of the most accessible and well-known intermediaries between science, but also certain types of misinformation, and the public, namely Google Scholar, also in response to the legitimate concerns that the discussion of generative AI and misinformation needs to be more nuanced and empirically substantiated  (Simon et al., 2023).

Google Scholar, https://scholar.google.com , is an easy-to-use academic search engine. It is available for free, and its index is extensive (Gusenbauer & Haddaway, 2020). It is also often touted as a credible source for academic literature and even recommended in library guides, by media and information literacy initiatives, and fact checkers (Tripodi et al., 2023). However, Google Scholar lacks the transparency and adherence to standards that usually characterize citation databases. Instead, Google Scholar uses automated crawlers, like Google’s web search engine (Martín-Martín et al., 2021), and the inclusion criteria are based on primarily technical standards, allowing any individual author—with or without scientific affiliation—to upload papers to be indexed (Google Scholar Help, n.d.). It has been shown that Google Scholar is susceptible to manipulation through citation exploits (Antkare, 2020) and by providing access to fake scientific papers (Dadkhah et al., 2017). A large part of Google Scholar’s index consists of publications from established scientific journals or other forms of quality-controlled, scholarly literature. However, the index also contains a large amount of gray literature, including student papers, working papers, reports, preprint servers, and academic networking sites, as well as material from so-called “questionable” academic journals, including paper mills. The search interface does not offer the possibility to filter the results meaningfully by material type, publication status, or form of quality control, such as limiting the search to peer-reviewed material.

To understand the occurrence of ChatGPT (co-)authored work in Google Scholar’s index, we scraped it for publications, including one of two common ChatGPT responses (see Appendix A) that we encountered on social media and in media reports (DeGeurin, 2024). The results of our descriptive statistical analyses showed that around 62% did not declare the use of GPTs. Most of these GPT-fabricated papers were found in non-indexed journals and working papers, but some cases included research published in mainstream scientific journals and conference proceedings. 2 Indexed journals mean scholarly journals indexed by abstract and citation databases such as Scopus and Web of Science, where the indexation implies journals with high scientific quality. Non-indexed journals are journals that fall outside of this indexation. More than half (57%) of these GPT-fabricated papers concerned policy-relevant subject areas susceptible to influence operations. To avoid increasing the visibility of these publications, we abstained from referencing them in this research note. However, we have made the data available in the Harvard Dataverse repository.

The publications were related to three issue areas—health (14.5%), environment (19.5%) and computing (23%)—with key terms such “healthcare,” “COVID-19,” or “infection”for health-related papers, and “analysis,” “sustainable,” and “global” for environment-related papers. In several cases, the papers had titles that strung together general keywords and buzzwords, thus alluding to very broad and current research. These terms included “biology,” “telehealth,” “climate policy,” “diversity,” and “disrupting,” to name just a few.  While the study’s scope and design did not include a detailed analysis of which parts of the articles included fabricated text, our dataset did contain the surrounding sentences for each occurrence of the suspicious phrases that formed the basis for our search and subsequent selection. Based on that, we can say that the phrases occurred in most sections typically found in scientific publications, including the literature review, methods, conceptual and theoretical frameworks, background, motivation or societal relevance, and even discussion. This was confirmed during the joint coding, where we read and discussed all articles. It became clear that not just the text related to the telltale phrases was created by GPT, but that almost all articles in our sample of questionable articles likely contained traces of GPT-fabricated text everywhere.

Evidence hacking and backfiring effects

Generative pre-trained transformers (GPTs) can be used to produce texts that mimic scientific writing. These texts, when made available online—as we demonstrate—leak into the databases of academic search engines and other parts of the research infrastructure for scholarly communication. This development exacerbates problems that were already present with less sophisticated text generators (Antkare, 2020; Cabanac & Labbé, 2021). Yet, the public release of ChatGPT in 2022, together with the way Google Scholar works, has increased the likelihood of lay people (e.g., media, politicians, patients, students) coming across questionable (or even entirely GPT-fabricated) papers and other problematic research findings. Previous research has emphasized that the ability to determine the value and status of scientific publications for lay people is at stake when misleading articles are passed off as reputable (Haider & Åström, 2017) and that systematic literature reviews risk being compromised (Dadkhah et al., 2017). It has also been highlighted that Google Scholar, in particular, can be and has been exploited for manipulating the evidence base for politically charged issues and to fuel conspiracy narratives (Tripodi et al., 2023). Both concerns are likely to be magnified in the future, increasing the risk of what we suggest calling evidence hacking —the strategic and coordinated malicious manipulation of society’s evidence base.

The authority of quality-controlled research as evidence to support legislation, policy, politics, and other forms of decision-making is undermined by the presence of undeclared GPT-fabricated content in publications professing to be scientific. Due to the large number of archives, repositories, mirror sites, and shadow libraries to which they spread, there is a clear risk that GPT-fabricated, questionable papers will reach audiences even after a possible retraction. There are considerable technical difficulties involved in identifying and tracing computer-fabricated papers (Cabanac & Labbé, 2021; Dadkhah et al., 2023; Jones, 2024), not to mention preventing and curbing their spread and uptake.

However, as the rise of the so-called anti-vaxx movement during the COVID-19 pandemic and the ongoing obstruction and denial of climate change show, retracting erroneous publications often fuels conspiracies and increases the following of these movements rather than stopping them. To illustrate this mechanism, climate deniers frequently question established scientific consensus by pointing to other, supposedly scientific, studies that support their claims. Usually, these are poorly executed, not peer-reviewed, based on obsolete data, or even fraudulent (Dunlap & Brulle, 2020). A similar strategy is successful in the alternative epistemic world of the global anti-vaccination movement (Carrion, 2018) and the persistence of flawed and questionable publications in the scientific record already poses significant problems for health research, policy, and lawmakers, and thus for society as a whole (Littell et al., 2024). Considering that a person’s support for “doing your own research” is associated with increased mistrust in scientific institutions (Chinn & Hasell, 2023), it will be of utmost importance to anticipate and consider such backfiring effects already when designing a technical solution, when suggesting industry or legal regulation, and in the planning of educational measures.

Recommendations

Solutions should be based on simultaneous considerations of technical, educational, and regulatory approaches, as well as incentives, including social ones, across the entire research infrastructure. Paying attention to how these approaches and incentives relate to each other can help identify points and mechanisms for disruption. Recognizing fraudulent academic papers must happen alongside understanding how they reach their audiences and what reasons there might be for some of these papers successfully “sticking around.” A possible way to mitigate some of the risks associated with GPT-fabricated scholarly texts finding their way into academic search engine results would be to provide filtering options for facets such as indexed journals, gray literature, peer-review, and similar on the interface of publicly available academic search engines. Furthermore, evaluation tools for indexed journals 3 Such as LiU Journal CheckUp, https://ep.liu.se/JournalCheckup/default.aspx?lang=eng . could be integrated into the graphical user interfaces and the crawlers of these academic search engines. To enable accountability, it is important that the index (database) of such a search engine is populated according to criteria that are transparent, open to scrutiny, and appropriate to the workings of  science and other forms of academic research. Moreover, considering that Google Scholar has no real competitor, there is a strong case for establishing a freely accessible, non-specialized academic search engine that is not run for commercial reasons but for reasons of public interest. Such measures, together with educational initiatives aimed particularly at policymakers, science communicators, journalists, and other media workers, will be crucial to reducing the possibilities for and effects of malicious manipulation or evidence hacking. It is important not to present this as a technical problem that exists only because of AI text generators but to relate it to the wider concerns in which it is embedded. These range from a largely dysfunctional scholarly publishing system (Haider & Åström, 2017) and academia’s “publish or perish” paradigm to Google’s near-monopoly and ideological battles over the control of information and ultimately knowledge. Any intervention is likely to have systemic effects; these effects need to be considered and assessed in advance and, ideally, followed up on.

Our study focused on a selection of papers that were easily recognizable as fraudulent. We used this relatively small sample as a magnifying glass to examine, delineate, and understand a problem that goes beyond the scope of the sample itself, which however points towards larger concerns that require further investigation. The work of ongoing whistleblowing initiatives 4 Such as Academ-AI, https://www.academ-ai.info/ , and Retraction Watch, https://retractionwatch.com/papers-and-peer-reviews-with-evidence-of-chatgpt-writing/ . , recent media reports of journal closures (Subbaraman, 2024), or GPT-related changes in word use and writing style (Cabanac et al., 2021; Stokel-Walker, 2024) suggest that we only see the tip of the iceberg. There are already more sophisticated cases (Dadkhah et al., 2023) as well as cases involving fabricated images (Gu et al., 2022). Our analysis shows that questionable and potentially manipulative GPT-fabricated papers permeate the research infrastructure and are likely to become a widespread phenomenon. Our findings underline that the risk of fake scientific papers being used to maliciously manipulate evidence (see Dadkhah et al., 2017) must be taken seriously. Manipulation may involve undeclared automatic summaries of texts, inclusion in literature reviews, explicit scientific claims, or the concealment of errors in studies so that they are difficult to detect in peer review. However, the mere possibility of these things happening is a significant risk in its own right that can be strategically exploited and will have ramifications for trust in and perception of science. Society’s methods of evaluating sources and the foundations of media and information literacy are under threat and public trust in science is at risk of further erosion, with far-reaching consequences for society in dealing with information disorders. To address this multifaceted problem, we first need to understand why it exists and proliferates.

Finding 1: 139 GPT-fabricated, questionable papers were found and listed as regular results on the Google Scholar results page. Non-indexed journals dominate.

Most questionable papers we found were in non-indexed journals or were working papers, but we did also find some in established journals, publications, conferences, and repositories. We found a total of 139 papers with a suspected deceptive use of ChatGPT or similar LLM applications (see Table 1). Out of these, 19 were in indexed journals, 89 were in non-indexed journals, 19 were student papers found in university databases, and 12 were working papers (mostly in preprint databases). Table 1 divides these papers into categories. Health and environment papers made up around 34% (47) of the sample. Of these, 66% were present in non-indexed journals.

Indexed journals*534719
Non-indexed journals1818134089
Student papers4311119
Working papers532212
Total32272060139

Finding 2: GPT-fabricated, questionable papers are disseminated online, permeating the research infrastructure for scholarly communication, often in multiple copies. Applied topics with practical implications dominate.

The 20 papers concerning health-related issues are distributed across 20 unique domains, accounting for 46 URLs. The 27 papers dealing with environmental issues can be found across 26 unique domains, accounting for 56 URLs.  Most of the identified papers exist in multiple copies and have already spread to several archives, repositories, and social media. It would be difficult, or impossible, to remove them from the scientific record.

As apparent from Table 2, GPT-fabricated, questionable papers are seeping into most parts of the online research infrastructure for scholarly communication. Platforms on which identified papers have appeared include ResearchGate, ORCiD, Journal of Population Therapeutics and Clinical Pharmacology (JPTCP), Easychair, Frontiers, the Institute of Electrical and Electronics Engineer (IEEE), and X/Twitter. Thus, even if they are retracted from their original source, it will prove very difficult to track, remove, or even just mark them up on other platforms. Moreover, unless regulated, Google Scholar will enable their continued and most likely unlabeled discoverability.

Environmentresearchgate.net (13)orcid.org (4)easychair.org (3)ijope.com* (3)publikasiindonesia.id (3)
Healthresearchgate.net (15)ieee.org (4)twitter.com (3)jptcp.com** (2)frontiersin.org
(2)

A word rain visualization (Centre for Digital Humanities Uppsala, 2023), which combines word prominences through TF-IDF 5 Term frequency–inverse document frequency , a method for measuring the significance of a word in a document compared to its frequency across all documents in a collection. scores with semantic similarity of the full texts of our sample of GPT-generated articles that fall into the “Environment” and “Health” categories, reflects the two categories in question. However, as can be seen in Figure 1, it also reveals overlap and sub-areas. The y-axis shows word prominences through word positions and font sizes, while the x-axis indicates semantic similarity. In addition to a certain amount of overlap, this reveals sub-areas, which are best described as two distinct events within the word rain. The event on the left bundles terms related to the development and management of health and healthcare with “challenges,” “impact,” and “potential of artificial intelligence”emerging as semantically related terms. Terms related to research infrastructures, environmental, epistemic, and technological concepts are arranged further down in the same event (e.g., “system,” “climate,” “understanding,” “knowledge,” “learning,” “education,” “sustainable”). A second distinct event further to the right bundles terms associated with fish farming and aquatic medicinal plants, highlighting the presence of an aquaculture cluster.  Here, the prominence of groups of terms such as “used,” “model,” “-based,” and “traditional” suggests the presence of applied research on these topics. The two events making up the word rain visualization, are linked by a less dominant but overlapping cluster of terms related to “energy” and “water.”

essay on sampling method

The bar chart of the terms in the paper subset (see Figure 2) complements the word rain visualization by depicting the most prominent terms in the full texts along the y-axis. Here, word prominences across health and environment papers are arranged descendingly, where values outside parentheses are TF-IDF values (relative frequencies) and values inside parentheses are raw term frequencies (absolute frequencies).

essay on sampling method

Finding 3: Google Scholar presents results from quality-controlled and non-controlled citation databases on the same interface, providing unfiltered access to GPT-fabricated questionable papers.

Google Scholar’s central position in the publicly accessible scholarly communication infrastructure, as well as its lack of standards, transparency, and accountability in terms of inclusion criteria, has potentially serious implications for public trust in science. This is likely to exacerbate the already-known potential to exploit Google Scholar for evidence hacking (Tripodi et al., 2023) and will have implications for any attempts to retract or remove fraudulent papers from their original publication venues. Any solution must consider the entirety of the research infrastructure for scholarly communication and the interplay of different actors, interests, and incentives.

We searched and scraped Google Scholar using the Python library Scholarly (Cholewiak et al., 2023) for papers that included specific phrases known to be common responses from ChatGPT and similar applications with the same underlying model (GPT3.5 or GPT4): “as of my last knowledge update” and/or “I don’t have access to real-time data” (see Appendix A). This facilitated the identification of papers that likely used generative AI to produce text, resulting in 227 retrieved papers. The papers’ bibliographic information was automatically added to a spreadsheet and downloaded into Zotero. 6 An open-source reference manager, https://zotero.org .

We employed multiple coding (Barbour, 2001) to classify the papers based on their content. First, we jointly assessed whether the paper was suspected of fraudulent use of ChatGPT (or similar) based on how the text was integrated into the papers and whether the paper was presented as original research output or the AI tool’s role was acknowledged. Second, in analyzing the content of the papers, we continued the multiple coding by classifying the fraudulent papers into four categories identified during an initial round of analysis—health, environment, computing, and others—and then determining which subjects were most affected by this issue (see Table 1). Out of the 227 retrieved papers, 88 papers were written with legitimate and/or declared use of GPTs (i.e., false positives, which were excluded from further analysis), and 139 papers were written with undeclared and/or fraudulent use (i.e., true positives, which were included in further analysis). The multiple coding was conducted jointly by all authors of the present article, who collaboratively coded and cross-checked each other’s interpretation of the data simultaneously in a shared spreadsheet file. This was done to single out coding discrepancies and settle coding disagreements, which in turn ensured methodological thoroughness and analytical consensus (see Barbour, 2001). Redoing the category coding later based on our established coding schedule, we achieved an intercoder reliability (Cohen’s kappa) of 0.806 after eradicating obvious differences.

The ranking algorithm of Google Scholar prioritizes highly cited and older publications (Martín-Martín et al., 2016). Therefore, the position of the articles on the search engine results pages was not particularly informative, considering the relatively small number of results in combination with the recency of the publications. Only the query “as of my last knowledge update” had more than two search engine result pages. On those, questionable articles with undeclared use of GPTs were evenly distributed across all result pages (min: 4, max: 9, mode: 8), with the proportion of undeclared use being slightly higher on average on later search result pages.

To understand how the papers making fraudulent use of generative AI were disseminated online, we programmatically searched for the paper titles (with exact string matching) in Google Search from our local IP address (see Appendix B) using the googlesearch – python library(Vikramaditya, 2020). We manually verified each search result to filter out false positives—results that were not related to the paper—and then compiled the most prominent URLs by field. This enabled the identification of other platforms through which the papers had been spread. We did not, however, investigate whether copies had spread into SciHub or other shadow libraries, or if they were referenced in Wikipedia.

We used descriptive statistics to count the prevalence of the number of GPT-fabricated papers across topics and venues and top domains by subject. The pandas software library for the Python programming language (The pandas development team, 2024) was used for this part of the analysis. Based on the multiple coding, paper occurrences were counted in relation to their categories, divided into indexed journals, non-indexed journals, student papers, and working papers. The schemes, subdomains, and subdirectories of the URL strings were filtered out while top-level domains and second-level domains were kept, which led to normalizing domain names. This, in turn, allowed the counting of domain frequencies in the environment and health categories. To distinguish word prominences and meanings in the environment and health-related GPT-fabricated questionable papers, a semantically-aware word cloud visualization was produced through the use of a word rain (Centre for Digital Humanities Uppsala, 2023) for full-text versions of the papers. Font size and y-axis positions indicate word prominences through TF-IDF scores for the environment and health papers (also visualized in a separate bar chart with raw term frequencies in parentheses), and words are positioned along the x-axis to reflect semantic similarity (Skeppstedt et al., 2024), with an English Word2vec skip gram model space (Fares et al., 2017). An English stop word list was used, along with a manually produced list including terms such as “https,” “volume,” or “years.”

  • Artificial Intelligence
  • / Search engines

Cite this Essay

Haider, J., Söderström, K. R., Ekström, B., & Rödl, M. (2024). GPT-fabricated scientific papers on Google Scholar: Key features, spread, and implications for preempting evidence manipulation. Harvard Kennedy School (HKS) Misinformation Review . https://doi.org/10.37016/mr-2020-156

  • / Appendix B

Bibliography

Antkare, I. (2020). Ike Antkare, his publications, and those of his disciples. In M. Biagioli & A. Lippman (Eds.), Gaming the metrics (pp. 177–200). The MIT Press. https://doi.org/10.7551/mitpress/11087.003.0018

Barbour, R. S. (2001). Checklists for improving rigour in qualitative research: A case of the tail wagging the dog? BMJ , 322 (7294), 1115–1117. https://doi.org/10.1136/bmj.322.7294.1115

Bom, H.-S. H. (2023). Exploring the opportunities and challenges of ChatGPT in academic writing: A roundtable discussion. Nuclear Medicine and Molecular Imaging , 57 (4), 165–167. https://doi.org/10.1007/s13139-023-00809-2

Cabanac, G., & Labbé, C. (2021). Prevalence of nonsensical algorithmically generated papers in the scientific literature. Journal of the Association for Information Science and Technology , 72 (12), 1461–1476. https://doi.org/10.1002/asi.24495

Cabanac, G., Labbé, C., & Magazinov, A. (2021). Tortured phrases: A dubious writing style emerging in science. Evidence of critical issues affecting established journals . arXiv. https://doi.org/10.48550/arXiv.2107.06751

Carrion, M. L. (2018). “You need to do your research”: Vaccines, contestable science, and maternal epistemology. Public Understanding of Science , 27 (3), 310–324. https://doi.org/10.1177/0963662517728024

Centre for Digital Humanities Uppsala (2023). CDHUppsala/word-rain [Computer software]. https://github.com/CDHUppsala/word-rain

Chinn, S., & Hasell, A. (2023). Support for “doing your own research” is associated with COVID-19 misperceptions and scientific mistrust. Harvard Kennedy School (HSK) Misinformation Review, 4 (3). https://doi.org/10.37016/mr-2020-117

Cholewiak, S. A., Ipeirotis, P., Silva, V., & Kannawadi, A. (2023). SCHOLARLY: Simple access to Google Scholar authors and citation using Python (1.5.0) [Computer software]. https://doi.org/10.5281/zenodo.5764801

Dadkhah, M., Lagzian, M., & Borchardt, G. (2017). Questionable papers in citation databases as an issue for literature review. Journal of Cell Communication and Signaling , 11 (2), 181–185. https://doi.org/10.1007/s12079-016-0370-6

Dadkhah, M., Oermann, M. H., Hegedüs, M., Raman, R., & Dávid, L. D. (2023). Detection of fake papers in the era of artificial intelligence. Diagnosis , 10 (4), 390–397. https://doi.org/10.1515/dx-2023-0090

DeGeurin, M. (2024, March 19). AI-generated nonsense is leaking into scientific journals. Popular Science. https://www.popsci.com/technology/ai-generated-text-scientific-journals/

Dunlap, R. E., & Brulle, R. J. (2020). Sources and amplifiers of climate change denial. In D.C. Holmes & L. M. Richardson (Eds.), Research handbook on communicating climate change (pp. 49–61). Edward Elgar Publishing. https://doi.org/10.4337/9781789900408.00013

Fares, M., Kutuzov, A., Oepen, S., & Velldal, E. (2017). Word vectors, reuse, and replicability: Towards a community repository of large-text resources. In J. Tiedemann & N. Tahmasebi (Eds.), Proceedings of the 21st Nordic Conference on Computational Linguistics (pp. 271–276). Association for Computational Linguistics. https://aclanthology.org/W17-0237

Google Scholar Help. (n.d.). Inclusion guidelines for webmasters . https://scholar.google.com/intl/en/scholar/inclusion.html

Gu, J., Wang, X., Li, C., Zhao, J., Fu, W., Liang, G., & Qiu, J. (2022). AI-enabled image fraud in scientific publications. Patterns , 3 (7), 100511. https://doi.org/10.1016/j.patter.2022.100511

Gusenbauer, M., & Haddaway, N. R. (2020). Which academic search systems are suitable for systematic reviews or meta-analyses? Evaluating retrieval qualities of Google Scholar, PubMed, and 26 other resources. Research Synthesis Methods , 11 (2), 181–217.   https://doi.org/10.1002/jrsm.1378

Haider, J., & Åström, F. (2017). Dimensions of trust in scholarly communication: Problematizing peer review in the aftermath of John Bohannon’s “Sting” in science. Journal of the Association for Information Science and Technology , 68 (2), 450–467. https://doi.org/10.1002/asi.23669

Huang, J., & Tan, M. (2023). The role of ChatGPT in scientific communication: Writing better scientific review articles. American Journal of Cancer Research , 13 (4), 1148–1154. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10164801/

Jones, N. (2024). How journals are fighting back against a wave of questionable images. Nature , 626 (8000), 697–698. https://doi.org/10.1038/d41586-024-00372-6

Kitamura, F. C. (2023). ChatGPT is shaping the future of medical writing but still requires human judgment. Radiology , 307 (2), e230171. https://doi.org/10.1148/radiol.230171

Littell, J. H., Abel, K. M., Biggs, M. A., Blum, R. W., Foster, D. G., Haddad, L. B., Major, B., Munk-Olsen, T., Polis, C. B., Robinson, G. E., Rocca, C. H., Russo, N. F., Steinberg, J. R., Stewart, D. E., Stotland, N. L., Upadhyay, U. D., & Ditzhuijzen, J. van. (2024). Correcting the scientific record on abortion and mental health outcomes. BMJ , 384 , e076518. https://doi.org/10.1136/bmj-2023-076518

Lund, B. D., Wang, T., Mannuru, N. R., Nie, B., Shimray, S., & Wang, Z. (2023). ChatGPT and a new academic reality: Artificial Intelligence-written research papers and the ethics of the large language models in scholarly publishing. Journal of the Association for Information Science and Technology, 74 (5), 570–581. https://doi.org/10.1002/asi.24750

Martín-Martín, A., Orduna-Malea, E., Ayllón, J. M., & Delgado López-Cózar, E. (2016). Back to the past: On the shoulders of an academic search engine giant. Scientometrics , 107 , 1477–1487. https://doi.org/10.1007/s11192-016-1917-2

Martín-Martín, A., Thelwall, M., Orduna-Malea, E., & Delgado López-Cózar, E. (2021). Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations’ COCI: A multidisciplinary comparison of coverage via citations. Scientometrics , 126 (1), 871–906. https://doi.org/10.1007/s11192-020-03690-4

Simon, F. M., Altay, S., & Mercier, H. (2023). Misinformation reloaded? Fears about the impact of generative AI on misinformation are overblown. Harvard Kennedy School (HKS) Misinformation Review, 4 (5). https://doi.org/10.37016/mr-2020-127

Skeppstedt, M., Ahltorp, M., Kucher, K., & Lindström, M. (2024). From word clouds to Word Rain: Revisiting the classic word cloud to visualize climate change texts. Information Visualization , 23 (3), 217–238. https://doi.org/10.1177/14738716241236188

Swedish Research Council. (2017). Good research practice. Vetenskapsrådet.

Stokel-Walker, C. (2024, May 1.). AI Chatbots Have Thoroughly Infiltrated Scientific Publishing . Scientific American. https://www.scientificamerican.com/article/chatbots-have-thoroughly-infiltrated-scientific-publishing/

Subbaraman, N. (2024, May 14). Flood of fake science forces multiple journal closures: Wiley to shutter 19 more journals, some tainted by fraud. The Wall Street Journal . https://www.wsj.com/science/academic-studies-research-paper-mills-journals-publishing-f5a3d4bc

The pandas development team. (2024). pandas-dev/pandas: Pandas (v2.2.2) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.10957263

Thorp, H. H. (2023). ChatGPT is fun, but not an author. Science , 379 (6630), 313–313. https://doi.org/10.1126/science.adg7879

Tripodi, F. B., Garcia, L. C., & Marwick, A. E. (2023). ‘Do your own research’: Affordance activation and disinformation spread. Information, Communication & Society , 27 (6), 1212–1228. https://doi.org/10.1080/1369118X.2023.2245869

Vikramaditya, N. (2020). Nv7-GitHub/googlesearch [Computer software]. https://github.com/Nv7-GitHub/googlesearch

This research has been supported by Mistra, the Swedish Foundation for Strategic Environmental Research, through the research program Mistra Environmental Communication (Haider, Ekström, Rödl) and the Marcus and Amalia Wallenberg Foundation [2020.0004] (Söderström).

Competing Interests

The authors declare no competing interests.

The research described in this article was carried out under Swedish legislation. According to the relevant EU and Swedish legislation (2003:460) on the ethical review of research involving humans (“Ethical Review Act”), the research reported on here is not subject to authorization by the Swedish Ethical Review Authority (“etikprövningsmyndigheten”) (SRC, 2017).

This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided that the original author and source are properly credited.

Data Availability

All data needed to replicate this study are available at the Harvard Dataverse: https://doi.org/10.7910/DVN/WUVD8X

Acknowledgements

The authors wish to thank two anonymous reviewers for their valuable comments on the article manuscript as well as the editorial group of Harvard Kennedy School (HKS) Misinformation Review for their thoughtful feedback and input.

  • CBSE Class 11

CBSE Class 11 Food Production Sample Papers 2024-25 Released for Skill Subjects: Download Now!

Cbse sample papers and marking scheme 2025: cbse has released the sample papers of food production skill subjects for class 11 for the 2025 board exams. download the food production sample question papers with marking scheme in pdf here..

Anisha Mishra

CBSE Class 11 Food Production Skill Subject Sample Papers 2025: The Central Board Of Secondary Education (CBSE) has made sample papers available for the all subjects for classes 11 on its official website. These sample papers help them to practice and perform better in examinations. In this article we have provided the sample paper for the CBSE Class 11 Food Production sample paper 2025 , along with the section wise questions and direct link to download the sample paper to prepare and practice. For now, students can take a look at the Skill Subject Sample Papers. Read the complete article to download the free PDF of the Food Productions sample papers and the marking scheme as well.

CBSE Class 11 Food Production Skill Subject: General Instructions:

1. Please read the instructions carefully.

2. This Question Paper consists of 24 questions in two sections – Section A & Section B.

3. Section A has Objective type questions whereas Section B contains Subjective type questions.

4. Out of the given (6 + 18 =) 24 questions, a candidate has to answer (6 + 11 =) 17 questions in  the allotted (maximum) time of 3 hours.

5. All questions of a particular section must be attempted in the correct order.

  • This section has 06 questions.
  • There is no negative marking.
  • Do as per the instructions given.
  • Marks allotted are mentioned against each question/part.
  • This section contains 18 questions.
  • A candidate has to do 11 questions.

CBSE Class 11 Food Production Sample Question Papers of Skill Subjects 2024-25 

Section a: objective type questions .

  • 1 Answer any 4 out of the given 6 questions on Employability Skills (1 x 4 = 4 marks)
  • Nature System.
  • Earth System.
  • None of the above.
  • Three – way process.
  • Two – way process.
  • One – way process.
  • Self-Confident.
  • Self-Control.
  • Self-Motivated.
  • Self-Aware.
  • I am good at understanding other peoples.
  • Dealing with strangers, I am confident.
  • I don’t know, how to play chess.
  • I help my parents in household chores.
  • Adding Substitutes.
  • Scaling Up.
  • All of the above.

2 Answer any 5 out of the given 7 questions (1 x 5 = 5 marks)

  • Meetings, Incentive tours, Conferences & Exhibitions.
  • Meet ups, Incentive travel, Class & Exhibitions.
  • Meetings, Incentive transport, Conferences & Events.
  • Meetings, Inclination, Conferences & Events.
  • Garde manger.
  • Commissary.
  • Patisserie.
  • Oils and fat.
  • Cool & Moist store.
  • Cool & Dry store.
  • Warm & Moist store.
  • Chiffonade.
  • Fruit salad.
  • Vegetable salad.
  • Protein salad.
  • Pasta salad

3 Answer any 6 out of the given 7 questions (1 x 6 = 6 marks)

  • Bar counter.
  • Chef’s scarf.
  • Safety shoes.
  • Musk melon.

essay on sampling method

  • Black pepper.

4 Answer any 5 out of the given 6 questions (1 x 5 = 5 marks)

  • F&B Manager.
  • Executive Chef.
  • HR Manager.
  • Event coordinator.
  • Commis chef.
  • 5˚C to 45˚C.
  • 5˚C to 40˚C.
  • 5˚C to 50˚C.
  • 5˚C to 60˚C.
  • Dough maker.
  • Transportation.
  • all of the above.
  • Curry Leaves.

5. Answer any 5 out of the given 6 questions (1 x 5 = 5 marks)

  • 3˚C to 4˚C.
  • 5˚C to 6˚C.
  • 8˚C to 10˚C.
  • 5˚C to 7˚C.
  • To make the food more palatable.
  • Cooked food cannot be stored for a longer time.
  • It improves the eye appeal of the food.
  • It kills the bacteria and keeps the food sterile.
  • Conduction.
  • Convection.
  • Shallow fry.

To view and access the complete set of question and sections click on the link below to download PDF:

CBSE Class 11 Food Production Marking Scheme 2024-25

The marking scheme helps students by giving them the exact idea of what is needed to get good scores and grades in examination. It explains how each answers will be scored, the question weightage for exam, and makes understand what the teacher are looking for in your answer. By knowing the marking scheme students can focus on important topics and practice accordingly and see how well they are doing. To access the marking scheme for class 11 Food Production sample paper 2025, click on the link below to download the marking scheme in PDF format: 

  • CBSE Class 11 Syllabus for Board Exam 2024-25
  • NCERT Books for Class 11 All Subjects PDF
  • CBSE Class 11 Deleted Syllabus For 2025 Exams
  • CBSE Class 11 Science
  • CBSE Class 11 Commerce
  • CBSE CLASS 11 Humanities 

Get here latest School , CBSE and Govt Jobs notification and articles in English and Hindi for Sarkari Naukari , Sarkari Result and Exam Preparation . Download the Jagran Josh Sarkari Naukri App .

  • UGC NET Answer Key 2024
  • SSC CGL Exam Analysis 2024
  • RBI Grade B Admit Card 2024
  • SSC GD Recruitment 2025
  • SSC CGL Admit Card 2024
  • UP Police Constable Question Paper 2024 PDF
  • CDS Question Paper 2024
  • RRB NTPC Recruitment 2024
  • CBSE Class 12 Sample Papers 2024-25
  • CBSE Class 10 Sample Papers 2024-25
  • Education News
  • CBSE Class 11 Practice Papers

Latest Education News

RBSE Supplementary Result 2024 Kab Aayega? लेटेस्ट अपडेट! राजस्थान बोर्ड सप्लीमेंट्री 10वीं, 12वीं का रिजल्ट जल्द, यहां पर मिलेगा डायरेक्ट लिंक

Monuments of India: Top 5 Historical Monuments, Taj Mahal, Humayun’s Tomb, Ajanta Caves And More

SSC CGL Exam Analysis 2024 Live Updates: Shift 1, 2, 3, 4 Paper Review, Difficulty Level

Union Territories of India: Ladakh, Chandigarh, Puducherry, Jammu And Kashmir And More

SAV Bihar 11th Result 2024 [OUT]: जारी हुआ बीएसईबी सिमुलतला विद्यालय कक्षा 11 का रिजल्ट, ये रहा Direct Link

NEET PG 2024 Scorecard for All India 50% Quota Seats Today, Download at nbe.edu.in

MP ANMTST Answer Key 2024 Out at esb.mp.gov.in: Here’s Direct Link to Raise Objection

Assam Board Class 9 Social Science Syllabus 2024-25: Download Syllabus PDF For Free

IBPS RRB Clerk Prelims Result 2024 to be Out Soon at ibps.in: Check Expected Date and Time

CBSE Class 10 Maths Competency-Based Questions With Answer Key 2024-25: Chapter 5 Arithmetic Progressions Download For Free

CBSE Class 10 Maths Competency-Based Questions With Answer Key 2024-25: Chapter 6 Triangles Free PDF Download

Logic Puzzle: Only People With High IQ Can Solve This in 11 Seconds – Are You Elite Enough?

GST Council Meet: कैंसर की दवाओं और नमकीन पर टैक्स घटा, क्या हुआ सस्ता और क्या महंगा, यहां देखें

Jasdeep Singh Gill Story: कौन हैं जसदीप सिंह गिल? केमिकल इंजीनियर से धार्मिक गुरु बनने तक की कहानी

eShram Card: क्या है ई-श्रम कार्ड? लाभ, पात्रता और ऑनलाइन अप्लाई की सभी डिटेल्स यहां देखें, e-shram Card Download का तरीका

Unified Pension Scheme: लाभ, पात्रता, न्यूनतम पेंशन राशि, पेंशन कैलकुलेटर सहित सभी डिटेल्स यहां देखें

Ind vs Ban: ऋषभ पंत और राहुल के साथ Test टीम में किसे मिला मौका, कौन हुआ बाहर, देखें यहां

उत्तर प्रदेश के 8 रेलवे स्टेशनों को मिले नए नाम, यहां देखें सभी नाम

Haryana Congress Candidates List: कांग्रेस उम्मीदवारों की पहली और दूसरी लिस्ट जारी, यहां देखें सभी के नाम

Haryana BJP Candidate List 2024: 67 उम्मीदवारों की पहली लिस्ट जारी, यहां देखें सभी के नाम

IMAGES

  1. Random Sampling Method Free Essay Example

    essay on sampling method

  2. Non Probability Sampling Method Which Is Judgement Sampling Accounting

    essay on sampling method

  3. Sampling Methods: Types of Sampling Methods & Examples

    essay on sampling method

  4. Sampling and techniques

    essay on sampling method

  5. Systematic Matching Sampling Essay Example

    essay on sampling method

  6. 📚 Sampling Method and Study Design

    essay on sampling method

VIDEO

  1. Sampling in Research

  2. Method of sampling #education #commercewalesawsir #virqlshorts #waliacommerceclasses

  3. Sampling Techniques Part-5 (Cluster Sampling)

  4. Selecting a Sampling Method

  5. Probability Sampling Technique || Part 27 || By Sunil Sir||

  6. Simple Random Sampling

COMMENTS

  1. Sampling Methods

    Sampling Methods | Types, Techniques & Examples

  2. (PDF) Sampling Methods in Research: A Review

    (PDF) Sampling Methods in Research: A Review

  3. Sampling Methods

    Abstract. Knowledge of sampling methods is essential to design quality research. Critical questions are provided to help researchers choose a sampling method. This article reviews probability and non-probability sampling methods, lists and defines specific sampling techniques, and provides pros and cons for consideration.

  4. What are sampling methods and how do you choose the best one?

    What are sampling methods and how do you choose the best ...

  5. Sampling methods in Clinical Research; an Educational Review

    Sampling methods in Clinical Research; an Educational ...

  6. Sampling Methods In Reseach: Types, Techniques, & Examples

    Sampling Methods In Reseach: Types, Techniques, & ...

  7. What are Sampling Methods? Techniques, Types, and Examples

    What Are Sampling Methods? Techniques, Types, and ...

  8. Systematic Sampling

    Systematic Sampling | A Step-by-Step Guide with Examples

  9. Types of Sampling Methods in Human Research: Why, When and How?

    Also, in the case of a small. sample set, a representation of the entire population is more likely to be compromised ( Bhardwaj, 2019; Sharma, 2017 ). 3.2. Systematic Sampling. Systematic sampling ...

  10. Sampling Methods in Research Methodology; How to Choose a Sampling

    Sampling Methods in Research Methodology; How to ...

  11. PDF Sampling Strategies in Qualitative Research

    SAGE Research Methods. Page 2 of 21. Sampling Strategies in Qualitative Research. 1. 1. Sampling can be divided in a number of different ways. At a basic level, with the exception ... papers on delay in diagnosis, which outline some of the factors tied to delay. So, for example, in rheumatoid arthritis in adults, the central issue was family

  12. Sampling Types and Processes

    Introduction. Sampling refers to a statistical process where part of a given data from a target population is selected and used as a representative of the whole population. There are various types of sampling processes, including random, systematic, and cluster sampling methods, among others. In regards to random sampling, each element of the ...

  13. Simple Random Sampling

    Simple Random Sampling | Definition, Steps & Examples

  14. Sampling Methods

    Sampling Methods | Types, Techniques, & Examples - Scribbr

  15. Analysis of Sampling Methods

    Analysis of Sampling Methods Essay. Exclusively available on IvyPanda®. In the study by Grantham et al. (2006), convenience or accidental sampling was used, and the most convenient people were selected as the study participants. Regardless of the convenience and frugality of this sampling method, it is necessary to consider its disadvantages ...

  16. Sampling Techniques in Education

    Sampling Techniques in Education Essay. Random sampling is a sampling technique where all elements in a population have an equal probability of being selected to form the sample. It means, therefore, that elements are chosen arbitrarily without following any formulae (Babbie, 2010). This technique is unbiased and it gives true representative ...

  17. (PDF) Types of sampling in research

    (PDF) Types of sampling in research

  18. Sampling Methods Essay

    Sampling Methods Essay. Sampling is the framework on which any form of research is carried out. A suitable sample that meets the inclusion and exclusion criteria of a research design must be chosen from a given population to carry out studies. In this essay comparison is made between stratified random sampling and convenience sampling.

  19. Different Types Of Sampling Method Education Essay

    The sampling process comprises several stages: Defining the population of concern. Specifying a sampling frame, a set of items or events possible to measure. Specifying a sampling method for selecting items or events from the frame. Determining the sample size. Implementing the sampling plan.

  20. Sampling methods and techniques

    A sample design is a definite plan for obtaining a sample from a given population. It refers to the technique or the procedure the researcher would adopt in selecting items for the sample. Sample design also leads to a procedure to tell the number of items to be included in the sample i.e., the size of the sample.

  21. Cluster Sampling

    Cluster Sampling | A Simple Step-by-Step Guide ... - Scribbr

  22. (PDF) Research Sampling and Sample Size Determination: A practical

    (PDF) Research Sampling and Sample Size Determination

  23. GPT-fabricated scientific papers on Google Scholar: Key features

    A sample of scientific papers with signs of GPT-use found on Google Scholar was retrieved, downloaded, and analyzed using a combination of qualitative coding and descriptive statistics. All papers contained at least one of two common phrases returned by conversational agents that use large language models (LLM) like OpenAI's ChatGPT.

  24. CBSE Class 11 Food Production Sample Papers 2025: Sample Question

    CBSE Class 11 Food Production Skill Subject: General Instructions: 1. Please read the instructions carefully. 2. This Question Paper consists of 24 questions in two sections - Section A & Section B.

  25. 253 questions with answers in SAMPLING METHODS

    Question. 6 answers. Apr 4, 2018. Does cluster sampling still apply with Probability Proportional to Size (PPS) sampling in the following scenario: Does you still need to use cluster samples if ...