Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Descriptive Statistics | Definitions, Types, Examples

Published on July 9, 2020 by Pritha Bhandari . Revised on June 21, 2023.

Descriptive statistics summarize and organize characteristics of a data set. A data set is a collection of responses or observations from a sample or entire population.

In quantitative research , after collecting data, the first step of statistical analysis is to describe characteristics of the responses, such as the average of one variable (e.g., age), or the relation between two variables (e.g., age and creativity).

The next step is inferential statistics , which help you decide whether your data confirms or refutes your hypothesis and whether it is generalizable to a larger population.

Table of contents

Types of descriptive statistics, frequency distribution, measures of central tendency, measures of variability, univariate descriptive statistics, bivariate descriptive statistics, other interesting articles, frequently asked questions about descriptive statistics.

There are 3 main types of descriptive statistics:

  • The distribution concerns the frequency of each value.
  • The central tendency concerns the averages of the values.
  • The variability or dispersion concerns how spread out the values are.

Types of descriptive statistics

You can apply these to assess only one variable at a time, in univariate analysis, or to compare two or more, in bivariate and multivariate analysis.

  • Go to a library
  • Watch a movie at a theater
  • Visit a national park

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

A data set is made up of a distribution of values, or scores. In tables or graphs, you can summarize the frequency of every possible value of a variable in numbers or percentages. This is called a frequency distribution .

  • Simple frequency distribution table
  • Grouped frequency distribution table
Gender Number
Male 182
Female 235
Other 27

From this table, you can see that more women than men or people with another gender identity took part in the study. In a grouped frequency distribution, you can group numerical response values and add up the number of responses for each group. You can also convert each of these numbers to percentages.

Library visits in the past year Percent
0–4 6%
5–8 20%
9–12 42%
13–16 24%
17+ 8%

Measures of central tendency estimate the center, or average, of a data set. The mean, median and mode are 3 ways of finding the average.

Here we will demonstrate how to calculate the mean, median, and mode using the first 6 responses of our survey.

The mean , or M , is the most commonly used method for finding the average.

To find the mean, simply add up all response values and divide the sum by the total number of responses. The total number of responses or observations is called N .

Mean number of library visits
Data set 15, 3, 12, 0, 24, 3
Sum of all values 15 + 3 + 12 + 0 + 24 + 3 = 57
Total number of responses = 6
Mean Divide the sum of values by to find : 57/6 =

The median is the value that’s exactly in the middle of a data set.

To find the median, order each response value from the smallest to the biggest. Then , the median is the number in the middle. If there are two numbers in the middle, find their mean.

Median number of library visits
Ordered data set 0, 3, 3, 12, 15, 24
Middle numbers 3, 12
Median Find the mean of the two middle numbers: (3 + 12)/2 =

The mode is the simply the most popular or most frequent response value. A data set can have no mode, one mode, or more than one mode.

To find the mode, order your data set from lowest to highest and find the response that occurs most frequently.

Mode number of library visits
Ordered data set 0, 3, 3, 12, 15, 24
Mode Find the most frequently occurring response:

Measures of variability give you a sense of how spread out the response values are. The range, standard deviation and variance each reflect different aspects of spread.

The range gives you an idea of how far apart the most extreme response scores are. To find the range , simply subtract the lowest value from the highest value.

Standard deviation

The standard deviation ( s or SD ) is the average amount of variability in your dataset. It tells you, on average, how far each score lies from the mean. The larger the standard deviation, the more variable the data set is.

There are six steps for finding the standard deviation:

  • List each score and find their mean.
  • Subtract the mean from each score to get the deviation from the mean.
  • Square each of these deviations.
  • Add up all of the squared deviations.
  • Divide the sum of the squared deviations by N – 1.
  • Find the square root of the number you found.
Raw data Deviation from mean Squared deviation
15 15 – 9.5 = 5.5 30.25
3 3 – 9.5 = -6.5 42.25
12 12 – 9.5 = 2.5 6.25
0 0 – 9.5 = -9.5 90.25
24 24 – 9.5 = 14.5 210.25
3 3 – 9.5 = -6.5 42.25
= 9.5 Sum = 0 Sum of squares = 421.5

Step 5: 421.5/5 = 84.3

Step 6: √84.3 = 9.18

The variance is the average of squared deviations from the mean. Variance reflects the degree of spread in the data set. The more spread the data, the larger the variance is in relation to the mean.

To find the variance, simply square the standard deviation. The symbol for variance is s 2 .

Prevent plagiarism. Run a free check.

Univariate descriptive statistics focus on only one variable at a time. It’s important to examine data from each variable separately using multiple measures of distribution, central tendency and spread. Programs like SPSS and Excel can be used to easily calculate these.

Visits to the library
6
Mean 9.5
Median 7.5
Mode 3
Standard deviation 9.18
Variance 84.3
Range 24

If you were to only consider the mean as a measure of central tendency, your impression of the “middle” of the data set can be skewed by outliers, unlike the median or mode.

Likewise, while the range is sensitive to outliers , you should also consider the standard deviation and variance to get easily comparable measures of spread.

If you’ve collected data on more than one variable, you can use bivariate or multivariate descriptive statistics to explore whether there are relationships between them.

In bivariate analysis, you simultaneously study the frequency and variability of two variables to see if they vary together. You can also compare the central tendency of the two variables before performing further statistical tests .

Multivariate analysis is the same as bivariate analysis but with more than two variables.

Contingency table

In a contingency table, each cell represents the intersection of two variables. Usually, an independent variable (e.g., gender) appears along the vertical axis and a dependent one appears along the horizontal axis (e.g., activities). You read “across” the table to see how the independent and dependent variables relate to each other.

Number of visits to the library in the past year
Group 0–4 5–8 9–12 13–16 17+
Children 32 68 37 23 22
Adults 36 48 43 83 25

Interpreting a contingency table is easier when the raw data is converted to percentages. Percentages make each row comparable to the other by making it seem as if each group had only 100 observations or participants. When creating a percentage-based contingency table, you add the N for each independent variable on the end.

Visits to the library in the past year (Percentages)
Group 0–4 5–8 9–12 13–16 17+
Children 18% 37% 20% 13% 12% 182
Adults 15% 20% 18% 35% 11% 235

From this table, it is more clear that similar proportions of children and adults go to the library over 17 times a year. Additionally, children most commonly went to the library between 5 and 8 times, while for adults, this number was between 13 and 16.

Scatter plots

A scatter plot is a chart that shows you the relationship between two or three variables . It’s a visual representation of the strength of a relationship.

In a scatter plot, you plot one variable along the x-axis and another one along the y-axis. Each data point is represented by a point in the chart.

From your scatter plot, you see that as the number of movies seen at movie theaters increases, the number of visits to the library decreases. Based on your visual assessment of a possible linear relationship, you perform further tests of correlation and regression.

Descriptive statistics: Scatter plot

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Statistical power
  • Pearson correlation
  • Degrees of freedom
  • Statistical significance

Methodology

  • Cluster sampling
  • Stratified sampling
  • Focus group
  • Systematic review
  • Ethnography
  • Double-Barreled Question

Research bias

  • Implicit bias
  • Publication bias
  • Cognitive bias
  • Placebo effect
  • Pygmalion effect
  • Hindsight bias
  • Overconfidence bias

Descriptive statistics summarize the characteristics of a data set. Inferential statistics allow you to test a hypothesis or assess whether your data is generalizable to the broader population.

The 3 main types of descriptive statistics concern the frequency distribution, central tendency, and variability of a dataset.

  • Distribution refers to the frequencies of different responses.
  • Measures of central tendency give you the average for each response.
  • Measures of variability show you the spread or dispersion of your dataset.
  • Univariate statistics summarize only one variable  at a time.
  • Bivariate statistics compare two variables .
  • Multivariate statistics compare more than two variables .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 21). Descriptive Statistics | Definitions, Types, Examples. Scribbr. Retrieved September 3, 2024, from https://www.scribbr.com/statistics/descriptive-statistics/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, central tendency | understanding the mean, median & mode, variability | calculating range, iqr, variance, standard deviation, inferential statistics | an easy introduction & examples, what is your plagiarism score.

Educational resources and simple solutions for your research journey

What is Descriptive Research? Definition, Methods, Types and Examples

What is Descriptive Research? Definition, Methods, Types and Examples

Descriptive research is a methodological approach that seeks to depict the characteristics of a phenomenon or subject under investigation. In scientific inquiry, it serves as a foundational tool for researchers aiming to observe, record, and analyze the intricate details of a particular topic. This method provides a rich and detailed account that aids in understanding, categorizing, and interpreting the subject matter.

Descriptive research design is widely employed across diverse fields, and its primary objective is to systematically observe and document all variables and conditions influencing the phenomenon.

After this descriptive research definition, let’s look at this example. Consider a researcher working on climate change adaptation, who wants to understand water management trends in an arid village in a specific study area. She must conduct a demographic survey of the region, gather population data, and then conduct descriptive research on this demographic segment. The study will then uncover details on “what are the water management practices and trends in village X.” Note, however, that it will not cover any investigative information about “why” the patterns exist.

Table of Contents

What is descriptive research?

If you’ve been wondering “What is descriptive research,” we’ve got you covered in this post! In a nutshell, descriptive research is an exploratory research method that helps a researcher describe a population, circumstance, or phenomenon. It can help answer what , where , when and how questions, but not why questions. In other words, it does not involve changing the study variables and does not seek to establish cause-and-effect relationships.

descriptive formula in research

Importance of descriptive research

Now, let’s delve into the importance of descriptive research. This research method acts as the cornerstone for various academic and applied disciplines. Its primary significance lies in its ability to provide a comprehensive overview of a phenomenon, enabling researchers to gain a nuanced understanding of the variables at play. This method aids in forming hypotheses, generating insights, and laying the groundwork for further in-depth investigations. The following points further illustrate its importance:

Provides insights into a population or phenomenon: Descriptive research furnishes a comprehensive overview of the characteristics and behaviors of a specific population or phenomenon, thereby guiding and shaping the research project.

Offers baseline data: The data acquired through this type of research acts as a reference for subsequent investigations, laying the groundwork for further studies.

Allows validation of sampling methods: Descriptive research validates sampling methods, aiding in the selection of the most effective approach for the study.

Helps reduce time and costs: It is cost-effective and time-efficient, making this an economical means of gathering information about a specific population or phenomenon.

Ensures replicability: Descriptive research is easily replicable, ensuring a reliable way to collect and compare information from various sources.

When to use descriptive research design?

Determining when to use descriptive research depends on the nature of the research question. Before diving into the reasons behind an occurrence, understanding the how, when, and where aspects is essential. Descriptive research design is a suitable option when the research objective is to discern characteristics, frequencies, trends, and categories without manipulating variables. It is therefore often employed in the initial stages of a study before progressing to more complex research designs. To put it in another way, descriptive research precedes the hypotheses of explanatory research. It is particularly valuable when there is limited existing knowledge about the subject.

Some examples are as follows, highlighting that these questions would arise before a clear outline of the research plan is established:

  • In the last two decades, what changes have occurred in patterns of urban gardening in Mumbai?
  • What are the differences in climate change perceptions of farmers in coastal versus inland villages in the Philippines?

Characteristics of descriptive research

Coming to the characteristics of descriptive research, this approach is characterized by its focus on observing and documenting the features of a subject. Specific characteristics are as below.

  • Quantitative nature: Some descriptive research types involve quantitative research methods to gather quantifiable information for statistical analysis of the population sample.
  • Qualitative nature: Some descriptive research examples include those using the qualitative research method to describe or explain the research problem.
  • Observational nature: This approach is non-invasive and observational because the study variables remain untouched. Researchers merely observe and report, without introducing interventions that could impact the subject(s).
  • Cross-sectional nature: In descriptive research, different sections belonging to the same group are studied, providing a “snapshot” of sorts.
  • Springboard for further research: The data collected are further studied and analyzed using different research techniques. This approach helps guide the suitable research methods to be employed.

Types of descriptive research

There are various descriptive research types, each suited to different research objectives. Take a look at the different types below.

  • Surveys: This involves collecting data through questionnaires or interviews to gather qualitative and quantitative data.
  • Observational studies: This involves observing and collecting data on a particular population or phenomenon without influencing the study variables or manipulating the conditions. These may be further divided into cohort studies, case studies, and cross-sectional studies:
  • Cohort studies: Also known as longitudinal studies, these studies involve the collection of data over an extended period, allowing researchers to track changes and trends.
  • Case studies: These deal with a single individual, group, or event, which might be rare or unusual.
  • Cross-sectional studies : A researcher collects data at a single point in time, in order to obtain a snapshot of a specific moment.
  • Focus groups: In this approach, a small group of people are brought together to discuss a topic. The researcher moderates and records the group discussion. This can also be considered a “participatory” observational method.
  • Descriptive classification: Relevant to the biological sciences, this type of approach may be used to classify living organisms.

Descriptive research methods

Several descriptive research methods can be employed, and these are more or less similar to the types of approaches mentioned above.

  • Surveys: This method involves the collection of data through questionnaires or interviews. Surveys may be done online or offline, and the target subjects might be hyper-local, regional, or global.
  • Observational studies: These entail the direct observation of subjects in their natural environment. These include case studies, dealing with a single case or individual, as well as cross-sectional and longitudinal studies, for a glimpse into a population or changes in trends over time, respectively. Participatory observational studies such as focus group discussions may also fall under this method.

Researchers must carefully consider descriptive research methods, types, and examples to harness their full potential in contributing to scientific knowledge.

Examples of descriptive research

Now, let’s consider some descriptive research examples.

  • In social sciences, an example could be a study analyzing the demographics of a specific community to understand its socio-economic characteristics.
  • In business, a market research survey aiming to describe consumer preferences would be a descriptive study.
  • In ecology, a researcher might undertake a survey of all the types of monocots naturally occurring in a region and classify them up to species level.

These examples showcase the versatility of descriptive research across diverse fields.

Advantages of descriptive research

There are several advantages to this approach, which every researcher must be aware of. These are as follows:

  • Owing to the numerous descriptive research methods and types, primary data can be obtained in diverse ways and be used for developing a research hypothesis .
  • It is a versatile research method and allows flexibility.
  • Detailed and comprehensive information can be obtained because the data collected can be qualitative or quantitative.
  • It is carried out in the natural environment, which greatly minimizes certain types of bias and ethical concerns.
  • It is an inexpensive and efficient approach, even with large sample sizes

Disadvantages of descriptive research

On the other hand, this design has some drawbacks as well:

  • It is limited in its scope as it does not determine cause-and-effect relationships.
  • The approach does not generate new information and simply depends on existing data.
  • Study variables are not manipulated or controlled, and this limits the conclusions to be drawn.
  • Descriptive research findings may not be generalizable to other populations.
  • Finally, it offers a preliminary understanding rather than an in-depth understanding.

To reiterate, the advantages of descriptive research lie in its ability to provide a comprehensive overview, aid hypothesis generation, and serve as a preliminary step in the research process. However, its limitations include a potential lack of depth, inability to establish cause-and-effect relationships, and susceptibility to bias.

Frequently asked questions

When should researchers conduct descriptive research.

Descriptive research is most appropriate when researchers aim to portray and understand the characteristics of a phenomenon without manipulating variables. It is particularly valuable in the early stages of a study.

What is the difference between descriptive and exploratory research?

Descriptive research focuses on providing a detailed depiction of a phenomenon, while exploratory research aims to explore and generate insights into an issue where little is known.

What is the difference between descriptive and experimental research?

Descriptive research observes and documents without manipulating variables, whereas experimental research involves intentional interventions to establish cause-and-effect relationships.

Is descriptive research only for social sciences?

No, various descriptive research types may be applicable to all fields of study, including social science, humanities, physical science, and biological science.

How important is descriptive research?

The importance of descriptive research lies in its ability to provide a glimpse of the current state of a phenomenon, offering valuable insights and establishing a basic understanding. Further, the advantages of descriptive research include its capacity to offer a straightforward depiction of a situation or phenomenon, facilitate the identification of patterns or trends, and serve as a useful starting point for more in-depth investigations. Additionally, descriptive research can contribute to the development of hypotheses and guide the formulation of research questions for subsequent studies.

Editage All Access is a subscription-based platform that unifies the best AI tools and services designed to speed up, simplify, and streamline every step of a researcher’s journey. The Editage All Access Pack is a one-of-a-kind subscription that unlocks full access to an AI writing assistant, literature recommender, journal finder, scientific illustration tool, and exclusive discounts on professional publication services from Editage.  

Based on 22+ years of experience in academia, Editage All Access empowers researchers to put their best research forward and move closer to success. Explore our top AI Tools pack, AI Tools + Publication Services pack, or Build Your Own Plan. Find everything a researcher needs to succeed, all in one place –  Get All Access now starting at just $14 a month !    

Related Posts

Back to school 2024 sale

Back to School – Lock-in All Access Pack for a Year at the Best Price

journal turnaround time

Journal Turnaround Time: Researcher.Life and Scholarly Intelligence Join Hands to Empower Researchers with Publication Time Insights 

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case AskWhy Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

descriptive formula in research

Home Market Research

Descriptive Research: Definition, Characteristics, Methods + Examples

Descriptive Research

Suppose an apparel brand wants to understand the fashion purchasing trends among New York’s buyers, then it must conduct a demographic survey of the specific region, gather population data, and then conduct descriptive research on this demographic segment.

The study will then uncover details on “what is the purchasing pattern of New York buyers,” but will not cover any investigative information about “ why ” the patterns exist. Because for the apparel brand trying to break into this market, understanding the nature of their market is the study’s main goal. Let’s talk about it.

What is descriptive research?

Descriptive research is a research method describing the characteristics of the population or phenomenon studied. This descriptive methodology focuses more on the “what” of the research subject than the “why” of the research subject.

The method primarily focuses on describing the nature of a demographic segment without focusing on “why” a particular phenomenon occurs. In other words, it “describes” the research subject without covering “why” it happens.

Characteristics of descriptive research

The term descriptive research then refers to research questions, the design of the study, and data analysis conducted on that topic. We call it an observational research method because none of the research study variables are influenced in any capacity.

Some distinctive characteristics of descriptive research are:

  • Quantitative research: It is a quantitative research method that attempts to collect quantifiable information for statistical analysis of the population sample. It is a popular market research tool that allows us to collect and describe the demographic segment’s nature.
  • Uncontrolled variables: In it, none of the variables are influenced in any way. This uses observational methods to conduct the research. Hence, the nature of the variables or their behavior is not in the hands of the researcher.
  • Cross-sectional studies: It is generally a cross-sectional study where different sections belonging to the same group are studied.
  • The basis for further research: Researchers further research the data collected and analyzed from descriptive research using different research techniques. The data can also help point towards the types of research methods used for the subsequent research.

Applications of descriptive research with examples

A descriptive research method can be used in multiple ways and for various reasons. Before getting into any survey , though, the survey goals and survey design are crucial. Despite following these steps, there is no way to know if one will meet the research outcome. How to use descriptive research? To understand the end objective of research goals, below are some ways organizations currently use descriptive research today:

  • Define respondent characteristics: The aim of using close-ended questions is to draw concrete conclusions about the respondents. This could be the need to derive patterns, traits, and behaviors of the respondents. It could also be to understand from a respondent their attitude, or opinion about the phenomenon. For example, understand millennials and the hours per week they spend browsing the internet. All this information helps the organization researching to make informed business decisions.
  • Measure data trends: Researchers measure data trends over time with a descriptive research design’s statistical capabilities. Consider if an apparel company researches different demographics like age groups from 24-35 and 36-45 on a new range launch of autumn wear. If one of those groups doesn’t take too well to the new launch, it provides insight into what clothes are like and what is not. The brand drops the clothes and apparel that customers don’t like.
  • Conduct comparisons: Organizations also use a descriptive research design to understand how different groups respond to a specific product or service. For example, an apparel brand creates a survey asking general questions that measure the brand’s image. The same study also asks demographic questions like age, income, gender, geographical location, geographic segmentation , etc. This consumer research helps the organization understand what aspects of the brand appeal to the population and what aspects do not. It also helps make product or marketing fixes or even create a new product line to cater to high-growth potential groups.
  • Validate existing conditions: Researchers widely use descriptive research to help ascertain the research object’s prevailing conditions and underlying patterns. Due to the non-invasive research method and the use of quantitative observation and some aspects of qualitative observation , researchers observe each variable and conduct an in-depth analysis . Researchers also use it to validate any existing conditions that may be prevalent in a population.
  • Conduct research at different times: The analysis can be conducted at different periods to ascertain any similarities or differences. This also allows any number of variables to be evaluated. For verification, studies on prevailing conditions can also be repeated to draw trends.

Advantages of descriptive research

Some of the significant advantages of descriptive research are:

Advantages of descriptive research

  • Data collection: A researcher can conduct descriptive research using specific methods like observational method, case study method, and survey method. Between these three, all primary data collection methods are covered, which provides a lot of information. This can be used for future research or even for developing a hypothesis for your research object.
  • Varied: Since the data collected is qualitative and quantitative, it gives a holistic understanding of a research topic. The information is varied, diverse, and thorough.
  • Natural environment: Descriptive research allows for the research to be conducted in the respondent’s natural environment, which ensures that high-quality and honest data is collected.
  • Quick to perform and cheap: As the sample size is generally large in descriptive research, the data collection is quick to conduct and is inexpensive.

Descriptive research methods

There are three distinctive methods to conduct descriptive research. They are:

Observational method

The observational method is the most effective method to conduct this research, and researchers make use of both quantitative and qualitative observations.

A quantitative observation is the objective collection of data primarily focused on numbers and values. It suggests “associated with, of or depicted in terms of a quantity.” Results of quantitative observation are derived using statistical and numerical analysis methods. It implies observation of any entity associated with a numeric value such as age, shape, weight, volume, scale, etc. For example, the researcher can track if current customers will refer the brand using a simple Net Promoter Score question .

Qualitative observation doesn’t involve measurements or numbers but instead just monitoring characteristics. In this case, the researcher observes the respondents from a distance. Since the respondents are in a comfortable environment, the characteristics observed are natural and effective. In a descriptive research design, the researcher can choose to be either a complete observer, an observer as a participant, a participant as an observer, or a full participant. For example, in a supermarket, a researcher can from afar monitor and track the customers’ selection and purchasing trends. This offers a more in-depth insight into the purchasing experience of the customer.

Case study method

Case studies involve in-depth research and study of individuals or groups. Case studies lead to a hypothesis and widen a further scope of studying a phenomenon. However, case studies should not be used to determine cause and effect as they can’t make accurate predictions because there could be a bias on the researcher’s part. The other reason why case studies are not a reliable way of conducting descriptive research is that there could be an atypical respondent in the survey. Describing them leads to weak generalizations and moving away from external validity.

Survey research

In survey research, respondents answer through surveys or questionnaires or polls . They are a popular market research tool to collect feedback from respondents. A study to gather useful data should have the right survey questions. It should be a balanced mix of open-ended questions and close ended-questions . The survey method can be conducted online or offline, making it the go-to option for descriptive research where the sample size is enormous.

Examples of descriptive research

Some examples of descriptive research are:

  • A specialty food group launching a new range of barbecue rubs would like to understand what flavors of rubs are favored by different people. To understand the preferred flavor palette, they conduct this type of research study using various methods like observational methods in supermarkets. By also surveying while collecting in-depth demographic information, offers insights about the preference of different markets. This can also help tailor make the rubs and spreads to various preferred meats in that demographic. Conducting this type of research helps the organization tweak their business model and amplify marketing in core markets.
  • Another example of where this research can be used is if a school district wishes to evaluate teachers’ attitudes about using technology in the classroom. By conducting surveys and observing their comfortableness using technology through observational methods, the researcher can gauge what they can help understand if a full-fledged implementation can face an issue. This also helps in understanding if the students are impacted in any way with this change.

Some other research problems and research questions that can lead to descriptive research are:

  • Market researchers want to observe the habits of consumers.
  • A company wants to evaluate the morale of its staff.
  • A school district wants to understand if students will access online lessons rather than textbooks.
  • To understand if its wellness questionnaire programs enhance the overall health of the employees.

LEARN MORE         FREE TRIAL

MORE LIKE THIS

Experimental vs Observational Studies: Differences & Examples

Experimental vs Observational Studies: Differences & Examples

Sep 5, 2024

Interactive forms

Interactive Forms: Key Features, Benefits, Uses + Design Tips

Sep 4, 2024

closed-loop management

Closed-Loop Management: The Key to Customer Centricity

Sep 3, 2024

Net Trust Score

Net Trust Score: Tool for Measuring Trust in Organization

Sep 2, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • What’s Coming Up
  • Workforce Intelligence
  • Privacy Policy

Research Method

Home » Descriptive Analytics – Methods, Tools and Examples

Descriptive Analytics – Methods, Tools and Examples

Table of Contents

Descriptive Analytics

Descriptive Analytics

Definition:

Descriptive analytics focused on describing or summarizing raw data and making it interpretable. This type of analytics provides insight into what has happened in the past. It involves the analysis of historical data to identify patterns, trends, and insights. Descriptive analytics often uses visualization tools to represent the data in a way that is easy to interpret.

Descriptive Analytics in Research

Descriptive analytics plays a crucial role in research, helping investigators understand and describe the data collected in their studies. Here’s how descriptive analytics is typically used in a research setting:

  • Descriptive Statistics: In research, descriptive analytics often takes the form of descriptive statistics . This includes calculating measures of central tendency (like mean, median, and mode), measures of dispersion (like range, variance, and standard deviation), and measures of frequency (like count, percent, and frequency). These calculations help researchers summarize and understand their data.
  • Visualizing Data: Descriptive analytics also involves creating visual representations of data to better understand and communicate research findings . This might involve creating bar graphs, line graphs, pie charts, scatter plots, box plots, and other visualizations.
  • Exploratory Data Analysis: Before conducting any formal statistical tests, researchers often conduct an exploratory data analysis, which is a form of descriptive analytics. This might involve looking at distributions of variables, checking for outliers, and exploring relationships between variables.
  • Initial Findings: Descriptive analytics are often reported in the results section of a research study to provide readers with an overview of the data. For example, a researcher might report average scores, demographic breakdowns, or the percentage of participants who endorsed each response on a survey.
  • Establishing Patterns and Relationships: Descriptive analytics helps in identifying patterns, trends, or relationships in the data, which can guide subsequent analysis or future research. For instance, researchers might look at the correlation between variables as a part of descriptive analytics.

Descriptive Analytics Techniques

Descriptive analytics involves a variety of techniques to summarize, interpret, and visualize historical data. Some commonly used techniques include:

Statistical Analysis

This includes basic statistical methods like mean, median, mode (central tendency), standard deviation, variance (dispersion), correlation, and regression (relationships between variables).

Data Aggregation

It is the process of compiling and summarizing data to obtain a general perspective. It can involve methods like sum, count, average, min, max, etc., often applied to a group of data.

Data Mining

This involves analyzing large volumes of data to discover patterns, trends, and insights. Techniques used in data mining can include clustering (grouping similar data), classification (assigning data into categories), association rules (finding relationships between variables), and anomaly detection (identifying outliers).

Data Visualization

This involves presenting data in a graphical or pictorial format to provide clear and easy understanding of the data patterns, trends, and insights. Common data visualization methods include bar charts, line graphs, pie charts, scatter plots, histograms, and more complex forms like heat maps and interactive dashboards.

This involves organizing data into informational summaries to monitor how different areas of a business are performing. Reports can be generated manually or automatically and can be presented in tables, graphs, or dashboards.

Cross-tabulation (or Pivot Tables)

It involves displaying the relationship between two or more variables in a tabular form. It can provide a deeper understanding of the data by allowing comparisons and revealing patterns and correlations that may not be readily apparent in raw data.

Descriptive Modeling

Some techniques use complex algorithms to interpret data. Examples include decision tree analysis, which provides a graphical representation of decision-making situations, and neural networks, which are used to identify correlations and patterns in large data sets.

Descriptive Analytics Tools

Some common Descriptive Analytics Tools are as follows:

Excel: Microsoft Excel is a widely used tool that can be used for simple descriptive analytics. It has powerful statistical and data visualization capabilities. Pivot tables are a particularly useful feature for summarizing and analyzing large data sets.

Tableau: Tableau is a data visualization tool that is used to represent data in a graphical or pictorial format. It can handle large data sets and allows for real-time data analysis.

Power BI: Power BI, another product from Microsoft, is a business analytics tool that provides interactive visualizations with self-service business intelligence capabilities.

QlikView: QlikView is a data visualization and discovery tool. It allows users to analyze data and use this data to support decision-making.

SAS: SAS is a software suite that can mine, alter, manage and retrieve data from a variety of sources and perform statistical analysis on it.

SPSS: SPSS (Statistical Package for the Social Sciences) is a software package used for statistical analysis. It’s widely used in social sciences research but also in other industries.

Google Analytics: For web data, Google Analytics is a popular tool. It allows businesses to analyze in-depth detail about the visitors on their website, providing valuable insights that can help shape the success strategy of a business.

R and Python: Both are programming languages that have robust capabilities for statistical analysis and data visualization. With packages like pandas, matplotlib, seaborn in Python and ggplot2, dplyr in R, these languages are powerful tools for descriptive analytics.

Looker: Looker is a modern data platform that can take data from any database and let you start exploring and visualizing.

When to use Descriptive Analytics

Descriptive analytics forms the base of the data analysis workflow and is typically the first step in understanding your business or organization’s data. Here are some situations when you might use descriptive analytics:

Understanding Past Behavior: Descriptive analytics is essential for understanding what has happened in the past. If you need to understand past sales trends, customer behavior, or operational performance, descriptive analytics is the tool you’d use.

Reporting Key Metrics: Descriptive analytics is used to establish and report key performance indicators (KPIs). It can help in tracking and presenting these KPIs in dashboards or regular reports.

Identifying Patterns and Trends: If you need to identify patterns or trends in your data, descriptive analytics can provide these insights. This might include identifying seasonality in sales data, understanding peak operational times, or spotting trends in customer behavior.

Informing Business Decisions: The insights provided by descriptive analytics can inform business strategy and decision-making. By understanding what has happened in the past, you can make more informed decisions about what steps to take in the future.

Benchmarking Performance: Descriptive analytics can be used to compare current performance against historical data. This can be used for benchmarking and setting performance goals.

Auditing and Regulatory Compliance: In sectors where compliance and auditing are essential, descriptive analytics can provide the necessary data and trends over specific periods.

Initial Data Exploration: When you first acquire a dataset, descriptive analytics is useful to understand the structure of the data, the relationships between variables, and any apparent anomalies or outliers.

Examples of Descriptive Analytics

Examples of Descriptive Analytics are as follows:

Retail Industry: A retail company might use descriptive analytics to analyze sales data from the past year. They could break down sales by month to identify any seasonality trends. For example, they might find that sales increase in November and December due to holiday shopping. They could also break down sales by product to identify which items are the most popular. This analysis could inform their purchasing and stocking decisions for the next year. Additionally, data on customer demographics could be analyzed to understand who their primary customers are, guiding their marketing strategies.

Healthcare Industry: In healthcare, descriptive analytics could be used to analyze patient data over time. For instance, a hospital might analyze data on patient admissions to identify trends in admission rates. They might find that admissions for certain conditions are higher at certain times of the year. This could help them allocate resources more effectively. Also, analyzing patient outcomes data can help identify the most effective treatments or highlight areas where improvement is needed.

Finance Industry: A financial firm might use descriptive analytics to analyze historical market data. They could look at trends in stock prices, trading volume, or economic indicators to inform their investment decisions. For example, analyzing the price-earnings ratios of stocks in a certain sector over time could reveal patterns that suggest whether the sector is currently overvalued or undervalued. Similarly, credit card companies can analyze transaction data to detect any unusual patterns, which could be signs of fraud.

Advantages of Descriptive Analytics

Descriptive analytics plays a vital role in the world of data analysis, providing numerous advantages:

  • Understanding the Past: Descriptive analytics provides an understanding of what has happened in the past, offering valuable context for future decision-making.
  • Data Summarization: Descriptive analytics is used to simplify and summarize complex datasets, which can make the information more understandable and accessible.
  • Identifying Patterns and Trends: With descriptive analytics, organizations can identify patterns, trends, and correlations in their data, which can provide valuable insights.
  • Inform Decision-Making: The insights generated through descriptive analytics can inform strategic decisions and help organizations to react more quickly to events or changes in behavior.
  • Basis for Further Analysis: Descriptive analytics lays the groundwork for further analytical activities. It’s the first necessary step before moving on to more advanced forms of analytics like predictive analytics (forecasting future events) or prescriptive analytics (advising on possible outcomes).
  • Performance Evaluation: It allows organizations to evaluate their performance by comparing current results with past results, enabling them to see where improvements have been made and where further improvements can be targeted.
  • Enhanced Reporting and Dashboards: Through the use of visualization techniques, descriptive analytics can improve the quality of reports and dashboards, making the data more understandable and easier to interpret for stakeholders at all levels of the organization.
  • Immediate Value: Unlike some other types of analytics, descriptive analytics can provide immediate insights, as it doesn’t require complex models or deep analytical capabilities to provide value.

Disadvantages of Descriptive Analytics

While descriptive analytics offers numerous benefits, it also has certain limitations or disadvantages. Here are a few to consider:

  • Limited to Past Data: Descriptive analytics primarily deals with historical data and provides insights about past events. It does not predict future events or trends and can’t help you understand possible future outcomes on its own.
  • Lack of Deep Insights: While descriptive analytics helps in identifying what happened, it does not answer why it happened. For deeper insights, you would need to use diagnostic analytics, which analyzes data to understand the root cause of a particular outcome.
  • Can Be Misleading: If not properly executed, descriptive analytics can sometimes lead to incorrect conclusions. For example, correlation does not imply causation, but descriptive analytics might tempt one to make such an inference.
  • Data Quality Issues: The accuracy and usefulness of descriptive analytics are heavily reliant on the quality of the underlying data. If the data is incomplete, incorrect, or biased, the results of the descriptive analytics will be too.
  • Over-reliance on Descriptive Analytics: Businesses may rely too much on descriptive analytics and not enough on predictive and prescriptive analytics. While understanding past and present data is important, it’s equally vital to forecast future trends and make data-driven decisions based on those predictions.
  • Doesn’t Provide Actionable Insights: Descriptive analytics is used to interpret historical data and identify patterns and trends, but it doesn’t provide recommendations or courses of action. For that, prescriptive analytics is needed.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Emerging Research Methods

Emerging Research Methods – Types and Examples

Diagnostic Analytics

Diagnostic Analytics – Methods, Tools and...

Social Network Analysis

Social Network Analysis – Types, Tools and...

Big Data Analytics

Big Data Analytics -Types, Tools and Methods

Digital Ethnography

Digital Ethnography – Types, Methods and Examples

Prescriptive Analytics

Prescriptive Analytics – Techniques, Tools and...

Copyright © SurveySparrow Inc. 2024 Privacy Policy Terms of Service SurveySparrow Inc.

Descriptive Research 101: Definition, Methods and Examples

blog author

Parvathi Vijayamohan

Last Updated: 16 July 2024

10 min read

Descriptive Research 101: Definition, Methods and Examples

Table Of Contents

  • Descriptive Research 101: The Definitive Guide

What is Descriptive Research?

  • Key Characteristics
  • Observation
  • Case Studies
  • Types of Descriptive Research
  • Question Examples
  • Real-World Examples

Tips to Excel at Descriptive Research

  • More Interesting Reads

Imagine you are a detective called to a crime scene. Your job is to study the scene and report whatever you find: whether that’s the half-smoked cigarette on the table or the large “RACHE” written in blood on the wall. That, in a nutshell, is  descriptive research .

Researchers often need to do descriptive research on a problem before they attempt to solve it. So in this guide, we’ll take you through:

  • What is descriptive research + its characteristics
  • Descriptive research methods
  • Types of descriptive research
  • Descriptive research examples
  • Tips to excel at the descriptive method

Click to jump to the section that interests you.

Let’s begin by going through what descriptive studies can and cannot do.

Definition: As its name says, descriptive research  describes  the characteristics of the problem, phenomenon, situation, or group under study.

So the goal of all descriptive studies is to  explore  the background, details, and existing patterns in the problem to fully understand it. In other words, preliminary research.

However, descriptive research can be both  preliminary and conclusive . You can use the data from a descriptive study to make reports and get insights for further planning.

What descriptive research isn’t: Descriptive research finds the  what/when/where  of a problem, not the  why/how .

Because of this, we can’t use the descriptive method to explore cause-and-effect relationships where one variable (like a person’s job role) affects another variable (like their monthly income).

Key Characteristics of Descriptive Research

  • Answers the “what,” “when,” and “where”  of a research problem. For this reason, it is popularly used in  market research ,  awareness surveys , and  opinion polls .
  • Sets the stage  for a research problem. As an early part of the research process, descriptive studies help you dive deeper into the topic.
  • Opens the door  for further research. You can use descriptive data as the basis for more profound research, analysis and studies.
  • Qualitative and quantitative research . It is possible to get a balanced mix of numerical responses and open-ended answers from the descriptive method.
  • No control or interference with the variables . The researcher simply observes and reports on them. However, specific research software has filters that allow her to zoom in on one variable.
  • Done in natural settings . You can get the best results from descriptive research by talking to people, surveying them, or observing them in a suitable environment. For example, suppose you are a website beta testing an app feature. In that case, descriptive research invites users to try the feature, tracking their behavior and then asking their opinions .
  • Can be applied to many research methods and areas. Examples include healthcare, SaaS, psychology, political studies, education, and pop culture.

Descriptive Research Methods: The Top Three You Need to Know!

In short, survey research is a brief interview or conversation with a set of prepared questions about a topic. So you create a questionnaire, share it, and analyze the data you collect for further action.

Read more : The difference between surveys vs questionnaires

  • Surveys can be hyper-local, regional, or global, depending on your objectives.
  • Share surveys in-person, offline, via SMS, email, or QR codes – so many options!
  • Easy to automate if you want to conduct many surveys over a period.

FYI: If you’re looking for the perfect tool to conduct descriptive research, SurveySparrow’s got you covered. Our AI-powered text and sentiment analysis help you instantly capture detailed insights for your studies.

With 1,000+ customizable (and free) survey templates , 20+ question types, and 1500+ integrations , SurveySparrow makes research super-easy.

Want to try out our platform? Click on the template below to start using it.👇

Product Market Research Survey Template

Preview Template

 Product Market Research Survey Template

2. Observation

The observational method is a type of descriptive research in which you, the researcher, observe ongoing behavior.

Now, there are several (non-creepy) ways you can observe someone. In fact, observational research has three main approaches:

  • Covert observation: In true spy fashion, the researcher mixes in with the group undetected or observes from a distance.
  • Overt observation : The researcher identifies himself as a researcher – “The name’s Bond. J. Bond.” – and explains the purpose of the study.
  • Participatory observation : The researcher participates in what he is observing to understand his topic better.
  • Observation is one of the most accurate ways to get data on a subject’s behavior in a natural setting.
  • You don’t need to rely on people’s willingness to share information.
  • Observation is a universal method that can be applied to any area of research.

3. Case Studies

In the case study method, you do a detailed study of a specific group, person, or event over a period.

This brings us to a frequently asked question: “What’s the difference between case studies and longitudinal studies?”

A case study will go  very in-depth into the subject with one-on-one interviews, observations, and archival research. They are also qualitative, though sometimes they will use numbers and stats.

An example of longitudinal research would be a study of the health of night shift employees vs. general shift employees over a decade. An example of a case study would involve in-depth interviews with Casey, an assistant director of nursing who’s handled the night shift at the hospital for ten years now.

  • Due to the focus on a few people, case studies can give you a tremendous amount of information.
  • Because of the time and effort involved, a case study engages both researchers and participants.
  • Case studies are helpful for ethically investigating unusual, complex, or challenging subjects. An example would be a study of the habits of long-term cocaine users.

7 Types of Descriptive Research

Cross-sectional researchStudies a particular group of people or their sections at a given point in time. Example: current social attitudes of Gen Z in the US
Longitudinal researchStudies a group of people over a long period of time. Example: tracking changes in social attitudes among Gen-Zers from 2022 – 2032.
Normative researchCompares the results of a study against the existing norms. Example: comparing a verdict in a legal case against similar cases.
Correlational/relational researchInvestigates the type of relationship and patterns between 2 variables. Example: music genres and mental states.
Comparative researchCompares 2 or more similar people, groups or conditions based on specific traits. Example: job roles of employees in similar positions from two different companies.
Classification researchArranges the data into classes according to certain criteria for better analysis. Example: the classification of newly discovered insects into species.
Archival researchSearching for and extracting information from past records. Example: Tracking US Census data over the decades.

Descriptive Research Question Examples

  • How have teen social media habits changed in 10 years?
  • What causes high employee turnover in tech?
  • How do urban and rural diets differ in India?
  • What are consumer preferences for electric vs. gasoline cars in Germany?
  • How common is smartphone addiction among UK college students?
  • What drives customer satisfaction in banking?
  • How have adolescent mental health issues changed in 15 years?
  • What leisure activities are popular among retirees in Japan?
  • How do commute times vary in US metro areas?
  • What makes e-commerce websites successful?

Descriptive Research: Real-World Examples To Build Your Next Study

1. case study: airbnb’s growth strategy.

In an excellent case study, Tam Al Saad, Principal Consultant, Strategy + Growth at Webprofits, deep dives into how Airbnb attracted and retained 150 million users .

“What Airbnb offers isn’t a cheap place to sleep when you’re on holiday; it’s the opportunity to experience your destination as a local would. It’s the chance to meet the locals, experience the markets, and find non-touristy places.

Sure, you can visit the Louvre, see Buckingham Palace, and climb the Empire State Building, but you can do it as if it were your hometown while staying in a place that has character and feels like a home.” – Tam al Saad, Principal Consultant, Strategy + Growth at Webprofits

2. Observation – Better Tech Experiences for the Elderly

We often think that our elders are so hopeless with technology. But we’re not getting any younger either, and tech is changing at a hair trigger! This article by Annemieke Hendricks shares a wonderful example where researchers compare the levels of technological familiarity between age groups and how that influences usage.

“It is generally assumed that older adults have difficulty using modern electronic devices, such as mobile telephones or computers. Because this age group is growing in most countries, changing products and processes to adapt to their needs is increasingly more important. “ – Annemieke Hendricks, Marketing Communication Specialist, Noldus

3. Surveys – Decoding Sleep with SurveySparrow

SRI International (formerly Stanford Research Institute) – an independent, non-profit research center – wanted to investigate the impact of stress on an adolescent’s sleep. To get those insights, two actions were essential: tracking sleep patterns through wearable devices and sending surveys at a pre-set time – the pre-sleep period.

“With SurveySparrow’s recurring surveys feature, SRI was able to share engaging surveys with their participants exactly at the time they wanted and at the frequency they preferred.”

Read more about this project : How SRI International decoded sleep patterns with SurveySparrow

1: Answer the six Ws –

  • Who should we consider?
  • What information do we need?
  • When should we collect the information?
  • Where should we collect the information?
  • Why are we obtaining the information?
  • Way to collect the information

#2: Introduce and explain your methodological approach

#3: Describe your methods of data collection and/or selection.

#4: Describe your methods of analysis.

#5: Explain the reasoning behind your choices.

#6: Collect data.

#7: Analyze the data. Use software to speed up the process and reduce overthinking and human error.

#8: Report your conclusions and how you drew the results.

Wrapping Up

Whether it’s social media habits, consumer preferences, or mental health trends, descriptive research provides a clear snapshot into what people actually think.

If you want to know more about feedback methodology, or research, check out some of our other articles below.

👉 Desk Research 101: Definition, Methods, and Examples

👉 Exploratory Research: Your Guide to Unraveling Insights

👉 Design Research: Types, Methods, and Importance

blog author image

Content marketer at SurveySparrow.

Parvathi is a sociologist turned marketer. After 6 years as a copywriter, she pivoted to B2B, diving into growth marketing for SaaS. Now she uses content and conversion optimization to fuel growth - focusing on CX, reputation management and feedback methodology for businesses.

You Might Also Like

The ultimate organizational culture definition! And 15 tips on how to make your culture the best.

Work Culture

The ultimate organizational culture definition! And 15 tips on how to make your culture the best.

best intake forms with sample templates

What Are Intake Forms? Types, Templates and How to Create It

Top 10 HR Trends For 2024 | Latest HR Trends

Top 10 HR Trends For 2024 | Latest HR Trends

CX Optimization is important : An Interview with 1to1 Media’s Judith Aquino

CX Optimization is important : An Interview with 1to1 Media’s Judith Aquino

Turn every feedback into a growth opportunity.

14-day free trial • Cancel Anytime • No Credit Card Required • Need a Demo?

Root out friction in every digital experience, super-charge conversion rates, and optimize digital self-service

Uncover insights from any interaction, deliver AI-powered agent coaching, and reduce cost to serve

Increase revenue and loyalty with real-time insights and recommendations delivered to teams on the ground

Know how your people feel and empower managers to improve employee engagement, productivity, and retention

Take action in the moments that matter most along the employee journey and drive bottom line growth

Whatever they’re are saying, wherever they’re saying it, know exactly what’s going on with your people

Get faster, richer insights with qual and quant tools that make powerful market research available to everyone

Run concept tests, pricing studies, prototyping + more with fast, powerful studies designed by UX research experts

Track your brand performance 24/7 and act quickly to respond to opportunities and challenges in your market

Explore the platform powering Experience Management

  • Free Account
  • Product Demos
  • For Digital
  • For Customer Care
  • For Human Resources
  • For Researchers
  • Financial Services
  • All Industries

Popular Use Cases

  • Customer Experience
  • Employee Experience
  • Net Promoter Score
  • Voice of Customer
  • Customer Success Hub
  • Product Documentation
  • Training & Certification
  • XM Institute
  • Popular Resources
  • Customer Stories
  • Artificial Intelligence

Market Research

  • Partnerships
  • Marketplace

The annual gathering of the experience leaders at the world’s iconic brands building breakthrough business results, live in Salt Lake City.

  • English/AU & NZ
  • Español/Europa
  • Español/América Latina
  • Português Brasileiro
  • REQUEST DEMO
  • Experience Management
  • Descriptive Research

Try Qualtrics for free

Descriptive research: what it is and how to use it.

8 min read Understanding the who, what and where of a situation or target group is an essential part of effective research and making informed business decisions.

For example you might want to understand what percentage of CEOs have a bachelor’s degree or higher. Or you might want to understand what percentage of low income families receive government support – or what kind of support they receive.

Descriptive research is what will be used in these types of studies.

In this guide we’ll look through the main issues relating to descriptive research to give you a better understanding of what it is, and how and why you can use it.

Free eBook: 2024 global market research trends report

What is descriptive research?

Descriptive research is a research method used to try and determine the characteristics of a population or particular phenomenon.

Using descriptive research you can identify patterns in the characteristics of a group to essentially establish everything you need to understand apart from why something has happened.

Market researchers use descriptive research for a range of commercial purposes to guide key decisions.

For example you could use descriptive research to understand fashion trends in a given city when planning your clothing collection for the year. Using descriptive research you can conduct in depth analysis on the demographic makeup of your target area and use the data analysis to establish buying patterns.

Conducting descriptive research wouldn’t, however, tell you why shoppers are buying a particular type of fashion item.

Descriptive research design

Descriptive research design uses a range of both qualitative research and quantitative data (although quantitative research is the primary research method) to gather information to make accurate predictions about a particular problem or hypothesis.

As a survey method, descriptive research designs will help researchers identify characteristics in their target market or particular population.

These characteristics in the population sample can be identified, observed and measured to guide decisions.

Descriptive research characteristics

While there are a number of descriptive research methods you can deploy for data collection, descriptive research does have a number of predictable characteristics.

Here are a few of the things to consider:

Measure data trends with statistical outcomes

Descriptive research is often popular for survey research because it generates answers in a statistical form, which makes it easy for researchers to carry out a simple statistical analysis to interpret what the data is saying.

Descriptive research design is ideal for further research

Because the data collection for descriptive research produces statistical outcomes, it can also be used as secondary data for another research study.

Plus, the data collected from descriptive research can be subjected to other types of data analysis .

Uncontrolled variables

A key component of the descriptive research method is that it uses random variables that are not controlled by the researchers. This is because descriptive research aims to understand the natural behavior of the research subject.

It’s carried out in a natural environment

Descriptive research is often carried out in a natural environment. This is because researchers aim to gather data in a natural setting to avoid swaying respondents.

Data can be gathered using survey questions or online surveys.

For example, if you want to understand the fashion trends we mentioned earlier, you would set up a study in which a researcher observes people in the respondent’s natural environment to understand their habits and preferences.

Descriptive research allows for cross sectional study

Because of the nature of descriptive research design and the randomness of the sample group being observed, descriptive research is ideal for cross sectional studies – essentially the demographics of the group can vary widely and your aim is to gain insights from within the group.

This can be highly beneficial when you’re looking to understand the behaviors or preferences of a wider population.

Descriptive research advantages

There are many advantages to using descriptive research, some of them include:

Cost effectiveness

Because the elements needed for descriptive research design are not specific or highly targeted (and occur within the respondent’s natural environment) this type of study is relatively cheap to carry out.

Multiple types of data can be collected

A big advantage of this research type, is that you can use it to collect both quantitative and qualitative data. This means you can use the stats gathered to easily identify underlying patterns in your respondents’ behavior.

Descriptive research disadvantages

Potential reliability issues.

When conducting descriptive research it’s important that the initial survey questions are properly formulated.

If not, it could make the answers unreliable and risk the credibility of your study.

Potential limitations

As we’ve mentioned, descriptive research design is ideal for understanding the what, who or where of a situation or phenomenon.

However, it can’t help you understand the cause or effect of the behavior. This means you’ll need to conduct further research to get a more complete picture of a situation.

Descriptive research methods

Because descriptive research methods include a range of quantitative and qualitative research, there are several research methods you can use.

Use case studies

Case studies in descriptive research involve conducting in-depth and detailed studies in which researchers get a specific person or case to answer questions.

Case studies shouldn’t be used to generate results, rather it should be used to build or establish hypothesis that you can expand into further market research .

For example you could gather detailed data about a specific business phenomenon, and then use this deeper understanding of that specific case.

Use observational methods

This type of study uses qualitative observations to understand human behavior within a particular group.

By understanding how the different demographics respond within your sample you can identify patterns and trends.

As an observational method, descriptive research will not tell you the cause of any particular behaviors, but that could be established with further research.

Use survey research

Surveys are one of the most cost effective ways to gather descriptive data.

An online survey or questionnaire can be used in descriptive studies to gather quantitative information about a particular problem.

Survey research is ideal if you’re using descriptive research as your primary research.

Descriptive research examples

Descriptive research is used for a number of commercial purposes or when organizations need to understand the behaviors or opinions of a population.

One of the biggest examples of descriptive research that is used in every democratic country, is during elections.

Using descriptive research, researchers will use surveys to understand who voters are more likely to choose out of the parties or candidates available.

Using the data provided, researchers can analyze the data to understand what the election result will be.

In a commercial setting, retailers often use descriptive research to figure out trends in shopping and buying decisions.

By gathering information on the habits of shoppers, retailers can get a better understanding of the purchases being made.

Another example that is widely used around the world, is the national census that takes place to understand the population.

The research will provide a more accurate picture of a population’s demographic makeup and help to understand changes over time in areas like population age, health and education level.

Where Qualtrics helps with descriptive research

Whatever type of research you want to carry out, there’s a survey type that will work.

Qualtrics can help you determine the appropriate method and ensure you design a study that will deliver the insights you need.

Our experts can help you with your market research needs , ensuring you get the most out of Qualtrics market research software to design, launch and analyze your data to guide better, more accurate decisions for your organization.

Related resources

Market intelligence 10 min read, marketing insights 11 min read, ethnographic research 11 min read, qualitative vs quantitative research 13 min read, qualitative research questions 11 min read, qualitative research design 12 min read, primary vs secondary research 14 min read, request demo.

Ready to learn more about Qualtrics?

Have a thesis expert improve your writing

Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.

  • Knowledge Base

Descriptive Statistics | Definitions, Types, Examples

Published on 4 November 2022 by Pritha Bhandari . Revised on 9 January 2023.

Descriptive statistics summarise and organise characteristics of a data set. A data set is a collection of responses or observations from a sample or entire population .

In quantitative research , after collecting data, the first step of statistical analysis is to describe characteristics of the responses, such as the average of one variable (e.g., age), or the relation between two variables (e.g., age and creativity).

The next step is inferential statistics , which help you decide whether your data confirms or refutes your hypothesis and whether it is generalisable to a larger population.

Table of contents

Types of descriptive statistics, frequency distribution, measures of central tendency, measures of variability, univariate descriptive statistics, bivariate descriptive statistics, frequently asked questions.

There are 3 main types of descriptive statistics:

  • The distribution concerns the frequency of each value.
  • The central tendency concerns the averages of the values.
  • The variability or dispersion concerns how spread out the values are.

Types of descriptive statistics

You can apply these to assess only one variable at a time, in univariate analysis, or to compare two or more, in bivariate and multivariate analysis.

  • Go to a library
  • Watch a movie at a theater
  • Visit a national park

A data set is made up of a distribution of values, or scores. In tables or graphs, you can summarise the frequency of every possible value of a variable in numbers or percentages.

  • Simple frequency distribution table
  • Grouped frequency distribution table
Gender Number
Male 182
Female 235
Other 27

From this table, you can see that more women than men or people with another gender identity took part in the study. In a grouped frequency distribution, you can group numerical response values and add up the number of responses for each group. You can also convert each of these numbers to percentages.

Library visits in the past year Percent
0–4 6%
5–8 20%
9–12 42%
13–16 24%
17+ 8%

Measures of central tendency estimate the center, or average, of a data set. The mean , median and mode are 3 ways of finding the average.

Here we will demonstrate how to calculate the mean, median, and mode using the first 6 responses of our survey.

The mean , or M , is the most commonly used method for finding the average.

To find the mean, simply add up all response values and divide the sum by the total number of responses. The total number of responses or observations is called N .

Mean number of library visits
Data set 15, 3, 12, 0, 24, 3
Sum of all values 15 + 3 + 12 + 0 + 24 + 3 = 57
Total number of responses = 6
Mean Divide the sum of values by to find : 57/6 =

The median is the value that’s exactly in the middle of a data set.

To find the median, order each response value from the smallest to the biggest. Then, the median is the number in the middle. If there are two numbers in the middle, find their mean.

Median number of library visits
Ordered data set 0, 3, 3, 12, 15, 24
Middle numbers 3, 12
Median Find the mean of the two middle numbers: (3 + 12)/2 =

The mode is the simply the most popular or most frequent response value. A data set can have no mode, one mode, or more than one mode.

To find the mode, order your data set from lowest to highest and find the response that occurs most frequently.

Mode number of library visits
Ordered data set 0, 3, 3, 12, 15, 24
Mode Find the most frequently occurring response:

Measures of variability give you a sense of how spread out the response values are. The range, standard deviation and variance each reflect different aspects of spread.

The range gives you an idea of how far apart the most extreme response scores are. To find the range , simply subtract the lowest value from the highest value.

Standard deviation

The standard deviation ( s ) is the average amount of variability in your dataset. It tells you, on average, how far each score lies from the mean. The larger the standard deviation, the more variable the data set is.

There are six steps for finding the standard deviation:

  • List each score and find their mean.
  • Subtract the mean from each score to get the deviation from the mean.
  • Square each of these deviations.
  • Add up all of the squared deviations.
  • Divide the sum of the squared deviations by N – 1.
  • Find the square root of the number you found.
Raw data Deviation from mean Squared deviation
15 15 – 9.5 = 5.5 30.25
3 3 – 9.5 = -6.5 42.25
12 12 – 9.5 = 2.5 6.25
0 0 – 9.5 = -9.5 90.25
24 24 – 9.5 = 14.5 210.25
3 3 – 9.5 = -6.5 42.25
= 9.5 Sum = 0 Sum of squares = 421.5

Step 5: 421.5/5 = 84.3

Step 6: √84.3 = 9.18

The variance is the average of squared deviations from the mean. Variance reflects the degree of spread in the data set. The more spread the data, the larger the variance is in relation to the mean.

To find the variance, simply square the standard deviation. The symbol for variance is s 2 .

Univariate descriptive statistics focus on only one variable at a time. It’s important to examine data from each variable separately using multiple measures of distribution, central tendency and spread. Programs like SPSS and Excel can be used to easily calculate these.

Visits to the library
6
Mean 9.5
Median 7.5
Mode 3
Standard deviation 9.18
Variance 84.3
Range 24

If you were to only consider the mean as a measure of central tendency, your impression of the ‘middle’ of the data set can be skewed by outliers, unlike the median or mode.

Likewise, while the range is sensitive to extreme values, you should also consider the standard deviation and variance to get easily comparable measures of spread.

If you’ve collected data on more than one variable, you can use bivariate or multivariate descriptive statistics to explore whether there are relationships between them.

In bivariate analysis, you simultaneously study the frequency and variability of two variables to see if they vary together. You can also compare the central tendency of the two variables before performing further statistical tests .

Multivariate analysis is the same as bivariate analysis but with more than two variables.

Contingency table

In a contingency table, each cell represents the intersection of two variables. Usually, an independent variable (e.g., gender) appears along the vertical axis and a dependent one appears along the horizontal axis (e.g., activities). You read ‘across’ the table to see how the independent and dependent variables relate to each other.

Number of visits to the library in the past year
Group 0–4 5–8 9–12 13–16 17+
Children 32 68 37 23 22
Adults 36 48 43 83 25

Interpreting a contingency table is easier when the raw data is converted to percentages. Percentages make each row comparable to the other by making it seem as if each group had only 100 observations or participants. When creating a percentage-based contingency table, you add the N for each independent variable on the end.

Visits to the library in the past year (Percentages)
Group 0–4 5–8 9–12 13–16 17+
Children 18% 37% 20% 13% 12% 182
Adults 15% 20% 18% 35% 11% 235

From this table, it is more clear that similar proportions of children and adults go to the library over 17 times a year. Additionally, children most commonly went to the library between 5 and 8 times, while for adults, this number was between 13 and 16.

Scatter plots

A scatter plot is a chart that shows you the relationship between two or three variables. It’s a visual representation of the strength of a relationship.

In a scatter plot, you plot one variable along the x-axis and another one along the y-axis. Each data point is represented by a point in the chart.

From your scatter plot, you see that as the number of movies seen at movie theaters increases, the number of visits to the library decreases. Based on your visual assessment of a possible linear relationship, you perform further tests of correlation and regression.

Descriptive statistics: Scatter plot

Descriptive statistics summarise the characteristics of a data set. Inferential statistics allow you to test a hypothesis or assess whether your data is generalisable to the broader population.

The 3 main types of descriptive statistics concern the frequency distribution, central tendency, and variability of a dataset.

  • Distribution refers to the frequencies of different responses.
  • Measures of central tendency give you the average for each response.
  • Measures of variability show you the spread or dispersion of your dataset.
  • Univariate statistics summarise only one variable  at a time.
  • Bivariate statistics compare two variables .
  • Multivariate statistics compare more than two variables .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2023, January 09). Descriptive Statistics | Definitions, Types, Examples. Scribbr. Retrieved 3 September 2024, from https://www.scribbr.co.uk/stats/descriptive-statistics-explained/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, data collection methods | step-by-step guide & examples, variability | calculating range, iqr, variance, standard deviation, normal distribution | examples, formulas, & uses.

A Guide on Data Analysis

3 descriptive statistics.

When you have an area of interest that you want to research, a problem that you want to solve, a relationship that you want to investigate, theoretical and empirical processes will help you.

Estimand is defined as “a quantity of scientific interest that can be calculated in the population and does not change its value depending on the data collection design used to measure it (i.e., it does not vary with sample size and survey design, or the number of non-respondents, or follow-up efforts).” ( Rubin 1996 )

Estimands include:

  • population means
  • Population variances
  • correlations
  • factor loading
  • regression coefficients

3.1 Numerical Measures

There are differences between a population and a sample

Measures of Category Population Sample
- What is it? Reality A small fraction of reality (inference)
- Characteristics described by Parameters Statistics
Central Tendency Mean \(\mu = E(Y)\) \(\hat{\mu} = \overline{y}\)
Central Tendency Median 50-th percentile \(y_{(\frac{n+1}{2})}\)
Dispersion Variance \[\begin{aligned} \sigma^2 &= var(Y) \\ &= E(Y- \mu^2) \end{aligned}\] \(s^2=\frac{1}{n-1} \sum_{i = 1}^{n} (y_i-\overline{y})^2\)
Dispersion Coefficient of Variation \(\frac{\sigma}{\mu}\) \(\frac{s}{\overline{y}}\)
Dispersion Interquartile Range difference between 25th and 75th percentiles. Robust to outliers
Shape Skewness Standardized 3rd central moment (unitless) \(g_1=\frac{\mu_3}{\mu_2^{3/2}}\) \(\hat{g_1}=\frac{m_3}{m_2sqrt(m_2)}\)
Shape Central moments \(\mu=E(Y)\) \(\mu_2 = \sigma^2=E(Y-\mu)^2\) \(\mu_3 = E(Y-\mu)^3\) \(\mu_4 = E(Y-\mu)^4\) |

\(m_2=\sum_{i=1}^{n}(y_1-\overline{y})^2/n\)

\(m_3=\sum_{i=1}^{n}(y_1-\overline{y})^3/n\)

Shape Kurtosis (peakedness and tail thickness) Standardized 4th central moment \(g_2^*=\frac{E(Y-\mu)^4}{\sigma^4}\) \(\hat{g_2}=\frac{m_4}{m_2^2}-3\)

Order Statistics: \(y_{(1)},y_{(2)},...,y_{(n)}\) where \(y_{(1)}<y_{(2)}<...<y_{(n)}\)

Coefficient of variation: standard deviation over mean. This metric is stable, dimensionless statistic for comparison.

Symmetric: mean = median, skewness = 0

Skewed right: mean > median, skewness > 0

Skewed left: mean < median, skewness < 0

Central moments: \(\mu=E(Y)\) , \(\mu_2 = \sigma^2=E(Y-\mu)^2\) , \(\mu_3 = E(Y-\mu)^3\) , \(\mu_4 = E(Y-\mu)^4\)

For normal distributions, \(\mu_3=0\) , so \(g_1=0\)

\(\hat{g_1}\) is distributed approximately as \(N(0,6/n)\) if sample is from a normal population. (valid when \(n > 150\) )

  • For large samples, inference on skewness can be based on normal tables with 95% confidence interval for \(g_1\) as \(\hat{g_1}\pm1.96\sqrt{6/n}\)
  • For small samples, special tables from Snedecor and Cochran 1989, Table A 19(i) or Monte Carlo test
Kurtosis > 0 (leptokurtic) heavier tail compared to a normal distribution with the same \(\sigma\) (e.g., t-distribution)
Kurtosis < 0 (platykurtic) lighter tail compared to a normal distribution with the same \(\sigma\)

For a normal distribution, \(g_2^*=3\) . Kurtosis is often redefined as: \(g_2=\frac{E(Y-\mu)^4}{\sigma^4}-3\) where the 4th central moment is estimated by \(m_4=\sum_{i=1}^{n}(y_i-\overline{y})^4/n\)

  • the asymptotic sampling distribution for \(\hat{g_2}\) is approximately \(N(0,24/n)\) (with \(n > 1000\) )
  • large sample on kurtosis uses standard normal tables
  • small sample uses tables by Snedecor and Cochran, 1989, Table A 19(ii) or Geary 1936

3.2 Graphical Measures

3.2.1 shape.

It’s a good habit to label your graph, so others can easily follow.

Others more advanced plots

3.2.2 Scatterplot

3.3 normality assessment.

Since Normal (Gaussian) distribution has many applications, we typically want/ wish our data or our variable is normal. Hence, we have to assess the normality based on not only Numerical Measures but also Graphical Measures

3.3.1 Graphical Assessment

descriptive formula in research

The straight line represents the theoretical line for normally distributed data. The dots represent real empirical data that we are checking. If all the dots fall on the straight line, we can be confident that our data follow a normal distribution. If our data wiggle and deviate from the line, we should be concerned with the normality assumption.

3.3.2 Summary Statistics

Sometimes it’s hard to tell whether your data follow the normal distribution by just looking at the graph. Hence, we often have to conduct statistical test to aid our decision. Common tests are

Methods based on normal probability plot

  • Correlation Coefficient with Normal Probability Plots
  • Shapiro-Wilk Test

Methods based on empirical cumulative distribution function

  • Anderson-Darling Test
  • Kolmogorov-Smirnov Test
  • Cramer-von Mises Test
  • Jarque–Bera Test

3.3.2.1 Methods based on normal probability plot

3.3.2.1.1 correlation coefficient with normal probability plots.

( Looney and Gulledge Jr 1985 ) ( Samuel S. Shapiro and Francia 1972 ) The correlation coefficient between \(y_{(i)}\) and \(m_i^*\) as given on the normal probability plot:

\[W^*=\frac{\sum_{i=1}^{n}(y_{(i)}-\bar{y})(m_i^*-0)}{(\sum_{i=1}^{n}(y_{(i)}-\bar{y})^2\sum_{i=1}^{n}(m_i^*-0)^2)^.5}\]

where \(\bar{m^*}=0\)

Pearson product moment formula for correlation:

\[\hat{p}=\frac{\sum_{i-1}^{n}(y_i-\bar{y})(x_i-\bar{x})}{(\sum_{i=1}^{n}(y_{i}-\bar{y})^2\sum_{i=1}^{n}(x_i-\bar{x})^2)^.5}\]

  • When the correlation is 1, the plot is exactly linear and normality is assumed.
  • The closer the correlation is to zero, the more confident we are to reject normality
  • Inference on W* needs to be based on special tables ( Looney and Gulledge Jr 1985 )

3.3.2.1.2 Shapiro-Wilk Test

( Samuel Sanford Shapiro and Wilk 1965 )

\[W=(\frac{\sum_{i=1}^{n}a_i(y_{(i)}-\bar{y})(m_i^*-0)}{(\sum_{i=1}^{n}a_i^2(y_{(i)}-\bar{y})^2\sum_{i=1}^{n}(m_i^*-0)^2)^.5})^2\]

where \(a_1,..,a_n\) are weights computed from the covariance matrix for the order statistics.

  • Researchers typically use this test to assess normality. (n < 2000) Under normality, W is close to 1, just like \(W^*\) . Notice that the only difference between W and W* is the “weights”.

3.3.2.2 Methods based on empirical cumulative distribution function

The formula for the empirical cumulative distribution function (CDF) is:

\(F_n(t)\) = estimate of probability that an observation \(\le\) t = (number of observation \(\le\) t)/n

This method requires large sample sizes. However, it can apply to distributions other than the normal (Gaussian) one.

descriptive formula in research

3.3.2.2.1 Anderson-Darling Test

The Anderson-Darling statistic ( T. W. Anderson and Darling 1952 ) :

\[A^2=\int_{-\infty}^{\infty}(F_n(t)=F(t))^2\frac{dF(t)}{F(t)(1-F(t))}\]

  • a weight average of squared deviations (it weights small and large values of t more)

For the normal distribution,

\(A^2 = - (\sum_{i=1}^{n}(2i-1)(ln(p_i) +ln(1-p_{n+1-i}))/n-n\)

where \(p_i=\Phi(\frac{y_{(i)}-\bar{y}}{s})\) , the probability that a standard normal variable is less than \(\frac{y_{(i)}-\bar{y}}{s}\)

Reject normal assumption when \(A^2\) is too large

Evaluate the null hypothesis that the observations are randomly selected from a normal population based on the critical value provided by ( Marsaglia and Marsaglia 2004 ) and ( Stephens 1974 )

This test can be applied to other distributions:

  • Exponential
  • Extreme-value
  • Weibull: log(Weibull) = Gumbel
  • Log-normal (two-parameter)

Consult ( Stephens 1974 ) for more detailed transformation and critical values.

3.3.2.2.2 Kolmogorov-Smirnov Test

  • Based on the largest absolute difference between empirical and expected cumulative distribution
  • Another deviation of K-S test is Kuiper’s test

3.3.2.2.3 Cramer-von Mises Test

  • Based on the average squared discrepancy between the empirical distribution and a given theoretical distribution. Each discrepancy is weighted equally (unlike Anderson-Darling test weights end points more heavily)

3.3.2.2.4 Jarque–Bera Test

( Bera and Jarque 1981 )

Based on the skewness and kurtosis to test normality.

\(JB = \frac{n}{6}(S^2+(K-3)^2/4)\) where \(S\) is the sample skewness and \(K\) is the sample kurtosis

\(S=\frac{\hat{\mu_3}}{\hat{\sigma}^3}=\frac{\sum_{i=1}^{n}(x_i-\bar{x})^3/n}{(\sum_{i=1}^{n}(x_i-\bar{x})^2/n)^\frac{3}{2}}\)

\(K=\frac{\hat{\mu_4}}{\hat{\sigma}^4}=\frac{\sum_{i=1}^{n}(x_i-\bar{x})^4/n}{(\sum_{i=1}^{n}(x_i-\bar{x})^2/n)^2}\)

recall \(\hat{\sigma^2}\) is the estimate of the second central moment (variance) \(\hat{\mu_3}\) and \(\hat{\mu_4}\) are the estimates of third and fourth central moments.

If the data comes from a normal distribution, the JB statistic asymptotically has a chi-squared distribution with two degrees of freedom.

The null hypothesis is a joint hypothesis of the skewness being zero and the excess kurtosis being zero.

3.4 Bivariate Statistics

Correlation between

  • Two Continuous variables
  • Two Discrete variables
  • Categorical and Continuous
Categorical Continuous

Questions to keep in mind:

  • Is the relationship linear or non-linear?
  • If the variable is continuous, is it normal and homoskadastic?
  • How big is your dataset?

3.4.1 Two Continuous

3.4.1.1 pearson correlation.

  • Good with linear relationship

3.4.1.2 Spearman Correlation

3.4.2 categorical and continuous, 3.4.2.1 point-biserial correlation.

Similar to the Pearson correlation coefficient, the point-biserial correlation coefficient is between -1 and 1 where:

-1 means a perfectly negative correlation between two variables

0 means no correlation between two variables

1 means a perfectly positive correlation between two variables

Alternatively

3.4.2.2 Logistic Regression

See 3.4.2.2

3.4.3 Two Discrete

3.4.3.1 distance metrics.

Some consider distance is not a correlation metric because it isn’t unit independent (i.e., if you scale the distance, the metrics will change), but it’s still a useful proxy. Distance metrics are more likely to be used for similarity measure.

Euclidean Distance

Manhattan Distance

Chessboard Distance

Minkowski Distance

Canberra Distance

Hamming Distance

Cosine Distance

Sum of Absolute Distance

Sum of Squared Distance

Mean-Absolute Error

3.4.3.2 Statistical Metrics

3.4.3.2.1 chi-squared test, 3.4.3.2.1.1 phi coefficient, 3.4.3.2.1.2 cramer’s v.

  • between nominal categorical variables (no natural order)

\[ \text{Cramer's V} = \sqrt{\frac{\chi^2/n}{\min(c-1,r-1)}} \]

\(\chi^2\) = Chi-square statistic

\(n\) = sample size

\(r\) = # of rows

\(c\) = # of columns

Alternatively,

ncchisq noncentral Chi-square

nchisqadj Adjusted noncentral Chi-square

fisher Fisher Z transformation

fisheradj bias correction Fisher z transformation

3.4.3.2.1.3 Tschuprow’s T

  • 2 nominal variables

3.4.3.3 Ordinal Association (Rank correlation)

  • Good with non-linear relationship

3.4.3.3.1 Ordinal and Nominal

3.4.3.3.1.1 freeman’s theta.

  • Ordinal and nominal

3.4.3.3.1.2 Epsilon-squared

3.4.3.3.2 two ordinal, 3.4.3.3.2.1 goodman kruskal’s gamma.

  • 2 ordinal variables

3.4.3.3.2.2 Somers’ D

or Somers’ Delta

3.4.3.3.2.3 Kendall’s Tau-b

3.4.3.3.2.4 yule’s q and y.

Special version \((2 \times 2)\) of the Goodman Kruskal’s Gamma coefficient.

Variable 1
a b
c d

\[ \text{Yule's Q} = \frac{ad - bc}{ad + bc} \]

We typically use Yule’s \(Q\) in practice while Yule’s Y has the following relationship with \(Q\) .

\[ \text{Yule's Y} = \frac{\sqrt{ad} - \sqrt{bc}}{\sqrt{ad} + \sqrt{bc}} \]

\[ Q = \frac{2Y}{1 + Y^2} \]

\[ Y = \frac{1 = \sqrt{1-Q^2}}{Q} \]

3.4.3.3.2.5 Tetrachoric Correlation

  • is a special case of Polychoric Correlation when both variables are binary

3.4.3.3.2.6 Polychoric Correlation

  • between ordinal categorical variables (natural order).
  • Assumption: Ordinal variable is a discrete representation of a latent normally distributed continuous variable. (Income = low, normal, high).

3.5 Summary

Get the correlation table for continuous variables only

Alternatively, you can also have the

cyl vs carb
cyl 1 . .
vs −.81 1 .
carb .53 −.57 1

descriptive formula in research

Different comparison between different correlation between different types of variables (i.e., continuous vs. categorical) can be problematic. Moreover, the problem of detecting non-linear vs. linear relationship/correlation is another one. Hence, a solution is that using mutual information from information theory (i.e., knowing one variable can reduce uncertainty about the other).

To implement mutual information, we have the following approximations

\[ \downarrow \text{prediction error} \approx \downarrow \text{uncertainty} \approx \downarrow \text{association strength} \]

More specifically, following the X2Y metric , we have the following steps:

Predict \(y\) without \(x\) (i.e., baseline model)

Average of \(y\) when \(y\) is continuous

Most frequent value when \(y\) is categorical

Predict \(y\) with \(x\) (e.g., linear, random forest, etc.)

Calculate the prediction error difference between 1 and 2

To have a comprehensive table that could handle

continuous vs. continuous

categorical vs. continuous

continuous vs. categorical

categorical vs. categorical

the suggested model would be Classification and Regression Trees (CART). But we can certainly use other models as well.

The downfall of this method is that you might suffer

  • Symmetry: \((x,y) \neq (y,x)\)
  • Comparability : Different pair of comparison might use different metrics (e.g., misclassification error vs. MAE)

3.5.1 Visualization

descriptive formula in research

More general form,

descriptive formula in research

Both heat map and correlation at the same time

descriptive formula in research

More elaboration with ggplot2

descriptive formula in research

Logo for University of Southern Queensland

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

14 Quantitative analysis: Descriptive statistics

Numeric data collected in a research project can be analysed quantitatively using statistical tools in two different ways. Descriptive analysis refers to statistically describing, aggregating, and presenting the constructs of interest or associations between these constructs. Inferential analysis refers to the statistical testing of hypotheses (theory testing). In this chapter, we will examine statistical techniques used for descriptive analysis, and the next chapter will examine statistical techniques for inferential analysis. Much of today’s quantitative data analysis is conducted using software programs such as SPSS or SAS. Readers are advised to familiarise themselves with one of these programs for understanding the concepts described in this chapter.

Data preparation

In research projects, data may be collected from a variety of sources: postal surveys, interviews, pretest or posttest experimental data, observational data, and so forth. This data must be converted into a machine-readable, numeric format, such as in a spreadsheet or a text file, so that they can be analysed by computer programs like SPSS or SAS. Data preparation usually follows the following steps:

Data coding. Coding is the process of converting data into numeric format. A codebook should be created to guide the coding process. A codebook is a comprehensive document containing a detailed description of each variable in a research study, items or measures for that variable, the format of each item (numeric, text, etc.), the response scale for each item (i.e., whether it is measured on a nominal, ordinal, interval, or ratio scale, and whether this scale is a five-point, seven-point scale, etc.), and how to code each value into a numeric format. For instance, if we have a measurement item on a seven-point Likert scale with anchors ranging from ‘strongly disagree’ to ‘strongly agree’, we may code that item as 1 for strongly disagree, 4 for neutral, and 7 for strongly agree, with the intermediate anchors in between. Nominal data such as industry type can be coded in numeric form using a coding scheme such as: 1 for manufacturing, 2 for retailing, 3 for financial, 4 for healthcare, and so forth (of course, nominal data cannot be analysed statistically). Ratio scale data such as age, income, or test scores can be coded as entered by the respondent. Sometimes, data may need to be aggregated into a different form than the format used for data collection. For instance, if a survey measuring a construct such as ‘benefits of computers’ provided respondents with a checklist of benefits that they could select from, and respondents were encouraged to choose as many of those benefits as they wanted, then the total number of checked items could be used as an aggregate measure of benefits. Note that many other forms of data—such as interview transcripts—cannot be converted into a numeric format for statistical analysis. Codebooks are especially important for large complex studies involving many variables and measurement items, where the coding process is conducted by different people, to help the coding team code data in a consistent manner, and also to help others understand and interpret the coded data.

Data entry. Coded data can be entered into a spreadsheet, database, text file, or directly into a statistical program like SPSS. Most statistical programs provide a data editor for entering data. However, these programs store data in their own native format—e.g., SPSS stores data as .sav files—which makes it difficult to share that data with other statistical programs. Hence, it is often better to enter data into a spreadsheet or database where it can be reorganised as needed, shared across programs, and subsets of data can be extracted for analysis. Smaller data sets with less than 65,000 observations and 256 items can be stored in a spreadsheet created using a program such as Microsoft Excel, while larger datasets with millions of observations will require a database. Each observation can be entered as one row in the spreadsheet, and each measurement item can be represented as one column. Data should be checked for accuracy during and after entry via occasional spot checks on a set of items or observations. Furthermore, while entering data, the coder should watch out for obvious evidence of bad data, such as the respondent selecting the ‘strongly agree’ response to all items irrespective of content, including reverse-coded items. If so, such data can be entered but should be excluded from subsequent analysis.

-1

Data transformation. Sometimes, it is necessary to transform data values before they can be meaningfully interpreted. For instance, reverse coded items—where items convey the opposite meaning of that of their underlying construct—should be reversed (e.g., in a 1-7 interval scale, 8 minus the observed value will reverse the value) before they can be compared or combined with items that are not reverse coded. Other kinds of transformations may include creating scale measures by adding individual scale items, creating a weighted index from a set of observed measures, and collapsing multiple values into fewer categories (e.g., collapsing incomes into income ranges).

Univariate analysis

Univariate analysis—or analysis of a single variable—refers to a set of statistical techniques that can describe the general properties of one variable. Univariate statistics include: frequency distribution, central tendency, and dispersion. The frequency distribution of a variable is a summary of the frequency—or percentages—of individual values or ranges of values for that variable. For instance, we can measure how many times a sample of respondents attend religious services—as a gauge of their ‘religiosity’—using a categorical scale: never, once per year, several times per year, about once a month, several times per month, several times per week, and an optional category for ‘did not answer’. If we count the number or percentage of observations within each category—except ‘did not answer’ which is really a missing value rather than a category—and display it in the form of a table, as shown in Figure 14.1, what we have is a frequency distribution. This distribution can also be depicted in the form of a bar chart, as shown on the right panel of Figure 14.1, with the horizontal axis representing each category of that variable and the vertical axis representing the frequency or percentage of observations within each category.

Frequency distribution of religiosity

With very large samples, where observations are independent and random, the frequency distribution tends to follow a plot that looks like a bell-shaped curve—a smoothed bar chart of the frequency distribution—similar to that shown in Figure 14.2. Here most observations are clustered toward the centre of the range of values, with fewer and fewer observations clustered toward the extreme ends of the range. Such a curve is called a normal distribution .

(15 + 20 + 21 + 20 + 36 + 15 + 25 + 15)/8=20.875

Lastly, the mode is the most frequently occurring value in a distribution of values. In the previous example, the most frequently occurring value is 15, which is the mode of the above set of test scores. Note that any value that is estimated from a sample, such as mean, median, mode, or any of the later estimates are called a statistic .

36-15=21

Bivariate analysis

Bivariate analysis examines how two variables are related to one another. The most common bivariate statistic is the bivariate correlation —often, simply called ‘correlation’—which is a number between -1 and +1 denoting the strength of the relationship between two variables. Say that we wish to study how age is related to self-esteem in a sample of 20 respondents—i.e., as age increases, does self-esteem increase, decrease, or remain unchanged?. If self-esteem increases, then we have a positive correlation between the two variables, if self-esteem decreases, then we have a negative correlation, and if it remains the same, we have a zero correlation. To calculate the value of this correlation, consider the hypothetical dataset shown in Table 14.1.

Normal distribution

After computing bivariate correlation, researchers are often interested in knowing whether the correlation is significant (i.e., a real one) or caused by mere chance. Answering such a question would require testing the following hypothesis:

\[H_0:\quad r = 0 \]

Social Science Research: Principles, Methods and Practices (Revised edition) Copyright © 2019 by Anol Bhattacherjee is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Once the data has been coded and double-checked, the next step is to calculate Descriptive Statistics. The three main types of descriptive statistics are frequencies, measures of central tendency (also called averages), and measures of variability. Frequency statistics simply count the number of times that each variable occurs, such as the number of males and females within the sample. Measures of central tendency give one number that represents the entire set of scores, such as the mean. Measures of variability indicate the degree to which scores differ around the average.

Descriptive research designs typically only require descriptive statistics. However, all other types of research designs will require both descriptive and inferential statistics. Since it is important for the reader to have a good understanding of the sample that the study was conducted on, the first statistics for all research designs should include descriptive statistics of the personal information for the sample. The types of personal information to be included will vary depending on the type of research study. All studies should report descriptive statistics on gender and age. Other variables might include grade level, marital status, years of work experience, educational qualifications, socio-economic status, etc. It is up to the researcher to thoughtfully consider what the reader needs to know about the sample to make an informed decision about whether the sample is representative of the overall population.

Frequency and percentage statistics should be used to represent most personal information variables. However, if participants reported their exact age, then the mean and standard deviation should be calculated for the age variable. Frequency statistics should be reported whenever the data is discrete, meaning that there are separate categories that the participant can tick. For example, marital status can have categories of single, married, divorced, widowed, and separated. Educational qualifications can have categories of secondary school, diploma, degree, post-graduate diploma, masters, and doctorate.

However, measures of central tendency and variability should be reported for variables that have continuous data, meaning that the scores can vary along a continuum of numbers. For example, age is on a continuum from 0 to 100 or so, academic achievement generally varies from 0 to 100, and number of pages a student reads in a week can vary from 0 to maybe 300. These are all continuous variables, so a measure of central tendency and variability should be reported to represent these variables.

Recall that a frequency is simply the number of participants who indicated that category (aka "Male"). However, it is oftentimes difficult to interpret frequency distributions because the frequency by itself is meaningless unless there is a reference point to interpret the number. Percentages are easier to understand than frequencies because the percentage can be interpreted as follows. Imagine there were exactly 100 participants in the sample. How many participants out of those 100 would fall in that category? In Table 3, if there were 100 participants in the study, 55 would be female. Percentage is calculated by taking the frequency in the category divided by the total number of participants and multiplying by 100%. To calculate the percentage of males in Table 3, take the frequency for males (80) divided by the total number in the sample (200). Then take this number times 100%, resulting in 40%. At this point, a simple table with the frequency and Percentage of personal information variables will suffice. In the Tables and Figures page, I will describe how to convert these tables into APA format or graphically represent it in a figure.

Male
Female
Missing
Total

Once descriptive statistics for the personal information have been calculated, then it is time to move onto the variables under study. In most cases, a total score for each variable will have been calculated in the previous step, Coding the Data . APA standards require that researchers report descriptive statistics on the major variables under study, even for studies that will use inferential statistics, so the nature of any effect can be understood by the reader. This means that all research studies must report the mean and standard deviation for all variables under study. The mean is necessary to summarize that variable across all participants; the standard deviation is necessary to understand how much each participant varies around that mean.

At the moment, it is enough to calculate the mean and standard deviation and combine them all in one table. If a causal-comparative design is used that compares two or more groups on these variables (aka compares males and females on academic achievement), then it is necessary to calculate the mean and standard deviation separately for each group. If a pre- post-test design is used, then the mean and standard deviation will need to be calculated separately for the pre-test and the post-test. Table 4 gives the means and standard deviations for a study that compares teachers in private and public schools on three variables associated with early literacy practices. Recall that the mean is calculated by summing the scores, and then dividing this sum by the number of scores. Calculating the standard deviation is a bit more complicated. Microsoft Excel will quickly and automatically calculate each statistic using the =average and =stdev functions. For examples of how to calculate frequencies, averages, and variation by hand, click here .

Read Books Aloud
Tell Stories
Sight Words

When reporting frequencies, do not add any places after the decimal point; only report whole numbers. When reporting percentages, means, and standard deviations, typically include two decimal points.

At this point, we cannot say that there is a significant difference between Public or Private school teachers on any of these variables. There will always be differences between scores for different groups of people. Inferential statistics are necessary to determine whether these differences are big enough to be considered significant. In other words, to determine if the differences between groups are large enough to say that there is any meaningful difference between the two, one must calculate an inferential statistic, described in the next chapter.

If a research study has Research Questions, then either a percentage or a mean will likely be calculated to answer the research question. Once the descriptive statistics for the personal information and key variables have been calculated, then it is time to answer any research questions. Refer to the Research Questions that were developed. Calculate the appropriate statistic to answer each research question separately. Refer to the Methods of Data Analysis to determine which statistics should be calculated to answer each research question.

Again, it is very important that the researcher is very careful when calculating statistics to avoid careless errors. Incorrect calculations can lead a researcher to draw incorrect conclusions, making the study invalid and untrue. Therefore, check every calculation multiple times in order to maintain the highest ethical standards in research.

For a step-by-step example of a descriptive research study and how to calculate descriptive statistics, click here .

Copyright 2013, Katrina A. Korb, All Rights Reserved

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 1. Descriptive Statistics and Frequency Distributions

This chapter is about describing populations and samples, a subject known as descriptive statistics. This will all make more sense if you keep in mind that the information you want to produce is a description of the population or sample as a whole, not a description of one member of the population. The first topic in this chapter is a discussion of distributions , essentially pictures of populations (or samples). Second will be the discussion of descriptive statistics. The topics are arranged in this order because the descriptive statistics can be thought of as ways to describe the picture of a population, the distribution.

Distributions

The first step in turning data into information is to create a distribution. The most primitive way to present a distribution is to simply list, in one column, each value that occurs in the population and, in the next column, the number of times it occurs. It is customary to list the values from lowest to highest. This simple listing is called a frequency distribution . A more elegant way to turn data into information is to draw a graph of the distribution. Customarily, the values that occur are put along the horizontal axis and the frequency of the value is on the vertical axis.

Ann is the equipment manager for the Chargers athletic teams at Camosun College, located in Victoria, British Columbia. She called the basketball and volleyball team managers and collected the following data on sock sizes used by their players. Ann found out that last year the basketball team used 14 pairs of size 7 socks, 18 pairs of size 8, 15 pairs of size 9, and 6 pairs of size 10 were used. The volleyball team used 3 pairs of size 6, 10 pairs of size 7, 15 pairs of size 8, 5 pairs of size 9, and 11 pairs of size 10. Ann arranged her data into a distribution and then drew a graph called a histogram. Ann could have created a relative frequency distribution as well as a frequency distribution. The difference is that instead of listing how many times each value occurred, Ann would list what proportion of her sample was made up of socks of each size.

You can use the Excel template below (Figure 1.1) to see all the histograms and frequencies she has created. You may also change her numbers in the yellow cells to see how the graphs will change automatically.

Notice that Ann has drawn the graphs differently. In the first graph, she has used bars for each value, while on the second, she has drawn a point for the relative frequency of each size, and then “connected the dots”. While both methods are correct, when you have values that are continuous, you will want to do something more like the “connect the dots” graph. Sock sizes are discrete , they only take on a limited number of values. Other things have continuous values; they can take on an infinite number of values, though we are often in the habit of rounding them off. An example is how much students weigh. While we usually give our weight in whole kilograms in Canada (“I weigh 60 kilograms”), few have a weight that is exactly so many kilograms. When you say “I weigh 60”, you actually mean that you weigh between 59 1/2 and 60 1/2 kilograms. We are heading toward a graph of a distribution of a continuous variable where the relative frequency of any exact value is very small, but the relative frequency of observations between two values is measurable. What we want to do is to get used to the idea that the total area under a “connect the dots” relative frequency graph, from the lowest to the highest possible value, is one. Then the part of the area under the graph between two values is the relative frequency of observations with values within that range. The height of the line above any particular value has lost any direct meaning, because it is now the area under the line between two values that is the relative frequency of an observation between those two values occurring.

You can get some idea of how this works if you go back to the bar graph of the distribution of sock sizes, but draw it with relative frequency on the vertical axis. If you arbitrarily decide that each bar has a width of one, then the area under the curve between 7.5 and 8.5 is simply the height times the width of the bar for sock size 8: .3510*1 . If you wanted to find the relative frequency of sock sizes between 6.5 and 8.5, you could simply add together the area of the bar for size 7 (that’s between 6.5 and 7.5) and the bar for size 8 (between 7.5 and 8.5).

Descriptive statistics

Now that you see how a distribution is created, you are ready to learn how to describe one. There are two main things that need to be described about a distribution: its location and its shape. Generally, it is best to give a single measure as the description of the location and a single measure as the description of the shape.

To describe the location of a distribution, statisticians use a typical value from the distribution. There are a number of different ways to find the typical value, but by far the most used is the arithmetic mean , usually simply called the mean . You already know how to find the arithmetic mean, you are just used to calling it the average . Statisticians use average more generally — the arithmetic mean is one of a number of different averages. Look at the formula for the arithmetic mean:

[latex]\mu = \dfrac{\sum{x}}{N}[/latex]

All you do is add up all of the members of the population, [latex]\sum{x}[/latex], and divide by how many members there are, N . The only trick is to remember that if there is more than one member of the population with a certain value, to add that value once for every member that has it. To reflect this, the equation for the mean sometimes is written:

[latex]\mu = \dfrac{\sum{f_i(x_i)}}{N}[/latex]

where f i is the frequency of members of the population with the value x i .

This is really the same formula as above. If there are seven members with a value of ten, the first formula would have you add seven ten times. The second formula simply has you multiply seven by ten — the same thing as adding together ten sevens.

Other measures of location are the median and the mode. The median is the value of the member of the population that is in the middle when the members are sorted from smallest to largest. Half of the members of the population have values higher than the median, and half have values lower. The median is a better measure of location if there are one or two members of the population that are a lot larger (or a lot smaller) than all the rest. Such extreme values can make the mean a poor measure of location, while they have little effect on the median. If there are an odd number of members of the population, there is no problem finding which member has the median value. If there are an even number of members of the population, then there is no single member in the middle. In that case, just average together the values of the two members that share the middle.

The third common measure of location is the mode . If you have arranged the population into a frequency or relative frequency distribution, the mode is easy to find because it is the value that occurs most often. While in some sense, the mode is really the most typical member of the population, it is often not very near the middle of the population. You can also have multiple modes. I am sure you have heard someone say that “it was a bimodal distribution “. That simply means that there were two modes, two values that occurred equally most often.

If you think about it, you should not be surprised to learn that for bell-shaped distributions, the mean, median, and mode will be equal. Most of what statisticians do when describing or inferring the location of a population is done with the mean. Another thing to think about is using a spreadsheet program, like Microsoft Excel, when arranging data into a frequency distribution or when finding the median or mode. By using the sort and distribution commands in 1-2-3, or similar commands in Excel, data can quickly be arranged in order or placed into value classes and the number in each class found. Excel also has a function, =AVERAGE(…), for finding the arithmetic mean. You can also have the spreadsheet program draw your frequency or relative frequency distribution.

One of the reasons that the arithmetic mean is the most used measure of location is because the mean of a sample is an unbiased estimator of the population mean. Because the sample mean is an unbiased estimator of the population mean, the sample mean is a good way to make an inference about the population mean. If you have a sample from a population, and you want to guess what the mean of that population is, you can legitimately guess that the population mean is equal to the mean of your sample. This is a legitimate way to make this inference because the mean of all the sample means equals the mean of the population, so if you used this method many times to infer the population mean, on average you’d be correct.

All of these measures of location can be found for samples as well as populations, using the same formulas. Generally, μ is used for a population mean, and x is used for sample means. Upper-case N , really a Greek nu , is used for the size of a population, while lower case n is used for sample size. Though it is not universal, statisticians tend to use the Greek alphabet for population characteristics and the Roman alphabet for sample characteristics.

Measuring population shape

Measuring the shape of a distribution is more difficult. Location has only one dimension (“where?”), but shape has a lot of dimensions. We will talk about two,and you will find that most of the time, only one dimension of shape is measured. The two dimensions of shape discussed here are the width and symmetry of the distribution. The simplest way to measure the width is to do just that—the range is the distance between the lowest and highest members of the population. The range is obviously affected by one or two population members that are much higher or lower than all the rest.

The most common measures of distribution width are the standard deviation and the variance. The standard deviation is simply the square root of the variance, so if you know one (and have a calculator that does squares and square roots) you know the other. The standard deviation is just a strange measure of the mean distance between the members of a population and the mean of the population. This is easiest to see if you start out by looking at the formula for the variance:

[latex]\sigma^2 = \dfrac{\sum{(x-\mu)^2}}{N}[/latex]

Look at the numerator. To find the variance, the first step (after you have the mean, μ ) is to take each member of the population, and find the difference between its value and the mean; you should have N differences. Square each of those, and add them together, dividing the sum by N , the number of members of the population. Since you find the mean of a group of things by adding them together and then dividing by the number in the group, the variance is simply the mean of the squared distances between members of the population and the population mean.

Notice that this is the formula for a population characteristic, so we use the Greek σ and that we write the variance as σ 2 , or sigma square because the standard deviation is simply the square root of the variance, its symbol is simply sigma , σ .

One of the things statisticians have discovered is that 75 per cent of the members of any population are within two standard deviations of the mean of the population. This is known as Chebyshev’s theorem . If the mean of a population of shoe sizes is 9.6 and the standard deviation is 1.1, then 75 per cent of the shoe sizes are between 7.4 (two standard deviations below the mean) and 11.8 (two standard deviations above the mean). This same theorem can be stated in probability terms: the probability that anything is within two standard deviations of the mean of its population is .75.

It is important to be careful when dealing with variances and standard deviations. In later chapters, there are formulas using the variance, and formulas using the standard deviation. Be sure you know which one you are supposed to be using. Here again, spreadsheet programs will figure out the standard deviation for you. In Excel, there is a function, =STDEVP(…), that does all of the arithmetic. Most calculators will also compute the standard deviation. Read the little instruction booklet, and find out how to have your calculator do the numbers before you do any homework or have a test.

The other measure of shape we will discuss here is the measure of skewness. Skewness is simply a measure of whether or not the distribution is symmetric or if it has a long tail on one side, but not the other. There are a number of ways to measure skewness, with many of the measures based on a formula much like the variance. The formula looks a lot like that for the variance, except the distances between the members and the population mean are cubed, rather than squared, before they are added together:

[latex]sk = \dfrac{\sum{(x-\mu)^3}}{N}[/latex]

At first, it might not seem that cubing rather than squaring those distances would make much difference. Remember, however, that when you square either a positive or negative number, you get a positive number, but when you cube a positive, you get a positive and when you cube a negative you get a negative. Also remember that when you square a number, it gets larger, but that when you cube a number, it gets a whole lot larger. Think about a distribution with a long tail out to the left. There are a few members of that population much smaller than the mean, members for which (x – μ) is large and negative. When these are cubed, you end up with some really big negative numbers. Because there are no members with such large, positive (x – μ) , there are no corresponding really big positive numbers to add in when you sum up the (x – μ) 3 , and the sum will be negative. A negative measure of skewness means that there is a tail out to the left, a positive measure means a tail to the right. Take a minute and convince yourself that if the distribution is symmetric, with equal tails on the left and right, the measure of skew is zero.

To be really complete, there is one more thing to measure, kurtosis or peakedness . As you might expect by now, it is measured by taking the distances between the members and the mean and raising them to the fourth power before averaging them together.

Measuring sample shape

Measuring the location of a sample is done in exactly the way that the location of a population is done. However, measuring the shape of a sample is done a little differently than measuring the shape of a population. The reason behind the difference is the desire to have the sample measurement serve as an unbiased estimator of the population measurement. If we took all of the possible samples of a certain size, n , from a population and found the variance of each one, and then found the mean of those sample variances, that mean would be a little smaller than the variance of the population.

You can see why this is so if you think it through. If you knew the population mean, you could find [latex]\sum{\dfrac{(x-\mu)^2}{n}}[/latex] for each sample, and have an unbiased estimate for σ 2 . However, you do not know the population mean, so you will have to infer it. The best way to infer the population mean is to use the sample mean x . The variance of a sample will then be found by averaging together all of the [latex]\sum{\dfrac{(x-\bar{x})^2}{n}}[/latex].

The mean of a sample is obviously determined by where the members of that sample lie. If you have a sample that is mostly from the high (or right) side of a population’s distribution, then the sample mean will almost for sure be greater than the population mean. For such a sample, [latex]\sum{\dfrac{(x-\bar{x})^2}{n}}[/latex] would underestimate σ 2 . The same is true for samples that are mostly from the low (or left) side of the population. If you think about what kind of samples will have [latex]\sum{\dfrac{(x-\bar{x})^2}{n}}[/latex] that is greater than the population σ 2 , you will come to the realization that it is only those samples with a few very high members and a few very low members — and there are not very many samples like that. By now you should have convinced yourself that [latex]\sum{\dfrac{(x-\bar{x})^2}{n}}[/latex] will result in a biased estimate of σ 2 . You can see that, on average, it is too small.

How can an unbiased estimate of the population variance, σ 2 , be found? If [latex]\sum{\dfrac{(x-\bar{x})^2}{n}}[/latex] is on average too small, we need to do something to make it a little bigger. We want to keep the [latex]\sum{(x-\bar{x})^2}[/latex], but if we divide it by something a little smaller, the result will be a little larger. Statisticians have found out that the following way to compute the sample variance results in an unbiased estimator of the population variance:

[latex]s^2 = \dfrac{\sum{(x-\bar{x})^2}}{n-1}[/latex]

If we took all of the possible samples of some size, n , from a population, and found the sample variance for each of those samples, using this formula, the mean of those sample variances would equal the population variance, σ 2 .

Note that we use s 2 instead of σ 2 , and n instead of N (really nu , not en ) since this is for a sample and we want to use the Roman letters rather than the Greek letters, which are used for populations.

There is another way to see why you divide by n-1 . We also have to address something called degrees of freedom before too long, and the degrees of freedom are the key in the other explanation. As we go through this explanation, you should be able to see that the two explanations are related.

Imagine that you have a sample with 10 members, n=10 , and you want to use it to estimate the variance of the population from which it was drawn. You write each of the 10 values on a separate scrap of paper. If you know the population mean, you could start by computing all 10 (x – μ) 2 . However, in the usual case, you do not know μ , and you must start by finding x from the values on the 10 scraps to use as an estimate of m . Once you have found x , you could lose any one of the 10 scraps and still be able to find the value that was on the lost scrap from the other 9 scraps. If you are going to use x in the formula for sample variance, only 9 (or n-1 ) of the x ’s are free to take on any value. Because only n-1 of the  x ’s can vary freely, you should divide [latex]\sum{(x-\bar{x})^2}[/latex] by n-1 , the number of ( x ’s) that are really free. Once you use x in the formula for sample variance, you use up one degree of freedom, leaving only n-1 . Generally, whenever you use something you have previously computed from a sample within a formula, you use up a degree of freedom.

A little thought will link the two explanations. The first explanation is based on the idea that x , the estimator of μ , varies with the sample. It is because x varies with the sample that a degree of freedom is used up in the second explanation.

The sample standard deviation is found simply by taking the square root of the sample variance:

[latex]s=\surd[\dfrac{\sum{(x-\bar{x}})^2}{n-1}][/latex]

While the sample variance is an unbiased estimator of population variance, the sample standard deviation is not an unbiased estimator of the population standard deviation — the square root of the average is not the same as the average of the square roots. This causes statisticians to use variance where it seems as though they are trying to get at standard deviation. In general, statisticians tend to use variance more than standard deviation. Be careful with formulas using sample variance and standard deviation in the following chapters. Make sure you are using the right one. Also note that many calculators will find standard deviation using both the population and sample formulas. Some use σ and s to show the difference between population and sample formulas, some use s n and s n-1 to show the difference.

If Ann wanted to infer what the population distribution of volleyball players’ sock sizes looked like she could do so from her sample. If she is going to send volleyball coaches packages of socks for the players to try, she will want to have the packages contain an assortment of sizes that will allow each player to have a pair that fits. Ann wants to infer what the distribution of volleyball players’ sock sizes looks like. She wants to know the mean and variance of that distribution. Her data, again, are shown in Table 1.1.

Table 1.1 Ann’s Data
6 3
7 24
8 33
9 20
10 17

The mean sock size can be found: [latex]=\dfrac{3*6+24*7+33*8+20*9+17*10}{97} = 8.25[/latex]

To find the sample standard deviation, Ann decides to use Excel. She lists the sock sizes that were in the sample in column A (see Table 1.2) , and the frequency of each of those sizes in column B. For column C, she has the computer find for each of [latex]\sum{(x-\bar{x})^2}[/latex] the sock sizes, using the formula (A1-8.25) 2 in the first row, and then copying it down to the other four rows. In D1, she multiplies C1, by the frequency using the formula =B1*C1, and copying it down into the other rows. Finally, she finds the sample standard deviation by adding up the five numbers in column D and dividing by n-1 = 96 using the Excel formula =sum(D1:D5)/96. The spreadsheet appears like this when she is done:

Table 1.2 Sock Sizes
1 6 3 5.06 15.19
2 7 24 1.56 37.5
3 8 33 0.06 2.06
4 9 20 0.56 11.25
5 10 17 3.06 52.06
6 = 97 Var = 1.217139
7 Std.dev = 1.103.24

Ann now has an estimate of the variance of the sizes of socks worn by basketball and volleyball players, 1.22. She has inferred that the population of Chargers players’ sock sizes has a mean of 8.25 and a variance of 1.22.

Ann’s collected data can simply be added to the following Excel template. The calculations of both variance and standard deviation have been shown below. You can change her numbers to see how these two measures change.

To describe a population you need to describe the picture or graph of its distribution. The two things that need to be described about the distribution are its location and its shape. Location is measured by an average, most often the arithmetic mean. The most important measure of shape is a measure of dispersion, roughly width, most often the variance or its square root the standard deviation.

Samples need to be described, too. If all we wanted to do with sample descriptions was describe the sample, we could use exactly the same measures for sample location and dispersion that are used for populations. However, we want to use the sample describers for dual purposes: (a) to describe the sample, and (b) to make inferences about the description of the population that sample came from. Because we want to use them to make inferences, we want our sample descriptions to be unbiased estimators . Our desire to measure sample dispersion with an unbiased estimator of population dispersion means that the formula we use for computing sample variance is a little different from the one used for computing population variance.

Introductory Business Statistics with Interactive Spreadsheets - 1st Canadian Edition Copyright © 2015 by Mohammad Mahbobi and Thomas K. Tiemann is licensed under a Creative Commons Attribution 4.0 International License , except where otherwise noted.

Share This Book

descriptive formula in research

Methods and formulas for Descriptive Statistics (Tables)

In this topic, standard deviation, n nonmissing, row percent, column percent, total percents.

The mean is the sum of all observations divided by the number of (non-missing) observations. Use the following formula to calculate the mean for each cell or margin using the data corresponding to that cell or margin.

descriptive formula in research

TermDescription
xdata value for each observation
ncount of number of observations for each cell or margin

The median is the middle value in an ordered data set. Thus, at least half the observations are less than or equal to the median, and at least half the observations are greater than or equal to the median.

If the number of observations in a data set is odd, the median is the value in the middle. If the number of observations in a data set is even, the median is the average of the two middle values.

descriptive formula in research

The smallest data value that is in a table cell or margin.

The largest data value that is in a table cell or margin.

The sum is the total of all the data values that are in a table cell or margin.

The standard deviation is the most common measure of dispersion, or how spread out the data are about the mean. The more widely the values are spread out, the larger the standard deviation. The standard deviation is calculated by taking the square root of the variance.

Use this formula to calculate the standard deviation for each cell or margin using the data from that cell or margin.

descriptive formula in research

TermDescription
xdata value for each observation
mean for each cell or margin
ncount of number of observations for each cell or margin

The number of non-missing observations that are in a table cell or margin.

The number of missing observations that are in a table cell or margin.

The count is the number of times each combination of categories occurs.

The row percent is obtained by multiplying the ratio of a cell count to the corresponding row total by 100 and is given by:

descriptive formula in research

TermDescription
+ number of observations in the row
number of observations in the cell corresponding to row and column

The column percent is obtained by multiplying the ratio of a cell count to the corresponding column total by 100 and is given by:

descriptive formula in research

TermDescription
counts of all the observations in the column
counts of observations in the cell corresponding to row and column

The total percent is obtained by multiplying the ratio of a cell count to the total number of observations by 100 and is given by:

descriptive formula in research

TermDescription
number of observations in the table
number of observations in the cell corresponding to row and column
  • Minitab.com
  • License Portal
  • Cookie Settings

You are now leaving support.minitab.com.

Click Continue to proceed to:

  • School Guide
  • Mathematics
  • Number System and Arithmetic
  • Trigonometry
  • Probability
  • Mensuration
  • Maths Formulas
  • Class 8 Maths Notes
  • Class 9 Maths Notes
  • Class 10 Maths Notes
  • Class 11 Maths Notes
  • Class 12 Maths Notes

Descriptive Statistics

Descriptive statistics is a subfield of statistics that deals with characterizing the features of known data. Descriptive statistics give summaries of either population or sample data. Aside from descriptive statistics, inferential statistics is another important discipline of statistics used to draw conclusions about population data.

Descriptive statistics is divided into two categories:

Measures of Central Tendency

Measures of dispersion.

In this article, we will learn about descriptive statistics, including their many categories, formulae, and examples in detail.

What is Descriptive Statistics?

Descriptive statistics is a branch of statistics focused on summarizing, organizing, and presenting data in a clear and understandable way. Its primary aim is to define and analyze the fundamental characteristics of a dataset without making sweeping generalizations or assumptions about the entire data set.

The main purpose of descriptive statistics is to provide a straightforward and concise overview of the data, enabling researchers or analysts to gain insights and understand patterns, trends, and distributions within the dataset.

Descriptive statistics typically involve measures of central tendency (such as mean, median, mode), dispersion (such as range, variance, standard deviation), and distribution shape (including skewness and kurtosis). Additionally, graphical representations like charts, graphs, and tables are commonly used to visualize and interpret the data.

Histograms, bar charts, pie charts, scatter plots, and box plots are some examples of widely used graphical techniques in descriptive statistics.

Descriptive Statistics Definition

Descriptive statistics is a type of statistical analysis that uses quantitative methods to summarize the features of a population sample. It is useful to present easy and exact summaries of the sample and observations using metrics such as mean, median, variance, graphs, and charts.

Types of Descriptive Statistics

There are three types of descriptive statistics:

Measures of Frequency Distribution

The central tendency is defined as a statistical measure that may be used to describe a complete distribution or dataset with a single value, known as a measure of central tendency. Any of the central tendency measures accurately describes the whole data distribution. In the following sections, we will look at the central tendency measures, their formulae, applications, and kinds in depth.

Mean is the sum of all the components in a group or collection divided by the number of items in that group or collection. Mean of a data collection is typically represented as x̄ (pronounced “x bar”). The formula for calculating the mean for ungrouped data to express it as the measure is given as follows:

For a series of observations:

x̄ = Σx / n
  • x̄ = Mean Value of Provided Dataset
  • Σx = Sum of All Terms
  • n = Number of Terms

Example: Weights of 7 girls in kg are 54, 32, 45, 61, 20, 66 and 50. Determine the mean weight for the provided collection of data.

Mean = Σx/n = (54 + 32 + 45 + 61 + 20 + 66 + 50)/7 = 328 / 7 = 46.85 Thus, the group’s mean weight is 46.85 kg.

Median of a data set is the value of the middle-most observation obtained after organizing the data in ascending order, which is one of the measures of central tendency. Median formula may be used to compute the median for many types of data, such as grouped and ungrouped data.

Ungrouped Data Median (n is odd): [(n + 1)/2] th  term Ungrouped Data Median (n is even): [(n / 2) th  term + ((n / 2) + 1) th  term]/2

Example: Weights of 7 girls in kg are 54, 32, 45, 61, 20, 66 and 50. Determine the median weight for the provided collection of data.

Arrange the provided data collection in ascending order: 20, 32, 45, 50, 54, 61, 66 Median = [(n + 1) / 2] th  term = [(7 + 1) / 2] th  term = 4 th  term = 50 Thus, group’s median weight is 50 kg.

Mode is one of the measures of central tendency, defined as the value that appears the most frequently in the provided data, i.e. the observation with the highest frequency is known as the mode of data. The mode formulae provided below can be used to compute the mode for ungrouped data.

Mode of Ungrouped Data: Most Repeated Observation in Dataset

Example: Weights of 7 girls in kg are 54, 32, 45, 61, 20, 45 and 50. Determine the mode weight for the provided collection of data.

Mode = Most repeated observation in Dataset = 45 Thus, group’s mode weight is 45 kg.

If the variability of data within an experiment must be established, absolute measures of variability should be employed. These metrics often reflect differences in a data collection in terms of the average deviations of the observations. The most prevalent absolute measurements of deviation are mentioned below. In the following sections, we will look at the variability measures, their formulae in depth.

Standard Deviation

The range represents the spread of your data from the lowest to the highest value in the distribution. It is the most straightforward measure of variability to compute. To get the range, subtract the data set’s lowest and highest values.

Range = Highest Value – Lowest Value

Example: Calculate the range of the following data series:  5, 13, 32, 42, 15, 84

Arrange the provided data series in ascending order: 5, 13, 15, 32, 42, 84 Range = H – L = 84 – 5 = 79 So, the range is 79.

Standard deviation (s or SD) represents the average level of variability in your dataset. It represents the average deviation of each score from the mean. The higher the standard deviation, the more varied the dataset is.

To calculate standard deviation, follow these six steps:

Step 1: Make a list of each score and calculate the mean.

Step 2: Calculate deviation from the mean, by subtracting the mean from each score.

Step 3: Square each of these differences.

Step 4: Sum up all squared variances.

Step 5: Divide the total of squared variances by N-1.

Step 6: Find the square root of the number that you discovered.

Example: Calculate standard deviation of the following data series:  5, 13, 32, 42, 15, 84.

Step 1: First we have to calculate the mean of following series using formula: Σx / n

Step 2: Now calculate the deviation from mean, subtract the mean from each series.

Step 3: Squared the deviation from mean and then add all the deviation.

Series

Deviation from Mean

Squared Deviation

5

5-31.83 = -26.83

719.85

13

13-31.83 = -18.83

354.57

32

32-31.83 = 0.17

0.0289

42

42-31.83 = 10.17

103.43

15

15-31.83 = -16.83

283.25

84

84-31.83 = 52.17

2721.71

Mean = 191/6 = 31.83

sum = 0

Sum = 4182.84

Step 4: Divide the squared deviation with N-1 => 4182.84 / 5 = 836.57

Step 5: √836.57 = 28.92

So, the standard deviation is 28.92

Variance is calculated as average of squared departures from the mean. Variance measures the degree of dispersion in a data collection. The more scattered the data, the larger the variance in relation to the mean. To calculate the variance, square the standard deviation.

Symbol for variance is s 2

Example: Calculate the variance of the following data series:  5, 13, 32, 42, 15, 84.

First we have to calculate the standard deviation, that we calculate above i.e. SD = 28.92 s 2 = (SD) 2 = (28.92) 2 = 836.37 So, the variance is 836.37

Mean Deviation

Mean Deviation  is used to find the average of the absolute value of the data about the mean, median, or mode. Mean Deviation is some times also known as absolute deviation. The formula mean deviation is given as follows:

Mean Deviation = ∑ n 1 |X – μ|/n
  •   μ is Central Value

Quartile Deviation

Quartile Deviation is the Half of difference between the third and first quartile. The formula for quartile deviation is given as follows:

Quartile Deviation = (Q 3 − Q 1 )/2
  •   Q 3 is Third Quartile
  • Q 1 is First Quartile

Other measures of dispersion include the relative measures also known as the coefficients of dispersion.

Datasets consist of various scores or values. Statisticians employ graphs and tables to summarize the occurrence of each possible value of a variable, often presented in percentages or numerical figures.

For instance, suppose you were conducting a poll to determine people’s favorite Beatles. You would create one column listing all potential options (John, Paul, George, and Ringo) and another column indicating the number of votes each received. Statisticians represent these frequency distributions through graphs or tables

Univariate Descriptive Statistics

Univariate descriptive statistics focus on one thing at a time. We look at each thing individually and use different ways to understand it better. Programs like SPSS and Excel can help us with this.

If we only look at the average (mean) of something, like how much people earn, it might not give us the true picture, especially if some people earn a lot more or less than others. Instead, we can also look at other things like the middle value (median) or the one that appears most often (mode). And to understand how spread out the values are, we use things like standard deviation and variance along with the range.

Bivariate Descriptive Statistics

When we have information about more than one thing, we can use bivariate or multivariate descriptive statistics to see if they are related. Bivariate analysis compares two things to see if they change together. Before doing any more complicated tests, it’s important to look at how the two things compare in the middle.

Multivariate analysis is similar to bivariate analysis, but it looks at more than two things at once, which helps us understand relationships even better.

Representations of Data in Descriptive Statistics

Descriptive statistics use a variety of ways to summarize and present data in an understandable manner. This helps us grasp the data set’s patterns, trends, and properties.

Frequency Distribution Tables: Frequency distribution tables divide data into categories or intervals and display the number of observations (frequency) that fall into each one. For example, suppose we have a class of 20 students and are tracking their test scores. We may make a frequency distribution table that contains score ranges (e.g., 0-10, 11-20) and displays how many students scored in each range.

Graphs and Charts: Graphs and charts graphically display data, making it simpler to understand and analyze. For example, using the same test score data, we may generate a bar graph with the x-axis representing score ranges and the y-axis representing the number of students. Each bar on the graph represents a score range, and its height shows the number of students scoring within that range.

These approaches help us summarize and visualize data, making it easier to discover trends, patterns, and outliers, which is critical for making informed decisions and reaching meaningful conclusions in a variety of sectors.

Descriptive Statistics Applications

Descriptive statistics are used in a variety of sectors to summarize, organize, and display data in a meaningful and intelligible way. Here are a few popular applications:

  • Business and Economics: Descriptive statistics are useful for analyzing sales data, market trends, and customer behaviour. They are used to generate averages, medians, and standard deviations in order to better evaluate product performance, pricing strategies, and financial metrics.
  • Healthcare: Descriptive statistics are used to analyze patient data such as demographics, medical histories, and treatment outcomes. They assist healthcare workers in determining illness prevalence, assessing treatment efficacy, and identifying risk factors.
  • Education: Descriptive statistics are useful in education since they summarize student performance on tests and examinations. They assist instructors in assessing instructional techniques, identifying areas for improvement, and monitoring student growth over time.
  • Market Research: Descriptive statistics are used to analyze customer preferences, product demand, and market trends. They enable businesses to make educated decisions about product development, advertising campaigns, and market segmentation.
  • Finance and investment: Descriptive statistics are used to analyze stock market data, portfolio performance, and risk management. They assist investors in determining investment possibilities, tracking asset values, and evaluating financial instruments.

Difference Between Descriptive Statistics and Inferential Statistics

Difference between Descriptive Statistics and Inferential Statistics is studied using the table added below as,

Descriptive Statistics vs Inferential Statistics

Descriptive Statistics

Does not need making predictions or generalizations outside the dataset.

This involves making forecasts or generalizations about a wider population.

Gives basic summary of the sample.

Concludes about the population based on the sample.

include mean, median, mode, standard deviation, etc.

include hypothesis testing, confidence intervals, regression analysis, etc.

Focuses on the properties of the current dataset.

Concentrates on drawing conclusions about the population from sample data.

Helpful for comprehending data patterns and linkages.

Useful for making judgements, predictions, and drawing inferences that go beyond the observed facts.

Example of Descriptive Statistics Examples

Example 1: Calculate the Mean, Median and Mode for the following series: {4, 8, 9, 10, 6, 12, 14, 4, 5, 3, 4}

First, we are going to calculate the mean. Mean = Σx / n = (4 + 8 + 9 + 10 + 6 + 12 + 14 + 4 + 5 + 3 + 4)/11 = 79 / 11 = 7.1818 Thus, the Mean is 7.1818. Now, we are going to calculate the median. Arrange the provided data collection in ascending order: 3, 4, 4, 4, 5, 6, 8, 9, 10, 12, 14 Median = [(n + 1) / 2] th  term = [(11 + 1) / 2] th  term = 6 th  term = 6 Thus, the median is 6. Now, we are going to calculate the mode. Mode = The most repeated observation in the dataset = 4 Thus, the mode is 4.

Example 2: Calculate the Range for the following series: {4, 8, 9, 10, 6, 12, 14, 4, 5, 3, 4}

Arrange the provided data series in ascending order: 3, 4, 4, 4, 5, 6, 8, 9, 10, 12, 14 Range = H – L = 14 – 3 = 11 So, the range is 11.

Example 3: Calculate the standard deviation and variance of following data: {12, 24, 36, 48, 10, 18}

First we are going to compute standard deviation. For standard deviation calculate the mean, deviation from mean and squared deviation.

Series

Deviation from Mean

Squared Deviation

12

12-24.66 = -12.66

160.28

24

24-24.66 = -0.66

0.436

36

36-24.66 = 11.34

128.595

48

48-24.66 = 23.34

544.76

10

10-24.66 = -14.66

214.92

18

18-24.66 = -6.66

44.36

Mean = 148/6 = 24.66

sum = 0

Sum = 1093.351

Dividing squared deviation with N-1 => 1093.351 / 5 = 218.67

√(218.67) = 14.79

So, the standard deviation is 14.79.

Now we are going to calculate the variance.

s 2 = 218.744

So, the variance is 218.744

Practice Problems on Descriptive Statistics

P1) Determine the sample variance of the following series: {17, 21, 52, 28, 26, 23}

P2) Determine the mean and mode of the following series: {21, 14, 56, 41, 18, 15, 18, 21, 15, 18}

P3) Find the median of the following series: {7, 24, 12, 8, 6, 23, 11}

P4) Find the standard deviation and variance of the following series: {17, 28, 42, 48, 36, 42, 20}

FAQs of Descriptive Statistics

What is meant by descriptive statistics.

Descriptive statistics seek to summarize, organize, and display data in an accessible manner while avoiding making sweeping generalizations about the whole population. It aids in discovering patterns, trends, and distributions within the collection.

How is the mean computed in descriptive statistics?

Mean is computed by adding together all of the values in the dataset and dividing them by the total number of observations. It measures the dataset’s central tendency or average value.

What role do measures of variability play in descriptive statistics?

Measures of variability, such as range, standard deviation, and variance, aid in quantifying the spread or dispersion of data points around the mean. They give insights on the dataset’s variety and consistency.

Can you explain the median in descriptive statistics?

The median is the midpoint value of a dataset whether sorted ascending or descending. It measures central tendency and is important when dealing with skewed data or outliers.

How can frequency distribution measurements contribute to descriptive statistics?

Measures of frequency distribution summarize the incidence of various values or categories within a dataset. They give insights into the distribution pattern of the data and are commonly represented by graphs or tables.

How are inferential statistics distinguished from descriptive statistics?

Inferential statistics use sample data to draw inferences or make predictions about a wider population, whereas descriptive statistics summarize aspects of known data. Descriptive statistics concentrate on the present dataset, whereas inferential statistics go beyond the observable data.

Why are descriptive statistics necessary in data analysis?

Descriptive statistics give researchers and analysts a clear and straightforward summary of the dataset, helping them to identify patterns, trends, and distributions. It aids in making educated judgements and gaining valuable insights from data.

What are the four types of descriptive statistics?

There are four major types of descriptive statistics: Measures of Frequency Measures of Central Tendency Measures of Dispersion or Variation Measures of Position

Which is an example of descriptive statistics?

Descriptive statistics examples include the study of mean, median, and mode.

Please Login to comment...

Similar reads.

  • School Learning
  • Math-Statistics
  • Top 10 Fun ESL Games and Activities for Teaching Kids English Abroad in 2024
  • Top Free Voice Changers for Multiplayer Games and Chat in 2024
  • Best Monitors for MacBook Pro and MacBook Air in 2024
  • 10 Best Laptop Brands in 2024
  • 15 Most Important Aptitude Topics For Placements [2024]

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Korean Med Sci
  • v.37(16); 2022 Apr 25

Logo of jkms

A Practical Guide to Writing Quantitative and Qualitative Research Questions and Hypotheses in Scholarly Articles

Edward barroga.

1 Department of General Education, Graduate School of Nursing Science, St. Luke’s International University, Tokyo, Japan.

Glafera Janet Matanguihan

2 Department of Biological Sciences, Messiah University, Mechanicsburg, PA, USA.

The development of research questions and the subsequent hypotheses are prerequisites to defining the main research purpose and specific objectives of a study. Consequently, these objectives determine the study design and research outcome. The development of research questions is a process based on knowledge of current trends, cutting-edge studies, and technological advances in the research field. Excellent research questions are focused and require a comprehensive literature search and in-depth understanding of the problem being investigated. Initially, research questions may be written as descriptive questions which could be developed into inferential questions. These questions must be specific and concise to provide a clear foundation for developing hypotheses. Hypotheses are more formal predictions about the research outcomes. These specify the possible results that may or may not be expected regarding the relationship between groups. Thus, research questions and hypotheses clarify the main purpose and specific objectives of the study, which in turn dictate the design of the study, its direction, and outcome. Studies developed from good research questions and hypotheses will have trustworthy outcomes with wide-ranging social and health implications.

INTRODUCTION

Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses. 1 , 2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results. 3 , 4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the inception of novel studies and the ethical testing of ideas. 5 , 6

It is crucial to have knowledge of both quantitative and qualitative research 2 as both types of research involve writing research questions and hypotheses. 7 However, these crucial elements of research are sometimes overlooked; if not overlooked, then framed without the forethought and meticulous attention it needs. Planning and careful consideration are needed when developing quantitative or qualitative research, particularly when conceptualizing research questions and hypotheses. 4

There is a continuing need to support researchers in the creation of innovative research questions and hypotheses, as well as for journal articles that carefully review these elements. 1 When research questions and hypotheses are not carefully thought of, unethical studies and poor outcomes usually ensue. Carefully formulated research questions and hypotheses define well-founded objectives, which in turn determine the appropriate design, course, and outcome of the study. This article then aims to discuss in detail the various aspects of crafting research questions and hypotheses, with the goal of guiding researchers as they develop their own. Examples from the authors and peer-reviewed scientific articles in the healthcare field are provided to illustrate key points.

DEFINITIONS AND RELATIONSHIP OF RESEARCH QUESTIONS AND HYPOTHESES

A research question is what a study aims to answer after data analysis and interpretation. The answer is written in length in the discussion section of the paper. Thus, the research question gives a preview of the different parts and variables of the study meant to address the problem posed in the research question. 1 An excellent research question clarifies the research writing while facilitating understanding of the research topic, objective, scope, and limitations of the study. 5

On the other hand, a research hypothesis is an educated statement of an expected outcome. This statement is based on background research and current knowledge. 8 , 9 The research hypothesis makes a specific prediction about a new phenomenon 10 or a formal statement on the expected relationship between an independent variable and a dependent variable. 3 , 11 It provides a tentative answer to the research question to be tested or explored. 4

Hypotheses employ reasoning to predict a theory-based outcome. 10 These can also be developed from theories by focusing on components of theories that have not yet been observed. 10 The validity of hypotheses is often based on the testability of the prediction made in a reproducible experiment. 8

Conversely, hypotheses can also be rephrased as research questions. Several hypotheses based on existing theories and knowledge may be needed to answer a research question. Developing ethical research questions and hypotheses creates a research design that has logical relationships among variables. These relationships serve as a solid foundation for the conduct of the study. 4 , 11 Haphazardly constructed research questions can result in poorly formulated hypotheses and improper study designs, leading to unreliable results. Thus, the formulations of relevant research questions and verifiable hypotheses are crucial when beginning research. 12

CHARACTERISTICS OF GOOD RESEARCH QUESTIONS AND HYPOTHESES

Excellent research questions are specific and focused. These integrate collective data and observations to confirm or refute the subsequent hypotheses. Well-constructed hypotheses are based on previous reports and verify the research context. These are realistic, in-depth, sufficiently complex, and reproducible. More importantly, these hypotheses can be addressed and tested. 13

There are several characteristics of well-developed hypotheses. Good hypotheses are 1) empirically testable 7 , 10 , 11 , 13 ; 2) backed by preliminary evidence 9 ; 3) testable by ethical research 7 , 9 ; 4) based on original ideas 9 ; 5) have evidenced-based logical reasoning 10 ; and 6) can be predicted. 11 Good hypotheses can infer ethical and positive implications, indicating the presence of a relationship or effect relevant to the research theme. 7 , 11 These are initially developed from a general theory and branch into specific hypotheses by deductive reasoning. In the absence of a theory to base the hypotheses, inductive reasoning based on specific observations or findings form more general hypotheses. 10

TYPES OF RESEARCH QUESTIONS AND HYPOTHESES

Research questions and hypotheses are developed according to the type of research, which can be broadly classified into quantitative and qualitative research. We provide a summary of the types of research questions and hypotheses under quantitative and qualitative research categories in Table 1 .

Quantitative research questionsQuantitative research hypotheses
Descriptive research questionsSimple hypothesis
Comparative research questionsComplex hypothesis
Relationship research questionsDirectional hypothesis
Non-directional hypothesis
Associative hypothesis
Causal hypothesis
Null hypothesis
Alternative hypothesis
Working hypothesis
Statistical hypothesis
Logical hypothesis
Hypothesis-testing
Qualitative research questionsQualitative research hypotheses
Contextual research questionsHypothesis-generating
Descriptive research questions
Evaluation research questions
Explanatory research questions
Exploratory research questions
Generative research questions
Ideological research questions
Ethnographic research questions
Phenomenological research questions
Grounded theory questions
Qualitative case study questions

Research questions in quantitative research

In quantitative research, research questions inquire about the relationships among variables being investigated and are usually framed at the start of the study. These are precise and typically linked to the subject population, dependent and independent variables, and research design. 1 Research questions may also attempt to describe the behavior of a population in relation to one or more variables, or describe the characteristics of variables to be measured ( descriptive research questions ). 1 , 5 , 14 These questions may also aim to discover differences between groups within the context of an outcome variable ( comparative research questions ), 1 , 5 , 14 or elucidate trends and interactions among variables ( relationship research questions ). 1 , 5 We provide examples of descriptive, comparative, and relationship research questions in quantitative research in Table 2 .

Quantitative research questions
Descriptive research question
- Measures responses of subjects to variables
- Presents variables to measure, analyze, or assess
What is the proportion of resident doctors in the hospital who have mastered ultrasonography (response of subjects to a variable) as a diagnostic technique in their clinical training?
Comparative research question
- Clarifies difference between one group with outcome variable and another group without outcome variable
Is there a difference in the reduction of lung metastasis in osteosarcoma patients who received the vitamin D adjunctive therapy (group with outcome variable) compared with osteosarcoma patients who did not receive the vitamin D adjunctive therapy (group without outcome variable)?
- Compares the effects of variables
How does the vitamin D analogue 22-Oxacalcitriol (variable 1) mimic the antiproliferative activity of 1,25-Dihydroxyvitamin D (variable 2) in osteosarcoma cells?
Relationship research question
- Defines trends, association, relationships, or interactions between dependent variable and independent variable
Is there a relationship between the number of medical student suicide (dependent variable) and the level of medical student stress (independent variable) in Japan during the first wave of the COVID-19 pandemic?

Hypotheses in quantitative research

In quantitative research, hypotheses predict the expected relationships among variables. 15 Relationships among variables that can be predicted include 1) between a single dependent variable and a single independent variable ( simple hypothesis ) or 2) between two or more independent and dependent variables ( complex hypothesis ). 4 , 11 Hypotheses may also specify the expected direction to be followed and imply an intellectual commitment to a particular outcome ( directional hypothesis ) 4 . On the other hand, hypotheses may not predict the exact direction and are used in the absence of a theory, or when findings contradict previous studies ( non-directional hypothesis ). 4 In addition, hypotheses can 1) define interdependency between variables ( associative hypothesis ), 4 2) propose an effect on the dependent variable from manipulation of the independent variable ( causal hypothesis ), 4 3) state a negative relationship between two variables ( null hypothesis ), 4 , 11 , 15 4) replace the working hypothesis if rejected ( alternative hypothesis ), 15 explain the relationship of phenomena to possibly generate a theory ( working hypothesis ), 11 5) involve quantifiable variables that can be tested statistically ( statistical hypothesis ), 11 6) or express a relationship whose interlinks can be verified logically ( logical hypothesis ). 11 We provide examples of simple, complex, directional, non-directional, associative, causal, null, alternative, working, statistical, and logical hypotheses in quantitative research, as well as the definition of quantitative hypothesis-testing research in Table 3 .

Quantitative research hypotheses
Simple hypothesis
- Predicts relationship between single dependent variable and single independent variable
If the dose of the new medication (single independent variable) is high, blood pressure (single dependent variable) is lowered.
Complex hypothesis
- Foretells relationship between two or more independent and dependent variables
The higher the use of anticancer drugs, radiation therapy, and adjunctive agents (3 independent variables), the higher would be the survival rate (1 dependent variable).
Directional hypothesis
- Identifies study direction based on theory towards particular outcome to clarify relationship between variables
Privately funded research projects will have a larger international scope (study direction) than publicly funded research projects.
Non-directional hypothesis
- Nature of relationship between two variables or exact study direction is not identified
- Does not involve a theory
Women and men are different in terms of helpfulness. (Exact study direction is not identified)
Associative hypothesis
- Describes variable interdependency
- Change in one variable causes change in another variable
A larger number of people vaccinated against COVID-19 in the region (change in independent variable) will reduce the region’s incidence of COVID-19 infection (change in dependent variable).
Causal hypothesis
- An effect on dependent variable is predicted from manipulation of independent variable
A change into a high-fiber diet (independent variable) will reduce the blood sugar level (dependent variable) of the patient.
Null hypothesis
- A negative statement indicating no relationship or difference between 2 variables
There is no significant difference in the severity of pulmonary metastases between the new drug (variable 1) and the current drug (variable 2).
Alternative hypothesis
- Following a null hypothesis, an alternative hypothesis predicts a relationship between 2 study variables
The new drug (variable 1) is better on average in reducing the level of pain from pulmonary metastasis than the current drug (variable 2).
Working hypothesis
- A hypothesis that is initially accepted for further research to produce a feasible theory
Dairy cows fed with concentrates of different formulations will produce different amounts of milk.
Statistical hypothesis
- Assumption about the value of population parameter or relationship among several population characteristics
- Validity tested by a statistical experiment or analysis
The mean recovery rate from COVID-19 infection (value of population parameter) is not significantly different between population 1 and population 2.
There is a positive correlation between the level of stress at the workplace and the number of suicides (population characteristics) among working people in Japan.
Logical hypothesis
- Offers or proposes an explanation with limited or no extensive evidence
If healthcare workers provide more educational programs about contraception methods, the number of adolescent pregnancies will be less.
Hypothesis-testing (Quantitative hypothesis-testing research)
- Quantitative research uses deductive reasoning.
- This involves the formation of a hypothesis, collection of data in the investigation of the problem, analysis and use of the data from the investigation, and drawing of conclusions to validate or nullify the hypotheses.

Research questions in qualitative research

Unlike research questions in quantitative research, research questions in qualitative research are usually continuously reviewed and reformulated. The central question and associated subquestions are stated more than the hypotheses. 15 The central question broadly explores a complex set of factors surrounding the central phenomenon, aiming to present the varied perspectives of participants. 15

There are varied goals for which qualitative research questions are developed. These questions can function in several ways, such as to 1) identify and describe existing conditions ( contextual research question s); 2) describe a phenomenon ( descriptive research questions ); 3) assess the effectiveness of existing methods, protocols, theories, or procedures ( evaluation research questions ); 4) examine a phenomenon or analyze the reasons or relationships between subjects or phenomena ( explanatory research questions ); or 5) focus on unknown aspects of a particular topic ( exploratory research questions ). 5 In addition, some qualitative research questions provide new ideas for the development of theories and actions ( generative research questions ) or advance specific ideologies of a position ( ideological research questions ). 1 Other qualitative research questions may build on a body of existing literature and become working guidelines ( ethnographic research questions ). Research questions may also be broadly stated without specific reference to the existing literature or a typology of questions ( phenomenological research questions ), may be directed towards generating a theory of some process ( grounded theory questions ), or may address a description of the case and the emerging themes ( qualitative case study questions ). 15 We provide examples of contextual, descriptive, evaluation, explanatory, exploratory, generative, ideological, ethnographic, phenomenological, grounded theory, and qualitative case study research questions in qualitative research in Table 4 , and the definition of qualitative hypothesis-generating research in Table 5 .

Qualitative research questions
Contextual research question
- Ask the nature of what already exists
- Individuals or groups function to further clarify and understand the natural context of real-world problems
What are the experiences of nurses working night shifts in healthcare during the COVID-19 pandemic? (natural context of real-world problems)
Descriptive research question
- Aims to describe a phenomenon
What are the different forms of disrespect and abuse (phenomenon) experienced by Tanzanian women when giving birth in healthcare facilities?
Evaluation research question
- Examines the effectiveness of existing practice or accepted frameworks
How effective are decision aids (effectiveness of existing practice) in helping decide whether to give birth at home or in a healthcare facility?
Explanatory research question
- Clarifies a previously studied phenomenon and explains why it occurs
Why is there an increase in teenage pregnancy (phenomenon) in Tanzania?
Exploratory research question
- Explores areas that have not been fully investigated to have a deeper understanding of the research problem
What factors affect the mental health of medical students (areas that have not yet been fully investigated) during the COVID-19 pandemic?
Generative research question
- Develops an in-depth understanding of people’s behavior by asking ‘how would’ or ‘what if’ to identify problems and find solutions
How would the extensive research experience of the behavior of new staff impact the success of the novel drug initiative?
Ideological research question
- Aims to advance specific ideas or ideologies of a position
Are Japanese nurses who volunteer in remote African hospitals able to promote humanized care of patients (specific ideas or ideologies) in the areas of safe patient environment, respect of patient privacy, and provision of accurate information related to health and care?
Ethnographic research question
- Clarifies peoples’ nature, activities, their interactions, and the outcomes of their actions in specific settings
What are the demographic characteristics, rehabilitative treatments, community interactions, and disease outcomes (nature, activities, their interactions, and the outcomes) of people in China who are suffering from pneumoconiosis?
Phenomenological research question
- Knows more about the phenomena that have impacted an individual
What are the lived experiences of parents who have been living with and caring for children with a diagnosis of autism? (phenomena that have impacted an individual)
Grounded theory question
- Focuses on social processes asking about what happens and how people interact, or uncovering social relationships and behaviors of groups
What are the problems that pregnant adolescents face in terms of social and cultural norms (social processes), and how can these be addressed?
Qualitative case study question
- Assesses a phenomenon using different sources of data to answer “why” and “how” questions
- Considers how the phenomenon is influenced by its contextual situation.
How does quitting work and assuming the role of a full-time mother (phenomenon assessed) change the lives of women in Japan?
Qualitative research hypotheses
Hypothesis-generating (Qualitative hypothesis-generating research)
- Qualitative research uses inductive reasoning.
- This involves data collection from study participants or the literature regarding a phenomenon of interest, using the collected data to develop a formal hypothesis, and using the formal hypothesis as a framework for testing the hypothesis.
- Qualitative exploratory studies explore areas deeper, clarifying subjective experience and allowing formulation of a formal hypothesis potentially testable in a future quantitative approach.

Qualitative studies usually pose at least one central research question and several subquestions starting with How or What . These research questions use exploratory verbs such as explore or describe . These also focus on one central phenomenon of interest, and may mention the participants and research site. 15

Hypotheses in qualitative research

Hypotheses in qualitative research are stated in the form of a clear statement concerning the problem to be investigated. Unlike in quantitative research where hypotheses are usually developed to be tested, qualitative research can lead to both hypothesis-testing and hypothesis-generating outcomes. 2 When studies require both quantitative and qualitative research questions, this suggests an integrative process between both research methods wherein a single mixed-methods research question can be developed. 1

FRAMEWORKS FOR DEVELOPING RESEARCH QUESTIONS AND HYPOTHESES

Research questions followed by hypotheses should be developed before the start of the study. 1 , 12 , 14 It is crucial to develop feasible research questions on a topic that is interesting to both the researcher and the scientific community. This can be achieved by a meticulous review of previous and current studies to establish a novel topic. Specific areas are subsequently focused on to generate ethical research questions. The relevance of the research questions is evaluated in terms of clarity of the resulting data, specificity of the methodology, objectivity of the outcome, depth of the research, and impact of the study. 1 , 5 These aspects constitute the FINER criteria (i.e., Feasible, Interesting, Novel, Ethical, and Relevant). 1 Clarity and effectiveness are achieved if research questions meet the FINER criteria. In addition to the FINER criteria, Ratan et al. described focus, complexity, novelty, feasibility, and measurability for evaluating the effectiveness of research questions. 14

The PICOT and PEO frameworks are also used when developing research questions. 1 The following elements are addressed in these frameworks, PICOT: P-population/patients/problem, I-intervention or indicator being studied, C-comparison group, O-outcome of interest, and T-timeframe of the study; PEO: P-population being studied, E-exposure to preexisting conditions, and O-outcome of interest. 1 Research questions are also considered good if these meet the “FINERMAPS” framework: Feasible, Interesting, Novel, Ethical, Relevant, Manageable, Appropriate, Potential value/publishable, and Systematic. 14

As we indicated earlier, research questions and hypotheses that are not carefully formulated result in unethical studies or poor outcomes. To illustrate this, we provide some examples of ambiguous research question and hypotheses that result in unclear and weak research objectives in quantitative research ( Table 6 ) 16 and qualitative research ( Table 7 ) 17 , and how to transform these ambiguous research question(s) and hypothesis(es) into clear and good statements.

VariablesUnclear and weak statement (Statement 1) Clear and good statement (Statement 2) Points to avoid
Research questionWhich is more effective between smoke moxibustion and smokeless moxibustion?“Moreover, regarding smoke moxibustion versus smokeless moxibustion, it remains unclear which is more effective, safe, and acceptable to pregnant women, and whether there is any difference in the amount of heat generated.” 1) Vague and unfocused questions
2) Closed questions simply answerable by yes or no
3) Questions requiring a simple choice
HypothesisThe smoke moxibustion group will have higher cephalic presentation.“Hypothesis 1. The smoke moxibustion stick group (SM group) and smokeless moxibustion stick group (-SLM group) will have higher rates of cephalic presentation after treatment than the control group.1) Unverifiable hypotheses
Hypothesis 2. The SM group and SLM group will have higher rates of cephalic presentation at birth than the control group.2) Incompletely stated groups of comparison
Hypothesis 3. There will be no significant differences in the well-being of the mother and child among the three groups in terms of the following outcomes: premature birth, premature rupture of membranes (PROM) at < 37 weeks, Apgar score < 7 at 5 min, umbilical cord blood pH < 7.1, admission to neonatal intensive care unit (NICU), and intrauterine fetal death.” 3) Insufficiently described variables or outcomes
Research objectiveTo determine which is more effective between smoke moxibustion and smokeless moxibustion.“The specific aims of this pilot study were (a) to compare the effects of smoke moxibustion and smokeless moxibustion treatments with the control group as a possible supplement to ECV for converting breech presentation to cephalic presentation and increasing adherence to the newly obtained cephalic position, and (b) to assess the effects of these treatments on the well-being of the mother and child.” 1) Poor understanding of the research question and hypotheses
2) Insufficient description of population, variables, or study outcomes

a These statements were composed for comparison and illustrative purposes only.

b These statements are direct quotes from Higashihara and Horiuchi. 16

VariablesUnclear and weak statement (Statement 1)Clear and good statement (Statement 2)Points to avoid
Research questionDoes disrespect and abuse (D&A) occur in childbirth in Tanzania?How does disrespect and abuse (D&A) occur and what are the types of physical and psychological abuses observed in midwives’ actual care during facility-based childbirth in urban Tanzania?1) Ambiguous or oversimplistic questions
2) Questions unverifiable by data collection and analysis
HypothesisDisrespect and abuse (D&A) occur in childbirth in Tanzania.Hypothesis 1: Several types of physical and psychological abuse by midwives in actual care occur during facility-based childbirth in urban Tanzania.1) Statements simply expressing facts
Hypothesis 2: Weak nursing and midwifery management contribute to the D&A of women during facility-based childbirth in urban Tanzania.2) Insufficiently described concepts or variables
Research objectiveTo describe disrespect and abuse (D&A) in childbirth in Tanzania.“This study aimed to describe from actual observations the respectful and disrespectful care received by women from midwives during their labor period in two hospitals in urban Tanzania.” 1) Statements unrelated to the research question and hypotheses
2) Unattainable or unexplorable objectives

a This statement is a direct quote from Shimoda et al. 17

The other statements were composed for comparison and illustrative purposes only.

CONSTRUCTING RESEARCH QUESTIONS AND HYPOTHESES

To construct effective research questions and hypotheses, it is very important to 1) clarify the background and 2) identify the research problem at the outset of the research, within a specific timeframe. 9 Then, 3) review or conduct preliminary research to collect all available knowledge about the possible research questions by studying theories and previous studies. 18 Afterwards, 4) construct research questions to investigate the research problem. Identify variables to be accessed from the research questions 4 and make operational definitions of constructs from the research problem and questions. Thereafter, 5) construct specific deductive or inductive predictions in the form of hypotheses. 4 Finally, 6) state the study aims . This general flow for constructing effective research questions and hypotheses prior to conducting research is shown in Fig. 1 .

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g001.jpg

Research questions are used more frequently in qualitative research than objectives or hypotheses. 3 These questions seek to discover, understand, explore or describe experiences by asking “What” or “How.” The questions are open-ended to elicit a description rather than to relate variables or compare groups. The questions are continually reviewed, reformulated, and changed during the qualitative study. 3 Research questions are also used more frequently in survey projects than hypotheses in experiments in quantitative research to compare variables and their relationships.

Hypotheses are constructed based on the variables identified and as an if-then statement, following the template, ‘If a specific action is taken, then a certain outcome is expected.’ At this stage, some ideas regarding expectations from the research to be conducted must be drawn. 18 Then, the variables to be manipulated (independent) and influenced (dependent) are defined. 4 Thereafter, the hypothesis is stated and refined, and reproducible data tailored to the hypothesis are identified, collected, and analyzed. 4 The hypotheses must be testable and specific, 18 and should describe the variables and their relationships, the specific group being studied, and the predicted research outcome. 18 Hypotheses construction involves a testable proposition to be deduced from theory, and independent and dependent variables to be separated and measured separately. 3 Therefore, good hypotheses must be based on good research questions constructed at the start of a study or trial. 12

In summary, research questions are constructed after establishing the background of the study. Hypotheses are then developed based on the research questions. Thus, it is crucial to have excellent research questions to generate superior hypotheses. In turn, these would determine the research objectives and the design of the study, and ultimately, the outcome of the research. 12 Algorithms for building research questions and hypotheses are shown in Fig. 2 for quantitative research and in Fig. 3 for qualitative research.

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g002.jpg

EXAMPLES OF RESEARCH QUESTIONS FROM PUBLISHED ARTICLES

  • EXAMPLE 1. Descriptive research question (quantitative research)
  • - Presents research variables to be assessed (distinct phenotypes and subphenotypes)
  • “BACKGROUND: Since COVID-19 was identified, its clinical and biological heterogeneity has been recognized. Identifying COVID-19 phenotypes might help guide basic, clinical, and translational research efforts.
  • RESEARCH QUESTION: Does the clinical spectrum of patients with COVID-19 contain distinct phenotypes and subphenotypes? ” 19
  • EXAMPLE 2. Relationship research question (quantitative research)
  • - Shows interactions between dependent variable (static postural control) and independent variable (peripheral visual field loss)
  • “Background: Integration of visual, vestibular, and proprioceptive sensations contributes to postural control. People with peripheral visual field loss have serious postural instability. However, the directional specificity of postural stability and sensory reweighting caused by gradual peripheral visual field loss remain unclear.
  • Research question: What are the effects of peripheral visual field loss on static postural control ?” 20
  • EXAMPLE 3. Comparative research question (quantitative research)
  • - Clarifies the difference among groups with an outcome variable (patients enrolled in COMPERA with moderate PH or severe PH in COPD) and another group without the outcome variable (patients with idiopathic pulmonary arterial hypertension (IPAH))
  • “BACKGROUND: Pulmonary hypertension (PH) in COPD is a poorly investigated clinical condition.
  • RESEARCH QUESTION: Which factors determine the outcome of PH in COPD?
  • STUDY DESIGN AND METHODS: We analyzed the characteristics and outcome of patients enrolled in the Comparative, Prospective Registry of Newly Initiated Therapies for Pulmonary Hypertension (COMPERA) with moderate or severe PH in COPD as defined during the 6th PH World Symposium who received medical therapy for PH and compared them with patients with idiopathic pulmonary arterial hypertension (IPAH) .” 21
  • EXAMPLE 4. Exploratory research question (qualitative research)
  • - Explores areas that have not been fully investigated (perspectives of families and children who receive care in clinic-based child obesity treatment) to have a deeper understanding of the research problem
  • “Problem: Interventions for children with obesity lead to only modest improvements in BMI and long-term outcomes, and data are limited on the perspectives of families of children with obesity in clinic-based treatment. This scoping review seeks to answer the question: What is known about the perspectives of families and children who receive care in clinic-based child obesity treatment? This review aims to explore the scope of perspectives reported by families of children with obesity who have received individualized outpatient clinic-based obesity treatment.” 22
  • EXAMPLE 5. Relationship research question (quantitative research)
  • - Defines interactions between dependent variable (use of ankle strategies) and independent variable (changes in muscle tone)
  • “Background: To maintain an upright standing posture against external disturbances, the human body mainly employs two types of postural control strategies: “ankle strategy” and “hip strategy.” While it has been reported that the magnitude of the disturbance alters the use of postural control strategies, it has not been elucidated how the level of muscle tone, one of the crucial parameters of bodily function, determines the use of each strategy. We have previously confirmed using forward dynamics simulations of human musculoskeletal models that an increased muscle tone promotes the use of ankle strategies. The objective of the present study was to experimentally evaluate a hypothesis: an increased muscle tone promotes the use of ankle strategies. Research question: Do changes in the muscle tone affect the use of ankle strategies ?” 23

EXAMPLES OF HYPOTHESES IN PUBLISHED ARTICLES

  • EXAMPLE 1. Working hypothesis (quantitative research)
  • - A hypothesis that is initially accepted for further research to produce a feasible theory
  • “As fever may have benefit in shortening the duration of viral illness, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response when taken during the early stages of COVID-19 illness .” 24
  • “In conclusion, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response . The difference in perceived safety of these agents in COVID-19 illness could be related to the more potent efficacy to reduce fever with ibuprofen compared to acetaminophen. Compelling data on the benefit of fever warrant further research and review to determine when to treat or withhold ibuprofen for early stage fever for COVID-19 and other related viral illnesses .” 24
  • EXAMPLE 2. Exploratory hypothesis (qualitative research)
  • - Explores particular areas deeper to clarify subjective experience and develop a formal hypothesis potentially testable in a future quantitative approach
  • “We hypothesized that when thinking about a past experience of help-seeking, a self distancing prompt would cause increased help-seeking intentions and more favorable help-seeking outcome expectations .” 25
  • “Conclusion
  • Although a priori hypotheses were not supported, further research is warranted as results indicate the potential for using self-distancing approaches to increasing help-seeking among some people with depressive symptomatology.” 25
  • EXAMPLE 3. Hypothesis-generating research to establish a framework for hypothesis testing (qualitative research)
  • “We hypothesize that compassionate care is beneficial for patients (better outcomes), healthcare systems and payers (lower costs), and healthcare providers (lower burnout). ” 26
  • Compassionomics is the branch of knowledge and scientific study of the effects of compassionate healthcare. Our main hypotheses are that compassionate healthcare is beneficial for (1) patients, by improving clinical outcomes, (2) healthcare systems and payers, by supporting financial sustainability, and (3) HCPs, by lowering burnout and promoting resilience and well-being. The purpose of this paper is to establish a scientific framework for testing the hypotheses above . If these hypotheses are confirmed through rigorous research, compassionomics will belong in the science of evidence-based medicine, with major implications for all healthcare domains.” 26
  • EXAMPLE 4. Statistical hypothesis (quantitative research)
  • - An assumption is made about the relationship among several population characteristics ( gender differences in sociodemographic and clinical characteristics of adults with ADHD ). Validity is tested by statistical experiment or analysis ( chi-square test, Students t-test, and logistic regression analysis)
  • “Our research investigated gender differences in sociodemographic and clinical characteristics of adults with ADHD in a Japanese clinical sample. Due to unique Japanese cultural ideals and expectations of women's behavior that are in opposition to ADHD symptoms, we hypothesized that women with ADHD experience more difficulties and present more dysfunctions than men . We tested the following hypotheses: first, women with ADHD have more comorbidities than men with ADHD; second, women with ADHD experience more social hardships than men, such as having less full-time employment and being more likely to be divorced.” 27
  • “Statistical Analysis
  • ( text omitted ) Between-gender comparisons were made using the chi-squared test for categorical variables and Students t-test for continuous variables…( text omitted ). A logistic regression analysis was performed for employment status, marital status, and comorbidity to evaluate the independent effects of gender on these dependent variables.” 27

EXAMPLES OF HYPOTHESIS AS WRITTEN IN PUBLISHED ARTICLES IN RELATION TO OTHER PARTS

  • EXAMPLE 1. Background, hypotheses, and aims are provided
  • “Pregnant women need skilled care during pregnancy and childbirth, but that skilled care is often delayed in some countries …( text omitted ). The focused antenatal care (FANC) model of WHO recommends that nurses provide information or counseling to all pregnant women …( text omitted ). Job aids are visual support materials that provide the right kind of information using graphics and words in a simple and yet effective manner. When nurses are not highly trained or have many work details to attend to, these job aids can serve as a content reminder for the nurses and can be used for educating their patients (Jennings, Yebadokpo, Affo, & Agbogbe, 2010) ( text omitted ). Importantly, additional evidence is needed to confirm how job aids can further improve the quality of ANC counseling by health workers in maternal care …( text omitted )” 28
  • “ This has led us to hypothesize that the quality of ANC counseling would be better if supported by job aids. Consequently, a better quality of ANC counseling is expected to produce higher levels of awareness concerning the danger signs of pregnancy and a more favorable impression of the caring behavior of nurses .” 28
  • “This study aimed to examine the differences in the responses of pregnant women to a job aid-supported intervention during ANC visit in terms of 1) their understanding of the danger signs of pregnancy and 2) their impression of the caring behaviors of nurses to pregnant women in rural Tanzania.” 28
  • EXAMPLE 2. Background, hypotheses, and aims are provided
  • “We conducted a two-arm randomized controlled trial (RCT) to evaluate and compare changes in salivary cortisol and oxytocin levels of first-time pregnant women between experimental and control groups. The women in the experimental group touched and held an infant for 30 min (experimental intervention protocol), whereas those in the control group watched a DVD movie of an infant (control intervention protocol). The primary outcome was salivary cortisol level and the secondary outcome was salivary oxytocin level.” 29
  • “ We hypothesize that at 30 min after touching and holding an infant, the salivary cortisol level will significantly decrease and the salivary oxytocin level will increase in the experimental group compared with the control group .” 29
  • EXAMPLE 3. Background, aim, and hypothesis are provided
  • “In countries where the maternal mortality ratio remains high, antenatal education to increase Birth Preparedness and Complication Readiness (BPCR) is considered one of the top priorities [1]. BPCR includes birth plans during the antenatal period, such as the birthplace, birth attendant, transportation, health facility for complications, expenses, and birth materials, as well as family coordination to achieve such birth plans. In Tanzania, although increasing, only about half of all pregnant women attend an antenatal clinic more than four times [4]. Moreover, the information provided during antenatal care (ANC) is insufficient. In the resource-poor settings, antenatal group education is a potential approach because of the limited time for individual counseling at antenatal clinics.” 30
  • “This study aimed to evaluate an antenatal group education program among pregnant women and their families with respect to birth-preparedness and maternal and infant outcomes in rural villages of Tanzania.” 30
  • “ The study hypothesis was if Tanzanian pregnant women and their families received a family-oriented antenatal group education, they would (1) have a higher level of BPCR, (2) attend antenatal clinic four or more times, (3) give birth in a health facility, (4) have less complications of women at birth, and (5) have less complications and deaths of infants than those who did not receive the education .” 30

Research questions and hypotheses are crucial components to any type of research, whether quantitative or qualitative. These questions should be developed at the very beginning of the study. Excellent research questions lead to superior hypotheses, which, like a compass, set the direction of research, and can often determine the successful conduct of the study. Many research studies have floundered because the development of research questions and subsequent hypotheses was not given the thought and meticulous attention needed. The development of research questions and hypotheses is an iterative process based on extensive knowledge of the literature and insightful grasp of the knowledge gap. Focused, concise, and specific research questions provide a strong foundation for constructing hypotheses which serve as formal predictions about the research outcomes. Research questions and hypotheses are crucial elements of research that should not be overlooked. They should be carefully thought of and constructed when planning research. This avoids unethical studies and poor outcomes by defining well-founded objectives that determine the design, course, and outcome of the study.

Disclosure: The authors have no potential conflicts of interest to disclose.

Author Contributions:

  • Conceptualization: Barroga E, Matanguihan GJ.
  • Methodology: Barroga E, Matanguihan GJ.
  • Writing - original draft: Barroga E, Matanguihan GJ.
  • Writing - review & editing: Barroga E, Matanguihan GJ.

Demographic Variations in VEMP Responses: A Cross-Sectional Study of Normative Data from an Indian Population

  • Original Article
  • Published: 06 September 2024

Cite this article

descriptive formula in research

  • Sanjay Kumar   ORCID: orcid.org/0000-0002-9737-7327 1 ,
  • Rashmi Natraj 1 &
  • Angshuman Dutta 1  

This study aimed to establish normative data for cervical (cVEMP) and ocular (oVEMP) vestibular evoked myogenic potentials in the Indian population, with a focus on assessing demographic variations across different age groups and genders. A cross-sectional observational study was conducted from January 2023 to December 2023 at a tertiary care center, involving 40 participants with normal hearing thresholds. Standardized cVEMP and oVEMP tests were performed using 500 Hz tone bursts at 95–115 dB nHL. VEMP responses, including latency and amplitude, were statistically analyzed using descriptive statistics, ANOVA, and Kruskal–Wallis tests to determine normative ranges and assess the impact of age and gender. The study established normative VEMP thresholds at about 105 dB for cervical (cVEMP) and ocular (oVEMP) responses, notably higher than the global average of 70–100 dB, suggesting unique regional variations in the Indian demographic. Normative values ranged from 45.68 at the 5th percentile to 92.32 at the 95th percentile for cVEMPs, and 13.12 to 18.58 for oVEMPs. Data analysis showed exceptionally stable VEMP responses with minimal age-related decline—significantly less than typically seen in Western populations, and no significant gender differences, indicating consistent vestibular function across diverse demographic groups. This research significantly enhances the diagnostic framework for vestibular disorders in India by establishing tailored normative VEMP data. These benchmarks facilitate precise clinical assessments and support the development of customized treatment protocols, improving healthcare outcomes for vestibular dysfunctions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

descriptive formula in research

Data Availability Statement (DAS)

The present study incorporates an accompanying additional data file that has been carefully reformatted to ensure the utmost protection of patient information. To obtain further information or to conduct a more comprehensive analysis of the datasets, individuals can reach out to the corresponding author, Dr. Sanjay Kumar. It is essential to acknowledge that the original raw data are kept confidential because of rigorous privacy regulations. Moreover, all data gathering procedures were carried out in strict adherence to the ethical standards set forth by the Command Hospital Air Force, located in Bangalore, India.

Curthoys IS, Iwasaki S, Chihara Y, Ushio M, McGarvie LA, Burgess AM (2011) The ocular vestibular-evoked myogenic potential to air-conducted sound; probable superior vestibular nerve origin. Clin Neurophysiol 122(3):611–616

Article   PubMed   Google Scholar  

Admis A, Unsal S, Gunduz MA (2019) Evaluation of vestibular evoked myogenic potentials (VEMPs) in individuals with tinnitus and normal hearing. Int Tinnitus J 22(2):58–63

Google Scholar  

Rodriguez A, Thomas MLA, Janky KL (2019) Air-conducted vestibular evoked myogenic potential testing in children, adolescents, and young adults: thresholds, frequency tuning, and effects of sound exposure. Ear Hear 40(1):192–203

Article   PubMed   PubMed Central   Google Scholar  

Min Hong S, Geun Yeo S, Wan Kim S, Il CC (2008) The results of vestibular evoked myogenic potentials, with consideration of age-related changes, in vestibular neuritis, benign paroxysmal positional vertigo, and Meniere’s disease. Acta Otolaryngol 128(8):861–865

Article   Google Scholar  

Rosengren SM, Welgampola MS, Colebatch JG (2010) Vestibular evoked myogenic potentials: past, present and future. Clin Neurophysiol 121(5):636–651

Article   CAS   PubMed   Google Scholar  

Nguyen KD, Welgampola MS, Carey JP (2010) Test-retest reliability and age-related characteristics of the ocular and cervical vestibular evoked myogenic potential tests. Otol Neurotol 31(5):793–802

Janky KL, Shepard N (2009) Vestibular evoked myogenic potential (VEMP) testing: normative threshold response curves and effects of age. J Am Acad Audiol 20(08):514–522

Ochi K, Ohashi T (2003) Age-related changes in the vestibular-evoked myogenic potentials. Otolaryngol Head Neck Surg. 129(6):655–9

PubMed   Google Scholar  

Gattie M, Lieven EV, Kluk K. Sexual dimorphism in VEMP peak to trough Latency. bioRxiv. 2023:2023–04.

Madzharova K (2019) Peculiarities of the VEMP test in children. Int Bull Otorhinolaryngol 15(4):10–13

Download references

Acknowledgements

We would like to express our gratitude to the committed medical and administrative personnel of Command Hospital Air Force, Bangalore, India, for their consistent support and assistance during the duration of the study. The present study was supported by self-funding, and we express our gratitude to those individuals who, via personal financial resources and unwavering commitment, facilitated the execution of this research.

Author information

Authors and affiliations.

Department of Ear, Nose, Throat - Head and Neck Surgery (ENT-HNS), Command Hospital Airforce, Bangalore, Karnataka, 560007, India

Sanjay Kumar, Rashmi Natraj & Angshuman Dutta

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Sanjay Kumar .

Ethics declarations

Conflict of interests.

No conflicts of interest encountered during the study.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (DOCX 13 KB)

Supplementary file 2 (xlsx 32 kb), rights and permissions.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Kumar, S., Natraj, R. & Dutta, A. Demographic Variations in VEMP Responses: A Cross-Sectional Study of Normative Data from an Indian Population. Indian J Otolaryngol Head Neck Surg (2024). https://doi.org/10.1007/s12070-024-05043-6

Download citation

Received : 16 April 2024

Accepted : 17 July 2024

Published : 06 September 2024

DOI : https://doi.org/10.1007/s12070-024-05043-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Vestibular function
  • Normative data
  • Indian population
  • Diagnostic accuracy
  • Age variations
  • Gender differences
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. Introduction to Descriptive Statistics

    descriptive formula in research

  2. Introduction to Descriptive Statistics

    descriptive formula in research

  3. How To Use Descriptive Analysis In Research

    descriptive formula in research

  4. Introduction to Descriptive Statistics

    descriptive formula in research

  5. 7 Types of Statistical Analysis: Definition and Explanation

    descriptive formula in research

  6. PPT

    descriptive formula in research

VIDEO

  1. The 'Descriptive Dynamite' Formula: Explode Your English Eloquence Today

  2. Descriptive Statistics Explained and Calculated

  3. Uses of Descriptive Geometry in Mathematics : Math Skills

  4. Calculating Descriptive Statistics of Data in R

  5. Median Calculation Using Karl Pearson's Approximate Relationship Among Mean-Median-Mode

  6. Formula Research Fusion II

COMMENTS

  1. Descriptive Research

    Descriptive research methods. Descriptive research is usually defined as a type of quantitative research, though qualitative research can also be used for descriptive purposes. The research design should be carefully developed to ensure that the results are valid and reliable.. Surveys. Survey research allows you to gather large volumes of data that can be analyzed for frequencies, averages ...

  2. Descriptive Research Design

    As discussed earlier, common research methods for descriptive research include surveys, case studies, observational studies, cross-sectional studies, and longitudinal studies. Design your study: Plan the details of your study, including the sampling strategy, data collection methods, and data analysis plan.

  3. Descriptive Statistics

    Descriptive Statistics Formulas. Sure, here are some of the most commonly used formulas in descriptive statistics: Mean (μ or x̄): The average of all the numbers in the dataset. It is computed by summing all the observations and dividing by the number of observations. Formula: μ = Σx/n or x̄ = Σx/n

  4. Descriptive Statistics

    Descriptive Statistics | Definitions, Types, Examples

  5. What is Descriptive Research? Definition, Methods, Types and Examples

    Descriptive research is a methodological approach that seeks to depict the characteristics of a phenomenon or subject under investigation. In scientific inquiry, it serves as a foundational tool for researchers aiming to observe, record, and analyze the intricate details of a particular topic. This method provides a rich and detailed account ...

  6. Descriptive Research: Characteristics, Methods + Examples

    Descriptive research is a research method describing the characteristics of the population or phenomenon studied. This descriptive methodology focuses more on the "what" of the research subject than the "why" of the research subject. The method primarily focuses on describing the nature of a demographic segment without focusing on ...

  7. Descriptive Analytics

    Some common Descriptive Analytics Tools are as follows: Excel: Microsoft Excel is a widely used tool that can be used for simple descriptive analytics. It has powerful statistical and data visualization capabilities. Pivot tables are a particularly useful feature for summarizing and analyzing large data sets.

  8. Descriptive Research 101: Definition, Methods and Examples

    Definition: As its name says, descriptive research describes the characteristics of the problem, phenomenon, situation, or group under study. So the goal of all descriptive studies is to explore the background, details, and existing patterns in the problem to fully understand it. In other words, preliminary research.

  9. Descriptive Statistics in Research: Your Complete Guide- Qualtrics

    It's also important to note that descriptive statistics can employ and use both quantitative and qualitative research. Describing data is undoubtedly the most critical first step in research as it enables the subsequent organization, simplification and summarization of information — and every survey question and population has summary ...

  10. Descriptive Research Design

    Descriptive research methods. Descriptive research is usually defined as a type of quantitative research, though qualitative research can also be used for descriptive purposes. The research design should be carefully developed to ensure that the results are valid and reliable.. Surveys. Survey research allows you to gather large volumes of data that can be analysed for frequencies, averages ...

  11. Descriptive research: What it is and how to use it

    Descriptive research design. Descriptive research design uses a range of both qualitative research and quantitative data (although quantitative research is the primary research method) to gather information to make accurate predictions about a particular problem or hypothesis. As a survey method, descriptive research designs will help ...

  12. Descriptive Statistics: Definitions, Types, Examples

    Descriptive statistics is the study of numerical and graphical ways of describing and displaying data. Here are some important concepts. ... Scientific research, hypothesis testing: ... The standard deviation formula varies for population and and highest value of sample. Both formulas are similar but not the same. Symbol used for Sample ...

  13. Descriptive Statistics

    Descriptive Statistics | Definitions, Types, Examples. Published on 4 November 2022 by Pritha Bhandari.Revised on 9 January 2023. Descriptive statistics summarise and organise characteristics of a data set. A data set is a collection of responses or observations from a sample or entire population.. In quantitative research, after collecting data, the first step of statistical analysis is to ...

  14. Chapter 3 Descriptive Statistics

    3 Descriptive Statistics. 3. Descriptive Statistics. When you have an area of interest that you want to research, a problem that you want to solve, a relationship that you want to investigate, theoretical and empirical processes will help you. Estimand is defined as "a quantity of scientific interest that can be calculated in the population ...

  15. Quantitative analysis: Descriptive statistics

    Numeric data collected in a research project can be analysed quantitatively using statistical tools in two different ways. Descriptive analysis refers to statistically describing, ... Using the above formula, as shown in Table 14.1, the manually computed value of correlation between age and self-esteem is 0.79. ...

  16. Descriptive Analysis of Research Data

    This article briefly discusses common descriptive data analysis measures including frequency distri butions, central tendency, variability, and correlation. However, statistical textbooks should be consulted for a thorough discussion and computa tional formulas. Inferential analysis will be discussed in a future article.

  17. Descriptive Statistics

    Descriptive Statistics. Conducting Educational Research Calculating Descriptive Statistics. Once the data has been coded and double-checked, the next step is to calculate Descriptive Statistics. The three main types of descriptive statistics are frequencies, measures of central tendency (also called averages), and measures of variability.

  18. Descriptive research

    Descriptive science is a category of science that involves descriptive research; that is, observing, recording, describing, and classifying phenomena.Descriptive research is sometimes contrasted with hypothesis-driven research, which is focused on testing a particular hypothesis by means of experimentation. [3]David A. Grimaldi and Michael S. Engel suggest that descriptive science in biology ...

  19. Chapter 1. Descriptive Statistics and Frequency Distributions

    Descriptive Statistics and Frequency Distributions This chapter is about describing populations and samples, a subject known as descriptive statistics. This will all make more sense if you keep in mind that the information you want to produce is a description of the population or sample as a whole, not a description of one member of the population.

  20. Descriptive Statistics for Summarising Data

    Using the data from these three rows, we can draw the following descriptive picture. Mentabil scores spanned a range of 50 (from a minimum score of 85 to a maximum score of 135). Speed scores had a range of 16.05 s (from 1.05 s - the fastest quality decision to 17.10 - the slowest quality decision).

  21. Methods and formulas for Descriptive Statistics (Tables)

    If the number of observations in a data set is odd, the median is the value in the middle. If the number of observations in a data set is even, the median is the average of the two middle values. Use the following method to calculate the median for each cell or margin using the data corresponding to that cell or margin.

  22. Descriptive Statistics: Definition, Formulas, Types, Examples

    Descriptive Statistics Definition. Descriptive statistics is a type of statistical analysis that uses quantitative methods to summarize the features of a population sample. It is useful to present easy and exact summaries of the sample and observations using metrics such as mean, median, variance, graphs, and charts.

  23. A Practical Guide to Writing Quantitative and Qualitative Research

    These questions can function in several ways, such as to 1) identify and describe existing conditions (contextual research questions); 2) describe a phenomenon (descriptive research questions); 3) assess the effectiveness of existing methods, protocols, theories, or procedures (evaluation research questions); 4) examine a phenomenon or analyze ...

  24. Motivations of family advisors in engaging in research to improve a

    Design. This study utilized Sally Thorne's (2016) interpretive description methodology to address the research question. Interpretive description is grounded in a naturalistic inquiry and objective knowledge is unattainable through empirical analysis but rather, the participants and researcher construct meaning together [].Interpretive descriptive is known as a useful methodology to generate ...

  25. Demographic Variations in VEMP Responses: A Cross-Sectional ...

    This study aimed to establish normative data for cervical (cVEMP) and ocular (oVEMP) vestibular evoked myogenic potentials in the Indian population, with a focus on assessing demographic variations across different age groups and genders. A cross-sectional observational study was conducted from January 2023 to December 2023 at a tertiary care center, involving 40 participants with normal ...