Normal Hypothesis Testing ( AQA A Level Maths: Statistics )

Revision note.

Amber

Normal Hypothesis Testing

How is a hypothesis test carried out with the normal distribution.

  • The population mean is tested by looking at the mean of a sample taken from the population
  • A hypothesis test is used when the value of the assumed population mean is questioned
  • Make sure you clearly define µ before writing the hypotheses, if it has not been defined in the question
  • The null hypothesis will always be H 0 : µ = ...
  • The alternative hypothesis will depend on if it is a one-tailed or two-tailed test
  • The alternative hypothesis, H 1   will be H 1 :   µ > ... or  H 1 :   µ < ...
  • The alternative hypothesis, H 1   will be H 1 :   µ ≠ ..
  • Remember that the variance of the sample mean distribution will be the variance of the population distribution divided by n
  • the mean of the sample mean distribution will be the same as the mean of the population distribution
  • The normal distribution will be used to calculate the probability of the observed value of the test statistic taking the observed value or a more extreme value
  • either calculating the probability of the test statistic taking the observed or a more extreme value ( p – value ) and comparing this with the significance level
  • Finding the critical region can be more useful for considering more than one observed value or for further testing

How is the critical value found in a hypothesis test for the mean of a normal distribution?

  • The probability of the observed value being within the critical region, given a true null hypothesis will be the same as the significance level
  • To find the critical value(s) find the distribution of the sample means, assuming H 0 is true, and use the inverse normal function on your calculator
  • For a two-tailed test you will need to find both critical values, one at each end of the distribution

What steps should I follow when carrying out a hypothesis test for the mean of a normal distribution?

  • Following these steps will help when carrying out a hypothesis test for the mean of a normal distribution:

Step 2.  Write the null and alternative hypotheses clearly using the form

H 0 : μ = ...

H 1 : μ ... ...

Step 4.    Calculate either the critical value(s) or the p – value (probability of the observed value) for the test

Step 5.    Compare the observed value of the test statistic with the critical value(s) or the p - value with the significance level

Step 6.    Decide whether there is enough evidence to reject H 0 or whether it has to be accepted

Step 7.  Write a conclusion in context

Worked example

5-3-2-hypothesis-nd-we-solution-part-1

You've read 0 of your 10 free revision notes

Unlock more, it's free, join the 100,000 + students that ❤️ save my exams.

the (exam) results speak for themselves:

Did this page help you?

  • Hypothesis Testing (Normal Distribution) (A Level only)
  • Sampling & Data Collection
  • Statistical Measures
  • Data Presentation
  • Working with Data
  • Correlation & Regression
  • Further Correlation & Regression (A Level only)
  • Basic Probability
  • Further Probability (A Level only)
  • Probability Distributions

Author: Amber

Amber gained a first class degree in Mathematics & Meteorology from the University of Reading before training to become a teacher. She is passionate about teaching, having spent 8 years teaching GCSE and A Level Mathematics both in the UK and internationally. Amber loves creating bright and informative resources to help students reach their potential.

hypothesis testing with normal distribution

  • The Open University
  • Accessibility hub
  • Guest user / Sign out
  • Study with The Open University

My OpenLearn Profile

Personalise your OpenLearn profile, save your favourite content and get recognition for your learning

About this free course

Become an ou student, download this course, share this free course.

Data analysis: hypothesis testing

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

4.1 The normal distribution

Here, you will look at the concept of normal distribution and the bell-shaped curve. The peak point (the top of the bell) represents the most probable occurrences, while other possible occurrences are distributed symmetrically around the peak point, creating a downward-sloping curve on either side of the peak point.

Cartoon showing a bell-shaped curve.

The cartoon shows a bell-shaped curve. The x-axis is titled ‘How high the hill is’ and the y-axis is titled ‘Number of hills’. The top of the bell-shaped curve is labelled ‘Average hill’, but on the lower right tail of the bell-shaped curve is labelled ‘Big hill’.

In order to test hypotheses, you need to calculate the test statistic and compare it with the value in the bell curve. This will be done by using the concept of ‘normal distribution’.

A normal distribution is a probability distribution that is symmetric about the mean, indicating that data near the mean are more likely to occur than data far from it. In graph form, a normal distribution appears as a bell curve. The values in the x-axis of the normal distribution graph represent the z-scores. The test statistic that you wish to use to test the set of hypotheses is the z-score . A z-score is used to measure how far the observation (sample mean) is from the 0 value of the bell curve (population mean). In statistics, this distance is measured by standard deviation. Therefore, when the z-score is equal to 2, the observation is 2 standard deviations away from the value 0 in the normal distribution curve.

A symmetrical graph reminiscent of a bell showing normal distribution.

A symmetrical graph reminiscent of a bell. The top of the bell-shaped curve appears where the x-axis is at 0. This is labelled as Normal distribution.

Previous

Module 9: Hypothesis Testing With One Sample

Distribution needed for hypothesis testing, learning outcomes.

  • Conduct and interpret hypothesis tests for a single population mean, population standard deviation known
  • Conduct and interpret hypothesis tests for a single population mean, population standard deviation unknown

Earlier in the course, we discussed sampling distributions.  Particular distributions are associated with hypothesis testing. Perform tests of a population mean using a normal distribution or a Student’s t- distribution . (Remember, use a Student’s t -distribution when the population standard deviation is unknown and the distribution of the sample mean is approximately normal.) We perform tests of a population proportion using a normal distribution (usually n is large or the sample size is large).

If you are testing a  single population mean , the distribution for the test is for means :

[latex]\displaystyle\overline{{X}}\text{~}{N}{\left(\mu_{{X}}\text{ , }\frac{{\sigma_{{X}}}}{\sqrt{{n}}}\right)}{\quad\text{or}\quad}{t}_{{{d}{f}}}[/latex]

The population parameter is [latex]\mu[/latex]. The estimated value (point estimate) for [latex]\mu[/latex] is [latex]\displaystyle\overline{{x}}[/latex], the sample mean.

If you are testing a  single population proportion , the distribution for the test is for proportions or percentages:

[latex]\displaystyle{P}^{\prime}\text{~}{N}{\left({p}\text{ , }\sqrt{{\frac{{{p}{q}}}{{n}}}}\right)}[/latex]

The population parameter is [latex]p[/latex]. The estimated value (point estimate) for [latex]p[/latex] is p′ . [latex]\displaystyle{p}\prime=\frac{{x}}{{n}}[/latex] where [latex]x[/latex] is the number of successes and [latex]n[/latex] is the sample size.

Assumptions

When you perform a  hypothesis test of a single population mean μ using a Student’s t -distribution (often called a t-test), there are fundamental assumptions that need to be met in order for the test to work properly. Your data should be a simple random sample that comes from a population that is approximately normally distributed . You use the sample standard deviation to approximate the population standard deviation. (Note that if the sample size is sufficiently large, a t-test will work even if the population is not approximately normally distributed).

When you perform a  hypothesis test of a single population mean μ using a normal distribution (often called a z -test), you take a simple random sample from the population. The population you are testing is normally distributed or your sample size is sufficiently large. You know the value of the population standard deviation which, in reality, is rarely known.

When you perform a  hypothesis test of a single population proportion p , you take a simple random sample from the population. You must meet the conditions for a binomial distribution which are as follows: there are a certain number n of independent trials, the outcomes of any trial are success or failure, and each trial has the same probability of a success p . The shape of the binomial distribution needs to be similar to the shape of the normal distribution. To ensure this, the quantities np  and nq must both be greater than five ( np > 5 and nq > 5). Then the binomial distribution of a sample (estimated) proportion can be approximated by the normal distribution with μ = p and [latex]\displaystyle\sigma=\sqrt{{\frac{{{p}{q}}}{{n}}}}[/latex] . Remember that q = 1 – p .

Concept Review

In order for a hypothesis test’s results to be generalized to a population, certain requirements must be satisfied.

When testing for a single population mean:

  • A Student’s t -test should be used if the data come from a simple, random sample and the population is approximately normally distributed, or the sample size is large, with an unknown standard deviation.
  • The normal test will work if the data come from a simple, random sample and the population is approximately normally distributed, or the sample size is large, with a known standard deviation.

When testing a single population proportion use a normal test for a single population proportion if the data comes from a simple, random sample, fill the requirements for a binomial distribution, and the mean number of success and the mean number of failures satisfy the conditions:  np > 5 and nq > n where n is the sample size, p is the probability of a success, and q is the probability of a failure.

Formula Review

If there is no given preconceived  α , then use α = 0.05.

Types of Hypothesis Tests

  • Single population mean, known population variance (or standard deviation): Normal test .
  • Single population mean, unknown population variance (or standard deviation): Student’s t -test .
  • Single population proportion: Normal test .
  • For a single population mean , we may use a normal distribution with the following mean and standard deviation. Means: [latex]\displaystyle\mu=\mu_{{\overline{{x}}}}{\quad\text{and}\quad}\sigma_{{\overline{{x}}}}=\frac{{\sigma_{{x}}}}{\sqrt{{n}}}[/latex]
  • A single population proportion , we may use a normal distribution with the following mean and standard deviation. Proportions: [latex]\displaystyle\mu={p}{\quad\text{and}\quad}\sigma=\sqrt{{\frac{{{p}{q}}}{{n}}}}[/latex].
  • Distribution Needed for Hypothesis Testing. Provided by : OpenStax. Located at : . License : CC BY: Attribution
  • Introductory Statistics . Authored by : Barbara Illowski, Susan Dean. Provided by : Open Stax. Located at : http://cnx.org/contents/[email protected] . License : CC BY: Attribution . License Terms : Download for free at http://cnx.org/contents/[email protected]

Hypothesis Testing with the Normal Distribution

Contents Toggle Main Menu 1 Introduction 2 Test for Population Mean 3 Worked Example 3.1 Video Example 4 Approximation to the Binomial Distribution 5 Worked Example 6 Comparing Two Means 7 Workbooks 8 See Also

Introduction

When constructing a confidence interval with the standard normal distribution, these are the most important values that will be needed.

Significance Level

$10$%

$5$%

$1$%

$z_{1-\alpha}$

$1.28$

$1.645$

$2.33$

$z_{1-\frac{\alpha}{2} }$

$1.645$

$1.96$

$2.58$

Distribution of Sample Means

where $\mu$ is the true mean and $\mu_0$ is the current accepted population mean. Draw samples of size $n$ from the population. When $n$ is large enough and the null hypothesis is true the sample means often follow a normal distribution with mean $\mu_0$ and standard deviation $\frac{\sigma}{\sqrt{n}}$. This is called the distribution of sample means and can be denoted by $\bar{X} \sim \mathrm{N}\left(\mu_0, \frac{\sigma}{\sqrt{n}}\right)$. This follows from the central limit theorem .

The $z$-score will this time be obtained with the formula \[Z = \dfrac{\bar{X} - \mu_0}{\frac{\sigma}{\sqrt{n}}}.\]

So if $\mu = \mu_0, X \sim \mathrm{N}\left(\mu_0, \frac{\sigma}{\sqrt{n}}\right)$ and $ Z \sim \mathrm{N}(0,1)$.

The alternative hypothesis will then take one of the following forms: depending on what we are testing.

Worked Example

An automobile company is looking for fuel additives that might increase gas mileage. Without additives, their cars are known to average $25$ mpg (miles per gallons) with a standard deviation of $2.4$ mpg on a road trip from London to Edinburgh. The company now asks whether a particular new additive increases this value. In a study, thirty cars are sent on a road trip from London to Edinburgh. Suppose it turns out that the thirty cars averaged $\overline{x}=25.5$ mpg with the additive. Can we conclude from this result that the additive is effective?

We are asked to show if the new additive increases the mean miles per gallon. The current mean $\mu = 25$ so the null hypothesis will be that nothing changes. The alternative hypothesis will be that $\mu > 25$ because this is what we have been asked to test.

\begin{align} &H_0:\mu=25. \\ &H_1:\mu>25. \end{align}

Now we need to calculate the test statistic. We start with the assumption the normal distribution is still valid. This is because the null hypothesis states there is no change in $\mu$. Thus, as the value $\sigma=2.4$ mpg is known, we perform a hypothesis test with the standard normal distribution. So the test statistic will be a $z$ score. We compute the $z$ score using the formula \[z=\frac{\bar{x}-\mu}{\frac{\sigma}{\sqrt{n} } }.\] So \begin{align} z&=\frac{\overline{x}-25}{\frac{2.4}{\sqrt{30} } }\\ &=1.14 \end{align}

We are using a $5$% significance level and a (right-sided) one-tailed test, so $\alpha=0.05$ so from the tables we obtain $z_{1-\alpha} = 1.645$ is our test statistic.

As $1.14<1.645$, the test statistic is not in the critical region so we cannot reject $H_0$. Thus, the observed sample mean $\overline{x}=25.5$ is consistent with the hypothesis $H_0:\mu=25$ on a $5$% significance level.

Video Example

In this video, Dr Lee Fawcett explains how to conduct a hypothesis test for the mean of a single distribution whose variance is known, using a one-sample z-test.

Approximation to the Binomial Distribution

A supermarket has come under scrutiny after a number of complaints that its carrier bags fall apart when the load they carry is $5$kg. Out of a random sample of $200$ bags, $185$ do not tear when carrying a load of $5$kg. Can the supermarket claim at a $5$% significance level that more that $90$% of the bags will not fall apart?

Let $X$ represent the number of carrier bags which can hold a load of $5$kg. Then $X \sim \mathrm{Bin}(200,p)$ and \begin{align}H_0&: p = 0.9 \\ H_1&: p > 0.9 \end{align}

We need to calculate the mean $\mu$ and variance $\sigma ^2$.

\[\mu = np = 200 \times 0.9 = 180\text{.}\] \[\sigma ^2= np(1-p) = 18\text{.}\]

Using the normal approximation to the binomial distribution we obtain $Y \sim \mathrm{N}(180, 18)$.

\[\mathrm{P}[X \geq 185] = \mathrm{P}\left[Z \geq \dfrac{184.5 - 180}{4.2426} \right] = \mathrm{P}\left[Z \geq 1.0607\right] \text{.}\]

Because we are using a one-tailed test at a $5$% significance level, we obtain the critical value $Z=1.645$. Now $1.0607 < 1.645$ so we cannot accept the alternative hypothesis. It is not true that over $90$% of the supermarket's carrier bags are capable of withstanding a load of $5$kg.

Comparing Two Means

When we test hypotheses with two means, we will look at the difference $\mu_1 - \mu_2$. The null hypothesis will be of the form

where $a$ is a constant. Often $a=0$ is used to test if the two means are the same. Given two continuous random variables $X_1$ and $X_2$ with means $\mu_1$ and $\mu_2$ and variances $\frac{\sigma_1^2}{n_1}$ and $\frac{\sigma_2^2}{n_2}$ respectively \[\mathrm{E} [\bar{X_1} - \bar{X_2} ] = \mathrm{E} [\bar{X_1}] - \mathrm{E} [\bar{X_2}] = \mu_1 - \mu_2\] and \[\mathrm{Var}[\bar{X_1} - \bar{X_2}] = \mathrm{Var}[\bar{X_1}] - \mathrm{Var}[\bar{X_2}]=\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}\text{.}\] Note this last result, the difference of the variances is calculated by summing the variances.

We then obtain the $z$-score using the formula \[Z = \frac{(\bar{X_1}-\bar{X_2})-(\mu_1 - \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}}\text{.}\]

These workbooks produced by HELM are good revision aids, containing key points for revision and many worked examples.

  • Tests concerning a single sample
  • Tests concerning two samples

Selecting a Hypothesis Test

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

Hypothesis testing.

Key Topics:

  • Basic approach
  • Null and alternative hypothesis
  • Decision making and the p -value
  • Z-test & Nonparametric alternative

Basic approach to hypothesis testing

  • State a model describing the relationship between the explanatory variables and the outcome variable(s) in the population and the nature of the variability. State all of your assumptions .
  • Specify the null and alternative hypotheses in terms of the parameters of the model.
  • Invent a test statistic that will tend to be different under the null and alternative hypotheses.
  • Using the assumptions of step 1, find the theoretical sampling distribution of the statistic under the null hypothesis of step 2. Ideally the form of the sampling distribution should be one of the “standard distributions”(e.g. normal, t , binomial..)
  • Calculate a p -value , as the area under the sampling distribution more extreme than your statistic. Depends on the form of the alternative hypothesis.
  • Choose your acceptable type 1 error rate (alpha) and apply the decision rule : reject the null hypothesis if the p-value is less than alpha, otherwise do not reject.
sampled from a with unknown mean μ and known variance σ . : μ = μ
H : μ ≤ μ
H : μ ≥ μ
: μ ≠ μ
H : μ > μ
H : μ < μ
  • \(\frac{\bar{X}-\mu_0}{\sigma / \sqrt{n}}\)
  • general form is: (estimate - value we are testing)/(st.dev of the estimate)
  • z-statistic follows N(0,1) distribution
  • 2 × the area above |z|, area above z,or area below z, or
  • compare the statistic to a critical value, |z| ≥ z α/2 , z ≥ z α , or z ≤ - z α
  • Choose the acceptable level of Alpha = 0.05, we conclude …. ?

Making the Decision

It is either likely or unlikely that we would collect the evidence we did given the initial assumption. (Note: “likely” or “unlikely” is measured by calculating a probability!)

If it is likely , then we “ do not reject ” our initial assumption. There is not enough evidence to do otherwise.

If it is unlikely , then:

  • either our initial assumption is correct and we experienced an unusual event or,
  • our initial assumption is incorrect

In statistics, if it is unlikely, we decide to “ reject ” our initial assumption.

Example: Criminal Trial Analogy

First, state 2 hypotheses, the null hypothesis (“H 0 ”) and the alternative hypothesis (“H A ”)

  • H 0 : Defendant is not guilty.
  • H A : Defendant is guilty.

Usually the H 0 is a statement of “no effect”, or “no change”, or “chance only” about a population parameter.

While the H A , depending on the situation, is that there is a difference, trend, effect, or a relationship with respect to a population parameter.

  • It can one-sided and two-sided.
  • In two-sided we only care there is a difference, but not the direction of it. In one-sided we care about a particular direction of the relationship. We want to know if the value is strictly larger or smaller.

Then, collect evidence, such as finger prints, blood spots, hair samples, carpet fibers, shoe prints, ransom notes, handwriting samples, etc. (In statistics, the data are the evidence.)

Next, you make your initial assumption.

  • Defendant is innocent until proven guilty.

In statistics, we always assume the null hypothesis is true .

Then, make a decision based on the available evidence.

  • If there is sufficient evidence (“beyond a reasonable doubt”), reject the null hypothesis . (Behave as if defendant is guilty.)
  • If there is not enough evidence, do not reject the null hypothesis . (Behave as if defendant is not guilty.)

If the observed outcome, e.g., a sample statistic, is surprising under the assumption that the null hypothesis is true, but more probable if the alternative is true, then this outcome is evidence against H 0 and in favor of H A .

An observed effect so large that it would rarely occur by chance is called statistically significant (i.e., not likely to happen by chance).

Using the p -value to make the decision

The p -value represents how likely we would be to observe such an extreme sample if the null hypothesis were true. The p -value is a probability computed assuming the null hypothesis is true, that the test statistic would take a value as extreme or more extreme than that actually observed. Since it's a probability, it is a number between 0 and 1. The closer the number is to 0 means the event is “unlikely.” So if p -value is “small,” (typically, less than 0.05), we can then reject the null hypothesis.

Significance level and p -value

Significance level, α, is a decisive value for p -value. In this context, significant does not mean “important”, but it means “not likely to happened just by chance”.

α is the maximum probability of rejecting the null hypothesis when the null hypothesis is true. If α = 1 we always reject the null, if α = 0 we never reject the null hypothesis. In articles, journals, etc… you may read: “The results were significant ( p <0.05).” So if p =0.03, it's significant at the level of α = 0.05 but not at the level of α = 0.01. If we reject the H 0 at the level of α = 0.05 (which corresponds to 95% CI), we are saying that if H 0 is true, the observed phenomenon would happen no more than 5% of the time (that is 1 in 20). If we choose to compare the p -value to α = 0.01, we are insisting on a stronger evidence!

Neither decision of rejecting or not rejecting the H entails proving the null hypothesis or the alternative hypothesis. We merely state there is enough evidence to behave one way or the other. This is also always true in statistics!

So, what kind of error could we make? No matter what decision we make, there is always a chance we made an error.

Errors in Criminal Trial:

Errors in Hypothesis Testing

Type I error (False positive): The null hypothesis is rejected when it is true.

  • α is the maximum probability of making a Type I error.

Type II error (False negative): The null hypothesis is not rejected when it is false.

  • β is the probability of making a Type II error

There is always a chance of making one of these errors. But, a good scientific study will minimize the chance of doing so!

The power of a statistical test is its probability of rejecting the null hypothesis if the null hypothesis is false. That is, power is the ability to correctly reject H 0 and detect a significant effect. In other words, power is one minus the type II error risk.

\(\text{Power }=1-\beta = P\left(\text{reject} H_0 | H_0 \text{is false } \right)\)

Which error is worse?

Type I = you are innocent, yet accused of cheating on the test. Type II = you cheated on the test, but you are found innocent.

This depends on the context of the problem too. But in most cases scientists are trying to be “conservative”; it's worse to make a spurious discovery than to fail to make a good one. Our goal it to increase the power of the test that is to minimize the length of the CI.

We need to keep in mind:

  • the effect of the sample size,
  • the correctness of the underlying assumptions about the population,
  • statistical vs. practical significance, etc…

(see the handout). To study the tradeoffs between the sample size, α, and Type II error we can use power and operating characteristic curves.

Assume data are independently sampled from a normal distribution with unknown mean μ and known variance σ = 9. Make an initial assumption that μ = 65.

Specify the hypothesis: H : μ = 65 H : μ ≠ 65

z-statistic: 3.58

z-statistic follow N(0,1) distribution

The -value, < 0.0001, indicates that, if the average height in the population is 65 inches, it is unlikely that a sample of 54 students would have an average height of 66.4630.

Alpha = 0.05. Decision: -value < alpha, thus

Conclude that the average height is not equal to 65.

What type of error might we have made?

Type I error is claiming that average student height is not 65 inches, when it really is. Type II error is failing to claim that the average student height is not 65in when it is.

We rejected the null hypothesis, i.e., claimed that the height is not 65, thus making potentially a Type I error. But sometimes the p -value is too low because of the large sample size, and we may have statistical significance but not really practical significance! That's why most statisticians are much more comfortable with using CI than tests.

Based on the CI only, how do you know that you should reject the null hypothesis?

The 95% CI is (65.6628,67.2631) ...

What about practical and statistical significance now? Is there another reason to suspect this test, and the -value calculations?

There is a need for a further generalization. What if we can't assume that σ is known? In this case we would use s (the sample standard deviation) to estimate σ.

If the sample is very large, we can treat σ as known by assuming that σ = s . According to the law of large numbers, this is not too bad a thing to do. But if the sample is small, the fact that we have to estimate both the standard deviation and the mean adds extra uncertainty to our inference. In practice this means that we need a larger multiplier for the standard error.

We need one-sample t -test.

One sample t -test

  • Assume data are independently sampled from a normal distribution with unknown mean μ and variance σ 2 . Make an initial assumption, μ 0 .
: μ = μ
H : μ ≤ μ
H : μ ≥ μ
: μ ≠ μ
H : μ > μ
H : μ < μ
  • t-statistic: \(\frac{\bar{X}-\mu_0}{s / \sqrt{n}}\) where s is a sample st.dev.
  • t-statistic follows t -distribution with df = n - 1
  • Alpha = 0.05, we conclude ….

Testing for the population proportion

Let's go back to our CNN poll. Assume we have a SRS of 1,017 adults.

We are interested in testing the following hypothesis: H 0 : p = 0.50 vs. p > 0.50

What is the test statistic?

If alpha = 0.05, what do we conclude?

We will see more details in the next lesson on proportions, then distributions, and possible tests.

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

6 Week 5 Introduction to Hypothesis Testing Reading

An introduction to hypothesis testing.

What are you interested in learning about? Perhaps you’d like to know if there is a difference in average final grade between two different versions of a college class? Does the Fort Lewis women’s soccer team score more goals than the national Division II women’s average? Which outdoor sport do Fort Lewis students prefer the most?  Do the pine trees on campus differ in mean height from the aspen trees? For all of these questions, we can collect a sample, analyze the data, then make a statistical inference based on the analysis.  This means determining whether we have enough evidence to reject our null hypothesis (what was originally assumed to be true, until we prove otherwise). The process is called hypothesis testing .

A really good Khan Academy video to introduce the hypothesis test process: Khan Academy Hypothesis Testing . As you watch, please don’t get caught up in the calculations, as we will use SPSS to do these calculations.  We will also use SPSS p-values, instead of the referenced Z-table, to make statistical decisions.

The Six-Step Process

Hypothesis testing requires very specific, detailed steps.  Think of it as a mathematical lab report where you have to write out your work in a particular way.  There are six steps that we will follow for ALL of the hypothesis tests that we learn this semester.

Six Step Hypothesis Testing Process

1. Research Question

All hypothesis tests start with a research question.  This is literally a question that includes what you are trying to prove, like the examples earlier:  Which outdoor sport do Fort Lewis students prefer the most? Is there sufficient evidence to show that the Fort Lewis women’s soccer team scores more goals than the national Division 2 women’s average?

In this step, besides literally being a question, you’ll want to include:

  • mention of your variable(s)
  • wording specific to the type of test that you’ll be conducting (mean, mean difference, relationship, pattern)
  • specific wording that indicates directionality (are you looking for a ‘difference’, are you looking for something to be ‘more than’ or ‘less than’ something else, or are you comparing one pattern to another?)

Consider this research question: Do the pine trees on campus differ in mean height from the aspen trees?

  • The wording of this research question clearly mentions the variables being studied. The independent variable is the type of tree (pine or aspen), and these trees are having their heights compared, so the dependent variable is height.
  • ‘Mean’ is mentioned, so this indicates a test with a quantitative dependent variable.
  • The question also asks if the tree heights ‘differ’. This specific word indicates that the test being performed is a two-tailed (i.e. non-directional) test. More about the meaning of one/two-tailed will come later.

2. Statistical Hypotheses

A statistical hypothesis test has a null hypothesis, the status quo, what we assume to be true.  Notation is H 0, read as “H naught”.  The alternative hypothesis is what you are trying to prove (mentioned in your research question), H 1 or H A .  All hypothesis tests must include a null and an alternative hypothesis.  We also note which hypothesis test is being done in this step.

The notation for your statistical hypotheses will vary depending on the type of test that you’re doing. Writing statistical hypotheses is NOT the same as most scientific hypotheses. You are not writing sentences explaining what you think will happen in the study. Here is an example of what statistical hypotheses look like using the research question: Do the pine trees on campus differ in mean height from the aspen trees?

LaTeX: H_0\:

3. Decision Rule

In this step, you state which alpha value you will use, and when appropriate, the directionality, or tail, of the test.  You also write a statement: “I will reject the null hypothesis if p < alpha” (insert actual alpha value here).  In this introductory class, alpha is the level of significance, how willing we are to make the wrong statistical decision, and it will be set to 0.05 or 0.01.

Example of a Decision Rule:

Let alpha=0.01, two-tailed. I will reject the null hypothesis if p<0.01.

4. Assumptions, Analysis and Calculations

Quite a bit goes on in this step.  Assumptions for the particular hypothesis test must be done.  SPSS will be used to create appropriate graphs, and test output tables. Where appropriate, calculations of the test’s effect size will also be done in this step.

All hypothesis tests have assumptions that we hope to meet. For example, tests with a quantitative dependent variable consider a histogram(s) to check if the distribution is normal, and whether there are any obvious outliers. Each hypothesis test has different assumptions, so it is important to pay attention to the specific test’s requirements.

Required SPSS output will also depend on the test.

5. Statistical Decision

It is in Step 5 that we determine if we have enough statistical evidence to reject our null hypothesis.  We will consult the SPSS p-value and compare to our chosen alpha (from Step 3: Decision Rule).

Put very simply, the p -value is the probability that, if the null hypothesis is true, the results from another randomly selected sample will be as extreme or more extreme as the results obtained from the given sample. The p -value can also be thought of as the probability that the results (from the sample) that we are seeing are solely due to chance. This concept will be discussed in much further detail in the class notes.

Based on this numerical comparison between the p-value and alpha, we’ll either reject or retain our null hypothesis.  Note: You may NEVER ‘accept’ the null hypothesis. This is because it is impossible to prove a null hypothesis to be true.

Retaining the null means that you just don’t have enough evidence to prove your alternative hypothesis to be true, so you fall back to your null. (You retain the null when p is greater than or equal to alpha.)

Rejecting the null means that you did find enough evidence to prove your alternative hypothesis as true. (You reject the null when p is less than alpha.)

Example of a Statistical Decision:

Retain the null hypothesis, because p=0.12 > alpha=0.01.

The p-value will come from SPSS output, and the alpha will have already been determined back in Step 3. You must be very careful when you compare the decimal values of the p-value and alpha. If, for example, you mistakenly think that p=0.12 < alpha=0.01, then you will make the incorrect statistical decision, which will likely lead to an incorrect interpretation of the study’s findings.

6. Interpretation

The interpretation is where you write up your findings. The specifics will vary depending on the type of hypothesis test you performed, but you will always include a plain English, contextual conclusion of what your study found (i.e. what it means to reject or retain the null hypothesis in that particular study).  You’ll have statistics that you quote to support your decision.  Some of the statistics will need to be written in APA style citation (the American Psychological Association style of citation).  For some hypothesis tests, you’ll also include an interpretation of the effect size.

Some hypothesis tests will also require an additional (non-Parametric) test after the completion of your original test, if the test’s assumptions have not been met. These tests are also call “Post-Hoc tests”.

As previously stated, hypothesis testing is a very detailed process. Do not be concerned if you have read through all of the steps above, and have many questions (and are possibly very confused). It will take time, and a lot of practice to learn and apply these steps!

This Reading is just meant as an overview of hypothesis testing. Much more information is forthcoming in the various sets of Notes about the specifics needed in each of these steps. The Hypothesis Test Checklist will be a critical resource for you to refer to during homeworks and tests.

Student Course Learning Objectives

4.  Choose, administer and interpret the correct tests based on the situation, including identification of appropriate sampling and potential errors

c. Choose the appropriate hypothesis test given a situation

d. Describe the meaning and uses of alpha and p-values

e. Write the appropriate null and alternative hypotheses, including whether the alternative should be one-sided or two-sided

f. Determine and calculate the appropriate test statistic (e.g. z-test, multiple t-tests, Chi-Square, ANOVA)

g. Determine and interpret effect sizes.

h. Interpret results of a hypothesis test

  • Use technology in the statistical analysis of data
  • Communicate in writing the results of statistical analyses of data

Attributions

Adapted from “Week 5 Introduction to Hypothesis Testing Reading” by Sherri Spriggs and Sandi Dang is licensed under CC BY-NC-SA 4.0 .

Math 132 Introduction to Statistics Readings Copyright © by Sherri Spriggs is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License , except where otherwise noted.

Share This Book

An Introduction to Bayesian Thinking

Chapter 5 hypothesis testing with normal populations.

In Section 3.5 , we described how the Bayes factors can be used for hypothesis testing. Now we will use the Bayes factors to compare normal means, i.e., test whether the mean of a population is zero or compare the means of two groups of normally-distributed populations. We divide this mission into three cases: known variance for a single population, unknown variance for a single population using paired data, and unknown variance using two independent groups.

Also note that some of the examples in this section use an updated version of the bayes_inference function. If your local output is different from what is seen in this chapter, or the provided code fails to run for you please make sure that you have the most recent version of the package.

5.1 Bayes Factors for Testing a Normal Mean: variance known

Now we show how to obtain Bayes factors for testing hypothesis about a normal mean, where the variance is known . To start, let’s consider a random sample of observations from a normal population with mean \(\mu\) and pre-specified variance \(\sigma^2\) . We consider testing whether the population mean \(\mu\) is equal to \(m_0\) or not.

Therefore, we can formulate the data and hypotheses as below:

Data \[Y_1, \cdots, Y_n \mathrel{\mathop{\sim}\limits^{\rm iid}}\textsf{Normal}(\mu, \sigma^2)\]

  • \(H_1: \mu = m_0\)
  • \(H_2: \mu \neq m_0\)

We also need to specify priors for \(\mu\) under both hypotheses. Under \(H_1\) , we assume that \(\mu\) is exactly \(m_0\) , so this occurs with probability 1 under \(H_1\) . Now under \(H_2\) , \(\mu\) is unspecified, so we describe our prior uncertainty with the conjugate normal distribution centered at \(m_0\) and with a variance \(\sigma^2/\mathbf{n_0}\) . This is centered at the hypothesized value \(m_0\) , and it seems that the mean is equally likely to be larger or smaller than \(m_0\) , so a dividing factor \(n_0\) is given to the variance. The hyper parameter \(n_0\) controls the precision of the prior as before.

In mathematical terms, the priors are:

  • \(H_1: \mu = m_0 \text{ with probability 1}\)
  • \(H_2: \mu \sim \textsf{Normal}(m_0, \sigma^2/\mathbf{n_0})\)

Bayes Factor

Now the Bayes factor for comparing \(H_1\) to \(H_2\) is the ratio of the distribution, the data under the assumption that \(\mu = m_0\) to the distribution of the data under \(H_2\) .

\[\begin{aligned} \textit{BF}[H_1 : H_2] &= \frac{p(\text{data}\mid \mu = m_0, \sigma^2 )} {\int p(\text{data}\mid \mu, \sigma^2) p(\mu \mid m_0, \mathbf{n_0}, \sigma^2)\, d \mu} \\ \textit{BF}[H_1 : H_2] &=\left(\frac{n + \mathbf{n_0}}{\mathbf{n_0}} \right)^{1/2} \exp\left\{-\frac 1 2 \frac{n }{n + \mathbf{n_0}} Z^2 \right\} \\ Z &= \frac{(\bar{Y} - m_0)}{\sigma/\sqrt{n}} \end{aligned}\]

The term in the denominator requires integration to account for the uncertainty in \(\mu\) under \(H_2\) . And it can be shown that the Bayes factor is a function of the observed sampled size, the prior sample size \(n_0\) and a \(Z\) score.

Let’s explore how the hyperparameters in \(n_0\) influences the Bayes factor in Equation (5.1) . For illustration we will use the sample size of 100. Recall that for estimation, we interpreted \(n_0\) as a prior sample size and considered the limiting case where \(n_0\) goes to zero as a non-informative or reference prior.

\[\begin{equation} \textsf{BF}[H_1 : H_2] = \left(\frac{n + \mathbf{n_0}}{\mathbf{n_0}}\right)^{1/2} \exp\left\{-\frac{1}{2} \frac{n }{n + \mathbf{n_0}} Z^2 \right\} \tag{5.1} \end{equation}\]

Figure 5.1 shows the Bayes factor for comparing \(H_1\) to \(H_2\) on the y-axis as \(n_0\) changes on the x-axis. The different lines correspond to different values of the \(Z\) score or how many standard errors \(\bar{y}\) is from the hypothesized mean. As expected, larger values of the \(Z\) score favor \(H_2\) .

Vague prior for mu: n=100

Figure 5.1: Vague prior for mu: n=100

But as \(n_0\) becomes smaller and approaches 0, the first term in the Bayes factor goes to infinity, while the exponential term involving the data goes to a constant and is ignored. In the limit as \(n_0 \rightarrow 0\) under this noninformative prior, the Bayes factor paradoxically ends up favoring \(H_1\) regardless of the value of \(\bar{y}\) .

The takeaway from this is that we cannot use improper priors with \(n_0 = 0\) , if we are going to test our hypothesis that \(\mu = n_0\) . Similarly, vague priors that use a small value of \(n_0\) are not recommended due to the sensitivity of the results to the choice of an arbitrarily small value of \(n_0\) .

This problem arises with vague priors – the Bayes factor favors the null model \(H_1\) even when the data are far away from the value under the null – are known as the Bartlett’s paradox or the Jeffrey’s-Lindleys paradox.

Now, one way to understand the effect of prior is through the standard effect size

\[\delta = \frac{\mu - m_0}{\sigma}.\] The prior of the standard effect size is

\[\delta \mid H_2 \sim \textsf{Normal}(0, \frac{1}{\mathbf{n_0}})\]

This allows us to think about a standardized effect independent of the units of the problem. One default choice is using the unit information prior, where the prior sample size \(n_0\) is 1, leading to a standard normal for the standardized effect size. This is depicted with the blue normal density in Figure 5.2 . This suggested that we expect that the mean will be within \(\pm 1.96\) standard deviations of the hypothesized mean with probability 0.95 . (Note that we can say this only under a Bayesian setting.)

In many fields we expect that the effect will be small relative to \(\sigma\) . If we do not expect to see large effects, then we may want to use a more informative prior on the effect size as the density in orange with \(n_0 = 4\) . So they expected the mean to be within \(\pm 1/\sqrt{n_0}\) or five standard deviations of the prior mean.

Prior on standard effect size

Figure 5.2: Prior on standard effect size

Example 1.1 To illustrate, we give an example from parapsychological research. The case involved the test of the subject’s claim to affect a series of randomly generated 0’s and 1’s by means of extra sensory perception (ESP). The random sequence of 0’s and 1’s are generated by a machine with probability of generating 1 being 0.5. The subject claims that his ESP would make the sample mean differ significantly from 0.5.

Therefore, we are testing \(H_1: \mu = 0.5\) versus \(H_2: \mu \neq 0.5\) . Let’s use a prior that suggests we do not expect a large effect which leads the following solution for \(n_0\) . Assume we want a standard effect of 0.03, there is a 95% chance that it is between \((-0.03/\sigma, 0.03/\sigma)\) , with \(n_0 = (1.96\sigma/0.03)^2 = 32.7^2\) .

Figure 5.3 shows our informative prior in blue, while the unit information prior is in orange. On this scale, the unit information prior needs to be almost uniform for the range that we are interested.

Prior effect in the extra sensory perception test

Figure 5.3: Prior effect in the extra sensory perception test

A very large data set with over 104 million trials was collected to test this hypothesis, so we use a normal distribution to approximate the distribution the sample mean.

  • Sample size: \(n = 1.0449 \times 10^8\)
  • Sample mean: \(\bar{y} = 0.500177\) , standard deviation \(\sigma = 0.5\)
  • \(Z\) -score: 3.61

Now using our prior in the data, the Bayes factor for \(H_1\) to \(H_2\) was 0.46, implying evidence against the hypothesis \(H_1\) that \(\mu = 0.5\) .

  • Informative \(\textit{BF}[H_1:H_2] = 0.46\)
  • \(\textit{BF}[H_2:H_1] = 1/\textit{BF}[H_1:H_2] = 2.19\)

Now, this can be inverted to provide the evidence in favor of \(H_2\) . The evidence suggests that the hypothesis that the machine operates with a probability that is not 0.5, is 2.19 times more likely than the hypothesis the probability is 0.5. Based on the interpretation of Bayes factors from Table 3.5 , this is in the range of “not worth the bare mention”.

To recap, we present expressions for calculating Bayes factors for a normal model with a specified variance. We show that the improper reference priors for \(\mu\) when \(n_0 = 0\) , or vague priors where \(n_0\) is arbitrarily small, lead to Bayes factors that favor the null hypothesis regardless of the data, and thus should not be used for hypothesis testing.

Bayes factors with normal priors can be sensitive to the choice of the \(n_0\) . While the default value of \(n_0 = 1\) is reasonable in many cases, this may be too non-informative if one expects more effects. Wherever possible, think about how large an effect you expect and use that information to help select the \(n_0\) .

All the ESP examples suggest weak evidence and favored the machine generating random 0’s and 1’s with a probability that is different from 0.5. Note that ESP is not the only explanation – a deviation from 0.5 can also occur if the random number generator is biased. Bias in the stream of random numbers in our pseudorandom numbers has huge implications for numerous fields that depend on simulation. If the context had been about detecting a small bias in random numbers what prior would you use and how would it change the outcome? You can experiment it in R or other software packages that generate random Bernoulli trials.

Next, we will look at Bayes factors in normal models with unknown variances using the Cauchy prior so that results are less sensitive to the choice of \(n_0\) .

5.2 Comparing Two Paired Means using Bayes Factors

We previously learned that we can use a paired t-test to compare means from two paired samples. In this section, we will show how Bayes factors can be expressed as a function of the t-statistic for comparing the means and provide posterior probabilities of the hypothesis that whether the means are equal or different.

Example 5.1 Trace metals in drinking water affect the flavor, and unusually high concentrations can pose a health hazard. Ten pairs of data were taken measuring the zinc concentration in bottom and surface water at ten randomly sampled locations, as listed in Table 5.1 .

Water samples collected at the the same location, on the surface and the bottom, cannot be assumed to be independent of each other. However, it may be reasonable to assume that the differences in the concentration at the bottom and the surface in randomly sampled locations are independent of each other.

Table 5.1: Zinc in drinking water
location bottom surface difference
1 0.430 0.415 0.015
2 0.266 0.238 0.028
3 0.567 0.390 0.177
4 0.531 0.410 0.121
5 0.707 0.605 0.102
6 0.716 0.609 0.107
7 0.651 0.632 0.019
8 0.589 0.523 0.066
9 0.469 0.411 0.058
10 0.723 0.612 0.111

To start modeling, we will treat the ten differences as a random sample from a normal population where the parameter of interest is the difference between the average zinc concentration at the bottom and the average zinc concentration at the surface, or the main difference, \(\mu\) .

In mathematical terms, we have

  • Random sample of \(n= 10\) differences \(Y_1, \ldots, Y_n\)
  • Normal population with mean \(\mu \equiv \mu_B - \mu_S\)

In this case, we have no information about the variability in the data, and we will treat the variance, \(\sigma^2\) , as unknown.

The hypothesis of the main concentration at the surface and bottom are the same is equivalent to saying \(\mu = 0\) . The second hypothesis is that the difference between the mean bottom and surface concentrations, or equivalently that the mean difference \(\mu \neq 0\) .

In other words, we are going to compare the following hypotheses:

  • \(H_1: \mu_B = \mu_S \Leftrightarrow \mu = 0\)
  • \(H_2: \mu_B \neq \mu_S \Leftrightarrow \mu \neq 0\)

The Bayes factor is the ratio between the distributions of the data under each hypothesis, which does not depend on any unknown parameters.

\[\textit{BF}[H_1 : H_2] = \frac{p(\text{data}\mid H_1)} {p(\text{data}\mid H_2)}\]

To obtain the Bayes factor, we need to use integration over the prior distributions under each hypothesis to obtain those distributions of the data.

\[\textit{BF}[H_1 : H_2] = \iint p(\text{data}\mid \mu, \sigma^2) p(\mu \mid \sigma^2) p(\sigma^2 \mid H_2)\, d \mu \, d\sigma^2\]

This requires specifying the following priors:

  • \(\mu \mid \sigma^2, H_2 \sim \textsf{Normal}(0, \sigma^2/n_0)\)
  • \(p(\sigma^2) \propto 1/\sigma^2\) for both \(H_1\) and \(H_2\)

\(\mu\) is exactly zero under the hypothesis \(H_1\) . For \(\mu\) in \(H_2\) , we start with the same conjugate normal prior as we used in Section 5.1 – testing the normal mean with known variance. Since we assume that \(\sigma^2\) is known, we model \(\mu \mid \sigma^2\) instead of \(\mu\) itself.

The \(\sigma^2\) appears in both the numerator and denominator of the Bayes factor. For default or reference case, we use the Jeffreys prior (a.k.a. reference prior) on \(\sigma^2\) . As long as we have more than two observations, this (improper) prior will lead to a proper posterior.

After integration and rearranging, one can derive a simple expression for the Bayes factor:

\[\textit{BF}[H_1 : H_2] = \left(\frac{n + n_0}{n_0} \right)^{1/2} \left( \frac{ t^2 \frac{n_0}{n + n_0} + \nu } { t^2 + \nu} \right)^{\frac{\nu + 1}{2}}\]

This is a function of the t-statistic

\[t = \frac{|\bar{Y}|}{s/\sqrt{n}},\]

where \(s\) is the sample standard deviation and the degrees of freedom \(\nu = n-1\) (sample size minus one).

As we saw in the case of Bayes factors with known variance, we cannot use the improper prior on \(\mu\) because when \(n_0 \to 0\) , then \(\textit{BF}[H1:H_2] \to \infty\) favoring \(H_1\) regardless of the magnitude of the t-statistic. Arbitrary, vague small choices for \(n_0\) also lead to arbitrary large Bayes factors in favor of \(H_1\) . Another example of the Barlett’s or Jeffreys-Lindley paradox.

Sir Herald Jeffrey discovered another paradox testing using the conjugant normal prior, known as the information paradox . His thought experiment assumed that our sample size \(n\) and the prior sample size \(n_0\) . He then considered what would happen to the Bayes factor as the sample mean moved further and further away from the hypothesized mean, measured in terms standard errors with the t-statistic, i.e., \(|t| \to \infty\) . As the t-statistic or information about the mean moved further and further from zero, the Bayes factor goes to a constant depending on \(n, n_0\) rather than providing overwhelming support for \(H_2\) .

The bounded Bayes factor is

\[\textit{BF}[H_1 : H_2] \to \left( \frac{n_0}{n_0 + n} \right)^{\frac{n - 1}{2}}\]

Jeffrey wanted a prior with \(\textit{BF}[H_1 : H_2] \to 0\) (or equivalently, \(\textit{BF}[H_2 : H_1] \to \infty\) ), as the information from the t-statistic grows, indicating the sample mean is as far as from the hypothesized mean and should favor \(H_2\) .

To resolve the paradox when the information the t-statistic favors \(H_2\) but the Bayes factor does not, Jeffreys showed that no normal prior could resolve the paradox .

But a Cauchy prior on \(\mu\) , would resolve it. In this way, \(\textit{BF}[H_2 : H_1]\) goes to infinity as the sample mean becomes further away from the hypothesized mean. Recall that the Cauchy prior is written as \(\textsf{C}(0, r^2 \sigma^2)\) . While Jeffreys used a default of \(r = 1\) , smaller values of \(r\) can be used if smaller effects are expected.

The combination of the Jeffrey’s prior on \(\sigma^2\) and this Cauchy prior on \(\mu\) under \(H_2\) is sometimes referred to as the Jeffrey-Zellener-Siow prior .

However, there is no closed form expressions for the Bayes factor under the Cauchy distribution. To obtain the Bayes factor, we must use the numerical integration or simulation methods.

We will use the function from the package to test whether the mean difference is zero in Example 5.1 (zinc), using the JZS (Jeffreys-Zellener-Siow) prior.

hypothesis testing with normal distribution

With equal prior probabilities on the two hypothesis, the Bayes factor is the posterior odds. From the output, we see this indicates that the hypothesis \(H_2\) , the mean difference is different from 0, is almost 51 times more likely than the hypothesis \(H_1\) that the average concentration is the same at the surface and the bottom.

To sum up, we have used the Cauchy prior as a default prior testing hypothesis about a normal mean when variances are unknown. This does require numerical integration, but it is available in the function from the package. If you expect that the effect sizes will be small, smaller values of \(r\) are recommended.

It is often important to quantify the magnitude of the difference in addition to testing. The Cauchy Prior provides a default prior for both testing and inference; it avoids problems that arise with choosing a value of \(n_0\) (prior sample size) in both cases. In the next section, we will illustrate using the Cauchy prior for comparing two means from independent normal samples.

5.3 Comparing Independent Means: Hypothesis Testing

In the previous section, we described Bayes factors for testing whether the mean difference of paired samples was zero. In this section, we will consider a slightly different problem – we have two independent samples, and we would like to test the hypothesis that the means are different or equal.

Example 5.2 We illustrate the testing of independent groups with data from a 2004 survey of birth records from North Carolina, which are available in the package.

The variable of interest is – the weight gain of mothers during pregnancy. We have two groups defined by the categorical variable, , with levels, younger mom and older mom.

Question of interest : Do the data provide convincing evidence of a difference between the average weight gain of older moms and the average weight gain of younger moms?

We will view the data as a random sample from two populations, older and younger moms. The two groups are modeled as:

\[\begin{equation} \begin{aligned} Y_{O,i} & \mathrel{\mathop{\sim}\limits^{\rm iid}} \textsf{N}(\mu + \alpha/2, \sigma^2) \\ Y_{Y,i} & \mathrel{\mathop{\sim}\limits^{\rm iid}} \textsf{N}(\mu - \alpha/2, \sigma^2) \end{aligned} \tag{5.2} \end{equation}\]

The model for weight gain for older moms using the subscript \(O\) , and it assumes that the observations are independent and identically distributed, with a mean \(\mu+\alpha/2\) and variance \(\sigma^2\) .

For the younger women, the observations with the subscript \(Y\) are independent and identically distributed with a mean \(\mu-\alpha/2\) and variance \(\sigma^2\) .

Using this representation of the means in the two groups, the difference in means simplifies to \(\alpha\) – the parameter of interest.

\[(\mu + \alpha/2) - (\mu - \alpha/2) = \alpha\]

You may ask, “Why don’t we set the average weight gain of older women to \(\mu+\alpha\) , and the average weight gain of younger women to \(\mu\) ?” We need the parameter \(\alpha\) to be present in both \(Y_{O,i}\) (the group of older women) and \(Y_{Y,i}\) (the group of younger women).

We have the following competing hypotheses:

  • \(H_1: \alpha = 0 \Leftrightarrow\) The means are not different.
  • \(H_2: \alpha \neq 0 \Leftrightarrow\) The means are different.

In this representation, \(\mu\) represents the overall weight gain for all women. (Does the model in Equation (5.2) make more sense now?) To test the hypothesis, we need to specify prior distributions for \(\alpha\) under \(H_2\) (c.f. \(\alpha = 0\) under \(H_1\) ) and priors for \(\mu,\sigma^2\) under both hypotheses.

Recall that the Bayes factor is the ratio of the distribution of the data under the two hypotheses.

\[\begin{aligned} \textit{BF}[H_1 : H_2] &= \frac{p(\text{data}\mid H_1)} {p(\text{data}\mid H_2)} \\ &= \frac{\iint p(\text{data}\mid \alpha = 0,\mu, \sigma^2 )p(\mu, \sigma^2 \mid H_1) \, d\mu \,d\sigma^2} {\int \iint p(\text{data}\mid \alpha, \mu, \sigma^2) p(\alpha \mid \sigma^2) p(\mu, \sigma^2 \mid H_2) \, d \mu \, d\sigma^2 \, d \alpha} \end{aligned}\]

As before, we need to average over uncertainty and the parameters to obtain the unconditional distribution of the data. Now, as in the test about a single mean, we cannot use improper or non-informative priors for \(\alpha\) for testing.

Under \(H_2\) , we use the Cauchy prior for \(\alpha\) , or equivalently, the Cauchy prior on the standardized effect \(\delta\) with the scale of \(r\) :

\[\delta = \alpha/\sigma^2 \sim \textsf{C}(0, r^2)\]

Now, under both \(H_1\) and \(H_2\) , we use the Jeffrey’s reference prior on \(\mu\) and \(\sigma^2\) :

\[p(\mu, \sigma^2) \propto 1/\sigma^2\]

While this is an improper prior on \(\mu\) , this does not suffer from the Bartlett’s-Lindley’s-Jeffreys’ paradox as \(\mu\) is a common parameter in the model in \(H_1\) and \(H_2\) . This is another example of the Jeffreys-Zellner-Siow prior.

As in the single mean case, we will need numerical algorithms to obtain the Bayes factor. Now the following output illustrates testing of Bayes factors, using the Bayes inference function from the package.

hypothesis testing with normal distribution

We see that the Bayes factor for \(H_1\) to \(H_2\) is about 5.7, with positive support for \(H_1\) that there is no difference in average weight gain between younger and older women. Using equal prior probabilities, the probability that there is a difference in average weight gain between the two groups is about 0.15 given the data. Based on the interpretation of Bayes factors from Table 3.5 , this is in the range of “positive” (between 3 and 20).

To recap, we have illustrated testing hypotheses about population means with two independent samples, using a Cauchy prior on the difference in the means. One assumption that we have made is that the variances are equal in both groups . The case where the variances are unequal is referred to as the Behren-Fisher problem, and this is beyond the scope for this course. In the next section, we will look at another example to put everything together with testing and discuss summarizing results.

5.4 Inference after Testing

In this section, we will work through another example for comparing two means using both hypothesis tests and interval estimates, with an informative prior. We will also illustrate how to adjust the credible interval after testing.

Example 5.3 We will use the North Carolina survey data to examine the relationship between infant birth weight and whether the mother smoked during pregnancy. The response variable, , is the birth weight of the baby in pounds. The categorical variable provides the status of the mother as a smoker or non-smoker.

We would like to answer two questions:

Is there a difference in average birth weight between the two groups?

If there is a difference, how large is the effect?

As before, we need to specify models for the data and priors. We treat the data as a random sample for the two populations, smokers and non-smokers.

The birth weights of babies born to non-smokers, designated by a subgroup \(N\) , are assumed to be independent and identically distributed from a normal distribution with mean \(\mu + \alpha/2\) , as in Section 5.3 .

\[Y_{N,i} \mathrel{\mathop{\sim}\limits^{\rm iid}}\textsf{Normal}(\mu + \alpha/2, \sigma^2)\]

While the birth weights of the babies born to smokers, designated by the subgroup \(S\) , are also assumed to have a normal distribution, but with mean \(\mu - \alpha/2\) .

\[Y_{S,i} \mathrel{\mathop{\sim}\limits^{\rm iid}}\textsf{Normal}(\mu - \alpha/2, \sigma^2)\]

The difference in the average birth weights is the parameter \(\alpha\) , because

\[(\mu + \alpha/2) - (\mu - \alpha/2) = \alpha\] .

The hypotheses that we will test are \(H_1: \alpha = 0\) versus \(H_2: \alpha \ne 0\) .

We will still use the Jeffreys-Zellner-Siow Cauchy prior. However, since we may expect the standardized effect size to not be as strong, we will use a scale of \(r = 0.5\) rather than 1.

Therefore, under \(H_2\) , we have \[\delta = \alpha/\sigma \sim \textsf{C}(0, r^2), \text{ with } r = 0.5.\]

Under both \(H_1\) and \(H_2\) , we will use the reference priors on \(\mu\) and \(\sigma^2\) :

\[\begin{aligned} p(\mu) &\propto 1 \\ p(\sigma^2) &\propto 1/\sigma^2 \end{aligned}\]

The input to the base inference function is similar, but now we will specify that \(r = 0.5\) .

hypothesis testing with normal distribution

We see that the Bayes factor is 1.44, which weakly favors there being a difference in average birth weights for babies whose mothers are smokers versus mothers who did not smoke. Converting this to a probability, we find that there is about a 60% chance of the average birth weights are different.

While looking at evidence of there being a difference is useful, Bayes factors and posterior probabilities do not convey any information about the magnitude of the effect. Reporting a credible interval or the complete posterior distribution is more relevant for quantifying the magnitude of the effect.

Using the function, we can generate samples from the posterior distribution under \(H_2\) using the option.

The 2.5 and 97.5 percentiles for the difference in the means provide a 95% credible interval of 0.023 to 0.57 pounds for the difference in average birth weight. The MCMC output shows not only summaries about the difference in the mean \(\alpha\) , but the other parameters in the model.

In particular, the Cauchy prior arises by placing a gamma prior on \(n_0\) and the conjugate normal prior. This provides quantiles about \(n_0\) after updating with the current data.

The row labeled effect size is the standardized effect size \(\delta\) , indicating that the effects are indeed small relative to the noise in the data.

Estimates of effect under H2

Figure 5.4: Estimates of effect under H2

Figure 5.4 shows the posterior density for the difference in means, with the 95% credible interval indicated by the shaded area. Under \(H_2\) , there is a 95% chance that the average birth weight of babies born to non-smokers is 0.023 to 0.57 pounds higher than that of babies born to smokers.

The previous statement assumes that \(H_2\) is true and is a conditional probability statement. In mathematical terms, the statement is equivalent to

\[P(0.023 < \alpha < 0.57 \mid \text{data}, H_2) = 0.95\]

However, we still have quite a bit of uncertainty based on the current data, because given the data, the probability of \(H_2\) being true is 0.59.

\[P(H_2 \mid \text{data}) = 0.59\]

Using the law of total probability, we can compute the probability that \(\mu\) is between 0.023 and 0.57 as below:

\[\begin{aligned} & P(0.023 < \alpha < 0.57 \mid \text{data}) \\ = & P(0.023 < \alpha < 0.57 \mid \text{data}, H_1)P(H_1 \mid \text{data}) + P(0.023 < \alpha < 0.57 \mid \text{data}, H_2)P(H_2 \mid \text{data}) \\ = & I( 0 \text{ in CI }) P(H_1 \mid \text{data}) + 0.95 \times P(H_2 \mid \text{data}) \\ = & 0 \times 0.41 + 0.95 \times 0.59 = 0.5605 \end{aligned}\]

Finally, we get that the probability that \(\alpha\) is in the interval, given the data, averaging over both hypotheses, is roughly 0.56. The unconditional statement is the average birth weight of babies born to nonsmokers is 0.023 to 0.57 pounds higher than that of babies born to smokers with probability 0.56. This adjustment addresses the posterior uncertainty and how likely \(H_2\) is.

To recap, we have illustrated testing, followed by reporting credible intervals, and using a Cauchy prior distribution that assumed smaller standardized effects. After testing, it is common to report credible intervals conditional on \(H_2\) . We also have shown how to adjust the probability of the interval to reflect our posterior uncertainty about \(H_2\) . In the next chapter, we will turn to regression models to incorporate continuous explanatory variables.

8.1 A Single Population Mean Using the Normal Distribution

A confidence interval for a population mean with a known standard deviation is based on the fact that the sample means follow an approximately normal distribution. Suppose that our sample has a mean of x ¯  = 10 x ¯  = 10 and we have constructed the 90 percent confidence interval (5, 15), where the margin of error = 5.

Calculating the Confidence Interval

To construct a confidence interval for a single unknown population mean, μ , where the population standard deviation is known , we need x ¯ x ¯ as an estimate for μ , and we need the margin of error. The margin of error for the population mean is called the error bound for a population mean ( EBM ). The sample mean, x ¯ , x ¯ , is the point estimate of the unknown population mean, μ .

The confidence interval ( CI ) estimate will have the form:

(point estimate – error bound, point estimate + error bound) or, in symbols, ( x ¯ – E B M , x ¯ + E B M x ¯ – E B M , x ¯ + E B M ).

The margin of error ( EBM ) depends on the confidence level ( CL ). The confidence level is often considered the probability that the calculated confidence interval estimate will contain the true population parameter. However, it is more accurate to state that the confidence level is the percentage of confidence intervals that contain the true population parameter when repeated samples are taken. Most often, the person constructing the confidence interval will choose a confidence level of 90 percent or higher, because that person wants to be reasonably certain of his or her conclusions.

Another probability, which is called alpha ( α ) ( α ) is related to the confidence level, CL . Alpha is the probability that the confidence interval does not contain the unknown population parameter. Mathematically, alpha can be computed as α = 1 − C L α = 1 − C L .

Example 8.1

  • Suppose we have collected data from a sample. We know the sample mean, but we do not know the mean for the entire population.
  • The sample mean is seven, and the error bound for the mean is 2.5.

The confidence interval is (7 – 2.5, 7 + 2.5), and calculating the values gives (4.5, 9.5).

If the confidence level is 95 percent, then we say, "We estimate with 95 percent confidence that the true value of the population mean is between 4.5 and 9.5."

Suppose we have data from a sample. The sample mean is 15, and the error bound for the mean is 3.2.

What is the confidence interval estimate for the population mean?

A confidence interval for a population mean with a known standard deviation is based on the fact that the sample means follow an approximately normal distribution. Suppose that our sample has a mean of x ¯ x ¯ = 10, and we have constructed the 90 percent confidence interval (5, 15) where EBM = 5.

To get a 90 percent confidence interval, we must include the central 90 percent of the probability of the normal distribution. If we include the central 90 percent, we leave out a total of α = 10 percent in both tails, or 5 percent in each tail, of the normal distribution.

The critical value 1.645 is the z -score in a standard normal probability distribution that puts an area of 0.90 in the center, an area of 0.05 in the far left tail, and an area of 0.05 in the far right tail. To capture the central 90 percent, we must go out 1.645 standard deviations on either side of the calculated sample mean. The critical value will change depending on the confidence level of the interval.

It is important that the standard deviation used be appropriate for the parameter we are estimating, so in this section, we need to use the standard deviation that applies to sample means, which is σ n σ n . The fraction σ n σ n is commonly called the standard error of the mean in order to distinguish clearly the standard deviation for a mean from the population standard deviation, σ .

  • X ¯ X ¯ is normally distributed, that is, X ¯ X ¯ ~ N ( μ X , σ n ) ( μ X , σ n ) .
  • When the population standard deviation σ is known, we use a normal distribution to calculate the error bound.

To construct a confidence interval estimate for an unknown population mean, we need data from a random sample. The steps to construct and interpret the confidence interval are as follows:

  • Calculate the sample mean, x ¯ , x ¯ , from the sample data. Remember, in this section, we already know the population standard deviation, σ .
  • Find the z -score that corresponds to the confidence level.
  • Calculate the error bound EBM .
  • Construct the confidence interval.
  • If we denote the critical z -score by z a 2 z a 2 , and the sample size by n , then the formula for the confidence interval with confidence level C l = 1 − α C l = 1 − α , is given by ( x ¯ − z a 2 × σ n , x ¯ + z a 2 × σ n ) . ( x ¯ − z a 2 × σ n , x ¯ + z a 2 × σ n ) .
  • Write a sentence that interprets the estimate in the context of the situation in the problem. (Explain what the confidence interval means, in the words of the problem.)

We will first examine each step in more detail and then illustrate the process with some examples.

Finding the z -Score for the Stated Confidence Level

When we know the population standard deviation, σ , we use a standard normal distribution to calculate the error bound EBM and construct the confidence interval. We need to find the value of z that puts an area equal to the confidence level (in decimal form) in the middle of the standard normal distribution Z ~ N (0, 1).

The confidence level, CL , is the area in the middle of the standard normal distribution. CL = 1 – α , so α is the area that is split equally between the two tails. Each of the tails contains an area equal to α 2 α 2 .

The z -score that has an area to the right of α 2 α 2 is denoted by z α 2 z α 2 .

For example, when CL = 0.95, α = 0.05, and α 2 α 2 = 0.025, we write z α 2 z α 2 = z z 0.025 .

The area to the right of z 0.025 is 0.025 and the area to the left of z 0.025 is 1 – 0.025 = 0.975.

z α 2  =  z 0. 025  = 1 .96 z α 2  =  z 0. 025  = 1 .96 , using a calculator, computer, or standard normal probability table.

Normal table (see appendices) shows that the probability for 0 to 1.96 is 0.47500, and so the probability to the right tail of the critical value 1.96 is 0.5 – 0.475 = 0.025

Using the TI-83, 83+, 84, 84+ Calculator

invNorm (0.975, 0, 1) = 1.96. In this command, the value 0.975 is the total area to the left of the critical value that we are looking to calculate. The parameters 0 and 1 are the mean value and the standard deviation of the standard normal distribution Z.

Remember to use the area to the LEFT of z α 2 z α 2 . In this chapter, the last two inputs in the invNorm command are 0, 1, because you are using a standard normal distribution Z with mean 0 and standard deviation 1.

Calculating the Margin of Error EBM

The error bound formula for an unknown population mean, μ , when the population standard deviation, σ , is known is

Margin of error = ( z α 2 ) ( σ n ) . ( z α 2 ) ( σ n ) .

Constructing the Confidence Interval

The confidence interval estimate has the format sample mean plus or minus the margin of error.

The graph gives a picture of the entire situation

CL + α 2 α 2 + α 2 α 2 = CL + α = 1.

Writing the Interpretation

The interpretation should clearly state the confidence level ( CL ), explain which population parameter is being estimated (here, a population mean ), and state the confidence interval (both endpoints): "We estimate with ___percent confidence that the true population mean (include the context of the problem) is between ___ and ___ (include appropriate units)."

Example 8.2

Suppose scores on exams in statistics are normally distributed with an unknown population mean and a population standard deviation of three points. A random sample of 36 scores is taken and gives a sample mean (sample mean score) of 68. Find a confidence interval estimate for the population mean exam score (the mean score on all exams).

Find a 90 percent confidence interval for the true (population) mean of statistics exam scores.

  • You can use technology to calculate the confidence interval directly.
  • The first solution is shown step-by-step (Solution A).
  • The second solution uses the TI-83, 83+, and 84+ calculators (Solution B).
  • x ¯  = 68 x ¯  = 68 8.2 E B M = ( z α 2 ) ( σ n ) E B M = ( z α 2 ) ( σ n ) σ = 3; n = 36 ; σ = 3; n = 36 ;
  • The confidence level is 90 percent ( CL = 0.90).

The area to the right of z 0.05 is 0.05 and the area to the left of z 0.05 is 1 – 0.05 = 0.95.

using invNorm(0.95, 0, 1) on the TI-83,83+, and 84+ calculators. This can also be found using appropriate commands on other calculators, using a computer, or using a probability table for the standard normal distribution.

EBM = (1.645) ( 3 36 ) ( 3 36 ) = 0.8225

x ¯ x ¯ – EBM = 68 – 0.8225 = 67.1775

x ¯ x ¯ + EBM = 68 + 0.8225 = 68.8225

The 90 percent confidence interval is (67.1775, 68.8225).

Press STAT and arrow over to TESTS . Arrow down to 7:ZInterval . Press ENTER . Arrow to Stats and press ENTER . Arrow down and enter 3 for σ , 68 for x ¯ x ¯ , 36 for n , and .90 for C-level . Arrow down to Calculate and press ENTER . The confidence interval is (to three decimal places)(67.178, 68.822).

Interpretation

Explanation of 90 percent confidence level.

Suppose average pizza delivery times are normally distributed with an unknown population mean and a population standard deviation of 6 minutes. A random sample of 28 pizza delivery restaurants is taken and has a sample mean delivery time of 36 min.

Find a 90 percent confidence interval estimate for the population mean delivery time.

Example 8.3

The specific absorption rate (SAR) for a cell phone measures the amount of radio frequency (RF) energy absorbed by the user’s body when using the handset. Every cell phone emits RF energy. Different phone models have different SAR measures. For certification from the Federal Communications Commission for sale in the United States, the SAR level for a cell phone must be no more than 1.6 watts per kilogram. Table 8.1 shows the highest SAR level for a random selection of cell phone models of a random cell phone company.

Phone Model # SAR Phone Model # SAR Phone Model # SAR
800 1.11 1800 1.36 2800 0.74
900 1.48 1900 1.34 2900 0.5
1000 1.43 2000 1.18 3000 0.4
1100 1.3 2100 1.3 3100 0.867
1200 1.09 2200 1.26 3200 0.68
1300 0.455 2300 1.29 3300 0.51
1400 1.41 2400 0.36 3400 1.13
1500 0.82 2500 0.52 3500 0.3
1600 0.78 2600 1.6 3600 1.48
1700 1.25 2700 1.39 3700 1.38

Find a 98 percent confidence interval for the true (population) mean of the SARs for cell phones. Assume that the population standard deviation is σ = 0.337.

This is calculated by adding the specific absorption rate for the 30 cell phones in the sample, and dividing the result by 30.

Next, find the EBM . Because you are creating a 98 percent confidence interval, CL = 0.98.

You need to find z 0.01 , having the property that the area under the normal density curve to the right of z 0.01 is 0.01 and the area to the left is 0.99. Use your calculator, a computer, or a probability table for the standard normal distribution to find z 0.01 = 2.326.

To find the 98 percent confidence interval, find   x ¯ ± E B M   x ¯ ± E B M .

We estimate with 98 percent confidence that the true SAR mean for the population of cell phones in the United States is between 0.8809 and 1.1671 watts per kilogram.

  • Press STAT and arrow over to TESTS.
  • Arrow down to 7:ZInterval.
  • Press ENTER.
  • Arrow to Stats and press ENTER.
  • x ¯ : 1.024 x ¯ : 1.024
  • C -level: 0.98
  • Arrow down to Calculate and press ENTER.
  • The confidence interval is (to three decimal places) (0.881, 1.167).

Table 8.2 shows a different random sampling of 20 cell phone models. Use these data to calculate a 93 percent confidence interval for the true mean SAR for cell phones certified for use in the United States. As previously, assume that the population standard deviation is σ = 0.337.

450 1.48 1450 1.53
550 0.8 1550 0.68
650 1.15 1650 1.4
750 1.36 1750 1.24
850 0.77 1850 0.57
950 0.462 1950 0.2
1050 1.36 2050 0.51
1150 1.39 2150 0.3
1250 1.3 2250 0.73
1350 0.7 2350 0.869

Notice the difference in the confidence intervals calculated in Example 8.3 and the following Try It exercise. These intervals are different for several reasons: they are calculated from different samples, the samples are different sizes, and the intervals are calculated for different levels of confidence. Even though the intervals are different, they do not yield conflicting information. The effects of these kinds of changes are the subject of the next section in this chapter.

Changing the Confidence Level or Sample Size

Example 8.4.

Suppose we change the original problem in Example 8.2 by using a 95 percent confidence level. Find a 95 percent confidence interval for the true (population) mean statistics exam score.

To find the confidence interval, you need the sample mean, x ¯ x ¯ , and the EBM .

  • x ¯  = 68 x ¯  = 68 E B M = ( z α 2 ) ( σ n ) E B M = ( z α 2 ) ( σ n ) σ = 3; n = 36 σ = 3; n = 36
  • The confidence level is 95 percent ( CL = 0.95).

The area to the right of z z 0.025 is 0.025, and the area to the left of z z 0.025 is 1 – 0.025 = 0.975.

when using invnorm(0.975,0,1) on the TI-83, 83+, or 84+ calculators. (This can also be found using appropriate commands on other calculators, using a computer, or using a probability table for the standard normal distribution.)

Notice that the EBM is larger for a 95 percent confidence level in the original problem.

Explanation of 95 percent Confidence Level

Comparing the results, summary: effect of changing the confidence level.

  • Increasing the confidence level increases the error bound, making the confidence interval wider.
  • Decreasing the confidence level decreases the error bound, making the confidence interval narrower.

Refer back to the pizza-delivery Try It exercise. The population standard deviation is six minutes and the sample mean deliver time is 36 minutes. Use a sample size of 20. Find a 95 percent confidence interval estimate for the true mean pizza-delivery time.

Example 8.5

Suppose we change the original problem in Example 8.2 to see what happens to the error bound if the sample size is changed.

Leave everything the same except the sample size. Use the original 90 percent confidence level. What happens to the error bound and the confidence interval if we increase the sample size and use n = 100 instead of n = 36? What happens if we decrease the sample size to n = 25 instead of n = 36?

  • x ¯ x ¯ = 68
  • EBM = ( z α 2 ) ( σ n ) ( z α 2 ) ( σ n )
  • σ = 3, the confidence level is 90 percent ( CL = 0.90), z α 2 z α 2 = z 0.05 = 1.645.

When n = 100, EBM = ( z α 2 ) ( σ n ) ( z α 2 ) ( σ n ) = (1.645) ( 3 100 ) ( 3 100 ) = 0.4935.

When n = 25, EBM = ( z α 2 ) ( σ n ) ( z α 2 ) ( σ n ) = (1.645) ( 3 25 ) ( 3 25 ) = 0.987.

Summary: Effect of Changing the Sample Size

  • Increasing the sample size causes the error bound to decrease, making the confidence interval narrower.
  • Decreasing the sample size causes the error bound to increase, making the confidence interval wider.

Refer back to the pizza-delivery Try It exercise. The mean delivery time is 36 minutes and the population standard deviation is six minutes. Assume the sample size is changed to 50 restaurants with the same sample mean. Find a 90 percent confidence interval estimate for the population mean delivery time.

Working Backward to Find the Error Bound or Sample Mean

When we calculate a confidence interval, we find the sample mean, calculate the error bound, and use them to calculate the confidence interval. However, sometimes when we read statistical studies, the study may state the confidence interval only. If we know the confidence interval, we can work backward to find both the error bound and the sample mean.

  • From the upper value for the interval, subtract the sample mean,
  • Or, from the upper value for the interval, subtract the lower value. Then divide the difference by 2.
  • Subtract the error bound from the upper value of the confidence interval,
  • Or, average the upper and lower endpoints of the confidence interval.

Notice that there are two methods to perform each calculation. You can choose the method that is easier to use with the information you know.

Example 8.6

Suppose we know that a confidence interval is (67.18, 68.82) and we want to find the error bound. We may know that the sample mean is 68, or perhaps our source only gives the confidence interval and does not tell us the value of the sample mean.

Calculate the error bound:

  • If we know that the sample mean is 68, EBM = 68.82 – 68 = 0.82.
  • If we do not know the sample mean, EBM = ( 68.82 − 67.18 ) 2 ( 68.82 − 67.18 ) 2 = 0.82. The margin of error is the quantity that we add and subtract from the sample mean to obtain the confidence interval. Therefore, the margin of error is half of the length of the interval.

Calculate the sample mean:

  • If we know the error bound, x ¯ x ¯ = 68.82 – 0.82 = 68.
  • If we do not know the error bound, x ¯ x ¯ = ( 67.18 + 68.82 ) 2 ( 67.18 + 68.82 ) 2 = 68.

Suppose we know that a confidence interval is (42.12, 47.88). Find the error bound and the sample mean.

Calculating the Sample Size n

If researchers desire a specific margin of error, then they can use the error bound formula to calculate the required sample size. In this situation, we are given the desired margin of error, EBM , and we need to compute the sample size n .

The formula for sample size is n = z 2 σ 2 E B M 2 z 2 σ 2 E B M 2 , found by solving the error bound formula for n . Always round up the value of n to the closest integer.

In this formula, z is the critical value z α 2 z α 2 , corresponding to the desired confidence level. A researcher planning a study who wants a specified confidence level and error bound can use this formula to calculate the size of the sample needed for the study.

Example 8.7

The population standard deviation for the age of Foothill College students is 15 years. If we want to be 95 percent confident that the sample mean age is within two years of the true population mean age of Foothill College students, how many randomly selected Foothill College students must be surveyed?

  • From the problem, we know that σ = 15 and EBM = 2.
  • z = z 0.025 = 1.96, because the confidence level is 95 percent.
  • n = z 2 σ 2 E B M 2 z 2 σ 2 E B M 2 = ( 1.96 ) 2 ( 15 ) 2 2 2 ( 1.96 ) 2 ( 15 ) 2 2 2 = 216.09 using the sample size equation.
  • Use n = 217. Always round the answer up to the next higher integer to ensure that the sample size is large enough.

Therefore, 217 Foothill College students should be surveyed in order to be 95 percent confident that we are within two years of the true population mean age of Foothill College students.

The population standard deviation for the height of high school basketball players is three inches. If we want to be 95 percent confident that the sample mean height is within one inch of the true population mean height, how many randomly selected students must be surveyed?

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Access for free at https://openstax.org/books/statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Statistics
  • Publication date: Mar 27, 2020
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/statistics/pages/8-1-a-single-population-mean-using-the-normal-distribution

© Apr 16, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

machines-logo

Article Menu

hypothesis testing with normal distribution

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Improved bayes-based reliability prediction of small-sample hall current sensors.

hypothesis testing with normal distribution

1. Introduction

2. design and data analysis of the acceleration test scheme for hall current sensors, 2.1. accelerated degradation testing design, 2.2. degradation data processing and analysis, 3. hall current sensor prior information processing, 3.1. extrapolated pseudo-failure life, 3.2. pseudo-failure life distribution test, 3.3. pseudo-failure life expansion, 4. improved bayes method for solving weibull distribution parameters, 4.1. solving for the prior distribution, 4.2. calculating the posterior distribution, 5. reliability prediction of hall current sensors under normal stress levels, 6. conclusions, author contributions, data availability statement, conflicts of interest.

  • Crescentini, M.; Syeda, S.F.; Gibiino, G.P. Hall-effect current sensors: Principles of operation and implementation techniques. IEEE Sens. J. 2022 , 22 , 10137–10151. [ Google Scholar ] [ CrossRef ]
  • Liang, J.; Tian, Q.; Feng, J.; Pi, D.; Yin, G. A polytopic model-based robust predictive control scheme for path tracking of autonomous vehicles. IEEE Trans. Intell. Veh. 2024 , 9 , 3928–3938. [ Google Scholar ] [ CrossRef ]
  • Liang, J.; Feng, J.; Fang, Z.; Lu, Y.; Yin, G.; Mao, X.; Wu, J.; Wang, F. An energy-oriented torque-vector control framework for distributed drive electric vehicles. IEEE Trans. Transp. Electrif. 2023 , 9 , 4014–4031. [ Google Scholar ] [ CrossRef ]
  • Duan, F.; Wang, G. Optimal step-stress accelerated degradation test plans for inverse gaussian process based on proportional degradation rate model. J. Stat. Comput. Simul. 2017 , 88 , 305–328. [ Google Scholar ] [ CrossRef ]
  • Barre, O.; Napame, B. The insulation for machines having a high lifespan expectancy, design, tests and acceptance criteria issues. Machines 2017 , 5 , 7. [ Google Scholar ] [ CrossRef ]
  • Chen, L.; Fan, D.; Zheng, J.; Xie, X. Functional safety analysis and design of sensors in robot joint drive system. Machines 2022 , 10 , 360. [ Google Scholar ] [ CrossRef ]
  • Li, J.; Pan, F.; Li, J.; Ji, Y.; Song, H.; Wang, B. Research on TMR current transducer with temperature compensation based on reference magnetic field. IEEE Access 2023 , 11 , 121828–121834. [ Google Scholar ] [ CrossRef ]
  • Huber, S.; Leten, W.; Ackermann, M.; Schott, C.; Paul, O. A fully integrated analog compensation for the Piezo-Hall effect in a CMOS single-chip hall sensor microsystem. IEEE Sens. J. 2015 , 15 , 2924–2933. [ Google Scholar ] [ CrossRef ]
  • Luo, Q.; Zhang, S.J.; Zhang, X. Study on the influence of temperature and stress field to PCB’s modal. Appl. Mech. Mater. 2012 , 271 , 1441–1445. [ Google Scholar ] [ CrossRef ]
  • Wang, X. Research on network fault diagnosis based on fault tree analysis. In Proceedings of the 2022 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC-Dalian), Dalian, China, 14–16 April 2022; pp. 1186–1189. [ Google Scholar ]
  • Chen, H.; Liu, Z.; Alippi, C.; Huang, B.; Liu, D. Explainable intelligent fault diagnosis for nonlinear dynamic systems: From unsupervised to supervised learning. IEEE Trans. Neural Netw. Learn. Syst. 2024 , 35 , 6166–6179. [ Google Scholar ] [ CrossRef ]
  • Zhong, K.; Ding, Z.; Zhang, H.; Chen, H.; Zio, E. Simultaneous fault diagnosis and size estimation using multitask federated incremental learning. IEEE Trans. Reliab. 2024 . early access . [ Google Scholar ]
  • Aikhuele, D.O.; Periola, A.; Ighravwe, D.E. Wind turbine systems operational state and reliability evaluation: An artificial neural network approach. Int. J. Data Netw. Sci. 2019 , 3 , 323–330. [ Google Scholar ] [ CrossRef ]
  • Liu, J.; Shen, H.; Yang, F. Reliability evaluation of distribution network power supply based on improved sampling monte carlo method. In Proceedings of the 2020 5th Asia Conference on Power and Electrical Engineering (ACPEE-Chengdu), Chengdu, China, 4–7 June 2020; pp. 1725–1729. [ Google Scholar ]
  • Liu, J.; Liu, L.; Li, S.; Tang, Z.; Wang, C. Study of structural reliability evaluation method based on deep BP neural network. In Proceedings of the 2018 Prognostics and System Health Management Conference (PHM-Chongqing), Chongqing, China, 26–28 October 2018; pp. 500–505. [ Google Scholar ]
  • Ibrahim, M.S.; Fan, J.; Yung, W.K.; Wu, Z.; Sun, B. Lumen degradation lifetime prediction for high-power white LEDs based on the Gamma process model. IEEE Photonics J. 2019 , 11 , 1–16. [ Google Scholar ] [ CrossRef ]
  • Li, Y.; Xu, S.; Chen, H.; Jia, L.; Ma, K. A general degradation process of useful life analysis under unreliable signals for accelerated degradation testing. IEEE Trans. Ind. Inform. 2023 , 19 , 7742–7750. [ Google Scholar ] [ CrossRef ]
  • Li, Y.F.; Huang, H.Z.; Mi, J.; Peng, W.; Han, X. Reliability analysis of multi-state systems with common cause failures based on Bayesian network and fuzzy probability. Ann. Oper. Res. 2018 , 6 , 7177–7189. [ Google Scholar ] [ CrossRef ]
  • Gao, C.; Guo, Y.; Zhong, M.; Liang, X.; Wang, H.; Yi, H. Reliability analysis based on dynamic bayesian networks: A case study of an unmanned surface vessel. Ocean Eng. 2021 , 240 , 109970. [ Google Scholar ] [ CrossRef ]
  • Cai, B.; Zhang, Y.; Wang, H.; Liu, Y.; Ji, R.; Gao, C.; Kong, X.; Liu, J. Resilience evaluation methodology of engineering systems with dynamic-bayesian-network-based degradation and maintenance. Reliab. Eng. Syst. Saf. 2021 , 209 , 107464. [ Google Scholar ] [ CrossRef ]
  • Davila-Frias, A.; Yodo, N.; Le, T.; Yadav, O.P. A deep neural network and Bayesian method based framework for all-terminal network reliability estimation considering degradation. Reliab. Eng. Syst. Saf. 2023 , 229 , 108881. [ Google Scholar ] [ CrossRef ]
  • Zhao, W.; Cao, K.; Tao, Y.; Zhang, J.; Tu, X.; Wang, J.; Dong, H. Performance reliability evaluation of explosive initiator based on bayes-bootstrap method. In Proceedings of the 2021 Global Reliability and Prognostics and Health Management (PHM-Nanjing), Nanjing, China, 15–17 October 2021; pp. 1–6. [ Google Scholar ]
  • JEDEC JESD22-A101 ; Steady State Temperature Humidity Bias Life Test. Joint Electron Device Engineering Council: Arlington County, VA, USA, 2021.
  • AEC-Q100-005-REV-D1 ; Non-Volatile Memory Program-Erase Endurance Data Retention and Operational Life Test. Automotive Electronics Council: Sydney, Australia, 2012.
  • MIL-STD-883 ; Microcircuits Test Methods and Procedures. Department of Defense: Washington, DC, USA, 2019.
  • IEC 60068-2-78 ; Environmental Testing—Part 2-78: Tests—Test Cab: Damp Heat, Steady State. International Electrotechnical Commission: Geneva, Switzerland, 2012.
  • Xu, C.; Rui, X.; Song, X.; Gao, J. Generalized reliability measures of kalman filtering for precise point positioning. J. Syst. Eng. Electron. 2013 , 24 , 699–705. [ Google Scholar ] [ CrossRef ]
  • Zhang, T.; Wang, Q.; Shu, Y.; Xiao, W.; Ma, W. Remaining useful life prediction for rolling bearings with a novel entropy-based health indicator and improved particle filter algorithm. IEEE Access 2023 , 11 , 3062–3079. [ Google Scholar ] [ CrossRef ]
  • Jiang, C.; Chen, Q.; Lei, B. A self-adaptive multi-step degradation hidden markov model for operational reliability prediction of wind turbine bearings. In Proceedings of the 2023 5th International Conference on System Reliability and Safety Engineering (SRSE), Beijing, China, 20–23 December 2023; pp. 19–25. [ Google Scholar ]
  • Fan, T.; Zhao, W. Ensemble of model-based and data-driven prognostic approaches for reliability prediction. In Proceedings of the 2017 Prognostics and System Health Management Conference (PHM-Harbin), Harbin, China, 9–12 July 2017; pp. 1–6. [ Google Scholar ]
  • Bathla, S.; Vasudevan, V. A framework for reliability analysis of combinational circuits using approximate bayesian inference. IEEE Trans. Very Large Scale Integr. Syst. 2023 , 31 , 543–554. [ Google Scholar ] [ CrossRef ]
  • Chen, B.; Liu, Y.; Zhang, C.; Wang, Z. Time series data for equipment reliability analysis with deep learning. IEEE Access 2020 , 8 , 105484–105493. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

Fitting FunctionSample A1Sample B1Sample C1
SSER SSER SSER
Linear model8.96 × 10 0.99550.00740.98690.00870.9896
Power function model0.01880.89090.05510.88640.01800.8424
Logarithmic model0.02860.83410.08200.83090.03270.8407
Exponential model0.23590.88400.01620.96430.01770.9624
Sample NumberPseudo-Failure Life/Days
B1255.50
B2246.33
B3282.59
B4268.58
B5271.25
B6278.08
B7262.12
B8271.54
Distribution TypeHPAD*CV
Weibull distribution00.99740.16820.7170
Log-normal distribution00.98260.22510.6667
Normal distribution00.98890.20790.6667
NumberOriginal ValuePredicted ValueRelative Error/%
B1255.50255.590.035
B2246.33246.32−0.004
B3282.59282.580.003
B4268.58268.690.041
B5271.25271.370.044
B6278.08278.320.086
B7262.12262.120.000
B8271.54271.550.003
Starting PointSample SizeMeanStandard DeviationMC Error
100019,001269.10.52300.0077
100039,001269.10.52590.0054
100059,001269.10.52320.0042
Starting PointSample SizeMeanStandard DeviationMC Error
100019,00123.05.3770.2988
100039,00123.115.0280.1809
100059,00123.115.0020.1348
Stress CombinationMagnitude Parameter αShape Parameter β
50 °C-60%RH482.925.42
65 °C-70%RH269.123.11
80 °C-80%RH138.521.88
Reliability ThresholdService Life (Days)Service Life (Years)
0.828217.7
0.8527857.6
0.927257.5
0.9526547.3
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Chen, T.; Liu, Z.; Ju, L.; Lu, Y.; Wei, S. Improved Bayes-Based Reliability Prediction of Small-Sample Hall Current Sensors. Machines 2024 , 12 , 618. https://doi.org/10.3390/machines12090618

Chen T, Liu Z, Ju L, Lu Y, Wei S. Improved Bayes-Based Reliability Prediction of Small-Sample Hall Current Sensors. Machines . 2024; 12(9):618. https://doi.org/10.3390/machines12090618

Chen, Ting, Zhengyu Liu, Ling Ju, Yongling Lu, and Shike Wei. 2024. "Improved Bayes-Based Reliability Prediction of Small-Sample Hall Current Sensors" Machines 12, no. 9: 618. https://doi.org/10.3390/machines12090618

Article Metrics

Further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

  • Open access
  • Published: 04 September 2024

Hypothesis paper: GDF15 demonstrated promising potential in Cancer diagnosis and correlated with cardiac biomarkers

  • Xiaohe Hao 1   na1 ,
  • Zhenyu Zhang 1   na1 ,
  • Jing Kong 2   na1 ,
  • Rufei Ma 3 ,
  • Cuiping Mao 1 ,
  • Xun Peng 1 ,
  • Lisheng Liu 1 ,
  • Chuanxi Zhao 1 ,
  • Xinkai Mo 1 ,
  • Meijuan Cai 4 ,
  • Xiangguo Yu 1 &
  • Qinghai Lin 1  

Cardio-Oncology volume  10 , Article number:  56 ( 2024 ) Cite this article

Metrics details

Cardiovascular toxicity represents a significant adverse consequence of cancer therapies, yet there remains a paucity of effective biomarkers for its timely monitoring and diagnosis. To give a first evidence able to elucidate the role of Growth Differentiation Factor 15 (GDF15) in the context of cancer diagnosis and its specific association with cardiac indicators in cancer patients, thereby testing its potential in predicting the risk of CTRCD (cancer therapy related cardiac dysfunction).

Analysis of differentially expressed genes (DEGs), including GDF15, was performed by utilizing data from the public repositories of the Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO). Cardiomyopathy is the most common heart disease and its main clinical manifestations, such as heart failure and arrhythmia, are similar to those of CTRCD. Examination of GDF15 expression was conducted in various normal and cancerous tissues or sera, using available database and serum samples. The study further explored the correlation between GDF15 expression and the combined detection of cardiac troponin-T (c-TnT) and N-terminal prohormone of brain natriuretic peptide (NT-proBNP), assessing the combined diagnostic utility of these markers in predicting risk of CTRCD through longitudinal electrocardiograms (ECG).

GDF15 emerged as a significant DEG in both cancer and cardiomyopathy disease models, demonstrating good diagnostic efficacy across multiple cancer types compared to healthy controls. GDF15 levels in cancer patients correlated with the established cardiac biomarkers c-TnT and NT-proBNP. Moreover, higher GDF15 levels correlated with an increased risk of ECG changes in the cancer cohort.

GDF15 demonstrated promising diagnostic potential in cancer identification; higher GDF15, combined with elevated cardiac markers, may play a role in the monitoring and prediction of CTRCD risk.

Introduction

Cancer and cardiovascular disease (CVD), share common risk factors such as obesity, smoking, and diabetes, and exhibit overlap in the signaling pathways that govern both normal cardiovascular physiology and tumor growth [ 1 , 2 ]. The incidence of cardiovascular toxicity during or after cancer treatment has been on the rise, with heart failure (HF) being the most prevalent and severe cardiovascular complication associated with cancer therapy [ 3 ]. This trend may be attributed to improved survival rates among cancer patients, which has led to an increased prevalence of cardiomyopathy associated with aging and changes in immune function. Additionally, the cardiotoxic effects of specific cancer treatments (including chemotherapy, targeted therapy, biological agents, and irradiation) have become more pronounced [ 4 ]. Consequently, there is a pressing need to enhance the prevention, surveillance, and early management of cardiovascular diseases in patients who are at high risk of cardiac dysfunction related to cancer therapeutics throughout their treatment journey [ 5 ]. While cardiac biomarkers such as cardiac troponin-T (c-TnT) and N-terminal prohormone of brain natriuretic peptide (NT-proBNP) have been somewhat effective in guiding the initiation and monitoring of heart-protective therapy in cancer patients, there is a high demand for more sensitive and specific markers [ 6 ].

Growth Differentiation Factor 15 (GDF-15), a member of the transforming growth factor-beta (TGF-β) superfamily, is also known as macrophage inhibitory cytokine-1 (MIC-1) due to its role in inhibiting macrophage secretion of pro-inflammatory factors [ 7 ]. GDF15 is associated with a wide range of biological functions in both physiological and pathological processes, as evidenced by its alternative names [ 8 ]. Under normal conditions, GDF15 expression remains low in various tissues and serum but markedly increases in response to inflammation, tissue damage, and various disease states, including malignant tumors, CVD, diabetes, and obesity, thus acting as a stress response molecule [ 9 , 10 , 11 , 12 , 13 ]. As a diagnostic and prognostic marker for tumors, the expression of GDF15 correlates with the degree of cachexia [ 14 ]. Similarly, as a cardiovascular disease marker, it is closely related to heart failure and myocardial infarction severity [ 15 , 16 ]. However, the potential of GDF15 as a predictive biomarker for CTRCD (cancer therapy related cardiac dysfunction) remains unclear, and its efficacy in assessing and monitoring cardiovascular toxicity during cancer treatment necessitates further experimental validation [ 17 , 18 ].

In the current study, we identified cardiac biomarkers (including GDF15) in cancer patients using TCGA and GEO databases, and confirmed the efficiency of GDF15 in cancer identification by using serum samples, revealing the potential of GDF15 in the monitoring and predicting risk of CTRCD.

Materials and methods

Overall method framework.

Recent cancer statistics indicate that approximately a quarter of all estimated cancer deaths can be attributed to digestive system tumors (DSTs) [ 19 ]. To identify cardiac biomarkers in cancer patients, we analyzed differentially expressed genes (DEGs) common to various DSTs using the TCGA database, supplemented by GEO database data related to cardiomyopathy (Figure S1 ). Among the five extracellular differential molecules identified across the two disease model datasets, GDF15 emerged as the most significant. Subsequent analysis revealed widespread expression of GDF15 across various normal and tumor tissues. Serum samples from 30 healthy donors and 507 cancer patients indicated that GDF15 is highly expressed in nearly all tumors, signifying significant diagnostic efficacy. Moreover, GDF15 serum levels showed a significant correlation with the cardiac markers NT-proBNP and c-TnT, particularly in cases involving acute heart failure and myocardial injury (Figure S2 ). An evaluation of electrocardiogram (ECG) results for cancer patients with varying GDF15 expression levels revealed a higher incidence of arrhythmic (e.g., sinus bradycardia, sinus tachycardia) and ischemic (e.g., ST changes, T-wave alterations) conditions among patients with elevated GDF15 levels, whereas patients with lower expression levels frequently exhibited normal ECG results.

Patients and healthy donors

This study enrolled a cohort comprising 30 healthy donors and 507 cancer patients treated at the Shandong Cancer Hospital and Institute from January 2022 to June 2023. The healthy donors consisted of individuals undergoing routine physical examinations, all of whom underwent serum GDF15 testing. Of these, five cases were also assessed for both c-TnT and NT-proBNP. The cancer patient group encompassed various malignancies, with 450 patients completing the full spectrum of tests for GDF15, cTnT, and NT-proBNP, and 57 liver cancer patients undergoing testing for GDF15 expression only. Comprehensive clinical data for the 450 patients across various tumor types (Table S1 ) were extracted from electronic medical records and the Ruimei Laboratory Information System version 6.0 (rmlis, Huangpu District, Shanghai, China), as summarized in Table  1 . Informed consent was obtained from all participants prior to the study, with ethical approval granted by the Ethics Committee of the Shandong Cancer Hospital and Institute, aligning with the Declaration of Helsinki.

Data collection

Data for this research were sourced from electronic medical records and the Ruimei Laboratory Information System. The c-TnT and NT-proBNP levels in cancer patients were determined using Electrochemiluminescence on the Cobas e801 analyzer (Roche Diagnostics, GmbH, Mannheim, Germany). The normal range for NT-proBNP was considered to be 0-125 pg/ml, with < 125 pg/ml exclusionary of chronic heart failure and < 300 pg/ml exclusionary of acute heart failure. For the diagnosis of acute heart failure using NT-proBNP levels, the criteria vary by age: for individuals younger than 50 years, a level greater than 450 pg/ml is indicative; for those aged between 50 and 75 years, the threshold is above 900 pg/ml; and for those over 75 years, a level exceeding 1800 pg/ml is suggestive of acute heart failure. Additionally, for patients with a glomerular filtration rate (GFR) below 60 ml/min, a NT-proBNP level greater than 1200 pg/ml is indicative. Regarding cardiac troponin-T (c-TnT), normal levels range from 0 to 14 pg/ml. Levels between 15 and 52 pg/ml suggest myocardial injury, while levels above 52 pg/ml are indicative of acute myocardial injury. The delineation of test range and criteria, which are diagnostic criteria based on many previous studies and the long-term data accumulation of Roch company in hospital detection operation [ 20 , 21 ].

Enzyme‑linked immunosorbent assay (ELISA)

Plasma samples were collected, centrifuged again at 3500 ×g for 10 min to eliminate hemocytes, including red blood cells, white blood cells, and platelets, and the supernatant was retained. The plasma levels of GDF15 from cancer patients and healthy donors were quantified using Human GDF-15 ELISA Kits (ab155432, Abcam, Cambridge, UK). Samples underwent a 10-fold dilution prior to testing, with subsequent procedures conducted as per the kit’s protocol.

Database and processing

Analysis of differentially expressed genes (DEGs) in colon adenocarcinoma (COAD), esophageal carcinoma (ESCA), liver hepatocellular carcinoma (LIHC), stomach adenocarcinoma (STAD), lung adenocarcinoma (LUAD) tissues, and adjacent tissues was conducted using the GEPIA2 online database ( http://gepia2.cancer-pku.cn/#dataset ). Detailed clinical data of cancer patients were obtained from The Cancer Genome Atlas (TCGA) database.

Additionally, public gene expression profiles (GSE116250) covering 14 non-failing donors (NF), 37 dilated cardiomyopathy (DCM), and 13 ischemic cardiomyopathy (ICM) were examined. DEGs identification and visualization between NF, DCM and ICM were executed through volcano plots and heatmaps. Extracellular gene analysis for protein subcellular localization utilized Hum-mPLoc 3.0.

ECG analysis

ECG data, collected on the day of or within one day before or after blood sampling, were recorded with 12-lead ECG devices and interpreted by a minimum of two cardiologists. The analysis included normal ECG readings and identification of arrhythmic (e.g., sinus bradycardia, sinus tachycardia, incomplete right bundle branch block, intraventricular block), ischemic (e.g., ST changes, T-wave alterations), and non-specific (e.g., low-voltage QRS, QT interval variations) findings.

Statistical analysis

Statistical analysis was performed using GraphPad Prism version 9.0 (GraphPad Software, CA, USA). Data were analyzed using the unpaired t-test for two groups with normal distribution, the Mann-Whitney test for non-normally distributed data, the Kruskal-Wallis test for multiple groups with non-normally distributed data, and one-way ANOVA for normally distributed datasets. Results are presented as mean ± standard deviation (SD), with a p-value of < 0.05 considered statistically significant.

Screening of DEGs in TCGA database

Given the high incidence and mortality associated with digestive system tumors (DSTs), we sourced data for colon adenocarcinoma (COAD), esophageal carcinoma (ESCA), liver hepatocellular carcinoma (LIHC), and stomach adenocarcinoma (STAD) from the TCGA database, comprising a total of 1234 cancerous and 1006 non-cancerous tissue samples. The data included colon adenocarcinoma (COAD; 275 cancer vs. 349 control tissues), esophageal carcinoma (ESCA; 182 cancer vs. 286 control tissues), liver hepatocellular carcinoma (LIHC; 369 cancer vs. 160 control tissues), and stomach adenocarcinoma (STAD; 408 cancer vs. 211 control tissues), as detailed in Table S2 . DEGs were determined based on a fold change > 2.0 and a P-value < 0.05, and their distribution was illustrated in volcano plots (Fig.  1 A-D, Tables S3 - 6 ). An intersection of DEGs across these DSTs revealed 381 common genes (Fig.  1 E, Tables S7 - 10 ), with LIHC-specific DEGs showcased in a heatmap (Fig.  1 F). Given the need for biomarkers detectable in serum, we conducted a sub-localization analysis using Hum-mPLoc 3.0, identifying 48 extracellularly localized DEGs (Tables S11 - 14 ), among which GDF15 was highlighted as a significant finding (Fig.  1 G).

figure 1

DEGs identification in the TCGA database. Volcano plots comparing the expression fold-change of DEGs in COAD tissues (A) , ESCA tissues (B) , LIHC tissues (C) , STAD tissues (D) compared with adjacent normal tissues. (E) Venn plot showing the shared genes among the DEGs in 4 kinds of DST tissues vs. healthy control tissues, and displayed in a heatmap ( F , up-regulated marked in red or down-regulated marked in blue). (G) Shared DEGs that extracellular localized were screened and presented by heatmap (LIHC vs. normal tissues), and GDF15 was among them

Identification of DEGs clusters attributed to cardiomyopathy and DSTs patients

Integrating cardiomyopathy data from GSE116250, including 14 non-failing donors (NF), 37 dilated cardiomyopathy (DCM), and 13 ischemic cardiomyopathy (ICM) cases, we screened for DEGs with fold changes > 2.0 and P  < 0.05, as depicted in volcano maps (Fig.  2 A-B, Tables S15 - 16 ). This analysis identified 629 DEGs common between the two cardiomyopathy types (Fig.  2 C). A further screen of serum biomarkers using Hum-mPLoc 3.0 revealed 105 extracellular DEGs, as displayed in heatmaps (Fig.  2 D-E, Tables S17 - 18 ). A Venn diagram pinpointed 19 DEGs shared between 4 types of DST and 2 types cardiomyopathy patients, with 5 extracellular molecules identified for potential clinical blood detection, most notably GDF15 (Fig.  3 B-C, Tables S19 - 22 ). The differential expression of GDF15 in lung adenocarcinoma (LUAD)/adjacent tissues was not as pronounced as in DSTs (Fig.  3 D).

figure 2

Screening of DEG clusters attributed to cardiomyopathy. Volcano plots comparing the expression fold-change of DEGs in dilated cardiomyopathy (DCM) vs. non-failing donors (NF) (A) , and ischemic cardiomyopathy (ICM) vs. NF (B) . (C) Venn plot showing the shared genes among the DEGs in DCM and ICM vs. NF, and the shared extracellular localized DEGs, presented in heatmaps ( D , E ); these include GDF15

figure 3

Identification of DEG clusters attributed to cardiomyopathy and DST patients. Venn plot showing DEGs shared among DST patients (A) , from which five differential extracellular molecules were identified (B) and are depicted in a heatmap (C) . Additionally, panel (D) displays GDF15 expression levels in various tumor tissues compared to control tissues, as recorded in the TCGA database. Statistical data are expressed as mean ± standard deviation (SD), with significance determined by an unpaired two-tailed t-test: * indicates p  < 0.05

GDF15 as a diagnostic biomarker for cancer patients

In pursuit of clinical insights on GDF15, we utilized integrated databases such as the Protein Atlas ( https://www.proteinatlas.org/ENSG00000130513-GDF15 ) to examine its expression across 44 diverse tissues. This analysis, underpinned by knowledge-based annotations, utilized color-coding to demarcate tissue groups sharing functional similarities. Notably, GDF15 exhibited significant spatial expression specificity, prominently within most digestive system tissues, albeit with notable exceptions in the liver and esophagus. Immunohistochemical assays revealed distinct cytoplasmic staining patterns in colorectal and prostate cancers, among others, demonstrating the presence of GDF15. In contrast, lung cancer and certain other tumors displayed minimal to no GDF15 staining, a finding that aligns with previous analyses from the TCGA database (Fig.  4 A-B). We utilized the proximity extension assay (PEA) to measure plasma concentrations of GDF15 across various cancer types. Notably, the serum levels of GDF15 in different cancer types did not always align with the patterns observed in our tissue staining (Fig.  4 C). Serum samples from 30 healthy donors and 507 cancer patients representing a diverse array of cancers were analyzed to ascertain GDF15 concentrations (Table  1 ). In the healthy cohort ( N  = 30), the mean GDF15 concentration was 760.5 ± 60.30 pg/mL. In cancer patients ( N  = 507), however, GDF15 levels were markedly elevated, with mean concentrations ranging from 1710 ± 656.2 to 4162 ± 214.8 pg/mL (Fig.  4 D). Earlier observations had indicated that normal liver tissues exhibited minimal or no GDF15 staining, whereas liver cancer tissues and serum samples displayed significantly higher GDF15 levels, the highest among all examined cancer types. Similarly, GDF15 staining was either weak or absent in both healthy and cancerous lung tissues, yet serum levels of GDF15 were considerably increased in lung cancer patients.

figure 4

Expression profile of GDF15 in human normal / tumor tissues and blood. The expression of GDF15 provided by available analysis platforms in normal human tissues (A) against tumor tissues (B) . Panel (C) shows the serum GDF15 levels in patients with various tumors, as reported in an online database, while panel (D) illustrates the concentrations of GDF15 that we determined in the serum of cancer patients in comparison to healthy controls. The data are expressed as mean ± standard deviation (SD), with statistical analysis conducted via an unpaired two-tailed t-test, where * signifies p  < 0.05, *** signifies p  < 0.001, and **** signifies p  < 0.0001

To evaluate the diagnostic utility of GDF15 for cancer, we compared serum GDF15 levels between representative cancer patient groups and healthy controls, employing receiver operating characteristic (ROC) curves. GDF15 demonstrated significant discriminative ability for pan-cancer detection, achieving an area under the curve (AUC) of 0.91 (95% confidence interval [CI] 0.8719–0.9396), with a sensitivity of 84.02% and a specificity of 86.67% compared to healthy controls (Fig.  5 A). The diagnostic performance of GDF15 varied across the various cancers, exhibiting particularly strong diagnostic efficacy in liver cancer, with an AUC of 0.99 (95% confidence interval [CI] 0.9731–1.001), 92.86% sensitivity, and 96.67% specificity (Fig.  5 B-F). However, in breast cancer, the diagnostic value of GDF15 was notably lower, evidenced by an AUC of 0.51 ( p  = 0.9578), with 30.77% sensitivity and 96.67% specificity (Fig.  5 E), possibly due to the limited number of breast cancer cases ( N  = 13) included in the study.

figure 5

Diagnostic utility of GDF15 in various cancers. (A) ROC curves revealed the AUC of pan-cancer to be 0.9057, P  < 0.0001. ROC curves for GDF15 in the diagnosis of lung cancer (B) , liver cancer (C) , esophageal cancer (D) , breast cancer (E) , and lymphoma (F)

Serum GDF15 levels in cancer patients correlated with cardiac indicators

Cancer therapy related cardiovascular toxicity includes myocardial injury and heart failure, immune myocarditis, hypertension, arrhythmia, coronary heart disease, venous thromboembolism, dyslipidemia. The correlation of these markers with cardiac dysfunction, as detailed in the Methods and Materials section, varies across the spectrum of potential values. Despite most c-TnT and NT-proBNP values falling within or near the normal range, with outliers presenting scattered results, it was challenging to deduce a straightforward relationship between GDF15 levels and these cardiac markers using a simple correlation analysis approach. Thus, we observed GDF15 expression across various degrees of cardiac disease in a pan-cancer context. We discovered that as c-TnT and NT-proBNP levels rose, indicating worsening cardiac disease, GDF15 expression also significantly increased (Fig.  6 A-B). The mean concentration of GDF15 in individuals with NT-proBNP within the normal range (< 125 pg/ml, N  = 287) was approximately 2507 ± 110.4 pg/mL. This included individuals with NT-proBNP levels below 10 pg/ml ( N  = 35), where GDF15 levels averaged about 1523 ± 184.4 pg/mL. When NT-proBNP levels suggested chronic heart failure (> 125 & <300 pg/ml, N  = 88), GDF15 concentrations averaged around 3044 ± 226.0 pg/mL, P  = 0.0230. For NT-proBNP levels indicating potential acute heart failure (> 450 pg/ml, N  = 55), GDF15 levels rose to an average of 4850 ± 306.7 pg/mL, P  < 0.0001. Notably, in cases with NT-proBNP exceeding 1500 pg/ml ( N  = 17), GDF15 levels reached a particularly high average of 5933 ± 395.4 pg/mL, P  < 0.0001 (Fig.  6 A).

figure 6

Association Between Serum GDF15 Levels and Cardiac Biomarkers in Cancer Patients. This figure displays the correlation of serum GDF15 levels with cardiac biomarkers across all cancer types, with panels (A) and (B) illustrating GDF15 levels across varying NT-proBNP and c-TnT ranges, respectively. Panels (C) and (D) detail GDF15 levels in lung cancer patients across different NT-proBNP ranges, while panels (E) and (F) depict these levels in liver cancer patients across varying c-TnT ranges. Data represent mean ± standard error of the mean (SEM), with normal ranges for c-TnT and NT-proBNP serving as controls. N: represented the number of cases. Statistical significance was assessed using an unpaired two-tailed t-test, where * indicates p  < 0.05, ** indicates p  < 0.01, and **** indicates P  < 0.0001

When evaluating c-TnT levels, individuals within the normal range (< 14 pg/ml, N  = 366) had a mean GDF15 concentration of 2615 ± 101.1 pg/mL. This included a subset ( N  = 13) with c-TnT levels below 3 pg/ml, where GDF15 averaged approximately 1283 ± 277.5 pg/mL. For patients with c-TnT levels indicative of myocardial injury (15–52 pg/ml, N  = 82), GDF15 levels averaged around 4303 ± 242.2 pg/mL, P  < 0.0001. In cases of acute myocardial injury (c-TnT > 52 pg/ml, N  = 4), GDF15 concentrations escalated to an average of 5789 ± 1151 pg/mL, P  = 0.0012 (Fig.  6 B).

Further analysis was conducted on lung and liver cancer patients, representing the largest subsets in this study, to more closely explore the correlation between GDF15 expression and cardiac biomarkers. Given that most c-TnT or NT-proBNP values were within or near normal ranges, with a limited number of cases showing abnormal expression, the correlation findings for these specific cancer types were not as pronounced as those observed in the broader cancer cohort. GDF15 levels were only significantly correlated with acute heart failure (NT-proBNP > 450 pg/ml) in lung cancer patients (Fig.  6 C-D). In instances of myocardial injury (c-TnT > 14 pg/ml), GDF15 expression markedly increased in both lung and liver cancer (Fig.  6 E-F).

The levels of serum GDF15 associated with ECG changes

Electrocardiography (ECG) has historically been used for screening and monitoring of cardiovascular toxicity caused by cancer treatments [ 22 ], however it could not be completed on all patients at the times of blood collection due to various circumstances. For instance, the patient with the highest serum level of GDF15 (7493.66 pg/ml) died the day following blood collection, leaving no ECG data. Consequently, we selected a subset of 26 patients, characterized by either relatively high or low serum GDF15 levels, for analysis and comparison of ECG results, as summarized in Table  2 . The ECG findings of six representative cases are illustrated in Fig.  7 . This analysis revealed that patients with lower serum levels of GDF15 generally had low levels of c-TnT or NT-proBNP, correlating with predominantly normal ECG outcomes (Fig.  7 A-B). Conversely, when serum GDF15 exceeded 600 pg/ml, ECGs exhibited sinus rhythm and ST changes, despite c-TnT and NT-proBNP values remaining within normal limits (Table  2 ). Moreover, patients presenting with higher GDF15 expression were found to have elevated levels of c-TnT or NT-proBNP, alongside arrhythmic alterations (such as sinus bradycardia and tachycardia) and significant ischemic changes (including ST changes and T-wave alterations), suggesting a notable correlation among these biomarkers (Fig.  7 C-D; Table  2 ). One particular case involved a patient with a high GDF15 expression (6637.38 pg/ml) whose serum levels of c-TnT (6.84 pg/ml) and NT-proBNP (16.57 pg/ml) were within normal ranges, yet the ECG revealed ST depression in leads III and avF, along with T wave inversion among other abnormal alterations (Fig.  7 F; Table  2 ). Another significant observation was made in a patient with an exceptionally high serum GDF15 level (7493.66 pg/ml); while the c-TnT (23.03 pg/ml) and NT-proBNP (277.70 pg/ml) levels were not markedly elevated, the ECG indicated serious myocardial ischemia, as evidenced by ST segment elevation in leads II, III, avF, V5-V6 (Fig.  7 E).

figure 7

ECG Outcomes Related to Serum GDF15 Concentrations in Cancer Patients. This figure shows ECG findings in cancer patients, categorized by serum GDF15 levels. Panels (A) and (B) demonstrate patients with low GDF15 levels and corresponding low c-TnT and NT-proBNP levels, resulting in normal ECG outcomes. In contrast, panels ( C - F ) present cases with elevated GDF15 levels alongside more complex c-TnT and NT-proBNP readings, showing a heightened incidence of ECG abnormalities. All ECG recordings were conducted at a standard speed of 25 mm/s

Cardiovascular disease and cancer are the two leading causes of death worldwide [ 23 ]. Accumulating clinical evidence demonstrates the increased risk of developing cardiac disease during cancer treatment. Cancer incidence and mortality were significantly increased in patients with heart failure [ 24 , 25 ]. Addressing the cardiotoxic effects associated with anti-cancer therapies represents a formidable challenge currently confronting cardiologists and oncologists [ 26 ]. This underscores the need for a reliable serum biomarker for monitoring cardiovascular toxicity during cancer treatment. Research indicates that elevated GDF15 levels are linked to a spectrum of cardiovascular conditions, including myocardial hypertrophy, heart failure, atherosclerosis, and endothelial dysfunction. Moreover, GDF15 has been shown to precipitate cachexia and provide protection against obesity and insulin resistance in murine models [ 9 , 27 ]. Notably, cardiovascular diseases, heart failure, and organ failure emerge as the prevalent clinical manifestations of cachexia induced by malignant tumors [ 28 ].

In this study, we elucidated the role of DEGs across various DSTs utilizing the TCGA database, highlighting 48 secreted proteins as potential serum biomarkers. Subsequent analyses, leveraging the GEO database, allowed us to delve into DEGs pertinent to cardiomyopathy. This dual-disease model approach culminated in the identification of five extracellular molecules, with GDF15 emerging as a notably significant biomarker for clinical detection.

Despite the variability observed in GDF15 immunohistochemical staining across normal and cancerous tissues, serum levels of GDF15 in patients with tumors significantly exceeded those in healthy controls. This disparity might be attributed to the limited sample size of healthy individuals ( n  = 30) and the complexity surrounding the treatment histories of cancer patients, potentially influencing GDF15 expression. Nonetheless, GDF15 demonstrated robust diagnostic utility across a spectrum of cancers, particularly standing out in DSTs where its levels were markedly elevated, aligning with findings from prior research [ 29 ].

Our results suggested an aberrant expression of GDF15 in both cancerous conditions and cardiomyopathy, posing the question of its utility as a marker for cardiac dysfunction induced by cancer treatments. While existing studies suggest GDF15’s potential as a predictive marker for CTRCD [ 18 , 30 ], our analysis extends this narrative by showing a correlation between increased GDF15 levels and enhanced cardiac markers across a pan-cancer cohort. Notably, patients exhibiting the highest GDF15 serum levels also displayed elevated cardiac dysfunction markers but succumbed shortly after testing, underscoring the marker’s prognostic significance. Specifically, the patient with the highest serum GDF15 concentration (7937.295 pg/ml) presented with significantly high levels of c-TnT (342.40 pg/ml) and NT-proBNP (4348.00 pg/ml), yet died merely two days following testing. Similarly, another patient, registering extremely high GDF15 levels (7804.815 pg/ml) alongside a history of premature cardiac beats and serum levels of c-TnT (15.24 pg/ml) and NT-proBNP (1265.00 pg/ml), succumbed within a month following discharge. Through medical records review, it was found that these two patients with extremely high level of GDF15 were terminal-stage (stage IV) patients with multiple organ failure caused by bone metastasis and cachexia, which may be an important cause of their death. In contrast, when analyzing lung and liver cancer patients independently, the link between GDF15 levels and cardiac biomarkers appeared to be less pronounced. This reduced correlation suggests that the majority of these cancer patients may not exhibit pronounced cardiomyopathy post-treatment, with most cardiac indicators values falling into the normal range, except for few samples showing aberrant expressions, thereby influencing the overall statistical outcomes.

ECG has historically been critical for the diagnosis and management of cardiac injury, cardiomyopathy, and cardiovascular toxicity [ 22 , 31 , 32 ]. In this study, we observed the ECG patterns in patients exhibiting varying levels of GDF15 expression. Generally, we noted a correlation where elevated GDF15 levels were associated with increased c-TnT and NT-proBNP levels, alongside more pronounced arrhythmic alterations (such as sinus bradycardia) and ischemic changes (including ST segment and T-wave variations). However, there were instances where GDF15 levels were markedly high, while c-TnT and NT-proBNP levels remained within normal limits or did not show significant elevation, yet the ECG demonstrated substantial changes.

It is important to acknowledge certain limitations: not every patient had their ECG recorded in close proximity to the blood collection time, leading to disparate data without a temporally-proximal correlation. Echocardiography is common monitoring methods of CTRCD according to the Guidelines of ESC [ 33 ]. Artificial intelligence electrocardiogram served as a screening tool to detect a newly abnormal LVEF (left ventricular ejection fraction) after anthracycline-based cancer therapy [ 34 ] and lack of LVEF evaluation is indeed a limitation of this study. GDF15 has been proved to be a keen pro-inflammatory factor in many studies, and both cardiac disease and cancer have links to inflammation. The results of GDF15 and cardiac biomarkers may be the case, but whether the results can be applied accurately and specifically is certainly worth discussing. In addition, cancer subtypes and comorbidities may affect the expression of GDF15.

In conclusion, this hypothesis generating study attempts to reinforce the association between GDF15 expression and both cancer and cardiac damage after chemotherapy, underscoring its diagnostic efficacy in cancer patients and its potential in monitoring cardiovascular toxicity. While GDF15 levels generally correlate with traditional cardiac markers, the study reveals instances of discordance, suggesting a complementary role for GDF15 in the complex landscape of cancer treatment-related cardiac care.

Data availability

No datasets were generated or analysed during the current study.

Bertero E, Canepa M, Maack C, Ameri P. Linking heart failure to Cancer: background evidence and research perspectives. Circulation. 2018;138:735–42.

Article   PubMed   Google Scholar  

de Wit S, Glen C, de Boer RA, Lang NN. Mechanisms shared between cancer, heart failure, and targeted anti-cancer therapies. Cardiovasc Res. 2023;118:3451–66.

Otto CM. Heartbeat: heart failure induced by cancer therapy. Heart. 2019;105:1–3.

Zamorano JL, Lancellotti P, Rodriguez MD, Aboyans V, Asteggiano R, Galderisi M, et al. 2016 ESC position paper on cancer treatments and cardiovascular toxicity developed under the auspices of the ESC Committee for Practice Guidelines: the Task Force for cancer treatments and cardiovascular toxicity of the European Society of Cardiology (ESC). Eur J Heart Fail. 2017;19:9–42.

Finet JE. Management of Heart failure in Cancer patients and Cancer survivors. Heart Fail Clin. 2017;13:253–88.

Pudil R, Mueller C, Celutkiene J, Henriksen PA, Lenihan D, Dent S, et al. Role of serum biomarkers in cancer patients receiving cardiotoxic cancer therapies: a position statement from the Cardio-Oncology Study Group of the Heart Failure Association and the Cardio-Oncology Council of the European Society of Cardiology. Eur J Heart Fail. 2020;22:1966–83.

Article   CAS   PubMed   Google Scholar  

Bootcov MR, Bauskin AR, Valenzuela SM, Moore AG, Bansal M, He XY, et al. MIC-1, a novel macrophage inhibitory cytokine, is a divergent member of the TGF-beta superfamily. Proc Natl Acad Sci U S A. 1997;94:11514–9.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Assadi A, Zahabi A, Hart RA. .GDF15, an update of the physiological and pathological roles it plays: a review. Pflugers Arch. 2020;472:1535–46.

Wang D, Day EA, Townsend LK, Djordjevic D, Jorgensen SB, Steinberg GR. 2021.GDF15: emerging biology and therapeutic applications for obesity and cardiometabolic disease. Nat Rev Endocrinol 17: 592–607.

Asrih M, Wei S, Nguyen TT, Yi HS, Ryu D, Gariani K. Overview of growth differentiation factor 15 in metabolic syndrome. J Cell Mol Med. 2023;27:1157–67.

Breit SN, Brown DA, Tsai V. .GDF15 analogs as obesity therapeutics. Cell Metab. 2023;35:227–8.

Sjoberg KA, Sigvardsen CM, Alvarado-Diaz A, Andersen NR, Larance M, Seeley RJ et al. 2023.GDF15 increases insulin action in the liver and adipose tissue via a beta-adrenergic receptor-mediated mechanism. Cell Metab 35: 1327–e13405.

Wang D, Townsend LK, DesOrmeaux GJ, Frangos SM, Batchuluun B, Dumont L et al. 2023.GDF15 promotes weight loss by enhancing energy expenditure in muscle. Nature 619: 143–50.

Ling T, Zhang J, Ding F, Ma L. Role of growth differentiation factor 15 in cancer cachexia (review). Oncol Lett. 2023;26:462.

Wollert KC, Kempf T, Wallentin L. 2017.Growth differentiation factor 15 as a Biomarker in Cardiovascular Disease. Clin Chem 63: 140–51.

Sawalha K, Norgard NB, Drees BM, Lopez-Candales A. Growth differentiation factor 15 (GDF-15), a new biomarker in heart failure management. Curr Heart Fail Rep. 2023;20:287–99.

Yu LR, Cao Z, Makhoul I, Daniels JR, Klimberg S, Wei JY, et al. Immune response proteins as predictive biomarkers of doxorubicin-induced cardiotoxicity in breast cancer patients. Exp Biol Med (Maywood). 2018;243:248–55.

Ananthan K, Lyon AR. The role of biomarkers in Cardio-Oncology. J Cardiovasc Transl Res. 2020;13:431–50.

Article   PubMed   PubMed Central   Google Scholar  

Siegel RL, Miller KD, Wagle NS, Jemal A. 2023.Cancer statistics, 2023. CA Cancer J Clin 73: 17–48.

Januzzi JJ, Chen-Tournoux AA, Moe G. Amino-terminal pro-B-type natriuretic peptide testing for the diagnosis or exclusion of heart failure in patients with acute symptoms. Am J Cardiol. 2008;101:29–38.

Hochholzer W, Morrow DA, Giugliano RP. Novel biomarkers in cardiovascular disease: update 2010. Am Heart J. 2010;160:583–94.

Zamorano JL, Lancellotti P, Rodriguez MD, Aboyans V, Asteggiano R, Galderisi M, et al. 2016 ESC position paper on cancer treatments and cardiovascular toxicity developed under the auspices of the ESC Committee for Practice Guidelines: the Task Force for cancer treatments and cardiovascular toxicity of the European Society of Cardiology (ESC). Eur Heart J. 2016;37:2768–801.

2018.Global, regional, and national age-sex-specific mortality for 282 causes of death in 195 countries and territories, 1980–2017: a systematic analysis for the global burden of Disease Study 2017. Lancet 392: 1736–88.

Camilli M, Chiabrando JG, Lombardi M, Del BM, Montone RA, Lombardo A, et al. Cancer incidence and mortality in patients diagnosed with heart failure: results from an updated systematic review and meta-analysis. Cardiooncology. 2023;9:8.

PubMed   PubMed Central   Google Scholar  

Sayour NV, Paal AM, Ameri P, Meijers WC, Minotti G, Andreadou I, et al. Heart failure pharmacotherapy and cancer: pathways and pre-clinical/clinical evidence. Eur Heart J. 2024;45:1224–40.

Fogarassy G, Vathy-Fogarassy A, Kenessey I, Veress G, Polgar C, Forster T. Preventing cancer therapy-related heart failure: the need for novel studies. J Cardiovasc Med (Hagerstown). 2021;22:459–68.

Hsu JY, Crawley S, Chen M, Ayupova DA, Lindhout DA, Higbee J, et al. Non-homeostatic body weight regulation through a brainstem-restricted receptor for GDF15. Nature. 2017;550:255–9.

Laird B, Jatoi A. Cancer cachexia: learn from yesterday, live for today and hope for tomorrow. Curr Opin Support Palliat Care. 2023;17:161.

Wang Y, Jiang T, Jiang M, Gu S. Appraising growth differentiation factor 15 as a promising biomarker in digestive system tumors: a meta-analysis. BMC Cancer. 2019;19:177.

Cartas-Espinel I, Telechea-Fernandez M, Manterola DC, Avila BA, Saavedra CN, Riffo-Campos AL. Novel molecular biomarkers of cancer therapy-induced cardiotoxicity in adult population: a scoping review. ESC Heart Fail. 2022;9:1651–65.

Valentini F, Anselmi F, Metra M, Cavigli L, Giacomin E, Focardi M, et al. Diagnostic and prognostic value of low QRS voltages in cardiomyopathies: old but gold. Eur J Prev Cardiol. 2022;29:1177–87.

Chaudhari GR, Mayfield JJ, Barrios JP, Abreau S, Avram R, Olgin JE, et al. Deep learning augmented ECG analysis to identify biomarker-defined myocardial injury. Sci Rep. 2023;13:3364.

Lyon AR, Lopez-Fernandez T, Couch LS, Asteggiano R, Aznar MC, Bergler-Klein J et al. 2022.2022 ESC guidelines on cardio-oncology developed in collaboration with the European Hematology Association (EHA), the European Society for Therapeutic Radiology and Oncology (ESTRO) and the International Cardio-Oncology Society (IC-OS). Eur Heart J 43: 4229–361.

Jacobs J, Greason G, Mangold KE, Wildiers H, Willems R, Janssens S, et al. Artificial intelligence electrocardiogram as a novel screening tool to detect a newly abnormal left ventricular ejection fraction after anthracycline-based cancer therapy. Eur J Prev Cardiol. 2024;31:560–6.

Download references

Acknowledgements

We are grateful to all the patients participated in the study and China Scholarship Council.

This work was supported by the National Natural Science Foundation of China (No. 82202836), Natural Science Foundation of Shandong Province (No. ZR202103020544), Shandong Province Medical and Health Technology Development Plan Project (No. 202002070690), Qingdao Science and Technology Demonstration and Guidance Project (22-3-7-smjk-10-nsh).

Author information

Xiaohe Hao, Zhenyu Zhang and Jing Kong contributed equally to this work.

Authors and Affiliations

Department of Clinical Laboratory, Shandong Cancer Hospital and Institute, Shandong First Medical University, Shandong Academy of Medical Sciences, 440 Ji-Yan Road, Jinan, Shandong Province, 250117, PR China

Xiaohe Hao, Zhenyu Zhang, Cuiping Mao, Xun Peng, Kun Ru, Lisheng Liu, Chuanxi Zhao, Xinkai Mo, Xiangguo Yu & Qinghai Lin

Department of Cardiology, Qilu Hospital of Shandong University, Jinan, Shandong, 250012, China

Electrocardiogram Room, Shandong Cancer Hospital and Institute, Shandong First Medical University, Shandong Academy of Medical Sciences, Jinan, Shandong, 250117, China

Department of Clinical Laboratory, Qilu Hospital of Shandong University, Jinan, Shandong, 250012, China

Meijuan Cai

You can also search for this author in PubMed   Google Scholar

Contributions

Qinghai Lin designed experiments; Xiaohe Hao, Jing Kong, Xun Peng, Zhenyu Zhang and Cuiping Mao collected the laboratory results; Kun Ru, Chuanxi Zhao, Xinkai Mo and Rufei Ma provided the information of cancer patients; Qinghai Lin and Xiaohe Haoprepared Figs.  1 , 2 , 3 , 4 and 5 , Jing Kong and Rufei Ma prepared Figs.  6 and 7 ; Table  1 , and 2 . Qinghai Lin, Xiangguo Yu, Meijuan Cai and Lisheng Liu wrote the manuscript and analyzed the data; Xiaohe Hao polished the manuscript. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Xiangguo Yu or Qinghai Lin .

Ethics declarations

Ethics approval and consent to participate.

All procedures performed in studies involving human samples were reviewed and approved by the Ethics Committee of Shandong Cancer Hospital and Institute, in accordance with the Declaration of Helsinki. The participants provided their written informed consent to participate in this study.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, supplementary material 3, supplementary material 4, supplementary material 5, supplementary material 6, supplementary material 7, supplementary material 8, supplementary material 9, supplementary material 10, supplementary material 11, supplementary material 12, supplementary material 13, supplementary material 14, supplementary material 15, supplementary material 16, supplementary material 17, supplementary material 18, supplementary material 19, supplementary material 20, supplementary material 21, supplementary material 22, supplementary material 23.

hypothesis testing with normal distribution

Supplementary Material 24

hypothesis testing with normal distribution

Supplementary Material 25

Supplementary material 26, supplementary material 27, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Hao, X., Zhang, Z., Kong, J. et al. Hypothesis paper: GDF15 demonstrated promising potential in Cancer diagnosis and correlated with cardiac biomarkers. Cardio-Oncology 10 , 56 (2024). https://doi.org/10.1186/s40959-024-00263-9

Download citation

Received : 07 June 2024

Accepted : 30 August 2024

Published : 04 September 2024

DOI : https://doi.org/10.1186/s40959-024-00263-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cardiovascular toxicity

Cardio-Oncology

ISSN: 2057-3804

hypothesis testing with normal distribution

COMMENTS

  1. Normal Hypothesis Testing

    Revision notes on 5.3.2 Normal Hypothesis Testing for the AQA A Level Maths: Statistics syllabus, written by the Maths experts at Save My Exams.

  2. 9.4: Distribution Needed for Hypothesis Testing

    Particular distributions are associated with hypothesis testing. Perform tests of a population mean using a normal distribution or a Student's t t -distribution. (Remember, use a Student's t t -distribution when the population standard deviation is unknown and the distribution of the sample mean is approximately normal.)

  3. Data analysis: hypothesis testing: 4.1 The normal distribution

    A normal distribution is a probability distribution that is symmetric about the mean, indicating that data near the mean are more likely to occur than data far from it. In graph form, a normal distribution appears as a bell curve. The values in the x-axis of the normal distribution graph represent the z-scores.

  4. 7.4.1

    The p-value is the area under the standard normal distribution that is more extreme than the test statistic in the direction of the alternative hypothesis. Make a decision.

  5. Normal Distribution Hypothesis Tests

    Learn how to do normal hypothesis tests to test the mean parameter of a normal distribution. Find out when to use them, how to calculate the test statistic, and how to handle multiple observations.

  6. 9.3 Probability Distribution Needed for Hypothesis Testing

    When you perform a hypothesis test of a single population mean μ using a normal distribution (often called a z-test), you take a simple random sample fr...

  7. 9.2: Tests in the Normal Model

    From these basic statistics we can construct the test statistics that will be used to construct our hypothesis tests. The following results were established in the section on Special Properties of the Normal Distribution. Define Z = M − μ σ /√n, T = M − μ S /√n, V = n − 1 σ2 S2. Z has the standard normal distribution.

  8. 9.3 Distribution Needed for Hypothesis Testing

    Earlier in the course, we discussed sampling distributions. Particular distributions are associated with hypothesis testing. Perform tests of a populati...

  9. 7.4.1

    The p-value is the area under the standard normal distribution that is more extreme than the test statistic in the direction of the alternative hypothesis. Make a decision.

  10. Distribution Needed for Hypothesis Testing

    Earlier in the course, we discussed sampling distributions. Particular distributions are associated with hypothesis testing. Perform tests of a population mean using a normal distribution or a Student's t-distribution. (Remember, use a Student's t -distribution when the population standard deviation is unknown and the distribution of the sample mean is approximately normal.) We perform ...

  11. Hypothesis Testing with the Normal Distribution

    When n n is large enough and the null hypothesis is true the sample means often follow a normal distribution with mean μ0 μ 0 and standard deviation σ √n σ n.

  12. Hypothesis Testing

    Ideally the form of the sampling distribution should be one of the "standard distributions" (e.g. normal, t, binomial..) Calculate a p-value, as the area under the sampling distribution more extreme than your statistic. Depends on the form of the alternative hypothesis.

  13. Introduction to Hypothesis testing for Normal distribution ...

    Introduction to Hypothesis testing for Normal distribution In this tutorial, we learn how to conduct a hypothesis test for normal distribution using p values method with 6 simple steps.

  14. 8.1.3: Distribution Needed for Hypothesis Testing

    Particular distributions are associated with hypothesis testing. Perform tests of a population mean using a normal distribution or a Student's t -distribution. (Remember, use a Student's t -distribution when the population standard deviation is unknown and the distribution of the sample mean is approximately normal.)

  15. 6 Week 5 Introduction to Hypothesis Testing Reading

    All hypothesis tests have assumptions that we hope to meet. For example, tests with a quantitative dependent variable consider a histogram(s) to check if the distribution is normal, and whether there are any obvious outliers. Each hypothesis test has different assumptions, so it is important to pay attention to the specific test's requirements.

  16. Chapter 5 Hypothesis Testing with Normal Populations

    Chapter 5 Hypothesis Testing with Normal Populations In Section 3.5, we described how the Bayes factors can be used for hypothesis testing. Now we will use the Bayes factors to compare normal means, i.e., test whether the mean of a population is zero or compare the means of two groups of normally-distributed populations. We divide this mission into three cases: known variance for a single ...

  17. PDF §5.1 HYPOTHESIS TESTS USING NORMAL DISTRIBUTIONS

    Three things Examples Normal distributions The standard normal distribution Dotplots Hypothesis testing, II

  18. 9.2: Hypothesis Testing

    When you perform a hypothesis test of a single population mean μ using a normal distribution (often called a z -test), you take a simple random sample from the population.

  19. 8.1 A Single Population Mean Using the Normal Distribution

    A confidence interval for a population mean with a known standard deviation is based on the fact that the sample means follow an approximately normal di...

  20. Hypothesis testing for Normal distribution examples

    Hypothesis testing for Normal distributionIn this tutorial, we do more examples of hypothesis testing for one-tailed and two-tailed tests using the p values ...

  21. Hypothesis testing for Normal Distribution

    Critical values method In this tutorial, we work through 3 examples of using the critical value method to determine, determine whether to accept or reject the null hypothesis.

  22. Weighted Graph-Based Two-Sample Test via Empirical Likelihood

    In network data analysis, one of the important problems is determining if two collections of networks are drawn from the same distribution. This problem can be modeled in the framework of two-sample hypothesis testing. Several graph-based two-sample tests have been studied. However, the methods mainly focus on binary graphs, and many real-world networks are weighted. In this paper, we apply ...

  23. 9.3: A Single Population Mean using the Normal Distribution

    The hypothesis test itself has an established process. This section focuses on tests of a mean when the population standard deviation is given or can be determined.

  24. Improved Bayes-Based Reliability Prediction of Small-Sample Hall ...

    Using the distribution test scatter plot to test the distribution of the two parameters, the results indicate that α and β follow the normal distribution and log-normal distribution, respectively. Figure 6 and Figure 7 show the normal distribution and log-normal distribution test results for α and β at a confidence level of 95%, respectively.

  25. Hypothesis paper: GDF15 demonstrated promising potential in Cancer

    Hypothesis paper: GDF15 demonstrated promising potential in Cancer diagnosis and correlated with cardiac biomarkers ... CA, USA). Data were analyzed using the unpaired t-test for two groups with normal distribution, the Mann-Whitney test for non-normally distributed data, the Kruskal-Wallis test for multiple groups with non-normally distributed ...