The assumed standard deviation = 11.6 N Mean SE Mean 95% Upper Bound Z P
Personalise your OpenLearn profile, save your favourite content and get recognition for your learning
Become an ou student, download this course, share this free course.
Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.
In the world of scientific inquiry, you often begin with a null hypothesis (H 0 ), which expresses the currently accepted value for a parameter in the population. The alternative hypothesis (H a ), on the other hand, is the opposite of the null hypothesis and challenges the currently accepted value.
To illustrate this concept of null and alternative hypotheses, you will look at some well-known stories and examples.
In ancient and medieval times, the widely held belief was that all planets orbited around the Earth, as the Earth was considered the centre of the universe. This idea can be considered the null hypothesis, as it represents the currently accepted value for a parameter in the population. Thus, it can be written as:
H 0 : All planets orbit around the Earth.
Planets revolving around the Sun
In the world of business and finance, the idea that paper money must be backed by gold (the gold standard) was also a commonly held belief for a long time. This belief can be considered a null hypothesis. However, following the Great Depression, people began to question this belief and broke the link between banknotes and gold. This alternative hypothesis challenged the gold standard, and it eventually became widely accepted that the value of paper money is not necessarily equal to a fixed amount of gold. Thus, H 0 and H a statements can be written as:
H 0 : The value of paper money is equal to a fixed amount of gold.
H a : The value of paper money is not equal to a fixed amount of gold.
In modern times, people generally place their trust in the value of banknotes issued by central banks or monetary authorities, which are backed by a strong government. This belief can be considered a null hypothesis. However, digital currency, such as Bitcoin, has emerged as an alternative to traditional paper money. Bitcoin is not backed by any central bank or monetary authority, and transactions involving Bitcoin are verified by network nodes using cryptography and recorded in a blockchain. This alternative hypothesis challenges the belief that the value of paper money is solely based on people's trust in central banks or monetary authorities. Thus, H 0 and H a statements can be written as:
H 0 : The value of paper money is equal to people’s trust in central banks or monetary authorities.
H a : The value of paper money is not equal to people’s trust in central banks or monetary authorities.
In conclusion, the alternative hypothesis always challenges the idea expressed in the null hypothesis. By testing the null hypothesis against the alternative hypothesis, you can determine which idea is more supported by the available data. The alternative hypothesis is often referred to as a ‘research hypothesis’ because it initiates the motivation and opportunities for further research.
Let’s return to the first example given in Section 1. If you see that your friends and relatives make more or less than £26,000 annually on average, perhaps you should question the widely accepted proposition of £26,000 as the average annual salary in the UK. This will enable you to develop an alternative hypothesis:
H a : Average annual salary in the UK is not equal to £26,000.
The following activity will test your knowledge of null and alternative hypotheses.
Read the following statements. Can you develop a null hypothesis and an alternative hypothesis?
‘It is believed that a high-end coffee machine produces a cup of caffè latte with an average of 1 cm of foam. The hotel employee claims that after the machine has been repaired, it is no longer able to produce a cup of caffè latte with 1cm foam.’
H 0 : a coffee machine makes a cup of caffè latte with 1cm foam on average.
H a : a coffee machine cannot make a cup of caffè latte with 1 cm foam on average.
If you have developed the hypotheses H 0 and H a as mentioned in the discussion to Activity 1, you have shown that you are familiar with the structure of different types of hypotheses. However, in the next section you will explore the concept of hypothesis formulation further.
The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.
H 0 : The null hypothesis: It is a statement of no difference between the variables—they are not related. This can often be considered the status quo and as a result if you cannot accept the null it requires some action.
H a : The alternative hypothesis: It is a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 . This is usually what the researcher is trying to prove.
Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.
After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are "reject H 0 " if the sample information favors the alternative hypothesis or "do not reject H 0 " or "decline to reject H 0 " if the sample information is insufficient to reject the null hypothesis.
Mathematical Symbols Used in H 0 and H a :
equal (=) | not equal (≠) greater than (>) less than (<) |
greater than or equal to (≥) | less than (<) |
less than or equal to (≤) | more than (>) |
H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.
H 0 : No more than 30% of the registered voters in Santa Clara County voted in the primary election. p ≤ .30 H a : More than 30% of the registered voters in Santa Clara County voted in the primary election. p > 30
A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25%. State the null and alternative hypotheses.
We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are: H 0 : μ = 2.0 H a : μ ≠ 2.0
We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.
We want to test if college students take less than five years to graduate from college, on the average. The null and alternative hypotheses are: H 0 : μ ≥ 5 H a : μ < 5
We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.
In an issue of U. S. News and World Report , an article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third pass. The same article stated that 6.6% of U.S. students take advanced placement exams and 4.4% pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6%. State the null and alternative hypotheses. H 0 : p ≤ 0.066 H a : p > 0.066
On a state driver’s test, about 40% pass the test on the first try. We want to test if more than 40% pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.
Bring to class a newspaper, some news magazines, and some Internet articles . In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.
This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.
Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.
Access for free at https://openstax.org/books/introductory-statistics-2e/pages/1-introduction
© Jul 18, 2024 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.
Agnieszka glica, katarzyna wasilewska, julia jurkowska, jarosław żygierewicz, bartosz kossowski.
The neural noise hypothesis of dyslexia posits an imbalance between excitatory and inhibitory (E/I) brain activity as an underlying mechanism of reading difficulties. This study provides the first direct test of this hypothesis using both indirect EEG power spectrum measures in 120 Polish adolescents and young adults (60 with dyslexia, 60 controls) and direct glutamate (Glu) and gamma-aminobutyric acid (GABA) concentrations from magnetic resonance spectroscopy (MRS) at 7T MRI scanner in half of the sample. Our results, supported by Bayesian statistics, show no evidence of E/I balance differences between groups, challenging the hypothesis that cortical hyperexcitability underlies dyslexia. These findings suggest alternative mechanisms must be explored and highlight the need for further research into the E/I balance and its role in neurodevelopmental disorders.
eLife assessment
The authors combined neurophysiological (electroencephalography [EEG]) and neurochemical (magnetic resonance spectroscopy [MRS]) measures to empirically evaluate the neural noise hypothesis of developmental dyslexia. Their results are solid , supported by consistent findings from the two complementary methodologies and Bayesian statistics. Additional analyses, particularly on the neurochemical measures, are necessary to further substantiate the results. This study is useful for understanding the neural mechanisms of dyslexia and neural development in general.
According to the neural noise hypothesis of dyslexia, reading difficulties stem from an imbalance between excitatory and inhibitory (E/I) neural activity ( Hancock et al., 2017 ). The hypothesis predicts increased cortical excitation leading to more variable and less synchronous neural firing. This instability supposedly results in disrupted sensory representations and impedes phonological awareness and multisensory integration skills, crucial for learning to read ( Hancock et al., 2017 ). Yet, studies testing this hypothesis are lacking.
The non-invasive measurement of the E/I balance can be derived through assessment of glutamate (Glu) and gamma-aminobutyric acid (GABA) neurotransmitters concentration via magnetic resonance spectroscopy (MRS) ( Finkelman et al., 2022 ) or through global, indirect estimations from the electroencephalography (EEG) signal ( Ahmad et al., 2022 ).
Direct measurements of Glu and GABA yielded conflicting findings. Higher Glu concentrations in the midline occipital cortex correlated with poorer reading performance in children ( Del Tufo et al., 2018 ; Pugh et al., 2014 ), while elevated Glu levels in the anterior cingulate cortex (ACC) corresponded to greater phonological skills ( Lebel et al., 2016 ). Elevated GABA in the left inferior frontal gyrus was linked to reduced verbal fluency in adults ( Nakai and Okanoya, 2016 ), and increased GABA in the midline occipital cortex in children was associated with slower reaction times in a linguistic task ( Del Tufo et al., 2018 ). However, notable null findings exist regarding dyslexia status and Glu levels in the ACC among children ( Horowitz-Kraus et al., 2018 ) as well as Glu and GABA levels in the visual and temporo-parietal cortices in both children and adults ( Kossowski et al., 2019 ).
Both beta (∼13-28 Hz) and gamma (> 30 Hz) oscillations may serve as E/I balance indicators ( Ahmad et al., 2022 ), as greater GABA-ergic activity has been associated with greater beta power ( Jensen et al., 2005 ; Porjesz et al., 2002 ) and gamma power or peak frequency ( Brunel and Wang, 2003 ; Chen et al., 2017 ). Resting-state analyses often reported nonsignificant beta power associations with dyslexia ( Babiloni et al., 2012 ; Fraga González et al., 2018 ; Xue et al., 2020 ), however, one study indicated lower beta power in dyslexic compared to control boys ( Fein et al., 1986 ). Mixed results were also observed during tasks. One study found decreased beta power in the dyslexic group ( Spironelli et al., 2008 ), while the other increased beta power relative to the control group ( Rippon and Brunswick, 2000 ). Insignificant relationship between resting gamma power and dyslexia was reported ( Babiloni et al., 2012 ; Lasnick et al., 2023 ). When analyzing auditory steady-state responses, the dyslexic group had a lower gamma peak frequency, while no significant differences in gamma power were observed ( Rufener and Zaehle, 2021 ). Essentially, the majority of studies in dyslexia examining gamma frequencies evaluated cortical entrainment to auditory stimuli ( Lehongre et al., 2011 ; Marchesotti et al., 2020 ; Van Hirtum et al., 2019 ). Therefore, the results from these tasks do not provide direct evidence of differences in either gamma power or peak frequency between the dyslexic and control groups.
The EEG signal comprises both oscillatory, periodic activity, and aperiodic activity, characterized by a gradual decrease in power as frequencies rise (1/f signal) ( Donoghue et al., 2020 ). Recently recognized as a biomarker of E/I balance, a lower exponent of signal decay (flatter slope) indicates a greater dominance of excitation over inhibition in the brain, as shown by the simulation models of local field potentials, ratio of AMPA/GABA a synapses in the rat hippocampus ( Gao et al., 2017 ) and recordings under propofol or ketamine in macaques and humans ( Gao et al., 2017 ; Waschke et al., 2021 ). However, there are also pharmacological studies providing mixed results ( Colombo et al., 2019 ; Salvatore et al., 2024 ). Nonetheless, the 1/f signal has shown associations with various conditions putatively characterized by changes in E/I balance, such as early development in infancy ( Schaworonkow and Voytek, 2021 ), healthy aging ( Voytek et al., 2015 ) and neurodevelopmental disorders like ADHD ( Ostlund et al., 2021 ), autism spectrum disorder ( Manyukhina et al., 2022 ) or schizophrenia ( Molina et al., 2020 ). Despite its potential relevance, the evaluation of the 1/f signal in dyslexia remains limited to one study, revealing flatter slopes among dyslexic compared to control participants at rest ( Turri et al., 2023 ), thereby lending support to the notion of neural noise in dyslexia.
Here, we examined both indirect (1/f signal, beta, and gamma oscillations during both rest and a spoken language task) and direct (Glu and GABA) biomarkers of E/I balance in participants with dyslexia and age-matched controls. The neural noise hypothesis predicts flatter slopes of 1/f signal, decreased beta and gamma power, and higher Glu concentrations in the dyslexic group. Furthermore, we tested the relationships between different E/I measures. Flatter slopes of 1/f signal should be related to higher Glu level, while enhanced beta and gamma power to increased GABA level.
We recruited 120 Polish adolescents and young adults – 60 with dyslexia diagnosis and 60 controls matched in sex, age, and family socio-economic status. The dyslexic group scored lower in all reading and reading-related tasks and higher in the Polish version of the Adult Reading History Questionnaire (ARHQ-PL) ( Bogdanowicz et al., 2015 ),where a higher score indicates a higher risk of dyslexia (see Table S1 in the Supplementary Material). Although all participants were within the intellectual norm, the dyslexic group scored lower on the IQ scale (including nonverbal subscale only) than the control group. However, the Bayesian statistics did not provide evidence for the difference between groups in the nonverbal IQ.
We analyzed the aperiodic (exponent and offset) components of the EEG signal at rest and during a spoken language task, where participants listened to a sentence and had to indicate its veracity. Due to a technical error, the signal from one person (a female from the dyslexic group) was not recorded during most of the language task and was excluded from the analyses. Hence, the results are provided for 119 participants – 59 in the dyslexic and 60 in the control group.
First, aperiodic parameter values were averaged across all electrodes and compared between groups (dyslexic, control) and conditions (resting state, language task) using a 2×2 repeated measures ANOVA. Age negatively correlated both with the exponent ( r = -.27, p = .003, BF 10 = 7.96) and offset ( r = -.40, p < .001, BF 10 = 3174.29) in line with previous investigations ( Cellier et al., 2021 ; McSweeney et al., 2021 ; Schaworonkow and Voytek, 2021 ; Voytek et al., 2015 ), therefore we included age as a covariate. Post-hoc tests are reported with Bonferroni corrected p -values.
For the mean exponent, we found a significant effect of age ( F (1,116) = 8.90, p = .003, η 2 p = .071, BF incl = 10.47), while the effects of condition ( F (1,116) = 2.32, p = .131, η 2 p = .020, BF incl = 0.39) and group ( F (1,116) = 0.08, p = .779, η 2 p = .001, BF incl = 0.40) were not significant and Bayes Factor did not provide evidence for either inclusion or exclusion. Interaction between group and condition ( F (1,116) = 0.16, p = .689, η 2 p = .001, BF incl = 0.21) was not significant and Bayes Factor indicated against including it in the model.
For the mean offset, we found significant effects of age ( F (1,116) = 22.57, p < .001, η 2 p = .163, BF incl = 1762.19) and condition ( F (1,116) = 23.04, p < .001, η 2 p = .166, BF incl > 10000) with post-hoc comparison indicating that the offset was lower in the resting state condition ( M = -10.80, SD = 0.21) than in the language task ( M = -10.67, SD = 0.26, p corr < .001). The effect of group ( F (1,116) = 0.00, p = .964, η 2 p = .000, BF incl = 0.54) was not significant while Bayes Factor did not provide evidence for either inclusion or exclusion. Interaction between group and condition was not significant ( F (1,116) = 0.07, p = .795, η 2 p = .001, BF incl = 0.22) and Bayes Factor indicated against including it in the model.
Next, we restricted analyses to language regions and averaged exponent and offset values from the frontal electrodes corresponding to the left (F7, FT7, FC5) and right inferior frontal gyrus (F8, FT8, FC6), as well as temporal electrodes, corresponding to the left (T7, TP7, TP9) and right superior temporal sulcus, STS (T8, TP8, TP10)( Giacometti et al., 2014 )( Scrivener and Reader, 2022 ). A 2×2×2×2 (group, condition, hemisphere, region) repeated measures ANOVA with age as a covariate was applied. Power spectra from the left STS at rest and during the language task are presented in Figure 1A and C , while the results for the exponent, offset, and beta power are presented in Figure 1B and D .
Overview of the main results obtained in the study. (A) Power spectral densities averaged across 3 electrodes (T7, TP7, TP9) corresponding to the left superior temporal sulcus (STS) separately for dyslexic (DYS) and control (CON) groups at rest and (C) during the language task. (B) Plots illustrating results for the exponent, offset, and the beta power from the left STS electrodes at rest and (D ) during the language task. (E) Group results (CON > DYS) from the fMRI localizer task for words compared to the control stimuli (p < .05 FWE cluster threshold) and overlap of the MRS voxel placement across participants. (F) MRS spectra separately for DYS and CON groups. (G) Plots illustrating results for the Glu, GABA, Glu/GABA ratio and the Glu/GABA imbalance. (H ) Semi-partial correlation between offset at rest (left STS electrodes) and Glu controlling for age and gray matter volume (GMV).
For the exponent, there were significant effects of age ( F (1,116) = 14.00, p < .001, η 2 p = .108, BF incl = 11.46) and condition F (1,116) = 4.06, p = .046, η 2 p = .034, BF incl = 1.88), however, Bayesian statistics did not provide evidence for either including or excluding the condition factor. Furthermore, post-hoc comparisons did not reveal significant differences between the exponent at rest ( M = 1.51, SD = 0.17) and during the language task ( M = 1.51, SD = 0.18, p corr = .546). There was also a significant interaction between region and group, although Bayes Factor indicated against including it in the model ( F (1,116) = 4.44, p = .037, η 2 p = .037, BF incl = 0.25). Post-hoc comparisons indicated that the exponent was higher in the frontal than in the temporal region both in the dyslexic ( M frontal = 1.54, SD frontal = 0.15, M temporal = 1.49, SD temporal = 0.18, p corr < .001) and in the control group ( M frontal = 1.54, SD frontal = 0.17, M temporal = 1.46, SD temporal = 0.20, p corr < .001). The difference between groups was not significant either in the frontal ( p corr = .858) or temporal region ( p corr = .441). The effects of region ( F (1,116) = 1.17, p = .282, η 2 p = .010, BF incl > 10000) and hemisphere ( F (1,116) = 1.17, p = .282, η 2 p = .010, BF incl = 12.48) were not significant, although Bayesian statistics indicated in favor of including them in the model. Furthermore, the interactions between condition and group ( F (1,116) = 0.18, p = .673, η 2 p = .002, BF incl = 3.70), and between region, hemisphere, and condition ( F (1,116) = 0.11, p = .747, η 2 p = .001, BF incl = 7.83) were not significant, however Bayesian statistics indicated in favor of including these interactions in the model. The effect of group ( F (1,116) = 0.12, p = .733, η 2 p = .001, BF incl = 1.19) was not significant, while Bayesian statistics did not provide evidence for either inclusion or exclusion. Any other interactions were not significant and Bayes Factor indicated against including them in the model.
In the case of offset, there were significant effects of condition ( F (1,116) = 20.88, p < .001, η 2 p = .153, BF incl > 10000) and region ( F (1,116) = 6.18, p = .014, η 2 p = .051, BF incl > 10000). For the main effect of condition, post-hoc comparison indicated that the offset was lower in the resting state condition ( M = -10.88, SD = 0.33) than in the language task ( M = -10.76, SD = 0.38, p corr < .001), while for the main effect of region, post-hoc comparison indicated that the offset was lower in the temporal ( M = -10.94, SD = 0.37) as compared to the frontal region ( M = -10.69, SD = 0.34, p corr < .001). There was also a significant effect of age ( F (1,116) = 20.84, p < .001, η 2 p = .152, BF incl = 0.23) and interaction between condition and hemisphere, ( F (1,116) = 4.35, p = .039, η 2 p = .036, BF incl = 0.21), although Bayes Factor indicated against including these factors in the model. Post-hoc comparisons for the condition*hemisphere interaction indicated that the offset was lower in the resting state condition than in the language task both in the left ( M rest = -10.85, SD rest = 0.34, M task = -10.73, SD task = 0.40, p corr < .001) and in the right hemisphere ( M rest = -10.91, SD rest = 0.31, M task = -10.79, SD task = 0.37, p corr < .001) and that the offset was lower in the right as compared to the left hemisphere both at rest ( p corr < .001) and during the language task ( p corr < .001). The interactions between region and condition ( F (1,116) = 1.76, p = .187, η 2 p = .015, BF incl > 10000), hemisphere and group ( F (1,116) = 1.58, p = .211, η 2 p = .013, BF incl = 1595.18), region and group ( F (1,116) = 0.27, p = .605, η 2 p = .002, BF incl = 9.32), as well as between region, condition, and group ( F (1,116) = 0.21, p = .651, η 2 p = .002, BF incl = 2867.18) were not significant, although Bayesian statistics indicated in favor of including them in the model. The effect of group ( F (1,116) = 0.18, p = .673, η 2 p = .002, BF incl < 0.00001) was not significant and Bayesian statistics indicated against including it in the model. Any other interactions were not significant and Bayesian statistics indicated against including them in the model or did not provide evidence for either inclusion or exclusion.
Then, we analyzed the aperiodic-adjusted brain oscillations. Since the algorithm did not find the gamma peak (30-43 Hz) above the aperiodic component in the majority of participants, we report the results only for the beta (14-30 Hz) power. We performed a similar regional analysis as for the exponent and offset with a 2×2×2×2 (group, condition, hemisphere, region) repeated measures ANOVA. However, we did not include age as a covariate, as it did not correlate with any of the periodic measures. The sample size was 117 (DYS n = 57, CON n = 60) since in 2 participants the algorithm did not find the beta peak above the aperiodic component in the left frontal electrodes during the task.
The analysis revealed a significant effect of condition ( F (1,115) = 8.58, p = .004, η 2 p = .069, BF incl = 5.82) with post-hoc comparison indicating that the beta power was greater during the language task ( M = 0.53, SD = 0.22) than at rest ( M = 0.50, SD = 0.19, p corr = .004). There were also significant effects of region ( F (1,115) = 10.98, p = .001, η 2 p = .087, BF incl = 23.71), and hemisphere ( F (1,115) = 12.08, p < .001, η 2 p = .095, BF incl = 23.91). For the main effect of region, post-hoc comparisons indicated that the beta power was greater in the temporal ( M = 0.52, SD = 0.21) as compared to the frontal region ( M = 0.50, SD = 0.19, p corr = .001), while for the main effect of hemisphere, post-hoc comparisons indicated that the beta power was greater in the right ( M = 0.52, SD = 0.20) than in the left hemisphere ( M = 0.51, SD = 0.20, p corr < .001). There was a significant interaction between condition and region ( F (1,115) = 12.68, p < .001, η 2 p = .099, BF incl = 55.26) with greater beta power during the language task as compared to rest significant in the temporal ( M rest = 0.50, SD rest = 0.20, M task = 0.55, SD task = 0.24, p corr < .001), while not in the frontal region ( M rest = 0.49, SD rest = 0.18, M task = 0.51, SD task = 0.22, p corr = .077). Also, greater beta power in the temporal as compared to the frontal region was significant during the language task ( p corr < .001), while not at rest ( p corr = .283). The effect of group ( F (1,115) = 0.05, p = .817, η 2 p = .000, BF incl < 0.00001) was not significant and Bayes Factor indicated against including it in the model. Any other interactions were not significant and Bayesian statistics indicated against including them in the model or did not provide evidence for either inclusion or exclusion.
Additionally, building upon previous findings which demonstrated differences in dyslexia in aperiodic and periodic components within the parieto-occipital region ( Turri et al., 2023 ), we have included analyses for the same cluster of electrodes in the Supplementary Material. However, in this region, we also did not find evidence for group differences either in the exponent, offset or beta power.
In total, 59 out of 120 participants underwent MRS session at 7T MRI scanner - 29 from the dyslexic group (13 females, 16 males) and 30 from the control group (14 females, 16 males). The MRS voxel was placed in the left STS, in a region showing highest activation for both visual and auditory words (compared to control stimuli) localized individually in each participant, based on an fMRI task (see Figure 1E for overlap of the MRS voxel placement across participants and Figure 1F for MRS spectra). We decided to analyze the neurometabolites’ levels derived from the left STS, as this region is consistently related to functional and structural differences in dyslexia across languages ( Yan et al., 2021 ).
Due to insufficient magnetic homogeneity or interruption of the study by the participants, 5 participants from the dyslexic group had to be excluded. We excluded further 4 participants due to poor quality of the obtained spectra thus the results for Glu are reported for 50 participants - 21 in the dyslexic (12 females, 9 males) and 29 in the control group (13 females, 16 males). In the case of GABA, we additionally excluded 3 participants based on the Cramér-Rao Lower Bounds (CRLB) > 20%. Therefore, the results for GABA, Glu/GABA ratio and Glu/GABA imbalance are reported for 47 participants - 20 in the dyslexic (12 females, 8 males) and 27 in the control group (11 females, 16 males). Demographic and behavioral characteristics for the subsample of 47 participants are provided in the Table S2.
For each metabolite, we performed a separate univariate ANCOVA with the effect of group being tested and voxel’s gray matter volume (GMV) as a covariate (see Figure 1G ). For the Glu analysis, we also included age as a covariate, due to negative correlation between variables ( r = -.35, p = .014, BF 10 = 3.41). The analysis revealed significant effect of GMV ( F (1,46) = 8.18, p = .006, η 2 p = .151, BF incl = 12.54), while the effects of age ( F (1,46) = 3.01, p = .090, η 2 p = .061, BF incl = 1.15) and group ( F (1,46) = 1.94, p = .170, 1 = .040, BF incl = 0.63) were not significant and Bayes Factor did not provide evidence for either inclusion or exclusion.
Conversely, GABA did not correlate with age ( r = -.11, p = .481, BF 10 = 0.23), thus age was not included as a covariate. The analysis revealed a significant effect of GMV ( F (1,44) = 4.39, p = .042, η 2 p = .091, BF incl = 1.64), however Bayes Factor did not provide evidence for either inclusion or exclusion. The effect of group was not significant ( F (1,44) = 0.49, p = .490, η 2 p = .011, BF incl = 0.35) although Bayesian statistics did not provide evidence for either inclusion or exclusion.
Also, Glu/GABA ratio did not correlate with age ( r = -.05, p = .744, BF 10 = 0.19), therefore age was not included as a covariate. The results indicated that the effect of GMV was not significant ( F (1,44) = 0.95, p = .335, η 2 p = .021, BF incl = 0.43) while Bayes Factor did not provide evidence for either inclusion or exclusion. The effect of group was not significant ( F (1,44) = 0.01, p = .933, η 2 p = .000, BF incl = 0.29) and Bayes Factor indicated against including it in the model.
Following a recent study examining developmental changes in both EEG and MRS E/I biomarkers ( McKeon et al., 2024 ), we calculated an additional measure of Glu/GABA imbalance, computed as the absolute residual value from the linear regression of Glu predicted by GABA with greater values indicating greater Glu/GABA imbalance. Alike the previous work ( McKeon et al., 2024 ), we took the square root of this value to ensure a normal distribution of the data. This measure did not correlate with age ( r = -.05, p = .719, BF 10 = 0.19); thus, age was not included as a covariate. The results indicated that the effect of GMV was not significant ( F (1,44) = 0.63, p = .430, η 2 p = .014, BF incl = 0.37) while Bayes Factor did not provide evidence for either inclusion or exclusion. The effect of group was not significant ( F (1,44) = 0.74, p = .396, η 2 p = .016, BF incl = 0.39) although Bayesian statistics did not provide evidence for either inclusion or exclusion.
Next, we investigated correlations between Glu and GABA concentrations in the left STS and EEG markers of E/I balance. Semi-partial correlations were performed ( Table 1 ) to control for confounding variables - for Glu the effects of age and GMV were regressed, for GABA, Glu/GABA ratio and Glu/GABA imbalance the effect of GMV was regressed, while for exponents and offsets the effect of age was regressed. For zero-order correlations between variables see Table S3.
Glu negatively correlated with offset in the left STS both at rest ( r = -.38, p = .007, BF 10 = 6.28; Figure 1H ) and during the language task ( r = -.37, p = .009, BF 10 = 5.05), while any other correlations between Glu and EEG markers were not significant and Bayesian statistics indicated in favor of null hypothesis or provided absence of evidence for either hypothesis. Furthermore, Glu/GABA imbalance positively correlated with exponent at rest both averaged across all electrodes ( r = .29, p = .048, BF 10 = 1.21), as well as in the left STS electrodes ( r = .35, p = .017, BF 10 = 2.87) although Bayes Factor provided absence of evidence for either alternative or null hypothesis. Conversely, GABA and Glu/GABA ratio were not significantly correlated with any of the EEG markers and Bayesian statistics indicated in favor of null hypothesis or provided absence of evidence for either hypothesis.
The neural noise hypothesis of dyslexia predicts impact of the neural noise on reading through the impairment of 1) phonological awareness, 2) lexical access and generalization and 3) multisensory integration ( Hancock et al., 2017 ). Therefore, we analyzed correlations between these variables, reading skills and direct and indirect markers of E/I balance. For the composite score of phonological awareness, we averaged z-scores from phoneme deletion, phoneme and syllable spoonerisms tasks. For the composite score of lexical access and generalization we averaged z-scores from objects, colors, letters and digits subtests from rapid automatized naming (RAN) task, while for the composite score of reading we averaged z-scores from words and pseudowords read per minute, and text reading time in reading comprehension task. The outcomes from the RAN and reading comprehension task have been transformed from raw time scores to items/time scores in order to provide the same direction of relationships for all z-scored measures, with greater values indicating better skills. For the multisensory integration score we used results from the redundant target effect task reported in our previous work ( Glica et al., 2024 ), with greater values indicating a greater magnitude of multisensory integration.
Age positively correlated with multisensory integration ( r = .38, p < .001, BF 10 = 87.98), composite scores of reading ( r = .22, p = .014, BF 10 = 2.24) and phonological awareness ( r = .21, p = .021, BF 10 = 1.59), while not with the composite score of RAN ( r = .13, p = .151, BF 10 = 0.32). Hence, we regressed the effect of age from multisensory integration, reading and phonological awareness scores and performed semi-partial correlations ( Table 2 , for zero-order correlations see Table S4).
Phonological awareness positively correlated with offset in the left STS at rest ( r = .18, p = .049, BF 10 = 0.77) and with beta power in the left STS both at rest ( r = .23, p = .011, BF 10 = 2.73; Figure 2A ) and during the language task ( r = .23, p = .011, BF 10 = 2.84; Figure 2B ), although Bayes Factor provided absence of evidence for either alternative or null hypothesis. Furthermore, multisensory integration positively correlated with GABA concentration ( r = .31, p = .034, BF 10 = 1.62) and negatively with Glu/GABA ratio ( r = -.32, p = .029, BF 10 = 1.84), although Bayes Factor provided absence of evidence for either alternative or null hypothesis. Any other correlations between reading skills and E/I balance markers were not significant and Bayesian statistics indicated in favor of null hypothesis or provided absence of evidence for either hypothesis.
Associations between beta power, phonological awareness and reading. (A) Semi-partial correlation between phonological awareness controlling for age and beta power (in the left STS electrodes) at rest and (B) during the language task. (C) Partial correlation between phonological awareness and reading controlling for age. (D) Mediation analysis results. Unstandardized b regression coefficients are presented. Age was included in the analysis as a covariate. 95% CI - 95% confidence intervals. left STS - values averaged across 3 electrodes corresponding to the left superior temporal sulcus (T7, TP7, TP9).
Given that beta power correlated with phonological awareness, and considering the prediction that neural noise impedes reading by affecting phonological awareness — we examined this relationship through a mediation model. Since phonological awareness correlated with beta power in the left STS both at rest and during language task, the outcomes from these two conditions were averaged prior to the mediation analysis. Macro PROCESS v4.2 ( Hayes, 2017 ) on IBM SPSS Statistics v29 with model 4 (simple mediation) with 5000 Bootstrap samples to assess the significance of indirect effect was employed. Since age correlated both with phonological awareness and reading, we also included age as a covariate.
The results indicated that both effects of beta power in the left STS ( b = .96, t (116) = 2.71, p = .008, BF incl = 7.53) and age ( b = .06, t (116) = 2.55, p = .012, BF incl = 5.98) on phonological awareness were significant. The effect of phonological awareness on reading was also significant ( b = .69, t (115) = 8.16, p < .001, BF incl > 10000), while the effects of beta power ( b = -.42, t (115) = -1.25, p = .213, BF incl = 0.52) and age ( b = .03, t (115) = 1.18, p = .241, BF incl = 0.49) on reading were not significant when controlling for phonological awareness. Finally, the indirect effect of beta power on reading through phonological awareness was significant ( b = .66, SE = .24, 95% CI = [.24, 1.18]), while the total effect of beta power was not significant ( b = .24, t (116) = 0.61, p = .546, BF incl = 0.41). The results from the mediation analysis are presented in Figure 2D .
Although similar mediation analysis could have been conducted for the Glu/GABA ratio, multisensory integration, and reading based on the correlations between these variables, we did not test this model due to the small sample size (47 participants), which resulted in insufficient statistical power.
The current study aimed to validate the neural noise hypothesis of dyslexia ( Hancock et al., 2017 ) utilizing E/I balance biomarkers from EEG power spectra and ultra-high-field MRS. Contrary to its predictions, we did not observe differences either in 1/f slope, beta power, or Glu and GABA concentrations in participants with dyslexia. Relations between E/I balance biomarkers were limited to significant correlations between Glu and the offset when controlling for age, and between Glu/GABA imbalance and the exponent.
In terms of indirect markers, our study found no evidence of group differences in the aperiodic components of the EEG signal. In most of the models, we did not find evidence for either including or excluding the effect of the group when Bayesian statistics were evaluated. The only exception was the regional analysis for the offset, where results indicated against including the group factor in the model. These findings diverge from previous research on an Italian cohort, which reported decreased exponent and offset in the dyslexic group at rest, specifically within the parieto-occipital region, but not the frontal region ( Turri et al., 2023 ). Despite our study involving twice the number of participants and utilizing a longer acquisition time, we observed no group differences, even in the same cluster of electrodes (refer to Supplementary Material). The participants in both studies were of similar ages. The only methodological difference – EEG acquisition with eyes open in our study versus both eyes-open and eyes-closed in the work by Turri and colleagues (2023) – cannot fully account for the overall lack of group differences observed. The diverging study outcomes highlight the importance of considering potential inflation of effect sizes in studies with smaller samples.
Although a lower exponent of the EEG power spectrum has been associated with other neurodevelopmental disorders, such as ADHD ( Ostlund et al., 2021 ) or ASD (but only in children with IQ below average) ( Manyukhina et al., 2022 ), our study suggests that this is not the case for dyslexia. Considering the frequent comorbidity of dyslexia and ADHD ( Germanò et al., 2010 ; Langer et al., 2019 ), increased neural noise could serve as a common underlying mechanism for both disorders. However, our specific exclusion of participants with a comorbid ADHD diagnosis indicates that the EEG spectral exponent cannot serve as a neurobiological marker for dyslexia in isolation. No information regarding such exclusion criteria was provided in the study by Turri et al. (2023) ; thus, potential comorbidity with ADHD may explain the positive findings related to dyslexia reported therein.
Regarding the aperiodic-adjusted oscillatory EEG activity, Bayesian statistics for beta power, indicated in favor of excluding the group factor from the model. Non-significant group differences in beta power at rest have been previously reported in studies that did not account for aperiodic components ( Babiloni et al., 2012 ; Fraga González et al., 2018 ; Xue et al., 2020 ). This again contrasts with the study by Turri et al. (2023) , which observed lower aperiodic-adjusted beta power (at 15-25 Hz) in the dyslexic group. Concerning beta power during task, our results also contrast with previous studies which showed either reduced ( Spironelli et al., 2008 ) or increased ( Rippon and Brunswick, 2000 ) beta activity in participants with dyslexia. Nevertheless, since both of these studies employed phonological tasks and involved children’s samples, their relevance to our work is limited.
In terms of direct neurometabolite concentrations derived from the MRS, we found no evidence for group differences in either Glu, GABA or Glu/GABA imbalance in the language-sensitive left STS. Conversely, the Bayes Factor suggested against including the group factor in the model for the Glu/GABA ratio. While no previous study has localized the MRS voxel based on the individual activation levels, nonsignificant group differences in Glu and GABA concentrations within the temporo-parietal and visual cortices have been reported in both children and adults ( Kossowski et al., 2019 ), as well as in the ACC in children ( Horowitz-Kraus et al., 2018 ). Although our MRS sample size was half that of the EEG sample, previous research reporting group differences in Glu concentrations involved an even smaller dyslexic cohort (10 participants with dyslexia and 45 typical readers in Pugh et al., 2014 ). Consistent with earlier studies that identified group differences in Glu and GABA concentrations ( Del Tufo et al., 2018 ; Pugh et al., 2014 ) we reported neurometabolite levels relative to total creatine (tCr), indicating that the absence of corresponding results cannot be ascribed to reference differences. Notably, our analysis of the fMRI localizer task revealed greater activation in the control group as compared to the dyslexic group within the left STS for words than control stimuli (see Figure 1E and the Supplementary Material) in line with previous observations ( Blau et al., 2009 ; Dębska et al., 2021 ; Yan et al., 2021 ).
Irrespective of dyslexia status, we found negative correlations between age and exponent and offset, consistent with previous research ( Cellier et al., 2021 ; McSweeney et al., 2021 ; Schaworonkow and Voytek, 2021 ; Voytek et al., 2015 ) and providing further evidence for maturational changes in the aperiodic components (indicative of increased E/I ratio). At the same time, in line with previous MRS works ( Kossowski et al., 2019 ; Marsman et al., 2013 ), we observed a negative correlation between age and Glu concentrations. This suggests a contrasting pattern to EEG results, indicating a decrease in neuronal excitation with age. We also found a condition-dependent change in offset, with a lower offset observed at rest than during the language task. The offset value represents the uniform shift in power across frequencies ( Donoghue et al., 2020 ), with a higher offset linked to increased neuronal spiking rates ( Manning et al., 2009 ). Change in offset between conditions is consistent with observed increased alpha and beta power during the task, indicating elevated activity in both broadband (offset) and narrowband (alpha and beta oscillations) frequency ranges during the language task.
In regard to relationships between EEG and MRS E/I balance biomarkers, we observed a negative correlation between the offset in the left STS (both at rest and during the task) and Glu levels, after controlling for age and GMV. This correlation was not observed in zero-order correlations (see Supplementary Material). Contrary to our predictions, informed by previous studies linking the exponent to E/I ratio ( Colombo et al., 2019 ; Gao et al., 2017 ; Waschke et al., 2021 ), we found the correlation with Glu levels to involve the offset rather than the exponent. This outcome was unexpected, as none of the referenced studies reported results for the offset. However, given the strong correlation between the exponent and offset observed in our study ( r = .68, p < .001, BF 10 > 10000 and r = .72, p < .001, BF 10 > 10000 at rest and during the task respectively) it is conceivable that similar association might be identified for the offset if it were analyzed.
Nevertheless, previous studies examining relationships between EEG and MRS E/I balance biomarkers ( McKeon et al., 2024 ; van Bueren et al., 2023 ) did not identify a similar negative association between Glu and the offset. Instead, one study noted a positive correlation between the Glu/GABA ratio and the exponent ( van Bueren et al., 2023 ), which was significant in the intraparietal sulcus but not in the middle frontal gyrus. This finding presents counterintuitive evidence, suggesting that an increased E/I balance, as indicated by MRS, is associated with a higher aperiodic exponent, considered indicative of decreased E/I balance. In line with this pattern, another study discovered a positive relationship between the exponent and Glu levels in the dorsolateral prefrontal cortex ( McKeon et al., 2024 ). Furthermore, they observed a positive correlation between the exponent and the Glu/GABA imbalance measure, calculated as the absolute residual value of a linear relationship between Glu and GABA ( McKeon et al., 2024 ), a finding replicated in the current work. This implies that a higher spectral exponent might not be directly linked to MRS-derived Glu or GABA levels, but rather to a greater disproportion (in either direction) between these neurotransmitters. These findings, alongside the contrasting relationships between EEG and MRS biomarkers and age, suggest that these methods may reflect distinct biological mechanisms of E/I balance.
Evidence regarding associations between neurotransmitters levels and oscillatory activity also remains mixed. One study found a positive correlation between gamma peak frequency and GABA concentration in the visual cortex ( Muthukumaraswamy et al., 2009 ), a finding later challenged by a study with a larger sample ( Cousijn et al., 2014 ). Similarly, a different study noted a positive correlation between GABA in the left STS and gamma power ( Balz et al., 2016 ), another study, found non-significant relation between these measures ( Wyss et al., 2017 ). Moreover, in a simultaneous EEG and MRS study, an event-related increase in Glu following visual stimulation was found to correlate with greater gamma power ( Lally et al., 2014 ). We could not investigate such associations, as the algorithm failed to identify a gamma peak above the aperiodic component for the majority of participants. Also, contrary to previous findings showing associations between GABA in the motor and sensorimotor cortices and beta power ( Cheng et al., 2017 ; Gaetz et al., 2011 ) or beta peak frequency ( Baumgarten et al., 2016 ), we observed no correlation between Glu or GABA levels and beta power. However, these studies placed MRS voxels in motor regions which are typically linked to movement-related beta activity ( Baker et al., 1999 ; Rubino et al., 2006 ; Sanes and Donoghue, 1993 ) and did not adjust beta power for aperiodic components, making direct comparisons with our findings limited.
Finally, we examined pathways posited by the neural noise hypothesis of dyslexia, through which increased neural noise may impact reading: phonological awareness, lexical access and generalization, and multisensory integration ( Hancock et al., 2017 ). Phonological awareness was positively correlated with the offset in the left STS at rest, and with beta power in the left STS, both at rest and during the task. Additionally, multisensory integration showed correlations with GABA and the Glu/GABA ratio. Since the Bayes Factor did not provide conclusive evidence supporting either the alternative or null hypothesis, these associations appear rather weak. Nonetheless, given the hypothesis’s prediction of a causal link between these variables, we further examined a mediation model involving beta power, phonological awareness, and reading skills. The results suggested a positive indirect effect of beta power on reading via phonological awareness, whereas both the direct (controlling for phonological awareness and age) and total effects (without controlling for phonological awareness) were not significant. This finding is noteworthy, considering that participants with dyslexia exhibited reduced phonological awareness and reading skills, despite no observed differences in beta power. Given the cross-sectional nature of our study, further longitudinal research is necessary to confirm the causal relation among these variables. The effects of GABA and the Glu/GABA ratio on reading, mediated by multisensory integration, warrant further investigation. Additionally, considering our finding that only males with dyslexia showed deficits in multisensory integration ( Glica et al., 2024 ), sex should be considered as a potential moderating factor in future analyses. We did not test this model here due to the smaller sample size for GABA measurements.
Our findings suggest that the neural noise hypothesis, as proposed by Hancock and colleagues (2017) , does not fully explain the reading difficulties observed in dyslexia. Despite the innovative use of both EEG and MRS biomarkers to assess excitatory-inhibitory (E/I) balance, neither method provided evidence supporting an E/I imbalance in dyslexic individuals. Importantly, our study focused on adolescents and young adults, and the EEG recordings were conducted during rest and a spoken language task. These factors may limit the generalizability of our results. Future research should include younger populations and incorporate a broader array of tasks, such as reading and phonological processing, to provide a more comprehensive evaluation of the E/I balance hypothesis. Additionally, our findings are consistent with another study by Tan et al. (2022) which found no evidence for increased variability (’noise’) in behavioral and fMRI response patterns in dyslexia. Together, these results highlight the need to explore alternative neural mechanisms underlying dyslexia and suggest that cortical hyperexcitability may not be the primary cause of reading difficulties.
In conclusion, while our study challenges the neural noise hypothesis as a sole explanatory framework for dyslexia, it also underscores the complexity of the disorder and the necessity for multifaceted research approaches. By refining our understanding of the neural underpinnings of dyslexia, we can better inform future studies and develop more effective interventions for those affected by this condition.
Participants.
A total of 120 Polish participants aged between 15.09 and 24.95 years ( M = 19.47, SD = 3.06) took part in the study. This included 60 individuals with a clinical diagnosis of dyslexia performed by the psychological and pedagogical counseling centers (28 females and 32 males) and 60 control participants without a history of reading difficulties (28 females and 32 males). All participants were right-handed, born at term, without any reported neurological/psychiatric diagnosis and treatment (including ADHD), without hearing impairment, with normal or corrected-to-normal vision, and IQ higher than 80 as assessed by the Polish version of the Abbreviated Battery of the Stanford-Binet Intelligence Scale-Fifth Edition (SB5) ( Roid et al., 2017 ).
The study was approved by the institutional review board at the University of Warsaw, Poland (reference number 2N/02/2021). All participants (or their parents in the case of underaged participants) provided written informed consent and received monetary remuneration for taking part in the study.
Participants’ reading skills were assessed by multiple paper-pencil tasks described in detail in our previous work ( Glica et al., 2024 ). Briefly, we evaluated words and pseudowords read in one minute ( Szczerbiński and Pelc-Pękała, 2013 ), rapid automatized naming ( Fecenec et al., 2013 ), and reading comprehension speed. We also assessed phonological awareness by a phoneme deletion task ( Szczerbiński and Pelc-Pękała, 2013 ) and spoonerisms tasks ( Bogdanowicz et al., 2016 ), as well as orthographic awareness (Awramiuk and Krasowicz-Kupis, 2013). Furthermore, we evaluated non-verbal perception speed ( Ciechanowicz and Stańczak, 2006 ) and short-term and working memory by forward and backward conditions from the Digit Span subtest from the WAIS-R ( Wechsler, 1981 ). We also assessed participants’ multisensory audiovisual integration by a redundant target effect task, which results have been reported in our previous work ( Glica et al., 2024 ).
EEG was recorded from 62 scalp and 2 ear electrodes using the Brain Products system (actiCHamp Plus, Brain Products GmbH, Gilching, Germany). Data were recorded in BrainVision Recorder Software (Vers. 1.22.0002, Brain Products GmbH, Gilching, Germany) with a 500 Hz sampling rate. Electrodes were positioned in line with the extended 10-20 system. Electrode Cz served as an online reference, while the Fpz as a ground electrode. All electrodes’ impedances were kept below 10 kΩ. Participants sat in a chair with their heads on a chin-rest in a dark, sound-attenuated, and electrically shielded room while the EEG was recorded during both a 5-minute eyes-open resting state and the spoken language comprehension task. The paradigm was prepared in the Presentation software (Version 20.1, Neurobehavioral Systems, Inc., Berkeley, CA, www.neurobs.com ).
During rest, participants were instructed to relax and fixate their eyes on a white cross presented centrally on a black background. After 5 minutes, the spoken language comprehension task automatically started. The task consisted of 3 to 5 word-long sentences recorded in a speech synthesizer which were presented binaurally through sound-isolating earphones. After hearing a sentence, participants were asked to indicate whether the sentence was true or false by pressing a corresponding button. In total, there were 256 sentences – 128 true (e.g., “Plants need water”) and 128 false (e.g., “Dogs can fly”).
Sentences were presented in a random order in two blocks of 128 trials. At the beginning of each trial, a white fixation cross was presented centrally on a black background for 500 ms, then a blank screen appeared for either 500, 600, 700, or 800 ms (durations set randomly and equiprobably) followed by an auditory sentence presentation. The length of sentences ranged between 1.17 and 2.78 seconds and was balanced between true ( M = 1.82 seconds, SD = 0.29) and false sentences ( M = 1.82 seconds, SD = 0.32; t (254) = -0.21, p = .835; BF 10 = 0.14). After a sentence presentation, a blank screen was displayed for 1000 ms before starting the next trial. To reduce participants’ fatigue, a 1-minute break between two blocks of trials was introduced, and it took approximately 15 minutes to complete the task.
MRI data were acquired using Siemens 3T Trio system with a 32-channel head coil. Structural data were acquired using whole brain 3D T1-weighted image (MP_RAGE, TI = 1100 ms, GRAPPA parallel imaging with acceleration factor PE = 2, voxel resolution = 1mm 3 , dimensions = 256×256×176). Functional data were acquired using whole-brain echo planar imaging sequence (TE = 30ms, TR = 1410 ms, flip angle FA = 90°, FOV = 212 mm, matrix size = 92×92, 60 axial slices 2.3mm thick, 2.3×2.3 mm in-plane resolution, multiband acceleration factor = 3). Due to a technical issue, data from two participants were acquired with a 12-channel coil (see Supplementary Material).
The fMRI task served as a localizer for later MRS voxel placement in language-sensitive left STS. The task was prepared using Presentation software (Version 20.1, Neurobehavioral Systems, Inc., Berkeley, CA, www.neurobs.com ) and consisted of three runs, each lasting 5 minutes and 9 seconds. Two runs involved the presentation of visual stimuli, while the third run of auditory stimuli. In each run, stimuli were presented in 12 blocks, with 14 stimuli per block. In visual runs, there were four blocks from each category: 1) 3 to 4 letters-long words, 2) the same words presented as a false font string (BACS font) ( Vidal et al., 2017 ), and 3) strings of 3 to 4-long consonants. Similarly, in the auditory run, there were four blocks from each category: 1) words recorded in a speech synthesizer, 2) the same words presented backward, and 3) consonant strings recorded in a speech synthesizer. Stimuli within each block were presented for 800 ms with a 400 ms break in between. The duration of each block was 16.8 seconds. Between blocks, a fixation cross was displayed for 8 seconds. Participants performed a 1-back task to maintain focus. The blocks were presented in a pseudorandom order and each block included 2 to 3 repeated stimuli.
The GE 7T system with a 32-channel coil was utilized. Structural data were acquired using whole brain 3D T1-weighted image (3D-SPGR BRAVO, TI = 450ms, TE = 2.6ms, TR = 6.6ms, flip angle = 12 deg, bandwidth = ±32.5kHz, ARC acceleration factor PE = 2, voxel resolution = 1mm, dimensions = 256 x 256 x 180). MRS spectra with 320 averages were acquired from the left STS using single-voxel spectroscopy semiLaser sequence ( Deelchand et al., 2021 ) (voxel size = 15 x 15 x 15 mm, TE = 28ms, TR = 4000ms, 4096 data points, water suppressed using VAPOR). Eight averages with unsuppressed water as a reference were collected.
To localize left STS, T1-weighted images from fMRI and MRS sessions were coregistered and fMRI peak coordinates were used as a center of voxel volume for MRS. Voxels were then adjusted to include only the brain tissue. During the acquisition, participants took part in a simple orthographic task.
The continuous EEG signal was preprocessed in the EEGLAB ( Delorme and Makeig, 2004 ). The data were filtered between 0.5 and 45 Hz (Butterworth filter, 4th order) and re-referenced to the average of both ear electrodes. The data recorded during the break between blocks, as well as bad channels, were manually rejected. The number of rejected channels ranged between 0 and 4 ( M = 0.19, SD = 0.63). Next, independent component analysis (ICA) was applied. Components were automatically labeled by ICLabel ( Pion-Tonachini et al., 2019 ), and those classified with 50-100% source probability as eye blinks, muscle activity, heart activity, channel noise, and line noise, or with 0-50% source probability as brain activity, were excluded. Components labeled as “other” were visually inspected, and those identified as eye blinks and muscle activity were also rejected. The number of rejected components ranged between 11 and 46 ( M = 28.43, SD = 7.26). Previously rejected bad channels were interpolated using the nearest neighbor spline ( Perrin et al., 1989 , 1987 ).
The preprocessed data were divided into a 5-minute resting-state signal and a signal recorded during a spoken language comprehension task using MNE ( Gramfort, 2013 ) and custom Python scripts. The signal from the task was cut up based on the event markers indicating the beginning and end of a sentence. Only trials with correct responses given between 0 and 1000 ms after the end of a sentence were included. The signals recorded during every trial were further multiplied by the Tukey window with α = 0.01 in order to normalize signal amplitudes at the beginning and end of every trial. This allowed a smooth concatenation of signals recorded during task trials, resulting in a continuous signal derived only when participants were listening to the sentences.
The continuous signal from the resting state and the language task was epoched into 2-second-long segments. An automatic rejection criterion of +/-200 μV was applied to exclude epochs with excessive amplitudes. The number of epochs retained in the analysis ranged between 140–150 ( M = 149.66, SD = 1.20) in the resting state condition and between 102–226 ( M = 178.24, SD = 28.94) in the spoken language comprehension task.
Power spectral density (PSD) for 0.5-45 Hz in 0.5 Hz increments was calculated for every artifact-free epoch using Welch’s method for 2-second-long data segments windowed with a Hamming window with no overlap. The estimated PSDs were averaged for each participant and each channel separately for the resting state condition and the language task. Aperiodic and periodic (oscillatory) components were parameterized using the FOOOF method ( Donoghue et al., 2020 ). For each PSD, we extracted parameters for the 1-43 Hz frequency range using the following settings: peak_width_limits = [1, 12], max_n_peaks = infinite, peak_threshold = 2.0, mean_peak_height = 0.0, aperiodic_mode = ‘fixed’. Apart from broad-band aperiodic parameters (exponent and offset), we also extracted power, bandwidth, and the center frequency parameters for the theta (4-7 Hz), alpha (7-14 Hz), beta (14-30 Hz) and gamma (30-43 Hz) bands. Since in the majority of participants, the algorithm did not find the peak above the aperiodic component in theta and gamma bands, we calculated the results only for the alpha and beta bands. The results for other periodic parameters than the beta power are reported in Supplementary Material.
Apart from the frequentist statistics, we also performed Bayesian statistics using JASP ( JASP Team, 2023 ). For Bayesian repeated measures ANOVA, we reported the Bayes Factor for the inclusion of a given effect (BF incl ) with the ’across matched model’ option, as suggested by Keysers and colleagues (2020) , calculated as a likelihood ratio of models with a presence of a specific factor to equivalent models differing only in the absence of the specific factor. For Bayesian t -tests and correlations, we reported the BF 10 value, indicating the ratio of the likelihood of an alternative hypothesis to a null hypothesis. We considered BF incl/10 > 3 and BF incl/10 < 1/3 as evidence for alternative and null hypotheses respectively, while 1/3 < BF incl/10 < 3 as the absence of evidence ( Keysers et al., 2020 ).
The data were analyzed using Statistical Parametric Mapping (SPM12, Wellcome Trust Centre for Neuroimaging, London, UK) run on MATLAB R2020b (The MathWorks Inc., Natick, MA, USA). First, all functional images were realigned to the participant’s mean. Then, T1-weighted images were coregistered to functional images for each subject. Finally, fMRI data were smoothed with a 6mm isotropic Gaussian kernel.
In each subject, the left STS was localized in the native space as a cluster in the middle and posterior left superior temporal sulcus, exhibiting higher activation for visual words versus false font strings and auditory words versus backward words (logical AND conjunction) at p < .01 uncorrected. For 6 participants, the threshold was lowered to p < .05 uncorrected, while for another 6 participants, the contrast from the auditory run was changed to auditory words versus fixation cross due to a lack of activation for other contrasts.
In the Supplementary Material, we also performed the group-level analysis of the fMRI data (Tables S5-S7 and Figure S1).
MRS data were analyzed using fsl-mrs version 2.0.7 ( Clarke et al., 2021 ). Data stored in pfile format were converted into NIfTI-MRS using spec2nii tool. We then used the fsl_mrs_preproc function to automatically perform coil combination, frequency and phase alignment, bad average removal, combination of spectra, eddy current correction, shifting frequency to reference peak and phase correction.
To obtain information about the percentage of WM, GM and CSF in the voxel we used the svs_segmentation with results of fsl_anat as an input. Voxel segmentation was performed on structural images from a 3T scanner, coregistered to 7T structural images in SPM12. Next, quantitative fitting was performed using fsl_mrs function. As a basis set, we utilized a collection of 27 metabolite spectra simulated using FID-A ( Simpson et al., 2017 ) and a script tailored for our experiment. We supplemented this with synthetic macromolecule spectra provided by fsl_mrs . Signals acquired with unsuppressed water served as water reference.
Spectra underwent quantitative assessment and visual inspection and those with linewidth higher than 20Hz, %CRLB higher than 20%, and poor fit to the model were excluded from the analysis (see Table S8 in the Supplementary Material for a detailed checklist). Glu and GABA concentrations were expressed as a ratio to total-creatine (tCr; Creatine + Phosphocreatine).
Behavioral data, raw and preprocessed EEG data, 2 nd level fMRI data, preprocessed MRS data and Python script for the analysis of preprocessed EEG data can be found at OSF: https://osf.io/4e7ps/
This study was supported by the National Science Centre grant (2019/35/B/HS6/01763) awarded to Katarzyna Jednoróg.
We gratefully acknowledge valuable discussions with Ralph Noeske from GE Healthcare for his support in setting up the protocol for an ultra-high field MR spectroscopy and sharing the set-up for basis set simulation in FID-A.
Katarzyna jednoróg, for correspondence:, version history.
© 2024, Glica et al.
This article is distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use and redistribution provided that the original author and source are credited.
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Conditional quantile regression provides a useful statistical tool for modeling and inferring the relationship between the response and covariates in the heterogeneous data. In this paper, we develop a novel testing procedure for the ultrahigh-dimensional partially linear quantile regression model to investigate the significance of ultrahigh-dimensional interested covariates in the presence of ultrahigh-dimensional nuisance covariates. The proposed test statistic is an \(L_2\) -type statistic. We estimate the nonparametric component by some flexible machine learners to handle the complexity and ultrahigh dimensionality of considered models. We establish the asymptotic normality of the proposed test statistic under the null and local alternative hypotheses. A screening-based testing procedure is further provided to make our test more powerful in practice under the ultrahigh-dimensional regime. We evaluate the finite-sample performance of the proposed method via extensive simulation studies. A real application to a breast cancer dataset is presented to illustrate the proposed method.
This is a preview of subscription content, log in via an institution to check access.
Subscribe and save.
Price includes VAT (Russian Federation)
Instant access to the full article PDF.
Rent this article via DeepDyve
Institutional subscriptions
A tuning-free efficient test for marginal linear effects in high-dimensional quantile regression, variable selection in censored quantile regression with high dimensional data.
Belloni, A., Chernozhukov, V.: \(l_1\) -penalized quantile regression in high-dimensional sparse models. Ann. Stat. 39 (1), 82–130 (2011)
Beyerlein, A., Kries, R., Ness, A.R., Ong, K.K.: Genetic markers of obesity risk: stronger associations with body composition in overweight compared to normal-weight children. PLoS ONE 6 (4), 19057 (2011)
Article Google Scholar
Cai, L., Guo, X., Li, G., Tan, F.: Tests for high-dimensional single-index models. Electron. J. Stat. 17 (1), 429–463 (2023)
Article MathSciNet Google Scholar
Chen, S.X., Qin, Y.-L.: A two-sample test for high-dimensional data with applications to gene-set testing. Ann. Stat. 38 (2), 808–835 (2010)
Chen, J., Li, Q., Chen, H.Y.: Testing generalized linear models with high-dimensional nuisance parameters. Biometrika 110 (1), 83–99 (2023)
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., Robins, J.: Double/debiased machine learning for treatment and structural parameters. Econometr. J. 21 (1), 1–68 (2018)
Cui, H., Guo, W., Zhong, W.: Test for high-dimensional regression coefficients using refitted cross-validation variance estimation. Ann. Stat. 46 (3), 958–988 (2018)
Cui, H., Zou, F., Ling, L.: Feature screening and error variance estimation for ultrahigh-dimensional linear model with measurement errors. Commun. Math. Stat., pp. 1–33 (2023)
Dezeure, R., Bühlmann, P., Zhang, C.-H.: High-dimensional simultaneous inference with the bootstrap. TEST 26 , 685–719 (2017)
Du, L., Guo, X., Sun, W., Zou, C.: False discovery rate control under general dependence by symmetrized data aggregation. J. Am. Stat. Assoc. 118 (541), 607–621 (2023)
Fan, J., Lv, J.: Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B 70 (5), 849–911 (2008)
Fan, J., Guo, S., Hao, N.: Variance estimation using refitted cross-validation in ultrahigh dimensional regression. J. R. Stat. Soc. Ser. B 74 (1), 37–65 (2012)
Guo, B., Chen, S.X.: Tests for high dimensional generalized linear models. J. R. Stat. Soc. Ser. B 78 (5), 1079–1102 (2016)
Guo, W., Zhong, W., Duan, S., Cui, H.: Conditional test for ultrahigh dimensional linear regression coefficients. Stat. Sin. 32 , 1381–1409 (2022)
MathSciNet Google Scholar
Hall, P., Heyde, C.C.: Martingale Limit Theory and Its Application. Academic Press, UK (2014)
Google Scholar
Khaled, W., Lin, J., Han, Z., Zhao, Y., Hao, H.: Test for heteroscedasticity in partially linear regression models. J. Syst. Sci. Complex. 32 , 1194–1210 (2019)
Koenker, R.: Quantile Regression. Cambridge University Press, Cambridge (2005)
Book Google Scholar
Koenker, R., Bassett, G.: Regression quantiles. Econometrica 46 (1), 33–50 (1978)
Koenker, R., Chernozhukov, V., He, X., Peng, L.: Handbook of Quantile Regression. CRC Press, New York (2017)
Lu, W., Zhu, Z., Lian, H.: Sparse and low-rank matrix quantile estimation with application to quadratic regression. Stat. Sin. 33 (2), 945–959 (2023)
Ma, R., Cai, T., Li, H.: Global and simultaneous hypothesis testing for high-dimensional logistic regression models. J. Am. Stat. Assoc. 116 (534), 984–998 (2021)
Meinshausen, N., Meier, L., Bühlmann, P.: P-values for high-dimensional regression. J. Am. Stat. Assoc. 104 (488), 1671–1681 (2009)
Méndez Civieta, Á., Aguilera-Morillo, M.C., Lillo, R.E.: Asgl: a python package for penalized linear and quantile regression. arXiv preprint arXiv:2111.00472 (2021)
Ning, Y., Liu, H.: A general theory of hypothesis tests and confidence regions for sparse high dimensional models. Ann. Stat. 45 (1), 158–195 (2017)
Parker, J.S., Mullins, M., Cheang, M.C., Leung, S., Voduc, D., Vickery, T., Davies, S., Fauron, C., He, X., Hu, Z.: Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 27 (8), 1160–1167 (2009)
Prat, A., Bianchini, G., Thomas, M., Belousov, A., Cheang, M.C., Koehler, A., Gómez, P., Semiglazov, V., Eiermann, W., Tjulandin, S.: Research-based PAM50 subtype predictor identifies higher responses and improved survival outcomes in HER2-positive breast cancer in the NOAH study. Clin. Cancer Res. 20 (2), 511–521 (2014)
Sherwood, B., Wang, L.: Partially linear additive quantile regression in ultra-high dimension. Ann. Stat. 44 (1), 288–317 (2016)
Shi, H., Sun, B., Yang, W., Guo, X.: Tests for ultrahigh-dimensional partially linear regression models. arXiv preprint arXiv:2304.07546 (2023)
Song, X., Li, G., Zhou, Z., Wang, X., Ionita-Laza, I., Wei, Y.: QRank: a novel quantile regression tool for eQTL discovery. Bioinformatics 33 (14), 2123–2130 (2017)
Tan, F., Jiang, X., Guo, X., Zhu, L.: Testing heteroscedasticity for regression models based on projections. Stat. Sin. 31 (2), 625–646 (2021)
Tang, Y., Wang, Y., Judy Wang, H., Pan, Q.: Conditional marginal test for high dimensional quantile regression. Stat. Sin. 32 , 869–892 (2022)
Wang, H.J., Zhu, Z., Zhou, J.: Quantile regression in partially linear varying coefficient models. Ann. Stat. 37 (6B), 3841–3866 (2009)
Wang, H., Jin, H., Jiang, X.: Feature selection for high-dimensional varying coefficient models via ordinary least squares projection. Commun. Math. Stat., pp. 1–42 (2023)
Wu, Y., Yin, G.: Conditional quantile screening in ultrahigh-dimensional heterogeneous data. Biometrika 102 (1), 65–76 (2015)
Yang, W., Guo, X., Zhu, L.: Score function-based tests for ultrahigh-dimensional linear models. arXiv preprint arXiv:2212.08446 (2022)
Zhang, X., Cheng, G.: Simultaneous inference for high-dimensional linear models. J. Am. Stat. Assoc. 112 (518), 757–768 (2017)
Zhang, Y., Lian, H., Yu, Y.: Ultra-high dimensional single-index quantile regression. J. Mach. Learn. Res. 21 (1), 9212–9236 (2020)
Zhong, P.-S., Chen, S.X.: Tests for high-dimensional regression coefficients with factorial designs. J. Am. Stat. Assoc. 106 (493), 260–274 (2011)
Download references
The authors would like to thank the editor, the Associate Editor, and the two anonymous reviewers for their valuable comments and constructive suggestions, which lead to significant improvements in the paper. Xu Guo was supported by National Natural Science Foundation of China (Nos. 12071038, 12322112); Niwen Zhou was supported by National Natural Science Foundation of China (No. 12301331) and Natural Science Foundation of Guangdong Province (No. 2023A1515010026).
Authors and affiliations.
School of Statistics, Beijing Normal University, 19 Xinjiekouwai Street, Haidian District, 100875, Beijing, People’s Republic of China
Hongwei Shi, Weichao Yang & Xu Guo
Center for Statistics and Data Science, Beijing Normal University, 18 Jinfeng Road, Zhuhai City, 519087, Guangdong Province, People’s Republic of China
You can also search for this author in PubMed Google Scholar
Correspondence to Xu Guo .
Conflict of interest.
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
This section contains the proofs of the theoretical results in the main paper. We first describe some additional notations used in the proofs. For two sequences of positive constants \(a_n\) and \(b_n\) , we write \(a_n \lesssim b_n\) if there exists some universal constant \(c>0\) and positive integer N independent of n such that \(a_n \le c b_n\) for all \(n \ge N\) . \(a_n > rsim b_n\) is equivalent to \(b_n \lesssim a_n\) . We use \(a_n \asymp b_n\) to denote that \(a_n \lesssim b_n\) and \(a_n > rsim b_n\) hold simultaneously. For \(d\times q\) -dimensional matrix \({\textbf{M}}\) , we define \(\Vert {\textbf{M}}\Vert _{F} = \{\textrm{tr}({\textbf{M}}{\textbf{M}}^\top )\}^{1/2}\) and \(\Vert {\textbf{M}}\Vert _2 = \lambda _{\max }({\textbf{M}})\) . If the matrix \({\textbf{M}}\) is symmetric, \(\lambda _{\max }({\textbf{M}})\) is the maximal eigenvalue of \({\textbf{M}}\) .
To ensure brevity while no confusion, we substitute \(n_1\) and \(\hat{m}_{\tau 2}(\cdot )\) in \(U_{n1}\) with n and \(\hat{m}(\cdot )\) , respectively. Then we can write \(U_{n1}\) as \(U_n\) , which is given by
The above \(U_n\) suggests that we estimate \(m_{\tau }(\cdot )\) using the data \({\mathcal {D}}_2\) , while the construction of the test statistic relies on the data \({\mathcal {D}}_1\) . Thus, \(\{\hat{m}({\textbf{Z}}_i), i \in {\mathcal {D}}_{1}\}\) are i.i.d random variables given the data \({\mathcal {D}}_2\) . For simplicity, we write \(F_{Y}(a|{\textbf{X}}, {\textbf{Z}}) = \Pr (Y \le a | {\textbf{X}}, {\textbf{Z}})\) and \(f_{Y}(\cdot |{\textbf{X}}, {\textbf{Z}})\) as \(F_{Y}(\cdot )\) and \(f_{Y}(\cdot )\) , respectively. Here \(F_{Y}(\cdot )\) is the conditional cumulative distribution function of Y given \(({\textbf{X}}, {\textbf{Z}})\) , while \(f_{Y}(\cdot )\) is the conditional density function of Y given \(({\textbf{X}}, {\textbf{Z}})\) . Furthermore, we denote \(m_i:=m({\textbf{Z}}_i)\) , \(\hat{m}_i:=\hat{m}({\textbf{Z}}_i)\) , \(F_Y(m_i):= F_{Y,i}\) , \(e_i:= F_{Y,i} - I(Y_i < m_i)\) , and \(\hat{e}_i:= F_{Y,i} - I(Y_i < \hat{m}_i)\) .
Before giving the proof of Theorem 2.7 , we introduce some useful technical lemmas, whose proofs are deferred to Appendix B.
Let \({\textbf{M}}_{1}\) and \({\textbf{M}}_2\) be two \(d \times d\) semipositive matrices. Given that \({\mathbb {E}}(\prod _{i=1}^{2}{\textbf{X}}^\top {\textbf{M}}_{i}{\textbf{X}}) \le C\prod _{i=1}^{2}\textrm{tr}({\textbf{M}}_{i}\varvec{\Sigma }_{{\textbf{X}}})\) , where \({\textbf{X}}\) is a d -dimensional random vector with \({\mathbb {E}}({\textbf{X}}) = 0\) and \(\varvec{\Sigma }_{{\textbf{X}}} = {\mathbb {E}}({\textbf{X}}{\textbf{X}}^{\top })\) . Let \(e \bot \!\!\!\bot {\textbf{X}}\) be a bounded variable with mean 0 and variance \(\tau (1-\tau )\) , where \(\tau \in (0, 1)\) . Assume that \(\textrm{tr}(\varvec{\Sigma }^4_{{\textbf{X}}})=o\{\textrm{tr}^2(\varvec{\Sigma }_{{\textbf{X}}}^2)\}\) and \(\textrm{tr}(\varvec{\Sigma }_{{\textbf{X}}}^2)\rightarrow \infty \) as \((n, d)\rightarrow \infty \) , then the following holds.
Under conditions in Theorem 2.7 , it can be shown that
Here \(\varvec{\Sigma }_{{\textbf{X}}} = {\mathbb {E}}({\textbf{X}}{\textbf{X}}^{\top })\) and \(\varvec{r}_{e{\textbf{X}}}={\mathbb {E}}[(\hat{e}-e) {\textbf{X}}]={\mathbb {E}}[\{I(Y<m_{\tau }({\textbf{Z}}))-I(Y<\hat{m}_{\tau }({\textbf{Z}}))\} {\textbf{X}}]\) .
Under the null hypothesis, \(\tau = \Pr (Y_i \le m_i | {\textbf{X}}, {\textbf{Z}})=F_{Y,i}\) , then \(e_i = F_{Y,i} - I(Y_i< m_i) = \tau - I(Y_i < m_i) = \varphi _{\tau }(Y_i - m_i)\) . Similarly, it follows that \(\hat{e}_i = \varphi _{\tau }(Y_i - \hat{m}_i)\) . We rewrite \(U_n\) as the sum of three terms, that is,
As \(U_1\) , \(U_2\) and \(U_3\) are U -statistics, we can prove the theorem based on the properties of U -statistic.
Denote \(u_{n,2}=\frac{1}{n-1} U_{2} = \frac{1}{n(n-1)} \sum _{i \ne j} u_{ij,2}\) with the kernel \(u_{ij,2}=(\hat{e}_i-e_i)(\hat{e}_j-e_j) {\textbf{X}}_i^{\top } {\textbf{X}}_j\) , and further define \(\varvec{r}_{e{\textbf{X}}}={\mathbb {E}}[(\hat{e}-e) {\textbf{X}}]={\mathbb {E}}[\{I(Y<m_{\tau }({\textbf{Z}}))-I(Y<\hat{m}_{\tau }({\textbf{Z}}))\} {\textbf{X}}]\) . We yield that
which holds by the equality ( 7.1 ) in Lemma 7.2 .
By the Hoeffding decomposition, the variance of \(u_{n,2}\) is
where \(u_{1i,2} = {\mathbb {E}}(u_{ij,2} | {\textbf{X}}_i, {\textbf{Z}}_i, Y_i) = (\hat{e}_i-e_i) {\textbf{X}}_i^{\top } \varvec{r}_{e{\textbf{X}}}\) is the projection of \(u_{ij,2}\) to the space \(\{{\textbf{X}}_i, {\textbf{Z}}_i, Y_i\}\) . We further have
We derive that
where the first inequality holds based on the definition of variance, the second inequality follows from the Cauchy–Schwarz inequality and the third inequality holds by the inequality ( 7.2 ) in Lemma 7.2 .
Moreover, we have
where the last inequality holds by \((\hat{e}_1-e_1)\) is independent of \((\hat{e}_2-e_2)\) and the inequality ( 7.3 ) in Lemma 7.2 .
Combining equations ( 7.7 ), ( 7.8 ) and ( 7.9 ), it then follows that
where the last equality holds by Condition 2.4 along with the equalities ( 7.4 ) and ( 7.5 ) in Lemma 7.2 . Therefore, from the results ( 7.6 ) and ( 7.10 ), we conclude that
Secondly, we turn to consider \(U_3\) . Similar to the term \(U_2\) , we write \(u_{n,3}=\frac{1}{n-1} U_3 = \frac{1}{n(n-1)} \sum _{i \ne j} u_{ij,3}\) with \(u_{ij,3}=[e_i(\hat{e}_j-e_j)+e_j(\hat{e}_i-e_i)] {\textbf{X}}_i^{\top } {\textbf{X}}_j\) . We have \({\mathbb {E}}(u_{n,3})={\mathbb {E}}(u_{ij,3})=0\) ; then,
Furthermore, the projection of \(u_{ij,3}\) to the space \(\{{\textbf{X}}_i, {\textbf{Z}}_i, Y_i\}\) is
Here the last equality holds by the equality ( 7.4 ) in Lemma 7.2 , while the fourth equality follows from the facts that \({\mathbb {E}}[({\textbf{X}}^{\top } \varvec{r}_{e{\textbf{X}}})^2]={\mathbb {E}}({\varvec{r}_{e{\textbf{X}}}}^{\top } {\textbf{X}}{\textbf{X}}^{\top } \varvec{r}_{e{\textbf{X}}})=\varvec{r}_{e{\textbf{X}}} \varvec{\Sigma }_{{\textbf{X}}} \varvec{r}_{e{\textbf{X}}}\) , and
What is more, we derive that
where the second inequality holds by the equation ( 7.14 ) and Cauchy–Schwarz inequality, the third inequality follows from the inequality ( 7.3 ) in Lemma 7.2 , while the last equality holds based on the equality ( 7.5 ) in Lemma 7.2 .
Combining equations ( 7.13 ) and ( 7.15 ), we have
and then conclude that
From the conclusions presented in ( 7.11 ) for \(U_2\) and ( 7.16 ) for \(U_3\) along with the results for \(U_1\) in Lemma 7.1 , we complete the proof, that is,
\(\square \)
The proof of the following lemma is deferred to Appendix B.
Under conditions in Theorem 2.8 , it can be shown that
Without loss of generality, we suppose that the intercept term \(b_{\tau 0}\) within the model ( 1.1 ) is zero in the proof. Under the alternative hypotheses,
where the third equality follows from mean value theorem, and \(|{\textbf{X}}_i^{\top }\tilde{\varvec{\beta }_{\tau }}| \le |{\textbf{X}}_i^{\top } \varvec{\beta }_{\tau }|\) .
Under the alternative hypotheses, \(U_n\) can be written as
Under the proof in Theorem 2.7 , we have
For the term \(U_4\) , we denote \(u_{n,4}=\frac{1}{n-1} U_4 = \frac{1}{n(n-1)} \sum _{i \ne j} u_{ij,4}\) , where \(u_{ij,4} = {\textbf{X}}_i^{\top }\varvec{\beta }_{\tau } \tilde{f}_{Y,i} {\textbf{X}}_j^{\top }\varvec{\beta }_{\tau } \tilde{f}_{Y,j} {\textbf{X}}_i^{\top } {\textbf{X}}_j\) . Note that \(\tilde{f}_{Y,i} = f_Y(m_i + {\textbf{X}}_i^{\top }\tilde{\varvec{\beta }_{\tau }}) = f_Y(m_i) +f^{\prime }_{Y}(m_i + {\textbf{X}}_i^{\top } \breve{\varvec{\beta }_{\tau }}){\textbf{X}}_i^{\top }\tilde{\varvec{\beta }_{\tau }} =: f_{Y,i} + \breve{f}^{\prime }_{Y,i}{\textbf{X}}_i^{\top } \tilde{\varvec{\beta }_{\tau }}\) , where \(|{\textbf{X}}_i^{\top } \breve{\varvec{\beta }_{\tau }}| < |{\textbf{X}}_i^{\top }\tilde{\varvec{\beta }_{\tau }}|\) . It follows that
where denote \(\varvec{\Sigma }_{{\textbf{X}}_f} = {\mathbb {E}}\{f(m_{\tau }({\textbf{Z}})){\textbf{X}}{\textbf{X}}^{\top }\}\) . Further, define \(\varvec{\Sigma }_{{\textbf{X}}_{f^{\prime }}} = {\mathbb {E}}\{f^{\prime }(m_{\tau }({\textbf{Z}})){\textbf{X}}{\textbf{X}}^{\top }\}\) , and we calculate that
where the first inequality holds by the fact that \(|{\textbf{X}}_i^{\top }\tilde{\varvec{\beta }_{\tau }}| \le |{\textbf{X}}_i^{\top } \varvec{\beta }_{\tau }|\) , the second inequality follows from the Cauchy–Schwarz inequality, the third inequality holds by Condition 2.2 and the last equality holds based on the condition \(\varvec{\beta }_{\tau }^\top \varvec{\Sigma }_{{\textbf{X}}}\varvec{\beta }_{\tau } = o(1)\) in \(\varvec{\beta }_{\tau } \in {\mathscr {L}}^{I}(\varvec{\beta }_{\tau })\) . Moreover, by the similar derivation as ( 7.23 ), we can obtain that
Here the second inequality follows from the fact that \(2ab \le a^2 + b^2\) , and the last inequality holds by the inequality ( 7.17 ) in Lemma 7.3 . Combining equations ( 7.21 )–( 7.24 ), we derive that
where the last equality holds by the boundness of \(f_{Y}(\cdot )\) and \(f^{\prime }_Y(\cdot )\) in Condition 2.5 .
The projection of \(u_{ij,4}\) to the space \(\{{\textbf{X}}_i, {\textbf{Z}}_i, Y_i\}\) is
Here the first inequality follows from the boundness of \(f_{Y}(\cdot )\) in Condition 2.5 . We further derive that
Here the third inequality holds by the Cauchy–Schwarz inequality, and the last inequality follows from inequalities ( 7.17 ) and ( 7.18 ) in Lemma 7.3 .
We then calculate that
where the second inequality follows from Condition 2.5 , the third inequality holds by the Cauchy–Schwarz inequality and the last inequality follows based on the inequality ( 7.17 ) in Lemma 7.3 and the inequality ( 7.3 ) in Lemma 7.2 .
Combining equations ( 7.26 ) and ( 7.27 ), we have
Here the last equality holds by the conditions \(\varvec{\beta }_{\tau }^\top \varvec{\Sigma }_{{\textbf{X}}}\varvec{\beta }_{\tau } = o(1)\) and \(\varvec{\beta }_{\tau }^\top \varvec{\Sigma }_{{\textbf{X}}}^3\varvec{\beta }_{\tau } = o\{\textrm{tr}(\varvec{\Sigma }_{{\textbf{X}}}^2)/{n}\}\) in \(\varvec{\beta }_{\tau } \in {\mathscr {L}}^{I}(\varvec{\beta }_{\tau })\) . In conclusion, from the results ( 7.25 ) and ( 7.28 ), we obtain that
Here the second equality follows from Condition 2.5 , and the last equality holds by the conditions \(\varvec{\beta }_{\tau }^\top \varvec{\Sigma }_{{\textbf{X}}}^{2}\varvec{\beta }_{\tau } = o\left\{ {\textrm{tr}(\varvec{\Sigma }_{{\textbf{X}}}^2)}/(n^2{\varvec{r}_{m{\textbf{X}}}}^\top \varvec{r}_{m{\textbf{X}}})\right\} \) in \(\varvec{\beta }_{\tau } \in {\mathscr {L}}^{I}(\varvec{\beta }_{\tau })\) and \(n{\varvec{r}_{m{\textbf{X}}}}^{\top }{\varvec{r}_{m{\textbf{X}}}} = o\{\textrm{tr}^{1/2}(\varvec{\Sigma }_{{\textbf{X}}}^2)\}\) in Condition 2.3 .
Now we turn to consider \(U_5\) . We define \(u_{n,5}= \frac{1}{n-1} U_5 = \frac{1}{n(n-1)} \sum _{i \ne j} u_{ij,5}\) , where
Observe that \({\mathbb {E}}(U_5) = {\mathbb {E}}(u_{ij,5}) = 0\) . The projection of \(u_{ij,5}\) to the space \(\{{\textbf{X}}_i, {\textbf{Z}}_i, Y_i \}\) is
Here the last inequality follows from the boundness of \(f_{Y}(\cdot )\) in Condition 2.5 . In addition,
where the last equality follows from the equation ( 7.14 ) and the last inequality holds by \(\tau \in (0,1)\) .
Additionally, it follows that
Here the second inequality holds by Condition 2.5 , and the last inequality follows from the inequality ( 7.17 ) in Lemma 7.3 and the inequality ( 7.3 ) in Lemma 7.2 .
Combining equations ( 7.30 ) and ( 7.31 ), we derive that
Here the last equality holds by the conditions \(\varvec{\beta }_{\tau }^\top \varvec{\Sigma }_{{\textbf{X}}}\varvec{\beta }_{\tau } = o(1)\) and \(\varvec{\beta }_{\tau }^\top \varvec{\Sigma }_{{\textbf{X}}}^3\varvec{\beta }_{\tau } = o\{\textrm{tr}(\varvec{\Sigma }_{{\textbf{X}}}^2)/{n}\}\) in \(\varvec{\beta }_{\tau } \in {\mathscr {L}}^{I}(\varvec{\beta }_{\tau })\) . Then, from the results \({\mathbb {E}}(U_5) = 0\) and ( 7.32 ), we have
Lastly, we consider the term \(U_6\) . Define \(u_{n,6} = \frac{1}{n-1}U_6 = \frac{1}{n(n-1)} \sum _{i \ne j} u_{ij,6}\) with the kernel
We obtain that
where the first inequality holds by Condition 2.5 , and thus,
Here the last equality follows from the equality ( 7.19 ) in Lemma 7.3 .
Note that the projection of \(u_{ij,6}\) to the space \(\{{\textbf{X}}_i, {\textbf{Z}}_i, Y_i \}\) is
where the first inequality holds by Condition 2.5 . We further derive that
where the third inequality holds by the Cauchy–Schwarz inequality, and the last inequality follows from the inequality ( 7.17 ) in Lemma 7.3 and the inequality ( 7.2 ) in Lemma 7.2 . Similarly,
Then combining equations ( 7.34 )–( 7.36 ), we have
Furthermore, similar to the derivation of the equation ( 7.31 ), we calculate that
Here the third inequality holds by Condition 2.5 , the fourth inequality follows from the Cauchy–Schwarz inequality, and the last inequality follows based on the inequality ( 7.17 ) in Lemma 7.3 and the inequality ( 7.3 ) in Lemma 7.2 .
Accordingly, combining equations ( 7.37 )–( 7.38 ), we have
where the last equality holds by conditions \(\varvec{\beta }_{\tau }^\top \varvec{\Sigma }_{{\textbf{X}}}\varvec{\beta }_{\tau } = o(1)\) and \(\varvec{\beta }_{\tau }^\top \varvec{\Sigma }_{{\textbf{X}}}^3\varvec{\beta }_{\tau } = o\{\textrm{tr}(\varvec{\Sigma }_{{\textbf{X}}}^2)/{n}\}\) in \(\varvec{\beta }_{\tau } \in {\mathscr {L}}^{I}(\varvec{\beta }_{\tau })\) , along with the equalities ( 7.4 ) and ( 7.5 ) in Lemma 7.2 . Thus, we conclude that
In sum, following the results ( 7.20 ), ( 7.29 ), ( 7.33 ) and ( 7.39 ), we verify that
Under conditions in Theorem 2.9 , it can be shown that
Here \(\hat{\zeta } = F_{Y}(m_{\tau }({\textbf{Z}})) + {\textbf{X}}^{\top }\varvec{\beta }_{\tau } {f}_{Y}(m_{\tau }({\textbf{Z}}) + {\textbf{X}}^{\top } \tilde{\varvec{\beta }_{\tau }})-I\{Y < \hat{m}_{\tau }({\textbf{Z}})\}\) and \(\varvec{r}_{\zeta {\textbf{X}}} = {\mathbb {E}}(\hat{\zeta }{\textbf{X}})\) .
Under the alternative hypotheses, recall that
where denote \(\hat{\zeta }_i=F_{Y,i} + {\textbf{X}}_i^{\top }\varvec{\beta }_{\tau } \tilde{f}_{Y,i}-I(Y_i<\hat{m}_i)\) . \(u_n = \frac{1}{n-1} U_n = \frac{1}{n(n-1)} \sum _{i \ne j} u_{ij}\) is U -statistic with the kernel \(u_{ij} = \hat{\zeta }_i\hat{\zeta }_j{\textbf{X}}_i^{\top } {\textbf{X}}_j\) . Clearly, we denote \(\varvec{r}_{\zeta {\textbf{X}}} = {\mathbb {E}}(\hat{\zeta }{\textbf{X}})\) and obtain that
Here the last inequality follows from the facts that \(\varvec{r}_{\zeta {\textbf{X}}} \asymp \varvec{\Sigma }_{{\textbf{X}}}\varvec{\beta }_{\tau }+\varvec{r}_{m{\textbf{X}}}\) , which holds based on the boundness of \(f_{Y}(\cdot )\) in Condition 2.5 , along with the condition \(\textrm{tr}^{1/2}(\varvec{\Sigma }_{{\textbf{X}}}^2)=o\big (n\Vert \varvec{\Sigma }_{{\textbf{X}}}\varvec{\beta }_{\tau }+\varvec{r}_{m{\textbf{X}}}\Vert _2^2\big )\) in \(\varvec{\beta }_{\tau } \in {\mathscr {L}}^{II}(\varvec{\beta }_{\tau })\) .
The projection of \(u_n\) to the space \(\{{\textbf{X}}_i, {\textbf{Z}}_i, Y_i\}\) is
We then derive that
Here the last inequality holds by the inequality ( 7.17 ) in Lemma 7.2 . As a result, from the technical results in Lemma 7.4 and the condition \(n{\varvec{r}_{\zeta {\textbf{X}}}}^{\top }{\varvec{r}_{\zeta {\textbf{X}}}} \gg \textrm{tr}^{1/2}(\varvec{\Sigma }_{{\textbf{X}}}^2)\) , we obtain that
Similarly, we calculate that
where the second inequality holds by the Cauchy–Schwarz inequality, and the third inequality holds based on the inequality ( 7.3 ) in Lemma 7.2 and the equality ( 7.40 ) in Lemma 7.4 .
Combining equations ( 7.43 )–( 7.44 ), we derive that
Following the results ( 7.42 ) and ( 7.45 ), we calculate that
From the condition \(n{\varvec{r}_{\zeta {\textbf{X}}}}^{\top }{\varvec{r}_{\zeta {\textbf{X}}}} \gg \textrm{tr}^{1/2}(\varvec{\Sigma }_{{\textbf{X}}}^2)\) , it follows that
This section contains some useful lemmas.
We define \(U_1 = \frac{1}{n}\sum _{i\ne j}e_ie_j{\textbf{X}}_i^{\top }{\textbf{X}}_j\) for brevity. Denote
Define \(S_{n k}=\sum _{i=2}^k \eta _{ni}=\frac{2}{n} \sum _{i=2}^k \sum _{j=1}^{i-1} e_i e_j {\textbf{X}}_i^{\top }{\textbf{X}}_j\) with \(S_{nk}-S_{n(k-1)}=\eta _{n k}\) defined as martingale differences, and \({\mathscr {F}}_k = \sigma \{({\textbf{X}}_i, e_i), i=1, \ldots , k\}\) . Obviously, we have \({\mathbb {E}}(\eta _{nk} | {\mathscr {F}}_{k-1})=0\) , which follows that \((S_{nk}, {\mathscr {F}}_{k})\) is a zero-mean martingale sequence. Define \(v_{ni} = \textrm{Var}(\eta _{ni}|{\mathscr {F}}_{i-1})\) and \(V_{n} = \sum _{i=2}^{n}v_{ni}\) . Note that
Therefore, by the martingale central limit theorem [ 15 ], it is sufficient to show that the following two conditions hold.
and for all \(\iota >0\) ,
We first establish the equation ( 7.46 ). Observe that
\(S_{nn}=\frac{1}{n} \sum _{i \ne j} e_i e_j {\textbf{X}}_i^{\top }{\textbf{X}}_j\) and denote \(u_{n,s}=\frac{1}{n-1} S_{n n} = \frac{1}{n(n-1)} \sum _{i \ne j} u_{ij,s}\) is a U -statistic with the kernel \(u_{i j, s}=e_i e_j {\textbf{X}}_i^{\top }{\textbf{X}}_j\) . The projection of \(u_{i j, s}\) to the space \(\{{\textbf{X}}_i, e_i\}\) is \(u_{1i, s}={\mathbb {E}}(u_{i j, s} | {\textbf{X}}_i, e_i)={\mathbb {E}}(e_i e_j {\textbf{X}}_i^{\top } {\textbf{X}}_j | {\textbf{X}}_i, e_i)=0\) . Furthermore, by the Hoeffding decomposition,
Then combining equations ( 7.48 ) and ( 7.49 ), we write
Now we need to show that \(R_1 {\mathop {\rightarrow }\limits ^{p}} 1\) and \(R_2 {\mathop {\rightarrow }\limits ^{p}} 0\) . It can be derived that
Here the last equality holds by the equality \({\mathbb {E}}({\textbf{X}}^{\top } \varvec{\Sigma }_{{\textbf{X}}} {\textbf{X}}) = \textrm{tr}(\varvec{\Sigma }_{{\textbf{X}}}^2)\) .
Observe the fact that
where the first inequality follows from e is bounded and the second inequality holds by the condition \({\mathbb {E}}(\prod _{i=1}^{2}{\textbf{X}}^\top {\textbf{M}}_{i}{\textbf{X}}) \le C\prod _{i=1}^{2}\textrm{tr}({\textbf{M}}_{i}\varvec{\Sigma }_{{\textbf{X}}})\) and the equality \({\mathbb {E}}({\textbf{X}}^{\top } \varvec{\Sigma }_{{\textbf{X}}} {\textbf{X}}) = \textrm{tr}(\varvec{\Sigma }_{{\textbf{X}}}^2)\) .
Similar to the derivation of \(\textrm{Var}(R_1)\) , we obtain that
Here the last equality holds by the condition \(\textrm{tr}(\varvec{\Sigma }^4_{{\textbf{X}}})=o\{\textrm{tr}^2(\varvec{\Sigma }_{{\textbf{X}}}^2)\}\) . Observe that \({\mathbb {E}}(R_2) = 0\) , combining equations ( 7.50 )–( 7.52 ), Chebyshev inequality yields that \(R_1 {\mathop {\rightarrow }\limits ^{p}} 1\) and \(R_2 {\mathop {\rightarrow }\limits ^{p}} 0\) . Up to now, the equation ( 7.46 ) is verified.
Next, we establish the equation ( 7.47 ). For all \(\iota > 0\) , we have
which holds by Markov inequality. Furthermore, we obtain that
where the first inequality holds by e is bounded. The last inequality holds by equations ( 7.55 ) and ( 7.56 ) as follows,
where the last inequality holds by the inequality ( 7.3 ) in Lemma 7.2 , and similarly,
Consequently, ( 7.47 ) can also be established by combining equations ( 7.53 ) and ( 7.54 ). \(\square \)
We first prove the equality ( 7.1 ). By mean value theorem, we calculate that
Here \(\tilde{m}({\textbf{Z}})\) is a value between \(m_{\tau }({\textbf{Z}})\) and \(\hat{m}_{\tau }({\textbf{Z}})\) and the last inequality follows from Condition 2.5 . It is clear that \(n{\varvec{r}_{e{\textbf{X}}}}^{\top }{\varvec{r}_{e{\textbf{X}}}} = o\{\textrm{tr}^{1/2}(\varvec{\Sigma }_{{\textbf{X}}}^2)\}\) under Condition 2.3 .
Under Condition 2.2 , we derive that
which verifies the inequality ( 7.2 ). Similarly, we prove the inequality ( 7.3 ),
The first and second inequalities hold by Condition 2.2 . Furthermore, following the equality ( 7.1 ), the equality ( 7.4 ) satisfies
where the second inequality holds based on the fact that the Frobenius norm is an upper bound on the spectral norm, and the last equality follows from the equation ( 7.57 ) and Condition 2.3 .
Now we prove the equality ( 7.5 ). Without loss of generality, we assume \(\hat{m}_{\tau }({\textbf{Z}}) < m_{\tau }({\textbf{Z}})\) given \(({\textbf{X}}, {\textbf{Z}})\) . Similar to the derivation in the equality ( 7.57 ), we obtain that
Here the last equality holds by Condition 2.4 . \(\square \)
The inequality ( 7.17 ) can be similarly verified as inequality ( 7.2 ) and thus omitted here. By using similar arguments of the proof in ( 7.58 ), we establish the inequality ( 7.18 ),
Next, we prove the equality ( 7.19 ), and we observe that
Here the first equality holds by the condition \(\varvec{\beta }_{\tau }^\top \varvec{\Sigma }_{{\textbf{X}}}^{2}\varvec{\beta }_{\tau } = o\{{\textrm{tr}(\varvec{\Sigma }_{{\textbf{X}}}^2)}/(n^2{\varvec{r}_{m{\textbf{X}}}}^\top \varvec{r}_{m{\textbf{X}}})\}\) in \(\varvec{\beta }_{\tau } \in {\mathscr {L}}^{I}(\varvec{\beta }_{\tau })\) , and the second inequality follows from the equation ( 7.57 ). \(\square \)
Firstly, we prove the equality ( 7.40 ). Note that \(F_Y(m_{\tau }({\textbf{Z}}))\) , \(I\{Y < \hat{m}_{\tau }({\textbf{Z}})\}\) and \(f_Y(m_{\tau }({\textbf{Z}}) + {\textbf{X}}^{\top }\tilde{\varvec{\beta }_{\tau }})\) are all bounded; thus, we derive that
Here the second inequality follows by the inequality ( 7.17 ) in Lemma 7.3 , and the last equality holds based on the condition \(\varvec{\beta }_{\tau }^\top \varvec{\Sigma }_{{\textbf{X}}}\varvec{\beta }_{\tau } = O(1)\) in \(\varvec{\beta }_{\tau } \in {\mathscr {L}}^{II}(\varvec{\beta }_{\tau })\) .
Secondly, we establish the inequality ( 7.41 ). By using similar arguments of the proof in ( 7.59 ), we obtain that
In the main text, we propose the \(L_2\) -type test statistic \(S_n\) and the screening-based test statistic \(\tilde{S}_n\) based on single data splitting. However, sample splitting may introduce additional randomness into the analysis. To better apply our methods in practice, we further propose an algorithm based on multiple data splitting. We adopt the approach in [ 22 ]. The detailed procedure is summarized in Algorithm 3.
Testing procedure for \({\mathbb {H}}_0\) based on multiple data splitting
We conduct further simulation studies to evaluate the performance of ML estimators (based on Lasso and Neural Networks) for the unknown smooth function \(m_{\tau }(\cdot )\) . Following the settings in Example 4.1 of the main text, we consider the null hypothesis and generate the random error from Case 1 . The estimation quality of ML estimators is measured by the mean absolute error (MAE) over 500 repetitions, with results summarized in Table 6 . From this table, we can see that the Lasso performs better than MLP with smaller MAEs. Additionally, the value of MAE decreases as the sample size n increases. Due to the high dimension of nuisance covariates \({\textbf{Z}}\) , q and the small sample size n , the MAEs are not very small. However, from the results in Table 1 , it is clear that our procedure can control the empirical size well.
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Reprints and permissions
Shi, H., Yang, W., Zhou, N. et al. Inference for Partially Linear Quantile Regression Models in Ultrahigh Dimension. Commun. Math. Stat. (2024). https://doi.org/10.1007/s40304-023-00389-9
Download citation
Received : 19 June 2023
Revised : 09 August 2023
Accepted : 12 November 2023
Published : 06 September 2024
DOI : https://doi.org/10.1007/s40304-023-00389-9
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
IMAGES
VIDEO
COMMENTS
Z Test: Uses, Formula & Examples
Z-test Calculator | Definition | Examples
Z-test - Wikipedia ... Z-test
An example of how to perform a two sample z-test. Let's jump in! Two Sample Z-Test: Formula. A two sample z-test uses the following null and alternative hypotheses: H 0: μ 1 = μ 2 (the two population means are equal) H A: μ 1 ≠ μ 2 (the two population means are not equal) We use the following formula to calculate the z test statistic:
Z Test: Definition & Two Proportion Z-Test
One Sample Z-Test: Definition, Formula, and Example
Null hypothesis: All adults sleep 7 hours a day; Alternative hypothesis: All adults do not sleep 7 hours a day; Great, now that we know what hypothesis testing is when to apply the z-test, and the orientations of the hypotheses according to the alternative hypothesis, it's time to see a couple of examples. Let's go for it! Example
The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0, the —null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.
Z-Test for Statistical Hypothesis Testing Explained
Null & Alternative Hypotheses | Definitions, Templates & ...
Chapter 10: Hypothesis Testing with Z. This chapter lays out the basic logic and process of hypothesis testing using a z. We will perform a test statistics using z, we use the z formula from chapter 8 and data from a sample mean to make an inference about a population.
The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. \(H_0\): The null hypothesis: It is a statement of no difference between the variables—they are not related. This can often be considered the status quo and as a result if you cannot accept the null it requires some action.
Hypothesis Testing: Upper-, Lower, and Two Tailed Tests
Approximate Hypothesis Tests: the z Test and the t Test . This chapter presents two common tests of the hypothesis that a population mean equals a particular value and of the hypothesis that two population means are equal: the z test and the t test. These tests are approximate: They are based on approximations to the probability distribution of the test statistic when the null hypothesis is ...
A z test is a test that is used to check if the means of two populations are different or not provided the data follows a normal distribution. For this purpose, the null hypothesis and the alternative hypothesis must be set up and the value of the z test statistic must be calculated.
Z-test : Formula, Types, Examples - GeeksforGeeks ... Z-test
In a hypothesis test, sample data is evaluated in order to arrive at a decision about some type of claim. If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis, typically denoted with \(H_{0}\). The null is not rejected unless the hypothesis ...
Formulation: The null and alternative hypotheses are formulated based on the research question and the hypothesis the researcher seeks to test. Testing: Statistical tests are performed to determine whether there is enough evidence to reject the null hypothesis in favor of the alternative hypothesis. Decision-making: The decision to accept or reject the null hypothesis is based on the results ...
A Z-test is a type of statistical hypothesis test used to test the mean of a normally distributed test statistic. It tests whether there is a significant difference between an observed population mean and the population mean under the null hypothesis, H 0. A Z-test can only be used when the population variance is known (or can be estimated with ...
10.1 - Z-Test: When Population Variance is Known
The Null hypothesis \(\left(H_{O}\right)\) is a statement about the comparisons, e.g., between a sample statistic and the population, or between two treatment groups. The former is referred to as a one-tailed test whereas the latter is called a two-tailed test. The null hypothesis is typically "no statistical difference" between the ...
1.1 Formulating null and alternative hypotheses. In the world of scientific inquiry, you often begin with a null hypothesis (H 0), which expresses the currently accepted value for a parameter in the population.The alternative hypothesis (H a), on the other hand, is the opposite of the null hypothesis and challenges the currently accepted value.. To illustrate this concept of null and ...
The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0: The null hypothesis: It is a statement of no difference between the variables—they are not related. This can often be considered the status quo and as a result if you cannot accept the null it requires some action.
When the null hypothesis is true, the estimates of effect size are 0.1 and 0.21 if correlation with the selection measure is .2 or .4, respectively. Note that when the null is true, then "power" indicates the false-positive rate, which should be around .05.
For Bayesian t-tests and correlations, we reported the BF 10 value, indicating the ratio of the likelihood of an alternative hypothesis to a null hypothesis. We considered BF incl/10 > 3 and BF incl/10 < 1/3 as evidence for alternative and null hypotheses respectively, while 1/3 < BF incl/10 < 3 as the absence of evidence (Keysers et al., 2020 ...
We establish the asymptotic normality of the proposed test statistic under the null and local alternative hypotheses. A screening-based testing procedure is further provided to make our test more powerful in practice under the ultrahigh-dimensional regime. ... Ma, R., Cai, T., Li, H.: Global and simultaneous hypothesis testing for high ...