4.3 One-Sample Inference
4.3.1 For Single Mean
Consider a scenario where
Yi∼i.i.d. N(μ,σ2),
where i.i.d. stands for “independent and identically distributed.” This model can be expressed as:
Yi=μ+ϵi,
where:
- ϵi∼i.i.d.N(0,σ2),
- E(Yi)=μ,
- Var(Yi)=σ2,
- ˉy∼N(μ,σ2/n).
When σ2 is estimated by s2, the standardized test statistic follows a t-distribution:
ˉy−μs/√n∼tn−1.
A 100(1−α)% confidence interval for μ is obtained as:
1−α=P(−tα/2;n−1≤ˉy−μs/√n≤tα/2;n−1),
or equivalently,
P(ˉy−tα/2;n−1s√n≤μ≤ˉy+tα/2;n−1s√n).
The confidence interval is expressed as:
ˉy±tα/2;n−1s√n,
where s/√n is the standard error of ˉy.
If the experiment were repeated many times, 100(1−α)% of these intervals would contain μ.
Case | Confidence Interval 100(1−α)% | Sample Size (Confidence α, Error d) | Hypothesis Test Statistic |
---|---|---|---|
σ2 known, X normal (or n≥25) | ˉX±zα/2σ√n | n≈z2α/2σ2d2 | z=ˉX−μ0σ/√n |
σ2 unknown, X normal (or n≥25) | ˉX±tα/2s√n | n≈z2α/2s2d2 | t=ˉX−μ0s/√n |
4.3.1.1 Power in Hypothesis Testing
Power (π(μ)) of a hypothesis test represents the probability of correctly rejecting the null hypothesis (H0) when it is false (i.e., when alternative hypothesis HA is true). Formally, it is expressed as:
Power=π(μ)=1−β=P(test rejects H0|μ)=P(test rejects H0|HA is true),
where β is the probability of a Type II error (failing to reject H0 when it is false).
To calculate this probability:
Under H0: The distribution of the test statistic is centered around the null parameter (e.g., μ0).
Under HA: The test statistic is distributed differently, shifted according to the true value under HA (e.g., μ1).
Hence, to evaluate the power, it is crucial to determine the distribution of the test statistic under the alternative hypothesis, HA.
Below, we derive the power for both one-sided and two-sided z-tests.
4.3.1.1.1 One-Sided z-Test
Consider the hypotheses:
H0:μ≤μ0vs.HA:μ>μ0
The power for a one-sided z-test is derived as follows:
- The test rejects H0 if ˉy>μ0+zασ√n, where zα is the critical value for the test at the significance level α.
- Under the alternative hypothesis, the distribution of ˉy is centered at μ, with standard deviation σ√n.
- The power is then:
π(μ)=P(ˉy>μ0+zασ√n|μ)=P(Z>zα+μ0−μσ/√n|μ),where Z=ˉy−μσ/√n=1−Φ(zα+(μ0−μ)√nσ)=Φ(−zα+(μ−μ0)√nσ).
Here, we use the symmetry of the standard normal distribution: 1−Φ(x)=Φ(−x).
Suppose we wish to show that the mean response μ under the treatment is higher than the mean response μ0 without treatment (i.e., the treatment effect δ=μ−μ0 is large).
Since power is an increasing function of μ−μ0, it suffices to find the sample size n that achieves the desired power 1−β at μ=μ0+δ. The power at μ=μ0+δ is:
π(μ0+δ)=Φ(−zα+δ√nσ)=1−β
Given Φ(zβ)=1−β, we have:
−zα+δ√nσ=zβ
Solving for n, we obtain:
n=((zα+zβ)σδ)2
Larger sample sizes are required when:
- The sample variability is large (σ is large).
- The significance level α is small (zα is large).
- The desired power 1−β is large (zβ is large).
- The magnitude of the effect is small (δ is small).
In practice, δ and σ are often unknown. To estimate σ, you can:
- Use prior studies or pilot studies.
- Approximate σ based on the anticipated range of the observations (excluding outliers). For normally distributed data, dividing the range by 4 provides a reasonable estimate of σ.
These considerations ensure the test is adequately powered to detect meaningful effects while balancing practical constraints such as sample size.
4.3.1.1.2 Two-Sided z-Test
For a two-sided test, the hypotheses are:
H0:μ=μ0vs.HA:μ≠μ0
The test rejects H0 if ˉy lies outside the interval μ0±zα/2σ√n. The power of the test is:
π(μ)=P(ˉy<μ0−zα/2σ√n|μ)+P(ˉy>μ0+zα/2σ√n|μ)=Φ(−zα/2+(μ−μ0)√nσ)+Φ(−zα/2−(μ−μ0)√nσ).
To ensure a power of 1−β when the treatment effect δ=|μ−μ0| is at least a certain value, we solve for n. Since the power function for a two-sided test is increasing and symmetric in |μ−μ0|, it suffices to find n such that the power equals 1−β when μ=μ0+δ. This gives:
n=((zα/2+zβ)σδ)2
Alternatively, the required sample size can be determined using a confidence interval approach. For a two-sided α-level confidence interval of the form:
ˉy±D
where D=zα/2σ√n, solving for n gives:
n=(zα/2σD)2
This value should be rounded up to the nearest integer to ensure the required precision.
# Generate random data and compute a 95% confidence interval
data <- rnorm(100) # Generate 100 random values
t.test(data, conf.level = 0.95) # Perform t-test with 95% confidence interval
#>
#> One Sample t-test
#>
#> data: data
#> t = -1.3809, df = 99, p-value = 0.1704
#> alternative hypothesis: true mean is not equal to 0
#> 95 percent confidence interval:
#> -0.33722662 0.06046365
#> sample estimates:
#> mean of x
#> -0.1383815
For a one-sided hypothesis test, such as testing H0:μ≥30 versus Ha:μ<30:
# Perform one-sided t-test
t.test(data, mu = 30, alternative = "less")
#>
#> One Sample t-test
#>
#> data: data
#> t = -300.74, df = 99, p-value < 2.2e-16
#> alternative hypothesis: true mean is less than 30
#> 95 percent confidence interval:
#> -Inf 0.02801196
#> sample estimates:
#> mean of x
#> -0.1383815
When σ is unknown, you can estimate it using:
Prior studies or pilot studies.
The range of observations (excluding outliers) divided by 4, which provides a reasonable approximation for normally distributed data.
4.3.1.1.3 z-Test Summary
- For one-sided tests:
π(μ)=Φ(−zα+(μ−μ0)√nσ)
- For two-sided tests:
π(μ)=Φ(−zα/2+(μ−μ0)√nσ)+Φ(−zα/2−(μ−μ0)√nσ)
Factors Affecting Power
- Effect Size (μ−μ0): Larger differences between μ and μ0 increase power.
- Sample Size (n): Larger n reduces the standard error, increasing power.
- Variance (σ2): Smaller variance increases power.
- Significance Level (α): Increasing α (making the test more liberal) increases power through zα.
4.3.1.1.4 One-Sample t-test
In hypothesis testing, calculating the power and determining the required sample size for t-tests are more complex than for z-tests. This complexity arises from the involvement of the Student’s t-distribution and its generalized form, the non-central t-distribution.
The power function for a one-sample t-test can be expressed as:
π(μ)=P(ˉy−μ0s/√n>tn−1;α∣μ)
Here:
μ0 is the hypothesized population mean under the null hypothesis,
ˉy is the sample mean,
s is the sample standard deviation,
n is the sample size,
tn−1;α is the critical t-value from the Student’s t-distribution with n−1 degrees of freedom at significance level α.
When μ>μ0 (i.e., μ−μ0=δ), the random variable
T=ˉy−μ0s/√n
does not follow the Student’s t-distribution. Instead, it follows a non-central t-distribution with:
a non-centrality parameter λ=δ√n/σ, where σ is the population standard deviation,
degrees of freedom n−1.
Key Properties of the Power Function
- The power π(μ) is an increasing function of the non-centrality parameter λ.
- For δ=0 (i.e., when the null hypothesis is true), the non-central t-distribution simplifies to the regular Student’s t-distribution.
To calculate the power in practice, numerical procedures (see below) or precomputed charts are typically required.
Approximate Sample Size Adjustment for t-tests
When planning a study, researchers often start with an approximation based on z-tests and then adjust for the specifics of the t-test. Here’s the process:
1. Start with the Sample Size for a z-test
For a two-sided test: nz=(zα/2+zβ)2σ2δ2 where:
zα/2 is the critical value from the standard normal distribution for a two-tailed test,
zβ corresponds to the desired power 1−β,
δ is the effect size μ−μ0,
σ is the population standard deviation.
2. Adjust for the t-distribution
Let v=n−1, where n is the sample size derived from the z-test. For a two-sided t-test, the approximate sample size is:
n∗=(tv;α/2+tv;β)2σ2δ2
Here:
tv;α/2 and tv;β are the critical values from the Student’s t-distribution for the significance level α and desired power, respectively.
Since v depends on n∗, this process may require iterative refinement.
Notes:
- Approximations: The above formulas provide an intuitive starting point but may require adjustments based on exact numerical solutions.
- Insights: Power is an increasing function of:
- the effect size δ,
- the sample size n,
- and a decreasing function of the population variability σ.
# Example: Power calculation for a one-sample t-test
library(pwr)
# Parameters
effect_size <- 0.5 # Cohen's d
alpha <- 0.05 # Significance level
power <- 0.8 # Desired power
# Compute sample size
sample_size <-
pwr.t.test(
d = effect_size,
sig.level = alpha,
power = power,
type = "one.sample"
)$n
# Print result
cat("Required sample size for one-sample t-test:",
ceiling(sample_size),
"\n")
#> Required sample size for one-sample t-test: 34
# Power calculation for a given sample size
calculated_power <-
pwr.t.test(
n = ceiling(sample_size),
d = effect_size,
sig.level = alpha,
type = "one.sample"
)$power
cat("Achieved power with computed sample size:",
calculated_power,
"\n")
#> Achieved power with computed sample size: 0.8077775
4.3.2 For Difference of Means, Independent Samples
100(1−α) Confidence Interval | Hypothesis Testing Test Statistic | ||
---|---|---|---|
When σ2 is known | ˉX1−ˉX2±zα/2√σ21n1+σ22n2 | z=(ˉX1−ˉX2)−(μ1−μ2)0√σ21n1+σ22n2 | |
When σ2 is unknown, Variances Assumed EQUAL | ˉX1−ˉX2±tα/2√s2p(1n1+1n2) | t=(ˉX1−ˉX2)−(μ1−μ2)0√s2p(1n1+1n2) | Pooled Variance: s2p=(n1−1)s21−(n2−1)s22n1+n2−2 Degrees of Freedom: γ=n1+n2−2 |
When σ2 is unknown, Variances Assumed UNEQUAL | ˉX1−ˉX2±tα/2√(s21n1+s22n2) | t=(ˉX1−ˉX2)−(μ1−μ2)0√(s21n1+s22n2) | Degrees of Freedom: γ=(s21n1+s22n2)2(s21n1)2n1−1+(s22n2)2n2−1 |
4.3.3 For Difference of Means, Paired Samples
Metric | Formula |
---|---|
Confidence Interval | ˉD±tα/2sd√n |
Hypothesis Test Statistic | t=ˉD−D0sd/√n |
4.3.4 For Difference of Two Proportions
The mean of the difference between two sample proportions is given by:
^p1−^p2
The variance of the difference in proportions is:
p1(1−p1)n1+p2(1−p2)n2
A 100(1−α)% confidence interval for the difference in proportions is calculated as:
^p1−^p2±zα/2√p1(1−p1)n1+p2(1−p2)n2
where
zα/2: The critical value from the standard normal distribution.
^p1, ^p2: Sample proportions.
n1, n2: Sample sizes.
Sample Size for a Desired Confidence Level and Margin of Error
To achieve a margin of error d for a given confidence level, the required sample size can be estimated as follows:
With Prior Estimates of ^p1 and ^p2: n≈z2α/2[p1(1−p1)+p2(1−p2)]d2
Without Prior Estimates (assuming maximum variability, ˆp=0.5): n≈z2α/22d2
Hypothesis Testing for Difference in Proportions
The test statistic for hypothesis testing depends on the null hypothesis:
When (p1−p2)≠0: z=(^p1−^p2)−(p1−p2)0√p1(1−p1)n1+p2(1−p2)n2
When (p1−p2)0=0 (testing equality of proportions): z=^p1−^p2√ˆp(1−ˆp)(1n1+1n2)
where ˆp is the pooled sample proportion:
ˆp=x1+x2n1+n2=n1^p1+n2^p2n1+n2
4.3.5 For Single Proportion
The 100(1−α)% confidence interval for a population proportion p is:
ˆp±zα/2√ˆp(1−ˆp)n
Sample Size Determination
With Prior Estimate (ˆp): n≈z2α/2ˆp(1−ˆp)d2
Without Prior Estimate: n≈z2α/24d2
The test statistic for H0:p=p0 is:
z=ˆp−p0√p0(1−p0)n
4.3.6 For Single Variance
For a sample variance s2 with n observations, the 100(1−α)% confidence interval for the population variance σ2 is:
1−α=P(χ21−α/2;n−1)≤(n−1)s2/σ2≤χ2α/2;n−1)=P((n−1)s2χ2α/2;n−1≤σ2≤(n−1)s2χ21−α/2;n−1)
Equivalently, the confidence interval can be written as:
((n−1)s2χ2α/2,(n−1)s2χ21−α/2)
To find confidence limits for σ, compute the square root of the interval bounds:
Confidence Interval for σ:(√(n−1)s2χ2α/2,√(n−1)s2χ21−α/2)
Hypothesis Testing for Variance
The test statistic for testing a null hypothesis about a population variance (σ20) is:
χ2=(n−1)s2σ20
This test statistic follows a chi-squared distribution with n−1 degrees of freedom under the null hypothesis.
4.3.7 Non-parametric Tests
Method | Purpose | Assumptions |
---|---|---|
Sign Test | Test median | None (ordinal data sufficient) |
Wilcoxon Signed Rank Test | Test symmetry around a value | Symmetry of distribution |
Wald-Wolfowitz Runs Test | Test for randomness | Independent observations |
Quantile (or Percentile) Test | Test specific quantile | None (ordinal data sufficient) |
4.3.7.1 Sign Test
The Sign Test is used to test hypotheses about the median of a population, μ(0.5), without assuming a specific distribution for the data. This test is ideal for small sample sizes or when normality assumptions are not met.
To test the population median, consider the hypotheses:
- Null Hypothesis: H0:μ(0.5)=0
- Alternative Hypothesis: Ha:μ(0.5)>0 (one-sided test)
Steps:
Count Positive and Negative Deviations:
- Count observations (yi) greater than 0: s+ (number of positive signs).
- Count observations less than 0: s− (number of negative signs).
- s−=n−s+.
Decision Rule:
- Reject H0 if s+ is large (or equivalently, s− is small).
- To determine how large s+ must be, use the distribution of S+ under H0, which is Binomial with p=0.5.
Null Distribution:
Under H0, S+ follows: S+∼Binomial(n,p=0.5)Critical Value:
Reject H0 if: s+≥bn,α where bn,α is the upper α critical value of the binomial distribution.p-value Calculation:
Compute the p-value for the observed (one-tailed) s+ as: \text{p-value} = P(S \ge s_+) = \sum_{i=s_+}^{n} \binom{n}{i} \left(\frac{1}{2}\right)^nAlternatively: P(S \le s_-) = \sum_{i=0}^{s_-} \binom{n}{i} \left(\frac{1}{2}\right)^n
Large Sample Normal Approximation
For large n, use a normal approximation for the binomial test. Reject H_0 if: s_+ \ge \frac{n}{2} + \frac{1}{2} + z_{\alpha} \sqrt{\frac{n}{4}} where z_\alpha is the critical value for a one-sided test.
For two-sided tests, use the maximum or minimum of s_+ and s_-:
Test statistic: s_{\text{max}} = \max(s_+, s_-) or s_{\text{min}} = \min(s_+, s_-)
Reject H_0 if p-value is less than \alpha, where: p\text{-value} = 2 \sum_{i=s_{\text{max}}}^{n} \binom{n}{i} \left(\frac{1}{2}\right)^n = 2 \sum_{i = 0}^{s_{min}} \binom{n}{i} \left( \frac{1}{2} \right)^n
Equivalently, rejecting H_0 if s_{max} \ge b_{n,\alpha/2}.
For large n, the normal approximation uses:
z = \frac{s_{\text{max}} - \frac{n}{2} - \frac{1}{2}}{\sqrt{\frac{n}{4}}}
Reject H_0 at \alpha if z \ge z_{\alpha/2}.
Handling zeros in the data is a common issue with the Sign Test:
- Random Assignment: Assign zeros randomly to either s_+ or s_- (2 researchers might get different results).
- Fractional Assignment: Count each zero as 0.5 toward both s_+ and s_- (but then we could not apply the Binomial Distribution afterward).
- Ignore Zeros: Ignore zeros, but note this reduces the sample size and power.
# Example Data
data <- c(0.76, 0.82, 0.80, 0.79, 1.06, 0.83, -0.43, -0.34, 3.34, 2.33)
# Count positive signs
s_plus <- sum(data > 0)
# Sample size excluding zeros
n <- length(data)
# Perform a one-sided binomial test
binom.test(s_plus, n, p = 0.5, alternative = "greater")
#>
#> Exact binomial test
#>
#> data: s_plus and n
#> number of successes = 8, number of trials = 10, p-value = 0.05469
#> alternative hypothesis: true probability of success is greater than 0.5
#> 95 percent confidence interval:
#> 0.4930987 1.0000000
#> sample estimates:
#> probability of success
#> 0.8
4.3.7.2 Wilcoxon Signed Rank Test
The Wilcoxon Signed Rank Test is an improvement over the Sign Test as it considers both the magnitude and direction of deviations from the null hypothesis value (e.g., 0). However, this test assumes that the data are symmetrically distributed around the median, unlike the Sign Test.
We test the following hypotheses:
H_0: \mu_{(0.5)} = 0 \\ H_a: \mu_{(0.5)} > 0
This example assumes no ties or duplicate observations in the data.
Procedure for the Signed Rank Test
- Rank the Absolute Values:
- Rank the observations y_i based on their absolute values.
- Let r_i denote the rank of y_i.
- Since there are no ties, ranks r_i are uniquely determined and form a permutation of integers 1, 2, \dots, n.
- Calculate w_+ and w_-:
- w_+ is the sum of the ranks corresponding to positive values of y_i.
- w_- is the sum of the ranks corresponding to negative values of y_i.
- By definition: w_+ + w_- = \sum_{i=1}^n r_i = \frac{n(n+1)}{2}
- Decision Rule:
- Reject H_0 if w_+ is large (or equivalently, if w_- is small).
Null Distribution of W_+
Under the null hypothesis, the distributions of W_+ and W_- are identical and symmetric. The p-value for a one-sided test is:
\text{p-value} = P(W \ge w_+) = P(W \le w_-)
An \alpha-level test rejects H_0 if w_+ \ge w_{n,\alpha}, where w_{n,\alpha} is the critical value from a table of the null distribution of W_+.
For two-sided tests, use:
p\text{-value} = 2P(W \ge w_{max}) = 2P(W \le w_{min})
Normal Approximation for Large Samples
For large n, the null distribution of W_+ can be approximated by a normal distribution:
z = \frac{w_+ - \frac{n(n+1)}{4} - \frac{1}{2}}{\sqrt{\frac{n(n+1)(2n+1)}{24}}}
The test rejects H_0 at level \alpha if:
w_+ \ge \frac{n(n+1)}{4} + \frac{1}{2} + z_{\alpha} \sqrt{\frac{n(n+1)(2n+1)}{24}} \approx w_{n,\alpha}
For a two-sided test, the decision rule uses the maximum or minimum of w_+ and w_-:
w_{max} = \max(w_+, w_-)
w_{min} = \min(w_+, w_-)
The p-value is computed as:
p\text{-value} = 2P(W \ge w_{max}) = 2P(W \le w_{min})
Handling Tied Ranks
If some observations |y_i| have tied absolute values, assign the average rank (or “midrank”) to all tied values. For example:
- Suppose y_1 = -1, y_2 = 3, y_3 = -3, and y_4 = 5.
- The ranks for |y_i| are:
- |y_1| = 1: r_1 = 1
- |y_2| = |y_3| = 3: r_2 = r_3 = \frac{2+3}{2} = 2.5
- |y_4| = 5: r_4 = 4
# Example Data
data <- c(0.76, 0.82, 0.80, 0.79, 1.06, 0.83, -0.43, -0.34, 3.34, 2.33)
# Perform Wilcoxon Signed Rank Test (exact test)
wilcox_exact <- wilcox.test(data, exact = TRUE)
# Display results
wilcox_exact
#>
#> Wilcoxon signed rank exact test
#>
#> data: data
#> V = 52, p-value = 0.009766
#> alternative hypothesis: true location is not equal to 0
For large samples, you can use the normal approximation by setting exact = FALSE
:
# Perform Wilcoxon Signed Rank Test (normal approximation)
wilcox_normal <- wilcox.test(data, exact = FALSE)
# Display results
wilcox_normal
#>
#> Wilcoxon signed rank test with continuity correction
#>
#> data: data
#> V = 52, p-value = 0.01443
#> alternative hypothesis: true location is not equal to 0
4.3.7.3 Wald-Wolfowitz Runs Test
The Runs Test is a non-parametric test used to examine the randomness of a sequence. Specifically, it tests whether the order of observations in a sequence is random. This test is useful in detecting non-random patterns, such as trends, clustering, or periodicity.
The hypotheses for the Runs Test are:
- Null Hypothesis: H_0: The sequence is random.
- Alternative Hypothesis: H_a: The sequence is not random.
A run is a sequence of consecutive observations of the same type. For example: - In the binary sequence + + - - + - + +
, there are 5 runs: ++
, --
, +
, -
, ++
.
Runs can be formed based on any classification criteria, such as:
Positive vs. Negative values
Above vs. Below the median
Success vs. Failure in binary outcomes
Test Statistic
Number of Runs (R):
The observed number of runs in the sequence.Expected Number of Runs (E[R]):
Under the null hypothesis of randomness, the expected number of runs is: E[R] = \frac{2 n_1 n_2}{n_1 + n_2} + 1 where:- n_1: Number of observations in the first category (e.g., positives).
- n_2: Number of observations in the second category (e.g., negatives).
- n = n_1 + n_2: Total number of observations.
Variance of Runs (\text{Var}[R]):
The variance of the number of runs is given by: \text{Var}[R] = \frac{2 n_1 n_2 (2 n_1 n_2 - n)}{n^2 (n - 1)}Standardized Test Statistic (z):
For large samples (n \geq 20), the test statistic is approximately normally distributed: z = \frac{R - E[R]}{\sqrt{\text{Var}[R]}}
Decision Rule
- Compute the z-value and compare it to the critical value of the standard normal distribution.
- For a significance level \alpha:
- Reject H_0 if |z| \ge z_{\alpha/2} (two-sided test).
- Reject H_0 if z \ge z_\alpha or z \le -z_\alpha for one-sided tests.
Steps for Conducting a Runs Test:
- Classify the data into two groups (e.g., above/below median, positive/negative).
- Count the total number of runs (R).
- Compute E[R] and \text{Var}[R] based on n_1 and n_2.
- Compute the z-value for the observed number of runs.
- Compare the z-value to the critical value to decide whether to reject H_0.
For a numerical dataset where the test is based on values above and below the median:
# Example dataset
data <- c(1.2, -0.5, 3.4, -1.1, 2.8, -0.8, 4.5, 0.7)
library(randtests)
# Perform Runs Test (above/below median)
runs.test(data)
#>
#> Runs Test
#>
#> data: data
#> statistic = 2.2913, runs = 8, n1 = 4, n2 = 4, n = 8, p-value = 0.02195
#> alternative hypothesis: nonrandomness
The output of the runs.test
function includes:
Observed Runs: The actual number of runs in the sequence.
Expected Runs: The expected number of runs under H_0.
p-value: The probability of observing a number of runs as extreme as the observed one under H_0.
If the p-value is less than \alpha, reject H_0 and conclude that the sequence is not random.
Limitations of the Runs Test
The test assumes that observations are independent.
For small sample sizes, the test may have limited power.
Ties in the data must be resolved by a predefined rule (e.g., treating ties as belonging to one group or excluding them).
4.3.7.4 Quantile (or Percentile) Test
The Quantile Test (also called the Percentile Test) is a non-parametric test used to evaluate whether the proportion of observations falling within a specific quantile matches the expected proportion under the null hypothesis. This test is useful for assessing the distribution of data when specific quantiles (e.g., medians or percentiles) are of interest.
Suppose we want to test whether the true proportion of data below a specified quantile q matches a given probability p. The hypotheses are:
- Null Hypothesis: H_0: The true proportion is equal to p.
- Alternative Hypothesis: H_a: The true proportion is not equal to p (two-sided), greater than p (right-tailed), or less than p (left-tailed).
Test Statistic
The test statistic is based on the observed count of data points below the specified quantile.
Observed Count (k):
The number of data points y_i such that y_i \leq q.Expected Count (E[k]):
The expected number of observations below the quantile q under H_0 is: E[k] = n \cdot pVariance:
Under the binomial distribution, the variance is: \text{Var}[k] = n \cdot p \cdot (1 - p)Standardized Test Statistic (z):
For large n, the test statistic is approximately normally distributed: z = \frac{k - E[k]}{\sqrt{\text{Var}[k]}} = \frac{k - n \cdot p}{\sqrt{n \cdot p \cdot (1 - p)}}
Decision Rule
- Compute the z-value for the observed count.
- Compare the z-value to the critical value of the standard normal distribution:
- For a two-sided test, reject H_0 if |z| \geq z_{\alpha/2}.
- For a one-sided test, reject H_0 if z \geq z_\alpha (right-tailed) or z \leq -z_\alpha (left-tailed).
Alternatively, calculate the p-value and reject H_0 if the p-value \leq \alpha.
Suppose we have a dataset and want to test whether the proportion of observations below the 50th percentile (median) matches the expected value of p = 0.5.
# Example data
data <- c(12, 15, 14, 10, 13, 11, 14, 16, 15, 13)
# Define the quantile to test
quantile_value <- quantile(data, 0.5) # Median
p <- 0.5 # Proportion under H0
# Count observed values below or equal to the quantile
k <- sum(data <= quantile_value)
# Sample size
n <- length(data)
# Expected count under H0
expected_count <- n * p
# Variance
variance <- n * p * (1 - p)
# Test statistic (z-value)
z <- (k - expected_count) / sqrt(variance)
# Calculate p-value for two-sided test
p_value <- 2 * (1 - pnorm(abs(z)))
# Output results
list(
quantile_value = quantile_value,
observed_count = k,
expected_count = expected_count,
z_value = z,
p_value = p_value
)
#> $quantile_value
#> 50%
#> 13.5
#>
#> $observed_count
#> [1] 5
#>
#> $expected_count
#> [1] 5
#>
#> $z_value
#> [1] 0
#>
#> $p_value
#> [1] 1
For a one-sided test (e.g., testing whether the proportion is greater than p):
# Calculate one-sided p-value
p_value_one_sided <- 1 - pnorm(z)
# Output one-sided p-value
p_value_one_sided
#> [1] 0.5
Interpretation of Results
p-value: If the p-value is less than \alpha, reject H_0 and conclude that the proportion of observations below the quantile deviates significantly from p.
Quantile Test Statistic (z): The z-value indicates how many standard deviations the observed count is from the expected count under the null hypothesis. Large positive or negative z values suggest non-random deviations.
Assumptions of the Test
Observations are independent.
The sample size is large enough for the normal approximation to the binomial distribution to be valid (n \cdot p \geq 5 and n \cdot (1 - p) \geq 5).
Limitations of the Test
For small sample sizes, the normal approximation may not hold. In such cases, exact binomial tests are more appropriate.
The test assumes that the quantile used (e.g., the median) is well-defined and correctly calculated from the data.