Chapter 8 Testing Variance Terms

8.1 $\chi^2$ and F Tests

$\chi^2$ and F Tests are used to test variances. The $\chi^2$ Test considers if the sample variance is equal to a null value of the population variance, $\sigma^2$

The F - test is used to test if two different sample variances are equal to one another. The F - test is very useful in linear regression to compare different models and determine which model provides the best fit.

I am going to assume you know some things about linear regression (but if not no worries, we will cover it later). Let’s take a simple example. $Y =X\beta+e$ You can think of $X\beta$ as the part of the variation in Y we can explain due to the regression line and $e$ as the part that remains unknown. If we add the variables in the matrix W to the regression line, how do we know if we are doing a better job? $Y =X\beta+W\gamma+e$ the parameters $\gamma$ is a whole vector of coefficients. T-test can only test one coefficient at time. We need something else. Here is where the F-test comes in. It compares the variance of the error term from the first regression (without W) to the variance of the second regression (with W). If W matters, then the variance of the error term should have decreased. That is, there is less that is unknown.

8.2 $\chi^2$ Test

The Chi square statistic can be used to test if a sample variance is statistically different than a particular value. We know that $\frac{(n-1)s^2}{\sigma^2} \sim \chi^2_{n-1}$

A two side interval at a confidence level of $\alpha$ then is $\chi^2_{n-1}(\alpha/2) < \frac{(n-1)s^2}{\sigma^2} < \chi^2_{n-1}(1-\alpha/2)$

which can be solved for $\sigma^2$ $\frac{(n-1)s^2}{\chi^2_{n-1}(\alpha/2)} > \sigma^2 > \frac{(n-1)s^2}{\chi^2_{n-1}(1-\alpha/2)}$

8.3 $\chi^2$ Example

Let’s suppose you have sample of earnings for married couples and you want to test if wages for married women and single women have the same volatility.

You find a study that states the wage variance for married women is $10,000.
You ask 21 single women their wage and find a sample variance equal to $14,400.

To test if these two variance terms are the same at the 10% level, we need to find the $[\chi^2_{20}(5\%),\chi^2_{20}(95\%)]$

Using a $\chi^2$ table we find, $\chi^2_{20}(95\%)=31.41$ and $\chi^2_{20}(5\%)=10.851$ .
Using 10,000 $= \sigma^2$ and 14,400 = $s^{2}$ , we find the $\chi^2$ statistic to equal $\frac{(20)14,400}{10,000}=28.8$

Note, this value lies between our two critical values. Therefore, we fail to reject the null hypothesis of equal variances in wages among married and single women.

8.4 $\chi^2$ as a Goodness of fit Measure

The chi square distribution is also used to judge if one variable can reliably predict a second variable. In this case, the chi square statistic is equal to $\sum_{i}^{n} \frac{(O_{i}-E_{i})^2}{E_{i}} \sim \chi^2_{n-1}$ where $O_{i}$ stands for observed and $E_{i}$ stands for expected.

8.5 Example 1

At the population level, 80% of the population is white, 12% is black, and 8% is of some other race. Are the General Social Survey data consistent with these proportions?

Pop. %	Expected	Observed	$\frac{(O - E)^2}{E}$
0.80	1908.8	1897	0.073
0.12	286.32	322	4.45
0.08	190.88	167	2.99
——	——–	——–	———–
1	2386	2386	7.51

The critical value for the $\chi^2$ statistics at 2 df and a significance of 5% is 5.99. The test statistic is greater than the critical value, therefore we reject the null hypothesis that the GSS distribution of race is equivalent to the actual distribution of race in the population

8.6 Example 2

A company wants to examine the relationship between customer satisfaction and product preferences. They survey 300 randomly selected customers and ask them to rate their satisfaction level as “Satisfied,” “Neutral,” or “Dissatisfied,” and also inquire about their preferred product category: “Electronics,” “Fashion,” or “Home Goods.” The data collected is as follows:

	Electronics	Fashion	Home Goods	Total
Satisfied	50	30	20	100
Neutral	40	70	30	140
Dissatisfied	10	20	50	80
————	—————	———-	————	—–
Total	100	120	100	320

Hypothesis Formulation: - Null Hypothesis (H0): Customer satisfaction levels are independent of product preferences. - Alternative Hypothesis (Ha): There is a significant association between customer satisfaction levels and product preferences.

Calculating Expected Frequencies: To calculate the expected frequencies, assuming independence, we use the formula: Expected Frequency (E) = (Row Total * Column Total) / Grand Total

For example, the expected frequency for the cell in the “Satisfied” row and “Electronics” column would be: E(Satisfied, Electronics) = (100 * 100) / 320 = 31.25

Performing this calculation for each cell in the contingency table, we get the following expected frequencies:

	Electronics	Fashion	Home Goods	Total
Satisfied	31.25	37.5	31.25	100
Neutral	40.0	48.0	52.0	140
Dissatisfied	28.75	34.5	28.75	80
————	————	———-	———–	———
Total	100	120	100	320

Computing the Chi-Square Test Statistic: Using the observed and expected frequencies, we can compute the chi-square test statistic. The formula for the chi-square test statistic is:


χ^2 = Σ [(Observed Frequency - Expected Frequency)^2 / Expected Frequency]

After calculating the terms for each cell and summing them up, we get the chi-square test statistic.

Degrees of Freedom: The degrees of freedom for a chi-square test of independence is given by (Number of Rows - 1) * (Number of Columns - 1). In this case, it’s (3 - 1) * (3 - 1) = 2 * 2 = 4.

Comparison with Critical Value: Using a chi-square distribution table or statistical software, the critical value for a significance level of α = 0.05 and 4 degrees of freedom is approximately 9.488.

Conclusion: If the computed chi-square test statistic is greater than the critical value (9.488), we reject the null hypothesis and conclude that there is a significant association between customer satisfaction levels and product preferences. If the computed value is less than the critical value, we fail to reject the null hypothesis, suggesting no significant association.

8.7 F Test

An F test is used to determine if there is a statistical difference between two sample variances.

This test is very important in linear regression. In linear regression, we are trying to choose the correct set of explanatory variables that minimizes the variance between predicted and observed values of the dependent variable.
If you consider a regression that measures educational attainment, then we may think that both family income and IQ will affect this dependent variable.
How do we know if adding family income into the regression equation will give us more explanatory power?
An F-test will help us determine the additional power.

8.8 Conduct an F Test

Assume we ran two regressions, one which only uses IQ and the other which uses both IQ and family income. In both cases we calculate the variance between predict values of educational attainment and observed values of education attainment.
Calculate the sample variance between observed and predicted variables in both cases.
Calculate the F-statistic as $\frac{s_{1}^2}{s_{2}^2} = F_{n,m}$ where n is the degrees of freedom in the numerator and m is the degrees of freedom in the denominator.
Then compare this value to a critical value of $F_{\alpha,n,m}$ . If $F_{\alpha,n,m}<F_{n,m}$ we reject the null hypothesis of equal variances and conclude that adding family income improves your prediction power in a statistically significant manner.

8.9 F Test Example

Suppose you have two samples.

The first is of size 5 and the second is of size 4.
The sample variance for the first sample is 10 and the sample variance for the second sample is 8.
The ratio of the two sample variances is equal to 1.25. This is the observed F statistic with 4 degrees of freedom in the numerator and 3 degrees of freedom in the denominator.

qf(.95, df1=4, df2=3)

## [1] 9.117182

In this case, we would fail to reject the null hypothesis of equal variances because the critical value of 9.117 is greater than the observed value of 1.25.

8.10 Business Application of an F-test

Let’s consider a business example where an F test can be used to compare the performance of two different marketing strategies to determine which one is more effective in increasing sales.

Suppose you have an e-commerce company that sells electronic gadgets. You want to assess whether a new marketing strategy, let’s call it Strategy A, is more effective than your current marketing strategy, Strategy B, in driving sales. To do this, you decide to run a controlled experiment where you randomly divide your customer base into two groups:

Group 1: Customers exposed to Strategy A (e.g., targeted email campaigns, social media ads, and discounts for a specific duration). Group 2: Customers exposed to Strategy B (e.g., traditional advertising, regular promotions).

After a specific time period, you collect data on the total sales generated from each group. Now, you want to determine if there is a statistically significant difference between the two strategies in terms of sales performance.

This is where the F test comes into play. The F test, also known as the Analysis of Variance (ANOVA), can be used to compare the means of multiple groups and determine if there are significant differences between them.

The null hypothesis (H0) for the F test in this scenario would be: “There is no significant difference in sales performance between Strategy A and Strategy B.” The alternative hypothesis (Ha) would be: “There is a significant difference in sales performance between Strategy A and Strategy B.”

You can perform the F test on the sales data collected from both groups to calculate the F-statistic. The F-statistic will help you assess whether the variation in sales performance between the two strategies is significantly greater than what you would expect from random chance.

If the F-statistic is large enough and exceeds a critical value (determined based on the degrees of freedom and chosen significance level), you would reject the null hypothesis. It means that there is a significant difference between the two marketing strategies, and you can conclude that one strategy is more effective in driving sales than the other.

Conversely, if the F-statistic is not significant, you would fail to reject the null hypothesis, and you wouldn’t have enough evidence to claim that one strategy is superior to the other in terms of increasing sales.

In summary, the F test allows you to make data-driven decisions in your business by comparing the effectiveness of different marketing strategies or other interventions on various performance metrics. It helps you identify which approach is statistically better, enabling you to allocate your resources and efforts more efficiently.