# Topic 7 Independent samples t-test

Before, with one-sample t-test, we wanted to test if our sample mean was equal to our guess for the population mean.

Now, we are interested in testing if the population mean of a group is equal to the population mean of another group.

This is called the *independent samples t-test*.

We use this test to examine the means of one continuous variable (we call it the dependent variable) across two different groups defined by a categorical variable (we call it the independent variable).

## 7.1 Formula

Note that in the formulas below the subscripts 1 and 2 refer, respectively, to groups 1 and 2 defined by the categorical variable you are interested in.

Your t-calculated is now defined by:

\[t = \frac{\bar {x}_1 - \bar{x}_2}{se_{diff}}\]

where

\[se_{diff} = \sqrt{\frac{{s}^{2}_p}{n_1}+\frac{{s}^{2}_p}{n_2}}\]

and \({s}^{2}_p\), which is the pooled variance, is:

\[{s}^{2}_p = \frac{SS_1 + SS_2}{df_1 +df_2} \]

Note that \(SS\) is the sum of squares for each group and can be calculated by:

\[SS_1 = \sum f \cdot {(x_1 - \bar {x}_1)}^2\] \[SS_2 = \sum f \cdot {(x_2 - \bar {x}_2)}^2\]

You should get those values easily from the table you write to calculate the mean and standard deviation. See below:

\(x\) | f | fx | \(x - \overline{x}\) | \((x - \overline{x})^2\) | \(f(x - \overline{x})^2\) |
---|---|---|---|---|---|

… | … | … | … | … | … |

. | \(\sum f\) | \(\sum fx\) | \(\sum above = SS\) |

Therefore, to hand-calculate the independent samples t-test, you need to draw two of the above tables (one for each group). After that, you simply apply the formulas above.

## 7.2 Assumptions

For the *Independent sample t-test* to work, three assumptions need to hold:

**Normality:**The variable of interest should be normally distributed within each group.

- We usually do not perform mathematical tests to check for normality.
- Looking at measures of central tendency and histograms is probably the best way to do so.

**Independence:**There must be no relationship between the variables in each group.

- This is not simple to test.
- Usually, we assume it holds if each group is composed of observations from different units. In other words, there must be no individual that belongs to both groups.

**Equality of variances:**The population variances of the two groups must not be statistically different.

- If variances are equal, we say we have
*homoskedasticity* - If variances are not equal, we say we have
*heteroskedasticity* - We can test it!

## 7.3 Levene’s test for equality of variances

We test the following:

- Null hypothesis: population variances are equal
- Alternative hypothesis: population variances are not equal

This is an application of the F-test (we will cover it in the following weeks). Do not worry about the math behind it for now.

**Notes:**

- When you perform your independent sample t-test, SPSS performs the Levene’s test automatically!
- Homogeneity of variance test = Levene’s test
- Factor variable is the variable that defines your groups (our independent variable).

**Interpretation**

As before, we will look at the p-value:

- \(p \leq \alpha\) we reject the null
- \(p > \alpha\) we fail to reject the null

**Conclusion**

If we reject the null, then variances are *not* equal. Therefore, our assumption (3) for the Independent sample t-test *does not hold*.

## 7.4 Interpretation of the independent sample t-test

Once again, we look at the p-value:

- \(p \leq \alpha\) we reject the null
- \(p > \alpha\) we fail to reject the null

**Conclusion**

If we reject the null, we conclude that the means of the different groups are *not* equal.

## 7.5 Exercise

First, I will illustrate looking at differences in wages between married and non-married individuals.

Second, using the “earnings_data.sav” do the necessary procedures to check if college graduates earn higher wages than non college graduates.

- What is your null hypothesis?
- What is the alternative hypothesis?
- What is your alpha?
- Interpret your p-value.