# Topic 7 Independent samples t-test

Before, with one-sample t-test, we wanted to test if our sample mean was equal to our guess for the population mean.

Now, we are interested in testing if the population mean of a group is equal to the population mean of another group.

This is called the independent samples t-test.

We use this test to examine the means of one continuous variable (we call it the dependent variable) across two different groups defined by a categorical variable (we call it the independent variable).

## 7.1 Formula

Note that in the formulas below the subscripts 1 and 2 refer, respectively, to groups 1 and 2 defined by the categorical variable you are interested in.

Your t-calculated is now defined by:

$t = \frac{\bar {x}_1 - \bar{x}_2}{se_{diff}}$

where

$se_{diff} = \sqrt{\frac{{s}^{2}_p}{n_1}+\frac{{s}^{2}_p}{n_2}}$

and $${s}^{2}_p$$, which is the pooled variance, is:

${s}^{2}_p = \frac{SS_1 + SS_2}{df_1 +df_2}$

Note that $$SS$$ is the sum of squares for each group and can be calculated by:

$SS_1 = \sum f \cdot {(x_1 - \bar {x}_1)}^2$ $SS_2 = \sum f \cdot {(x_2 - \bar {x}_2)}^2$

You should get those values easily from the table you write to calculate the mean and standard deviation. See below:

$$x$$ f fx $$x - \overline{x}$$ $$(x - \overline{x})^2$$ $$f(x - \overline{x})^2$$
. $$\sum f$$ $$\sum fx$$ $$\sum above = SS$$

Therefore, to hand-calculate the independent samples t-test, you need to draw two of the above tables (one for each group). After that, you simply apply the formulas above.

## 7.2 Assumptions

For the Independent sample t-test to work, three assumptions need to hold:

1. Normality: The variable of interest should be normally distributed within each group.
• We usually do not perform mathematical tests to check for normality.
• Looking at measures of central tendency and histograms is probably the best way to do so.
1. Independence: There must be no relationship between the variables in each group.
• This is not simple to test.
• Usually, we assume it holds if each group is composed of observations from different units. In other words, there must be no individual that belongs to both groups.
1. Equality of variances: The population variances of the two groups must not be statistically different.
• If variances are equal, we say we have homoskedasticity
• If variances are not equal, we say we have heteroskedasticity
• We can test it!

## 7.3 Levene’s test for equality of variances

We test the following:

• Null hypothesis: population variances are equal
• Alternative hypothesis: population variances are not equal

This is an application of the F-test (we will cover it in the following weeks). Do not worry about the math behind it for now.

Notes:

• When you perform your independent sample t-test, SPSS performs the Levene’s test automatically!
• Homogeneity of variance test = Levene’s test
• Factor variable is the variable that defines your groups (our independent variable).

Interpretation

As before, we will look at the p-value:

• $$p \leq \alpha$$ we reject the null
• $$p > \alpha$$ we fail to reject the null

Conclusion

If we reject the null, then variances are not equal. Therefore, our assumption (3) for the Independent sample t-test does not hold.

## 7.4 Interpretation of the independent sample t-test

Once again, we look at the p-value:

• $$p \leq \alpha$$ we reject the null
• $$p > \alpha$$ we fail to reject the null

Conclusion

If we reject the null, we conclude that the means of the different groups are not equal.

## 7.5 Exercise

First, I will illustrate looking at differences in wages between married and non-married individuals.

Second, using the “earnings_data.sav” do the necessary procedures to check if college graduates earn higher wages than non college graduates.

1. What is your null hypothesis?
2. What is the alternative hypothesis?