Topic 7 Independent samples t-test
Before, with one-sample t-test, we wanted to test if our sample mean was equal to our guess for the population mean.
Now, we are interested in testing if the population mean of a group is equal to the population mean of another group.
This is called the independent samples t-test.
We use this test to examine the means of one continuous variable (we call it the dependent variable) across two different groups defined by a categorical variable (we call it the independent variable).
7.1 Formula
Note that in the formulas below the subscripts 1 and 2 refer, respectively, to groups 1 and 2 defined by the categorical variable you are interested in.
Your t-calculated is now defined by:
t=ˉx1−ˉx2sediff
where
sediff=√s2pn1+s2pn2
and s2p, which is the pooled variance, is:
s2p=SS1+SS2df1+df2
Note that SS is the sum of squares for each group and can be calculated by:
SS1=∑f⋅(x1−ˉx1)2 SS2=∑f⋅(x2−ˉx2)2
You should get those values easily from the table you write to calculate the mean and standard deviation. See below:
x | f | fx | x−¯x | (x−¯x)2 | f(x−¯x)2 |
---|---|---|---|---|---|
… | … | … | … | … | … |
. | ∑f | ∑fx | ∑above=SS |
Therefore, to hand-calculate the independent samples t-test, you need to draw two of the above tables (one for each group). After that, you simply apply the formulas above.
7.2 Assumptions
For the Independent sample t-test to work, three assumptions need to hold:
- Normality: The variable of interest should be normally distributed within each group.
- We usually do not perform mathematical tests to check for normality.
- Looking at measures of central tendency and histograms is probably the best way to do so.
- Independence: There must be no relationship between the variables in each group.
- This is not simple to test.
- Usually, we assume it holds if each group is composed of observations from different units. In other words, there must be no individual that belongs to both groups.
- Equality of variances: The population variances of the two groups must not be statistically different.
- If variances are equal, we say we have homoskedasticity
- If variances are not equal, we say we have heteroskedasticity
- We can test it!
7.3 Levene’s test for equality of variances
We test the following:
- Null hypothesis: population variances are equal
- Alternative hypothesis: population variances are not equal
This is an application of the F-test (we will cover it in the following weeks). Do not worry about the math behind it for now.
Notes:
- When you perform your independent sample t-test, SPSS performs the Levene’s test automatically!
- Homogeneity of variance test = Levene’s test
- Factor variable is the variable that defines your groups (our independent variable).
Interpretation
As before, we will look at the p-value:
- p≤α we reject the null
- p>α we fail to reject the null
Conclusion
If we reject the null, then variances are not equal. Therefore, our assumption (3) for the Independent sample t-test does not hold.
7.4 Interpretation of the independent sample t-test
Once again, we look at the p-value:
- p≤α we reject the null
- p>α we fail to reject the null
Conclusion
If we reject the null, we conclude that the means of the different groups are not equal.
7.5 Exercise
First, I will illustrate looking at differences in wages between married and non-married individuals.
Second, using the “earnings_data.sav” do the necessary procedures to check if college graduates earn higher wages than non college graduates.
- What is your null hypothesis?
- What is the alternative hypothesis?
- What is your alpha?
- Interpret your p-value.