# Topic 3 Z-scores

Z-score are transformations of the data to create some standardization.

Here is how we calculate Z-scores:

$$$z=\frac{x_i-\overline{x}}{sd}$$$

It is important to note that Z-scores always have:

• Mean = 0
• SD = 1

The shape of the Z-score distribution equals the shape of the distribution of the original values. We can check this with SPSS using the commands from Labs 1 and 2.

## 3.1 The Z-distribution

What if our original distribution was a normal distribution?

Because Z-scores standardize the values of any distribution, we can use them to generalize the properties of ANY normal distribution.

This idea is fundamental for inferential statistics.

The Z-distribution, therefore, is a distribution of Z-scores created from the values of a perfectly normal distribution.

## 3.2 Important concepts

### 3.2.1 Percentiles

Remember what percentile meant when you took the SATs or ACTs?

$$95^{th}$$ percentile = 750, means 95% of all values fall below 750.

Notation: The $$X{th}$$ percentile of a given variable is often referred to as $$C_{X}$$.

• E.g. $$C_{95}$$. is the $$95{th}$$ percentile of a given variable.

### 3.2.2 Quartiles

Equally divide the data into 4 percentiles: the $$25^{th}$$ percentile; the $$50^{th}$$ percentile; the $$75^{th}$$ percentile; the $$100^{th}$$ percentile.

Notation: There are only 4 quartiles. We define them as: $$Q_{X}$$.

• 1st quartile = $$25^{th}$$ percentile = $$Q_{1}$$
• 2nd quartile = $$50^{th}$$ percentile = $$Q_{2}$$
• 3rd quartile = $$75^{th}$$ percentile = $$Q_{3}$$
• 4th quartile = $$100^{th}$$ percentile = $$Q_{4}$$

Note: Since the 4th quartile is the highest value, we are often more concerned about $$Q_{1}$$, $$Q_{2}$$ and $$Q_{3}$$.

## 3.3 The empirical rules of the normal curve

Recall the rules of the normal curve from lecture (68%, 95% and 99%). Keep them in mind as we discuss Z-scores.

## 3.4 The Z-table

Because standard normal distributions are used very often, it is useful to have a table that summarizes its percentiles.

This is what Z-tables do. They give you the percentile for a given z value in a perfectly normal distribution. See the z-table at NYU classes.

## 3.5 Exercises

Part 1

Open the data set “Standard normal_N_10000.sav” from NYU classes.

This is a standard normal distribution with $$n=10000$$. We will use it to illustrate the properties of the standard normal and to understand the Z-table.

Using SPSS, perform the following commands:

1. Calculate measures of central tendency and measures of dispersion for the $$x$$ variable.
2. Check the histogram for the $$x$$ variable. Be sure to include a normal curve above it.
3. Create Z-scores for the values of $$x$$. SPSS will create the variable $$Zx$$.
4. Calculate measures of central tendency and measures of dispersion for the $$Zx$$ variable.
5. Check the histogram for the $$Zx$$ variable. Be sure to include a normal curve above it.
6. Calculate the 90th, 95th, 97.5th and 99th percentile of the $$Zx$$ distribution. Find those variables in the Z-table.

Part 2

Open the data set “earnings_data.sav” from NYU classes.

Perform the calculations in Part 1 for the variable $$wages$$.