# Topic 3 Z-scores

Z-score are transformations of the data to create some standardization.

Here is how we calculate Z-scores:

\[\begin{equation} z=\frac{x_i-\overline{x}}{sd} \end{equation}\]

It is important to note that Z-scores always have:

- Mean = 0
- SD = 1

The shape of the Z-score distribution equals the shape of the distribution of the original values. We can check this with SPSS using the commands from Labs 1 and 2.

## 3.1 The Z-distribution

What if our original distribution was a normal distribution?

Because Z-scores standardize the values of any distribution, we can use them to generalize the properties of ANY normal distribution.

This idea is fundamental for inferential statistics.

The Z-distribution, therefore, is a distribution of Z-scores created from the values of a perfectly normal distribution.

## 3.2 Important concepts

### 3.2.1 Percentiles

Remember what percentile meant when you took the SATs or ACTs?

\(95^{th}\) percentile = 750, means 95% of all values fall below 750.

**Notation:** The \(X{th}\) percentile of a given variable is often referred to as \(C_{X}\).

- E.g. \(C_{95}\). is the \(95{th}\) percentile of a given variable.

### 3.2.2 Quartiles

Equally divide the data into 4 percentiles: the \(25^{th}\) percentile; the \(50^{th}\) percentile; the \(75^{th}\) percentile; the \(100^{th}\) percentile.

**Notation:** There are only 4 quartiles. We define them as: \(Q_{X}\).

- 1st quartile = \(25^{th}\) percentile = \(Q_{1}\)
- 2nd quartile = \(50^{th}\) percentile = \(Q_{2}\)
- 3rd quartile = \(75^{th}\) percentile = \(Q_{3}\)
- 4th quartile = \(100^{th}\) percentile = \(Q_{4}\)

**Note:** Since the 4th quartile is the highest value, we are often more concerned about \(Q_{1}\), \(Q_{2}\) and \(Q_{3}\).

## 3.3 The empirical rules of the normal curve

Recall the rules of the normal curve from lecture (68%, 95% and 99%). Keep them in mind as we discuss Z-scores.

## 3.4 The Z-table

Because standard normal distributions are used very often, it is useful to have a table that summarizes its percentiles.

This is what Z-tables do. They give you the percentile for a given z value in a perfectly normal distribution. See the z-table at NYU classes.

## 3.5 Exercises

**Part 1**

Open the data set “Standard normal_N_10000.sav” from NYU classes.

This is a standard normal distribution with \(n=10000\). We will use it to illustrate the properties of the standard normal and to understand the Z-table.

Using SPSS, perform the following commands:

- Calculate measures of central tendency and measures of dispersion for the \(x\) variable.
- Check the histogram for the \(x\) variable. Be sure to include a normal curve above it.
- Create Z-scores for the values of \(x\). SPSS will create the variable \(Zx\).
- Calculate measures of central tendency and measures of dispersion for the \(Zx\) variable.
- Check the histogram for the \(Zx\) variable. Be sure to include a normal curve above it.
- Calculate the 90th, 95th, 97.5th and 99th percentile of the \(Zx\) distribution. Find those variables in the Z-table.

**Part 2**

Open the data set “earnings_data.sav” from NYU classes.

Perform the calculations in Part 1 for the variable \(wages\).