Chapter 10 MORAN’S I: AN INDEX OF AUTOCORRELATION

In this section, the exploratory approaches of Section 7.3 will be taken a step further. As stated earlier, autocorrelation is the tendency of zi values of nearby polygons to be related. Rather like the Pearson correlation coefficient, which measures the dependency between a pair of variables, there are also coefficients (or indices) to measure autocorrelation. One that is very commonly used is the Moran’s \(I\) coefficient (???), defined by

\[ I = \frac{n}{\sum_i \sum_j w_{ij}} \frac{\sum_i \sum_j w_{ij} (z_i - \bar{z}) (z_j - \bar{z})}{\sum_i (z_i - \bar{z})^2} \label{morani} \]

where \(w_{ij}\) is the \(i,j\)th element of a weights matrix \(\mathbf{W}\), specifying the degree of dependency between polygons \(j\) and \(j\). As before, this could be a neighbour indicator, so that \(w_{ij} = 1\) if polygons \(i\) and \(j\) are neighbours and \(w_{ij} = 0\) otherwise, or the rows of the matrix could be standardised to sum to 1, in which case \(\mathbf{Wz}\) is the vector of lagged means \(\tilde{z}_i\) as defined in section 7.3.

It is also worth noting that, if \(\mathbf{W}\) is standardised so that its rows sum to 1, then \(\sum_i \sum_j w_{ij} = n\). In this case, the equation simplifies to

\[ I = \frac{\sum_i \sum_j w_{ij} q_i q_j }{\sum_i q_i^2} \label{morani2} \]

where \(q_i = z_i - \bar{z}\) - that is, \(q_i\) is \(z_i\) re-centred around the mean value of \(z\). If the vector of \(q_i\)’s is written as \(\mathbf{q}\) then in vector-matrix form, equation may be written as

\[ I = \frac{\mathbf{q}^T\mathbf{Wq}}{\mathbf{q}^T\mathbf{q}} \label{moranvec} \]

It may be checked that if the \(q_i\)’s are plotted in a lagged mean plot - as in figures or , and a regression line is fitted, then \(I\) is the slope of this line. This helps to interpret the coefficient. Larger values of \(I\) suggest that there is a stronger relationship between nearby \(z_i\) values. Furthermore, \(I\) may be negative in some circumstances - suggesting that there can be a degree of inverse correlation between nearby \(z_i\) values, giving a checkerboard pattern on the map. For example, a company may choose to site a chain of stores to be spread evenly across the state, so that occurrence of a store in one county may imply that there is no store in a neighbouring county.

10.1 Moran’s-\(I\) in R

The package spdep provides functions to evaluate Moran’s-\(I\) for a given data set and \(\mathbf{W}\) matrix. As noted in the earlier information box, it is sometimes more effective to store the \(\mathbf{W}\) matrix in listw form - and this is done for the computation of Moran’s \(I\) here. The function used to compute Moran’s \(I\) is called moran.test - and can be used as below:


    Moran I test under randomisation

data:  penn.state.utm$smk  
weights: penn.state.lw    

Moran I statistic standard deviate = 5.4175, p-value
= 3.022e-08
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
      0.404431265      -0.015151515       0.005998405 

The above code supplies more than the actual Moran’s \(I\) estimate itself - but for now note that the value is about 0.404 for the Penn State smoking uptake data.

This is fine, but one problem is deciding whether the above value is a sufficiently high value of \(I\) to suggest that an autocorrelated process model is a plausible alternative to an assumption that the smoking uptake rates are independent. There are two issues here

  1. Is this value of \(I\) a relatively large level on an absolute scale?
  2. How likely is the observed \(I\) value, or a larger value, to be obversed if the rates were independent?

The first of these is a benchmarking problem. Like correlation, Moran’s \(I\) is a dimensionless property - so that for example with a given set of polygons and associated \(\mathbf{W}\)-matrix, area-normalised rates of rainfall would have the same Moran’s \(I\) regardless of whether rainfall was measured in millimeters or inches. However, while correlation is always restricted to lie within the range \([-1,1]\) - making, say a value of 0.8 reasonably easy to interpret - the range of Moran’s \(I\) varies with the \(\mathbf{W}\)-matrix. The maximum and minimum values of \(I\) are shown (???) to be the maximum and minimum values of the eigenvalues (???) of \((\mathbf{W} + \mathbf{W}^T)/2\). For the \(\mathbf{W}\)-matrix here, \(I\) can range between -0.579 and 1.020. Thus on an absolute scale the reported value suggests a reasonable degree of spatial autocorrelation.

An R function to find the maximum and minimum \(I\)-values from a listw object is defined below. listw2mat converts a listw function to a matrix.

However, the null hypothesis statement of ‘no spatial autocorrelation’ is quite broad, and two more specific hypotheses will be considered here. The first is the assumption that each \(z_i\) is drawn from an independent Gaussian distribution, with mean \(\mu\) and variance \(\sigma\). Under this assumption, it can be shown that \(I\) is approximately normally distributed with mean \(E(I) = −1/(n − 1)\). The variance of this distribution is quite complex – readers interested in seeing the formula could consult, for example, (???). If the variance is denoted \(V_{\textrm{Norm}}(I)\) then the test statistic is

\[ \frac{I - \textrm{E}(I)}{V_\textrm{Norm}(I)} \label{normZ} \]

This will be approximately normally distributed with mean zero and variance one, so that \(p\)-values may be obtained by comparison with the standard normal distribution.

The other form of the test is a more formal working of the randomisation idea set out in section 7.3. In this case, no assumption is made about the distribution of the \(z_i\)’s - but it is assumed that any permutation of the \(z_i\)’s against the polygons is equally likely. Thus, the null hypothesis is still one of `no spatial pattern’, but it is conditional on the observed data. Under this hypothesis, it is also possible to compute the mean and variance of \(I\). As before, the expected value if \(I\) is \(\textrm{E}(I) = -1/(n-1)\) - and the formula for the variance is different to that for the normality assumption, but also complex - again the formula is given in (???). If this variance is denoted by \(V_{\textrm{Rand}}(I)\) then the test statistic is

\[ \frac{I - \textrm{E}(I)}{V_\textrm{Rand}(I)} \textrm{ .} \label{randZ} \]

In this case, the distribution of the text statistic in expression is also close to the Normal distribution - and the quantity in this expression can also be compared to the normal distribution with mean zero and variance one, to obtain \(p\)-values. Both kinds of test are available in R via the moran.test function shown earlier. As noted earlier, as well as the Moran’s-\(I\) statistic itself, this function prints out some further information. In particular, looking again at this output it can be seen that the expectation, variance and test statistic for the Moran’s-\(I\) statistic is output (the test statistic is labelled ‘Moran-\(I\) statistic standard deviate’), as well as the associated \(p\)-value. As a default, the output refers to the randomised hypotheses - that is, \(V_\textrm{Rand}(I)\) is used. Thus, looking at the output from moran.test(penn.state.utm$smk,penn.state.lw) again, it can be seen that there is strong evidence to reject the randomisation null hypothesis in favour of an alternative hypothesis of \(I > 0\) for the smoking uptake rates.

The key word argument randomisation allows the Normal distribution assumption, and hence \(V_\textrm{Norm(I)}\) to be used instead:


    Moran I test under normality

data:  penn.state.utm$smk  
weights: penn.state.lw    

Moran I statistic standard deviate = 5.4492, p-value
= 2.53e-08
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
      0.404431265      -0.015151515       0.005928887 

From this, it can be seen that there is also strong evidence to reject the null hypothesis of \(z_i\)’s being independently normally distributed, again in favour of an alternative that \(I > 0\).

10.2 A Simulation-Based Approach

The previous tests approximate the test statistic by a normal distribution with mean 0 and variance 1. However, this distribution is asymptotic – that is, as \(n\) increases, the actual distribution of the test statistic gets closer to the normal distribution. The rate at which this happens is affected by the arrangement of the polygons – essentially, in some cases, the value of \(n\) for which a normal approximation is reasonable is lower than for others (???; ???).

For this reason, it may be reasonable to employ a simulation-based approach here, instead of using a theoretical, but approximate approach. In this approach – which applies to the permutation based hypothesis – a number of random permutations (say 10,000) of the data are drawn, and assigned to polygons, using the sample function in R, as in section 7.3. For each randomly drawn permutation, the Moran’s \(I\) is computed. This provides a simulated sample of draws of Moran’s-\(I\) from the randomisation null hypothesis. The true Moran’s-\(I\) is then computed from the data. If the null hypothesis is true, then the probability of drawing the observed data is the same as any other permutation of the \(z_i\)’s amongst the polygons. Thus, if \(m\) just the number if simulated Moran’s-\(I\) values exceeding the observed one, and \(M\) is the total number of simulations, then the probability of getting the observed Moran’s-\(I\) or a greater one is

\[ p = \frac{m+1}{M+1} \label{mc} \]

This methodology is due to (???). The function moran.mc in spdep allows this to be computed:


    Monte-Carlo simulation of Moran I

data:  penn.state.utm$smk 
weights: penn.state.lw  
number of simulations + 1: 10001 

statistic = 0.40443, observed rank = 10001, p-value =
9.999e-05
alternative hypothesis: greater

note that the third argument provides the number of simulations. Once again, there is evidence to reject the null hypothesis that any permutation of \(z_i\)’s is equally likely in favour of the alternative that \(I > 0\).