Today, we’re going to look at *local spatial autocorrelation*. Like a kind of *outlier* diagnostic, local spatial autocorrelation measures how the *local* structure of a spatial relationship around each site either conforms to what we expect or is different from what we expect. Together, *local* spatial statistics are a general branch of statistics tha aim to characterize the relationship between *a single observation* and the sites surrounding it.

Often, *local* spatial autocorrelation is contrasted with *global* spatial autocorrelation, which is the structural relationship between *sites* (in abstract) and their *surroundings* (again, in abstract)., and this may have strongly different structure for many ideas of what *surrounds* each observation. Thus, local statistics are an attempt at measuring the geographical beahvior of a given social, physical, or behaviorial process around each observation.

```
library(sf)
library(mosaic)
bristol = sf::st_read('./data/bristol-imd.shp')
```

```
## Reading layer `bristol-imd' from data source `/home/lw17329/OneDrive/teaching/second_year_methods/spatial-practicals/data/bristol-imd.shp' using driver `ESRI Shapefile'
## Simple feature collection with 263 features and 12 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 350239.4 ymin: 166638.9 xmax: 364618.1 ymax: 183052.8
## epsg (SRID): 27700
## proj4string: +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +towgs84=446.448,-125.157,542.06,0.15,0.247,0.842,-20.489 +units=m +no_defs
```

First, though, let’s really refresh our understanding of plain, a-spatial *correlation*, and how our knowledge of outliers work in two dimensions for a standard relationship between two variables. Here, let’s look at the relationship between housing deprivation and crime in Bristol:

```
correlation = cor.test(housing ~ crime, data=bristol)
correlation
```

```
##
## Pearson's product-moment correlation
##
## data: housing and crime
## t = 7.8199, df = 261, p-value = 1.309e-13
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.3322359 0.5287749
## sample estimates:
## cor
## 0.4356841
```

While there is some correlation between these two scores (areas with housing deprivation tend to have higher crime), we can *also* see that not every observation agrees with this trend. Namely, we can highlight one such observation with really low crime rates, but relatively high housing deprivation:

```
plot(housing ~ crime, data=bristol, pch=19)
abline(lm(housing ~ crime, data=bristol),
col='orangered', lwd=2)
abline(v=mean(bristol$crime), lwd=1.5, col='slategrey', lty='dashed')
abline(h=mean(bristol$housing), lwd=1.5, col='slategrey', lty='dashed')
housing_outlier = bristol[bristol$LSOA11CD == 'E01014714',]
points(housing_outlier$crime, housing_outlier$housing, col='red', pch=20)
```

- There are about 4 values with our possible outlier’s level of
`housing`

deprivation. - Approximately (using your eyeballs, not`R`

), what’s the mean of those four observations’`crime`

levels?

- Is this mean substantially different from the value at our potential outlier?

- There are approximately three observations with the same
`crime`

levels as our candidate outlier.

- Approximately (using your eyeballs, not
`R`

), what’s the mean of those three observations’`housing`

deprivation? - Is this mean substantially different from the value at our potential outlier?

- Make a map of bristol
`housing`

and`crime`

values. Highlight the potential outlier (don’t forget, named`housing_outlier`

), in white. Describe the difference in the two maps around the`housing_outlier`

. Where is the`housing_outlier`

?

Humans do a few things they “see” outliers in a scatterplot. Mainly our intution about which observations are (visual) outliers comes from the distance between an observation and the center of the data being analyzed. This is why we are apt to think that the red observation is an outlier, but the blue may not be.