Testing for covariance and correlation

Correlation

It's easy to test for correlation with the "cor()" function. Just pass in the two vectors to the function.

TheVector1=rnorm(1000,mean=0,sd=1)
TheVector2=rnorm(1000,mean=5,sd=1)
TheCorrelation=cor(TheVector1, TheVector2)

The result will be less than 0 for negative correlation and greater than 0 for positive. The magnitude of the value indicates how strong the correlation is.

Try this with a few data sets generated from a linear relationship. First, start with two lines with the slope set the same (the "1"s in the code below).

XValues=1:20
YValues1=1*XValues+10
YValues2=-10*XValues+10
TheCorrelation=cor(YValues1, YValues2)
print(TheCorrelation)

Now, change the values of the slope coefficients and see what happens to the correlation value. Then, change one of the slope coefficients to be negative.

Covariance

Covariance is closely related to correlation but changes based on the magnitude of the variance so correlation is typically used.

Correlation Plots

The code below will create an NxN chart with scatter grams and their Pearson's correlation correficients together. While this is great for analysis, please simplify and break up charts like this for presentation in reports and papers.

# function to put all the plots into one panel and add the labels
panel.pearson <- function(x, y, ...) {
  horizontal <- (par("usr")[1] + par("usr")[2]) / 2;
  vertical <- (par("usr")[3] + par("usr")[4]) / 2;
  text(horizontal, vertical, format(abs(cor(x,y)), digits=2))
}

# Command that creates the actual pairwise correlation plots."lower.panel" 
# defaults to the correlation plots
pairs (TheData, upper.panel=panel.pearson, cex=0.7, main="Title")

The names of the covariates in the chart from the code above are not really that great as they are just the names of the variables. The code below will convert the names to something more readable.

###########################
# Pairwise Correlation Plots

## Creates a function that calculates and places the Pearson's correlation values for each covariate
## I found this function online along with the pairs() command below.
panel.pearson <- function(x, y, ...) {
  horizontal <- (par("usr")[1] + par("usr")[2]) / 2;
  vertical <- (par("usr")[3] + par("usr")[4]) / 2;
  text(horizontal, vertical, format(abs(cor(x,y)), digits=2))
}


## Creates a vector naming each of the covariates I want in the pairwise plots from the .csv file.
z= cbind (streetdist, aspect, slope, tri, relativetp, waterdist, ndvi, ndmi, greenness, wetness, 
          bulkdensity, factor(mechsite2), percentclay)

## Creates a vector of names to be placed in the diagonal boxes of the pairwise plots.
colnames (z)=c("Distance to roadway", "Aspect","Slope","TRI","RPI","Distance to water",
               "NDVI","NDMI","Greenness","Wetness", "Bulk Density", "Mech site", "% clay")

## Command that creates the actual pairwise correlation plots."lower.panel" defaults to the correlation plots
pairs (z, upper.panel=panel.pearson, cex=0.7, main="Correlation plots between predictor variables")

Code provided by Matt Lau

Below is an example of creating covariance plots for two rasters.

############################################################
# Pairwise Correlation Plots

## Creates a function that calculates and places the Pearson's correlation values for each covariate
## I found this function online along with the pairs() command below.
panel.pearson <- function(x, y, ...) 
{
  horizontal <- (par("usr")[1] + par("usr")[2]) / 2;
  
  vertical <- (par("usr")[3] + par("usr")[4]) / 2;
  
  text(horizontal, vertical, format(abs(cor(x,y)), digits=2))
}

############################################################
# Load the rasters and optionally crop them (makes the code go faster)

TheExtent=ext(-124,-120,40,44) # extent to crop to in spatial reference units (e.g. degrees)

# load the raster
AnnualMeanTempFilePath="BioClim_CONUS_1km (1)/BioClim_CONUS_1km/bio_1_AnnualMeanTemp_CONUS_1.img"
AnnualMeanTempRaster = rast(AnnualMeanTempFilePath) # loads data as a spatraster (terra)

# Crop the raster
AnnualMeanTempRaster=crop(AnnualMeanTempRaster,TheExtent)
plot(AnnualMeanTempRaster)

# Convert the raster to a data frame and then get the pixel values as a vector
AnnualMeanTempDataFrame=as.data.frame(AnnualMeanTempRaster,xy=TRUE)
AnnualMeanTempVector=AnnualMeanTempDataFrame$Layer_1

# repeat this for each raster
AnnualPrecipFilePath="BioClim_CONUS_1km (1)/BioClim_CONUS_1km/bio_12_AnnualPrecip_CONUS_1.img"
AnnualPrecipRaster = rast(AnnualPrecipFilePath) # loads data as a spatraster (terra)
AnnualPrecipRaster=crop(AnnualPrecipRaster,TheExtent)
plot(AnnualPrecipRaster)
AnnualPrecipDataFrame=as.data.frame(AnnualPrecipRaster,xy=TRUE)
AnnualPrecipVector=AnnualPrecipDataFrame$Layer_1


## Creates a vector naming each of the covariates I want in the pairwise plots from the .csv file.
z= cbind (AnnualMeanTempVector, AnnualPrecipVector)

## Creates a vector of names to be placed in the diagonal boxes of the pairwise plots.
colnames (z)=c("Annual Mean Temp", "Annual Mean Precip")

## Command that creates the actual pairwise correlation plots."lower.panel" defaults to the correlation plots
pairs (z, upper.panel=panel.pearson, cex=0.7, main="Correlation plots between predictor variables")

Other Resources

Wikipedia on Covariance vs. Correlation

R for Spatial Statistics

Testing for covariance and correlation

Correlation

Covariance

Correlation Plots

Other Resources