Testing for covariance and correlation
Correlation
It's easy to test for correlation with the "cor()" function. Just pass in the two vectors to the function.
TheVector1=rnorm(1000,mean=0,sd=1) TheVector2=rnorm(1000,mean=5,sd=1) TheCorrelation=cor(TheVector1, TheVector2)
The result will be less than 0 for negative correlation and greater than 0 for positive. The magnitude of the value indicates how strong the correlation is.
Try this with a few data sets generated from a linear relationship. First, start with two lines with the slope set the same (the "1"s in the code below).
XValues=1:20 YValues1=1*XValues+10 YValues2=-10*XValues+10 TheCorrelation=cor(YValues1, YValues2) print(TheCorrelation)
Now, change the values of the slope coefficients and see what happens to the correlation value. Then, change one of the slope coefficients to be negative.
Covariance
Covariance is closely related to correlation but changes based on the magnitude of the variance so correlation is typically used.
Correlation Plots
The code below will create an NxN chart with scatter grams and their Pearson's correlation correficients together. While this is great for analysis, please simplify and break up charts like this for presentation in reports and papers.
# function to put all the plots into one panel and add the labels panel.pearson <- function(x, y, ...) { horizontal <- (par("usr")[1] + par("usr")[2]) / 2; vertical <- (par("usr")[3] + par("usr")[4]) / 2; text(horizontal, vertical, format(abs(cor(x,y)), digits=2)) } # Command that creates the actual pairwise correlation plots."lower.panel" # defaults to the correlation plots pairs (TheData, upper.panel=panel.pearson, cex=0.7, main="Title")The names of the covariates in the chart from the code above are not really that great as they are just the names of the variables. The code below will convert the names to something more readable.
########################### # Pairwise Correlation Plots ## Creates a function that calculates and places the Pearson's correlation values for each covariate ## I found this function online along with the pairs() command below. panel.pearson <- function(x, y, ...) { horizontal <- (par("usr")[1] + par("usr")[2]) / 2; vertical <- (par("usr")[3] + par("usr")[4]) / 2; text(horizontal, vertical, format(abs(cor(x,y)), digits=2)) } ## Creates a vector naming each of the covariates I want in the pairwise plots from the .csv file. z= cbind (streetdist, aspect, slope, tri, relativetp, waterdist, ndvi, ndmi, greenness, wetness, bulkdensity, factor(mechsite2), percentclay) ## Creates a vector of names to be placed in the diagonal boxes of the pairwise plots. colnames (z)=c("Distance to roadway", "Aspect","Slope","TRI","RPI","Distance to water", "NDVI","NDMI","Greenness","Wetness", "Bulk Density", "Mech site", "% clay") ## Command that creates the actual pairwise correlation plots."lower.panel" defaults to the correlation plots pairs (z, upper.panel=panel.pearson, cex=0.7, main="Correlation plots between predictor variables")
Code provided by Matt Lau
Other Resources
Wikipedia on Covariance vs. Correlation