This lab will introduce you to modeling presence/absence data with GLMs. This is also the first full-modeling lab. A key element of this lab is examining the response of the data vs. the predicted output of the model and how both relate to the predictor variables.
Note: Remember to include the glm2 library in the code below.
The function below will create synthetic presence/absence data for evaluating GLMs. Copy it into R now and compile them.
############################################################################ # Creates a data frame with Xs ranging from 1 to the number of entries # and "Measures" with 1/2 set to 0 and 1/2 set to 1. The lower have of # the X values have measures at 0. # ProportionRandom - amount of uniform randomness to add ############################################################################ Categories1D_Random=function(NumEntries=10,ProportionRandom=0.4) { Range=NumEntries*ProportionRandom/2 Ys=as.vector(array(1:NumEntries)) Xs=as.vector(array(1:NumEntries)) Measures=as.vector(array(1:NumEntries)) for (Index in 1:NumEntries) { Xs[Index]=Index #runif(1,0,100) Ys[Index]=Index #runif(1,0,100) Threshold=0.5 Random=0 if (ProportionRandom!=0) Random=runif(1,-Range,Range) if (Xs[Index]>NumEntries/2+Random) Measures[Index]=1 else Measures[Index]=0 } TheDataFrame = data.frame(Ys, Xs, Measures) }
The code below will create a synthetic data set for a logistic model using the function above. Try it now in R. Note that the data contains some randomness to make the values overlap a bit.
TheData=Categories1D_Random(100) # create a set of binary data plot(TheData$Xs,TheData$Measures) # plot the data
© Copyright 2018 HSU - All rights reserved.