Correlation/Regression Modeling Steps
Below is an overview of the steps in regression modeling which seeks to find a correlation between predictor variables and a response variable.
1. Defining the Question
2. Collect and/or Find Data
3. Qualify Data
4. Data Exploration
Looking for: relationships between the response variable and covariates
- Response variable
- Occurrences
- Histogram against each covariate, compare with histograms of entire sample area
- If there is a difference, you may be able to model it, otherwise you can't
- Convert to density (continuous)? Kernel density functions
- Convert to "count"
- Convert to a raster using a "count" operation, then convert back to points
- Binary (presence/absence)
4. Prepare the Data sets
- How will the model be validated?
- Test and training data sets
- Three options:
- Use predictors that were collected with samples
- Use continuous predictors
- Use a combination of predictors collected with samples and continuous predictors
- Gridding data
- Eliminating duplicates
5. Define Modeling Approaches
- Type of response variable:
- Occurrences (i.e. just "1's")
- Binary (e.g. presence/absence, alive/dead)
- Categorical
- Continuous
- Type of covariates/predictors
- Combination of continuous and categorical
- Residuals:
- Constant variance?
- Correcting variance (transforming data)
- Shape of the expected response:
- Linear
- Logistic: Presence/Absence, True/False
- Exponential
- Logarithmic
- Complex
6. Run Models
- Bootstrapping
- Jackknifing
- Optimizing parameters with Monte Carlo methods
7. Validating and Selecting Models
- AIC, AICc, BIC
- AUC
- Deviance explained
8. Estimating Uncertainty
- Computing precision & accuracy
- Jiggling
- RMSE
- etc.
9. Documenting Results
- Types of maps:
- Gridded
- Interpolated
- Continuous
- Uncertainty maps
- Caveats
- Significant Digits