Filtering Datatables

Sometimes we need to subset datatables. This could include limiting the diameter of trees in a dataset to 10, selecting only rows that have the name "Redwood", etc.

The simplest approach I have found is to use the "filter" function in the library dplyr . This function takes the dataframe as the first parameter, a filter condition as the second parameter, and returns a new dataframe. The example below will remove all rows that have a value of less than or equal to 24 in the MaxDia column.

 NewData <- filter(TheData, TheData$MaxDia>24.0)

WHile you can filter using the "&" charater to combine muliple criteria, I recommend just filtering based on the first condition and then based on the second as in the exapmle below which will limit a dataframe to rows that have values in the MaxDia column from 10 up to (but not including 20).

 NewData <- filter(TheData, TheData$MaxDia<20.0)

 NewData <- filter(TheData, TheData$MaxDia>=10)

Comparison operaters include:

Other Resources

Quick - R Operators - note that what Quick-R refers to as "logical operators" are actually comparison operators. Logical operators are operators like and ("&").