Function to calculate frequency distributions for categorical variables
tab( data, x, sort = FALSE, maxcat = NULL, minp = NULL, na.rm = FALSE, total = FALSE, digits = 2, cum = FALSE, plot = FALSE )
data | A dataframe |
---|---|
x | A factor variable in the data frame. |
sort | logical. Sort levels from high to low. |
maxcat | Maximum number of categories to be included. Smaller categories will be combined into an "Other" category. |
minp | Minimum proportion for a category to be included. Categories representing smaller proportions willbe combined into an "Other" category. maxcat and minp cannot both be specified. |
na.rm | logical. Removes missing values when TRUE. |
total | logical. Include a total category when TRUE. |
digits | Number of digits the percents should be rounded to. |
cum | logical. If |
plot | logical. If |
If plot = TRUE
return a ggplot2 bar chart. Otherwise
return a data frame.
The function tab
will calculate the frequency
distribution for a categorical variable and output a data frame
with three columns: level, n, percent.
tab(cars74, carb)#> level n percent #> carb1 7 21.88% #> carb2 10 31.25% #> carb3 3 9.38% #> carb4 10 31.25% #> carb6 1 3.12% #> carb8 1 3.12%tab(cars74, carb, plot=TRUE)tab(cars74, carb, sort=TRUE)#> level n percent #> carb2 10 31.25% #> carb4 10 31.25% #> carb1 7 21.88% #> carb3 3 9.38% #> carb6 1 3.12% #> carb8 1 3.12%tab(cars74, carb, sort=TRUE, plot=TRUE)tab(cars74, carb, cum=TRUE)#> level n percent cum_n cum_percent #> carb1 7 21.88% 7 21.88% #> carb2 10 31.25% 17 53.12% #> carb3 3 9.38% 20 62.5% #> carb4 10 31.25% 30 93.75% #> carb6 1 3.12% 31 96.88% #> carb8 1 3.12% 32 100%tab(cars74, carb, cum=TRUE, plot=TRUE)