Function to calculate frequency distributions for categorical variables
tab( data, x, sort = FALSE, maxcat = NULL, minp = NULL, na.rm = FALSE, total = FALSE, digits = 2, cum = FALSE, plot = FALSE )
data | A dataframe |
---|---|
x | A factor variable in the data frame. |
sort | logical. Sort levels from high to low. |
maxcat | Maximum number of categories to be included. Smaller categories will be combined into an "Other" category. |
minp | Minimum proportion for a category to be included. Categories representing smaller proportions willbe combined into an "Other" category. maxcat and minp cannot both be specified. |
na.rm | logical. Removes missing values when TRUE. |
total | logical. Include a total category when TRUE. |
digits | Number of digits the percents should be rounded to. |
cum | logical. If |
plot | logical. If |
If plot = TRUE
return a ggplot2 bar chart. Otherwise
return a data frame.
The function tab
will calculate the frequency
distribution for a categorical variable and output a data frame
with three columns: level, n, percent.
tab(venues, state, sort = TRUE, na.rm = TRUE, maxcat = 10, digits = 3)#> level n percent #> CA 70 8.495% #> TX 59 7.16% #> FL 43 5.218% #> NY 43 5.218% #> NC 34 4.126% #> OH 34 4.126% #> IL 28 3.398% #> PA 27 3.277% #> LA 24 2.913% #> VA 23 2.791% #> Other 439 53.277%tab(cars74, carb, cum=TRUE, plot=TRUE)