Function to calculate frequency distributions for categorical variables
tab(
data,
x,
sort = FALSE,
maxcat = NULL,
minp = NULL,
na.rm = FALSE,
total = FALSE,
digits = 2,
cum = FALSE,
plot = FALSE
)
A dataframe
A factor variable in the data frame.
logical. Sort levels from high to low.
Maximum number of categories to be included. Smaller categories will be combined into an "Other" category.
Minimum proportion for a category to be included. Categories representing smaller proportions willbe combined into an "Other" category. maxcat and minp cannot both be specified.
logical. Removes missing values when TRUE.
logical. Include a total category when TRUE.
Number of digits the percents should be rounded to.
logical. If TRUE
, include cumulative counts
and percents. In this case total
will be set to FALSE
.
logical. If TRUE
, generate bar chart rather than a frequency table.
If plot = TRUE
return a ggplot2 bar chart. Otherwise
return a data frame.
The function tab
will calculate the frequency
distribution for a categorical variable and output a data frame
with three columns: level, n, percent.
tab(cars74, carb)
#> level n percent
#> carb1 7 21.88%
#> carb2 10 31.25%
#> carb3 3 9.38%
#> carb4 10 31.25%
#> carb6 1 3.12%
#> carb8 1 3.12%
tab(cars74, carb, plot=TRUE)
tab(cars74, carb, sort=TRUE)
#> level n percent
#> carb2 10 31.25%
#> carb4 10 31.25%
#> carb1 7 21.88%
#> carb3 3 9.38%
#> carb6 1 3.12%
#> carb8 1 3.12%
tab(cars74, carb, sort=TRUE, plot=TRUE)
tab(cars74, carb, cum=TRUE)
#> level n percent cum_n cum_percent
#> carb1 7 21.88% 7 21.88%
#> carb2 10 31.25% 17 53.12%
#> carb3 3 9.38% 20 62.5%
#> carb4 10 31.25% 30 93.75%
#> carb6 1 3.12% 31 96.88%
#> carb8 1 3.12% 32 100%
tab(cars74, carb, cum=TRUE, plot=TRUE)