Predicting medical expenses

insurance

Format

A data frame with 1338 rows and 7 variables:

age

integer. age of primary beneficiary.

sex

character. insurance contractor gender, female, male.

bmi

double. Body mass index, providing an understanding of body, weights that are relatively high or low relative to height, objective index of body weight (kg / m ^ 2) using the ratio of height to weight, ideally 18.5 to 24.9.

children

integer. Number of children covered by health insurance / Number of dependents.

smoker

character. Smoking (yes, no)

region

character. the beneficiary's residential area in the US, northeast, southeast, southwest, northwest.

charges

double. Individual medical costs billed by health insurance (in dollars).

Source

From Machine Learning in R by Brett Lantz. Data downloaded from GitHub.

Examples

summary(insurance)
#> age sex bmi children #> Min. :18.00 Length:1338 Min. :15.96 Min. :0.000 #> 1st Qu.:27.00 Class :character 1st Qu.:26.30 1st Qu.:0.000 #> Median :39.00 Mode :character Median :30.40 Median :1.000 #> Mean :39.21 Mean :30.66 Mean :1.095 #> 3rd Qu.:51.00 3rd Qu.:34.69 3rd Qu.:2.000 #> Max. :64.00 Max. :53.13 Max. :5.000 #> smoker region charges #> Length:1338 Length:1338 Min. : 1122 #> Class :character Class :character 1st Qu.: 4740 #> Mode :character Mode :character Median : 9382 #> Mean :13270 #> 3rd Qu.:16640 #> Max. :63770