Predicting no-show medical appointments

appointments

Format

A data frame with 110527 rows and 14 variables:

PatientId

double. Identification of a patient.

AppointmentID

double. dentification of each appointment.

Gender

factor. Male, Female.

ScheduledDay

datatime. The day and time of the actual appointment, when they have to visit the doctor.

AppointmentDay

double. The day someone called or registered the appointment, this is before appointment of course.

Age

double. Age of the patient.

Neighbourhood

character. Where the appointment takes place.

Scholarship

integer. 0=FALSE, 1=TRUE. Scholarship is a social welfare program providing financial aid to poor Brazilian families.

Hypertension

integer. 0=FALSE, 1=TRUE.

Diabetes

integer. 0=FALSE, 1=TRUE.

Alcoholism

integer. 0=FALSE, 1=TRUE.

Handcap

integer. 0=FALSE, 1=TRUE.

SMS_received

integer. 0=FALSE, 1=TRUE. 1 or more messages sent to the patient.

No_show

factor. Yes, No.

Source

Joni Hoppen, Kaggle Medical Appointment No Shows https://www.kaggle.com/joniarroba/noshowappointments.

Details

This Kaggle competition was designed to challenge participants to predict office no-shows. It is also a good dataset to practice date and time manipulation.

Examples

summary(appointments)
#> PatientId AppointmentID Gender #> Min. :3.920e+04 Min. :5030230 Male :71840 #> 1st Qu.:4.173e+12 1st Qu.:5640286 Female:38687 #> Median :3.173e+13 Median :5680573 #> Mean :1.475e+14 Mean :5675305 #> 3rd Qu.:9.439e+13 3rd Qu.:5725524 #> Max. :1.000e+15 Max. :5790484 #> ScheduledDay AppointmentDay Age #> Min. :2015-11-10 07:13:56 Min. :2016-04-29 00:00:00 Min. : -1.00 #> 1st Qu.:2016-04-29 10:27:01 1st Qu.:2016-05-09 00:00:00 1st Qu.: 18.00 #> Median :2016-05-10 12:13:17 Median :2016-05-18 00:00:00 Median : 37.00 #> Mean :2016-05-09 07:49:15 Mean :2016-05-19 00:57:50 Mean : 37.09 #> 3rd Qu.:2016-05-20 11:18:37 3rd Qu.:2016-05-31 00:00:00 3rd Qu.: 55.00 #> Max. :2016-06-08 20:07:23 Max. :2016-06-08 00:00:00 Max. :115.00 #> Neighbourhood Scholarship Diabetes Alcoholism #> Length:110527 Min. :0.00000 Min. :0.00000 Min. :0.0000 #> Class :character 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.0000 #> Mode :character Median :0.00000 Median :0.00000 Median :0.0000 #> Mean :0.09827 Mean :0.07186 Mean :0.0304 #> 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.0000 #> Max. :1.00000 Max. :1.00000 Max. :1.0000 #> Handcap SMS_received No_show Hypertension #> Min. :0.00000 Min. :0.000 No :88208 Min. :0.0000 #> 1st Qu.:0.00000 1st Qu.:0.000 Yes:22319 1st Qu.:0.0000 #> Median :0.00000 Median :0.000 Median :0.0000 #> Mean :0.02225 Mean :0.321 Mean :0.1972 #> 3rd Qu.:0.00000 3rd Qu.:1.000 3rd Qu.:0.0000 #> Max. :4.00000 Max. :1.000 Max. :1.0000