Predicting no-show medical appointments
appointments
A data frame with 110527 rows and 14 variables:
PatientId
double. Identification of a patient.
AppointmentID
double. dentification of each appointment.
Gender
factor. Male, Female
.
ScheduledDay
datatime. The day and time of the actual appointment, when they have to visit the doctor.
AppointmentDay
double. The day someone called or registered the appointment, this is before appointment of course.
Age
double. Age of the patient.
Neighbourhood
character. Where the appointment takes place.
Scholarship
integer. 0=FALSE, 1=TRUE
. Scholarship
is a social welfare program providing financial aid to poor Brazilian families.
Hypertension
integer. 0=FALSE, 1=TRUE
.
Diabetes
integer. 0=FALSE, 1=TRUE
.
Alcoholism
integer. 0=FALSE, 1=TRUE
.
Handcap
integer. 0=FALSE, 1=TRUE
.
SMS_received
integer. 0=FALSE, 1=TRUE
.
1 or more messages sent to the patient.
No_show
factor. Yes, No.
Joni Hoppen, Kaggle Medical Appointment No Shows https://www.kaggle.com/joniarroba/noshowappointments.
This Kaggle competition was designed to challenge participants to predict office no-shows. It is also a good dataset to practice date and time manipulation.
summary(appointments)
#> PatientId AppointmentID Gender
#> Min. :3.920e+04 Min. :5030230 Male :71840
#> 1st Qu.:4.173e+12 1st Qu.:5640286 Female:38687
#> Median :3.173e+13 Median :5680573
#> Mean :1.475e+14 Mean :5675305
#> 3rd Qu.:9.439e+13 3rd Qu.:5725524
#> Max. :1.000e+15 Max. :5790484
#> ScheduledDay AppointmentDay Age
#> Min. :2015-11-10 07:13:56 Min. :2016-04-29 00:00:00 Min. : -1.00
#> 1st Qu.:2016-04-29 10:27:01 1st Qu.:2016-05-09 00:00:00 1st Qu.: 18.00
#> Median :2016-05-10 12:13:17 Median :2016-05-18 00:00:00 Median : 37.00
#> Mean :2016-05-09 07:49:15 Mean :2016-05-19 00:57:50 Mean : 37.09
#> 3rd Qu.:2016-05-20 11:18:37 3rd Qu.:2016-05-31 00:00:00 3rd Qu.: 55.00
#> Max. :2016-06-08 20:07:23 Max. :2016-06-08 00:00:00 Max. :115.00
#> Neighbourhood Scholarship Diabetes Alcoholism
#> Length:110527 Min. :0.00000 Min. :0.00000 Min. :0.0000
#> Class :character 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.0000
#> Mode :character Median :0.00000 Median :0.00000 Median :0.0000
#> Mean :0.09827 Mean :0.07186 Mean :0.0304
#> 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.0000
#> Max. :1.00000 Max. :1.00000 Max. :1.0000
#> Handcap SMS_received No_show Hypertension
#> Min. :0.00000 Min. :0.000 No :88208 Min. :0.0000
#> 1st Qu.:0.00000 1st Qu.:0.000 Yes:22319 1st Qu.:0.0000
#> Median :0.00000 Median :0.000 Median :0.0000
#> Mean :0.02225 Mean :0.321 Mean :0.1972
#> 3rd Qu.:0.00000 3rd Qu.:1.000 3rd Qu.:0.0000
#> Max. :4.00000 Max. :1.000 Max. :1.0000