Breast cancer data

breast

Format

A data frame with 286 rows and 10 variables:

age

factor. 20-29, 30-39, 40-49, 50-59, 60-69, 70-79.

menopause

factor. lt40, ge40, premeno.

tumor.size

factor. 0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54.

inv.nodes

factor. 0-2, 3-5, 6-8, 9-11, 12-14, 15-17, 18+.

node.caps

factor. yes, no.

deg.malig

factor. 1, 2, 3. Higher numbers indicate greater malignancy.

breast

factor. left, right.

breast.quad

factor. left-up, left-low, right-up, right-low, central.

irradiate

factor. yes, no.

recurrence

factor. yes, no.

Source

This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data. Downloaded from OpenML.

Note

recurrence is the response or outcome variable.

Examples

summary(breast)
#> age menopause tumor.size inv.nodes node.caps deg.malig #> 20-29: 1 ge40 :129 30-34 :60 0-2 :213 no :222 1: 71 #> 30-39:36 lt40 : 7 25-29 :54 3-5 : 36 yes : 56 2:130 #> 40-49:90 premeno:150 20-24 :50 6-8 : 17 NA's: 8 3: 85 #> 50-59:96 15-19 :30 9-11 : 10 #> 60-69:57 10-14 :28 12-14: 3 #> 70-79: 6 40-44 :22 15-17: 6 #> (Other):42 18+ : 1 #> breast breast.quad irradiate recurrence #> left :152 central : 21 no :218 no :201 #> right:134 left-low :110 yes: 68 yes: 85 #> left-up : 97 #> right-low: 24 #> right-up : 33 #> NA's : 1 #>