Breast cancer data
breast
A data frame with 286 rows and 10 variables:
age
factor. 20-29, 30-39, 40-49, 50-59, 60-69, 70-79.
menopause
factor. lt40, ge40, premeno.
tumor.size
factor. 0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54.
inv.nodes
factor. 0-2, 3-5, 6-8, 9-11, 12-14, 15-17, 18+.
node.caps
factor. yes, no.
deg.malig
factor. 1, 2, 3. Higher numbers indicate greater malignancy.
breast
factor. left, right.
breast.quad
factor. left-up, left-low, right-up, right-low, central.
irradiate
factor. yes, no.
recurrence
factor. yes, no.
This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data. Downloaded from OpenML.
recurrence
is the response or outcome variable.
summary(breast)
#> age menopause tumor.size inv.nodes node.caps deg.malig
#> 20-29: 1 ge40 :129 30-34 :60 0-2 :213 no :222 1: 71
#> 30-39:36 lt40 : 7 25-29 :54 3-5 : 36 yes : 56 2:130
#> 40-49:90 premeno:150 20-24 :50 6-8 : 17 NA's: 8 3: 85
#> 50-59:96 15-19 :30 9-11 : 10
#> 60-69:57 10-14 :28 12-14: 3
#> 70-79: 6 40-44 :22 15-17: 6
#> (Other):42 18+ : 1
#> breast breast.quad irradiate recurrence
#> left :152 central : 21 no :218 no :201
#> right:134 left-low :110 yes: 68 yes: 85
#> left-up : 97
#> right-low: 24
#> right-up : 33
#> NA's : 1
#>