Housing Values in Suburbs of Boston
Boston
A data frame with 506 rows and 14 variables:
crim
per capita crime rate by town.
zn
proportion of residential land zoned for lots over 25,000 sq.ft.
indus
proportion of non-retail business acres per town.
chas
Charles River dummy variable (= 1 if tract bounds river; 0 otherwise).
nox
nitrogen oxides concentration (parts per 10 million).
rm
average number of rooms per dwelling.
age
proportion of owner-occupied units built prior to 1940.
dis
weighted mean of distances to five Boston employment centres.
rad
index of accessibility to radial highways.
tax
full-value property-tax rate per $10,000.
ptratio
pupil-teacher ratio by town.
black
proportion of blacks by town.
lstat
lower status of the population (percent).
medv
median value of owner-occupied homes in $1000s.
Harrison, D. and Rubinfeld, D.L. (1978) Hedonic prices and the demand for clean air. J. Environ. Economics and Management 5, 81–102.
Belsley D.A., Kuh, E. and Welsch, R.E. (1980) Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.
The Boston
data frame was obtained from Venables and Ripley's MASS
package.
The data utilize census tracts in the Boston Standard Metropolitatn
Statistical Area in 1970. Two changes have been made from this original
dataset. The dollar values for tax
and
medv
have been converted to 2020 US dollars (assuming a 299.4%
cumulative inflation rate). Additionally, the black
variable has
been transformed from its original metric
(1000*(proportion of blacks by town - 0.63)^2) to a simple the proportion
of blacks by town.
#> #> The data frame Boston has 506 observations and 14 variables. #> #> Overall #> pos varname type n_unique n_miss pct_miss #> 1 crim numeric 504 0 0% #> 2 zn numeric 26 0 0% #> 3 indus numeric 76 0 0% #> 4 chas integer 2 0 0% #> 5 nox numeric 81 0 0% #> 6 rm numeric 446 0 0% #> 7 age numeric 356 0 0% #> 8 dis numeric 412 0 0% #> 9 rad integer 9 0 0% #> 10 tax numeric 66 0 0% #> 11 ptratio numeric 46 0 0% #> 12 black numeric 357 0 0% #> 13 lstat numeric 455 0 0% #> 14 medv numeric 229 0 0% #> #> Numeric Variables #> n mean sd skew min p25 median p75 max #> crim 506 3.61 8.60 5.19 0.01 0.08 0.26 3.68 88.98 #> zn 506 11.36 23.32 2.21 0.00 0.00 0.00 12.50 100.00 #> indus 506 11.14 6.86 0.29 0.46 5.19 9.69 18.10 27.74 #> chas 506 0.07 0.25 3.39 0.00 0.00 0.00 0.00 1.00 #> nox 506 0.55 0.12 0.72 0.38 0.45 0.54 0.62 0.87 #> rm 506 6.28 0.70 0.40 3.56 5.89 6.21 6.62 8.78 #> age 506 68.57 28.15 -0.60 2.90 45.02 77.50 94.07 100.00 #> dis 506 3.80 2.11 1.01 1.13 2.10 3.21 5.19 12.13 #> rad 506 9.55 8.71 1.00 1.00 4.00 5.00 24.00 24.00 #> tax 506 1628.87 672.46 0.67 746.13 1113.21 1316.70 2657.34 2836.89 #> ptratio 506 18.46 2.16 -0.80 12.60 17.40 19.05 20.20 22.00 #> black 506 1.22 0.11 -3.34 0.65 1.24 1.26 1.26 1.26 #> lstat 506 12.65 7.14 0.90 1.73 6.95 11.36 16.96 37.97 #> medv 506 89.91 36.70 1.10 19.95 67.93 84.59 99.75 199.50