A data frame containing wine reviews for wines made in the United States, France, and Italy.

wine

Format

A data frame with 5,000 rows and 13 variables The variables are as follows:

id

a unique identifier

country

country that the wine is from

description

description of the wine, written by the taster

designation

vineyard within the winery where the wine is from

points

number of points the wine was rated on a scale of 1 - 100

price

cost of the bottle of wine($)

province

the province or state that the wine is from

region

the wine growing area within the province or state

taster_name

name of the reviewer

taster_twitter_handle

twitter handle of the reviewer

title

title of the wine review; contains the vintage of the wine

variety

type of grape used to make the wine

winery

winery where the wine was produced

Source

The full data set can be found on Kaggle https://www.kaggle.com/zynicide/wine-reviews

Note

This is a good dataset for both quantitative and text mining.

Examples

summary(wine)
#>       id              country          description        designation       
#>  Length:5000        Length:5000        Length:5000        Length:5000       
#>  Class :character   Class :character   Class :character   Class :character  
#>  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
#>                                                                             
#>                                                                             
#>                                                                             
#>                                                                             
#>      points           price              province   
#>  Min.   : 80.00   Min.   :  7.00   California:1639  
#>  1st Qu.: 87.00   1st Qu.: 22.00   Washington: 701  
#>  Median : 89.00   Median : 35.00   Oregon    : 437  
#>  Mean   : 89.34   Mean   : 41.76   Burgundy  : 252  
#>  3rd Qu.: 91.00   3rd Qu.: 50.00   Tuscany   : 217  
#>  Max.   :100.00   Max.   :550.00   Alsace    : 202  
#>                                    (Other)   :1552  
#>                   region               taster_name   taster_twitter_handle
#>  Columbia Valley (WA): 299   Roger Voss      :1100   Length:5000          
#>  Willamette Valley   : 184   Virginie Boone  : 795   Class :character     
#>  Alsace              : 179   Kerin O’Keefe   : 785   Mode  :character     
#>  Russian River Valley: 159   Paul Gregutt    : 735                        
#>  Champagne           : 157   Matt Kettmann   : 533                        
#>  Napa Valley         : 126   Sean P. Sullivan: 413                        
#>  (Other)             :3896   (Other)         : 639                        
#>     title                               variety    
#>  Length:5000        Pinot Noir              : 770  
#>  Class :character   Chardonnay              : 497  
#>  Mode  :character   Red Blend               : 420  
#>                     Cabernet Sauvignon      : 298  
#>                     Bordeaux-style Red Blend: 243  
#>                     Syrah                   : 220  
#>                     (Other)                 :2552  
#>                    winery    
#>  Columbia Crest       :  25  
#>  Louis Latour         :  18  
#>  Chateau Ste. Michelle:  16  
#>  Cayuse               :  13  
#>  Chehalem             :  13  
#>  :Nota Bene           :  12  
#>  (Other)              :4903  

table(wine$points)
#> 
#>  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99 
#>   4   6  27  56 129 232 355 503 687 549 661 621 468 361 219  88  26   3   3   1 
#> 100 
#>   1 

hist(wine$points)


plot(wine$points, wine$price, 
     main = "Wine Prices by Score", 
     xlab = "Score (1-100)", 
     ylab = "Price ($)")

 
wine[1, ]
#>   id country
#> 1  1      US
#>                                                                                                                                                                                                                                                                                                description
#> 1 Cabernet Sauvignon (80%) makes up the majority of this wine, with the rest equal parts Cabernet Franc and Merlot. Aromas of high-toned green herbs, spice, black cherry and blackberry lead to lively cranberry and cherry flavors. It shows a sense of freshness, though at times it seems a bit green.
#>              designation points price   province                  region
#> 1 Frederick Estate Grown     89    50 Washington Walla Walla Valley (WA)
#>        taster_name taster_twitter_handle
#> 1 Sean P. Sullivan         @wawinereport
#>                                                                              title
#> 1 Spring Valley Vineyard 2013 Frederick Estate Grown Red (Walla Walla Valley (WA))
#>                    variety                 winery
#> 1 Bordeaux-style Red Blend Spring Valley Vineyard