A dataset containing detailed plot descriptions for 3670 British movies from 1920 to 2017 from their Wikipedia pages.
movies
A data frame with 3670 rows and 7 variables. The variables are as follows:
Year of film release
Film title
Director of film
Main cast of actors and actresses in film
Genre of film
Wikipedia web page
Film's plot from Wikipedia page
The data are a subset of the Kaggle Wikipedia movie plots dataset https://www.kaggle.com/jrobischon/wikipedia-movie-plots
This is a good dataset for text mining.
#> year title director cast #> Min. :1920 Length:3670 Length:3670 Length:3670 #> 1st Qu.:1951 Class :character Class :character Class :character #> Median :1967 Mode :character Mode :character Mode :character #> Mean :1973 #> 3rd Qu.:2001 #> Max. :2017 #> genre url plot #> Length:3670 Length:3670 Length:3670 #> Class :character Class :character Class :character #> Mode :character Mode :character Mode :character #> #> #>movies[1, ]#> year title director cast #> 1 1920 The Amateur Gentleman Maurice Elvey Langhorn Burton, Cecil Humphreys #> genre url #> 1 drama https://en.wikipedia.org/wiki/The_Amateur_Gentleman_(1920_film) #> plot #> 1 In Regency Britain a young man tries to establish his father's innocence of an accused crime, by travelling to London disguised as a gentleman.