Difference between revisions of "TheBigScreen"
Line 47: | Line 47: | ||
==Related Work== | ==Related Work== | ||
+ | |||
+ | {| class="wikitable" width="100%" | ||
+ | |- | ||
+ | ! style="width: 50%;" | Visualizations | ||
+ | ! Learning Points | ||
+ | |- | ||
+ | | | ||
+ | <p><center>'''Top 20 Most Profitable Movies''' </center></p> | ||
+ | [[File:20 Most Profitable Movies.png|400px|center]] | ||
+ | <p><center>'''Source''': https://www.kaggle.com/param1/d/deepmatrix/imdb-5000-movie-dataset/the-money-makers</center></p> | ||
+ | || | ||
+ | * Simple and easy to read | ||
+ | * Not aesthetically pleasing | ||
+ | * Data points not properly explained e.g. why do some points have tails | ||
+ | |- | ||
+ | | <p><center> '''Duration of Movie vs. IMDB Score''' </center></p> | ||
+ | [[File:Duration vs IMDB Score.png|400px|center]] | ||
+ | <p><center> '''Source''': https://www.kaggle.com/benjaminlott/d/deepmatrix/imdb-5000-movie-dataset/imdb-5000-general-data-analysis </center> </p> | ||
+ | || | ||
+ | * Colours are visually appealing | ||
+ | * Messy | ||
+ | * Visualization was so big that legend could not fit on the same window | ||
+ | * Tooltip tags are unformatted and messy | ||
+ | |||
+ | |- | ||
+ | | <p><center> '''Age Ratings vs IMDB Score''' </center></p> | ||
+ | [[File:Ratings vs Score.png|400px|center]] | ||
+ | <p><center> '''Source''': https://www.kaggle.com/adhok93/d/deepmatrix/imdb-5000-movie-dataset/eda-with-plotly</center></p> | ||
+ | || | ||
+ | * Appropriate and informative use of boxplots to visualize continuous variable, IMDB scores | ||
+ | * Messy, especially when tooltip is displayed | ||
+ | * Unnecessary legend and use of colours | ||
+ | * No y-axis title | ||
+ | |} | ||
==Technical Challenges== | ==Technical Challenges== |
Revision as of 16:01, 4 October 2016
Proposal | Project Presentation | Poster | Application | Research Paper |
Contents
Problem and Motivation
Is it possible to predict how good a movie will be before it even screens? This is a subjective question. While some rely on movie critics and early reviews, others depend on instinct. However, we know reviews can take a long time to gather and human instinct is simply unreliable. Thousands of movies are produced every year and all of them our clamouring for the $11 we spend on movie tickets! Our group wants to know if we can predict which movies are worth you spending your money and time on.
Data
We are using the IMDB 5000 Movie Dataset from Kaggle. The Internet Movie Database (IMDB) is an online database of information related to films, television programs and video games [1]. Amongst its functions, IMDB allows users rate movies on a scale of 1 to 10.
The dataset contains the following variables, including but not limited to:
- movie title
- director name
- actors’ names and Facebook likes
- length of movie
- year
- gross earnings
- genres
- language
- country
- content rating
- budget
- IMDB rating
Related Work
Visualizations | Learning Points |
---|---|
|
|
| |
|