ISSS608 2016 17T1 Group9 Report

From Visual Analytics and Applications
Jump to navigation Jump to search

Proposal

Poster

Application

Report

 


Introduction

IMDB (Internet Movie Database) is an online database of information related to films from all over the world. The dataset taken into consideration has a list of 5,043 movies reviewed by users. The dataset contains 28 Variables spanning across 100 years in 66 countries.

Interactive tools have been developed to allow the user to explore the movies based on Genres. Dashboard has been developed to allow the user to explore the top movies, top actors, top directors, earnings and profits of the top movies. Treemap Explorations have been developed to allow the user to view the distribution of movies by Geography, Language, Income and Rating.

Motivation

The motivation behind this project is to cater to a movie enthusiast enabling him/her to explore the IMDB movie data set at a glance using various interactive visualization techniques. Visualization preferences of wide variety of users has been taken into consideration.

Review and Critique on past work

The dataset has been made available to us through Kaggle where a user has scraped 5000+ movies from IMDB website using a Python library called "scrapy". This data has been used with minor cleansing of the data. Reference has been taken from D3.js libraries to develop interactive sunburst diagrams. The work of Jason Davies on wordcloud has been used to create visualizations.

While a lot of predictive and exploratory models have been built for this dataset, there was limited past work which enables visualization with interactivity for the user. Our project aims to leverage the interactivity of out tools to create visualizations that have not been created before.

Design Framework

To effectively build interactive exploratory visualizations of a movie datasets, we need to keep in mind that each users preference. While some users may want to explore movies by genre, others may want to look at movies by ratings. Some users may want to look at information on their favourite actors while other may have favourite directors whose data they would want to explore.

We have tried to design a wide range of tools keeping in mid the preferences a user may have. Below are the designs of various visualization tools that have been used.

Dashboard Showcasing Top Grossing Movies

A dashboard has been developed leveraging interactivity of Tableau to enable the user to view the various parameters of top movies by gross revenue. Using the slider on the top right corner of the dashboard, the user can select the threshold gross revenue that he wants to view. The tool will display the following parameters for all movies with gross revenue above the selected threshold.
• Movie title with gross revenue
• Actor Names
• Director Names
• IMDB Rating
• Net profit that the movie made

Tableau Dashboard

Top Rated Movies By Genre

Most users have their favourite genres. Some people like Action movies while others love romantic movies. Some users may want to explore movies that are both action and romantic. To enable this exploration by genre, we design an interactive tool in d3.js which shows us the distribution of top 100 movies by Genre. If a user wants to view the top movies that belong to the genre of Crime, drama and thriller, he can click on those respective titles and the top movies the belong to all those genres will be displayed.

Zoomable sunburst


Zoomable Sunburst

Distribution of Movies by Genre

This tool has been designed to enable a user who wants to view the overall distribution of movies by genre. The tool uses the platform D3.js to provide an inside into what percentage of movies belong to what sequence of genres.

This tool differs from the previous tool due to its ability to provide and aggregated view to the user of what percentage of movies belong to which genre sequence.

Sequential Sunburst

Movie Distribution by Geography and Language

For users who would like to view the movies by the Geography or language in which the movie was produced, a treemap provides an interactive solution to explore the movies by their geographic location and language. Below are the features of the Treemap.
• Shows distribution of movies produced by Geographic location in the form of a hierarchical tree.
• Shows distribution of movies by language in the form of hierarchical tree.
• Shows the Gross earning of the movie by the size of the rectangle.
• Shows the IMDB rating of the movie by the intensity of the colour of the rectangle.

Treemap

Word Cloud of Actors and Directors

Simple yet highly effective wordcloud has been used to help the user visualise the artists that appear in most movies. The word cloud is for the actors and directors. The size of the word is an indicative of the number of movies they have appeared in/directed.

Actors Worldly

Discussion

Our project is a humble attempt to allow the user to explore the movies around the world interactively. Various tools designed as a part of the project can be used by movie enthusiasts for the following: • To view the top movies in the Genre of their choice without going through series of steps on the IMDB website. • To explore the movies from the country they desire, language they desire with their ratings and monetary performance at US box office in a few clicks. • Allows the user to understand the actors and directors associated with the top grossing movies and how much profit these movies have made. All this can be explored using a simple interactive dashboard. • Enables the user to view the actors and directors who have working on most movies. This can be done at a glance by viewing the wordcloud without getting into the data.

Futurework

The entire motivation behind this project is to make the life of a movie enthusiast simpler. To extend our work, we would attempt to integrate the various interactive tools designed into a single easy to use dashboard which would be convenient for a movie enthusiasts.

To cater to a wider base of movie enthusiasts, we would like to integrate the IMDB database with other movie databases to get a more comprehensive view of the film industries across the world.