IS428 2013-14 Term1 Assign2 Jeremy ZHONG Jiahao

From Visual Analytics for Business Intelligence
Jump to: navigation, search

Assignment 2: Interactive Data Visualisation

In this digital economy age, massive and complex data have been captured and stored in organization databases and/or data warehouses. By and large, these data contain a large amount of variables of a particular product or activity. Due to limitations in perceptual and screen space, data visualization techniques available in traditional business intelligence systems tend to confine to univariate and bivariate data such as bar chart, pie, chart, histogram, and scatter plot. As a result, many important relationships that live in these data remain undiscovered. In this assignment, you are required to design a data visualization system for analyzing and visualizing high-dimensional attributes from a dataset of your choice. The goal of this assignment is not to develop a new visualization tool, but to apply the interactive visualization techniques you have learned by using commercial-of-the-shelf software. It also aims to allow you to gain hands-on experiences on using the visualization tool and at the same time, to evaluate the pros and cons of the tool in real world applications.

Step 1: Theme of Interest

Gaming Industry - Console Games

I've chosen this as I felt it's an industry I have interest and can relate to.

Step 2: Questions for Investigation

Over the years, technological advancement led to increased popularity of console games. However, are console games really getting better? Or are developers just catching onto the bandwagon?

This assignment seeks to answer this main question of...

  • Has the growing popularity of console games (over the released years) affect the quality trend of games produced (game ratings)?
    • In addition, was sales affected?

Step 3: Data Selection

3.1. Data Required


Unlike the government (and many other) datasets, video game datasets are not readily available for downloads.

This poses a huge problem as I had to resort to scraping data from the above websites - a technique that I wasn't familiar with. Well, let's take up the challenge!

3.2. Tools Required for Scraping

Combed the Web and... finalised on two scraper tools to get data out of web pages into spreadsheets.

Note: Scraper Extension Tool has very minimal learning curve. However, it's only capable of scraping data from the current page. GameRankings.com paginates their search results which makes it a huge pain to acquire data from. I can't automate the "Click-Next-Page-And-Scrape" process. Manually scraping across ~500 search results pages would never be an option. I turned to uBot Studios for that - More to be explained shortly.

3.3. Procedure - Data Scraping of Sales for VgChartz

I installed the Scraper Extension Tool and navigated to http://www.vgchartz.com/gamedb/?platform=X360&results=200

Based on the diagram below, here's what I did...

  1. Highlighted the common data set which I would like to extract and "Right-Click" to get Chrome Context Menu.
  2. Hit on "Scrape Similar".

Jeremy-A2-1.png


I was presented with this screen and the option to "Export to Google Docs". Well, perfect!

Jeremy-A2-2.png

Once exported... VIOLAAA!!

Jeremy-A2-3.png


Yet Another Problem
Scraper only allows you to scrape data from that current page. If you want more, you got to manually click on the next page, and then hit Scraper again.

VgChartz, apparently, only displays data up to 200 search results.

Jeremy-A2-4.png

I can't imagine repeating the same process 17 (3400 / 200) times for one platform. I have 3 platforms to extract data from.

A neat trick was to change the results parameter on the URL to my desired amount:

Jeremy-A2-5.png

This mandate scraping process would be repeated across the Top 3 Popular Console Games - XBOX360, WII and PS3. Thank god for Scraper Tool.



3.4. Procedure - Data Scraping of Game Reviews from GameRankings.com

Just when I thought my life would be easier with Scraper Tool, GameRankings.com proved me otherwise.

They paginate their search results and doesn't allow me to specify the returned result set.

Jeremy-A2-6.png

I had to turn to uBot Studios - http://ubotstudio.com/ - A tool that automates common tasks online, such as web scraping.

Wrote a custom bot from scratch to scrape data from multiple pages across the 3 popular console game platforms.

As this assignment's main focus is not on scraping, I would not explain how this bot was created, but rather, a summary of what this bot does.

Jeremy-A2-7.png

In a nutshell, this bot would:

  • Scrape the necessary fields (i.e. GameTitle and Game Reviews)
  • Automate clicking of "Next-Page" (if any)
  • Repeat process till done
  • Populate all extracted data into a .csv file


Do contact me if you're interested to try out this custom bot.

3.5. Merging/Jointing 2 Data Files by Game Title

Jeremy-A2-8.png

3.6. Merge-ception - Merging 3 Merged Data Files (Across 3 Platforms) into ONE Master Copy

Jeremy-A2-9.png



Step 4: Introduction of Visual Analysis

Over the years, technological advancement led to increased popularity of console games. However, are games really getting better? Or are people just catching onto the bandwagon? Was sales affect?

Recap: Main Question for Investigation

  • Has the growing popularity of console games (over the released years) affect the quality trend of games produced (game ratings)? In addition, was sales affected?

Note: Only data from the top 3 game console across 2005 to 2011 were used for this analysis; namely, xbox360, PS3 and WII.

Visualisation 1: Games Published Over the Years

Before we even begin, I would like to find out how many games were published across the years.

Jeremy-A2-10.png

Observations and Key Findings

  • Whoa! There is indeed an increasing growth of console games over the years. Not that interesting, eh? Hang in there!



Visualisation 2: Quality of Games Over the Years

Next, I would like to find out how are the average quality of games (Game Ratings) produced over the years (Released Year).

Jeremy-A2-11.png

Observations and Key Findings

  1. What?! This visualisation above depicts that as we progress into the years, the average quality of games are progressively getting bad (and eventually a minor spike from 2008 onwards).
  2. This got me really curious. Logically, if quality of games were bad, sales would be bad as well, right?



Visualisation 3: Global Sales over the Years



Jeremy-A2-12.png

Observations and Key Findings

Nah, I was proven - otherwise.

  • Sales were in fact, increasing despite the (lower) quality of games produced. If you were to notice, the sales for 2010 onwards actually declined.
    • It got me really curious - my prime suspect (reasons) would be the release of new game consoles (i.e. Nintendo Wii U, Nintendo 3DS, PlayStation Vita).

Anyway, let's get back on track.

No wait, I can't. I'm still confused. More games suck but people buy more?!!

Let's visualise it.



Visualisation 4: More Games, Lesser Quality, Higher Sales



Jeremy-A2-13.png

Observations and Key Findings

  1. The above visualisation depicts that the average quality of games got progressively bad over the years. However, there's still an increase in sales.
    • I can't wrap my head around this fact.
  2. I had to investigate - firstly, but finding the sales breakdown of games across the years.

Note: The bigger the circle, the more games were produced for that released year.



Visualisation 5: Sales Breakdown of Games across the Years



Jeremy-A2-14.png

Observations and Key Findings

  1. This visualisation shows that for every year, there would be a couple of games that dominated the top percentile of the sales chart. For example, Year 2009's sales were mostly from Wii Sports Resort, New Super Mario, COD MW2, etc… Most of which are games with high game review ratings.

Note: Game sales were broken down across the years (indicated by different colour codes).



Step 5: Final Conclusion

With all the above analysis, I could finally draw a conclusion that while high quality console games with high game ratings enjoy their abundance of sales (higher sales), the truckload of other games (lower quality) and the growing popularity of console games (more games) completely overcomes the quality trends.

Hence, explaining the term of, "More Games, Lesser Quality. Yet, Higher Sales."



Final Dashboard

Jeremy-A2-Dashboard.png

(( Click here to View Full Interative Visualisation ))



Feedback

Would love to hear your feedback! :)



Awesome dashboard! XD

--fuhua