FoodScapers Proposal

From Visual Analytics for Business Intelligence
Revision as of 00:01, 26 November 2018 by Mallikang.2015 (talk | contribs) (→‎Datasets and Sources)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Inkuehdible2.png


ABOUT US

PROPOSAL

POSTER

APPLICATION

RESEARCH PAPER




< Back to Project Groups


Problem and Motivation

Singapore is well-known for its diversity and fusion of food, with influences from the Chinese, Indians, Malays, to the French and even Italian. Interesting, this unique blend of authentic Singaporean food can be found right here, in the little red dot: Singapore. More often than not, restaurants in present-day Singapore can be seen serving fusion food inspired by Singapore’s diversity. Instead of having the differences divide the country, Singapore’s diversity serve as a spark to Singapore’s unique melting pot of food culture.

As the Singapore government seeks to place more emphasis on our food landscape as integral to our intangible heritage[1], it becomes all the more urgent that our policy-planners from the Ministry of Culture, Community and Youth (MCCY), National Heritage Board (NHB), and the Singapore Tourism Board (STB), fully understand the current gastronomical landscape of Singapore.

One way to do just that, would be through data.

Our data visualisation project hopes to provide planners with a cockpit view of the local food landscape, by mapping the distribution of the different cuisines in Singapore, to help identify ethnic food enclaves, trace the popularity trends of specific local foods in Singapore, as well as to understand how price elasticity of local food impacts it's economic accessibility and review ratings.

Objectives

This Data Visualisation project aims to provide policy makers with a dashboard that will allow planners to gain insights into the local food landscape.
With this dashboard, planners will be able to :

  1. Get an overview of the distribution of restaurant by cuisine type, and to discover ethnic food enclaves by filtering by cuisine type.
  2. Get a sense of how the local food landscape changes by the time of day. Users will be able to filter restaurants by opening hours.
  3. Understand the price elasticity of different local dishes. Users will be able to filter restaurants for a specific local dish, and see the price elasticity of this selection on a chart.

Related Works

Visualizations Explaination
SL Tableau.png


Yelp Insights : Data Visualization
Source: http://shirleylei.com/projects/yelp.html

Shirley Lei's work on Yelp's Restaurant Data, uses Tableau as a visualisation platform to draw correlations between the neighbourhood the restaurant is in, the primary attributes of each restaurant (type of cuisine, outlet type, etc.), and the secondary attributes of each restaurant (WIFI and Child-friendly category tags).

GovTechDirty.png


Is Dirty Delicious?
Source: https://blog.data.gov.sg/is-dirty-delicious-f176bbb6e15e

Lin Zhaowei & Gaille Teo's work on visualising cleanliness ratings of hawker stores, combines both Yelp's Restaurant Ratings Dataset, as well as NEA's cleanliness score dataset, to draw parallels between a store's cleanliness rating, and Yelp user review ratings. The results attempt to debunk a common misconception that 'dirtier' hawker stores are reviewed better.


Screen Shot 2018-10-14 at 5.41.42 PM.png


From Hawaii to the Heartland: Poke’s journey from local favorite to food craze to national staple
Source: https://www.yelpblog.com/2018/08/pokes-journey-from-local-favorite-to-national-staple

Carl Bialik is a Data Editor at Yelp. In this work, he uses data visualisations to tell a story of the rapid rise in popularity of Hawaiian Poke between 2013 and 2018. By visualising the rise of Hawaiian Poke over time, in relation to the popularity of other food categories allow users a better context of the American food landscape, one in which the Hawaiian poke is an outlier.



Datasets and Sources

We need a vast dataset that covers the various aspects of dining in Singapore, and for this reason, we have chosen the following data sources to obtain a rich dataset.

  1. Postal Codes - List of all possible postal codes where a restaurant/hawker centre could exist within Singapore. Retrieved from OneMap API:
  2. Possible Restaurants of Interest - List of all restaurants that exist on the yelp portal at the postal codes provided. Retrieved from Yelp API:
  3. Restaurant Details - Details posted on yelp for the restaurants of interest. Retrieved using a crawling and scraping script on the Yelp Website.

Below is the list of Table Header name, description and an example for each table.

1. Postal Codes: Retrieved in JSON format using the OneMap API. Only showing attributes used.
Source : http://developers.onemap.sg

Name Description Example
Address Full address of the Postal Code "7 MAXWELL ROAD AMOY STREET FOOD CENTRE SINGAPORE 069111"
Coordinates Latitude and Longitude of the Postal Code.
           {   
               "latitude": 1.28035,
               "longitude": 103.84472
           }

2. List of Restaurants at a postal code: Retrieved as a JSON Array. Only showing attributes used.
Source : https://api.yelp.com/v3/businesses/search?location=069120

Name Description Example
id Unique ID of the restaurant. "fY1IkBnRft1KR0O2tqu7pg"
Name Name of the restaurant. "Tian Tian Hainanese Chicken Rice"
URL URL of the of the restaurant's yelp page, used for web scraping. https://www.yelp.com/biz/tian-tian-hainanese-chicken-rice-singapore-7
Categories Type of Restaurant Cuisine. "Hainan"


3. Restaurant Details: Retrieved using Python and returned in JSON format
Source : https://www.yelp.com/biz/din-tai-fung-singapore-5?osq=Din+Tai+Fung

Name Description Example
Name Name of the Restaurant Din Tai Fung
Location Address at which the restaurant is located

Wisma Atria
435 Orchard Rd
Level 4
Singapore 238877
Singapore
Orchard

Opening Hours Opening Hours of the restaurant 11:30 am - 10:00 pm
Cuisines - Menu Cuisines offered by the restaurant Taiwanese, Dim Sum
Price Range Price Range of food offered by the restaurant $$

later converted to a price bucket

Rating Average Overall Ratings provided by the patrons 4.5 stars
Reviews Number of Reviews regarding the restaurant 116

Based on the retrieved data, we will do data cleaning and entity extraction in order to understand some of the dishes on the menu and their cost (if available).


Storyboard

This dashboard will be designed according to the data visualisation task taxonomy as posited by Ben Shneiderman [2], which suggests that interactions should follow the flow to first provide an overview, then allow a zoom or filter and finally, provide details on demand. This flow will allow users with a more intuitive interaction with the dashboard.

The storyboard below displays an instance of how Shneiderman's task taxonomy is applied :

Storyboard Step Explanation
Step 1 : Overview
Screenshot 2018-11-21 at 4.11.04 AM.png
  • First, we will show users an overall distribution of all restaurants as a choropleth map.
  • Users will be able to change the aggregation of the map, between Planning Subzone (Masterplan 2014) and by a 200m Hexagonal grid. Users will also be given the option to turn off aggregations, choosing a point map layer (with jitter) instead.
  • All panels will be movable, and the Visualisations panel will also be hideable, addition to being movable.
  • The Navigation bar will allow users to switch between the different themes we would wish to explore, namely: "Interactive Map" and "Data Explorer".


Step 2 : Filter
Filter Panel
FiltersFinal.jpg
  • Users will be able to pick from an array of filter options to control the subzone, opening hours, and cuisine types.
  • Filters will remain constant throughout the application, and will not reset between tabs.
Step 3 : Zoom into the details
Analytics Panel
ScatterPlot.jpg

The analytics panel include 3 functions: Hexbin Density Plot, Opening Hours Heatmap, and F&B Cuisine Type Treemap.


  • The Hexbin Density Plot will reveal the relations between the ratings and the number of reviews on Yelp.
  • The opening hours heatmap will reveal the specific restaurants available at the specific opening hour range based on the user's setting of the time range.
  • The F&B Cuisine Type Treemap will reveal the main cuisine categories and continue to branch out into the subcategories of the cuisines to reflect the diversity of food in Singapore.
Step 4 : Details on Demand
Popup on Hover
Details.jpg
  • Zooming into a chosen Hexagon or planning area displays the Restaurants as a point/pin map.
  • Hovering over the pin will reveal a popup, with the selected restaurant's details.


Tools/ Libraries

Explanation

UI Layer

  • leaflet
  • maptools
  • shinythemes


Data Modelling Layer

  • plot.ly
  • gridExtra
  • lattice
  • maptools
  • dplyr
  • spatstat
  • sf
  • flexclust
  • rgdal


Data Layer

  • Yelp API


AARCH.png

Project Timeline

The Project Timeline is live and constantly updated. You can view our progress in the link below :

Food Scrapers Project Timeline (Google Sheets)

Timeline Foodscrappers.png

Ideation Drafts

This section is updated based on meetings and the ideation process behind the visualization that will be our final output.

Data of Essence to our visualization: An attempt to select relevant variables from the data rich source that is yelp
Data Discussion.jpg
Storyboard objectives: An attempt to clarify and narrow our scope
storyboard objectives
Storyboard Draft: An attempt to visualise our ideas based on the scope and variables chosen.
Storyboard 1.jpg
Storyboard 2.jpg
Storyboard Outline: An attempt to segment the visualisation into Exploratory: Explicit, Implicit Filters, and Explanatory works.
Dashboard ideation.png


Challenges and Assumptions

The challenges we face include:

No. Challenges Description Proposed Solution
1. Data Cleaning and Transformation

[Country's culture]

The data collected under the patrons' reviews is generally representative of the patrons own characteristics and often includes their cultural identity through fillers like lah, as well as spelling errors, that can make entity recognition difficult.

Use of a dictionary and spell check tool to improve the quality of the data to ensure entity recognition is successfully applied

2. Data Cleaning and Transformation

[Coding language]

While the data was scrapped using Python language, the data was meant to be visualise on R Shiny Application. As such, the difference in coding language disrupted the loading of the dataset onto the R Shiny Application. For example, a list is not recognised as a list, but is instead recognised as a character string

3. Information Presentation

Determining the most effective way to visualise and display the data in an interactive format is of the essence. It is necessary that the most important information is easily discernible from the visualisations.

Gain exposure to different visualisation techniques.
Follow and revisit techniques explored in class.
Also look at DataCamp courses.

4. Information Presentation [Limitations of Shiny App]

The colours of the density function on the R Shiny Application is pre-fixed and cannot be adjusted to facilitate better user experience.

The assumptions we made include:

  • If data regarding a restaurant is not available on yelp, we assume that the restaurant is not popular i.e. not much footfall reaches the restaurant. Therefore the restaurant would not add much value to our visualisation.

References

[1] NHB. (n.d.). Hawker Culture. Retrieved October 14, 2018, from https://www.oursgheritage.sg/hawker-culture/
[2] Shneiderman, B. (2005) “The eyes have it: A task by data type taxonomy for information visualization” IEEE Conference on Visual Languages (VL96), pp. 336-343.
[3] Zaccheus, M. (2018, August 20). Singapore hawker culture to be nominated for Unesco listing. Retrieved November 20, 2018, from https://www.straitstimes.com/singapore/spore-hawker-culture-to-be-nominated-for-unesco-listing
[4] Lei, S. (n.d.). Yelp Insights : Data Visualization. Retrieved November 20, 2018, from http://shirleylei.com/projects/yelp.html
[5] Lin, Z., & Teo, G. (2016, November 18). Is dirty delicious? – Data.gov.sg Blog. Retrieved November 20, 2018, from https://blog.data.gov.sg/is-dirty-delicious-f176bbb6e15e
[6] Bialik, C. (2018, August 23). From Hawaii to the Heartland: Poke's journey from local favorite to food craze to national staple. Retrieved November 20, 2018, from https://www.yelpblog.com/2018/08/pokes-journey-from-local-favorite-to-national-staple
[7] Ong, T. (2013, June 22). Will Kampong Glam Turn into Kampong Glum? Retrieved November 20, 2018, from https://sg.asia-city.com/city-living/article/will-kampong-glam-turn-into-kampong-glum

Comments

Please feel free to leave comments / suggestions!

No. Name Date Comments
1. Insert your name here Insert date here Insert comment here
2. Insert your name here Insert date here Insert comment here
3. Insert your name here Insert date here Insert comment here