JJJ: Proposal

From Visual Analytics for Business Intelligence
Revision as of 11:38, 14 November 2016 by James.chua.2013 (talk | contribs)
Jump to navigation Jump to search


Proposal



Problem & Motivation

Land scarcity is persistent issue faced by Singapore since its independence, being a country with only the size of a typical city or smaller of a fellow developed nation. As such, one of Singapore’s main challenges is in the area of Urban Planning to optimize land use without compromising on the standards of living for its residents.

One of the ways to assess the effectiveness of urban planning would be to study commuter patterns, understanding how people travel for their work and educational needs. Some who stay near to their workplace enjoy a shorter journey with less commuting time. However, there are also people who stay far from their workplaces and spend long hours on travel, for example an individual who stays at Tampines yet having to travel to Tuas for work. Hence, we would like to create a tool to gain a closer look into commuter patterns in Singapore to find out more on current commuter patterns. We believe that the tool in investigating commuter patterns would be useful for urban planners to be able to identify potential problems and patterns in the current design so as to improve the urban landscape in preparation for population growth.

Objectives:

  • To explore recent commuter data for bus travel in Singapore
  • To visualize commuter patterns during the morning peak hours
  • To explore the impact of current commuter patterns on possible challenges in urban planning
  • To create a visualization for an easy and intuitive understanding of the current situation for the average Singaporean

Background Survey of Related Work

Related Works What We Can Learn Based on Sources

Commuting Patterns of Industrial Workers

% Trips by Industrial Workers - Working Social Document.JPG

Source: http://web.mit.edu/11.521/papers/WorkingSocialDocument_Aug2012_v2.pdf

  • A static chart which is not intuitive prevents users from identifying a single path
  • Colour scheme makes it difficult to differentiate (Green vs Yellow)
  • Too many lines makes it difficult for users to identify the region names
  • Brushing and filtering is needed to focus on area of concern while muting out other points to reduce clutter on visualisation

An analysis of Bus Travelling Time

Bus travel time.png

Source: http://sgtptr.chrissng.net/

  • Lack of clear and uniform intervals on the legend
  • Usage of decimal for values of data presented (mins of travel) is not appropriate in the context of the data. (i.e. Usuall people would round up to an integer when indicating a range)
  • Every origin is a destination which makes it redundant to show origin and destination legends
  • The names of the places are well labelled and clear

Traveller Distances

Traveller distances - Working Social Document.JPG

Source: http://web.mit.edu/11.521/papers/WorkingSocialDocument_Aug2012_v2.pdf

  • Effective visualisation on clustering effect
  • Can be improved for use of multivariate analysis by incorporating shapes or colours to categorise the data

Choice of Dataset

Data Source: https://data.dex.sg/organization/land-transport-authority

  1. Preparation of Data

We downloaded a total of 8 CSV files which consists of commuter and bus data from the data source above. The main dataset provides an extensive number of up to 1,000,000 commuter records and needs careful analysis of what we need. Here is a list of attributes that are of our concern when cleaning the data:

  1. Commuter ID
  2. Ride Start Time
  3. Ride End Time
  4. Boarding Bus Stop
  5. Alighting Bus Stop
  6. Bus Service Number

Other datasets provides us more attributes such as the coordinates of the bus stops

Description of the approach

Type of Visual Why do we think it is useful

Proportional Symbol Map

Photo 2016-10-07 18-12-00.jpg

Source: https://www.e-education.psu.edu/natureofgeoinfo/c3_p17.html

  • To show the proportion of people travelling out/within each region
  • Bar charts will be generated on each region of the map e.g. East
  • Each Bar chart represents the proportion of people travelling Interregion (East-West, East-North, etc) and within region (East-East)
  • Can add in more analysis by binning the journey start time to morning, afternoon and night as filters to show commuter flow by time period of day
  • Interactive visual that can be used with brushing to highlight and narrow down to area of interest
  • To provide for easy and intuitive understanding of existing commuter patterns

Sunburst Diagram

Photo 2016-10-07 18-35-05.jpg

Source: https://bl.ocks.org/kerryrodden/7090426

  • Possible alternative to the above Proportional Symbol Map
  • Focuses more on the hierarchical breakdown with the specific proportion at each hierarchy
  • There is a need to breakdown each hierarchy further to show the full potential of this particular visual

Interactive dashboard with line graph, box plot, and data table

  • To be used to show the distribution of people boarding a particular bus service at different bus stops
  • Boxplot to show the distribution of the number of people boarding with summary statistics e.g. Median
  • Line graph to plot y-axis as number of people boarding and x-axis with each bus stop traveled by the bus service in sequence
  • Brushing of the line graph or box plot will reflect the same data point(s) on the other graph with the details of the data point(s) displayed in the data table
  • Could be used to identify redundancy in the bus stops the bus route takes and suggest improvements to the bus services

References

  1. http://worksingapore.com/articles/live_4.php
  2. https://www.lta.gov.sg/content/dam/ltaweb/corp/PublicationsResearch/files/ReportNewsletter/LTMP2013Report.pdf
  3. https://www.quora.com/In-Singapore-it-takes-more-than-an-hour-to-reach-a-destination-via-the-public-transport-bus-train-but-just-quarter-of-the-time-if-I-were-to-take-the-taxi-Would-we-still-call-the-public-transport-successful
  4. http://web.mit.edu/11.521/papers/WorkingSocialDocument_Aug2012_v2.pdf
  5. http://www.enterpriseinnovation.net/article/singapores-transport-vision-analytics-new-interfaces-autonomous-vehicles-1298824564
  6. http://business.asiaone.com/career/news/3-factors-determine-if-singaporeans-leave-their-jobs
  7. http://community.jobscentral.com.sg/articles/your-daily-work-commute-ruining-your-life
  8. http://blog.moneysmart.sg/lifestyle/cheap-fast-and-painless-commuting-in-singapore-is-it-possible/
  9. http://lkyspp.nus.edu.sg/wp-content/uploads/2014/01/Transport-Planning-for-Singapore.pdf
  10. http://lkyspp.nus.edu.sg/wp-content/uploads/2013/04/Barter-Sg-urban-transport-sustainable-by-design-or-necessity.pdf
  11. https://www.ura.gov.sg/uol/master-plan/view-master-plan/master-plan-2014/Growth-Area
  12. http://www.smartnation.sg/initiatives/Mobility/spearheading-research-in-standards-for-sdvs
  13. https://www.ura.gov.sg/skyline/skyline09/skyline09-02/text/04.htm

Key Technical Challenges

Big Data Processing

Processes such as data cleaning needs to be done to ensure the data is in appropriate formats for us to run our analysis. Given the data, we might need to make estimations with appropriate assumptions to group the data such as categorizing the commuter groups. Certain data points may be excluded from our analysis as it is out of scope, for example commuters travelling to and from bus stop in Johor Bahru, Malaysia.
As we have multiple datasets, we need to ensure appropriate merging of data sets without errors.

Unfamiliarity with visualization tools

As we have yet to go in depth into the hands-on of some of the visuals suggested, we might need to research and practice on using our specific data to create the visuals we have in mind. There may be limitations due to software we try to use to generate the visual we want. We may not end up with what we had in mind or miss out on exploring other softwares that could have helped us achieve what we wanted.
D3.js has high customization ability, but requires a lot more work on manual coding to achieve the visualizations we want compared to commercial tools such as Tableau with off the shelf functions/capabilities that we can adjust using their user interface.

Managing the trade offs between static and interactive visualizations

Interactive Visualizations and Static Visualizations have their own strengths, and we need to understand the trade off of choosing one over the other. Some interactive visualizations may engage the user more, however the effectiveness may depend on the user's knowledge and capability in manipulating the tools to gain insightful results. On the other hand, static visualizations may present a more comprehensive view at a glance, but require more technical knowledge for the user to fully understand the data visualization.

Milestones

Project Timeline.JPG

Comments

Please feel free to comment on our proposal.