Difference between revisions of "IS428 2018 19T1 Group11 Proposal"
(5 intermediate revisions by 2 users not shown) | |||
Line 3: | Line 3: | ||
<!--MAIN HEADER --> | <!--MAIN HEADER --> | ||
− | + | <span class="mw-ui-button {{#switch: {{{color|white}}} }}"> | |
+ | [[Project_Groups|{{{Clickable Button|Back to Project Home}}}]] | ||
+ | </span><noinclude> | ||
− | | style=" | + | {|style="background-color:#009D3B; color:#009D3B; padding: 10 0 10 0;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0" | |
− | |||
− | | | ||
− | | style=" | + | | style="padding:0.2em; font-size:100%; background-color:#009D3B; text-align:center; color:#F5F5F5" width="16.6%" | |
− | [[IS428_2018_19T1_Group11_Team | <font color="#fff"><b> | + | [[IS428_2018_19T1_Group11_Team|<font color="#fff" face="Century Gothic"><b>HOME</b></font>]] |
− | | | + | | style="background:none;" width="1%" | |
− | | style=" | + | | style="padding:0.2em; font-size:100%; background-color:#bcbbbb; text-align:center; color:#F5F5F5" width="16.6%" | |
− | <font color="#000"><b>PROPOSAL</b></font> | + | <font color="#000" face="Century Gothic"><b>PROPOSAL</b></font> |
− | | | + | | style="background:none;" width="1%" | |
− | | style=" | + | | style="padding:0.2em; font-size:100%; background-color:#009D3B; text-align:center; color:#F5F5F5" width="16.6%" | |
− | [[IS428_2018_19T1_Group11_Poster | <font color="#fff"><b>POSTER</b></font>]] | + | [[IS428_2018_19T1_Group11_Poster|<font color="#fff" face="Century Gothic"><b>POSTER</b></font>]] |
− | | | + | | style="background:none;" width="1%" | |
− | | style=" | + | | style="padding:0.2em; font-size:100%; background-color:#009D3B; text-align:center; color:#F5F5F5" width="16.6%" | |
− | [[IS428_2018_19T1_Group11_Application | <font color="# | + | [[IS428_2018_19T1_Group11_Application|<font color="#fff" face="Century Gothic"><b>APPLICATION</b></font>]] |
− | | | + | | style="background:none;" width="1%" | |
− | | style=" | + | | style="padding:0.2em; font-size:100%; background-color:#009D3B; text-align:center; color:#F5F5F5" width="16.6%" | |
− | [[IS428_2018_19T1_Group11_Report | + | [[IS428_2018_19T1_Group11_Report|<font color="#fff" face="Century Gothic"><b>REPORT</b></font>]] |
− | | | + | | style="background:none;" width="1%" | |
|} | |} | ||
− | [[ | + | [[Grab C.H.A | <font color="#000"><b>Version 1</b></font>]] | [[IS428_2018_19T1_Group11_Proposal-version2 | <font color="#000">Version 2</font>]] |
<br> | <br> | ||
<div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">PROBLEM AND MOTIVATION</font></div> | <div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">PROBLEM AND MOTIVATION</font></div> | ||
− | + | <br> | |
Despite Grab’s strong presence within the South East Asian rideshare market following the acquisition of Uber in March of this year, the growing number of players within the different fields in which Grab operates in incentivises the company to adopt non-traditional methods to improve its business operations. <br> | Despite Grab’s strong presence within the South East Asian rideshare market following the acquisition of Uber in March of this year, the growing number of players within the different fields in which Grab operates in incentivises the company to adopt non-traditional methods to improve its business operations. <br> | ||
Line 44: | Line 44: | ||
<div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">OBJECTIVES</font></div> | <div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">OBJECTIVES</font></div> | ||
− | + | <br> | |
The objective of the visualization is to bridge the gap between the analytics and business teams. While the findings from the LDA model might be intuitive for those from the analytics team, the business users may find it difficult to internalize these findings. Hence, we aim to create a scalable way to present these findings to the business users in an easy to understand format. | The objective of the visualization is to bridge the gap between the analytics and business teams. While the findings from the LDA model might be intuitive for those from the analytics team, the business users may find it difficult to internalize these findings. Hence, we aim to create a scalable way to present these findings to the business users in an easy to understand format. | ||
Line 52: | Line 52: | ||
<div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">DATA SOURCE</font></div> | <div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">DATA SOURCE</font></div> | ||
− | + | <br> | |
<b> Data Source </b><br> | <b> Data Source </b><br> | ||
Data used is obtained from web scraping of various social media platforms such as Instagram, Twitter, Reddit and Google Play. <br> The data set consists of 9000 comments that were scraped and collected. <br><br> | Data used is obtained from web scraping of various social media platforms such as Instagram, Twitter, Reddit and Google Play. <br> The data set consists of 9000 comments that were scraped and collected. <br><br> | ||
Line 58: | Line 58: | ||
<b> Data Attributes </b> <br> | <b> Data Attributes </b> <br> | ||
The following is a snapshot of the data collected, and a description of the data attributes: <br> | The following is a snapshot of the data collected, and a description of the data attributes: <br> | ||
− | [[File:Metadata1.png|thumb|alt=Alt text| Figure 1: Comments Dataset| | + | [[File:Metadata1.png|thumb|alt=Alt text| Figure 1: Comments Dataset|left|upright=2.35]] <br> |
− | {| class="wikitable" style="background-color:#FFFFFF | + | {| class="wikitable" style="background-color:#FFFFFF;" width="60%" |
! Data Attributes | ! Data Attributes | ||
! Description of attributes | ! Description of attributes | ||
Line 93: | Line 93: | ||
<div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">BACKGROUND SURVEY OF RELATED WORKS</font></div> | <div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">BACKGROUND SURVEY OF RELATED WORKS</font></div> | ||
+ | <br> | ||
{| class="wikitable" style="background-color:#FFFFFF;" width="80%" | {| class="wikitable" style="background-color:#FFFFFF;" width="80%" | ||
! style="width:50%" | Related Works | ! style="width:50%" | Related Works | ||
Line 124: | Line 125: | ||
<div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">STORYBOARD</font></div> | <div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">STORYBOARD</font></div> | ||
+ | <br> | ||
{| class="wikitable" style="background-color:#FFFFFF;" width="80%" | {| class="wikitable" style="background-color:#FFFFFF;" width="80%" | ||
! style="width:50%" | Sketches | ! style="width:50%" | Sketches | ||
Line 140: | Line 142: | ||
* Suggested cluster topic based on keyword that appears most frequently | * Suggested cluster topic based on keyword that appears most frequently | ||
|- | |- | ||
− | | style="text-align: center;" | <b> Visualisation 2: Topic Cluster Visualisation <br> </ | + | | style="text-align: center;" | <b> Visualisation 2: Topic Cluster Visualisation<br></b> Option 1 |
+ | <br> | ||
[[File:Topic Cluster Visualisation - option 1.png|400px|center]] | [[File:Topic Cluster Visualisation - option 1.png|400px|center]] | ||
Inspired by: https://www.csc2.ncsu.edu/faculty/healey/tweet_viz/ | Inspired by: https://www.csc2.ncsu.edu/faculty/healey/tweet_viz/ | ||
Line 158: | Line 161: | ||
|- | |- | ||
− | | style="text-align: center;" | <b> Visualisation 2: Topic Cluster Visualisation <br> </ | + | | style="text-align: center;" | <b> Visualisation 2: Topic Cluster Visualisation<br></b> Option 2 |
[[File:Topic Cluster Visualisation - option 2.png|400px|center]] | [[File:Topic Cluster Visualisation - option 2.png|400px|center]] | ||
[[File:Topic Cluster Visualisation - option 2.1.png|400px|center]] | [[File:Topic Cluster Visualisation - option 2.1.png|400px|center]] | ||
Line 168: | Line 171: | ||
|- | |- | ||
− | | style="text-align: center;" | <b> Visualisation 2: Topic Cluster Visualisation <br> </ | + | | style="text-align: center;" | <b> Visualisation 2: Topic Cluster Visualisation <br></b> Option 3 |
[[File:Topic Cluster Visualisation - option 3.png|400px|center]] | [[File:Topic Cluster Visualisation - option 3.png|400px|center]] | ||
+ | Inspired by: http://lcs.ios.ac.cn/~shil/paper/VISA_VINCI.pdf | ||
| | | | ||
Idea: <br> | Idea: <br> | ||
Line 184: | Line 188: | ||
|} | |} | ||
</div> | </div> | ||
− | |||
− | |||
<div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">KEY TECHNICAL CHALLENGES</font></div> | <div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">KEY TECHNICAL CHALLENGES</font></div> | ||
− | + | <br> | |
{| class="wikitable" style="background-color:#FFFFFF;" width="80%" | {| class="wikitable" style="background-color:#FFFFFF;" width="80%" | ||
! style="width:50%" | Key challenges | ! style="width:50%" | Key challenges | ||
Line 216: | Line 218: | ||
<div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">PROJECT TIMELINE</font></div> | <div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">PROJECT TIMELINE</font></div> | ||
− | [[File:Gantt Chart.png| | + | <br> |
+ | [[File:Gantt Chart.png|1500px]] | ||
</div> | </div> | ||
<div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">TECHNOLOGIES AND TOOLS</font></div> | <div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">TECHNOLOGIES AND TOOLS</font></div> | ||
+ | <br> | ||
{| class="wikitable" style="background-color:#FFFFFF;" width="80%" | {| class="wikitable" style="background-color:#FFFFFF;" width="80%" | ||
! style="width:50%" | Technology and Tools | ! style="width:50%" | Technology and Tools | ||
! Explanation | ! Explanation | ||
|- | |- | ||
− | | style="text-align: center;" | [[File:Rshiny.jpg| | + | | style="text-align: center;" | [[File:Rshiny.jpg|100px|center]] |
| We will primarily be using Shiny for our visualization. Shiny is an open source R package that provides an elegant and powerful web framework for building web applications using R | | We will primarily be using Shiny for our visualization. Shiny is an open source R package that provides an elegant and powerful web framework for building web applications using R | ||
− | | style="text-align: center;" | [[File:Rstudio.jpg| | + | |- |
+ | | style="text-align: center;" | [[File:Rstudio.jpg|90px|center]] | ||
| We will be building the machine learning model and visualization using R Studio | | We will be building the machine learning model and visualization using R Studio | ||
− | | style="text-align: center;" | [[File:Photoshop.png| | + | |- |
+ | | style="text-align: center;" | [[File:Photoshop.png|80px|center]] | ||
| We will be using Photoshop to design our poster | | We will be using Photoshop to design our poster | ||
|} | |} | ||
Line 234: | Line 240: | ||
<div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">REFERENCES</font></div> | <div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">REFERENCES</font></div> | ||
− | + | <br> | |
+ | https://www.csc2.ncsu.edu/faculty/healey/tweet_viz/ <br> | ||
+ | https://rstudio-pubs-static.s3.amazonaws.com/236186_d311ea00291d42509864aa0a77d340e8.html <br> | ||
+ | http://lcs.ios.ac.cn/~shil/paper/VISA_VINCI.pdf <br> | ||
+ | These pages have inspired numerous alternatives to visualising topic sentiments | ||
</div> | </div> | ||
<div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">COMMENTS</font></div> | <div style="background: #009D3B ; margin-top: 40px; font-weight: bold; line-height: 0.3em;letter-spacing:-0.08em;font-size:20px"><font color=#009D3B face="Century Gothic">COMMENTS</font></div> | ||
− | + | <br> | |
+ | Feel free to comment and leave suggestions and feedback to help us improve our project!:D | ||
</div> | </div> |
Latest revision as of 20:01, 14 October 2018
PROPOSAL |
Despite Grab’s strong presence within the South East Asian rideshare market following the acquisition of Uber in March of this year, the growing number of players within the different fields in which Grab operates in incentivises the company to adopt non-traditional methods to improve its business operations.
While it is important to fulfil the bottom line, a huge determinant of the company’s success stems from the public’s perception and Grab’s positioning in the markets. As such, this project aims to create a systematic method in which Grab can use to understand the public’s sentiment on their product and the company image.
The NLP algorithm used by Grab to describe the public’s sentiments follows the Latent Dirichlet Allocation (LDA) model, a generative statistical model that allows sets of observations to be explained by unobserved groups, or topics, that explain how some parts of the data are similar. With this model, Grab hopes to identify latent topics of interest and understand the public’s perception of the topic. Grab can therefore make use of this information to address any inadequacies in their business practices, create more effective marketing campaigns and improve their business operations to build a stronger overall brand image.
The objective of the visualization is to bridge the gap between the analytics and business teams. While the findings from the LDA model might be intuitive for those from the analytics team, the business users may find it difficult to internalize these findings. Hence, we aim to create a scalable way to present these findings to the business users in an easy to understand format.
As mentioned before, we hope that by employing the LDA model, the business teams will be able to get a sense of the current perception of Grab’s products by the community. In the model, the user chooses an input of the total number of topics that he or she believes is an ideal balance between granularity and actionability (i.e too many topics may result in low actionability, while too little topics may result in not enough cause for action). As the ideal number of topics are often subjective, our proposed visualization will have the option for users to adjust the number of topics based on what he/she believes is ideal. We will also be including a covariance score plot that can guide users towards the ideal number of topics.
Data Source
Data used is obtained from web scraping of various social media platforms such as Instagram, Twitter, Reddit and Google Play.
The data set consists of 9000 comments that were scraped and collected.
Data Attributes
The following is a snapshot of the data collected, and a description of the data attributes:
Data Attributes | Description of attributes |
---|---|
Document | Comments scraped may consist of more than a sentence each. They are separated and identified by documents. Hence, a document represents a sentence of comment. |
Dominant_Topic | Dominant topic refers to the topic that the document will most likely be sorted into. |
Topic_Perc_Contrib | The probability that the comment will be found in the topic amongst all other comments with similar keywords. |
Keywords | Keywords that belong in each of the topics. |
Text | Words in each comment after the removal of stop words (eg. the, is, to, on etc). |
Original_Comment | Original comment. |
Comment_Date | Date that the comment was posted. |
Related Works | What We Can Learn |
---|---|
Tweet Sentiment Visualisation Source: https://www.csc2.ncsu.edu/faculty/healey/tweet_viz/tweet_app/ |
|
Tweet topic cluster visualisation
Source: https://www.csc2.ncsu.edu/faculty/healey/tweet_viz/tweet_app/ |
|
Tweet sentiment across time
Source: https://www.csc2.ncsu.edu/faculty/healey/tweet_viz/tweet_app/ |
|
Sketches | Description of Approach |
---|---|
Visualisation 1: User modelling interface |
Idea:
|
Visualisation 2: Topic Cluster Visualisation Option 1
Inspired by: https://www.csc2.ncsu.edu/faculty/healey/tweet_viz/ |
Idea:
*subject to data availability |
Visualisation 2: Topic Cluster Visualisation Option 2 |
Idea:
|
Visualisation 2: Topic Cluster Visualisation Option 3 Inspired by: http://lcs.ios.ac.cn/~shil/paper/VISA_VINCI.pdf |
Idea:
*subject to data availability |
Key challenges | Proposed solution to overcome challenges |
---|---|
Data cleaning and ensuring good data quality |
|
Lack of experience using relevant tools for analysis such as RShiny |
|
Determining the most effective way to visualise the data |
|
Technology and Tools | Explanation |
---|---|
We will primarily be using Shiny for our visualization. Shiny is an open source R package that provides an elegant and powerful web framework for building web applications using R | |
We will be building the machine learning model and visualization using R Studio | |
We will be using Photoshop to design our poster |
https://www.csc2.ncsu.edu/faculty/healey/tweet_viz/
https://rstudio-pubs-static.s3.amazonaws.com/236186_d311ea00291d42509864aa0a77d340e8.html
http://lcs.ios.ac.cn/~shil/paper/VISA_VINCI.pdf
These pages have inspired numerous alternatives to visualising topic sentiments
Feel free to comment and leave suggestions and feedback to help us improve our project!:D