Difference between revisions of "Project Groups"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 494: Line 494:
  
 
||
 
||
'''A Visual Application for Better Business Decision Making: Combining Association Rule Mining with Network Analysis'''
+
<b>VRshiny</b>: An Application for <b>V</b>isualizing Association <b>R</b>ules with Network Diagram in <b>Shiny</b>
<br>Association rules mining (ARM), also known as Market Basket Analysis (MBA), was developed based on the simple yet powerful concept of statistics—probabilities. It is being widely used to find out the combinations of items that people are more likely to purchase together. ARM helps us to identify groups of variables that are highly correlated, to observe the behaviors that tend to occur together and to analyze the combination of variables are likely to result in the occurrence of another event. In our application, we aim to extend the application of ARM to different business domains, on different datasets, for the user to have a better understanding of his/her current business situation, before making decisions for the future. A network analysis is included in the application to better visualize the interactions among the occurrences of different events.
 
  
Besides the conventional Market Basket Analysis, our application can be used to analyze different business scenarios similar to below examples:
+
<br>Association rules mining (ARM) was developed based on the fundamental yet powerful concept of statistics—probabilities. It is being widely used to find out the combinations of items that people are more likely to purchase together. We deployed various applications of ARM in VRshiny, allowing the users – especially small to medium sized enterprises – to investigate their current business status before making decisions for the future. A network diagram is incorporated in VRshiny to better visualize the association rules.
  
* Pharmacosurveillance: which drugs with adverse-effects tend to be purchased together?
+
<br>VRshiny was made interactive with R Shiny framework for the users to calibrate the model and explore the model statistics before choosing their rules of interest for analysis. The application allows users to upload different datasets and perform association rules mining beyond the conventional Market Basket Analysis. We intend to fulfill the preliminary analytics needs of business owners from all industries with our handy visual analytics application, before they invest into more expensive and complex analytics capabilities.
 
+
 
* Sentiment analysis: which combination of key words in the reviews tend to lead to higher/lower review rating?
 
 
 
* HR analytics: which combination of signs indicates higher probability of employee leaving?
 
 
 
* E-Commerce: which links or pages the customer tend to click/browse before making a purchase?
 
 
 
The application was made interactive with R Shiny and Shiny Dashboard for the users to calibrate the model and explore the model statistics before choosing his/her interested associations for analysis. We hope to fulfill the analytics needs of the small to medium business owners with our handy exploratory visual analytics application, before they invest into more expensive and complex analytics capabilities.
 
 
||
 
||
  

Revision as of 23:17, 31 July 2017

Vaa1.jpg ISSS608 Visual Analytics and Applications

About

Weekly Session

Assignments

Visual Analytics Project

Course Resources

 


Project Groups

Please change Your Team name to your project topic and change student name to your own name

Project Team Project Title/Description Project Artifacts Project Member

Group 1

Understanding traffic patterns using network graph visualisations

The project aims to illustrate the power of visual analytics in highlight patterns exhibited by vehicles when traversing through various traffic corridors. By linking the information captured in RFID tags when vehicles move through checkpoints, an interactive application is designed. This will help to unravel insights such as frequently travelled corridors, preferred routes amongst vehicles, traffic density, etc. The application will be primarily developed using R, and specifically the versatile ggraph package, which helps to develop powerful network visualisations. ggraph has been chosen as it is a recent release (Feb 2017), that exhibits the power of R and the ggplot architecture in incorporating network visualisations. Though ggplot tools have existed to visualise network patterns previously, the use of this package helps in making neater visualisations that help the user understand better. The motivation for this project stems from the traffic accumulation key problem found in most cities. Though the dataset pertains to a set of vehicles travelling through a wildlife preserve, the ideology can be applied to planning of roads and associated establishments. Urban planning needs to cater to robust planning of vehicle corridors to minimise disruptions in flow, and improve productivity. The interactive application helps the user understand linkages between various points in a predefined vicinity. The timestamp information of vehicle passages present in the data helps to understand various parameters such as traffic density, preferred corridors for vehicles and their speeds. The application devised here is also aimed to support the traffic authorities to identify what other alternative corridors might exist for reaching from Point A to point B. In addition, network measures such as the betweenness, the connectivity and closeness of various nodes, are also provided.



  • Kishan Bharadwaj Shridhar
  • Ong Guan Jie Jason
  • Zhang Yanrong

Group 2 - Team ADA

Analysing Rise in Temperatures and Its Causes Globally Through Interactive Visualizations

Climate change and global warming are material and contemporary issues that are gaining traction from countries all over the world. Global citizens of all ages and economic backgrounds are faced with unwanted effects of climate change today. As the buzz around global warming continues to increase, this contemporary issue has incited many relevant visualisations. Through this project, we analyse the key contributors to climate change namely: Fossil fuel consumption, adoption/rejection of renewable energy, electricity consumption, deforestation rate, and greenhouse gas emissions. The residual impact of these factors to facilitate the rise of global temperature has been captured for 86 countries around the world.

While most contemporary visualisations focus on individual environmental hazards such as increased rates of carbon emissions or the rapid rise in temperature, our analysis attempts to connect the dots to better understand the cause-and-effect nature of global warming. Through our visualisations, we depict the causal effect between the factors which contribute to greenhouse gas emissions and the resulting impact on increase in temperature from the year 1990 to 2012. Furthermore, we attempt to forecast the aforementioned causal effects and the net rise in temperature for ten subsequent years to better understand the variation in each factor over time.

  • Akangsha Bandalkul
  • Angad Srivastava
  • Dipti Kalyandurgmath

Group 3: Team S-MALL

Smart Mall Application: Visual Application Design for Brick-and-Mortar Retailers

With growing popularity of e-commerce and online shopping, traditional brick & mortar retail malls are facing stiff challenge to reinvent itself with unique preposition. Also as part of the Smart-Nation drive and transformation, retail malls can and needs to leverage on this digital transformation journey. In this connected era, shoppers are actively leaving their digital footprints, and hence they can be tracked just like on-line customers. Retail malls are able to gather this data from Wi-Fi. Along with the typical transaction data gathered from daily operations and customers profile data obtained from loyalty programs, how can we bridge the information and re-produce the shoppers journey? We also notice that the industry is keen to know the product mix in order to up-sell or cross-sell. So how can visualization techniques assist in analyzing data, deriving insights and forming business strategies?

This project attempts to develop an interactive application to visualize and answer:

  • How does traffic looks like in the shopping mall? When is the peak hour, where is the busiest area?
  • Who are the shoppers in the shopping mall?
  • How to analyze product mix?

  • CHEN Yun-Chen
  • CHIAM Zhan Peng
  • ZHENG Bijun

Tfm UR.jpg

Why did the Migrant cross the border?
Exploring Migration Flows: An Analytical Dashboard

As a phenomenon, migration is not new, and people have been moving from place to place since the dawn of time. Migration could be either permanent or temporal. The causes of migration are varied, ranging from social, political, to economic reasons. Theories of migration tend to focus on push-pull factors of source and destination countries. However, any migration flow is both a product of these factors as well as the relevant immigration policies of regulating cross border flows. Given all these factors, it is difficult to study the determinants of migration. Currently, attempts at visualising migration flows are more descriptive than analytic, focusing on the flows (particularly on the type) rather than the reasons for the flows. Examples range from general migration data, to refugees, and even human trafficking flows. There have also been attempts to visualize migration using proxies, such as tax return data and remittance data. We developed an analytical framework to explore determinants of migration i.e. how the attributes of source and destination countries are related to in and out migration rates of these countries . Using R/Shiny and other packages, we attempted to integrate bilateral migration flow data with data describing the characteristics of both source and destination countries, drawing data from the World Bank, the Polity IV dataset, and measures from the Hofstede’s Cultural Dimension Theory. This culminated in the design of an analytical dashboard that allows users to perform exploratory data analysis to aid policy and academic research on migration.

  • CHEN Xiaoqing
  • Vincent Mack Zhi Wei
  • David Ten Kao Yuan

Group 7 - Trench-coat Detectives

Geo-spatial analysis of vehicle movement data to uncover patterns and detect anomalies

Geo-spatial analysis is a subject of growing interest owing to numerous reasons - but a clear driver is data availability and data accessibility. The rise of the sensors and IOT era has made data capture by independent organizations and bodies plausible and this improves accessibility to movement data which was earlier limited to usage by government bodies and leading researchers.

Our study utilizes data from a Natural Preserve to explore vehicle movement patterns. The scope includes the analysis and visual representation of frequency related findings such as ‘peak and non-peak periods’ and route related findings such as ‘path navigation through the preserve’ with the aid of interactive dashboards.

The primary framework used for analysis and visualization is rShiny. There are several packages in R that enable us to create a seamless interactive interface for connecting and exploring the data. We have incorporated calendar heat maps to understand the peak and non-peak cycles across months, day of the weeks and hour of the day. This has enabled us to plot all the movement data over time on a single view. The patterns and trends identified can be used to drill down for further exploration. The route taken by the vehicles was explored using Sunburst diagrams. This has been used to view a summary of the paths taken and understand the more popular paths. Common destinations and starting points can be easily identified and compared.

The interactive dashboards have been developed to accommodate the analysis of other sources of movement related data to retain reusability of dashboards and extend its usage for other purposes where deemed suitable. Future works can include incorporating speed related elements in analysis.

  • Anuthama Murugesan
  • Krutika Balveer Choudhary
  • Sumalika Kodumuru

100px

Even though Zika virus has been identified as early as in 1940s, it wasn't widely reported till it outbroke rapidly in south America in 2015. In Singapore, the first case was found in August 2016. Within 2 months, there are more than 400 cases identified locally. In this project, we will exam spread pattern of the Zika virus, are there any relationship between the weather or geolocation and the spread of this virus.

  • Ye JiaTao
  • Yang YuWei
  • Chen YiFan

HAPPY.jpg Group6

World Happiness Visualization

In philosophy, happiness refers to the Greek concept of eudaimonia, and reflects the good life, or flourishing, rather than simply an emotion. In addition, numerous factor could affect mental happiness which would largely influence people's overall well-being especially in today's high-stressed world, therefore, careful investigation and analysis is meaningful.

According to the 2017 United Nation's World Happiness Report, Singapore is the happiest country in Asia. To add value to the data collected in World Happiness Report, which only provides plain text, R is used together with rShiny to interactively visualize 150+ countries’ happiness level and happiness related factors, including economy, health, family, freedom, trust and generosity etc. Moreover, this data vis could show details and insights of the selected dataset interactively according to different levels: global, region and individual country level.

The application built for this project could enable different stakeholders gain more insights for global happiness levels and probably help the local governments of countries better understand the happiness level of their citizens in order to make their countries a happier places to live in.

  • HE Lingfei
  • MAO Chenxin
  • WANG Yingbei

100px

Performance Decomposition of China Listed Firms

Since initiating market reforms in 1978, China has shifted from a centrally-planned to a market-based economy and has experienced rapid economic and social development. With a population of 1.3 billion, China is the second largest economy and is increasingly playing an important and influential role in development and in the global economy. China has been the largest contributor to world growth since the global financial crisis of 2008. An increasing number of foreign investors are looking for opportunities to invest in China. Therefore, understanding the performance of China listed firms becomes very essential. This project identifies the need of developing an interactive dashboard to display the performance of China listed firms, which will be very helpful in understanding how these companies of different industries have developed over the past 12 years.

To develop a comprehensive display of the performance of China listed firms, four parts are covered in our application. The first part is an interactive tree map, which shows 4 layers from industry to province to city. The tree map provides an overview of listed firms distribution in terms of total assets and profits in a top-down method. The second part is a geo_facet line graph which is linked to the tree map by click action. The geo_facet map compares the yearly revenue performance from 2003-2015 of each province by industry. The third part is scatter plot, which is also linked to the tree map by click action. The scatter plot shows each stock performs in terms of earning and cash flow of based on the selected province, which provides a more granular view of the stock’s performance. The forth part is a spark table, which is an information dashboard design. This spark table is a way the displays all the important indicators of each stock so that the investors have a pipeline to observe the data in a more detailed manner.

  • Wei Yunna
  • Chen Yinjue
  • Xu Yue

100px AnalyTweets

Visualising Twitter: Hashtags and User Mentions Network

Twitter is described as the SMS of the internet. When key events occur, knowing the buzz from an information network such as Twitter tells you the present. Our aim is to visually analyze the association of #tags, representing a key idea, with @mentions.

  • Kuar Kah Ling
  • Meenakshi Gopalakrishnan
  • Parikshit Mayee

Group 10 VisualizeR

CROWD FUNDING – Visualizing and Analyzing with R

Crowdfunding is the practice of using small amounts of capital from a relatively large number of individuals to fund a project or venture typically through the Internet. Crowdfunding makes use of the easy accessibility of vast networks of friends, family and colleagues through social media websites like Facebook, Twitter and LinkedIn to get the word out about a new business or campaign and attract investors. Mobile Apps are a popular growing medium along with the above mentioned social media websites for helping campaigns and projects to publicize and seek funding for their work. Campaigns can range anywhere from technology, business, nonprofit, political, charity, commercial, or financing for a startup. With the rise of such online platforms allowing people to easily create campaigns, crowdfunding has emerged as an area that is ripe for research.

As an area of analysis, crowdfunding has largely featured literature that focused more on predicting the success/failure of campaigns. However, as a field of visualization, the data has relatively been left untapped; most visualizations that exist simply show the accuracy of these prediction algorithms.

Through this project and application of R and its tools, we set a platform to explore the datasets gathered by the crowdfunding apps for understanding and visualizing patterns between the viewers and investors. The application sets the tone for performing exploratory data analysis (via choropleths and heatmaps and calendar maps) by way of communicating the age group that contributes most or the states that contribute highly on crowd funding projects. The application helps us find specific segments of users who show interest on specific category of project (Health/Environmental/ Technological/ Sports/Politics, etc.) that the app launches/publishes. It helps unleash the user behavior through sunburst charts for various regions/states and help us find the regions that indulge in cautious investing or impulsive funding. Usage of clustering algorithms (k means and parallel coordinates visualization) demonstrated in CFVAR help us segment the users in ways or methods that matter to individual users or corporations for their ongoing as well as upcoming projects. Both researchers of crowdfunding as well as people interested in starting their own campaigns can benefit from such tools as they can utilize these visualizations to make better sense of the data. Because of this emerging domain, the visualizations explored would just be the beginning of what can be an ever-increasing domain of research and analysis for this growing field.

  • Eric Prabowo C
  • Asmit Adgaonkar
  • Shuo Zhang

IMDb

The project aims to enable the user to explore various aspects of IMDb data set through interactive visualization tools and techniques.

  • Abhinav Ghildiyal
  • Agrim Gairola
  • Vishal Bansal

R Application to Project Singapore’s Historical Rainfall Data


Singapore has made great efforts in making government agency data easily accessible to everyone and this project aims to make use of such government data and transforming it to an easily consumable form for the general audience. The group is tapping into the historical rainfall data from the website of the National Environmental Agency of Singapore and using various library available in R to project how rainfall data collected from various weather stations in Singapore can be used to project the rain experienced over a specific time period using Geospatial Analysis. By coming up with an application the team hope it could provide an example on how open source resources can augment the vast government data availability to share meaningful insights to both to public and private organizations.

  • Arunkumar Chavarukulangara Rajan
  • Josef Carlo Exconde
  • Sandeep Challa

100px

Video Game Analytics

Vgchartz.com, is a video game sales tracking website that provides weekly sales figures of console software and hardware by region. The site was launched in June 2005 and is run by a small team of ten. Presently, users find the visualizations time consuming, difficult to comprehend and navigate.


  • Zhang Jinchuan
  • Li Nanxun
  • Chris Thng
  • Ling Jingting

A View at HDB Car Parking System

The HDB car parking system in Singapore is a very mature one. This project aims to visualize this car parking system. The various types of car parking are understood and visualized. These car parking systems are broken down area-wise and block-wise and visualized using various tools and visualizations such as hierarchical tree and sunburst diagrams.

  • Arcchit Mittal
  • Mukund Krishna Ravi
  • Shishir Pravin Nehete

A View at HDB Car Parking System

The HDB car parking system in Singapore is a very mature one. This project aims to visualize this car parking system. The various types of car parking are understood and visualized. These car parking systems are broken down area-wise and block-wise and visualized using various tools and visualizations such as hierarchical tree and sunburst diagrams.

  • Arcchit Mittal
  • Mukund Krishna Ravi
  • Shishir Pravin Nehete

Pandemic.jpg Group 15

Characterising Pandemic Spread Using R

A pandemic is an epidemic or outbreak of infectious disease that spreads rapidly not only to many people, but across countries. The unprecedented mobility of people and food over the last 30 years has seen a steady increase in the frequency and diversity of disease outbreaks. No country is immune to this growing global threat. Scientists are predicting that it is not a matter of if, but when the next pandemic will happen. Singapore, as a small city state, with the highest population density in the world and one of the highest air passenger traffic, is particularly vulnerable.

There are reasons to remain optimistic, as Singapore’s SMART Nation initiatives and modern healthcare systems’ electronic records have open up new possibilities in the fight against potential infectious disease outbreaks in the country. Data will be increasingly ubiquitous as the world, including Singapore, continues to make significant advancement in the digitalisation age. Insights from the data have the potential to offer a critical line of preparedness needed through early identification, rapid effective response, and containment of disease outbreaks.

Using R programming to analyse a synthetic dataset (i.e. computer- and human-generated data) relating to a major disease outbreak that spanned several cities across the world in 2009, we have developed a visualisation tool and deployed it as an interactive dashboard prototype via R Shiny. This visualisation tool can potentially be used by health officials to analyse the hospitalisation data and characterise the spread of the pandemic across countries should an actual disease outbreak happen. We have demonstrated the capabilities of this visualisation tool through the use of calendar heatmap, trellis plot and other new visualisation graphing methods. The efficacy of each of these visual analytics techniques will be discussed in detail. We will also suggest possibilities for future works by combining hospital records with other data sources.

[VAST Challenge 2010 - Characterisation of Pandemic Spread]

  • Chua Gim Hong
  • Huang Liwei
  • Ngo Siew Hui

A View at HDB Car Parking System

The HDB car parking system in Singapore is a very mature one. This project aims to visualize this car parking system. The various types of car parking are understood and visualized. These car parking systems are broken down area-wise and block-wise and visualized using various tools and visualizations such as hierarchical tree and sunburst diagrams.

  • Arcchit Mittal
  • Mukund Krishna Ravi
  • Shishir Pravin Nehete
Changing-maps-of-indian-states-ibnlive.gif

The Indian Story

This project aims to explore the census dataset published by the Government of India on the literacy rate for state and city levels. This data grouped by gender, age groups and education levels of the Indian population. Our team, through the aid of visual geo-plotting, intends to find out interesting insights on literacy. The end objective is to create a dashboard on R that will the give the user the ability to interactively visualise the data and find out areas and groups of Indian population that needs to be targeted to improve literacy rates across the country.


  • Mandi Luo
  • Priyadarshini Majumdar
  • Sandhya Vasudeva Rao
India-map.jpg

Airlines Network in India

For a developing country like India, transportation infrastructure is one of the most important indicators for its economic growth. They form the backbone of tourism industry, support movement of goods and people across the country and hence drives the country’s economic growth. Roadways, railways and airways are the major means of transportation in India, however the contribution of airways is small as compared to that of the other two. With the coming of the Narendra Modi government, the civil aviation sector is growing at a rapid rate and with the shift in the govt policy there is expectation of addition of several low-cost private air service providers in the region. These players often offer competitive pricing and thereby driving the air traffic. Understanding of these transportation system is important for many reasons of policy, administration and efficiency.

Our main aim is to investigate the airport network infrastructure of India (ANI) to explore its various properties and its traffic dynamics in terms of different airlines operating. The study could help us understand how low cost airlines could share their network to connect remote parts of India. We propose to build a visual network exploration tool using R. This tool can be used not only to explore the airport network in India but also can be used to explore any kind of airport network. We plan to use some of the novel methods provided by R to integrate geographic data with network data.


  • Debasish Behera
  • Manish Mittal
  • Roger Ganga Sundaraja


Corr.png

Stock evaluation and portfolio decision Stock is one of the common investment methods for HNWI (High Net Worth Investors). Traditionally, the visualization of different stocks is just line charts showing the price or the trend of the stock. However, with the development of visualization technologies, investors now may ask more for the stock graphs. This project tends to work out a visualization solution for stock investors by providing more analytical and interactive functions. There are mainly three modules in our visualization application. The first module is about the fundernebtal analysis of the stock strating from a treemap of analysis of the industries, followed by a interative stock candlestick chart. Investors can also have the comparison of two stocks in terms of some key ratios and have a stock forecast graph. The second module is about the analysis of the correlation of stocks and the third one is about the portfolio decision.With all these three modules together, investors may find it helpful for their stock analyzing.

  • Jiaqi Zhang
  • Xintian Liu
  • Hongjun Qian
Group eight Logo.png

VRshiny: An Application for Visualizing Association Rules with Network Diagram in Shiny


Association rules mining (ARM) was developed based on the fundamental yet powerful concept of statistics—probabilities. It is being widely used to find out the combinations of items that people are more likely to purchase together. We deployed various applications of ARM in VRshiny, allowing the users – especially small to medium sized enterprises – to investigate their current business status before making decisions for the future. A network diagram is incorporated in VRshiny to better visualize the association rules.


VRshiny was made interactive with R Shiny framework for the users to calibrate the model and explore the model statistics before choosing their rules of interest for analysis. The application allows users to upload different datasets and perform association rules mining beyond the conventional Market Basket Analysis. We intend to fulfill the preliminary analytics needs of business owners from all industries with our handy visual analytics application, before they invest into more expensive and complex analytics capabilities.

  • Bo Cao
  • Yuhui Zhou
  • Yifei Guan

Project Groups

Please change Your Team name to your project topic and change student name to your own name


Team Members
Investigation of vehicle traffic corridors using visual analytics Kishan Bharadwaj Shridhar Ong Guan Jie Jason Zhang Yanrong
Global Warming: A Tale of Rising Temperatures Angad Srivastava Akangsha Bandalkul Dipti Kalyandurgmath
S-MALL Chen Yun-Chen Chiam Zhan Peng Zheng Bijun Ghost Lin
Group 4 Akanksha Mittal Sivagamy Balamourougane Sanghavy Balamourougane
Group 5 Vincent Mack Zhi Wei Chen Xiaoqing David Ten Kao Yuan Student name Student name Student name
Group 6 HE Lingfei MAO Chenxin WANG Yingbei
Trenchcoat Detectives Anuthama Murugesan Krutika Balveer Choudhary Sumalika Kodumuru Student name Student name Student name
theArules Cao Bo Guan Yifei Zhou Yuhui
The Indian Story Luo Mandi Sandhya Vasudeva Rao Priyadarshini Majumdar Ghost Ghost Ghost
Group 10 visualizeR Eric Prabowo Asmit Adgaonkar Shuo Zhang - - -
Group11 TripleY Wei Yunna Chen Yin Jue Xu Yue -- -- --
Group 12 Arunkumar Chavarukulangara Rajan Josef Carlo Exconde Sandeep Chala -- -- --
Your Team name Student name Student name Student name Student name Student name Student name
Your Team name Student name Student name Student name Student name Student name Student name
Group 15: Characterising Pandemic Spread Using R Chua Gim Hong Huang Liwei Ngo Siew Hui -- -- --
Group 16 Jiaqi Zhang Xintian Liu Hongjun Qian Student name Student name Student name
Your Team name Student name Student name Student name Student name Student name Student name
Group 18 - Intelligent Airlines Network Debasish Behera Manish Mittal Roger Ganga Sundaraj Student name Student name Student name
Group 13 Rishi Tandon - - - - -