Difference between revisions of "Project Groups"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(5 intermediate revisions by 3 users not shown)
Line 64: Line 64:
 
<center>Group 2</center>
 
<center>Group 2</center>
 
||
 
||
'''Combatting Greenhouse Gas Emissions through Exploratory and Panel Data Analysis'''
+
'''DEGGED - Dynamic Exploration of Greenhouse Gas Emissions and its Determinants using R and Shiny'''
  
Global warming is expected to result in a rise of the average global temperature between 1.1 to 6.4 degree celsius over the century, if there are no interventions taken to reduce emissions of greenhouse gases. With this impending global situation, the European Union (EU) leaders committed to an ambitious goal of reducing greenhouse gases by 55% by 2030 to tackle climate change. The availability of a broad range of climate change related statistics on Eurostat allowed our group to investigate the impacts of the drivers and mitigation measures on the greenhouse gas emissions for the EU countries. Exploratory analysis will be used to understand the current situation, and Panel data analysis will be performed to glean insights on the determinants of the greenhouse gas emissions, allowing us to monitor EU's progress towards achieving its 2030 goal.
+
In December 2020, the European Union (EU) leaders committed to an ambitious goal of reducing greenhouse gas emissions levels by 55% by 2030 to tackle climate change. As the world's third largest emitter of greenhouse gases, it is important to determine the factors which are contributing significantly to greenhouse gas emissions. Most existing literatures focused largely on the relationship between drivers and greenhouse gas emission levels, without considering mitigation factors which also plays a role in reducing greenhouse gas emissions. Earlier research also often presented findings in static forms limiting the amount of data exploitation that can be performed. Hence, our research aims to study the relationship of both drivers and mitigation measures on greenhouse gas emission levels using Ordinary Least Squares regression and Panel Data regression. We designed and developed DEGGED, an interactive web-based dashboard, to allow policymakers and environmentalists to explore and analyse determinants of greenhouse gas emissions. DEGGED will be intuitive for non-technical users to perform fundamental data analysis and regression modeling without any coding needed from users.
 
||
 
||
[https://greenhouseemission.netlify.app/posts/2021-02-25-proposal/ Project Blog Link]
+
[https://greenhouseemission.netlify.app/ Project Blog Link]
 
||
 
||
 
* [https://selenechoong.netlify.app/ Choong Shi Lian Selene]
 
* [https://selenechoong.netlify.app/ Choong Shi Lian Selene]
Line 133: Line 133:
  
 
||
 
||
[https://ourshinypet.netlify.app/ Project Blog Link]
+
[https://ourshinypet.netlify.app/ Project Blog] <br>
 +
[https://kgalbindo.shinyapps.io/shinyPET/ Shiny Application] <br>
 +
[https://ourshinypet.netlify.app/files/ShinyPET_paper.pdf/ Research paper] <br>
 +
[https://ourshinypet.netlify.app/files/ShinyPET_poster.pdf/ Poster] <br>
 +
[https://github.com/suyiinang/ourshinyPET/tree/main/deliverables/ Github link]
 +
 
 
||
 
||
 
* [https://suyiinang.netlify.app/ Ang Su Yiin]
 
* [https://suyiinang.netlify.app/ Ang Su Yiin]
Line 149: Line 154:
 
In this project, we analyse and identify patterns regarding the Global Innovation Score comparing pre-and post Covid-19 pandemic in different countries based on the Global Innovation Index 2020 and specifically to better understand the impact in Singapore.
 
In this project, we analyse and identify patterns regarding the Global Innovation Score comparing pre-and post Covid-19 pandemic in different countries based on the Global Innovation Index 2020 and specifically to better understand the impact in Singapore.
  
Our approach includes developing a R-Shiny application for an interactive '''(1) Exploratory Data Analysis which includes a Choropleth Map, Bubble Plot, Visualising Uncertainty, and Time-series Analysis''', and '''(2) Statistical Analysis which includes Correlation Analysis, Multiple Linear Regression Model, and Hierarchical Clustering'''.  
+
Our approach includes developing a R-Shiny application for an interactive '''(1) Exploratory Data Analysis which includes a Choropleth Map, Bubble Plot, Radar Chart, and Slope Chart''', and '''(2) Statistical Analysis which includes Correlation Analysis, Statistical Plots, and Hierarchical Clustering'''.  
 
||
 
||
[https://innovation-amidst-covid-19.netlify.app/proposal.html/ Project Blog Link]
+
[https://innovation-amidst-covid-19.netlify.app/ Project Blog]<br>
 +
[https://github.com/ctteo2019/innovation-amidst-covid-19/ Project GitHub]<br>
 +
[https://lanceteo89.shinyapps.io/INNOVAC/ Shiny Application]<br>
 +
[https://github.com/ctteo2019/INNOVAC/ Shiny GitHub]<br>
 +
[https://innovation-amidst-covid-19.netlify.app/posts/2021-04-25-application-user-guide/ Application User Guide]<br>
 +
[https://github.com/ctteo2019/innovation-amidst-covid-19/blob/8c4387a26ce3e67f4447899ec2d16415ee098c0d/Project%20Poster/Project-Poster.pdf/ Project Poster]<br>
 +
[https://github.com/ctteo2019/innovation-amidst-covid-19/blob/8c4387a26ce3e67f4447899ec2d16415ee098c0d/Practice%20Research%20Paper/Practice-Research-Paper.pdf/ Practice Research Paper]
 
||
 
||
 
* [https://elaine-lee.netlify.app/ Elaine Lee]
 
* [https://elaine-lee.netlify.app/ Elaine Lee]
Line 281: Line 292:
 
'''Data Visualization Survey Analysis'''
 
'''Data Visualization Survey Analysis'''
  
Data visualization is the graphical representation of information and data. It has been an important factor in data analytics pipeline, to reveal insights that are often difficult to be delivered in other forms. It is commonly used in various scenarios, such as data cleaning, exploring data structure, detecting pattern, identifying trends and clusters. It helps operations and management make informative decisions. Understanding the current state of data visualization is crucial. It gives organizations and practitioners in the field a better idea of where data visualization stands today, and where it’s headed.  
+
Data visualization is the graphical representation of information and data. It has been an important factor in data analytics pipeline, to reveal insights that are often difficult to be delivered in other forms. It is commonly used in various scenarios, such as data cleaning, exploring data structure, detecting pattern, identifying trends and clusters. It provides organizations and practitioners a handy tool to analyse data and enables them to make informative decisions based on insights gained. Understanding the current state of data visualization is crucial. It gives organizations and practitioners in the field a better idea of where data visualization stands today, and where it’s headed. On the other hand, it helps people who have an interest in data visualization know how to enter the field.
  
In this research study, we want to build a R Shiny application to illustrate the current state of data visualization. The goal is to draw a comprehensive picture of data visualization for organizations, practitioners and people having an interest in data visualization, by analyzing Annual Data visualization Community Survey.  
+
In this research study, we will build a R Shiny application to illustrate the current trend of data visualization. The goal is to draw a comprehensive picture of data visualization for organizations, practitioners and people having an interest in data visualization, by analysing Annual Data visualization Community Survey. The analysis and visualization consist of three parts: interactive '''exploratory data analysis''', '''cluster analysis''' and '''association analysis'''.
 
||
 
||
[https://flamboyant-meninsky-d13fd9.netlify.app/posts/2021-02-28-group-project-proposal-data-visualization-survey-analysis/ Project Blog Link]
+
[https://group14.netlify.app/posts/2021-02-28-group-project-proposal-data-visualization-survey-analysis/ Project Blog Link]
 
||
 
||
 
* Bai Xinyue
 
* Bai Xinyue

Latest revision as of 22:32, 27 April 2021

Vaa logo.jpg ISSS608 Visual Analytics and Applications

About

Weekly Session

DataViz Makeover

Assignment

Visual Analytics Project

Resources

 


Project Groups

Please provide project description the project title and an abstract of your project. The abstract should not be more than 350 words. You are also required to include project blog link and the names of team member.


Project Team Project Title/Description Project Web Blog Project Member
Group 1

Understanding Airbnb listings in Australia

The abundance of Airbnb data provides great opportunity to conduct a variety of data analyses to understand the residential short-lease rental market. The dataset that has be scrapped on the Airbnb web and made publicly available by Inside Airbnb provides geospatial, textual, and quantitative data on each of the listings listed on the web. This project provides an analytics platform for interested parties (especially non-data specialists) to conduct exploratory spatial data, text, cluster, and regression analysis on the Australia Airbnb dataset using simple and user-friendly interactive dashboards that does not require programming knowledge.

Project Blog Link

  • Jason TEY Shou Heng
  • Louelle TEO Fengmin
  • WONG Kian Hoong (Andy)
Group 2

DEGGED - Dynamic Exploration of Greenhouse Gas Emissions and its Determinants using R and Shiny

In December 2020, the European Union (EU) leaders committed to an ambitious goal of reducing greenhouse gas emissions levels by 55% by 2030 to tackle climate change. As the world's third largest emitter of greenhouse gases, it is important to determine the factors which are contributing significantly to greenhouse gas emissions. Most existing literatures focused largely on the relationship between drivers and greenhouse gas emission levels, without considering mitigation factors which also plays a role in reducing greenhouse gas emissions. Earlier research also often presented findings in static forms limiting the amount of data exploitation that can be performed. Hence, our research aims to study the relationship of both drivers and mitigation measures on greenhouse gas emission levels using Ordinary Least Squares regression and Panel Data regression. We designed and developed DEGGED, an interactive web-based dashboard, to allow policymakers and environmentalists to explore and analyse determinants of greenhouse gas emissions. DEGGED will be intuitive for non-technical users to perform fundamental data analysis and regression modeling without any coding needed from users.

Project Blog Link

Group 3

Understanding Key Stories Covered In the Media and How Readers Engaged With News

As we become more and more inundated with news from various digital sources today, understanding what the key stories are across the digital spectrum is becoming more and more challenging. As such, we are interested in understanding how to best present a visual snapshot of the key stories that are covered in local media and identifying how readers engaged with the news.

Project Blog Link

Group 4

COVID ExploreR - Interactive Visual Analysis with R Shiny for Exploring COVID-19 Data

The Coronavirus (COVID-19) has caught the world’s attention with the first COVID-19 cases reported in Wuhan, Hubei, China, in December 2019. In the global battle against the virus, countries seek to understand the virus, its spread, impact and more recently, receptivity towards the COVID-19 vaccination.

Our project aims to leverage the richness of the COVID-19 data to provide an interactive experience in generating insights and analyses using R Shiny from three key aspects: (1) new cases; (2) deaths; and (3) vaccination receptivity.

Project Blog Link

Group 5

Predicting whether an individual would go for the H1N1 vaccine

Vaccination is a crucial public health measure to flatten the curve in a pandemic. By looking at a dataset that contains the personal demographics and attitudes of respondents in the USA towards H1N1 vaccination, we hope to predict whether an individual would go for the vaccine.

Project Blog Link

  • Hai Dan
  • Lim Pek Loong Desmond
  • Tay Kai Lin
Group 6

Our Shiny PET: A Predictive, Exploratory and Text Application for Airbnb Data

The increasing availability of data has resulted in the increased demand for data driven decisions. Although there is an extensive range of commercial statistical tools, they are often subscription-based and demand good technical knowledge to mine and draw insights from. Therefore, it may not appeal to the average user.

As such, our project aims to develop a user-friendly application that will enable users to make data-driven decisions without the need to understand programming languages or have extensive statistical knowledge. We will use Airbnb data as our baseline for this project - data generated is rich in information, which consists of structured, unstructured (textual), and location data.

With this application, users will be able to perform text analysis on review and listing data to generate more quantitative insights. The exploratory module allows users to identify interesting patterns based on selected variables. Findings from the exploratory module will be further augmented in the confirmatory module where selection of statistical methods will be guided based on user’s chosen variables. Finally, the predictive module enables users to prepare and build a variety of prediction models without needing to have in-depth understanding of the predictive models and its algorithms.

Project Blog
Shiny Application
Research paper
Poster
Github link

Group 7

Innovation Amidst Covid-19

Covid-19’s impacts on workers and workplaces across the globe has been dramatic. Indeed, there will be global business realignment triggered by the global Covid-19 pandemic disruptions.

In this project, we analyse and identify patterns regarding the Global Innovation Score comparing pre-and post Covid-19 pandemic in different countries based on the Global Innovation Index 2020 and specifically to better understand the impact in Singapore.

Our approach includes developing a R-Shiny application for an interactive (1) Exploratory Data Analysis which includes a Choropleth Map, Bubble Plot, Radar Chart, and Slope Chart, and (2) Statistical Analysis which includes Correlation Analysis, Statistical Plots, and Hierarchical Clustering.

Project Blog
Project GitHub
Shiny Application
Shiny GitHub
Application User Guide
Project Poster
Practice Research Paper

Group 8

A Simple Stock Analyzer

The individual investor is often overwhelmed with data and information with no tools to analyse, visualize or forecast stock performance without subscribing to expensive tools. The Simple Stock Analyzer (SSA), a highly interactive and visually driven application, leverages the recently available R packages: `timetk`, `tidyquant`, `modeltime` - to present a tool that empowers the individual investor with a simple to use graphical user interface built with R and Shiny.

Project Website Shiny Application

  • Anh Hoang Bui
  • Evelyn Phang
  • Ling Huang
Group 9

Enabling optimization of bike-sharing operations – Bluebikes

The advent of shared bikes has provided people with a new way of commuting, and has picked up rapidly due to its convenience and low cost. However, there are still some problems at the current stage, such as an over-accumulation of bikes at certain areas leading to inconveniences to the public. On the flip side, there could be insufficient supply of bikes at selected stations during peak periods leading to potential users choosing an alternate form of transport. There is also the issue of overused bikes lacking maintenance/servicing at the right time intervals.

There is currently no platform that provides an integrated analytics capability to perform exploratory analysis of the trip data and gather insights to improve the operations. This is the gap that our team is intrigued to close. We would like to design an interactive application that will help the executives of Blue Bikes to analyze and visualize users’ trip data. This application would serve as the go-to analytics platform for gathering insights on the bike sharing operations and facilitate decision making on improvement ideas.

The objective of this project is to create an app using R-Shiny that will enable Bluebikes to focus on the operational optimization of their bike fleet supply at each of the stations via:

• Exploratory and Confirmatory interface to analyze bike trip duration and intensity of bike station activity.

• Analyze the deficit or excess of bikes that are moving in and out of the numerous bike stations.

• Optimize the utilization rate of their entire bike fleet.

• Track and determine the right time to perform servicing and maintenance on the bike fleet.


Project Blog Link

  • Liu Jie
  • Wang Ziqi
  • Vikram Shashank Dandekar
Group 10

Understanding Prime Mover (PM) Productivity in Yard (UPMPY)

The study objective is to seek insight and create an application using Shiny for the R programming language to enable managers to improve Prime Mover (PM) operations. This is done through helping managers obtain in-depth understanding of PM characteristics, Terminal, time of day, and movement status using exploratory data analysis. Confirmatory data analysis is carried out to understand correlations and differences in the distribution. Following this, quality control, using pareto and control charts, takes center stage to identify the main contributors to poor productivity. All of this is presented using a R Shiny app, which enables interactivity, where users are able to input controls and output is reflected in real-time.

/ Project Blog Link / Project Submission

  • Li Zhenglong
  • Lim Kai Chin
Group 11

The Prime Crime Area Spatio-Temporal Analysis

With the limited police resources and possible adverse impact when crime occurs, analytics on crime has been done as far back as in the 1800s (Hunt, 2019). Crime occurrence was found to have spatial patterns, and thus predictive analytics should be possible. However, mixed results were obtained in the research to determine whether predictive policing results to lower crime rates (Meijer & Wessels, 2019). Thus, it is more beneficial to use analytics to determine areas with a higher risk of crime and to discover the underlying factors to the increased risk.

Traditionally, crime analysis is done manually or through a spreadsheet program (RAND Corporation, 2013). Using demographic, socio-economic and crime rate data of the Greater London Region, retrieved from the London Datastore, this project would give the users an easier way to do the crime analysis using a web application. In this project, 3 key analysis will be performed:

  1. Exploratory Data Analysis: Finding spatial hotspots and how crime rates have changed over the years.
  2. Clustering Analysis: Finding similar local authority districts (LADs) based on the crime rates and other influencing factors.
  3. Regression Modelling: Forecasting crime rate in each LAD.

Project Blog Link

Group 12

Investing 101: A visual and predictive guide for the rookie investor

Existing financial data websites such as Yahoo Finance do a good job in providing historical price data and technical indicators, but the beginner investor lacks the knowledge to properly utilise and benefit from these. In addition, we have also identified several gaps in such websites.

For one, these websites do not provide tools to allow the user to compare stocks meaningfully or zoom in to the statistical properties of financial returns. For example, a user is unable to conduct correlation analysis or visualize the distribution of returns. Secondly, these websites also do not provide any form of forecasting to aid in investors’ decisions.

This project aims to improve on the current offering of financial data websites by including the following key modules:

  1. Exploratory Data Analysis: Key visualizations and analysis of key financial asset returns metrics
  2. Time-Series Forecasting: Predicting financial asset prices using an ARIMA model
  3. Time-Series Clustering: Clustering financial assets based on historical returns

Project Blog Link

  • Andre Lee
  • Boey Yi Heen
  • Ng Weekien
Group 13

The Impact of Lifestyle and Family Background on Grades of High School Students

In the past many years, there has been an emphasis on education around the world because of the impact it a person, be it in terms of employment opportunities and quality of life. It is hence important to know what are factors that affect one’s academic performance. While there are many factors that can impact a person’s academic performance, family background and one’s lifestyle are two of the larger factors.

Since there are many sub-factors in family background and lifestyle choices, the motivation of this study is to look deeper at these sub-factors to see which are the factors that have a greater correlation in the impact on a student’s grades. More specifically, this study aims to study the correlation between each factor and a student’s grades, as well as aiming to build a model that can accurately determine the academic performance of a student. From the findings, targeted help may be administered to students in these specific areas attributing to poor grades in school, therein helping them have a higher chance of a better future.

Project Proposal Weblog

  • Lim Jun Jie Timothy
  • Tang Haozheng
  • Wu Yufeng
Group 14

Data Visualization Survey Analysis

Data visualization is the graphical representation of information and data. It has been an important factor in data analytics pipeline, to reveal insights that are often difficult to be delivered in other forms. It is commonly used in various scenarios, such as data cleaning, exploring data structure, detecting pattern, identifying trends and clusters. It provides organizations and practitioners a handy tool to analyse data and enables them to make informative decisions based on insights gained. Understanding the current state of data visualization is crucial. It gives organizations and practitioners in the field a better idea of where data visualization stands today, and where it’s headed. On the other hand, it helps people who have an interest in data visualization know how to enter the field.

In this research study, we will build a R Shiny application to illustrate the current trend of data visualization. The goal is to draw a comprehensive picture of data visualization for organizations, practitioners and people having an interest in data visualization, by analysing Annual Data visualization Community Survey. The analysis and visualization consist of three parts: interactive exploratory data analysis, cluster analysis and association analysis.

Project Blog Link

  • Bai Xinyue
  • Li Hongting
  • Zhang Weimin
Group 15

Project Title and Abstract

Project Blog Link

  • Team member 1
  • Team member 2
  • Team member 3
Group 16

VISTAS - Visualising Industry Skill TAlent Shifts

The LinkedIn and World Bank Group have partnered and released data from 2015 to 2019 that focuses on 100+ countries with at least 100,000 LinkedIn members each, distributed across 148 industries and 50,000 skill categories. This data aims to help government and researchers understand rapidly evolving labor markets with detailed and dynamic data.

Through our project, we want to provide individuals and countries with insights into various interest areas to benchmark themselves against the global landscape. As an extension, we will be including macroeconomic indicators of GDP growth from World Bank Organization to our data visualisations. We envision that our project will help individual and countries answer questions on employability, employment opportunities, and migration and skill trends.

Project Blog Link
Shiny App
Github