Difference between revisions of "Project Groups"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(107 intermediate revisions by 19 users not shown)
Line 44: Line 44:
 
||
 
||
 
<div style="text-align:center;">
 
<div style="text-align:center;">
Group01
+
Group 1: Hawk-R Stall Rentals
 +
[[File:G1-ProjThumb.jpg|thumb]]
 
</div>
 
</div>
 
||
 
||
Title
+
'''Auntie, Stall Rental How Much R?'''
  
Abstract: Not more than 300 words
+
Hawker stalls are a staple in the Singapore food scene.  With the recent furore over hawker stall management and outrageous stall bids, do hawkers actually know what the market rate of hawker stall rentals and how much to bid? 
 +
 
 +
Our project aims to help existing and aspiring hawkers understand hawker stall rentals through appropriate visualisations and to analyse the factors that affect stall rentals.  Our app, built in R Shiny, will allow users to explore Singapore’s publicly managed hawker centre rental data.  Users will be able to view price trends, lookup prices of hawker stall rentals in the selected hawker centre.  We will also include a geo-weighted regression and users can use our app to see how different location based variables affect the rental price of the hawker stalls.
 +
 
||
 
||
 
*[[Group01_Proposal|Proposal]]
 
*[[Group01_Proposal|Proposal]]
Line 56: Line 60:
 
*[[Group01_Report|Report]]
 
*[[Group01_Report|Report]]
 
||
 
||
* Member 1
+
* Chia Yong Qing
* Member 2
+
* Choo Mei Xuan
* Member 3
+
* Clara Chua
 
|-
 
|-
 
||
 
||
<div style="text-align:center;">Group02</div>
+
<div style="text-align:center;">
 +
Group02: Casualties of Commodity Trade War
 +
[[File:Cargo_Port.jpg|300px]]
 +
</div>
 
||
 
||
'''Environmental Criminology: The Missing "W" in Whodunnit'''
+
'''Casualties of Commodity Trade War'''
 
+
<br>
With increased availability of crime data rich with geospatial-temporal variables, exploratory, statistical and predictive analytics can be leveraged on to understand crime occurences with the lens of environmental criminology. The application produced from this research leverages on previous works on analysing interaction and associations amongst crime data variables that is supplemented with the population data. With Los Angeles city crimes used as our case study, we demonstrate how results from various analytical methods can be displayed visually and intuitively for exploration by the casual user with interactivity catered to potential varying needs. In particular, the application displayed exploratory and predictive statistcal analytics results using radar charts, calendar plot, choropleths, small multiples of choropleths, multimodal network graphs, heat maps and geographical maps.  
+
Trade has always been an essential economic activity to mankind. Over the years, globalization has created complex inter-dependencies between countries. Recent US’s announcement of the imposition of hefty tariffs on steel and aluminum on most countries as part of their economic policy has shattered the delicate balance of world trade. The breakout of a full-blown global trade-war seems to loom ahead, and amongst many questions that arise out of this possibility, we seek to provide visual insights on the “Casualties-Of-War”.
  
 +
Our interactive R Shiny application will allow policymakers to identify trends, patterns and dependencies in commodity trade at geographic, regional and economic communities; and identify economies that are sensitive to trade, along with the particular commodities that give rise to this sensitivity.
 
||
 
||
*[[Grp2_Proposal|Proposal]]
+
*[[Group02_Proposal|Proposal]]
*[[Group02_Report|Report]]
 
*[https://rchlt.shinyapps.io/va-g2-lightapp/ Light Version of App^]
 
*[https://tinyurl.com/yaapkvx7/ Code for full-scale App]
 
*[https://tinyurl.com/ybmxe3d8/ Data for full-scale App]
 
 
*[[Group02_Poster|Poster]]
 
*[[Group02_Poster|Poster]]
 
+
*[[Group02_Application|Application]]
^ Light Version contains 10 months of data (Jan, Feb, Aug, Sep, Dec for 2016 and 2017)
+
*[[Group02_ResearchPaper|Report]]
 
 
 
||
 
||
* Matilda Tan Ying Xuan
+
* Brian Chen
* Nurul Asyikeen Binte Azhar
+
* Hyder Ali
* Rachel Tong
+
* Matthias Oh
 
|-
 
|-
 
||
 
||
[[Group_3_Overview|Group 3: Shiny-GWR Geovisual Analytics Application]]<br><br>[[File:Icont3.JPG|209px|center]]
+
<div style="text-align:center;">
 +
Group 3: Tourism Investigator
 +
[[File:Singapore Tourism.jpg|300px]]
 +
</div>
 
||
 
||
 +
'''Welcome to Singapore! - Insights on Tourists Arrival and Expenditure on the Sunny Island'''
  
'''Building a geo-visualization application to analyse district economy in east region of China with geographically weighted regression (GWR) technique'''
+
Singapore has been fast emerging as a global tourism destination, with visitors arrival and tourism receipts hitting record high year on year.
 
+
From the data extracted from CEIC, we strive to create new observations that can assist businesses and enterprises to make an informed decision on tourism management. Leveraging on the insights generated, these enterprises and business will be in a better position to tap on the growing tourism revenue. Guiding the tourism industry players in creating impactful services, it will leave a profound and lasting impression on travellers.
Geospatial analysis was developed for problems in the environmental and life sciences, which has currently extended to almost all industries including economy, defence, utilities, social sciences, and public safety. The application of geo-visualization using geographically weighted regression (GWR) is an exploratory technique mainly intended to indicate where non-stationarity is taking place on the map. It is a good exploratory analytical tool which creates a set of location based parameter estimates, able to be mapped and analysed to give spatial information for the relationship of explanatory variables and response variable.
+
<br> Using the various visualisation techniques and tools in R Shiny learnt in class, we aspire to unravel the most lucrative visitors and the fast-growing segments among those who set foot on this sunny island. Moving forward, our team will attempt to generate a forecasting model using ARIMA analysis techniques to forecast future tourism arrivals using optimization techniques.
 
 
 
 
Our study uses economical data to explore district GDP condition in northern region of China. The project scope covers the analysis, model and visual representation of multivariate factors like GDP,Industry Output, Usual Residence,Average Wage,Area,City Construction Rate,No. of higher institution, and ratio of Teacher/Student which contributes to economical development in each city area of the province or municipality with the assistance of interactive charts and graphs.
 
 
 
 
||
 
||
*[[Group_3_Proposal|Proposal]]
+
*[[Group03_Proposal|Proposal]]
*[[Group_3_Application|Application]]
+
*[[Group03_Poster|Poster]]
*[[Group_3_Poster|Poster]]
+
*[[Group03_Application|Application]]
*[[Group_3_Report|Report]]
+
*[[Group03_Report|Report]]
 
||
 
||
* Xiao Zhenyu
+
* SHEN He
* Chen Zhengjian
+
* SOO Zhi Kai
* Zheng Mianyi
+
* ZUO Anna
 
|-
 
|-
 
||
 
||
[[Group_4_Overview|Group 4: A tale of Bitcoin]]<br><br>[[File:Bitcoin.png|209px|center]]
+
<div style="text-align:center;">
 +
[[File:4507954714 0e51720a8f m.jpg|200px]]
 +
</div>
 
||
 
||
 +
'''R-CsI: An R-ConSumerInsights Business Application to better understand Customers'''
  
'''Ever wondered how far bitcoin's value could go?'''
+
Technology in today's world is advancing faster than ever before. With the concepts of digital transformation, the Internet of Things (IoT) and cloud computing becoming more and more prevalent, it has also become far easier to obtain and access large amounts of data on a variety of consumer activities in an ever-widening list of industries. By using various visual, statistical and data mining techniques on these data sets, businesses will be able to harness the power of hindsight with regards to customer behavior, allowing them to learn more about the activities, purchases or other transactions made by their customer base. Businesses will then be able to use the insights gleaned from data exploration and discovery to address fundamental issues, such as customer acquisition, development and retention. <br><br>
 
 
Bitcoin has recently garnered mixed reviews from two extreme ends, from China banning bitcoin to Chicago Mercantile Exchange supporting the futures trading of bitcoin. There are even more varying opinions from big investment banks to regulators. All this recent excitement is due to bitcoin’s value rising by more than 700% (as of October 2017) from the start of 2017.  
 
 
 
It is very tempting to speculate that the price will continue to go up. If it does, by how much? If it doesn’t, how hard will it fall? How is its relative performance compared to other instruments? There are many more questions from both investors as well as curious academics alike. This paper’s focus will be on the following:
 
 
 
# price movement patterns and trends; and
 
# the risk and return profile of bitcoin
 
 
 
 
 
The approach taken to answering these question is through various visualisation techniques built in R.
 
 
 
 
 
  
 +
This project aims to discover insights on the segments that exist in a selected retailer’s customer base, as well as identify groups of products that are highly associated during purchase. This will be done through the analysis of Dunnhumby - The Complete Journey dataset obtained from the Dunnhumby data science company that tracks the purchases of 2500 households over 2 years. Upon understanding the differences among consumer groups as well as developing a better understanding of the patterns and hidden relationships in the transactional data, it is our hope that businesses will be able to obtain invaluable insights into its customer profile, and can focus its efforts on developing more effective customer-based strategies.<br/>
 
||
 
||
*[[Group_4_Overview|Proposal]]
+
*[[Group04_Proposal|Proposal]]
*[[Group_4_Application|Application]]
+
*[[Group04_Poster|Poster]]
*[[Group_4_Poster|Poster]]
+
*[[Group04_Application|Application]]
*[[Group_4_Report|Report]]
+
*[[Group04_Report|Report]]
 
||
 
||
* DENG Yuetong
+
* LEE Kern Choong
* YAU Hon Tak
+
* LEE Yeng Ling
 
+
* Debbie SIAH Mei Ping
 
 
 
|-
 
|-
 
||
 
||
 
<div style="text-align:center;">
 
<div style="text-align:center;">
[[ISSS608_2017-18_T1_Group5_Report|Group 5: Aviation Expansion]]<br><br>[[File:G5_pic.jpg|250px|center]]
+
Group05: VizTS: Clustering Edition
 +
[[Image:Time Series Clustering.jpg|250px]]  
 
</div>
 
</div>
 
||
 
||
'''Linking the globe: An Interactive Dashboard for Exploring Aviation Expansion Along the "Belt and Road"'''<br>
+
'''Visual Application for Time Series Clustering'''
''How civil aviation contributes to the “Belt and Road” initiative?''<br>
 
 
 
The Belt and Road (B&R) refers to the land-based "Silk Road Economic Belt" and the seagoing "21st Century Maritime Silk Road". Unveiled in 2013, the strategy underlines China's push to take a larger role in global affairs with a China-centered trading network by reinvigorating the seamless flow of capital, goods and services between Asia and the rest of the world. with aim of promoting further market integration and forging new ties among communities, the routes cover more than 60 countries and regions from Asia to Europe via Southeast Asia, South Asia, Central Asia, West Asia and the Middle East.<br><br>
 
  
By investigating all air-routes and flights between China and the “Belt and Road” countries between 2013 and 2017, the main purpose of this project is to explore the growing trend, regional connectivity and development potential in this aviation network. <br><br>
+
Time series clustering is to partition time series data into groups based on similarity or distance, so that time series in the same cluster are similar. Time-series datasets contain valuable information that can be obtained through pattern discovery. Clustering is a common solution performed to uncover these patterns on time-series datasets. It represents the time-series cluster structures as visual images (visualization of time-series data) can help users quickly understand the structure of data, clusters, anomalies, and other regularities in datasets.
  
A web-based visual analytics tool is implemented using R shiny with Leaflet package which can be used to easily explore and understand the flight network.
+
Time series clustering has a wide variety of strategies and a series specific to Dynamic Time Warping (DTW) distance. The <b>dtwclust</b> is a package of R statistical software so that have many of the algorithm implemented in this package that are specifically tailored to DTW. A great amount of effort went into implementing them as efficiently as possible, and the functions were designed with flexibility and extensibility in mind. As such, the <b>dtwclust</b> is a package with its functions comparable to, if not more superior than the expensive commercial-of-the-shelves analytical toolkit such as SAS Enterprise Miner. However, till date, the usage of <b>dtwclust</b> package tends to be confined within academic research as it required intermediate R programming skill.
  
 +
The project aims to provide a user-friendly interface to <b>dtwclust</b> package by using R Shiny framework. The user-friendly interface design allows casual users to import data, manage, explore, calibrate, visualise and evaluate clusters without having to type a single line of code. In addition to that, the application aims to incorporates graph visualization to enhance data exploration, to aid in the interpretability of the outputs of the clusters and to investigate the similarities or dissimilarities within the cluster.
 
||
 
||
*[[ISSS608_2017-18_T1_Group5_Proposal|Proposal]]
+
*[[Group05_Proposal|Proposal]]
*[[ISSS608_2017-18_T1_Group5_Report|Report]]
+
*[[Group05_Poster|Poster]]
*[[ISSS608_2017-18_T1_Group5_Poster|Poster]]
+
*[[Group05_Application|Application]]
*[[ISSS608_2017-18_T1_Group5_Application|Application]]
+
*[[Group05_Report|Report]]
 
||
 
||
* Wang Rui
+
* Arief Sulistio
* Wu Yuqing
+
* Goh I Vy
* Xing Siyuan
+
* Lim Si Ling Evelyn
|-
 
 
 
 
|-
 
|-
 
||
 
||
 
<div style="text-align:center;">
 
<div style="text-align:center;">
[[Group_6_Overview|Group 6: Beijing Air Quality]]<br><br>[[File: Air3.jpg|209px|center]]
+
[[File:Group 6 Logo.png|300px|frameless|center|link=https://wiki.smu.edu.sg/18191isss608g1/Group06_Overview]]
 
</div>
 
</div>
 
||
 
||
 +
'''Gee-Whiz: Singapore and the suppliers that make her tick'''
  
'''How is Beijing Air Quality in 2017?'''
+
Singapore is recognized worldwide for its efficient and clean public service. Ranked 6th in the world in a corruption perceptions index by Transparency International, the incidence of public sector corruption here remains one of the lowest in the world. Efforts to maintain openness and transparency in all government activities can be witnessed throughout the established systems and processes.
 
 
On Nov 4th, Beijing Environmental Protection Agency released the news, owing to the adverse weather conditions and early winter heating as well as other factors, it is expected that there will be a continuous 4-day regional heavily polluted air quality in Beijing-Tianjin-Hebei and surrounding areas on November 4th, in addition, the air quality in some cities may reach serious pollution level….
 
 
 
 
 
Beijing, one of the most serious polluted city, which is also the capital of China. Along with the escalation of air pollution, most people who are working and living in Beijing are faced with the tracheitis, pneumoconiosis, asthma, to name just a few. Gradually, a lot of people are terrified with living and working in Beijing.
 
 
 
 
 
In our project, we make efforts to visualize and analyze Beijing air quality according to its main existing indicators, such as AQI, NO, SO2, CO, PM2.5, etc.. To better display the visualization results, we utilize R Shiny Dashboard to make the part of the page design. Then, through exerting the r package of ggplot2, we visualize the fluctuation of AQI and frequency of AQI level. Besides, we generate the spider chart which shows the severity of each pollutant by using fmsb this package. We also display raster map and geofacet line graphs for 8 main view points through using the packages of ggplot2, maptools, gstat, raster, geofacet.  
 
 
 
  
All in all, we hope that we can try our best to show the air quality, and make people clearly know more about the surroundings they are living in as well as raise public environmental awareness.
+
GeBIZ is Singapore’s public eProcurement portal for suppliers to bid on tenders published by various agencies and ministries. While public agencies enjoy the economies of scale which come with the electronic purchase of goods and services, suppliers have broader access to government tenders and quotations. GeBIZ encourages greater transparency together with fair and open market competition as all procurement operations are published online.
  
 +
Using network analysis and other visualization techniques, we will explore the relationships between suppliers and the agencies with which they trade. Are there unknown biases in the tendering process which favors certain suppliers over others? Are there strong relationships between certain ministries and suppliers for specific types of projects? Are some suppliers providing such high value services to the government as to pose a concentration risk? Join us as we find out.
 
||
 
||
 
*[[Group06_Proposal|Proposal]]
 
*[[Group06_Proposal|Proposal]]
 
*[[Group06_Poster|Poster]]
 
*[[Group06_Poster|Poster]]
*[[RShinyApp]]
+
*[[Group06_Application|Application]]
 
*[[Group06_Report|Report]]
 
*[[Group06_Report|Report]]
 
||
 
||
* Wang Yizhou
+
* Charu Malik
* Zhou Chen
+
* Kateryna Mazurenko
* Zhang Lidan
+
* Qiao Xueyu
|-
 
 
 
 
|-
 
|-
 
||
 
||
 
<div style="text-align:center;">
 
<div style="text-align:center;">
[[Group 7 Overview|Group 7: Bike Sharing]]<br><br>[[Image:Shareing_bicycle.png|250px|center]]
+
(Group07) Corn: The A-maize-ing Crop
 +
[[File:Group07 header3.jpg|300px|frameless|center|link=https://wiki.smu.edu.sg/18191isss608g1/ISSS608_Group07_Proposal]]
 +
 
 
</div>
 
</div>
 
||
 
||
 +
'''Corn: The A-maize-ing Crop'''
  
'''Bike Sharing'''
+
Corn or Maize (as called in some countries) was first grown in ancient Central America. Corn or Maize (as called in some countries) was first grown in ancient Central America. Corn has become a staple in many parts of the world, providing not only substances that we fill our belly with, but also act as the raw ingredient for corn ethanol, animal feed etc. The Corn Belt in the US has about 96,000,000 acres of land just for corn production, and have characteristics of leveled land, fertile and highly organic soils.
 
 
Pronto Cycle Share, branded as Pronto!, was a public bicycle sharing system in Seattle, Washington, that operated from 2014 to 2017. The system, owned initially by a non-profit and later by the Seattle Department of Transportation, included 58 stations in the city's central neighbourhoods and above 500 bicycles.
 
 
 
Bike-sharing is a short distance transportation for people to make their life more convenient. When people use shared-bike, they can borrow and return bikes at any stations in the service station. Some stations have too many incoming bike and get jammed without enough docks for upcoming bikes, while some other stations get empty quickly and lack enough bikes for people to check out.
 
 
 
'''Which station has the most passenger flow?'''
 
 
 
In our project, we calculate the in degree an out degree for each station, to help user to understand how passenger usually use bike sharing service through each station point. We also divided time range into different periods, the data users can see much more details in yearly, monthly, even hourly. So that they can understand better the bike usage pattern.
 
  
'''How to re-distribute bike at a lower cost?'''
+
Corn has been known to be able to grow in a wide range of climatic conditions, hence it would be a challenge to set precise conditions for corn production. Hence, breeders have been experimenting with various types of corn hybrids, each of them specifically created to have high yield despite the environment it is planted in. Over the years, the farmers have been using trial and error method to identify the best hybrids to plant by planting each of these hybrids in different locations with different environmental factors; this process has been proven to be slow and not very effective.
  
Because passenger will take the bike from one station to another station everyday, Company should ask employees to re-distribute bike among existent stations. In our project, we use the real map, cooperated with the degree data calculated before, to visualize a shortest path to help employees re-distribute bike in a more efficient way.
+
Using visualisation tools such as geofacet and isoline graphs would give a good overivew on which part of the Corn Belt has better yield. We would implement '''GWmodel''' to generate our prediction model. This model aims to explore the meteorology and geographical factors that makes a corn, the ''a-maize-ing'' crop that we know today, which would benefit the corn breeder greatly.<br>
  
 
||
 
||
*[[G7 Project Introduction|Proposal]]
+
*[[ISSS608_Group07_Proposal|Proposal]]
*[[G7 Poster|Poster]]
+
*[[ISSS608_Group07_Poster|Poster]]
*[[G7 Application|Application]]
+
*[[ISSS608_Group07_Application|Application]]
*[[G7 Report|Report]]
+
*[[ISSS608_Group07_Report|Report]]
 
||
 
||
* Zhang Peng
+
* Tan Le Wen Angelina
* Wang Shang
+
* Pu Yiran
|-
+
* Stanley Alexander DION
 
 
 
|-
 
|-
 
||
 
||
[[Group_8_Overview|Group 8: Time Series Explorer]]<br><br>[[File: Group8ProjectBanner.png|350px|center]]
+
<div style="text-align:center;">
 +
[[File:Group 8 Logo.jpg|300px|frameless|center|link=https://wiki.smu.edu.sg/18191isss608g1/Group08_Proposal]]
 +
</div>
 
||
 
||
'''Time-series Explorer: Building interactive data visualisation for time series analysis'''
+
'''Visualizing Future of Crowd Funding with 余额宝'''
  
Time-series analysis is a time and effort consuming endeavour. As budding data analysts, we spent considerable resources in experimenting with many variations of parameter configurations to analyse time-series data. This difficulty stems from the lack of automatic tools that can help calculate the optimized time-series parameters during model training. To tackle this challenge, we created an easy-to-use time-series exploration system that is accessible even to the uninitiated analyst. The system is able to decompose the time series data to its constituent parts, namely Seasonality, Trend and Random (Noise). It can generate several forecasting models, using Exponential Smoothing and ARIMA analysis techniques, to predict future time periods using optimization techniques. The system also allows other forms of time series data to be displayed and their forecasts compared using the given forecasting methods, within certain formats. To test the system capabilities, we adopted the Singapore Consumer Price Index (CPI) as our use case. The CPI, with its short-term forecasts, is often used for tuning Governmental policies to steer inflation rates in countries like Singapore and for foreign investors to consider allocating potential investment funds into the country.
+
[http://yuebao.thfund.com.cn/ Yu’e Bao (余额宝)] is an investment product offered by [https://www.alipay.com/ Alipay (支付宝)], a mobile and online payment platform established by China’s multinational conglomerate [https://www.alibabagroup.com/en/global/home Alibaba Group]. In June 2013, Alibaba Group launched Yu’e Bao, in collaboration with [http://www.thfund.com.cn/en/index.html Tianhong Asset Management Co., Ltd.], to form the first internet fund in China. Since then, Yu’e Bao has become the nation’s largest money market fund and, by Feb 2018, has [https://yourstory.com/2018/08/alibaba-yue-bao-unearthed-hidden-treasure-from-digital-wallets/ US$251 billion] under its management. In Chinese, Yu’e Bao represents “Leftover Treasure”. Alipay users can deposit their extra cash, for example, leftover from online shopping, into this investment product. The money will be invested via a money market fund with no minimum amount or exit charges, with interest paid on a daily basis. While major banks offer 0.35% annual interest on deposits, Yu’e Bao may offers user 6% interest with the convenience and freedom to deposit and withdraw anytime via Alipay mobile app. Thus, Yu’e Bao became extremely popular in China.
 +
<br><br>
 +
Using various data visualization methodologies, coupled with analysis of survival and time-series, this project aims to build an interactive tool on R Shiny framework, to unearth the underlying treasures of associations between Yu’e Bao’s user profile, behaviour, time and other financial factors.  
  
 
||
 
||
*[[Group_8_Overview|Proposal]]
+
<div style="text-align:center;">
*[[Group_8_Poster|Poster]]
+
</div>
*[[Group_8_Application|Application]]
+
*[[Group08_Proposal|Proposal]]
*[[Group_8_Report|Report]]
+
*[[Group08_Poster|Poster]]
 +
*[[Group08_Application|Application]]
 +
*[[Group08_Report|Report]]
 +
*[[Group08_Academic_Paper|Academic Paper]]
 
||
 
||
* Fam Guo Teng
+
* Wong Yam Yip
* Wang Yuchen
+
* Wu Jing Long
* Xu Yanru
+
* Song Chen Xi
 
|-
 
|-
 
||
 
||
[[Group10_Overview|Group 10:China Property Trend]]<br><br>[[File:Geocluster.jpeg|209px|center]]
+
<div style="text-align:center;">
 +
[[File:Cover.jpg|300px|frameless|center]]
 +
Group09
 +
</div>
 
||
 
||
'''China property analysis'''
+
'''Visualization on China Stock Market Data'''
 
 
The real-estate market is ever growing and has more stakeholders. We are here to build an app that makes an analysis of the housing prices market an easy and effective one by just a few clicks and hovering around. This way allowing the major stakeholders perform their analysis and plan their decisions more efficiently.
 
 
 
We have used various packages such as'Recharts','Timekt','Sweep','ggplot2' that allowed users to model and visualize the housing prices indexes in different ways for different purposes.
 
 
 
Time Series Analysis-The application will allow the user to choose the City they are interested in and the time period they want to look at. The trend of the prices during that period will be provided.This is built for analysts and agents and government officials who would like to know on the performance at a certain period of time and also a comparative study between different cities. This way they can find any outliers or a particular pattern in the indices. The time series is related to the economic policy and the effect is stressed based on the chosen policy time period.
 
 
 
Cluster Analysis – We further develop some clusters of the cities based on their housing index reaction. This way we can group the cities whose housing market behave/ respond to the market in a similar way. The government officials and the local agents understand the markets better and plan their policies better. A waiver or cluster development centric policy can be made by the government.
 
 
 
Forcast Analysis: Forecast analysis is done using Geofacet that we can compare the forecasted prices between the different region of the country. Geofaceting arranges a sequence of plots of data for different geographical entities into a grid that strives to preserve some of the original geographical orientation of the entities.
 
  
This app can be applied to any other economic variable in China. This will be greatly helpful for economists, agents and government officials to look into the specific data and make some judgments and decisions based on it.
+
In our general impression, when we access a trading platform to make investment, we need to deal with plenty of price data to make our investment decision, which includes the opening price, closing price, highest price of the day and lowest price of the day. And the K line is the most popular visualization chart of the stock data for investor to refer to. However, if a fresh investor is not very professional and sensitive to the financial data, he may be distracted by various price data and are not able to make appropriate financial decision.
 +
Therefore, visualization of stock market data is quite useful for technical stock market analysis and will help investors to gain a comprehensive understanding on how the stock market is changing, which lead to our analysis objective for this project.
  
 
||
 
||
*[[Group10_Overview|Proposal]]
+
*[[Group09_Proposal|Proposal]]
*[[Group_10_Poster|Poster]]
+
*[[Group09_Poster|Poster]]
*[[Group_10_Application|Application]]
+
*[[Group09_Application|Application]]
*[[Group_10_Report|Report]]
+
*[[Group09_Report|Report]]
 
||
 
||
* Aishwarya Mohan
+
* Cao Xinjie
* Deng Chunling
+
* Chen Jingyi
* Ma Xiaoliu
+
* Wang Yixuan
|-
 
 
 
 
|-
 
|-
 
||
 
||
[[Group_11_Overview|Group 11: CrimeModeler: A Visually-Driven Geospatial Modelling Tool for Crime Applications]]<br><br>[[File:police.jpeg|220px|center]]
+
<div style="text-align:center;">
 +
[[Image:Group10-logo.jpg|300px]]
 +
Group10
 +
</div>
 
||
 
||
'''CrimeModeler: A Visually-Driven Geospatial Modelling Tool for Crime Applications'''
+
'''China Stock Data Visualization'''
 
+
The stock market data is seamless endless and widely available on the web. The movement of stock exchange depends on a complex mix of factors and difficult to predict. Exploring the patterns of stock market data, using different data visualization skills will be largely helpful for stock market investors and traders. This project aims to provide advanced data visualization of stock market data to reveal the hidden pattern of market movement.
Based on UN’s Survey of Crime Trends published in 2006, England and Wales have one of the highest crime rates among OECD countries. We have developed CrimeModeler, a geospatially modelling tool to investigate the spatial variation of crime across different districts in England and Wales, and the relationship between crime and socio-economic characteristics for each district. As it is common for neighbouring regions to have correlation in their crime rate, we compare the use of geographically weighted regression (GWR) and conventional (or global) multiple regression model to see whether a better result can be obtained from GWR. We will also investigate whether there are certain variables that have an impact on crime rate in one area but not in another. Local governments may use this information to come up with better policies to tackle crime.
 
 
 
 
 
 
||
 
||
*[[Group11 Proposal|Proposal]]
+
*[[Group10_Proposal|Proposal]]
*[[Group11 Poster|Poster]]
+
*[[Group10_Poster|Poster]]
*[http://crime.raymondfoo.host/ Application]
+
*[[Group10_Application|Application]]
*[[Group11 Report|Report]]
+
*[[Group10_Report|Report]]
 
||
 
||
* Raymond FOO Celong
+
* Hou Xuelin
* Anthony GOH Jun Jie
+
* Yan Huilin
* Karan Jyoti KHANNA
+
* Zhang Yanli
 
|-
 
|-
 
 
||
 
||
 
<div style="text-align:center;">
 
<div style="text-align:center;">
[[Group12_Proposal|Group 12:Cross Shareholding]]<br><br>[[File:Group12Title.JPG|209px|center]]
+
[[File:Group_11.png|300px|frameless|center]]
 +
Group11
 +
</div>
 
||
 
||
'''Cross Shareholding'''
+
'''Visualization Analysis of Citi Bike Data of New York City'''
 
 
Cross shareholding is a situation in which a corporation owns stock in another company. So, technically, corporations own securities issued by other corporations. Cross shareholding can lead to double counting, whereby the equity of each company is counted twice when determining value. When double counting occurs, the security's value is counted twice, which can result in estimating the wrong value of the two companies.
 
 
 
Cross shareholding is very common in corporate world. Sometimes, there can be more than 10 companies involved and it is very difficult for investors and regulators to track who owns how much.
 
 
 
In this project, our group choose 1 or 2 big groups of companies from Korea and China with heavy cross shareholding between each other and conduct visualization and relationship analysis on their networks using R-Shiny so that people can have better picture of these companies’ network and easier to understand relationship between companies.
 
  
 +
Over the past decade, bicycle-sharing systems have been growing in number and popularity in cities across the world. Citi Bike is New York City’s bike share system and the largest in the nation. Citi Bike launched in May 2013 and has become an essential part of our transportation network. It’s fun, efficient and affordable-not to mention healthy and good for the environment. In this project, we will perform an exploratory analysis on data provided by Citibike_stations and Citibike_trips and illustrate the power of visual analysis in using R shiny. We will build a user-friendly application to help riders to find out which station is most convenient for you to return and borrow bikes? From the Citibike’s point of view, it is critical to understand which station the most popular bike stations is and think about the geographical distribution of bike stations for Citi Bike.
 
||
 
||
*[[Group12 Proposal|Proposal]]
+
*[[Group11_Proposal|Proposal]]
*[[Group12 Poster|Poster]]
+
*[[Group11_Poster|Poster]]
*[https://koreastockcrossholding.shinyapps.io/ksch1203/ Application]
+
*[[Group11_Application|Application]]
*[[Group12 Report|Report]]
+
*[[Group11_Report|Report]]
 
||
 
||
* KYONG HWAN KIM
+
* Huang Shan
* GONGQIANG
+
* Kouhei Takesita
* HE ZIWEN
+
* Zhang Kexin
 
|-
 
|-
 
  
 
|}
 
|}

Latest revision as of 21:04, 8 December 2018

Vaa logo.jpg ISSS608 Visual Analytics and Applications

About

Weekly Session

Assignment

Visual Analytics Project

Course Resources

 


Please change Your Team name to your project topic and change student name to your own name.

Project Groups

Please change Your Team name to your project topic and change student name to your own name

Project Team Project Title/Description Project Artifacts Project Member

Group 1: Hawk-R Stall Rentals

G1-ProjThumb.jpg

Auntie, Stall Rental How Much R?

Hawker stalls are a staple in the Singapore food scene. With the recent furore over hawker stall management and outrageous stall bids, do hawkers actually know what the market rate of hawker stall rentals and how much to bid?

Our project aims to help existing and aspiring hawkers understand hawker stall rentals through appropriate visualisations and to analyse the factors that affect stall rentals. Our app, built in R Shiny, will allow users to explore Singapore’s publicly managed hawker centre rental data. Users will be able to view price trends, lookup prices of hawker stall rentals in the selected hawker centre. We will also include a geo-weighted regression and users can use our app to see how different location based variables affect the rental price of the hawker stalls.

  • Chia Yong Qing
  • Choo Mei Xuan
  • Clara Chua

Group02: Casualties of Commodity Trade War Cargo Port.jpg

Casualties of Commodity Trade War
Trade has always been an essential economic activity to mankind. Over the years, globalization has created complex inter-dependencies between countries. Recent US’s announcement of the imposition of hefty tariffs on steel and aluminum on most countries as part of their economic policy has shattered the delicate balance of world trade. The breakout of a full-blown global trade-war seems to loom ahead, and amongst many questions that arise out of this possibility, we seek to provide visual insights on the “Casualties-Of-War”.

Our interactive R Shiny application will allow policymakers to identify trends, patterns and dependencies in commodity trade at geographic, regional and economic communities; and identify economies that are sensitive to trade, along with the particular commodities that give rise to this sensitivity.

  • Brian Chen
  • Hyder Ali
  • Matthias Oh

Group 3: Tourism Investigator Singapore Tourism.jpg

Welcome to Singapore! - Insights on Tourists Arrival and Expenditure on the Sunny Island

Singapore has been fast emerging as a global tourism destination, with visitors arrival and tourism receipts hitting record high year on year. From the data extracted from CEIC, we strive to create new observations that can assist businesses and enterprises to make an informed decision on tourism management. Leveraging on the insights generated, these enterprises and business will be in a better position to tap on the growing tourism revenue. Guiding the tourism industry players in creating impactful services, it will leave a profound and lasting impression on travellers.
Using the various visualisation techniques and tools in R Shiny learnt in class, we aspire to unravel the most lucrative visitors and the fast-growing segments among those who set foot on this sunny island. Moving forward, our team will attempt to generate a forecasting model using ARIMA analysis techniques to forecast future tourism arrivals using optimization techniques.

  • SHEN He
  • SOO Zhi Kai
  • ZUO Anna

4507954714 0e51720a8f m.jpg

R-CsI: An R-ConSumerInsights Business Application to better understand Customers

Technology in today's world is advancing faster than ever before. With the concepts of digital transformation, the Internet of Things (IoT) and cloud computing becoming more and more prevalent, it has also become far easier to obtain and access large amounts of data on a variety of consumer activities in an ever-widening list of industries. By using various visual, statistical and data mining techniques on these data sets, businesses will be able to harness the power of hindsight with regards to customer behavior, allowing them to learn more about the activities, purchases or other transactions made by their customer base. Businesses will then be able to use the insights gleaned from data exploration and discovery to address fundamental issues, such as customer acquisition, development and retention.

This project aims to discover insights on the segments that exist in a selected retailer’s customer base, as well as identify groups of products that are highly associated during purchase. This will be done through the analysis of Dunnhumby - The Complete Journey dataset obtained from the Dunnhumby data science company that tracks the purchases of 2500 households over 2 years. Upon understanding the differences among consumer groups as well as developing a better understanding of the patterns and hidden relationships in the transactional data, it is our hope that businesses will be able to obtain invaluable insights into its customer profile, and can focus its efforts on developing more effective customer-based strategies.

  • LEE Kern Choong
  • LEE Yeng Ling
  • Debbie SIAH Mei Ping

Group05: VizTS: Clustering Edition Time Series Clustering.jpg

Visual Application for Time Series Clustering

Time series clustering is to partition time series data into groups based on similarity or distance, so that time series in the same cluster are similar. Time-series datasets contain valuable information that can be obtained through pattern discovery. Clustering is a common solution performed to uncover these patterns on time-series datasets. It represents the time-series cluster structures as visual images (visualization of time-series data) can help users quickly understand the structure of data, clusters, anomalies, and other regularities in datasets.

Time series clustering has a wide variety of strategies and a series specific to Dynamic Time Warping (DTW) distance. The dtwclust is a package of R statistical software so that have many of the algorithm implemented in this package that are specifically tailored to DTW. A great amount of effort went into implementing them as efficiently as possible, and the functions were designed with flexibility and extensibility in mind. As such, the dtwclust is a package with its functions comparable to, if not more superior than the expensive commercial-of-the-shelves analytical toolkit such as SAS Enterprise Miner. However, till date, the usage of dtwclust package tends to be confined within academic research as it required intermediate R programming skill.

The project aims to provide a user-friendly interface to dtwclust package by using R Shiny framework. The user-friendly interface design allows casual users to import data, manage, explore, calibrate, visualise and evaluate clusters without having to type a single line of code. In addition to that, the application aims to incorporates graph visualization to enhance data exploration, to aid in the interpretability of the outputs of the clusters and to investigate the similarities or dissimilarities within the cluster.

  • Arief Sulistio
  • Goh I Vy
  • Lim Si Ling Evelyn
Group 6 Logo.png

Gee-Whiz: Singapore and the suppliers that make her tick

Singapore is recognized worldwide for its efficient and clean public service. Ranked 6th in the world in a corruption perceptions index by Transparency International, the incidence of public sector corruption here remains one of the lowest in the world. Efforts to maintain openness and transparency in all government activities can be witnessed throughout the established systems and processes.

GeBIZ is Singapore’s public eProcurement portal for suppliers to bid on tenders published by various agencies and ministries. While public agencies enjoy the economies of scale which come with the electronic purchase of goods and services, suppliers have broader access to government tenders and quotations. GeBIZ encourages greater transparency together with fair and open market competition as all procurement operations are published online.

Using network analysis and other visualization techniques, we will explore the relationships between suppliers and the agencies with which they trade. Are there unknown biases in the tendering process which favors certain suppliers over others? Are there strong relationships between certain ministries and suppliers for specific types of projects? Are some suppliers providing such high value services to the government as to pose a concentration risk? Join us as we find out.

  • Charu Malik
  • Kateryna Mazurenko
  • Qiao Xueyu

(Group07) Corn: The A-maize-ing Crop

Group07 header3.jpg

Corn: The A-maize-ing Crop

Corn or Maize (as called in some countries) was first grown in ancient Central America. Corn or Maize (as called in some countries) was first grown in ancient Central America. Corn has become a staple in many parts of the world, providing not only substances that we fill our belly with, but also act as the raw ingredient for corn ethanol, animal feed etc. The Corn Belt in the US has about 96,000,000 acres of land just for corn production, and have characteristics of leveled land, fertile and highly organic soils.

Corn has been known to be able to grow in a wide range of climatic conditions, hence it would be a challenge to set precise conditions for corn production. Hence, breeders have been experimenting with various types of corn hybrids, each of them specifically created to have high yield despite the environment it is planted in. Over the years, the farmers have been using trial and error method to identify the best hybrids to plant by planting each of these hybrids in different locations with different environmental factors; this process has been proven to be slow and not very effective.

Using visualisation tools such as geofacet and isoline graphs would give a good overivew on which part of the Corn Belt has better yield. We would implement GWmodel to generate our prediction model. This model aims to explore the meteorology and geographical factors that makes a corn, the a-maize-ing crop that we know today, which would benefit the corn breeder greatly.

  • Tan Le Wen Angelina
  • Pu Yiran
  • Stanley Alexander DION
Group 8 Logo.jpg

Visualizing Future of Crowd Funding with 余额宝

Yu’e Bao (余额宝) is an investment product offered by Alipay (支付宝), a mobile and online payment platform established by China’s multinational conglomerate Alibaba Group. In June 2013, Alibaba Group launched Yu’e Bao, in collaboration with Tianhong Asset Management Co., Ltd., to form the first internet fund in China. Since then, Yu’e Bao has become the nation’s largest money market fund and, by Feb 2018, has US$251 billion under its management. In Chinese, Yu’e Bao represents “Leftover Treasure”. Alipay users can deposit their extra cash, for example, leftover from online shopping, into this investment product. The money will be invested via a money market fund with no minimum amount or exit charges, with interest paid on a daily basis. While major banks offer 0.35% annual interest on deposits, Yu’e Bao may offers user 6% interest with the convenience and freedom to deposit and withdraw anytime via Alipay mobile app. Thus, Yu’e Bao became extremely popular in China.

Using various data visualization methodologies, coupled with analysis of survival and time-series, this project aims to build an interactive tool on R Shiny framework, to unearth the underlying treasures of associations between Yu’e Bao’s user profile, behaviour, time and other financial factors.

  • Wong Yam Yip
  • Wu Jing Long
  • Song Chen Xi
Cover.jpg

Group09

Visualization on China Stock Market Data

In our general impression, when we access a trading platform to make investment, we need to deal with plenty of price data to make our investment decision, which includes the opening price, closing price, highest price of the day and lowest price of the day. And the K line is the most popular visualization chart of the stock data for investor to refer to. However, if a fresh investor is not very professional and sensitive to the financial data, he may be distracted by various price data and are not able to make appropriate financial decision. Therefore, visualization of stock market data is quite useful for technical stock market analysis and will help investors to gain a comprehensive understanding on how the stock market is changing, which lead to our analysis objective for this project.

  • Cao Xinjie
  • Chen Jingyi
  • Wang Yixuan

Group10-logo.jpg Group10

China Stock Data Visualization The stock market data is seamless endless and widely available on the web. The movement of stock exchange depends on a complex mix of factors and difficult to predict. Exploring the patterns of stock market data, using different data visualization skills will be largely helpful for stock market investors and traders. This project aims to provide advanced data visualization of stock market data to reveal the hidden pattern of market movement.

  • Hou Xuelin
  • Yan Huilin
  • Zhang Yanli
Group 11.png

Group11

Visualization Analysis of Citi Bike Data of New York City

Over the past decade, bicycle-sharing systems have been growing in number and popularity in cities across the world. Citi Bike is New York City’s bike share system and the largest in the nation. Citi Bike launched in May 2013 and has become an essential part of our transportation network. It’s fun, efficient and affordable-not to mention healthy and good for the environment. In this project, we will perform an exploratory analysis on data provided by Citibike_stations and Citibike_trips and illustrate the power of visual analysis in using R shiny. We will build a user-friendly application to help riders to find out which station is most convenient for you to return and borrow bikes? From the Citibike’s point of view, it is critical to understand which station the most popular bike stations is and think about the geographical distribution of bike stations for Citi Bike.

  • Huang Shan
  • Kouhei Takesita
  • Zhang Kexin