Difference between revisions of "Main Page"

From Analytics Practicum
Jump to navigation Jump to search
Line 212: Line 212:
 
<td>[[Time-series Analysis on Singapore Public Transportation Train Network]]</td>
 
<td>[[Time-series Analysis on Singapore Public Transportation Train Network]]</td>
 
<td>
 
<td>
The adoption of Ezlink smart card technology allows transportation analyst to discover new insights of the consumption and lifestyle of their commuters’ in the transportation network. As smart cards contain rich data and all the transactions are in temporal sequences, it gives an opportunity to analyse the complex and voluminous time-series data using time-series data mining techniques. This is particularly interesting as there is a need to transform these rich data into actionable information and knowledge, which users can understand. Therefore, this paper seeks to explores the problem of the transportation network and validate against the implementation current policy of the free rides and discusses the use of time-series data mining techniques to achieve insights that will provides a picture on whether the policy matches with the findings.  
+
The adoption of Ezlink smart card technology allows transportation analyst to discover new insights of the consumption and lifestyle of their commuters’ in the transportation network. As smart cards contain rich data and all the transactions are in temporal sequences, it gives an opportunity to analyse the complex and voluminous time-series data using time-series data mining techniques. This is particularly interesting as there is a need to transform these rich data into actionable information and knowledge, which users can understand. Therefore, this project seeks to explores the problem of the transportation network and validate against the implementation current policy of the free rides and discusses the use of time-series data mining techniques to achieve insights that will provides a picture on whether the policy matches with the findings.  
 
<br/>
 
<br/>
 
</td>
 
</td>

Revision as of 14:44, 24 April 2015

Introduction to Analytics Practicum

Logo.png

The Analytics Practicum module (ANLY482) is a compulsory module for those who are taking the Analytics Second Major program. It involves a project that assess the students' ability to apply analytics in real-time events extensively. These projects come from both the academics and industry. Students can also get a good sense of how analytics are used in their field of study.

Welcome to the Analytics Practicum (ANLY482)! -

Practicum Projects for AY2014/2015 Term 2

Team Project Name Student Member(s) Project Supervisor Sponsor
Social Media & Public Opinion Unstructured data is challenging and when it comes to unstructured textual data the analytical toolkit needs more tools! This project aims at quantifying and studying the trends in human emotions expressed by Twitter users over a period of time. The data-set provided comprises of social media data in form of tweets published by Singapore-based Twitter users over several months. Individual teams have to come up with their granular analysis of change in mood trends, periods of significance (may be weekends or any weekday) and other noteworthy actionable insights coming out of analysis done. It is expected that the results are presented as a web-based visualization that summarizes the trend of the happiness level over time and allows the inspection of factors associated with the happiness level at a certain point in time. Also, it should provide the end-users with various drill-down features to choose from while interacting with the end system.

Types of analysis: Text mining and analytics is a vast topic so to help you get started we suggest you look into techniques like Sentiment analysis, word stemming, word frequency analysis, social network analysis (centrality, network diameters and density etc.), Influencer analysis and visual analytics techniques. File:ReferenceDocument01.docx

Suggested Platforms: SAS EM, Python, R, Gephi, NodeXL (Microsoft)

Libraries to explore: SAS EM sentiment analysis package, NLTK tools & libraries (both Python 2.7 and R), D3.js, C3.js

Other recommendations: No prior knowledge of any courses is assumed while designing this project but prior knowledge of Social and Contextual Analytics, Visual Analytics will be good for taking up advanced analysis during project related tasks.

Recommended Team Composition: Students are free to come up with their own teams but forming a team with diverse backgrounds and skill-sets is highly recommended.

Kean KWOK Jin
Miguel Nicholas
Sherman TAN

Prof. Seema Chokshi

Lecturer of Information Systems, Programme Head, SMU Undergraduate Second Major in Analytics

Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics

Palakorn Achananuparp

Research Scientist at LARC

Emergency Department and Queuing Theory Queuing theory and optimization has been a problem of interest in the field of computer science, operations and analytics for a while now. It finds applications in the fields of traffic engineering, telecommunications, banks, hospitals and many other operational research related use cases. various models and algorithms have been constructed so that queue lengths and waiting times can be predicted.

The current project applies the knowledge of queuing theory and hospital related domain knowledge in dynamically managing queues in emergency department.
Suggested Platforms: SAS EM, Python, R
Other recommendations: In order to get a better understanding of subject matter you may first read through the links provided below and materials on queuing theory and discrete event simulations.
Improving Patient Length-of-Stay in Emergency Department Through Dynamic Queue Management Code Blue Analisys of hospital bed capacity via queuing theory and simulation Research on genetic optimization algorithm of queue rules based on simulation model Smart Priority Queue Algorithms for Self-Optimizing Event Storage
Recommended Team Composition: Students are free to come up with their own teams but forming a team with diverse backgrounds and skill-sets is highly recommended.

Jinq-Yi
Marcus TAN
Muhammad Faris

Prof. Seema Chokshi

Lecturer of Information Systems, Programme Head, SMU Undergraduate Second Major in Analytics

Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics

Prof. TAN Kar Way

Assistant Professor of Information Systems (Practice)

Sustainability Jobs: Analyse the trends The Green Transformation Lab (GTL) is a joint initiative by SMU and DHL aimed at accelerating the evolution of sustainable logistics across Asia Pacific and creating solutions that help companies transform their supply chains, becoming greener, more resource efficient and sustainable. GTL’s Sustainability Heat-map is a group of heat-maps with global profiling of areas and trends on:

a. sustainability-related jobs and
b. sustainability-related topics on Twitter.
This project aims at analyzing the crawled data to get deeper actionable insights and answer questions such as:
a. Is there a trend in the type of sustainability jobs in the various regions, e.g., Europe, Asia, America?
b. Why does sustainable jobs matter? Is it more attractive to potential employees? Or is it policy driven in some countries?
c. Which is the more important measurement? Number of sustainability jobs per unit GDP or number of sustainability jobs per person? Could there be any other measurements?
Types of analysis: Students need to understand the term sustainability well and its applied meaning in sustainable job environment. This project requires text mining and analytics with stress on sustainability as the key area to look into. You may look into techniques like clustering, noun phrase analysis, social network analysis (centrality, network diameters and density etc.) and visual analytics techniques. You need to (but not restricted to) unravel trends in sustainability jobs in various domains across the globe, see what kind of profiles come under this term and examine if any patterns emerge. File:ReferenceDocument03.docx
Suggested Platforms: Python, R, Gephi, NodeXL (Microsoft), Tableau

Libraries to explore: SAS EM sentiment analysis package, NLTK tools & libraries (both Python 2.7 and R), D3.js, C3.js
Other recommendations: Students are free to bring in any data inputs which they see fit to support their findings such as any government policy that mandates issuance of sustainability jobs in a particular field.
No prior knowledge of any courses is assumed while designing this project but prior knowledge of Social and Contextual Analytics, Visual Analytics will be good for taking up advanced analysis during project related tasks. In order to get a better understanding of subject matter you may first read through the links provided below and materials.
GT Lab gLab SMU
Recommended Team Composition: Students are free to come up with their own teams but forming a team with diverse backgrounds and skill-sets is highly recommended.

TAN Siong Min
Janice KOH
TAY Hui Shia

Prof. Seema Chokshi

Lecturer of Information Systems, Programme Head, SMU Undergraduate Second Major in Analytics

Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics

Prof. TAN Kar Way

Assistant Professor of Information Systems (Practice)

Geospatial Visualisation of Global Consumption Patterns

Arisaig Partners (Asia) Pte Ltd is an independent investment management company established since 1996. Yearly, Arisaig Partners holds a Consumer Symposium for potential clients around the globe, and wishes to increase the effectiveness of their presentation by building an interactive dashboard. The project covers two main research areas: Macro: Demographics & Economic indicators Micro: Individual sector matrices

Considering the research areas and sample data, the project is to be as followed:
Macro Visualisation #1: Exploration of Demographic Data Relationship
a. Birth rate per women
b. Average age of country/region
c. Income
d. Household size & Household Income
Macro Visualisation #2: Exploration of Economic Data
a. GDP, GDP growth rate
b. Debt to GDP ratios, Savings rate
c. Capital Market Size
Micro Visualisations: Exploration of Consumption per Capita

Platforms Considered: d3

Tan Kei Rong Benjamin
Sean Chua Kian Shun
Zoey Teo Kai Ying

Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics

Gordon Yeo, Investment Analyst

Arisaig Partners (Asia) Pte Ltd

Network Analysis of Interlocking Directorates

Interlocking directorates has been an interesting topic that captured much attention from the researchers and the public for more than a century. Many methods have been developed to analyze this two-mode network, revealing its practical application in diverse fields of study.

In this study, we study the data from Singapore interlocking directorates network, visualize it using visual analytics tools, determine its patterns and propose some of the applications using the visualization and analysis results. Our proposed applications includes using interlocking directorates network to early detect the commitment of accounting fraud, as well as to aid the Singapore authorities in urban development planning. This is the integration from the knowledge we have acquired from accounting, information systems, business administration and social sciences courses, which we hope the idea would be able to be applied in real life.

Key features of our project includes:

  1. Visualization of the network of interlocking directorates
  2. Using Patterns of Interlocking Directorates Network to predict Accounting Fraud
  3. Using Interlocking Directorates Network Analysis to aid in urban development planning

Technologies considered: NodeXL, Gephi

Le Hoang Trinh
Zheng Tianwei

Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics

Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics

GeoVisual Analytics Tool for population health analysis​

Health Promotion Board(HPB) is established to promote national health status in Singapore. It looks after the entire population, making sure that health care is available to Singapore citizens when they are in need. One objective of HPB’s is to make health care facilities accessible to everyone and this project aims to define the accessibility of these facilities to Singapore population.

We aim to demonstrate the distribution of health care agencies of Health Promotion Boards and determine the accessibility of each agency. We will visualize the distribution of facilities using Geographical Information Systems(GIS) and then further zoom in to look at individual region and facility.

Key features of our project includes:

  1. Facility and Population distribution visualisation
  2. Dashboard to filter regions and facilities


Technologies considered: QGIS, PostGIS, R

Song Chengyue
Wang Jing

Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics

Health Promotion Board

Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics

Time-series Analysis on Singapore Public Transportation Train Network

The adoption of Ezlink smart card technology allows transportation analyst to discover new insights of the consumption and lifestyle of their commuters’ in the transportation network. As smart cards contain rich data and all the transactions are in temporal sequences, it gives an opportunity to analyse the complex and voluminous time-series data using time-series data mining techniques. This is particularly interesting as there is a need to transform these rich data into actionable information and knowledge, which users can understand. Therefore, this project seeks to explores the problem of the transportation network and validate against the implementation current policy of the free rides and discusses the use of time-series data mining techniques to achieve insights that will provides a picture on whether the policy matches with the findings.

Koh Ying Ying Trecia
Luqman Haqim Bin Ab Rahman

Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics

Prof. KAM Tin Seong

Faculty Staff of Learning Analytics Research Centre (LARC)

Come back after 30 days!

At the moment of discharging from hospital,
Nurse: "Thank you for choosing our hospital, we hope that you have a speedy recovery!"
Patient: "Thank you, i feel much more at ease if i stay under the care of your hospital staff. If i feel any slightest discomfort, i will come back immediately ok!"
Nurse: *with an horrified face* "Oh.. uh.. actually, if its really something minor, you do not need to come back. But well, we can't stop any patients from visiting us... repeatedly"
Patient: "Yeah, so that's it. I will come back whenever i feel not at ease, even though it may be eventually minor. You know, just gotta be safe."

Hospitals have been studying about the likelihood of patients readmitting within 30 days starting on the day of discharge primarily to reduce costs and operational overhead. It has grown to be a governmental concern as health systems, especially in UK, has decided to incentivize hospitals that abide by the rules and penalize those who did not manage to reduce their number of 30 days readmissions. Furthermore, for governments with welfare systems that provide for health care, patients readmitted within 30 days may be an avoidable expense if the hospitals were able to identify such patients during the first diagnosis. At the same time, various researchers claimed their superiority over other studies. Even though LACE (Length of stay, Acuity of admission, Comorbidity, Emergency department visits) index has been acknowledged as the gold standard in prediction of 30 day readmissions rates, authors claimed that their model perform better, often with caveats.

The outcomes of such a study allow hospitals to identify patients with a higher risk of readmissions and prescribe interventions in the form of house visits, or a wide-spectrum treatment in order to mitigate the problem. Data analytics, especially predictive ones, enable hospitals to do so. If effectively implemented, hospitals can reduce their costs and focus their limited resources to prevent avoidable readmissions. Consequently, hospitals may be able to stretch their resources to care for a larger number of patients, instead of serving readmitted patients..

Objective: To predict the likelihood of a patient readmitting within 30 days using 2 models (i.e. decision tree & multivariable logistic regression), along with its corresponding ROC curves to determine the predictive power.

Suggested Platforms: SAS EM, JMP, RapidMiner, Tableau


Nicholas Lee Desheng
Goh Jian Hao

Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics


Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics

GLC

GLC is an international postal and logistic company with a global network in over 220 countries and territories across the globe. The company offers a wide range of services such as international express deliveries; global freight forwarding by air, sea, road and rail; warehousing solutions from packaging, to repairs, to storage; mail deliveries worldwide; and other customised logistic services. In recent years, the company’s sale volume and revenue in the Asia-Pacific has grown rapidly. However, the company is also experiencing increasing competition from its competitors. In order to stay competitive in the Asia-Pacific market, besides producing cutting edge products, the company believes that it is also very important to understand consumers’ needs.

Objective and task: GLC wishes to maximise its sales revenue and market share in 2015 by formulating appropriate product strategies and distribution channel policies based on the analysis of historical sales data. Actionable recommendations may be useful in helping the management to meet the business objectives and to shape this aspect of its business strategy and operations.

CHENG Fu Mei
LEONG Wai Sum
Lynette SEOW Hui Xin

Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics

GLC (anonymous)

Prof. KAM Tin Seong

Associate Professor of Information Systems

Senior Advisor, SIS Programmes in Analytics

Practicum Projects for AY2014/2015 Term 1

Team Project Name Student Member(s) Project Supervisor Sponsor
Kolaveri Di Social Analytics Project "This Kolaveri Di" is a Tamil song from the soundtrack of Tamil film 3. It was written and sung by actor Dhanush and composed by music director Anirudh Ravichander. The song was officially released on 16 November 2011, and it instantly became viral on social networking sites for its quirky "Tanglish" lyrics. Soon, the song became the most searched YouTube video in India and an internet phenomenon across Asia. Within a few weeks, YouTube honoured the video with a Recently Most Popular Gold Medal Award for receiving a large number of hits in a short time. The objective of this project is to identify the key element(s) that explains the success of this video, particularly for its capability in drawing listeners and spreading its viral effect over the online domain. The analysts are required to submit a report, detailing out these elements along with some recommendations that could help to replicate its success.
  1. Lee Jaehyun
  2. Chan Wei Yin
Prof. Seema Chokshi

Lecturer of Information Systems, Programme Head, SMU Undergraduate Second Major in Analytics

Prof. Srinivas K Reddy

Professor of Marketing, Director, Centre for Marketing Excellence, Academic Director, LVMH-SMU Asia Luxury Brand Research Initiative, Area

Visualization of Consumer Satisfaction Consumer research has been a hot topic. Businesses and government agencies are interested to know the satisfaction levels of Singaporean consumers and effectively take actions that can create valuable and meaningful impact in the society. This project explores these satisfaction levels. It uses the respondent level data from the Satisfaction Index of Singapore (2008-2013). The objective of this project is to produce a dashboard that shows trends of consumer satisfaction visually.
  1. Mohamed Yousof Bin Shamsul Hameed
  2. Kee Eng Sen
Prof. Seema Chokshi

Lecturer of Information Systems, Programme Head, SMU Undergraduate Second Major in Analytics

Prof. Marcus Lee

Assistant Professor of Marketing (Practice), Academic Director for the Institute of Service Excellence at SMU (ISES)

Twitter Analytics The background of the project is horizon scanning; creating an analytical platform that is scanning the online (social/established data) to identify upcoming topics and keywords clusters. The objective is to not stop at the cloud creation but to be able to provide a time series analysis and forecast of the ‘relevance’ of the topic over the course of X number of days
  1. Fransisca Fortunata
Prof. Seema Chokshi

Lecturer of Information Systems, Programme Head, SMU Undergraduate Second Major in Analytics

David Hardoon

Head, Analytics, SAS Institute Pte Ltd, Singapore

Grading

Project Proposal

Update of Wikipage: 1%

Proposal Report: 14%

Mid-Term Presentation and Report

Mid-Term Presentation: 10%

Mid-Term Report: 15%

Mid-Term Update of Wikipage: 5%

Final Presentation and Report

Final Report: 25%

Final Presentation: 15%

Project Poster: 10%

Update of Wikipage: 5%