Difference between revisions of "T15 Overview"

From Analytics Practicum
Jump to navigation Jump to search
Line 1: Line 1:
 
<!--Logo-->
 
<!--Logo-->
[[File:G15PISA_HOME.png|1300px|]]<br>
+
[[File:Ap_background.png|1300px|]]<br>
 
<!--/Logo-->
 
<!--/Logo-->
  
Line 37: Line 37:
 
<p>Our project makes use of PISA data collected during the latest survey of 2012 with regards to Singapore. The aim of project is to explore the relationship between computer use in school and secondary-school student performance in reading and mathematics. Building on the current international work done by PISA, our project brings the analysis to Singapore national level and studies various aspects of student performance relative to their access to computer in and outside of school, in order to provide insights for education policy makers of Singapore Ministry of Education (MOE). </p>
 
<p>Our project makes use of PISA data collected during the latest survey of 2012 with regards to Singapore. The aim of project is to explore the relationship between computer use in school and secondary-school student performance in reading and mathematics. Building on the current international work done by PISA, our project brings the analysis to Singapore national level and studies various aspects of student performance relative to their access to computer in and outside of school, in order to provide insights for education policy makers of Singapore Ministry of Education (MOE). </p>
  
== Motivation ==
+
== Business Problem ==
<p>KTPH manages a huge amount of data from public health screening. These data potentially contains valuable insights about individuals’ health relative to their lifestyles and medical background, but knowledge is not being extracted effectively from these data. At the same time, KTPH initiates many programs to promote public health, but there is room for improvement, especially when there is a lack of a data-driven method to make decisions during the execution process. Last but not least, KTPH currently has no means to measure the penetration rate of their past alignment programs, and likewise, no means to fine-tune future programs to achieve higher penetration rate.  
+
<p>The Ministry of Education (MOE) of Singapore collects and analyses data from schools island wide to continually improve policies and practices in Education. However, most of this data are not publicly available for research and analysis by those outside the Ministry. Hence, the sponsor seeks to gain insights about education in Singapore from the publicly available data collected by the OECD through their “Programme for International Student Assessment” (PISA) survey. The PISA is a triennial international survey which aims to evaluate education systems worldwide by testing the skills and knowledge of 15-year-old students. The most recently published results are from the assessment in 2012.</p>
Data visualization is thus adopted by KTPH in collaboration with T-Lab as an ongoing effort to derive insights from their data-rich operations.</p>
 
  
 
== Project Objectives ==
 
== Project Objectives ==

Revision as of 22:35, 28 February 2016

Ap background.png

HOME

 

ABOUT US

 

PROJECT OVERVIEW

 

PROJECT MANAGEMENT

 

DOCUMENTATION


Project Introduction

Introduction of PISA

The Programme for International Student Assessment (PISA) is a international survey which aims to evaluate education systems worldwide by testing the skills and knowledge of 15-year-old students. To date, students representing more than 70 economies have participated in the assessment. The most recently published results are from the assessment in 2012.

Around 510,000 students in 65 economies took part in the PISA 2012 assessment of reading, mathematics and science representing about 28 million 15-year-olds globally. Given PISA is an ongoing triennial survey, countries and economies participating in successive surveys can compare their students' performance over time and assess the impact of education policy decisions.

Since the year 2000, every three years, fifteen-year-old students from randomly selected schools worldwide take tests in the key subjects: reading, mathematics and science, with a focus on one subject in each year of assessment. Students take a test that lasts 2 hours. The tests are a mixture of open-ended and multiple-choice questions that are organized in groups based on a passage setting out a real-life situation. A total of about 390 minutes of test items are covered. Students take different combinations of different tests. The students and their school principals also answer questionnaires to provide information about the students' backgrounds, schools and learning experiences and about the broader school system and learning environment.

Project Introduction

Our project makes use of PISA data collected during the latest survey of 2012 with regards to Singapore. The aim of project is to explore the relationship between computer use in school and secondary-school student performance in reading and mathematics. Building on the current international work done by PISA, our project brings the analysis to Singapore national level and studies various aspects of student performance relative to their access to computer in and outside of school, in order to provide insights for education policy makers of Singapore Ministry of Education (MOE).

Business Problem

The Ministry of Education (MOE) of Singapore collects and analyses data from schools island wide to continually improve policies and practices in Education. However, most of this data are not publicly available for research and analysis by those outside the Ministry. Hence, the sponsor seeks to gain insights about education in Singapore from the publicly available data collected by the OECD through their “Programme for International Student Assessment” (PISA) survey. The PISA is a triennial international survey which aims to evaluate education systems worldwide by testing the skills and knowledge of 15-year-old students. The most recently published results are from the assessment in 2012.

Project Objectives

This project is a follow-up of an IS480 project by team Cinquefoil. Our aim is to improve the KTPH dashboard by adopting a richer set of visualization techniques, so as to enable a more user-centric data querying and discovery process. KTPH users will be able to use the dashboard to identify unhealthy individuals of the population, the areas they are in, take appropriate actions and monitor the results of such actions.

As such, the objectives of our analytics project consist of the following:

  • To visualize effectively the current health condition of the public across various regions of Singapore
  • To allow health officers to track the health progress of individual at risks
  • To assist health officers in monitoring the penetration rate of KTPH alignment programs targeted at the general public
  • To allow users to interact with visualizations, thereby forming their own query and arriving at their own findings

Data

The data is provided by KTPH Health Population team, consisting of 6,744 patient records with the following attributes:

Demographics

  • Gender
  • Age/Age group
  • Race
  • Education level
  • Occupation
  • Home address

Health measurements

  • Weight
  • Height
  • waist
  • BMI
  • Glucose measure
  • Cholesterol level
  • Blood pressure
  • Systolic
  • Diastolic
  • Instances of strokes, heart attacks, diabetes
  • Other health measurements

Lifestyle

  • Smoking habit
  • Stress level
  • Exercise
  • Diet

Intervention records

  • Nurse intervention
  • Doctor outcome
  • Doctor revisit
  • Follow up at clinics

Sample dataset

Methodology

Technology

As KTPH prefers a versatile tool that Health Population team can just use without the need for complex setup and installation, d3.js was used to develop a web application in Apache server. D3.js is a JavaScript library for developing visualizations on the web. D3.js is coded in Javascript and use SVG objects for visualization, which allows for more flexibility. SVG objects are also scalable and support visualization on mobile devices. It is convenient as a JavaScript library can run on all modern browsers without users having to install additional software.

In addition to that, we propose to explore dc.js library which is a closely related tool to d3.js. Dc.js allows effective cross-filtering across different charts and has improved performance compared to d3.js. This addition will boost the story-telling capability of the current dashboard and allow users to formulate their own queries in the process of data discovery.

Visualization

Treemap

Treemap is a powerful tool to simultaneously show the big picture, comparison of related items and allow navigation to the details. One important aspect of healthcare visual analytics is the ability to drill-down to details for further investigation. Using treemap to show the health indicators as the example below can provide a bird-eye’s view for users, such that they can observe patterns among the indicators before drilling down to study the details. This technique will be used in Screening Result module.


Treemap.png

Parallel Coordinates

This technique can be used to analyze multiple clinical variables. Each axis represents one numerical clinical variable (eg. BMI, cholesterol level, systolic and diastolic levels). Users can look at the lines and quickly spot the sample line that is outside the normal range. A separate line representing national average could be used as a benchmark; alternatively, expert-defined healthy level for each indicators could also be used. This technique should be used in the intermediate level of drill-down so that the number of lines does not get too large and clutter the chart.


Parallel coordinates.png

Chord Visualization

This chart is to study the association between clinical variables. More often than not, clinical variables are likely to have some relationship with one another, for example, a patient with overweight level of BMI is more likely to have high cholesterol level and higher risk of diabetes. Chord visualization allows data exploration that reveals such a pattern, and potentially helps to identify individuals at risk of diseases like diabetes based on their other health indicators.


Chord visualisation.png

Funnel Plot

Funnel plot is essentially a scatter plot with 2 sets of boundary lines: one set for 95% confidence and one for 99.8% confidence. The points that lie outside the boundaries will be highlighted as non-random variations that are extremely rare and should be examined more closely, compared to points that lie inside the boundaries that are random variations that happen by chance. In our case, the data points will represent households, x-axis is %population above a certain age and y-axis is %population above a certain age that responds to alignment program. Thus this chart can show penetration rate of KTPH health initiatives to improve public health.


Funnel plot.png

Geospatial Intelligence

The current version does not show the percentage of households participating in KTPH health initiatives; instead it shows the absolute number of households reached out. We will modify the current OpenStreetMap view of the module to reflect the percentage and penetration rate by regions.

Scope of Work

The visualization should allow KTPH users in Health Population team to see an overview of public health condition, based on screening results, and then drill down to region and patient group level to further investigate the various factors that contribute to the status quo. Users can also examine the possible correlations between said factors.

The components to be examined and improved thus are:
Screening Result Module

  • Stratification & Visual Presentation of Health Screening Results

Health Classification Module

  • Health Classification
  • Risk Analysis for Disease
  • Summary of Unhealthy Screening Results

Geospatial Intelligence Module

  • Public Health Screening Penetration Rate
  • Public Health Status Ratio

Repeat Analysis Module (secondary)

  • Flow Analysis of Population Health Screening Results
  • Trend Analysis of Key Health Indicators

Patient Journey Module (secondary)

  • Individual Resident Progress View
  • Temporal Event Sequence Analysis

References

Reddy, C. (n.d.). Introduction to Visual Analytics and Medical Data Visualization. In Healthcare data analytics
Rowell, K. (2013, September 6). Category Archives: Design Basics. Retrieved January 10, 2015, from http://www.healthdataviz.com/category/design-basics/ (n.d.). Retrieved from http://vizhub.healthdata.org/gbd-compare/england