John Bau

From Visual Analytics for Business Intelligence
Revision as of 16:54, 13 November 2019 by Sunho.lee.2017 (talk | contribs) (→‎Problem)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

<--- Go Back to Project Groups

John Bau 300.gif

About Us

Proposal

Poster

Application

Research Paper



Problem

South Korea, despite being one of the rapidly growing economies in the world, has the highest suicide rate in the world. According to OECD Statistics, 25.6 people commit suicide per 100,000 people (OECD, 2018). In other words, suicide incidents happen to 36 people per day and 13,092 people per year; much higher than the OECD average of 11.6.

Hence, our group would like to analyze multiple data sets such as gender, social status, and age to understand the underlying social issue of South Korea leading to suicides.

Every number in the datasets affects loved ones, family and precious life. Throughout our journey, we hope to derive insightful meanings from the datasets to highlight the severity of suicide social issue in South Korea, in which the government and society tend to overlook.

Motivation

A blockbuster korean drama, Sky Castle, has been our source of inspiration to the initial idea of our project as the drama portrayed a huge issue of Korea’s education system and Socioeconomic Status (SES), which weigh heavily in measuring an individual’s success and to a greater extent, they have been ingrained in Korean culture. However, when the pressure to conform to family and society’s expectations is too great, suicide is increasingly a top option to escape those expectations.

Our motivation is to review the pattern of suicide numbers over the years based on several factors - occupation, income etc. Korea government website does provide a basic suicide visualisation; however, the visualisation only considers one data set and no correlation. Hence, it is essential to identify several factors and of these factors, how they correlate to suicide rate.

Objectives

  • To analyse suicide growth rate over the years
  • To analyse and compare which region is impacted the most and least.
  • To understand how significant each factor affects suicide rate.
  • To correlate the cause of suicide to suicide growth rate.
  • To understand which demographic committed suicide by age and gender.
  • Based on demographic, we can analysis the relationship with cause of suicide to discover the common suicide reason.

Selected Datasets

The Data Sets we will be using for our analysis and for our application is listed below:

Dataset/Source Data Attributes Rationale Of Usage

impulse_to_commit_suicide
(2008 - 2018)
(13 years and above)

  • Type(1)
  • Type(2)
  • Year

This dataset will be used to understand the reasons for suicide thought in South Korea from 2008 to 2018.

suicide_rate_region
(1998 - 2017)

  • By Region
  • Gender
  • Year

This dataset will be used to understand suicide rate in the different major cities in Korea. We will also be able to gain insights on the Number of suicide.

suicide_by_occupation
(1993 - 2017)

  • Cause of death
  • Gender
  • Occupation
  • Unit
  • Year

This data set will be used to understand the general demographic of occupations of who committed suicide in Korea from 1993 to 2017. We will be able to gain descriptive insights of which occupation tend to have higher suicide rate.

suicide_attampt_student
(2005 - 2018)

  • Gender
  • Respondent Characteristics
  • Respondent Characteristic
  • Year

This data set will be used to understand the general demographic of different level of students attempted to commit suicide coming to Korea from 2005 to 2018. We will be able to gain descriptive insights on the students demographics by Gender, Education level.

suicide_by_age
(1983 - 2017)

  • Reasons
  • Gender
  • Age
  • Year

This data set will be used to understand the general demographic of different ages who committed suicide in Korea from 1983 to 2017. We will be able to gain descriptive insights on the age trend who committed suicide over the years.

suicide_marital_status
(1983 - 2017)

  • Reason for Death
  • Gender
  • Marital status
  • Year

This data set will be used to understand the general demographic of suicide rate based on Martial status in Korea. We will be able to gain descriptive insights on the status trend who intend to commit suicide over the years.

suicide_intension_survey
(2010 - 2017)
  • Suicide Intent
  • Response
  • Year
  • Year

This data set will be used to understand the general demographic of international suicide intention based on status in Korea. We will be able to gain descriptive insights on the status trend who intend to commit suicide over the years.

Background Survey

Examples Takeaways
Jb bs 01.png
Line chart is used to detect any anomalies in the trend. However, The use of absolute value displays a slight fluctuation and relatively constant trend.

Jb bs 02.png

Raw values tend to inflate the value of suicide attempt.

Jb bs 03.png

They use absolute value to visualise the data. KOSIS generally uses bar or pie chart hence it’s not very attractive or just too simple

Jb bs 04.png

The colour does not represent. They combined the total into the chart.

Jb bs 05.png

Our world in data used scatter chart with trend line to visualise the violence rate. Also the size of the circle to represent the violence rate.

Jb bs 06.png

To display top 10 scams and difference from 2017 to 2018. It utilises stock color to differentiate the increase (red) and decrease (green). The image of related handcuffs helps to identify a cause for concern

Consideration and Visual Selection

Below are a few visualizations and charts we considered making for our projects.

Considerations Pros Cons

Jb cvs 01.png

  • Understand the general relationship of different cluster
  • Able to visualise the situation, patterns and correlations
  • Able to show Gender, Age and occupations differences
  • Does not explain why and how the situation occurred
  • Circle is hard to compare and know exact value

Jb cvs 02.png

  • Easy to understand which region tend to have higher suicide rate
  • Hard to compare, classify or tank the individual region

Jb cvs 03.png

  • Useful for cross-examining multivariate data.
  • Good for showing variance across multiple variables, revealing and patterns, displaying whether any variables are similar to each other.
  • Able to identity which season,year,month, date has highest & lowest suicide rate.
  • Color schemes need to factor Cultural consideration. In Korea, a brighter colour such as orange is not favourable among citizens
  • Excessive use of colours neglects the fundamentals of visualisation - difficult interpretation.

Brainstorming Sessions

Examples Remark
Jb 0000.png
Display the distribution of age-group using pyramid chart. We can identify the most vulnerable age group and analyse the cause further.

Use scatter plot to identify how age group are clustered together. With the addition of interactive animation for years, a trail function will help to discover the trend over the years.

Jb 0001.png

Jb 0002.png

A Radar Chart, also called as Spider Chart, Radial Chart or Web Chart, is a graphical method of displaying multivariate data in the form of a two-dimensional chart of three or more quantitative variables represented on axes starting from the same point.
We will be using this to show the reasons for Korean’s suicide reason/thoughts.

Jb 0003.png

Identification of anomalies will be clear across categories.

Jb 0004.png

Use Cartogram chart to analyse which resgion has the higher or lower suicide rate.

Jb 0005.png

We will also use bar chart to show clearer picture of exact percentage of each region and compare

Proposed Storyboard

Layout Learning Points
Jb 0006.png
Use map chart to indicate which region has the least to most suicides based on the intensity of the colour.

Jb 0007.png

Jb 0008.png

Base on the splitted bubble chart, we are able to identify which occupation has the highest suicide rate in percentage and of each occupation, it shows the propotion of gender.

The time series will be positioned on top of the splitted bubble chart

Credit to Professor Kam Tin Seong

Technologies

Jb 0009.png

Timeline

Capturell.png

Comments

Feel free to leave us some comments so that we can improve!

No. Name Date Comments
1. <your name> <date> <comment>
2. <your name> <date> <comment>
3. <your name> <date> <comment>