Assignment Dropbox

From Visual Analytics for Business Intelligence
Revision as of 16:12, 23 August 2016 by Jx.wang.2013 (talk | contribs) (Created page with "=Background= <big>'''Education Matters'''</big> <br> Education, often thought as the pre-requiste to a successful career and better income, has long been Singapore’s priorit...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Background

Education Matters
Education, often thought as the pre-requiste to a successful career and better income, has long been Singapore’s priority. But does better education lead to better income? What are the other factors that allows one to command a higher pay? What can Singaporeans do to get a better income?

Motivation

One of the choice I had after my Polytechnic education was if I should continue my education to university or to start work to repay my education debt. My friend decided that he would be able to earn more. I would love to prove it using this visual analytics project. However, a project based on a personal agenda would not be very useful to readers and by projecting estimated earning seems off track to this project. Hence, I shall focus on the income disparities due to education.

Rationale of data selection

According to our story, I searched and found a data set on Singaporeans’ income group seperated by education level and gender. This dataset was particularly useful as it gave both education level as a whole and an option to separate by gender. This allowed me to generate graph on the level of abstraction I want (education level on the first set of visualization graphs and zooming into gender difference on the second set of visualization graphs). File:Va ass 1 dataset.xlsx

Don’t quit!

While analyzing the dataset from data.gov.sg regarding Singaporeans, their education level and income, I noticed a difference in the upward skewness of the graphs by education level. File:Income by education.png
Higher education does command a higher pay. It is apparent from the chart that every increase in education level also increases the median income by an income bin. Primary and University level of education graph.

Simply for numeric comparsion:
1. The top 80% of university graduate earn more than 50% of polytechnic, 70% of Secondary school level and 90% of Primary education level graduates.
2. Top half of university graduate earn more than 80% of polytechnic, 90% of secondary school level and 97% of primary school graduates.

The difference makes it undeniably important that education gives individuals a better shot of getting higher income.

Rationale for chart design

For the chart (Monthly income by Education Level), I decided to put the a bar chart for all four major category of education horizontally in a lower to higher education level. This will help the readers in understanding the situation quickly and see the skewed difference on each level.

Since it didn’t seems to help readers with different colors (whether on gender or education level), no coloring was used.

Preparations

Bar chart across education levels
Using data from data.gov.sg, I managed to get data on people’s income aggregated by the different income bins and their highest attained education level. Simply by removing the top excess rows, I was able to generated the above chart using tableau.
Numeric comparsion Using the same dataset, we select on the top 80% of university graduate lowest income group and using that income group, see how many people in other education level is below that income group.

Does it pay to be a man?

An interesting observation came up when analyzing the dataset. Females across all education level were skewed towards the lower income bins. To simplify the visualization, I will use the University level to do a side by side comparision. File:Income uni gender.png

It seems that gender equality have yet to find its place in the workforce, at least for the time being and within the top education level group. The chart above demonstration that women have lesser presence in the upper half of the bins.

If we take the difference between the genders in each income bin and plot into a chart, we would see females have a greater presense in the lower income bin while male dominates the higher income bins. An area line graph was used to ensure that reader can see the comparision past the 0 difference mark and hence shows which income bin does each gender excel in.

File:DifferenceInMaleVsFemaleInBinsColored.jpg 

Possible reason for the difference
• Jobs that are dominated by women like teaching does not pay as much as jobs that are dominated by man like engineering.

• There are more men in the workforce. Women are more likely to be homemakers after marriage (or giving birth) than men and thus an uneven workforce population. Also, the data we look at is only for university degree holders. And there were much fewer female university graduate in the past.

• It takes years for a female university graduate to hit the higher income bin and given the social stereotype in the past, it would take many years before we see female university graduate have a bigger presences in higher income bins.

File:More males than females.png
More man than women in workforce with university degree

Rationale for chart design

With the dataset that was taken from data.gov.sg, there is only boxplot and side by side bar chart that highlights the difference that both genders are getting without requiring the user to strain their eyes and figure if there is a difference like in a packed bubbles chart (or any other area based charts).

Graph color wise, area chart and side by side bar chart are given colors based on the gender stereotype colors (pink for female and blue for male). The purpose was to allow readers to understand the graph quickly even without the help of color legend. Area line graph was further editted (photoshopped to be pink and blue with text) to make it more obvious which gender has a higher presence in which income bin.

No sorting was done on any of my graphs does not make useful sense to the user, especially when handling data seperated by income bins. I reordered some of the items such as “Below 500” as it was originally automatically sorted after the highest income bin. .

The side by side bar chart was deliberately set horizontal to aid users in the reading of the labels.
Other unused chart
The dataset from data.gov.sg was an aggregated count based on income bins making it difficult to easily generate a correct boxplot. Data massaging was required to change the dataset from an aggregated form to a list of incomes based on the counts of each bin. The generated dataset was used to make a boxplot seen below.

File:Boxplot uni gender.png
Do not use boxplot for aggregated dataset

This box plot highlights the difference between the pay that man and woman as a whole instead of by income brackets. As we can see from the box plot, men receive significantly more than woman on average and median. Although boxplot could illustrate the difference by percentile and other mathematical comparison, this chart was generated based on a generated data set which meant that it will not be solid enough to make a statement out of it. Therefore, the boxplot was not used in our analysis.

File:Pre-generated data.png
Before generating
File:3mb.png
After generating

Preparations

Side-by-side bar chart Similar to how the bar chart was done, the dataset only required some massaging of its header before feeding it onto tableau. I decided to use side by side bar chart and a gender stereotyped colouring for easier viewing of the difference in pay each gender is getting.

Area Line chart Using the dataset of female and males university graduates and their numbers in income bins, I used a calculated field to get the difference between gender across the bin.

Box plot chart The above box plot chart of university graduate separated by gender required much more massaging of data before it could be used. Because the data given to us was aggregated into counts of occurrence, I used java codes to repeat the occurrence into actual values (using upper limit).

Conclusion

In the first set visualization graphs (Don’t Quit!), we have shown that education does directly affect an individual’s income and low education may be turned into an income ceiling. Hence, I strongly recommend students to keep progressing to higher education as much as possible. The second set of visualization graphs (Does it pay to be a man?) which was derived from an observation of the same dataset seems to suggest that females are at a disadvantage. Unfortunately, that I was unable to attain the same dataset arranged by years which could be used to form a trendline to see the changes over time. Due to the social stereotype which limits females in the industry as well as studying in the past, we can see the lack of education have played its part in resulting females to have a lesser presence in higher income bins.

infographic

400px

Comments?

Good job! I like the way you presented your topic, looking at the demographic first, then zoom in looking at the different gender and also how you created a unique graph through your insights.

-- Benjamin Tan