Difference between revisions of "IS428 2016-17 Term1 Assign2 Li Weiqiao"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
(Created page with "==Abstract== Safety should be a priority for everyone in the workplace. According to the WSHI report in 2014, overall number of report injuries increased compared to 2013....")
 
 
Line 6: Line 6:
  
 
==Theme of Interest==
 
==Theme of Interest==
Questions for Investigation:<br>
+
Final Questions:<br>
Which industry has the highest rate of workplace injuries?<br>
+
Is there any pattern that injury rate has gone up by certain month or weekday?<br>
• Which body parts got injured most of the time?<br>
+
• Which industry has a higher injury rate in workplace? <br>
Which gender suffered more injuries and particularly in which industry?<br>
+
Is there any correlation between injury and workers' gender? <br>
Who is more prone to injuries? Is it the younger or older ones?<br>
+
• Is there any correlation between injury and company's SSCI index?<br>
How did they get injured?<br>
+
Is there any correlation between injury and workers' length of service?<br>
• Who gets injured most of the time?<br>
+
Is there any correlation between injury and workers' age?<br>
When do accidents occur the most? (Which month and why? Does it occur more frequently during overtime?)<br>
 
• What is the distribution of the number of MC given to injured employees?<br>
 
  
==Identifying appropriate attributes==
 
The following data attributes are selected for analysis:
 
* Reported Date
 
* Reported Month
 
* Informant's No Of Employees
 
* Accident Month
 
* Major Industry
 
* Sub Industry
 
* Body Parts Injured
 
* Nature of Injury
 
* Major Injury Indicator
 
* Accident Type Category
 
* Victim's Age
 
* Cause
 
* Victim's Gender
 
  
==Blog==
+
== Findings ==
Data Transformation, Preparation & Cleaning <br>
+
1. Frequency of injury by time <br>
To get things started, I examined and analysed the raw data ("WPI_data") to understand the dataset and attributes given to me. I also noticed that some data are either improper or irrelevant. For example, there were some spelling mistakes. As a data analyst, I needed to ensure that I have done proper data cleaning before doing analytics so that my analysis is true and accurate. I used Microsoft Excel 2016 and JMP to clean and transform my data. Then, I used Tableau to do my visualization.<br>
 
  
1. Industry with the Highest Rate of Injuries <br>  
+
[[File:Frequency_of_accidents_by_weekday_.png|600px|Visualization 3]]<br>
I created a new attribute called "% of Injured Employees" by taking the sum of records divided by the total number of workers of the informants. <br>
+
[[File:Frequency_of_accidents_by_working_hours_.png|600px|Visualization 3]]<br>
  
I used a treemap as it allows the user to have a bird's eye view of the injury rates of the different industries. <br>
+
From the tree map by weekday , there is no certain pattern about a certain day in a week that has a higher probability to get injury in work place. But from the left row category which is month, we can tell that from November to January there are less injuries. After research , this is probably because less working people work during the winter break.  
  
[[File:Dina_MA2_Injured Employees.png|600px|Visualization 3]]
+
From the tree map by working hours, we can interestingly find that there is a high possibility to get injury at 10a.m. to 12a.m.. Governments can formulate and push for a protection policy or program for all organizations and companies to protect people in the workplace.
  
 +
2. More Dangerous Industry <br>
  
2. MC given to Minor and Major Injuries <br>  
+
[[File:Work_Injury_Rate_by_Sub_Industry.png|600px|Visualization 3]] <br>
You can see that for major injuries, blindness was given the most number of MC, followed by amputations then multiple injuries. However, my findings is that not enough MC days have been given to amputations and multiple injuries.
 
  
[[File:MC_given_to_major_minor_injuries.png|600px|Visualization 3]]
+
From the bubble chart, we can cleary see that construction is a higher risk industry for working people. And there are also some industries like manufacturing, logistics or marine that has more injury cases. To industry that has more injury case, we can formulate more protection program and provide safety facilities. <br>
  
3. Overall Pattern of Accidents in a Year
+
3. Factors that affect injury rate <br>
You can see that for the months of January and February, not a lot of accidents have occurred. The peak was during March, April and September. I reckon that especially at those months, perhaps precaution has been lax or either there are more construction happening during those months. <br>
 
  
[[File:Overall_Pattern_of_Accidents_in_a_Year.png|600px|Visualization 3]]
+
[[File:Factors_that_affect_work_injury_rate.png|600px|Visualization 3]]<br>
  
4. Victim Demographic <br>
+
There are several factors that may affect injury rate like workers' age, workers' gender and length of service. <br>
  
Males tend to get injured more often than females. Statistics have shown that males predominantly work in the labor intensive e.g. construction industry.<br>
+
 +
==Evolving Questions==
  
[[File:Victim_Demographic.png|600px|Visualization 3]]
+
1. How s the workplace safety changing during these years? Why is it going to change ?(if there is any slope)
  
5. Pattern of Accidents in a Year by Industry
+
Which is impossible to answer becuase of limited data in 2014.
  
 +
->>>>>> Can we find any patterns that occurs in month, weekdays or working hours ?
  
[[File:Patterns_of_Accidents_in_a_Year_by_Industry_.png|600px|Visualization 3]]
 
  
6. Type of Injuries by Industry
+
2.  
  
[[File:Type_of_Injuries_by_Industry.png|600px|Visualization 3]]
+
What causes these accidents ?
  
7. Number of Accidents of Different Major Industry
+
->>>>>>>>>>>>>>> is there any relationship betwee cause and workers' gender ?
 +
  
[[File:No_of_Accidents_of_Different_Major_Industry.png|600px|Visualization 3]]
+
3. Is there any measures effective to protect workers from injury ?why its not effective?
  
8. Accident Rate of Different Sub Industry.png ‎
+
----->>>>>>>>>>by looking at high-risk industry and high-risk working hours
 +
 +
4. Which industry or sector has the highest workplace injurt rate? Why ?
  
[[File:Accident_Rate_of_Different_Sub-Industry.png|600px|Visualization 3]]
+
-------->>>>>>>>>> Which industry/sector/company has the most frequent accident injury rate ?
  
9. No of Reported Cases Per Month By Sub Industry
+
5. questions which are not visualized :
  
[[File:No_of_Reported_Cases_Per_Month_By_Sub_Industry1.png|600px|Visualization 3]]
+
How s the compensation ? like no.of mc. Days
  
==Findings & Recommendations==
+
Is there any delay from injury happening date and report date?
1. Adults got injured the most. <br>
+
 
2. Construction has the highest rate of workplace injuries. Perhaps more precaution and safety measures should be put in place for this industry. After all, it is one of the most dangerous industries. <br>
+
What kind of accident is the most common one?
3. Hand and lower leg ankle foot got injured the most. After all, we use our hands to do work and since our lower leg ankle foot tends to be exposed more often so employers should consider imposing covered shoes or insure those body parts more.<br>
+
 
4. Males suffered more injuries as compared to the females especially in the construction industry.<br>
+
How often does the workplace injury happen?
 +
 
 +
 
 +
 
 +
==Intermediate Visualization==
 +
 
 +
1. intermediate patter of injury rate
 +
[[File:Interium_pattern_of_accident_date.PNG|600px|Visualization 3]] <br>
 +
 
 +
At first , I want to show the injury rate by month, weekdays and working hours in one treemap. After visualizing like that, i found that it is not clear and hard to see labels and find patterns. So I split them into two treemaps which clearly to find the patterns of injury rate.
 +
 
 +
2.
 +
[[File:Question_2_inter.PNG|600px|Visualization 3]] <br>
 +
 
 +
Because of the size of the work(sub-industry) can represent the number of records, I firstly visualize like that. But because all words have same color, it is hard for human to differentiate from words. So i put number of records as a mark of color which emphasize which industry has more injury case in the workplace.
 +
 
 +
3. 
 +
[[File:Question2_inter.PNG|600px|Visualization 3]] <br>
 +
 
 +
I put color and size as marks of number of records, I still feel it is a bit messy to read. So I change to the bubble chart. And in the bubble chart, I can put labels on it also.
  
==Visualization==
 
https://public.tableau.com/views/VAAssignment2DinaHengLiGwek/Dashboard3?:embed=y&:display_count=yes
 
  
 
==Tools Utilized==
 
==Tools Utilized==
 
# Excel 2013 for data preparation
 
# Excel 2013 for data preparation
 
# Tableau for visualization
 
# Tableau for visualization

Latest revision as of 06:35, 26 September 2016

Abstract

Safety should be a priority for everyone in the workplace. According to the WSHI report in 2014, overall number of report injuries increased compared to 2013.

Accessing the data set from Workplace Safety and Health Institution, this report is going to dive into several parts to study about injury frequent rate by time and industry category and factors that affect injury rate which to provide a better understanding about workplace safety. Measures can be effectively brought out to protect working people from accidents and injuries.

Theme of Interest

Final Questions:
• Is there any pattern that injury rate has gone up by certain month or weekday?
• Which industry has a higher injury rate in workplace?
• Is there any correlation between injury and workers' gender?
• Is there any correlation between injury and company's SSCI index?
• Is there any correlation between injury and workers' length of service?
• Is there any correlation between injury and workers' age?


Findings

1. Frequency of injury by time

Visualization 3
Visualization 3

From the tree map by weekday , there is no certain pattern about a certain day in a week that has a higher probability to get injury in work place. But from the left row category which is month, we can tell that from November to January there are less injuries. After research , this is probably because less working people work during the winter break.

From the tree map by working hours, we can interestingly find that there is a high possibility to get injury at 10a.m. to 12a.m.. Governments can formulate and push for a protection policy or program for all organizations and companies to protect people in the workplace.

2. More Dangerous Industry

Visualization 3

From the bubble chart, we can cleary see that construction is a higher risk industry for working people. And there are also some industries like manufacturing, logistics or marine that has more injury cases. To industry that has more injury case, we can formulate more protection program and provide safety facilities.

3. Factors that affect injury rate

Visualization 3

There are several factors that may affect injury rate like workers' age, workers' gender and length of service.


Evolving Questions

1. How s the workplace safety changing during these years? Why is it going to change ?(if there is any slope)

Which is impossible to answer becuase of limited data in 2014.

->>>>>> Can we find any patterns that occurs in month, weekdays or working hours ?


2.

What causes these accidents ?

->>>>>>>>>>>>>>> is there any relationship betwee cause and workers' gender ?


3. Is there any measures effective to protect workers from injury ?why its not effective?


>>>>>>>>>>by looking at high-risk industry and high-risk working hours

4. Which industry or sector has the highest workplace injurt rate? Why ?


>>>>>>>>>> Which industry/sector/company has the most frequent accident injury rate ?

5. questions which are not visualized :

How s the compensation ? like no.of mc. Days

Is there any delay from injury happening date and report date?

What kind of accident is the most common one?

How often does the workplace injury happen?


Intermediate Visualization

1. intermediate patter of injury rate Visualization 3

At first , I want to show the injury rate by month, weekdays and working hours in one treemap. After visualizing like that, i found that it is not clear and hard to see labels and find patterns. So I split them into two treemaps which clearly to find the patterns of injury rate.

2. Visualization 3

Because of the size of the work(sub-industry) can represent the number of records, I firstly visualize like that. But because all words have same color, it is hard for human to differentiate from words. So i put number of records as a mark of color which emphasize which industry has more injury case in the workplace.

3. Visualization 3

I put color and size as marks of number of records, I still feel it is a bit messy to read. So I change to the bubble chart. And in the bubble chart, I can put labels on it also.


Tools Utilized

  1. Excel 2013 for data preparation
  2. Tableau for visualization