IS428 2016-17 Term1 Assign2 Zheng Xiye

From Visual Analytics for Business Intelligence
Revision as of 17:36, 25 September 2016 by Xiye.zheng.2013 (talk | contribs)
Jump to navigation Jump to search

Theme of Interest

Workplace security and safety has always been one of the essential areas of consideration in Singapore when structuring government policies. Singapore government's high concerns over its workforce's physical well-being has not only reassured its people of conducive working environment but also attracted talents from nearby countries to maintain the sustainable growth of Singapore's economy. However, SMRT Track Accident on 22nd March earlier this year, has sparked off another round of discussion centering around the extend of safety measures that should be implemented in Singapore's workplace. Disputes from which went beyond the context of SMRT Operations Safety Protocols to every aspects of measures, necessary in preventing workplace injuries. On top of which, according to WSHI National Statistics Report 2014, although the number of workplace fatal injury cases has decreased from 73 to 60, number of major and minor injuries in 2014 remain as high as 2013 and going on an increasing trend to 672 and 12,863 respectively from 640 and 11,740. From this perspective, exploring potential correlations between various factors and injury rate may be beneficial in structuring preventive and mitigation actions accordingly.
Workplace Stats.JPG

Analytical & Investigation Questions

Questions listed below are potential correlations hoping to be substantiated with concrete data exploration outputs:

  1. Which periods of the year are having the highest work injuries rate?
  2. Which sub-industries did most of the work injuries fall under?
  3. Which groups of victims with certain characteristics combination are experiencing largest number of work injuries?

Tools

  1. Tableau Desktop
  2. JMP Pro 12
  3. Microsoft Excel

Approaches

Data Preprocessing

1. Data Selection

Out of 48 data variables given, only those relevant to the context of this data exploration process are selected and grouped into 4 distinct data categories based on its nature.

S/D Data Categories Data Variables
I. Accident Time Data
  • Reported Date
  • Accident Date
II. Accident Industry Data
  • Major Industry
  • Sub Industry
III. Accident Type Data
  • Accident Type Category
IV. Accident Victims Data
  • Victims' Gender
  • Victims' Age
  • Occupation
  • Injured (Overtime)
  • Injured (Official Work Duties)

2. Lowercase Formatting - Sub Industry & Occupation

After browsing through the data, I noticed that both Sub Industry and Occupation has data variables of different cases, resulting in 'double-counting' due to duplication. As such, I have made use of JMP Pro's formula: Character - Lowercase to standardize casing of all data variables in both columns. This is achieved by:

  • Importing data to JMP Pro and creating a new column - 'Sub Industry LC'.
  • Right-click at the top of 'Sub Industry LC' and choose 'formula'.
  • In the 'formula' panel, click on 'Character' followed by 'Lower Case' and lastly, 'Sub Industry' as the parameter of 'Lowercase' function.
  • The same process is repeated to convert all 'Occupation' data variables to lower case.

Lowercase Pre-processing.JPG

==

Conclusion

Reference

WSHI National Statistics Report 2014 SMRT Track Accident

Comments