ANLY482 AY2017-18T2 Group13 Analysis & Findings
Contents
Manipulation and Cleaning of Training Records
In order to rectify these missing data points, we replaced the values whenever possible as the team hopes to analyze as many data points as possible. When replacement of values are not possible, records are excluded from analysis.
Missing data (Field) | Action |
---|---|
Start and End date | 1. Attempted to find corresponding dates from exact course titles, unsuccessful
2. Attempt to take 'Create date' however inaccurate as record could have been created after actual start and end date 3. Excluded (Final action) |
Category | Replaced missing data with corresponding records with exact course title |
Groupings within Training Records
Course Titles
Using text explorer and vetting through our Sponsor, course titles were grouped in order to narrow down our analysis to specific areas. Grouping was done with a calculated field to facilitated breakdown analysis in later stage. Staff department groupings were also applied in Training Records
[Text explorer screenshot]
Groupings :
- Bulk
- HSEQ
- Equipment and Maintenance
- Miscellaneous
- Packaging
- Packaging and Raw Material
- Warehouse
- Others
Manipulation and Cleaning of Staff List
1. As the staff list provided were in separate sheets, data fields were standardized and merged into 1 data set.
Previous field name | New field name | Sheet affected |
---|---|---|
Job Code | Job Title | Dec'13 & Dec '14 |
Job Group | Staff Group | Dec'14 |
Designation | Job Title | Dec'15, Dec'16, Dec'17 |
Staff Grade | Staff Group | Dec'15, Dec'16, Dec'17 |
Section | Department | Dec'15, Dec'16, Dec'17 |
Department | Location | Dec '14, Dec'15, Dec'16, Dec'17 |
2. Standardization of variables within columns were also made to ensure consistency throughout the years.
Issue | Action taken | Sheet affected |
---|---|---|
Location column missing from sheets and location information was combined in the naming assigned to the departments (e.g. EMOS7000) | Location Group column created to extract the front letters from the department in Excel with a calculated field | Dec'13 & Dec '14 |
Department variables inconsistent with other years as it was combined with location. Other sheets used a single descriptive field to specify departments (e.g. EMOS Shared Services) [SNAPSHOT] | Used corresponding descriptive department name based on matching of employee number to other sheets | Dec’13 & Dec'14 |
Variables as Job Title was inconsistent with other years and were in the form of acronyms | Cross referenced employee number to other years and Training Records for corresponding job title | Dec'13 & Dec ‘14 |
Variables in Cost Centre , Citizenship and Serial Number fields were inconsistent over the years (e.g. Missing entire field) and had missing records that could not be matched with other years | As these fields were not within the scope and were advised by the Sponsor to exclude, these fields were excluded from the data analysis entirely | Dec'13, Dec'14, Dec'17 |
Groupings within Staff Records
As provided by the Sponsor, staff departments were grouped into the following : - Bagging - CSR - Operations - Technical Towhead - Transport - Warehouse - Others: Includes all other departments not belonging to either of the above
Mapping Staff List and Training Record data set
Employee number is unique to each employee hence it was used an identifier. While an left join was considered, a full outer join was used instead to include all the records from both data sets independent from whether there were matching employee numbers.
As distinct count of employees in the Training Record is greater than that in the Staff List, the team attributed this to the fact that there are employees who received training but left in the short term to be captured in the Stall List records.
A full outer join would : - Include employees who undergone training but left the company in a short term hence will not be traceable in the annual Staff List - Include employees who did not undergo training However, with a full outer join, crucial fields such as employee number, location etc will be duplicated and may present differing variables. Meanwhile, employee records who appeared in staff list but not training records would have null values in training employee fields.
[Screenshot of tableau]
As such additional columns were created generalize columns that combined data from both sets
1.Employee Number
2.Location Group Filter
3.Staff Department Group Filter
Created to first obtain the department group information from the Training Records, if it is unavailable, information from the Staff List would be used.
4.Job Title Group Filter
Upon mapping, discrepancies in the staff department groups were found. Since employees underwent training based on the job title that they are categorized for training purposes,job title group in Training Records took precedence over job title group in Staff Records.
To be updated
To be updated