ANLY482 AY2016-17 T2 Group 2 Project Overview Data Source

From Analytics Practicum
Revision as of 16:02, 29 December 2016 by Jonathanlow.2013 (talk | contribs)
Jump to navigation Jump to search


HOME

 

PROJECT OVERVIEW

 

FINDINGS

 

PROJECT DOCUMENTATION

 

PROJECT MANAGEMENT

Background Data Source Methodology

Preliminary Data Source

To facilitate our initial analysis, GovTech provided us with dataset that consists of job postings for January 2016. The dataset contains on every instance’s job title, description and requirements. Relevant skills can be found in all 3 columns.

Data Dictionary

Data Field Description
[empty] Serial number of the job postings
jobtitle This shows the hiring post for the job. In the job title, it displays whether is it an engineering or finance role.
Description This describes in detail the company’s profile, candidate’s characteristics they are looking for and the expected work scope of the candidate.
requirements This lists out the certifications, experiences, skills required for the job post.


Using this data, we can gather labelled data by identifying the words from the ‘description’ and ‘requirements’ fields and scraping websites for common skillsets. This labelled data will then be used to build a model. Subsequently, data will be scraped from jobsbank.gov.sg and used to train the model.