ANLY482 AY2016-17 T2 Group 2 Project Overview Data Source
Revision as of 16:02, 29 December 2016 by Jonathanlow.2013 (talk | contribs)
Background | Data Source | Methodology |
---|
Preliminary Data Source
Preliminary Data Source
To facilitate our initial analysis, GovTech provided us with dataset that consists of job postings for January 2016. The dataset contains on every instance’s job title, description and requirements. Relevant skills can be found in all 3 columns.
Data Dictionary
Data Field | Description |
---|---|
[empty] | Serial number of the job postings |
jobtitle | This shows the hiring post for the job. In the job title, it displays whether is it an engineering or finance role. |
Description | This describes in detail the company’s profile, candidate’s characteristics they are looking for and the expected work scope of the candidate. |
requirements | This lists out the certifications, experiences, skills required for the job post. |
Using this data, we can gather labelled data by identifying the words from the ‘description’ and ‘requirements’ fields and scraping websites for common skillsets. This labelled data will then be used to build a model.
Subsequently, data will be scraped from jobsbank.gov.sg and used to train the model.