Difference between revisions of "ANLY482 AY2016-17 T2 Group 2 Project Overview Data Source"

From Analytics Practicum
Jump to navigation Jump to search
Line 46: Line 46:
 
==<div style="background: #6A8D9D; line-height: 0.3em; font-family:helvetica;  border-left: #466675 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF"><strong>Preliminary Data Source</strong></font></div></div>==
 
==<div style="background: #6A8D9D; line-height: 0.3em; font-family:helvetica;  border-left: #466675 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#F2F1EF"><strong>Preliminary Data Source</strong></font></div></div>==
 
<div style="margin:20px; padding: 10px; background: #ffffff; text-align:left; font-size: 95%;-webkit-border-radius: 15px;-webkit-box-shadow: 7px 4px 14px rgba(176, 155, 121, 0.96); -moz-box-shadow: 7px 4px 14px rgba(176, 155, 121, 0.96);box-shadow: 7px 4px 14px rgba(176, 155, 121, 0.96);">
 
<div style="margin:20px; padding: 10px; background: #ffffff; text-align:left; font-size: 95%;-webkit-border-radius: 15px;-webkit-box-shadow: 7px 4px 14px rgba(176, 155, 121, 0.96); -moz-box-shadow: 7px 4px 14px rgba(176, 155, 121, 0.96);box-shadow: 7px 4px 14px rgba(176, 155, 121, 0.96);">
For our preliminary analysis, we are provided with 1 month of data (January 2016). Due to the insufficient data provided, our team will build a program to scrape more data from Jobsbank.gov.sg.
+
To facilitate our initial analysis, GovTech provided us with dataset that consists of job postings for January 2016. The dataset contains on every instance’s job title, description and requirements. Relevant skills can be found in all 3 columns. <br><br>
 
 
The dataset provided to us contains on every instance's job title, description and requirements. Relevant skills can be found in all 3 columns.<br><br>
 
 
'''Data Dictionary'''
 
'''Data Dictionary'''
 
{|class="wikitable" width="60%"
 
{|class="wikitable" width="60%"
Line 71: Line 69:
 
| This lists out the certifications, experiences, skills required for the job post.
 
| This lists out the certifications, experiences, skills required for the job post.
 
|}
 
|}
 +
<br>
 +
Using this data, we can gather labelled data by identifying the words from the ‘description’ and ‘requirements’ fields and scraping websites for common skillsets. This labelled data will then be used to build a model.
 +
Subsequently, data will be scraped from jobsbank.gov.sg and used to train the model.
 +
 
</div>
 
</div>

Revision as of 16:02, 29 December 2016


HOME

 

PROJECT OVERVIEW

 

FINDINGS

 

PROJECT DOCUMENTATION

 

PROJECT MANAGEMENT

Background Data Source Methodology

Preliminary Data Source

To facilitate our initial analysis, GovTech provided us with dataset that consists of job postings for January 2016. The dataset contains on every instance’s job title, description and requirements. Relevant skills can be found in all 3 columns.

Data Dictionary

Data Field Description
[empty] Serial number of the job postings
jobtitle This shows the hiring post for the job. In the job title, it displays whether is it an engineering or finance role.
Description This describes in detail the company’s profile, candidate’s characteristics they are looking for and the expected work scope of the candidate.
requirements This lists out the certifications, experiences, skills required for the job post.


Using this data, we can gather labelled data by identifying the words from the ‘description’ and ‘requirements’ fields and scraping websites for common skillsets. This labelled data will then be used to build a model. Subsequently, data will be scraped from jobsbank.gov.sg and used to train the model.