Difference between revisions of "Kabak: Research Paper Data Preparation"
Jump to navigation
Jump to search
(10 intermediate revisions by the same user not shown) | |||
Line 10: | Line 10: | ||
| style="vertical-align:top;width:16%;" | <div style="padding: 3px; font-weight: bold; text-align:center; line-height: wrap_content; font-size:16px; border-bottom:1px solid #000000; border-top:1px solid #000000; font-family:Trebuchet MS"> [[Kabak: Application | <font color="#35383c"><b>APPLICATION</b>]] | | style="vertical-align:top;width:16%;" | <div style="padding: 3px; font-weight: bold; text-align:center; line-height: wrap_content; font-size:16px; border-bottom:1px solid #000000; border-top:1px solid #000000; font-family:Trebuchet MS"> [[Kabak: Application | <font color="#35383c"><b>APPLICATION</b>]] | ||
− | | style="vertical-align:top;width:16%;" | <div style="padding: 3px; font-weight: bold; text-align:center; line-height: wrap_content; font-size:16px; border-bottom:1px solid #000000; border-top:1px solid #000000; font-family:Trebuchet MS; background-color:#35383c;"> [[Kabak: | + | | style="vertical-align:top;width:16%;" | <div style="padding: 3px; font-weight: bold; text-align:center; line-height: wrap_content; font-size:16px; border-bottom:1px solid #000000; border-top:1px solid #000000; font-family:Trebuchet MS; background-color:#35383c;"> [[Kabak: Report | <font color="#FFFFFF"><b>REPORT</b>]] |
+ | |||
+ | | style="vertical-align:top;width:16%;" | <div style="padding: 3px; font-weight: bold; text-align:center; line-height: wrap_content; font-size:16px; border-bottom:1px solid #000000; border-top:1px solid #000000; font-family:Trebuchet MS"> [[Project_Groups | <font color="#35383c"><b>OTHER PROJECT GROUPS</b>]] | ||
|} | |} | ||
Line 18: | Line 20: | ||
{| style="background-color:#ffffff ; margin: 3px 11px 3px 11px;" width="80%"| | {| style="background-color:#ffffff ; margin: 3px 11px 3px 11px;" width="80%"| | ||
| style="font-family:Trebuchet MS; font-size:11px; text-align: center; border:solid 1px #35383c; background-color: #FFFFFF" width="200px" | | | style="font-family:Trebuchet MS; font-size:11px; text-align: center; border:solid 1px #35383c; background-color: #FFFFFF" width="200px" | | ||
− | [[Kabak: | + | [[Kabak: Report|<font color="#35383c"><strong>OVERVIEW</strong></font>]] |
| style="font-family:Trebuchet MS; font-size:11px; text-align: center; border:solid 1px #35383c; background-color: #35383c" width="200px" | | | style="font-family:Trebuchet MS; font-size:11px; text-align: center; border:solid 1px #35383c; background-color: #35383c" width="200px" | | ||
− | [[Kabak: | + | [[Kabak: Report Data Preparation|<font color="#FFFFFF"><strong>DATA PREPARATION</strong></font>]] |
− | |||
− | |||
− | |||
| style="font-family:Trebuchet MS; font-size:11px; text-align: center; border:solid 1px #35383c; background-color: #FFFFFF" width="200px" | | | style="font-family:Trebuchet MS; font-size:11px; text-align: center; border:solid 1px #35383c; background-color: #FFFFFF" width="200px" | | ||
− | [[Kabak: | + | [[Kabak: Report Analysis|<font color="#35383c"><strong>ANALYSIS</strong></font>]] |
|} | |} | ||
</center> | </center> | ||
Line 89: | Line 88: | ||
|- | |- | ||
| | | | ||
− | + | * Stack data to consolidate data table in to 2 columns (Postal Code, Housing Type) | |
− | + | * Remove rows with missing data | |
− | |||
|| | || | ||
− | + | [[File: Kabakdatacleaning1.png|400px|center]] | |
− | + | |- | |
− | * | + | | |
− | **By | + | * Concatenate all 12 months data into one consolidated data table |
− | + | **By the end of this phase of data cleaning, we have a total of 177,053 rows | |
− | + | || | |
− | + | [[File: Kabakdatacleaning2.png|400px|center]]|- | |
|- | |- | ||
| | | | ||
− | + | * Merging Private Housing Data with Public Housing Data | |
− | + | **Final consolidated data consist of 241,766 rows | |
|| | || | ||
− | + | [[File: Kabakdatacleaning3.png|400px|center]] | |
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
| | | | ||
− | + | * Geocoding of postal codes with missing latitudes and longitude via https://developers.google.com/maps/documentation/geocoding/intro | |
− | + | **Public housing data: 223 missing data | |
+ | **Private housing data: 338 missing data | ||
|| | || | ||
− | + | [[File: GEOCODING.PNG|400px|center]] | |
− | |||
− | |||
− | |||
− | |||
|} | |} | ||
<br/> | <br/> |
Latest revision as of 15:46, 22 November 2016
Initial Dataset
DATASET | DESCRIPTION | DATA USED |
---|---|---|
Average Monthly Household Electricity Consumption Link (1H): https://www.ema.gov.sg/cmsmedia/Publications_and_Statistics/Statistics/23RSU.xls Link (2H): https://www.ema.gov.sg/cmsmedia/Publications_and_Statistics/Statistics/25RSU.xls |
|
|
Average Monthly Household Electricity Consumption by Postal Code (Private Apartments), 2015 Link: https://www.ema.gov.sg/cmsmedia/Publications_and_Statistics/Statistics/2RSU.xls |
|
|
Basic Demographics Characteristics (2015) |
|
|
Data Cleaning
METHOD | DESCRIPTION |
---|---|
|
|
|
|- |
|
|
|