Difference between revisions of "ZAN Project Findings"
Jump to navigation
Jump to search
Line 43: | Line 43: | ||
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Data Cleaning</font></div> | <div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Data Cleaning</font></div> | ||
<br/> | <br/> | ||
− | The data had 77,205 records initially. | + | The data had 77,205 records initially. The following diagram shows our team's general data cleaning procedures. |
− | + | <center> | |
+ | [[Image:AY2017_ZAN_Data_Cleaning.png|700px]] | ||
+ | </center> | ||
+ | <br/> | ||
+ | After the the data cleaning, the data now has 63,511 records | ||
+ | <br/> | ||
<div align="left"> | <div align="left"> | ||
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Data Exploration</font></div> | <div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Data Exploration</font></div> | ||
<br/> | <br/> | ||
+ | Due to the sensitivity and confidentiality of the data, please refer to the elearn dropbox or send us an email. | ||
<div align="left"> | <div align="left"> | ||
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Data Modelling</font></div> | <div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #2E8B57 solid 32px;"><font color="##4682B4">Data Modelling</font></div> | ||
<br/> | <br/> | ||
+ | Due to the nature of the data, our team has decided to prepare 3 separate analytical sandboxes for the models. |
Revision as of 11:16, 22 February 2017
Mid-Term Progress
|
Final Progressnew! |
Data Cleaning
The data had 77,205 records initially. The following diagram shows our team's general data cleaning procedures.
After the the data cleaning, the data now has 63,511 records
Data Exploration
Due to the sensitivity and confidentiality of the data, please refer to the elearn dropbox or send us an email.
Data Modelling
Due to the nature of the data, our team has decided to prepare 3 separate analytical sandboxes for the models.