Data Preparation Process
Revision as of 20:25, 15 October 2017 by Chen.zhou.2016 (talk | contribs) (Created page with "<!--MAIN HEADER --> {|style="background-color:#1B338F;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0" | | style="font-family:Century Gothic; font-size:...")
|
|
|
|
Raw data process
1. Exclude row with missing value
2. Split location column
- Split the location column into two columns, named N and W
- Recode the column W into negative based on the direction of map given
3. Label the area zone for each data
- Use Graph Builder to locate all the points into the map:choose map as the background in graph builder and select N as y-axis, W as x-axis
- Use Lasso tool to select the points zone by zone and use magnifier tool to double check the points located on the border
- Label the select rows for certain zone
Text mining
Consider it is microblog sent by everyone, what this assignment interested is related to flu. Hence, it is important to extract the data that related to flu topic.