Difference between revisions of "ISS608 2017-18 T1 Assign KyonghwanKim Data Preparation"
Jump to navigation
Jump to search
Kh.kim.2016 (talk | contribs) |
Kh.kim.2016 (talk | contribs) |
||
Line 47: | Line 47: | ||
*Created_at column is splitted to Date and Time columns. Date column is used in other analytics.<br/> | *Created_at column is splitted to Date and Time columns. Date column is used in other analytics.<br/> | ||
*Also, Location column is splitted to Latitude and Longitude columns. These data is used to plot in Vastropolis map. | *Also, Location column is splitted to Latitude and Longitude columns. These data is used to plot in Vastropolis map. | ||
− | |[[file:microblog_split.png]] | + | |[[file:microblog_split.png|450px]] |
|- | |- | ||
|'''2. Outliers'''<br/> | |'''2. Outliers'''<br/> | ||
Line 53: | Line 53: | ||
*Also, there are 6 items with Longitude outside of given map range. They are removed as well so that all data are within parameters. | *Also, there are 6 items with Longitude outside of given map range. They are removed as well so that all data are within parameters. | ||
*Total 27 rows are removed and 1,023,050 rows are used for analysis with file name "Microblog_Final.csv". | *Total 27 rows are removed and 1,023,050 rows are used for analysis with file name "Microblog_Final.csv". | ||
− | |[[file:missing_time.png]] [[file:outlier_Longitude.png]] | + | |[[file:missing_time.png|150px]] [[file:outlier_Longitude.png|300px]] |
|- | |- | ||
|} | |} | ||
==2. Key Words== | ==2. Key Words== | ||
− | + | Some of key words are given. However, there may be additional key words to enhance accuracy of analysis. <br/> | |
− | + | ||
+ | [[file:outbreak.png|500px]] | ||
+ | |||
+ | Above graph shows text distribution with key word ''"flu"'' and ''"cold"''. Text traffic shoots up from May 18th and remain high until 20th. Text distribution between 3 days of outbreak using JMP text explorer is shown below. | ||
+ | |||
+ | [[file:could18.png|250px]][[file:could19.png|250px]][[file:could20.png|250px]] | ||
+ | |||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
Line 69: | Line 75: | ||
|- | |- | ||
|} | |} | ||
− | |||
− | |||
− |
Revision as of 05:23, 15 October 2017
Vastropolis Epidemic Report
|
|
|
|
|
|
Microblog
1. Data cleaning
2. Key Words
Some of key words are given. However, there may be additional key words to enhance accuracy of analysis.
Above graph shows text distribution with key word "flu" and "cold". Text traffic shoots up from May 18th and remain high until 20th. Text distribution between 3 days of outbreak using JMP text explorer is shown below.
Diagnosis | Symptoms |
flu, cold | fever, chill, fatigue, cough, breath, nausea, vomit, diarrhea, sweat, pain, sore throat, muscle, letharg (-y or -ic), runny nose, doctor, sick |