Difference between revisions of "ISSS608 2017-18 T1 Assign WANG SHANG"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 17: Line 17:
 
In the microblog dataset, there is a column that records the text that is published to social platform by different persons, and this dataset also supports the created time and location to me. I import this dataset to JMP, using word function split the location data into two columns, latitude and longitude. then I use text explore analysis to split each text record into words and phrases with no stemming. Because I think if someone is ill, he/she usually sends a blog message about his/her illness. So that if I can find a word that can represent a symptom or illness in a text, it probability means this blog creator has gotten this illness. Hence, I can just extract a key symptom to represent the current status of a person.
 
In the microblog dataset, there is a column that records the text that is published to social platform by different persons, and this dataset also supports the created time and location to me. I import this dataset to JMP, using word function split the location data into two columns, latitude and longitude. then I use text explore analysis to split each text record into words and phrases with no stemming. Because I think if someone is ill, he/she usually sends a blog message about his/her illness. So that if I can find a word that can represent a symptom or illness in a text, it probability means this blog creator has gotten this illness. Hence, I can just extract a key symptom to represent the current status of a person.
  
Here is an example, I use ''flu­like, fever, chills,sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea, vomiting, diarrhea, and enlarged lymph nodes'', which is provided in the overview part of assignment introduction page, as my illness word list. And I find each word from this list in JMP text explore analysis to collect related text records.
+
Here is an example, I use ''flu­like, fever, chills,sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea, vomiting, diarrhea, and enlarged lymph nodes'', which is provided in the overview part of assignment introduction page, as my illness word list. And I find each word from this list in JMP text explore analysis to collect related text records and put them into a new table. In this new table, I create a new column called ''Key_Symptom'' using the particular words as the value.
 
 
 
  
 +
[[image: Key_symptom_finding_WS.jpg|center|500px|pic1. finding key symptom]]
  
 +
After finishing the same process on the all words in my illness word list, I concentrate them together to generate my visualize-used table. Before I import them into Tableau, I also create a new column named ''DayNight'' based on ''Created_at'' column. In this column, value "1" means ''Day'', because the hour of created time between 6 and 17. and value "2" means ''Night'', because the hour of created time less than 6 or larger than 17. So far, the data preparation has been finished. I will use it and weather and population data to do a visualization analysis
  
  
 
= Tasks & Solutions =
 
= Tasks & Solutions =
 
== Task 1: Origin and Epidemic Spread ==
 
== Task 1: Origin and Epidemic Spread ==
 +
In my opinion,

Revision as of 03:16, 15 October 2017

Title WangShang.jpg Mini Challenge: What's happened in Smartpolis?

Background

Smartpolis is a major metropolitan area with a population of approximately two million residents. During the last few days, health professionals at local hospitals have noticed a dramatic increase in reported illnesses.

I want to mine some valuable insights to track the trend of spread of illness by using visualization analysis tools, and help government to let them know what they can do for a better illness spread control.


Data Description

I have three datasets and one Smartpolis map for analysis. In the three datasets, the first one contains microblog messages collected from various devices with GPS capabilities. These devices include laptop computers, handheld computers, and cellular phones, another two are about population statistics and observed weather data. I am also supported some additional information in a Words file.


Data Preparation

In the microblog dataset, there is a column that records the text that is published to social platform by different persons, and this dataset also supports the created time and location to me. I import this dataset to JMP, using word function split the location data into two columns, latitude and longitude. then I use text explore analysis to split each text record into words and phrases with no stemming. Because I think if someone is ill, he/she usually sends a blog message about his/her illness. So that if I can find a word that can represent a symptom or illness in a text, it probability means this blog creator has gotten this illness. Hence, I can just extract a key symptom to represent the current status of a person.

Here is an example, I use flu­like, fever, chills,sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea, vomiting, diarrhea, and enlarged lymph nodes, which is provided in the overview part of assignment introduction page, as my illness word list. And I find each word from this list in JMP text explore analysis to collect related text records and put them into a new table. In this new table, I create a new column called Key_Symptom using the particular words as the value.

pic1. finding key symptom

After finishing the same process on the all words in my illness word list, I concentrate them together to generate my visualize-used table. Before I import them into Tableau, I also create a new column named DayNight based on Created_at column. In this column, value "1" means Day, because the hour of created time between 6 and 17. and value "2" means Night, because the hour of created time less than 6 or larger than 17. So far, the data preparation has been finished. I will use it and weather and population data to do a visualization analysis


Tasks & Solutions

Task 1: Origin and Epidemic Spread

In my opinion,