Difference between revisions of "ISSS608 2017-18 T1 Assign ZHANG LIDAN"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
<div style=background:#2B3820 border:#A3BFB1>
 
<div style=background:#2B3820 border:#A3BFB1>
 
[[Image:title_momo.png|150px]]  
 
[[Image:title_momo.png|150px]]  
<font size = 5; color="#FFFFFF">To be a Visual Detective</font>
+
<font size = 5; color="#FFFFFF">Epidemic Spread in Smartpolis</font>
 
</div>
 
</div>
 
<!--MAIN HEADER -->
 
<!--MAIN HEADER -->
Line 7: Line 7:
 
| style="font-family:Century Gothic; font-size:100%; solid #000000; background:#B0620E; text-align:center;" width="25%" |  
 
| style="font-family:Century Gothic; font-size:100%; solid #000000; background:#B0620E; text-align:center;" width="25%" |  
 
;
 
;
[[ISSS608_2017-18_T1_Assign_ZHANG_Lidan| <font color="#FFFFFF">Background</font>]]
+
[[ISSS608_2017-18_T1_Assign_ZHANG_LIDAN| <font color="#FFFFFF">Background</font>]]
  
 
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3820; text-align:center;" width="25%" |  
 
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3820; text-align:center;" width="25%" |  
Line 25: Line 25:
 
==Background==
 
==Background==
 
===Motivation===
 
===Motivation===
During the past few days, there is a sharply increase in reported illness from local hospitals in Vastopolis City. To better assist the city official in look in mitigating the bad effect from these symptoms. I prepare to explore what happens according to the comments from the microblog and weather and events from the given datasets.
+
Smartpolis is a major metropolitan area with a population of approximately two million residents. During the last few days, health professionals at local hospitals have noticed a dramatic increase in reported illnesses.<br/>
 +
Observed symptoms are largely flu­like and include fever, chills,sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. More recently, there have been several deaths believed to be associated with the current outbreak. City officials fear a possible epidemic and are mobilizing emergency management resources to mitigate the impact. <br/>
 +
To clearly explore the origin and epidemic spread in Smartpolis, it is essential to identify approximately where the outbreak started on the map (ground zero location) and outline the affected area.<br/>
 +
In addition, it is needed to present a hypothesis on how the infection is being transmitted. For example, is the method of transmission person-­to­-person, airborne, waterborne, or something else? Is the outbreak contained? Is it necessary for emergency management personnel to deploy treatment resources outside the affected area?
 +
 
 
===About Dataset===
 
===About Dataset===
There are 3 datasets given in csv format. Here, I mainly focus on the Microblog dataset at first.  
+
There are 3 datasets given in csv format and one map in png format.<br/>
 +
[[File:Dataset.PNG|200px]]
 +
* Microblog<br/>
 +
Attributes:<br/>
 +
ID – personal identifier of the individual posting the message<br/>
 +
Created_at – date and time of the post<br/>
 +
Location – latitude and longitude coordinates of the mobile device at the time of post<br/>
 +
Text – the posted message<br/>
 +
[[File:Microblogs.PNG|400px]]
 +
* Population<br/>
 +
Attributes:<br/>
 +
Zone_Name – the name of one of the 13 city zones within the metropolitan area<br/>
 +
Population_Density – the number of residents in the zone<br/>
 +
Daytime_Population – the estimated population in the zone due to commuting during work hours<br/>
 +
[[File:Weather.PNG|200px]]
 +
* Weather<br/>
 +
Attributes:<br/>
 +
Date – date of observed weather by weather station<br/>
 +
Weather – weather conditions for a particular day<br/>
 +
Average_Wind_Speed – measured in miles per hour<br/>
 +
Wind_Direction – the direction from which the wind is blowing or from which it originates<br/>
 +
[[File:Population.PNG|250px]]
 +
* Smartpolis<br/>
 +
Attributes:<br/>
 +
Longitude: -93.5673W ~ -93.1923W<br/>
 +
Latitude: 42.3017N ~ 42.1609N<br/>
 +
[[File:MAP.PNG|400px]]
 
===Tools Application===
 
===Tools Application===
I use JMP to do the data pre-processing, then plot the data by using Tableau.
+
* JMP 13: Data pre-processing<br/>
 +
* Tableau 10.3: Data Visualization<br/>
 
===Work Flow===
 
===Work Flow===
[[Image:workflow.PNG|500px]]
+
[[Image:workflow.PNG|600px]]
 
 
==Data Preparation==
 
To better deal with the data, I import the microblog data set into the JMP at first. This dataset contains a lot of useful information. For example, I can use the location axis and the timestamp to identify where these rows are located. Then, through tokenizing and stemming the words in each message, I can filter the high frequency words and flulike-related keywords for further data exploration.
 
The microblogs dataset contains 1,023,077 rows.
 
Firstly, I need to separate the location into longitude and latitude. Then, because these locations are at the western, hemisphere, I should reverse the longitude coordinates into negative value.
 
Next, to exclude the irrelevant information, I create the subset dataset which consists of main flulike symptoms, such as chill, flu, fever, sweat, pain, fatigue, ache, cough, breath, nausea, vomit, diarrhea. Here, I use the Text Explorer in JMP to generate these new columns.
 
[[File:1.png|600px|center]]
 
Next, I create the bar chart to display the frequency of microblogs including the symptom words. From this table, it can be noticeable that there is a sharply increase in the frequency from May 18 to May 20, 2011.
 
[[File:2.png|1000px|center]]
 
Aiming to explore what happens from May 18 to May 20, I decide to reload the microblog dataset into JMP. Through observing the words in the text, I find the words are not only related to flulike symptoms, but also related to stomach problems. Then, I generate one dataset contains flulike symptoms like breath, cough, fatigue, fever, flu, and pneumonia, another dataset contains stomach ache symptoms like diarrhea, nausea, stomach and vomit.
 
==Original and epidemic spread==
 
===Flulike problems distribution===
 
There are 29243 rows talking about some of the flulike symptoms. After I import the dataset and map and adjust the latitude and longitude coordinates, I use the Pages functions to observe the changes of the distribution by hours from April 30 to May 20.
 
At 8 am on May 18, there was an outbreak of flulike disease in the Downtown, I guess it might be happen in Vastopolis Dome and Convention Center.
 
[[File:3.png|600px|center]]
 
To confirm whether there is an outbreak of flulike disease in these areas, I display a Word Cloud to identify the location.
 
[[File:4.png|500px|center]]
 
===reasons of this flulike disease start===
 
From the messages on May 17 and 18, there may be an Art Festival or Concert in Downtown. Besides, there was a supply for food. Therefore, the outbreak possibly started owing to the food or person-to-person.
 
From the weather dataset, it was west wind from May 16 to May 19, so people in the Eastside were infected with flulike disease, however people in the west part were not. Therefore, I can identify this flu spread mainly because of airborne.
 
[[File:5.png|600px|center]]
 
After 6 pm, most people went back to home after work, so other areas were infected gradually. From the following bar chart, there is an increase in daytime population among Downtown, Westside, Uptown. So, there are a large possibility for people who worked in Downtown and Uptown but lived in other areas infecting their families after work.
 
[[File:6.png|600px|center]]
 
[[File:7.png|700px|center]]
 
From 12 am, most people were ready to sleep, so there is less information about flu. However, from 2 am May 19, there is another outbreak of flu. I create another chart to show the changes of each symptom among different hours. In the following chart, it can be observed that it really exists an outbreak from 2 am on May 19. A large quantity of people caught the flu and more people were coughing and got a fever. In addition, there is no decrease in the number of flulike disease in next 48 hours.
 
[[File:8.png|600px|center]]
 
From 7 am on May 20, there were lot of people go to hospital because of the serious flulike symptoms. It can be seen from the following map.
 
[[File:9.png|700px|center]]
 
Nonetheless, there is good news that it was a slower wind speed on May 20. Thus, the number of infected people may increase in a slower speed.
 
[[File:10.png|600px|center]]
 
===Stomach problems distribution===
 
 
 
==reference==
 
There is another outbreak of stomach problem at 2 am on May 19, which happened below the 610 Highway and along the downstream of the Vast River.
 
[[File:11.png|600px|center]]
 
==feedback==
 

Latest revision as of 17:27, 15 October 2017

Title momo.png Epidemic Spread in Smartpolis

Background

Data Preparation

Data Visualization

Conclusion

 


Background

Motivation

Smartpolis is a major metropolitan area with a population of approximately two million residents. During the last few days, health professionals at local hospitals have noticed a dramatic increase in reported illnesses.
Observed symptoms are largely flu­like and include fever, chills,sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. More recently, there have been several deaths believed to be associated with the current outbreak. City officials fear a possible epidemic and are mobilizing emergency management resources to mitigate the impact.
To clearly explore the origin and epidemic spread in Smartpolis, it is essential to identify approximately where the outbreak started on the map (ground zero location) and outline the affected area.
In addition, it is needed to present a hypothesis on how the infection is being transmitted. For example, is the method of transmission person-­to­-person, airborne, waterborne, or something else? Is the outbreak contained? Is it necessary for emergency management personnel to deploy treatment resources outside the affected area?

About Dataset

There are 3 datasets given in csv format and one map in png format.
Dataset.PNG

  • Microblog

Attributes:
ID – personal identifier of the individual posting the message
Created_at – date and time of the post
Location – latitude and longitude coordinates of the mobile device at the time of post
Text – the posted message
Microblogs.PNG

  • Population

Attributes:
Zone_Name – the name of one of the 13 city zones within the metropolitan area
Population_Density – the number of residents in the zone
Daytime_Population – the estimated population in the zone due to commuting during work hours
Weather.PNG

  • Weather

Attributes:
Date – date of observed weather by weather station
Weather – weather conditions for a particular day
Average_Wind_Speed – measured in miles per hour
Wind_Direction – the direction from which the wind is blowing or from which it originates
Population.PNG

  • Smartpolis

Attributes:
Longitude: -93.5673W ~ -93.1923W
Latitude: 42.3017N ~ 42.1609N
MAP.PNG

Tools Application

  • JMP 13: Data pre-processing
  • Tableau 10.3: Data Visualization

Work Flow

Workflow.PNG