Difference between revisions of "ISSS608 2017-18 T1 Assign ZHANG LIDAN"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
<div style="background: #3b3b3b; padding: 15px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #a9a9a9 solid 32px; font-size: 20px"><font color="white">Assignment 1 - To be a Visual Detective: D</font></div>
+
<div style=background:#2B3820 border:#A3BFB1>
 +
[[Image:title_momo.png|150px]]
 +
<font size = 5; color="#FFFFFF">Epidemic Spread in Smartpolis</font>
 +
</div>
 +
<!--MAIN HEADER -->
 +
{|style="background-color:#1B338F;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 +
| style="font-family:Century Gothic; font-size:100%; solid #000000; background:#B0620E; text-align:center;" width="25%" |
 +
;
 +
[[ISSS608_2017-18_T1_Assign_ZHANG_LIDAN| <font color="#FFFFFF">Background</font>]]
  
==Background==
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3820; text-align:center;" width="25%" |
 +
;
 +
[[ISSS608_2017-18_T1_Assign_ZHANG_Lidan_Data_Preparation| <font color="#FFFFFF">Data Preparation</font>]]
  
==Data Preparation==
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3820; text-align:center;" width="25%" |
To better deal with the data, I import the microblog data set into the JMP at first. This dataset contains a lot of useful information. For example, I can use the location axis and the timestamp to identify where these rows are located. Then, through tokenizing and stemming the words in each message, I can filter the high frequency words and flulike-related keywords for further data exploration.
+
;
The microblogs dataset contains 1,023,077 rows.
+
[[ISSS608_2017-18_T1_Assign_ZHANG_Lidan_Visualization| <font color="#FFFFFF">Data Visualization</font>]]
Firstly, I need to separate the location into longitude and latitude. Then, because these locations are at the western, hemisphere, I should reverse the longitude coordinates into negative value.
 
Next, to exclude the irrelevant information, I create the subset dataset which consists of main flulike symptoms, such as chill, flu, fever, sweat, pain, fatigue, ache, cough, breath, nausea, vomit, diarrhea. Here, I use the Text Explorer in JMP to generate these new columns.
 
[[File:1.png|600px|center]]
 
Next, I create the bar chart to display the frequency of microblogs including the symptom words. From this table, it can be noticeable that there is a sharply increase in the frequency from May 18 to May 20, 2011.
 
[[File:2.png|1000px|center]]
 
Aiming to explore what happens from May 18 to May 20, I decide to reload the microblog dataset into JMP. Through observing the words in the text, I find the words are not only related to flulike symptoms, but also related to stomach problems. Then, I generate one dataset contains flulike symptoms like breath, cough, fatigue, fever, flu, and pneumonia, another dataset contains stomach ache symptoms like diarrhea, nausea, stomach and vomit.
 
==Original and epidemic spread==
 
===Flulike problems distribution===
 
There are 29243 rows talking about some of the flulike symptoms. After I import the dataset and map and adjust the latitude and longitude coordinates, I use the Pages functions to observe the changes of the distribution by hours from April 30 to May 20.
 
At 8 am on May 18, there was an outbreak of flulike disease in the Downtown, I guess it might be happen in Vastopolis Dome and Convention Center.
 
[[File:3.png|600px|center]]
 
  
==reference==
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3820; text-align:center;" width="25%" |
 +
;
 +
[[ISSS608_2017-18_T1_Assign_ZHANG_Lidan_Conclusion| <font color="#FFFFFF">Conclusion</font>]]
 +
|  &nbsp;
 +
|}
 +
<br/>
 +
==Background==
 +
===Motivation===
 +
Smartpolis is a major metropolitan area with a population of approximately two million residents. During the last few days, health professionals at local hospitals have noticed a dramatic increase in reported illnesses.<br/>
 +
Observed symptoms are largely flu­like and include fever, chills,sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. More recently, there have been several deaths believed to be associated with the current outbreak. City officials fear a possible epidemic and are mobilizing emergency management resources to mitigate the impact. <br/>
 +
To clearly explore the origin and epidemic spread in Smartpolis, it is essential to identify approximately where the outbreak started on the map (ground zero location) and outline the affected area.<br/>
 +
In addition, it is needed to present a hypothesis on how the infection is being transmitted. For example, is the method of transmission person-­to­-person, airborne, waterborne, or something else? Is the outbreak contained? Is it necessary for emergency management personnel to deploy treatment resources outside the affected area?
  
==feedback==
+
===About Dataset===
 +
There are 3 datasets given in csv format and one map in png format.<br/>
 +
[[File:Dataset.PNG|200px]]
 +
* Microblog<br/>
 +
Attributes:<br/>
 +
ID – personal identifier of the individual posting the message<br/>
 +
Created_at – date and time of the post<br/>
 +
Location – latitude and longitude coordinates of the mobile device at the time of post<br/>
 +
Text – the posted message<br/>
 +
[[File:Microblogs.PNG|400px]]
 +
* Population<br/>
 +
Attributes:<br/>
 +
Zone_Name – the name of one of the 13 city zones within the metropolitan area<br/>
 +
Population_Density – the number of residents in the zone<br/>
 +
Daytime_Population – the estimated population in the zone due to commuting during work hours<br/>
 +
[[File:Weather.PNG|200px]]
 +
* Weather<br/>
 +
Attributes:<br/>
 +
Date – date of observed weather by weather station<br/>
 +
Weather – weather conditions for a particular day<br/>
 +
Average_Wind_Speed – measured in miles per hour<br/>
 +
Wind_Direction – the direction from which the wind is blowing or from which it originates<br/>
 +
[[File:Population.PNG|250px]]
 +
* Smartpolis<br/>
 +
Attributes:<br/>
 +
Longitude: -93.5673W ~ -93.1923W<br/>
 +
Latitude: 42.3017N ~ 42.1609N<br/>
 +
[[File:MAP.PNG|400px]]
 +
===Tools Application===
 +
* JMP 13: Data pre-processing<br/>
 +
* Tableau 10.3: Data Visualization<br/>
 +
===Work Flow===
 +
[[Image:workflow.PNG|600px]]

Latest revision as of 17:27, 15 October 2017

Title momo.png Epidemic Spread in Smartpolis

Background

Data Preparation

Data Visualization

Conclusion

 


Background

Motivation

Smartpolis is a major metropolitan area with a population of approximately two million residents. During the last few days, health professionals at local hospitals have noticed a dramatic increase in reported illnesses.
Observed symptoms are largely flu­like and include fever, chills,sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. More recently, there have been several deaths believed to be associated with the current outbreak. City officials fear a possible epidemic and are mobilizing emergency management resources to mitigate the impact.
To clearly explore the origin and epidemic spread in Smartpolis, it is essential to identify approximately where the outbreak started on the map (ground zero location) and outline the affected area.
In addition, it is needed to present a hypothesis on how the infection is being transmitted. For example, is the method of transmission person-­to­-person, airborne, waterborne, or something else? Is the outbreak contained? Is it necessary for emergency management personnel to deploy treatment resources outside the affected area?

About Dataset

There are 3 datasets given in csv format and one map in png format.
Dataset.PNG

  • Microblog

Attributes:
ID – personal identifier of the individual posting the message
Created_at – date and time of the post
Location – latitude and longitude coordinates of the mobile device at the time of post
Text – the posted message
Microblogs.PNG

  • Population

Attributes:
Zone_Name – the name of one of the 13 city zones within the metropolitan area
Population_Density – the number of residents in the zone
Daytime_Population – the estimated population in the zone due to commuting during work hours
Weather.PNG

  • Weather

Attributes:
Date – date of observed weather by weather station
Weather – weather conditions for a particular day
Average_Wind_Speed – measured in miles per hour
Wind_Direction – the direction from which the wind is blowing or from which it originates
Population.PNG

  • Smartpolis

Attributes:
Longitude: -93.5673W ~ -93.1923W
Latitude: 42.3017N ~ 42.1609N
MAP.PNG

Tools Application

  • JMP 13: Data pre-processing
  • Tableau 10.3: Data Visualization

Work Flow

Workflow.PNG