Difference between revisions of "ISSS608 2017-18 T1 Assign ZHANG PENG"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(64 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
<div style=background:#468499 border:#A3BFB1>
 +
[[Image:ZP_title1111.jpg|250px]]
 +
<font size = 5; color="#FFFFFF">Epidemic Spread in Smartpolis</font>
 +
</div>
 +
<!--MAIN HEADER -->
 +
 
<!--- Challenge Introduction --->
 
<!--- Challenge Introduction --->
= VAST Challenge: Characterization of an Epidemic Spread =
+
= Overview =
<div width=100%>
+
<table width=90%>
<div width=30%>
+
<tr><td>
</div>
+
<p align="justify">Smartpolis is a major metropolitan area with a population of approximately two million residents. During the last few days, health professionals at local hospitals have noticed a dramatic increase in reported illnesses. Observed symptoms are largely flu¬like and include fever, chills,sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. More recently, there have been several deaths believed to be associated with the current outbreak. City officials fear a possible epidemic and are mobilizing emergency management resources to mitigate the impact.</p>
<div width=70%>
+
</td></tr>
<p align="justify"> assignment overview </p>
+
<tr><td>
 +
<p><b>Our Tasks</b></p>
 +
<p><b>Task 1:</b></p>
 
<ul>
 
<ul>
 
<li>Identify approximately where the outbreak started on the map (ground zero location). Outline the affected area.</li>
 
<li>Identify approximately where the outbreak started on the map (ground zero location). Outline the affected area.</li>
 +
</ul>
 +
<p><b>Task 2:</b></p>
 +
<ul>
 
<li>Present a hypothesis on how the infection is being transmitted. For example, is the method of transmission person-¬to¬-person, airborne, waterborne, or something else? Identify the trends that support your hypothesis.</li>
 
<li>Present a hypothesis on how the infection is being transmitted. For example, is the method of transmission person-¬to¬-person, airborne, waterborne, or something else? Identify the trends that support your hypothesis.</li>
<li>Is the outbreak contained? Is it necessary for emergency management personnel to deploy treatment resources outside the affected area?</li>
+
<li>Is the outbreak contained? Is it necessary for emergency management personnel to deploy treatment resources outside the affected area? </li>
 
</ul>
 
</ul>
<br>
+
</td></tr>
 +
</table>
 
<br>
 
<br>
 
<!------------------------->
 
<!------------------------->
 +
<!--- Data Introduction --->
  
<!--- Data Introduction --->
 
 
= Data Description =
 
= Data Description =
<p align="justify">xxx</p>
 
</div>
 
</div>
 
===== Dataset Overview =====
 
 
<table width=90%>
 
<table width=90%>
 
<tr>
 
<tr>
<td><b>1 dataset1</b></td>
+
<td><b>1 Microblog Messages</b></td>
<td><b>2 dataset2</b></td>
+
<td><b>2 Map</b></td>
 
</tr>
 
</tr>
 
<tr>
 
<tr>
<td>xxx</td>
+
<td>The provided CSV file contains a number of microblog messages. 
<td>xxx</td>
+
Attributes:
 +
<ul>
 +
<li>ID – personal identifier of the individual posting the message</li>
 +
<li>Created_at – date and time of the post</li>
 +
<li>Location – latitude and longitude coordinates of the mobile device at the time of post</li>
 +
<li>Text – the posted message</li>
 +
</ul>
 +
</td>
 +
<td>
 +
[[File:ZP MAP.png|left|500 px]]
 +
</td>
 +
</tr>
 +
<tr>
 +
<td><b>3 Population Statistics</b></td>
 +
<td><b>4 Observed Weather</b></td>
 +
</tr>
 +
<tr>
 +
<td>The provided CSV file contains a number of population statistics.
 +
Attributes:
 +
<ul>
 +
<li>Zone_Name – the name of one of the 13 city zones within the metropolitan area</li>
 +
<li>Population_Density – the number of residents in the zone</li>
 +
<li>Daytime_Population – the estimated population in the zone due to commuting during work hours</li>
 +
</ul>
 +
</td>
 +
<td>The provided CSV file contains a number of weather statistics.
 +
Attributes:
 +
<ul>
 +
<li>Date – date of observed weather by weather station</li>
 +
<li>Weather – weather conditions for a particular day</li>
 +
<li>Average_Wind_Speed – measured in miles per hour</li>
 +
<li>Wind_Direction – the direction from which the wind is blowing or from which it originates</li>
 +
</ul>
 +
</td>
 
</tr>
 
</tr>
 
</table>
 
</table>
<br>
+
<p><b>5 Additional Information</b></p>
 +
<ul>
 +
<li>Economy – The economy of Vastopolis is based on commerce, entertainment, finance, trucking services, shipping services, health care, and industry.</li>
 +
<li>Water Supply - Residents and businesses get their drinking water by pumping water from nearby reservoirs or rivers.  These distributed water systems are both public and privately owned.</li>
 +
<li>Entertainment – Vastopolis has two stadiums (Vastopolis Dome and Westside Stadium) for sports, concerts, and other events.  The various lakes and the Vast River, which flows south at a steady rate of three miles per hour, is used for water-based sports and recreation.</li>
 +
<li>City Administration – Vastopolis has several locations of significance including a state courthouse, a capitol building, convention center, and a large airport.</li>
 +
</ul>
 
<br>
 
<br>
 
<!------------------------->
 
<!------------------------->
  
 
<!--- Data Preparation --->
 
<!--- Data Preparation --->
 +
 
= Data Preparation =
 
= Data Preparation =
<table></table>
+
<table>
 +
<tr><td>
 +
<p><b>1 Prepare keywords for Microblogs texts</b></p>
 +
<p align="justify">Microblogs’ texts contain a large amount of useless information. To filter out useful and related information from texts, keywords should be designed for efficient filtering. I chose observed symptoms as keywords to filter out related Microblogs by using JMP. The observed symptoms contain chills, sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. I use the symptom-related keywords and frequency of these words to do the deep research.</p>
 +
<p><b>Keywords: </b>'vomit,vomiting', 'sweats', 'pains,painful,pain', 'nausea', 'flu', 'fatigue', 'diarrhea', 'cough,coughing', 'chills', 'stomach', 'breath,breathing', 'ache,headache,aches'</p>
 +
</td></tr>
 +
<tr><td>
 +
<p><b>2 Categorise data to each keyword</b></p>
 +
<p align="justify">At first, I used JMP text explorer function to analyse words in column of ‘text’.</p>
 +
<p>[[File:ZP DP1.png|left|400 px]]</p>
 +
</td></tr>
 +
<tr><td>
 +
<p>Then, I searched for each keyword. As a result, we can find the keyword and the count of its frequency. Furthermore, we select out all the rows which contain keywords.</p>
 +
<p>[[File:ZP DP3.PNG|left|400 px]]</p>
 +
</td></tr>
 +
<tr><td>
 +
<p>The last step is to concatenate filtered data together. </p>
 +
<p>[[File:ZP DP2.png|left|400 px]]</p>
 +
</td></tr>
 +
<tr><td>
 +
<p><b>3 Set longitude and latitude</b></p>
 +
<p align="justify">To build map graph for further analysis, the location should be separated. I named the longitude as Location_X and named latitude as Location_Y. Because the longitude in west is negative, the minus one should be multiplied.</p>
 +
<p>[[File:ZP DP4.PNG|left|400 px]]</p>
 +
</td></tr>
 +
<tr><td>
 +
<p><b>4 Build Map background in Tableau</b></p>
 +
<p align="justify">The background map should be edited so that we can match each point on the map.</p>
 +
<p>[[File:ZP DP5.png|left|400 px]]</p>
 +
</td></tr>
 +
</table>
 
<!------------------------->
 
<!------------------------->
  
Line 43: Line 120:
  
 
= Interactive Visualization =
 
= Interactive Visualization =
You may have your own investigation here: link
+
You may have your own investigation here: https://public.tableau.com/profile/zhang.peng8803#!/vizhome/keyword2/EpidemicSpreadStory?publish=yes
 +
<br>
 
<br>
 
<br>
 
<!------------------------>
 
<!------------------------>
Line 49: Line 127:
 
<!--- Patterns of Life Analysis --->
 
<!--- Patterns of Life Analysis --->
  
= Patterns of Life Analysis =
+
= Analysis Results =
===== Daily Patterns =====
+
== Question 1 ==
<table width=100%>
+
<table>
<tr bgcolor="#86af49">
+
<tr><td>
<th><font color="#ffffff">Images</font></th>
+
'''<big><p><i>Part 1:Identify approximately where the outbreak started on the map (ground zero location). Outline the affected area. Explain how you arrived at your conclusion. </i></p></big>'''
<th><font color="#ffffff">Interpretations</font></th>
+
</td></tr>
</tr>
+
<tr><td>
<tr>
+
<p align="justify"><b>1. Filter keywords from Microblogs texting</b></p>
<td>[[File:ZBJ 2pnoshow.jpg|600 px|left]]<br>
+
<p align="justify">I chose observed symptoms as keywords to filter out related Microblogs by using JMP. The observed symptoms contain chills, sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. I use the symptom-related keywords and frequency of these words to improve accuracy.</p>
[[File:ZBJ ArrivalLine-ranger.jpg|600 px|left]]</td>
+
</td></tr>
<td>
+
<tr><td>
<ul>
+
<p align="justify"><b>2. Find out the outbreak time</b></p>
<li>Park service vehicles never showed up between 4am to 5am in any kind of gates. This might be the off-duty period of the rangers, and the off-duty period varied across different days of week. For example, rangers were never on patrol from 1am to 5am on Saturday.</li>
+
<p align="justify">We can find all the symptoms are sharply increased from May 18 and partial from May 19 in below line-chart. I only need to do deep analysis from May 17 to May 20.</p>
<li>If we look at the arrival time (in this case is the time when rangers started each patrol trip), the first shift always started at 6am and the last shift started at 17pm.</li>
+
<p>[[File:ZP FINDING1.png|left|Figure 1|600 px]]</p>
</ul></td>
+
</td></tr>
</tr>
+
<tr><td>
<tr>
+
<p align="justify"><b>3. Find out the affected areas</b></p>
<td>[[File:ZBJ Nocamping.jpg|600 px|left]]</td>
+
<p align="justify">The symptom words used on May 18 are related to aches, breath, chills, cough, fatigue and sweats. These symptoms outbroke mainly in Uptown, Downtown and Eastside.</p>
<td>The two types of buses and 4+ axle trucks, all large vehicles, had no appearance in any camping areas. It might represent that these three car-types can only be passing through the preserve, and camping area is not allowed for large vehicles.</td>
+
<p>[[File:ZP FINDING2.png|left|Figure 2|400 px]]</p>
</tr>
+
</td></tr>
<tr>
+
<tr><td>
<td>[[File:ZBJ Campinghour.jpg|600 px|left]]</td>
+
<p align="justify">The symptom in Uptown, Downtown and Eastside is like flu. So that on May 19, in the same place, the word ‘flu’ was used more frequently. (Figure3: Compare flu frequency on May 18 and May 19).</p>
<td>Majority of traffics through camping areas only happened between 5am to 22pm, except for one car-id 20154519024544-322, which is discussed in later section. It might indicate that traffics were not allowed in camping areas after 22pm to ensure the safety and rest of overnight campers.</td>
+
<p>
</tr>
+
[[File:ZP FINDING3.png|left|Figure 3|500 px]]
<tr>
+
</p>
<td>[[File:ZBJ Activetime.jpg|600 px|left]]</td>
+
</td></tr>
<td>2 axle car/motorcycle, 2 axle truck, and 3 axle truck were most active vehicles in the preserve. Their activities started to increase at 6am and started to flatten out at around 18pm. 7am to 17pm had most vehicle activities. </td>
+
<tr><td>
</tr>
+
<p align="justify">Conclusion1: The first affected area is clustered near the centre as Uptown, Downtown. The symptom is like flu.  
<tr>
+
However, other different partial symptoms are clustered near downstream in Southville and Smogtown on May 19 and May 20. They are diarrhea, nausea, stomach and vomit. We find out that it is a like stomach-ache symptom. (Figure4) </p>
<td>[[File:ZBJ Trespassing.jpg|600 px|left]]</td>
+
<p>
<td>There were vehicles that simply passed the preserved without making any stops and looking around. These vehicles can be identified by investigating the number of gates they passed through. This pattern only applies to non-campers and happened within a short time period. The graph on the left shows all the possible routes for trespassing.<br>
+
[[File:ZP FINDING4.png|left|Figure 4|500 px]]
<ul>
+
</p>
<li>Entrance0<->entrance3</li>
+
</td></tr>
<li>Entrance2<->entrance4</li>
+
<tr><td>
<li>Entrance1<->general-gate7<->entrance3</li>
+
<p align="justify">Conclusion2: The second affected area is near downstream in Plainville and Smogtown one day later than the first affected area. The symptom is like stomach ache.</p>
<li>Entrance0<->general-gate7<-> general-gate4<-> entrance1</li>
+
<p align="justify">Combine conclusion 1 and conclusion 2, we can achieve that the first outbreak area is near the centre. The second outbreak area is near downstream of the river. The symptoms in these two areas are totally different.</p>
</ul></td>
+
<P>[[File:ZP FINDING5.PNG|left|Figure 5|500 px]]</P>
</tr>
+
</td></tr>
</table>
 
 
 
===== Longer-Period Patterns =====
 
<table width=100%>
 
<tr bgcolor="#86af49">
 
<th><font color="#ffffff">Images</font></th>
 
<th><font color="#ffffff">Interpretations</font></th>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Line-month.jpg|600 px|left]]</td>
 
<td>Traffic increased since May and reached highest in July, then started to decrease. November to March were the least popular months for visitors and it is possible that these are winter months.</td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Line-weekend.jpg|600 px|left]]</td>
 
<td>Activities of 2 axle car/motorcycle, 2 axle truck and 3 axle truck increased on Friday and decreased on Monday. This can be explained by the overnight camping during weekends.</td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Ranger.jpg|600 px|left]]<br>
 
[[File:ZBJ RangerCampDuration.jpg|600 px|left]]</td>
 
<td>The duration rangers spent at ranger-stops and camping areas were less than 1 hour. </td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ MapRoute-rangermost.jpg|600 px|left]]</td>
 
<td>The graph shows the route that had most rangers' episodes. It was the most frequent patrol route of the rangers, and it was almost twice as frequent as the second most frequent patrol route. It is possible that the east side of the preserve required more care and protection. <br>
 
<ul>
 
<li>ranger-base>gate8>general-gate5>gate3>ranger-stop3>ranger-stop3>gate3>camping8>general-gate3>gate4>ranger-stop5>ranger-stop5>gate4>gate5>ranger-stop6>ranger-stop6>gate5>gate8>ranger-base</li>
 
</ul> </td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ ArrTime-campers.jpg|600 px|left]]</td>
 
<td>Campers arrived at the preserve between 5am and 17pm. Friday to Sunday were more popular as expected. </td>
 
</tr>
 
</table>
 
 
 
===== Unusual Patterns =====
 
<table width=100%>
 
<tr bgcolor="#86af49">
 
<th><font color="#ffffff">Images</font></th>
 
<th><font color="#ffffff">Interpretations</font></th>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Route-322.jpg|600 px|left]]</td>
 
<td>This table displayed the route of car-id 20154519024544-322 (a 2 axle truck), which passed through camping gates after 22pm. This vehicle had 16 episodes, and each episode had exact same route except for the first episode. This vehicle came to the preserve each Friday and left the on the following Monday.</td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Route-multiepisode.jpg|600 px|left]]</td>
 
<td>Apart from 20154519024544-322, there were other car-ids that had multi-episodes, which means they did not render their car-id by the time they exited the preserve. And every time they came to the preserve, they followed the same routes and went for overnight camping. This group of visitors might hold a regular pass for their visits.</td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Line-gateanomaly.jpg|600 px|left]]<br>
 
[[File:ZBJ Route-gateanomaly.jpg|600 px|left]]
 
<td>
 
<ul>
 
<li>Unauthorized 4+ axle truck appeared in gates only on Tuesday and Thursday, though not every week.</li>
 
<li>They arrived at the preserved between 2am to 4am. More interestingly, the time they passed through gates avoided the time when park service vehicles passed through those gates.</li>
 
<li>The 4+ alxe trucks that passed through gates had different car-id but they all followed the exact same route: <br>
 
entrance3>gate6>ranger-stop6>gate5>general-gate5>gate3>ranger-stop3>ranger-stop3>gate3>general-gate5>gate5>ranger-stop6>gate6>entrance3</li>
 
<li>Recalling from previous section, this route was in the area where the most frequent ranger patrol route covered.</li>
 
</ul></td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Map-skipgate.jpg|600 px|left]]</td>
 
<td>
 
<ul>
 
<li>There were vehicles going between entrance1 and ranger-stop1 without records from gate2. However, entrance1<->gate2<->ranger-stop1 is the only path between the entrance1 and ranger-stop1.</li>
 
<li>There were 6 episodes, all were 2 axle car/motorcycle, followed the exact same path entrance1>ranger-stop1>ranger-stop1>entrance1. They happened on the same day and at the same time. </li>
 
<li>They stayed for almost 4 hours in ranger-stop1, which was long and suspicious.</li>
 
</ul>
 
</td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Roundtrip.jpg|600 px|left]]</td>
 
<td>Apart from the gate-skippers mentioned above, there were another 3 episodes made a simple round-trip in the preserve: they entered the preserve, passed through general-gate, then made the same route back to the entrance.</td>
 
</tr>
 
 
</table>
 
</table>
===== Top 3 Possible Causes =====
 
<ol>
 
<li>Long term visitor with car-id 20154519024544-322 and his behavior to travel pass camping areas during midnight. </li>
 
<li>Unauthorized 4+ axle trucks invading restricted areas because the route they traveled was part of the most frequent patrol route of the rangers. This area could be where pipits resided, and therefore needed more care and protection from rangers. And the fact that they went through the restricted area when the rangers were off-duty makes them extremely suspicious.</li>
 
<li>Possible over speeding which requires further investigations, especially for trespassing routes.</li>
 
</ol>
 
<br>
 
<!-------------------------------->
 
  
<!--- Discussion --->
+
== Question 2 ==
= Comments & Discussions =
+
<table>
<!--- Please replace with your comments in the <td></td> tags :)--->
+
<tr><td>
<table border=1 style="width:90%;border-collapse: collapse; font-family: Calibri;">
+
'''<big><p><i>Part 1: Present a hypothesis on how the infection is being transmitted. For example, is the method of transmission person-¬to¬-person, airborne, waterborne, or something else? Identify the trends that support your hypothesis.</i></p></big>'''
<tr>
+
</td></tr>
<td>comment1: Hi,Bijun. Amazing work! From you analysis pack, I can see the level of efforts that you have devoted into this assignment. Yet, I have the following suggestions that hopefully are useful in further improving your work:
+
<tr><td>
<ul>
+
<p align="justify"><b>1. Flu-like symptom and Stomach-ache are transmitted by different way</b></p>
<b>Aesthetic</b><br>
+
<p align="justify">Reason: From Q1 we achieved that there are two different symptoms. The first is a flu-like symptom and the second is a stomach-ache like symptom. In the second and third day of outbreak, people in Downtown and Uptown did not have the same symptom as stomach-ache. In another word, stomach-ache infection will not spread to other places.</p>
<li>1. I love the way you present your analysis. However, a big part of the analysis findings are demonstrated by line graphs. I am wondering if you can try other types of graphs to make the findings more visually clear?</li>
+
<p>[[File:ZP FINDING6.png|left|Figure 6|600 px]]</p>
<li>2. For the first graph of daily pattern. You are trying to say that the rangers do not work from 4-5 am. Yet the graphs x axis only shows hour of 3,9,15,21. I think the pattern will be more obvious if you construct the graph in a way that X axis displays every hour of the day.</li>
+
</td></tr>
<li>3. the story structure in your tableau workbook is clear. Look forward to viewing it interactively on line soon.</li>
+
<tr><td>
</ul>
+
<p align="justify">The stomach-ache like symptom is limited in the same areas and did not affect other areas. However, by comparing the flu-like symptom in the first outbreak day and the last day, the flu-like symptom has been affected to the whole country. As a result, we can conclude that Flu-like symptom and Stomach-ache infections are transmitted by different way. Moreover, we need to analyse these two symptoms separately.</p>
<ul>
+
<p>[[File:ZP FINDING7.png|left|Figure 7|600 px]]</p>
<b>Clarity</b><br>
+
</td></tr>
<li>1. the tableau workbook contains lots of useful information, yet it is a bit complex and confusing. There are quite a few selectors and parameters defined and the graphs are controlled by different selectors, which is not straightforward. Audience probably need to spend quite some time in understanding the dashboard, particularly for those who do not know the background of vast challenge.</li>
+
<tr><td>
<li>2. the legends are not closely attached to their corresponding graphs, which also adds confusion to the dashboard.</li>
+
<p align="justify"><b>2. The stomach-ache symptom is more likely transmitted by waterborne</b></p>
</ul>
+
<ul><li>Is the stomach-ache symptom spread by waterborne?</li></ul>
<b>Overall, fantastic work! Hopefully my comments can add value</b>
+
<p align="justify">The answer is yes. On May 18, the stomach-ache like symptom was outbreak near both sides of the river near the downstream simultaneously. This trend corresponded with the flow of river from North to South.</P>
<br>
+
<p align="justify">In fact, the stomach-ache is always caused by food or water as a common sense. Residents and businesses get their drinking water by pumping water from nearby reservoirs or rivers. As a result, stomach-ache symptom has a high possibility of spread by waterborne.</p>
Best Regards
+
</td></tr>
<br>
+
<tr><td>
Yunna
+
<ul><li>Is the stomach-ache symptom spread by person-person?</li></ul>
</td>
+
<p align="justify">The answer is no. the day time population in Plainvile is smaller than night which means many working people would back to Plainvile on May19. However, on May20, these people back to work and they are not affected. Stomach-ache symptom is only limited under downstreaming.</P>
</tr>
+
<p>[[File:ZP FINDING8.png|left|Figure 8|500 px]]</p>
<tr>
+
</td></tr>
<td>Hi Joyce, <br>
+
<tr><td>
Great overall effort and very engaging. Some of my feedback as below 😉<br>
+
<ul><li>Is the stomach-ache symptom spread by air-borne?</li></ul>
<b>Aesthetics:</b><br>
+
<p align="justify">The answer is no. Although the direction of wind is nearly from west to east, Southville and Lakeside areas did not find any symptom like stomach-ache symptom.</P>
<ol>
+
</td></tr>
* Though it is very interactive with a lot of interactive filters and legends, I feel that it is abit overwhelming, confusing and took me a while to understand and link them together; which I believe also is causing some dashboard performance issue to load slowly. Selecting some of the values also caused the whole dashboard to blank out and getting lost in visualization, can consider to reduce the number of variables of filters for interactive visualization.
+
<tr><td>
* I guess storybook should be story-telling and easy to follow for anyone. Currently, it is designed for exploratory purposes.  
+
<p align="justify"><b>3. The flu-like symptom is more likely transmitted by person-to-person and airborne</b></p>
* The colors of the titles, filter and legends are well-designed and implemented and all well-linked across the various graphs! Only thing is that descriptions fonts are a bit small for old folks.  
+
<ul><li>Is the flu-like symptom spread by waterborne?</li></ul>
</ol>
+
<p align="justify">The answer is no. To analyse the symptom more deeply, I used the graph to see the distribution in each hour on May 18 and selected out the most significant hours such as 7, 8, 17 and 18 o’clock. We can find out that the outbreak is start around at 8 o’clock clustered in Downtown, Uptown and a few in Eastside. These three affected areas are on the right of the river. However, on the left side of the river, the areas which are opposite to Downtown and Uptown are not affected at the same time during 8 o’clock to 17 o’clock. As a result, the flu-like symptom is not affected by waterborne.</P>
<b>Clarity:</b><br>
+
<p>[[File:ZP FINDING9.PNG|left|Figure 9|600 px]]</p>
<ol>
+
</td></tr>
* I don’t understand the “top x and bottom x” & “Timestamp Slector” (typo? Sector or Selector?) but I guess you are trying to compare the traffic at each gate with the arrival time, though I can’t tell anything obvious from the arrival time from pattern detection. In such case, it may be interesting to include and look at their departure time as well.  
+
<tr><td>
* Currently it only shows the route on the map; I think you can also consider the intensity of the path traveled illustrating by thickness of line.  
+
<ul><li>Is the flu-like symptom spread by person to person?</li></ul>
* Coordinates of the checkpoints are offset from the actual 200x200 grid or actual distance/area and may cause confusion. Zooming feature in the map is good as it allows for better visibility similar to “How long did they spend”. Only point is to consider expanding the box to allow complete view of it; currently it requires scrolling.  
+
<p align="justify">My answer is yes. As we can see from the above graph, from 8~17 o’clock the affected area is not changed. However, after 18 o’clock, people started to send related messages in many other areas. I guess the reason is that from 8-17 o’clock is the work time, after 18 people start to back home. To prove this, I selected out the ID from affected area in Downtown, Uptown and Westside.</p>
</ol>
+
<p>[[File:ZP FINDING10.png|left|Figure 10|600 px]]</p>
Cheers, <br>
+
</td></tr>
Zac
+
<tr><td>
</td>
+
<p align="justify">As we can see from the following graph, these selected people were also in the same area from 8~17 o’clock on May 18 and May 19.</P>
</tr>
+
<p>[[File:ZP FINDING11.png|left|Figure 11|500 px]]</p>
<tr>
+
</td></tr>
<td>comment3
+
<tr><td>
<p>Hi Zheng Bijun,</p>
+
<p align="justify">After 17 o’clock, these people in affected area would back to home. This result can be roughly concluded by the population distribution. Many people after work will leave Uptown and Downtown and go back to Lakeside, Plainville, Suburbia.</p>
<p>Please find my feedback comments as follows. You present a very nice analysis, which is answering the questions of the challenge. With regard to the clarity and aesthetics aspect, I have the following to add.</p>
+
<p>[[File:ZP FINDING12.PNG|left|Figure 12|600 px]]</p>
<p><strong>Clarity :</strong></p>
+
</td></tr>
<p><strong>&nbsp;</strong></p>
+
<tr><td>
<ul>
+
<p align="justify">Now let’s compared the selected people and the total affected people who have flu-like symptom during 8~17 o’clock on May 18 and May19. (The deep blue points are the infected people on May 18, they will also be emphasized by deep blue on May 19 if they also sent the message) We can find that although the infected people would go back to work on May 19 and stayed in the same areas, but other areas such as Riverside, Villa and so on were also affected. The highly convincing explanation is that the infected people back to home from centre areas and spread the disease to their friends and families. As a result, the flu-like symptom can spread person to person.</p>
*In the daily pattern plots, you have used the &lsquo;select&rsquo; feature effectively to illustrate clearly the trend you wish to explain. This is a good practice, as it help to retain the background information clearly, whilst projecting the focus for the user.</li>
+
<p>[[File:ZP FINDING13.png|left|Figure 13|500 px]]</p>
*When using map images, you might want to use the Cartesian coordinates more effectively. You can use Tableau to import the map as a background image, and then geocode it so that you will be able to use annotations. The current texts you have indicated does help the user identify the locations inside the preserve, such as entrance 0, entrance 3, etc. but having annotations will help them to pop out of the plane, thereby presenting better clarity.</li>
+
</td></tr>
*In your 2<sup>nd</sup> plot inside longer period patterns, I assume the x axis shows the days of the week (1-7). You might want to add an axis label, or you might want to use aliases for the days of the week and label the axis. (for e.g. 1-Sunday, 2-Monday, etc.).</li>
+
<tr><td>
*On most of the plots, you have a well defined title, so you might not want to show the headers on the Y axis (# of cars) since it can already be known that the chart shows the trends of traffic.</li>
+
<ul><li>Is the flu-like symptom spread by airborne?</li></ul>
</ul>
+
<p align="justify">The answer is that it depends on different conditions. As we have been analysed that the starting point which caused stomach-ache symptom is near the river beside the bound of Downtown and Plainville.</p>
<p><strong>Aesthetics: </strong></p>
+
</td></tr>
<ul>
+
<tr><td>
*I notice that you have tweaked the background colour. Maybe, you would want to also explore format axis feature in Tableau that might change the text to more bolder and visible formats. This would lend more readability to the plots.</li>
+
<p align="justify"><b>Condition 1: </b>If the stomach-ache like symptom and flu-like symptom are caused by the same infection and the different symptoms are caused by different disseminators, the airborne will have high possibility to spread infection. Because the wind from west to east on May 18, 19 is coordinate to the direction of the spread. The spread was start from the river beside the bound of Downtown and Plainville, then moved to Downtown, to Uptown and Eastside.</p>
*On the arrival time calendar plot you have developed, when you try to visualize the number of episodes, the gradation in colors is good, and helps to quickly infer, which times of the day have higher episodes.</li>
+
<p align="justify"><b>Condition 2: </b>If these two symptoms are caused by different infections, the cause of flu-like symptom may breakout in the centre of Downtown and Uptown. We cannot judge whether it can be transmitted by airborne.</p>
</ul>
+
</td></tr>
<p style="margin-left: .25in;">&nbsp;&nbsp;Hope the feedback helps, and please leave out a feedback on my page as well. You may access it [[ISSS608_2016-17_T3_Assign_KISHAN_BHARADWAJ_SHRIDHAR|here]]. <br />
+
<tr><td>
Navigate to the bottom of the main page, after reading the 3 sub pages.</p>
+
'''<big><p><i>Part 2: Is the outbreak contained? Is it necessary for emergency management personnel to deploy treatment resources outside the affected area? Explain your reasoning.</i></p></big>'''
<p style="margin-left: .25in;">&nbsp;</p>
+
</td></tr>
<p style="margin-left: .25in;">Thank You,</p>
+
<tr><td>
<p style="margin-left: .25in;">Kishan Bharadwaj Shridhar</p>
+
<p align="justify">The outbreak of flu-like symptom is not contained. Although the outbreak started from the Downtown, Uptown and Eastside, all areas were affected.
</td>
+
Furthermore, I selected people who are infected by flu-like epidemic from 8~17 o’clock on May 18. We can find on May 19 during 8~17 o’clock, these people back to work which meanings they can insist to work. However, on May 20, many of them went to the hospital in each area which presented the epidemic was becoming much more seriously. To prevent the flu-like symptom to grow much more quickly and more severely, emergency management personnel to deploy treatment resources outside the affected area is necessary.</p>
</tr>
+
<p>[[File:ZP FINDING14.png|left|Figure 14|500 px]]</p>
<tr>
+
</td></tr>
<td>comment4</td>
+
<tr><td>
</tr>
+
<p align="justify">As we saw from the graph, people who have stomach-ache on May 18 still stayed in the same areas neither went to the hospital nor went to work. This symptom will not be more seriously because they even not went to the hospital.</p>
<tr>
+
<p>[[File:ZP FINDING15.png|left|Figure 15|500 px]]</p>
<td>comment5</td>
+
</td></tr>
</tr>
 
 
</table>
 
</table>
 
 
 
<!--- Discussion --->
 
<!--- Discussion --->

Latest revision as of 23:15, 15 October 2017

ZP title1111.jpg Epidemic Spread in Smartpolis

Overview

Smartpolis is a major metropolitan area with a population of approximately two million residents. During the last few days, health professionals at local hospitals have noticed a dramatic increase in reported illnesses. Observed symptoms are largely flu¬like and include fever, chills,sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. More recently, there have been several deaths believed to be associated with the current outbreak. City officials fear a possible epidemic and are mobilizing emergency management resources to mitigate the impact.

Our Tasks

Task 1:

  • Identify approximately where the outbreak started on the map (ground zero location). Outline the affected area.

Task 2:

  • Present a hypothesis on how the infection is being transmitted. For example, is the method of transmission person-¬to¬-person, airborne, waterborne, or something else? Identify the trends that support your hypothesis.
  • Is the outbreak contained? Is it necessary for emergency management personnel to deploy treatment resources outside the affected area?


Data Description

1 Microblog Messages 2 Map
The provided CSV file contains a number of microblog messages.

Attributes:

  • ID – personal identifier of the individual posting the message
  • Created_at – date and time of the post
  • Location – latitude and longitude coordinates of the mobile device at the time of post
  • Text – the posted message
ZP MAP.png
3 Population Statistics 4 Observed Weather
The provided CSV file contains a number of population statistics.

Attributes:

  • Zone_Name – the name of one of the 13 city zones within the metropolitan area
  • Population_Density – the number of residents in the zone
  • Daytime_Population – the estimated population in the zone due to commuting during work hours
The provided CSV file contains a number of weather statistics.

Attributes:

  • Date – date of observed weather by weather station
  • Weather – weather conditions for a particular day
  • Average_Wind_Speed – measured in miles per hour
  • Wind_Direction – the direction from which the wind is blowing or from which it originates

5 Additional Information

  • Economy – The economy of Vastopolis is based on commerce, entertainment, finance, trucking services, shipping services, health care, and industry.
  • Water Supply - Residents and businesses get their drinking water by pumping water from nearby reservoirs or rivers. These distributed water systems are both public and privately owned.
  • Entertainment – Vastopolis has two stadiums (Vastopolis Dome and Westside Stadium) for sports, concerts, and other events. The various lakes and the Vast River, which flows south at a steady rate of three miles per hour, is used for water-based sports and recreation.
  • City Administration – Vastopolis has several locations of significance including a state courthouse, a capitol building, convention center, and a large airport.



Data Preparation

1 Prepare keywords for Microblogs texts

Microblogs’ texts contain a large amount of useless information. To filter out useful and related information from texts, keywords should be designed for efficient filtering. I chose observed symptoms as keywords to filter out related Microblogs by using JMP. The observed symptoms contain chills, sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. I use the symptom-related keywords and frequency of these words to do the deep research.

Keywords: 'vomit,vomiting', 'sweats', 'pains,painful,pain', 'nausea', 'flu', 'fatigue', 'diarrhea', 'cough,coughing', 'chills', 'stomach', 'breath,breathing', 'ache,headache,aches'

2 Categorise data to each keyword

At first, I used JMP text explorer function to analyse words in column of ‘text’.

ZP DP1.png

Then, I searched for each keyword. As a result, we can find the keyword and the count of its frequency. Furthermore, we select out all the rows which contain keywords.

ZP DP3.PNG

The last step is to concatenate filtered data together.

ZP DP2.png

3 Set longitude and latitude

To build map graph for further analysis, the location should be separated. I named the longitude as Location_X and named latitude as Location_Y. Because the longitude in west is negative, the minus one should be multiplied.

ZP DP4.PNG

4 Build Map background in Tableau

The background map should be edited so that we can match each point on the map.

ZP DP5.png


Interactive Visualization

You may have your own investigation here: https://public.tableau.com/profile/zhang.peng8803#!/vizhome/keyword2/EpidemicSpreadStory?publish=yes


Analysis Results

Question 1

Part 1:Identify approximately where the outbreak started on the map (ground zero location). Outline the affected area. Explain how you arrived at your conclusion.

1. Filter keywords from Microblogs texting

I chose observed symptoms as keywords to filter out related Microblogs by using JMP. The observed symptoms contain chills, sweats, aches and pains, fatigue, coughing, breathing difficulty, nausea and vomiting, diarrhea, and enlarged lymph nodes. I use the symptom-related keywords and frequency of these words to improve accuracy.

2. Find out the outbreak time

We can find all the symptoms are sharply increased from May 18 and partial from May 19 in below line-chart. I only need to do deep analysis from May 17 to May 20.

Figure 1

3. Find out the affected areas

The symptom words used on May 18 are related to aches, breath, chills, cough, fatigue and sweats. These symptoms outbroke mainly in Uptown, Downtown and Eastside.

Figure 2

The symptom in Uptown, Downtown and Eastside is like flu. So that on May 19, in the same place, the word ‘flu’ was used more frequently. (Figure3: Compare flu frequency on May 18 and May 19).

Figure 3

Conclusion1: The first affected area is clustered near the centre as Uptown, Downtown. The symptom is like flu. However, other different partial symptoms are clustered near downstream in Southville and Smogtown on May 19 and May 20. They are diarrhea, nausea, stomach and vomit. We find out that it is a like stomach-ache symptom. (Figure4)

Figure 4

Conclusion2: The second affected area is near downstream in Plainville and Smogtown one day later than the first affected area. The symptom is like stomach ache.

Combine conclusion 1 and conclusion 2, we can achieve that the first outbreak area is near the centre. The second outbreak area is near downstream of the river. The symptoms in these two areas are totally different.

Figure 5

Question 2

Part 1: Present a hypothesis on how the infection is being transmitted. For example, is the method of transmission person-¬to¬-person, airborne, waterborne, or something else? Identify the trends that support your hypothesis.

1. Flu-like symptom and Stomach-ache are transmitted by different way

Reason: From Q1 we achieved that there are two different symptoms. The first is a flu-like symptom and the second is a stomach-ache like symptom. In the second and third day of outbreak, people in Downtown and Uptown did not have the same symptom as stomach-ache. In another word, stomach-ache infection will not spread to other places.

Figure 6

The stomach-ache like symptom is limited in the same areas and did not affect other areas. However, by comparing the flu-like symptom in the first outbreak day and the last day, the flu-like symptom has been affected to the whole country. As a result, we can conclude that Flu-like symptom and Stomach-ache infections are transmitted by different way. Moreover, we need to analyse these two symptoms separately.

Figure 7

2. The stomach-ache symptom is more likely transmitted by waterborne

  • Is the stomach-ache symptom spread by waterborne?

The answer is yes. On May 18, the stomach-ache like symptom was outbreak near both sides of the river near the downstream simultaneously. This trend corresponded with the flow of river from North to South.

In fact, the stomach-ache is always caused by food or water as a common sense. Residents and businesses get their drinking water by pumping water from nearby reservoirs or rivers. As a result, stomach-ache symptom has a high possibility of spread by waterborne.

  • Is the stomach-ache symptom spread by person-person?

The answer is no. the day time population in Plainvile is smaller than night which means many working people would back to Plainvile on May19. However, on May20, these people back to work and they are not affected. Stomach-ache symptom is only limited under downstreaming.

Figure 8

  • Is the stomach-ache symptom spread by air-borne?

The answer is no. Although the direction of wind is nearly from west to east, Southville and Lakeside areas did not find any symptom like stomach-ache symptom.

3. The flu-like symptom is more likely transmitted by person-to-person and airborne

  • Is the flu-like symptom spread by waterborne?

The answer is no. To analyse the symptom more deeply, I used the graph to see the distribution in each hour on May 18 and selected out the most significant hours such as 7, 8, 17 and 18 o’clock. We can find out that the outbreak is start around at 8 o’clock clustered in Downtown, Uptown and a few in Eastside. These three affected areas are on the right of the river. However, on the left side of the river, the areas which are opposite to Downtown and Uptown are not affected at the same time during 8 o’clock to 17 o’clock. As a result, the flu-like symptom is not affected by waterborne.

Figure 9

  • Is the flu-like symptom spread by person to person?

My answer is yes. As we can see from the above graph, from 8~17 o’clock the affected area is not changed. However, after 18 o’clock, people started to send related messages in many other areas. I guess the reason is that from 8-17 o’clock is the work time, after 18 people start to back home. To prove this, I selected out the ID from affected area in Downtown, Uptown and Westside.

Figure 10

As we can see from the following graph, these selected people were also in the same area from 8~17 o’clock on May 18 and May 19.

Figure 11

After 17 o’clock, these people in affected area would back to home. This result can be roughly concluded by the population distribution. Many people after work will leave Uptown and Downtown and go back to Lakeside, Plainville, Suburbia.

Figure 12

Now let’s compared the selected people and the total affected people who have flu-like symptom during 8~17 o’clock on May 18 and May19. (The deep blue points are the infected people on May 18, they will also be emphasized by deep blue on May 19 if they also sent the message) We can find that although the infected people would go back to work on May 19 and stayed in the same areas, but other areas such as Riverside, Villa and so on were also affected. The highly convincing explanation is that the infected people back to home from centre areas and spread the disease to their friends and families. As a result, the flu-like symptom can spread person to person.

Figure 13

  • Is the flu-like symptom spread by airborne?

The answer is that it depends on different conditions. As we have been analysed that the starting point which caused stomach-ache symptom is near the river beside the bound of Downtown and Plainville.

Condition 1: If the stomach-ache like symptom and flu-like symptom are caused by the same infection and the different symptoms are caused by different disseminators, the airborne will have high possibility to spread infection. Because the wind from west to east on May 18, 19 is coordinate to the direction of the spread. The spread was start from the river beside the bound of Downtown and Plainville, then moved to Downtown, to Uptown and Eastside.

Condition 2: If these two symptoms are caused by different infections, the cause of flu-like symptom may breakout in the centre of Downtown and Uptown. We cannot judge whether it can be transmitted by airborne.

Part 2: Is the outbreak contained? Is it necessary for emergency management personnel to deploy treatment resources outside the affected area? Explain your reasoning.

The outbreak of flu-like symptom is not contained. Although the outbreak started from the Downtown, Uptown and Eastside, all areas were affected. Furthermore, I selected people who are infected by flu-like epidemic from 8~17 o’clock on May 18. We can find on May 19 during 8~17 o’clock, these people back to work which meanings they can insist to work. However, on May 20, many of them went to the hospital in each area which presented the epidemic was becoming much more seriously. To prevent the flu-like symptom to grow much more quickly and more severely, emergency management personnel to deploy treatment resources outside the affected area is necessary.

Figure 14

As we saw from the graph, people who have stomach-ache on May 18 still stayed in the same areas neither went to the hospital nor went to work. This symptom will not be more seriously because they even not went to the hospital.

Figure 15