Difference between revisions of "ISSS608 2017-18 T1 Assign XU YANRU"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(5 intermediate revisions by the same user not shown)
Line 11: Line 11:
  
 
The location collected from microblog contains latitude and longitude of where the message was posted. However, the two values are stored in one column. It is split into two columns named "Latitude" and "Longitude" as Numeric data in JMP. "Location" is hidden to avoid confusion.
 
The location collected from microblog contains latitude and longitude of where the message was posted. However, the two values are stored in one column. It is split into two columns named "Latitude" and "Longitude" as Numeric data in JMP. "Location" is hidden to avoid confusion.
 +
<gallery>
 +
CoordinatesConverstion.jpg
 +
</gallery>
  
 
'''2. Symptom'''
 
'''2. Symptom'''
 +
The posting in microblog is unstructured data. Text Explore in JMP is used to extract and analyze those messages.
 +
<gallery>
 +
File:Text Explore.jpg
 +
</gallery>
 +
 +
It reveals the repeated terms and phrases. According to this list, some phrases obviously represents illness are taken out and indicated in a new column "Symptom".
 +
* Aching Muscles
 +
* Breathing (Including "shortness of breath")
 +
* Caught a fever
 +
* Caught a Pneumonia
 +
* Chill
 +
* Declining Health
 +
* Dry Cough
 +
* Hurt to Move
 +
* Running Nose
 +
* Sore Throat
 +
* Sick Sucks
 +
* Medicine Medicine
 +
 +
These records are exported into a .csv file for visualization in Tableau.
 +
 +
=== Data Visualization and Interpretation ===
 +
 +
There are only several posting in microblog regarding illness between 30 Apr 2011 to 17 Apr 2011. From 18 Apr onward, the illness is reported everywhere in the city. You can click [https://public.tableau.com/views/Symptom/Symptoms?:embed=y&:display_count=yes&publish=yes here] to manipulate the data and understand the symptoms posted.
 +
 +
Illness posting captured on 18 Apr 2011:
 +
<gallery>
 +
                    File:Illness Posted_18Apr.jpg
 +
</gallery>
 +
 +
Illness posting captured on 30 Apr 2011:
 +
<gallery>
 +
                    File:Illness Posted_30Apr.jpg
 +
</gallery>
 +
 +
From the changes of the number of illness over the three weeks, the epidemic outbreak is likely in downtown as most of the cases posted in microblog is from downtown before it fully spread on 18 May 2011.
 +
 +
The illness reported in the first 18 days are all along the rivers.
 +
<gallery>
 +
File:Illness Posted_30Apr_17May_01.jpg
 +
</gallery>
 +
 +
It [https://public.tableau.com/views/Weather_50/Weather?:embed=y&:display_count=yes&publish=yes rained] from 3 May to 7 May 2011.
 +
<gallery>
 +
File:Weather.jpg
 +
</gallery>
 +
 +
From the information above, we can interpret that the transmission is through water.
 +
 +
[https://public.tableau.com/views/Population_139/Population?:embed=y&:display_count=yes&publish=yes Residence] who works across different zones in the city may affect the estimation on affected areas.
 +
<gallery>
 +
File:Population_XYR.jpg
 +
</gallery>
 +
 +
However, Smartpolis authority should deploy treatment resources outside the affected areas, especially the areas are using the water of the same rivers downstream. It will help prevent this infection in other cities.

Latest revision as of 20:29, 15 October 2017

ISS608_2017-18_T1_Assign_XU YANRU

Mini Challenge:Illness in Smartpolis

Background

In the past few days, health professionals from Smartpolis hospitals noticed the reported illness increased significantly. In order to analyze whether epidemic would happen to this major metropolitan, city officials provided some datasets, microblog messages collected, satellite map, weather and population of Smartpolis.

Data Preparation

1. Coordinate detail

The location collected from microblog contains latitude and longitude of where the message was posted. However, the two values are stored in one column. It is split into two columns named "Latitude" and "Longitude" as Numeric data in JMP. "Location" is hidden to avoid confusion.

2. Symptom The posting in microblog is unstructured data. Text Explore in JMP is used to extract and analyze those messages.

It reveals the repeated terms and phrases. According to this list, some phrases obviously represents illness are taken out and indicated in a new column "Symptom".

  • Aching Muscles
  • Breathing (Including "shortness of breath")
  • Caught a fever
  • Caught a Pneumonia
  • Chill
  • Declining Health
  • Dry Cough
  • Hurt to Move
  • Running Nose
  • Sore Throat
  • Sick Sucks
  • Medicine Medicine

These records are exported into a .csv file for visualization in Tableau.

Data Visualization and Interpretation

There are only several posting in microblog regarding illness between 30 Apr 2011 to 17 Apr 2011. From 18 Apr onward, the illness is reported everywhere in the city. You can click here to manipulate the data and understand the symptoms posted.

Illness posting captured on 18 Apr 2011:

Illness posting captured on 30 Apr 2011:

From the changes of the number of illness over the three weeks, the epidemic outbreak is likely in downtown as most of the cases posted in microblog is from downtown before it fully spread on 18 May 2011.

The illness reported in the first 18 days are all along the rivers.

It rained from 3 May to 7 May 2011.

From the information above, we can interpret that the transmission is through water.

Residence who works across different zones in the city may affect the estimation on affected areas.

However, Smartpolis authority should deploy treatment resources outside the affected areas, especially the areas are using the water of the same rivers downstream. It will help prevent this infection in other cities.