ISSS608 2017-18 T1 Assign WANG RUI Investigation step1

From Visual Analytics and Applications
Revision as of 23:14, 15 October 2017 by Ruiwang.2016 (talk | contribs) (Created page with "center|1050px <!--MAIN HEADER --> {|style="background-color:#FFFF00;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0" | | style="fon...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Header.png

Mission

Information

Investigation

Insights

 


5 Steps to the Truth:
Step #1: Understand the data.


First, let's look at the microblog message dataset:
“ID” is identifier of person who post the message. There are 73,928 users identified which can be used to label person of interest in the following steps.
“Created_at” provides the date and time when message was created. 21 days of records are given which covers from April 31st to May 20th. Some dirty data with wrong timestamp are identified as shown below:

Wrclean1.png

21 (out of 1,023,077) Dirty data with wrong time records. Since they are only 0.002% and what they talk about is not related to disease. They are removed from microblog data. After this cleaning process, there are 1,023,056 message records left.


“Location” indicates the latitude and longitude of location where messages are posted. To utilize this location information and map onto dashboard, the combined latitude and longitude figures are separated into two columns. (Longitude is transformed to negative values because of West longitude). The progress is shown below:

Wrclean2.png


Second, let's look at the weather dataset:
“Wind Speed” and “Wind Direction” are very useful as supporting evidences of whether the disease is airborne transmitted. To make the analysis more intuitive, shaped icons are implemented for better visualization.

Wrwind.png


“Weather”

Wrweather.png


Now, let's look at the population information:
“Population”