DataPreparation

From Visual Analytics and Applications
Revision as of 01:04, 15 October 2017 by Ziwenhe.2016 (talk | contribs)
Jump to navigation Jump to search

Vaa1.jpg ISSS608 Visual Analytics and Applications Assignment

Background

Outbreak&Affected_Areas

Transmission

EmergencyControl

DataPreparation

 


Data Preparation

Transforming data

After imported the original dataset into JMP, found that two columns are dirty data which need to be cleaned. Splitting the column "Created_at" into two columns, one is "Date" and the other one is "Time". With the same method split the column "location" into "latitude" and "longitude".

Key Words Selection

By reading requirement of the assignment, there are lots of the key words. With the function of Text Explorer in JMP, also found that the key words list. The top 5 key words in the list are all related to illness.

Combined all the resources, finally choose 13 key words in the report.

Excluding & Hiding Data

After the above steps, choose the key words and label all the rows related to these 13 key words. And then invert selection to exclude and hide the rows which do not include all the key words.

Processing Data

Tag the Key Words

In order to tag all the 13 key words, used the formula to tag all the key words and made one new column for the tags.

After tagged all the key words, there should be a column tag with all the key words.

Processing the Time

The time is in the HH:MM format. The format is not good to analyse the final results. Transformed the time into two different formats. One is WorkingHour and night. The other is WorkingHour, Evening, EarlyMorning and Midnight.

Visualing Data in Tableau

After all the steps done, the data cleaning was finished. Then exported the all the data which tagged with key words into excel and imported into Tableau. Then in the tableau plot all the key words in the map via Map(Background Images) function. The scatterplots for the key words can display the trend of the data. And the bar chart for the population shows the population distribution during the day and night time.

Recommedations