ISSS608 2017-18 T1 Assign ZHANG LIDAN

From Visual Analytics and Applications
Revision as of 19:32, 12 October 2017 by Lidan.zhang.2016 (talk | contribs)
Jump to navigation Jump to search
Assignment 1 - To be a Visual Detective: D

Background

Data Preparation

To better deal with the data, I import the microblog data set into the JMP at first. This dataset contains a lot of useful information. For example, I can use the location axis and the timestamp to identify where these rows are located. Then, through tokenizing and stemming the words in each message, I can filter the high frequency words and flulike-related keywords for further data exploration. The microblogs dataset contains 1,023,077 rows. Firstly, I need to separate the location into longitude and latitude. Then, because these locations are at the western, hemisphere, I should reverse the longitude coordinates into negative value. Next, to exclude the irrelevant information, I create the subset dataset which consists of main flulike symptoms, such as chill, flu, fever, sweat, pain, fatigue, ache, cough, breath, nausea, vomit, diarrhea. Here, I use the Text Explorer in JMP to generate these new columns.

1.png

disease

reference

feedback