ISSS608 2017-18 T1 Assign FOO CELONG RAYMOND/MakingSenseOfTheChatter

From Visual Analytics and Applications
Revision as of 21:02, 13 October 2017 by Raymondfoo.2016 (talk | contribs)
Jump to navigation Jump to search
RaymHeader.png



Exploring and Organising the Data

First thing first, I checked if there is any dirty data. True enough, there are 21 microblogs with problems with the time values in the date. I cleaned up the time and set them to midnight (00:00).

RaymDirtyDates.png

The number of microblog is massive. I looked through to see the distribution of posts over time.

RaymMicroblogPerDay.png


RaymMicroblogPerHour.png

The data will need to be organised in some manner so that they can be easily analysed later. The obvious choice is by city zones. I carefully started to group the microblog by the zones by creating a column to store the zone in which the microblog was transmitted.

RaymMicroblogByZone.png

Next, I will also create an indicator column for microblogs that were transmitted area near the various points of interest.

RaymMicroblogByPlaceOfInterest.png