ISSS608 2016-17 T1 Assign3 Franky Eddy

From Visual Analytics and Applications
Jump to navigation Jump to search

Abstract

DinoFun World held a weekend tribute to Scott Jones, internationally renowned football star. He was a classic hometown hero, with thousands of fans who cheered his success as if he were a beloved family member. However, the event did not go as planned. Scott’s weekend was marred by crime and mayhem perpetrated by a poor, misguided and disgruntled figure from Scott’s past. While the crimes were rapidly solved, park officials and law enforcement figures are interested in understanding just what happened during that weekend to better prepare themselves for future events. They are interested in understanding how people move and communicate in the park, as well as how patterns changes and evolve over time, and what can be understood about motivations for changing patterns.

To help solve this problem, data visualization is used to get these insights:

  • Identify those IDs that stand out for their large volumes of communication
  • From the communication pattern, can you hypothesize when the vandalism was discovered?


Overview of Data


DinoFun World is a typical modest-sized amusement park, sitting on about 215 hectares and hosting thousands of visitors each day. It has a small town feel, but it is well known for its exciting rides and events.

One event last year was a weekend tribute to Scott Jones, internationally renowned football (“soccer,” in US terminology) star. Scott Jones is from a town nearby DinoFun World. He was a classic hometown hero, with thousands of fans who cheered his success as if he were a beloved family member. To celebrate his years of stardom in international play, DinoFun World declared “Scott Jones Weekend”, where Scott was scheduled to appear in two stage shows each on Friday, Saturday, and Sunday to talk about his life and career. In addition, a show of memorabilia related to his illustrious career would be displayed in the park’s Pavilion. However, the event did not go as planned. Scott’s weekend was marred by crime and mayhem perpetrated by a poor, misguided and disgruntled figure from Scott’s past.

While the crimes were rapidly solved, park officials and law enforcement figures are interested in understanding just what happened during that weekend to better prepare themselves for future events. They are interested in understanding how people move and communicate in the park, as well as how patterns changes and evolve over time, and what can be understood about motivations for changing patterns.

Objectives


The data is from the in-app communication data over the three days of the Scott Jones celebration. This includes communications between the paying park visitors, as well as communications between the visitors and park services. In addition, the data also contains records indicating if and when the user sent a text to an external party.

The objective is to use visual analytics to analyze the available data and develop responses to the questions below.

  • Identify those IDs that stand out for their large volumes of communication.
    • Characterize the communication patterns you see.
    • Based on these patterns, what do you hypothesize about these IDs?
    • Describe up to 10 communications patterns in the data. Characterize who is communicating, with whom, when and where.
    • From this data, can you hypothesize when the vandalism was discovered? Describe your rationale.


Approaches


The step by step approaches done can be seen below.

  • Identify those IDs that stand out for their large volumes of communication. For each of these IDs

In order to identify IDs that has large volume of communication, JMP Pro is used to check the communication data on the three days (Friday, Saturday, and Sunday). Some interesting observations can be seen below.
Comm Friday Franky.pngComm Saturday Franky.pngComm Sunday Franky.png

Based on the observation, there are three interesting IDs that has very large volume of communication compared to other IDs: 1278894, 839736, and external.

  • Characterize the communication patterns you see

After identifying the IDs that stand out, the next step is characterizing the communication pattern.

1. 1278894
Characteristics of pattern:

  • This ID makes the most communications in the three days of the event
  • The location of this ID is only at Entry Corridor
  • Communications of this ID is only at certain time (12 PM to 1 PM, 2 PM to 3 PM, 4 PM to 5 PM, 6 PM to 7 PM, and 8 PM to 9 PM) with the message interval of 5 minutes between messages


From 1278894 Franky.png
Based on these communication patterns, this ID is most probably the Cindysaurus Trivia Game from the DinoFun World app.

2. 839736
Characteristics of pattern:

  • This ID's location is also only at Entry Corridor


Location Franky.png

  • Automatically responds to message received in 5 minutes
  • Same number of communications sent and received


Based on these communication patterns, this ID is most probably the Information Center.

3. External
Characteristics of pattern:

  • Only receives message but never sends message
  • Never communicates with the two IDs with highest number of communications (1278894 and 839736)
  • Communications happen anytime in the day during the event


Based on these communication patterns, this ID is most probably external party.

  • Communications Patterns in the Data


  • Communication to 839736 on Sunday

Communication to 839736 Franky.png
From the graph, it can be seen that the number of communications suddenly increased very significantly at about 11:45 AM to 12:00 PM on Sunday. There is also another high number of communiactions at about 2:40 PM to 3:00 PM. This could be an indication that the vandalism was discovered at those timings.

  • Communication to External on Sunday


Comm to External Franky.png
From the graph above, it can be seen that there is a very large increase in number of communications to External at about 11:45 AM to 12:00 PM on Sunday.

  • Location of Communication to Information Center when the Spikes Happened


Comm to 839736 Location Franky.png

From this graph, it can be seen that most of the communications to 839736 are from Wet Land. This could be an indicator that there is something happening at Wet Land between 11:45 AM to 12:00 PM.
From previous graph, it is known that there is another small spike of communications at about 2:40 PM to 3:00 PM. Therefore, the location of communication when this spike happened also need to be investigated.
Comm Location Franky.png
From the graph, it is known that most of the communications to 839736 between 2:40 PM to 3:00 PM are from Coaster Alley. This is an indicator that the place of the crime has changed.

  • Communications to 839736

Another interesting thing to find out is who communicates the most with the Information Center.
Comm to 839736 Franky.png
From this graph, it can be seen that there are few IDs that communicates much more than other IDs. The top three communicators are 1149884, 1601276, and 1217381.
Based on the previous graph, the timings of communication between these top three communicators can be observed as can be seen in the figure below.

  • High Communication to 839736

High Communication to 839736 Franky.png
From this graph, it can be seen that the three IDs "take turns" communicating Information Center (839736). Between 11:30 AM to 12:30 PM, mostly the ID 1601276 is communicating with the Information Center. After that, between 12:30 PM to about 2 PM, mostly ID 1149884 is communicating. Next, at about 2:00 PM to 3:00 PM, the ID 1217381 communicates most with Information Center. Based on these communication patterns, the three IDs is most probably the security officer.

  • Communications Visualization


The visualization of communications to the Information Center (839736) can be seen below.
Comm Visualization Franky.png

From the communication above, the communications to the Information Center on Sunday between 11:45 AM to 12:30 PM can be seen. There are much more communications to the Information Center at about that time which indicates that something wrong is happening during these timings.

Conclusion


From this data and communication patterns, it can be hypothesized that the vandalism was discovered at about 11:45 AM to 12:00 PM on Sunday.

Visualisation Software

To perform the visual analysis, the following softwares are used:

  • Tableau : to visualize the data
  • JMP Pro : to prepare the data
  • Gephi : to visualize the communication data


Comments