Difference between revisions of "ISSS608 2016-17 T1 Assign3 Frandy Eddy"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 37: Line 37:
 
* This ID never communicates with ID 839736 and external which has the second and third highest volume of communication on the 3 days.
 
* This ID never communicates with ID 839736 and external which has the second and third highest volume of communication on the 3 days.
  
- Based on these patterns, I hypothesize that this ID is  
+
- Based on these patterns, it can be hypothesized that this ID is the Cindysaurus Trivia Game, which is available from the DinoFun World app.
  
 
<br />
 
<br />
Line 49: Line 49:
 
* This ID never communicates with ID 1278894 and external which has the highest and third highest volume of communication on the 3 days.
 
* This ID never communicates with ID 1278894 and external which has the highest and third highest volume of communication on the 3 days.
  
- Based on these patterns, I hypothesize that this ID is  
+
- Based on these patterns, it can be hypothesized that this ID is the visitor information center.
  
 
<br />
 
<br />
Line 58: Line 58:
 
* Communications involving this ID happen at all times throughout the day.
 
* Communications involving this ID happen at all times throughout the day.
 
* This ID never communicates with ID 1278894 and 839736 which are the two IDs with the highest volume of communication on the 3 days.
 
* This ID never communicates with ID 1278894 and 839736 which are the two IDs with the highest volume of communication on the 3 days.
- Based on these patterns, I hypothesize that this ID is an external party
+
- Based on these patterns, it can be hypothesized that this ID is external party.
 
<br /><br />
 
<br /><br />
There are some communication patterns found in the data.
+
As the data is very big, we will only analyze the communication patterns on subset of the data which we find interesting. There are some communication patterns found in the data.
 
<br />
 
<br />
'''1. Abc'''
+
'''1. Communication patterns of ID with high volume of communication'''
 
<br />
 
<br />
'''2. Abc'''
+
'''2. Period when there is a spike in the volume of communication'''
 
<br />
 
<br />
'''3. Abc'''
+
'''3. Location where the spike in the volume of communication comes from'''
 
<br />
 
<br />
 
'''4. Abc'''
 
'''4. Abc'''
Line 72: Line 72:
 
'''5. Abc'''
 
'''5. Abc'''
 
<br /><br />
 
<br /><br />
From the communication patterns, the vandalism was probably discovered at around  
+
From the communication patterns, the vandalism was probably discovered at around 11:45AM – 12:00 PM on Sunday.
  
  

Revision as of 00:09, 28 October 2016

Abstract

DinoFun


Overview

DinoFun World is a typical modest-sized amusement park, sitting on about 215 hectares and hosting thousands of visitors each day. It has a small town feel, but it is well known for its exciting rides and events.
One event last year was a weekend tribute to Scott Jones, internationally renowned football (“soccer,” in US terminology) star. Scott Jones is from a town nearby DinoFun World. He was a classic hometown hero, with thousands of fans who cheered his success as if he were a beloved family member. To celebrate his years of stardom in international play, DinoFun World declared “Scott Jones Weekend”, where Scott was scheduled to appear in two stage shows each on Friday, Saturday, and Sunday to talk about his life and career. In addition, a show of memorabilia related to his illustrious career would be displayed in the park’s Pavilion. However, the event did not go as planned. Scott’s weekend was marred by crime and mayhem perpetrated by a poor, misguided and disgruntled figure from Scott’s past.
While the crimes were rapidly solved, park officials and law enforcement figures are interested in understanding just what happened during that weekend to better prepare themselves for future events. They are interested in understanding how people move and communicate in the park, as well as how patterns changes and evolve over time, and what can be understood about motivations for changing patterns.


The Task

To be a visual detective

  1. Identify those IDs that stand out for their large volumes of communication. For each of these IDs
    1. Characterize the communication patterns you see.
    2. Based on these patterns, what do you hypothesize about these IDs?
  2. Describe up to 10 communications patterns in the data. Characterize who is communicating, with whom, when and where. If you have more than 10 patterns to report, please prioritize those patterns that are most likely to relate to the crime.
  3. From this data, can you hypothesize when the vandalism was discovered? Describe your rationale.


Data Preparation

Before we start to do the analysis, we need to prepare the data so that it can be used for analysis.


Results & Findings

There are some IDs with much larger volume of communication than others. The IDs are 1278894, 839736, and external.

1. ID 1278894
- Patterns:

  • This ID has the highest volume of communication on all 3 days.
  • The volume of communication from and to this ID is almost the same, with more communication coming from this ID than communication going to this ID. (~0.30% difference)
  • It is found that the location of this ID is always on Entry Corridor.
  • Communications from this ID only happen at certain times. It is sending out broadcast messages at 12:00 PM - 12:55 PM, 2:00 PM - 2:55 PM, 4:00 PM - 4:55 PM, 6:00 PM - 6:55 PM, and 8:00 PM - 8:55 PM with an interval of 5 minutes between each broadcast message.
  • This ID never communicates with ID 839736 and external which has the second and third highest volume of communication on the 3 days.

- Based on these patterns, it can be hypothesized that this ID is the Cindysaurus Trivia Game, which is available from the DinoFun World app.


2. ID 839736
- Patterns:

  • This ID always responds within 5 minutes after it receives a message.
  • The volume of communication from and to this ID is almost exactly the same. (~0.01% difference)
  • It is found that the location of this ID is always on Entry Corridor.
  • Communications involving this ID happen at all times throughout the day.
  • This ID never communicates with ID 1278894 and external which has the highest and third highest volume of communication on the 3 days.

- Based on these patterns, it can be hypothesized that this ID is the visitor information center.


3. External
- Patterns:

  • This ID only receives messages and does not send any messages.
  • Communications involving this ID happen at all times throughout the day.
  • This ID never communicates with ID 1278894 and 839736 which are the two IDs with the highest volume of communication on the 3 days.

- Based on these patterns, it can be hypothesized that this ID is external party.

As the data is very big, we will only analyze the communication patterns on subset of the data which we find interesting. There are some communication patterns found in the data.
1. Communication patterns of ID with high volume of communication
2. Period when there is a spike in the volume of communication
3. Location where the spike in the volume of communication comes from
4. Abc
5. Abc

From the communication patterns, the vandalism was probably discovered at around 11:45AM – 12:00 PM on Sunday.


Software Used

  • Tableau 10.0 - Used for data visualization
  • JMP Pro - Used for data preparation and analysis
  • Gephi - Used for visualization of communication pattern