ISSS608 2016-17 T1 Assign3 Ong Han Ying - Act 2

From Visual Analytics and Applications
Jump to navigation Jump to search
Where is the crime?


ACT#02 - Begin With an Open Mind
“I make a point of never having any prejudices, and of following docilely wherever fact may lead me.” – The Reigate Puzzle


Part 1 : Who are they?
Part 2 : When and Where?
Part 3 : Diving Deep: Who are they, again?
Part 4 : What's next?

Part 1 : Who are they?

Act#2: Part1 : Who are they?

Part 2 : When and Where?

Act#2: Part2 : Where and When?

Part 3 : Diving Deep: Who are they, again?

Act#2: Part3 : Diving Deep: Who are they, again?

Part 4 : What's next?

Act#2: Part4 : What's next?



From Part 1 : Who are they?
_____Question 1: Who are those that have a high communication exchange?
__________Answer #1 2 Distinct IDS!
_____Question 2: Where and where to they talk to, and received message from?
__________Answer #2A ID 1278894
_______________FINDINGS #1
_______________FINDINGS #2
__________Answer #2B ID 839736
_______________FINDINGS #3
_______________FINDINGS #4
_______________FINDINGS #5
From Part 2 : When and Where?
_____Question 3: When and where are these communication conducted?
__________Answer #3A An Overview
_______________FINDINGS #6
__________Answer #3B Distribution by Time & Venue
_______________FINDINGS #7
From Part 3 - Diving Deep: Who are they, again?
_____Question 4: Whom to They Communication With?
__________#Answer #4A Communication between 2330-2335 on Saturday
_______________FINDINGS #8
__________#Answer #4B High communication frequency at Coaster Alley at 11AM for all 3 days
_______________FINDINGS #9
__________#Answer #4C High communication frequency at Coaster Alley at 4PM For Fri & Sat only
_______________FINDINGS #10
__________#Answer #4D High Communication at Kiddie Land, at 5PM
_______________FINDINGS #11

From Part 1 : Who are they?

Question 1: Who are those that have a high communication exchange?

Answer #1 2 Distinct IDS!

A boxplot reveal the following:

2 Distinct Outliers

For more details, please go to: Behind the Scene - #Act02 - Fun Fact#01

Question 2: Where and where to they talk to, and received message from?

Answer #2A ID 1278894

The top of the list is ID 1278894, and he/her communication pattern as below;

Communication Pattern of ID 1278894

  1. From the heatmap, we can see that ID1278894 sends out a message every 5 min regularly, in equal interval of every 2hour, between 1200 to 2000.
  2. Further analysis, this ID only send out from entry corridor, and it never moves at all.
  3. This should be from the server, and likely the application of DinoFun World.
  4. Look like there are most people came on Sunday.

  1. Since the server sent out the message at a regular interval, most of the guest respond within the time frame. But 2 outliers, respond >30min after receiving the message!
  2. Upon further investigation, both outliers belong to the same ID 1765818. This makes him/ her suspicious.
Suspect #1 ID 1765818

Answer #2B ID 839736

Communication Pattern of ID 839736

  1. From the heatmap, we can see that ID839736 received messages between 0800 to 2330, there are messages every minute.
  2. Further analysis, this ID only send out from entry corridor, and it never moves at all.
  3. This should be from service helpdesk!

  1. The helpdesk is receiving the most number of messages between 1200 to 1223.
  2. It responds readily between 1201 to 1225.
  3. This high volume of communication is found in the Wetland.
The peak between 1220-1225
The peak between 1220-1225 - At Wetland

  1. The helpdesk is receiving a significant amount of messages between 1440 to 1442.
  2. This is found in Coaster Alley instead.
The 2nd peak between 1440-1442

From Part 2 : When and Where?

Question 3: When and where are these communication conducted?

Answer #3A An Overview

The distribution of the messages sent over time, excluding server & helpdesk, as below;

Overview - excluding Server & Helpdesk

  1. There is unusual high communication after 2330, on Saturday. Further investigation is required.

Answer #3B Distribution by Time & Venue

The distribution of the messages sent over time by venue, as below;

Communication by Time & Venue

  1. There is a common peak at 11AM at Coaster Alley over 3 days, and 4PM at coaster Alley on Friday & Saturday.
  2. There is a significant higher spike ar 1700, over at Kiddie Land.

From Part 3 - Diving Deep: Who are they, again?

Question 4: Whom to They Communication With?

Answer #4A Communication between 2330-2335 on Saturday

The network graph selected as below;

Betweeness - Network graph at 2331-2335, Sat

For more details, please go to: Behind the Scene - #Act02 - Fun Fact#02

  1. From the graph, we can see that there are 2 main coordinators among those who communicate during this specified timing.
  2. It might be a tour group that stays late.
  3. There is also a communication to external. (We shall not make any wild guess first)
  4. We will take note of this, and find out if anyone that doesn't belong to the group, but still in the park.

Answer #4B High communication frequency at Coaster Alley at 11AM for all 3 days

Outdegree is chosen to display as the node as it is able to display the tour group (or cluster of tourists) easier, as compared to others.

Friday - Indegree Friday - Outdegree
In Degree- Friday at 11AM to 11.01AM, Coaster Alley
Out Degree- Friday at 11AM to 11.01AM, Coaster Alley
Saturday - Outdegree Sunday - Outdegree
Out Degree- Sat at 11AM to 11.01AM, Coaster Alley
Out Degree- Sun at 11AM to 11.01AM, Coaster Alley

For more details, please go to: Behind the Scene - #Act02 - Fun Fact#03

  1. Based on the out-degree network diagram, we are able to see that people are communication in groups, with certain ID (bigger node), sending out more messages to the others.
  2. These people are likely to be the leader of the group (or a tour guide).
  3. Also, we can see small clusters of groups on their own, which might be groups of friends coming together.
  4. on Friday at 11AM, there is a higher number of message sent (in-degree) to the "external".
  5. As such, with this high frequency of messages exchange, this is likely to be the showtime of Scott Jones, since it draws a large number of crowd and groups.
  6. confirming with the movement data later, we will determine at a later stage if these tours can be removed from being a suspect.

Answer #4C High communication frequency at Coaster Alley at 4PM For Fri & Sat only

For communication at 4PM, "out-degree" is selected because it is able to display the group more obviously, as compared to "betweenness" that show distinct group only. This is especially so when there seem to be many small clusters of group communication at this hour.

Out Degree- Friday Out Degree- Saturday
Outdegree- Friday at 4PM, Coaster Alley
Outdegree- Saturday at 4PM, Coaster Alley

For more details, please go to: Behind the Scene - #Act02 - Fun Fact#04

  1. It was clear that on both days, a big group are attached, and therefore; there was communication among the people.
  2. The peak of the timing seems to suggest the showtime (either start or end time) of the 6 shows.
  3. Without a peak on Sunday at a similar timing, it is likely that the "event" was "canceled" or was not being organized since there are supposed to be 6 shows.

Answer #4D High Communication at Kiddie Land, at 5PM

For communication at Kiddie Land at 5PM, Out-degree are analyzed so as to identify the volume of the communication flow sent out, mainly by the "leader" of the group. The network graph as below;

Out-Degree - Kiddie Land, Sat
  1. The communication is made up of tour groups, too; based on the out-degree diagram.
  2. However, this is the only timeslot over the 3 days that has a spike, therefore; there might be some special events that occurs.
  3. This is especially fishy when this area is not near the performance & exhibition, and also; likely after the timing of the event of Scott Jones.
  4. We are unable to conclude if this were related to the crime, but the sudden increase in the crowd in a place further away from the crime scene, can be fishy.
  5. In the event that the sudden in the crowd is related to the crime, then; we are able to identify non-suspects -via the tour groups/ communication groups.
  6. Thus, this is worth to be taken note of, and to be analyzed further with the movement data.

Detective Board

Detective Board from Act#2


Timeline from ACT#2

Detective Board
Watch on Github.

Act #01
Act #02
Act #03
Act #04
Act #05
Act #06
Act #07
Behind the Scene
Homework Answer