ISSS608 2016-17 T1 Assign3 ZHANG Zhe
Contents
Introduction
DinoFun World held a series of events about Scott Jones, an internationally renowned football star in last year. Scott was planned to appear in two stage shows each on Friday, Saturday, and Sunday to talk about his life and career, as well as a show of memorabilia in the park’s Pavilion (Area 32). However, the event was marred.
Although the crimes were rapidly solved, they have intention to know to how to do better preparation in the future. The task is understanding how people move and communicate in the park, as well as how patterns changes and evolve over time, and what can be understood about motivations for changing patterns.
IDs that Stand Out for Their Large Volumes of Communication
The 2 IDs that stand out for their large volumes of communication are 1278894 and 839736, separately.
ID 1278894 sent and received messages at regular intervals for a period by observing the dots on blue lines are relatively sparse and the interval was fixed. It communicated with a large amount of visitors, but minority of visitors didn't respond. Based on this condition, it may be advertisements, service tips and others.
Hypothesis: According to official website of DinoFun World, ID 1278894 should be the Cindysaurus Trivia Game that is available from the app.
The ID 839736 communicates irregularly. Because dots on the orange line are close and form a line. It is continuous. In addition, every receiver gave feedback to this ID.
Hypothesis: ID 839736 should be a status check sensor. Security personnel check everyone's movement status by the feedback.
Who Are Criminals?
Logic Chain:
Data Filter
- Vandalism’s Time
Our main task is to identify the criminal offenders. So the first step is to filter data to narrow the range. If we want to filter data, we should identify time of the incident.
The times of communication with external increased sharply at 11:44 AM, Sunday. It is abnormal when compared with the previous 2 days.
We know the crime scene was Creighton Pavilion (Attraction 32, Coordinate (32,33)). Through the timeline of checking in Pavilion, we could know it opened between 8:10 AM – 9:30 AM and closed between 9:31 AM – 11:29 AM on Sunday.
Hypothesis: The offender(s) checked in the Pavilion, and hided inside, then sabotaged during the closing time. When Pavilion opened again, he escaped with the opportunity that many visitors were entering. Meanwhile, people found the destructive activity and messages sent were up to the peak.
crime occurred between 9:30 AM and 11:30 AM on Sunday. And based on the hypothesis, visitors who checked in between 8:10 AM and 9:30 AM on Sunday was extracted.
- Remove Children
There is a very small probability that the offender is a kid. So bane would neither check in the Kiddie Rides (Attraction 9-19), nor crime with children. Get rid of this crowd.
Based on the hypothesis, remove visitors who checked in Kiddie Rides.
- Delete Persons Who Have Not-Present Proof
Based on our hypothesis, from 9:31 AM to 10:29 AM, the offenders were vandalizing in Pavilion. And when visitors were within four walls, sensors couldn’t get responses. So there were not their movement data.
Eliminate visitors that had moving record during the time of crime. Now there are only 4 IDs (1983765, 461004, 1502920, 416790).
Select Corresponding Comm_data
Extract unique ID after filtering. These IDs are our suspects. And map these IDs on comm_data to obtain their communication information.
It is worth noting that the 2 special IDs should be removed.
Identify Criminals
Import the new communication data into Gephi and set it as edges.
We noticed that ID 1983765 didn’t have communications. The others formed a group with other 4 IDs and external. We considered them as 2 parts.
- ID 1983765: We observed his movement route. On Sunday, he entered Scholtz Expression at 9:13 AM and came out at 11:33 AM. It could explain why he disappeared during crime. We could rule out his suspicion of committing the crime.
- ID 461004, 1502920 and 416790: We observed their (the 7-person group) movement route. The main details are shown as below.
Hypothesis: The criminal gang includes 3 principle offenders (ID 461004, 1502920 and 416790) and 4 accomplices (ID 1350546, 1123214, 1187909 and 1350546). The 3 prime culprits entered during Pavilion's opening time and hided inside, then vandalized during the closing time while 2 accomplices (ID 1350546 and 1123214) stand watch on the west zone and the others (ID 1187909 and 1350546) were responsible for another zone.
Communications Patterns
- Public IDs which communicated with everyone. They sent and respond at fixed interval or continuous. No specific time and locations.
- Small groups among several visitors. They communicated with each other in this small group. The closer to Pavilion, the more communications.
- There were 2 peaks in Friday and Saturday, with the same time and location. Through this, we could conclude that the 2 stage shows in everyday are at 11:00 AM and 4:00 PM, respectively. In addition, the performance venue was located in Coaster Alley.
But in Sunday, numbers of communication increased sharply at 11:39 AM after the 11:00 AM peak occurred, and the location is Wet Land where the exit of Creighton Pavilion was located at. It shows people found the Pavilion was vandalized before 11:39 AM on Sunday.
Conclusion
The criminal gang includes 3 principle offenders (ID 461004, 1502920 and 416790) and 4 accomplices (ID 1350546, 1123214, 1187909 and 1350546). The 3 prime culprits entered during Pavilion's opening time and hided inside, then vandalized during the closing time while 2 accomplices (ID 1350546 and 1123214) stand watch on the west zone and the others (ID 1187909 and 1350546) were responsible for another zone.
Interactive Visualization
Due to the limitation of Tableau, dashboards on Park_Movement.csv couldn't be public. So the link below is the dashboard only on comm_data.csv. https://public.tableau.com/profile/publish/Book1_14510/Story1#!/publish-confirm