ISSS608 2016-17 T1 Assign3 CHIA Yong Jian When was the vandalism discovered

From Visual Analytics and Applications
Jump to navigation Jump to search

CHIA YONG JIAN Assign3 CSI Logo.png Episode VA3: "A Crime-Filled Weekend at DinoFun World"

[Home]

[Data Review and Prep]

[Large Volume Comms IDs]

[Ten Comms Patterns]

[Discovery of Vandalism]


CHIA YONG JIAN Assign3 PuzzleLogo.png

"When was the vandalism discovered?"

Based on the observations made in earlier sections regarding communications for ID 839736 and "External" on Sunday 8 June 2014, we have the following facts:

  • At 11:57 AM, park visitors began to realise the Pavilion was vandalised and started communicating with external parties or social media, causing message spike.
  • At 12:03 PM, park information and service line using ID 839736 began receiving messages from park visitors about the vandalism, causing a message spike as well.


Hence, it is hypothesized that the vandalism could have been discovered at around 11:57am on Sunday. We shall now look into the movement data to investigate further if clues can be found pinpointing suspects.

"Who could possibly did the crime then?"

Suspicious Park Visitor ID 1983765

A database technology called KDB (http://code.kx.com/wiki/A_Brief_Introduction_to_kdb%2B) that can handle large volumes of data was used to perform quick analysis of the data through data queries. The large movement data CSV files were loaded into a local KDB process. Let's first assume that the movement data generated is clean. This means, for any one time, one individual can only be at one location. A simple KDB query (syntax similar to SQL queries) was run to group records by timestamp, ID, type (renamed to movementtype in the KDB query due to reserved word use). Sample query is shown below:

CHIA YONG JIAN Assign3 KDBquery.png


Strangely we can notice that a particular ID, 1983765 had duplication of movements from 8:18 PM until 8:34 PM on Saturday (7 June 2014). There were no other IDs that were affected - either the system happened to have problems with this ID, or there is something suspicious going on, where someone is attempting to tamper with the system (signs of a criminal wanting to cover his tracks). A sample of these timestamps are as follows:

CHIA YONG JIAN Assign3 Task3 KDB 1983765.png


Adding to the suspicions, when a data filter check was done in JMP to review the communications data for 1983765 (and to generate a network graph), a total of 0 results shown up:

CHIA YONG JIAN Assign3 Task3 missing comms 1983765.png


Review of movement and check-in data for ID 1983765

Using available movement data, an inspection of the route that ID 1983765 has taken was done using Tableau and JMP for Sunday, 8 June 2014. The below are observations made:

  • ID 1983765 first started showing up at the park at 8:15 AM, via the Raptor Restroom (49), as a check-in.
  • The person then moved along the park until it reached the Creighton Pavilion (32), at 8.32 AM, as a check-in.
  • At 9:08 AM, the person left Creighton Pavilion and boarded the Scholtz Express
  • The next time the person was seen as movement data was at 11:33 AM. The person then took the shortest path and left via the Raptor Restroom (49), where it first entered the park.

A GIF file of the person's movements can be seen below:

CHIA YONG JIAN Assign3 Task3 GIF 1983765.gif


However, the Creighton Pavilion appears to be closed that day between 9:30 AM and 11:30 AM, with no check-ins during that period. See sample of the raw data below (note that the Pavilion's X and Y coordinates are inferred from other check-ins done at the Pavilion):

CHIA YONG JIAN Assign3 Pavillon Check-In Sun.png


Logically thinking, if vandalism was to be done without being seen by someone else, it would be great to do the crime when there are no other people around. Potentially it could be when the duplicated movements was discovered on Saturday night. However, if that was the case, then the spikes in communications could have already happened when the park started having people checked in from 8AM onwards. But that only happened around 11:57 AM. Hence, all factors considered, this time slot of 9:30 AM to 11:30 AM appears to be the time period for committing the vandalism, after the last batch of people left the Pavilion from 9:30 AM onwards. But yet, the movement data shows the person going off at 9:08 AM. This is of course if we assumed that the person are always together with the device. This assumption could be challenged. Decoys could have been possibly planned to avoid attention and investigations like this to track down this suspect.

Final thoughts

Given the suspicions of the duplicated movement on Saturday night and movement pattern on Sunday, ID 1983765 remains a likely suspect. It will be worth for the police to call in Park Visitor ID 1983765 to assist in investigations. Further analysis could be performed to understand if there were other suspects or accomplices as well, such as the IDs noted in the section "Ten Communications Patterns". This is just one angle of the story - For alternative detective analysis, the following could be read: