Difference between revisions of "ISSS608 2016-17 T1 Assign3 WEI Jingxian"
Line 10: | Line 10: | ||
== IDs with high-volume communication == | == IDs with high-volume communication == | ||
− | |||
===Data Preparation=== | ===Data Preparation=== | ||
At first, we combine all the communication data in Friday, Saturday and Sunday. Since we want to find out those IDs with high-volume communications, we use JMP to convert the data into unique ID and calculated the communication counts for each ID. Also, we only select the top 100 IDs with high communications. | At first, we combine all the communication data in Friday, Saturday and Sunday. Since we want to find out those IDs with high-volume communications, we use JMP to convert the data into unique ID and calculated the communication counts for each ID. Also, we only select the top 100 IDs with high communications. | ||
− | [[File:Datapre1. | + | [[File:Datapre1.PNG|800px|frameless|center]] |
===Findings=== | ===Findings=== |
Revision as of 10:57, 28 October 2016
Contents
Introduction
DinoFun World is a tropical amusement park and it is hosting thousands of visitors each day. Except for Entry Corridor, there are four districts in the park, including Coaster Alley, Kiddie Land, Tundra Land and Wet Land. Facilities with different level of excitement are available in these four area.
The famous soccer star would attend a set of events from 6 June to 8 June. Unfortunately, “Scott Jones Weekend” was marred by vandalism. There was a mayhem that disturbed the event in the park’s Creighton Pavilion. Although the crimes were rapidly solved, park officials and law enforcement figures would like to know what happened during the weekend, and would like to explore the communication and movement data to identify notable patterns, which may be related to the crime.
Data Information
The dataset given is the communication and movement data from the DinoFun World App. Visitors can use the app as electric tickets and the sensors around the part will record the movements while visitors are using the app.
IDs with high-volume communication
Data Preparation
At first, we combine all the communication data in Friday, Saturday and Sunday. Since we want to find out those IDs with high-volume communications, we use JMP to convert the data into unique ID and calculated the communication counts for each ID. Also, we only select the top 100 IDs with high communications.
Findings
It is obvious that there are two IDs standing out. 1278894 has almost 400k total communications and 839736 has around 120k total counts, while other IDs have at most about 70k communications.
After we identified these two IDs, we would like to explore the communication patterns of them. The below figure shows that 1278894 and 839736 only call or send text at Entry Corridor (red area in the chart), but the messages sent to them are from everywhere around the park.
From the communication timeline of 1278894, we can easily find that every day it starts sending message /calling at 12pm and will stop sending at 8.55pm. The time of sending message is the same among three days. In addition, the time 1278894 paused was the time it received large communications.