ISSS608 2016-17 T1 Assign3 Ye Jiatao

From Visual Analytics and Applications
Revision as of 15:43, 28 October 2016 by Jiatao.ye.2015 (talk | contribs)
Jump to navigation Jump to search

Overview

DinoFun World is a typical modest-sized amusement park, sitting on about 215 hectares and hosting thousands of visitors each day. It has a small town feel, but it is well known for its exciting rides and events. One event last year was a weekend tribute to Scott Jones, internationally renowned football (“soccer,” in US terminology) star. Scott Jones is from a town nearby DinoFun World. He was a classic hometown hero, with thousands of fans who cheered his success as if he were a beloved family member. To celebrate his years of stardom in international play, DinoFun World declared “Scott Jones Weekend”, where Scott was scheduled to appear in two stage shows each on Friday, Saturday, and Sunday to talk about his life and career. In addition, a show of memorabilia related to his illustrious career would be displayed in the park’s Pavilion. However, the event did not go as planned. Scott’s weekend was marred by crime and mayhem perpetrated by a poor, misguided and disgruntled figure from Scott’s past. While the crimes were rapidly solved, park officials and law enforcement figures are interested in understanding just what happened during that weekend to better prepare themselves for future events. They are interested in understanding how people move and communicate in the park, as well as how patterns changes and evolve over time, and what can be understood about motivations for changing patterns.

The Task

In this case, we mainly need to solve the following question using the in-app communication and visitor movement data:

  1. Identify those IDs that stand out for their large volumes of communication. For each of these IDs
    1. Characterize the communication patterns you see.
    2. Based on these patterns, what do you hypothesize about these IDs? Note: Please limit your response to no more than 4 images and 300 words.
  2. Describe up to 10 communications patterns in the data. Characterize who is communicating, with whom, when and where. If you have more than 10 patterns to report, please prioritize those patterns that are most likely to relate to the crime. Note: Please limit your response to no more than 10 images and 1000 words.
  3. From this data, can you hypothesize when the vandalism was discovered? Describe your rationale. Note: Please limit your response to no more than 3 images and 300 words.


Data Set

  1. DinoFunWorld_CommData.zip consist of in-app communication data over the three days of the Scott Jones celebration.
  2. DinoFunWorld_MoveData.zip consists of three days park movement data. The park movement datasets are in csv format.
  3. DinoFunWorld_LayoutMap.zip consists of a jpg file.
  4. DinoFunWorld_Website.zip consists of webpages of DinoFun World Park.

Approaches

Identify those IDs that stand out for their large volumes of communication. For each of these IDs

  1. Characterize the communication patterns you see.
  2. Based on these patterns, what do you hypothesize about these IDs?
Y 01.jpg


From the network figure above, 3 IDs stand our for their large volume of communication: 1278894, 839736 and external. Because external means sending messages outside the park, in this step, we just focus on ID 1278894 and 839736.

  • ID 1278894
  1. There is no connection among ID 1278894, 839736 and external, namely, these 3 account haven't send or receive message from each other.
  2. There is a clearly cyclical messaging pattern for ID 1278894 within these 3 days. From 12:00 each day, this ID begin to send out a large number of messages in every 5 minutes in an hour, wait for another hour and repeat 5 minutes messaging again within next hour.
  3. The cyclical communication pattern would repeat 5 times a day, which begins from 12:00 to 20:00 in the evening.
  4. The communication volume keeps increase steadily from Friday to Sunday, which means there are more and more people come to visit this park within this period.
  5. This ID only have record of sending messages from Entry Corridor.

By now, we can hypothesize that ID 1278894 is a account for park employee who always on the entry. Because the amount of message sent out by this account remain relatively steady with a day, we can assume that this ID sends out messages to all visitor in the park related to attraction open info or cindysaurus trivia game mentioned in the park's official website.

  • ID 839736
  1. The messaging pattern of ID 839736 is not as clear as that of ID 1278894. It sent and received up to 25 messages per minutes from 8:00 am to 11:30 pm each day except for Sunday, when there is abnormal peak of communication showed up, which reached around 1400 messages in 1 minutes for receiving at 12:00 and sending at 12:03.
  2. The sending and receiving pattern is time correlative for this ID, which means that it will respond to the inquiry or questions manually.
  3. This ID also only have records in Entry Corridor.

From the evidences above, we can confidently hypothesize that this ID is another park employee who in charge of emergency event and responding to visitors' inquiry.





Result