ISSS608 2017-18 T1 Assign ZHANG PENG
Contents
VAST Challenge: Characterization of an Epidemic Spread
assignment overview
- Identify approximately where the outbreak started on the map (ground zero location). Outline the affected area.
- Present a hypothesis on how the infection is being transmitted. For example, is the method of transmission person-¬to¬-person, airborne, waterborne, or something else? Identify the trends that support your hypothesis.
- Is the outbreak contained? Is it necessary for emergency management personnel to deploy treatment resources outside the affected area?
Data Description
xxx
Dataset Overview
1 dataset1 | 2 dataset2 |
xxx | xxx |
Data Preparation
1 Define episodes
Intuitively, each car-id (except for park service vehicles 2p) should possess two entrance records with one serves as entering and the other for exiting. However, it is noticed that many car-ids had more than two entrances recorded. It is rational to assume that these cars did made multiple visits to the preserve. Therefore, I split the trip with multiple entering and exiting of the same car-id into different episodes. Each episode can be seen as a complete trip and analysis is performed based on episodes.
2 Exclude incomplete trips By looking at the maximum timestamp of the entire dataset, it is safe to conclude that the data provided is generated before June 2016. Therefore, there are vehicles with incomplete trips (the maximum timestamps of these car-ids are around end of May) in the dataset, and I exclude car-ids with only one entrance record. The figure on the right shows the car-ids with only one entrance record and their maximum timestamp. |
3 Remove duplicate records
It is noticed that some sensor records are duplicated in the dataset, and these duplicates all have three entrance records for a car-id - the first two have the same timestamp (or very close timestamp), and the third one has a different timestamp and is coherent with the following activities. Hence, I remove the first two records assuming there were something wrong the data entry and keeping the third one makes more sense when interpreting the trip. The figure on the right shows an example of this kind of data anomaly. |
4 Label the sequence of gates of each episode In order to plot the entire routes of the vehicles, I create a new column named 'Sequence' to |
5 Concatenate the routes To concatenate the gate-names of each episode to form a route. |
6 Extract the gate-to-gate directions |
7 Calculate the gate-to-gate duration |
8 Extract the arrival timestamp of each episode |
9 Segment the visitor types
|
10 Map coordinates of gates
|
Interactive Visualization
You may have your own investigation here: Link to interactive visualization
- Please be noticed that the link is not working well due to some unknown tableau server issue; please download the workbook via tableau public landing page.
Patterns of Life Analysis
Daily Patterns
Images | Interpretations |
---|---|
|
|
The two types of buses and 4+ axle trucks, all large vehicles, had no appearance in any camping areas. It might represent that these three car-types can only be passing through the preserve, and camping area is not allowed for large vehicles. | |
Majority of traffics through camping areas only happened between 5am to 22pm, except for one car-id 20154519024544-322, which is discussed in later section. It might indicate that traffics were not allowed in camping areas after 22pm to ensure the safety and rest of overnight campers. | |
2 axle car/motorcycle, 2 axle truck, and 3 axle truck were most active vehicles in the preserve. Their activities started to increase at 6am and started to flatten out at around 18pm. 7am to 17pm had most vehicle activities. | |
There were vehicles that simply passed the preserved without making any stops and looking around. These vehicles can be identified by investigating the number of gates they passed through. This pattern only applies to non-campers and happened within a short time period. The graph on the left shows all the possible routes for trespassing.
|
Longer-Period Patterns
Images | Interpretations |
---|---|
Traffic increased since May and reached highest in July, then started to decrease. November to March were the least popular months for visitors and it is possible that these are winter months. | |
Activities of 2 axle car/motorcycle, 2 axle truck and 3 axle truck increased on Friday and decreased on Monday. This can be explained by the overnight camping during weekends. | |
The duration rangers spent at ranger-stops and camping areas were less than 1 hour. | |
The graph shows the route that had most rangers' episodes. It was the most frequent patrol route of the rangers, and it was almost twice as frequent as the second most frequent patrol route. It is possible that the east side of the preserve required more care and protection.
|
|
Campers arrived at the preserve between 5am and 17pm. Friday to Sunday were more popular as expected. |
Unusual Patterns
Images | Interpretations |
---|---|
This table displayed the route of car-id 20154519024544-322 (a 2 axle truck), which passed through camping gates after 22pm. This vehicle had 16 episodes, and each episode had exact same route except for the first episode. This vehicle came to the preserve each Friday and left the on the following Monday. | |
Apart from 20154519024544-322, there were other car-ids that had multi-episodes, which means they did not render their car-id by the time they exited the preserve. And every time they came to the preserve, they followed the same routes and went for overnight camping. This group of visitors might hold a regular pass for their visits. | |
|
|
|
|
Apart from the gate-skippers mentioned above, there were another 3 episodes made a simple round-trip in the preserve: they entered the preserve, passed through general-gate, then made the same route back to the entrance. |
Top 3 Possible Causes
- Long term visitor with car-id 20154519024544-322 and his behavior to travel pass camping areas during midnight.
- Unauthorized 4+ axle trucks invading restricted areas because the route they traveled was part of the most frequent patrol route of the rangers. This area could be where pipits resided, and therefore needed more care and protection from rangers. And the fact that they went through the restricted area when the rangers were off-duty makes them extremely suspicious.
- Possible over speeding which requires further investigations, especially for trespassing routes.
Comments & Discussions
comment1: Hi,Bijun. Amazing work! From you analysis pack, I can see the level of efforts that you have devoted into this assignment. Yet, I have the following suggestions that hopefully are useful in further improving your work:
Overall, fantastic work! Hopefully my comments can add value
|
Hi Joyce, Great overall effort and very engaging. Some of my feedback as below 😉
Clarity:
Cheers, |
comment3
Hi Zheng Bijun, Please find my feedback comments as follows. You present a very nice analysis, which is answering the questions of the challenge. With regard to the clarity and aesthetics aspect, I have the following to add. Clarity :
Aesthetics:
Hope the feedback helps, and please leave out a feedback on my page as well. You may access it here.
Thank You, Kishan Bharadwaj Shridhar |
comment4 |
comment5 |