Difference between revisions of "ISSS608 2017-18 T1 Assign ZHANG PENG"

From Visual Analytics and Applications
Jump to navigation Jump to search
(Created page with "<!--- Challenge Introduction ---> = VAST Challenge: Characterization of an Epidemic Spread = <div width=100%> <div width=30%> </div> <div width=70%> <p align="justify"> assign...")
 
Line 37: Line 37:
 
<!--- Data Preparation --->
 
<!--- Data Preparation --->
 
= Data Preparation =
 
= Data Preparation =
<b>1 Define episodes</b><br>
+
<table></table>
<p align="justify">Intuitively, each car-id (except for park service vehicles 2p) should possess two entrance records with one serves as entering and the other for exiting. However, it is noticed that many car-ids had more than two entrances recorded. It is rational to assume that these cars did made multiple visits to the preserve. Therefore, I split the trip with multiple entering and exiting of the same car-id into different episodes. Each episode can be seen as a complete trip and analysis is performed based on episodes. </p>
 
<br>
 
<table>
 
<tr>
 
<td valign="top"><b>2 Exclude incomplete trips</b><br>
 
<p align="justify">By looking at the maximum timestamp of the entire dataset, it is safe to conclude that the data provided is generated before June 2016. Therefore, there are vehicles with incomplete trips (the maximum timestamps of these car-ids are around end of May) in the dataset, and I exclude car-ids with only one entrance record. The figure on the right shows the car-ids with only one entrance record and their maximum timestamp.</p></td>
 
<td>[[File:ZBJ 1entrance-list.PNG|left|150 px]]</td>
 
 
 
<td valign="top"><b>3 Remove duplicate records</b>
 
<p align="justify">It is noticed that some sensor records are duplicated in the dataset, and these duplicates all have three entrance records for a car-id - the first two have the same timestamp (or very close timestamp), and the third one has a different timestamp and is coherent with the following activities. Hence, I remove the first two records assuming there were something wrong the data entry and keeping the third one makes more sense when interpreting the trip. The figure on the right shows an example of this kind of data anomaly.</p></td>
 
<td valign="top">[[File:ZBJ 3entrance-1.PNG|right|400 px]]</td>
 
</tr>
 
</table>
 
<table>
 
<tr>
 
<td width=50% valign="top">
 
<b>4 Label the sequence of gates of each episode</b><br>
 
<p align="justify">In order to plot the entire routes of the vehicles, I create a new column named 'Sequence' to <br>mark the order of the gates they passed through.</p>
 
</td>
 
<td valign="top">
 
<b>5 Concatenate the routes</b><br>
 
<p>To concatenate the gate-names of each episode to form a route. </p>
 
</td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Seq.JPG|left|500 px]]</td>
 
<td>[[File:ZBJ Rule.JPG|left|600 px]]</td>
 
</tr>
 
<tr>
 
<td valign="top"><b>6 Extract the gate-to-gate directions</b><br></td>
 
<td valign="top"><b>7 Calculate the gate-to-gate duration</b><br></td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Gatetogate.JPG|left|550 px]]</td>
 
<td>[[File:ZBJ Duration.JPG|left|600 px]]</td>
 
</tr>
 
<tr>
 
<td valign="top"><b>8 Extract the arrival timestamp of each episode</b><br></td>
 
<td valign="top"><b>9 Segment the visitor types</b><br>
 
<ul>
 
<li>Camper: normal visitors who went to camping areas</li>
 
<li>Non-camper: normal visitors who never went to camping areas</li>
 
<li>Rangers: vehicles with car-type as 2p</li>
 
</ul>
 
</td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Arrtime.JPG|left|550 px]]</td>
 
<td>[[File:ZBJ Visitortype.JPG|left|450 px]]</td>
 
</tr>
 
</table>
 
<table>
 
<tr>
 
<td valign="top"><b>10 Map coordinates of gates</b><br>
 
<ul>
 
<li>Use JMP Pro 13 Custom Map Creator add-in to point the gates on the map and generate the coordinates.</li>
 
<li>The scale of the map is set to 12*12 as the area indicated in data description. </li>
 
</ul></td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Mapcoord.jpg|left|700 px]]</td>
 
</tr>
 
</table>
 
<br>
 
 
 
 
<!------------------------->
 
<!------------------------->
  
Line 108: Line 43:
  
 
= Interactive Visualization =
 
= Interactive Visualization =
You may have your own investigation here: [https://public.tableau.com/profile/zheng.bijun#!/vizhome/VAST2017MC1/StoryPatternsofLifeAnalysis Link to interactive visualization]
+
You may have your own investigation here: link
* Please be noticed that the link is not working well due to some unknown tableau server issue; please download the workbook via tableau public landing page.
 
<table>
 
<tr>
 
<td>[[File:ZBJ Cover.JPG|600 px|left]]</td>
 
<td>[[File:ZBJ Trend.JPG|600 px|left]]</td>
 
</tr>
 
<tr>
 
<td>[[File:ZBJ Pattern.JPG|600 px|left]]</td>
 
<td>[[File:ZBJ Vehicle.JPG|600 px|left]]</td>
 
</tr>
 
</table>
 
 
<br>
 
<br>
 
<!------------------------>
 
<!------------------------>

Revision as of 14:02, 15 October 2017

VAST Challenge: Characterization of an Epidemic Spread

assignment overview

  • Identify approximately where the outbreak started on the map (ground zero location). Outline the affected area.
  • Present a hypothesis on how the infection is being transmitted. For example, is the method of transmission person-¬to¬-person, airborne, waterborne, or something else? Identify the trends that support your hypothesis.
  • Is the outbreak contained? Is it necessary for emergency management personnel to deploy treatment resources outside the affected area?



Data Description

xxx

Dataset Overview
1 dataset1 2 dataset2
xxx xxx



Data Preparation


Interactive Visualization

You may have your own investigation here: link


Patterns of Life Analysis

Daily Patterns
Images Interpretations

  • Park service vehicles never showed up between 4am to 5am in any kind of gates. This might be the off-duty period of the rangers, and the off-duty period varied across different days of week. For example, rangers were never on patrol from 1am to 5am on Saturday.
  • If we look at the arrival time (in this case is the time when rangers started each patrol trip), the first shift always started at 6am and the last shift started at 17pm.
The two types of buses and 4+ axle trucks, all large vehicles, had no appearance in any camping areas. It might represent that these three car-types can only be passing through the preserve, and camping area is not allowed for large vehicles.
Majority of traffics through camping areas only happened between 5am to 22pm, except for one car-id 20154519024544-322, which is discussed in later section. It might indicate that traffics were not allowed in camping areas after 22pm to ensure the safety and rest of overnight campers.
2 axle car/motorcycle, 2 axle truck, and 3 axle truck were most active vehicles in the preserve. Their activities started to increase at 6am and started to flatten out at around 18pm. 7am to 17pm had most vehicle activities.
There were vehicles that simply passed the preserved without making any stops and looking around. These vehicles can be identified by investigating the number of gates they passed through. This pattern only applies to non-campers and happened within a short time period. The graph on the left shows all the possible routes for trespassing.
  • Entrance0<->entrance3
  • Entrance2<->entrance4
  • Entrance1<->general-gate7<->entrance3
  • Entrance0<->general-gate7<-> general-gate4<-> entrance1
Longer-Period Patterns
Images Interpretations
Traffic increased since May and reached highest in July, then started to decrease. November to March were the least popular months for visitors and it is possible that these are winter months.
Activities of 2 axle car/motorcycle, 2 axle truck and 3 axle truck increased on Friday and decreased on Monday. This can be explained by the overnight camping during weekends.

The duration rangers spent at ranger-stops and camping areas were less than 1 hour.
The graph shows the route that had most rangers' episodes. It was the most frequent patrol route of the rangers, and it was almost twice as frequent as the second most frequent patrol route. It is possible that the east side of the preserve required more care and protection.
  • ranger-base>gate8>general-gate5>gate3>ranger-stop3>ranger-stop3>gate3>camping8>general-gate3>gate4>ranger-stop5>ranger-stop5>gate4>gate5>ranger-stop6>ranger-stop6>gate5>gate8>ranger-base
Campers arrived at the preserve between 5am and 17pm. Friday to Sunday were more popular as expected.
Unusual Patterns
Images Interpretations
This table displayed the route of car-id 20154519024544-322 (a 2 axle truck), which passed through camping gates after 22pm. This vehicle had 16 episodes, and each episode had exact same route except for the first episode. This vehicle came to the preserve each Friday and left the on the following Monday.
Apart from 20154519024544-322, there were other car-ids that had multi-episodes, which means they did not render their car-id by the time they exited the preserve. And every time they came to the preserve, they followed the same routes and went for overnight camping. This group of visitors might hold a regular pass for their visits.

  • Unauthorized 4+ axle truck appeared in gates only on Tuesday and Thursday, though not every week.
  • They arrived at the preserved between 2am to 4am. More interestingly, the time they passed through gates avoided the time when park service vehicles passed through those gates.
  • The 4+ alxe trucks that passed through gates had different car-id but they all followed the exact same route:
    entrance3>gate6>ranger-stop6>gate5>general-gate5>gate3>ranger-stop3>ranger-stop3>gate3>general-gate5>gate5>ranger-stop6>gate6>entrance3
  • Recalling from previous section, this route was in the area where the most frequent ranger patrol route covered.
  • There were vehicles going between entrance1 and ranger-stop1 without records from gate2. However, entrance1<->gate2<->ranger-stop1 is the only path between the entrance1 and ranger-stop1.
  • There were 6 episodes, all were 2 axle car/motorcycle, followed the exact same path entrance1>ranger-stop1>ranger-stop1>entrance1. They happened on the same day and at the same time.
  • They stayed for almost 4 hours in ranger-stop1, which was long and suspicious.
Apart from the gate-skippers mentioned above, there were another 3 episodes made a simple round-trip in the preserve: they entered the preserve, passed through general-gate, then made the same route back to the entrance.
Top 3 Possible Causes
  1. Long term visitor with car-id 20154519024544-322 and his behavior to travel pass camping areas during midnight.
  2. Unauthorized 4+ axle trucks invading restricted areas because the route they traveled was part of the most frequent patrol route of the rangers. This area could be where pipits resided, and therefore needed more care and protection from rangers. And the fact that they went through the restricted area when the rangers were off-duty makes them extremely suspicious.
  3. Possible over speeding which requires further investigations, especially for trespassing routes.


Comments & Discussions

comment1: Hi,Bijun. Amazing work! From you analysis pack, I can see the level of efforts that you have devoted into this assignment. Yet, I have the following suggestions that hopefully are useful in further improving your work:
    Aesthetic
  • 1. I love the way you present your analysis. However, a big part of the analysis findings are demonstrated by line graphs. I am wondering if you can try other types of graphs to make the findings more visually clear?
  • 2. For the first graph of daily pattern. You are trying to say that the rangers do not work from 4-5 am. Yet the graphs x axis only shows hour of 3,9,15,21. I think the pattern will be more obvious if you construct the graph in a way that X axis displays every hour of the day.
  • 3. the story structure in your tableau workbook is clear. Look forward to viewing it interactively on line soon.
    Clarity
  • 1. the tableau workbook contains lots of useful information, yet it is a bit complex and confusing. There are quite a few selectors and parameters defined and the graphs are controlled by different selectors, which is not straightforward. Audience probably need to spend quite some time in understanding the dashboard, particularly for those who do not know the background of vast challenge.
  • 2. the legends are not closely attached to their corresponding graphs, which also adds confusion to the dashboard.

Overall, fantastic work! Hopefully my comments can add value
Best Regards
Yunna

Hi Joyce,

Great overall effort and very engaging. Some of my feedback as below 😉
Aesthetics:

    • Though it is very interactive with a lot of interactive filters and legends, I feel that it is abit overwhelming, confusing and took me a while to understand and link them together; which I believe also is causing some dashboard performance issue to load slowly. Selecting some of the values also caused the whole dashboard to blank out and getting lost in visualization, can consider to reduce the number of variables of filters for interactive visualization.
    • I guess storybook should be story-telling and easy to follow for anyone. Currently, it is designed for exploratory purposes.
    • The colors of the titles, filter and legends are well-designed and implemented and all well-linked across the various graphs! Only thing is that descriptions fonts are a bit small for old folks.

Clarity:

    • I don’t understand the “top x and bottom x” & “Timestamp Slector” (typo? Sector or Selector?) but I guess you are trying to compare the traffic at each gate with the arrival time, though I can’t tell anything obvious from the arrival time from pattern detection. In such case, it may be interesting to include and look at their departure time as well.
    • Currently it only shows the route on the map; I think you can also consider the intensity of the path traveled illustrating by thickness of line.
    • Coordinates of the checkpoints are offset from the actual 200x200 grid or actual distance/area and may cause confusion. Zooming feature in the map is good as it allows for better visibility similar to “How long did they spend”. Only point is to consider expanding the box to allow complete view of it; currently it requires scrolling.

Cheers,
Zac

comment3

Hi Zheng Bijun,

Please find my feedback comments as follows. You present a very nice analysis, which is answering the questions of the challenge. With regard to the clarity and aesthetics aspect, I have the following to add.

Clarity :

 

    • In the daily pattern plots, you have used the ‘select’ feature effectively to illustrate clearly the trend you wish to explain. This is a good practice, as it help to retain the background information clearly, whilst projecting the focus for the user.
    • When using map images, you might want to use the Cartesian coordinates more effectively. You can use Tableau to import the map as a background image, and then geocode it so that you will be able to use annotations. The current texts you have indicated does help the user identify the locations inside the preserve, such as entrance 0, entrance 3, etc. but having annotations will help them to pop out of the plane, thereby presenting better clarity.
    • In your 2nd plot inside longer period patterns, I assume the x axis shows the days of the week (1-7). You might want to add an axis label, or you might want to use aliases for the days of the week and label the axis. (for e.g. 1-Sunday, 2-Monday, etc.).
    • On most of the plots, you have a well defined title, so you might not want to show the headers on the Y axis (# of cars) since it can already be known that the chart shows the trends of traffic.

Aesthetics:

    • I notice that you have tweaked the background colour. Maybe, you would want to also explore format axis feature in Tableau that might change the text to more bolder and visible formats. This would lend more readability to the plots.
    • On the arrival time calendar plot you have developed, when you try to visualize the number of episodes, the gradation in colors is good, and helps to quickly infer, which times of the day have higher episodes.

  Hope the feedback helps, and please leave out a feedback on my page as well. You may access it here.
Navigate to the bottom of the main page, after reading the 3 sub pages.

 

Thank You,

Kishan Bharadwaj Shridhar

comment4
comment5