Difference between revisions of "ISSS608 2016-17 T1 Assign3 Liu Jialin"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(6 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
<!--MAIN HEADER -->
 
<!--MAIN HEADER -->
 
{|style="background-color:#1B338F;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
 
{|style="background-color:#1B338F;" width="100%" cellspacing="0" cellpadding="0" valign="top" border="0"  |
| style="font-family:Century Gothic; font-size:100%; solid #000000; background:#2B3856; text-align:center;" width="25%" |  
+
| style="font-family:Century Gothic; font-size:100%; solid #000000; background:#2B3856; text-align:center;" width="20%" |  
 
;
 
;
[[ISSS608_2016-17_T1_Assign3_Liu_Jialin| <font color="#FFFFFF">Overview and Data Preparation</font>]]
+
[[ISSS608_2016-17_T1_Assign3_Liu_Jialin| <font color="#FFFFFF">Overview</font>]]
  
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="25%" |  
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="20%" |
 +
;
 +
[[Data Preparation| <font color="#FFFFFF">Data Preparation</font>]]
 +
 
 +
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="20%" |  
 
;
 
;
 
[[Question 1| <font color="#FFFFFF">Question 1</font>]]
 
[[Question 1| <font color="#FFFFFF">Question 1</font>]]
  
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="25%" |  
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="20%" |  
 
;
 
;
 
[[Question 2| <font color="#FFFFFF">Question 2</font>]]
 
[[Question 2| <font color="#FFFFFF">Question 2</font>]]
  
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="25%" |  
+
| style="font-family:Century Gothic; font-size:100%; solid #1B338F; background:#2B3856; text-align:center;" width="20%" |  
 
;
 
;
 
[[Question 3| <font color="#FFFFFF">Question 3</font>]]
 
[[Question 3| <font color="#FFFFFF">Question 3</font>]]
Line 19: Line 23:
 
|  &nbsp;
 
|  &nbsp;
 
|}
 
|}
<br/>
+
<h2>Problem Overview</h2>
Data preparation using JMP: <br>
+
DinoFun World is a typical modest-sized amusement park, sitting on about 215 hectares and hosting thousands of visitors each day. It has a small town feel, but it is well known for its exciting rides and events.
• Use concatenate function to combine the communication records across 3 days into one file, name file <b>Communication in 3 days</b>.<br>
+
 
• Sort ascending on “timestamp”, then sort ascending on “from”. Now the messages send by the same ID appear together and appear in time order.<br>
+
One event last year was a weekend tribute to Scott Jones, internationally renowned football (“soccer,” in US terminology) star. Scott Jones is from a town nearby DinoFun World. He was a classic hometown hero, with thousands of fans who cheered his success as if he were a beloved family member. To celebrate his years of stardom in international play, DinoFun World declared “Scott Jones Weekend”, where Scott was scheduled to appear in two stage shows each on Friday, Saturday, and Sunday to talk about his life and career. In addition, a show of memorabilia related to his illustrious career would be displayed in the park’s Pavilion.
• Hide and exclude all messages to and from 1278894 and 839736. Unexclude the rows accordingly when needed.<br>
+
However, the event did not go as planned. Scott’s weekend was marred by crime and mayhem perpetrated by a poor, misguided and disgruntled figure from Scott’s past.
• Create a column, name it "Unique Combination", apply formula: “Char(:from) || Char(:to)”.<br>
+
 
• Tabulate "Unique Combination" and N, make into data table, name file <b>Unique direction count of messages</b>.<br>
+
While the crimes were rapidly solved, park officials and law enforcement figures are interested in understanding just what happened during that weekend to better prepare themselves for future events. They are interested in understanding how people move and communicate in the park, as well as how patterns changes and evolve over time, and what can be understood about motivations for changing patterns.
• In <b>Unique direction count of messages</b>, change column name “N” to “weight”.<br>
+
<br>
• Update <b>Communication in 3 days</b> from <b>Unique direction count of messages</b>, update with “weight” column.<br>
+
<h2>Dataset Available</h2>
In <b>Communication in 3 days</b>, create a column called “Timestamp difference in min”, apply formula “Dif(:Timestamp, 1) / 60”.<br>
+
* ''DinoFunWorld_CommData.zip'' consist of in-app communication data over the three days of the Scott Jones celebration.
• Save file as <b>Edges for communication</b>. Remove column “Timestamp difference in min”, sort ascending by “Unique Combination”.<br>
+
* ''DinoFunWorld_MoveData.zip'' consists of three days park movement data.  The park movement datasets are in csv format.  
• Unlock the “Unique combination” row, change row information from characters to numerical, continuous.<br>
+
* ''DinoFunWorld_LayoutMap.zip'' consists of a jpg file.
• Create a new column, name it “remove duplicates”, and apply formula “Dif(:Unique Combination, 1)”.<br>
+
* ''DinoFunWorld_Website.zip'' consists of webpages of DinoFun World Park.
• Select all rows with remove duplicates = 0, these are duplicate rows, delete these rows.<br>
 
• Delete columns “Location”, “Timestamp” and “remove duplicates”.<br>
 
• Change “from” to “Source” and “to” to “Target”<br>
 
• Save file, export as excel, name exported file <b>Nodes for communication</b>.<br>
 
• In excel, copy all the Target nodes at the end of Source column. Remove duplicates for this column. Delete the Target Column.<br>
 
• Change “Source” to “ID”.<br>
 
 
<br>
 
<br>
Gephi:<br>
+
<h2>Analysis Tools Used</h2>
• Import into Gephi using <b>Nodes for communication</b> and <b>Edges for communication</b>.<br>
+
<ul>
• In Gephi, using Hu Yifan layout, change optimal distance to 200, run the layout to obtain a satisfactory layout.<br>
+
<li>JMP</li>
• Set nodes size depends on Degree and nodes colour to depend on Out-degree.<br>
+
<li>Gephi</li>
• Set colour of edges to depend on weight.<br>
 
• In filter, select topology, drag Mutual Degree into filter. Change the filters to obtain the filtered layouts.<br>
 
• In context, check the number of nodes remained using this filter.<br>
 

Latest revision as of 22:10, 28 October 2016

Overview

Data Preparation

Question 1

Question 2

Question 3

 

Problem Overview

DinoFun World is a typical modest-sized amusement park, sitting on about 215 hectares and hosting thousands of visitors each day. It has a small town feel, but it is well known for its exciting rides and events.

One event last year was a weekend tribute to Scott Jones, internationally renowned football (“soccer,” in US terminology) star. Scott Jones is from a town nearby DinoFun World. He was a classic hometown hero, with thousands of fans who cheered his success as if he were a beloved family member. To celebrate his years of stardom in international play, DinoFun World declared “Scott Jones Weekend”, where Scott was scheduled to appear in two stage shows each on Friday, Saturday, and Sunday to talk about his life and career. In addition, a show of memorabilia related to his illustrious career would be displayed in the park’s Pavilion. However, the event did not go as planned. Scott’s weekend was marred by crime and mayhem perpetrated by a poor, misguided and disgruntled figure from Scott’s past.

While the crimes were rapidly solved, park officials and law enforcement figures are interested in understanding just what happened during that weekend to better prepare themselves for future events. They are interested in understanding how people move and communicate in the park, as well as how patterns changes and evolve over time, and what can be understood about motivations for changing patterns.

Dataset Available

  • DinoFunWorld_CommData.zip consist of in-app communication data over the three days of the Scott Jones celebration.
  • DinoFunWorld_MoveData.zip consists of three days park movement data. The park movement datasets are in csv format.
  • DinoFunWorld_LayoutMap.zip consists of a jpg file.
  • DinoFunWorld_Website.zip consists of webpages of DinoFun World Park.


Analysis Tools Used

  • JMP
  • Gephi