Difference between revisions of "ISSS608 2016-17 T1 Assign2 Ong Han Ying - Data Preparation After Design Iteration"

From Visual Analytics and Applications
Jump to navigation Jump to search
 
(2 intermediate revisions by the same user not shown)
Line 40: Line 40:
 
{|style= width="101%; margin-left: 0 auto ; margin-right: 0auto ; border: none " cellspacing="3" cellpadding="0" valign="top" border="0"  |
 
{|style= width="101%; margin-left: 0 auto ; margin-right: 0auto ; border: none " cellspacing="3" cellpadding="0" valign="top" border="0"  |
  
| style="font-family:Arial; font-size:100%; solid #000000;background:#EAEDED; border-top:10px solid #FFFFFF; border-bottom:4px solid #A6ACAF; text-align:center;" width="50%" |  
+
| style="font-family:Arial; font-size:100%; solid #000000;background:#F7F9F9; border-top:10px solid #FFFFFF; border-bottom:3px solid #A6ACAF; text-align:center;" width="50%" |  
 
;
 
;
 
[[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Data Preparation|<font color="#0B5345"><big>Initial Preparation Work</big></font> ]]  
 
[[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Data Preparation|<font color="#0B5345"><big>Initial Preparation Work</big></font> ]]  
  
| style="font-family:Arial; font-size:100%; solid #000000; background:#E8F6F3; border-bottom:4px solid #7DCEA0; border-top:10px solid #FFFFFF; text-align:center;" width="50%" |  
+
| style="font-family:Arial; font-size:100%; solid #000000; background:#E8F6F3; border-bottom:3px solid #7DCEA0; border-top:10px solid #FFFFFF; text-align:center;" width="50%" |  
 
;
 
;
 
[[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Data Preparation_After_Design_Iteration|<font color="#0B5345"><big>'''After Design Iteration'''</big></font> ]]   
 
[[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Data Preparation_After_Design_Iteration|<font color="#0B5345"><big>'''After Design Iteration'''</big></font> ]]   
Line 68: Line 68:
 
</font>
 
</font>
  
==5.5: Additional Data Identified from Design Iteration ==
+
=5.5: Additional Data Identified from Design Iteration =
  
 
This section highlights the additional data preparation work that is identified in [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach|'''Section 6: Approach''']].  
 
This section highlights the additional data preparation work that is identified in [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach|'''Section 6: Approach''']].  
Line 74: Line 74:
 
This is aligned with the overview of data preparation in [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Data Preparation|'''Section 5.0: Overview''']] where additional data may be required at the visualization & analysis stage.  
 
This is aligned with the overview of data preparation in [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Data Preparation|'''Section 5.0: Overview''']] where additional data may be required at the visualization & analysis stage.  
  
===5.5.1: Rephrasing the Survey Questions ===
+
==5.5.1: Rephrasing the Survey Questions ==
  
 
With reference to [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach_Data_Presentation|'''6.2.2: Analysing Categorical Data''']], the survey questions has been shortened, as below;  
 
With reference to [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach_Data_Presentation|'''6.2.2: Analysing Categorical Data''']], the survey questions has been shortened, as below;  
Line 80: Line 80:
 
[[File:OHY Qn Rephrase.jpg |1000px|centre|Rephrasing of Survey Questions]]
 
[[File:OHY Qn Rephrase.jpg |1000px|centre|Rephrasing of Survey Questions]]
  
===5.5.2: Creating Manual Binning ===
+
==5.5.2: Creating Manual Binning ==
  
 
With reference to [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach Data Presentation|'''6.2.3: Analysing Interval Data''']], manual binning are created in the dataset (through MS EXCEL's formula), as below;  
 
With reference to [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach Data Presentation|'''6.2.3: Analysing Interval Data''']], manual binning are created in the dataset (through MS EXCEL's formula), as below;  
Line 95: Line 95:
 
|}
 
|}
  
===5.5.3: Adding #ID to the Survey Result's Dataset===
+
==5.5.3: Adding #ID to the Survey Result's Dataset==
  
 
From [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach Data Presentation|'''6.2.4: Computing Survey Results - Using Frequency Tables''']], it was identified that #ID has to be added to the dataset. As such, #ID is added to the excel file, and "wiki4H_SurveyResultTable_Final_V2.xlsx" is created, to support the analysis.  The implementation steps as display below;
 
From [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach Data Presentation|'''6.2.4: Computing Survey Results - Using Frequency Tables''']], it was identified that #ID has to be added to the dataset. As such, #ID is added to the excel file, and "wiki4H_SurveyResultTable_Final_V2.xlsx" is created, to support the analysis.  The implementation steps as display below;
Line 107: Line 107:
 
[[File:OHY_Data_Prep_11.png |600px|centre|Pivoting in Tableau]]
 
[[File:OHY_Data_Prep_11.png |600px|centre|Pivoting in Tableau]]
  
===5.5.4: Joining Survey Results' Table and Survey Master's Table===
+
==5.5.4: Joining Survey Results' Table and Survey Master's Table==
  
 
From [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach Data Presentation|'''6.2.4: Computing Survey Results - Using Frequency Tables''']], it was also identified that the 2 tables of "Survey Results", and "Survey Master" has to be joined, so to support the analysis.  The implementation steps as completed in tableau, and as display below;
 
From [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach Data Presentation|'''6.2.4: Computing Survey Results - Using Frequency Tables''']], it was also identified that the 2 tables of "Survey Results", and "Survey Master" has to be joined, so to support the analysis.  The implementation steps as completed in tableau, and as display below;
Line 113: Line 113:
 
[[File:OHY_Data_Prep_12.png |500px|centre|Joining the Tables in Tableau ]]
 
[[File:OHY_Data_Prep_12.png |500px|centre|Joining the Tables in Tableau ]]
  
===5.5.5: Creating an "Indicator_Master" and Joining the Tables===
+
==5.5.5: Creating an "Indicator_Master" and Joining the Tables==
  
 
From [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach Data Presentation|'''6.2.4: Computing Survey Results - Using Frequency Tables''']], it was identified that the a new table of "Indicator_Master" has to be created, and then, to be joined to the Master Dataset, so to support the analysis.  The implementation steps as completed in tableau, and as display below;
 
From [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach Data Presentation|'''6.2.4: Computing Survey Results - Using Frequency Tables''']], it was identified that the a new table of "Indicator_Master" has to be created, and then, to be joined to the Master Dataset, so to support the analysis.  The implementation steps as completed in tableau, and as display below;
Line 123: Line 123:
 
[[File:OHY_Data_Prep_14.png |500px|centre|Joining the Tables in Tableau]]
 
[[File:OHY_Data_Prep_14.png |500px|centre|Joining the Tables in Tableau]]
  
===5.5.6: Research Paper's Hypothesis ===
+
==5.5.6: Research Paper's Hypothesis ==
  
 
From [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach Inital Layout|'''6.1.2: Objective 2 - To Identify Further Insights to the Conclusion of the Research Paper''']], it was identified that the summary of the hypothesis from the research paper is not available. As such, a new table of "Hypothese" is created in the excel database of "Survey_QN_Master" to support the analysis.  The summary of the table as below;  
 
From [[ISSS608 2016-17 T1 Assign2 Ong Han Ying - Approach Inital Layout|'''6.1.2: Objective 2 - To Identify Further Insights to the Conclusion of the Research Paper''']], it was identified that the summary of the hypothesis from the research paper is not available. As such, a new table of "Hypothese" is created in the excel database of "Survey_QN_Master" to support the analysis.  The summary of the table as below;  

Latest revision as of 12:03, 15 October 2016

OHY Header.jpg ISSS608 2016-17 Assignment 2 - Ong Han Ying

About

Executive Summary

Introduction

Data Preparation

Approach

Results

Conclusion

 

Initial Preparation Work

After Design Iteration

 
CONTENT

_5.5: Additional Data Identified from Design Iteration
_____5.5.1: Adding #ID to the Survey Result's Dataset
_____5.5.2: Joining Survey Results' Table and Survey Master's Table
_____5.5.3: Adding #ID to the Survey Result's Dataset
_____5.5.4: Joining Survey Results' Table and Survey Master's Table
_____5.5.5: Creating an "Indicator_Master" and Joining the Tables
_____5.5.6: Research Paper's Hypothesis

5.5: Additional Data Identified from Design Iteration

This section highlights the additional data preparation work that is identified in Section 6: Approach.

This is aligned with the overview of data preparation in Section 5.0: Overview where additional data may be required at the visualization & analysis stage.

5.5.1: Rephrasing the Survey Questions

With reference to 6.2.2: Analysing Categorical Data, the survey questions has been shortened, as below;

Rephrasing of Survey Questions

5.5.2: Creating Manual Binning

With reference to 6.2.3: Analysing Interval Data, manual binning are created in the dataset (through MS EXCEL's formula), as below;

Field Name Formula Output
AGE_Manual_Bin_1 =IF([AGE]="","",IF([AGE]<=30,"<=30",IF([AGE]<=40,"31-40",IF([AGE]<=50,"41-50",IF([AGE]<=60,"51-60",">60"))))) "<=30", "31-40", "41-50", "51-60", ">60"
AGE_Manual_Bin_2 =IF([AGE]="","",IF([AGE]<=30,"21-30",IF([AGE]<=40,"31-40",IF([AGE]<=50,"41-50",IF([AGE]<=60,"51-60","61-70"))))) "21-30", "31-40", "41-50", "51-60", "61-70"
YEARSEXP_Manual_binning_2 =IF([YEAREXP]="","",IF([YEAREXP]<=10,"0-10",IF([YEAREXP]<=20,"11-20",IF([YEAREXP]<=30,"21-30",IF([YEAREXP]<=40,"31-40","41-50"))))) "0-10", "11-20","21-30","31-40", "41-50"

5.5.3: Adding #ID to the Survey Result's Dataset

From 6.2.4: Computing Survey Results - Using Frequency Tables, it was identified that #ID has to be added to the dataset. As such, #ID is added to the excel file, and "wiki4H_SurveyResultTable_Final_V2.xlsx" is created, to support the analysis. The implementation steps as display below;

Step1 : Assigning Unique #ID into the dataset,

Adding #ID to Dataset

Step2 : Load the Dataset into Tableau, and do a "pivot", as below;

Pivoting in Tableau

5.5.4: Joining Survey Results' Table and Survey Master's Table

From 6.2.4: Computing Survey Results - Using Frequency Tables, it was also identified that the 2 tables of "Survey Results", and "Survey Master" has to be joined, so to support the analysis. The implementation steps as completed in tableau, and as display below;

Joining the Tables in Tableau

5.5.5: Creating an "Indicator_Master" and Joining the Tables

From 6.2.4: Computing Survey Results - Using Frequency Tables, it was identified that the a new table of "Indicator_Master" has to be created, and then, to be joined to the Master Dataset, so to support the analysis. The implementation steps as completed in tableau, and as display below;

Step 1: Creating a new worksheet of "Indicator_Master" in "Survey_Master.xlsx";

Indicator Master

Step 2: Joining "Indicator_Master" table, to the master table, as below;

Joining the Tables in Tableau

5.5.6: Research Paper's Hypothesis

From 6.1.2: Objective 2 - To Identify Further Insights to the Conclusion of the Research Paper, it was identified that the summary of the hypothesis from the research paper is not available. As such, a new table of "Hypothese" is created in the excel database of "Survey_QN_Master" to support the analysis. The summary of the table as below;

Field Name Description
Hypothesis Code Hypothesis Code, such as H1 etc
Description Description of the Hypothesis
Qn_Code Question Code

Next, this table is joined to the other table, as below;

Adding #ID to Dataset

Previous Sub-section: Initial Preparation Work


Previous Section: Introduction