Difference between revisions of "IS428 AY2019-20T2 Assign JIANG XI"

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
(Created page with "--<div style="margin-bottom:30px;">frameless|center|1000px</div> <!--Header--> <div style="width:100%; text-align:center;"> {|style="background-col...")
 
 
(27 intermediate revisions by the same user not shown)
Line 1: Line 1:
--<div style="margin-bottom:30px;">[[File:Library Image.jpg|frameless|center|1000px]]</div>
 
  
<!--Header-->
+
== Overview ==
<div style="width:100%; text-align:center;">
 
{|style="background-color:#432c67; padding: 10 0 10 0;" width="100%" cellspacing="0" cellpadding="0" valign="top"|
 
| style="padding:0.2em; font-weight: bold; font-size:100%; background-color:#432c67; text-align:center; color:#F5F5F5" width="10%" |
 
<font color="#F5F5F5" size=5 face="Helvetica">LIBRARY SURVEY 2018</font>
 
|}
 
</div>
 
<!--/Header-->
 
<div></div>
 
<br /><br />
 
==<div style="background:#432c67; padding: 15px; font-weight: bold; line-height: 0.3em; font-size:20px"><font color=#fbfcfd face="Helvetica"><center>PROBLEM & MOTIVATION</center></font></div>==
 
  
===<center>PROBLEM</center>===
+
Every two years, SMU Libraries conduct a comprehensive survey in which faculty, students and staff have the opportunity to rate various aspects of SMU library's services. The survey provides SMU libraries with input to help enhance existing services and to anticipate emerging needs of SMU faculty, students and staff.
<p>
 
</p>
 
  
===<center>MOTIVATION</center>===
+
== Objectives for the VA Assignment ==
<p>
+
To generate useful insights for the library management team to figure out the areas of improvements by analysing the survey feedback from the different user groups.
 +
#the undergraduate students,
 +
#the postgraduate students,
 +
#the faculty,
 +
#the staff.
  
</p>
+
== Dataset ==
 +
There are 3 files have been provided to the student namely:
 +
<br>
 +
1. SMU Library Survey Comments MAC.xls    >  The text archive for the feedbacks
  
==<div style="background:#432c67; padding: 15px; font-weight: bold; line-height: 0.3em;font-size:20px"><font color=#fbfcfd face="Helvetica"><center>DATA PREPARATION</center></font></div>==
+
2. Raw data 2018-03-07 SMU LCS data file - KLG.xlsx  > The raw data file with the legend  
===<center>MODIFICATIONS</center>===
 
<p>
 
<center>
 
{| class="wikitable" width="100%"
 
|-
 
! style="font-weight: bold;background: #899db3;color:#fbfcfd;width: 7%;" |Tool
 
! style="font-weight: bold;background: #899db3;color:#fbfcfd;width: 30%;" |Tasks and Modifications
 
|-
 
|
 
<center>
 
Microsoft Excel
 
</center>
 
|
 
<ol>
 
</ol>
 
|-
 
|
 
<center>
 
Tableau Prep
 
</center>
 
|
 
<ol>
 
<li> Pivoting the dataset so as to bring the different rows for each question to columns. This make analysis on Tableau easier. </li>
 
   
 
</ol>
 
|-
 
|}
 
</center>
 
</p>
 
  
===<center>TABLEAU PREP FLOW</center> <br>===
+
3. SMULibraries_BeHeardSurvey_FullReport.pdf > The sample result given by the external consultancy
<p>
+
<br>
<center>
 
 
</center>
 
</p>
 
  
===<center>DATA TRANSFORMATION</center>===
+
== Data Preparation==
<p>
 
<center>
 
{| class="wikitable" width="100%"
 
|-
 
! style="font-weight: bold;background: #899db3;color:#fbfcfd;width: 30%;" |Original Data
 
! style="font-weight: bold;background: #899db3;color:#fbfcfd;width: 30%;" |Processed Data
 
|-
 
|
 
 
|
 
 
|}
 
</center>
 
</p>
 
  
==<div style="background:#432c67; padding: 15px; font-weight: bold; line-height: 0.3em; font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>STORYBOARD</center></font></div>==
+
===Remapping Values===
 +
Issue faced:
 +
The actual survey data has been encoded into the Enumerated type representations for better computing performance and smaller storage used.
 +
However, it has dramatically increased the difficulty for the us to interpret the raw dataset as we have to conduct additional data mapping procedure.
 +
<br>
  
===<center>SKETCH 1: Survey Overview Dashboard</center>===
+
[[File: RAW DATA.png|400px]]
<p>
+
<br>
+
Solution:  
</p>
+
Since the excel has provided multiple useful formulae for mapping, I then decided to use them to retrieve the encoded values from the legend.
===<center>SKETCH 2: Survey Dashboard 1</center>===
 
<p>
 
 
</p>
 
===<center>SKETCH 3: Survey Dashboard 2</center>===
 
<p>
 
 
</p>
 
  
==<div style="background:#432c67; padding: 15px; font-weight: bold; line-height: 0.3em; font-size:20px"><font color=#fbfcfd face="Century Gothic"><center>KEY TECHNICAL CHALLENGES & MITIGATION</center></font></div>==
+
<br>
 +
====-<b>Step 1. Mapping for question headers</b>====
 +
Used Vloop function to match the Code and its question
  
{| class="wikitable" width="100%"
+
Example for Campus-Which Library do you use more?
|-
+
 
! style="font-weight: bold;background: #899db3;color:#fbfcfd;width: 5%;" |No.
+
[[File:Which Library do you use more.png|400px]]
! style="font-weight: bold;background: #899db3;color:#fbfcfd;width: 15%;" |Challenge
+
 
! style="font-weight: bold;background: #899db3;color:#fbfcfd;width: 40%;" |Description
+
=IFERROR(VLOOKUP(SMU!B1,Legend!$A:$B,2,FALSE),B1)
! style="font-weight: bold;background: #899db3;color:#fbfcfd;width: 40%;" |Mitigation
+
<br>
|-
+
 
| 1.
+
====-<b>Step 2. Mapping for question answers</b>====
| Lack of Familiarity with Tools
+
 
| Everyone in the group do not know how to program in RShiny for visualisation
+
Used Hloop function to match the Code and its answers of the following question.
| We will learn Rshiny during class, call for consultation and rely on Googling for any programming challenges. Alternatively, there is also Datacamp available for us.
+
 
|-
+
Position (Select ONE that best describes you)
| 2.
+
What is your major area of study, research or teaching? (Select ONE that best describes you)
| Viability of Ideas
+
Are you an international (non-exchange) student?
| We do not know if the current dataset is sufficient in providing all the information needed to conduct analysis and building of planned visualizations.
+
How frequently do you visit the library?
| There are multiple dataset online to use and we can use Prof Kam's REALIS dataset provided to us to supplement our dataset if we are lacking of certain variables. We could also derive our own variables based on the current dataset if needed (e.g. Geocoding).
+
How frequently do you visit the Campus?
|-
+
How often do you access library resources (e.g. online articles, databases, ebooks)?
| 3.
+
<br>
| Lack of Domain Knowledge
+
Example :
| HDB resale prices are affected by a spectrum of different factors such as policy measures and redevelopment. It is hard for us to understand without domain knowledge.
+
=HLOOKUP(BM4,Legend!$1:$10,6)
| Learn from informative websites such as from HDB and iteratively discover and learn insights into the dataset
+
<br>
|}
+
[[File:Hlookup.png|400px]]
</center>
+
<br>
 +
 
 +
===Invalid Values===
 +
====<b>-Step 3. Indentifying Invalid inputs</b>====
 +
Use the 2 questions' response to create an additional column called "Valid" to check whether it is a usable data.
 +
 
 +
Which Library do you use more?
 +
Position (Select ONE that best describes you)
 +
 
 +
[[File:INVALID.png|1400px]]
 +
 
 +
True : Valid
 +
False: Invalid
 +
 
 +
Formula: =AND(B4>0,D4>0)
 +
 
 +
====<b>-Step 4. Removing NA response </b>====
 +
Use excel to drop the Columns for NA-01 to 26.
 +
 
 +
 
 +
Completed excel file:
 +
[https://drive.google.com/file/d/12Nm0v64SLI2TlJULAiA5VVk7qGgY1-jo/view?usp=sharing Prepared data file]
 +
 
 +
== Data visualization ==
 +
[https://public.tableau.com/profile/jiangxi4511#!/vizhome/JIANGXI/Dashboard1?publish=yes Link]
 +
 
 +
==Insights summary==
 +
 
 +
#Overall
 +
#the undergraduate students,
 +
#the postgraduate students,
 +
#the faculty,
 +
#the staff.

Latest revision as of 00:10, 16 March 2020

Overview

Every two years, SMU Libraries conduct a comprehensive survey in which faculty, students and staff have the opportunity to rate various aspects of SMU library's services. The survey provides SMU libraries with input to help enhance existing services and to anticipate emerging needs of SMU faculty, students and staff.

Objectives for the VA Assignment

To generate useful insights for the library management team to figure out the areas of improvements by analysing the survey feedback from the different user groups.

  1. the undergraduate students,
  2. the postgraduate students,
  3. the faculty,
  4. the staff.

Dataset

There are 3 files have been provided to the student namely:
1. SMU Library Survey Comments MAC.xls > The text archive for the feedbacks

2. Raw data 2018-03-07 SMU LCS data file - KLG.xlsx > The raw data file with the legend

3. SMULibraries_BeHeardSurvey_FullReport.pdf > The sample result given by the external consultancy

Data Preparation

Remapping Values

Issue faced: The actual survey data has been encoded into the Enumerated type representations for better computing performance and smaller storage used. However, it has dramatically increased the difficulty for the us to interpret the raw dataset as we have to conduct additional data mapping procedure.

RAW DATA.png
Solution: Since the excel has provided multiple useful formulae for mapping, I then decided to use them to retrieve the encoded values from the legend.


-Step 1. Mapping for question headers

Used Vloop function to match the Code and its question

Example for Campus-Which Library do you use more?

Which Library do you use more.png

=IFERROR(VLOOKUP(SMU!B1,Legend!$A:$B,2,FALSE),B1)

-Step 2. Mapping for question answers

Used Hloop function to match the Code and its answers of the following question.

Position (Select ONE that best describes you) What is your major area of study, research or teaching? (Select ONE that best describes you) Are you an international (non-exchange) student? How frequently do you visit the library? How frequently do you visit the Campus? How often do you access library resources (e.g. online articles, databases, ebooks)?
Example : =HLOOKUP(BM4,Legend!$1:$10,6)
Hlookup.png

Invalid Values

-Step 3. Indentifying Invalid inputs

Use the 2 questions' response to create an additional column called "Valid" to check whether it is a usable data.

Which Library do you use more? Position (Select ONE that best describes you)

INVALID.png

True : Valid False: Invalid

Formula: =AND(B4>0,D4>0)

-Step 4. Removing NA response

Use excel to drop the Columns for NA-01 to 26.


Completed excel file: Prepared data file

Data visualization

Link

Insights summary

  1. Overall
  2. the undergraduate students,
  3. the postgraduate students,
  4. the faculty,
  5. the staff.