IS428 AY2019-20T2 Assign DAVID CHOW JING SHAN

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
Assignment 3 - To be a Visual Detective: SMU libraries

Background

Every two years, SUM libraries conduct a comprehensive survey to receive feedback from the faculty, students, and staff with regards to the libraries' services. With the data collected, SMU libraries hope to gain some insights so that they can enhance their existing services and meet the emerging needs of SMU faculty, students and staff.

However, the past reports were not able to give a comprehensive and condensed summary for SMU libraries as they are mainly made-up of pages of tables. With so much information, it becomes difficult for SMU libraries to get an essential overview that allows them to gain useful insights.

Objectives

With such a massive amount of data being collected, it is important to summarize the key areas and provide proper visualizations for them. Hence, using all the visualization and analytical techniques that I have learned in class, I aim to create an interactive data visualization to help SMU libraries better understand their service feedbacks so that they can make the necessary improvements to satisfy the needs of their respective users.

With this interactive visualization, SMU libraries would be able to easily gain insights from the feedback given by the following stakeholders:

  • undergraduate students
  • postgraduate students
  • faculty
  • staff


Data and Visualization

About Data

With the given data(2018 Survey dataset), like any visualization process, it is vital to understand and look through the data before using any visualization software. I will be using Tableau as my visualization software and there is some work to be done on the given datasets before I plot my visualizations. The given data consist of two main files: One contains the tabular data while the other contains feedback comments.

For the tabular data, I plan to separate them into 'identifier questions' and 'feedback questions'.

'Identifier questions' are mainly categorical questions that are used to categorize the respondents. The important questions include:

  1. StudyArea - Area of study/research/teaching
  2. Position - Years of study/Occupation
  3. ID - local or international student
  4. HowOftenL - Frequency to library
  5. HowOftenC - Frequency to the school campus
  6. HowOftenW - Frequency to accessing library resources


'Feedback questions' are the main feedback from respondents. The data collected are mostly in Likert scale (from 1-lowest to 7-Highest). Most of these questions are repeated three times for different purposes: Importance(I), Performance(P) & NA. For I and NA, they both have the same 26 questions while P has one extra question, "Overall how satisfied are you with the Library?". With some references from the past reports, I will also segment the questions into four main categories:

  1. Information Resources
  2. Facilities Equipment
  3. Communication
  4. Service Delivery

This will help me to provide a more organized visualization for the user. But before moving on to making visualizations, I will need to do some pre-processing of the data.


Data Preparation

Data preparation

Screenshot
Steps Taken
Delete invalid.JPG

Upon initial inspection of the tabular dataset('Raw data 2018-03-07 SMU LCS data file - KLG.xlsx'), I found out there's an invalid row: ResponseID 833. It is invalid because, apart from the many empty fields, it does not have any input under the 'Position' field as well. This makes it difficult for me to categorize this response properly. Thus, I will be removing it from the dataset by deleting that row.

Rename field.JPG

Firstly, I copied the field/question names(from I1 to I26) from the 'Legend' tab and rename the question ID in the 'SMU' tab respectively(Using transpose paste option in excel). Then, I would do the same for the 'Performance' fields(from P1 to P27) too. Then save the file. Secondly, I changed split every word under Comment column to their own respective column in excel.

Data pivot.png

Then, I used Tableau Prep Builder to pivot all comment words. Afterward, I saved the output file to be used for Tableau visualization. Lastly, in Tableau, I open the 'SMU' sheet from the newly saved file. Then, I pivoted all the questions columns and renamed them as 'Survey questions' and rename their respective answers as 'Survey responses'.

Word filter.JPG

Before making the respective visualization with text data, I used Tableau filter to filter out the common stop words and punctuations.


Interactive Visualization

The interactive visualization can be accessed here: https://public.tableau.com/profile/david.chow.jing.shan#!/vizhome/VA_assignment_15838961627190/HomePage 800px

Home Dashboard

The following shows the home dashboard: New homepage.jpg

A homepage is created with 5 tabs/dashboards: Home, Overview, Importance & Performance, Common words and Comment segment. The order of the tabs is being arranged to provide a smooth and logical flow for the user when he/she is reviewing the survey feedback.

The basic flow:

  1. After viewing the Home page, the user can look at the Overview page to get a rough idea of the distribution of the survey respondents. Concurrently, he/she will be able to see the overall satisfaction of the respondents.
  2. Then the user would most likely want to find out which aspects of the libraries are the most important for these respondents. He/she will be able to view it on the Importance and Performance dashboard.
  3. After finding out what are the most important features/aspects to these respondents, the user can find out how well they did from the same dashboard.
  4. Lastly, to get a general sense of what most respondents are saying in their comments, the user can find out in the Common words page.

To read the full comments(segmented by the user type), the user can visit the Comment segment.


Alternatively, the user can hover over each tab button to get a summary tooltip and select a dashboard that he/she is interested in.

Overview

The following shows the Overview/demographic dashboard:

New overview.JPG

The horizontal bar chart helps to visualize the proportion of survey respondents. It's arranged in ascending order with the biggest participating group at the top. When hovered above any bar on the left, a tooltip will appear to show the library and resource usage for the target group. For the library usage stack bars, there two distinct colors that help to segment the two libraries. The user will able to tell which library is being used more frequently by the target group.

On the right-hand side, the summary of overall satisfaction and overall likelihood of recommending library services to other students are shown. For more specific information about these features of a particular group, simply click on any of the respondent bars on the left. This will cause both the satisfaction and recommendation likelihood bar charts on the left to change according to the selected target group.

Importance & Performance

The following shows the dashboard for Importance & Performance:

New importance.jpg

Based on the given dataset, I used diverging bar charts to display the importance and performance data. This is because the given data is in Likert scale and the diverging bar chart is very effective for visualizing this kind of data. Both of the Importance and Performance charts are put side-by-side for easier comparison and reference. To focus on a specific category, a category filter(Communication, Facilities & Equipment, Information Resources and Service Delivery) is also available.

To assist with the investigation, a Position group filter is available. It consists of Undergraduates, Postgraduates, Exchange students, Faculty, Staff, and Others. By default, it will choose all the sub-groups. However, the user can refine their visualization with a Position filter that narrows down to a specific sub-group(e.g. Year 1,2,3 or 4 students under Undergraduates).

The default value is set at 4. However, it is customizable with the neutral score filter. The reference line will shift according to this filter to suit the user's needs.

Common words

The following shows the Common words dashboard: New common words.jpg

For a quick glimpse of the most commonly used words found in the comment section, I have used a word cloud to visualize them. I have also added a simple complimentary bar chart for the user to compare the usage between commonly used words. This will allow the user to get some rough ideas about what most respondents would like to say about the SMU libraries.

There is also a filter to alter the number of top-mentioned words. This can either help the user to expand or narrow his/her focus. Additionally, there is also a Position filter for the user to explore words commonly used by a specific group of respondents.

Comment segment

The following shows the Comment segment dashboard: New comments.jpg

This is a simple list that displays all the comments in the given data. There are filters available for the user to customize and narrow down his/her reading. This dashboard serves as a follow-up for the Common words dashboard. It provides a convenient way for the user to read all the comments in a more organized/structured manner.


Insights

Interesting insights

According to the given data, Undergraduate Year 2 has the highest participation. This is followed by Undergraduate Year 1, 4 and 3.

Overall Satisfaction

Generally, the SMU libraries are quite well-received as the overall satisfaction with the library is mostly positive, with over 50% voting for 6 and 7 in terms of the level of satisfaction (1 being the lowest, while 7 is the highest). Similarly, most students would also be very likely to recommend library services to other students, with over 80% of the respondents voting 7 and above (0 being the lowest, while 10 is the highest) in terms of the likelihood of a recommendation.


Overview insight.jpg



Importance

At first glance, the area that garnered the most 'Very High' importance votes is the ability to get wireless access whenever needed(Facility & Equipment). Meanwhile, the lowest importance goes to availability of computer(Facility & Equipment) and library workshops(Service Delivery).

For undergraduates, Postgraduates and staff, the ability to get wireless access whenever needed has the most number of 'Very High' Importance votes respectively. This means that wireless access is the most important to these target groups.


Importance under.jpg Importance post.jpg Importance staff.jpg


After wireless access, the second most important area is the ability to find a quiet place in the library to study(Facilities & Equipment) for undergraduates, postgraduates and staff.

Meanwhile, most faculty members value Useful Online Resources(Information Resources) over the other areas as it has the most votes in 'Very High' importance. Their second most important area is the ability of library search engine that enables users to find relevant library resources quickly.

Importance faculty.jpg

Performance

In terms of performance, wireless access also did very well according to most respondents as it has the most votes in 'Very High' performance for the respective groups that value it the most(Undergraduates, Postgraduates and Staff).


Performance under.jpg Performance post.jpg Performance staff.jpg

However, for the ability to find a quiet place to study, its perceived performance is significantly lower than its importance as it has a way lower number of votes in 'Very High' Performance(only 16.5% of undergraduates,30.7% of postgraduates) as compared to that of importance(70% of undergraduates,69.3% of postgraduates). Thus, it may be worthwhile to look into creating a more quiet and conducive environment for the users. Perhaps sound-proofing the meeting rooms and etc.


For most faculty members, Useful Online Resources did not receive the highest number of votes in 'Very High' performance. Instead, it is prompt delivery of requested books and articles(Service Delivery) that got the highest votes in 'Very High' performance. This is followed by the ability to get help from library staff (Service Delivery) and Wireless access(Facilities & Equipment). As Useful Online Resources is rated the most important by most faculty members, it may be advisable for SMU libraries to improve the availability and variety of online resources that are useful for Faculty members. Improvement for the library search engine is also recommended.


Performance faculty.jpg

Common words

The most common words used in the given comments are 'study', 'seat' and 'table'. For the undergraduates such as Year 2, the word 'hogging' is also pretty prevalent. SMU libraries should consider adding more chairs and tables and coming up with policies to deal with the hogging issue.

Year2 common words.jpg

Reference

Likert Scale Tableau: https://www.youtube.com/watch?v=JodWmiIxl2c

Word cloud Tableau: https://www.youtube.com/watch?v=_dh0OipfrYI