IS428 AY2019-20T2 Assign PEH ANQI

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
SMU Libraries


Overview

Singapore Management University have two libraries, the Li Ka Shing Library and the Kwa Geok Choo Law Library. Every two years, the libraries would conduct a comprehensive survey in which faculty, students and staff have the opportunity to rate various aspects of SMU library's services. The survey provides SMU libraries with inputs to help enhance existing services and to anticipate emerging needs of SMU faculty, students and staff.

The past survey reports were mainly made-up of pages of tables, which are very difficult to comprehend. Hence, the task is to create interactive data visualisation to transform these tables into visual representation that allow SMU libraries to gain useful insights, and reveal the level of services provided by SMU libraries as perceived by the following stakeholders:

  • Faculty
  • Undergraduate students
  • Postgraduate students
  • Staff


Data

Data Preparation

About the Data

The 2018 library survey data is used for this assignment. Two excel data was provided, Raw data 2018-03-07 SMU LCS data file - KLG.xlsx and the 2018-02-16 SMU Library Survey Comments MAC.xls. The 2018-02-16 SMU Library Survey Comments MAC.xls file consist of comments provided by each respondents which is also available in the Raw data 2018-03-07 SMU LCS data file - KLG.xlsx file. Hence, I will be using only the Raw data 2018-03-07 SMU LCS data file - KLG.xlsx file.

A total of 2639 responds was collected from the 2018 Survey. However 1 respondent, ID 833, only answered 4 questions out of a total of 87 and did not provide any information on the respondent position, campus and study area. Hence respondent 833's input will be omitted from the visualization.

The dataset contained of 88 columns. 7 of them provide basic information of the respondents, their position, Studyarea, frequency of them visiting the library, etc. 2 of them are on respondents satisfaction level and likelihood of recommending the library to others. 78 of them are likert scale questions where respondents have to rank different services provided by the library by importance of the services and how well the library perform in providing these services. Lastly, 1 of them is a free text containing additional comments from the respondents.

Data cleaning in excel

Data Cleaning
Steps Taken
1. Remove respondent 833
PAQ Pre-process 1.png
Respondent with responseID 833 is removed as identification attributes are blank (Campus, Position, StudyArea), the respondent also only answered 4 out of the 88 questions. Hence I decided to omitted this respondent from the visualization.
2. Replace column Campus, Position, StudyArea and ID's numerical value
PAQ Processing 2.png
The values in the excel file are represented by numerical values, making it difficult to interpret the data without referring to the legend. Hence, I used the “LOOKUP” function in excel to get the values for Campus, Position, StudyArea and ID field by referring to the legend worksheet.
3. Replace column name for likert scale questions
PAQ Preprocessing 3.png
The column name for the likert scale questions on importance and performance is represented by a code. Eg: I01, P01, NA01, etc. This makes it difficult to know what each column represents. Hence, I used the "HLOOKUP" function in excel to get the column code with column name by referring to the legend worksheet.
4. Pre-processing Comments
PAQ Preprocessing 3.png
The column name for the likert scale questions on importance and performance is represented by a code. Eg: I01, P01, NA01, etc. This makes it difficult to know what each column represents. Hence, I used the "HLOOKUP" function in excel to get the column code with column name by referring to the legend worksheet.

Data cleaning using Python for comments

Comments provided by the respondents are free text which is difficult to work with in tableau. In order to provide a overview on main concerns that that the respondents have, I have decided to pre-process the comments using Python NLTK package to remove commonly used words and punctuation in the comments and build a word cloud in tableau to show the dominant words mentioned.

I removed the comment column in the excel file and placed it in another excel file containing the comments, respondent ID and position. Using Python, I conducted stop words removal and lemmatization.

Step 1. Remove empty rows
There is quite a number of rows with empty comments, therefore I started by removing empty rows in the dataset.

PAQ pre-processing4.png

Step 2. Convert text to lowercase and remove punctuation

Text process 2.png


Step 3. lemmatizing
As different people have different way of typing and depending on the sentence structure, often words like "happily" and "happiness" both have the same root word which is "happy". Since they both have the same meaning and belong to the same root word, I would like them to be counted together when plotting the word cloud. Lemmatizing is one of the methods to convert words back to its root word. Hence I did lemmatization for each word in the comments.

Text process3.png


Step 4. Stop words removal
Stop words are words commonly used in a sentence such as "I", "You", "Are". These words are commonly mentioned but have no meaning to them. Hence, I will be removing these words.

Text process 4.png

Below is a sample of how the comment change during the different steps of text processing.

After text processing.png

After the pre-processing, I export the data back to a excel file and used the "Text to column" function to split the sentence in a column to one word in one column

Data Preparation in Tableau

After the pre-processing the data, I uploaded both excel files to Tableau. Columns with likert scale questions on importance and performance are not in the right format for Tableau to process them. It has to be change to a column with the questions and a column with the ratings instead. Therefore, I pivoted the columns with the questions.

PAQ Process.png

Visualization

The dashboard created can be found here: https://public.tableau.com/profile/peh.anqi#!/vizhome/IS428_Assignment_PehAnqi/Home

A home page is used to provide readers with information on what the visualization is about and buttons for them to navigate into the different dashboards.

Respondent Demographics

PAQ Respondent Demographics 2.png

This dashboard shows a breakdown on the respondents demographics. The users can click on any of the chart to filter the rest of the dashboard and click on the "Reset All Filters" Button to reset the filters.

Overall Satisfactory Level

Thumb

This dashboard shows the overall satisfaction level of the Respondents and the likelihood of them recommending the library to others. Readers can hover over the charts to view the distributions of respondents under the library they usually use more.

Importance of Service and Service Performance perceived by Respondents

Thumb

This dashboard shows a breakdown on the respondents satisfaction on the different service provided by the library. The services are under 4 main service groups, Communication, Facilities & Equipment, Information Resources and Service Delivery. Other than the rating on SMU Library's performance, the respondents also indicate how important the service is to them. The ideal situation would be if the library can perform well in aspects deem as important to a large number of respondents.

Top words mentioned

Thumb

This dashboard highlights the dominant words mentioned in the comments provided by respondents. Filtering the word cloud by Respondent Position and Position Group can help readers get a quick overview on top concerns most respondents have. From the overall word cloud, Study, Seat, Space, Hogging and quiet environment are words mentioned frequently and can be areas that the library can take a look into.

Comments

PAQ Comments.png

After understanding the dominant words mentioned in comments. This dashboard would provide context to comments that consist of the dominant words. If the reader have a certain keyword they are interested in, for example: hogging (highlighted from the Word Cloud previously), they can type "hogging" under the "Free Text Search" to filter all comments that consist of the word "hogging". The readers can also filter the comments by Position and Position Group.

Insights

Undergraduate students

Overall, the undergraduate is the largest group of respondents, 79.95%, whom participated in this survey. Most of them visit the library weekly, except for the students from the School of Economics and School of Law where majority of them visit the library daily.

Overall Satisfaction Level
Finding 1:
Majority of the undergraduates are satisfied with the service provided, where 93.89% of respondents voted a satisfaction level of 5 and above (out of 7) and 86.94% of them voted for 7 and above (out of 10) on their likelihood on recommending the libraries to others.

PAQ Undergrad satisfaction.png

Between the two libraries (LKS Library and KGC Law Library), respondents have slightly higher satisfaction for the KGC Law Library. The KGC Law Library only had 4.79% of respondents voting 4 and below for their satisfaction level. While for the LKS library, it have 6.56% of respondents voting 4 and below for their satisfaction level.

Between 2 lib.png


Performance and Level of importance for services provided
Finding 2:
The top 2 most important factors to the Undergraduates in Facilities & Equipment Service would be having a quiet place to study and a place to work in groups. However, the LKS library is not performing well in both areas. Facilities & Equipment Service rating for LKS library is shown below:

Facilities service LKS 2.png


Finding 3:
For the KGC library, the same two factors are most important to the Undergraduates and lacking in performance. In the KGC library, photocopying, scanning and printing service is also an area to improve on. The availability of computers also have quite a number of low performance rating, however most respondents also voted that it is not very important to most of them. Facilities & Equipment Service rating for KGC Law library is shown below:

Facilities service KGC2.png


Commonly mentioned words in comments
Finding 4: The most commonly mention words for the Undergraduates are Study, Space, Seats, Table and Area. This can show that most of the concerns Undergraduates have are on space available in the library.

Undergrad comments.png

Postgraduate students

Overall, the Postgraduates students are the second largest group of respondents, 10.58%, whom participated in this survey. Most of them visit the library weekly, except for the students from the School of Accountancy and School of Law where majority of them visit the library daily.

Overall Satisfaction Level
Finding 1:
Majority of the postgraduates are satisfied with the service provided, where 94.61% of respondents voted a satisfaction level of 5 and above (out of 7) and 90% of them voted for 7 and above (out of 10) on their likelihood on recommending the libraries to others.

Graduate sat level.png

Between the two libraries (LKS Library and KGC Law Library), respondents have slightly higher satisfaction for the LKS Library compared to the KGC Law Library and is more likely to recommend the LKS Library to others.

Recommendation grad.png


Performance and Level of importance for services provided
Finding 2:
The top 2 most important factors to the Postgraduates in Facilities & Equipment Service would be having a quiet place to study and a place to work in groups with. However, both libraries are not performing well in both areas for the Postgraduates.

Grad facilities2.png


Finding 3:
Another area which is of high importance to the Postgraduate but is not performed as well would be accessing online resources via mobile devices under the Information Resources Service. This would be quite a important factor for Postgraduates mainly because they are not full time students and mobile phone is a way for them to access to resources while being out of campus.

Grad information resources service 2.png


Commonly mentioned words in comments
Finding 4: The most commonly mention words for the Postgraduates are Student, Study, Book, Hour and Area. This can show that most of the concerns Postgraduates have are on space available in the library.

Grad comments.png

Faculty

Overall, 2.46% of respondents are the Faculty. Most of them visit the library Quarterly but would access the library resources weekly.

Overall Satisfaction Level
Finding 1:
Majority of the faculty are satisfied with the service provided, where there is no rating below 4 for overall satisfaction level and no rating below 5 for likelihood of recommending SMU Libraries to others. The faculty are more satisfied with the KGC Law Library as compared to the LKS Library though both Library have relatively high ratings.

Faculty Satisfaction.png


Performance and Level of importance for services provided
Finding 2:
The service group that have the highest importance level for Faculty members would be Information Resources service. Almost of the areas asked are ranked above 4 (out of 7). There are three areas that can be improved, which is information resources located at the library meet their learning and research needs, online resources and the library search engine allowing them to find relevant resources quickly. As faculty members are incharge of research work and teaching students, it is understandable why these areas are important to them and they would be the group that commonly use this service.

Faculty resources.png


Finding 3:
Another area which is of high importance to the Faculty members but is not performed as well would under the Service Delivery where library items are placed at the shelves when needed by the Faculty member. It is rated with high importance, however was not rated as high for performance.

Faculty Service Delivery.png


Commonly mentioned words in comments
Finding 4: The most commonly mention words by the Faculty would be, resources, book and available. This can show that resources availability is the most important factor to the Faculty members. In the word cloud, many positive words such as great, good and like are highly mentioned as well.

Faculty comment.png

Staff

Overall, 5.23% of respondents are from Staffs. Most of them visit the library monthly.

Overall Satisfaction Level
Finding 1:
Majority of the staff member are satisfied with the service provided, where there is no rating below 3 for both the overall satisfaction level and likelihood of recommending SMU Libraries to others. The staff member are more satisfied with the KGC Law Library as compared to the LKS Library.

Staff satisfaction.png


Performance and Level of importance for services provided
Finding 2:
The service group would need improvement on would be information resources. Two areas, using mobile devices to access online resources and Library search engine to find relevant resources quickly had a lower rating for performance as compared to the other services provided. Though not ranked very highly for importance, these two areas are still important and the services can be further improved for the Staff members.

Staff information resources.png


Commonly mentioned words in comments
Finding 3: The most commonly mention words by the Staff would be, book, space and student. This can show that book availability is one of the most important factor to the Staff members. Many of the comments also raised concerns the Staff members think the students would faced, hence, Student is also highly mentioned.

Staff comments.png