Difference between revisions of "ANLY482 AY2016-17 T2 Group7: Project Findings"

From Analytics Practicum
Jump to navigation Jump to search
m
 
(51 intermediate revisions by 2 users not shown)
Line 8: Line 8:
  
 
| style="border-bottom:7px solid #005192;" width="20%" |
 
| style="border-bottom:7px solid #005192;" width="20%" |
[[ANLY482_AY2016-17_T2_Group7: Project Overview | <font color="#fff">Project Overview</font>]]
+
[[ANLY482_AY2016-17_T2_Group7: Project Overview | <font color="#bbdefb">Project Overview</font>]]
  
 
| style="border-bottom:7px solid #febd3d;" width="20%" |
 
| style="border-bottom:7px solid #febd3d;" width="20%" |
[[ANLY482_AY2016-17_T2_Group7: Project Findings | <font color="#bbdefb">Project Findings</font>]]
+
[[ANLY482_AY2016-17_T2_Group7: Methodology | <font color="#fff">Project Findings</font>]]
  
 
| style="border-bottom:7px solid #005192;" width="20%" |
 
| style="border-bottom:7px solid #005192;" width="20%" |
Line 20: Line 20:
 
|}
 
|}
 
<!-- End Main Navigation Bar -->
 
<!-- End Main Navigation Bar -->
 +
 +
<!--Sub Header-->
 +
{| style="background-color:white; color:000000 padding: 5px 0 0 0;" width="100%" height=50px cellspacing="0" cellpadding="0" valign="top" border="0" |
 +
 +
| style="vertical-align:top;width:14%;" | <div style="padding: 3px; text-align:center; line-height: wrap_content; font-size:15px; border-bottom:2px solid #0163bd; font-family:Century Gothic"> [[ANLY482_AY2016-17_T2_Group7: Methodology| <b>Methodology</b>]]
 +
 +
| style="vertical-align:top;width:14%;" | <div style="padding: 3px; text-align:center; line-height: wrap_content; font-size:15px; border-bottom:5px solid #0163bd; font-family:Century Gothic"> [[ANLY482_AY2016-17_T2_Group7: Exploratory Data Analysis| <b>Exploratory Data Analysis</b>]]
 +
 +
| style="vertical-align:top;width:14%;" | <div style="padding: 3px; text-align:center; line-height: wrap_content; font-size:15px; border-bottom:2px solid #0163bd; font-family:Century Gothic"> [[ANLY482_AY2016-17_T2_Group7: Text Mining| <b>Text Analytics</b>]]
 +
 +
| style="vertical-align:top;width:14%;" | <div style="padding: 3px; text-align:center; line-height: wrap_content; font-size:15px; border-bottom:2px solid #0163bd; font-family:Century Gothic"> [[ANLY482_AY2016-17_T2_Group7: Gap Analysis| <b>Gap Analysis</b>]]
 +
 +
|}
 +
<!--/Sub Header-->
 +
 +
<!-- Please do not make changes to above -->
  
 
<br />
 
<br />
  
 
<!-- Start Information -->
 
<!-- Start Information -->
 +
<div style="background:#307FBB; line-height:0.3em; font-family:sans-serif; font-size:120%; border-left:#bbdefb solid 15px;"><div style="border-left:#fff solid 5px; padding:15px;"><font color="#fff"><strong>Exploratory Data Analysis</strong></font></div></div>
  
<div style="background:#307FBB; line-height:0.3em; font-family:sans-serif; font-size:120%; border-left:#bbdefb solid 15px;"><div style="border-left:#fff solid 5px; padding:15px;"><font color="#fff"><strong>Findings</strong></font></div></div>
+
<div style="color:#212121;">
 +
[[File:BJJ1.png|700px]]<br/>
 +
''Chart 1: Overall Search Counts by Month for All Users''<br/>
 +
 
 +
[[File:Overall search by existing students.png|700px]]<br/>
 +
''Chart 1.1: Overall Search by Month for Existing Students''<br/>
  
<div style="color:#212121;">
+
[[File:Search counts by existing students during academic weeks1.png|1040px]]<br/>
<big>Analysis 1: Insights on Search Count by Date:</big><br/>
+
''Chart 1.2: Search Count by Existing Students during Academic Weeks''<br/>
==overview==
 
[[File:overall_search_count_by_month.jpg]]<br/>
 
Chart 1: Overall Search Counts by Month<br/>
 
  
 +
[[File:Overall search by alumni.png|700px]]<br/>
 +
''Chart 1.3: Overall Search by Month for Alumni''<br/>
  
[[File:user_group_search_counts.jpg]][[File:others_search_counts.jpg]]<br/>
+
[[File:Search Counts by Alumni during Academic Weeks1.png|1040px]]<br/>
Chart 2: User Group Search Counts
+
''Chart 1.4: Search Count by Alumni during Academic Weeks''<br/>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 
Chart 3: Search Count of 'Others'
 
<br/>
 
  
 +
[[File:user_group_search_counts.jpg|300px]]<br/>
 +
''Chart 2: User Group Search Counts''<br/>
  
 +
[[File:others_search_counts.jpg|300px]]<br/>
 +
''Chart 3: Search Count of 'Others'''<br/>
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 48: Line 70:
 
|-
 
|-
 
| Thought Process:
 
| Thought Process:
| We want to understand the number of searches throughout the year and see if there are any observable trends. From our personal anecdotes as Seniors in SMU, we believe there will be a spike in searches especially during the project weeks (Week 4 onwards). Thus, we initiated the break down of the number of searches by months, to have a better look at where the peak periods are.
+
| We want to understand the number of searches throughout the year and see if there are any observable trends.
 +
Thus, we initiated the break down of the number of searches by months, to have a better look at where the peak periods are.
  
 
|-
 
|-
Line 54: Line 77:
 
| There is great variation in the number of searches across the span of a year, and these searches on the Ezproxy are contributed by students - Undergraduate, Masters, PhD and others (international exchange, local exchange, visiting students). As the users of the Ezproxy site are students of Singapore Management University, the spike in the number of searches can be seen during the months of the regular semesters (Term 1 and 2) - January to March and Mid-August to November.
 
| There is great variation in the number of searches across the span of a year, and these searches on the Ezproxy are contributed by students - Undergraduate, Masters, PhD and others (international exchange, local exchange, visiting students). As the users of the Ezproxy site are students of Singapore Management University, the spike in the number of searches can be seen during the months of the regular semesters (Term 1 and 2) - January to March and Mid-August to November.
  
 +
'''Identifying the start and end of regular terms just by looking at the number of searches'''
  
|}
+
In Chart 1, we could potentially identify the start and end of the 2 regular terms just by observing where the number of searches experience a gradual dip.
 +
 
 +
The overall trend of the number of searches forms the shape of a jagged mountain for both terms, thus the start and ends of the mountains fall around the start and ends of the terms.
 +
 
 +
'''Existing Students and Alumni in Charts 1.1 & 1.3 respectively'''
  
[[File:Chart4.png|1040px]]<br/>
+
We discovered that Chart 1 may not reveal much about who are the users who are actually performing the searches. Thus, we decided to filter by Graduation Year to showcase the Overall Search by Month for Existing Students and Alumni in Charts 1.1 & 1.3 respectively.
Chart 4: Search Count by Days for Semester 1: Jan-March 2016<br/>
 
  
[[File:Chart5.png|1040px]]<br/>
+
The filtering and classification is as follows:
Chart 5: Search Count by Days for Semester 2: Aug-Nov 2016<br/>
 
  
 
{| class="wikitable"
 
{| class="wikitable"
 
| Subject Matter:
 
| Understanding the students’ behaviours in searches during the semesters.
 
 
|-
 
|-
| Thought Process:
+
! Graduating Year !! Type of User (Existing Student or Alumni) !! Thought Process
| In SMU, 1 of the common perception is that SMU students study all day everyday, even the weekends. Thus, we want to see if this perception of SMU students is indeed true.
+
|-
 
+
| Null || Existing Student || This consists of users who are still far away from their graduating year
Next, we want to see if there is a surge in searches when the semester reaches the week where group projects are released. This is because projects in SMU largely requires the students to perform desk research and 1 of the many places to do so is through the SMU library’s EzProxy e-resources database.
+
|-
 
+
| GY_2012 || Alumni || This indicates the students who have graduated in 2012 and are considered ‘alumni’ in 2016 where this dataset is based.
 +
|-
 +
| GY_2013 || Alumni || This indicates the students who have graduated in 2013 and are considered ‘alumni’ in 2016 where this dataset is based.
 +
|-
 +
| GY_2014 || Alumni || This indicates the students who have graduated in 2014 and are considered ‘alumni’ in 2016 where this dataset is based.
 +
|-
 +
| GY_2015 || Alumni || This indicates the students who have graduated in 2015 and are considered ‘alumni’ in 2016 where this dataset is based.
 +
|-
 +
| GY_2016 || Existing Student || This indicates students who are graduating in 2016 but are still considered students in the year 2016.
 
|-
 
|-
| Analysis:
+
| GY_2016 || Existing Student || This indicates students who are graduating in 2017 but are still considered students in the year 2016.
| '''Dip in Weekend Searches'''
+
|}
  
From the charts in Semesters 1 & 2, we noticed that there is a dip in the number of searches performed every weekend (Saturdays & Sundays). For example, there is a plunge in the number of searches on 16th of January (Saturday). Thus, this may show that the perception of SMU students studying all day everyday and even the weekends may be untrue. Or it could be that SMU students generally do not perform as many searches for their research on weekends.
+
We observed that the line chart in Chart 1.1 follows about the same shape as that of Chart 1. This could be due to existing students contributing to majority of the overall searches.  
  
'''Release of Project Requirements in Week 4'''
+
In Chart 1.3 however, the shape is vastly different from that of Chart 1. The dip after April 2016 could be due to the fact that alumnus typically receive their job offers around that period and thus are not academically involved in searching for e-resources as much as when they were still students.
  
Generally, the release of the project requirements would be on Week 4 of Term 2, which is 25 Jan - 31 Jan. From Chart 4, we can observe that there is a steep increase in the number of searches from 25 Jan which peaks at 26 Jan and then decreases steeply again to the end of Week 4, 31 Jan.
+
'''Chart 1.2: Students in Academic Weeks'''
  
Contrastingly, for Term 1, the release of the project requirements would be on Week 4, which is 5 Sep - 11 Sep. From Chart 5, we can observe there is no major increase in the number of searches from 5 Sep - 11 Sep. Thus this not may explain that our belief that students did their searches in the week the project requirements are released.  
+
From Chart 1.1, we decided to generate another chart showing how students search throughout the weeks in academic terms. We observed that the peaks in the regular terms, Terms 2 and 1, occur during Week 8, which is the recess week. This could be because that majority of the students start their research during recess week.  
  
'''Research on Recess Week?'''
+
Next, we observed that there is a decrease in the number of searches in the weeks following the recess week (Week 8) and then we noticed there is an unusual increase in the number of searches again in Week 14, which is the study week. This same trend can be seen on both Term 2 & 1. We believe that this increase in the number of searches could be due to the students performing searches as they revise for their final examinations.
  
However, upon further contrasting of the trends in Charts 4 and 5 side-by-side, we discovered that there is always a spike in the number of searches on the first day of Recess week in both Terms. For Term 2, Recess week starts on 22 Feb where there is a visible spike from 21 Feb to 22 Feb. And for Term 1, Recess week starts on 3 Oct where there is also a visible spike from 2 Oct to 3 Oct (this spike happens to be the highest in the entire Term 1). And in both cases, the number of searches decreases gradually until the end of the Recess Week (28 Feb for Term 2 and 9 Oct for Term 1 respectively). This is a very interesting discovery as it potentially shows that students typically start their research on the first day of Recess week, thereby contributing to the spike in number of searches, and then as the Recess week comes to a close, the amount of research students performed becomes lesser too.
+
'''Chart 1.4: Alumni in Academic Weeks'''
  
'''Highest Spike in Term 2: 11 Feb, end of CNY?'''
+
From Chart 1.4, we observed that the number of searches for Term 2 is close to none. The data for Term 1 shows no recognizable pattern.
 +
|}
  
From Chart 4, we observed the highest spike in Term 2, which takes place on 11 Feb 2016. We could not find a possible explanation for this other than it being the end of the Chinese New Year holidays (9 & 10 Feb 2016) and students may be guilty from enjoying their CNY a tad too much and thus began to do their research on the library databases on 11 Feb.
+
[[File:Chart4.png|1040px]]<br/>
 +
''Chart 4: Dip in Weekends for Term 2: Jan-March 2016''<br/>
  
'''Identifying the start and end of regular terms just by looking at the number of searches'''
+
[[File:BJJ5.png|1040px]]<br/>
 +
''Chart 5: Dip in Weekends for Term 1: Aug-Nov 2016''<br/>
  
In both Charts 4 & 5, we could potentially identify the start and end of the 2 regular terms just by observing where the number of searches experience a gradual dip.
+
[[File:Bjj6.png|1040px]]<br/>
 +
''Chart 6: Search Count by Days for Term 2: Jan-March 2016''<br/>
  
The overall trend of the number of searches forms the shape of a jagged mountain for both terms, thus the start and ends of the mountains fall around the start and ends of the terms.
+
[[File:Bjj6.png|1040px]]<br/>
 +
''Chart 7: Search Count by Days for Term 1: Aug-Nov 2016''<br/>
  
 +
[[File:Chart5.png|1040px]]<br/>
 +
''Chart 7: Search Count by Days for Term 1: Aug-Nov 2016''<br/>
  
|}
+
[[File:Bjj8.png|1040px]]<br/>
 
+
''Chart 8: Chinese New Year in 2016''<br/>
[[File:Percent_of_Search_Counts_by_Degrees_in_Weekends.png]]<br/>
 
Chart 6: Percentage of Search Counts by Degrees in Weekends for Semester 2: Jan-March 2016<br/>
 
 
 
[[File:Percent_of_Search_Counts_by_Degrees_in_Weekends_from_Sep_to_Dec.png]]<br/>
 
Chart 7: Percentage of Search Counts by Degrees in Weekends for Semester 1: Sep to Nov 2016<br/>
 
  
 
{| class="wikitable"
 
{| class="wikitable"
  
 
| Subject Matter:
 
| Subject Matter:
| Understanding the percentage of searches contributed by students across their Degrees during weekends
+
| Understanding the students’ behaviours in searches during the semesters.
 
|-
 
|-
 
| Thought Process:
 
| Thought Process:
| We want to dive deeper into the analysis of weekend searches and find out who are the ones still contributing to it, despite the dip in number of weekend searches.  
+
| In SMU, 1 of the common perception is that SMU students study all day everyday, even the weekends. Thus, we want to see if this perception of SMU students is indeed true.
 +
 
 +
Next, we want to see if there is a surge in searches when the semester reaches the week where group projects are released. This is because projects in SMU largely requires the students to perform desk research and 1 of the many places to do so is through the SMU library’s EzProxy e-resources database.
  
 
|-
 
|-
 
| Analysis:
 
| Analysis:
| In Chart 6, we noticed that 56.75% of searches were done by students enrolled in the Bachelor of Laws programme, which occupies a majority of the total number of searches performed on weekends. Additionally, 16.91% of searches were done by students from Bachelor of Business Management and 7.36% from the Juris Doctor programme.
+
| '''Dip in Weekend Searches'''
  
One of the possible conclusions from this observation is that students enrolled in the Law field (Bachelor of Laws & Juris Doctor programme) do not typically stop performing searches and/or stop researching simply because it is the weekends. In addition to that, students in the Bachelor of Business Management programme contributes significantly to the number of searches on weekends too, perhaps due to the nature of the programme which is research-intensive. This is in contrast to students from other non-research intensive programmes such as Bachelor of Science (Information Systems) at 1.64% of total number of searches.  
+
From the Chart 3 & 4,, we noticed that there is a dip in the number of searches performed every weekend (Saturdays & Sundays). For example, there is a plunge in the number of searches on 16th of January (Saturday). Thus, this may show that the perception of SMU students studying all day everyday and even the weekends may be untrue. Or it could be that SMU students generally do not perform as many searches for their research on weekends.
 +
 
 +
'''Research on Recess Week?'''
  
In Chart 7, we can observe that the abovementioned trend is consistent for students in the Bachelor of Laws, Bachelor of Business Management and Juris Doctor. Thus, our trend analysis holds consistent for both Semesters 1 and 2.  
+
However, upon further contrasting of the trends in Charts 5 and 6 side-by-side, we discovered that there is always a spike in the number of searches on the first day of Recess week in both Terms. For Term 2, Recess week starts on 22 Feb where there is a visible spike from 21 Feb to 22 Feb. And for Term 1, Recess week starts on 3 Oct where there is also a visible spike from 2 Oct to 3 Oct (this spike happens to be the highest in the entire Term 1). And in both cases, the number of searches decreases gradually until the end of the Recess Week (28 Feb for Term 2 and 9 Oct for Term 1 respectively). This is a very interesting discovery as it potentially shows that students typically start their research on the first day of Recess week, thereby contributing to the spike in number of searches, and then as the Recess week comes to a close, the amount of research students performed becomes lesser too.  
  
 +
'''Highest Spike in Term 2: 11 Feb, end of CNY?'''
 +
 +
From Chart 7, we observed the highest spike in Term 2, which takes place on 11 Feb 2016. We could not find a possible explanation for this other than it being the end of the Chinese New Year holidays (9 & 10 Feb 2016) and students may be picking up on their research, thus explaining the spike in number of searches performed on 11 Feb 2016.
  
 
|}
 
|}
  
[[File:Search_count_by_school.jpg]]<br/>
+
[[File:Percent_of_Search_Counts_by_Degrees_in_Weekends.png|800px]]<br/>
Chart 8: Search Count by Schools for 2016<br/>
+
''Chart 9: Percentage of Search Counts by Degrees in Weekends for Term 2: Jan-March 2016''<br/>
  
[[File:Percentage_of_Search_count_by_school.jpg]]<br/>
+
[[File:Percent_of_Search_Counts_by_Degrees_in_Weekends_from_Sep_to_Dec.png|800px]]<br/>
Chart 9: Percentage of Search Counts by Schools & Months<br/>
+
''Chart 10: Percentage of Search Counts by Degrees in Weekends for Term 1: Sep to Nov 2016''<br/>
  
 
{| class="wikitable"
 
{| class="wikitable"
  
 
| Subject Matter:
 
| Subject Matter:
| Understanding the percentage of searches contributed by students across their Degrees for 2016
+
| Understanding the percentage of searches contributed by students across their Degrees during weekends
 
|-
 
|-
 
| Thought Process:
 
| Thought Process:
| After learning about the different spikes as a whole, we then consider the possibility of some schools being greater contributors to the searches. The main contributors to these searches should most likely be similar to that of the weekends, and thus, we broke down the search count by schools over the months to analyze the general trend of searches across each degree.
+
| We want to dive deeper into the analysis of weekend searches and find out who are the ones still contributing to it, despite the dip in number of weekend searches.  
  
 
|-
 
|-
 
| Analysis:
 
| Analysis:
| Similar to the weekend searches, we observed that the top 3 percentages of searches still come from the students enrolled in the Bachelor of Laws, Bachelor of Business Management and Juris Doctor.  
+
| In Chart 9, we noticed that 56.75% of searches were done by students enrolled in the Bachelor of Laws programme, which occupies a majority of the total number of searches performed on weekends. Additionally, 16.91% of searches were done by students from Bachelor of Business Management and 7.36% from the Juris Doctor programme.  
  
In Chart 8, we observed that even though Semester 1 and Semester 2 starts in Aug and Jan respectively, the total number of searches in Jan is significantly higher than that of Aug. This is where additional information such as the exact dates of the start of Semesters 1 & 2 comes in handy; Semester 2 starts on 4 Jan 2016, thereby occupying the entire month of Jan whereby Semester 1 starts only on 15 Aug 2016, thereby occupying only half of the month of Aug. Without such additional information, analysts may conclude that perhaps students in Semester 2 are more hardworking than in Semester 1 in terms of the number of searches they perform.  
+
One of the possible conclusions from this observation is that students enrolled in the Law field (Bachelor of Laws & Juris Doctor programme) do not typically stop performing searches and/or stop researching simply because it is the weekends. In addition to that, students in the Bachelor of Business Management programme contributes significantly to the number of searches on weekends too, perhaps due to the nature of the programme which is research-intensive. This is in contrast to students from other non-research intensive programmes such as Bachelor of Science (Information Systems) at 1.64% of total number of searches.  
  
 +
In Chart 10, we can observe that the abovementioned trend is consistent for students in the Bachelor of Laws, Bachelor of Business Management and Juris Doctor. Thus, our trend analysis holds consistent for both Terms 1 and 2.   
  
 
|}
 
|}
 +
 
<br/>
 
<br/>
[[File:usage_of_databases_by_schools.jpg]]<br/>
+
[[File:usage_of_databases_by_schools.jpg|900px]]<br/>
Chart 10: Usage of Database by Schools<br/>
+
''Chart 11: Usage of Database by Schools''<br/>
 
 
With reference to Chart 10, we have selected 2 databases, Lawnet and Euromonitor, to focus on for this interim phase. This is due to the fact that these 2 databases are the most commonly used amongst the Law and Business students respectively, as these 2 schools are the 2 biggest contributors to the searches during the semester.
 
 
 
From the following actions applied to these 2 databases, we could then repeat these steps for the rest of the databases in the next phase following the interim report.
 
 
 
 
 
<div style="background:#307FBB; line-height:0.3em; font-family:sans-serif; font-size:120%; border-left:#bbdefb solid 15px;"><div style="border-left:#fff solid 5px; padding:15px;"><font color="#fff"><strong>Interim Gap Analysis</strong></font></div></div>
 
 
 
<big>'''Excessive System Logging of Search Queries'''</big>
 
 
 
In our EDA, we discovered that there exists a problem of excessive system logging of search queries. We have found 2 examples of such occurrence:
 
 
 
{| class="wikitable"
 
|-
 
! '''Time''' !! '''Search Query Logged'''
 
|-
 
| 12:55:02PM || Re
 
|-
 
| 12:55:04PM || Resol
 
|-
 
| 12:55:06PM || Resoluti
 
|-
 
| 12:55:08PM || Resolution
 
|}
 
Example 1: Log data is logged every 2 second
 
 
 
{| class="wikitable"
 
|-
 
! Key Press!! Search Query Logged
 
|-
 
| 1st Key Press: T || T
 
|-
 
| 2nd Key Press: r || Tr
 
|-
 
| 3rd Key Press: u || Tru
 
|-
 
| 4th Key Press: m || Trum
 
|-
 
| 5th Key Press: p || Trump
 
|}
 
Example 2: Log Data is logged with every key press
 
 
 
In our analysis, these presents a problem to us in the form of how do we determine which is the actual search query that a User is searching for? As illustrated by the example by ‘User A’ below, in a single session logged by ‘User A’, there may be multiple search queries searched by users. In this case, we used 3 search queries as an example. The challenge to us is to sieve out which are the search queries (eg. Jack, Singapore) that User A is searching for when it is not the end of the session for him.
 
 
 
Eg. List of 3 Search Queries being logged with every key press by User A:
 
 
 
[ Start of Session for User A ]
 
 
 
Re
 
 
 
Regu
 
 
 
Regula
 
 
 
Regulati
 
 
 
Regulation
 
 
 
Ja
 
 
 
Jack
 
 
 
Si
 
 
 
Sing
 
 
 
Singap
 
 
 
Singapor
 
 
 
Singapore
 
 
 
[ End of Session for User A ]
 
 
 
We decided that this shortfall not only affects us as project analysts, but to other stakeholders as well.
 
 
 
<big>'''Interim Gap Analysis by Stakeholders'''</big>
 
 
 
The '''Actual Performance''' in this case would be if everything remains status quo, meaning the problem of multiple logging of search queries would persist.
 
 
 
The '''Desired Performance''' in this case would be if this problem does not exist and 1 line of logging is created for 1 full, actual search query.
 
 
 
{| class="wikitable"
 
|-
 
! Stakeholders Involved/Impact of Performance !! Actual Performance !! Desired Performance
 
|-
 
| Our Team as Project Analysts || Presents a problem whereby we need to find out how to determine which line of search query logged is the actual, full search query by end-users so that we can begin the analysis from there || Every line of search query would be the actual, full search query by end-users so we need not clean the dataset even further, thereby reducing the amount of work we have to do and saves time which can be better spent in progressing the analysis
 
|-
 
| End-Users of Library’s e-Resources || Presents a problem whereby end-users may experience unnecessary lag in obtaining the results from their search queries || No lag when completing searches would mean a better overall user experience. Furthermore, such seamless experience would mean that the system do not stand in the way of the intensive research that students have to do in their course of study, but rather serving as an effective aid to them.
 
|-
 
| Library Team as Project Sponsors for this Practicum || Presents a problem whereby the project sponsors run a risk of the project analysts not being able to sieve out the line of search queries which are full, actual and useful to determine the accurate search queries that users are actually searching for || No such problem as whatever the search query is, it would be logged as exactly that.
 
|-
 
| Library Team in charge of ensuring that the EzProxy server serves the users in the best possible way || Wastage of resources and can potentially slow down the servers when multiple logs are triggered and recorded before searches are completed. This utilizes processing RAM of the server unnecessarily and takes up precious memory space when being recorded as a line of search query. || There would be no wastage of server’s processing RAM and memory space as 1 line of logging would be created for 1 full, actual search query entered by users.
 
|}
 
  
  

Latest revision as of 02:40, 7 April 2017

Home

Team

Project Overview

Project Findings

Project Management

Documentation



Exploratory Data Analysis

BJJ1.png
Chart 1: Overall Search Counts by Month for All Users

Overall search by existing students.png
Chart 1.1: Overall Search by Month for Existing Students

Search counts by existing students during academic weeks1.png
Chart 1.2: Search Count by Existing Students during Academic Weeks

Overall search by alumni.png
Chart 1.3: Overall Search by Month for Alumni

Search Counts by Alumni during Academic Weeks1.png
Chart 1.4: Search Count by Alumni during Academic Weeks

User group search counts.jpg
Chart 2: User Group Search Counts

Others search counts.jpg
Chart 3: Search Count of 'Others'

Subject Matter: Awareness of the number of searches throughout the year
Thought Process: We want to understand the number of searches throughout the year and see if there are any observable trends.

Thus, we initiated the break down of the number of searches by months, to have a better look at where the peak periods are.

Analysis: There is great variation in the number of searches across the span of a year, and these searches on the Ezproxy are contributed by students - Undergraduate, Masters, PhD and others (international exchange, local exchange, visiting students). As the users of the Ezproxy site are students of Singapore Management University, the spike in the number of searches can be seen during the months of the regular semesters (Term 1 and 2) - January to March and Mid-August to November.

Identifying the start and end of regular terms just by looking at the number of searches

In Chart 1, we could potentially identify the start and end of the 2 regular terms just by observing where the number of searches experience a gradual dip.

The overall trend of the number of searches forms the shape of a jagged mountain for both terms, thus the start and ends of the mountains fall around the start and ends of the terms.

Existing Students and Alumni in Charts 1.1 & 1.3 respectively

We discovered that Chart 1 may not reveal much about who are the users who are actually performing the searches. Thus, we decided to filter by Graduation Year to showcase the Overall Search by Month for Existing Students and Alumni in Charts 1.1 & 1.3 respectively.

The filtering and classification is as follows:

Graduating Year Type of User (Existing Student or Alumni) Thought Process
Null Existing Student This consists of users who are still far away from their graduating year
GY_2012 Alumni This indicates the students who have graduated in 2012 and are considered ‘alumni’ in 2016 where this dataset is based.
GY_2013 Alumni This indicates the students who have graduated in 2013 and are considered ‘alumni’ in 2016 where this dataset is based.
GY_2014 Alumni This indicates the students who have graduated in 2014 and are considered ‘alumni’ in 2016 where this dataset is based.
GY_2015 Alumni This indicates the students who have graduated in 2015 and are considered ‘alumni’ in 2016 where this dataset is based.
GY_2016 Existing Student This indicates students who are graduating in 2016 but are still considered students in the year 2016.
GY_2016 Existing Student This indicates students who are graduating in 2017 but are still considered students in the year 2016.

We observed that the line chart in Chart 1.1 follows about the same shape as that of Chart 1. This could be due to existing students contributing to majority of the overall searches.

In Chart 1.3 however, the shape is vastly different from that of Chart 1. The dip after April 2016 could be due to the fact that alumnus typically receive their job offers around that period and thus are not academically involved in searching for e-resources as much as when they were still students.

Chart 1.2: Students in Academic Weeks

From Chart 1.1, we decided to generate another chart showing how students search throughout the weeks in academic terms. We observed that the peaks in the regular terms, Terms 2 and 1, occur during Week 8, which is the recess week. This could be because that majority of the students start their research during recess week.

Next, we observed that there is a decrease in the number of searches in the weeks following the recess week (Week 8) and then we noticed there is an unusual increase in the number of searches again in Week 14, which is the study week. This same trend can be seen on both Term 2 & 1. We believe that this increase in the number of searches could be due to the students performing searches as they revise for their final examinations.

Chart 1.4: Alumni in Academic Weeks

From Chart 1.4, we observed that the number of searches for Term 2 is close to none. The data for Term 1 shows no recognizable pattern.

Chart4.png
Chart 4: Dip in Weekends for Term 2: Jan-March 2016

BJJ5.png
Chart 5: Dip in Weekends for Term 1: Aug-Nov 2016

Bjj6.png
Chart 6: Search Count by Days for Term 2: Jan-March 2016

Bjj6.png
Chart 7: Search Count by Days for Term 1: Aug-Nov 2016

Chart5.png
Chart 7: Search Count by Days for Term 1: Aug-Nov 2016

Bjj8.png
Chart 8: Chinese New Year in 2016

Subject Matter: Understanding the students’ behaviours in searches during the semesters.
Thought Process: In SMU, 1 of the common perception is that SMU students study all day everyday, even the weekends. Thus, we want to see if this perception of SMU students is indeed true.

Next, we want to see if there is a surge in searches when the semester reaches the week where group projects are released. This is because projects in SMU largely requires the students to perform desk research and 1 of the many places to do so is through the SMU library’s EzProxy e-resources database.

Analysis: Dip in Weekend Searches

From the Chart 3 & 4,, we noticed that there is a dip in the number of searches performed every weekend (Saturdays & Sundays). For example, there is a plunge in the number of searches on 16th of January (Saturday). Thus, this may show that the perception of SMU students studying all day everyday and even the weekends may be untrue. Or it could be that SMU students generally do not perform as many searches for their research on weekends.

Research on Recess Week?

However, upon further contrasting of the trends in Charts 5 and 6 side-by-side, we discovered that there is always a spike in the number of searches on the first day of Recess week in both Terms. For Term 2, Recess week starts on 22 Feb where there is a visible spike from 21 Feb to 22 Feb. And for Term 1, Recess week starts on 3 Oct where there is also a visible spike from 2 Oct to 3 Oct (this spike happens to be the highest in the entire Term 1). And in both cases, the number of searches decreases gradually until the end of the Recess Week (28 Feb for Term 2 and 9 Oct for Term 1 respectively). This is a very interesting discovery as it potentially shows that students typically start their research on the first day of Recess week, thereby contributing to the spike in number of searches, and then as the Recess week comes to a close, the amount of research students performed becomes lesser too.

Highest Spike in Term 2: 11 Feb, end of CNY?

From Chart 7, we observed the highest spike in Term 2, which takes place on 11 Feb 2016. We could not find a possible explanation for this other than it being the end of the Chinese New Year holidays (9 & 10 Feb 2016) and students may be picking up on their research, thus explaining the spike in number of searches performed on 11 Feb 2016.

Percent of Search Counts by Degrees in Weekends.png
Chart 9: Percentage of Search Counts by Degrees in Weekends for Term 2: Jan-March 2016

Percent of Search Counts by Degrees in Weekends from Sep to Dec.png
Chart 10: Percentage of Search Counts by Degrees in Weekends for Term 1: Sep to Nov 2016

Subject Matter: Understanding the percentage of searches contributed by students across their Degrees during weekends
Thought Process: We want to dive deeper into the analysis of weekend searches and find out who are the ones still contributing to it, despite the dip in number of weekend searches.
Analysis: In Chart 9, we noticed that 56.75% of searches were done by students enrolled in the Bachelor of Laws programme, which occupies a majority of the total number of searches performed on weekends. Additionally, 16.91% of searches were done by students from Bachelor of Business Management and 7.36% from the Juris Doctor programme.

One of the possible conclusions from this observation is that students enrolled in the Law field (Bachelor of Laws & Juris Doctor programme) do not typically stop performing searches and/or stop researching simply because it is the weekends. In addition to that, students in the Bachelor of Business Management programme contributes significantly to the number of searches on weekends too, perhaps due to the nature of the programme which is research-intensive. This is in contrast to students from other non-research intensive programmes such as Bachelor of Science (Information Systems) at 1.64% of total number of searches.

In Chart 10, we can observe that the abovementioned trend is consistent for students in the Bachelor of Laws, Bachelor of Business Management and Juris Doctor. Thus, our trend analysis holds consistent for both Terms 1 and 2.


Usage of databases by schools.jpg
Chart 11: Usage of Database by Schools


[Back To Project Page]