Difference between revisions of "Maximum Project Overview"

From Analytics Practicum
Jump to navigation Jump to search
Line 41: Line 41:
  
 
The library currently aims to optimise its resource availability and distributions channels to maximise the learning effectiveness of its students. This could be in terms of increasing resources available for certain highly searched topics, altering current trainings and workshops to focus on any common mistakes committed by students while using the assets or finding any unexpected trends in user journey through digital and physical touch points. They further want to know if usage patterns vary between students based on certain attributes like Programme, Year of Graduation and Education Level. For this purpose, they have conducted an initial survey for the freshman batch of 2017 to evaluate the difference in their confidence level in various research skills before and after joining SMU, factoring in several considerations like modules taken, library workshops attended and so on and so forth. They wish for us to understand if this survey contains any actionable insights.
 
The library currently aims to optimise its resource availability and distributions channels to maximise the learning effectiveness of its students. This could be in terms of increasing resources available for certain highly searched topics, altering current trainings and workshops to focus on any common mistakes committed by students while using the assets or finding any unexpected trends in user journey through digital and physical touch points. They further want to know if usage patterns vary between students based on certain attributes like Programme, Year of Graduation and Education Level. For this purpose, they have conducted an initial survey for the freshman batch of 2017 to evaluate the difference in their confidence level in various research skills before and after joining SMU, factoring in several considerations like modules taken, library workshops attended and so on and so forth. They wish for us to understand if this survey contains any actionable insights.
 
<br/>
 
  
 
<br/><div align="left">
 
<br/><div align="left">
Line 61: Line 59:
  
 
The current objectives may be subjected to further changes after we have obtained and look at the actual data.
 
The current objectives may be subjected to further changes after we have obtained and look at the actual data.
 
<br/>
 
  
 
<br/><div align="left">
 
<br/><div align="left">
Line 96: Line 92:
 
* modules taken
 
* modules taken
 
* library workshops attended
 
* library workshops attended
 +
 +
<br/><div align="left">
 +
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #4AB6A6 solid 32px;"><font color="#4AB6A6">Project Methodology</font></div>
 +
<br/>
 +
 +
As we have not obtained the data until the NDA is signed, we will only share our initial thought process of how we will tackle the project. We shall adopt closely to the Data Analytics Lifecycle approach.
 +
 +
Our plan of action is to discern the effectiveness of the library eBook databases in meeting the research needs of students. By analysing the proxy entries, we can define the usage pattern of its users and divide them into distinct clusters based on demographic and behavioural traits. Furthermore, we intend to track student user journey once they start interacting with the several physical and digital touch points sequentially. As such, we have also conducted a secondary research from various university published articles to gain a broad understanding of now turnstile and proxy data could be used to draw insights.
 +
 +
At this phase of the project, we will focus on understanding the given dataset and clean the data. Concurrently, we will decide on the analytical model and prepare the data accordingly.
 +
 +
<br/><div align="left">
 +
<div style="background: #F5FFFA; padding: 12px; font-family: Arimo; font-size: 18px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #4AB6A6 solid 32px;"><font color="#4AB6A6">Project Scope</font></div>
 +
<br/>
 +
 +
While the project will be primarily focussed on answering the questions mentioned above, our client has been supportive enough to let us experiment with different analytical tools and present any other significant insights we derive.
 +
 +
* We will be unable to conduct a yearly or seasonal analysis as the dataset is limited to records from 2017 only.
 +
* The dataset pertains to all students of SMU who used the library resources in the said time-period. However, the survey was only conducted for the freshman batch.
  
 
<br/>
 
<br/>

Revision as of 20:28, 14 January 2018

Team20 Logo.jpg


HOME

 

ABOUT US

 

PROJECT OVERVIEW

 

PROJECT FINDINGS

 

PROJECT MANAGEMENT

 

DOCUMENTATION

 

BACK TO MAIN PAGE

 


Project Motivation


The library currently aims to optimise its resource availability and distributions channels to maximise the learning effectiveness of its students. This could be in terms of increasing resources available for certain highly searched topics, altering current trainings and workshops to focus on any common mistakes committed by students while using the assets or finding any unexpected trends in user journey through digital and physical touch points. They further want to know if usage patterns vary between students based on certain attributes like Programme, Year of Graduation and Education Level. For this purpose, they have conducted an initial survey for the freshman batch of 2017 to evaluate the difference in their confidence level in various research skills before and after joining SMU, factoring in several considerations like modules taken, library workshops attended and so on and so forth. They wish for us to understand if this survey contains any actionable insights.


Project Objectives


We had an initial discussion with our project sponsor and they would like us to create a visual dashboard to ascertain cause and effect relationship between the initiatives and resources of the library, and student performance (in terms of confidence and optimal usage of resources).

The objectives of the project would be of the following:

  1. Analysis of the Library’s proxy server logs (usage of online resources) and turnstile logs (usage of physical resources)
    • Initial exploratory analysis to identify clusters in usage patterns
    • To understand if these patterns affect confidence level of students (limited to freshmen)
    • Alternatively, determine if confidence level spurs certain search behaviour.
  2. Recommendations based on findings
    • To help stakeholders understand the analysis of the findings
    • To validate or suggest changes in current workshops and trainings
    • To validate or suggest changes in the current availability of resources

The current objectives may be subjected to further changes after we have obtained and look at the actual data.


Data


The sponsor has provided us with five datasets - student data, request log data, turnstile data, and pre and post survey data.

The student dataset contains information about the current students of SMU across all batches. The record attributes are the following:

  • email (hashed to a 64-digit- long hexadecimal number for non-disclosure reasons)
  • education level
  • faculty
  • admission year
  • graduation year
  • degree program

The request log dataset contains records captured by the library’s URL rewriting proxy server throughout the year of 2017. This dataset captures all user requests to external databases. The record attributes are the following:

  • user ID
  • session ID
  • search database
  • timestamp
  • search query

The turnstile dataset contains records captured by the library’s gantries throughout the year of 2017. This dataset captures physical taps on the gantries of the library. The record attributes are the following:

  • date
  • time
  • device name
  • email (hashed to a 64-digit- long hexadecimal number for non-disclosure reasons)

The pre and post survey dataset contains responses of students before and after the first semester of freshman year on their confidence level in various research skills. Some of the record attributes are as follows:

  • email (hashed to a 64-digit- long hexadecimal number for non-disclosure reasons)
  • school
  • modules taken
  • library workshops attended

Project Methodology


As we have not obtained the data until the NDA is signed, we will only share our initial thought process of how we will tackle the project. We shall adopt closely to the Data Analytics Lifecycle approach.

Our plan of action is to discern the effectiveness of the library eBook databases in meeting the research needs of students. By analysing the proxy entries, we can define the usage pattern of its users and divide them into distinct clusters based on demographic and behavioural traits. Furthermore, we intend to track student user journey once they start interacting with the several physical and digital touch points sequentially. As such, we have also conducted a secondary research from various university published articles to gain a broad understanding of now turnstile and proxy data could be used to draw insights.

At this phase of the project, we will focus on understanding the given dataset and clean the data. Concurrently, we will decide on the analytical model and prepare the data accordingly.


Project Scope


While the project will be primarily focussed on answering the questions mentioned above, our client has been supportive enough to let us experiment with different analytical tools and present any other significant insights we derive.

  • We will be unable to conduct a yearly or seasonal analysis as the dataset is limited to records from 2017 only.
  • The dataset pertains to all students of SMU who used the library resources in the said time-period. However, the survey was only conducted for the freshman batch.