Difference between revisions of "ANLY482 AY2016-17 T2 Group15 Analysis & Findings"

From Analytics Practicum
Jump to navigation Jump to search
Line 86: Line 86:
 
</div>
 
</div>
 
For our entire data preparation and analysis, we will be using the following softwares:
 
For our entire data preparation and analysis, we will be using the following softwares:
[[image:edufy_excel.png | 300px]]
+
 
 +
[[image:edufy_excel.png | 150px]]
 
[[image:edufy_jmppro.png | 300px]]
 
[[image:edufy_jmppro.png | 300px]]
  

Revision as of 02:08, 24 February 2017

Edufy back.png Back to Project Main Page

Edufy icon.png

Edufy homeicon.png Home

Edufy projectoverviewicon.png Project Overview

Edufy analysisicon.png Analysis & Findings

Edufy projectmanagementicon.png Project Management

Edufy documentationicon.png Documentation


Data Source

The data that we obtained were all provided by Edufy Secondary School. In total, we received data covering three batches of students from 2014 to 2016. Each batch of data covers the four years of secondary school that the student have been through. Just to make it clear, the data we have will be consist of the following:

Batch of 2014 Batch of 2015 Batch of 2016
Secondary 1 (2011) Secondary 1 (2012) Secondary 1 (2013)
Secondary 2 (2012) Secondary 2 (2013) Secondary 2 (2014)
Secondary 3 (2013) Secondary 3 (2014) Secondary 3 (2015)
Secondary 4 (2014) Secondary 4 (2015) Secondary 4 (2016)

And for each year, we are also given the breakdown of the various examinations that each student has to take in a year. Here is the breakdown of the various data for each year:

  • Secondary 1: CA1, SA1, CA2, SA2, Overall (5 sets of data)
  • Secondary 2: CA1, SA1, CA2, SA2, Overall (5 sets of data)
  • Secondary 3: CA1, SA1, CA2, SA2, Overall (5 sets of data)
  • Secondary 4: CA1 OR CA2, SA1, SA2 aka Prelims, Overall (4 sets of data)


The 'Overall' refers to the overall score a student gets for that entire academic year. It is calculated by taking a combined score for CA1 & SA1 (37.5% CA1, 62.5% SA1) which makes up 40% of the total and CA2 & SA2 (25% CA2, 75% SA2) which makes up the remaining 60% of the total.


Edufy sample data.png


You can see a small glimpse of the data that we have received from our sponsor in the above image. This data is the first few columns of the Batch of 2016 CA1 data that we received. So this file will mainly contain the Secondary 1 CA1, Secondary 2 CA1, Secondary 3 CA1 and Secondary 4 CA1 from the Batch of 2016.


Each individual student's name is being coded. For example, in the image shown, the first student is a Secondary 4 student from the class S4-1 and his index number is 1. This protects the identity of the students that we are analyzing. Besides the main academic results, we also have other columns such as the second language of the student, the results of PSLE and 'O' Levels (our main objective), the gender of the student and the student's class in Secondary 1 and Secondary 2 (for inter-class analysis).


After asking our sponsor for more data, we managed to get the CCA data of the students as well but only the CCA data during the students' graduating year. Here is a sample data of the CCA for the 'Batch of 2016':


Edufy sample data cca.png


As you can see, we are given the name of the CCA the student is involved in and also the number of points and the corresponding grade that the student received at the end of the four years of their secondary school. We are not given the CCA records at the end of each of their academic year.


Data Preparation

For our entire data preparation and analysis, we will be using the following softwares:

Edufy excel.png Edufy jmppro.png


a

Exploratory Data Analysis


Time-series Analysis