AY1516 T2 Team13 Natasha Studio Findings

From Analytics Practicum
Jump to navigation Jump to search

HOME

TEAM

PROJECT OVERVIEW

FINDINGS & ANALYSIS

PROJECT MANAGEMENT

DOCUMENTATION

EXPLORATORY DATA ANALYSIS OTHER ANALYSIS DATABASE CREATION ASSOCIATION RULE MINING LOGISTIC REGRESSION

Purchases Data

Exploratory data analysis allows us to better understand Natasha Studio business performance. As mentioned earlier, the purchase data between year 2010 to 2012 and year 2014 to 2015 will be analyzed separately due to the missing observations.

Price Packages

Bar chart of packages bought in 2010-2012Bar chart of packages bought in 2014-2015

The graphs above show that the most popular package type between the year 2010 to 2012 and 2014-2015.

For 2010-2012, we can see that the “06 Weeks (Full-Course)” package, contributing to 31% of the package purchases at 856 purchases. The “Unlimited Any Classes: 1 Month” package follows closely behind at 26.5% of the total package purchases number, followed by “04 Open Class” package at 17.9% and “04 Weeks (Full-Course)” package at 9% of the total number of packages bought.

In contrast, frequency analysis for package purchase data for year 2014-2015 reveals that the “Unlimited Any Classes: 1 Month” became the most popular package, and contributes to 39.1% of the entire package sales at 992 packages. This is despite the fact that the 2010 – 2012 dataset span for a longer temporal duration, highlighting a dominant changing trend of “Unlimited Any Classes: 1 Month” package as a significant revenue contributor to Natasha Studio. In addition, the next popular packages are only half as popular compared to the “Unlimited Any Classes: 1 Month” package. The “04 Open Classes” package remains popular and contributes to a stable 17% of total packages bought. However, it can be observed that the “06 Weeks (Full-Course)” package is relegated to being the third most popular package, contributing to 14.7% of total packages bought, and the number of package purchased is less than half compared to 2010 – 2012 dataset.

Price Package Vs. Amount Paid

Bar-line chart of sum of amount paid for packages and number of packages bought in 2010-2012 Bar-line chart of sum of amount paid for packages and number of packages bought in 2014-2015

We observe that the most revenue generating package is the “Unlimited Any Classes: 1 Month” package, followed by the “06 Weeks (Full-Course)” in 2010 – 2012. A difference in minimum and maximum prices reflects the varying pricing decisions adopted by Natasha Studio for each package. Furthermore, those packages with minimum of $0 amount paid indicates promotional strategies, such as the “Buy 2 Free 1” promotion for the unlimited any class packages. Those packages included in the free bundle promotions are listed as $0 paid in the Purchase data. In addition, it could be observed in Figure 6 that the “Unlimited Any Classes: 1 Month” and the “30 Any Classes” package brings in a disproportionate amount of revenue for the number of such packages sold. Hence, this could indicate that retaining these two package types would be beneficial for Natasha Studio in terms of revenue gains.

Analyzing the 2014-2015 Purchase Data, the “Unlimited Any Classes: 1 Month” package is the greatest revenue generator for Natasha Studio. This phenomenon is likely caused by the large number of “Unlimited Any Classes: 1 Month “packages bought in 2014-2015. One interesting observation is that although the “06 Weeks (Full-Course)” package is the 3rd most popular package in terms of frequency, it is the 2nd highest revenue generator for Natasha Studio, followed by the “08 Open Classes” package which is ranked 4th in terms of frequency purchase. This indicates that for the same number of lessons, courses tend to generate more revenue. This observation might be related to the dance genre associated with the courses offered, and will be analyzed in later sections.

Course/Open & Level

Summary statistics of amount paid for different class difficulty levels in 2010-2012 Summary statistics of amount paid for different class difficulty levels in 2014-2015

Next, the summary statistics table in Table 18 indicates the difficulty levels associated with the classes. The 2010-2012 dataset reveals that Natasha Studio appeals to the beginner group of dancers, where the top 3 course difficulties levels are ABC, I and II. Similarly, beginner open class packages are the most popular package amongst the open class packages.

As mentioned previously, difficulty levels for open classes for year 2014-2015 are not available due to the nature of the hardcopy data records provided. Once again, it is observed that the lower level courses are more popular compared to those of higher difficulties. Furthermore, despite the larger number of open class packages sold, the total amount of revenue generated by courses packages is greater than open classes. This highlights the importance of offering course packages for Natasha Studio in order to reap revenue benefits.

Time series Analysis

Lastly, running a time-series analysis for purchase packages reveals that there are no distinct seasonal time trend for particular package type in both 2010 – 2012 and 2014 – 2015 datasets

Line plot of number of different packages bought in 2010- 2012 Line plot of number of different packages bought in 2014- 2015

Yet, conducting time-series analysis for the total number of package purchased reveals indications of time trend relating to the package purchase behavior. Months such as November, December, May and June are observed to have higher sales than other months. This coincides with school holiday periods, hence indicating that a seasonal trend for package purchase does exist.

However, a worrying trend is observed in 2014-2015, where the later months of 2015 shows a clear declining trend in terms of the total number of packages bought. This data evidence indicates that Larry’s concern regarding Natasha Studio’s competitiveness and recent business performance are not unfounded. Hence, the team hopes that the data modelling provided in this project will assist Natasha Studio in their product offering.

Line plot of total packages bought in 2010-2012
Line plot of total packages bought in 2014-2015

Attendance Data

Genre

Bar-line chart of different genre's attendance and revenue generated

We can see that the most popular dance genre in Natasha Studio is Hiphop, followed by Bboying and Kpop (Note that not all of the genre was seen here, as some were hidden for a clear illustration purposes). However, after cost is added into the mix, even though Hiphop is the most popular and highest revenue generating class, the above graph highlights the profitability of Hiphop (Kids) classes despite having lesser classes. This better efficiency in revenue generation is due to the higher price charged for Hiphop (Kids) classes. Natasha Studio could consider increasing the level of these courses to improve its overall revenue generation

Course / Open & Level

Attendance of classes based on class difficulty levels

Natasha Studio is seen to be more focused on introductory level course and open classes. This is seen in the high frequency of Open Class: Beginner/Introductory as well as Course: Level ABC. Again, when adding cost to the mix, we get a more interesting picture. This above graph identifies “Course: PTP” has the best revenue generator per number of participants. Performance Training Programmes (PTP) are priced higher than traditional courses due to its higher level of performance and rigor expected from students.

Time Series Analysis

Line plot of attendance based on months

We can see that there is a seasonal monthly trend, with June and November being the more popular months. Once again similar to the observations in Purchase data, the higher attendance rate for lessons are likely due to school holidays. As students are less involved in school work, they are more likely to pursue non-academic interest and attend Natasha Studio’s dance classes.

Bar chart of attendance based on days

We can see that Saturday and Sunday is more popular likely due to it being the weekend. Thus, Natasha Studio should focus on holding more classes during Saturdays and Sundays to capture more of the market.

Line plot of attendance based on genre and days

If we further break it down into genres, we are able to obtain greater insights. For instance, the two most popular genres, Bboying and Hiphop are shown to be popular on different days of the week. Bboying is most popular on Sunday, whereas Hiphop is most popular on Monday, Wednesday and Saturday . This is likely because Bboying and Hiphop are traditionally taught by 2 different set of instructors, who likely set different schedules.

Bar chart of attendance based on timing

Evening time is also seen in Figure 19 to be the most popular. However, we also note a significant number of null data – 1,461 out of 19,712 records, is about 7.4% of the total dataset. Thus, there is as high a level of missing data. Even it if we recode all the null values into 830pm (3rd most popular), it still would not exceed the popularities of 630pm to 730pm. However, in light of our current project scope, we are leave this variable as it is, unless we decide to use it in our subsequent analysis.

Instructor

Bar chart of attendance based on faculty

Larry, the owner is also observed to be the instructor that teaches the most number of members. However, we also note the high number of null data – 2,558 out of 19,712 records, 13% of the total dataset. Instructors are typically tied to the genres. Thus, for the purpose of our project, we are focusing on analyzing the attendance pattern by dance genre. Hence, in light of our project scope, we will leave this variable as it is, as we are likely to exclude this field in our further analysis.