ANLY482 AY2017-18T2 Group 25 : Project Findings / Final
Interim | Final |
Conversion Rate Estimation
While it was trivial to compute open rates and clickthrough rates for Chope's emails, conversion rates proved to be a more difficult task. The datasets had little information on whether a user made a booking after opening or clicking through an EDM, and thus an estimation had to be used.
Conversion Definition:
A user who made and completed a booking on the Chope platform after receiving an email
Computation:
Reservations made within 24 hours of these actions are attributed as conversions
Text Mining
The word cloud visualises the most frequently used words, such as “buffet” and “new additions”.
Logistic Regression
- Highest frequency words were saved in a document term matrix to be used in a logistic regression model
- Logistic regression was computed by inserting variables in a stepwise manner using p-value thresholds
DV: Binary variable denoting whether the campaign was in the top performing quartile of campaigns
IV: Binary variable denoting whether the terms were used in the subject line
Results:
The terms “voucher”, “new additions” and “birthday” were statistically significant in explaining the better performance of top campaigns
Other terms which appeared frequently turned out to be insignificant, such as “top eats” and “sale” .
The logistic regression model had an overall accuracy of 76%
240px