Difference between revisions of "ANLY482 AY2016-17 T2 Group 2 Project Overview Data Source"

From Analytics Practicum
Jump to navigation Jump to search
Line 49: Line 49:
 
  <br><br>
 
  <br><br>
 
'''Musical data'''
 
'''Musical data'''
 +
This dataset contains of every instance of transaction data for customer musical purchases. Musical purchases can be in the form of local or overseas.
 
{|class="wikitable" width="60%"
 
{|class="wikitable" width="60%"
 
|-
 
|-
Line 112: Line 113:
 
<br>
 
<br>
 
'''Concert data'''
 
'''Concert data'''
 +
This dataset contains of every instance of transaction data for customer concert purchases.
 
{|class="wikitable" width="60%"
 
{|class="wikitable" width="60%"
 
|-
 
|-
Line 181: Line 183:
 
| The total amount from the purchase
 
| The total amount from the purchase
 
|}
 
|}
Using this data, we can gather labelled data by identifying the words from the ‘description’ and ‘requirements’ fields and scraping websites for common skillsets. This labelled data will then be used to build a model.
+
<br>
Subsequently, data will be scraped from jobsbank.gov.sg and used to train the model.
+
'''Customer Profile data'''
 +
This dataset consists of each customer’s account number and the associated account details.
 +
{|class="wikitable" width="60%"
 +
|-
 +
! width="15%" | Data Field
 +
! Description
 +
 +
        |-
 +
        ! ACCOUNTNUMBER
 +
| Account number of the customer. Each account number is anonymised in the same way as the transaction datas.
 +
       
 +
        |-
 +
        ! GENDER
 +
| The gender of the customer.
 +
 
 +
        |-
 +
        ! NATIONALITY
 +
| The nationality of the customer
  
 +
        |-
 +
        ! Age
 +
| The age of the customer
 +
 +
        |-
 +
        ! New
 +
| 0 means that it is not an old customer while 1 means that it is a new customer.
 +
|}
 
</div>
 
</div>

Revision as of 16:19, 8 January 2017


HOME

 

PROJECT OVERVIEW

 

FINDINGS

 

PROJECT DOCUMENTATION

 

PROJECT MANAGEMENT

Background Data Source Methodology

Preliminary Data Source

To facilitate our initial analysis, Kaisou has provided us with sample datasets that consists of some transaction data from November 2016. The three datasets given are namely the musical transaction data, concerts transaction data and customer profile data.



Musical data This dataset contains of every instance of transaction data for customer musical purchases. Musical purchases can be in the form of local or overseas.

Data Field Description
Currency The currency the transaction was purchased in. Should be “SGD” for all transactions.
AccDummy The account number that purchased this transaction. This is being anonymised.
TicketStatus S is for Single and M is for Multiple.
TicketType The kind of ticket type.
Channel I is for Internet and P is for Phone. We will use this column to differentiate which channel the transaction is purchased from.
MusicalDate The date where the musical is held.
QuickPick Y means that the machine picked the number while N means that the customer picked the number.
DrawNumber Unique number for each musical.
Product 23 is local production, 9 is overseas production
SettleDate The settlement date for the purchase.
Selection Seat number that the customer selected.
TicketDate The date of purchase.
TicketTime The time of purchase
TicketAmount The total amount from the purchase


Concert data This dataset contains of every instance of transaction data for customer concert purchases.

Data Field Description
Currency The currency the transaction was purchased in. Should be “SGD” for all transactions.
AccountDummy The account number that purchased this transaction. This is being anonymised.
TicketStatus A is for Active.
TicketType The kind of ticket type.
Channel I is for Internet and P is for Phone. We will use this column to differentiate which channel the transaction is purchased from.
LiveInd Y means that the purchase was on a live concert while N means otherwise
TicketType The type of ticket
LegStatus
MarketName
Odds
SettleDate The settlement date for the tickets.
SettleInfo
TicketDate The date of purchase.
TicketTime The time of purchase.
ArtistCode The concert artist that the ticket belongs to
TicketAmount The total amount from the purchase


Customer Profile data This dataset consists of each customer’s account number and the associated account details.

Data Field Description
ACCOUNTNUMBER Account number of the customer. Each account number is anonymised in the same way as the transaction datas.
GENDER The gender of the customer.
NATIONALITY The nationality of the customer
Age The age of the customer
New 0 means that it is not an old customer while 1 means that it is a new customer.