ANLY482 AY2016-17 T1 Group6/Midterm Progress

From Analytics Practicum
Jump to navigation Jump to search

MST Logo.jpeg

Home Team Project Overview Midterm Progress Final Progress Project Management Documentation


Recap & Objectives

Our sponsor, Trustsphere is a software company that provides relationship analytics solutions. Their products deliver insights that help clients across the globe improve key business issues including sales force effectiveness, enterprise-wide collaboration and corporate governance. The company engaged our team to utilize our technical and analytical capabilities to help them understand and tackle their business problem of little growth in sales and a longer than ideal sales cycle.

While the field of Sales Analytics has received plenty attention in the past, recent studies reveal that few companies have also delved into the area of Sales People Analytics. Salespeople communications to potential clients, especially in the B2B sphere, are wholly relied upon for marketing the company’s product. Furthermore, Steward et al. (2010) found that higher-performing salespeople also regularly activated their internal company networks, to coordinate a team of experts tailored to serve a particular customer. Just sales figures to evaluate salespeople performance covers a very narrow perspective as it disregards cycle time and in-progress pitches, therefore our team has defined our scope as to analyze the sales team’s internal and external communications to gain insight into their relationships with internal and external parties and to identify the sales stages that act as bottlenecks in the sales process.


Data Provided by Client

For this project, our team is working with two sets of data provided to us by TrustSphere:

A) Daily email communication data (main dataset)

This dataset contains year-to-date (up till 31 August 2016) records of daily email communication data of all 19 Trustsphere sales people across the globe. This data includes the following variables:
  • Date: Includes the date and time of a particular email being sent
  • Originator address: Sender email address
  • Recipient address: Receiver email address
  • Direction: Nature of communication (internal, inbound or outbound)
  • MsgID: Unique message ID of emails sent
  • Email Subject: Email subject header


B) Staff List

The dataset lists all of TrustSphere staff (57) with the following variables:
  • Name
  • Hierarchy
  • Department
  • Position
  • Location


We were also provided with a Relationship dataset, as mentioned previously in the proposal, which contained individual records of salespeople relationships – however we are not using this dataset in any of our analyses.


Scope of Work

After repeated interaction with the Sales team, our team decided to divide the scope into the following sections:

1. RELATIONSHIP REPORT
  • Analyses the number and strength of Internal and External Relationships Salespeople have developed over the analysis period.
  • Takes into account the frequency and recency of emails exchanged by and with the sales person to highlight their communication and collaboration effort.


2. CLIENTS AND SALES STAGES
  • Reports the sales progress for each account up-till 31st August. That is, how many account are active and what stage of the sales cycle they’re in.
  • Evaluates the performance of each salesperson depending on how many accounts they have in each stage, their response trends in each stage, progress from historical communication indications etc.
  • Provides a postmortem on inactive accounts, reporting on how long communications with a client lasted, what stage did communications end and which salespeople were responsible.


3. SOCIAL NETWORK ANALYSIS
  • Analyses collaboration trends, that is, what department do salespeople interact with more, for example does interacting with C-Suite employees correlate with better performance?
  • Examines overlap trends, to see if multiple salespeople interact with the same client, and whether abnormalities exist within the overlap.
  • Examines how networks differ with relation to location, whether their lies a communication gap between teams based in different regions.


Methodology
Area of Interest Analyses Purpose
Relationship Report Data Transformation, Summary Statistics and Visualisations
  • To condense email data into unique Salesperson-Person X relationships
  • Classify if the relationships are Internal/External
  • Establish the number of emails exchanged in each relationship
  • Establish the timeline of emails exchanged within the relationship
Clients & Stages Data Transformation, Text Mining, Summary Statistics
  • To condense email data into unique records of email threads (1 record -> 1 email thread)
  • Use text mining to determine which thread the email belongs to.
  • Establish the timeline of emails exchanged within an email thread.
  • Calculating average response time within the thread
Salespeople Network Social Network Analysis
  • To observe what departments salespeople communicate with
  • To observe how salespeople from different locations interact with each other
  • To observe overlap between salespeople
Classifications and Metrics

Following our exploratory analysis of the data and the scope and repeated meetings with the client we have established certain classifications and metrics that ease reporting of evaluations of the sales performance.

STAGES AND CLIENTS

1. Sales Stages

a. Prospecting Stage: When a prospecting email is sent out to a prospective client. They may or may not respond.
b. Meeting Stage: When the client has responded favourably to the prospecting stage and Trustsphere has a scheduled pitch meeting with the client.
c. POC Stage: When the client has agreed to commission a product trial.
d. After POC Stage: Follow ups, quotations, contracts etc.

The classification of these stages was provided to us by Trustsphere. Their commission structure rewards sales people after they get a client into the Meeting and POC Stage. Therefore all emails after the POC stage are classified into the the ‘After POC’ stage.


2. Active and Inactive Clients

a. Active Client: Contact has taken place within the past 30 days
b. Inactive Client: No contact has taken place in the past 30 days


3. Failed Prospects

a. Failed Prospect is a classification that indicates what percentage of prospecting emails sent out did not make it to the meeting stage.


RELATIONSHIP REPORT

1. Hot & Cold Relationships

a. Hot Relationship: Last contact was made less than 3 days ago
b. Cold Relationship: Last Contact was made more than 3 days ago

2. Hot & Cold Relationships

a. Strong Relationship: Above average number of emails exchanged AND is a hot relationship
b. Weak Relationship: Below average number of emails exchanged AND/OR Cold Relationship


Data Cleaning & Transforming

1. SALES STAGES AND CLIENTS


A. REMOVING JUNK AND INTERNAL EMAILS


Mst 1A.png


B. CONDENSING EMAIL RECORDS INTO UNIQUE THREAD RECORDS

Following The Data Cleaning in Step 1, we wished to reduce the emails to unique threads, i.e. 1 record for each thread, instead of each email

Mst 1B.png


C. COMPUTED FIELDS

Mst 1C.png

Out of these 20090 unique salesperson – contact – subject thread email communications, there were some with and without replies. Further steps needed to be taken to remove irrelevant emails.


D. FURTHER REMOVAL OF IRRELEVANT EMAILS

Two key aspects were used to determine if each unique salesperson – contact communication is relevant
• Is there at least 1 reply within each unique salesperson – contact relationship?
• Do the subject threads between each unique salesperson – contact relationship contain specific keywords (“Relationship Analytics”, “POC”?)

There needs to be 1 reply between a salesperson-client for the communication to be relevant. The Sales cycle consists of 4 stages. Email subject headers may change during these stages. There are no specific templates or subject headers used except for the following 2 keywords (“Relationship analytics”, “POC”). There would also be some subject headers that required a reply and those that don’t. Our team has concluded that within each sales process, there should be a minimum of 1 reply between a salesperson and his contact. We have categorized such unique salesperson – contact relationship with at least 1 reply for any thread to be relevant.

The protocol keywords of either “Relationship Analytics” or “POC” need to be present in communication threads for a salesperson-client communication to be relevant.

Although a unique salesperson – contact relationship may have at least 1 reply throughout their email exchange, their communication exchange may not be relevant to sales or the process of acquiring a potential client. TrustSphere has a number of partners whom are important to them. As such, this communication may be email exchanges between TrustSphere and their partners, which are irrelevant. For each unique salesperson – contact – thread record, our team has identified if the thread has one of the following keywords (“Relationship analytics”, “POC”). A unique salesperson – contact communication is only relevant if there is at least 1 reply and at least 1 thread with stated keywords throughout their communication exchange.


E. DETERMINE STAGES

At the end of this steps, a total of 1814 / 20090 unique salesperson – contact – thread records are identified as relevant. These unique records contain a total of 5864 emails sent and received. Our next step is to look at each unique salesperson – contact communication and determine their sales stages. TrustSphere has provided the team with a list of their partners. For each of these 1814 unique salesperson – contact – thread records, out team has categorised them into the following groups:
1. Prospecting
2. Meeting
3. POC
4. After POC

The team will consult with TrustSphere to ensure these categorisations are accurate. The team will also delve into the use of text analytics for further insights and alternative categorisations methodologies.


F. FINAL DATASET

Salesperson: TrustSphere Sales Staff
Contact: Email address of person a salesperson is in communication with
Thread: ThreadSubject
Incoming count: Total count of inbound emails in the thread
Outgoing count: Total count of outbound emails in the thread
Date First In / Out: Date email is first sent or received for a particular thread
Date Last In: Date email is last received for a particular thread
Date Last Out: Date email is last sent for a particular thread
Average Response Time: Average time taken for an email reply
Total Response Time: Total time taken for all email replies
Relevant: True if a Salesperson – Contact communication has at least 1 reply in their entire email communication. False if otherwise.
Keyword: True if a Salesperson – Contact communication with at least 1 thread with the keywords “relationship analytics” or “poc” in their entire email communication. False if otherwise.
Partner: True if Contact is a TrustSphere partner. False if otherwise.
Phase: Stage of the sales process (Prospecting, Meeting, POC, After POC)


2. RELATIONSHIP REPORT


A. RAW DATASET

Mst 2A.jpg


B. DATA TRANSFORMATION

The data was then transformed from 1 record being 1 email communication between an originator and a receiver to 1 record being a salesperson-contact person relationship. The computed columns included number of inbound and outbound emails sent and received between the Salesperson and Contact person. The number of days since the last email was sent and received by the salesperson from the contact person etc.

Mst 2B.png


C. FINAL DATASET

Originator Address: Sales Person
Receiver Address: Contact Person
Type: Whether it is an external/internal relationship
Outbound Count: Number of Emails sent from Salesperson to Contact Person
Inbound Count: Number of Emails received by Salesperson from Contact Person
Total Frequency: Sum of Inbound and Outbound Count. Total emails exchanged between Salesperson and Contact Person
Last Out: Number of days ago the last email from sent from Salesperson to Contact Person
Last In: Number of days ago the last email was received by Salesperson from Contact Person
Last Contact: The lesser of Last in and Last Out. The number of days ago the last communication was sent or received.

Table: Which month does the record belong to

Hot/Cold: Whether the relationship is hot or cold (Refer to Classifications and Metrics)
Strong/Weak: Whether the relationship is strong or weak (Refer to Classifications and Metrics)



Future Tasks and Deliverables

Now that our Datasets are ready our next steps are to -

A. Calculating and Visualising Metrics from the above Dataset to gather Insights
  • Using SAS, JMP and Tableau we will be calculating and visualising these metrics on a per month basis to determine time series and seasonality trends and identify any red flags in salespersons performance or sales cycle process.
B. Create a Dashboard to report Insights to Sales Manager
  • Using Javascript and D3 we will create a web based dashboard to report insights to the sales insights to the Sales Manager/Director. (Mock Up created on Powerpoint, Appendix B)