Difference between revisions of "APA Project Overview"

From Analytics Practicum
Jump to navigation Jump to search
m
Line 27: Line 27:
 
|style="vertical-align:top;width:30%;" | <div style="background: #10d0e5; padding: 13px; font-weight: bold; text-align:center; line-height: wrap_content; text-indent: 20px;font-size:20px; font-family:helvetica"> <font color= #ffffff>Motivation</font></div><br/>
 
|style="vertical-align:top;width:30%;" | <div style="background: #10d0e5; padding: 13px; font-weight: bold; text-align:center; line-height: wrap_content; text-indent: 20px;font-size:20px; font-family:helvetica"> <font color= #ffffff>Motivation</font></div><br/>
 
<p>
 
<p>
The way the employees of a company collaborate within teams and even across regions is very important to the success of a company. Thus, the business leaders are interested in keeping a check on this area of the firm and ask various questions such as:<br><br>
+
People Analytics has been rated as the second-biggest overall capability gap in organizations by the Deloitte university press. Through people analytics, companies are able to find better hires, improve retention, and find more suitable leaders. This has a direct impact on direction of the organization and hence its growth. In this project, we will keep a focus on four main categories for the analysis, answering several questions and developing several metrics under these categories. Given below are the four categories, with some questions we are aim to answer. Over the course of the project, we will include more relevant questions in the project scope. Thus, the business leaders are interested in keeping a check on this area of the firm and ask various questions such as: <br>
• Are different geographies working well?<br>
+
<b> Network Strengths </b> <br>
• Which employees work in silos?<br>
+
* Find the number of relationships internally and externally (distilled by strength and date – later dates indicate there has been recent communication) of all employees. This insight will provide an understanding of which employees, departments and locations are best at building and nurturing a large number of relationships internally and externally.
• How effective are the sales people at email correspondence?<br><br>
+
 
Answers for such questions usually rely more on qualitative observations as not many standard metrics can be directly used to track these details. Due to its general derivation, the results are often considered unreliable and hence non-actionable. The main motivation of our project is to assist TrustSphere to derive reliable and actionable insights regarding collaboration due to our data-driven approach.
+
<b> Influence </b> <br>
 +
* Identify top 10 employees who influence information flow within the organization. This insight can help identify agents for change as well as pinpoint employees that are overly relied on in the organization’s structure.
 +
* Find the social networks of junior employees with colleagues in managerial positions. This insight will give an idea of which employee is potential turned to for advice or trusted issues by managers.
 +
 
 +
<b> Collaboration </b> <br>
 +
* Interaction within and between departments/employees, geographies. It will also highlight individual employees that collaborate well within the organization.
 +
* Quantify the value an employee has on the internal and external network. It can be used to preempt the impact a departing employee will have on the network (the number of relationships that will be lost as a result of the employee leaving).
 +
* Quantify a manager’s effectiveness at building relationships with his or her team and the whole organization. This behavior then can be benchmarked against ideal performers in the organization.
 +
* Identify potential leaders within the organization. Research  has shown that employees that build strong relationships with an organization’s various departments often possess leadership potential.
 +
* Find employees who like to work in silos.
 +
 
 +
<b> Email Analysis </b> <br>
 +
* Find the average number of emails sent and received by an employee on a daily or weekly basis. How connected are these employees? Calculate a departmental average to understand collaboration between departments as well.
 +
Answers for such questions usually rely more on qualitative observations as not many standard metrics can be directly used to track these details. Due to its general derivation, the results are often considered unreliable and hence non-actionable. The main motivation of our project is to assist TrustSphere to derive reliable and actionable insights with our data-driven approach.
 +
 
 
</p>
 
</p>
  
Line 38: Line 52:
 
|style="vertical-align:top;width:30%;" | <div style="background: #10d0e5; padding: 13px; font-weight: bold; text-align:center; line-height: wrap_content; text-indent: 20px;font-size:20px; font-family:helvetica"> <font color= #ffffff>Objective</font></div><br/>
 
|style="vertical-align:top;width:30%;" | <div style="background: #10d0e5; padding: 13px; font-weight: bold; text-align:center; line-height: wrap_content; text-indent: 20px;font-size:20px; font-family:helvetica"> <font color= #ffffff>Objective</font></div><br/>
 
<p>
 
<p>
The primary objective of our project is to use social network analysis theories to create a hybrid centrality scoring method and a comprehensive dashboard based on email correspondence data that would provide TrustSphere with direct insights on various aspects of global collaboration effectiveness.
+
The primary objective of our project is to use social network analysis theories to create a hybrid centrality scoring method, along with other metrics to assess networks, influence, collaboration and email activity of employees. Additionally, we will build a comprehensive dashboard to provide TrustSphere with a clear and structured platform to view the generated metrics.  
 
</p>
 
</p>
  
Line 45: Line 59:
 
|style="vertical-align:top;width:30%;" | <div style="background: #10d0e5; padding: 13px; font-weight: bold; text-align:center; line-height: wrap_content; text-indent: 20px;font-size:20px; font-family:helvetica"> <font color= #ffffff>Data</font></div><br/>
 
|style="vertical-align:top;width:30%;" | <div style="background: #10d0e5; padding: 13px; font-weight: bold; text-align:center; line-height: wrap_content; text-indent: 20px;font-size:20px; font-family:helvetica"> <font color= #ffffff>Data</font></div><br/>
 
<p>
 
<p>
We are provided with an excel sheet containing a huge set of email exchange log via the trustsphere domain. The data provided is clean.
+
We are provided with an excel sheet containing a huge set of email exchange log via the TrustSphere domain. The data provided is clean (Screenshot of the data is shown below).<br>
 +
We will be collecting more data through a survey sent out to all employees of TrustSphere.  
 +
 
 
<br><br>
 
<br><br>
 
[[Image:Data sample.png|800px]]
 
[[Image:Data sample.png|800px]]
Line 83: Line 99:
 
• Eigenvector centrality<br>
 
• Eigenvector centrality<br>
 
<br>
 
<br>
Our first step would be to explore the network using these centralities, identifying hubs, brokers and groups as well as delving into other SNA concepts discussed by various academic researchers. We would be using softwares such as UCINET and NETDRAW for these analysis.<br><br>
+
Our first step would be to explore the network using these centralities, identifying hubs, brokers and groups as well as delving into other SNA concepts discussed by various academic researchers. We would be using softwares such as Gephi, and modelling packages from R such as weighted network packages for these analysis.<br><br>
 
Our goal is to come up with our own hybrid centrality score that would quantify an overall importance of each node in the network. Using the insights from step one, we will be creating multiple surveys (source of additional data) for the employees at TrustSphere to find influences (for the hybrid centrality). During the process, we will be referencing to the work of Karen Stephenson and Rob Cross, both of whom specialize in the field of organizational social networks.<br><br>
 
Our goal is to come up with our own hybrid centrality score that would quantify an overall importance of each node in the network. Using the insights from step one, we will be creating multiple surveys (source of additional data) for the employees at TrustSphere to find influences (for the hybrid centrality). During the process, we will be referencing to the work of Karen Stephenson and Rob Cross, both of whom specialize in the field of organizational social networks.<br><br>
 
In the end, along with the hybrid centrality score algorithm, we will be delivering a comprehensive dynamic dashboard visualizing the most relevant measures that we identify during our project. <br>
 
In the end, along with the hybrid centrality score algorithm, we will be delivering a comprehensive dynamic dashboard visualizing the most relevant measures that we identify during our project. <br>
Line 98: Line 114:
 
&emsp; b. Within different geographical regions<br>
 
&emsp; b. Within different geographical regions<br>
 
&emsp; c. Within projects<br>
 
&emsp; c. Within projects<br>
4. Develop a dynamic dashboard to visualize relevant measures<br>
+
4. Assess Influence, Network Strength and Email collaboration <br>
5. The Scope is fluid and will become more specific as the project progresses <br>
+
5. Develop a dynamic dashboard to visualize relevant measures<br>
 +
6. The Scope is fluid and will become more specific as the project progresses <br>
 
</p>
 
</p>

Revision as of 22:56, 8 January 2017

APA logo.png

HOME

 

PROJECT OVERVIEW

 

FINAL PROGRESS

 

PROJECT MANAGEMENT

 

DOCUMENTATION

 
Motivation

People Analytics has been rated as the second-biggest overall capability gap in organizations by the Deloitte university press. Through people analytics, companies are able to find better hires, improve retention, and find more suitable leaders. This has a direct impact on direction of the organization and hence its growth. In this project, we will keep a focus on four main categories for the analysis, answering several questions and developing several metrics under these categories. Given below are the four categories, with some questions we are aim to answer. Over the course of the project, we will include more relevant questions in the project scope. Thus, the business leaders are interested in keeping a check on this area of the firm and ask various questions such as:
Network Strengths

  • Find the number of relationships internally and externally (distilled by strength and date – later dates indicate there has been recent communication) of all employees. This insight will provide an understanding of which employees, departments and locations are best at building and nurturing a large number of relationships internally and externally.
Influence
  • Identify top 10 employees who influence information flow within the organization. This insight can help identify agents for change as well as pinpoint employees that are overly relied on in the organization’s structure.
  • Find the social networks of junior employees with colleagues in managerial positions. This insight will give an idea of which employee is potential turned to for advice or trusted issues by managers.
Collaboration
  • Interaction within and between departments/employees, geographies. It will also highlight individual employees that collaborate well within the organization.
  • Quantify the value an employee has on the internal and external network. It can be used to preempt the impact a departing employee will have on the network (the number of relationships that will be lost as a result of the employee leaving).
  • Quantify a manager’s effectiveness at building relationships with his or her team and the whole organization. This behavior then can be benchmarked against ideal performers in the organization.
  • Identify potential leaders within the organization. Research has shown that employees that build strong relationships with an organization’s various departments often possess leadership potential.
  • Find employees who like to work in silos.
Email Analysis
  • Find the average number of emails sent and received by an employee on a daily or weekly basis. How connected are these employees? Calculate a departmental average to understand collaboration between departments as well.
Answers for such questions usually rely more on qualitative observations as not many standard metrics can be directly used to track these details. Due to its general derivation, the results are often considered unreliable and hence non-actionable. The main motivation of our project is to assist TrustSphere to derive reliable and actionable insights with our data-driven approach.

Objective

The primary objective of our project is to use social network analysis theories to create a hybrid centrality scoring method, along with other metrics to assess networks, influence, collaboration and email activity of employees. Additionally, we will build a comprehensive dashboard to provide TrustSphere with a clear and structured platform to view the generated metrics.

Data

We are provided with an excel sheet containing a huge set of email exchange log via the TrustSphere domain. The data provided is clean (Screenshot of the data is shown below).
We will be collecting more data through a survey sent out to all employees of TrustSphere.

Data sample.png

The data consists of the following attributes:
  1. Date: date the email was sent/received
  2. Originator address: email address of the sender
  3. Recipient address: email address of the recipient
  4. Direction:
    a. ‘Inbound’ – email received by an employee of TrustSphere from an external sender
    b. ‘Outbound’ – email sent by an employee of TrustSphere to external recipient
    c. ‘Internal’ – email exchanged within TrustSphere employees
  5. Type:
    a. ‘em’ – message sent via email
    b. ‘im’ – message sent via instant messaging
  6. Size: number of characters in the message
  7. Msg ID: unique ID given to every emailing chain
  8. Email Subject: subject of the email

METHODOLOGY

We plan to take a social network analysis (SNA) approach to analyze the data since the goal is to analyze different attributes (preference to work in silos, importance, popularity) of the actors (employees) and the relationship between them (collaboration via email). This view of the data makes it an ideal social networks data when each email address would represent a node and every email would be the relationship between both the nodes.

SN diagram.png



There are various measures of SNA proposed over the years that help determine the role and importance of a node in the network. The following are a few examples:

• Degree centrality
• Closeness centrality
• Betweenness centrality
• Eigenvector centrality

Our first step would be to explore the network using these centralities, identifying hubs, brokers and groups as well as delving into other SNA concepts discussed by various academic researchers. We would be using softwares such as Gephi, and modelling packages from R such as weighted network packages for these analysis.

Our goal is to come up with our own hybrid centrality score that would quantify an overall importance of each node in the network. Using the insights from step one, we will be creating multiple surveys (source of additional data) for the employees at TrustSphere to find influences (for the hybrid centrality). During the process, we will be referencing to the work of Karen Stephenson and Rob Cross, both of whom specialize in the field of organizational social networks.

In the end, along with the hybrid centrality score algorithm, we will be delivering a comprehensive dynamic dashboard visualizing the most relevant measures that we identify during our project.

SCOPE OF WORK

1. Create a hybrid centrality score as an overall comprehensive measure of the network
2. Identify Silos
3. Assess Collaboration
  a. Within departments
  b. Within different geographical regions
  c. Within projects
4. Assess Influence, Network Strength and Email collaboration
5. Develop a dynamic dashboard to visualize relevant measures
6. The Scope is fluid and will become more specific as the project progresses