Difference between revisions of "APA Final Progress"
Line 225: | Line 225: | ||
'''Email distribution for associates'''<br> | '''Email distribution for associates'''<br> | ||
Every Bubble indicates number of emails sent by an employee on a single day. The size represents the number of emails for the day. It can be observed that most associates seem to have regular email communication. It can also be observed that Bersileus, Hansel and Lemuel hardly have any communication. This is because they are new to the company. <br> | Every Bubble indicates number of emails sent by an employee on a single day. The size represents the number of emails for the day. It can be observed that most associates seem to have regular email communication. It can also be observed that Bersileus, Hansel and Lemuel hardly have any communication. This is because they are new to the company. <br> | ||
− | [[Image:associates.png| | + | [[Image:associates.png|400px]] <br> |
'''Email distribution for the C-Suite'''<br> | '''Email distribution for the C-Suite'''<br> | ||
Every bubble indicates number of emails sent by an employee on a single day. The size represents the number of emails for the day. It can be observed that all C-Suite employees seem to have regular email communication. The instances of huge bubbles are possibly instances of mass emails sent to the company by the C-suite employees.<br> | Every bubble indicates number of emails sent by an employee on a single day. The size represents the number of emails for the day. It can be observed that all C-Suite employees seem to have regular email communication. The instances of huge bubbles are possibly instances of mass emails sent to the company by the C-suite employees.<br> | ||
− | [[Image:csuite.png| | + | [[Image:csuite.png|400px]]<br> |
'''Average size of email by Hierarchy'''<br> | '''Average size of email by Hierarchy'''<br> | ||
Size of an email is a potential indicator of the amount of email content. Figure 10 below shows who sends the largest emails among different hierarchies. Associates seem to have the highest email size. This can be explained by their relatively more hands-on technical work which involves a lot of data. Operational has lowest size to their independent nature and role at the company. <br> | Size of an email is a potential indicator of the amount of email content. Figure 10 below shows who sends the largest emails among different hierarchies. Associates seem to have the highest email size. This can be explained by their relatively more hands-on technical work which involves a lot of data. Operational has lowest size to their independent nature and role at the company. <br> | ||
− | [[Image:hier.png| | + | [[Image:hier.png|400px]]<br><br> |
<big>'''Centralities Without Edge Weights'''</big><br> | <big>'''Centralities Without Edge Weights'''</big><br> | ||
'''Eigenvector and Betweenness Centrality''' | '''Eigenvector and Betweenness Centrality''' | ||
social network was created using email data where the edges had no weights. Eigenvector and betweenness centrality were applied to this network as visible in the figure below. <br> | social network was created using email data where the edges had no weights. Eigenvector and betweenness centrality were applied to this network as visible in the figure below. <br> | ||
− | [[Image:eigbet.png| | + | [[Image:eigbet.png|400px]]<br> |
Eigenvector Centrality is a measure of the influence of a node in a network. It is a global measure. Betweenness centrality of a node reflects the amount of control that this node exerts over the interactions of other nodes in the network (Information Flow). It is a relatively localized measure. People with high eigenvector centralities and low betweenness centralities may be connected to highly influential individuals. | Eigenvector Centrality is a measure of the influence of a node in a network. It is a global measure. Betweenness centrality of a node reflects the amount of control that this node exerts over the interactions of other nodes in the network (Information Flow). It is a relatively localized measure. People with high eigenvector centralities and low betweenness centralities may be connected to highly influential individuals. | ||
Arun Sundar (Chief Strategy Officer) and Manish Goel (Chief Executive Officer) have high centralities as they are the most important people at the firm. | Arun Sundar (Chief Strategy Officer) and Manish Goel (Chief Executive Officer) have high centralities as they are the most important people at the firm. | ||
− | Furthermore, Associate Hana Owens who has relatively high global connections (eigenvector centrality), is highly connected to the high-level employees. This is also evident in the survey results. A social network was created using the survey data where employees answered how much they work with each other, i.e. work network. In the network below | + | Furthermore, Associate Hana Owens who has relatively high global connections (eigenvector centrality), is highly connected to the high-level employees. This is also evident in the survey results. A social network was created using the survey data where employees answered how much they work with each other, i.e. work network. In the network below, size of the node is the betweenness centrality. The filters applied are 1) tie weight >= 3 (Strong) (Scale 1-5), and 2) mutual edges. <br> |
− | [[Image:hana.png| | + | [[Image:hana.png|400px]]<br> |
− | Figure above shows that Hana Owens is connected to high level employees like Manish Goel and Adesh Goel. | + | Figure above shows that Hana Owens is connected to high level employees like Manish Goel and Adesh Goel. <br> |
− | + | '''Degree Centrality'''<br> | |
+ | Degree centrality was applied to the social network created using email data where the edges had no weights and the graph below was derived. It can be observed that degree centrality is similar to out degree and in degree results. Thus, the study will just consider degree centrality for social network comparison instead of focusing on in-degree and out-degree. | ||
+ | [[Image:deg.png|400px]]<br> | ||
+ | '''Closeness Centrality''' <br> | ||
+ | Closeness centrality was applied to the social network created using email data where the edges had no weights and the Figure below was derived.It can be observed that Closeness Centrality shows very less variation among the employees. Thus, the study will not consider closeness centrality for social network comparison. | ||
+ | [[Image:clo.png|400px]]<br> | ||
+ | '''Correlation Analysis'''<br> | ||
+ | [[Image:corr.png|600px]]<br> | ||
+ | R-square of 0.895 between Eigenvector and Betweenness Centrality shows a very high correlation. Since Eigenvector gives a global view, it is preferable over degree centrality as a measure of analysis. Therefore, there is no need to consider degree centrality as an individual analysis. | ||
+ | R-square of 0.5336 between Eigenvector and Betweenness Centrality shows a weak correlation. Therefore, we will consider both eigenvector and betweenness measures for separate analysis. | ||
Line 258: | Line 267: | ||
'''Target Sample:''' All employees in the company (across geographies)<br> | '''Target Sample:''' All employees in the company (across geographies)<br> | ||
'''Aim:''' To use the survey to validate if an email exchange network is a good tool to calculate influence score. We define Influence Score as the extent to which an individual sways information flow in the workplace. <br> | '''Aim:''' To use the survey to validate if an email exchange network is a good tool to calculate influence score. We define Influence Score as the extent to which an individual sways information flow in the workplace. <br> | ||
− | '''Summary:''' The purpose of the survey is to validate the use email exchange network for calculating influence | + | '''Summary:''' The purpose of the survey is to validate the use email exchange network for calculating a combination of collaboration and influence.<br> |
− | + | Work Network: We define work network as the network of employees with whom one interacts with on a daily basis for work purposes. | |
− | + | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
<br> | <br> | ||
− | <big>'''You may view the'''</big> | + | <big>'''You may view the'''</big>[https://smusg.asia.qualtrics.com/jfe/form/SV_6eVxySZKg8NAW2N <font face ="Century Gothic" color="#00C5CD"><strong><i><big>survey here.</big></i></strong></font>] |
− | [https://smusg.asia.qualtrics.com/jfe/form/SV_6eVxySZKg8NAW2N <font face ="Century Gothic" color="#00C5CD"><strong><i><big>survey here.</big></i></strong></font>] | ||
− |
Revision as of 14:54, 22 April 2017
Data Email Data
Cleaning Email Data
4. Removing unnecessary columns such as:
After the cleaning of data, there were 29,797 rows of data with no missing data instances.
Staff and Email Data Comparison
Cleaning survey data
|