Network Analysis of Interlocking Directorates/Findings Insights

From Analytics Practicum
Revision as of 18:06, 14 February 2015 by Tw.zheng.2011 (talk | contribs) (Added data preparation)
Jump to navigation Jump to search
Home Project Overview Findings & Insights Project Management Project Documentation Learning Outcomes


Data Files

The data set extracted from OneSource consists of 2 different files, a list of companies in Singapore and a list of executives in those companies.

List of Companies This file contains a list of companies currently operating in Singapore and their relevant business information. The initial extraction produced 66,023 rows of data, but after reviewing the duplicates caused by the selection process, only 50,677 rows remains. This data consists of detailed company-level data such as its industry classification and parent companies making up 22 attributes. Some important attributes used include the company name which will be used to join this list to the list of executives. Information on the industry classification, parent company and country information will be used to find relationships during our social network analysis (SNA).

While the list produced by the initial extraction was extensive, data was missing in several columns. In particular, many companies had empty cells under the parent company and parent country columns. As this information is pertinent to our project, we did a basic internet search on the companies’ profiles to fill them up as best as we could. For companies which we were unable to find information on, the team assumes that they had no parent companies and that their parent countries were Singapore. We feel that this is a valid assumption because most of the companies missing the information were small private Singapore companies.

List of Executives This file contains the personal details and titles of executives who are currently working in the companies above. The initial extraction produced 117,370 rows of data but after reviewing duplicate entries, we have reduced it to 79,330 rows. This list contains 16 attributes but only 5 attributes will be used from the data. 3 of these attributes make up the name of the executives and will be the basis of our edges in the SNA. The company name is used to join our 2 datasets together and the executive titles may assist us in drawing inferences for our conclusion.

Although OneSource provided us clear options regarding executive titles while extracting, the resulting titles varied wildly. This is to be expected because companies may have differing views on how to label their executives. To achieve a clearer analysis, our team further categorized the titles in line with the options provided by OneSource.

Fidings & Insights will be added after the analysis has been done.
Please check back later.

Work-in-progress.png