Difference between revisions of "ANLY482 AY2017-18T2 Group18/TeamDAcct Project Methodology"

From Analytics Practicum
Jump to navigation Jump to search
Line 69: Line 69:
 
In the analysis of data, we will be using the software SAS JMP Pro 13 as our main tool for data cleaning, data preparation and EDA. Our choice of this software is that it allows us to conduct statistical analysis on big datasets and can generate results that are easy to understand for end users. In addition, due to its popularity of being widely used, tutorials are readily available on the web, should we encounter any problem.  
 
In the analysis of data, we will be using the software SAS JMP Pro 13 as our main tool for data cleaning, data preparation and EDA. Our choice of this software is that it allows us to conduct statistical analysis on big datasets and can generate results that are easy to understand for end users. In addition, due to its popularity of being widely used, tutorials are readily available on the web, should we encounter any problem.  
  
 +
 +
Thereafter, we can formulate and suggest a model to the client that forecast the expenses incurred for an anticipated project site with certain characteristics. From a business perspective, this would point to possible approaches management wish to take (i.e. minimise cost, bid at higher price, pegged to industry price, market share growth), in using our proposed model to assist in shortlisting future project sites and to drive business strategy overall.
  
 
</div><br>
 
</div><br>

Revision as of 16:21, 14 January 2018

TeamDAcct.png

Home About Us Project Overview Project Findings Project Management Documentation ANLY482 Homepage

 


Methodology


Data Cleaning:

From the data we receive, we will first perform data cleaning on it. This is especially important as the data prior to June 2016 are manually stored as text format and most software are unable to read it. Hence data cleaning is performed on it to ensure that the data prior to June 2016 is consistent with the rest of the data that we will be using.


Data Preparation:

As the client is a cleaning company that engages in a variety of different cleaning activities (e.g. landscape care and maintenance services), we will be separating the project sites into the different categories of cleaning services and conducting exploratory data analysis as well as subsequent analyses separately on each of the categories. The rationale for this separation is that the main factor that drives expenses for a certain category of cleaning service might be different from the main factor that drives expenses for another category of cleaning service. Thus, by conducting analysis separately on the different categories of cleaning services it will provide the client with more comprehensive insights.


Exploratory Data Analysis (EDA):

Following which, EDA (e.g. through the usage of graphs or tables of summary measures) will be conducted to get a better understanding of the data. From the EDA, we will have a better understanding of the relationship amongst the explanatory variables (factors that drives the expenses such as wages) as well as provide us with a general direction and size of relationship between explanatory and outcome variables. In the analysis of data, we will be using the software SAS JMP Pro 13 as our main tool for data cleaning, data preparation and EDA. Our choice of this software is that it allows us to conduct statistical analysis on big datasets and can generate results that are easy to understand for end users. In addition, due to its popularity of being widely used, tutorials are readily available on the web, should we encounter any problem.


Thereafter, we can formulate and suggest a model to the client that forecast the expenses incurred for an anticipated project site with certain characteristics. From a business perspective, this would point to possible approaches management wish to take (i.e. minimise cost, bid at higher price, pegged to industry price, market share growth), in using our proposed model to assist in shortlisting future project sites and to drive business strategy overall.