Difference between revisions of "Methodology"

From Analytics Practicum
Jump to navigation Jump to search
Line 62: Line 62:
 
|}
 
|}
 
==<div id="mw-content-text" lang="en-GB" dir="ltr" class="mw-content-ltr"><div style="background: #6FB1D0; padding: 10px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #6A8295 solid 15px; font-size: 20px; font-family:DIN Alternate"><font color="white">Literature Review</font></div>==
 
==<div id="mw-content-text" lang="en-GB" dir="ltr" class="mw-content-ltr"><div style="background: #6FB1D0; padding: 10px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #6A8295 solid 15px; font-size: 20px; font-family:DIN Alternate"><font color="white">Literature Review</font></div>==
Prior to the application of the Huff’s model as well as the MCI (multiplicative competitive interaction) model in our project, our group decided to explore the contexts with which those models have been applied elsewhere in the real world. This is so that we can understand the models better and also figure if we can learn anything which could be applied in the context of our project eventually.
+
The MCI Model is an “econometric model for analysing market shares in a competitive environment where the market is divided into i submarkets (e.g. groups of customers, time periods or geographical regions) and served by j suppliers (e.g. firms, brands or locations)”. The resulting market share of the suppliers is dependent upon the attractiveness/utility of the alternatives j in the submarket. MCI model is nonlinear but can be transformed via Ordinary Least Square (OLS) regression using the multi-step log-centering transformation.
 
+
<br><br>
Huff’s model which was initially discovered by David Huff in 1962 (Huff D.L, 1962) stated that the probability of any individual choosing a store is a ratio of the utility of that store to the sum of utilities of all other stores that the individual considers. In the context of the library, the utility/attractiveness of a store was determined by the size of the library and the transportation costs involved. 
+
The MCI model is basically an extension of the Huff’s Model. First developed by David Huff (1964), the Huff’s Model is a gravity model which is capable of estimating the probability of a consumer patronising a retail area, out of a list of available retail areas. Nakanishi/Cooper (1974) then incorporated the aspect of competitive interaction between retail areas into analysing the attractiveness of retail areas to consumers via the MCI model.  
 
+
<br><br>
Huff’s basic model was then modified to attempt to include additional determinants of attractiveness for a conclusive picture of utility of a retail area. (Thomas, 1976) Reputation of the library would be one such example of a determinant of attractiveness.
+
Since then, there have been many adaptations and improvements of the MCI model to suit the peculiarity of each case it was applied on. Oscar -Benito, Michael Greatorex and Pablo-Gallego (2000) proposed an application of the MCI model to assess segmentation variables in their capacity to homogenise consumers’ probability of patronage for retail areas. The patrons were divided into different groups based on variables such as income or age for example. The results upon application of the MCI model were compared across the different segmentation variables and analysed retail area selection by consumers across the different segments. The results eventually highlighted the importance of evaluating demographic and socio-economic variables as indicators of shopping behaviour (González-Benito, Greatorex, & Muñoz-Gallego, 2000).
 
+
<br><br>
The MCI model eventually got introduced. It extended the Huff’s Model and basically stated that attractiveness should capture the essence of competitive interactions. (Nakanishi & Cooper, 1974). In the context of the library, this meant that some of the factors included in the attractiveness index could be considered as substitutes for the library and can eventually cannibalise patrons from going to the public libraries. Examples of such factors would include bookstores and other entertainment facilities.
+
In order to identify the main determinants of store choice behaviour, Nermin Oruc and Boris Tihi (2012) utilised the retail gravity model which was augmented by the inclusion of coefficients produced by the MCI model. They eventually presented the interaction between the attractiveness of store attributes (such as effectiveness of cashier, size of store, product offering) and deterrence of the distance between store and customer so as to analyse the impact of the trade-off one makes in visiting a store based on its attributes despite the long distance (Oruc, & Tihi, 2012)
 
+
<br><br>
A further extension of the model showed how both external and internal factors of attractiveness should be considered (Jain and Mahajan, 1979). This was eventually used in a project in Germany which determined the agricultural mix of the farming sector i.e. whether farmers should go into cash crops, dairy production etc. (Neuenfelt, 2014). So other than external factors of attractiveness such as the socioeconomic factors, internal factors dealing with the farms’ capabilities were also considered (land, labour, capital etc.). In the context of our project, this would mean looking at the internal facilities of the library such as the availability of study spaces or a café.
+
In a paper written by Gerard Cliquet (1995), he discusses how the MCI Model was applied to the furniture market in order to calculate store attractiveness. In the paper, he discusses about how the MCI model is basically a market share model or more specifically a spatial-interaction model (Ghosh and McLafferty, 1987) where a spatial dimension can be given by index i representing choice geographical situations. In such a case a specific implementation method is needed based on a geographical division of the trade area.  
<br>
+
<br><br>
 +
In lieu of the research above and adapting the MCI model in the context of our project, we would consider the market to be Singapore and submarkets i to be the geographical divisions of Singapore in the form of subzones. The suppliers j are the outlets of ABC retail in Singapore.
 
</div>
 
</div>
==<div id="mw-content-text" lang="en-GB" dir="ltr" class="mw-content-ltr"><div style="background: #6FB1D0; padding: 10px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #6A8295 solid 15px; font-size: 20px; font-family:DIN Alternate"><font color="white">Revision of Methodology</font></div>==
+
==<div id="mw-content-text" lang="en-GB" dir="ltr" class="mw-content-ltr"><div style="background: #6FB1D0; padding: 10px; font-weight: bold; line-height: 1em; text-indent: 15px; border-left: #6A8295 solid 15px; font-size: 20px; font-family:DIN Alternate"><font color="white">Methodology</font></div>==
 
<div style="border-left: #96C0CE solid 8px;font-family: Avenir; padding: 0px 30px 0px 18px; ">
 
<div style="border-left: #96C0CE solid 8px;font-family: Avenir; padding: 0px 30px 0px 18px; ">
<b>Adaptation of the Multiplicative Competitive Interaction Model</b>
+
Over the years, ABC retail has successfully experimented with multiple outlet related decisions such as upgrading their present outlets and even introduction of new ones. However, as there is a great overhead cost involved in setting up an outlet at a new location, and it being a long-term strategy, there exists a need for ABC Retail to be able to understand its existing customer flow. Currently, ABC Retail actively collects three different types of data, outlet attribute data, customer attribute data and lastly transactional data of items purchased by the customers. However even after the collection of this data, ABC Retail has been unable to successfully use this data to derive any insights that help them understand their as-is outlet demand. Therefore, this is where our team believes that there is a dire need of adapting the MCI package within the context of ABC Retail.
 
+
<br><br>
Based on multiple sessions with our supervisor, our team has decided to use the MCI package in R to conduct the analysis of attractiveness using Huff’s model. The Multiplicative Competitive Interaction (MCI) Model (Nakanishi/Cooper 1974), is an “econometric model for analyzing market shares in a competitive environment where the market is divided in i submarkets (e.g. groups of customers, time periods or geographical regions) and served by j suppliers (e.g. firms, brands or locations)”. Resulting market share of the suppliers (libraries in our case), this model also analyzes the attraction/utility of the alternatives in the submarket.
+
Based on our conducted literature review, we have learnt that though the MCI Model is an extremely powerful tool however, the model alone is inadequate and requires big sets of data in a given format in order to learn from it. Thus, as mentioned earlier, our team utilised the three sets of data provided to us by ABC Retail and used it to first conduct data cleaning and then pre-processing so that it could fit the MCI model better.  
Different from the senior group’s model, MCI model is nonlinear but can be transformed via Ordinary Least Square (OLS) regression using the multi-step log-centering transformation and our team will also re-arrange the raw data in an interaction matrix to fix into the model.
+
<br><br>
The purpose of this is to first use the MCI model along with the different new variables for attractiveness to assign weights to the variables. This would tell us which variables actually contribute towards the attractiveness of a library. After which we would then use the Huff’s model function in the MCI package to give us the probability (𝑷_𝒊𝒋) that a Patron from a given Subzone i would visit a given Library j. One of the main motivators of shifting towards the MCI model is so that there is no longer a need to store the hard-coded Huff’s model calibrations within the dashboard, which is how the previous team had kept their results. With the implementation of the MCI model, the users of the dashboard can then recalibrate the model within the dashboard to produce new Huff’s model attractiveness based on the buffer stated by the users.
+
After cleaning and processing the existing data that we had, our team conducted data abstraction to find more contextual data such as subzones (eg. AMSZ01, AMSZ02) of customers and outlets, population datasets of each subzone, which can be used to calculate other variables that would define the attractiveness of a given outlet in the MCI Model. We decided to choose distance from the customer’s residing subzone to the outlet’s location, the scope of service of the outlet, the floor size of the outlet, the number of MRTs, tuition centres and malls present within 1km of the outlet as the variables to be used for the MCI Model in finding their contribution towards the attractiveness of an outlet. Deciding on these variables was essentially a logical and informed decision as we based this on an assumption that being more easily accessible and having more relevant commodities nearby would play a big role in an outlet’s attractiveness to its customers. As this is simply an assumption, we would need the MCI Model to calibrate the actual weight that should be assigned to each of these variables based on the effect they have on the attractiveness of an outlet.
<br>
+
<br><br>
 
+
Once the MCI Model has calibrated the weights that do affect the attractiveness of an outlet, calculating the probability of visit for a given customer in a subzone based just attractiveness value would be incomplete. Thus, to have a better more accurate representation, it is necessary to look at the market share of ABC Retail in each subzone as well. This is because the choice a customer makes to go to the outlet which is further away instead of another outlet within his/her vicinity, can be explained through market share but not by attractiveness variables. Therefore, our team would then look into the MCI.Shares method within the MCI package to incorporate the attraction variables and their corresponding weightages to calculate the estimated probability (𝑷_𝒊𝒋) that a customer from a given Subzone i would visit a given Outlet j.
All in all, based on our literature review, we are looking into including the following factors into our analysis at the moment. (They will be subject to changes depending on the progress of the project and the practicality of applying these factors):
+
Lastly, our team acknowledges that the output from such models are only in the form of statistical summaries and in lists, and thus only interpretable to a small group of users who can understand R and its complex results. Thus, in order to allow ABC Retail to easily make meaningful sense of the probabilities of customer flow, our team then built the geospatial R dashboard that maps the results of the MCI Model in an intuitive manner made for easy analysis which will be shown in detail in the next few portion of this paper.
 
 
::1.Distance from Subzones/Planning Area to Library
 
::2.Number of MRTs present within a Buffer
 
::3.Number of Malls present within a Buffer
 
::4.Number of Tuition Centres within a Buffer
 
::5.Collection Size of the library
 
::6.Gross Floor Area of the Library
 
::7.Carpark accessibility
 
::8.Branch Type (Mall, Stand-Alone, Regional)
 
::9.MRT Centrality
 
  
 
</div>
 
</div>

Revision as of 22:11, 23 April 2017

Group Logo


HOME

 

ABOUT US

 

PROJECT OVERVIEW

 

PROJECT FINDINGS

 

PROJECT MANAGEMENT

 

DOCUMENTATION

 
Data Methodology

Literature Review

The MCI Model is an “econometric model for analysing market shares in a competitive environment where the market is divided into i submarkets (e.g. groups of customers, time periods or geographical regions) and served by j suppliers (e.g. firms, brands or locations)”. The resulting market share of the suppliers is dependent upon the attractiveness/utility of the alternatives j in the submarket. MCI model is nonlinear but can be transformed via Ordinary Least Square (OLS) regression using the multi-step log-centering transformation.

The MCI model is basically an extension of the Huff’s Model. First developed by David Huff (1964), the Huff’s Model is a gravity model which is capable of estimating the probability of a consumer patronising a retail area, out of a list of available retail areas. Nakanishi/Cooper (1974) then incorporated the aspect of competitive interaction between retail areas into analysing the attractiveness of retail areas to consumers via the MCI model.

Since then, there have been many adaptations and improvements of the MCI model to suit the peculiarity of each case it was applied on. Oscar -Benito, Michael Greatorex and Pablo-Gallego (2000) proposed an application of the MCI model to assess segmentation variables in their capacity to homogenise consumers’ probability of patronage for retail areas. The patrons were divided into different groups based on variables such as income or age for example. The results upon application of the MCI model were compared across the different segmentation variables and analysed retail area selection by consumers across the different segments. The results eventually highlighted the importance of evaluating demographic and socio-economic variables as indicators of shopping behaviour (González-Benito, Greatorex, & Muñoz-Gallego, 2000).

In order to identify the main determinants of store choice behaviour, Nermin Oruc and Boris Tihi (2012) utilised the retail gravity model which was augmented by the inclusion of coefficients produced by the MCI model. They eventually presented the interaction between the attractiveness of store attributes (such as effectiveness of cashier, size of store, product offering) and deterrence of the distance between store and customer so as to analyse the impact of the trade-off one makes in visiting a store based on its attributes despite the long distance (Oruc, & Tihi, 2012)

In a paper written by Gerard Cliquet (1995), he discusses how the MCI Model was applied to the furniture market in order to calculate store attractiveness. In the paper, he discusses about how the MCI model is basically a market share model or more specifically a spatial-interaction model (Ghosh and McLafferty, 1987) where a spatial dimension can be given by index i representing choice geographical situations. In such a case a specific implementation method is needed based on a geographical division of the trade area.

In lieu of the research above and adapting the MCI model in the context of our project, we would consider the market to be Singapore and submarkets i to be the geographical divisions of Singapore in the form of subzones. The suppliers j are the outlets of ABC retail in Singapore.

Methodology

Over the years, ABC retail has successfully experimented with multiple outlet related decisions such as upgrading their present outlets and even introduction of new ones. However, as there is a great overhead cost involved in setting up an outlet at a new location, and it being a long-term strategy, there exists a need for ABC Retail to be able to understand its existing customer flow. Currently, ABC Retail actively collects three different types of data, outlet attribute data, customer attribute data and lastly transactional data of items purchased by the customers. However even after the collection of this data, ABC Retail has been unable to successfully use this data to derive any insights that help them understand their as-is outlet demand. Therefore, this is where our team believes that there is a dire need of adapting the MCI package within the context of ABC Retail.

Based on our conducted literature review, we have learnt that though the MCI Model is an extremely powerful tool however, the model alone is inadequate and requires big sets of data in a given format in order to learn from it. Thus, as mentioned earlier, our team utilised the three sets of data provided to us by ABC Retail and used it to first conduct data cleaning and then pre-processing so that it could fit the MCI model better.

After cleaning and processing the existing data that we had, our team conducted data abstraction to find more contextual data such as subzones (eg. AMSZ01, AMSZ02) of customers and outlets, population datasets of each subzone, which can be used to calculate other variables that would define the attractiveness of a given outlet in the MCI Model. We decided to choose distance from the customer’s residing subzone to the outlet’s location, the scope of service of the outlet, the floor size of the outlet, the number of MRTs, tuition centres and malls present within 1km of the outlet as the variables to be used for the MCI Model in finding their contribution towards the attractiveness of an outlet. Deciding on these variables was essentially a logical and informed decision as we based this on an assumption that being more easily accessible and having more relevant commodities nearby would play a big role in an outlet’s attractiveness to its customers. As this is simply an assumption, we would need the MCI Model to calibrate the actual weight that should be assigned to each of these variables based on the effect they have on the attractiveness of an outlet.

Once the MCI Model has calibrated the weights that do affect the attractiveness of an outlet, calculating the probability of visit for a given customer in a subzone based just attractiveness value would be incomplete. Thus, to have a better more accurate representation, it is necessary to look at the market share of ABC Retail in each subzone as well. This is because the choice a customer makes to go to the outlet which is further away instead of another outlet within his/her vicinity, can be explained through market share but not by attractiveness variables. Therefore, our team would then look into the MCI.Shares method within the MCI package to incorporate the attraction variables and their corresponding weightages to calculate the estimated probability (𝑷_𝒊𝒋) that a customer from a given Subzone i would visit a given Outlet j. Lastly, our team acknowledges that the output from such models are only in the form of statistical summaries and in lists, and thus only interpretable to a small group of users who can understand R and its complex results. Thus, in order to allow ABC Retail to easily make meaningful sense of the probabilities of customer flow, our team then built the geospatial R dashboard that maps the results of the MCI Model in an intuitive manner made for easy analysis which will be shown in detail in the next few portion of this paper.