Difference between revisions of "Qui Vivra Verra - Methodology"

Latest revision as of 00:56, 31 August 2016

Data Preparation

Further analysis of the data set can be accomplished through market segmentation. The concept of k-means clustering can be applied on the Transaction Dataset, with the clustering parameters set as:

Recency (number of days from last transaction to end of the FY)
Frequency (number of transactions performed within the FY)
Monetary (average number of books borrowed per transaction)

Each patron will then be assigned to a cluster, with each cluster homogeneous within and heterogeneous across. From here, we can determine the dominant cluster of library member that each library caters to – which can provide some operational insights by understanding the demographics of the bulk of each library’s patrons.

Application of the Huff's Model

An adaptation of the Huff’s Model (Huff, 1964) will be applied in the analyses.

To quote a paper by Okabe & Sugihara (2012):

To state a general form of the Huff model, we consider a space S (which may be a plane or a network), in which n stores are located at p₁, …, p_n. Let a_i be the attractiveness of store i, which may be a function of its floor area, the number of items sold, its parking area and so forth; let d(p, p_i) be the distance between a point p on S and the store at p_i, which may be the Euclidean distance or the shortest-path distance; and let F(d(p, p_i)) be a monotonically decreasing function of d(p, p_i), referred to as a distance decay function or distance deterrence function. In these terms, the Huff model showing the probability of a consumer at p choosing the store at p_i is generally written as:

Adapting the Huff’s Model to the context of our project, we would consider Singapore as space S, in which n libraries are located at p₁, …, p_n. Let a_i be the attractiveness of library i, which is estimated by a multinomial generalised linear regression equation, taking into account the following factors (non-exhaustive):

Size of the library’s collection
Gross floor area of the library
Type of facility the library is located in (i.e. mall, stand-alone etc)
Size of facility the library is in (i.e. if the library is located in a mall, this refers to the gross floor area of the mall)
Number of MRT stations within a set distance (to be determined) from the library
Number of bus stops within a set distance (to be determined) from the library
Number of bus routes within a set distance (to be determined) from the library
Opening hours of the library
Number of educational institutes (i.e. primary/secondary schools, junior colleges, polytechnics, ITE, universities) within a set distance (to be determined) from the library
Number of other libraries (only considering the list under NLB) within a set distance from the library

Let d(p, p_i) be the distance between an area (geographical subzone) p on S and the library at p_i, which may be the Euclidean distance or the shortest-path distance; and let F(d(p, p_i)) be a monotonically decreasing function of d(p, p_i), referred to as a distance decay function or distance deterrence function. Therefore, the above-stated formula can be interpreted as the probability of a consumer at p choosing the library at p_i.

Dividing the number of patrons in each subzone at p that visited a library p_i by the total number of patrons in the subzone at p, we can obtain a probabilistic model which estimates the proportion of time that a patron from subzone p will visit library i in any given FY. Then, by substituting the known values of a_i (to be determined by the regression model) and d(p, p_i) into the adapted Huff’s Model, we are able to derive possible values of the power parameter (∝) that govern the distance decay function. By doing this process iteratively, we can obtain an unbiased estimate for ∝ that is accurate to a certain significant level.

@@ Line 11: / Line 11: @@
 | style="font-family:Century Gothic; font-size:100%; background:#003464; text-align:center; border-left: 0px" width="15%" |
 &nbsp;[[Qui Vivra Verra - About_Us | <font color="#ffffff" size="2"><strong>ABOUT US</strong></font>]]
+| style="font-family:Century Gothic; font-size:100%; background:#003464; text-align:center;border-left: 0px" width="15%" |
+&nbsp;[[Qui Vivra Verra - Project_Overview | <font color="#ffffff" size="2"><strong>PROJECT OVERVIEW</strong></font>]]
 | style="font-family:Century Gothic; font-size:100%; background:#f4f9fd; text-align:center;border-left: 0px" width="15%" |
-&nbsp;[[Qui Vivra Verra - Project_Overview | <font color="#000000" size="2"><strong>PROJECT OVERVIEW</strong></font>]]
+&nbsp;[[Qui Vivra Verra - Project_Findings | <font color="#000000" size="2"><strong>PROJECT FINDINGS</strong></font>]]
-| style="font-family:Century Gothic; font-size:100%; background:#003464; text-align:center;border-left: 0px" width="15%" |
-&nbsp;[[Qui Vivra Verra - Project_Findings | <font color="#ffffff" size="2"><strong>PROJECT FINDINGS</strong></font>]]
 | style="font-family:Century Gothic; font-size:100%; background:#003464; text-align:center; border-left: 0px" width="15%" |
@@ Line 31: / Line 31: @@
 {| style="background-color:white; color:000000 padding: 5px 0 0 0;" width="100%" height=50px cellspacing="0" cellpadding="0" valign="top" border="0" |
-| style="vertical-align:top;width:14%;" | <div style="padding: 3px; text-align:center; line-height: wrap_content; font-size:15px; border-bottom:2px solid #0163bd; font-family:Century Gothic"> [[Qui Vivra Verra - Project Overview| <b>Summary</b>]]
+| style="vertical-align:top;width:14%;" | <div style="padding: 3px; text-align:center; line-height: wrap_content; font-size:15px; border-bottom:2px solid #0163bd; font-family:Century Gothic"> [[Qui Vivra Verra - Project Findings| <b>Introduction</b>]]
-| style="vertical-align:top;width:14%;" | <div style="padding: 3px; text-align:center; line-height: wrap_content; font-size:15px; border-bottom:2px solid #0163bd; font-family:Century Gothic"> [[Qui Vivra Verra - Motivation & Objectives| <b>Motivation & Objectives</b>]]
+| style="vertical-align:top;width:14%;" | <div style="padding: 3px; text-align:center; line-height: wrap_content; font-size:15px; border-bottom:5px solid #0163bd; font-family:Century Gothic"> [[Qui Vivra Verra - Methodology| <b>Methodology</b>]]
-| style="vertical-align:top;width:14%;" | <div style="padding: 3px; text-align:center; line-height: wrap_content; font-size:15px; border-bottom:5px solid #0163bd; font-family:Century Gothic"> [[Qui Vivra Verra - Methodology| <b>Methodology</b>]]
+| style="vertical-align:top;width:14%;" | <div style="padding: 3px; text-align:center; line-height: wrap_content; font-size:15px; border-bottom:2px solid #0163bd; font-family:Century Gothic"> [[Qui Vivra Verra - Hypotheses & Findings| <b>Hypotheses & Findings</b>]]
-| style="vertical-align:top;width:14%;" | <div style="padding: 3px; text-align:center; line-height: wrap_content; font-size:15px; border-bottom:2px solid #0163bd; font-family:Century Gothic"> [[Qui Vivra Verra - Technology| <b>Technology</b>]]
+| style="vertical-align:top;width:14%;" | <div style="padding: 3px; text-align:center; line-height: wrap_content; font-size:15px; border-bottom:2px solid #0163bd; font-family:Century Gothic"> [[Qui Vivra Verra - References| <b>References</b>]]
 |}
 <!--/Sub Header-->
 <!-- Please do not make changes to above -->
+<!------- Details ---->
+<div style="background: #dce6f9; line-height: 0.3em; font-family:Century Gothic;  border-left: #003464 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#000000"><strong>Data Preparation</strong></font></div></div>
+Further analysis of the data set can be accomplished through market segmentation. The concept of k-means clustering can be applied on the Transaction Dataset, with the clustering parameters set as:
+* Recency (number of days from last transaction to end of the FY)
+* Frequency (number of transactions performed within the FY)
+* Monetary (average number of books borrowed per transaction)
+Each patron will then be assigned to a cluster, with each cluster homogeneous within and heterogeneous across. From here, we can determine the dominant cluster of library member that each library caters to – which can provide some operational insights by understanding the demographics of the bulk of each library’s patrons.
+<div style="background: #dce6f9; line-height: 0.3em; font-family:Century Gothic;  border-left: #003464 solid 15px;"><div style="border-left: #FFFFFF solid 5px; padding:15px;font-size:15px;"><font color= "#000000"><strong>Application of the Huff's Model</strong></font></div></div>
+An adaptation of the Huff’s Model (Huff, 1964) will be applied in the analyses.
+To quote a paper by Okabe & Sugihara (2012):
+To state a general form of the Huff model, we consider a space ''S'' (which may be a plane or a network), in which n stores are located at ''p<sub>1</sub>, …, p<sub>n</sub>''. Let a<sub>i</sub> be the attractiveness of store ''i'', which may be a function of its floor area, the number of items sold, its parking area and so forth; let ''d(p, p<sub>i</sub>)'' be the distance between a point ''p'' on ''S'' and the store at ''p<sub>i</sub>'', which may be the Euclidean distance or the shortest-path distance; and let ''F(d(p, p<sub>i</sub>))'' be a monotonically decreasing function of ''d(p, p<sub>i</sub>)'', referred to as a distance decay function or distance deterrence function. In these terms, the Huff model showing the probability of a consumer at ''p'' choosing the store at ''p<sub>i</sub>'' is generally written as:
+[[File:Huff's Model Formula.png|center|Huff's_Model_Formula.png]]
+Adapting the Huff’s Model to the context of our project, we would consider Singapore as space ''S'', in which n libraries are located at ''p<sub>1</sub>, …, p<sub>n</sub>''. Let a<sub>i</sub> be the attractiveness of library ''i'', which is estimated by a multinomial generalised linear regression equation, taking into account the following factors (non-exhaustive):
+* Size of the library’s collection
+* Gross floor area of the library
+* Type of facility the library is located in (i.e. mall, stand-alone etc)
+* Size of facility the library is in (i.e. if the library is located in a mall, this refers to the gross floor area of the mall)
+* Number of MRT stations within a set distance (to be determined) from the library
+* Number of bus stops within a set distance (to be determined) from the library
+* Number of bus routes within a set distance (to be determined) from the library
+* Opening hours of the library
+* Number of educational institutes (i.e. primary/secondary schools, junior colleges, polytechnics, ITE, universities) within a set distance (to be determined) from the library
+* Number of other libraries (only considering the list under NLB) within a set distance from the library
+Let ''d(p, p<sub>i</sub>)'' be the distance between an area (geographical subzone) ''p'' on ''S'' and the library at ''p<sub>i</sub>'', which may be the Euclidean distance or the shortest-path distance; and let ''F(d(p, p<sub>i</sub>))'' be a monotonically decreasing function of ''d(p, p<sub>i</sub>)'', referred to as a distance decay function or distance deterrence function. Therefore, the above-stated formula can be interpreted as the probability of a consumer at ''p'' choosing the library at ''p<sub>i</sub>''.
+Dividing the number of patrons in each subzone at ''p'' that visited a library ''p<sub>i</sub>'' by the total number of patrons in the subzone at ''p'', we can obtain a probabilistic model which estimates the proportion of time that a patron from subzone ''p'' will visit library ''i'' in any given FY. Then, by substituting the known values of ''a<sub>i</sub>''  (to be determined by the regression model) and ''d(p, p<sub>i</sub>)'' into the adapted Huff’s Model, we are able to derive possible values of the power parameter (∝) that govern the distance decay function. By doing this process iteratively, we can obtain an unbiased estimate for ∝ that is accurate to a certain significant level.

Difference between revisions of "Qui Vivra Verra - Methodology"

Latest revision as of 00:56, 31 August 2016

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools