Maptimization Details

From IS415-Geospatial Analytics for Business Intelligence
Jump to navigation Jump to search

HOME

PROPOSAL

DETAILS

POSTER

APPLICATION

RESEARCH PAPER


Jury


Motivation and Problem Discovery

Motivation

With families in Singapore moving towards modern times where both parents or couple work full-time, domestic household chores are performed much less due to time-constraints. Jelivery Service(JS) Pte Ltd aims to alleviate this issue by providing convenient laundry washing services.

Problem

Currently, Jelivery Services operations function on delivering laundry to each customer's doorsteps regardless of distribution demand nor distance. This creates an issue where deliveries are sometimes time-inefficient and very fuel-consuming for its employees. This results a high operational cost on the company's expenses, making their profit margin very thin.

Proposed Solution

Our group, Maptimization, aims to optimize Jelivery Service's business process by streamlining it via the creation of single-best-point(s) for laundry collections/returns based on the layout of the area and zone. This would increase efficiency and ultimately improve customer satisfaction by providing the best convenience to them possible. Through the use of K-means clustering and sufficient data points, we would be able to create an optimization model.

Project Sketch

First sketch

Maptimization presketch.jpg


Our first sketch sets a direction of what we would like users to be able to see and utilize in our final application. As seen above, we would like to provide three features to our users
1. Filtering by Region(North, South, East, West): This would allow users of different regions to only select laundry collection points near their regions. This is to facilitate convenience and disallow "noise(undesired)" collection points from showing up in their interface
2. Filter by Distance: This means that each collection point can only go up to a certain range from them(e.g. 1KM away from them). The application would recommend how many collection points are within the selected range
3. Filter by K-Clusters: This would allow users to know how many collection points are clustering together, as each collection point is catered to 10 household collection point. If a cluster is more dense, he will more likely be able to collect his laundry easier.

Final sketch

Maptimization sketch1.jpg


Our final sketch shows an in-depth view of what we would expect the application to look and feel like. Through the selection of options on the top right of our application, users will be able to select Jelivery Service laundry collection points to their own convenience.

For illustration purposes, we have shown that a user have selected the "North" Region and a K-Cluster of 3. The application shows that 3 K-Clusters are shown to the user. The user will then be able to make their way down to whichever is the nearest collection point or whichever is the densest cluster to be served quicker.

Algorithm and Solution

Kmean Algorithm Explained

The Κ-means clustering algorithm uses iterative refinement to produce a final result. The algorithm inputs are the number of clusters Κ and the data set. The data set is a collection of features for each data point. The algorithms starts with initial estimates for the Κ centroids, which can either be randomly generated or randomly selected from the data set. The algorithm then iterates between two steps: 1. Data assignment step: Each centroid defines one of the clusters. In this step, each data point is assigned to its nearest centroid, based on the squared Euclidean distance. More formally, if Ci is the collection of centroids in set C, then each data point x is assigned to a cluster based on

Kmean.jpg

where dist( · ) is the standard (L2) Euclidean distance. Let the set of data point assignments for each i^th cluster centroid be Si. 2. Centroid update step: In this step, the centroids are recomputed. This is done by taking the mean of all data points assigned to that centroid's cluster.

 Kmean2.jpg

The algorithm iterates between steps one and two until a stopping criteria is met (i.e., no data points change clusters, the sum of the distances is minimized, or some maximum number of iterations is reached).

This algorithm is guaranteed to converge to a result. The result may be a local optimum (i.e. not necessarily the best possible outcome), meaning that assessing more than one run of the algorithm with randomized starting centroids may give a better outcome.

Technologies used

Technologies:
- R Shiny
- R Studio
- R Language
- OMS
- Excel


Challenges & Limitations

Data

  • There was lack of sufficient data available for Toa Payoh
  • Datasets found are often incomplete or had missing fields

Data cleaning and transformation

  • Addresses found often had invalid postal codes or unknown postal codes, which required research to amend it
  • Data transformation to respective regions(North/South/East/West) was resource-draining and tedious

Novel technologies

  • Steep learning curve
  • A lot of consultations were needed with Prof Kam
  • Self-learning


Future work

This Spatial Clustering with kmean is an optimization model in aiding business to analyze the best collection point or clustering for business needs. With the option to upload own data set, it would allow the flexibility to be transferable to other business functions and industry as long as they met the data set requirements.
Furthermore, base on the needs of the business, this model can be easily extended to include criteria in its consideration when doing spatial clustering. For example, we can add in distance into consideration such that the minimum number of distance should not be under 10km or maximum distance should not be over 15km. These would ensure a more fair distribution of k clustering base on the business needs.