Two Eyes One Pizza Data
Contents
- 1 Preliminary Data Observations
- 2 1. Geometry and Coordinate Reference System
- 3 2. Extracting each POI from the different shp files
- 4 3. Formation of polygons for each branch to align to ppt slides' trade area
- 5 4. Creation of competitors' POI
- 6 5. Aggregation of count of each POI to each trade area (Script)
- 7 6. Investigation on areas not covered by trade area
- 8 Comments
Preliminary Data Observations
These are the datasets we are using for data transformation:
Dataset | Rationale |
---|---|
| |
| |
| |
| |
|
Steps for data transformation:
- Align to Geometry and Coordinate Reference System for all files: EPSG:3828
- Extracting each POI from the different shp files
- Formation of polygons for each branch to align to ppt slides' trade area
- Creation of competitors' POI
- Aggregation of count of each POI to each trade area
- Investigation on areas not covered by trade area
- Data pre-processing for Sales Data
- Aggregating Sales Data with each branch
- Regression Analysis
1. Geometry and Coordinate Reference System
Every qgis file created have to align to the same reference system used by Taiwan Maps for accuracy and unit measurement.
Information on 3828 Reference system: https://epsg.io/3828
2. Extracting each POI from the different shp files
We are given the following POIs to extract:
- ATM
- Bank
- Bar or Pub
- Bookstore
- Bowling Centre
- Bus Station
- Business Facility
- Cinema
- Clothing Store
- Coffee Shop
- Commuter Rail Station
- Consumer Electronics Store
- Convenience Store
- Department Store
- Government Office
- Grocery Store
- Higher Education
- Hospital
- Hotel
- Medical Service
- Pharmacy
- Residential Area/ Building
- Restaurant
- School
- Shopping
- Sports Centre
- Sports Complex
- Train Station
These POIs have to be extracted by finding their respective Facility Type in the shp files provided.
For example, Facility Type of 9583 was filtered using the filter function in QGIS and exported as a layer to the geopackage.
3. Formation of polygons for each branch to align to ppt slides' trade area
As the trade areas are predefined from the ppt slides, we have to manually digitalise the trade area for each of the outlets. QGIS tools were used in the process: Split Features, Merge Features and our artistic skills. Moreover, we have added the store codes and area code given by the ppt slides so that we can easily identify these polygons.
Below is the generated polygons for all of our trade area.
4. Creation of competitors' POI
5 Competitors are identified and they are Dominos Pizza, Napolean Pizza, Mcdonalds, KFC and MosBurger. These data points are crucial in our analysis as they may have a correlation with the sales data. Hence, these competitors branches location have to be extracted from "Restaurant 5800".
After extraction, they are exported as layers in the geopackage to be used for aggregation.
5. Aggregation of count of each POI to each trade area (Script)
It is an exhausting task to aggregating each of the POIs into each trade area using the "Count Points in Polygon" tool.
The batch processing tool was tried but it is unable to append each newly created POI into an existing geopackage. Hence, a python script was written and it ran under QGIS's Python Console. It utilised "processing.run('qgis:countpointsinpolygon', parameters)" function, an inbuilt function by QGIS.
Learn more about the function here: https://docs.qgis.org/2.8/en/docs/user_manual/processing_algs/qgis/vector_analysis_tools/countpointsinpolygon.html
After the aggregation of POIs into each trade area, below is the screenshot of the columns for a particular trade area.
6. Investigation on areas not covered by trade area
As we observe the generated trade areas, it seems that there are some areas that are not covered by any of the branches. Hence, we decided to anaylse these areas.
In the above figure, the uncovered area is 大安森林公园 which is a park. Hence, it is explainable that such locations should not be covered in the delivery area.
In the above figure, we do see buildings that are not in any trade area. Therefore, this uncovered area may have be omitted and should be fulfilled by one of the branches.
Comments
Feel free to leave comments / suggestions!