GeViz
Contents
PROBLEM & MOTIVATION
GeBIZ is a Singapore Government’s one-stop e-procurement portal which facilitates tender activities between Singapore government and local and overseas suppliers. Currently, there is no available tool to aid the public and ministries to understand and gain insights on the procurement made by the government under each ministry. Hence, we are motivated to create an interactive visualisation tool on government's procurement spending to allow the public and ministries to identify spending patterns and gain insights into procurement spending under each ministry.
OBJECTIVES
In this project, we are creating a visualisation that is able to show the following:
- Gain an overview of procurement spending made by each ministry and agency
- Identify the relationships between ministries, agencies and suppliers
- Identify what are the goods and services procured by ministries and agencies under each category
SELECTED DATASETS
The following datasets will be used for analysis , as elaborated below:
Dataset/Source | Data Attributes | Rationale of Usage |
---|---|---|
Government Procurement Data (https://data.gov.sg/dataset/government-procurement) |
|
To gain information on government procurement such as tender description, amount and supplier information |
Ministry and Agencies List |
|
We will be looking through the Singapore Government Directory (https://www.gov.sg/sgdi/ministries) to categorise the agencies into their respective ministries. This will allow us to visualise the procurement spending on a ministry level. |
APPROACH
Text Classification using Support Vector Classifier (SVC)
One of the key challenges of working with the provided procurement dataset is the absence of categorization of each procurement transaction. Instead of labelling manually, we applied machine learning to classify the tender descriptions into different categories. We firstly scraped the procurement descriptions and categories from GeBiz website using Selenium and BeautifulSoup libraries in Python to be used as the training and validation dataset in our Support Vector Classifier model. We were able to achieve 90% for training accuracy before performing the categorization prediction.
Government Procurement Dataset after Text Classification
Dataset/Source | Data Attributes |
---|---|
Government Procurement Data |
|
BACKGROUND SURVEY OF RELATED WORKS
Some of these visualizations that we draw inspiration from, are as follows:
Reference of Other Interactive Visualization | What We Can Learn | |
---|---|---|
| ||
| ||
| ||
| ||
|
BRAINSTORMING SESSIONS
First Draft
[1] Treemap to show the spending breakdown for each category of all agencies under the selected ministry. The filters are year and ministry.
[2] Network diagram to show the relationship of agencies and suppliers of the selected ministry. The filters are year and ministry.
[3] Sankey diagram to show the cash flow between selected agency and suppliers for the selected category. The filters are year, ministry, agency and category.
[4] Word cloud to show an overview of the tender description for the selected agency and selected category. The filters are year, ministry, agency and category.
After consulting with prof, we made improvements to our first draft. Below is the second and finalised draft for our procurement dashboard.
Second Draft
[1] Treemap to show the spending breakdown for each category of all agencies under the selected ministry. The filters are year and ministry.
[2] Network diagram to show the relationship of agencies and suppliers of the selected ministry. The filters are year and ministry. We added a new filter which allows the user to filter the suppliers based on the procurement amount.
[3] Sankey diagram to show the cash flow between selected agency and suppliers for the selected category. The filters are year, ministry, agency and category.
[4] Word cloud to show an overview of the tender description for the selected agency and selected category. The filters are year, ministry, agency and category. We added a searchable table below the word cloud to allow the user to search for keywords and view the exact tender description.
PROPOSED STORYBOARD
To be filled!
TECHNOLOGIES
Tools and technologies
Data Architecture
KEY CHALLENGES
The following are some of the key technical challenges that we may face throughout the course of the project:
Key Challenges | Mitigation Plan |
---|---|
Unfamiliarity with R and Rshiny Libraries |
|
Unfamiliarity with Libraries for Machine Learning and Web Crawling |
|
Data Cleaning and Transformation |
|
TIMELINE
To be filled!
COMMENTS
Feel free to leave us some comments so that we can improve! We dont bite :)
No. | Name | Date | Comments |
---|---|---|---|
1. | Insert your name here | Insert date here | Insert comment here |
2. | Insert your name here | Insert date here | Insert comment here |
3. | Insert your name here | Insert date here | Insert comment here |