Version 1 | Version 2
INTRODUCTION
Blockchain, artificial intelligence, data science, edtech and internet-of-things are all buzzwords today in this new innovation era. More and more founders and investors begin to see the potential of innovation in Asia today. Meanwhile, local governments in the region have introduced new policies and initiatives to explore new technological innovation frontiers in order to boost the competitiveness of various knowledge-based industries. In recent years, both Hong Kong and Singapore government has pumped in resources including start-up clusters, grants and funding to boost its start-up Ecosystem.
The two Asian-Tigers, Singapore and Hong Kong, will be the main contexts for this project. They are two high-growth metropolitan city-states in Asia that share many characteristics in common in terms of GDP per capita and population density. Beyond that, both Hong Kong and Singapore offers a comprehensive financial and technical infrastructure and has attracted a considerable amount of foreign investment. It is commendable that both countries have achieved stellar economic performance despite the lack of natural resources and large land size. Unicorn Ventures strives to study the current start-up ecosystem in these two city-based regions and hope to generate new insights for policy-makers, founders and investors.
MOTIVATION
Our research is motivated by the lack of comprehensive comparison between these two countries’ startup ecosystem. Currently, there are many sources of fragmented datasets on different aspects of the startup ecosystem such as top-funded companies, industry breakdown and funding analysis. Our project aims to consolidate the data sources and study the two different startup ecosystems through a centralized interactive web application.
This project will focus on start-up companies and funding organizations in the tech ecosystem. The insights generated could help:
- Hong Kong and Singapore policy makers improve its existing infrastructures or policies to cultivate a more robust startup ecosystem
- Potential and current founders to understand the growing industries, competitor landscape and investors
- Potential investors to identify the growing industries and dominant players
OBJECTIVES
Our project aims to explore and compare the following aspects for the startup
ecosystem in Singapore and Hong Kong by considering the startups founded after
2000.
Ecosystem Index Analysis:
- What is the difference of Hong Kong and Singapore in terms of Global Entrepreneurship Index and Global Competitiveness Index?
Startup Analysis:
- Time-series analysis for startups that has formed/exited across the years by different industries
- What is the current breakdown of startups by industries, key industries, team size, funding stage, age and gender of founders?
Funding Analysis:
- What are top funded startups and their funding stages and industries?
- Time-series analysis of the disclosed funding over the years by industries
- Where does the investors originate from and what are the investor types?
SELECTED DATASET
| Dataset
|
Basic Startup Information in Singapore and Hong Kong
- Description: This dataset includes various key attributes on startups in Singapore and Hong Kong that was founded after 2000.
- Source: Crunchbase
- Dataset
- Components:
| Field
|
Type
|
Description
|
| Organization Name
|
String
|
Organization Name
|
| Categories
|
String
|
Industry the organization belongs to
|
| Sub Category 1
|
String
|
Sub category the organization belongs to
|
| Sub Category 1
|
String
|
Sub category the organization belongs to
|
| Headquarter Location
|
String
|
Organization headquarter location
|
| Description
|
String
|
Organization description
|
| Founded_Date
|
String
|
Organization founding date
|
| Exit Date
|
String
|
Organization exit date
|
|
Investment and Funding Information in Singapore and Hong Kong
- Description: This dataset details the individual disclosed funding transactions that are public and are published in crunchbase.
- Source: Crunchbase
- Dataset
- Components:
| Field
|
Type
|
Description
|
| Transaction Name
|
String
|
Auto-generated name of transaction (e.g. Angel-Uber)
|
| Organization Name
|
String
|
Name of the organization that got funded.
|
| Categories
|
String
|
Industry the organization belongs to
|
| Location
|
String
|
Location of the organization that got funded
|
| Funding Type
|
String
|
Type of Funding Round (e.g. Seed, Series A, Private Equity, Debt Financing)
|
| Money Raised Currency (in USD)
|
Integer
|
Amount of money raised in Funding Round
|
| Announced Date
|
Date
|
Date that the Funding Round was publicly anounced
|
| Funding Stage
|
String
|
The funding stage of a funding round
|
|
Investor Information in Singapore and Hong Kong
- Description: This dataset shows the investor locations and number of investments made by the investors .
| Field
|
Type
|
Description
|
| Location
|
String
|
Investor locations
|
| Lat
|
Double
|
Latitude of the location
|
| Categories
|
Double
|
Longitude of the location
|
| Number_of_Investments
|
Integer
|
Number of investments made by investors from the location
|
|
Other Startup Information in Singapore and Hong Kong
- Description: This dataset includes other attributes on startups in Singapore and Hong Kong that was founded after 2000.
- Source: Crunchbase
- Dataset
- Components:
| Field
|
Type
|
Description
|
| Organization Name
|
String
|
Organization Name
|
| Categories
|
String
|
Industry the organization belongs to
|
| Sector
|
String
|
The sector the organization belongs to
|
| Headquarter Location
|
String
|
Organization headquarter location
|
| CB_Rank
|
Integer
|
Crunchbase Rank
|
| Age_days
|
Integer
|
Age of the organization measured in days
|
| SimilarWeb_Average_Visits
|
Double
|
Average visits to the organization website recorded by SimilarWeb
|
| SimilarWeb_Visit_Duration
|
Integer
|
Visit duration to the organization website recorded by SimilarWeb
|
| SimilarWeb_Global_Traffic_Rank
|
Double
|
Organization website traffic ranking among all the organization websites globally recorded by SimilarWeb
|
| BuiltWith_Active_Tech_Count
|
Integer
|
Count of active technology used recorded by BuiltWit
|
|
BACKGROUND SURVEY OF RELATED WORKS
| Related Works
|
What We Can Learn
|
| Heat Map
|
- A heat map can display time series data for multiple dimensions across a fixed set of categories and this provides more information in one single chart.
- By looking at the overview of color intensity, user are able to identify the overall pattern of the data and highlight any exceptions/outliers.
- The color intensity displays the quantity while tooltip enables users to hover over the individual shaded area and understand the exact quantity.
|
| Cleveland Dot Plot
|
- Cleveland Dot Plot is effective in comparing data of two parties on a single chart. It can easily communicate the huge gaps between two objects to the user.
- Unlike bar chart, this dot plot reduce the clutter and maximize the data-ink ratio.
|
| Violin Plot
|
A violin plot show more information in one plot:
- IQR (25% percentile, median, 75% percentile) and Max & Min
- 95% confidence interval
- Average
- Density plot indicating concentration of data
This can be used for our industry analysis to achieve a more insightful visualization
|
| Dot Density Map
|
- Dot Density Map is useful to show the value density in different locations.
- The color and size of the dot can represents different things.
|
SKETCHES STORYBOARD
| Sketches
|
How Analyst Can Conduct Analysis
|
|
|
Heat Map:
- Shows time-series analysis for startups that has formed or exited and their funding amount across the years according different industries
- AllowS the user to identify the growing industries as well as the trend in start-up formation and funding amount
|
|
|
Cleveland Dot Plot with Filters:
- Presents current breakdown of startups by industries
- Colour of the dots represent the region
|
|
|
Violin Plot with filters:
- Displays values across sectors and industry inside each sector across 2 regions. The values are: age (company age), similarweb visit duration, crunchbase rank, similarweb average visits, builtwith active technology count, global traffic rank.
- The value shown are accessible through tabs.
- Breakdown of industries in each sector is shown by selecting sector in dropdown list.
- User can choose bar chart over violin chart to have an overview of data.
|
|
|
Dot Density Map:
- color shows number of investments by percentile (to standardize the skewed number)
- size shows number of investments
- style of map can be changed using the radio button
|
ARCHITECTURE DIAGRAM
KEY TECHNICAL CHALLENGES
Domain Knowledge Understanding:
- As the datasets involves many technical terms in the startup ecosystem, the group has to study more in-depth on the terminologies used in the ecosystem in order to draw meaningful insights. This includes the definition of different funding stages, types of fundings, types of investors, startup industries categories and etc.
Data Preprocessing:
- Missing data: how to deal with missing values
- Data integration and calculation: understanding the column attributes and perform meaningful summation or calculation
- Multiple values for certain observations: how to deal with such attributes and ensure that the visualizations take into account of start-ups that has attributes with multiple values
Technological Expertise:
- Learning relevant packages under R such as ggplot2, tidyverse, shiny and plotly
- Learning how to integrate D3.js with R to achieve both advanced analytics functions as well as interactive visualization
- Learning integration of different charts and enhance the interactivity and animation techniques of the storyboard
PROJECT TIMELINE
REFERENCES
COMMENTS
Feel free to comments, suggestions and feedbacks to help us improve our project!:D