IS428 2018-19 T1 Group 03 Unicorn Ventures: Proposal

From Visual Analytics for Business Intelligence
Jump to navigation Jump to search
Unicorn Logo.jpg


PROJECT GROUP

 

TEAM

 

PROPOSAL

 

POSTER

 

APPLICATION

 

RESEARCH PAPER

 

Version 1 | Version 2

INTRODUCTION


Blockchain, artificial intelligence, data science, edtech and internet-of-things are all buzzwords today in this new innovation era. More and more founders and investors begin to see the potential of innovation in Asia today. Meanwhile, local governments in the region have introduced new policies and initiatives to explore new technological innovation frontiers in order to boost the competitiveness of various knowledge-based industries. In recent years, both Hong Kong and Singapore government has pumped in resources including start-up clusters, grants and funding to boost its start-up Ecosystem.

The two Asian-Tigers, Singapore and Hong Kong, will be the main contexts for this project. They are two high-growth metropolitan city-states in Asia that share many characteristics in common in terms of GDP per capita and population density. Beyond that, both Hong Kong and Singapore offers a comprehensive financial and technical infrastructure and has attracted a considerable amount of foreign investment. It is commendable that both countries have achieved stellar economic performance despite the lack of natural resources and large land size. Unicorn Ventures strives to study the current start-up ecosystem in these two city-based regions and hope to generate new insights for policy-makers, founders and investors.

MOTIVATION


As there is a general lack of effective and user-friendly visualization for discovering country-specific differences between the start-up ecosystems, the main motivation behind this project is to create a centralized, dynamic and interactive dashboard for quantitative comparisons on various aspects of start-ups and funding organizations in Singapore and Hong Kong. Based on User-Centric Dashboard Design Guide, this dashboard takes a broad, strategic, customizable, drillable and exploratory approach and is targeted at potential entrepreneurs, policy makers and investors. This project will focus on start-up companies and funding organizations in the tech ecosystem. The insights generated could help:

  • Enable potential entrepreneurs to understand the growing and declining industries, investors’ profile as well as pinpoint the top funded start-ups
  • Help policy makers to identify the potential profitable and leading industries and dedicate more resources in specific industries
  • Assist investors in identifying the difference between Singapore and Hong Kong’s start-up industries and strategizing future investments in these two regions


OBJECTIVES


Our project aims to explore and compare the following aspects for the start-up ecosystem in Singapore and Hong Kong by considering the start-ups founded after 2000. We hope to address the following questions For Entrepreneurs:

  • Time-series analysis: When was the start-ups founded, exited and funded?
  • Funding Analysis: Discovering the top funded start-ups
  • Spatial Analysis: Where do the investors originate from?

For Policy Makers/Investors:

  • Profile Analysis: What are the differences in terms of start-up formation across the years?
  • Industry Analysis: Comparing the start-ups in various industries and sectors based on different performance indicators


SELECTED DATASET
Dataset

Basic Startup Information in Singapore and Hong Kong

  • Description: This dataset includes various key attributes on startups in Singapore and Hong Kong that was founded after 2000.
  • Source: Crunchbase
  • Dataset
  • Components:
Field Type Description
Organization Name String Organization Name
Categories String Industry the organization belongs to
Sub Category 1 String Sub category the organization belongs to
Sub Category 1 String Sub category the organization belongs to
Headquarter Location String Organization headquarter location
Description String Organization description
Founded_Date String Organization founding date
Exit Date String Organization exit date
Last Funding Date Date Last Funding Date
Last Funding Type String Last Funding Type
Last Funding Amount Currency (in USD) String Last Funding Amount in USD
Total Funding Amount Currency in USD String Total Funding Amount in USD

Investment and Funding Information in Singapore and Hong Kong

  • Description: This dataset details the individual disclosed funding transactions that are public and are published in crunchbase.
  • Source: Crunchbase
  • Dataset
  • Components:
Field Type Description
Transaction Name String Auto-generated name of transaction (e.g. Angel-Uber)
Organization Name String Name of the organization that got funded.
Categories String Industry the organization belongs to
Location String Location of the organization that got funded
Funding Type String Type of Funding Round (e.g. Seed, Series A, Private Equity, Debt Financing)
Money Raised Currency (in USD) Integer Amount of money raised in Funding Round
Announced Date Date Date that the Funding Round was publicly anounced
Funding Stage String The funding stage of a funding round

Investor Information in Singapore and Hong Kong

  • Description: This dataset shows the investor locations and number of investments made by the investors .
Field Type Description
Location String Investor locations
Lat Double Latitude of the location
Categories Double Longitude of the location
Number_of_Investments Integer Number of investments made by investors from the location

Other Startup Information in Singapore and Hong Kong

  • Description: This dataset includes other attributes on startups in Singapore and Hong Kong that was founded after 2000.
  • Source: Crunchbase
  • Dataset
  • Components:
Field Type Description
Organization Name String Organization Name
Categories String Industry the organization belongs to
Sector String The sector the organization belongs to
Headquarter Location String Organization headquarter location
CB_Rank Integer Crunchbase Rank
Age_days Integer Age of the organization measured in days
SimilarWeb_Average_Visits Double Average visits to the organization website recorded by SimilarWeb
SimilarWeb_Visit_Duration Integer Visit duration to the organization website recorded by SimilarWeb
SimilarWeb_Global_Traffic_Rank Double Organization website traffic ranking among all the organization websites globally recorded by SimilarWeb
BuiltWith_Active_Tech_Count Integer Count of active technology used recorded by BuiltWit
BACKGROUND SURVEY OF RELATED WORKS
Related Works What We Can Learn
Heat Map
Heatmap unicorn.PNG
  • A heat map can display time series data for multiple dimensions across a fixed set of categories and this provides more information in one single chart.
  • By looking at the overview of color intensity, user are able to identify the overall pattern of the data and highlight any exceptions/outliers.
  • The color intensity displays the quantity while tooltip enables users to hover over the individual shaded area and understand the exact quantity.
Cleveland Dot Plot
Cleveland dot plot unicorn.PNG
  • Cleveland Dot Plot is effective in comparing data of two parties on a single chart. It can easily communicate the huge gaps between two objects to the user.
  • Unlike bar chart, this dot plot reduce the clutter and maximize the data-ink ratio.


Violin Plot
Violin plot 1.png

A violin plot show more information in one plot:

  • IQR (25% percentile, median, 75% percentile) and Max & Min
  • 95% confidence interval
  • Average
  • Density plot indicating concentration of data

This can be used for our industry analysis to achieve a more insightful visualization

Dot Density Map
Map-dot.png
  • Dot Density Map is useful to show the value density in different locations.
  • The color and size of the dot can represents different things.
SKETCHES STORYBOARD
Sketches How Analyst Can Conduct Analysis
Heatmap unicorn sketch.PNG

Heat Map:

  • Shows time-series analysis for startups that has formed or exited and their funding amount across the years according different industries
  • AllowS the user to identify the growing industries as well as the trend in start-up formation and funding amount
Cleveland dot plot unicorn sketch.PNG

Cleveland Dot Plot with Filters:

  • Presents current breakdown of startups by industries
  • Colour of the dots represent the region
Violin-own-2.png

Violin Plot with filters:

  • Displays values across sectors and industry inside each sector across 2 regions. The values are: age (company age), similarweb visit duration, crunchbase rank, similarweb average visits, builtwith active technology count, global traffic rank.
  • The value shown are accessible through tabs.
  • Breakdown of industries in each sector is shown by selecting sector in dropdown list.
  • User can choose bar chart over violin chart to have an overview of data.

Density-map.png

Dot Density Map:

  • color shows number of investments by percentile (to standardize the skewed number)
  • size shows number of investments
  • style of map can be changed using the radio button
ARCHITECTURE DIAGRAM


Technology diagram VA.png


KEY TECHNICAL CHALLENGES


Domain Knowledge Understanding:

  • As the datasets involves many technical terms in the startup ecosystem, the group has to study more in-depth on the terminologies used in the ecosystem in order to draw meaningful insights. This includes the definition of different funding stages, types of fundings, types of investors, startup industries categories and etc.

Data Preprocessing:

  • Missing data: how to deal with missing values
  • Data integration and calculation: understanding the column attributes and perform meaningful summation or calculation
  • Multiple values for certain observations: how to deal with such attributes and ensure that the visualizations take into account of start-ups that has attributes with multiple values

Technological Expertise:

  • Learning relevant packages under R such as ggplot2, tidyverse, shiny and plotly
  • Learning how to integrate D3.js with R to achieve both advanced analytics functions as well as interactive visualization
  • Learning integration of different charts and enhance the interactivity and animation techniques of the storyboard


PROJECT TIMELINE


Timeline Diagram.png


REFERENCES



COMMENTS

Feel free to comments, suggestions and feedbacks to help us improve our project!:D