Difference between revisions of "VisualizeR Report"
Line 96: | Line 96: | ||
− | '''Selection and | + | '''Selection and Checkbox''' |
+ | We include a selection panel for the user to select different variables and a checkbox to decide whether to include them in the chart. | ||
+ | A line plot is chosen because our data are time-based and it is useful to drive insights about how the variables change over time. | ||
[[File:Cfraedadsline.JPG|800px]]<br/> | [[File:Cfraedadsline.JPG|800px]]<br/> | ||
− | + | '''Bar Chart''' | |
[[File:Cfraedadsbar.JPG|800px]]<br/> | [[File:Cfraedadsbar.JPG|800px]]<br/> | ||
Revision as of 18:03, 6 August 2017
|
|
|
|
|
Contents
Motivation of the application
Crowdfunding - the practice of using small amounts of capital from a relatively large number of individuals to fund a project or venture typically through the Internet – has risen almost exponentially to prominence.
Crowdfunding makes use of the easy accessibility of vast networks of friends, family and colleagues through social media websites like Facebook, Twitter and LinkedIn to get the word out about a new business or campaign and attract investors.
Mobile Application-mediated crowdfunding, especially, is an emerging paradigm used by individuals to solicit funds from other individuals to realize projects. Crowdfunding platforms, such as RocketHub, Kickstarter, and IndieGoGo have been providing opportunities for anyone with Internet access to pitch an idea to their social network and beyond and to gather funding to realize their work.
Currently, there are more than 100 crowdfunding websites in the US, and they are experiencing an exponential growth in popularity. Kickstarter.com, which started in 2009, now hasmore than $9,000,000 pledged per month. And considering the outlook for technology, this field will continue to expand given that it secures the right rules and regulations for functioning.
Campaigns are across various markets sectors and domains, across technology, businesses, nonprofit orgs, political, charity, commercial, or even financing for a startup.
With this sort of rise in online platforms allowing people to easily create campaigns, crowdfunding has emerged as an area that is ripe for research.
Review and critic on past works
Despite the growing popularity of crowdfunding, there is little scholarly research in this domain.
Economists study consumer behavior and how consumers continually make choices among products and services. They examine advantages of crowdfunding such as practicing menu pricing and extracting a larger share of the consumer surplus, and disadvantages of crowdfunding such as constraining the choices of prices to attract a large number of funders.
Management scholars find crowdfunding eliminates the effects of distance from funders whom creators did not previously know.
As an area of analysis, crowdfunding has largely featured literature that focused more on predicting the success/failure of campaigns.
As a field of visualization, the data has relatively been left untapped; most visualizations that exist simply show the accuracy of these prediction algorithms.
Design framework
Through this project and application of R and its tools, we have tried to set a platform to explore the datasets gathered by the crowdfunding apps for understanding and visualizing patterns between the viewers and investors. The application sets the tone for performing exploratory data analysis (via choropleths and heatmaps and calendar maps) by way of communicating the age group that contributes most or the states that contribute highly on crowd funding projects. The application helps us find specific segments of users who show interest on specific category of project (Health/Environmental/ Technological/ Sports/Politics, etc.) that the app launches/publishes. It helps unleash the user behavior through sunburst charts for various regions/states and help us find the regions that indulge in cautious investing or impulsive funding. Usage of clustering algorithms (k means and parallel coordinates visualization) demonstrated in CFVAR help us segment the users in ways or methods that matter to individual users or corporations for their ongoing as well as upcoming projects. Both researchers of crowdfunding as well as people interested in starting their own campaigns can benefit from such tools as they can utilize these visualizations to make better sense of the data. Because of this emerging domain, the visualizations explored would just be the beginning of what can be an ever-increasing domain of research and analysis for this growing field.
DESIGN WORKFLOW
For this analysis, we have made use of a dataset that was publicly available for Bootloader app, an app that collects information on the viewing and funding activity of the users on crowdfunding sites.
The dataset consists of 50000 observations of 10466 distinct Users/Visitors across 5 category of projects (Environment, Games, Sports, Fashion, Technology)
The dataset consists of US demography with the information on the location(latitude,longitude) of the visitors.
Data View
An overview of the data set is included for the user to have a basic understanding of our data.
The columns can be arranged and there's a search function for the user to search specific information.
Data Exploratory
A Data Exploratory tab is included for the user to play around the relationships between different variables.
Tab Panel
We divided the Data Exploration into three parts.
- The first one is a distribution of the total amount of money funded per user.
- The second one is the line plot that allows the user to select up to four different numeric variables to see their changes over days in one month.
- The third one is a bar chart for different categorical variables selected.
Selection and Checkbox
We include a selection panel for the user to select different variables and a checkbox to decide whether to include them in the chart.
A line plot is chosen because our data are time-based and it is useful to drive insights about how the variables change over time.
Analytics Dashboard
The crowdfunding dashboard is largely split into 4 areas
- A chloropleth map representing USA with states color intensity proportional to the amount coming from that particular state.
- A calendar map to understand the pattern of the funding received at what day of the month and at which hour.
- Bar chart to understand the proportion of the funded categories by each state
Hover
Click
User Behavior Diagram
Since our data capture the when and whether a user views or funds a project and which category that project belongs to, one user ID has multiple rows with different activities, but the demographic characteristics are the same. It would be more insightful to undercover the sequence of their behavior.
Most of us are aware now that web analytics or clickstream analysis is largely tackled by sunburst visualizations to answer the following:
- What is the journey most users take towards viewing or funding on the app ?
- What users do after viewing certain projects ?
- What paths end with a churn ?
Sunburst Visualization today can be termed the Unsupervised Clickstream Clustering for User Behavior Analysis. It helps us segment users by way of their navigation through the mobile apps or websites thus revealing their interests and decision making. The sunburst is one definite way to find any users who are interested only in one specific category. For eg. If the user is seen viewing tech after viewing games instead of ending his session, you can sort of conclude that he does keep his options and interest open in other projects. The user does not bind himself to just one category. The Sunburst is the optimal way to display multiple paths. The round Sunburst lets the most common paths shine, and behavioral anomalies stand out as spikes. Built as a dynamic report, it lets you select a path or step in the path, to get more detailed information.
To do so, data manipulation is needed to construct a new user behavior sequence table to output a sunburst chart. The sequence order is defined according to the time the user performed that action. After that, a data frame that only captures the sequences and the counts of different sequences is constructed.
Selection Panel
However, one sunburst chart that captures all the user behavior sequences across the whole U.S. would not give informative insights as it would become too general and cannot segment the users. Therefore, we include the availability that user can select the filters performed on the data set to focus on the users of a certain location, gender, marital status and the device they used.
Meanwhile, we incorporated five sunburst charts to compare the behavior sequences of different age groups.
Demonstration
DATA EXPLORATORY
Distribution tab explains the distribution of USD amounts from users, specifically for funding activity. As the graph shown below, the distribution is highly skewed for 0 amount due to view activities providing null dollar in the data.
ANALYTICS DASHBOARD
SUNBURST
Here is a very simple example interacting with the sunburst charts. Hover on one part of the rings, you can see the counts and the percentage of that behavior. For the male users aged 25-34 in North East, they particularly interested in the sports projects and more likely to fund the sports projects after viewing them compared to the projects of other categories.
Though interpreting the sunburst chart is fairly straight forward, a couple details are worth bearing in mind.
- First, sessions are defined as gaps in action. The interval of time between two activities of a user is not reflected in the sunburst.
- Second,
Discussion
With an inclusion of a Data Exploratory Tab to allow users to visually explore the distribution of the variables and their relationship with one another, it is hoped that users can gain further insights about our data.
Through the Analytics Dashboard, users can have further insights into where the app users located, when they interacted with the app and their contribution to the projects.
The sunburst charts divided into age groups can help the user find the pattern of behaviors from different segments of the app-users and decide which specific group to target.
We hope that users will be inspired to perform deeper data-driven and visual analysis with the help of the dashboard.
Future Work
- First, we plan to collect more data and do a deeper analysis. We would ideally want the data to have IDs for each of the projects to reveal patterns of viewing and funding for specific projects coming from the creators. Any information about the creators of the project (viz. the rating or expertise of the creator)
- Second, we would like to consider how one project leads up to other projects or innovations and how many of them turn into mega projects or companies at record pace. It would be good to find if investors also play the role of creators at any point in time and how varied or similar is the project scope from the ones they have invested in the past.
- Third, we would like perform time series analysis to find any cyclical patterns to understand linking of investments with the financial calendar of the investors.
- In sum, the application has set a good foundation for us to perform data analytics on this area of research and it can be further strengthened and made robust with the right sort of data.
Installation guide
No installation is required, you can access the application in the following link: [the visualizeR app]
To run the application in RStudio: Post the setup of Rstudio (https://www.rstudio.com/products/rstudio/download/), the end user of this application will have to avail the following packages and library for the functioning of this application:
- Shinydashboard
- Plotly
- Tidyverse(lubridate, dplyr,readr)
- sunburstR
User Guide
- Step-by-step guide on how to use the data visualisation functions designed.