Difference between revisions of "Group27 Report"

From Visual Analytics and Applications
Jump to navigation Jump to search
Line 24: Line 24:
 
|}
 
|}
  
==<font size="5"><font color="#000000">'''Airbnb in New York City'''</font></font>==
 
  
  
Line 47: Line 46:
 
[[File:Tech02.JPG|800px|center]]
 
[[File:Tech02.JPG|800px|center]]
  
===Page 2 - Find traffic patterns by statistical analysis===
+
==OVERVIEW OF AIRBNB IN NEW YORK CITY==
This page is mainly about the statistic analysis of traffic patterns in Beijing. We focus on the speed, number of taxis and distance which drivers traveled by district level and try to give our users more explicit idea about traffic in Beijing.
+
*1. Geo Spatial Visualization, Inside Airbnb
 +
Inside Airbnb is an independent, non-commercial set of tools and data that allows user to explore how Airbnb is being used in cities around the world. Since we want to check the Airbnb operating status in New York City, so we choose the NYC in Inside Airbnb and take Queens as example to check. Below is the Inside Airbnb application:
 +
From the Inside Airbnb, users can easily check the number of listers, room type, activity, availability and listings per host. When users are checking the listings per host part, there are something interesting:
  
<b>Package used: </b>Leaflet, ggplot, plotly
+
…Hosts with multiple listings are more likely to be running a          business without a license and not paying taxes, and if they are renting out an entire home or apartment and aren't present, are probably doing so illegally.
  
<b>Details of design:</b>
+
    In addition, in this part, users can check the number and portion of multi-listings in NYC and can also check the top hosts who have more than one listing. For an example, in Queens, top 3 hosts have 18, 15 and 15 listings respectively.  
* Layout: layout of this page is shown below. On the left side, we have a map of Beijing by district level. And on the right side, there are three line graphs showing the speed, number of taxis, and distance by hour and date. The reason why we show map and line graph together is that we want our users to have an intuitive idea about the how Beijing looks like and where is the specific district located. This may help them figure out why more taxi drivers would like to spend most of them time traveling in districts like Haidian but not like Fangshan.
+
However, Inside Airbnb only mention the illegal situation in NYC Airbnb, it does not show any detailed information or analysis about this topic. Therefore, to know more about illegal listings, we find an academic article, The High Cost of Short-Term Rentals in New York City. This article introduces the Airbnb and its critics in New York, it is indicating that,
 +
…a large amount of activity on shortterm rental platforms is not “home sharing” as the term is normally understood (occasional shortterm rentals of a family’s primary residence or a room within the primary residence), but rather a new form of de facto hotel.
  
 +
It also talks about the illegal listing and one new concept: Commercial Operator, which is one kind of the host in Airbnb and control multiple entire-home/apartment listings or large portfolios of private rooms are only 12% of hosts but they earn more than 28% of
 +
revenue in New York City.
  
 +
According to this article, the growth in revenue-earning Airbnb listings in 2017 can refer to the map below, it shows where has the highest listing density and where has the fastest revenue growth rate:
 +
Based on this information, we can know which borough need to be focus on and has the overall impression of Airbnb in NYC. Next, we were attracted by the part which is about analysing commercial operator. The left figure shows the distributions of the entire home listings for the four commercial operators of entire homes listings in NYC.
 +
However, this article only mentioned few about commercial operator and this figure only shows four commercial operators’ listing and relations. To do further analysing and help user to know more about illegal operating and commercial operator of Airbnb in NYC, our research will try to figure out the what is the current and previous situation of illegal phenomenon and commercial operator’s status, find out whether the New York short-term law has any effect on Airbnb in New York, try to summarize the relationships between illegal listings and commercial operators, finally we will visualize all our outcome by applying R shiny to make an interactive application.
  
===Page 3 - Analyze the working pattern of Beijing taxi drivers===
+
==Conclusion and Insight==
This page focuses on finding the behaviour of the taxi drivers.  
+
In conclusion to the insights discovered:
 +
* Airbnb is flouring in New York City, especially popular in Manhattan and Brooklyn with high price and tight booking schedule.
 +
* Manhattan contributes over half of whole revenue, larger than aggregated revenue of remining borough.
 +
* Deriving by revenue, both commercial operators and illegal hosts are preferring to operate their Airbnb property in Brooklyn and Queens which can better capture market needs.
 +
* With the practice with New York’s short-term rental laws, the illegal phenomenon was disappearing. However, some arbitragers are still seeking illegal operating in some popular district like Queens and Brooklyn.
  
1.The left interactive graph enables the user to select a taxi driver id and find where he lives and in which district he's driving at a specific time point. Take user 1000 as an example. ... After random sampling 20 drivers, we found that most drivers tend to drive near their living place and they like to drive around one district for 2-3 hours.
 
  
2. The top right part is a heat map reflecting the distribution of drivers' living places. The redder the area is, the higher density it is for driver home.
 
3. The bottom right part are two graphs giving statistics of the drivers' working days and working hour, including a comparison between normal working days (02-05) and lunar new year period(06-07).
 
  
==Discussion==
 
==== Visualization of Traffic patterns by hour of the day ====
 
  
 +
== Discussion==
  
 +
On 12 August 2018, we had a great opportunity to showcase our application to our classmates and some invited guests in a presentation event. We do receive many valuable comments about our work.
 +
Our audience give us a lot implementing suggestions, especially our Professor and advisor to the project, guide our team to change and modify our application in terms of changing the graph type, add some label words and other details. And some classmates suggest us to add more variables into analysis the revenue to make the conclusion more convincing, for the poster we could give more graphs to interpret instead of long paragraph words.
  
==== Heat Map of home addresses and starting points for taxis per day====
 
  
The plot below represents the specific locations of the 10357 taxis drivers' home in Beijing. Having no information of the drivers' profile, we derived their home location by computing the longest time period during which the car speed remains 0 km/h for each drive , and assume that specific location to be the driver's home.
 
  
[[File:Heatmap_driverhome.PNG|400px|center]]
 
  
====Insights on working patterns for taxi drivers in Beijing====
+
===Future Work===
  
In this section we dug deeeper into Taxi behaviour to get specific insights on number of working hours, number of taxis per hour and how many days a week taxi drivers work in this Beijing sample.
+
Given time constraints, the current application still has room to improve.  For our Airbnb overview analysis, we may add population, housing price and the housing stock in different borough into consideration since these variables are important indicator when we analyse the overall Airbnb listings. otherwise, it would a bit absolutely to get our conclusions. And for our revenue algorithm, we can improve its accuracy by taking the probability, the overall score, the number of reviews as factors to calculate.
  
The number of taxis per hour is represented below. We can observe that the number of hours in public holidays diminishes, as we assumed. Finally, the number of hours increases from 10am and peaks at 15 hours in both inside and outside inner circle.
+
For the technical part, the application can be enhanced to allow more multiple view interactive visualization for users to easy understand instead of current plot and line chart representation. For the illegal investigation part, the listing tree should be replacing into clear bar chart, let user to select the top 5 or 10 areas listings.
  
Opposite to that, the speed increases as the traffic (and number of hours) reduces such as in Chinese New Year.
 
  
Total distance during the night reaches a minimum and increases from 9am during the day until 9pm.
 
  
 
+
<
<br>[[File:Figure516.png|800px]]
 
 
 
We have analyzed in R the Number of working days for the 10357 taxis. The distribution of this data is represented in the chart below. As we can see over 6500 taxis or almost 63% of the taxi fleet works 7 days a week while 20% of the fleet in this data works 6 days a week. There’s a very small percentage of taxi drivers working 4 or less number of days in a week.
 
 
 
 
 
<br>[[File:Figure616.png|800px]]
 
 
 
Finally, we show the distribution of the number of working hours for the 10357 taxi drivers in our dataset.
 
 
 
<br>[[File:Figure716.png|800px]]
 
 
 
==Future work==
 
 
 
Beijing is one of the busiest cities in the world. From a simple dataset of 4 variables: Taxi ID, Latitude, Longitude and Timestamp the team has managed to derive insights from different angles on Taxi driver behaviour, traffic patterns and other distributions.
 
 
 
This data analysis could be useful for Beijing’s municipalities. It could also serve as a research paper and tool for new start-ups who want to understand the taxi patterns and locations in Beijing.
 
 
 
Other uses of this research are:
 
 
 
- For Government: Due to limited information for Beijing's city construction and road planning, we may only give suggestions based on our cognition. The suggestions may not be applicable in real world, but it gives a general direction of how to mitigate congestion by improving the traffic arrangements.
 
- For Taxi Companoies: We will suggest taxi companies in Beijing how to allocate taxis in a more efficient way.
 
- For Commuters: Commuters may have a more clear view of the traffic condition at different time and site to make cleverer decisions on their transportation planning.
 
 
 
 
 
The data used for this analysis is dated 2008, however, it would be relevant to apply it to a more recent dataset and real time data. Future work can consist of mapping a real time dataset with our application.
 
 
 
Overall, the team has demonstrated that by taking periodical samples of position for no matter what vehicle, useful insights can be derived for decision making.
 
  
 
==Installation Guide==
 
==Installation Guide==
  
Here are the steps for the installation:
+
At the point of the project, R Studio version 1.1.383 was used to create the application.
1.
+
1) Install R Studio version 1.1.383 from: https://www.rstudio.com/
2.
+
2) Open the installed R Studio application, and under the top menu, select Tools > Install Packages... Type in the following package names under the "Packages" field and click 'Install'.
3.
+
shiny, DT, tidyverse, Dplyr, ggplot2, RISONIO&GEOISON, Shiny themes and Plotly.
 +
3)  After the packages have finished installing, open the project files and click on "Run App".  
  
 
==User Guide==
 
==User Guide==
  
The Shiny application has 4 tabs:
+
This section details the actual steps in using the application and see the visualization results. below is the main page of our application, and there are three separate interfaces of our application:
* [1] First tab shows the data set. User can take a look at the different variables including the ones derived during the project
+
* [1] From the first page, we can explore the overview activity in New York’s Airbnb, on the right-hand side panel, users can select a year range from 2015 to 2017 to see different period’s Airbnb housing price, the ‘Borough’ is for user to filter the five main borough in New York to see the difference in areas. Since the Rent term is an important factor influence the house price, so from zooming the Rent term button we can see the difference of short-term rent and long-term rent of each areas. And the bar chart ‘Room Type Distribution’ would change based on the selection above. It will show you the room type distribution.
* [2] Second tab shows an initial visualization for exploratory purposes. One can clearly see the map of beijing and the different traffic patters. This is only for visualization purposes
+
* [2] Move on the second interface, this section is for user to explore the revenue of Airbnb actives, the left side panel is to filter different year with different rent terms. Since the rent terms is an important factor influence the housing price. The tree map ’Revenue vs Listing’ can be click on borough to see more specific area in this district in terms of revenue and listing result. The ‘Price distribution’ would let us the price difference in different borough. And this two bar chart show us the top 5 revenue and listing area separately based on users’ selection of borough.
* [3] Third tab explores the different traffic patterns per district and per day and hour.
+
* [3] The third interface is for exploring the illegal investigation, Users can also filter out the sub-graphs by clicking on their respective drop-down boxes and checkboxes at the top. Based on the selection, we can see the commercial revenue percentile and illegal revenue percentile distribution in New York city, therefore we can see the differences and compare the illegal investigation in different borough.  
* [4] Fourth tab in our Shiny R app includes taxi driver behavior. One can select a specific taxi ID and understand where this taxi driver is at each point of the day.
 
  
  
Line 137: Line 114:
 
==References==
 
==References==
  
*[1] Introductory statistics with R. By Peter Dalgaard published in 2002.
+
[1] Airbnb. 2014. Airbnb and Housing. Available online at Last accessed November 3, 2017.
*[2] Tutorials from Datacamp on R visualization and ShinyR
+
[2] Litten, K. 2016. Neighborhood ‘Mourners’ Want New Orleans Short-Term Rentals Regulated. The Times-Picayune. Available online at Last accessed July 4, 2017.
*[3] R Graphics cookbook by Winston Chang, 2012
+
[3] Lee, D. 2016. How Airbnb Short-Term Rentals Exacerbate Los Angeles’s Affordable Housing Crisis: Analysis and Policy Recommendations. Harvard Law & Policy Review 10: 229-253.
*[4] Data Visualization with R: 100 examples by Thomas Rahlf, published in 2017.
+
[4] New York Communities for Change and Real Affordability for All. (2015). Airbnb in NYC: Housing Report 2015. Policy report
*[5] T-Drive: Driving directions based on taxi trajectories pdf. By Microsoft November 1, 2010
+
[5] Barron, K., Kung, E., & Proserpio, D. (2017). The Sharing Economy and Housing Affordability: Evidence from Airbnb. Retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3006832
*[6] Urban computing with Taxicabs, October 20th, 2011
 
*[7] Analyzing 1.1Billion NYC Taxi and Uber trips by Todd Schneider, 2018
 
*[8] The Art of R Programming: A Tour of Statistical Software Design Book by Norman Matloff
 
*[9] R in a Nutshell Book by Joseph Adle
 
*[10] Using R for Introductory Statistics Textbook by John Verzani
 
*[11] R for Data Science Book by Garrett Grolemund and Hadley Wickham
 
*[12] Learning Shiny | R-bloggers
 
*[13] Web Application Development with R Using Shiny by Chris Beeley
 

Revision as of 18:08, 13 August 2018


Airbnb-newyork.jpg Group27:Airbnb Crisis in NYC

Overview   Proposal   Poster   Application   Report



INTRODUCTION

  • 1.1 Background

Airbnb is developing fast in recent years, it boasts almost two million listings in 34,000 cities, and according to data from Inside Airbnb, an independent data analysis website. listed about 36000 apartments in New York, which is the one of the biggest markets of Airbnb. However, The New York State Attorney General concluded that 72%of all units used as private short-term rentals on Airbnb during recent years appeared to violate both state and local New York laws. New York’s short-term rental laws, which were last updated in 2010, basically prohibit most apartments (buildings with three or more units) in New York City from being rented out for less than 30 days. Therefore, majority of entire home/apartment listings that you find on Airbnb and other sites for New York City would be considered illegal, especially if you can book them for a period of less than 30 days. Meanwhile, some hosts are purchasing more properties to extend their operating range, we call them commercial operator. Our project mainly has two objects: 1) Explore and visualize how Airbnb operates in New York City 2) Figure out how the New York’s short-term rental laws affect illegal listings and commercial operator in three years.

  • 1.2 Motivation

With the developing of Airbnb in New York, an increasing number of hosts have incentive to operate their property as Airbnb for listing short-term rental instead of lending out. In this context, two kinds of host emerged. Commercial Airbnb operators, who have multiple entire-home listings or large portfolios of private rooms, they operate Airbnb as business. Illegal hosts, who might not operate many listings and treat Airbnb as business, they are more willing to gain benefits based on their own property. Looking for quick return is their strategy. Based on these situation, we will explore the relationship among hosts, Airbnb NYC and New York’s short-term rental laws.

Data Preparation

Our dataset is from Inside Airbnb, these three csv files contain summary information and metrics for listings in New York City of 2015 to 2017. We deleted useless variables and summarized the listing which is illegal and the host which is commercial operator by using JMP and R.

TECHNICAL APPROACH

To visualize the overview of Airbnb in NYC and our statistical analysis, we applied several useful R packages:

Tech01.JPG
Tech02.JPG

OVERVIEW OF AIRBNB IN NEW YORK CITY

  • 1. Geo Spatial Visualization, Inside Airbnb

Inside Airbnb is an independent, non-commercial set of tools and data that allows user to explore how Airbnb is being used in cities around the world. Since we want to check the Airbnb operating status in New York City, so we choose the NYC in Inside Airbnb and take Queens as example to check. Below is the Inside Airbnb application: From the Inside Airbnb, users can easily check the number of listers, room type, activity, availability and listings per host. When users are checking the listings per host part, there are something interesting:

…Hosts with multiple listings are more likely to be running a business without a license and not paying taxes, and if they are renting out an entire home or apartment and aren't present, are probably doing so illegally.

    In addition, in this part, users can check the number and portion of multi-listings in NYC and can also check the top hosts who have more than one listing. For an example, in Queens, top 3 hosts have 18, 15 and 15 listings respectively. 

However, Inside Airbnb only mention the illegal situation in NYC Airbnb, it does not show any detailed information or analysis about this topic. Therefore, to know more about illegal listings, we find an academic article, The High Cost of Short-Term Rentals in New York City. This article introduces the Airbnb and its critics in New York, it is indicating that, …a large amount of activity on shortterm rental platforms is not “home sharing” as the term is normally understood (occasional shortterm rentals of a family’s primary residence or a room within the primary residence), but rather a new form of de facto hotel.

It also talks about the illegal listing and one new concept: Commercial Operator, which is one kind of the host in Airbnb and control multiple entire-home/apartment listings or large portfolios of private rooms are only 12% of hosts but they earn more than 28% of revenue in New York City.

According to this article, the growth in revenue-earning Airbnb listings in 2017 can refer to the map below, it shows where has the highest listing density and where has the fastest revenue growth rate: Based on this information, we can know which borough need to be focus on and has the overall impression of Airbnb in NYC. Next, we were attracted by the part which is about analysing commercial operator. The left figure shows the distributions of the entire home listings for the four commercial operators of entire homes listings in NYC. However, this article only mentioned few about commercial operator and this figure only shows four commercial operators’ listing and relations. To do further analysing and help user to know more about illegal operating and commercial operator of Airbnb in NYC, our research will try to figure out the what is the current and previous situation of illegal phenomenon and commercial operator’s status, find out whether the New York short-term law has any effect on Airbnb in New York, try to summarize the relationships between illegal listings and commercial operators, finally we will visualize all our outcome by applying R shiny to make an interactive application.

Conclusion and Insight

In conclusion to the insights discovered:

  • Airbnb is flouring in New York City, especially popular in Manhattan and Brooklyn with high price and tight booking schedule.
  • Manhattan contributes over half of whole revenue, larger than aggregated revenue of remining borough.
  • Deriving by revenue, both commercial operators and illegal hosts are preferring to operate their Airbnb property in Brooklyn and Queens which can better capture market needs.
  • With the practice with New York’s short-term rental laws, the illegal phenomenon was disappearing. However, some arbitragers are still seeking illegal operating in some popular district like Queens and Brooklyn.



Discussion

On 12 August 2018, we had a great opportunity to showcase our application to our classmates and some invited guests in a presentation event. We do receive many valuable comments about our work. Our audience give us a lot implementing suggestions, especially our Professor and advisor to the project, guide our team to change and modify our application in terms of changing the graph type, add some label words and other details. And some classmates suggest us to add more variables into analysis the revenue to make the conclusion more convincing, for the poster we could give more graphs to interpret instead of long paragraph words.



Future Work

Given time constraints, the current application still has room to improve. For our Airbnb overview analysis, we may add population, housing price and the housing stock in different borough into consideration since these variables are important indicator when we analyse the overall Airbnb listings. otherwise, it would a bit absolutely to get our conclusions. And for our revenue algorithm, we can improve its accuracy by taking the probability, the overall score, the number of reviews as factors to calculate.

For the technical part, the application can be enhanced to allow more multiple view interactive visualization for users to easy understand instead of current plot and line chart representation. For the illegal investigation part, the listing tree should be replacing into clear bar chart, let user to select the top 5 or 10 areas listings.


<

Installation Guide

At the point of the project, R Studio version 1.1.383 was used to create the application. 1) Install R Studio version 1.1.383 from: https://www.rstudio.com/ 2) Open the installed R Studio application, and under the top menu, select Tools > Install Packages... Type in the following package names under the "Packages" field and click 'Install'. shiny, DT, tidyverse, Dplyr, ggplot2, RISONIO&GEOISON, Shiny themes and Plotly. 3) After the packages have finished installing, open the project files and click on "Run App".

User Guide

This section details the actual steps in using the application and see the visualization results. below is the main page of our application, and there are three separate interfaces of our application:

  • [1] From the first page, we can explore the overview activity in New York’s Airbnb, on the right-hand side panel, users can select a year range from 2015 to 2017 to see different period’s Airbnb housing price, the ‘Borough’ is for user to filter the five main borough in New York to see the difference in areas. Since the Rent term is an important factor influence the house price, so from zooming the Rent term button we can see the difference of short-term rent and long-term rent of each areas. And the bar chart ‘Room Type Distribution’ would change based on the selection above. It will show you the room type distribution.
  • [2] Move on the second interface, this section is for user to explore the revenue of Airbnb actives, the left side panel is to filter different year with different rent terms. Since the rent terms is an important factor influence the housing price. The tree map ’Revenue vs Listing’ can be click on borough to see more specific area in this district in terms of revenue and listing result. The ‘Price distribution’ would let us the price difference in different borough. And this two bar chart show us the top 5 revenue and listing area separately based on users’ selection of borough.
  • [3] The third interface is for exploring the illegal investigation, Users can also filter out the sub-graphs by clicking on their respective drop-down boxes and checkboxes at the top. Based on the selection, we can see the commercial revenue percentile and illegal revenue percentile distribution in New York city, therefore we can see the differences and compare the illegal investigation in different borough.


Acknowledgements

The authors wish to thank Ting Seong KAM, professor of Visual analytics in School of Information Systems, Singapore Management University for his ongoing support.

References

[1] Airbnb. 2014. Airbnb and Housing. Available online at Last accessed November 3, 2017. [2] Litten, K. 2016. Neighborhood ‘Mourners’ Want New Orleans Short-Term Rentals Regulated. The Times-Picayune. Available online at Last accessed July 4, 2017. [3] Lee, D. 2016. How Airbnb Short-Term Rentals Exacerbate Los Angeles’s Affordable Housing Crisis: Analysis and Policy Recommendations. Harvard Law & Policy Review 10: 229-253. [4] New York Communities for Change and Real Affordability for All. (2015). Airbnb in NYC: Housing Report 2015. Policy report [5] Barron, K., Kung, E., & Proserpio, D. (2017). The Sharing Economy and Housing Affordability: Evidence from Airbnb. Retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3006832