top of page

NYC Taxi

New York City Taxi Trip Patterns in a Day

ezgif.com-gif-maker.gif

Resources

Team Size:

Time Frame: 

Tools:    

My Role: 

1

12 weeks

QGIS, Kepler.gl, NYC 

Data Analytics Data Visualization 

Taxi Service

Taxi service picture from WIX

Why This Topic

I am interested in this subject mainly because of my experience using Uber when I was working in Beijing, China. I had a wonderful experience when Uber just entered Chinese market around 2016 and there was a price war between Uber and local competitors, since there were coupons, discounts  and numerous incentives. However, two year later, I had a difficult time calling an Uber drive after work at 10pm, waiting in lines with more than 130 people. Out of curiosity, I decided to map out where those vehicles go, based on what I have learnt with mapping geographic data using tools such as QGIS and Leaflet. 

The research question in this project is what the pattern of NYC Taxi Trips look like, including features such as rush hours, busy areas as well as large fares. There are three steps or three tools involved in this research,

  1. Using QGIS to do data analysis and static mapping

  2. Using Excel to extract hourly data and visualize traffic volume

  3. Using Kepler to do interactive data visualization

Data Analytics

Data Requested from Taxi & Limousine Commision

image8.png

After comparing datasets from Uber movement, Taxi & Limousine Commision, NYC Opendata and Kaggle, I finally arrived at a dataset provided on Kepler.gl. In this dataset,  both fare and coordinates variables are included which would make my analysis clear and precise. The following variables are involved in this research, including tpep_pickup_datetime, tpep_dropoff_datetime, pickup_longitude, pickup_latitude, Dropoff_longitude, Dropoff_latitude, fare_amount.

Busy Taxi in NYC

image4.png

​Function involved: 

  • Counting points in polygon

  • Styling based on criteria: counts of pickups in gradient 

  • Picking a svg car as marker

Busy pickup zones

image1.png

Function involved:

  • Counting points in polygon

  • Styling based on criteria: golden dollar

  • Labelling based on criteria: top 8 most pick up places

Fat dollars

image3.png

Function involved:

  • Counting points in polygon

  • Styling based on criteria: golden chicken drumstick

  • Labelling based on criteria: top 8 most drop-off places

More Pick Up Than Drop Off

image1.png

Function involved:

  • Joining columns to calculate pickup-dropoff

  • Counting points i n polygon

  • Styling based on criteria: golden dollar (for real)

  • Labelling based on criteria: top 8 most pickup(for real) places

Less Pick Up Than Drop Off

image2.png

Function involved:

  • Ranking pickup-dropoff in a reverse order

  • Counting points in polygon

  • Styling based on criteria: destination flag

  • Labelling based on criteria: top 8 most drop-off(for real) places

Hourly Rides in a Day

image6.png

Function involved:

  • Extracting hour from date and time in Excel

  • Plotting a bar chart with counts of rides of each hour

  • Labelling number of rides especially for top pickups(for real) and least pickups(for real)

Data Visualization

Dynamic Pickup Counts in a Day

image7.gif

Function involved:

  • Plotting counts of pickups in column

  • Styling columns color based on counts, with the lightest for largest counts

  • Styling columns height based on counts, with the highest for largest counts

  • Applying pickup time to filters

Dynamic Drop-off fares in a Day

image9.gif

Function involved:

  • Plotting counts of pickups in column

  • Styling columns color based on counts, with the lightest for largest counts

  • Styling columns height based on counts, with the highest for largest counts

  • Applying pickup time to filters

Taxi On Road

Taxi on Roads picture from WIX 

Key Takeaways

This analysis is more useful based on the perspective of taxi drivers, especially for new drivers. If the driver want to get business running fast, the driver should go to neighborhood with most pickups; If the driver want longer trips, fat dollar neighborhood is the friend; If the driver is more experienced, he or she would better go to places labeled with most pickup(for real); for drivers not in a very good mood at running empty taxis, he or she may not want to go to places of least pickups but still many drop-offs.

 

In terms of Kepler, it is more of a visualization tool than a geodata analysis tool as Qgis does. Kepler is competitive with more intuitive interface, interactive labelling, but lack of data analysis functions, especially joining datasets.

 

In this paper, trips of hours in a day are discussed, similar analysis can be done in weeks in a month, months in a year. In terms of pickups, drivers would be more interested. For passengers, drop-offs or available vehicles in the area could be more valuable. If uber movement data is available for New York City, it would also be interesting  to see a greater count (if possible) than NYC taxi in 2019.

Created by De Han

  • LinkedIn
  • Instagram
  • Spotify
bottom of page