top of page

Filming NYC

Time Based Analysis and Location Centroid Analytics

image12.png

Tools

Team Size:

Time Frame: 

Tools:    

My Role: 

1

6 weeks

Carto, Kepler.gl, Excel

Data Analytics Data Visualization Engineer

Friends 25 Anniversary Pop Up in SOHO

image2.png

Friends 25 Anniversary Pop Up in SOHO | Friends real scene

image8.jpg
image7.png

Why This Topic

I went to the 25th anniversary of “Friends” pop-up in September. The pop-up was actually in a first floor space at SOHO. I know it was just a memorial, not a real filming location, but it seemed we can still feel the energy from the show and love it.

However, later that month, one of my friends told me, there is a friend's apartment just half miles from here, in Greenwich Village. It surprised me again that  there are so many free places of interest to visit in New York, before visitors need to take out their wallet. 

Data Analytics

The Process in Time Based Analysis and Data Size

image3.png
image10.png

​I started exploring data by opening the XML file dataset using EXCEL. I changed the Date type from string to date, so that it became recognizable in Carto and other platforms. After some data cleaning and steps, it shows in this dataset there are 233 locations, 142 directors and 178 films. Just from this data, it shows that there are on average more than one film directed by one director. I am eager to find out who the top directors filmed more films are from 1945 to 2006  in New York City.

Number of Filming Locations By Directors

image6.png

Number of Filming Locations By Film

image9.png

From EXCEL I created the bar chart below, it is very easy to find out the “Champion”. There are two directors who filmed most: Spike Lee and Woody Allen.  Both filmed at 11 locations in New York City.

Similarly, it would also be interesting to find which movie is filmed at more locations in NYC. The data tells us,  “Godspell” is the champion with 8 NYC filming locations within one film. “Annie Hall” is the second place with 5 filming locations.

Number of Filming Locations in NYC Changed From 1945 to 2006

image13.png

It is noticeable that there is a big jump from Year 1966 to Year 1968. The number of locations of filming jumped from 2 to 12 in two years. This reminds me of the introduction of the book “Scene from the cities”.  “1966 Mayor’s Office of Film, Theatre and Broadcasting was formed in 1966”. It is easy to believe that this could be a reason why the number jumped.

Pattern of Clusters and Centroids Shifts of Filming LocationsFrom 1940s to 2000s

After some research, I decided to switch to CARTO in order to use one of the analysis tools. I first separately calculated the centroids of locations of each decade, then I used another analysis tool to connect centroids based on sequence.It shows there had been shifts for those decades, but the shifts were not huge. It is because the range of New York City is big, after the “average” process, the difference of those small clusters are covered.  

After knowing where they are, in the following part, I will discover what type of locations they are by joining the information from another dataset: MapPLUTO.

A collaboration work with Kepler.gl, QGIS, Excel

2 methods

Use Distance Matrix instead of Join Layer By Locations

There is a cool thing about QGIS, because it is an open source software, it is surprised to discover that there could be more solution to solve one problem. Also, asking for help early is a valuable lesson I learnt, which could have saved a large amount of my time.

 

Methods Before

At the beginning, I used the “Join Layer By Locations” method. However, when I tried the Join function a couple times, I found that as many as 100 locations were not successfully locating their polygons in MapPLUTO dataset. I spent a large amount of time working on solving this “problem”. It turns out that it is not exactly a “problem”, it is a fact that MapPLUTO is purposefully not including parks because the original dataset was developed for tax lot purpose. 

Before working with MapPLUTO dataset, I was overwhelmed by the large volume in data rows and columns that I did not learn about this earlier.

Methods After

Professor Doyle reminded my that I could change to a different way of thinking for solutions. By calculating the distance of each point with the closest polygon of MapPLUTO, it is certain that there will be a “closest polygon”, which can be used for representing the type of land use for each filming location.

 

My work process includes QGIS, EXCEL and Kepler.gl. I started with mapping through Kepler.gl, it shows that the largest circle area are mainly blue or green, which represent “Commercial & Office Building” and “Multi-Family Residential Building”. 

Filming Streets

Picture of filming from WIX

Key Takeaways

From the analysis above, I was able to rank and find out Spike Lee and Woody Allen are the directors with most filmed locations in NYC. In addition, Godspell is the movie with 8 filming locations in NYC as the highest record in this dataset. With the ruler of time, it is easy to find the first peak of 12 filming locations in 1968, which largely came from the start of the Mayor's Office of Film, Theatre and Broadcasting in 1966. In terms of types of land use, I have discovered that more filming locations are based at Residential Building than Commercial & Office. Two main clusters of Commercial & Office are at Financial District and Time Square; two main clusters of Residential Buildings are at the south point of Central Park and the joint of Greenwich Village and East Village. 

Created by De Han

  • LinkedIn
  • Instagram
  • Spotify
bottom of page