leaving New York City (a case study)
i thought it would be fun to examine articles written by folks who proclaim their exit from New York City in a blog post. we’ve seen these sort of posts get published across Medium (and the like) for years now. actually, the oldest one of the sample size is from 2010.
to give you a bit more backstory, i used a very similar approach to Ryan Kulp (but swapped out SF for NYC) and brought in a contractor from Upwork to help with the scraping portion.
i needed to compile a list of articles to work with. i used Google search operators to get my initial list.
site:medium.com "leaving new york city"
site:medium.com "leaving nyc"
site:medium.com "goodbye nyc"
+ a few other variations of this…
the MozBar Chrome Extension was super helpful for downloading the results from Google for each search.
i ended up with ~ 275 articles after merging all results into 1 Google Sheet. i then manually scrubbed those to remove any that were links to: article replies, profiles and any other not relevant pages.
result: 188 articles
leaving New York City metadata
here’s where things got a bit more difficult for my technical know-how (or lack thereof). no big deal, Upwork to the rescue.
i found an individual who is well-versed in scraping. (thanks for the help, Shourya!) i asked Shourya to get a handful of attributions from each of the 188 posts:
- published date
- word count
we had access to everything aside from what the authors ate for breakfast, which we can infer was likely a vegan dish anyway.
i asked Shourya for a write-up of his process so i could include it in this article. see here:
Scripting language used for the task – “Python”
Modules used –
2) Beautiful Soup
1) Opened the given file through CSV module.
2) Read the url from the file
3) Requests module was used to send a GET request to the url and to fetch the response from the GET request.
4) Response was parsed with the help of Beautiful Soup.
5) Located the elements to be extracted using chrome dev tools.
6) Dumped the text from the articles into the text file.
7) Wrote ‘URL’, ‘Words’, ‘Platform’, ‘Query’, ‘Title’, ‘Description’, ‘Language’, ‘Published Date’ to the csv file through CSV module.
Repeated the above process until all the urls were parsed from the given file.
the fruits of Shourya’s labor
a few extremely helpful documents:
- (1) master article Google Sheet with important attributes
- (1) large Google Doc with every single word from every article (254k words)
- (1) Google Sheet with word frequency
because i’m treating this as a pseudo-statistical study, we must not forget to make predictions (step 4 of the scientific method).
- people leave New York City because it is expensive.
- people leave New York City because they want to escape the hustle and bustle.
- people leave New York City to move to Brooklyn/Queens/Long Island.
a collection of stats about leaving New York City
this one doesn’t account for the increase in popularity of blogging on Medium. as noted in the chart above, 2020 is a projection based on the percent change year over year since 2013. i didn’t include the % change from 2010 to 2013 because i didn’t have any data for 2011 and 2012 and the % change from 2010 to 2013 is 800%. it would have skewed the data.
for those interested in seeing the data, boom:
believe it or not, that took more thinking than i thought it would…
this is based on keywords/queries that i used to find articles. some of these articles likely appear for more than 1 keyword, but as i mentioned, this is simply based on the SERPs from my queries.
based on the above, we can conclude that:
- more and more people are leaving New York City
- some of them are blogging about it
it would be more exciting to know why these people are leaving. let’s take a look…
Shourya combined every word from every article into 1 Google Doc then created a word frequency map.
i manually analyzed the data in a Google Sheet and grouped the words by theme:
- job titles
and here’s what i came up with. words and places on the left and number of mentions on the right.
i wasn’t sure what to call Long Island, so i categorized it as a state. i know it’s not a state. i also know Brooklyn and Queens aren’t cities, but for the sake of my categories, both will remain as is. shhh.
this was interesting because of a few reasons…
first, you must remember the above data is only counting mentions. i don’t quite have the capacity to properly measure where these folks are moving to. i can only surmise based on the data i was able to gather.
lots of Brooklyn talk!
wasn’t sure what to expect here. not much to work with. from the looks of it people didn’t often mention their new job title in relation to their move away from NYC if related to work/job.
this category was random and varied drastically. certainly a fair share of work/job, money mentions. i wasn’t surprised by the mentions of suburbs and backyards – there was a good bit of it as you can see. i don’t blame them; i’d also want to escape ~ 750 sq ft of livable space.
reasons people leave New York City
i didn’t read all of the articles collected, but i promise i did read some. many reasons for people leaving New York City boiled down to:
- got a new job somewhere else
oddly enough, the 2nd most mentioned state is California which causes me to wonder why folks are leaving NYC and going to some place in California (likely Los Angeles) based on the number of mentions. it’s also likely those folks fall into reasoning #2 and #3.
if you scroll up and look at prediction #1, i’d say we nailed it. that was the obvious one. NYC is expensive.
in regards to escaping the hustle and bustle (prediction #2), i guess “backyards” speaks to that.
prediction #3 was that people leave NYC to move to Brooklyn/Queens/Long Island. the data definitely points to that. the word frequency map ranks them: Brooklyn, Queens, Long Island – in that order – highest mentions first. though, based on the articles i personally read, the jobs in new cities were more important than the cities themselves.
(this piece was largely inspired by Ryan Kulp’s article on leaving San Francisco)
an obligatory disclaimer to point the mob to (if they come). this is a sort of pseudo-statistical study – if you are trying to find something that better suits your needs, write it.