In this post we will use the ogr2ogr command line tool from GDAL to convert a shapefile of NYC zip code boundary data to GeoJSON format, as well as convert the projected coordinates to latitude and longitude, in one line of code.
Read MoreIn this post we will use the ogr2ogr command line tool from GDAL to convert a shapefile of NYC zip code boundary data to GeoJSON format, as well as convert the projected coordinates to latitude and longitude, in one line of code.
Read MoreA Voronoi diagram divides up a space into regions of influence based on a set of points. In this post we will generate a Voronoi diagram from a map of NYC subway station locations, which can be used to find the closest subway station to any location.
Read MoreThis is an overview of the COVID-19 Open Research Dataset (CORD-19), which is a corpus of research papers related to the coronavirus pandemic, and the Kaggle challenge to develop tools to process them using natural language processing techniques.
Read MoreFeature Engineering is an important skill in data science, and is the process of taking raw data and turning it into features that can be used as inputs for training machine learning algorithms. We will look at 311 noise complaints data in this post.
Read MoreCleaning up dirty, corrupted data with Python and Pandas. Dirty, corrupted data leads to dirty and corrupt analysis and conclusions. Who wants that? In this post we will go through a cleaning checklist with Pandas and a dataset from NYC Open Data.
Read MoreDetermining if a point lies in a polygon is a pretty common task in computational geometry. In this post we will use it to answer questions like 'which NYC neighborhood is this apartment building in?' using GeoDjango and data from NYC Open Data.
Read More