Lorem Ipsum with various Google Fonts
How to embed a Google Font into an SVG

If you use a Google Font in an SVG visualization and then try to save it as a file, you might find that the font was not preserved in the saved file. To remedy that, we will look at how to embed a custom font into an SVG with base64 encoding.

Read More
nyc map outline graphic
Using ogr2ogr to convert Shapefiles to GeoJSON
June 20, 2020

In this post we will use the ogr2ogr command line tool from GDAL to convert a shapefile of NYC zip code boundary data to GeoJSON format, as well as convert the projected coordinates to latitude and longitude, in one line of code.

Read More
Multi Foci Cluster Chart Graphic
Building a Multi-Foci Force Layout Bubble Chart in D3.js
June 12, 2020

You might be familiar with force layouts in D3.js to create things like bubble charts, network graphs and many other types of visualizations. In this post we will create a force layout bubble chart with multiple clusters along a timeline.

Read More
using word vector features with scikit-learn (featuring spacy)
Building a custom Scikit-learn Transformer using GloVe vectors from Spacy as features
May 23, 2020

Word vectors are useful in NLP tasks to preserve the context or meaning of text data. In this post we will use Spacy to obtain word vectors, and transform the vectors into a feature matrix that can be used in a Scikit-learn pipeline.

Read More
voronoimazecover.png
Python Maze Generator Part II: Voronoi Diagrams
May 14, 2020

Voronoi diagrams are used in a variety of fields for a variety of reasons, including the art and design world. This post is Part II in a series on mazes, where I will generate and solve random mazes from Voronoi diagrams using Python and Matplotlib.

Read More
Rihanna coreference resolution paragraph graphic
Coreference resolution in Python with Spacy + NeuralCoref
May 6, 2020

Coreference resolution is a task in Natural Language Processing that aims to group together all references to an entity, for example, a person like Rihanna, in text. In this post we use NeuralCoref - a Spacy extension - to do this in Python.

Read More
convex_hull.png
Convex hulls in Python: the Graham scan algorithm
April 26, 2020

Computing the convex hull of a set of points is a fundamental problem in computational geometry, and the Graham scan is a common algorithm for it. In this post we will implement the algorithm in Python and look at interesting uses of convex hulls.

Read More
nyc_voronoi_subway.png
Finding the nearest NYC subway station with a Voronoi map
April 19, 2020

A Voronoi diagram divides up a space into regions of influence based on a set of points. In this post we will generate a Voronoi diagram from a map of NYC subway station locations, which can be used to find the closest subway station to any location.

Read More
Classification vs. Clustering
Classification vs. Clustering in Machine Learning
April 12, 2020

Two broad categories in machine learning are supervised and unsupervised learning. Classification and clustering are examples of each of those respectively, and in this post I will go over the differences between them and when you might use them.

Read More
dresspixelated.jpg
Determining how similar two images are with Python + Perceptual Hashing
March 27, 2020

Years ago I had an app idea where users could upload an image of a fashion item like shoes, and it would identify them. In this post I will go over how I approached the problem using perceptual hashing in Python with Pillow and the imagehash library.

Read More
textnormalization.png
Text Normalization for Natural Language Processing in Python
March 22, 2020

Text Normalization is an important part of preprocessing text for Natural Language Processing. There are several common techniques that we will go over in this post, using the Natural Language Toolkit (NLTK) in Python.

Read More
coronavirus graphic
Overview of the COVID-19 Open Research Dataset (CORD-19) + Kaggle Challenge
March 20, 2020

This is an overview of the COVID-19 Open Research Dataset (CORD-19), which is a corpus of research papers related to the coronavirus pandemic, and the Kaggle challenge to develop tools to process them using natural language processing techniques.

Read More
martinigraph.png
Python Project: Which cocktails can you make from a list of ingredients?
March 18, 2020

For many of us, going out to restaurants and bars is but a distant memory, and you might want to make your own cocktails at home. In this post we will build a program in Python to tell you what cocktails you can make from a list of input ingredients.

Read More
analyticsgraph.png
Accessing the Google Analytics Reporting API (V4) with Python
March 8, 2020

How to access the Google Analytics API with Python and create reports with your analytics data. This API seems complicated at first, but once you get the hang of how things work it's easy to generate new and interesting reports.

Read More
sudoku.png
Generating and solving Sudoku puzzles with Python
March 4, 2020

You might be familiar with Sudoku - the single-player puzzle that involves inserting the numbers 1-9 into a grid in a certain way. In this post we will generate and solve Sudoku puzzles with Python using a depth-first search backtracking algorithm.

Read More
funnelgraph.png
Feature Engineering with Python + Pandas: An Introduction
Feb. 26, 2020

Feature Engineering is an important skill in data science, and is the process of taking raw data and turning it into features that can be used as inputs for training machine learning algorithms. We will look at 311 noise complaints data in this post.

Read More
dirtydata.png
Data cleaning with Python + Pandas: An Introduction
Feb. 16, 2020

Cleaning up dirty, corrupted data with Python and Pandas. Dirty, corrupted data leads to dirty and corrupt analysis and conclusions. Who wants that? In this post we will go through a cleaning checklist with Pandas and a dataset from NYC Open Data.

Read More
namedentityvogue.png
How to train a custom Named Entity Recognizer with Spacy
Feb. 10, 2020

In this post we will train a custom Named Entity Recognizer in Python with Spacy. I will go through the steps to prepare your data and train a model with it. Inspiration credit: text for the graphic is from Vogue magazine - link in post.

Read More
polygons.png
Point in Polygon search with GeoDjango
Feb. 2, 2020

Determining if a point lies in a polygon is a pretty common task in computational geometry. In this post we will use it to answer questions like 'which NYC neighborhood is this apartment building in?' using GeoDjango and data from NYC Open Data.

Read More
nyctaxisecond.png
Accessing NYC Open Data with Python + the Socrata Open Data API
Jan. 26, 2020

If the walls in NYC could talk, they would likely tell you a similar story as one you can glean from 311 complaints. Noise complaints, building complaints, rat sightings, etc. NYC Open Data provides us this data, which we can access using Python.

Read More
mazeflattened500.png
Generating and Solving Mazes with Python
Jan. 19, 2020

We've been into mazes for thousands of years. Some can be tricky to navigate, but we can solve them pretty quickly in a few lines of code, using well-known path-finding algorithms. All visualized in matplotlib.

Read More
graymediumcurves500flattened.png
How to train a custom Named Entity Recognizer with Stanford NLP
Jan. 12, 2020

When you want to label text data with named entities like people and location names, sometimes the out-of-the-box NER taggers do not quite meet your needs. Today we'll walk through the steps of training a Stanford NER model with a custom dataset.

Read More
NER graphic
Named Entity Recognition in Python with Stanford-NER and Spacy
Jan. 6, 2020

Named Entity Recognition is a common task in Natural Language Processing that aims to label things like person or location names in text data. Today we will look at two examples in Python, using the popular libraries Stanford NLP and Spacy.

Read More
2020zoomblur500.png
Anatomy Of A Web-Scraping Robot
Jan. 2, 2020

What is a bot? Robots are bad, right? Not always. At its core a robot is just a program to automate various things you could do as a human, such as visiting websites. I will outline the parts, or anatomy, making up such a robot in Python.

Read More
twitter_blog_clouds500flattened.png
Tweeting with Python
Dec. 22, 2019

How to access the Twitter API with Python using the Tweepy library. I will demonstrate how to connect to the API and do regular Twitter things like tweeting, following and favoriting, all using the API.

Read More
flattenedwarpedcurveddress.png
Classifying Fashion Articles with Python and Scikit-learn
Dec. 21, 2019

Text classification is a popular and important problem that we deal with on a daily basis. I will be creating a text classifier with Python and scikit-learn to filter a collection of articles based on whether or not they are fashion-related or not.

Read More
windybrown500.png
Building A Force-Directed Network Graph with D3.js
Dec. 15, 2019

Today I will go over what a force-directed graph is and how to build one in D3.js. This graph is built using data extracted from New York Times articles to show items that are talked about in the articles.

Read More
robotbackground500.png
What is web scraping?
Dec. 8, 2019

Web scraping can mean a lot of things, but it usually refers to writing a program to visit websites and extract information from them. It can be a great tool when you need customized data, and I will demonstrate this with a scraper written in Python.

Read More
Get the latest posts as soon as they come out!