abstract_tree.png
Solving the Lowest Common Ancestor Problem in Python

Finding the Lowest Common Ancestor of a pair of nodes in a tree can be helpful in a variety of problems in areas such as information retrieval, where it is used with suffix trees for string matching. Read on for the basics of this in Python.

Read More
rectangles_cover.png
How to write a custom fragment shader in GLSL and use it with three.js
April 16, 2023

This blog post walks through the process of writing a fragment shader in GLSL, and using it within the three.js library for working with WebGL. We will render a visually appealing grid of rotating rectangles that can be used as a website background.

Read More
streaming data
Streaming data with Flask and Fetch + the Streams API
April 10, 2023

Streaming can be a great way to transfer and process large amounts of data. It can help save space and/or time, if the data uses a lot of memory, or if you want to start processing or visualizing the data as it comes in.

Read More
point_polygon_yn.png
Point in Polygon search with MongoDB
Nov. 12, 2022

In a recent project, we had a large number of points on a canvas, where a user could draw a region of interest to see only the points within that area. Here is a demo of how to do that using MongoDB with a geospatial 2D-index. Visualized using D3.

Read More
main_graphic.jpg
Image Similarity with Python Part II: Nearest Neighbor Search
Feb. 18, 2022

This is Part II of my post on image similarity in Python with perceptual hashing. In this post, we will use Spotify's Annoy library to perform nearest neighbors search on a collection of images to find similar images to a query image.

Read More
kruskal animation shot
Kruskal's Algorithm Animation + Maze Generation
Feb. 7, 2022

Kruskal's algorithm finds a minimum spanning tree in an undirected, connected and weighted graph. We will use a union-find algorithm to do this, and generate a random maze from a grid of points.

Read More
lip_main.png
Puckering Lips Animation in D3
Jan. 28, 2022

Just in time for Valentine's day, create a puckering lips animation in D3 from an SVG path, using interpolations and .attrTween(). We will go through the steps from generating points from an SVG path, to interpolating lines in D3 to animate them.

Read More
cloud grid
Deploying a Flask app on AWS Lambda with Zappa
July 10, 2021

In this post I wanted to document the steps I took to deploy a Flask app to AWS Lambda using Zappa. Configuring everything properly in AWS was the most complicated part, so I hope this post can help other AWS noobs who might be struggling!

Read More
D3 Data Join
Data Joins in D3
March 10, 2021

Data joins in D3 can be a tricky thing to wrap your mind around, but once you do, you can take your visualizations to the next level with animations. Data joins are a core concept in D3, so it's a good idea to get acquainted with them.

Read More
rectCollideBlogCover.png
Rectangular Collision Detection in D3 Force Layouts
Feb. 4, 2021

D3 has a built-in force to detect circle collisions in force layouts, but what if you're working with rectangles? In this post we will go over how to detect and resolve collisions, and then adapt D3's built-in forceCollide to work on rectangles.

Read More
spotify_cover.jpg
Accessing the Spotify API with Python
Dec. 13, 2020

You can do a lot of interesting things with the Spotify API, like searching for artists and playlists, following and sharing them, and more. In this post we will access the API using Python to get featured playlists and associated artists and genres.

Read More
picto_sm_sharp.png
Building Pictogram Grids in D3.js
Nov. 22, 2020

Pictograms have been around for a long time, and with good reason. They are interesting and engaging, and might even help your audience to remember the information better. In this post we will build a pictogram grid in D3.js.

Read More
Playing Minesweeper graphic
Solving Minesweeper in Python as a Constraint Satisfaction Problem
Nov. 15, 2020

Let's play Minesweeper in Python. In this post we will treat Minesweeper as a constraint satisfaction problem and use common algorithms like constraint propagation and backtracking search to mimic logic we would use to play the game as humans.

Read More
minesweeper_header.png
Generating Minesweeper boards in Python
Nov. 6, 2020

In the next couple of posts we're playing Minesweeper in Python. You may be familiar with it since it probably can be found on your nearest computer. First we need to generate a board - that's this post - and then in the next, we will play the game.

Read More
Flood fill graphic
The Flood Fill Algorithm in Python
Oct. 11, 2020

The flood fill algorithm has several high profile uses, most notably the bucket fill tool in image editing programs, as well as in games like Minesweeper. In this post we will go over how the tool works, as well as how to implement the algorithm.

Read More
Arc Diagram Graphic
Arc Diagrams in D3.js Part II
Sept. 7, 2020

In part II of building arc diagrams in D3.js we will build the actual diagram with data from ride hailing app trips we prepared in Part I. Drawing the arc is the most complicated part of this visualization, and we will go through it step by step.

Read More
FRED API
Accessing the FRED API with Python
Aug. 28, 2020

FRED is a database with time series data on economic indicators from a wide variety of sources. There is an API to access all of this data, and in this post I will go over a recent project where I needed to collect all of it.

Read More
Arc Diagram Graphic
Arc Diagrams in D3.js: Visualizing Taxi Pickup and Dropoff Data
July 30, 2020

An arc diagram is a type of network graph where the nodes lie along one axis, with arcs connecting them. This post is part one of two, where we will prepare the data to visualize pickup and dropoff locations for ride hailing app rides in NYC.

Read More
time_series.png
Time Series Data In Pandas: An Introduction
July 20, 2020

Time series data is all the rage these days, and not just in fields like finance. In this post we will look at working with time series data in Pandas, how to do basic time-based manipulations and calculations such as rolling means and data shifting.

Read More
Lorem Ipsum with various Google Fonts
How to embed a Google Font into an SVG
July 1, 2020

If you use a Google Font in an SVG visualization and then try to save it as a file, you might find that the font was not preserved in the saved file. To remedy that, we will look at how to embed a custom font into an SVG with base64 encoding.

Read More
nyc map outline graphic
Using ogr2ogr to convert Shapefiles to GeoJSON
June 20, 2020

In this post we will use the ogr2ogr command line tool from GDAL to convert a shapefile of NYC zip code boundary data to GeoJSON format, as well as convert the projected coordinates to latitude and longitude, in one line of code.

Read More
Multi Foci Cluster Chart Graphic
Building a Multi-Foci Force Layout Bubble Chart in D3.js
June 12, 2020

You might be familiar with force layouts in D3.js to create things like bubble charts, network graphs and many other types of visualizations. In this post we will create a force layout bubble chart with multiple clusters along a timeline.

Read More
using word vector features with scikit-learn (featuring spacy)
Building a custom Scikit-learn Transformer using GloVe vectors from Spacy as features
May 23, 2020

Word vectors are useful in NLP tasks to preserve the context or meaning of text data. In this post we will use Spacy to obtain word vectors, and transform the vectors into a feature matrix that can be used in a Scikit-learn pipeline.

Read More
voronoimazecover.png
Python Maze Generator Part II: Voronoi Diagrams
May 14, 2020

Voronoi diagrams are used in a variety of fields for a variety of reasons, including the art and design world. This post is Part II in a series on mazes, where I will generate and solve random mazes from Voronoi diagrams using Python and Matplotlib.

Read More
Rihanna coreference resolution paragraph graphic
Coreference resolution in Python with Spacy + NeuralCoref
May 6, 2020

Coreference resolution is a task in Natural Language Processing that aims to group together all references to an entity, for example, a person like Rihanna, in text. In this post we use NeuralCoref - a Spacy extension - to do this in Python.

Read More
convex_hull.png
Convex hulls in Python: the Graham scan algorithm
April 26, 2020

Computing the convex hull of a set of points is a fundamental problem in computational geometry, and the Graham scan is a common algorithm for it. In this post we will implement the algorithm in Python and look at interesting uses of convex hulls.

Read More
nyc_voronoi_subway.png
Finding the nearest NYC subway station with a Voronoi map
April 19, 2020

A Voronoi diagram divides up a space into regions of influence based on a set of points. In this post we will generate a Voronoi diagram from a map of NYC subway station locations, which can be used to find the closest subway station to any location.

Read More
Classification vs. Clustering
Classification vs. Clustering in Machine Learning
April 12, 2020

Two broad categories in machine learning are supervised and unsupervised learning. Classification and clustering are examples of each of those respectively, and in this post I will go over the differences between them and when you might use them.

Read More
dresspixelated.jpg
Determining how similar two images are with Python + Perceptual Hashing
March 27, 2020

Years ago I had an app idea where users could upload an image of a fashion item like shoes, and it would identify them. In this post I will go over how I approached the problem using perceptual hashing in Python with Pillow and the imagehash library.

Read More
textnormalization.png
Text Normalization for Natural Language Processing in Python
March 22, 2020

Text Normalization is an important part of preprocessing text for Natural Language Processing. There are several common techniques that we will go over in this post, using the Natural Language Toolkit (NLTK) in Python.

Read More
coronavirus graphic
Overview of the COVID-19 Open Research Dataset (CORD-19) + Kaggle Challenge
March 20, 2020

This is an overview of the COVID-19 Open Research Dataset (CORD-19), which is a corpus of research papers related to the coronavirus pandemic, and the Kaggle challenge to develop tools to process them using natural language processing techniques.

Read More
martinigraph.png
Python Project: Which cocktails can you make from a list of ingredients?
March 18, 2020

For many of us, going out to restaurants and bars is but a distant memory, and you might want to make your own cocktails at home. In this post we will build a program in Python to tell you what cocktails you can make from a list of input ingredients.

Read More
analyticsgraph.png
Accessing the Google Analytics Reporting API (V4) with Python
March 8, 2020

How to access the Google Analytics API with Python and create reports with your analytics data. This API seems complicated at first, but once you get the hang of how things work it's easy to generate new and interesting reports.

Read More
sudoku.png
Generating and solving Sudoku puzzles with Python
March 4, 2020

You might be familiar with Sudoku - the single-player puzzle that involves inserting the numbers 1-9 into a grid in a certain way. In this post we will generate and solve Sudoku puzzles with Python using a depth-first search backtracking algorithm.

Read More
funnelgraph.png
Feature Engineering with Python + Pandas: An Introduction
Feb. 26, 2020

Feature Engineering is an important skill in data science, and is the process of taking raw data and turning it into features that can be used as inputs for training machine learning algorithms. We will look at 311 noise complaints data in this post.

Read More
dirtydata.png
Data cleaning with Python + Pandas: An Introduction
Feb. 16, 2020

Cleaning up dirty, corrupted data with Python and Pandas. Dirty, corrupted data leads to dirty and corrupt analysis and conclusions. Who wants that? In this post we will go through a cleaning checklist with Pandas and a dataset from NYC Open Data.

Read More
namedentityvogue.png
How to train a custom Named Entity Recognizer with Spacy
Feb. 10, 2020

In this post we will train a custom Named Entity Recognizer in Python with Spacy. I will go through the steps to prepare your data and train a model with it. Inspiration credit: text for the graphic is from Vogue magazine - link in post.

Read More
polygons.png
Point in Polygon search with GeoDjango
Feb. 2, 2020

Determining if a point lies in a polygon is a pretty common task in computational geometry. In this post we will use it to answer questions like 'which NYC neighborhood is this apartment building in?' using GeoDjango and data from NYC Open Data.

Read More
nyctaxisecond.png
Accessing NYC Open Data with Python + the Socrata Open Data API
Jan. 26, 2020

If the walls in NYC could talk, they would likely tell you a similar story as one you can glean from 311 complaints. Noise complaints, building complaints, rat sightings, etc. NYC Open Data provides us this data, which we can access using Python.

Read More
mazeflattened500.png
Generating and Solving Mazes with Python
Jan. 19, 2020

We've been into mazes for thousands of years. Some can be tricky to navigate, but we can solve them pretty quickly in a few lines of code, using well-known path-finding algorithms. All visualized in matplotlib.

Read More
graymediumcurves500flattened.png
How to train a custom Named Entity Recognizer with Stanford NLP
Jan. 12, 2020

When you want to label text data with named entities like people and location names, sometimes the out-of-the-box NER taggers do not quite meet your needs. Today we'll walk through the steps of training a Stanford NER model with a custom dataset.

Read More
NER graphic
Named Entity Recognition in Python with Stanford-NER and Spacy
Jan. 6, 2020

Named Entity Recognition is a common task in Natural Language Processing that aims to label things like person or location names in text data. Today we will look at two examples in Python, using the popular libraries Stanford NLP and Spacy.

Read More
2020zoomblur500.png
Anatomy Of A Web-Scraping Robot
Jan. 2, 2020

What is a bot? Robots are bad, right? Not always. At its core a robot is just a program to automate various things you could do as a human, such as visiting websites. I will outline the parts, or anatomy, making up such a robot in Python.

Read More
twitter_blog_clouds500flattened.png
Make a Twitter Bot with Python and Tweepy
Dec. 22, 2019

How to access the Twitter API with Python using the Tweepy library. I will demonstrate how to connect to the API and do regular Twitter things like tweeting, following and favoriting, all using the API.

Read More
flattenedwarpedcurveddress.png
Classifying Fashion Articles with Python and Scikit-learn
Dec. 21, 2019

Text classification is a popular and important problem that we deal with on a daily basis. I will be creating a text classifier with Python and scikit-learn to filter a collection of articles based on whether or not they are fashion-related or not.

Read More
windybrown500.png
Building A Force-Directed Network Graph with D3.js
Dec. 15, 2019

Today I will go over what a force-directed graph is and how to build one in D3.js. This graph is built using data extracted from New York Times articles to show items that are talked about in the articles.

Read More
robotbackground500.png
What is web scraping?
Dec. 8, 2019

Web scraping can mean a lot of things, but it usually refers to writing a program to visit websites and extract information from them. It can be a great tool when you need customized data, and I will demonstrate this with a scraper written in Python.

Read More
Get the latest posts as soon as they come out!