Starting out with general purpose computing on the GPU, we are going to write a WebGPU compute shader to compute Morton Codes from an array of 3-D coordinates. This is the first step to detecting collisions between pairs of points.
Read MoreStarting out with general purpose computing on the GPU, we are going to write a WebGPU compute shader to compute Morton Codes from an array of 3-D coordinates. This is the first step to detecting collisions between pairs of points.
Read MoreIn this post, I am dipping my toes into the world of compute shaders in WebGPU. This is the first of a series on building a particle simulation with collision detection using the GPU.
Read MoreFinding the Lowest Common Ancestor of a pair of nodes in a tree can be helpful in a variety of problems in areas such as information retrieval, where it is used with suffix trees for string matching. Read on for the basics of this in Python.
Read MoreThis blog post walks through the process of writing a fragment shader in GLSL, and using it within the three.js library for working with WebGL. We will render a visually appealing grid of rotating rectangles that can be used as a website background.
Read MoreStreaming can be a great way to transfer and process large amounts of data. It can help save space and/or time, if the data uses a lot of memory, or if you want to start processing or visualizing the data as it comes in.
Read MoreIn a recent project, we had a large number of points on a canvas, where a user could draw a region of interest to see only the points within that area. Here is a demo of how to do that using MongoDB with a geospatial 2D-index. Visualized using D3.
Read MoreThis is Part II of my post on image similarity in Python with perceptual hashing. In this post, we will use Spotify's Annoy library to perform nearest neighbors search on a collection of images to find similar images to a query image.
Read MoreKruskal's algorithm finds a minimum spanning tree in an undirected, connected and weighted graph. We will use a union-find algorithm to do this, and generate a random maze from a grid of points.
Read MoreJust in time for Valentine's day, create a puckering lips animation in D3 from an SVG path, using interpolations and .attrTween(). We will go through the steps from generating points from an SVG path, to interpolating lines in D3 to animate them.
Read MoreIn this post I wanted to document the steps I took to deploy a Flask app to AWS Lambda using Zappa. Configuring everything properly in AWS was the most complicated part, so I hope this post can help other AWS noobs who might be struggling!
Read MoreData joins in D3 can be a tricky thing to wrap your mind around, but once you do, you can take your visualizations to the next level with animations. Data joins are a core concept in D3, so it's a good idea to get acquainted with them.
Read MoreD3 has a built-in force to detect circle collisions in force layouts, but what if you're working with rectangles? In this post we will go over how to detect and resolve collisions, and then adapt D3's built-in forceCollide to work on rectangles.
Read MoreYou can do a lot of interesting things with the Spotify API, like searching for artists and playlists, following and sharing them, and more. In this post we will access the API using Python to get featured playlists and associated artists and genres.
Read MorePictograms have been around for a long time, and with good reason. They are interesting and engaging, and might even help your audience to remember the information better. In this post we will build a pictogram grid in D3.js.
Read MoreLet's play Minesweeper in Python. In this post we will treat Minesweeper as a constraint satisfaction problem and use common algorithms like constraint propagation and backtracking search to mimic logic we would use to play the game as humans.
Read MoreIn the next couple of posts we're playing Minesweeper in Python. You may be familiar with it since it probably can be found on your nearest computer. First we need to generate a board - that's this post - and then in the next, we will play the game.
Read MoreThe flood fill algorithm has several high profile uses, most notably the bucket fill tool in image editing programs, as well as in games like Minesweeper. In this post we will go over how the tool works, as well as how to implement the algorithm.
Read MoreIn part II of building arc diagrams in D3.js we will build the actual diagram with data from ride hailing app trips we prepared in Part I. Drawing the arc is the most complicated part of this visualization, and we will go through it step by step.
Read MoreFRED is a database with time series data on economic indicators from a wide variety of sources. There is an API to access all of this data, and in this post I will go over a recent project where I needed to collect all of it.
Read MoreAn arc diagram is a type of network graph where the nodes lie along one axis, with arcs connecting them. This post is part one of two, where we will prepare the data to visualize pickup and dropoff locations for ride hailing app rides in NYC.
Read MoreTime series data is all the rage these days, and not just in fields like finance. In this post we will look at working with time series data in Pandas, how to do basic time-based manipulations and calculations such as rolling means and data shifting.
Read MoreIf you use a Google Font in an SVG visualization and then try to save it as a file, you might find that the font was not preserved in the saved file. To remedy that, we will look at how to embed a custom font into an SVG with base64 encoding.
Read MoreIn this post we will use the ogr2ogr command line tool from GDAL to convert a shapefile of NYC zip code boundary data to GeoJSON format, as well as convert the projected coordinates to latitude and longitude, in one line of code.
Read MoreYou might be familiar with force layouts in D3.js to create things like bubble charts, network graphs and many other types of visualizations. In this post we will create a force layout bubble chart with multiple clusters along a timeline.
Read MoreWord vectors are useful in NLP tasks to preserve the context or meaning of text data. In this post we will use Spacy to obtain word vectors, and transform the vectors into a feature matrix that can be used in a Scikit-learn pipeline.
Read MoreVoronoi diagrams are used in a variety of fields for a variety of reasons, including the art and design world. This post is Part II in a series on mazes, where I will generate and solve random mazes from Voronoi diagrams using Python and Matplotlib.
Read MoreCoreference resolution is a task in Natural Language Processing that aims to group together all references to an entity, for example, a person like Rihanna, in text. In this post we use NeuralCoref - a Spacy extension - to do this in Python.
Read MoreComputing the convex hull of a set of points is a fundamental problem in computational geometry, and the Graham scan is a common algorithm for it. In this post we will implement the algorithm in Python and look at interesting uses of convex hulls.
Read MoreA Voronoi diagram divides up a space into regions of influence based on a set of points. In this post we will generate a Voronoi diagram from a map of NYC subway station locations, which can be used to find the closest subway station to any location.
Read MoreTwo broad categories in machine learning are supervised and unsupervised learning. Classification and clustering are examples of each of those respectively, and in this post I will go over the differences between them and when you might use them.
Read MoreYears ago I had an app idea where users could upload an image of a fashion item like shoes, and it would identify them. In this post I will go over how I approached the problem using perceptual hashing in Python with Pillow and the imagehash library.
Read MoreText Normalization is an important part of preprocessing text for Natural Language Processing. There are several common techniques that we will go over in this post, using the Natural Language Toolkit (NLTK) in Python.
Read MoreThis is an overview of the COVID-19 Open Research Dataset (CORD-19), which is a corpus of research papers related to the coronavirus pandemic, and the Kaggle challenge to develop tools to process them using natural language processing techniques.
Read MoreFor many of us, going out to restaurants and bars is but a distant memory, and you might want to make your own cocktails at home. In this post we will build a program in Python to tell you what cocktails you can make from a list of input ingredients.
Read MoreHow to access the Google Analytics API with Python and create reports with your analytics data. This API seems complicated at first, but once you get the hang of how things work it's easy to generate new and interesting reports.
Read MoreYou might be familiar with Sudoku - the single-player puzzle that involves inserting the numbers 1-9 into a grid in a certain way. In this post we will generate and solve Sudoku puzzles with Python using a depth-first search backtracking algorithm.
Read MoreFeature Engineering is an important skill in data science, and is the process of taking raw data and turning it into features that can be used as inputs for training machine learning algorithms. We will look at 311 noise complaints data in this post.
Read MoreCleaning up dirty, corrupted data with Python and Pandas. Dirty, corrupted data leads to dirty and corrupt analysis and conclusions. Who wants that? In this post we will go through a cleaning checklist with Pandas and a dataset from NYC Open Data.
Read MoreIn this post we will train a custom Named Entity Recognizer in Python with Spacy. I will go through the steps to prepare your data and train a model with it. Inspiration credit: text for the graphic is from Vogue magazine - link in post.
Read MoreDetermining if a point lies in a polygon is a pretty common task in computational geometry. In this post we will use it to answer questions like 'which NYC neighborhood is this apartment building in?' using GeoDjango and data from NYC Open Data.
Read MoreIf the walls in NYC could talk, they would likely tell you a similar story as one you can glean from 311 complaints. Noise complaints, building complaints, rat sightings, etc. NYC Open Data provides us this data, which we can access using Python.
Read MoreWe've been into mazes for thousands of years. Some can be tricky to navigate, but we can solve them pretty quickly in a few lines of code, using well-known path-finding algorithms. All visualized in matplotlib.
Read MoreWhen you want to label text data with named entities like people and location names, sometimes the out-of-the-box NER taggers do not quite meet your needs. Today we'll walk through the steps of training a Stanford NER model with a custom dataset.
Read MoreNamed Entity Recognition is a common task in Natural Language Processing that aims to label things like person or location names in text data. Today we will look at two examples in Python, using the popular libraries Stanford NLP and Spacy.
Read MoreWhat is a bot? Robots are bad, right? Not always. At its core a robot is just a program to automate various things you could do as a human, such as visiting websites. I will outline the parts, or anatomy, making up such a robot in Python.
Read MoreHow to access the Twitter API with Python using the Tweepy library. I will demonstrate how to connect to the API and do regular Twitter things like tweeting, following and favoriting, all using the API.
Read MoreText classification is a popular and important problem that we deal with on a daily basis. I will be creating a text classifier with Python and scikit-learn to filter a collection of articles based on whether or not they are fashion-related or not.
Read MoreToday I will go over what a force-directed graph is and how to build one in D3.js. This graph is built using data extracted from New York Times articles to show items that are talked about in the articles.
Read MoreWeb scraping can mean a lot of things, but it usually refers to writing a program to visit websites and extract information from them. It can be a great tool when you need customized data, and I will demonstrate this with a scraper written in Python.
Read More