In this tutorial, we streamline model predictions with an API endpoint that accepts a raw string for inferencing. We also experiment with model signatures for a more customizable payload.

Photo by Markus Winkler on Unsplash

Around one year ago, I wrote Deploy a Keras Model for Text Classification using TensorFlow Serving for deploying text classifiers in TensorFlow 1.X. Since then, I’ve spent a lot of time migrating older projects to TensorFlow 2.X. While I found 2.X much easier to work with, I had a hard time finding the documentation I needed, especially around deployment and migrating deprecated V1 features for TensorFlow Serving.

I’ve gathered my learnings and will be demonstrating model deployment with a toy example.

If this is your first time working with TensorFlow 2.X, …


Built something cool? Make it easily accessible by publishing it to PyPI!

Photo by Hitesh Choudhary on Unsplash

Python Package Index (PyPI)

The Python Package Index is a repository of software libraries available for Python programming. PyPI makes it easy to distribute and access useful projects that are not a part of the standard Python libraries.

It’s very simple to publish your own open-source project to PyPI. In this article, we will cover how to:

  • Prepare your package for PyPI
  • Manually build and upload your package to PyPI
  • Testing your package with TestPyPI prior to release
  • Versioning your releases
  • Syncing the main branch of your repository with PyPI using CircleCI

I will be using one of my repositories as an example: pdf-wrangler


Tips to stand out in a competitive job market

Photo by Magnet.me on Unsplash

This article is cross-posted on my website https://www.dscrashcourse.com/ alongside other guides we have been curating for aspiring data scientists.

The data scientist role is a popular career choice for anyone who likes to work with numbers and analytics. Once referred to as “the sexiest job in the 21st century” by Harvard Business review, the popularity of this industry has caused it to become oversaturated with job seekers and bootcamps.

With so much interest and competition, it has become harder for aspiring data scientists to stand out and get noticed in the job market.

So, what can an aspiring data scientist do to stand out?

First, it’s important to understand the reality…


Deploy an AI assistant using Rasa — from conception to Facebook — within an hour.

Photo by Volodymyr Hryshchenko on Unsplash

Rasa is an open-source conversational AI framework that uses machine learning to build chatbots and AI assistants. Today, I’m going to show you how to build your own simple chatbot using Rasa and deploying it as a bot to Facebook messenger — all within an hour. All you need is some simple Python programming and a working internet connection.

The complete code can be found here: GitHub Repo

The code was developed and tested in Python 3.7. Rasa currently only supports Python up to 3.8 (see here for updates).

The bot we build today will be very simple and will…


Set up GPT-2 demo in under 5 minutes using Streamlit

Photo by Halacious on Unsplash

GitHub Repo: ml-streamlit-demo

Bringing a Machine Learning model outside of a notebook environment and turning it into a beautiful data product used to be a lot of work. Luckily, there’s a lot of tooling being developed in this area to make prototyping easier. A while ago, I came across Streamlit, an open source Python library for building custom web apps.

It’s quick and easy to get started and took me less than 30 minutes to build an app with a pre-trained model. Since then, I have been using Streamlit for prototyping models and demonstrating their capabilities. …


Share your app with the internet in under 5 minutes

Photo by Susan Lewis-Penix on Unsplash

If you are not working with an existing app, you can refer to my Iris Classifier FastAPI App for reference. I wrote an article about how to set that up (the Docker component is optional).

Note: I put the Iris app together really quickly to demonstrate how to set up FastAPI — it certainly does not adhere to best practices when it comes to serving model predictions

This is my GitHub repo for the Iris Classifier app. We will be using this as an example in this tutorial. …


Step-by-step guide for using Airflow + Docker to deliver weather forecasts to Slack

Photo by Austin Distel on Unsplash

Tech Stack: Python 3.7, Airflow (1.10.10), Docker

GitHub link: All of the code can be found here.

Airflow + Slack

Slack is an increasingly popular chat app used in the workplace. Apache Airflow is an open source platform for orchestrating workflows. One of the biggest advantages to using Airflow is the versatility around its hooks and operators. Hooks are interfaces to external platforms, databases and also serve as the basic building blocks of Operators.

The Slack Webhook Operator can be used to integrate Airflow with Slack. …


Can our model accurately detect churn to help retain these customers?

Photo by Andre Hunter on Unsplash

Customer churn, also known as attrition, occurs when a customer stops doing business with a company. Understanding and detecting churn is the first step to retaining these customers and improving the company’s offerings.

Telco Dataset

We will be training our churn model over the Telco-Customer-Churn Dataset to predict the likelihood of customers leaving the fictional telecommunications company, Telco. This synthetic dataset was put together by IBM and includes a label indicating whether or not the customer left within the last month.

Goal: predict whether a customer will churn based on their demographic and service information.

Data Exploration

The exploration and modelling will be conducted…


Data Science Crash Course

The Ultimate Python library for working with Relational Data

Photo by Pascal Müller on Unsplash

Earlier this month, Edward Qian and I started working on a set of comprehensive lessons for aspiring Data Scientists, which can be found on our website www.dscrashcourse.com

I will be cross-posting slightly modified lessons to Medium to make them available to a broader audience. If you find these articles helpful, check out the site for more lessons and practice problems!

pandas is a Python library that makes it easy to read, export and work with relational data. This lesson will expand on its functionality and usage. We typically import pandas as pd to refer to the library using the abbreviated…


Data Science Crash Course

Understanding Probabilistic Models starting with Logistic Regression

Photo by Jonathan Petersson on Unsplash

Earlier this month, Edward Qian and I started working on a set of comprehensive lessons for aspiring Data Scientists, which can be found on our website www.dscrashcourse.com

I will be cross-posting slightly modified lessons to Medium to make them available to a broader audience. If you find these articles helpful, check out the site for more lessons and practice problems!

Logistic regression is used to model the probability of an event occurring by estimating its log odds. …

Mandy Gu

Data Scientist @ Wealthsimple

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store