state of the art machine learning

deep learning, SVMs, random forests and more on one platform

Product Tour

Ersatz provides a unified machine learning environment with support for deep learning, data wrangling, a variety of "model backends", model and data visualization, team collaboration, and GPU computing--all from a browser.

Using Erszatz for our senior project was a fantastic choice. The service is fast, easy, and powerful. Most importantly, the accuracy we received from the trained models was exactly what we were looking for.

Austin Fox, Student
Cal Poly San Luis Obispo
Case study

What is Ersatz?

Ersatz is a web-based general purpose platform for machine learning with support for GPU-based deep learning. It's geared towards aspiring and working data scientists with stuff to do. Ersatz has a number of components designed to make modern machine learning workflows much more efficient. Primarily, these include tools for data wrangling, model training, and machine learning infrastructure.

Let's walk through it step by step.

Problem formation and data wrangling

Any machine learning project starts with coming to a clear understanding of your goals. This includes answering questions like what kind of data you have access to, whether you have labeled or unlabeled data, what kind of accuracy is needed for a solution to be effective, and what kind of speed is needed. The importance of this step cannot be overstated because if it's done incorrectly, you will have a situation of "garbage in, garbage out".

For those in need of guidance, Ersatz Labs provides corporate training and consulting services to help with problem definition and data collection. We can help you answer the question, "Which machine learning techniques are most effective for my problem?" definitively. For more information, please email us at info@ersatzlabs.com or sign up for a free trial below.

Data processing and warehousing

Ersatz can be used to manage and share datasets across an organization. Datasets can be uploaded directly through the web interface or through our API. Upon uploading, data is parsed into Ersatz and converted to a distributed set of hdf5 files for storage.

Ersatz provides web-based graphical tools to identify duplicate columns or rows, fill in missing values, and otherwise wrangle your data into shape. When you are satisfied with your data, you can move on to actually building predictive models.

A screenshot from the Ersatz Column Selection Wizard

Ensembles and Models

Now we're getting to the meat and potatoes of Ersatz: the ensemble manager, parameter chooser, and model backends. We'll explain each.

A list of models in an ensemble
Ensemble Manager

An ensemble is simply a group of machine learning models with a common objective. An easy way to think of an ensemble is as a "project" or a "folder" containing models. The real benefit of an ensemble is how it gets the various models to work together. The ensemble manager is a key differentiator for Ersatz compared to other web-based machine learning platforms. It allows you to, for example, create a deep neural network that combines its outputs with a support vector machine.

Parameter Chooser

Most machine learning models require the user to set some mix of parameters. These often require specific expertise to set correctly. Ersatz automates this process by using machine learning to search a space of parameters for the best set of parameters to train a model.

Essentially, it takes the guess work out so you can simply leave several experiments running and see which ones net out the best. Our parameter search is more effective and faster than both random search and grid search.

Model backends

An important design goal of Ersatz is to be relatively future proof. We have achieved this by building a robust backend/runner system designed to make integrating with various machine learning tools, such as pylearn2 or sklearn, trivial.

This means Ersatz is compatible with a wide variety of machine learning models and techniques. You can think of Ersatz as a gui manager for running these types of models and combining them into machine learning pipelines.

Monitor training and evaluation

Models are trained and run on specialized worker machines with high performance GPUs for computation up to 40x faster than CPU-based architectures. The data warehouse is responsible for sampling and allocating data to specific workers. The worker processes send training statistics back, where they are then rendered as charts and tables for analysis in the web interface.

Having all of your data, models, and statistics in one place makes for a much more efficient machine learning experience.

A confusion matrix in Ersatz
Training statistics get displayed on a dashboard

A production-ready pipeline

In machine learning, it's not uncommon to have to use several very different tools in your journey from prototype to production-ready. With Ersatz, you can do everything in one environment. All of the models you train can be accessed through the API, so as soon as you have a prototype that works, it's ready to scale up.

If you deal with data and machine learning, Ersatz makes your job easier. It does this by reducing the time you spend preparing data, giving you access to lots of algorithms in one environment, and providing dashboarding and ensembling capabilities.

Ready to give Ersatz a try?

No credit card required. Try it today.

Ersatz is a product of Ersatz Labs

Ersatz Labs is a team of web developers and machine learning researchers dedicated to providing the best machine learning tools on the market. While data science may be hard, we think we can make it easier.
Ersatz Labs, Inc.

28 Second St, 3rd Floor
San Francisco, CA 94105

tel: 415.504.3794
info: info@ersatzlabs.com

© 2014 Ersatz Labs, Inc.
All Rights Reserved.