Value Proposition

Unified Interface

ThirdAI’s Universal Deep Transformers (UDT) library for AutoML can tackle a broad range of machine learning tasks and data modalities all through the same unified API. Whether you are solving problems in natural language processing, tabular data analytics, time series, information retrieval, text reformulation, and more, UDT provides a one-stop machine learning solution in just a few lines of Python code.

This universal API drastically simplifies the machine learning modeling workflow and management, especially for customers looking to solve multiple machine learning problems. With UDT, machine learning practitioners can avoid the tedious process of building distinct model training pipelines and maintaining separate AI software and hardware infrastructure for each new task. Instead, they can rely on the consistent and simple UDT interface. This standardized workflow frees up businesses to spend more time on high-leverage, differentiating product development.

While other deep learning frameworks require strong ML expertise and time to tune models to maximize performance, UDT completely removes these pain points. Instead, one simply invokes the UDT interface on a given dataset and ThirdAI’s proprietary technology does the rest. We have eliminated the bottleneck of model tuning by developing new mathematical techniques for automatically selecting the optimal hyper-parameters associated with our innovative deep learning algorithms. As a customer, you only need to invoke UDT on your dataset and then sit back and relax knowing ThirdAI’s technology will select optimal hyper-parameters and perform feature engineering automatically without any additional computational overhead.

Dramatically Lower Cost of Model Training and Refreshing

ThirdAI’s Big ‘Ol Layer Training (BOLT) algorithm is the engine powering our UDT product. Based on a decade of research advances in efficient deep learning (much of it driven by ThirdAI team members), BOLT is the only available production-ready software solution for efficiently training large neural network models with up to billions of parameters on standard CPUs as opposed to expensive, specialized hardware like GPUs that can cost hundreds of thousands to millions of dollars to operate in the cloud.

BOLT delivers revolutionary improvements in neural network training through proprietary sparse operations that activate only a small subset of neurons for a given input, as shown in the figure below. These sparse computations allow us to sidestep the need for specialized accelerators and train massive neural networks consisting of up to billions of parameters on everyday CPU machines. Moreover, BOLT’s sparse training is extremely fast on many standard machine workloads as shown in the table below, which also allows customers to regularly refresh models at a low cost on CPU machines, which ensures that customers always have access to high quality model predictions in production systems.

Improve Inference by 100X or More

As explained in our other article, larger models need larger batch sizes for speed and generalization. However, GPUs have limited memory and cannot scale to the needs of a billion scale network.

We notice that the top-of-the-line A100 GPU with 48 GB memory can barely accommodate a batch size of 256 for our 1.6BN network. To run 2048 bath sizes, we need to distribute the model and training over 4 A100 GPUS (on a machine that costs around $70000). Even 2 A100s cannot accommodate the training.

With the BOLT Engine, we can effortlessly scale up to a few thousand batch sizes with no change in model memory. Moreover, the BOLT experiments were done on an old dual Intel Gold V3 processor purchased for under $1,000. Running the same model in the same CPU with TensorFlow is 5x slower. More details here.

Immediately Production-Ready

It is estimated that 87% of AI models that are developed never see production. This is because existing AI softwares only optimizes for ease of prototyping and converting a prototype to production is an expensive and time consuming effort.
We designed our Universal Deep Transformer (UDT) library with production deployment as a first priority.  From raw feature processing to final model prediction, everything built in UDT is production ready since the inception of the pipeline itself.  Unlike other machine learning libraries that may require additional compilation steps or developers to write their own serialization operations, UDT handles all of these challenges within a single save operation and allows for push button deployment across a variety of platforms, clouds, or even on-premises.
Furthermore, since UDT can perform large model inference in as little as 1 millisecond (link to wayfair blog), customers do not need to worry about adding additional model compression steps such as knowledge distillation, quantization, or pruning to their machine learning pipelines. This also dramatically simplifies the journey of taking a model into production with ThirdAI.

We have also tested UDT within several popular cloud ML model serving frameworks, validating both the robustness of our model serialization design and the performance of our models at inference time. In particular, we have successfully integrated UDT in the following industry standard frameworks, and we anticipate that UDT will work nicely with additional popular services as well.

  • Databricks,
  • Google Cloud Vertex AI,
  • Microsoft Azure Machine Learning,
  • Amazon Sagemaker.

Explainable/Trustable AI

Deep Learning is a remarkable technology that is leading the AI revolution. However, Deep Learning technology, by its nature, is a black box decision maker that learns information from the data and makes accurate predictions. Cracking open this black box requires a significant effort and investment both in infrastructure and engineers.  It can take several years for a company to build infrastructure to make deep learning explainable and monitor the same in production

ThirdAI’s production-ready UDT software is inherently designed to provide attribution and explanation for any AI decisions made. As a result from the day of inception, our AI models are capable of providing root cause analysis, explanations, and attributions for the decisions made by the model. There is no need to invest in any specialized infrastructure to understand and monitor the deep learning models.  The attributions and analysis provided by the models on any input in any production environment can be used to understand the decisions and also monitor the AI.

Sustainable/Green AI

Traditional AI workloads result in a significant amount of computations and hence carbon emission. Accelerating AI with specialized hardware such as GPUs has made the situation worse. GPUs carbon emission rate is about 2-3x that of commodity CPUs. Existing solutions were never made keeping in mind the compute and energy reduction of AI algorithms.

At ThirdAI we pride ourselves in our sparsity based AI Technology that is designed to exponentially cut down the computations associated with AI workloads, be it training or inference. As a result, our software is delivering the most climate friendly AI algorithms that provides orders of magnitude reduction in energy footprint. Also, since all our computations are performed on commodity CPUs which are most of the time idle, we are essentially building and deploying AI with minimal carbon footprint emission.

Our UDT can help achieve significantly lower carbon emissions for some of the most complex AI workloads without losing on the accuracy.