Are we thinking hard enough about how exactly AI translates into business impact?
Let’s assume we are running an e-commerce search engine that uses machine learning on user-issued queries to identify the intended product category. Say the model in production incurs a 20ms prediction latency and has 90% accuracy.
A natural next goal from a modeling perspective would be to drive the accuracy higher, say to 95% or beyond. However, we know that improving the accuracy almost always requires the consumption of more computational resources for training models and may also increase the inference latency. In addition, the experimentation process and associated engineering efforts can be a months-long or even years-long endeavor.
Meanwhile, how does a 5% increase in accuracy translate to the business goal? Given a user query, the search engine should drive relevance and minimize the time for product discovery. Predicting the intended category is only an intermediate step toward achieving the business objective. We may not even see improvements in the downstream business performance by driving the accuracy to 95%.
To move the needle on the customer experience and make an impact to the e-commerce business, we should think beyond incrementally increasing the accuracy of this specific query relevance model. We could have another model that predicts the intended “price range.” It may also be a good idea to develop another AI solution to predict the intended “query attributes,” and perhaps another to predict the propensity of a shopper to buy vs. explore.
If we only invest in improving a single task, we are likely not materializing AI’s complete potential. However, expanding to multiple tasks would normally be difficult to achieve without hurting the latency or operational cost. Such an endeavor would typically also require hiring additional experts to tackle each of these respective problems. We must have a broad focus to deliver business impact, but it is not scalable or sustainable to handle new hypotheses by adding new headcount. We need automated solutions.
To validate a machine learning hypothesis in business settings, we need three main components: (1) data, (2) a production-ready model, and (3) infrastructure for A/B testing to select a model to deploy. While data and testing harnesses are often domain specific, the challenge of building machine learning models shares many commonalities across disciplines.
Moreover, an efficient model development workflow provides tremendous value as building AI solutions efficiently with less developer effort is critical to executing more experiments, testing more hypotheses, and ultimately making transformational business impact, as we illustrated in our e-commerce example above. The best way to have a good idea is to try lots of ideas.
All machine learning software libraries in wide use today are limited. Libraries for training large-scale neural networks, while effective at learning from raw, unstructured data, require access to costly specialized computing hardware, such as GPUs, and take weeks or months to train. Furthermore, these deep learning solutions very often also involve costly post-processing steps to compress models into a size that can be deployed efficiently. Conversely, data science tools that run on ordinary CPU hardware are invariably restricted to structured datasets, such as tables and spreadsheets, and are thus incapable of taking advantage of the vast troves of multi-modal information available to many organizations.
Most AutoML tools are also constrained by the amount and type of information they can leverage, resulting in model capabilities that are still sub-optimal. In the case of personalization, an ML framework should be able to handle a large number of categories and build predictive models that are conditional on every user. Sequential information, which is almost always available in the form of timestamps, provides critical insights for building truly personalized models that are aware of ever-evolving temporal behaviors. In addition, metadata in the form of text will likely require some form of Natural Language Processing (NLP). Existing AutoML tools typically do not handle these various modalities in conjunction.
Ultimately, to unlock business value, we need an AutoML solution that addresses all of these shortcomings by: (1) training large neural network models efficiently on low-cost hardware, (2) handling multiple modalities of data, such as text, metadata, and sequential relationships, and (3) leveraging the power of large quantities of data.
This is where ThirdAI comes in.
In addition, UDT performs inference in as little as 1ms, meaning that developers can immediately put a promising model in production without the need for costly compression operations commonplace in modern deep learning pipelines.
With UDT, business stakeholders can quickly train numerous models for a variety of tasks with minimal computational costs and no tedious parameter tuning. This AutoML approach aligns AI with business objectives by enabling developers to try ideas freely and ultimately converge on the differentiating solutions. By trying more ideas at scale and quickly, organizations can unlock the true value of AI.