A self-supervised learning framework for building a recommender system even in the absence of behavioral data
For many businesses, search and recommendation systems are a core part of the customer experience. A good recommendation model can match users with a personalized product suggestion that will best meet their needs, increasing sales and conversion metrics. Recommendation models also arise in other important areas, such as support ticketing.
State-of-the-art recommender systems are notorious for needing millions of training examples to produce good results. The model must also be updatable with new information, since customer preferences are often time-varying.
To address these problems, ThirdAI has developed a cold-start solution that provides a quick and easy way to get your recommendation system running, even if you do not have months or years of prior data. Our cold-start models have good performance out-of-the-box and can be dynamically updated with new information as users interact with the system.
If we have lots of historical data, we can use previous user behavior to inform the search results. For example, if many customers purchase a specific pair of tennis shoes after searching for “gym training shoes” then we can train a model to associate this type of query with that specific product. This supervised learning process yields good recommendations for well-established products that have been seen by many users, and there are lots of machine learning models that can leverage query-product information.
However, there are many cases where we do not have much behavioral data. For example, new products won’t have been seen by many customers, so we won’t know when to recommend them. We might also have seasonal products, work in a product niche with exceptionally low sales volume, or want to start a new e-commerce store. All of these settings require cold-start recommendation, where we need to start our model without a rich dataset of query-product information.
How is it possible to make recommendations without behavioral data from users? The answer lies in the metadata. By carefully using metadata about the items in our collection, we can generate reasonable recommendations without needing to observe users as they interact with the system.
In product search, the metadata often takes the form of a detailed description or review of each product. Using our tennis shoe example from earlier, suppose we have the following description of the shoes: “Grey color men’s tennis shoes. Lightweight with excellent traction, great for workouts, endurance training and gym use.” Based on this description, we might be able to guess that these shoes would be interesting to a user searching for “gym training shoes,” even if we haven’t directly observed this query-product pair.
As users interact with the system, we gain access to new data that can be used to refine the system. This allows us to bootstrap the system; by observing how users interact with the current model, we gather data that can be used to train a better model. Bootstrapping is a strong advantage of our cold-start modeling approach because we can update the model with new data at any time.
This stands in contrast to unsupervised indexing methods, such as elastic-search. These methods can achieve recommendation via keyword search or semantic search via pre-trained word embeddings. However, they cannot consume new data or dynamically change their recommendations based on recent observations of user behavior.
We introduced the cold-start problem in the context of product recommendation, but our solution is much more general. Broadly speaking, our approach applies to any recommendation task where we wish to identify a small number of relevant items from a collection based on a user-issued textual query, and we have metadata about each item in the form of text. Here are just a few possibilities:
ThirdAI provides a solution for cold-start recommendation that works well for large and complex recommendation problems. By using ThirdAI’s UDT engine, we are able to efficiently cold-start and deploy recommender models on any standard CPU. Our system has good performance out-of-the-box, can be dynamically updated with new data, and easily scales to collections with millions of items.
ThirdAI’s solution for cold-start recommendation uses our Universal Deep Transformer (UDT) modeling toolkit. Using UDT we can set up a robust and performant recommendation model with just a few lines of code:
As you can see, the model is ready to answer queries even after only seeing the product catalog. Additionally, as more supervised data is generated, UDT can execute a simple train call on this data to further improve the performance of the model on real examples.