Ruby
Python
PHP
Java
Node.js
Go
.NET
Improvements & Evaluations
KPIs
Key performance indicators (KPIs) are metrics used by companies to evaluate their own performance. To show how recommendations impact a business, using the proper KPI for this business is a natural choice. In fact, they allow the measurement of the user experience.
Typical KPIs assess the impact on sales and revenue, user engagement (e.g., click-through rate), conversion rates, and user retention. The exact definition and relevance of these KPIs depend on the domain and the business model of the customer.
One of the main complications with today’s recommendation engines is that there is no clear or definitive correlation between how well a model performs offline and how well this model will perform live for customers.
This problem is of course merely reflecting how impractical most academic research is for real-time recommendation, but beyond this, this means also there is too often a gap and a lack of impact when it comes to improving a business’ KPIs. To solve this challenge, several iterations and proper qualitative analysis need to be undertaken to help Machine Learning experts fine-tune the models to truly impact the KPIs.
It is important to understand not all KPIs are equally important for different customers. KPIs are often mutually exclusive in the sense that some of them contradict each other. For instance, an entertainment company will optimize the recommendation to retain and engage, while an e-commerce marketplace will not necessarily optimize for “time spent online” but instead for the average order value (AOV). Those are both important metrics; however, optimizing recommendation models for the first case doesn’t necessarily mean that those models would perform as well for direct sales.
To do this, we need to build, set up, and scale some A/B test pipelines in order to compare models and incrementally improve the recommendation pipeline.
A/B Test
One of the best ways to compare, learn and grow for recommendations is the implementation of an A/B test aka comparing on the same use-case across different group of user randomly generated.
There are two kinds of A/B tests we can employ to evaluate the success of the recommendation pipeline.
- The first is an external or customer A/B test, where Crossing Minds is compared head-to-head against another recommendation solution. Generally, that means CM is compared with a solution that was built in-house or a different solution used by a customer. While harder to organize as the burden of data collection falls upon prospective customers, this gives us the opportunity to prove the value of our solution and generally happens during pre-implementation.
- Internal A/B tests are done with continuous improvements in mind. Two different recommendation strategies (comprising Crossing Minds models and business rules) are tested against one another to find the best option. Conducting these tests on a regular basis allows CM to constantly evaluate where we can provide more value and better KPIs to customers.