During the past six months we have witnessed some incredible developments in AI. The release of Stable Diffusion changed the art world forever and ChatGPT-3 shook up the internet with its ability to write songs, mimic research papers and provide thorough and seemingly intelligent answers to common Google questions.
This advancement in generative AI provides further evidence that we are on the verge of an AI revolution.
However, most of these generative AI models are foundational models: high-capacity, unsupervised learning systems that train on massive amounts of data and require millions of dollars in computing power. Currently, only well-funded institutions with access to massive amounts of GPU power are able to build these models.
The majority of companies developing the application-layer AI that is driving widespread adoption of the technology still rely on supervised learning, using large amounts of labeled training data. Despite the impressive performance of base models, we are still in the early days of the AI revolution and numerous bottlenecks are holding back the spread of AI at the application layer.
Downstream from the well-known data labeling problem, there are additional data bottlenecks that will hamper later-stage AI development and deployment in production environments.
These issues are why, despite the early promise and deluge of investment, technologies such as self-driving cars are only a year away as of 2014.
These exciting proof-of-concept models perform well on benchmarked datasets in research environments, but they struggle to accurately predict when they’ll be released in the real world. A major problem is that the technology struggles to meet the higher performance threshold required in high-stakes production environments, and falls short of key metrics for robustness, reliability, and maintainability.
For example, these models often cannot handle outliers and edge cases, causing self-driving cars to mistake reflections from bicycles themselves for bicycles. They are not reliable or robust, so a robot barista makes a perfect cappuccino two times out of five, but spills the cup the other three times.
As a result, the AI production gap, the divide between “that’s neat” and “that’s useful,” is much wider and more formidable than ML engineers initially anticipated.
Counterintuitively, the best systems also have the most human interaction.
Fortunately, as more and more ML engineers embrace a data-centric approach to AI development, the implementation of active learning strategies is on the rise. The most advanced companies will use this technology to bridge the AI manufacturing gap and build models that can run faster in the wild.
Table of Contents
What is Active Learning?
Active learning makes training a supervised model an iterative process. The model trains on a first subset of labeled data from a large dataset. It then tries to make predictions about the rest of the unlabeled data based on what it has learned. ML engineers evaluate how confident the model is in its predictions and, by using a variety of acquisition functionscan quantify the added performance benefit by annotating one of the unlabeled examples.
By expressing uncertainty in its predictions, the model itself decides which additional data is most useful for its training. In doing so, it asks annotators to provide more examples of just that specific type of data so that it can train more intensively on that subset during the next round of training. Think of it like polling a student to find out where their knowledge gap is. Once you know what issues they are missing, you can give them textbooks, presentations, and other materials so they can focus their learning to better understand that particular aspect of the subject.
With active learning, training a model changes from a linear process to a circular process with a strong feedback loop.
Why advanced companies need to be ready to leverage active learning
Active learning is fundamental to closing the prototype production gap and increasing model reliability.
It is a common mistake to think of AI systems as a static piece of software, but these systems must constantly learn and evolve. If not, they make the same mistakes repeatedly, or, when released into the wild, they encounter new scenarios, make new mistakes, and don’t get a chance to learn from them.