Like humans, machines need to continually learn from non-stationary information streams. While this is a natural skill for humans, it’s challenging for neural networks-based AI machines.
One inherent problem in artificial neural networks is the phenomenon of catastrophic forgetting. Deep learning researchers are working extensively to solve this problem in their pursuit of AI agents that can continually learn like humans.
Research in continual learning, and AI in general, has drawn inspiration from human intelligence and computational neuroscience. Scientific researchers aim to bridge the gap between cognitive sciences and modern deep learning, and a key objective toward achieving Artificial General Intelligence (AGI) is to implement the brain skill of continual learning in artificial neural networks.
So, let’s take a look.
Continual Learning refers to the ability to learn from non-stationary information streams incrementally.
“Non-stationary” represents continuously changing data distributions.
“Incremental” learning refers to preserving previous knowledge while continuously learning new information.
For example, an AI image classifier for self-driving vehicles is trained on a data distribution of cars. The model is continuously exposed to different images of different vehicle form factors, models and types. While the model can learn to classify vehicles of different sizes with high accuracy, it must also correctly classify other objects visible in an open road environment, including pedestrians, trees, road signs, traffic lights and road blocks.
At the time of inference — where the AI model needs to make an intelligent decision to classify objects in its peripheral view — the model should retain all of its previously learned knowledge.
In order to achieve this goal, continual learning requires the following key characteristics:
Continual learning AI systems can adapt to learn new data distributions without requiring significant (re)training on new datasets.
In a real-world setting, information about the surroundings can change rapidly. Artificial neural networks suffer from loss of plasticity — they are no longer able to change predictions based on new data. (This is similar to neural plasticity in the human brain, which refers to the capacity of the nervous system to modify its structure and functionality.)
Continual learning systems are highly expected to achieve high adaptation with minimal loss of plasticity, that is, their ability to learn from new information.
Continual learning can take advantage of task and context similarity between learning tasks that are related – training a neural network model on one task such that it also performs well on another related task is called positive transfer.
Humans behave similarly: an athlete who has excelled at one format of a sport can also perform well and compete in other related sports.
Another desirable property of continual learning models is to be able to perform well without knowledge of the task identity or task switching underlying a training process.
For example, a model training to classify cars should be able to recognize that an airplane belongs to a different data distribution, despite similarities such as wheels and windows.
AI models train on large datasets. These datasets contain noise — unwanted signal errors in an image, sound or video stream and are not a part of the data sample itself. This is common for sensors that pick up information from the source due to fluctuations in the surrounding environment or the device itself.
Continual learning models should be able to learn the true data distribution without the noise components added to it.
While a sufficiently large AI model trained on large data assets can learn to generalize well on multiple data distributions, it is not necessarily the most sustainable, cost-effective and resource efficient method. Continual learning models should be compact and resource efficient in terms of:
Storage
Computing
Energy requirements
To better understand the importance of AI ethics in business, we spoke with Dan Corbin, an instructor at Pragmatic Institute.
Dan has more than 25 years of experience in product management, mentoring, and professional coaching, and has led product teams around the world. His experience managing a variety of products in diverse industries informs his insight into what makes an effective product team. Dan’s past roles include Senior Systems Analyst at Akin Gump Strauss Hauer & Feld LLP, Vice President of Operations and Chief Product Owner at TrialSmith, and Sr. Director of Product Management at Return Path. Dan’s passion for teaching, coaching, and mentorship shines through in his work as an Instructor at Pragmatic Institute, enabling him to guide students to strategic solutions for their most complex product and AI questions.
In this section, we've included Dan's responses to our prompts.
A significant challenge for business professionals lies in the misconception that integrating artificial intelligence into their workflow is a straightforward solution that can automate complex decision-making processes. Organizations must recognize that AI tools, while powerful, require a sophisticated level of oversight.
Successful organizations encourage their staff to adopt a growth mindset and they provide various training opportunities for employees. Similarly, for AI tools to evolve and improve, a structured approach is essential. Fortunately, numerous techniques can facilitate this process.
Strategies such as monitoring performance metrics, using diverse and high-quality inputs, incorporating few-shot prompting for context, providing feedback on outputs, and reporting errors or inconsistencies can enhance AI over time. Additionally, companies can leverage Retrieval-Augmented Generation (RAG) to mitigate the limitations of AI. RAG improves AI models by enhancing their ability to access and integrate the latest information from extensive datasets, ensuring up-to-date and accurate outputs.
Each group has much to learn from the other, but if I had to choose one area where product management could benefit AI research, it would be in deeply understanding user needs.
Product managers are customer-centric, employing various market research methods to explore customer pain points, preferences, and behaviors. AI researchers should consider these techniques when developing new tools to ensure their models and algorithms address real-world problems. Examples of product management techniques that AI researchers could adopt include user interviews, observations, ethnographic research, user personas, user scenarios, and use cases. Even with highly technical AI products, it’s essential to start with customer problems and work backward.
When determining how to implement continual learning principles, companies should assess their AI model's current performance, the rate of change in their data environment, and their long-term product goals. For instance, if a product requires a high level of personalization for customers, continual learning will be a crucial part of the development strategy. This approach is equally important if the data environment is dynamic or if quickly incorporating customer feedback is essential.
Teams building AI-powered products need to understand these principles and be prepared to apply them proactively to prevent AI performance degradation. Address continual learning by providing regular feedback on the model, correcting errors and inconsistencies, and enhancing the underlying data set. Most digital product teams incorporate product instrumentation into their beta versions to gather feedback and measure progress as early as possible. Similarly, you should provide early feedback to the AI systems driving your products.
Teams developing AI-powered products must conduct thorough risk mitigation and understand both the capabilities and limitations of AI. Continual learning can be a crucial part of that risk mitigation strategy. Leveraging continual learning adds adaptability to your model, enhancing its performance over time. As market conditions, user needs, and the data powering your AI evolve, teams and companies will face unforeseen challenges. Maintaining a robust yet adaptable model will enable teams to better manage these changes and uncertainties.
So how do you train a continual learning model?
One of the most common types of approaches is referred to as replay-based continual learning approach. In this approach, the model is periodically exposed to data from previous distributions to avoid catastrophic forgetting.
Another popular approach involves parameter regularization. This method involves imposing constraints on the model parameters to encourage the model to learn simple and more generalizable representations of the data distribution.
Recent advances involve adding context to the model architecture itself: different parts of the neural network model are tuned to perform well on different tasks and data distributions. It may be the case that the end-to-end network model comprises a set of smaller expert models each specializing in unique and distinct tasks.
An obvious assumption here is that we have sufficient knowledge of the task itself. In real-world scenarios, that is not always the case.
For example, a self-driving car is likely to experience objects that it has never observed before and therefore, classifying them among a known distribution may only contribute to its catastrophic forgetting.
See an error or have a suggestion? Please let us know by emailing ssg-blogs@splunk.com.
This posting does not necessarily represent Splunk's position, strategies or opinion.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.