To get the most out of large language models (LLMs), enterprises must customize them by fine-tuning them using domain-specific data. This helps polish a model so that it generates relevant outputs.
However, fine-tuning pre-trained models poses a significant, potentially dangerous problem: Honing models on different distributions than their original training datasets reorients their weights to the new inputs.
Going forward, the model ultimately blanks on the information it memorized during training — what’s known as “catastrophic forgetting.” This degrades the LLMs knowledge and reasoning skills, and thus its performance and usability.
Voice AI agent company Tenyx is today announcing a fine-tuning method to help address this glaring problem. The platform helps businesses adapt LLMs to their unique requirements without compromising foundational knowledge or protective safeguards.
“Catastrophic forgetting has been a longstanding challenge within the machine learning community,” Itamar Arel, Tenyx CEO and founder, told TechForgePulse in an exclusive interview. “For many years, the assumption has been that one can always continue to train on new data while also including the old.”
Loss of critical capabilities, exposure to harmful and biased content
Fine-tuning is gaining traction as a crucial tool within the “arsenal of methods aimed at harnessing LLMs” for enterprise use cases, Arel said.
However, data scientists fine-tuning LLMs via standard techniques do not typically have access to the full data set on which the model was trained, and conventional schemes don’t address the risks of forgetting effects. This leads to the loss of critical capabilities, as well as potential exposure to harmful comments (which can pose legal liability).
For instance, Arel said, LLaMA 7B may be used as an engine for a customer service chatbot such as a hotel reservations agent. But since it is off-the-shelf and not optimized for this specific domain, data scientists need to fine-tune it based on, say, a set of typical conversations between human agents and customers seeking to book a hotel room. This will likely use conventional fine-tuning techniques such as Low-Rank Adaptation (LoRA).
Very quickly, both knowledge (like answers to ‘What’s the distance from the hotel to the airport?’) and reasoning capabilities (correctly inferring statements such as ‘I’ll be arriving on Dec. 7 for four nights,’ for example) may be lost.
“The resulting fine-tuned model may respond better to specific inputs, but may suddenly produce wrong or potentially biased answers regarding general knowledge and reasoning tasks,” said Arel.
In another scenario, an LLM is trained with a corpus of English sentences, making it capable of reasoning and answering general knowledge questions. Later fine-tuning on structurally and syntactically different coding language datasets will alter how the model captures information, transforms it and outputs new information.
“Such a change will cause the network to lose its abilities in producing 100% coherent English language statements,” said Arel.
Limitations of LoRA
The parameter-efficient fine-tuning technique LoRA has been widely adopted due to its low memory and computational requirements.
However, Arel explained, it was never intended to mitigate catastrophic forgetting. When weights are updated as part of model training on data distribution that doesn’t match the original training data, the resulting distortions are difficult to predict.
“Our results show that although LoRA is computationally efficient, it suffers from the same drawbacks when it comes to inducing memory and reasoning loss,” said Arel.
Model complexity also makes it difficult to identify and fix distortions. Furthermore, fine-tuning through LoRA and other existing methods can weaken or outright retract safety measures established via reinforcement learning from human feedback (RLHF), which is vital for preventing biased and harmful model outputs.
“It is crucial to note that RLHF is also a training procedure,” said Arel, “and as such is affected as much as the knowledge and reasoning capabilities during fine-tuning.”
Existing mitigation processes inconsistent, unreliable
One current approach to mitigating catastrophic forgetting is reliance on large numbers of machine learning (ML) engineers tasked with limiting fine-tuning as much as possible and relying on prompt engineering to achieve desired performance.
However, this process is unreliable and inconsistent across models and there is no (at least to date) understanding of how, why and when it works. Evaluation sets that quantify the knowledge, reasoning capability and safety of fine-tuned models are also performed while fine-tuning takes place to “early-stop” the process at the best points in time.
“These solutions are costly, require manual work by ML engineers and are time-consuming,” Arel asserted. “There are no known ways to automate this human-intense process.”
Tenyx exhibits significant gains in safety, proficiency and knowledge
The Tenyx fine-tuning method attempts to determine the subset of model parameters that can be updated so that learning on the new data takes place, while at the same time, the model retains almost all prior learned input-output mappings, Arel explained.
The platform then projects updates made to neurons during fine-tuning to a space where they will not disturb the way they capture information on the pre-trained data distribution.
“In other words, by analyzing a trained LLM, our method is able to determine how and which of the billions of weights can be updated so that minimal to no catastrophic forgetting takes place as learning on the new data is achieved,” said Arel.
Tenyx’s platform is based on a novel mathematical interpretation of the geometric representations formed during initial LLM training, he explained. It captures the geometry of the data represented within the transformer networks that power today’s LLMs.
This geometric interpretation allows Tenyx to select a subset of the network weights and constrain updates of selected neurons, “with strong guarantees that effectively all previously learned information is retained,” said Arel.
The method retains RLHF protections and is aligned with regulatory changes — specifically the White House Executive Order on Safe, Secure, and Trustworthy AI.
Through the evaluation of popular enterprise and open-source finetuning algorithms in a pilot study, Tenyx exhibited the following capabilities:
- Safety: Tenyx fine-tuning saw an 11% reduction compared to OpenAI’s -66%, Together AI’s -94%, and LoRA’s -91%.
- Proficiency: OpenAI’s GPT 3.5 Turbo was initially more proficient because the model had more parameters, Tenyx’s Llama-2 7B was the most proficient after fine-tuning.
- Knowledge: Tenyx mitigated catastrophic forgetting the most with a 3% loss, compared to OpenAI’s 10%, Together AI’s 40%, and LoRA’s 43%.
“Catastrophic forgetting is a well-known problem in deep learning, which still affects even large, capable models,” said Noah Goodman, associate professor at Stanford University. “When trained on data from a new domain, models generally perform better on that domain while unintentionally modifying earlier capabilities.”
He added, “Tenyx has a strong research team who are exploring important new ideas for addressing this difficult challenge.”
TechForgePulse's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.