The future of generative AI in enterprises could be smaller, more targeted language models

by Ana Lopez

The amazing abilities of OpenAI’s ChatGPT would not be possible without large language models. These models are trained on billions, sometimes trillions of samples of text. The idea behind ChatGPT is to understand the language so well that it can anticipate in a fraction of a second which word is likely to come next. That takes a lot of training, computing resources and developer knowledge to pull this off.

But perhaps the future of these models is more focused than the boil-the-ocean approach we’ve seen from OpenAI and others, who want to be able to answer every question under the sun. What if each industry or even company had trained its own model to understand the jargon, language and approach of the individual entity? Perhaps then we would get less completely fabricated answers because the answers come from a more limited universe of words and sentences.

In the AI-driven future, each company’s own data could be its most valuable asset. If you’re an insurance company, you have a very different lexicon than a hospital, car company, or law firm, and when you combine that with your customer data and the entire content of the organization, you have a language model. While it may not be big, as in the sense of a really big language model, it would be just the model you need, one made for one and not for the masses.

This also requires a set of tools to collect, aggregate, and continually update the business dataset in a way that makes it digestible for these smaller major language models (sLLMs).

Building these models can be challenging. They will likely leverage something like open source or a private company’s existing LLMs and then refine it on the industry or company data to bring it into more focus, all in a more secure environment than the generic LLM variety.

This represents a huge opportunity for the startup community and we see many companies ahead of the curve on this idea.

Related Posts