2022 has seen incredible growth in foundation models — AI models trained on a massive scale — a revolution that began with Google’s BERT in 2018, picked up steam with OpenAI’s GPT-3 in 2020, and entered the zeitgeist with the company’s DALL-E text-to-image generator in early 2021.
The pace has only accelerated this year and moved firmly into the mainstream, thanks to the jaw-dropping text-to-image possibilities of DALL-E 2, Google’s Imagen and Midjourney, as well as the options for computer vision applications from Microsoft’s Florence and the multimodal options from Deep Mind’s Gato.
That turbocharged speed of development, as well as the ethical concerns around model bias that accompany it, is why one year ago, the Stanford Institute for Human-Centered AI founded the Center for Research on Foundation Models (CRFM) and published “On the Opportunities and Risks of Foundation Models” — a report that put a name to this powerful transformation.
“We coined the term ‘foundation models’ because we felt there needed to be a name to cover the importance of this set of technologies,” said Percy Liang, associate professor in computer science at Stanford University and director of the CRFM.
Since then, the progress of foundation models “made us more confident that this was a good move,” he added. However, it has also led to a growing need for transparency, which he said has been hard to come by.
“There is confusion about what these models actually are and what they’re doing,” Liang explained, adding that the pace of model development has been so fast that many of foundation models are already commercialized, or are underpinning point systems that the public is not aware of, such as search.
“We’re trying to understand the ecosystem and document and benchmark everything that’s happening,” he said.
The CRFM defines a foundation model as one that is trained on broad data and can be adapted to a wide range of downstream tasks.
“It’s a single model like a piece of infrastructure that is very versatile,” said Liang — in stark contrast to the previous generation of models that built bespoke models for different applications.
“This is a paradigm shift in the way that applications are built,” he explained. “You can build all sorts of interesting applications that were just impossible, or at the very least took a huge team of engineers months to build.”
Foundation models like DALL-E and GPT-3 offer new creative opportunities as well as new ways to interact with systems, said Rishi Bommasani, a Ph.D. student in the computer science department at Stanford whose research focuses on foundation models.
“One of the things we’re seeing, in language and vision and code, is that these systems may lower the barrier for entry,” he added. “Now we can specify things in natural language and therefore enable a far larger class of people.”
That is exciting to see, he said, “But it also entails thinking about new types of risks.”
The challenge, according to Liang and Bommasani, is that there’s not enough information to assess the social impact or explore solutions to risks of foundation models, including biased data sets that lead to racist or sexist output.
“We’re trying to map out the ecosystem, like what datasets were used, how models are trained, how the models are being used,” Liang said. “We’re talking to the various companies and trying to glean information by reading between the lines.”
The CRFM is also attempting to allow companies to share details about their foundation models while still protecting company interests and proprietary IP.
“I think people would be happy to share, but there’s a fear that oversharing might lead to some consequences,” he said. “It’s also if everyone were sharing it might be actually okay, but no one [wants] to be the first to share.”
This makes it challenging to proceed.
“Even basic things like whether these models can be released is a hot topic of contention,” he said. “This is something I wish the community would discuss a bit more and get a bit more consensus on how you can guard against the risks of misuse, while still maintaining open access and transparency so that these models can be studied by people in academia.”
“Foundation models cut down on data labeling requirements anywhere from a factor of like 10 times, 200 times, depending on the use case,” Dakshi Agrawal, IBM fellow and CTO of IBM AI, told VentureBeat. “Essentially, it’s the opportunity of a decade for enterprises.”
Certain enterprise use cases require more accuracy than traditional AI has been able to handle — such as very nuanced clauses in contracts, for example.
“Foundation models provide that leap in accuracy which enables these additional use cases,” he said.
Foundation models were born in natural language processing (NLP) and have transformed that space in areas such as customer care analysis, he added. Industry 4.0 also has a tremendous number of use cases, he explained. The same AI breakthroughs happening in language are happening in chemistry for example, as foundation models learn the language of chemistry from data — atoms, molecules and properties — and power a multitude of tasks.
“There are so many other areas where companies would love to use the foundation model, but we are not there yet,” he said, offering high-fidelity data synthesis and more natural conversational assistance as examples, but “we will be there maybe in a year or so. Or maybe two.”
Agrawal points out that regulated industries are hesitant to use current public large language models, so it’s essential that input data is controlled and trusted, while output should be controlled so as not to produce biased or harmful content. In addition, the output should be consistent with the input and facts — hallucinations, or interpretation errors, cannot be tolerated.
For the CEO who has already started their AI journey, “I would encourage them to experiment with foundation models,” he said.
Most AI projects, he explained, get stuck in boosting time to value. “I would urge them to try foundation models to see that time to value shrinks and how little time it takes away from day-to-day business.”
If an organization has not started on their AI journey or is at a very early stage, “I would say you can just leapfrog,” he said. “Try this very low-friction way of getting started on AI.”
Going forward, Agrawal thinks the cost of foundation models, and the energy used, will go down dramatically, thanks in part to hardware and software specifically targeted towards training them by leveraging the technology more effectively.
“I expect energy to be exponentially decreasing for a given use case in the coming years,” he said.
Overall, Liang said that foundation models will have a “transformative” impact – but it requires a balanced and objective approach.
“We can’t let the hype make us lose our heads,” he said. “The hope is that in a year we’ll at least be at a definitively better place in terms of our ability to make informed decisions or take informed actions.”
(Copyright: VentureBeat Foundation models: 2022's AI paradigm shift | VentureBeat)