Skip to main content

Multimodal Intelligence the Future of GenAI, Per Gartner

Liz Dominguez
generative ai
Realistically, people sift through varied information formats, including auditory, visual, and sense inputs, according to Brethenoux, and so this AI format is more efficient because data itself is multimodal.

As generative AI advances, CPG manufacturers must decide which formats and models will be most effective when integrating artificial intelligence into their tech stacks. 

The environment is growing increasingly more complex, from open-source large language models to domain-specific formats. Multimodal solutions, however, may have an advantage in human-AI interactions, per Gartner.

The company said 40% of generative AI solutions will be multimodal by 2027 — encompassing text, image, audio, and video. This is a 1% increase over last year and is largely attributed to the format’s ability to adapt to different industries and use cases, able to be applied at any touchpoint between AI and humans. 

Also read: Unilever is scaling generative AI through an expanded Accenture partnership

“As the GenAI market evolves towards models natively trained on more than one modality, this helps capture relationships between different data streams and has the potential to scale the benefits of GenAI across all data types and applications,” said Erick Brethenoux, distinguished VP analyst at Gartner. “It also allows AI to support humans in performing more tasks, regardless of the environment.”

Realistically, people sift through varied information formats, including auditory, visual, and sense inputs, according to Brethenoux, and so this AI format is more efficient because data itself is multimodal.

While many of these models are currently limited to two to three modalities, Gartner expects this to increase over the next few years. The challenge with single-modality formats is that combining them to support multimodal use cases often leads to delays and inaccurate results, hurting the overall experience. 

Overall, Gartner expects LLMs to also increase in impact over the next five years, and two technologies — domain-specific GenAI models and autonomous agents — hold the highest potential as GenAI reaches mainstream adoption within the next decade. 

Advertisement - article continues below
Advertisement

More GenAI Formats at a Glance

Open Source: These have a deep-learning foundation, democratize commercial access, and allow developers to optimize models for specific use cases. Chandrasekaran said they are highly customizable; have better control over privacy, security, and transparency; and offer smaller models that are more cost-effective to train and implement. 

Domain-Specific: These are optimized for specific business functions and industries, offering improved contextualized answers that better align with business goals. They reduce the need for advanced prompt engineering and have a lower hallucination rate, providing added security and the ability to implement for industry-specific tasks. 

Autonomous Agents: These combined systems can achieve business goals without human intervention. They make decisions based on environmental patterns, allowing them to tackle more complex tasks independently. "This will likely deliver cost savings, granting a competitive edge. It also poses an organizational workforce shift from delivery to supervision,” said Chandrasekaran.

More on AI Innovations

X
This ad will auto-close in 10 seconds