Google’s new Gemini AI beats GPT-4 in 30 of 32 tests

But will the difference be enough to matter in real life?

Tech giant Google has finally unveiled its much-hyped Gemini AI, a series of generative AI models it claims are its “largest and most capable” to date. 

“This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company,” said Google CEO Sundar Pichai. 

Multimodal AI: Generative AIs are algorithms trained to create original content in response to user prompts. OpenAI’s first iteration of ChatGPT, for example, can understand and produce human-like text, while its DALL-E 2 system can generate images based on text prompts. 

While those systems understand and generate just one type of content, a multimodal generative AI can work with several — in September, OpenAI announced a multimodal version of ChatGPT that could understand image, voice, and text inputs.

“Its capabilities are state-of-the-art in nearly every domain.”

Demis Hassabis

The Gemini era: According to Google, multimodal AIs are traditionally created by combining separate, specialized models into one program, but it took a different approach with its Gemini AI, training it to be multimodal from the start.

“This helps Gemini seamlessly understand and reason about all kinds of inputs from the ground up, far better than existing multimodal models — and its capabilities are state-of-the-art in nearly every domain,” wrote Demis Hassabis, CEO and cofounder of Google DeepMind.

In addition to being highly capable, Google says the Gemini AI is also its “most flexible” model. This has allowed the company to create three different sizes of the AI: Ultra, Nano, and Pro. 

  • Gemini Ultra is the most powerful model, designed for complex tasks. According to Google, it’s the first generative AI model to outperform human experts on the MMLU, a benchmark assessing knowledge across 57 subjects. Google is currently soliciting feedback on Ultra from select users, but expects to make it widely available in 2024.
  • Gemini Nano is the least capable model, but it’s small and efficient enough to run locally on smartphones. Google has already made it available on its Pixel 8 Pro — owners of that smartphone can use the AI to summarize audio recordings or generate responses to WhatsApp messages.
  • Gemini Pro, meanwhile, falls between Nano and Ultra in terms of capabilities and size. Google has integrated an English-language version of that model into its ChatGPT-like Bard, which will reportedly get an Ultra upgrade in 2024.

The big picture: Like the rest of the tech industry, Google has been racing to catch up with OpenAI in the generative AI space ever since the release of ChatGPT in 2022, and it’s been hyping the Gemini AI for months as the tech that will put it ahead. 

While Gemini did outperform OpenAI’s GPT-4 on 30 of 32 benchmarks tested (including the MMLU), the difference was often just a percentage point or two — meaning Google may be ahead, but only by a little and only compared to an AI model that’s been out for 9 months already.

“It’s clear that Gemini is a very sophisticated AI system … [but] it’s not obvious to me that Gemini is actually substantially more capable than GPT-4,” Melanie Mitchell, an AI researcher at the Santa Fe Institute in New Mexico, told MIT Technology Review.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at tips@freethink.com.

Related
The West needs more water. This Nobel winner may have the answer.
Paul Migrom has an Emmy, a Nobel, and a successful company. There’s one more big problem on the to-do list.
Can we automate science? Sam Rodriques is already doing it.
People need to anticipate the revolution that’s coming in how humans and AI will collaborate to create discoveries, argues Sam Rodrigues.
AI is now designing chips for AI
AI-designed microchips have more power, lower cost, and are changing the tech landscape.
Why futurist Amy Webb sees a “technology supercycle” headed our way
Amy Webb’s data suggests we are on the cusp of a new tech revolution that will reshape the world in much the same way the steam engine and internet did in the past.
AI chatbots may ease the world’s loneliness (if they don’t make it worse)
AI chatbots may have certain advantages when roleplaying as our friends. They may also come with downsides that make our loneliness worse.
Up Next
Exit mobile version