On Wednesday, Google unveiled Gemini 2.0, its most advanced artificial intelligence model suite to date, to the public.
In December, the corporation granted access to developers and trusted testers while integrating certain capabilities into Google products; yet, this is classified as a "general release" by Google.
The model suite comprises 2.0 Flash, described as a "workhorse model, ideal for high-volume, high-frequency tasks at scale," with 2.0 Pro Experimental for coding efficiency, and 2.0 Flash-Lite, touted as the company's "most cost-effective model to date."
Gemini Flash charges developers 10 cents per million tokens for text, picture, and video inputs, whereas Flash-Lite, its more economical variant, costs 0.75 cents for the same service. Tokens denote each discrete unit of data that the model analyses.
The ongoing releases are integral to Google's comprehensive strategy of substantial investment in AI agents as the competition intensifies among technology heavyweights and startups.
Meta, Amazon, Microsoft, OpenAI, and Anthropic are advancing towards agentic AI, which refers to models capable of executing intricate multistep tasks autonomously for users, eliminating the need for users to guide them through each particular step.
In the past year, we have focused on creating more autonomous models that possess a greater understanding of their surroundings, can anticipate future scenarios, and can act on your behalf under your oversight, Google stated in a December blog post. It further noted that Gemini 2.0 features "new advancements in multimodality—such as integrated image and audio output—and inherent tool utilisation," asserting that this model family will facilitate the development of new AI agents that align with our aspiration of a universal assistant.
Anthropic, the Amazon-supported AI business established by former OpenAI research executives, is a significant contender in the competition to create AI agents. In October, Anthropic said that their AI bots could utilise computers similarly to humans to accomplish intricate jobs. Anthropic's technology enables its system to comprehend computer screen content, pick buttons, input text, access websites, and do tasks across various software applications and real-time internet browsing, according to the startup.
Jared Kaplan, Anthropic’s chief science officer, stated in a CNBC interview that the tool may "utilise computers in fundamentally the same manner as we do." He stated it is capable of executing jobs involving "tens or even hundreds of steps."
OpenAI recently introduced a comparable product named Operator, which automates chores including vacation planning, form completion, restaurant bookings, and grocery ordering. The Microsoft-supported startup characterised Operator as "an agent capable of navigating the web to execute tasks on your behalf."
This week, OpenAI launched Deep Research, enabling an AI agent to generate intricate research reports and examine user-selected queries and topics. In December, Google introduced a comparable tool named Deep Research, designed to function as a "research assistant" that investigates intricate subjects and generates reports on your behalf.
CNBC initially reported in December that Google plans to unveil multiple AI technologies in early 2025.
“In history, it is not always essential to be the first; however, it is imperative to execute proficiently and truly excel as a premier product,” stated CEO Sundar Pichai during a strategy discussion at that time. “I believe that encapsulates the essence of 2025.”