Google introduces Gemini 2.0: the new era of its artificial intelligence agents

Google's new AI model introduces multimodal input and output capabilities and advanced reasoning. It will be accessible soon.

December 11, 2024

Google has made further progress in developing its artificial intelligence by **introducing Gemini 2.0**, a model set to transform how we engage with technology.

Gemini 2.0: The Most Advanced Model from Google

The debut of Gemini 2.0 signifies a new chapter in Google’s AI model lineage. While Gemini 1.0 and 1.5 were notable for their ability to process text, images, audio, video, and code in a multimodal fashion, this new model takes it a step further.

Gemini 2.0 not only comprehends these types of inputs but also generates multimodal outputs, like native images and synthesized audio. This is a significant advancement for applications such as generating complex reports or employing virtual assistants for sophisticated tasks.

“We are thrilled to introduce our next era of models designed for this new era of AI agents: Gemini 2.0, our most capable model to date. The new advancements in multimodality—like native image and audio generation and the seamless integration of tools—will enable us to create new AI agents that bring us closer to our vision of a universal assistant,” explained Sundar Pichai, CEO of Google, in a company statement.

Gemini 2.0 Flash

Google has launched the first model in the Gemini 2.0 family: an experimental version called Gemini 2.0 Flash. This is a reference model featuring low latency and improved performance.

“In addition to supporting multimodal inputs like images, video, and audio, Flash 2.0 now supports multimodal outputs, such as native images generated and mixed with text and multilingual audio synthesized from text (TTS). It is also natively integrated with tools like Google Search or code execution, as well as user-defined third-party functions,” the technologic giant explained.

The experimental Gemini 2.0 Flash model builds on the success of Gemini 1.5 Flash and is already accessible to developers through platforms like Google AI Studio and Vertex AI. Additionally, it supports new tools like the Multimodal Live API, which enables real-time video input and streaming audio.

Google also announced that, starting today, users of the web version of Gemini for desktop and mobile can access an optimized version for chat of Gemini 2.0 Flash. To do this, they simply need to select it from the model dropdown menu. The integration of this model into its mobile app is planned soon.

Astra, Mariner, and Jules Projects: The Future of AI Agents

Gemini 2.0 facilitates advanced experiences with AI agents, offering solutions that can **anticipate user needs** and perform actions on their behalf efficiently under supervision. To showcase the potential of this model, **Google has introduced several prototypes** that demonstrate how Gemini 2.0 can transform interactions between humans and machines:

Astra Project: This universal assistant uses multimodal capabilities to interact with tools like Google Search, Lens, and Maps. Astra also improves in areas such as mixed-language recognition and memory, storing up to 10 minutes of information during a session.
Mariner Project: Designed to explore complex tasks from a browser, Mariner uses Gemini 2.0 to understand and reason about web elements such as text, code, and images. In initial tests, it achieved an 83.5% success rate in navigation tasks, highlighting its potential as an advanced digital assistant.
Jules: This code agent is integrated into workflows like GitHub and assists developers in planning and executing tasks more efficiently.

Responsible Innovation in AI

The development of Gemini 2.0 is accompanied by a commitment to safety and ethics. Google has implemented strict risk assessment protocols, including measures to prevent misuse of AI and ensure user privacy. For example:

– Information Handling Security: Astra includes controls allowing users to delete sessions and protect sensitive data.

– Fraud Protection: Mariner prioritizes user instructions over malicious third-party attempts.

– Reduction of Biases and Errors: Gemini 2.0 incorporates automatic evaluations to optimize the safety of its multimodal outputs.

Photo: Google

Published by

María Bastero

Content Manager in Marketing4eCommerce

Content Manager in Marketing4eCommerce, which translates to: writer, editor, and absolute fan of generating images with AI.

Google introduces Gemini 2.0: the new era of its artificial intelligence agents

Gemini 2.0: The Most Advanced Model from Google

Gemini 2.0 Flash

Astra, Mariner, and Jules Projects: The Future of AI Agents

Responsible Innovation in AI

Related posts

Published by

Leave a Reply Cancel reply

Stay up to date!

All you need to know!