Google has made further progress in developing its artificial intelligence by **introducing Gemini 2.0**, a model set to transform how we engage with technology.
The debut of Gemini 2.0 signifies a new chapter in Google’s AI model lineage. While Gemini 1.0 and 1.5 were notable for their ability to process text, images, audio, video, and code in a multimodal fashion, this new model takes it a step further.
Gemini 2.0 not only comprehends these types of inputs but also generates multimodal outputs, like native images and synthesized audio. This is a significant advancement for applications such as generating complex reports or employing virtual assistants for sophisticated tasks.
“We are thrilled to introduce our next era of models designed for this new era of AI agents: Gemini 2.0, our most capable model to date. The new advancements in multimodality—like native image and audio generation and the seamless integration of tools—will enable us to create new AI agents that bring us closer to our vision of a universal assistant,” explained Sundar Pichai, CEO of Google, in a company statement.
Google has launched the first model in the Gemini 2.0 family: an experimental version called Gemini 2.0 Flash. This is a reference model featuring low latency and improved performance.
“In addition to supporting multimodal inputs like images, video, and audio, Flash 2.0 now supports multimodal outputs, such as native images generated and mixed with text and multilingual audio synthesized from text (TTS). It is also natively integrated with tools like Google Search or code execution, as well as user-defined third-party functions,” the technologic giant explained.
The experimental Gemini 2.0 Flash model builds on the success of Gemini 1.5 Flash and is already accessible to developers through platforms like Google AI Studio and Vertex AI. Additionally, it supports new tools like the Multimodal Live API, which enables real-time video input and streaming audio.
Google also announced that, starting today, users of the web version of Gemini for desktop and mobile can access an optimized version for chat of Gemini 2.0 Flash. To do this, they simply need to select it from the model dropdown menu. The integration of this model into its mobile app is planned soon.
Gemini 2.0 facilitates advanced experiences with AI agents, offering solutions that can **anticipate user needs** and perform actions on their behalf efficiently under supervision. To showcase the potential of this model, **Google has introduced several prototypes** that demonstrate how Gemini 2.0 can transform interactions between humans and machines:
The development of Gemini 2.0 is accompanied by a commitment to safety and ethics. Google has implemented strict risk assessment protocols, including measures to prevent misuse of AI and ensure user privacy. For example:
– Information Handling Security: Astra includes controls allowing users to delete sessions and protect sensitive data.
– Fraud Protection: Mariner prioritizes user instructions over malicious third-party attempts.
– Reduction of Biases and Errors: Gemini 2.0 incorporates automatic evaluations to optimize the safety of its multimodal outputs.
Photo: Google
Your email address will not be published. Required fields are marked *
Δ