Key Point 1: Google officially launches flagship AI model Gemini 2.0 Pro Experimental and introduces Gemini 2.0 Flash Thinking model, enhancing its competitiveness in the AI field.
Key Point 2: Facing competition from China’s AI startup DeepSeek and its low-cost and efficient AI models, Google attempts to increase its market share by integrating the Gemini 2.0 Flash Thinking model into its Gemini application.
Key Point 3: Gemini 2.0 Pro, as the leader of the Gemini series, performs well in encoding and processing complex prompts, and possesses stronger world knowledge understanding and reasoning capabilities. Its 2 million token context window enables it to handle a large amount of text content.
Google Introduces Gemini 2.0 Series
In response to the low-cost and efficient trend created by DeepSeek, Google officially launched its flagship AI model, Gemini 2.0 Pro Experimental, on Wednesday. It also released the Gemini 2.0 Flash Thinking model, seen as an important move by Google to actively respond to competition in the AI field and consolidate its market position.
Gemini 2.0 Pro: Upgraded encoding capability, expanded context window
Gemini 2.0 Pro is the successor to Gemini 1.5 Pro, which Google released in February last year. Google claims that it is now the leader of the Gemini AI model series. The model excels in encoding and processing complex prompts and has “better world knowledge understanding and reasoning capabilities” than any previous model.
According to Tech Church, Gemini 2.0 Pro can even access tools like Google Search and execute code on behalf of users.
It is worth noting that Gemini 2.0 Pro has a context window of 2 million tokens, which means it can process approximately 1.5 million words (English vocabulary) in one go. This capacity is enough for it to read all seven books in the “Harry Potter” series in a single prompt, with about 400,000 words of space remaining.
The Gemini 2.0 series models have been officially launched.
Image/Google
Facing DeepSeek! Gemini 2.0 Flash Thinking enters the competition
Both Google and DeepSeek released AI reasoning models in December last year, but DeepSeek’s R1 gained more attention. DeepSeek’s models are comparable to, and even surpass, leading AI models provided by American technology companies in terms of performance. Moreover, businesses can use these models at relatively low prices through their API.
In order to respond to the competition from DeepSeek, Google is trying to make more people aware of the Gemini 2.0 Flash Thinking model through the Gemini application. With the introduction of Gemini 2.0 Pro and Gemini 2.0 Flash Thinking, Google aims to maintain its leading position in the fiercely competitive AI market.
Comparison of Gemini 2.0 series models
Gemini 2.0 Flash
The main model in the Gemini series, suitable for daily tasks. Compared to 1.5 Flash, it has significantly improved quality. Compared to 1.5 Pro, the delay is lower, closer to real-time response, while slightly improving the quality.
Key features:
Equipped with real-time multimodal API, supporting low-latency bidirectional voice and video interaction. It outperforms Gemini 1.5 Pro in most quality benchmarks. Improvements in multimodal understanding, encoding, complex instruction following, and function calls support a better user experience. It adds built-in image generation and controllable text-to-speech function to achieve image editing, localized art creation, and expressive storytelling.
Applicable scenarios:
Suitable for daily applications that require quick response and high-quality output, such as real-time translation, video recognition, etc.
Gemini 2.0 Flash-Lite
It is the fastest and most cost-effective version of the Flash models, suitable for scenarios that require a balance between speed and cost.
Key features:
With the same price and speed, it has better quality than 1.5 Flash. It has multimodal input and text output functions, with a 1M token input context window and an 8k token output context window. However, it does not include the multimodal output generation, real-time multimodal API integration, thinking mode, and built-in tool usage features of Gemini 2.0 Flash.
Applicable scenarios:
Suitable for large-scale text output applications, such as generating titles for a large number of photos.
Gemini 2.0 Pro
The model in the Gemini series with the strongest encoding capability and world knowledge, with a 2M long context window, suitable for scenarios that require processing a large amount of information and complex encoding tasks.
Key features:
It performs well in encoding and processing complex prompts and has stronger understanding and reasoning capabilities for world knowledge. It has a super large context window of 2 million tokens, enabling comprehensive analysis and understanding of a large amount of information. It has tool invocation capabilities, such as Google Search.
Applicable scenarios:
Suitable for scenarios that require strong encoding capability and handling complex problems, such as converting Python code to Java code. Researchers can use Gemini 2.0 Pro to quickly read and understand a large amount of academic literature and automatically generate literature reviews, saving a lot of time and effort.
Further reading: Are there countless DeepSeeks? OpenAI’s potential rivals are not limited to it, listing the top 5 potential AI companies in China.
Source: TechChurch, Google
This article was initially generated and compiled by AI, and edited by Li Xiantai.
This article is a collaborative reprint from the Digital Times.