Close Menu
  • Home
  • News
  • Cryptocurrency
  • Blockchain
  • Metaverse
  • Policy
  • Opinion
  • Finance
  • All Posts
What's Hot

Cryptocurrency Platform Enters Stock Market! Exchange Coinbase Plans to Launch “Tokenized Stocks”

Jun. 18, 2025

Did Bitcoin Cause Stock Prices to Soar by 30 Times? An Analysis of How Companies Utilize Bitcoin to Propel Their Stock Prices.

Jun. 16, 2025

President Trump Supports Cryptocurrency: Are His Own Finances Benefiting? Understanding the Concerns Behind It.

Jun. 16, 2025
Facebook X (Twitter) Instagram
Remix Eth PulseRemix Eth Pulse
Facebook X (Twitter) Instagram
SUBSCRIBE
  • Home
  • News
  • Cryptocurrency
  • Blockchain
  • Metaverse
  • Policy
  • Opinion
  • Finance
  • All Posts
Remix Eth PulseRemix Eth Pulse
Home » Google Developer Conference Unveiled! Video Search Opens Up, Enhanced Reasoning Capabilities, and 6 Major AI Revolutions Revealed
News

Google Developer Conference Unveiled! Video Search Opens Up, Enhanced Reasoning Capabilities, and 6 Major AI Revolutions Revealed

By adminMay. 15, 2024Updated:Jul. 15, 2024No Comments7 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Telegram Email
Google開發者大會現身!影片搜尋開放,推理能力增加,6大AI革命揭曉。
Google開發者大會現身!影片搜尋開放,推理能力增加,6大AI革命揭曉。
Share
Facebook Twitter LinkedIn Pinterest Email

Google’s annual developer conference, Google I/O, took place in the early hours of May 15th, Taiwan time. This year, the focus was on updating AI capabilities. Throughout the event, the term “AI” was mentioned a total of 122 times.

One of the major updates was the integration of Gemini, a “multimodal” search engine and assistant. Starting this year, Google will be able to search using videos. They also introduced the AI Overview feature, which uses AI to summarize search content. The intelligent assistant, Astra, can recognize objects and actions in videos and provide real-time responses to related questions. Additionally, they unveiled the new Gemini 1.5 Flash and Veo image generation models.

During the event, Demis Hassabis, the leader of Google Deepmind, made his first appearance at Google I/O.

AI Revolution 1: Search Engine! Ability to search videos and understand complex commands!

The search engine, which solidifies Google’s leading position, underwent a fundamental update with the addition of Gemini’s new capabilities. It can now not only recognize audiovisual content but also understand longer and more complex commands.

Google can now search using videos!

While Google search has primarily relied on text and images for a long time, it has finally advanced to include “video search.” Users can now shoot videos and ask simple questions using voice or text, and the search engine will automatically analyze the video content and provide relevant responses.

In a demonstration, when faced with a technical difficulty while playing a vinyl record and the needle was irregularly moving, a user recorded a video and asked, “Why is it happening?” Google automatically searched and provided an AI search summary through the Google Overview feature.

AI Overview: Understanding longer and more complex commands

AI Overview, a technology introduced by Google last year, summarizes and organizes search content at the top of the search engine. With the new “multi-step reasoning capability” of the Gemini model, AI Overview can handle complex questions. Whether the question is long, contains many details, or has specific areas to focus on, there is no need for multiple queries.

For example, if a user wants to find a new yoga or Pilates studio in Boston, they can simply search, “Find me the best yoga or Pilates studios in Boston and tell me their new member offers and the time it takes to walk from Lighthouse Hill.” Despite multiple requirements, AI Overview can still complete the task.

AI Revolution 2: Astra Assistant! Real-time analysis of video content, thinking, and responding

Demis Hassabis, the leader of Google Deepmind, took the stage for the first time at Google I/O this year to showcase Google’s “future AI assistant” named Astra. Astra claims to understand the dynamic and complex world like a human.

Astra also has the ability to analyze videos in real-time, thanks to its multimodal capabilities, including instant analysis of dynamic scenes. It can even have memory. This feature received thunderous applause during the presentation.

In a demonstration, a user filmed their surroundings while walking and asked Astra, “Where do you think I am right now?” Then, they captured the computer screen, used a brush to circle the code, and asked Astra, “Where do you think there are problems that need improvement?” Before the video ended, they could ask Astra, “Do you remember where I put my glasses?” Astra could analyze all the frames from the past few minutes and find the frame where the glasses were located, analyze the information in the frame, and finally conclude, “Next to an apple.”

AI Revolution 3: Google Photos Search! AI helps you find photos and document your life

Google also introduced the Ask Photos with Gemini feature in Google Photos. It uses image analysis to classify objects in photos and add keyword tags. For example, users can quickly find photos with their car’s license plate or document the process of their daughter learning to swim and organize related photos. When asked, “When did my daughter learn the backstroke?” Gemini can quickly find relevant photos and provide the date as an answer.

AI Revolution 4: Android! Gemini spans all experiences, including conversations and videos

Android is expected to become the best platform for experiencing Google’s AI capabilities. Gemini is always ready to provide diverse assistance on mobile phones. Based on the applications demonstrated during the conference, Gemini can generate memes during conversations, answer questions about sports videos, and even provide instant answers to PDF files of over 80 pages using the Gemini Advanced App.

Gemini’s ability to process a large number of parameters allows it to read through an entire economics textbook in seconds and provide summaries or answer questions.

AI Revolution 5: Gemini Update! New model Flash, lighter and capable of processing millions of tokens at once

The technology behind large-scale language models is the foundation for all the new features introduced this time. Gemini, as Google’s core AI language model, has evolved to possess two core abilities: “multimodal” and “massive processing.” It can now process millions of tokens of text, images, and videos in one go.

Introducing the new member of the Gemini family, Gemini 1.5 Flash, which falls between Gemini 1.5 Pro and Gemini 1.5 Nano in size. However, it is more lightweight and efficient, offering the same level of capability as Gemini 1.5 Pro. For example, a conversation instruction window can process a million tokens, meaning it can analyze documents spanning 1500 pages or code exceeding 30,000 lines. This lightweight model is achieved through “knowledge distillation” and is more suitable for developers who require speed and cost-efficiency.

Gemini 1.5 Pro Update

Gemini 1.5 Pro, which was only announced in February this year, is also set to be upgraded. It will double its token processing capacity to 2 million, allowing it to process 2 hours of video, 22 hours of audio, over 60,000 lines of code, or over 1.4 million words of text simultaneously.

AI Revolution 6: Veo Image Model! Text-based video generation

In terms of video generation, Google presented Veo, challenging OpenAI’s Sora. Veo can generate high-quality 1080p videos based on natural language text instructions and understands terms related to filmmaking and visual technology. It can incorporate techniques like time-lapse photography during the creative process.

As for OpenAI’s Sora, it can generate complex scenes with multiple characters, specific actions, and numerous details. The AI not only understands various objects mentioned in the prompt but also knows how these objects exist in the real world, creating a stunningly realistic experience.

OpenAI Releases GPT-4o a Day Before Google I/O

Additionally, a day before Google I/O, OpenAI unveiled its new model, GPT-4o. It possesses the intelligence level of GPT-4 and has enhanced capabilities for voice and video processing, giving users a closer experience to interacting with a real person.

GPT-4o can provide real-time translation during conversations, enabling smooth communication between two people speaking different languages. When asked to tell a bedtime story, it can narrate with a more expressive and lively voice. It can also teach people how to solve simple math problems using a voice that closely resembles human speech.

According to OpenAI, GPT-4o can “read” the user’s facial expressions and tone, know when and how to respond, and quickly switch between different tones, ranging from robotic to lively singing.

With two major AI powerhouses releasing their latest technologies within two days, this AI revolution will continue to impact people’s lives.

Editor: Lin Mei-Xin

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

Cryptocurrency Platform Enters Stock Market! Exchange Coinbase Plans to Launch “Tokenized Stocks”

Jun. 18, 2025

President Trump Supports Cryptocurrency: Are His Own Finances Benefiting? Understanding the Concerns Behind It.

Jun. 16, 2025

From Regulatory Sandbox to Practical Implementation: How Stablecoins Drive the Future of Finance in Taiwan?

Jun. 13, 2025

“Is ‘MapleStory’ Facing a Change of Ownership? Tencent Plans Acquisition of Nexon with Ambitions to Dominate Game IP”

Jun. 13, 2025

[Perspective] A Comprehensive Analysis of Token Issuance in the Cryptocurrency Sphere: From the “Wild Era” to “Community-Centric” Approaches, How Have Token Issuance Mechanisms Changed?

Jun. 13, 2025

VASP Bill Hearing: How to Balance “Deregulation” and “Protection” in Cryptocurrency Regulation?

Jun. 12, 2025
Add A Comment
Leave A Reply Cancel Reply

Don't Miss
Finance

Cryptocurrency Platform Enters Stock Market! Exchange Coinbase Plans to Launch “Tokenized Stocks”

Jun. 18, 2025

What happened?Coinbase is seeking approval from the U.S. Securities and Exchange Commission (SEC) to…

Did Bitcoin Cause Stock Prices to Soar by 30 Times? An Analysis of How Companies Utilize Bitcoin to Propel Their Stock Prices.

Jun. 16, 2025

President Trump Supports Cryptocurrency: Are His Own Finances Benefiting? Understanding the Concerns Behind It.

Jun. 16, 2025

[Perspective] A Comprehensive Analysis of Token Issuance in the Cryptocurrency Sphere: From the “Wild Era” to “Community-Centric” Approaches, How Have Token Issuance Mechanisms Changed?

Jun. 13, 2025
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
About Us
About Us

Dive deep into the latest Ethereum and blockchain news to stay updated on the dynamic world of cryptocurrency. Remix Eth Pulse provides comprehensive and professional coverage of the most important events, trends, and analyses in the industry. From technical updates to market trends, we offer a one-stop information platform to help you stay informed and make informed decisions.

Our Picks

Cryptocurrency Platform Enters Stock Market! Exchange Coinbase Plans to Launch “Tokenized Stocks”

Jun. 18, 2025

Did Bitcoin Cause Stock Prices to Soar by 30 Times? An Analysis of How Companies Utilize Bitcoin to Propel Their Stock Prices.

Jun. 16, 2025
Most Popular

What sets Polymarket apart from other online betting platforms in terms of being called in for questioning about its online presidential election predictions?

Dec. 29, 2023

Analyzing Two Major Factors: Matrixport Report Sparks Bitcoin Plunge! Evaluating Whether it is an Opportunistic Move

Jan. 8, 2024
Facebook X (Twitter) Instagram Pinterest
  • Home
  • News
  • Cryptocurrency
  • Blockchain
  • Metaverse
  • Policy
  • Opinion
  • Finance
  • All Posts
© 2025 Remix Eth Pulse All rights reserved.

Type above and press Enter to search. Press Esc to cancel.