Anthropic Unveils New Model “Claude 3”, Stronger than GPT-4
Anthropic, backed by investments from Amazon and Google, has announced the release of its Claude 3 model series, claiming it surpasses all competitors, including GPT-4, making it the fastest and most powerful model to date. In fact, it even exhibits “human-like” capabilities in certain tasks.
“We are pleased to announce the launch of the Claude 3 model series, which sets new industry standards for a wide range of cognitive tasks,” stated Anthropic on their official website.
The Claude 3 series consists of three models: Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus, with performance ranging from low to high, providing users with options based on their specific needs and costs. Currently, Opus and Sonnet can be used on Claude.ai and Claude API, but Opus requires a subscription to Claude Pro for $20 per month, while Haiku will be launched soon.
Anthropic has positioned each model differently:
Claude 3 Opus:
This is Anthropic’s most advanced model, possessing near-human understanding and fluency, designed for extremely complex tasks and open-ended prompts.
Claude 3 Sonnet:
This model strikes a balance between intelligence and speed, offering more cost-effective and high-CP performance compared to similar products, specifically designed for high durability in large-scale AI usage.
Claude 3 Haiku:
The smallest and fastest model, with near-real-time responsiveness, capable of quickly answering simple questions, ideal for immediate user interaction.
Anthropic claims that in most tests, their most powerful Opus model in this release outperforms major AI models on the market, including university-level expert knowledge (MMLU), graduate-level professional reasoning (GPQA), and basic mathematics (GSM8K), surpassing even GPT in complex tasks, while exhibiting understanding and fluency close to human levels.
Additionally, the Claude 3 series models demonstrate comparable performance to competitors in visual capabilities, handling complex visual content such as photos, charts, and technical diagrams.
Anthropic points out that over half of their customers’ knowledge repositories consist of various types of visual content, such as PDFs, flowcharts, and slides, and they are delighted to offer this new modality to their clients. It is worth noting that while the Claude 3 series models can process images, they do not generate image content.
According to CNBC, Anthropic states that Claude 3 can handle approximately 150,000 words (200,000 tokens) of text, equivalent to books like “Moby Dick” or “Harry Potter and the Deathly Hallows,” while previous versions could only handle around 75,000 words.
In the data disclosed by Anthropic, the lightweight Haiku model can process research papers containing various complex charts and approximately 10,000 tokens in less than 3 seconds.
Regarding pricing, Anthropic charges $15 per million tokens for input and $75 per million tokens for output for Opus, which is significantly higher than GPT-4 Turbo’s $10 per million tokens for input and $30 per million tokens for output, perhaps indicating Anthropic’s confidence in their own model.
Anthropic is committed to significantly reducing the “hallucination rate” of models, making Claude 3 safer. Chatbots are prone to being misled or providing fabricated answers due to incomplete understanding of questions, which can lead to the spread of fake news.
Anthropic attempts to address this issue with Claude 3, claiming that the Opus model’s accuracy in answering a series of challenging and complex questions is more than twice that of previous models, significantly reducing the proportion of incorrect responses. However, they also acknowledge that completely eliminating this problem is not easy, as achieving a hallucination rate of zero is extremely challenging, according to Daniela Amodei, President of Anthropic.
“No model is perfect, and I believe that should be made clear,” emphasized Amodei. “We are doing our best to make the model safer and more powerful, but there are still times when fabricated responses may occur.”
Source:
Anthropic, Bloomberg, CNBC
Editor: Lin Meixin