Claude 3 Haiku: our fastest model yet

minimaxir · on March 13, 2024

This is a new announcement from the previous Claude 3 announcement last week: https://news.ycombinator.com/item?id=39590666

Specifically, the smallest/likely will be the most popular model is now available when it wasn't then. (The model ID is claude-3-haiku-20240307 ). Notably, this is also a cheap model that supports image input, but per the documentation you can only provide 20 images at a time which won't work for video inputs.

Testing around image inputs in the web Workbench, it's surprisingly good for the price.

simonw · on March 13, 2024

Pricing: $0.25/million tokens of input, $1.25/million of output

GPT-3.5 Turbo is $0.50/$1.50

I've updated the Claude 3 plugin for my LLM CLI tool to support the new model: https://github.com/simonw/llm-claude-3/releases/tag/0.3

    pipx install llm
    llm install llm-claude-3
    llm keys set claude
    # Paste Anthropic API key here
    llm -m claude-3-haiku 'Fun facts about armadillos'

It's pretty fast! Animated GIF here: https://github.com/simonw/llm-claude-3/issues/3#issuecomment...

porphyra · on March 13, 2024

For comparison, Groq [1] has (price per million tokens of input vs output):

    Llama 2 70B (4096 Context Length)     ~300 tokens/s $0.70/$0.80
    Llama 2 7B (2048 Context Length)     ~750 tokens/s $0.10/$0.10
    Mixtral, 8x7B SMoE (32K Context Length) ~480 tokens/s $0.27/$0.27
    Gemma 7B (8K Context Length)         ~820 tokens/s $0.10/$0.10

[1] https://wow.groq.com/

BoorishBears · on March 13, 2024

And zero capacity. Groq is coming across a total paper tiger. No billing, unusable rate limits, and most importantly: a request queue that makes it dramatically slower than any other option.

They say they're just waiting on implementing billing, but at this point it reads more like "we wouldn't be able to meet demand of all your request usages".

-

Groq is going through all that to offer 500tk/s theoretically, meanwhile I'm seeing Fireworks.ai come in at 300+tk/s in production use.

BoorishBears · on March 13, 2024

Seeing it be pretty slow in production with long prompts:

- 10-15 seconds for 400 tokens out, and 4,000-10,000 tokens in.

- 6-8 seconds when using Claude Instant for the same prompts

Hoping it's just a rush at launch.

redbell · on March 13, 2024

Oh, Claude.. You can't do this to me!

A few days ago, I decided to give Claude a try , so I created an account, verified my phone number, and successfully logged in. After a warm welcome from Claude and presenting myself, I entered my very first prompt, which reads: "What do you know about Hacker News?". I pressed ENTER, and after a second, it replied:

"Your account has been disabled after an automatic review of your recent activities that violate our Terms of Service. Please review our Terms of Service and Acceptable Use Policy for more information."

I contacted the support team, and after a day, they replied and redirected me to a Google Forms to fill, which I still didn't fill.

jug · on March 13, 2024

From Redditors at /r/ClaudeAI this seems to be a common bug. People have got accounts back upon contacting them and I saw one even without doing anything, after a while...

I very much miss an official comment on this though.

It's not exactly inspiring confidence to subscribe to a product you may randomly be locked out from, with no comment from the company behind it.

bkrausz · on March 14, 2024

In lieu of an official comment: the fraud detection that we run on account signup was overly aggressive, especially in response to the massive attention we've seen/new shapes of attacks. We've tuned it down and unbanned a number of accounts. If you're still seeing issues with your account specifically let me know the account at bkrausz at anthropic.com and I'll dig into it.

josh-sematic · on March 13, 2024

You can play with Haiku for free at https://app.airtrain.ai/playground

impulser_ · on March 14, 2024

Anthropic took the lead from OpenAI in LLM IMO. Claude 3 Opus is by far the best LLM on the market right now. It the first time using LLM where I was actually impressive by the responses. The knowledge, reasoning, and responses from Opus are way better than GPT4.

Haiku seems like it adds to this lead. Having a cheaper and better model than GPT3.5 for processing large amounts of documents is great.

Props to the Anthropic team.

ldjkfkdsjnv · on March 13, 2024

Is Anthropic moving faster than OpenAI? Or is OpenAI working on something so big, that they aren't worried by being outpaced. Regardless, I feel like I am watching history in real time.

a_wild_dandan · on March 13, 2024

OpenAI are presently training toward GPT-5, with periodic GPT-4.x releases planned. These models take immense training resources, red teaming, etc. It'll be a hot minute before you see GPT-4.5, etc.

svdr · on March 13, 2024

I guess they must have put some time in Sora?

gabev · on March 13, 2024

This is fantastic! We just shipped so much of our old Claude Instant calls to Haiku and the results are fantastic.

Zenfetch is now primarily powered by Claude 3 family of models :O https://www.zenfetch.com

GaggiX · on March 13, 2024

The multilingual capabilities of Claude 3 models are incredible, even the smallest model, Haiku, is fluent in Georgian, a language that not even GPT-4 can speak without making a huge amount of mistakes.

brcmthrowaway · on March 13, 2024

What are their training sources for this

jug · on March 13, 2024

I've been impressed by the quality of the Claude 3 lineup. If you don't need advanced math and reasoning, even Sonnet that is free reaches GPT-4 parity according to the LMSYS Chatbot Arena Leaderboard: https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboar...

There, Sonnet is within the margin of error of current GPT-4 and Opus the same but as for the GPT-4 previews.

Above all Anthropic seem to have found themselves a very nice training set, watching how nicely results are retained as they go down in model sizes.

pedalpete · on March 13, 2024

Is there a good reason why they are comparing Claude 3 with ChatGPT 3.5 instead of 4? Does anybody really even care about/use Gemini?

wavemode · on March 14, 2024

Because GPT-3.5 is in the same price class as Haiku. GPT-4 is significantly more expensive.

bryanlarsen · on March 13, 2024

AFAICT Claude 3 Haiku is supposed to be cheap/fast so compares with ChatGPT 3.5, and Claude 3 Ultra is supposed to be the best so compares with ChatGPT 4.

pants2 · on March 13, 2024

You mean Claude 3 Opus, but yes.

devnullbrain · on March 13, 2024

SMS verification seems to be broken. Nothing reported on the status page.

https://status.anthropic.com/

bkrausz · on March 14, 2024

We haven't seen any issues in the last day+ with SMS delivery. Sometimes Twilio will blackhole a phone number if they suspect abuse, but not sure if that's the case here.

If you want to email bkrausz at anthropic.com with your phone number I'm happy to check logs (assuming it's still not working).