Post
1134
𝗟𝗹𝗮𝗺𝗮-𝟯.𝟭 𝗺𝗼𝗱𝗲𝗹𝘀 𝗳𝗶𝗻𝗮𝗹𝗹𝘆 𝗴𝗲𝘁 𝘁𝗵𝗲𝗶𝗿 𝗖𝗵𝗮𝘁𝗯𝗼𝘁 𝗔𝗿𝗲𝗻𝗮 𝗿𝗮𝗻𝗸𝗶𝗻𝗴 🎖️
Given the impressive benchmarks published my Meta for their Llama-3.1 models, I was curious to see how these models would compare to top proprietary models on Chatbot Arena.
Now we've got the results! LMSys released the ELO derived from thousands of user votes for the new models, and here are the rankings:
💥 405B Model ranks 5th overall, in front of GPT-4-turbo! But behind GPT-4o, Claude-3.5 Sonnet and Gemini-advanced.
👏 70B Model climbs up to 9th rank ! From 1206 ➡️ 1244.
👍 8B Model improves from 1152 ➡️ 1170.
✅ This confirms that Llama-3.1 is a good contender for any task: any of its 3 model size is much cheaper to run than equivalent proprietary models!
For instance, here are the inference prices for the top models;
➤ GPT-4-Turbo inference price from OpenAI: $5/M input tokens, $15/M output tokens
➤ Llama-3.1-405B from HF API (for testing only): 3$/M for input or output tokens (Source linked in the first comment)
➤ Llama-3.1-405B from HF API (for testing only): free ✨
Get a head start on the HF API (resource by @andrewrreed ) 👉 https://huggingface.co/learn/cookbook/enterprise_hub_serverless_inference_api
Given the impressive benchmarks published my Meta for their Llama-3.1 models, I was curious to see how these models would compare to top proprietary models on Chatbot Arena.
Now we've got the results! LMSys released the ELO derived from thousands of user votes for the new models, and here are the rankings:
💥 405B Model ranks 5th overall, in front of GPT-4-turbo! But behind GPT-4o, Claude-3.5 Sonnet and Gemini-advanced.
👏 70B Model climbs up to 9th rank ! From 1206 ➡️ 1244.
👍 8B Model improves from 1152 ➡️ 1170.
✅ This confirms that Llama-3.1 is a good contender for any task: any of its 3 model size is much cheaper to run than equivalent proprietary models!
For instance, here are the inference prices for the top models;
➤ GPT-4-Turbo inference price from OpenAI: $5/M input tokens, $15/M output tokens
➤ Llama-3.1-405B from HF API (for testing only): 3$/M for input or output tokens (Source linked in the first comment)
➤ Llama-3.1-405B from HF API (for testing only): free ✨
Get a head start on the HF API (resource by @andrewrreed ) 👉 https://huggingface.co/learn/cookbook/enterprise_hub_serverless_inference_api