Post
925
𝙒𝙧𝙞𝙩𝙞𝙣𝙜 𝙩𝙤𝙤𝙡 𝙘𝙖𝙡𝙡𝙨 𝙞𝙣 𝙘𝙤𝙙𝙚 𝙟𝙪𝙨𝙩 𝙬𝙤𝙧𝙠𝙨 𝙗𝙚𝙩𝙩𝙚𝙧 𝙩𝙝𝙖𝙣 𝙅𝙎𝙊𝙉 💪
I was really happy to learn today by @sergeipetrov that paper 𝘌𝘹𝘦𝘤𝘶𝘵𝘢𝘣𝘭𝘦 𝘊𝘰𝘥𝘦 𝘈𝘤𝘵𝘪𝘰𝘯𝘴 𝘌𝘭𝘪𝘤𝘪𝘵 𝘉𝘦𝘵𝘵𝘦𝘳 𝘓𝘓𝘔 𝘈𝘨𝘦𝘯𝘵𝘴 was accepted at ICLR 2024!
As a reminder, an agent is a system in which you embed a LLM engine, to let it call tools.
These tools are meant like an IronMan suit, to supplement the LLM in areas that it isn't good at.
🧑💻 For instance your friendly LLM may be terrible at calculating powers of floating numbers ("What is X ^0.2947 ?"), so it should use a calculator.
🔎It may be terrible at knowing precise facts ("What was the date of the Golden Bull?") so it should use a web browser.
So the agent system will prompt an agent with "Now you can use these tools: calculator, search,..."
But 𝙝𝙤𝙬 𝙨𝙝𝙤𝙪𝙡𝙙 𝙩𝙝𝙚 𝙖𝙜𝙚𝙣𝙩 𝙚𝙭𝙥𝙧𝙚𝙨𝙨 𝙞𝙩𝙨 𝙖𝙘𝙩𝙞𝙤𝙣𝙨?
All well known frameworks let agents write their actions as JSON strings.
We 𝗽𝗿𝗲𝗳𝗲𝗿𝗿𝗲𝗱 𝘁𝗼 𝗴𝗼 𝘄𝗶𝘁𝗵 𝗳𝗼𝗿𝗺𝘂𝗹𝗮𝘁𝗶𝗻𝗴 𝗮𝗰𝘁𝗶𝗼𝗻𝘀 𝗶𝗻 𝗖𝗼𝗱𝗲, 𝘄𝗵𝗶𝗰𝗵 𝗶𝘀 𝗺𝘂𝗰𝗵 𝗺𝗼𝗿𝗲 𝘃𝗲𝗿𝘀𝗮𝘁𝗶𝗹𝗲 𝗮𝗻𝗱 𝗰𝗼𝗻𝗰𝗶𝘀𝗲, 𝗮𝗻𝗱 𝗮𝗹𝗹𝗼𝘄𝘀 𝘁𝗼 𝗰𝗵𝗮𝗶𝗻 𝗮𝗰𝘁𝗶𝗼𝗻𝘀 𝘀𝗲𝗮𝗺𝗹𝗲𝘀𝘀𝗹𝘆: see the picture attached for an example where Code formulation really shines.
And the paper confirms our choice: researchers show that 𝗰𝗼𝗺𝗽𝗮𝗿𝗲𝗱 𝘁𝗼 𝗝𝗦𝗢𝗡 𝗼𝗿 𝗽𝗹𝗮𝗶𝗻 𝘁𝗲𝘅𝘁, 𝗖𝗼𝗱𝗲 𝗶𝘀 𝗯𝗲𝘁𝘁𝗲𝗿 𝗯𝗼𝘁𝗵 𝗶𝗻 𝗰𝗼𝗻𝗰𝗶𝘀𝗲𝗻𝗲𝘀𝘀 𝗮𝗻𝗱 𝗽𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲:
➤ Up to 30% fewer steps for the same actions (much more concise)
➤ Up to 20% higher performance on benchmarks
And we find additional benefits, for instance a natural handling of variables.
Read the paper here 📖 Executable Code Actions Elicit Better LLM Agents (2402.01030)
Get your ReactCodeAgent running with our Agents framework! 👉 https://huggingface.co/learn/cookbook/agents
I was really happy to learn today by @sergeipetrov that paper 𝘌𝘹𝘦𝘤𝘶𝘵𝘢𝘣𝘭𝘦 𝘊𝘰𝘥𝘦 𝘈𝘤𝘵𝘪𝘰𝘯𝘴 𝘌𝘭𝘪𝘤𝘪𝘵 𝘉𝘦𝘵𝘵𝘦𝘳 𝘓𝘓𝘔 𝘈𝘨𝘦𝘯𝘵𝘴 was accepted at ICLR 2024!
As a reminder, an agent is a system in which you embed a LLM engine, to let it call tools.
These tools are meant like an IronMan suit, to supplement the LLM in areas that it isn't good at.
🧑💻 For instance your friendly LLM may be terrible at calculating powers of floating numbers ("What is X ^0.2947 ?"), so it should use a calculator.
🔎It may be terrible at knowing precise facts ("What was the date of the Golden Bull?") so it should use a web browser.
So the agent system will prompt an agent with "Now you can use these tools: calculator, search,..."
But 𝙝𝙤𝙬 𝙨𝙝𝙤𝙪𝙡𝙙 𝙩𝙝𝙚 𝙖𝙜𝙚𝙣𝙩 𝙚𝙭𝙥𝙧𝙚𝙨𝙨 𝙞𝙩𝙨 𝙖𝙘𝙩𝙞𝙤𝙣𝙨?
All well known frameworks let agents write their actions as JSON strings.
We 𝗽𝗿𝗲𝗳𝗲𝗿𝗿𝗲𝗱 𝘁𝗼 𝗴𝗼 𝘄𝗶𝘁𝗵 𝗳𝗼𝗿𝗺𝘂𝗹𝗮𝘁𝗶𝗻𝗴 𝗮𝗰𝘁𝗶𝗼𝗻𝘀 𝗶𝗻 𝗖𝗼𝗱𝗲, 𝘄𝗵𝗶𝗰𝗵 𝗶𝘀 𝗺𝘂𝗰𝗵 𝗺𝗼𝗿𝗲 𝘃𝗲𝗿𝘀𝗮𝘁𝗶𝗹𝗲 𝗮𝗻𝗱 𝗰𝗼𝗻𝗰𝗶𝘀𝗲, 𝗮𝗻𝗱 𝗮𝗹𝗹𝗼𝘄𝘀 𝘁𝗼 𝗰𝗵𝗮𝗶𝗻 𝗮𝗰𝘁𝗶𝗼𝗻𝘀 𝘀𝗲𝗮𝗺𝗹𝗲𝘀𝘀𝗹𝘆: see the picture attached for an example where Code formulation really shines.
And the paper confirms our choice: researchers show that 𝗰𝗼𝗺𝗽𝗮𝗿𝗲𝗱 𝘁𝗼 𝗝𝗦𝗢𝗡 𝗼𝗿 𝗽𝗹𝗮𝗶𝗻 𝘁𝗲𝘅𝘁, 𝗖𝗼𝗱𝗲 𝗶𝘀 𝗯𝗲𝘁𝘁𝗲𝗿 𝗯𝗼𝘁𝗵 𝗶𝗻 𝗰𝗼𝗻𝗰𝗶𝘀𝗲𝗻𝗲𝘀𝘀 𝗮𝗻𝗱 𝗽𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲:
➤ Up to 30% fewer steps for the same actions (much more concise)
➤ Up to 20% higher performance on benchmarks
And we find additional benefits, for instance a natural handling of variables.
Read the paper here 📖 Executable Code Actions Elicit Better LLM Agents (2402.01030)
Get your ReactCodeAgent running with our Agents framework! 👉 https://huggingface.co/learn/cookbook/agents