| ## Tool Calling | |
| To enable the tool calling feature, you may need to set certain tool calling parser options when starting the service. See [deploy_guidance](./deploy_guidance.md) for details. | |
| In Kimi-K2, a tool calling process includes: | |
| - Passing function descriptions to Kimi-K2 | |
| - Kimi-K2 decides to make a function call and returns the necessary information for the function call to the user | |
| - The user performs the function call, collects the call results, and passes the function call results to Kimi-K2 | |
| - Kimi-K2 continues to generate content based on the function call results until the model believes it has obtained sufficient information to respond to the user | |
| ### Preparing Tools | |
| Suppose we have a function `get_weather` that can query the weather conditions in real-time. | |
| This function accepts a city name as a parameter and returns the weather conditions. We need to prepare a structured description for it so that Kimi-K2 can understand its functionality. | |
| ```python | |
| def get_weather(city): | |
| return {"weather": "Sunny"} | |
| # Collect the tool descriptions in tools | |
| tools = [{ | |
| "type": "function", | |
| "function": { | |
| "name": "get_weather", | |
| "description": "Get weather information. Call this tool when the user needs to get weather information", | |
| "parameters": { | |
| "type": "object", | |
| "required": ["city"], | |
| "properties": { | |
| "city": { | |
| "type": "string", | |
| "description": "City name", | |
| } | |
| } | |
| } | |
| } | |
| }] | |
| # Tool name->object mapping for easy calling later | |
| tool_map = { | |
| "get_weather": get_weather | |
| } | |
| ``` | |
| ### Chat with tools | |
| We use `openai.OpenAI` to send messages to Kimi-K2 along with tool descriptions. Kimi-K2 will autonomously decide whether to use and how to use the provided tools. | |
| If Kimi-K2 believes a tool call is needed, it will return a result with `finish_reason='tool_calls'`. At this point, the returned result includes the tool call information. | |
| After calling tools with the provided information, we then need to append the tool call results to the chat history and continue calling Kimi-K2. | |
| Kimi-K2 may need to call tools multiple times until the model believes the current results can answer the user's question. We should check `finish_reason` until it is not `tool_calls`. | |
| The results obtained by the user after calling the tools should be added to `messages` with `role='tool'`. | |
| ```python | |
| import json | |
| from openai import OpenAI | |
| model_name='moonshotai/Kimi-K2-Instruct' | |
| client = OpenAI(base_url=endpoint, | |
| api_key='xxx') | |
| messages = [ | |
| {"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."} | |
| ] | |
| finish_reason = None | |
| while finish_reason is None or finish_reason == "tool_calls": | |
| completion = client.chat.completions.create( | |
| model=model_name, | |
| messages=messages, | |
| temperature=0.3, | |
| tools=tools, | |
| tool_choice="auto", | |
| ) | |
| choice = completion.choices[0] | |
| finish_reason = choice.finish_reason | |
| # Note: The finish_reason when tool calls end may vary across different engines, so this condition check needs to be adjusted accordingly | |
| if finish_reason == "tool_calls": | |
| messages.append(choice.message) | |
| for tool_call in choice.message.tool_calls: | |
| tool_call_name = tool_call.function.name | |
| tool_call_arguments = json.loads(tool_call.function.arguments) | |
| tool_function = tool_map[tool_call_name] | |
| tool_result = tool_function(tool_call_arguments) | |
| print("tool_result", tool_result) | |
| messages.append({ | |
| "role": "tool", | |
| "tool_call_id": tool_call.id, | |
| "name": tool_call_name, | |
| "content": json.dumps(tool_result), | |
| }) | |
| print('-' * 100) | |
| print(choice.message.content) | |
| ``` | |
| ### Tool Calling in Streaming Mode | |
| Tool calling can also be used in streaming mode. In this case, we need to collect the tool call information returned in the stream until we have a complete tool call. Please refer to the code below: | |
| ```python | |
| messages = [ | |
| {"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."} | |
| ] | |
| finish_reason = None | |
| msg = '' | |
| while finish_reason is None or finish_reason == "tool_calls": | |
| completion = client.chat.completions.create( | |
| model=model_name, | |
| messages=messages, | |
| temperature=0.3, | |
| tools=tools, | |
| tool_choice="auto", | |
| stream=True | |
| ) | |
| tool_calls = [] | |
| for chunk in completion: | |
| delta = chunk.choices[0].delta | |
| if delta.content: | |
| msg += delta.content | |
| if delta.tool_calls: | |
| for tool_call_chunk in delta.tool_calls: | |
| if tool_call_chunk.index is not None: | |
| # Extend the tool_calls list | |
| while len(tool_calls) <= tool_call_chunk.index: | |
| tool_calls.append({ | |
| "id": "", | |
| "type": "function", | |
| "function": { | |
| "name": "", | |
| "arguments": "" | |
| } | |
| }) | |
| tc = tool_calls[tool_call_chunk.index] | |
| if tool_call_chunk.id: | |
| tc["id"] += tool_call_chunk.id | |
| if tool_call_chunk.function.name: | |
| tc["function"]["name"] += tool_call_chunk.function.name | |
| if tool_call_chunk.function.arguments: | |
| tc["function"]["arguments"] += tool_call_chunk.function.arguments | |
| finish_reason = chunk.choices[0].finish_reason | |
| # Note: The finish_reason when tool calls end may vary across different engines, so this condition check needs to be adjusted accordingly | |
| if finish_reason == "tool_calls": | |
| for tool_call in tool_calls: | |
| tool_call_name = tool_call['function']['name'] | |
| tool_call_arguments = json.loads(tool_call['function']['arguments']) | |
| tool_function = tool_map[tool_call_name] | |
| tool_result = tool_function(tool_call_arguments) | |
| messages.append({ | |
| "role": "tool", | |
| "tool_call_id": tool_call['id'], | |
| "name": tool_call_name, | |
| "content": json.dumps(tool_result), | |
| }) | |
| # The text generated by the tool call is not the final version, reset msg | |
| msg = '' | |
| print(msg) | |
| ``` | |
| ### Manually Parsing Tool Calls | |
| The tool call requests generated by Kimi-K2 can also be parsed manually, which is especially useful when the service you are using does not provide a tool-call parser. | |
| The tool call requests generated by Kimi-K2 are wrapped by `<|tool_calls_section_begin|>` and `<|tool_calls_section_end|>`, | |
| with each tool call wrapped by `<|tool_call_begin|>` and `<|tool_call_end|>`. The tool ID and arguments are separated by `<|tool_call_argument_begin|>`. | |
| The format of the tool ID is `functions.{func_name}:{idx}`, from which we can parse the function name. | |
| Based on the above rules, we can directly post request to the completions interface and manually parse tool calls. | |
| ```python | |
| import requests | |
| from transformers import AutoTokenizer | |
| messages = [ | |
| {"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."} | |
| ] | |
| msg = '' | |
| tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) | |
| while True: | |
| text = tokenizer.apply_chat_template( | |
| messages, | |
| tokenize=False, | |
| tools=tools, | |
| add_generation_prompt=True, | |
| ) | |
| payload = { | |
| "model": model_name, | |
| "prompt": text, | |
| "max_tokens": 512 | |
| } | |
| response = requests.post( | |
| f"{endpoint}/completions", | |
| headers={"Content-Type": "application/json"}, | |
| json=payload, | |
| stream=False, | |
| ) | |
| raw_out = response.json() | |
| raw_output = raw_out["choices"][0]["text"] | |
| tool_calls = extract_tool_call_info(raw_output) | |
| if len(tool_calls) == 0: | |
| # No tool calls | |
| msg = raw_output | |
| break | |
| else: | |
| for tool_call in tool_calls: | |
| tool_call_name = tool_call['function']['name'] | |
| tool_call_arguments = json.loads(tool_call['function']['arguments']) | |
| tool_function = tool_map[tool_call_name] | |
| tool_result = tool_function(tool_call_arguments) | |
| messages.append({ | |
| "role": "tool", | |
| "tool_call_id": tool_call['id'], | |
| "name": tool_call_name, | |
| "content": json.dumps(tool_result), | |
| }) | |
| print('-' * 100) | |
| print(msg) | |
| ``` | |
| Here, `extract_tool_call_info` parses the model output and returns the model call information. A simple implementation would be: | |
| ```python | |
| def extract_tool_call_info(tool_call_rsp: str): | |
| if '<|tool_calls_section_begin|>' not in tool_call_rsp: | |
| # No tool calls | |
| return [] | |
| import re | |
| pattern = r"<\|tool_calls_section_begin\|>(.*?)<\|tool_calls_section_end\|>" | |
| tool_calls_sections = re.findall(pattern, tool_call_rsp, re.DOTALL) | |
| # Extract multiple tool calls | |
| func_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>" | |
| tool_calls = [] | |
| for match in re.findall(func_call_pattern, tool_calls_sections[0], re.DOTALL): | |
| function_id, function_args = match | |
| # function_id: functions.get_weather:0 | |
| function_name = function_id.split('.')[1].split(':')[0] | |
| tool_calls.append( | |
| { | |
| "id": function_id, | |
| "type": "function", | |
| "function": { | |
| "name": function_name, | |
| "arguments": function_args | |
| } | |
| } | |
| ) | |
| return tool_calls | |
| ``` | |