burtenshaw HF Staff commited on
Commit
3951d28
·
verified ·
1 Parent(s): 37a95d8

Add notebook examples for structured outputs and function calling

Browse files

These notebooks demonstrate to the community how they can use `Kimi-K2-Instruct ` for structured outputs and function calling.

Files changed (2) hide show
  1. function_calling.ipynb +325 -0
  2. structured_outputs.ipynb +198 -0
function_calling.ipynb ADDED
@@ -0,0 +1,325 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "id": "eec74b22",
6
+ "metadata": {
7
+ "vscode": {
8
+ "languageId": "raw"
9
+ }
10
+ },
11
+ "source": [
12
+ "# Function Calling with Hugging Face Inference Providers\n",
13
+ "\n",
14
+ "This notebook demonstrates how to use function calling with both OpenAI-compatible and Hugging Face native clients using Hugging Face Inference Providers.\n",
15
+ "\n",
16
+ "## Overview\n",
17
+ "- **OpenAI-Compatible**: Use familiar OpenAI API syntax with HF Inference Providers\n",
18
+ "- **Hugging Face Native**: Use HF's native InferenceClient with function calling\n",
19
+ "- **Shared Functions**: Reusable function definitions and schemas across both approaches\n",
20
+ "\n",
21
+ "## Installation\n",
22
+ "\n",
23
+ "First, install the required dependencies:\n"
24
+ ]
25
+ },
26
+ {
27
+ "cell_type": "code",
28
+ "execution_count": null,
29
+ "id": "f23485bd",
30
+ "metadata": {},
31
+ "outputs": [],
32
+ "source": [
33
+ "%pip install openai huggingface-hub python-dotenv\n"
34
+ ]
35
+ },
36
+ {
37
+ "cell_type": "code",
38
+ "execution_count": 1,
39
+ "id": "e39a23ae",
40
+ "metadata": {},
41
+ "outputs": [
42
+ {
43
+ "name": "stderr",
44
+ "output_type": "stream",
45
+ "text": [
46
+ "/Users/ben/code/inference-providers-mcp/.venv/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
47
+ " from .autonotebook import tqdm as notebook_tqdm\n"
48
+ ]
49
+ }
50
+ ],
51
+ "source": [
52
+ "import json\n",
53
+ "import os\n",
54
+ "from typing import Dict, Any, Optional\n",
55
+ "from openai import OpenAI\n",
56
+ "from huggingface_hub import InferenceClient\n",
57
+ "from dotenv import load_dotenv\n",
58
+ "\n",
59
+ "# Load environment variables\n",
60
+ "load_dotenv()\n",
61
+ "\n",
62
+ "# Create a shared configuration\n",
63
+ "HF_TOKEN = os.getenv(\"HF_TOKEN\")\n"
64
+ ]
65
+ },
66
+ {
67
+ "cell_type": "markdown",
68
+ "id": "0b45612f",
69
+ "metadata": {},
70
+ "source": [
71
+ "# Define some functions"
72
+ ]
73
+ },
74
+ {
75
+ "cell_type": "code",
76
+ "execution_count": null,
77
+ "id": "5cd13326",
78
+ "metadata": {},
79
+ "outputs": [],
80
+ "source": [
81
+ "# Shared function definitions (mock weather API)\n",
82
+ "def get_current_weather(location: str) -> Dict[str, Any]:\n",
83
+ " \"\"\"Get current weather information for a location.\"\"\"\n",
84
+ " return {\n",
85
+ " \"location\": location,\n",
86
+ " \"temperature\": \"22°C\",\n",
87
+ " \"condition\": \"Sunny\",\n",
88
+ " \"humidity\": \"65%\",\n",
89
+ " \"wind_speed\": \"5 km/h\",\n",
90
+ " }\n",
91
+ "\n",
92
+ "\n",
93
+ "def get_weather_forecast(location: str, date: str) -> Dict[str, Any]:\n",
94
+ " \"\"\"Get weather forecast for a location on a specific date.\"\"\"\n",
95
+ " return {\n",
96
+ " \"location\": location,\n",
97
+ " \"date\": date,\n",
98
+ " \"forecast\": \"Sunny with a chance of rain\",\n",
99
+ " \"temperature\": \"20°C\",\n",
100
+ " \"humidity\": \"70%\",\n",
101
+ " }\n",
102
+ "\n",
103
+ "\n",
104
+ "# Available functions registry\n",
105
+ "AVAILABLE_FUNCTIONS = {\n",
106
+ " \"get_current_weather\": get_current_weather,\n",
107
+ " \"get_weather_forecast\": get_weather_forecast,\n",
108
+ "}\n",
109
+ "\n",
110
+ "# Shared tool schemas (compatible with both OpenAI and HF)\n",
111
+ "TOOL_SCHEMAS = [\n",
112
+ " {\n",
113
+ " \"type\": \"function\",\n",
114
+ " \"function\": {\n",
115
+ " \"name\": \"get_current_weather\",\n",
116
+ " \"description\": \"Get current weather information for a location\",\n",
117
+ " \"parameters\": {\n",
118
+ " \"type\": \"object\",\n",
119
+ " \"properties\": {\n",
120
+ " \"location\": {\n",
121
+ " \"type\": \"string\",\n",
122
+ " \"description\": \"City and country (e.g., 'Paris, France')\",\n",
123
+ " }\n",
124
+ " },\n",
125
+ " \"required\": [\"location\"],\n",
126
+ " },\n",
127
+ " },\n",
128
+ " },\n",
129
+ " {\n",
130
+ " \"type\": \"function\",\n",
131
+ " \"function\": {\n",
132
+ " \"name\": \"get_weather_forecast\",\n",
133
+ " \"description\": \"Get weather forecast for a location on a specific date\",\n",
134
+ " \"parameters\": {\n",
135
+ " \"type\": \"object\",\n",
136
+ " \"properties\": {\n",
137
+ " \"location\": {\n",
138
+ " \"type\": \"string\",\n",
139
+ " \"description\": \"City and country (e.g., 'London, UK')\",\n",
140
+ " },\n",
141
+ " \"date\": {\n",
142
+ " \"type\": \"string\",\n",
143
+ " \"description\": \"Date in YYYY-MM-DD format\",\n",
144
+ " },\n",
145
+ " },\n",
146
+ " \"required\": [\"location\", \"date\"],\n",
147
+ " },\n",
148
+ " },\n",
149
+ " },\n",
150
+ "]\n"
151
+ ]
152
+ },
153
+ {
154
+ "cell_type": "markdown",
155
+ "id": "f48298c3",
156
+ "metadata": {},
157
+ "source": [
158
+ "# Implement a Function Calling app"
159
+ ]
160
+ },
161
+ {
162
+ "cell_type": "code",
163
+ "execution_count": 13,
164
+ "id": "7c4b21dc",
165
+ "metadata": {},
166
+ "outputs": [],
167
+ "source": [
168
+ "SYSTEM_PROMPT = \"\"\"\n",
169
+ "You are a helpful assistant that can answer questions and help with tasks.\n",
170
+ "\"\"\""
171
+ ]
172
+ },
173
+ {
174
+ "cell_type": "code",
175
+ "execution_count": null,
176
+ "id": "775ae07e",
177
+ "metadata": {},
178
+ "outputs": [],
179
+ "source": [
180
+ "def process_function_calls(response_message, messages):\n",
181
+ " \"\"\"Process function calls and return updated messages.\"\"\"\n",
182
+ " if not response_message.tool_calls:\n",
183
+ " return messages, False\n",
184
+ "\n",
185
+ " # Add assistant's response to messages\n",
186
+ " messages.append(response_message)\n",
187
+ "\n",
188
+ " # Process each tool call\n",
189
+ " for tool_call in response_message.tool_calls:\n",
190
+ " function_name = tool_call.function.name\n",
191
+ " function_args = json.loads(tool_call.function.arguments)\n",
192
+ "\n",
193
+ " print(f\"🔧 Calling: {function_name}\")\n",
194
+ " print(f\"📝 Args: {function_args}\")\n",
195
+ "\n",
196
+ " # Call the function\n",
197
+ " if function_name in AVAILABLE_FUNCTIONS:\n",
198
+ " func = AVAILABLE_FUNCTIONS[function_name]\n",
199
+ " result = func(**function_args)\n",
200
+ " print(f\"✅ Result: {result}\")\n",
201
+ "\n",
202
+ " # Add function result to messages\n",
203
+ " messages.append(\n",
204
+ " {\n",
205
+ " \"tool_call_id\": tool_call.id,\n",
206
+ " \"role\": \"tool\",\n",
207
+ " \"name\": function_name,\n",
208
+ " \"content\": json.dumps(result),\n",
209
+ " }\n",
210
+ " )\n",
211
+ " else:\n",
212
+ " print(f\"❌ Function {function_name} not found\")\n",
213
+ "\n",
214
+ " return messages, True\n",
215
+ "\n",
216
+ "\n",
217
+ "def chat_with_functions(user_message, client, model) -> str:\n",
218
+ " \"\"\"Unified function calling handler for both OpenAI and HF clients.\"\"\"\n",
219
+ " messages = [\n",
220
+ " {\"role\": \"system\", \"content\": SYSTEM_PROMPT},\n",
221
+ " {\"role\": \"user\", \"content\": user_message},\n",
222
+ " ]\n",
223
+ "\n",
224
+ " # Initial API call\n",
225
+ " response = client.chat.completions.create(\n",
226
+ " model=model,\n",
227
+ " messages=messages,\n",
228
+ " tools=TOOL_SCHEMAS,\n",
229
+ " tool_choice=\"auto\",\n",
230
+ " )\n",
231
+ "\n",
232
+ " response_message = response.choices[0].message\n",
233
+ "\n",
234
+ " # Process function calls if any\n",
235
+ " messages, had_tool_calls = process_function_calls(response_message, messages)\n",
236
+ "\n",
237
+ " if had_tool_calls:\n",
238
+ " # Get final response after function calls\n",
239
+ " final_response = client.chat.completions.create(\n",
240
+ " model=model,\n",
241
+ " messages=messages,\n",
242
+ " tools=TOOL_SCHEMAS,\n",
243
+ " tool_choice=\"auto\",\n",
244
+ " )\n",
245
+ " final_content = final_response.choices[0].message.content\n",
246
+ " else:\n",
247
+ " final_content = response_message.content\n",
248
+ "\n",
249
+ " return final_content\n"
250
+ ]
251
+ },
252
+ {
253
+ "cell_type": "code",
254
+ "execution_count": 8,
255
+ "id": "8b26419b",
256
+ "metadata": {},
257
+ "outputs": [],
258
+ "source": [
259
+ "client = OpenAI(\n",
260
+ " api_key=HF_TOKEN,\n",
261
+ " base_url=\"https://router.huggingface.co/groq/openai/v1\",\n",
262
+ ")\n",
263
+ "\n",
264
+ "if False:\n",
265
+ " # Initialize HF client with inference provider\n",
266
+ " client = InferenceClient(provider=\"groq\")"
267
+ ]
268
+ },
269
+ {
270
+ "cell_type": "markdown",
271
+ "id": "c410bafc",
272
+ "metadata": {},
273
+ "source": [
274
+ "# Demo!"
275
+ ]
276
+ },
277
+ {
278
+ "cell_type": "code",
279
+ "execution_count": 14,
280
+ "id": "32ee9713",
281
+ "metadata": {},
282
+ "outputs": [
283
+ {
284
+ "name": "stdout",
285
+ "output_type": "stream",
286
+ "text": [
287
+ "🔧 Calling: get_current_weather\n",
288
+ "📝 Args: {'location': 'Berlin, Germany'}\n",
289
+ "✅ Result: {'location': 'Berlin, Germany', 'temperature': '22°C', 'condition': 'Sunny', 'humidity': '65%', 'wind_speed': '5 km/h'}\n"
290
+ ]
291
+ }
292
+ ],
293
+ "source": [
294
+ "query = \"What's the current weather in Berlin?\"\n",
295
+ "\n",
296
+ "response = chat_with_functions(\n",
297
+ " user_message=query,\n",
298
+ " client=client,\n",
299
+ " model=\"moonshotai/kimi-k2-instruct\",\n",
300
+ ")"
301
+ ]
302
+ }
303
+ ],
304
+ "metadata": {
305
+ "kernelspec": {
306
+ "display_name": ".venv",
307
+ "language": "python",
308
+ "name": "python3"
309
+ },
310
+ "language_info": {
311
+ "codemirror_mode": {
312
+ "name": "ipython",
313
+ "version": 3
314
+ },
315
+ "file_extension": ".py",
316
+ "mimetype": "text/x-python",
317
+ "name": "python",
318
+ "nbconvert_exporter": "python",
319
+ "pygments_lexer": "ipython3",
320
+ "version": "3.11.10"
321
+ }
322
+ },
323
+ "nbformat": 4,
324
+ "nbformat_minor": 5
325
+ }
structured_outputs.ipynb ADDED
@@ -0,0 +1,198 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "id": "43a342b3",
6
+ "metadata": {
7
+ "vscode": {
8
+ "languageId": "raw"
9
+ }
10
+ },
11
+ "source": [
12
+ "# Structured Outputs with Hugging Face Inference Providers\n",
13
+ "\n",
14
+ "This notebook demonstrates how to use structured outputs with both OpenAI-compatible and Hugging Face native clients using Hugging Face Inference Providers.\n",
15
+ "\n",
16
+ "## Overview\n",
17
+ "- **OpenAI-Compatible**: Use familiar OpenAI structured outputs with HF Inference Providers\n",
18
+ "- **Hugging Face Native**: Use HF's native InferenceClient with JSON schema validation\n",
19
+ "- **Shared Models**: Reusable Pydantic models and schemas across both approaches\n",
20
+ "- **Guaranteed Structure**: Ensure responses match your defined schemas\n",
21
+ "\n",
22
+ "## Installation\n",
23
+ "\n",
24
+ "First, install the required dependencies:\n"
25
+ ]
26
+ },
27
+ {
28
+ "cell_type": "code",
29
+ "execution_count": 16,
30
+ "id": "7071d771",
31
+ "metadata": {},
32
+ "outputs": [],
33
+ "source": [
34
+ "# %pip install openai huggingface-hub pydantic python-dotenv"
35
+ ]
36
+ },
37
+ {
38
+ "cell_type": "code",
39
+ "execution_count": null,
40
+ "id": "7323b5fb",
41
+ "metadata": {},
42
+ "outputs": [],
43
+ "source": [
44
+ "import os\n",
45
+ "import json\n",
46
+ "from typing import Dict, Any, List, Optional\n",
47
+ "from openai import OpenAI\n",
48
+ "from huggingface_hub import InferenceClient\n",
49
+ "from pydantic import BaseModel, Field\n",
50
+ "from dotenv import load_dotenv\n",
51
+ "\n",
52
+ "# Load environment variables\n",
53
+ "load_dotenv()\n",
54
+ "\n",
55
+ "# Create a shared configuration\n",
56
+ "HF_TOKEN = os.getenv(\"HF_TOKEN\")"
57
+ ]
58
+ },
59
+ {
60
+ "cell_type": "markdown",
61
+ "id": "abbe98f5",
62
+ "metadata": {},
63
+ "source": [
64
+ "# Structured Outputs Task\n",
65
+ "\n",
66
+ "Let's setup a structured output task like analysing a research paper and returning a structured output."
67
+ ]
68
+ },
69
+ {
70
+ "cell_type": "code",
71
+ "execution_count": 18,
72
+ "id": "2c1799a9",
73
+ "metadata": {},
74
+ "outputs": [],
75
+ "source": [
76
+ "# Shared Pydantic Models and Sample Data\n",
77
+ "\n",
78
+ "# Define structured output models\n",
79
+ "class PaperAnalysis(BaseModel):\n",
80
+ " \"\"\"Analysis of a research paper.\"\"\"\n",
81
+ "\n",
82
+ " title: str = Field(description=\"The title of the paper\")\n",
83
+ " abstract_summary: str = Field(description=\"A concise summary of the abstract\")\n",
84
+ " main_contributions: List[str] = Field(description=\"Key contributions of the paper\")\n",
85
+ " methodology: str = Field(description=\"Brief description of the methodology used\")\n",
86
+ "\n",
87
+ "\n",
88
+ "# Sample data for testing\n",
89
+ "SAMPLE_PAPER = \"\"\"Title: Attention Is All You Need\n",
90
+ "\n",
91
+ "Abstract: The dominant sequence transduction models are based on complex recurrent \n",
92
+ "or convolutional neural networks that include an encoder and a decoder. The best \n",
93
+ "performing models also connect the encoder and decoder through an attention mechanism. \n",
94
+ "We propose a new simple network architecture, the Transformer, based solely on \n",
95
+ "attention mechanisms, dispensing with recurrence and convolutions entirely. \n",
96
+ "Experiments on two machine translation tasks show these models to be superior \n",
97
+ "in quality while being more parallelizable and requiring significantly less time to train.\n",
98
+ "\n",
99
+ "Introduction: Recurrent neural networks, long short-term memory and gated recurrent \n",
100
+ "neural networks in particular, have been firmly established as state of the art approaches \n",
101
+ "in sequence modeling and transduction problems such as language modeling and machine translation.\n",
102
+ "The Transformer architecture introduces multi-head attention mechanisms that allow the model\n",
103
+ "to jointly attend to information from different representation subspaces.\"\"\"\n"
104
+ ]
105
+ },
106
+ {
107
+ "cell_type": "markdown",
108
+ "id": "d4cd793c",
109
+ "metadata": {},
110
+ "source": [
111
+ "# Demo!"
112
+ ]
113
+ },
114
+ {
115
+ "cell_type": "code",
116
+ "execution_count": null,
117
+ "id": "b82ca76b",
118
+ "metadata": {},
119
+ "outputs": [],
120
+ "source": [
121
+ "# Unified Structured Output Handler\n",
122
+ "system_prompt = \"Analyze the research paper and extract structured information about its title, abstract, contributions, and methodology.\"\n",
123
+ "\n",
124
+ "client = OpenAI(\n",
125
+ " api_key=HF_TOKEN,\n",
126
+ " base_url=\"https://router.huggingface.co/novita/v3/openai\",\n",
127
+ ")\n",
128
+ "\n",
129
+ "\n",
130
+ "def get_structured_output(content: str) -> Any:\n",
131
+ " \"\"\"Get structured output using OpenAI-compatible client.\"\"\"\n",
132
+ "\n",
133
+ " messages = [\n",
134
+ " {\"role\": \"system\", \"content\": system_prompt},\n",
135
+ " {\"role\": \"user\", \"content\": content},\n",
136
+ " ]\n",
137
+ "\n",
138
+ " # Use OpenAI's structured output parsing\n",
139
+ " completion = client.beta.chat.completions.parse(\n",
140
+ " model=\"moonshotai/kimi-k2-instruct\",\n",
141
+ " messages=messages,\n",
142
+ " response_format=PaperAnalysis,\n",
143
+ " )\n",
144
+ "\n",
145
+ " return completion.choices[0].message.parsed\n"
146
+ ]
147
+ },
148
+ {
149
+ "cell_type": "code",
150
+ "execution_count": 26,
151
+ "id": "8519e939",
152
+ "metadata": {},
153
+ "outputs": [
154
+ {
155
+ "name": "stdout",
156
+ "output_type": "stream",
157
+ "text": [
158
+ "📄 Title: Attention Is All You Need\n",
159
+ "📝 Summary: Proposes the Transformer architecture, a sequence-to-sequence model that replaces all recurrence and convolution with attention mechanisms. Demonstrates state-of-the-art results on machine-translation benchmarks while being more parallelizable and faster to train.\n",
160
+ "🎯 Contributions: ['Introduces the Transformer architecture, the first transduction model built entirely on attention, eliminating recurrence and convolution.', 'Presents multi-head self-attention to jointly attend to information from different representation subspaces.', 'Shows that attention-only models outperform RNN/CNN baselines in translation quality while offering better parallelization and shorter training times.']\n",
161
+ "🔬 Methodology: Designs an encoder-decoder architecture composed solely of stacked self-attention and feed-forward layers. Uses multi-head scaled dot-product attention, positional encodings, and residual connections. Evaluates on WMT 2014 English-to-German and English-to-French translation tasks, comparing against previous RNN/CNN-based systems.\n"
162
+ ]
163
+ }
164
+ ],
165
+ "source": [
166
+ "paper_analysis = get_structured_output(\n",
167
+ " content=SAMPLE_PAPER,\n",
168
+ ")\n",
169
+ "\n",
170
+ "print(f\"📄 Title: {paper_analysis.title}\")\n",
171
+ "print(f\"📝 Summary: {paper_analysis.abstract_summary}\")\n",
172
+ "print(f\"🎯 Contributions: {paper_analysis.main_contributions}\")\n",
173
+ "print(f\"🔬 Methodology: {paper_analysis.methodology}\")\n"
174
+ ]
175
+ }
176
+ ],
177
+ "metadata": {
178
+ "kernelspec": {
179
+ "display_name": ".venv",
180
+ "language": "python",
181
+ "name": "python3"
182
+ },
183
+ "language_info": {
184
+ "codemirror_mode": {
185
+ "name": "ipython",
186
+ "version": 3
187
+ },
188
+ "file_extension": ".py",
189
+ "mimetype": "text/x-python",
190
+ "name": "python",
191
+ "nbconvert_exporter": "python",
192
+ "pygments_lexer": "ipython3",
193
+ "version": "3.11.10"
194
+ }
195
+ },
196
+ "nbformat": 4,
197
+ "nbformat_minor": 5
198
+ }