Text Generation
Transformers
Safetensors
llama
falcon3
Eval Results (legacy)
text-generation-inference
Instructions to use tiiuae/Falcon3-10B-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use tiiuae/Falcon3-10B-Base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="tiiuae/Falcon3-10B-Base")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon3-10B-Base") model = AutoModelForCausalLM.from_pretrained("tiiuae/Falcon3-10B-Base") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use tiiuae/Falcon3-10B-Base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "tiiuae/Falcon3-10B-Base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tiiuae/Falcon3-10B-Base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/tiiuae/Falcon3-10B-Base
- SGLang
How to use tiiuae/Falcon3-10B-Base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "tiiuae/Falcon3-10B-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tiiuae/Falcon3-10B-Base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "tiiuae/Falcon3-10B-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tiiuae/Falcon3-10B-Base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use tiiuae/Falcon3-10B-Base with Docker Model Runner:
docker model run hf.co/tiiuae/Falcon3-10B-Base
| language: | |
| - en | |
| - fr | |
| - es | |
| - pt | |
| license: other | |
| library_name: transformers | |
| tags: | |
| - falcon3 | |
| license_name: falcon-llm-license | |
| license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html | |
| model-index: | |
| - name: Falcon3-10B-Base | |
| results: | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: IFEval (0-Shot) | |
| type: HuggingFaceH4/ifeval | |
| args: | |
| num_few_shot: 0 | |
| metrics: | |
| - type: inst_level_strict_acc and prompt_level_strict_acc | |
| value: 36.48 | |
| name: strict accuracy | |
| source: | |
| url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Base | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: BBH (3-Shot) | |
| type: BBH | |
| args: | |
| num_few_shot: 3 | |
| metrics: | |
| - type: acc_norm | |
| value: 41.38 | |
| name: normalized accuracy | |
| source: | |
| url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Base | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: MATH Lvl 5 (4-Shot) | |
| type: hendrycks/competition_math | |
| args: | |
| num_few_shot: 4 | |
| metrics: | |
| - type: exact_match | |
| value: 24.77 | |
| name: exact match | |
| source: | |
| url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Base | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: GPQA (0-shot) | |
| type: Idavidrein/gpqa | |
| args: | |
| num_few_shot: 0 | |
| metrics: | |
| - type: acc_norm | |
| value: 12.75 | |
| name: acc_norm | |
| source: | |
| url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Base | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: MuSR (0-shot) | |
| type: TAUR-Lab/MuSR | |
| args: | |
| num_few_shot: 0 | |
| metrics: | |
| - type: acc_norm | |
| value: 14.17 | |
| name: acc_norm | |
| source: | |
| url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Base | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: MMLU-PRO (5-shot) | |
| type: TIGER-Lab/MMLU-Pro | |
| config: main | |
| split: test | |
| args: | |
| num_few_shot: 5 | |
| metrics: | |
| - type: acc | |
| value: 36.0 | |
| name: accuracy | |
| source: | |
| url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Base | |
| name: Open LLM Leaderboard | |
| <div align="center"> | |
| <img src="https://huggingface.co/datasets/tiiuae/documentation-images/resolve/main/general/falco3-logo.png" alt="drawing" width="500"/> | |
| </div> | |
| # Falcon3-10B-Base | |
| **Falcon3** family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. | |
| This repository contains the **Falcon3-10B-Base**. It achieves state-of-the-art results (at the time of release) on reasoning, language understanding, instruction following, code and mathematics tasks. | |
| Falcon3-10B-Base supports 4 languages (English, French, Spanish, Portuguese) and a context length of up to 32K. | |
| ⚠️ **This is a raw, pretrained model, which should be further finetuned using SFT, RLHF, continued pretraining, etc. for most use cases.** | |
| ## Model Details | |
| - Architecture | |
| - Transformer-based causal decoder-only architecture | |
| - 40 decoder blocks | |
| - Grouped Query Attention (GQA) for faster inference: 12 query heads and 4 key-value heads | |
| - Wider head dimension: 256 | |
| - High RoPE value to support long context understanding: 1000042 | |
| - Uses SwiGLu and RMSNorm | |
| - 32K context length | |
| - 131K vocab size | |
| - Depth up-scaled from **Falcon3-7B-Base** with continual pretraining on 2 Teratokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 1024 H100 GPU chips | |
| - Supports EN, FR, ES, PT | |
| - Developed by [Technology Innovation Institute](https://www.tii.ae) | |
| - License: TII Falcon-LLM License 2.0 | |
| - Model Release Date: December 2024 | |
| ## Getting started | |
| <details> | |
| <summary> Click to expand </summary> | |
| ```python | |
| import torch | |
| from transformers import pipeline | |
| pipe = pipeline( | |
| "text-generation", | |
| model="tiiuae/Falcon3-10B-Base", | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto" | |
| ) | |
| response = pipe("Question: How many hours in one day? Answer: ") | |
| print(response[0]['generated_text']) | |
| ``` | |
| </details> | |
| <br> | |
| ## Benchmarks | |
| We report in the following table our internal pipeline benchmarks. | |
| - We use [lm-evaluation harness](https://github.com/EleutherAI/lm-evaluation-harness). | |
| - We report **raw scores**. | |
| - We use same batch-size across all models. | |
| <table border="1" style="width: 100%; text-align: center; border-collapse: collapse;"> | |
| <colgroup> | |
| <col style="width: 10%;"> | |
| <col style="width: 10%;"> | |
| <col style="width: 7%;"> | |
| <col style="width: 7%;"> | |
| <col style="width: 7%;"> | |
| <col style="background-color: rgba(80, 15, 213, 0.5); width: 7%;"> | |
| </colgroup> | |
| <thead> | |
| <tr> | |
| <th>Category</th> | |
| <th>Benchmark</th> | |
| <th>Gemma2-9B</th> | |
| <th>Yi1.5-9B</th> | |
| <th>Mistral-Nemo-Base-2407 (12B)</th> | |
| <th>Falcon3-10B-Base</th> | |
| </tr> | |
| </thead> | |
| <tbody> | |
| <tr> | |
| <td rowspan="3">General</td> | |
| <td>MMLU (5-shot)</td> | |
| <td>70.8</td> | |
| <td>69.6</td> | |
| <td>68.8</td> | |
| <td><b>73.1</b></td> | |
| </tr> | |
| <tr> | |
| <td>MMLU-PRO (5-shot)</td> | |
| <td>41.4</td> | |
| <td>39.3</td> | |
| <td>34.7</td> | |
| <td><b>42.5</b></td> | |
| </tr> | |
| <tr> | |
| <td>IFEval</td> | |
| <td>21.3</td> | |
| <td>29.1</td> | |
| <td>16.1</td> | |
| <td><b>36.4</b></td> | |
| </tr> | |
| <tr> | |
| <td rowspan="2">Math</td> | |
| <td>GSM8K (5-shot)</td> | |
| <td>69.1</td> | |
| <td>63.8</td> | |
| <td>55.3</td> | |
| <td><b>81.4</b></td> | |
| </tr> | |
| <tr> | |
| <td>MATH Lvl-5 (4-shot)</td> | |
| <td>10.5</td> | |
| <td>9.2</td> | |
| <td>4.9</td> | |
| <td><b>22.9</b></td> | |
| </tr> | |
| <tr> | |
| <td rowspan="4">Reasoning</td> | |
| <td>Arc Challenge (25-shot)</td> | |
| <td>67.5</td> | |
| <td>61.7</td> | |
| <td>64.4</td> | |
| <td><b>66.8</b></td> | |
| </tr> | |
| <tr> | |
| <td>GPQA (0-shot)</td> | |
| <td>33.4</td> | |
| <td><b>36.6</b></td> | |
| <td>28.8</td> | |
| <td>34.1</td> | |
| </tr> | |
| <tr> | |
| <td>MUSR (0-shot)</td> | |
| <td><b>45.3</b></td> | |
| <td>43.3</td> | |
| <td>39.2</td> | |
| <td>44.2</td> | |
| </tr> | |
| <tr> | |
| <td>BBH (3-shot)</td> | |
| <td>54.3</td> | |
| <td>51.3</td> | |
| <td>50.2</td> | |
| <td><b>59.7</b></td> | |
| </tr> | |
| <tr> | |
| <td rowspan="4">CommonSense Understanding</td> | |
| <td>PIQA (0-shot)</td> | |
| <td><b>83.0</b></td> | |
| <td>80.5</td> | |
| <td>82.1</td> | |
| <td>79.4</td> | |
| </tr> | |
| <tr> | |
| <td>SciQ (0-shot)</td> | |
| <td><b>97.1</b></td> | |
| <td>95.2</td> | |
| <td>95.2</td> | |
| <td>93.5</td> | |
| </tr> | |
| <tr> | |
| <td>Winogrande (0-shot)</td> | |
| <td><b>74.2</b></td> | |
| <td>72.7</td> | |
| <td>73.2</td> | |
| <td>73.6</td> | |
| </tr> | |
| <tr> | |
| <td>OpenbookQA (0-shot)</td> | |
| <td><b>47.2</b></td> | |
| <td>45.2</td> | |
| <td><b>47.2</b></td> | |
| <td>45.0</td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| ## Useful links | |
| - View our [release blogpost](https://huggingface.co/blog/falcon3). | |
| - Feel free to join [our discord server](https://discord.gg/fwXpMyGc) if you have any questions or to interact with our researchers and developers. | |
| ## Technical Report | |
| Coming soon.... | |
| ## Citation | |
| If the Falcon3 family of models were helpful to your work, feel free to give us a cite. | |
| ``` | |
| @misc{Falcon3, | |
| title = {The Falcon 3 Family of Open Models}, | |
| url = {https://huggingface.co/blog/falcon3}, | |
| author = {Falcon-LLM Team}, | |
| month = {December}, | |
| year = {2024} | |
| } | |
| ``` | |
| # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) | |
| Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/tiiuae__Falcon3-10B-Base-details) | |
| | Metric |Value| | |
| |-------------------|----:| | |
| |Avg. |27.59| | |
| |IFEval (0-Shot) |36.48| | |
| |BBH (3-Shot) |41.38| | |
| |MATH Lvl 5 (4-Shot)|24.77| | |
| |GPQA (0-shot) |12.75| | |
| |MuSR (0-shot) |14.17| | |
| |MMLU-PRO (5-shot) |36.00| | |