Instructions to use IAAR-Shanghai/xFinder-llama38it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use IAAR-Shanghai/xFinder-llama38it with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="IAAR-Shanghai/xFinder-llama38it")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("IAAR-Shanghai/xFinder-llama38it")
model = AutoModelForCausalLM.from_pretrained("IAAR-Shanghai/xFinder-llama38it")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use IAAR-Shanghai/xFinder-llama38it with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "IAAR-Shanghai/xFinder-llama38it"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "IAAR-Shanghai/xFinder-llama38it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/IAAR-Shanghai/xFinder-llama38it

SGLang

How to use IAAR-Shanghai/xFinder-llama38it with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "IAAR-Shanghai/xFinder-llama38it" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "IAAR-Shanghai/xFinder-llama38it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "IAAR-Shanghai/xFinder-llama38it" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "IAAR-Shanghai/xFinder-llama38it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use IAAR-Shanghai/xFinder-llama38it with Docker Model Runner:
```
docker model run hf.co/IAAR-Shanghai/xFinder-llama38it
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

xFinder-llama38it

Model Details

xFinder-llama38it is a model specifically designed for key answer extraction in large language models (LLMs). It is trained by fine-tuning Llama3-8B-Instruct.

Developed by: IAAR
Fine-tuned from Model: Llama3-8B-Instruct

Model Sources

Repository: https://github.com/IAAR-Shanghai/xFinder
Paper: https://openreview.net/forum?id=7UqQJUKaLM

Uses

xFinder is primarily used to enhance the evaluation of LLMs by accurately extracting key answers from their outputs. It addresses the limitations of traditional regular expression (RegEx)-based extraction methods, which often fail to handle the diverse and complex outputs generated by LLMs. xFinder improves the reliability of model assessments across various tasks.

Training Details

xFinder-llama38it is fine-tuned from Llama3-8B-Instruct. The training data consists of approximately 26.9K samples from the Key Answer Finder (KAF) dataset. This dataset is designed to enhance the accuracy and robustness of key answer extraction and includes a variety of tasks. It has been meticulously annotated by GPT-4 and human experts to ensure high-quality training and evaluation. For more details, see this paper and try it with code.

Evaluation

xFinder is evaluated on the fully human-annotated test and generalization sets of the KAF dataset. The results demonstrate significant improvements in extraction accuracy and robustness compared to traditional methods. For more details, please refer to the paper and try it out using the provided code.

Citation

@inproceedings{
    xFinder,
    title={xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation},
    author={Qingchen Yu and Zifan Zheng and Shichao Song and Zhiyu li and Feiyu Xiong and Bo Tang and Ding Chen},
    booktitle={The Thirteenth International Conference on Learning Representations},
    year={2025},
    url={https://openreview.net/forum?id=7UqQJUKaLM}
}

Downloads last month: 1,179

Model tree for IAAR-Shanghai/xFinder-llama38it

Quantizations

1 model

Dataset used to train IAAR-Shanghai/xFinder-llama38it

Collection including IAAR-Shanghai/xFinder-llama38it

xFinder

Collection

The official collection for "xFinder: Robust and Pinpoint Answer Extraction for Large Language Models". • 4 items • Updated 7 days ago • 8