StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs through Knowledge-Reasoning Fusion


Introduction

We introduce StepFun-Formalizer, a family of large language models designed to translate natural-language mathematical problems into formal statements in Lean 4. Through the fusion of formal knowledge and informal-to-formal reasoning capability, StepFun-Formalizer achieves strong performance on autoformalization tasks. Evaluated with BEq verification on mainstream benchmarks including FormalMATH-Lite, ProverBench, and CombiBench, StepFun-Formalizer matches or exceeds all prior general-purpose and specialized autoformalization models of comparable scale. Please refer to our paper and code for more details.

Models

Model Download
StepFun-Formalizer-7B 🤗HuggingFace
StepFun-Formalizer-32B 🤗HuggingFace

Usage

from vllm import LLM, SamplingParams
from transformers import AutoTokenizer

def get_formal_statement_prompt(informal_problem: str, header: str = "import Mathlib\n") -> str:
    prompt = "Please autoformalize the following problem in Lean 4 with a header. Use the following theorem names: my_favorite_theorem.\n\n"
    prompt += informal_problem
    prompt += f"\n\nYour code should start with:\n```Lean4\n{header}\n```\n"
    return prompt

MODEL_DIR = "stepfun-ai/StepFun-Formalizer-32B"

if __name__ == "__main__":

    system_prompt = "You are an expert in mathematics and Lean 4."
    informal_problem = "The real numbers $x, y, z$ satisfy $0 \\leq x \\leq y \\leq z \\leq 4$. If their squares form an arithmetic progression with common difference 2, determine the minimum possible value of $|x-y|+|y-z|$.\n Prove that the answer is: 4-2\\sqrt{3}"
    header = "import Mathlib\n\nopen Real\n"
    user_prompt = get_formal_statement_prompt(informal_problem, header)

    dialog = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ] 

    tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
    prompt = tokenizer.apply_chat_template(dialog, tokenize=False, add_generation_prompt=True) + "<think>"
    print(f"prompt: {prompt}")

    model = LLM(
        MODEL_DIR, 
        tensor_parallel_size=4 # 8 for 32B, 4 for 7B
    )

    sampling_params = SamplingParams(
        temperature=0.6,
        top_p=0.95,
        max_tokens=16384,
        n=1
    )

    responses = model.generate(prompt, sampling_params)
    print(f"response: {responses[0].outputs[0].text}")

License

Both the code repository and the model weights are released under the Apache License (Version 2.0).

Citation

@misc{stepfunformalizer2025,
      title={StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs through Knowledge-Reasoning Fusion}, 
      author={Yutong Wu and Di Huang and Ruosi Wan and Yue Peng and Shijie Shang and Chenrui Cao and Lei Qi and Rui Zhang and Zidong Du and Jie Yan and Xing Hu},
      year={2025},
      eprint={2508.04440},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.04440}, 
}
Downloads last month
49
Safetensors
Model size
32.8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for stepfun-ai/StepFun-Formalizer-32B

Finetuned
(76)
this model
Quantizations
2 models

Collection including stepfun-ai/StepFun-Formalizer-32B