Falcon3-3B-Instruct-RL-CODE-FIX

This repository hosts the Falcon3-3B-Instruct-RL-CODE-FIX model — a fine-tuned version of Falcon3-3B-Instruct, trained using GRPO (Group Relative Policy Optimization) to solve programming tasks in the context of automatic program repair.

🛠️ Model Purpose

This model is designed to:

Understand buggy code snippets
Propose test cases that expose the bugs
Generate fixed versions of the code

It is particularly useful for:

Code contests
Automated debugging
Education and code quality assurance

🧠 Training Details

Base model: Falcon3-3B-Instruct
Method: GRPO
Dataset: Custom dataset of buggy code + test cases + fixes
Objective: Improve model reasoning over structured code repair tasks

🚀 Usage Example

from transformers import AutoModelForCausalLM, AutoTokenizer

faulty ="""
def add (x, y):
    \"\"\"return sum of x and y\"\"\"
    return x * y 
"""


PROGRAM_REPAIR_TEMPLATE = f"""
You are an expert in the field of software testing. 
You are given a buggy Python program, you are supposed to first generate testcases that can expose the bug, 
and then generate the corresponding fixed code. The two tasks are detailed as follows.

1. **Generate a comprehensive set of test cases to expose the bug**:
   - Each test case should include an input and the expected output.
   - Output the test cases as a JSON list, where each entry is a dictionary with keys `"test_input"` and `"test_output"`.
   - Write in ```json ``` block.

2. **Provide a fixed version**:
   - Write a correct Python program to fix the bug.
   - Write in ```python ``` block.
   - The code should read from standard input and write to standard output, matching the input/output format specified in the problem.
   
Here is an example. 
The faulty Python program is:
\`\`\`python
\"\"\"Please write a Python program to sum two integer inputs\"\"\"
def add (x, y):
    return x - y 
x = int(input())
y = int(input())
print(add(x,y))
\`\`\`

Testcases that can expose the bug:
\`\`\`json
[
    {{
        \"test_input\":\"1\n2\",
        \"test_output\":\"3\"
    }},
    {{
        \"test_input\":\"-1\n1\",
        \"test_output\":\"0\"
    }},
    {{
        \"test_input\":\"-1\n2\",
        \"test_output\":\"1\"
    }}
]
\`\`\`

Fixed code:
\`\`\`python
def add (x, y):
    return x + y 
x = int(input())
y = int(input())
print(add(x,y))
\`\`\`

Now, you are given a faulty Python function, please return:
1. **Testcases** that helps expose the bug.
2. **Fixed code** that can pass all testcases.

The faulty function is:
\`\`\`python
{faulty}
\`\`\`
<|assistant|>
"""

model = AutoModelForCausalLM.from_pretrained(
    "Neo111x/Falcon3-3B-Instruct-RL-CODE-RL",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
    "Neo111x/Falcon3-3B-Instruct-RL-CODE-RL",
    trust_remote_code=True
)

messages = [
    {"role": "user", "content": PROGRAM_REPAIR_TEMPLATE}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Neo111x
/

Falcon3-3B-Instruct-RL-CODE-RL

Falcon3-3B-Instruct-RL-CODE-FIX

🛠️ Model Purpose

🧠 Training Details

🚀 Usage Example

Model tree for Neo111x/Falcon3-3B-Instruct-RL-CODE-RL

Space using Neo111x/Falcon3-3B-Instruct-RL-CODE-RL 1