Neo111x
/

Falcon3-3B-Instruct-RL-CODE-RL

Test-case-generation

Model card Files Files and versions

Metrics Training metrics Community

Falcon3-3B-Instruct-RL-CODE-RL / README.md

Neo111x's picture

Update README.md

f150200 verified about 1 month ago

|

history blame contribute delete

3.62 kB

	---
	base_model:
	- tiiuae/Falcon3-3B-Instruct-1.58bit
	tags:
	- Repair
	- Code
	- Test
	- Test-case-generation
	- Code-Repair
	---

	# Falcon3-3B-Instruct-RL-CODE-FIX

	This repository hosts the Falcon3-3B-Instruct-RL-CODE-FIX model — a fine-tuned version of [Falcon3-3B-Instruct](https://huggingface.co/tiiuae/falcon3-3b-instruct), trained using GRPO (Group Relative Policy Optimization) to solve programming tasks in the context of automatic program repair.

	## 🛠️ Model Purpose

	This model is designed to:
	- Understand buggy code snippets
	- Propose test cases that expose the bugs
	- Generate fixed versions of the code

	It is particularly useful for:
	- Code contests
	- Automated debugging
	- Education and code quality assurance

	## 🧠 Training Details

	- Base model: Falcon3-3B-Instruct
	- Method: GRPO
	- Dataset: Custom dataset of buggy code + test cases + fixes
	- Objective: Improve model reasoning over structured code repair tasks

	## 🚀 Usage Example

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	faulty ="""
	def add (x, y):
	\"\"\"return sum of x and y\"\"\"
	return x * y
	"""


	PROGRAM_REPAIR_TEMPLATE = f"""
	You are an expert in the field of software testing.
	You are given a buggy Python program, you are supposed to first generate testcases that can expose the bug,
	and then generate the corresponding fixed code. The two tasks are detailed as follows.

	1. Generate a comprehensive set of test cases to expose the bug:
	- Each test case should include an input and the expected output.
	- Output the test cases as a JSON list, where each entry is a dictionary with keys `"test_input"` and `"test_output"`.
	- Write in ```json ``` block.

	2. Provide a fixed version:
	- Write a correct Python program to fix the bug.
	- Write in ```python ``` block.
	- The code should read from standard input and write to standard output, matching the input/output format specified in the problem.

	Here is an example.
	The faulty Python program is:
	\`\`\`python
	\"\"\"Please write a Python program to sum two integer inputs\"\"\"
	def add (x, y):
	return x - y
	x = int(input())
	y = int(input())
	print(add(x,y))
	\`\`\`

	Testcases that can expose the bug:
	\`\`\`json
	[
	{{
	\"test_input\":\"1\n2\",
	\"test_output\":\"3\"
	}},
	{{
	\"test_input\":\"-1\n1\",
	\"test_output\":\"0\"
	}},
	{{
	\"test_input\":\"-1\n2\",
	\"test_output\":\"1\"
	}}
	]
	\`\`\`

	Fixed code:
	\`\`\`python
	def add (x, y):
	return x + y
	x = int(input())
	y = int(input())
	print(add(x,y))
	\`\`\`

	Now, you are given a faulty Python function, please return:
	1. Testcases that helps expose the bug.
	2. Fixed code that can pass all testcases.

	The faulty function is:
	\`\`\`python
	{faulty}
	\`\`\`
	<\|assistant\|>
	"""

	model = AutoModelForCausalLM.from_pretrained(
	"Neo111x/Falcon3-3B-Instruct-RL-CODE-RL",
	trust_remote_code=True
	)
	tokenizer = AutoTokenizer.from_pretrained(
	"Neo111x/Falcon3-3B-Instruct-RL-CODE-RL",
	trust_remote_code=True
	)

	messages = [
	{"role": "user", "content": PROGRAM_REPAIR_TEMPLATE}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=512
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	print(response)
	```