Nellyw888
/

VeriReason-Qwen2.5-3b-RTLCoder-Verilog-GRPO-reasoning-tb

Reinforcement Learning

text-generation

text-generation-inference

Model card Files Files and versions

Nellyw888 commited on May 20

Commit

da6a8a0

·

verified ·

1 Parent(s): da3ed43

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -16,6 +16,8 @@ base_model:
 For implementation details, visit our GitHub repository: [VeriReason](https://github.com/NellyW8/VeriReason)
 ## Update Log
 2025.05.17: Initial release of VeriReason-Qwen2.5-3B-Verilog-RTL-GRPO-reasoning-tb
@@ -80,6 +82,19 @@ The GRPO (Generative Reinforcement Learning from Preference Optimization) traini
    ```
 ## Citation
 ## Acknowledgement
 This repo benefits from OpenR1 and LLamaFactory.

 For implementation details, visit our GitHub repository: [VeriReason](https://github.com/NellyW8/VeriReason)
+Check out our paper: [VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation](https://arxiv.org/abs/2505.11849)
 ## Update Log
 2025.05.17: Initial release of VeriReason-Qwen2.5-3B-Verilog-RTL-GRPO-reasoning-tb
    ```
 ## Citation
+Please cite our paper if you use our model or dataset:
+```bibtex
+@misc{wang2025verireasonreinforcementlearningtestbench,
+      title={VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation},
+      author={Yiting Wang and Guoheng Sun and Wanghao Ye and Gang Qu and Ang Li},
+      year={2025},
+      eprint={2505.11849},
+      archivePrefix={arXiv},
+      primaryClass={cs.AI},
+      url={https://arxiv.org/abs/2505.11849},
+}
+```
 ## Acknowledgement
 This repo benefits from OpenR1 and LLamaFactory.