|
--- |
|
license: mit |
|
datasets: |
|
- SoundMind-RL/SoundMindDataset |
|
language: |
|
- en |
|
base_model: |
|
- Qwen/Qwen2.5-Omni-7B |
|
pipeline_tag: audio-to-audio |
|
--- |
|
|
|
# SoundMind Model |
|
|
|
The SoundMind Model is an audio language model (ALM) trained using SoundMind, a rule-based reinforcement learning (RL) algorithm designed to equip ALMs with advanced bimodal reasoning capabilities. |
|
|
|
[Github](https://github.com/xid32/SoundMind) [Paper](https://arxiv.org/abs/2506.12935) [Dataset](https://huggingface.co/datasets/SoundMind-RL/SoundMindDataset) [Model](https://huggingface.co/SoundMind-RL/SoundMindModel) |
|
|
|
## Citation |
|
|
|
If you find our work helpful, feel free to give us a cite. |
|
|
|
```bibtex |
|
@article{diao2025soundmind, |
|
title={SoundMind: RL-Incentivized Logic Reasoning for Audio-Language Models}, |
|
author={Diao, Xingjian and Zhang, Chunhui and Kong, Keyi and Wu, Weiyi and Ma, Chiyu and Ouyang, Zhongyu and Qing, Peijun and Vosoughi, Soroush and Gui, Jiang}, |
|
journal={arXiv preprint arXiv:2506.12935}, |
|
year={2025} |
|
} |
|
``` |