Lunar-RVC-Model / README.md
IssacMosesD's picture
Update README.md
c959176 verified
---
license: mit
language:
- en
tags:
- RVC
- voice-conversion
- text-to-speech
- voice-cloning
- audio-generation
datasets:
- LJSpeech
- VCTK
metrics:
- MOS (Mean Opinion Score)
- PESQ
- STOI
base_model:
- MangioRVC/Mangio-RVC-Huggingface
pipeline_tag: audio-to-audio
---
# 🌙 LUNAR - High-Quality Female Voice RVC Model
LUNAR is a state-of-the-art **RVC (Retrieval-Based Voice Conversion) model** optimized for **female voice conversion** with studio-grade audio quality at **48kHz sampling rate**. This model delivers natural-sounding voice transformations with minimal artifacts.
## **Performance & Efficiency Metrics**
Here are the visual benchmarks of Lunar-RVC:
### **1. Training Loss Curve**
![Training Loss](./Metrics/1.jpg)
### **2. Validation Loss Curve**
![Validation Loss](./Metrics/2.jpg)
### **3. Training vs Validation Loss**
![Train vs Val Loss](./Metrics/3.jpg)
### **4. Inference Speed Comparison**
![Inference Speed](./Metrics/4.jpg)
### **5. Audio Quality Scores (MOS)**
![MOS Score](./Metrics/5.jpg)
### **6. GPU Memory Usage**
![GPU Memory](./Metrics/6.jpg)
### **7. Dataset Duration Distribution**
![Dataset Distribution](./Metrics/7.jpg)
### **8. Spectral Convergence**
![Spectral Convergence](./Metrics/8.jpg)
### **9. Model Size Comparison**
![Model Size](./Metrics/9.jpg)
### **10. Efficiency Radar Chart**
![Efficiency Radar](./Metrics/10.jpg)
---
## Key Features
- **High-Fidelity Conversion** - Produces natural, expressive female voices
- **Real-Time Ready** - Optimized for low-latency inference (<20ms/frame)
- **Pitch & Timbre Control** - Flexible voice modulation capabilities
- **48kHz Studio Quality** - Professional-grade audio output
- **Easy Integration** - Compatible with popular voice toolkits
## 📊 Model Specifications
| Parameter | Value |
|--------------------|---------------------|
| Framework | RVC v2 |
| Sample Rate | 48kHz |
| Bit Depth | 16-bit |
| Model Size | 1.8GB |
| Training Hours | 150 epochs (~10h) |
| VRAM Requirements | 4GB+ (inference) |
| Supported Formats | WAV, MP3, FLAC |
---
## **Inference Guide**
To use Lunar-RVC for inference:
``` bash
# Clone repository
git clone https://huggingface.co/IssacMosesD/Lunar-RVC-Model
cd Lunar-RVC
# Install dependencies
pip install -r requirements.txt
# Run inference
python infer.py --input input.wav --output output.wav --model Lunar-RVC.pth
## Use Cases
- Voice Cloning – Convert your voice into a professional singing voice.
- Streaming – Real-time voice conversion for content creators.
- Dubbing – High-quality voice conversion for movies & animations.
- Music Production – Transform any vocal track into a new singer’s voice.
## System Requirements
- OS: Windows / Linux
- Python: 3.8+
- GPU: NVIDIA (6GB VRAM minimum recommended)
- CUDA: 11.7+
- Torch: 1.13.1+
## Contact
- For support, queries, or collaboration:
- Name: [Issac Moses D](https://www.linkedin.com/in/issacmosesd/)
- Email: [[email protected]](mailto:[email protected])
- Hugging Face: [IssacMosesD](https://huggingface.co/IssacMosesD)
- GitHub: [Issac Moses D](https://github.com/Issac-Moses)
## Collaburator
- Name: [Dharani Karnan](https://www.linkedin.com/in/dharani-karnan-060040320/)
- Email: [[email protected]](mailto:[email protected])
- Hugging Face: [Dharani K](https://huggingface.co/dharzz188)
- GitHub: [Dharani K](https://github.com/Issac-Moses)