license: mit | |
metrics: | |
- accuracy | |
tags: | |
- mistral | |
- midi | |
- miditok | |
- music | |
- instrument | |
pipeline_tag: audio-to-audio | |
model-index: | |
- name: Mistral_MidiTok_Transformer_Single_Instrument_Small | |
results: [] | |
# Mistral_MidiTok_Transformer_Single_Instrument_Small | |
This model is trained from scratch using tokenized midi music. | |
I have trained a MidiTok tokeniser (REMI) and its made by spliting multi-track midi into a single track. | |
We then trained in on a small dataset. | |
Its using the Mistral model that has been cut down quite a bit. | |
### What else needs to be done | |
Update model training to use small positional embeddings for the model 1024 + a padding amount like 8 | |
### Training hyperparameters | |
The following hyperparameters were used during training: | |
- learning_rate: 0.0001 | |
- train_batch_size: 30 | |
- eval_batch_size: 30 | |
- seed: 444 | |
- gradient_accumulation_steps: 3 | |
- total_train_batch_size: 90 | |
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments | |
- lr_scheduler_type: cosine_with_restarts | |
- lr_scheduler_warmup_ratio: 0.3 | |
- training_steps: 20000 | |
### Framework versions | |
- Transformers 4.46.2 | |
- Pytorch 2.1.0+cu121 | |
- Datasets 3.1.0 | |
- Tokenizers 0.20.3 |