File size: 1,232 Bytes
dda61fd
 
 
 
 
 
 
 
 
 
f884ed8
18dc639
 
 
e13824e
 
18dc639
e13824e
 
 
 
 
18dc639
 
0c2af15
 
 
18dc639
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---
license: mit
metrics:
- accuracy
tags:
- mistral
- midi
- miditok
- music
- instrument
pipeline_tag: audio-to-audio
model-index:
- name: Mistral_MidiTok_Transformer_Single_Instrument_Small
  results: []
---

# Mistral_MidiTok_Transformer_Single_Instrument_Small

This model is trained from scratch using tokenized midi music.
I have trained a MidiTok tokeniser (REMI) and its made by spliting multi-track midi into a single track. 

We then trained in on a small dataset.
Its using the Mistral model that has been cut down quite a bit.

### What else needs to be done
Update model training to use small positional embeddings for the model 1024 + a padding amount like 8

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 30
- eval_batch_size: 30
- seed: 444
- gradient_accumulation_steps: 3
- total_train_batch_size: 90
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine_with_restarts
- lr_scheduler_warmup_ratio: 0.3
- training_steps: 20000

### Framework versions

- Transformers 4.46.2
- Pytorch 2.1.0+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3