roberta-base-squad / README.md
jimypbr's picture
Update README.md
fd659b7
|
raw
history blame
2.16 kB
metadata
license: mit
tags:
  - generated_from_trainer
datasets:
  - squad
model-index:
  - name: squad_roberta_base
    results: []

squad_roberta_base

This model is a fine-tuned version of roberta-base on the squad dataset.

Training and evaluation data

Trained and evaluated on the squad dataset.

Training procedure

Trained on 16 Graphcore Mk2 IPUs using optimum-graphcore.

Command line:

python examples/question-answering/run_qa.py \
  --ipu_config_name Graphcore/roberta-base-ipu \
  --model_name_or_path roberta-base \
  --dataset_name squad \
  --do_train \
  --do_eval \
  --num_train_epochs 2 \
  --per_device_train_batch_size 4 \
  --per_device_eval_batch_size 2 \
  --pod_type pod16 \
  --learning_rate 6e-5 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --seed 1984 \
  --lr_scheduler_type linear \
  --loss_scaling 64 \
  --weight_decay 0.01 \
  --warmup_ratio 0.25 \
  --logging_steps 1 \
  --save_steps -1 \
  --dataloader_num_workers 64 \
  --output_dir squad_roberta_base \
  --overwrite_output_dir \
  --push_to_hub

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6e-05
  • train_batch_size: 4
  • eval_batch_size: 2
  • seed: 1984
  • distributed_type: IPU
  • total_train_batch_size: 256
  • total_eval_batch_size: 40
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.25
  • num_epochs: 2.0
  • training precision: Mixed Precision

Training results

***** train metrics *****
  epoch                    =        2.0
  train_loss               =     1.2528
  train_runtime            = 0:02:14.50
  train_samples            =      88568
  train_samples_per_second =   1316.952
  train_steps_per_second   =       5.13

***** eval metrics *****
  epoch            =     2.0
  eval_exact_match = 85.2696
  eval_f1          = 91.7455
  eval_samples     =   10790

Framework versions

  • Transformers 4.18.0.dev0
  • Pytorch 1.10.0+cpu
  • Datasets 2.0.0
  • Tokenizers 0.11.6