Update README.md

5de95b3 verified about 1 month ago

3.62 kB

language:
  - ru
license: cc-by-4.0
library_name: transformers
tags:
  - text-classification
  - vulnerability
  - severity
  - cybersecurity
  - fstec
  - generated_from_trainer
datasets:
  - CIRCL/Vulnerability-FSTEC
base_model: ai-forever/ruRoberta-large
pipeline_tag: text-classification

VLAI: Automated Vulnerability Severity Classification (Russian Text)

A fine-tuned ai-forever/ruRoberta-large model for classifying Russian vulnerability descriptions from the FSTEC.

Trained on the CIRCL/Vulnerability-FSTEC dataset as part of the VulnTrain project.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 5

It achieves the following results on the evaluation set:

Loss: 2.6495
Accuracy: 0.7417
F1 Macro: 0.6650
Low Precision: 0.6154
Low Recall: 0.3380
Low F1: 0.4364
Medium Precision: 0.7619
Medium Recall: 0.8312
Medium F1: 0.7951
High Precision: 0.6869
High Recall: 0.6080
High F1: 0.6450
Critical Precision: 0.7678
Critical Recall: 0.7996
Critical F1: 0.7834

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1 Macro	Low Precision	Low Recall	Low F1	Medium Precision	Medium Recall	Medium F1	High Precision	High Recall	High F1	Critical Precision	Critical Recall	Critical F1
3.0373	1.0	1167	3.0503	0.6895	0.5626	0.7959	0.1099	0.1931	0.7233	0.7958	0.7578	0.6083	0.5152	0.5579	0.6947	0.7954	0.7416
2.9084	2.0	2334	2.8601	0.7142	0.6048	0.8	0.1803	0.2943	0.7523	0.8001	0.7754	0.6923	0.5156	0.5910	0.6660	0.8807	0.7584
2.5937	3.0	3501	2.6529	0.7335	0.6349	0.6967	0.2394	0.3564	0.7565	0.8379	0.7952	0.7126	0.5411	0.6152	0.7092	0.8488	0.7727
2.5230	4.0	4668	2.6348	0.7365	0.6549	0.6170	0.3268	0.4273	0.7403	0.8568	0.7943	0.7208	0.5451	0.6207	0.7526	0.8038	0.7773
2.0599	5.0	5835	2.6495	0.7417	0.6650	0.6154	0.3380	0.4364	0.7619	0.8312	0.7951	0.6869	0.6080	0.6450	0.7678	0.7996	0.7834

Framework versions

Transformers 5.5.0
Pytorch 2.11.0+cu130
Datasets 4.8.4
Tokenizers 0.22.2