Text Classification
Transformers
Safetensors
Russian
roberta
vulnerability
severity
cybersecurity
fstec
Generated from Trainer
text-embeddings-inference
Instructions to use CIRCL/vulnerability-severity-classification-russian-ruRoberta-large with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use CIRCL/vulnerability-severity-classification-russian-ruRoberta-large with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="CIRCL/vulnerability-severity-classification-russian-ruRoberta-large")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("CIRCL/vulnerability-severity-classification-russian-ruRoberta-large") model = AutoModelForSequenceClassification.from_pretrained("CIRCL/vulnerability-severity-classification-russian-ruRoberta-large") - Notebooks
- Google Colab
- Kaggle
metadata
language:
- ru
license: cc-by-4.0
library_name: transformers
tags:
- text-classification
- vulnerability
- severity
- cybersecurity
- fstec
- generated_from_trainer
datasets:
- CIRCL/Vulnerability-FSTEC
base_model: ai-forever/ruRoberta-large
pipeline_tag: text-classification
VLAI: Automated Vulnerability Severity Classification (Russian Text)
A fine-tuned ai-forever/ruRoberta-large model for classifying Russian vulnerability descriptions from the FSTEC.
Trained on the CIRCL/Vulnerability-FSTEC dataset as part of the VulnTrain project.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 5
It achieves the following results on the evaluation set:
- Loss: 2.6495
- Accuracy: 0.7417
- F1 Macro: 0.6650
- Low Precision: 0.6154
- Low Recall: 0.3380
- Low F1: 0.4364
- Medium Precision: 0.7619
- Medium Recall: 0.8312
- Medium F1: 0.7951
- High Precision: 0.6869
- High Recall: 0.6080
- High F1: 0.6450
- Critical Precision: 0.7678
- Critical Recall: 0.7996
- Critical F1: 0.7834
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 Macro | Low Precision | Low Recall | Low F1 | Medium Precision | Medium Recall | Medium F1 | High Precision | High Recall | High F1 | Critical Precision | Critical Recall | Critical F1 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3.0373 | 1.0 | 1167 | 3.0503 | 0.6895 | 0.5626 | 0.7959 | 0.1099 | 0.1931 | 0.7233 | 0.7958 | 0.7578 | 0.6083 | 0.5152 | 0.5579 | 0.6947 | 0.7954 | 0.7416 |
| 2.9084 | 2.0 | 2334 | 2.8601 | 0.7142 | 0.6048 | 0.8 | 0.1803 | 0.2943 | 0.7523 | 0.8001 | 0.7754 | 0.6923 | 0.5156 | 0.5910 | 0.6660 | 0.8807 | 0.7584 |
| 2.5937 | 3.0 | 3501 | 2.6529 | 0.7335 | 0.6349 | 0.6967 | 0.2394 | 0.3564 | 0.7565 | 0.8379 | 0.7952 | 0.7126 | 0.5411 | 0.6152 | 0.7092 | 0.8488 | 0.7727 |
| 2.5230 | 4.0 | 4668 | 2.6348 | 0.7365 | 0.6549 | 0.6170 | 0.3268 | 0.4273 | 0.7403 | 0.8568 | 0.7943 | 0.7208 | 0.5451 | 0.6207 | 0.7526 | 0.8038 | 0.7773 |
| 2.0599 | 5.0 | 5835 | 2.6495 | 0.7417 | 0.6650 | 0.6154 | 0.3380 | 0.4364 | 0.7619 | 0.8312 | 0.7951 | 0.6869 | 0.6080 | 0.6450 | 0.7678 | 0.7996 | 0.7834 |
Framework versions
- Transformers 5.5.0
- Pytorch 2.11.0+cu130
- Datasets 4.8.4
- Tokenizers 0.22.2