ai-forever/ru-scibench-grnti-classification
Viewer • Updated • 31.2k • 378
How to use kiselyovd/grnti-text-classifier with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="kiselyovd/grnti-text-classifier") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("kiselyovd/grnti-text-classifier")
model = AutoModelForSequenceClassification.from_pretrained("kiselyovd/grnti-text-classifier")Production-grade Russian scientific-text classifier: 28 top-level GRNTI codes.
Main model: XLM-RoBERTa-base fine-tuned on ai-forever/ru-scibench-grnti-classification.
| Model | Top-1 | Top-5 | Macro F1 | Weighted F1 |
|---|---|---|---|---|
| FacebookAI/xlm-roberta-base | 72.4% | 96.8% | 72.3% | 72.3% |
| DeepPavlov/rubert-base-cased | 72.9% | 95.9% | 72.8% | 72.8% |
from transformers import pipeline
clf = pipeline("text-classification", model="kiselyovd/grnti-text-classifier", top_k=5)
clf("Исследование квантовой электродинамики в кристаллах.")
This model is trained for Russian-language top-level GRNTI section classification (State Rubricator of Scientific and Technical Information). It is not evaluated outside Russian scientific text and should not be used for generic multilingual classification.
Do not rely on this model for high-stakes decisions. Outputs are probabilistic and subject to training-data biases.
ai-forever/ru-scibench-grnti-classification (MIT, 28 476 train + 2 772 test).FacebookAI/xlm-roberta-base.DeepPavlov/rubert-base-cased.Source: https://github.com/kiselyovd/grnti-text-classifier
MIT.
Base model
FacebookAI/xlm-roberta-base