File size: 5,551 Bytes

0b4f5ca
8bd02a9
 
 
 
 
 
 
 
 
 
 
 
 
0a91a83
 
 
 
 
 
0b4f5ca
 
8bd02a9
 
0b4f5ca
0a91a83
 
 
 
 
 
 
 
6186909
0a91a83
0b4f5ca
8bd02a9
 
 
 
 
 
 
0b4f5ca
8bd02a9
0b4f5ca
6186909
0b4f5ca
8bd02a9
0b4f5ca
6186909
 
 
 
 
 
 
 
 
 
0b4f5ca
8bd02a9
0b4f5ca
6186909
 
 
 
 
 
0b4f5ca
8bd02a9
0b4f5ca
6186909
 
 
 
 
8bd02a9
0b4f5ca
8bd02a9
 
 
 
 
 
 
 
0b4f5ca
8bd02a9
0b4f5ca
8bd02a9
 
 
 
 
 
 
 
 
0b4f5ca
6186909
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0b4f5ca
8bd02a9
0b4f5ca
8bd02a9

---
license: apache-2.0
library_name: peft
tags:
- generated_from_trainer
base_model: distilbert-base-multilingual-cased
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: multilabel_lora_distilbert_runews_classifier_tuned
  results: []
datasets:
- pyteach237/news_classify
language:
- ru
- fr
- en
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# # Model Card: DistilBERT with LoRA for Text Classification

## Model Details

**Model Name:** DistilBERT with LoRA for Text Classification  
**Model Type:** Transformer-based Language Model  
**Base Model:** `distilbert-base-multilingual-cased`  
**Fine-tuning Framework:** LoRA (Low-Rank Adaptation of Large Language Models)  
**Trained By:** ABODO Brice Donald  
**License:** Apache 2.0

This model is a fine-tuned version of [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.0019
- Accuracy: 0.8276
- F1: 0.8284
- Precision: 0.8317
- Recall: 0.8276

## Model description

This model is a fine-tuned version of `distilbert-base-multilingual-cased` for text classification tasks. The model has been adapted using LoRA (Low-Rank Adaptation) to efficiently train on the target dataset with fewer parameters, allowing for better performance with less computational resources.

## Intended uses & limitations

The model was trained and evaluated on the Russian Language news dataset, which consists of news texts labeled as positive, negative or neutral. The dataset is divided into training and test sets for evaluation purposes.
### Intended Use

This model is intended for text classification tasks, particularly multilabel sentiment analysis. It can be fine-tuned further for other classification tasks by using appropriate datasets and modifying the number of labels.

### Limitations and Risks

- **Bias:** The model may inherit biases present in the training data.
- **Generalization:** Performance may vary on datasets with different distributions from the training data.
- **Resource Usage:** Although more efficient than larger models, fine-tuning and inference still require significant computational resources.

## Training and evaluation data

The model was evaluated using the following metrics:

- **Accuracy:** Measures the fraction of correct predictions.
- **F1 Score:** Harmonic mean of precision and recall.
- **Precision:** Proportion of positive identifications that are actually correct.
- **Recall:** Proportion of actual positives that are correctly identified.

## Training procedure

### Preprocessing

- Tokenization: The text data was tokenized using the `DistilBertTokenizer` with a maximum length of 512 tokens.
- Padding and Truncation: Applied to ensure uniform input size.

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0009143508688456378
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 7

### Training results

| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     | Precision | Recall |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
| No log        | 1.0   | 91   | 0.5987          | 0.7634   | 0.7621 | 0.7648    | 0.7634 |
| No log        | 2.0   | 182  | 0.3768          | 0.8693   | 0.8698 | 0.8767    | 0.8693 |
| No log        | 3.0   | 273  | 0.2620          | 0.9065   | 0.9063 | 0.9093    | 0.9065 |
| No log        | 4.0   | 364  | 0.2427          | 0.9202   | 0.9203 | 0.9220    | 0.9202 |
| No log        | 5.0   | 455  | 0.2244          | 0.9367   | 0.9369 | 0.9387    | 0.9367 |
| 0.3641        | 6.0   | 546  | 0.2385          | 0.9491   | 0.9491 | 0.9495    | 0.9491 |
| 0.3641        | 7.0   | 637  | 0.2560          | 0.9464   | 0.9464 | 0.9465    | 0.9464 |

## How to Use
```
from transformers import DistilBertTokenizer, DistilBertForSequenceClassification, Trainer, TrainingArguments
from peft import PeftConfig, PeftModel

# Load the tokenizer and model
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-multilingual-cased')
model_id = 'pyteach237/multilabel_lora_distilbert_runews_classifier_tuned'
config = PeftConfig.from_pretrained(model_id)

# Define the model with LoRA
model = DistilBertForSequenceClassification.from_pretrained(
    config.base_model_name_or_path,
    num_labels=3
)
model = PeftModel.from_pretrained(model, model_id, config=config)

text = "Your text here :)"

# Tokenize input
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding='max_length', max_length=512)

# Make predictions
outputs = model(**inputs)
predictions = outputs.logits.argmax(dim=-1)

# Convert predictions to labels
labels = ['negative', 'neutral', 'positive']
predicted_label = labels[predictions.item()]
print(f'Predicted label: {predicted_label}')

```

## Acknowledgements

This model card template was inspired by the Hugging Face model cards. Special thanks to the contributors of the Hugging Face `transformers` library and the LoRA adaptation framework.

## Contact Information

For further information, please contact [Brice Donald] at [[email protected]].


### Framework versions

- PEFT 0.11.1
- Transformers 4.41.2
- Pytorch 2.1.2
- Datasets 2.19.2
- Tokenizers 0.19.1