File size: 3,148 Bytes
240ceda cf945b2 240ceda cf945b2 240ceda fc334fc 240ceda cf945b2 240ceda 5e7e89c 240ceda cf945b2 240ceda cf945b2 240ceda cf945b2 d3ecb83 240ceda d3ecb83 cf945b2 240ceda cf945b2 240ceda cf945b2 240ceda cf945b2 240ceda cf945b2 240ceda cf945b2 240ceda cf945b2 240ceda cf945b2 240ceda cf945b2 240ceda cf945b2 240ceda cf945b2 240ceda cf945b2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
---
library_name: transformers
license: mit
datasets:
- jhu-clsp/jfleg
language:
- en
base_model:
- google-t5/t5-base
pipeline_tag: text2text-generation
---
# 📚 Model Card for Grammar Correction Model
This is a grammar correction model based on the Google T5 architecture, fine-tuned on the JHU-CLSP/JFLEG dataset for text correction tasks. ✍️
## Model Details
This model is designed to correct grammatical errors in English sentences. It was fine-tuned using the JFLEG dataset, which provides examples of grammatically correct sentences.
- **Follow the Developer:** Abdul Samad Siddiqui ([@samadpls](https://github.com/samadpls)) 👨💻
## Uses
This model can be directly used to correct grammar and spelling mistakes in sentences. ✅
### Example Usage
Here's a basic code snippet to demonstrate how to use the model:
```python
import requests
API_URL = "https://api-inference.huggingface.co/models/samadpls/t5-base-grammar-checker"
HEADERS = {"Authorization": "Bearer YOUR_HF_API_KEY"}
def query(payload):
response = requests.post(API_URL, headers=HEADERS, json=payload)
return response.json()
data = query({"inputs": "grammar: This sentences, has bads grammar and spelling!"})
print(data)
```
OR
```python
from transformers import T5ForConditionalGeneration, T5Tokenizer
# Load the model and tokenizer
model_name = "samadpls/t5-base-grammar-checker"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)
# Example input
example_1 = "grammar: This sentences, has bads grammar and spelling!"
# Tokenize and generate corrected output
inputs = tokenizer.encode(example_1, return_tensors="pt")
outputs = model.generate(inputs)
corrected_sentence = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Corrected Sentence:", corrected_sentence)
```
## Training Details
The model was trained on the JHU CLSP JFLEG dataset, which includes various examples of sentences with grammatical errors and their corrections. 📖
### Training Procedure
- **Training Hardware:** Personal laptop with NVIDIA GeForce MX230 GDDR5 and 16GB RAM 💻
- **Training Time:** Approximately 1 hour ⏳
- **Hyperparameters:** No specific hyperparameters were set for training.
### Training Logs
| Step | Training Loss | Validation Loss |
|------|---------------|-----------------|
| 1 | 0.9282 | 0.6091 |
| 2 | 0.6182 | 0.5561 |
| 3 | 0.6279 | 0.5345 |
| 4 | 0.6345 | 0.5147 |
| 5 | 0.5636 | 0.5076 |
| 6 | 0.6009 | 0.4928 |
| 7 | 0.5469 | 0.4950 |
| 8 | 0.5797 | 0.4834 |
| 9 | 0.5619 | 0.4818 |
| 10 | 0.6342 | 0.4788 |
| 11 | 0.5481 | 0.4786 |
### Final Training Metrics
- **Training Runtime:** 1508.2528 seconds ⏱️
- **Training Samples per Second:** 1.799
- **Training Steps per Second:** 0.225
- **Final Training Loss:** 0.5925
- **Final Epoch:** 1.0
## Model Card Contact
For inquiries, please contact Abdul Samad Siddiqui via GitHub. 📬 |