metadata

license: apache-2.0
tags:
  - vision
  - image-classification
  - medical
  - glomerulus
  - pathology
  - pytorch
  - transformers
  - vision-transformer
datasets:
  - custom
metrics:
  - accuracy
  - f1
model-index:
  - name: vision-transformer-glomerulus-classifier
    results:
      - task:
          type: image-classification
          name: Image Classification
        dataset:
          name: Glomerulus Classification Dataset
          type: custom
        metrics:
          - type: accuracy
            value: 0.9491
          - type: f1
            value: 0.9493

Vision Transformer for Glomerulus Classification

This model is a Vision Transformer Large fine-tuned for classification of renal glomerulus images, distinguishing between normal and crescentic conditions across different histological stains.

Model Performance

Accuracy: 0.9491 (94.91%)
F1-Score: 0.9493 (94.93%)

Supported Classes

The model classifies glomeruli into 10 classes:

Normal:

Normal_AZAN - Normal glomerulus with Azan staining
Normal_HE - Normal glomerulus with H&E staining
Normal_PAMS - Normal glomerulus with PAMS staining
Normal_PAS - Normal glomerulus with PAS staining
Normal_PICRO - Normal glomerulus with Picro staining

Crescentic:

Crescente_AZAN - Crescentic glomerulus with Azan staining
Crescente_HE - Crescentic glomerulus with H&E staining
Crescente_PAMS - Crescentic glomerulus with PAMS staining
Crescente_PAS - Crescentic glomerulus with PAS staining
Crescente_PICRO - Crescentic glomerulus with Picro staining

How to Use

from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import torch

# Load model and processor
model_name = "CleitonOERocha/vision-transformer-glomerulus-classifier"
processor = AutoImageProcessor.from_pretrained(model_name)
model = AutoModelForImageClassification.from_pretrained(model_name)

# Load and process image
image = Image.open("your_image.jpg")
inputs = processor(images=image, return_tensors="pt")

# Make prediction
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class_id = predictions.argmax().item()
    predicted_class = model.config.id2label[predicted_class_id]
    confidence = predictions.max().item()

print(f"Predicted class: {predicted_class}")
print(f"Confidence: {confidence:.4f}")

Technical Details

Architecture: Vision Transformer Large (ViT-L/16)
Base Model: google/vit-large-patch16-224-in21k
Input Size: 224x224 pixels
Patch Size: 16x16
Parameters: ~300M
Framework: PyTorch + Transformers

Model Comparison

In our benchmark:

Vision Transformer Large: 94.91% (this model)
Vision Transformer Base: 93.53%
ResNet-50: 50.58%

Performance by Class

Class	Precision	Recall	F1-Score	Support
Crescente_AZAN	0.9500	0.9344	0.9421	61
Crescente_HE	0.9665	0.9914	0.9788	233
Crescente_PAMS	0.9737	0.9487	0.9610	78
Crescente_PAS	0.9607	0.9661	0.9634	177
Crescente_PICRO	0.7179	0.6364	0.6747	44
Normal_AZAN	0.8714	0.9683	0.9173	63
Normal_HE	0.9908	0.9908	0.9908	434
Normal_PAMS	0.9785	0.9681	0.9733	94
Normal_PAS	0.9835	0.8151	0.8914	146
Normal_PICRO	0.6667	0.9565	0.7857	46

Dataset

The model was trained on a custom dataset of renal glomerulus images with:

Total images: 2,759 images
Training: 1,383 images
Test: 1,376 images
Stains: 5 types (AZAN, H&E, PAMS, PAS, PICRO)
Conditions: Normal vs Crescentic

Limitations

Trained specifically for renal glomeruli
Requires histological quality images
Does not replace professional medical diagnosis
Performance may vary with different equipment/protocols

Note: This model is for research and educational purposes. It should not be used as the sole source for medical diagnosis.