🌸 EfficientNet-B4 Flower Classifier
A state-of-the-art image classification model for identifying 102 flower species from the Oxford Flowers-102 dataset.
Model Details
Model Description
This model is built on the EfficientNet-B4 backbone with a custom classifier head, trained using a novel 6-Phase Progressive Training strategy. The training progressively increases image resolution (280px → 400px) and augmentation difficulty (None → MixUp → CutMix → Hybrid).
- Developed by: fth2745
- Model type: Image Classification (CNN)
- License: MIT
- Finetuned from: EfficientNet-B4 (ImageNet pretrained)
Performance
| Metric | Test Set | Validation Set |
|---|---|---|
| Top-1 Accuracy | 94.49% | 97.45% |
| Top-3 Accuracy | 97.61% | 98.82% |
| Top-5 Accuracy | 98.49% | 99.31% |
| Macro F1-Score | 94.75% | 97.13% |
Training Details
Training Data
Oxford Flowers-102 dataset with offline data augmentation (tier-based augmentation for class balancing).
Training Procedure
6-Phase Progressive Training
| Phase | Epochs | Resolution | Augmentation | Dropout |
|---|---|---|---|---|
| 1. Basic | 1-5 | 280×280 | Basic Preprocessing | 0.4 |
| 2. MixUp Soft | 6-10 | 320×320 | MixUp α=0.2 | 0.2 |
| 3. MixUp Hard | 11-15 | 320×320 | MixUp α=0.4 | 0.2 |
| 4. CutMix Soft | 16-20 | 380×380 | CutMix α=0.2 | 0.2 |
| 5. CutMix Hard | 21-30 | 380×380 | CutMix α=0.5 | 0.2 |
| 6. Grand Finale | 31-40 | 400×400 | Hybrid | 0.2 |
Preprocessing
- Resize → RandomCrop → HorizontalFlip → Rotation (±20°) → Affine → ColorJitter → Normalize (ImageNet)
Training Hyperparameters
- Optimizer: AdamW
- Learning Rate: 1e-3
- Weight Decay: 1e-4
- Scheduler: CosineAnnealingWarmRestarts (T_0=5, T_mult=2)
- Loss: CrossEntropyLoss (label_smoothing=0.1)
- Batch Size: 8
- Training Regime: fp16 mixed precision (AMP)
🎯 6-Phase Progressive Training
Phase 1 ──→ Phase 2 ──→ Phase 3 ──→ Phase 4 ──→ Phase 5 ──→ Phase 6
280px 320px 320px 380px 380px 400px
None MixUp MixUp CutMix CutMix Hybrid
α=0.2 α=0.4 α=0.2 α=0.5 MixUp+Cut
Phase Details
| Phase | Epochs | Resolution | Technique | Alpha | Dropout | Purpose |
|---|---|---|---|---|---|---|
| 1️⃣ Basic | 1-5 | 280×280 | Basic Preprocessing | - | 0.4 | Learn fundamental features |
| 2️⃣ MixUp Soft | 6-10 | 320×320 | MixUp | 0.2 | 0.2 | Gentle texture blending |
| 3️⃣ MixUp Hard | 11-15 | 320×320 | MixUp | 0.4 | 0.2 | Strong texture mixing |
| 4️⃣ CutMix Soft | 16-20 | 380×380 | CutMix | 0.2 | 0.2 | Learn partial structures |
| 5️⃣ CutMix Hard | 21-30 | 380×380 | CutMix | 0.5 | 0.2 | Handle occlusions |
| 6️⃣ Grand Finale | 31-40 | 400×400 | Hybrid | 0.1-0.3 | 0.2 | Final polish with both |
💡 Why Progressive Training? Starting with low resolution helps the model learn general shapes first. Gradual augmentation increase builds robustness incrementally.
🖼️ Preprocessing Pipeline (All Phases)
⚠️ Note: These preprocessing steps are applied in ALL PHASES. Only
img_sizechanges per phase.
Complete Training Flow
┌─────────────────────────────────────────────────────────────┐
│ 📷 RAW IMAGE INPUT │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 🔄 STEP 1: IMAGE-LEVEL PREPROCESSING (Per image) │
├─────────────────────────────────────────────────────────────┤
│ 1️⃣ Resize │ (img_size + 32) × (img_size + 32) │
│ 2️⃣ RandomCrop │ img_size × img_size │
│ 3️⃣ HorizontalFlip │ p=0.5 │
│ 4️⃣ RandomRotation │ ±20° │
│ 5️⃣ RandomAffine │ scale=(0.8, 1.2) │
│ 6️⃣ ColorJitter │ brightness, contrast, saturation=0.2 │
│ 7️⃣ ToTensor │ [0-255] → [0.0-1.0] │
│ 8️⃣ Normalize │ ImageNet mean/std │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 🎲 STEP 2: BATCH-LEVEL AUGMENTATION (Phase-specific) │
├─────────────────────────────────────────────────────────────┤
│ Phase 1: None (Preprocessing only) │
│ Phase 2-3: MixUp (λ×ImageA + (1-λ)×ImageB) │
│ Phase 4-5: CutMix (Patch swap between images) │
│ Phase 6: Hybrid (MixUp + CutMix combined) │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 🎯 READY FOR MODEL TRAINING │
└─────────────────────────────────────────────────────────────┘
Phase-Specific Image Sizes
| Phase | img_size | Resize To | RandomCrop To |
|---|---|---|---|
| 1️⃣ Basic | 280 | 312×312 | 280×280 |
| 2️⃣ MixUp Soft | 320 | 352×352 | 320×320 |
| 3️⃣ MixUp Hard | 320 | 352×352 | 320×320 |
| 4️⃣ CutMix Soft | 380 | 412×412 | 380×380 |
| 5️⃣ CutMix Hard | 380 | 412×412 | 380×380 |
| 6️⃣ Grand Finale | 400 | 432×432 | 400×400 |
Preprocessing Details (All Phases)
| Step | Transform | Parameters | Purpose |
|---|---|---|---|
| 1️⃣ | Resize | (size+32, size+32) | Prepare for random crop |
| 2️⃣ | RandomCrop | (size, size) | Random position augmentation |
| 3️⃣ | RandomHorizontalFlip | p=0.5 | Left-right invariance |
| 4️⃣ | RandomRotation | degrees=20 | Rotation invariance |
| 5️⃣ | RandomAffine | scale=(0.8, 1.2) | Scale variation |
| 6️⃣ | ColorJitter | (0.2, 0.2, 0.2) | Brightness/Contrast/Saturation |
| 7️⃣ | ToTensor | - | Convert to PyTorch tensor |
| 8️⃣ | Normalize | mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225] | ImageNet normalization |
Test/Validation Preprocessing
| Step | Transform | Parameters |
|---|---|---|
| 1️⃣ | Resize | (size, size) |
| 2️⃣ | ToTensor | - |
| 3️⃣ | Normalize | ImageNet mean/std |
💡 Key Insight: Preprocessing (8 steps) is applied per image in every phase. MixUp/CutMix is applied AFTER preprocessing as batch-level augmentation.
🔀 Batch-Level Augmentation Techniques (Phase-Specific)
MixUp
Image A (Rose) + Image B (Sunflower)
↓
λ = Beta(α, α) → New Image = λ×A + (1-λ)×B
↓
Blended Image (70% Rose + 30% Sunflower features)
Benefits: ✅ Smoother decision boundaries ✅ Reduces overconfidence ✅ Better generalization
CutMix
Image A (Rose) + Random BBox from Image B (Sunflower)
↓
Paste B's region onto A
↓
Composite Image (Rose background + Sunflower patch)
Benefits: ✅ Object completion ability ✅ Occlusion robustness ✅ Localization skills
Hybrid (Grand Finale)
- Apply MixUp (blend two images)
- Apply CutMix (cut on blended image)
- Result: Maximum augmentation challenge
🛡️ Smart Training Features
Two-Layer Early Stopping
| Layer | Condition | Patience | Action |
|---|---|---|---|
| Phase-level | Train↓ + Val↑ (Overfitting) | 2 epochs | Skip to next phase |
| Global | Val loss not improving | 8 epochs | Stop training |
Smart Dropout Mechanism
| Signal | Condition | Action |
|---|---|---|
| ⚠️ Overfitting | Train↓ + Val↑ | Dropout += 0.05 |
| 🚑 Underfitting | Train↑ + Val↑ | Dropout -= 0.05 |
| ✅ Normal | Train↓ + Val↓ | No change |
Bounds: min=0.10, max=0.50
Model Architecture
EfficientNet-B4 (pretrained)
└── Custom Classifier Head
├── BatchNorm1d (1792)
├── Dropout
├── Linear (1792 → 512)
├── GELU
├── BatchNorm1d (512)
├── Dropout
└── Linear (512 → 102)
Total Parameters: ~19M (all trainable)
Supported Flower Classes
102 flower species including: Rose, Sunflower, Tulip, Orchid, Lily, Daisy, Hibiscus, Lotus, Magnolia, and 93 more.
Limitations
- Trained only on Oxford Flowers-102 dataset
- Best performance at 400×400 resolution
- May not generalize well to flowers outside the 102 trained classes
Citation
@misc{efficientnet-b4-flowers102,
title={EfficientNet-B4 Flower Classifier with 6-Phase Progressive Training},
author={fth2745},
year={2024},
url={https://huggingface.co/fth2745/efficientnet-b4-flowers102}
}
- Downloads last month
- 24
Model tree for fth2745/efficientnet-b4-flowers102
Base model
google/efficientnet-b4