shreyasmeher
/

ConflLlama

Text Classification

Model card Files Files and versions

shreyasmeher commited on Nov 14, 2024

Commit

3db687f

·

verified ·

1 Parent(s): f0bd365

Update README.md

Files changed (1) hide show

README.md +19 -0

README.md CHANGED Viewed

@@ -135,6 +135,25 @@ This model is designed for:
 3. Not intended for operational security decisions
 4. Results should be interpreted with appropriate context
 ## Citation
 ```bibtex
 @misc{conflllama,

 3. Not intended for operational security decisions
 4. Results should be interpreted with appropriate context
+## Training Logs
+<p align="center">
+  <img src="images/training-logs.png" alt="Training Logs" width="800"/>
+</p>
+The training logs show a successful training run with healthy convergence patterns:
+**Loss & Learning Rate:**
+- Loss decreases from 1.95 to ~0.90, with rapid initial improvement
+- Learning rate uses warmup/decay schedule, peaking at ~1.5x10^-4
+**Training Stability:**
+- Stable gradient norms (0.4-0.6 range)
+- Consistent GPU memory usage (~5800MB allocated, 7080MB reserved)
+- Steady training speed (~3.5s/step) with brief interruption at step 800
+The graphs indicate effective model training with good optimization dynamics and resource utilization. The loss vs. learning rate plot suggests optimal learning around 10^-4.
 ## Citation
 ```bibtex
 @misc{conflllama,