shreyasmeher
/

ConflLlama

+I'll help you create a model card for your specific implementation. I'll modify the template based on your actual training configuration:
+---
+base_model: unsloth/llama-3-8b-bnb-4bit
+tags:
+- llama.cpp
+- gguf
+- quantized
+- q4_k_m
+license: apache-2.0
+language:
+- en
+---
+# ConflLlama: GTD-Finetuned LLaMA-3 8B
+- **Model Type:** GGUF quantized (q4_k_m and q8_0)
+- **Base Model:** unsloth/llama-3-8b-bnb-4bit
+- **Quantization Details:**
+  - Methods: q4_k_m and q8_0
+  - q4_k_m uses Q6_K for half of attention.wv and feed_forward.w2 tensors
+  - Optimized for both speed (q8_0) and quality (q4_k_m)
+### Training Data
+- **Dataset:** Global Terrorism Database (GTD)
+- **Time Period:** Events before January 1, 2017
+- **Format:** Event summaries with associated attack types
+- **Labels:** Attack type classifications from GTD
+### Data Processing
+1. **Date Filtering:**
+   - Filtered events occurring before 2017-01-01
+   - Handled missing dates by setting default month/day to 1
+2. **Data Cleaning:**
+   - Removed entries with missing summaries
+   - Cleaned summary text by removing special characters and formatting
+3. **Attack Type Processing:**
+   - Combined multiple attack types with separator '|'
+   - Included primary, secondary, and tertiary attack types when available
+4. **Training Format:**
+   - Input: Processed event summaries
+   - Output: Combined attack types
+   - Used chat template:
+     ```
+     Below describes details about terrorist events.
+     >>> Event Details:
+     {summary}
+     >>> Attack Types:
+     {combined_attacks}
+     ```
+### Training Details
+- **Framework:** Unsloth optimization framework
+- **Hardware:** NVIDIA A100-SXM4-40GB GPU on Delta Supercomputer
+- **Training Configuration:**
+  - Batch Size: 1 per device
+  - Gradient Accumulation Steps: 8
+  - Learning Rate: 2e-4
+  - Max Steps: 1000
+  - Save Steps: 200
+  - Logging Steps: 10
+- **LoRA Configuration:**
+  - Rank: 8
+  - Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
+  - Alpha: 16
+  - Dropout: 0
+- **Optimizations:**
+  - Gradient Checkpointing: Enabled
+  - 4-bit Quantization: Enabled
+  - Max Sequence Length: 1024
+### Memory Optimizations
+- Used 4-bit quantization
+- Gradient accumulation steps: 8
+- Memory-efficient gradient checkpointing
+- Reduced maximum sequence length to 1024
+- Disabled dataloader pin memory
+## Intended Use
+This model is designed for:
+1. Classification of terrorist events based on event descriptions
+2. Research in conflict studies and terrorism analysis
+3. Understanding attack type patterns in historical events
+4. Academic research in security studies
+## Limitations
+1. Training data limited to pre-2017 events
+2. Maximum sequence length limited to 1024 tokens
+3. May not capture recent changes in attack patterns
+4. Performance dependent on quality of event descriptions
+## Ethical Considerations
+1. Model trained on sensitive terrorism-related data
+2. Should be used responsibly for research purposes only
+3. Not intended for operational security decisions
+4. Results should be interpreted with appropriate context
+## Citation
+```bibtex
+@misc{conflllama,
+  author = {Meher, Shreyas},
+  title = {ConflLlama: GTD-Finetuned LLaMA-3 8B},
+  year = {2024},
+  publisher = {HuggingFace},
+  note = {Based on Unsloth's LLaMA-3 8B and GTD Dataset}
+}
+```
+## Acknowledgments
+- Unsloth for optimization framework and base model
+- Hugging Face for transformers infrastructure
+- Global Terrorism Database team
+- NCSA Delta for computing resources
+- BBOV project support
+<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>
+Would you like me to add or modify any specific sections based on your implementation details?