tomg-group-umd 's Collections Retrofitting Recurrence
updated
Teaching Pretrained Language Models to Think Deeper with Retrofitted
Recurrence
Paper
• 2511.07384
• Published • 19
smcleish/Recurrent-Llama-3.2-train-recurrence-32
Text Generation
• 1B • Updated • 706
• 1
smcleish/Recurrent-Llama-3.2-train-recurrence-16
Text Generation
• 1B • Updated • 4
smcleish/Recurrent-Llama-3.2-train-recurrence-8
Text Generation
• 1B • Updated • 381
smcleish/Recurrent-Llama-3.2-train-recurrence-4
Text Generation
• 1B • Updated • 37
smcleish/Recurrent-TinyLlama-3T-train-recurrence-32
Text Generation
• 0.8B • Updated • 340
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-16
Text Generation
• 0.8B • Updated • 2
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-8
Text Generation
• 0.8B • Updated • 2
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4
Text Generation
• 0.8B • Updated • 3
smcleish/Recurrent-OLMo-2-0425-train-recurrence-32
Text Generation
• 1B • Updated • 422
• 2
smcleish/Recurrent-OLMo-2-0425-train-recurrence-16
Text Generation
• 1B • Updated • 2
smcleish/Recurrent-OLMo-2-0425-train-recurrence-8
Text Generation
• 1B • Updated • 31
smcleish/Recurrent-OLMo-2-0425-train-recurrence-4
Text Generation
• 1B • Updated • 2
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4-single-phase
Text Generation
• 0.8B • Updated • 2
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4-two-phase
Text Generation
• 0.8B • Updated
smcleish/Recurrent-Llama-3.2-untrained
Text Generation
• 1B • Updated • 44
smcleish/Recurrent-TinyLlama-3T-untrained
Text Generation
• 0.8B • Updated • 2
smcleish/Recurrent-OLMo-2-0425-untrained
Text Generation
• 1B • Updated • 3
smcleish/Recurrent-Llama-3.2-2-4-2-untrained
Text Generation
• 1B • Updated • 1
• 1
smcleish/retrofitting-llama-fineweb-edu-tokenized
Viewer
• Updated • 332M • 209