495 310 1031

Peter Szemraj PRO

pszemraj

https://pszemraj.carrd.co/

AI & ML interests

metallic intuition

Recent Activity

upvoted a paper about 5 hours ago

Revisiting the Shape Convention of Transformer Language Models

upvoted a paper 3 days ago

Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR

upvoted a paper 4 days ago

Horizon-LM: A RAM-Centric Architecture for LLM Training

View all activity

Organizations

upvoted a paper about 5 hours ago

Revisiting the Shape Convention of Transformer Language Models

Paper • 2602.06471 • Published 3 days ago • 3

upvoted a paper 3 days ago

Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR

Paper • 2602.05261 • Published 5 days ago • 47

upvoted a paper 4 days ago

Horizon-LM: A RAM-Centric Architecture for LLM Training

Paper • 2602.04816 • Published 5 days ago • 16

upvoted a paper 5 days ago

Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation

Paper • 2601.22813 • Published 10 days ago • 55

liked a dataset 7 days ago

pbevan11/synthetic-ocr-correction-gpt4o

Viewer • Updated Jul 25, 2024 • 10k • 24 • 6

upvoted 2 papers 7 days ago

Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published 12 days ago • 21

Do Reasoning Models Enhance Embedding Models?

Paper • 2601.21192 • Published 12 days ago • 24

upvoted a paper 10 days ago

Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published 12 days ago • 98

upvoted a paper 12 days ago

CGPT: Cluster-Guided Partial Tables with LLM-Generated Supervision for Table Retrieval

Paper • 2601.15849 • Published 18 days ago • 14

liked a dataset 12 days ago

Pageshift-Entertainment/LongPage

Viewer • Updated 20 days ago • 6.07k • 8.33k • 139

upvoted 2 papers 12 days ago

Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published 14 days ago • 40

Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published 13 days ago • 23

liked a model 12 days ago

moonshotai/Kimi-K2.5

Image-Text-to-Text • 171B • Updated 5 days ago • 456k • • 1.92k

upvoted an article 12 days ago

Article

Introducing Waypoint-1: Real-time interactive video diffusion from Overworld

21 days ago

•

liked a Space 12 days ago

Waypoint 1 Small

🎮

Explore and navigate through AI-generated worlds in real-time

liked a model 12 days ago

Roblox/cube3d-v0.1

Updated Mar 20, 2025 • 86

liked a model 15 days ago

tencent/Hunyuan3D-2.1

Image-to-3D • Updated Oct 17, 2025 • 23.7k • 833

upvoted 3 papers 15 days ago