-
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
Paper • 2402.14083 • Published • 49 -
Linear Transformers are Versatile In-Context Learners
Paper • 2402.14180 • Published • 7 -
Training-Free Long-Context Scaling of Large Language Models
Paper • 2402.17463 • Published • 25 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 624
Yang Lee PRO
innovation64
AI & ML interests
AGI
Recent Activity
liked
a model
about 19 hours ago
BriLLM/BriLLM0.5
upvoted
a
paper
about 2 months ago
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and
Future Frontiers
upvoted
a
paper
3 months ago
Qwen3 Embedding: Advancing Text Embedding and Reranking Through
Foundation Models