MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining Paper • 2505.07608 • Published May 12 • 81
ConFiguRe: Exploring Discourse-level Chinese Figures of Speech Paper • 2209.07678 • Published Sep 16, 2022
Long Context Alignment with Short Instructions and Synthesized Positions Paper • 2405.03939 • Published May 7, 2024
More Tokens, Lower Precision: Towards the Optimal Token-Precision Trade-off in KV Cache Compression Paper • 2412.12706 • Published Dec 17, 2024
RestGPT: Connecting Large Language Models with Real-World RESTful APIs Paper • 2306.06624 • Published Jun 11, 2023 • 1
LongEmbed: Extending Embedding Models for Long Context Retrieval Paper • 2404.12096 • Published Apr 18, 2024 • 2
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training Paper • 2309.10400 • Published Sep 19, 2023 • 26