Efficient AI
DFlash: Block Diffusion for Flash Speculative Decoding
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Create images in seconds. No sign-up, no paywall, no setup.