No notifications
Don't have an account? Sign up Already have an account? Sign in
1 article
Shrink your LLM's memory footprint by 6x, speed up attention by 8x, and lose almost nothing in accuracy — no retraining required.