AttentionVCGitHubVC
Trending ReposTrending BuildersPortfolio
AttentionVC

Built by JustinZ and Jennie

AttentionVCA product by AttentionVC
Back to Trending
0xSero
0xSero/

turboquant

TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration

Python
Stars

173

+18 today+20 /wk+20 /mo
Forks

29

Issues

3

Watchers

173

Star History

Repository Info

LicenseGPL-3.0
CreatedMar 25, 2026
Last push17h ago
Open on GitHub