AttentionVCGitHubVC
Trending ReposTrending BuildersPortfolio
AttentionVC

Built by JustinZ and Jennie

AttentionVCA product by AttentionVC
Back to Trending
0xSero
0xSero/

turboquant

TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration

Python
Stars

1.3k

+3 today+8 /wk+62 /mo
Forks

161

Issues

12

Watchers

1.3k

Star History

Repository Info

LicenseGPL-3.0
CreatedMar 25, 2026
Last push3/27/2026
Open on GitHub

Snapshot History

DateStarsForksIssues
May 8, 20261.3k16112
May 7, 20261.3k16111
May 6, 20261.3k16111
May 5, 20261.3k16012
Apr 7, 20268311018