
TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration
1.3k
161
12
1.3k
| Date | Stars | Forks | Issues |
|---|---|---|---|
| May 8, 2026 | 1.3k | 161 | 12 |
| May 7, 2026 | 1.3k | 161 | 11 |
| May 6, 2026 | 1.3k | 161 | 11 |
| May 5, 2026 | 1.3k | 160 | 12 |
| Apr 7, 2026 | 831 | 101 | 8 |