SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models

141 views

Conference on Language Modeling

7 months ago

SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models

SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models