nano-vllm/nanovllm/kvcache at c7ac39dfbd6cd03314f16508e79821a0b96ecc3e - nano-vllm - Gitea: Git with a cup of tea

zijie-tian/nano-vllm

Files

History

Zijie Tian e554d5482b [refactor] Delete unnesscessory test, and refacrtor the offload prefix cache.

2026-01-05 20:31:42 +08:00

..

[feat] Added chunked prefill and kvcache offload mechenism.

2025-12-10 03:47:37 +08:00

[fix] Fixed needle test bug.

2026-01-05 18:34:09 +08:00

__init__.py

[WIP] remove num_prefetch_blocks varible.

2025-12-24 18:22:26 +08:00

base_manager.py

[feat] Added chunked prefill and kvcache offload mechenism.

2025-12-10 03:47:37 +08:00

chunked_attention.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

gpu_manager.py

[feat] Added chunked prefill and kvcache offload mechenism.

2025-12-10 03:47:37 +08:00

hybrid_manager.py

[refactor] Delete unnesscessory test, and refacrtor the offload prefix cache.

2026-01-05 20:31:42 +08:00

kernels.py

[feat] Added chunked prefill and kvcache offload mechenism.

2025-12-10 03:47:37 +08:00

offload_engine.py

[WIP] need to fix model to normally decode.

2026-01-01 05:18:27 +08:00