nano-vllm/nanovllm at e440c45e73061827089bfbbdd71403a724f037ac - nano-vllm - Gitea: Git with a cup of tea

zijie-tian/nano-vllm

Files

History

Zijie Tian 07f5220f40 Merge branch 'tzj/minference' of ssh://git.zijie-tian.site:2222/zijie-tian/nano-vllm into tzj/minference

2026-01-20 02:27:10 +08:00

..

[WIP] Added sgDMA operator for scatter kvcache communication.

2025-12-24 23:48:52 +08:00

[refactor] Refactor the kvcache offload.

2026-01-04 19:37:03 +08:00

Merge branch 'tzj/minference' of ssh://git.zijie-tian.site:2222/zijie-tian/nano-vllm into tzj/minference

2026-01-20 02:27:10 +08:00

Merge branch 'tzj/minference' of ssh://git.zijie-tian.site:2222/zijie-tian/nano-vllm into tzj/minference

2026-01-20 02:27:10 +08:00

♻️ refactor: remove cross-layer pipeline and rename compute_chunked_prefill

2026-01-20 02:10:40 +08:00

[claudesquad] update from 'add-llama-1' on 10 Jan 26 21:03 CST

2026-01-10 21:03:45 +08:00

[feat] Need to optimized with async prefetch.

2025-12-15 06:58:40 +08:00

__init__.py

better

2025-06-15 10:36:45 +08:00

config.py

[WIP] Before integrate the xattn operator.

2026-01-19 21:19:21 +08:00

llm.py

support tensor parallel

2025-06-15 01:31:24 +08:00

sampling_params.py

compile random sampling

2025-08-31 22:55:34 +08:00