nano-vllm/tests at tzj/layer-offload - nano-vllm - Gitea: Git with a cup of tea

zijie-tian/nano-vllm

Files

History

Zijie Tian 5fb0f67295 [WIP] need refactor.

2026-01-22 22:20:34 +08:00

..

__init__.py

[WIP] NEED refactor nanovllm mechenism.

2025-12-22 23:52:56 +08:00

modeling_qwen3.py

[refactor] Refactor needle test.

2026-01-03 19:19:37 +08:00

run_parallel_niah.sh

Merge branch 'zijie/fix-dist-3': Fix distributed port conflict

2026-01-12 16:27:25 +08:00

test_minference_gpu.py

[claudesquad] update from 'layer-prefill-1' on 08 Jan 26 03:36 CST

2026-01-08 03:36:39 +08:00

test_needle_ref.py

[refactor] Refactor needle test.

2026-01-03 19:19:37 +08:00

test_needle.py

[WIP] need refactor.

2026-01-22 22:20:34 +08:00

test_port_conflict.py

Merge branch 'zijie/fix-dist-3': Fix distributed port conflict

2026-01-12 16:27:25 +08:00

test_quest_policy.py

[WIP] move metadata to GPU.

2026-01-06 23:32:32 +08:00

test_ruler_niah.py

[docs] Added dist port issue.

2026-01-12 15:16:39 +08:00

test_ruler_niah.sh

✅ test: add parallel multi-GPU RULER NIAH test script

2026-01-12 21:08:27 +08:00

test_ruler.py

feat: add XAttention sparse policy integration

2026-01-14 10:04:46 +08:00

test_sequential.py

[WIP] Before fix bench_offload.py.

2026-01-06 18:41:08 +08:00

test_xattn_estimate_chunked.py

✨ feat: add nanovllm.ops module with XAttention estimation kernels

2026-01-22 06:00:42 +08:00

utils.py

[WIP] Before fix bench_offload.py.

2026-01-06 18:41:08 +08:00