nano-vllm/tests at ccd1b3d4ab635f8dfb59843ba73bfe0cbeaf0b7a - nano-vllm - Gitea: Git with a cup of tea

zijie-tian/nano-vllm

Files

History

Zijie Tian ccd1b3d4ab [WIP] Before modify nanovllm CPU-GPU kvcache.

2025-12-31 22:41:07 +08:00

..

[WIP] Added sgDMA operator for scatter kvcache communication.

2025-12-24 23:48:52 +08:00

__init__.py

[WIP] NEED refactor nanovllm mechenism.

2025-12-22 23:52:56 +08:00

test_attention_offload.py

[opt] optimize nanovllm performance compareable with vllm.

2025-12-25 03:47:07 +08:00

test_chunked_attention.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_chunked_decode_hook.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_chunked_prefill_hook.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_debug_verification.py

[WIP] Before modify nanovllm CPU-GPU kvcache.

2025-12-31 22:41:07 +08:00

test_flash_attn_kvcache.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_flashinfer_merge.py

[WIP] Before modify to FlashInfer.

2025-12-30 01:11:13 +08:00

test_needle.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_offload_correctness.py

[test] Added offload correct verify.

2025-12-31 20:59:53 +08:00

test_offload_engine.py

[WIP] NEED refactor nanovllm mechenism.

2025-12-22 23:52:56 +08:00

test_prefill.py

[WIP] remove num_prefetch_blocks varible.

2025-12-24 18:22:26 +08:00

test_sgdma.py

[WIP] replace merge attention with triton kernel.

2025-12-25 01:07:05 +08:00