nano-vllm/tests at 7af721c12c7a9000ff066bfab69665ba9d1af2dd - nano-vllm - Gitea: Git with a cup of tea

zijie-tian/nano-vllm

Files

History

Zijie Tian 7af721c12c [WIP] Before modify to FlashInfer.

2025-12-30 01:11:13 +08:00

..

[WIP] Added sgDMA operator for scatter kvcache communication.

2025-12-24 23:48:52 +08:00

__init__.py

[WIP] NEED refactor nanovllm mechenism.

2025-12-22 23:52:56 +08:00

test_attention_offload.py

[opt] optimize nanovllm performance compareable with vllm.

2025-12-25 03:47:07 +08:00

test_chunked_attention.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_chunked_decode_hook.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_chunked_prefill_hook.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_flash_attn_kvcache.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_flashinfer_merge.py

[WIP] Before modify to FlashInfer.

2025-12-30 01:11:13 +08:00

test_needle.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_offload_correctness.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_offload_engine.py

[WIP] NEED refactor nanovllm mechenism.

2025-12-22 23:52:56 +08:00

test_prefill.py

[WIP] remove num_prefetch_blocks varible.

2025-12-24 18:22:26 +08:00

test_sgdma.py

[WIP] replace merge attention with triton kernel.

2025-12-25 01:07:05 +08:00