nano-vllm/tests at 74ee6d0895185d07e3184c555828758ef40ea899 - nano-vllm - Gitea: Git with a cup of tea

zijie-tian/nano-vllm

Files

History

Zijie Tian 74ee6d0895 [WIP] need to fix model to normally decode.

2026-01-01 05:18:27 +08:00

..

[WIP] Added sgDMA operator for scatter kvcache communication.

2025-12-24 23:48:52 +08:00

__init__.py

[WIP] NEED refactor nanovllm mechenism.

2025-12-22 23:52:56 +08:00

test_attention_offload.py

[opt] optimize nanovllm performance compareable with vllm.

2025-12-25 03:47:07 +08:00

test_chunked_attention.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_chunked_decode_hook.py

[WIP] need to fix model to normally decode.

2026-01-01 05:18:27 +08:00

test_chunked_prefill_hook.py

[refactor] Refactor the test_chunked_prefill/decode.

2026-01-01 03:32:26 +08:00

test_debug_verification.py

[WIP] need change flashattention to debug.

2026-01-01 00:58:22 +08:00

test_flash_attn_kvcache.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_flashinfer_merge.py

[WIP] Before modify to FlashInfer.

2025-12-30 01:11:13 +08:00

test_needle.py

[WIP] fixing attention compute error.

2025-12-30 00:31:48 +08:00

test_offload_correctness.py

[WIP] Before fix needle.

2025-12-31 23:35:25 +08:00

test_offload_engine.py

[WIP] NEED refactor nanovllm mechenism.

2025-12-22 23:52:56 +08:00

test_prefill.py

[WIP] remove num_prefetch_blocks varible.

2025-12-24 18:22:26 +08:00

test_sgdma.py

[WIP] replace merge attention with triton kernel.

2025-12-25 01:07:05 +08:00