Commit Graph

7 Commits

Author SHA1 Message Date
Zijie Tian
3956a30b14 🔧 chore: add --use-v1 flag to bench_vllm.py
Allow switching between vLLM V1/V2 engines via command line flag.
Default behavior now uses V2 (VLLM_USE_V1=0).

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
2026-01-27 09:14:55 +08:00
Zijie Tian
59473fa432 🔧 chore: add configurable arguments to bench_vllm.py
Add --model, --gpu-util, and --enforce-eager arguments for flexible
vLLM benchmarking comparisons.

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
2026-01-27 09:07:49 +08:00
Zijie Tian
aa953ecb59 [refactor] Aligned the bench. 2026-01-07 04:25:06 +08:00
Zijie Tian
82ed34fc2d [opt] optimize nanovllm performance compareable with vllm. 2025-12-25 03:47:07 +08:00
Zijie Tian
08d83185ce [fix] fix bench*.py. 2025-12-22 19:53:50 +08:00
Zijie Tian
0b6f19242d [feat] Added chunked prefill and kvcache offload mechenism. 2025-12-10 03:47:37 +08:00
Zijie Tian
761929390e [bench] Added vllm vs nano-vllm bench. 2025-12-10 00:44:57 +08:00