nano-vllm

Author	SHA1	Message	Date
Zijie Tian	3956a30b14	🔧 chore: add --use-v1 flag to bench_vllm.py Allow switching between vLLM V1/V2 engines via command line flag. Default behavior now uses V2 (VLLM_USE_V1=0). Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>	2026-01-27 09:14:55 +08:00
Zijie Tian	59473fa432	🔧 chore: add configurable arguments to bench_vllm.py Add --model, --gpu-util, and --enforce-eager arguments for flexible vLLM benchmarking comparisons. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>	2026-01-27 09:07:49 +08:00
Zijie Tian	aa953ecb59	[refactor] Aligned the bench.	2026-01-07 04:25:06 +08:00
Zijie Tian	82ed34fc2d	[opt] optimize nanovllm performance compareable with vllm.	2025-12-25 03:47:07 +08:00
Zijie Tian	08d83185ce	[fix] fix bench*.py.	2025-12-22 19:53:50 +08:00
Zijie Tian	0b6f19242d	[feat] Added chunked prefill and kvcache offload mechenism.	2025-12-10 03:47:37 +08:00
Zijie Tian	761929390e	[bench] Added vllm vs nano-vllm bench.	2025-12-10 00:44:57 +08:00