nano-vllm

Author	SHA1	Message	Date
Zijie Tian	c717072f31	✨ feat: add --model argument to bench.py for configurable model path Previously bench.py had a hardcoded model path. Now it accepts --model argument (default: Llama-3.1-8B-Instruct) to align with bench_offload.py. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>	2026-01-27 04:36:17 +08:00
Zijie Tian	aa953ecb59	[refactor] Aligned the bench.	2026-01-07 04:25:06 +08:00
Zijie Tian	82ed34fc2d	[opt] optimize nanovllm performance compareable with vllm.	2025-12-25 03:47:07 +08:00
Zijie Tian	08d83185ce	[fix] fix bench*.py.	2025-12-22 19:53:50 +08:00
Zijie Tian	051f2295c9	[feat] Added sparse KVcache feature, NEED VERIFY.	2025-12-22 08:51:02 +08:00
Zijie Tian	0b6f19242d	[feat] Added chunked prefill and kvcache offload mechenism.	2025-12-10 03:47:37 +08:00
Zijie Tian	761929390e	[bench] Added vllm vs nano-vllm bench.	2025-12-10 00:44:57 +08:00
GeeeekExplorer	801365a611	update bench	2025-06-19 23:28:11 +08:00
cheunglei	b5ace32982	use spawn	2025-06-17 23:49:15 +08:00
GeeeekExplorer	59aa3ff57c	better	2025-06-13 13:07:33 +08:00
GeeeekExplorer	135d1b38a2	release	2025-06-13 09:01:08 +08:00
GeeeekExplorer	ec3c60d96f	update bench	2025-06-12 22:54:51 +08:00
GeeeekExplorer	fee58d44e4	fix	2025-06-12 01:00:31 +08:00
GeeeekExplorer	b98e1ca305	fix	2025-06-10 21:25:54 +08:00
GeeeekExplorer	a5a4909e6a	init commit	2025-06-10 00:27:01 +08:00