Commit Graph

14 Commits

Author SHA1 Message Date
Zijie Tian
9b8165af5a [fix] Fixed kvcache offload problem. 2025-12-12 01:35:30 +08:00
Zijie Tian
babfa17354 [refactor] Translate into english, void Chinese due to claude. 2025-12-11 00:30:24 +08:00
Zijie Tian
e85c2b4776 [fix] Fixed kvcache offload bugs. 2025-12-10 22:34:00 +08:00
Zijie Tian
190df5f70d [refactor] Refactor current gpu and cpu block allocation strategy. 2025-12-10 21:23:31 +08:00
Zijie Tian
0a247ccb1b [feat] Added num_gpu_blocks limit gpu blocks. 2025-12-10 20:17:42 +08:00
Zijie Tian
87055cc5ce [refactor] Implement real chunked prefill mechenism. 2025-12-10 18:34:01 +08:00
Zijie Tian
0b6f19242d [feat] Added chunked prefill and kvcache offload mechenism. 2025-12-10 03:47:37 +08:00
Zijie Tian
761929390e [bench] Added vllm vs nano-vllm bench. 2025-12-10 00:44:57 +08:00
GeeeekExplorer
df99418f7d simplify 2025-08-31 20:02:51 +08:00
GeeeekExplorer
1caeec8dfa same as vllm 2025-06-27 18:50:56 +08:00
GeeeekExplorer
658520b788 warmup and allocate 2025-06-27 01:51:57 +08:00
GeeeekExplorer
386290d69e refactor 2025-06-11 21:12:57 +08:00
GeeeekExplorer
b98e1ca305 fix 2025-06-10 21:25:54 +08:00
GeeeekExplorer
a5a4909e6a init commit 2025-06-10 00:27:01 +08:00