Commit Graph

8 Commits

Author SHA1 Message Date
Zijie Tian
051f2295c9 [feat] Added sparse KVcache feature, NEED VERIFY. 2025-12-22 08:51:02 +08:00
Zijie Tian
91a0f09a24 [feat] Optimized with ASYNC offload. 2025-12-15 07:21:35 +08:00
Zijie Tian
b8b6478506 [feat] Need to optimized with async prefetch. 2025-12-15 06:58:40 +08:00
Zijie Tian
61edb8a344 [feat] Finished offload. Still need optimize performance. 2025-12-12 02:27:40 +08:00
Zijie Tian
60d24f7c12 [feat] Added bench_offload.py and GreedySampler. 2025-12-12 00:24:08 +08:00
Zijie Tian
190df5f70d [refactor] Refactor current gpu and cpu block allocation strategy. 2025-12-10 21:23:31 +08:00
Zijie Tian
0a247ccb1b [feat] Added num_gpu_blocks limit gpu blocks. 2025-12-10 20:17:42 +08:00
Zijie Tian
0b6f19242d [feat] Added chunked prefill and kvcache offload mechenism. 2025-12-10 03:47:37 +08:00