Commit Graph

6 Commits

Author SHA1 Message Date
Zijie Tian
89f8020d38 [WIP] fixing attention compute error. 2025-12-30 00:31:48 +08:00
Zijie Tian
16fcf8350b [WIP] replace merge attention with triton kernel. 2025-12-25 01:07:05 +08:00
Zijie Tian
dc7807a211 [feat] Fixed warmup memory overhead. 2025-12-15 21:39:14 +08:00
Zijie Tian
0bd7ba7536 [fix] Fixed chunked_attention.py implement. 2025-12-11 22:39:50 +08:00
Zijie Tian
b9ed77cbbb [fix] Fix import error. 2025-12-11 05:31:06 +08:00
Zijie Tian
0b6f19242d [feat] Added chunked prefill and kvcache offload mechenism. 2025-12-10 03:47:37 +08:00