Zijie Tian
|
89f8020d38
|
[WIP] fixing attention compute error.
|
2025-12-30 00:31:48 +08:00 |
|
Zijie Tian
|
16fcf8350b
|
[WIP] replace merge attention with triton kernel.
|
2025-12-25 01:07:05 +08:00 |
|
Zijie Tian
|
dc7807a211
|
[feat] Fixed warmup memory overhead.
|
2025-12-15 21:39:14 +08:00 |
|
Zijie Tian
|
0bd7ba7536
|
[fix] Fixed chunked_attention.py implement.
|
2025-12-11 22:39:50 +08:00 |
|
Zijie Tian
|
b9ed77cbbb
|
[fix] Fix import error.
|
2025-12-11 05:31:06 +08:00 |
|
Zijie Tian
|
0b6f19242d
|
[feat] Added chunked prefill and kvcache offload mechenism.
|
2025-12-10 03:47:37 +08:00 |
|