Zijie Tian
|
babfa17354
|
[refactor] Translate into english, void Chinese due to claude.
|
2025-12-11 00:30:24 +08:00 |
|
Zijie Tian
|
e85c2b4776
|
[fix] Fixed kvcache offload bugs.
|
2025-12-10 22:34:00 +08:00 |
|
Zijie Tian
|
190df5f70d
|
[refactor] Refactor current gpu and cpu block allocation strategy.
|
2025-12-10 21:23:31 +08:00 |
|
Zijie Tian
|
0a247ccb1b
|
[feat] Added num_gpu_blocks limit gpu blocks.
|
2025-12-10 20:17:42 +08:00 |
|
Zijie Tian
|
87055cc5ce
|
[refactor] Implement real chunked prefill mechenism.
|
2025-12-10 18:34:01 +08:00 |
|
Zijie Tian
|
0b6f19242d
|
[feat] Added chunked prefill and kvcache offload mechenism.
|
2025-12-10 03:47:37 +08:00 |
|
GeeeekExplorer
|
2f21442653
|
support qwen2
|
2025-11-04 01:44:42 +08:00 |
|
GeeeekExplorer
|
df99418f7d
|
simplify
|
2025-08-31 20:02:51 +08:00 |
|
PeterDing
|
f5b4840276
|
fix(model_runner): correct position indexing to be 0-based
- Change position calculation from len(seq) to len(seq) - 1
|
2025-07-04 14:29:12 +08:00 |
|
GeeeekExplorer
|
cb0b3dec3f
|
remove rng state
|
2025-06-27 22:50:33 +08:00 |
|
GeeeekExplorer
|
1caeec8dfa
|
same as vllm
|
2025-06-27 18:50:56 +08:00 |
|
GeeeekExplorer
|
658520b788
|
warmup and allocate
|
2025-06-27 01:51:57 +08:00 |
|
GeeeekExplorer
|
03cfc13bb3
|
faster pickle
|
2025-06-23 00:51:52 +08:00 |
|
GeeeekExplorer
|
cde3fc22c2
|
simplify
|
2025-06-21 17:19:15 +08:00 |
|
jinghuan-Chen
|
ffafaeb133
|
Release CUDA Graphs resource before exit.
|
2025-06-18 16:17:31 +08:00 |
|
GeeeekExplorer
|
bc0ad5a116
|
better
|
2025-06-17 23:33:38 +08:00 |
|
GeeeekExplorer
|
7e42fa6f63
|
fix
|
2025-06-15 13:28:29 +08:00 |
|
GeeeekExplorer
|
fc778a4da9
|
better
|
2025-06-15 10:36:45 +08:00 |
|
cheunglei
|
53b3ef2e32
|
support tensor parallel
|
2025-06-15 01:31:24 +08:00 |
|
GeeeekExplorer
|
b6136383c9
|
support fast pickle
|
2025-06-14 13:36:57 +08:00 |
|
GeeeekExplorer
|
4a8aa090a7
|
fix
|
2025-06-14 00:56:07 +08:00 |
|
GeeeekExplorer
|
98a1551a7d
|
support CUDA_VISIBLE_DEVICES
|
2025-06-12 23:14:01 +08:00 |
|
GeeeekExplorer
|
fee58d44e4
|
fix
|
2025-06-12 01:00:31 +08:00 |
|
GeeeekExplorer
|
08c84ec08d
|
multi file loader
|
2025-06-12 01:00:09 +08:00 |
|
GeeeekExplorer
|
386290d69e
|
refactor
|
2025-06-11 21:12:57 +08:00 |
|
GeeeekExplorer
|
b98e1ca305
|
fix
|
2025-06-10 21:25:54 +08:00 |
|
GeeeekExplorer
|
a5a4909e6a
|
init commit
|
2025-06-10 00:27:01 +08:00 |
|