Logo
Explore Help
Register Sign In
zijie-tian/nano-vllm
1
0
Fork 0
You've already forked nano-vllm
Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity
224 Commits 3 Branches 0 Tags
8035e4db3d8528642ca1da5c07cfa3a1819d664f
Commit Graph

56 Commits

Author SHA1 Message Date
Zijie Tian
1081ab51ea [refactor] Refactor offload code to multi-chunk. 2025-12-15 01:13:58 +08:00
Zijie Tian
61edb8a344 [feat] Finished offload. Still need optimize performance. 2025-12-12 02:27:40 +08:00
Zijie Tian
babfa17354 [refactor] Translate into english, void Chinese due to claude. 2025-12-11 00:30:24 +08:00
Zijie Tian
190df5f70d [refactor] Refactor current gpu and cpu block allocation strategy. 2025-12-10 21:23:31 +08:00
Zijie Tian
0a247ccb1b [feat] Added num_gpu_blocks limit gpu blocks. 2025-12-10 20:17:42 +08:00
Zijie Tian
0b6f19242d [feat] Added chunked prefill and kvcache offload mechenism. 2025-12-10 03:47:37 +08:00
First Previous 1 2 Next Last
Powered by Gitea Version: 1.25.4 Page: 42ms Template: 4ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API