Logo
Explore Help
Register Sign In
zijie-tian/nano-vllm
1
0
Fork 0
You've already forked nano-vllm
Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity
Files
e440c45e73061827089bfbbdd71403a724f037ac
nano-vllm/nanovllm
History
Zijie Tian 07f5220f40 Merge branch 'tzj/minference' of ssh://git.zijie-tian.site:2222/zijie-tian/nano-vllm into tzj/minference
2026-01-20 02:27:10 +08:00
..
comm
[WIP] Added sgDMA operator for scatter kvcache communication.
2025-12-24 23:48:52 +08:00
debug
[refactor] Refactor the kvcache offload.
2026-01-04 19:37:03 +08:00
engine
Merge branch 'tzj/minference' of ssh://git.zijie-tian.site:2222/zijie-tian/nano-vllm into tzj/minference
2026-01-20 02:27:10 +08:00
kvcache
Merge branch 'tzj/minference' of ssh://git.zijie-tian.site:2222/zijie-tian/nano-vllm into tzj/minference
2026-01-20 02:27:10 +08:00
layers
♻️ refactor: remove cross-layer pipeline and rename compute_chunked_prefill
2026-01-20 02:10:40 +08:00
models
[claudesquad] update from 'add-llama-1' on 10 Jan 26 21:03 CST
2026-01-10 21:03:45 +08:00
utils
[feat] Need to optimized with async prefetch.
2025-12-15 06:58:40 +08:00
__init__.py
better
2025-06-15 10:36:45 +08:00
config.py
[WIP] Before integrate the xattn operator.
2026-01-19 21:19:21 +08:00
llm.py
support tensor parallel
2025-06-15 01:31:24 +08:00
sampling_params.py
compile random sampling
2025-08-31 22:55:34 +08:00
Powered by Gitea Version: 1.25.4 Page: 64ms Template: 4ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API