Commit Graph

20 Commits

Author SHA1 Message Date
Zijie Tian
0ad86eb449 [claudesquad] update from 'perf_opt-2' on 07 Jan 26 05:58 CST 2026-01-07 05:58:10 +08:00
Zijie Tian
2fe50bab50 [claudesquad] update from 'debug_chunk-2' on 07 Jan 26 03:27 CST 2026-01-07 03:27:27 +08:00
Zijie Tian
f240903013 [docs] Add GPU mutex instructions for multi-instance debugging
Add instructions for Claude instances to check GPU availability before
running CUDA operations, preventing conflicts when multiple instances
debug in parallel on a single GPU.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 01:42:59 +08:00
Zijie Tian
edb5273e34 [WIP] Added basic test for quest. 2026-01-06 22:30:31 +08:00
Zijie Tian
e554d5482b [refactor] Delete unnesscessory test, and refacrtor the offload prefix cache. 2026-01-05 20:31:42 +08:00
Zijie Tian
054aaff403 [fix] Fixed needle test bug. 2026-01-05 18:34:09 +08:00
Zijie Tian
9b52d25866 [docs] Update CLAUDE.md. 2026-01-03 20:46:00 +08:00
Zijie Tian
bf4c63c7ec [docs] Added Sparse Attn. 2025-12-29 19:56:54 +08:00
Zijie Tian
82ed34fc2d [opt] optimize nanovllm performance compareable with vllm. 2025-12-25 03:47:07 +08:00
Zijie Tian
16fcf8350b [WIP] replace merge attention with triton kernel. 2025-12-25 01:07:05 +08:00
Zijie Tian
6ec1b23982 [WIP] NEED to modify communication. 2025-12-24 21:57:51 +08:00
Zijie Tian
782437c486 [WIP] remove num_prefetch_blocks varible. 2025-12-24 18:22:26 +08:00
Zijie Tian
1907b625b6 [refactor] Remove legacy mode path. 2025-12-22 20:17:56 +08:00
Zijie Tian
08d83185ce [fix] fix bench*.py. 2025-12-22 19:53:50 +08:00
Zijie Tian
8df0c7517b [docs] refactor CLAUDE.md. 2025-12-15 21:43:33 +08:00
Zijie Tian
b8b6478506 [feat] Need to optimized with async prefetch. 2025-12-15 06:58:40 +08:00
Zijie Tian
1081ab51ea [refactor] Refactor offload code to multi-chunk. 2025-12-15 01:13:58 +08:00
Zijie Tian
5949537faf [docs] Start ues CLAUDE rules. 2025-12-15 00:20:54 +08:00
Zijie Tian
a37f07943c [docs] Update the CLAUDE.md. 2025-12-15 00:13:27 +08:00
Zijie Tian
761929390e [bench] Added vllm vs nano-vllm bench. 2025-12-10 00:44:57 +08:00