Commit Graph

11 Commits

Author SHA1 Message Date
Zijie Tian
fa7601f4b8 ♻️ refactor: remove cross-layer pipeline and rename compute_chunked_prefill
- Remove cross-layer pipeline from OffloadEngine (saves ~1GB GPU memory for long sequences)
  - Delete layer_k/v_buffer_a/b double buffers
  - Remove start_decode_pipeline, get_decode_layer_kv, end_decode_pipeline methods
  - Remove pipeline state tracking variables
- Simplify decode to use ring buffer pipeline only (more efficient for long sequences)
- Rename compute_chunked_attention → compute_chunked_prefill for clarity
- Add mandatory needle test requirements: --enable-offload --input-len 32768

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 02:10:40 +08:00
Zijie Tian
6080bf7554 🙈 chore: exclude planning-with-files from git tracking
- Add planning files (task_plan.md, findings.md, progress.md) to .gitignore
- Remove existing planning files from git index (keep local)
- Update planning-with-files rule with git management policy

These temporary session files should not be version controlled.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 02:06:28 +08:00
Zijie Tian
a36f8569fc [WIP] Before refactor. 2026-01-20 01:25:46 +08:00
Zijie Tian
b97b0b96a0 [WIP] Before refactor the nanovllm sparse policy. 2026-01-19 22:34:44 +08:00
Zijie Tian
d1bbb7efe2 chore: update claude configuration and rules from tzj/vs_offload
- Add /sc:git command with smart commit functionality
- Add /sc:ultra-think command for deep thinking
- Update .claude/rules/ with improved documentation:
  - commands.md: command usage guidelines
  - doc-management.md: documentation policy
  - no-extra-docs.md: documentation creation policy
  - gpu-testing.md: GPU type detection and testing rules
- Update .claude/settings.json with claude-flow MCP configuration

这些改进提供了更好的开发体验和工具支持。
2026-01-18 18:56:49 +08:00
Zijie Tian
c254c8c330 chore: add planning-with-files rule configuration 2026-01-18 18:55:55 +08:00
Zijie Tian
03a8c033cb [claudesquad] update from 'add-llama-1' on 10 Jan 26 21:03 CST 2026-01-10 21:03:45 +08:00
Zijie Tian
6ec1b23982 [WIP] NEED to modify communication. 2025-12-24 21:57:51 +08:00
Zijie Tian
4dcef16c13 [WIP] NEED refactor nanovllm mechenism. 2025-12-22 23:52:56 +08:00
Zijie Tian
8df0c7517b [docs] refactor CLAUDE.md. 2025-12-15 21:43:33 +08:00
Zijie Tian
5949537faf [docs] Start ues CLAUDE rules. 2025-12-15 00:20:54 +08:00