[claudesquad] update from 'perf_opt-2' on 07 Jan 26 05:58 CST

This commit is contained in:
Zijie Tian
2026-01-07 05:58:10 +08:00
parent aa953ecb59
commit 0ad86eb449
4 changed files with 175 additions and 68 deletions

View File

@@ -66,6 +66,12 @@ python bench_offload.py
**Note**: The Python version in the path (python3.10) should match your environment.
**CRITICAL**: After making code changes to `nanovllm/` source files, you MUST reinstall the package for changes to take effect:
```bash
pip install -e . --prefix=./.local --no-deps
```
Without reinstallation, Python will use the old cached version and your changes will NOT be reflected!
## Sparse Attention
For sparse attention related content (block sparse attention, MInference, FlexPrefill, XAttention, AvgPool, etc.), refer to [`docs/sparse_attention_guide.md`](docs/sparse_attention_guide.md).