Files
nano-vllm/docs
Zijie Tian 1eb7521994 📝 docs: add XAttention density types documentation
Document the difference between compute density (BSA block level)
and communication density (CPU block level).

Key finding: Even with 37% compute density, comm density can be 100%
due to any() aggregation across heads/Q-positions spreading sparse
blocks across all CPU blocks.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 01:44:11 +08:00
..