Zijie Tian
|
e440c45e73
|
📝 docs: add XAttention algorithm guide based on COMPASS implementation
- Create docs/xattention_algorithm_guide.md with detailed algorithm explanation
- Stride reshape (inverse mode) for Q/K interleaved sampling
- Triton kernels: flat_group_gemm_fuse_reshape, softmax_fuse_block_sum
- Block selection via find_blocks_chunked with cumulative threshold
- BSA (block_sparse_attn) dependency for sparse computation
- Update docs/sparse_attention_guide.md XAttention section with accurate description
- Add documentation index entry in CLAUDE.md
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
2026-01-20 02:50:03 +08:00 |
|