nano-vllm/tests/test_xattn_estimate_alignment.py at 193ef55d18dca009922fd7dadc18782181ef5b67

Files

Zijie Tian 193ef55d18 ♻️ refactor: use Q-chunked processing in xattn alignment test

Match xattn_estimate internal logic by processing Q in chunks:
- Reduces peak memory for attn_scores tensor
- Enables testing 64K sequences without OOM
- All 5 test files pass (3.6K to 64K)

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>

2026-02-01 18:08:15 +08:00

8.7 KiB

Raw Blame History

View Raw

8.7 KiB Raw Blame History

8.7 KiB

Raw Blame History