📝 docs: update density alignment test with Offload mode results

- Rename doc to "Density Alignment Test Results" (covers both modes)
- Add Offload mode test results (3.7K-64.9K tokens, all passed)
- Add Layer 5 GPU-only test results (threshold=0.9, density=6.24%)
- Enhance test script to support both GPU-only and Offload data formats
- Add batch testing commands for all data files
- Update CLAUDE.md index

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
This commit is contained in:
Zijie Tian
2026-02-02 14:22:40 +08:00
parent 232fcf043e
commit dc51972777
3 changed files with 283 additions and 39 deletions

View File

@@ -40,7 +40,7 @@ Nano-vLLM is a lightweight vLLM implementation (~1,200 lines) for fast offline L
| [`docs/new_model_integration_guide.md`](docs/new_model_integration_guide.md) | 🔧 GUIDE: 新模型整合指南 - 配置映射、RoPE变体、EOS处理、权重转换、验证清单 |
| [`docs/xattn_density_alignment_analysis.md`](docs/xattn_density_alignment_analysis.md) | 📊 ANALYSIS: GPU-only vs Offload 模式 density 对齐分析chunked softmax 边界效应5-7% 差异根因 |
| [`docs/xattn_kv_chunking_density_test.md`](docs/xattn_kv_chunking_density_test.md) | 🧪 TEST: XAttention KV chunking density 验证threshold=1.0 对齐threshold<1.0 差异 10-13% |
| [`docs/gpuonly_density_alignment_test.md`](docs/gpuonly_density_alignment_test.md) | ✅ TEST: GPU-only density 对齐验证 (4K-64K)xattn_bsa vs xattn_estimate 完全一致 |
| [`docs/gpuonly_density_alignment_test.md`](docs/gpuonly_density_alignment_test.md) | ✅ TEST: Density 对齐验证 (GPU-only + Offload, 4K-64K)xattn_estimate vs KV chunking 完全一致 |
## Rules Index