📝 docs: add sparse policy None constraint rule
- Add "Policy 不能为 None (CRITICAL)" section - Document that sparse_policy must always be at least FullAttentionPolicy - Document warmup phase as the only exception where kvcache_manager can be None Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>
This commit is contained in:
@@ -1,5 +1,39 @@
|
|||||||
# Sparse Policy 代码规范
|
# Sparse Policy 代码规范
|
||||||
|
|
||||||
|
## Policy 不能为 None (CRITICAL)
|
||||||
|
|
||||||
|
**强制规则**: `sparse_policy` 参数**永远不能为 None**,必须至少为 `FullAttentionPolicy`。
|
||||||
|
|
||||||
|
```python
|
||||||
|
# ❌ 错误:允许 None
|
||||||
|
sparse_policy = getattr(config, 'sparse_policy', None)
|
||||||
|
|
||||||
|
# ✅ 正确:显式处理 None,默认使用 FULL
|
||||||
|
sparse_policy_type = getattr(config, 'sparse_policy', None)
|
||||||
|
if sparse_policy_type is None:
|
||||||
|
sparse_policy_type = SparsePolicyType.FULL
|
||||||
|
```
|
||||||
|
|
||||||
|
**原因**:
|
||||||
|
1. 统一的 API:所有代码路径都通过 policy 进行 attention 计算
|
||||||
|
2. 避免空指针:消除 `policy.xxx` 调用时的 None 检查
|
||||||
|
3. 简化逻辑:不需要 `if policy is not None` 的分支
|
||||||
|
|
||||||
|
**唯一例外:Warmup 阶段**
|
||||||
|
|
||||||
|
在 `model_runner.warmup_model()` 期间,kvcache_manager 还未分配。此时 `attention.py` 使用 flash_attn fallback:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# attention.py 中的 warmup 处理
|
||||||
|
if context.kvcache_manager is None:
|
||||||
|
# Warmup phase: use flash_attn directly
|
||||||
|
return flash_attn_varlen_func(...) if context.is_prefill else flash_attn_with_kvcache(...)
|
||||||
|
```
|
||||||
|
|
||||||
|
这是唯一允许 kvcache_manager 为 None 的情况。正式推理时,policy 必须存在。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## 基类要求 (MANDATORY)
|
## 基类要求 (MANDATORY)
|
||||||
|
|
||||||
每个 `SparsePolicy` 子类 **必须** 遵守以下要求:
|
每个 `SparsePolicy` 子类 **必须** 遵守以下要求:
|
||||||
|
|||||||
Reference in New Issue
Block a user