Commit Graph

1 Commits

Author SHA1 Message Date
Zijie Tian
0d31b3f71f 📝 docs: add CPU offload optimization strategies guide
- Document chunk size optimization (simplest, most effective)
- Analyze CUDA Graph limitations for offload scenarios
- Cover CUDA Graph applicability for MLP/Proj layers
- Survey frontier research: InfiniGen, ShadowKV, L2 Prefetch, KVPR
- Add optimization priority recommendations

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
2026-01-27 04:44:36 +08:00