Files
nano-vllm/nanovllm/layers
Zijie Tian c51a640a29 🐛 fix: remove torch.compile from add_rms_forward to avoid recompilation
The add_rms_forward method processes two input tensors (x and residual),
which causes torch.compile recompilation issues. Keep @torch.compile only
on rms_forward which processes a single input.

This prevents unnecessary recompilation overhead during inference.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-14 07:02:02 +08:00
..
fix
2025-06-15 13:28:29 +08:00
2025-08-31 20:02:51 +08:00
2025-08-31 20:02:51 +08:00