This commit is contained in:
GeeeekExplorer
2025-06-14 00:36:32 +08:00
parent 9b59dae751
commit 4a8aa090a7
5 changed files with 20 additions and 8 deletions

View File

@@ -5,8 +5,8 @@ A lightweight vLLM implementation built from scratch.
## Key Features
* 🚀 **Fast offline inference** - Comparable inference speeds to vLLM
* 📖 **Readable codebase** - Clean implementation under 1,200 lines of Python code
***Optimization Suite** - Prefix caching, Torch compilation, CUDA graph, etc
* 📖 **Readable codebase** - Clean implementation in ~ 1,200 lines of Python code
***Optimization Suite** - Prefix caching, Torch compilation, CUDA graph, etc.
## Installation