nano-vllm/docs at dce6ad6b74c006aa707d8d85c1d6b979c4782c07 - nano-vllm - Gitea: Git with a cup of tea

zijie-tian/nano-vllm

Files

History

Zijie Tian cf168fd9b9 ✅ test: add comprehensive RULER benchmark test suite

- Add test_ruler.py supporting all 13 RULER tasks (NIAH, QA, CWE, FWE, VT)
- Implement RULER official evaluation metrics (string_match_all/part)
- Fix max_model_len to 32896 to prevent decode OOM on long inputs
- Add ruler_benchmark_report.md with full test results (92.1% accuracy)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-14 00:51:30 +08:00

..

architecture_guide.md

[claudesquad] update from 'lw-offload-2' on 08 Jan 26 21:19 CST

2026-01-08 21:19:38 +08:00

cuda_graph_offload_guide.md

[claudesquad] update from 'fix-bug-2' on 09 Jan 26 16:10 CST

2026-01-09 16:10:28 +08:00

debugging_guide.md

[claudesquad] update from 'lw-offload-2' on 08 Jan 26 21:19 CST

2026-01-08 21:19:38 +08:00

gpu_only_performance_issue.md

[claudesquad] update from 'int-minference-1' on 08 Jan 26 23:22 CST

2026-01-08 23:22:38 +08:00

layerwise_offload_memory_analysis.md

[claudesquad] update from 'lw-offload-2' on 08 Jan 26 21:19 CST

2026-01-08 21:19:38 +08:00

multi_model_support.md

[claudesquad] update from 'add-llama-1' on 10 Jan 26 21:14 CST

2026-01-10 21:14:32 +08:00

offload_accuracy_issue.md

📝 docs: update offload accuracy issue with independent testing results

2026-01-12 21:08:35 +08:00

ruler_benchmark_report.md

✅ test: add comprehensive RULER benchmark test suite

2026-01-14 00:51:30 +08:00

ruler_niah_standalone_test.md

[tests] Added test_niah_standalone.py.

2026-01-12 00:16:37 +08:00

sparse_attention_guide.md

[claudesquad] update from 'lw-offload-2' on 08 Jan 26 21:19 CST

2026-01-08 21:19:38 +08:00

sparse_offload_integration.md

[claudesquad] update from 'int-minference-1' on 08 Jan 26 23:42 CST

2026-01-08 23:42:30 +08:00

sparse_prefill_integration_plan.md

[docs] Add sparse prefill integration plan from int-minference analysis

2026-01-10 23:33:09 +08:00

transformers_compatibility.md

[docs] Added transformers error desp.

2026-01-11 18:48:50 +08:00