Zijie Tian
1ab4676396
♻️ refactor: consolidate RULER test files and document root cause
- test_ruler.py: add --fresh-llm, --sample-indices, --json-output options
- test_ruler.py: consolidate test_ruler_single_sample.py, test_ruler_sequential.py, test_ruler_samples.py
- docs: update chunked offload issue with root cause (state leakage confirmed)
- docs: add single-sample test results showing 100% accuracy for niah_single_1
Deleted redundant test files:
- tests/test_ruler_single_sample.py
- tests/test_ruler_sequential.py
- tests/test_ruler_samples.py
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 23:41:17 +08:00
..
2025-12-22 23:52:56 +08:00
2026-01-03 19:19:37 +08:00
2026-01-03 19:19:37 +08:00
2026-01-19 21:19:21 +08:00
2026-01-06 23:32:32 +08:00
2026-01-20 23:41:17 +08:00
2026-01-06 18:41:08 +08:00
2026-01-06 18:41:08 +08:00