zijie-tian
  • Joined on 2026-01-03
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-21 01:12:28 +08:00
78050aef9f 🐛 fix: resolve CPU KV cache state leakage between requests
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-20 07:07:10 +08:00
c48753fc4e [WIP] Before work start.
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-20 04:49:20 +08:00
6180055ed8 📝 docs: add chunked attention solutions guide and update doc index
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-20 04:26:56 +08:00
4cbd451af7 📝 docs: add BSA interface documentation and cleanup temp files
3aef6fc3a2 feat: add XAttention Triton operators for sparse attention estimation
Compare 2 commits »
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-20 02:49:46 +08:00
690456dbf9 ♻️ refactor: create ops module and move chunked_attention
e440c45e73 📝 docs: add XAttention algorithm guide based on COMPASS implementation
Compare 2 commits »
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-20 02:26:36 +08:00
07f5220f40 Merge branch 'tzj/minference' of ssh://git.zijie-tian.site:2222/zijie-tian/nano-vllm into tzj/minference
37aecd4d52 📝 docs: add SparsePolicy implementation guide and update rules
fa7601f4b8 ♻️ refactor: remove cross-layer pipeline and rename compute_chunked_prefill
6080bf7554 🙈 chore: exclude planning-with-files from git tracking
e5a17c832c 📝 docs: add SparsePolicy architecture documentation
Compare 6 commits »
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-20 02:18:55 +08:00
b1f292cf22 Merge branch 'tzj/minference' of ssh://git.zijie-tian.site:2222/zijie-tian/nano-vllm into tzj/minference
16fbcf9e4c docs: add RULER 32K chunked offload issue documentation
50520a6c3c [fix] fixed request to request error.
Compare 3 commits »
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-20 01:25:05 +08:00
a36f8569fc [WIP] Before refactor.
d3b41b2f64 🔧 chore: clean up claude-flow configuration
baa4be7e2e ♻️ refactor: migrate chunked prefill attention to SparsePolicy
6783a45e6f 🚧 wip: update sparse policy refactoring plan to v4
16b269d897 🚧 wip: update sparse policy refactoring plan to v4
Compare 5 commits »
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-19 22:34:02 +08:00
b97b0b96a0 [WIP] Before refactor the nanovllm sparse policy.
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-19 21:18:39 +08:00
b5da802dff [WIP] Before integrate the xattn operator.
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-19 03:30:11 +08:00
9e6fdc0650 [WIP] Before plan execute.
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-19 00:55:30 +08:00
50520a6c3c [fix] fixed request to request error.
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-18 20:34:57 +08:00
e6e0dc5d7d feat: add comprehensive RULER benchmark testing
0550a64339 feat: add dynamic port allocation from tzj/vs_offload
Compare 2 commits »
zijie-tian pushed to tzj/vs_offload at zijie-tian/nano-vllm 2026-01-18 19:34:25 +08:00
b8c00399af chore: sync submodule URL with tzj/minference (use HTTPS)
13586e689b docs: add chunked prefill integration plan
Compare 2 commits »
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-18 19:23:36 +08:00
d9890aa2cd chore: add Block-SparseAttention submodule from tzj/vs_offload
5a837c8c83 chore: update .gitignore with tzj/vs_offload configuration
d1bbb7efe2 chore: update claude configuration and rules from tzj/vs_offload
1a78ae74d5 feat: add claude-flow MCP configuration
c254c8c330 chore: add planning-with-files rule configuration
Compare 5 commits »
zijie-tian pushed to tzj/vs_offload at zijie-tian/nano-vllm 2026-01-18 10:42:43 +08:00
e72725c12b test: add OffloadedTensor unified test suite
zijie-tian pushed to tzj/vs_offload at zijie-tian/nano-vllm 2026-01-18 10:31:46 +08:00
cfb188c34a docs: add chunked prefill analysis for ultra-long sequences
zijie-tian pushed to tzj/vs_offload at zijie-tian/nano-vllm 2026-01-15 02:37:08 +08:00
2826a649de docs: add XAttention integration guide
24baeb6d5a chore: add planning-with-files rule configuration
57f4e9c6e6 docs: reorganize documentation files
ac1ccbceaa feat: add XAttention sparse policy integration
029894118d feat: add claude-flow MCP configuration
Compare 10 commits »
zijie-tian created branch tzj/vs_offload in zijie-tian/nano-vllm 2026-01-15 02:37:08 +08:00
zijie-tian pushed to tzj/minference at zijie-tian/nano-vllm 2026-01-10 21:07:46 +08:00
03a8c033cb [claudesquad] update from 'add-llama-1' on 10 Jan 26 21:03 CST