What's Changed
- [Store] support resolving master RPC address from interface by @YiXR in https://github.com/kvcache-ai/Mooncake/pull/1784
- Bump google.golang.org/grpc from 1.59.0 to 1.79.3 in /mooncake-common/etcd by @dependabot[bot] in https://github.com/kvcache-ai/Mooncake/pull/1785
- [Transfer Engine] Round-robin slice batch across QPs in RdmaEndPoint::submitPostSend by @usernamehaha2022 in https://github.com/kvcache-ai/Mooncake/pull/1721
- [CI] Optimize CI/CD workflow execution order to implement a fail-fast mechanism by @LujhCoconut in https://github.com/kvcache-ai/Mooncake/pull/1782
- [STORE] split HA runtime and unify standby lifecycle by @YiXR in https://github.com/kvcache-ai/Mooncake/pull/1777
- [Build] add yalantinglibs submodule by @stmatengss in https://github.com/kvcache-ai/Mooncake/pull/1781
- [Store] Add C API for Mooncake Store by @jiangyukunok in https://github.com/kvcache-ai/Mooncake/pull/1763
- [P2P] fix: cannot disable NV_PEERMEM and enable CUDA at the same time by @stmatengss in https://github.com/kvcache-ai/Mooncake/pull/1797
- [Bugfix][Build] Fix S3SnapshotObjectStore Pimpl error & improve build reliability by @timzhang0727 in https://github.com/kvcache-ai/Mooncake/pull/1796
- fix: fix eviction notification unit test to ensure deterministic FIFO order by @00fish0 in https://github.com/kvcache-ai/Mooncake/pull/1800
- [TENT] Fix duplicate notify recv WR posting and PLOG misuse in RDMA endpoint by @dtcccc in https://github.com/kvcache-ai/Mooncake/pull/1803
- [CI] fix bugs with CI pr1782: fail-fast on format check and restore Ascend/Integration as PR gates by @LujhCoconut in https://github.com/kvcache-ai/Mooncake/pull/1806
- [Store] Add Upsert API for in-place object updates by @00fish0 in https://github.com/kvcache-ai/Mooncake/pull/1662
- [TENT] Disconnect before registering memory by @Cheng-China in https://github.com/kvcache-ai/Mooncake/pull/1807
- Update codeowners by @ykwd in https://github.com/kvcache-ai/Mooncake/pull/1819
- [Store] Add Go language bindings for Mooncake Store by @jiangyukunok in https://github.com/kvcache-ai/Mooncake/pull/1764
- [Misc] Fix silent failure in
code_format.shwhen clang-format is missing by @caozhanhao in https://github.com/kvcache-ai/Mooncake/pull/1824 - [PG] Implement graceful shutdown and reland CPU-only tests to CI by @caozhanhao in https://github.com/kvcache-ai/Mooncake/pull/1795
- [PG]: fix barrier imple problem by @KMSorSMS in https://github.com/kvcache-ai/Mooncake/pull/1792
- [TE] Enabling UB Transport on the Kunpeng SuperNode Phase 1 by @zchuango in https://github.com/kvcache-ai/Mooncake/pull/1805
- [PG] Introduce comprehensive test suite by @yuechen-sys in https://github.com/kvcache-ai/Mooncake/pull/1790
- [STORE] tighten snapshot correctness and reload snapshot-only standby from catalog by @YiXR in https://github.com/kvcache-ai/Mooncake/pull/1801
- Add native Rust bindings for Mooncake Store with usage example and CI integration by @Copilot in https://github.com/kvcache-ai/Mooncake/pull/1810
- [TENT] Fix NVLink IPC address for sub-allocated GPU tensors by @he-yufeng in https://github.com/kvcache-ai/Mooncake/pull/1831
- [PG][TE][TENT] Add dedicated peer liveness probe for recovery and enable elastic GPU test by @KMSorSMS in https://github.com/kvcache-ai/Mooncake/pull/1808
- [TE] feat: setup the RDMA for mlu device. by @phantomlei3 in https://github.com/kvcache-ai/Mooncake/pull/1799
- Optimize ci fail-fast scheme by @LujhCoconut in https://github.com/kvcache-ai/Mooncake/pull/1813
- fix: increase ParallelAllocation test pool to 32MB to avoid flaky slab race by @00fish0 in https://github.com/kvcache-ai/Mooncake/pull/1841
- [Bug fix] Fix get tcp port collision by @zhangzuo21 in https://github.com/kvcache-ai/Mooncake/pull/1816
- [TENT] Register base address of a buffer instead of its sub-allocated address into BufferDesc by @shuoerw in https://github.com/kvcache-ai/Mooncake/pull/1837
- [TE] Add Multi-Protocol Support for DRAM-CXL-SSD tiered storage by @hemist in https://github.com/kvcache-ai/Mooncake/pull/1832
- [TENT] Fix stale segment cache via withCachedSegment and async invalidation by @caozhanhao in https://github.com/kvcache-ai/Mooncake/pull/1826
- [Store] Introduce HA OpLog abstraction and LocalFS oplog store by @duhaode520 in https://github.com/kvcache-ai/Mooncake/pull/1804
- [TRANSFER_ENGINE] align USE_MACA with MUSA GPU paths and docs by @Dayuxiaoshui in https://github.com/kvcache-ai/Mooncake/pull/1814
- feat(store): expose drain job control via master HTTP API by @XucSh in https://github.com/kvcache-ai/Mooncake/pull/1815
- [TENT] Fix potential deadlock and UAF in
synchronizeLocalin [#1826] by @caozhanhao in https://github.com/kvcache-ai/Mooncake/pull/1849 - fix(docker): respect PYTHON_VERSION build-arg when building wheel by @staryxchen in https://github.com/kvcache-ai/Mooncake/pull/1745
- [CI] Harden CI pipeline: path filtering, concurrency, on-demand E2E, and security fixes by @00fish0 in https://github.com/kvcache-ai/Mooncake/pull/1846
- [store] Add get_into_ranges to support Grouped Scatter RDMA Reads by @zxpdemonio in https://github.com/kvcache-ai/Mooncake/pull/1717
- [CI] fix: slash command /run-e2e-ci fails for fork PRs by @00fish0 in https://github.com/kvcache-ai/Mooncake/pull/1859
- [Store] Fix
with_hard_pinfailure in python API by @0oshowero0 in https://github.com/kvcache-ai/Mooncake/pull/1873 - [Bug fix] Prevent redundant replica pinning during offloading by @ertcmm in https://github.com/kvcache-ai/Mooncake/pull/1853
- [TransferEngine] Add retry, async execution, and graceful shutdown for TENT TCP transport by @staryxchen in https://github.com/kvcache-ai/Mooncake/pull/1866
- [TransferEngine][ROCm] Add ROCm HIP support to the Mooncake Python package by @knitcapcat-amd in https://github.com/kvcache-ai/Mooncake/pull/1742
- [Docs] Add SSD offload benchmark results by @zhangzuo21 in https://github.com/kvcache-ai/Mooncake/pull/1835
- [Store] Support SSD offload via Python setup() interface by @zhangzuo21 in https://github.com/kvcache-ai/Mooncake/pull/1857
- [MISC] Add CODEOWNERS for efa_transport directory by @stmatengss in https://github.com/kvcache-ai/Mooncake/pull/1885
- [TE] Enabling UB Transport on the Kunpeng SuperNode Phase 2 by @zchuango in https://github.com/kvcache-ai/Mooncake/pull/1855
- [TransferEngine][MACA] Align MACA build paths with CMake options by @Dayuxiaoshui in https://github.com/kvcache-ai/Mooncake/pull/1888
- [TENT] Wire up cross-transport failover with safety limits and observability by @staryxchen in https://github.com/kvcache-ai/Mooncake/pull/1878
- [TENT] Enhance memory registration with transport type support by @staryxchen in https://github.com/kvcache-ai/Mooncake/pull/1877
- [Store] Support SSD Metrics by @LujhCoconut in https://github.com/kvcache-ai/Mooncake/pull/1879
- [TENT] Fix crashes caused by negative numa_node and incorrect config type inference by @dtcccc in https://github.com/kvcache-ai/Mooncake/pull/1894
- [Store] Enable NUMA-segmented allocation for RDMA RealClient-only mode by @00fish0 in https://github.com/kvcache-ai/Mooncake/pull/1838
- [CI] fix: health_check_test flaky due to fixed sleep, use polling for master down detection by @herbertskyper in https://github.com/kvcache-ai/Mooncake/pull/1868
- [CI] add configurable GitHub mirror fallback for Ascend checkout by @staryxchen in https://github.com/kvcache-ai/Mooncake/pull/1896
- feat(tent): replace raw RdmaEndPoint* with weak_ptr for endpoint lifecycle safety by @staryxchen in https://github.com/kvcache-ai/Mooncake/pull/1897
- [Store] Fix hardcoded 127.0.0.1 bind address in standalone client RPC… by @zhangzuo21 in https://github.com/kvcache-ai/Mooncake/pull/1900
- [TENT] Fix static library link group for final targets and reformat related CMake files by @staryxchen in https://github.com/kvcache-ai/Mooncake/pull/1893
- [PG][TENT] Fix CUDA collective wait semantics and NVLink small-transfer completion by @KMSorSMS in https://github.com/kvcache-ai/Mooncake/pull/1863
- [Doc] Update Client Explanation by @ykwd in https://github.com/kvcache-ai/Mooncake/pull/1905
- [TE] Add fi_read support, endpoint LRU eviction, and multi-NIC striping for EFA transport by @whn09 in https://github.com/kvcache-ai/Mooncake/pull/1821
- [Store] Enabling setting SSD offload path using python interface by @zhangzuo21 in https://github.com/kvcache-ai/Mooncake/pull/1884
- [Store] Expose batch_replica_clear in Python binding by @hnts03-moreh in https://github.com/kvcache-ai/Mooncake/pull/1848
- [CI] add format hook by @stmatengss in https://github.com/kvcache-ai/Mooncake/pull/1904
- [TENT] Batch transfer requests using cudaMemcpyBatchAsync by @shuoerw in https://github.com/kvcache-ai/Mooncake/pull/1890
- Minor bug fixes and improvments for Mooncake Store and Transfer Engine by @nickyc975 in https://github.com/kvcache-ai/Mooncake/pull/1895
- [Store] Fix segfault in disk-replica/offload paths when handling GPU VRAM pointers by @LujhCoconut in https://github.com/kvcache-ai/Mooncake/pull/1892
- fix(ci): retry ascend submodule update via GitHub mirrors by @staryxchen in https://github.com/kvcache-ai/Mooncake/pull/1924
- [TENT] add FaultProxyTransport for fault injection testing by @staryxchen in https://github.com/kvcache-ai/Mooncake/pull/1907
- [TE] PTE-aware auto-split large MR registration for EFA transport by @whn09 in https://github.com/kvcache-ai/Mooncake/pull/1912
- [Store] Add client bandwidth metrics for real and dummy clients by @stmatengss in https://github.com/kvcache-ai/Mooncake/pull/1874
- [store] Bug Fix: Local Disk Replica Metadata Not Cleaned Up After Store Node Offline by @Colors-111 in https://github.com/kvcache-ai/Mooncake/pull/1914
- [Store] unify file storage backend env vars under MOONCAKE_OFFLOAD_ p… by @LujhCoconut in https://github.com/kvcache-ai/Mooncake/pull/1929
- [Integration] connector_v1: subclass SupportsHMA so PD-disagg works for hybrid models by @HarshavardhanK in https://github.com/kvcache-ai/Mooncake/pull/1931
- [Store] auto-enable MC_STORE_MEMCPY in TCP-only environments by @dtcccc in https://github.com/kvcache-ai/Mooncake/pull/1936
- [TE] Fix a 1-second stall in RDMA WorkerPool due to store-buffer reordering by @chestnut-Q in https://github.com/kvcache-ai/Mooncake/pull/1932
- [CI] Restore auto-triggered ascend-test and integration-test in ci.yml by @LujhCoconut in https://github.com/kvcache-ai/Mooncake/pull/1943
- [PG] Fix wait() hang during CUDA Graph capture by @caozhanhao in https://github.com/kvcache-ai/Mooncake/pull/1933
- [Store]: Wait for all tasks to complete before completing a batch by @nickyc975 in https://github.com/kvcache-ai/Mooncake/pull/1906
- Pin a patched Go toolchain and track etcd go.sum for wrapper builds by @Copilot in https://github.com/kvcache-ai/Mooncake/pull/1937
- Refactor ASIO shared target into mooncake-common by @Copilot in https://github.com/kvcache-ai/Mooncake/pull/1926
- Bump version to 0.3.10.post2 in pyproject.toml by @ShangmingCai in https://github.com/kvcache-ai/Mooncake/pull/1949
New Contributors
- @usernamehaha2022 made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1721
- @jiangyukunok made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1763
- @timzhang0727 made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1796
- @zchuango made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1805
- @phantomlei3 made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1799
- @shuoerw made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1837
- @0oshowero0 made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1873
- @ertcmm made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1853
- @knitcapcat-amd made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1742
- @herbertskyper made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1868
- @hnts03-moreh made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1848
- @Colors-111 made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1914
- @HarshavardhanK made their first contribution in https://github.com/kvcache-ai/Mooncake/pull/1931
Full Changelog: https://github.com/kvcache-ai/Mooncake/compare/v0.3.10.post1...v0.3.10.post2