-
Notifications
You must be signed in to change notification settings - Fork 865
perf(store): lazy CacheMultiStore — defer cachekv alloc until access #2806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2806 +/- ##
===========================================
+ Coverage 46.28% 56.79% +10.51%
===========================================
Files 2005 2070 +65
Lines 163018 168396 +5378
===========================================
+ Hits 75449 95640 +20191
+ Misses 81047 64263 -16784
- Partials 6522 8493 +1971
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| for k, parent := range cms.storeParents { | ||
| if _, exists := cms.stores[k]; !exists { | ||
| cms.stores[k] = handler(k, parent.(types.KVStore)) | ||
| } | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k := range cms.storeParents { | ||
| delete(cms.storeParents, k) | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k, parent := range cms.gigaStoreParents { | ||
| if _, exists := cms.gigaStores[k]; !exists { | ||
| cms.gigaStores[k] = handler(k, parent) | ||
| } | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k := range cms.gigaStoreParents { | ||
| delete(cms.gigaStoreParents, k) | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k := range cms.storeParents { | ||
| cms.ensureStoreLocked(k) | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| for k := range cms.gigaStoreParents { | ||
| cms.ensureGigaStoreLocked(k) | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
d52d4a1 to
cc0301f
Compare
7abf219 to
6ac2e41
Compare
…l access CacheMultiStore previously created cachekv.NewStore wrappers eagerly for all ~21 module stores on every snapshot. This was wasteful because: 1. The OCC scheduler's prepareTask creates a CMS then immediately replaces all stores with VersionIndexedStores via SetKVStores — the cachekv stores allocated in CMS were discarded unused. 2. EVM transactions only access ~3 of 21 modules, so ~85% of cachekv stores per statedb snapshot were never touched. This change defers cachekv.NewStore allocation by storing parent references in storeParents/gigaStoreParents maps and materializing cachekv wrappers lazily on first GetKVStore/GetStore/GetGigaKVStore access. Benchmark results (M4 Max, 1000 EVM transfers/block): - TPS: 8,000-8,400 → 8,000-8,800 (~5% median uplift) - cachekv.NewStore allocations: -20,614 MB (-360M objects) - cachekv.Write allocations: -8,058 MB - memclrNoHeapPointers CPU: -9.94s - mallocgc CPU: -7.23s cumulative Co-Authored-By: Claude Opus 4.5 <[email protected]>
When CacheMultiStore branches (CacheContext, CacheTxContext), lazy parents were passed directly to the child CMS. This caused child.Write() to write directly to the raw commit store, bypassing the parent's caching layer — leading to state leaks (e.g., Simulate leaking account sequence increments to deliverState). Fix: force-materialize all lazy parents on the parent CMS before creating the child branch. This preserves proper cachekv isolation while keeping lazy init for the OCC hot path (SetKVStores). Co-Authored-By: Claude Opus 4.6 <[email protected]>
The lazy ensureStore/ensureGigaStore methods write to the stores and storeParents maps on first access. When multiple goroutines share a CMS (e.g., slashing BeginBlocker's HandleValidatorSignatureConcurrent), concurrent calls to GetKVStore race on these map operations. Add *sync.RWMutex with double-checked locking: the fast path (store already materialized) takes only an RLock; the slow path (first access) takes a full Lock. After all stores are materialized, every call hits the RLock-only fast path with ~10-20ns overhead. Co-Authored-By: Claude Opus 4.6 <[email protected]>
cc0301f to
8755c45
Compare
f8b8fc9 to
12550b7
Compare
Summary
cachekv.NewStoreallocation inCacheMultiStoreuntil the store is actually accessed viaGetKVStore/GetStore/GetGigaKVStorestoreParents/gigaStoreParentsmaps; materialize cachekv wrappers on-demand viaensureStore()/ensureGigaStore()prepareTaskdiscards them immediately, and EVM txs only touch ~3 of 21 modulesProblem
After lazy-init sortedCache (#2804), the next largest allocator was
cachekv.NewStoreitself — still called eagerly for all ~21 module stores perCacheMultiStore, even though EVM transactions only access ~3 modules. OCC'sprepareTaskcreates a CMS then immediately replaces stores with VersionIndexedStores, discarding the cachekv wrappers. Pure waste.Profiling after #2804 (30s, 1000 EVM transfers/block):
cachekv.NewStorecachekv.Write(re-creates MemDB)cachemulti.newCacheMultiStoreFromCMSmemclrNoHeapPointersChanges
sei-cosmos/store/cachemulti/store.go:newCacheMultiStoreFromCMS: Store parent KVStore references instoreParents/gigaStoreParentsmaps instead of eagerly wrapping withcachekv.NewStoreensureStore()/ensureGigaStore(): Lazy materializers that create cachekv wrapper on first accessGetKVStore/GetStore/GetGigaKVStore: CallensureStore()/ensureGigaStore()before returningBenchmark Results (M4 Max, 1000 EVM transfers/block, 30s profile)
cachekv.NewStorealloccachekv.WriteallocmemclrNoHeapPointersCPUmallocgcCPU (cum)runtime.(*mspan).heapBitsSmallForAddrCPUNote: total CPU samples increase because workers complete faster → more goroutines actively executing instead of blocked on allocator contention.
pprof -top -diff_basehighlights (alloc_space)pprof -top -diff_basehighlights (CPU)Test plan
cd giga/tests && go test ./... -count=1)cd sei-cosmos/store/cachemulti && go test ./... -count=1)gofmt -s -lclean🤖 Generated with Claude Code