Use DeferredChainMonitor for non-VSS storage backends#782
Use DeferredChainMonitor for non-VSS storage backends#782joostjager wants to merge 1 commit intolightningdevkit:mainfrom
Conversation
|
👋 Hi! I see this is a draft PR. |
Hmm, that's unfortunate. I imagine especially mobile and VSS-driven nodes would benefit the most from any change improving on the CM/CM inconsistency situation? |
|
Yea, I think its an open question what we should do - on the one hand nodes with remote persistence are going to be the most impacted by the increase in sending latency (which is probably something where we're currently in an unacceptably-bad state, given how single-threaded some of LDK's logic is around the BP!). OTOH, they are also somewhat more likely to hit the FC-due-to-out-of-sync issues because they have high latency persistence. I've mentioned to Joost but another option we have is to do the chanman and monitor writes at the same time but spawn them in-order, which will at least give us likely protection. We should maybe discuss live which option we want to go with. In any case, since this is now using the async pipeline for monitor persistence anyway, we should probably switch to actual async persistence for monitors at the same time. |
|
Parallel writes started in order still doesn't fully close the gap though. We'd remain in "mostly works" territory where the race window is smaller but not eliminated. As discussed offline, for high-latency backends, an option to avoid unnecessary round trips is batched writes. Doesn't need to be atomic (which would require all KV stores to support transactions), just ordered: write chanman first, then monitors, but send them together. This would fix the FC problem without being unnecessarily slow for remote storage. The downside is extending the KVStore interface with a batch write method, but we could provide a blanket implementation for existing KV stores that just iterates through the writes sequentially. For VSS specifically we'd implement actual batch sending to get the latency benefit. |
|
Illustration of the code changes for batch writes: lightningdevkit/rust-lightning#4379 |
|
Updated lightningdevkit/rust-lightning#4379 to also show how the batch writes can be used in the background processor. Note that this is based on |
Replace the ChainMonitor type alias to point to DeferredChainMonitor instead of the regular ChainMonitor. This enables safe persistence ordering (ChannelManager persisted before channel monitors) for all backends, including VSS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
aa8bf83 to
a48ba95
Compare
|
After offline discussion, we've landed on using DeferredChainMonitor for all backends (including VSS). Pushed the simplified version. Next step is further parallelization of the background processor loop to reduce the impact of deferred writing. If more performance improvements are needed after that, we can again look into multi-key writes to bundle everything in a single round trip. |
PoC branch to evaluate integrating DeferredChainMonitor (lightningdevkit/rust-lightning#4345) into ldk-node.
For local storage backends (SQLite, filesystem), this uses DeferredChainMonitor which defers monitor operations until explicitly flushed.
VSS continues to use the regular ChainMonitor. While this isn't safe against force-closes, it avoids introducing potentially high-latency channel manager writes into the critical path. Currently this provides no practical benefit since the background processor loop isn't sufficiently parallelized. Payment latency wouldn't actually increase if we'd also use deferred writing for VSS. This is primarily a forward-looking optimization for when that parallelization is addressed.