generated from MetaMask/metamask-module-template
-
Notifications
You must be signed in to change notification settings - Fork 6
feat(remote-comms): Add kernel incarnation detection protocol #788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
sirtimid
wants to merge
23
commits into
main
Choose a base branch
from
sirtimid/issue-689-incarnation-detection
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
Coverage Report
File Coverage |
2 tasks
Add Handshake and HandshakeAck message types to RemoteMessageBase for kernel incarnation detection protocol. These types will be used to exchange incarnation IDs during connection establishment. Part of #689 Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Generate unique incarnation ID (UUID) in Kernel constructor - Store incarnation ID in memory only (not persisted) - Pass incarnation ID through RemoteManager to remote-comms - Update PlatformServices type and RPC handlers to accept incarnationId The incarnation ID is used to detect when a peer has lost its state and reconnected with the same peer ID. Part of #689 Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add remoteIncarnationId field to PeerState type - Add setRemoteIncarnation() method that returns true on incarnation change - Add getRemoteIncarnation() method for retrieving stored incarnation - Add unit tests for incarnation tracking - Update existing tests to include remoteIncarnationId field Part of #689 Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add handshake message types and detection logic - Send handshake on outbound connections when incarnationId is provided - Handle incoming handshake and reply with handshakeAck - Handle incoming handshakeAck messages - Track remote incarnation IDs in PeerStateManager - Trigger onRemoteGiveUp callback when remote incarnation changes - Update reconnection-lifecycle to pass isOutbound flag for handshake - Add comprehensive unit tests for handshake protocol The handshake protocol allows kernels to detect when a peer has lost its state (incarnation changed) and trigger promise rejection for any pending work with that peer. Part of #689 Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add e2e tests for the incarnation detection feature: - Test that handshake is exchanged on connection establishment - Test that incarnation change is detected when peer restarts with fresh state - Mark test for message delivery after restart as todo (needs investigation) Part of #689 Co-Authored-By: Claude Opus 4.5 <[email protected]>
Replace crypto.randomUUID() with a manual UUID v4 generation using crypto.getRandomValues(), which is supported in older browsers: - randomUUID(): Chrome 92+, Firefox 95+, Safari 15.4+ - getRandomValues(): Chrome 11+, Firefox 21+, Safari 6.1+ Co-Authored-By: Claude Opus 4.5 <[email protected]>
…ility" This reverts commit fd0a785.
- Move handshake to connection level in new handshake.ts module - Persist incarnation ID to store via getOrCreateIncarnationId() - Remove incarnationId parameter from RemoteManager (uses store) - Consolidate identity reset logic in Kernel.resetKernelState() - Remove Handshake/HandshakeAck from RemoteMessageBase (handled at transport) The incarnation ID now persists across kernel restarts but resets when storage is cleared, correctly detecting actual state loss vs. restarts. Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Remove reconnectionHolder indirection pattern in transport.ts Use definite assignment assertion instead of holder object - Remove duplicate writeWithTimeout from handshake.ts, import from channel-utils - Fix PlatformServicesServer to accept incarnationId parameter - Remove redundant e2e tests for handshake exchange and todo test Co-Authored-By: Claude Opus 4.5 <[email protected]>
The isHandshakeMessage type guard was only checking for the method property but not validating that params.incarnationId exists and is a string. This could cause runtime errors when a malformed message with the correct method but missing params or incarnationId was received. Co-Authored-By: Claude Opus 4.5 <[email protected]>
The previous implementation used AbortSignal.timeout() which creates an internal timer that continues running even after the read succeeds. This could cause minor memory leaks as the timer and event listener weren't properly cleaned up. Now using AbortController with a manual setTimeout/clearTimeout to ensure the timer is always cleaned up in the finally block. Co-Authored-By: Claude Opus 4.5 <[email protected]>
Replace the definite assignment assertion (!) with a safer holder pattern that throws a clear error if handleConnectionLoss is called before initialization. This provides runtime safety if code is refactored incorrectly in the future, while keeping TypeScript happy. Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Remove unnecessary delay(3000) in incarnation detection test since await will naturally wait for the promise to settle - Add cleanup for the fresh database file created during the test to avoid accumulating test artifacts on disk Co-Authored-By: Claude Opus 4.5 <[email protected]>
Implement the 5 tests that were marked as .todo() due to async handshake flow complexity: - times out after 10 seconds when write hangs - handles timeout errors and triggers connection loss handling - error message includes correct timeout duration - handles multiple concurrent writes with timeout - reuses existing channel when inbound connection arrives during reconnection dial All tests now use proper mocking with makeAbortSignalMock to simulate timeout behavior in a controllable way. Co-Authored-By: Claude Opus 4.5 <[email protected]>
Extract the abort handler to a named function so it can be removed in the finally block, preventing a minor memory leak where the event listener would remain registered after the read completes. Co-Authored-By: Claude Opus 4.5 <[email protected]>
RemoteManager independently gets the incarnation ID from kernelStore, making the Kernel field redundant. Co-Authored-By: Claude Opus 4.5 <[email protected]>
The HandshakeDeps interface defined getRemoteIncarnation but neither performOutboundHandshake nor performInboundHandshake used it. Co-Authored-By: Claude Opus 4.5 <[email protected]>
…ient The PlatformServicesClient.initializeRemoteComms method was missing the incarnationId parameter, causing the handshake feature to silently fail in the browser runtime. Co-Authored-By: Claude Opus 4.5 <[email protected]>
62c07ec to
606f03f
Compare
…ages If JSON.parse returns null, accessing parsed.method throws a TypeError before the ?? 'unknown' fallback can evaluate. Use optional chaining to safely handle null values. Co-Authored-By: Claude Opus 4.5 <[email protected]>
Merged connection limit try-catch handling from main with handshake logic from this branch. Also integrated closeRejectedChannel helper for cleaner inbound connection handling. Co-Authored-By: Claude Opus 4.5 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
packages/ocap-kernel/src/remotes/platform/peer-state-manager.ts
Outdated
Show resolved
Hide resolved
- Remove unused getRemoteIncarnation method from PeerStateManager - Change Logger import to type-only import in handshake.ts - Update tests to use getState() instead of removed method Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #689
Summary
Implements a handshake protocol for kernel incarnation detection. When kernels connect, they exchange incarnation IDs to detect when a peer has lost its state (e.g., storage cleared) but retained the same peer ID.
Key Changes
Incarnation ID Management
kernelStore.getOrCreateIncarnationId()RemoteManager→initRemoteComms→initTransportHandshake Protocol (
handshake.ts)handshakemessage, wait forhandshakeAckhandshake, reply withhandshakeAckparams.incarnationIdIncarnation Change Detection
PeerStateManagertracks remote incarnation IDs per peersetRemoteIncarnation()returnstrueif incarnation changed from a known previous valueonRemoteGiveUpcallback triggers promise rejection for pending workPromise Rejection Flow
onRemoteGiveUp(peerId)→RemoteManager.#handleRemoteGiveUp()RemoteHandle.rejectPendingRedemptions()for pending URL redemptionsFiles Changed
handshake.tstransport.tspeer-state-manager.tsreconnection-lifecycle.tsstore/index.tsgetOrCreateIncarnationId()methodKernel.tsRemoteManager.tsRemoteHandle.tsrejectPendingRedemptions()methodTesting
🤖 Generated with Claude Code
Note
Medium Risk
Introduces a new connection-level handshake and propagates a new
incarnationIdparameter through the remote-comms stack; mistakes could break connectivity or cause unexpected remote give-ups/retries.Overview
Adds kernel incarnation detection to remote comms by generating/persisting an
incarnationIdinKernelStoreand threading it throughRemoteManager→initRemoteComms/RPC →PlatformServices→initTransport.transportnow performs an explicit handshake (handshake/handshakeAck, with timeout) on both outbound connects and inbound accepts (including during reconnection), tracksremoteIncarnationIdper peer, and triggersonRemoteGiveUpwhen a known peer reconnects with a different incarnation.Updates reset semantics to optionally clear identity (including
incarnationId), extends RPC validation forinitializeRemoteComms, and adds/adjusts unit + e2e coverage for handshake behavior, reconnection ordering, and incarnation-change failure handling.Written by Cursor Bugbot for commit 516cd65. This will update automatically on new commits. Configure here.