XIP-68 (Draft): Automated Fork Recovery

XIP: Automated Fork Recovery

This XIP is a work-in-progress. Previous discussion at Fork recovery design sketch · Issue #2062 · xmtp/libxmtp · GitHub

Overview

The MLS protocol requires all group members to maintain and advance the same encryption state as incoming commits update the group, with deviations resulting in parallel ‘forks’ of the group where subsets of the group’s members are unable to communicate with each other. Forks may be caused by inconsistencies in commit processing between client versions, bugs in message processing logic, and concurrency issues both locally on the client as well as due to distributed systems issues. More information on forks can be found here.

The best way to address forks is to prevent them, via robust system design and comprehensive testing. However, in the event that the root cause of a fork was not discovered beforehand, fork recovery provides an ‘insurance plan’ that can give blanket coverage over a wide variety of issues.

The following describes an intuition for how this may be done.

  1. All installations maintain a local log containing the success and failure status of each commit they applied, as well as the resulting encryption state.
  2. Superadmin installations will append each local commit result to the group’s remote commit log. Updates to the remote commit log form a causal chain. In the event of conflicting updates, all readers follow the rule of ‘first write wins’.
  3. When any installation discovers that their local commit log does not match the remote commit log, they will send a ‘readd request’ to all superadmin installations as well as their own inbox.
  4. Any recipient of a ‘readd request’ will verify that their own state matches the remote commit log, before removing and readding the member to their fork.

Commit Log

As a pre-requisite for fork detection, there must be accurate tracking of the commit state. Here we introduce two concepts, the local commit log and the remote commit log.

Local Commit Log

All installations must maintain a local commit log for each group they are a member of, the following is an example schema:

CREATE TABLE local_commit_log (
    -- A locally assigned ID for the local log entry
    "rowid" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    "group_id" BLOB NOT NULL,
    -- The sequence ID of the commit being applied
    "commit_sequence_id" BIGINT NOT NULL,
    -- The encryption state of the group before the commit was applied
    "last_epoch_authenticator" BLOB NOT NULL,
    -- Whether the commit was successfully applied or not
    -- 1 = Applied, all other values are failures matching the protobuf
    "commit_result" INT NOT NULL,
    "error" TEXT,
    -- Items below this line are only set if the commit was applied
    "applied_epoch_number" BIGINT,
    "applied_epoch_authenticator" BLOB,
    -- Items below this line are for debugging purposes
    "sender_inbox_id" TEXT,
    "sender_installation_id" BLOB,
    "commit_type" INT
);

The epoch_authenticator indicates the group encryption state after the commit was applied, and is defined in the MLS protocol spec.

Each time the local group state has been transformed, or the cursor has been advanced past a failed commit, an entry must be added to the local commit log. Note that the commit log is permitted to have duplicate entries, with the same server sequence_id but different values for the log_id field, as well as out-of-order entries. This would indicate bugs in the client’s processing logic.

Remote Commit Log

For each group, all superadmin installations should additionally publish updates to a remote commit log. This can be performed asynchronously via a worker process, using the local commit log as a reference.

enum CommitResult {
  COMMIT_RESULT_UNSPECIFIED = 0;
  COMMIT_RESULT_SUCCESS = 1;
  COMMIT_RESULT_WRONG_EPOCH = 2;
  COMMIT_RESULT_UNDECRYPTABLE = 3;
  COMMIT_RESULT_INVALID = 4;
}

message PlaintextCommitLogEntry {
  // The group_id of the group that the commit belongs to.
  bytes group_id = 1;
  // The sequence ID of the commit payload being validated.
  bytes commit_sequence_id = 2;
  // The encryption state before the commit was applied.
  bytes last_epoch_authenticator = 3;
  // Indicates whether the commit was successful, or why it failed.
  CommitResult commit_result = 4;
  // The epoch number after the commit was applied, if successful.
  uint64 applied_epoch_number = 5;
  // The encryption state after the commit was applied, if successful.
  bytes applied_epoch_authenticator = 6;
}

message CommitLogEntry {
  uint64 sequence_id = 1;
  bytes encrypted_commit_log_entry = 2;
}
  
message PublishCommitLogEntryRequest {
  bytes group_id = 1;
  bytes encrypted_commit_log_entry = 2;
}

service MlsApi {
  rpc PublishCommitLogEntry(PublishCommitLogEntryRequest) returns (google.protobuf.Empty) {
    option (google.api.http) = {
      post: "/mls/v1/publish-commit-log-entry"
      body: "*"
    };
  }
}

These updates are encrypted by an AES-256 encryption key specified in the group’s immutable metadata. On group creation, the creator of the group must generate a random 256-bit key and store it on the immutable metadata with the key commit_log_encryption_key. In the event that this key is not present in the immutable metadata, superadmins are not required to publish updates to the commit log.

Fork Detection

Parsing the remote commit log

All installations should run a worker process which periodically pulls updates from the remote commit log, and stores them in a local cache, for example in a table named remote_commit_log.

Installations should start from the beginning of the remote commit log, and process entries sequentially as follows:

  1. Skip over the entry if any of the following apply:
    1. The entry is undecryptable.
    2. The group_id of the entry does not match the requested group_id.
    3. The last_epoch_authenticator does not match the epoch_authenticatorof the most recent entry with a CommitResult of COMMIT_RESULT_APPLIED.
    4. The entry has a CommitResult of COMMIT_RESULT_APPLIED, but the epoch number is not exactly 1 greater than the most recent entry with a result of COMMIT_RESULT_APPLIED.
  2. Otherwise, the remote entry is stored in the local cache.

The following is an example schema for the cache:

CREATE TABLE remote_commit_log (
    -- The sequence ID of the log entry on the server
    "log_sequence_id" BIGINT NOT NULL,
    "group_id" BLOB NOT NULL,
    -- The sequence ID of the commit being referenced
    "commit_sequence_id" BIGINT NOT NULL,
    -- Whether the commit was successfully applied or not
    -- 1 = Applied, all other values are failures matching the protobuf
    "commit_result" INT NOT NULL,
    -- Items below this line are only set if the commit was succssfully decrypted
    "applied_epoch_number" BIGINT,
    "applied_epoch_authenticator" BLOB,
);

Comparing the local state with the remote state

Installations should also periodically determine if their current state is correct:

  1. Compute the (epoch_number, epoch_authenticator) of the group’s current MLS state. Search for the corresponding epoch_number with a commit_result of APPLIED in the cached remote commit log. If found:
    1. If the epoch_authenticator does not match, the group is likely forked.
    2. Otherwise, find the corresponding entry in the local commit log, if it exists, and for each entry after this entry, compare it with the remote commit log.
      1. If no mismatch is found, the group is not forked.
      2. Otherwise, the group is likely forked.
  2. Otherwise, the remote commit log has not reached this epoch number yet. In the remote commit log cache, find the entry with the latest epoch_number with a commit_result of APPLIED. For each entry starting from that entry in the remote commit log, check that the remote commit log entry matches the corresponding local commit log entry.
    1. If no mismatch is found, the group is not forked.
    2. Otherwise, the group is likely forked.

Repeated queries can be optimized, for example via a cursor into both logs.

Note that malicious group members may publish incorrect entries that cause the status of the group to incorrectly be reported as forked, so this cannot be treated as an absolute detection mechanism. In ‘Fork Recovery’, we describe how this case is mitigated. In ‘Future Work’, we describe how this could be prevented in the future.

Fork Recovery

Readd commits

Sending a readd request

When a fork is detected, the installation should send a readd request. In order to send it, a single-use MLS group is constructed containing all superadmin installations (according to the group metadata on the local forked state), as well as all other installations under its own inbox. The readd payload is encrypted and sent on this group in standard MLS fashion.

The readd request contains the group ID, and the latest epoch according to the remote commit log.

message ReaddRequest {
  bytes group_id = 1;
  uint64 commit_log_epoch = 2;
}

The readd request is also stored in the local state as follows:

Receiving a readd request

When an installation receives a readd request, the following steps must be taken:

  1. If the readd request is received in a group that either the recipient or sender inboxes are not members of, the request is ignored.
  2. The recipient notes the sender’s identity and inserts or updates the (group_id, inbox_id, installation_id, last_requested_epoch, last_responded_epoch, result=PENDING) readd record in its local DB. The last_responded_epoch is preserved if it already exists, or set to 0.
  3. The recipient sets the scheduled_readd_ns on its local copy of the group to a random (or deterministic?) jitter 0-5 minutes after the local time. If a time was already set, it is not overwritten. This mitigates the ‘thundering herd’ problem while also allowing for batch readds.
  4. Anytime a readd commit is received on a group, for each affected installation, if the commit’s epoch is greater than the last_responded_epoch on the installation’s readd record, the last_responded_epoch is updated. If the last_responded_epoch is greater than the last_requested_epoch, the result is set to SUCCESS_REMOTE.
  5. A worker process on the recipient is started anytime after the scheduled_readd_ns on the group. The worker process:
    1. Syncs the group, and then the commit log.
    2. If the recipient is no longer a member of the group, or if the local commit log conflicts with the remote commit log, all pending readds on the group are marked with status=FAILED.
    3. If the local commit log is ahead of the remote commit log, scheduled_readd_ns is set to 10 minutes after the local time, and the current run of the process is aborted.
    4. For each readd with status=PENDING, if the installation is no longer a member of the group, the readd is marked with status=FAILED.
    5. All remaining installations are added to a readd commit that is published to the group. This installation list is also preserved on the local intent state.
    6. The scheduled_readd_ns for the group is cleared.
  6. When the published readd commit is received:
    1. If there are no epoch errors or conflicts, the last_responded_epoch and result are updated for each installation in the local intent state. Welcomes should be constructed and sent for each affected installation.
    2. Otherwise, the intent is failed, and the scheduled_readd_ns is set to the local time so that the worker process can retry.

Receiving a welcome

Work in progress. The installation must validate that a readd was requested, and that the sender of the welcome is either an installation from the same inbox or a superadmin.

Security

  1. The remote commit log is encrypted by an AES-256 key written by the creator of the group onto the group’s immutable metadata.
  2. By convention, only superadmins may write to the remote commit log, but there is no physical limitation preventing other group members (or ex group members) from writing to and reading from the log.
  3. When receiving a welcome payload that was issued in response to a ‘readd request’, installations will verify that it was issued by a superadmin of the group (according to the metadata on their fork), or another installation under its own inbox, before accepting it.

In this model, malicious group members may write malicious updates to the remote log, causing all group members to send readd requests. However, no superadmin will match the state of the remote log, and hence no installation will be able to service the readd request, resulting in a no-op. This has the practical effect of disabling automated fork recovery, while all group members continue processing payloads on the main ‘fork’.

Future work

Decentralization

The remote commit log is required to be totally ordered, however the trust assumptions for the log are relaxed. In order to achieve this ordering, the creator of the group may nominate an originator that should receive all publishes to the log, with all other installations following this recommendation. In the event that the originator becomes unavailable, or tampers with the ordering of the log, fork recovery simply becomes disabled, as described in ‘Security’.

Restricting updates to the commit log, and/or recovering from bad updates

Creating an encryption key if none exists on the immutable metadata

Rollover for inactive superadmins