XIP-68 (Draft): Automated Fork Recovery

XIP: Automated Fork Recovery

The high-level approach in this XIP is unlikely to change, but some specifics are subject to revision. Discussion of previous iterations at Fork recovery design sketch · Issue #2062 · xmtp/libxmtp · GitHub

Overview

The MLS protocol requires all group members to maintain and advance the same encryption state as incoming commits update the group, with deviations resulting in parallel ‘forks’ of the group where subsets of the group’s members are unable to communicate with each other. Forks may be caused by inconsistencies in commit processing between client versions, bugs in message processing logic, and concurrency issues both locally on the client as well as due to distributed systems issues. More information on forks can be found here.

The best way to address forks is to prevent them, via robust system design and comprehensive testing. However, in the event that the root cause of a fork was not discovered beforehand, fork recovery provides an ‘insurance plan’ that can give blanket coverage over a wide variety of issues.

The following describes an intuition for how this may be done.

  1. All installations maintain a local log containing the success and failure status of each commit they applied, as well as the resulting encryption state.
  2. Superadmin installations will append each local commit result to the group’s remote commit log. The remote commit log may contain duplicates or conflicting updates from multiple installations. Readers follow the rule of ‘first write wins’, reading the log sequentially and discarding conflicting updates.
  3. When any installation discovers that their local commit log conflicts with the remote commit log, they will send a ‘readd request’ to all superadmin installations as well as their own inbox.
  4. Any recipient of a ‘readd request’ will verify that their own state matches the remote commit log, before removing and readding the member to their fork.

Commit Log

As a pre-requisite for fork detection, there must be accurate tracking of the commit state. Here we introduce two concepts, the local commit log and the remote commit log.

Local Commit Log

All installations must maintain a local commit log for each group they are a member of, the following is an example schema:

CREATE TABLE local_commit_log (
    -- A locally assigned ID for the local log entry
    "rowid" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    "group_id" BLOB NOT NULL,
    -- The sequence ID of the commit being applied
    -- For welcomes, this is the sequence ID of the commit that spawned the welcome
    -- For group creation, this is 0
    "commit_sequence_id" BIGINT NOT NULL,
    -- The encryption state of the group before the commit was applied
    "last_epoch_authenticator" BLOB NOT NULL,
    -- Whether the commit was successfully applied or not
    "commit_result" INT NOT NULL,
    -- The state after the commit was applied, or the existing state otherwise
    "applied_epoch_number" BIGINT NOT NULL,
    "applied_epoch_authenticator" BLOB NOT NULL,
);

The epoch_authenticator is derived from the epoch secret and uniquely identifies the group encryption state after the commit was applied, but is not itself sensitive information. It is defined in the MLS protocol spec.

Each time the local group state has been transformed, or the cursor has been advanced past a failed commit, an entry must be added to the local commit log. Note that the commit log is permitted to have duplicate entries, with the same server sequence_id but different values for the log_id field, as well as out-of-order entries. This would indicate bugs in the client’s processing logic.

Group creation, device sync restores, and welcomes without a corresponding cursor may be stored with a commit_sequence_id of 0. These entries may be useful for local debugging purposes even if they cannot be compared remotely.

Remote Commit Log

For each group, all superadmin installations should additionally publish updates to a remote commit log. This can be performed asynchronously via a worker process, using the local commit log as a reference. Note that group creation operations (those with a commit_sequence_id of 0) should be omitted from the remote commit log.

enum CommitResult {
  COMMIT_RESULT_UNSPECIFIED = 0;
  COMMIT_RESULT_SUCCESS = 1;
  COMMIT_RESULT_WRONG_EPOCH = 2;
  COMMIT_RESULT_UNDECRYPTABLE = 3;
  COMMIT_RESULT_INVALID = 4;
}

message PlaintextCommitLogEntry {
  // The group_id of the group that the commit belongs to.
  bytes group_id = 1;
  // The sequence ID of the commit payload being validated.
  bytes commit_sequence_id = 2;
  // The encryption state before the commit was applied.
  bytes last_epoch_authenticator = 3;
  // Indicates whether the commit was successful, or why it failed.
  CommitResult commit_result = 4;
  // The epoch number after the commit was applied, or the existing state otherwise
  uint64 applied_epoch_number = 5;
  // The encryption state after the commit was applied, or the existing state otherwise
  bytes applied_epoch_authenticator = 6;
}

message CommitLogEntry {
  uint64 sequence_id = 1;
  bytes encrypted_commit_log_entry = 2;
}
  
message PublishCommitLogEntryRequest {
  bytes group_id = 1;
  bytes encrypted_commit_log_entry = 2;
}

service MlsApi {
  rpc PublishCommitLogEntry(PublishCommitLogEntryRequest) returns (google.protobuf.Empty) {
    option (google.api.http) = {
      post: "/mls/v1/publish-commit-log-entry"
      body: "*"
    };
  }
}

These updates are encrypted by an AES-256 encryption key specified in the group’s immutable metadata. On group creation, the creator of the group must generate a random 256-bit key and store it on the immutable metadata with the key commit_log_encryption_key. In the event that this key is not present in the immutable metadata, superadmins are not required to publish updates to the commit log.

Fork Detection

Parsing the remote commit log

All installations should run a worker process which periodically pulls updates from the remote commit log, and stores them in a local cache, for example in a table named remote_commit_log.

Installations should store entries sequentially starting from the beginning of the remote commit log, skipping over entries for which any of the following apply:

  1. The entry is undecryptable.
  2. Any field of the decrypted entry is NULL.
  3. The group_id of the entry does not match the requested group_id.
  4. The commit_sequence_id of the entry is <= 0.
  5. The commit_sequence_id of the entry is not greater than the most recently stored entry, if one exists.
  6. The last_epoch_authenticator does not match the epoch_authenticatorof the most recently stored entry with a CommitResult of COMMIT_RESULT_APPLIED, if one exists.
  7. The entry has a CommitResult of COMMIT_RESULT_APPLIED, but the epoch number is not exactly 1 greater than the most recently stored entry with a result of COMMIT_RESULT_APPLIED, if one exists.

The following is an example schema for the cache:

CREATE TABLE remote_commit_log (
    -- The sequence ID of the log entry on the server
    "log_sequence_id" BIGINT NOT NULL,
    "group_id" BLOB NOT NULL,
    -- The sequence ID of the commit being referenced
    "commit_sequence_id" BIGINT NOT NULL,
    -- Whether the commit was successfully applied or not
    -- 1 = Applied, all other values are failures matching the protobuf
    "commit_result" INT NOT NULL,
    -- The state after the commit was applied, or the existing state otherwise
    "applied_epoch_number" BIGINT NOT NULL,
    "applied_epoch_authenticator" BLOB NOT NULL
);

Comparing the local state with the remote state

Installations should also periodically determine if their current state matches the remote state:

  1. Iterate through the local_commit_log by rowid in descending order (from most recent to least recent). For each row:
    1. If the commit_sequence_id of the row is present in the cached remote commit log, compare the applied_epoch_authenticator of both entries, and return.
    2. If the row is a Welcome, stop iterating after the check has been applied - the installation has been readded, so earlier state should not be checked.
  2. If no matching commit_sequence_id has been found, then the result is indeterminate.

If a mismatching applied_epoch_authenticator is found, then the installation may be forked, and should perform the steps described in ‘Fork Recovery’.

Repeated checks can be cached, for example by the latest seen rowID from both logs.

Note that malicious group members may publish incorrect entries that cause the status of the group to incorrectly be reported as forked, so this cannot be treated as an absolute detection mechanism. In ‘Security’, we describe how this case is mitigated. In ‘Future Work’, we describe how this could be prevented in the future.

Fork Recovery

Readd commits

Sending a readd request

When a fork is detected, the installation should send a readd request. In order to send it, a single-use MLS group is constructed containing all superadmin installations (according to the group metadata on the local forked state), as well as all other installations under its own inbox. The readd payload is encrypted and sent on this group in standard MLS fashion.

The readd request contains the group ID, and the latest epoch according to the remote commit log.

message ReaddRequest {
  bytes group_id = 1;
  uint64 commit_log_epoch = 2;
}

The readd request is also stored in the local state as follows:

Receiving a readd request

When an installation receives a readd request, the following steps must be taken:

  1. If the readd request is received in a group that either the recipient or sender inboxes are not members of, the request is ignored.
  2. The recipient notes the sender’s identity and inserts or updates the (group_id, inbox_id, installation_id, last_requested_epoch, last_responded_epoch, result=PENDING) readd record in its local DB. The last_responded_epoch is preserved if it already exists, or set to 0.
  3. The recipient sets the scheduled_readd_ns on its local copy of the group to a random (or deterministic?) jitter 0-5 minutes after the local time. If a time was already set, it is not overwritten. This mitigates the ‘thundering herd’ problem while also allowing for batch readds.
  4. Anytime a readd commit is received on a group, for each affected installation, if the commit’s epoch is greater than the last_responded_epoch on the installation’s readd record, the last_responded_epoch is updated. If the last_responded_epoch is greater than the last_requested_epoch, the result is set to SUCCESS_REMOTE.
  5. A worker process on the recipient is started anytime after the scheduled_readd_ns on the group. The worker process:
    1. Syncs the group, and then the commit log.
    2. If the recipient is no longer a member of the group, or if the local commit log conflicts with the remote commit log, all pending readds on the group are marked with status=FAILED.
    3. If the local commit log is ahead of the remote commit log, scheduled_readd_ns is set to 10 minutes after the local time, and the current run of the process is aborted.
    4. For each readd with status=PENDING, if the installation is no longer a member of the group, the readd is marked with status=FAILED.
    5. All remaining installations are added to a readd commit that is published to the group. This installation list is also preserved on the local intent state.
    6. The scheduled_readd_ns for the group is cleared.
  6. When the published readd commit is received:
    1. If there are no epoch errors or conflicts, the last_responded_epoch and result are updated for each installation in the local intent state. Welcomes should be constructed and sent for each affected installation.
    2. Otherwise, the intent is failed, and the scheduled_readd_ns is set to the local time so that the worker process can retry.

Receiving a welcome

Work in progress. The installation must validate that a readd was requested, and that the sender of the welcome is either an installation from the same inbox or a superadmin.

Security

  1. The remote commit log is encrypted by an AES-256 key written by the creator of the group onto the group’s immutable metadata.
  2. In normal operation, only superadmins may write to the remote commit log, but there is no physical limitation preventing other group members (or ex group members) from writing to and reading from the log.
  3. When receiving a welcome payload that was issued in response to a ‘readd request’, installations will verify that it was issued by a superadmin of the group (according to the metadata on their fork), or another installation under its own inbox, before accepting it.

In this model, malicious group members may write malicious updates to the remote log, causing all group members to send readd requests. However, no superadmin will match the state of the remote log, and hence no installation will be able to service the readd request, resulting in a no-op. This has the practical effect of disabling automated fork recovery, while all group members continue processing payloads on the main ‘fork’.

Future work

Decentralization

The remote commit log is required to be totally ordered, however the trust assumptions for the log are relaxed. In order to achieve this ordering, the creator of the group may nominate an originator that should receive all publishes to the log, with all other installations following this recommendation. In the event that the originator becomes unavailable, or tampers with the ordering of the log, fork recovery simply becomes disabled, as described in ‘Security’.

Permissioning updates to the commit log, and/or recovering from log entries written by malicious group members (which currently have the effect of disabling fork recovery)

Creating an encryption key if none exists on the immutable metadata (to enable fork recovery on older groups)

Rollover for inactive superadmins

2 Likes