XIP-35: Message sender identifier in topic

Abstract

This XRC proposes to add a signature to public envelopes for messages, that a notification server could use to detect if one of the subscribed users has sent the message and not send him the notification (i.e. avoid receiving notifications for your own messages).

Motivation

Right now, a message sent in a topic needs to have its payload decoded to understand who sent the message.

When a notification server sees a message in a topic and decides to sent it to one of its subscribers, it has no way to know if the message is from the subscriber or not.

This means a mobile XMTP client will subscribe to a topic then receive notifications for its own messages in the topic.

This would be fine if the XMTP client could just drop the notification after having decoded the payload.

On Apple devices, to be able to decode the payload before showing the notification, we must use a Notification Service Extension - and the only way to drop a notification from a Notification Service Extension is to obtain a specific entitlement from Apple: com.apple.developer.usernotifications.filtering

Obtaining this entitlement from Apple has proven very hard - they might consider XMTP clients too web3, and they just don’t provide this entitlement to apps that provide financial services.

Specifically, Coinbase Wallet never managed to obtain this entitlement, and Converse had it for a few months before losing it.

Specification

The SDKs would generate a private/public key pair for each topic.

The private/public key pair should always be the same for a given user and a given topic - not dependent on installations / SDK language.

Before publishing an envelope on a topic, the SDK would use the topic private key to sign the message payload and attach it to the envelope.

The envelope of a message would be updated to

message Envelope {
  string content_topic = 1;
  uint64 timestamp_ns = 2;
  bytes message = 3;
  
  optional string signature = 4; // New field added
}

When the client subscribes to a topic on its notification server, it would also provide the topic public key to the notification server.

When the notification server sees a message in a specific topic, it would do the following

  1. Check if it has clients that have subscribed to this topic
  2. For each client that has subscribed to the topic, try to verify the signature using the public key the client has provided when subscribing to the topic
  3. If the signature is verified, it means that the message was indeed sent by this person (not necessarily from this client) and the notification server can decide wether to send the notification or not (could be a parameter configured by the developer)

Backwards Compatibility

Notification server should allow subscribing to topics without sending a public key. This client would then receive all notifications.

Notification server should also parse envelopes that don’t have any signature. These messages would be sent to every client that subscribed to the topic, whether they provided a public key for this topic or not.

Copyright

Copyright and related rights waived via CC0.

This is a good proposal for what is a very difficult problem, some additional suggestions below. Thank you for submitting it, @Noe_Malzieu!

It seems that there are multiple kinds of push notifications that apps would like to suppress:

  • Messages sent by the subscriber
  • Messages of specific content types - read receipts, reaction removals, etc
  • Messages sent by blocked or unknown users

As a general principle, the protocol should expose as little information to the server as possible, and where forced to do so, for as short of a duration as possible.

Messages sent by the subscriber

The original proposal has the desirable property that it only reveals which messages were sent by the current subscriber in a given topic, rather than revealing all senders. Some suggested additions:

  • It would be better to use an HMAC rather than a signature (we can add a sender_hmac field to the envelope instead of signature). Signatures do not necessarily provide anonymity (the inability to link signatures given many message-signature pairs), nor is it reasonable to assume that signatures only successfully verify under one public key.
  • The hmac key can be derived from the existing symmetric shared encryption key for the topic, with the user as input. This means that there is no need to distribute a new secret between clients, for both XMTP v2 and v3, and leaves open the option of supporting deniability in the rest of the protocol in the future.
  • The hmac key can be additionally derived from a time-based value such as thirty-day periods since the unix epoch. This ensures that the visibility granted to an app’s push server will automatically expire unless the user continues using that app, as well as providing protection from compromises and leaks.
thirty_day_periods_since_epoch = floor(seconds_since_epoch / 60 / 60 / 24 / 30)
current_sender_hmac_key = hkdf({
    algo: ‘sha256’,
    input_key: topic_encryption_key,
    info: thirty_day_periods_since_epoch + “-“ + sender_address
    salt: null
})
sender_hmac = hmac({
   key: current_sender_hmac_key,
   content: message_header_bytes
})

When subscribing to a push server, an app could provide a list of conversation topics it would like to receive notifications for, and for each conversation, provide the current, previous, and next hmac key to prevent rollover issues. The logic for generating this list of keys can be encapsulated within a method in the XMTP SDK. Apps would regularly call this method, both to fetch hmac keys for newly subscribed conversation topics over time, as well as to periodically refresh the hmac keys.

When a push server is determining whether to push a notification for a given message, it would attempt to verify the mac using the hmac key corresponding to the message time. If the mac verifies successfully, the message should not be pushed. This logic can be encapsulated within a reference XMTP push server implementation.

Tradeoffs and limitations:

  • Given the hmac key for a given user on a given topic, the push server is able to forge the mac on other messages, thereby suppressing push notifications to that user. Note that the push server is able to suppress push notifications regardless, if it is the user’s only push server. This is partially mitigated by the limited lifetime of the hmac key.
  • Users within a conversation are able to forge hmac keys for each other, thereby suppressing push notifications for specific messages within that conversation. This is out of scope and could be a desirable feature in the future.
  • If a user does not open an app within a 30-60 day period, the hmac keys used by the app’s push server may expire. Given that this is only relevant if the user is still sending messages from other apps, this should be rare. Provided that the app is still installed on the user’s phone, the push server could choose to periodically send a background notification to the app to trigger a key refresh.

Messages of specific content types

Instead of exposing fine-grained information about message content types to the push server, messages can be annotated with a plaintext ‘shouldPush’ flag that is deterministically generated based on the content type on the sender’s side. The logic for deriving this flag should be consistently defined based on consensus from all app implementors and encapsulated inside the XMTP SDK.

Messages from blocked and unknown users

In XMTP v2, only 1:1 conversations exist. All clients can deterministically map between allowed users and the conversation topic corresponding to each user, and only subscribe to those topics. By default, this means that inbound messages from unrecognized senders will not be pushed.

Hi @richardhua , thanks for your proposal.

For the HMAC signature : I thought this was indeed decided from the last discussions we had with the XMTP team during the ISC meeting. I’m ok with it

For making it time-based : I don’t think this was part of the initial discussion. I’m honestly not against it, however, we do have a hard deadline in March on Converse’s side to ship all this so it all depends how much more work it is on XMTP’s side - then on our side to implement it?

I understand that the client part would be encapsulated in the XMTP Client SDK, so why not. As for the server-side, we have a custom implementation of the Push server but I guess with XMTP guidance we can also make it work.

Basically, now that we have shared all the requirements, I’m ok with your proposal, or with the simpler, non time based one that I thought we agreed on before.

Hi @noe, for making the HMAC time-based, there is low additional cost on XMTP’s side (just including an extra time value in the salt), and it significantly improves security. For implementors such as Converse, this should be abstracted away - periodically call a method in the XMTP SDK to obtain the hmac keys for the topics being subscribed to, and pass the result to the push server.

We are treating this issue as a priority and have been working on the implementation - it is complete on JS and in-progress on iOS and React Native. We will also be working on a reference server example shortly. @rygine is leading the coordination for this work.