Hybrid Groups
This page documents a new addition to the Cwtch protocol that is currently undergoing review. All material presented here should be considered provisional, and may contain errors.
The Problem with Legacy Cwtch Groups
One of the unique features of Cwtch is that groups are dependent on untrusted infrastructure.
Because of this, at their most basic, a Cwtch group is simply an agreement between a set of peers on a common cryptographic key, and a common (set of) untrusted server(s).
This provides Cwtch Groups with very nice properties such as anonymity to anyone not in the group, but it does mean that certain other nice properties like member flexibility, and credential rotation are difficult to achieve.
We want to allow people to make the right trade-off when it comes to their own risk models, i.e. to be able to trade efficiency for trust when that decision makes sense.
To do that we need to introduce a new class of group into Cwtch, something we are calling Hybrid Groups.
What Are Hybrid Groups?
The goal of hybrid groups is to balance the security properties of Cwtch peer-to-peer communication with the properties of untrusted infrastructure.
This is done by augmenting existing Cwtch Groups with an additional layer of peer-to-peer communication in order to provide efficient participant management, key rotation, and other useful features.
Desirable Properties for Hybrid Groups
As with the rest of Cwtch, our ultimate goal is that no metadata (and specifically as part of this work, no group metadata e.g. membership, message timing) be available to a party outside of the group.
Traditional Cwtch Groups take this to the extreme, and the expense of long syncing times, and a high possibility of disruption. Managed Groups and Augmented groups will allow communities to make the right trade-offs allowing for greater resilience and faster syncing.
No amount of cryptography can prevent a member betraying the group as a whole, by leaking the key or leaking sensitive transcripts.
We do require that our group messaging protocol ensures the following properties (as defined in Unger et al):
- confidentiality, integrity, authentication i.e. assuming key secrecy: only group members can read a messages, no honest party will accept a modified messages, and each participant can verify the source of each message (and is aware of who else is in the group).
- transcript consistency i.e. all members can resolve the same transcript (eventually).
- causality i.e. all messages can be strictly ordered
- speaker consistency i.e. all members agree on the content and order or messages sent by each member
- expandable and contractable membership i.e. we can add and remove members from our group at any point in time
- forward secrecy i.e. compromising all key material at a given point in time does not compromise previously sent communications
- future secrecy i.e. compromising all key material at a given point in time does not compromise future communications (assuming the compromised party is excluded from future updates)
However, we also operate under the following constraints governed by the decentralized nature of Cwtch:
- we cannot assume any group member will be online at the same time as any other group member (unless we explicitly make that assumption in the case of untrusted infrastructure or group management bots).
- relatedly, we cannot assume that any group member will be able to directly connect and message any other group member.
Because of this, we require a few additional properties:
- asynchronicity i.e. messages can be sent to group members when they are offline (through untrusted infrastructure or always-online managers).
- dropped messages resiliancy i.e. messages can be decrypted without the recipient being aware of all previous messages
- out-of-order arrival i.e. if a message is late in arriving (e.g. it was prepared offline and then sent later), it can still be decrypted and included in the transcript.
Given all of that, we are willing to make a few concessions:
- computational and trust inequality i.e. we are willing to allow some group members / protocol participants to be more trusted / do more computation than others.
- no subgroup messaging i.e. once a given group is established, we do not require the ability to members to only send messages to a subgroup of those members without forming a new group.
- limited participant/message repudiation i.e. the practical protection provided by cryptographic repudiation is contested. In contexts outside of cryptography the bar for evidence to meet is "beyond reasonable doubt" and thus, in the context of a malicious insider attempting to prove that that person was a member of a group, or sent a particular message - this bar is trivial to meet in both the online and offline settings. Thus, to simplify implementation and analysis we will not require any group protocol to provide repudiation outside of the properties provided by the peer-to-peer layer of the Cwtch protocol itself.
There exist many protocols that provide all of our criteria, and more. However, we also have a few extra properties we can leverage:
- pre-existing cryptographic identifiers - we can rely on all members have a cwtch address and thus an ed25519 public and private key pair that can be used to establish both authenticated p2p sessions and authenticated signatures.
- cwtch untrusted servers - we already have (crude) mechanism for offline-delivery that is metadata-resistant.
There is little sense introducing additional cryptographic primitives if we can avoid it. Thus the final desirable property of hybrid groups is:
- simplified implementation i.e. avoid introducing new cryptography into the core cwtch library, unless a desired property cannot be established without it.
It is for this reason that we are willing to make the concessions noted above. We are not striving to design the perfect group protocol, but one that builds on Cwtch in a controlled and compatible way.
Managed Groups Formal Model
With all of that out of the way, we can now introduce the concept of Managed Groups.
In Managed Groups, participants are split into 3 distinct categories:
- The Group Manager - an actor that is assumed to be always online such that the other participants can connect to them and sync group protocol messages and metadata
- The Group Leader(s) - a set of actors that are able to issue protocol messages that fundamentally change the nature of the group e.g. adding or removing participants, changing participant permissions, and moderating messages.
- The Group Members - regular group participants whose privileges within the group are governed by the Group Leaders.
Message Identifiers
When joining a new group, each group member randomly generates a group member id, this is an unsigned, 32 bit number. For each new message this member posts to the group, they will sequentially increment a counter and append this to the "group member id* e.g. Bob is invited to join a new group, Bob first generates their public group id for this group: 0x1D3F92D1 and then, sends their first message with id 0x1D3F92D100000001
These scheme prevents cut-and-paste attacks where a malicious group manager creates two groups with a member, and then re-encrypts a message from one to the other. Under this scheme, such an attack is easily detectable by the presence of multiple messages with conflicting counters.
The choice of 32-bit identifiers and counters allows 2^32 (~4 billion) group identifiers, each allowing 2^32 (~4 billion) messages per group member before such an attack even becomes possible.
If a malicious group member re-uses a group member id in different groups then it only reduces the security of their own messages (as it allows them to be cut-and-pasted into other conversations).
Transcript CRDT
This sequential numbering scheme for group messages allows us to build up the conversation transcript in a CRDT (conflict-free replicated data type). Each group member will maintain their own copy of the transcript tree assembled from each individual member message.
Each member message will contain the following fields:
	type GroupMessage struct {
		Author				 string // the authors cwtch address
		MemberGroupID 		 uint32,
		MemmberMessageID	 uint32,
		MessageBody			 string,
		Sent				 uint64 (milliseconds since epoch)
		Signature		     []byte  // of json-encoded content (including empty sig)
	}
When assembling the tree, recipient will first check the signature and if verified, insert the message into it's position.
Message Identifiers and Resolving Conflicts
With the exception of message(id:0), a message (id:n)  should not be included in a conversation tree until message (id: n-1) has been included.
The group manager will reject messages with reused identifiers - this suffices as protection in the managed group setting. However, at this point it seems prudent to define a more robust protocol for resolving such incidents.
We note that the existence of two conflicting messages (messages from the same user with the same message id) are proof enough that the user is not following the protocol, and likely acting maliciously. Any member can present this information to other members of the group. Further we will define a new Cwtch Overlay message: The Attestation Message.
Attestation
At any point any member can issue an Attestation Message. This message effectively commits the sender to an overall view of the conversation and can be checked by anyone else. If a member receives a conflicting or unverifiable attestation they can present the attestation messages themselves as proof and/or otherwise mark the member/conversation as compromised.
An attestation message defines a root hash formed by constructing a hash for each group member tree, concatenating them in Author handle order and hashing the result.
	type AttestationMessage struct {
		authors: []string // sorted in alphanumeric order
		groupids: []uint32 // sorted by authors
		latestmessageids: []uint32 // sorted by authors
		hash:	[]byte // see attest() below
	}
	fn attest(converation_tree) AttestionMessage {
		attestation := []
		for conversation := range conversation_tree {
			chash = conversation.Hash()
			attestation = append(attestation, (c.Author, c.MemberGroupID, c.MemberMessageID, chash))
		}
		attestation = sort(attestation);
		am := AttestationMessage.from(attestion); // concatenates chash from each element and hashes the result to get hash.
		return am;
	}n order to include a valid message from one conversation into another conversation that conversation will need to include all the same participants - otherwise the attestation will be rejected by honest members. (We additionally note that in the managed group setting that any malicious member would require the collusion of the group manager in order to include attempted forgeries).
Message Syncing
When a group member reconnects to the group manager they first perform a Sync action, by sending the last known message ID for each member. The group manager will respond with any missing messages (as per the rest of the Cwtch protocol, these will be streamed one message at a time through the regular channel).
In-band Group Metadata
The Group Leader can occasionally send group metadata related messages these include:
- AddMember- after which point, all members must include an attestation to the new members state.
- RemoveMember- after which point no further messages from the member should be included in attestation, but the existing attestation should continue to be sent) - this does mean that future members of the group will be able to determine previous members of the group until...
It is worth noting that message identifiers play a role here. Conflicting requests e.g. (remove member then add member) can be resolved by taking max(messageID).
Additional Metadata - Permissions
At this point we can begin to imagine additional metadata messages that may be desirable for certain groups:
- (Un)Mute a Member - advise members to ignore future messages from a given member (or undo this action).
- Restrict Message Types - disallow all or some participants the ability to post images or file shares to a group (the messages are still included in the CRDT but are ignored at the application level)
- Change Group Information - it is often desirable for groups to have a name / topic and/or other associated information.
We will note that there is one permission we are excluded from this list explicitly: Change Group Leadership i.e. Promote or Demote Member. To simplify the design of the protocol we assume that the initial leaders of the group, defined in the constitution message, are forever leaders of the group and cannot be removed or augmented - i.e. they are fundamental to the existence of the group itself.
This allows us to ignore a whole category of attacks that exploit asymmetry in leadership update messages, at the cost of forcing a new group to be created when such fundamental facts change.
Key Establishment and Rotation
Interestingly, in the most naive implementation, we could do away with group encryption entirely in the managed group setting. The Manager is already trusted and all messages are protected via cwtch session encryption - which allows the Manager to authenticate members and only distribute messages to authenticated parties.
However, in extensions of this approach we will want to reduce or remove the role of the Manager, and thus it is desirable to introduce some level of group-level encryption of messages. Doing so also allows us to avoid storing unencrypted group messages, which has other long term benefits.
Because we trust the Manager, we allow them to unilaterally rotate the group key and distribute the new key to active group members the next time they sync to the Manager, and members encrypt GroupMessage's using this key, as they do with LegacyGroup Messages i.e. this key establishment is done out-of-band of the actual group conversation.
There is nothing to prevent an extension of this protocol where the new group key is derived from peer-to-peer interactions i.e. group key agreement. We omit this from our managed group consideration due to the trust we are willing to invest in a per-group manager. However it other levels of hybrid groups such a protocol will be fully defined.
Security Properties
While the group manager is trusted in this design, there are strict limitations on their power:
- A group manager has all the privileges of a group member
- A group manager cannot forge a message from any other member - the signatures on each message prevent this.
- A group manager cannot censor messages from a member without censoring all future messages from that member (or without other members noticing the missing message)
- Any members ability to carry out sub-group attacks (cut-and-paste attacks) is strictly limited by the attestations that a message carries. In order to include a valid message from one conversation into another conversation that conversation will need to include all the same participants - otherwise the attestation will be rejected by honest members. (We additionally note that in the managed group setting that any malicious member would require the collusion of the group manager in order to include attempted forgeries).
Managed groups meet all of our criteria, will giving up trust equality (the group manager is explicitly trusted as a member of group), and participant/message repudiation - by signing all messages, members cannot later claim they did not send a particular message / did not see a particular message being sent.
References
- Unger, Nik et al. “SoK: secure messaging”. In: Security and Privacy (SP ), 2015 IEEE Sympo-sium on. IEEE. 2015, pp. 232–249