Notebook 01 — Protocol overview

Notebook 01 — Protocol overview#

Before we write code, we lay out the threat model, the protocol (hybrid X3DH + symmetric ratchet), and where each building block comes from.

Threat model#

In our toy world, Alice and Bob want to exchange messages over a public channel (our shared file queue). We assume an adversary that can:

Eavesdrop on every ciphertext
Tamper with ciphertexts (we use AEAD to detect this)
Later compromise Alice’s long-term private keys (forward secrecy goal)

We do not defend against:

Metadata privacy (sender/recipient are plaintext)
Active server impersonation (no PKI beyond TOFU)
Deniability, future secrecy (no DH ratchet)
Side-channel or fault attacks on our Python code

Why KEM alone is not enough#

You might think: Alice and Bob each have ML-KEM keypairs; Alice just encapsulates to Bob’s public key every message. That works for confidentiality, but it has two problems:

No forward secrecy: if Bob’s ML-KEM private key is compromised tomorrow, an attacker can decrypt every message ever sent.
No efficient continuation: each message would redo the full KEM handshake — ciphertext size stays ~1KB for a 5-byte “ok”.

Signal solved this for RSA/ECC with the Double Ratchet. We use its simpler symmetric-only half.

Hybrid X3DH (simplified)#

Alice’s first message to Bob runs a one-shot handshake:

Alice generates an ephemeral X25519 keypair \((eph_{sk}, eph_{pk})\).
Alice computes a classical shared secret: \(dh = X25519(eph_{sk}, bob_{pub})\).
Alice encapsulates a post-quantum shared secret to Bob: \((k_{kem}, c_{kem}) = \text{ML-KEM-Encaps}(bob_{ek})\).
Alice derives the root key: \(SK = \text{HKDF-SHAKE256}(salt, dh \| k_{kem}, info)\).
Alice sends \((eph_{pk}, c_{kem}, \text{body ciphertext})\).
Bob: \(dh = X25519(bob_{sk}, eph_{pk})\), \(k_{kem} = \text{ML-KEM-Decaps}(bob_{dk}, c_{kem})\), derives same \(SK\).

Both sides now hold a 96-byte derived material: 32 B root key + 32 B Alice-to-Bob chain key + 32 B Bob-to-Alice chain key.

Hybrid: an attacker needs to break both X25519 and ML-KEM-768 to recover \(SK\). Kills Shor (breaks X25519 but not ML-KEM) and kills unknown lattice attacks (they might break ML-KEM but not X25519 today).

See ML-KEM spec and Hybrid KEM in the companion book for the construction of ML-KEM-Encaps/Decaps and the rationale behind hybridization.

        sequenceDiagram
    autonumber
    participant A as Alice
    participant B as Bob
    Note over A,B: Bob has long-term (X25519, ML-KEM) public keys published
    A->>A: gen ephemeral X25519 (eph_sk, eph_pk)
    A->>A: dh = X25519(eph_sk, bob_x25519_pub)
    A->>A: (k_kem, c_kem) = ML-KEM.Encaps(bob_ml_kem_pub)
    A->>A: SK = HKDF(salt, dh ‖ k_kem, info)
    A->>A: split SK → root_key, ck_a2b, ck_b2a
    A->>B: { eph_pk, c_kem, ciphertext, nonce }
    B->>B: dh = X25519(bob_x25519_sk, eph_pk)
    B->>B: k_kem = ML-KEM.Decaps(bob_ml_kem_sk, c_kem)
    B->>B: SK = HKDF(salt, dh ‖ k_kem, info)
    B->>B: split SK → same root_key, ck_a2b, ck_b2a
    Note over A,B: Both sides hold matching 96 B of keying material

Symmetric ratchet#

Once the root key is established, each side has a chain key per direction. To encrypt message \(i\):

message_key_i   = HKDF(chain_key, info="msg_key")
new_chain_key   = HKDF(chain_key, info="chain_advance")
chain_key      := new_chain_key

Each message_key_i is used exactly once with ChaCha20-Poly1305 AEAD. Because HKDF is one-way:

Forward secrecy: learning chain_key at message \(i+1\) does not reveal message keys 0..i.
Backward non-secrecy: learning chain_key at message \(i+1\) does reveal message \(i+1\) and later.

The Double Ratchet’s DH step fixes backward non-secrecy by periodically re-running a fresh DH exchange. We leave that out here — notebook 03 shows exactly where this limitation bites.

        flowchart LR
    CK0[chain_key i] -->|HKDF info=msg_key| MK[message_key i]
    CK0 -->|HKDF info=chain_advance| CK1[chain_key i+1]
    MK --> AEAD[ChaCha20-Poly1305]
    PT[plaintext i] --> AEAD
    AEAD --> CT[ciphertext i]
    CK1 -.->|next message| CK1B[chain_key i+1]

Wire format#

Every message is a JSON object with base64-encoded byte fields:

{
  "version": 1,
  "sender": "alice",
  "recipient": "bob",
  "msg_index": 0,
  "kem_ciphertext": "base64...",
  "ephemeral_pk": "base64...",
  "nonce": "base64...",
  "ciphertext": "base64...",
  "sent_at": "2026-04-22T15:30:00Z"
}

Only the first message carries kem_ciphertext and ephemeral_pk — subsequent messages rely on the established chain. Our transport writes each message as one file in ~/.pq-messenger/inbox/<recipient>/<uuid>.json.

Reading order#

02: Walk through hybrid X3DH step by step with real keys.
03: Demonstrate forward secrecy and the ratchet’s limitation under key compromise.
04: Run a live two-process session over the file queue.