Spec: Encrypted Append-Only Feed

Prototype 4 — Local-First Encrypted Social Feed Status: Draft v0.1 — 2026-04-23


Overview

This page is the technical specification for Prototype 4 of the Connect Research project. The prototype goal is described in Research Prototypes — Prototype 4: Encrypted Append-Only Feed (Local First), and the corresponding plain-language scenario is Story 4: The Community Garden Feed. Prototype 4 directly exercises the Feed Format Spec: every field defined there — envelope structure, canonical serialization, audience-keyed encryption, hash chain, content types, and the SQLite storage schema — is implemented here in working code for the first time. No other prototype touches the feed data model; this is where that specification becomes executable.

The prototype is deliberately transport-agnostic. It does not include BLE discovery (Prototype 2) or NFC key exchange (Prototype 3). Two instances synchronize by exporting and importing a JSON file, a USB adb push, or a local TCP socket — whichever is simplest to run. Transport independence means the data model can be validated on a laptop before a single phone is involved.


API Surface / Key Interfaces

The interface definitions below use Rust. The Python equivalents are noted where they diverge. All type names map directly to concepts in the feed format spec.

Feed Author API

use ed25519_dalek::SigningKey;

/// A local user's feed. Owns the signing key and the local message store.
pub struct Feed {
    identity_keypair: Ed25519KeyPair,
    dh_keypair:       X25519KeyPair,
    feed_id:          FeedId,          // base64url(SHA-256(identity_pub))
    store:            MessageStore,
}

impl Feed {
    /// Create a new feed, generating a fresh Ed25519 + X25519 keypair.
    /// Writes the genesis message (profile_update) to the store.
    pub fn new(identity_keypair: Ed25519KeyPair) -> Feed;

    /// Append a new message to the feed.
    ///
    /// - Derives the next sequence number (store.max_seq() + 1).
    /// - Computes `previous` = SHA-256(canonical(prev_envelope)).
    /// - Encrypts `content` for `audience`.
    /// - Signs the envelope.
    /// - Writes to the store.
    /// - Returns the signed envelope.
    pub fn append(
        &mut self,
        content:  PostContent,
        audience: Audience,
    ) -> Result<MessageEnvelope, FeedError>;

    /// Returns the sequence number of the most recently appended message.
    pub fn get_sequence(&self) -> u64;

    /// Re-verifies the entire chain from genesis.
    ///
    /// For each message N, checks:
    ///   1. SHA-256(canonical(msg[N-1])) == msg[N].previous
    ///   2. Ed25519 signature over canonical(envelope_without_sig) is valid
    ///   3. Sequence numbers are contiguous (no gaps)
    pub fn verify_chain(&self) -> Result<(), ChainError>;
}

/// Which audience this message is addressed to.
pub enum Audience {
    /// journal_key — never replicated.
    SelfOnly,

    /// Double Ratchet encryption to a single peer.
    Direct(FeedId),

    /// MLS group encryption.
    Group(MlsGroupId),

    /// contacts_key per-contact encryption.
    /// The author encrypts the plaintext once per known contact.
    Contacts,
}

/// Plaintext content variant. Maps to the content types in the feed format spec.
pub enum PostContent {
    Post           { body: String, reply_to: Option<MessageId>, attachments: Vec<BlobHash> },
    Reaction       { target_message: MessageId, emoji: String },
    ProfileUpdate  { display_name: Option<String>, avatar_hash: Option<BlobHash> },
    Tombstone      { target_message: MessageId, reason: Option<String> },
    KeyRotation    { new_identity_key: Vec<u8>, new_dh_key: Vec<u8>, new_pq_key: Vec<u8>,
                     previous_key_sig: Option<Vec<u8>>, reason: Option<String> },
}

Encryption Layer

/// Wrapper around the per-audience encryption logic.
pub struct EncryptionLayer;

impl EncryptionLayer {
    /// Encrypt `plaintext` using the shared contacts_key for a single contact.
    ///
    /// Nonce: first 12 bytes of SHA-256(sequence_le_bytes || feed_id_bytes).
    /// AEAD:  ChaCha20-Poly1305.
    pub fn encrypt_for_contact(
        plaintext:   &[u8],
        contact_key: &ContactsKey,
        sequence:    u64,
        feed_id:     &FeedId,
    ) -> Vec<u8>;

    /// Decrypt ciphertext given a decryption context that carries the correct key.
    pub fn decrypt(
        ciphertext: &[u8],
        context:    DecryptContext,
    ) -> Result<Vec<u8>, DecryptError>;

    /// Encrypt with journal_key for audience=self.
    pub fn encrypt_self(
        plaintext:  &[u8],
        journal_key: &[u8; 32],
        sequence:   u64,
        feed_id:    &FeedId,
    ) -> Vec<u8>;
}

/// Decryption context — carries the appropriate key material for the audience.
pub enum DecryptContext {
    SelfOnly    { journal_key: [u8; 32] },
    Direct      { dr_session: Box<dyn DoubleRatchetSession> },
    Group       { mls_epoch_key: Vec<u8> },
    Contacts    { contacts_key: ContactsKey },
}

/// A symmetric key shared between two contacts.
/// Derived at contact-add time; stored in the platform keychain.
pub struct ContactsKey([u8; 32]);

impl ContactsKey {
    /// HKDF-SHA-256(X25519(my_dh, their_dh_pub),
    ///              salt="ProximityApp_ContactsKey_v1",
    ///              info=sorted(my_feed_id, their_feed_id))
    ///
    /// The HKDF output is symmetric: A.derive(B) == B.derive(A).
    pub fn derive(
        my_dh_key:    &X25519SecretKey,
        their_dh_pub: &X25519PublicKey,
        my_feed_id:   &FeedId,
        their_feed_id: &FeedId,
    ) -> ContactsKey;
}

Sync Layer

pub struct FeedSync;

impl FeedSync {
    /// Compute the delta between our feed and a remote peer that reports
    /// having messages up to `their_seq`.
    ///
    /// Returns messages [their_seq+1 .. our_max_seq] in ascending order.
    /// Excludes messages with audience=self (never replicated).
    pub fn diff(
        my_feed:  &Feed,
        their_seq: u64,
    ) -> Vec<MessageEnvelope>;

    /// Validate and persist a batch of incoming envelopes from a peer.
    ///
    /// For each envelope:
    ///   1. Verify feed_id matches derived SHA-256(signing_key.pub).
    ///   2. Verify Ed25519 signature.
    ///   3. Verify hash chain linkage with the locally-known previous message.
    ///   4. Detect sequence gaps — buffer out-of-order messages; reject if gap
    ///      is not filled within a configurable timeout.
    ///   5. Write to MessageStore.
    ///   6. Attempt decryption; store content_json if successful.
    pub fn merge(
        envelopes: Vec<MessageEnvelope>,
        store:     &mut MessageStore,
        contacts:  &ContactBook,
    ) -> Result<MergeReport, SyncError>;
}

pub struct MergeReport {
    pub accepted:  usize,
    pub rejected:  usize,
    pub buffered:  usize,   // awaiting gap fill
    pub errors:    Vec<SyncError>,
}

Storage

/// Wraps the SQLite connection. Schema matches the feed format spec section 9.1.
pub struct MessageStore {
    conn: rusqlite::Connection,
}

impl MessageStore {
    pub fn open(path: &Path) -> Result<MessageStore, StorageError>;

    /// Append a signed envelope. Returns Err if the sequence already exists
    /// (duplicate) or if the chain linkage is broken.
    pub fn append(&mut self, env: MessageEnvelope) -> Result<(), StorageError>;

    /// Retrieve a contiguous range [from_seq, to_seq] for the given feed.
    pub fn get_range(
        &self,
        feed_id:  &FeedId,
        from_seq: u64,
        to_seq:   u64,
    ) -> Vec<MessageEnvelope>;

    /// Return the highest sequence number held for feed_id, or None.
    pub fn max_seq(&self, feed_id: &FeedId) -> Option<u64>;
}

SQLite schema (identical to feed format spec section 9.1, reproduced for completeness):

CREATE TABLE messages (
  message_id    TEXT PRIMARY KEY,   -- base64url(SHA-256(envelope))
  feed_id       TEXT NOT NULL,
  sequence      INTEGER NOT NULL,
  timestamp     INTEGER NOT NULL,
  type          TEXT NOT NULL,
  audience      TEXT NOT NULL,
  envelope_json TEXT NOT NULL,      -- full signed envelope JSON
  content_json  TEXT,               -- decrypted content, NULL if not decryptable
  tombstoned    INTEGER DEFAULT 0,
  received_at   INTEGER NOT NULL,
  UNIQUE(feed_id, sequence)
);
CREATE INDEX idx_messages_feed_seq ON messages(feed_id, sequence);

CREATE TABLE blobs (
  content_hash TEXT PRIMARY KEY,
  data         BLOB NOT NULL,
  mime_type    TEXT,
  size_bytes   INTEGER NOT NULL,
  stored_at    INTEGER NOT NULL
);

CREATE TABLE contacts (
  feed_id          TEXT PRIMARY KEY,
  display_name     TEXT,
  avatar_hash      TEXT,
  identity_key_pub TEXT NOT NULL,
  dh_key_pub       TEXT NOT NULL,
  pq_key_pub       TEXT NOT NULL,
  contacts_key     TEXT NOT NULL,   -- encrypted; real key in platform keychain
  added_at         INTEGER NOT NULL,
  last_seen_seq    INTEGER DEFAULT -1
);

CREATE TABLE dr_sessions (
  peer_feed_id  TEXT PRIMARY KEY REFERENCES contacts(feed_id),
  session_state BLOB NOT NULL,
  last_updated  INTEGER NOT NULL
);

CREATE TABLE mls_groups (
  group_id       TEXT PRIMARY KEY,
  group_name     TEXT,
  epoch          INTEGER NOT NULL DEFAULT 0,
  mls_state_blob BLOB NOT NULL,
  created_at     INTEGER NOT NULL
);

Key material is never stored in SQLite. Private keys live in the platform keystore (Android Keystore / macOS Keychain). The contacts_key column holds a reference handle, not raw key bytes.


Dependencies & Libraries

p2panda Path (Rust)

Library Version Purpose License
p2panda-core 0.3.x (crates.io) Append-only log primitives, operation encoding MIT / Apache-2.0
p2panda-store 0.3.x SQLite-backed operation storage MIT / Apache-2.0
p2panda-encryption 0.1.x (early/unstable) MLS group state management built on openmls MIT / Apache-2.0
ed25519-dalek ^2.0 Ed25519 signing and verification MIT / Apache-2.0
x25519-dalek ^2.0 X25519 Diffie-Hellman MIT / Apache-2.0
sha2 ^0.10 SHA-256 (hash chain, feed_id derivation) MIT / Apache-2.0
hkdf ^0.12 HKDF-SHA-256 key derivation MIT / Apache-2.0
chacha20poly1305 ^0.10 ChaCha20-Poly1305 AEAD MIT / Apache-2.0
serde_json ^1 Canonical JSON serialization for signing MIT / Apache-2.0
serde ^1 Derive macros for envelope/content structs MIT / Apache-2.0
rusqlite ^0.31 SQLite storage (bundled feature recommended) MIT
base64 ^0.22 base64url encoding/decoding MIT / Apache-2.0
rand ^0.8 Secure CSPRNG for key generation MIT / Apache-2.0

Note on p2panda-encryption: As of April 2026 this crate is early-stage and API-unstable. Budget time for breakage. The alternative is to integrate openmls directly and skip p2panda's wrapper.

SSB-Inspired Minimal Path (Python)

Library Version Purpose License
cryptography ^42 Ed25519, X25519, ChaCha20-Poly1305, HKDF, SHA-256 Apache-2.0 / BSD
sqlite3 stdlib SQLite storage (no additional install required) PSF
base64 stdlib base64url encoding PSF
json stdlib Canonical JSON serialization PSF
hashlib stdlib SHA-256 PSF

SSB-Inspired Minimal Path (Kotlin / Android)

Library Version Purpose License
tink-android ^1.13 Ed25519, ChaCha20-Poly1305, HKDF — Google-audited Apache-2.0
BouncyCastle 1.77+ X25519, fallback crypto if Tink insufficient MIT
Room ^2.6 SQLite ORM for Android Apache-2.0
kotlinx.serialization ^1.6 JSON serialization for envelopes Apache-2.0

MLS for Group Encryption

Library Version Purpose License
openmls ^0.5 MLS RFC 9420 group state machine (Rust) MIT / Apache-2.0
openmls_rust_crypto ^0.2 Cryptographic backend for openmls MIT / Apache-2.0
openmls_dart 0.1.x (Feb 2026) Flutter wrapper around openmls — if Flutter path chosen MIT

CRDT Alternatives (Evaluate Later)

These libraries are not in scope for the two-week prototype but are worth evaluating in a follow-on iteration if conflict resolution or multi-writer feeds become a requirement.

Library Version Notes
loro ^1.0 (Rust + WASM + Swift) JSON CRDT; Swift bindings make it mobile-ready
cr-sqlite latest SQLite extension for CRDT replication — pragmatic drop-in
automerge ^2.0 Mature Rust CRDT; Keyhive adds E2E encryption layer

Platform Constraints

Transport-independent by design. This prototype does not implement BLE or WiFi Direct transport — that is Prototype 2's domain. Synchronization between two instances is done manually: file export/import, adb push over USB, scp over LAN, or a simple TCP socket. This is intentional — decoupling the data model from the transport means the feed logic can be fully validated on a desktop before any mobile platform work begins.

Start on desktop. A Rust CLI or Python script runs on Linux/macOS/Windows with zero mobile platform constraints. This allows the fastest iteration cycle: edit → cargo test → verify, with no emulator or phone required.

Android port. If the Rust path is chosen, the CLI can be compiled for Android using cargo ndk (Android NDK cross-compilation). The JNI boundary can expose the Rust Feed struct as a Kotlin-callable API. The Kotlin path avoids the JNI layer entirely at the cost of reimplementing the crypto from Rust to JVM.

Feed size growth. Append-only feeds grow indefinitely. At an average of 500 bytes per envelope (JSON + base64url overhead), 10,000 messages ≈ 5 MB per feed. 100 contacts × 10,000 messages each ≈ 500 MB. The storage limits defined in the feed format spec (10,000 messages per contact feed, 500 MB total blob store) are enforced at the MessageStore layer.

Per-audience encryption cost. audience = contacts requires encrypting the plaintext separately for each known contact. This is O(N) ChaCha20-Poly1305 encrypt operations per post, where N is the contact count. At 500 contacts and ~1 µs per encrypt, one post costs ~0.5 ms for the fan-out. Acceptable for social-feed cadence (posts per hour), not acceptable for high-frequency messaging.

Chain integrity and sequence gaps. The hash chain requires contiguous sequence numbers. If messages arrive out of order (which BLE transport makes likely), they must be buffered rather than rejected. The FeedSync::merge implementation maintains a per-feed gap buffer. Messages older than a configurable TTL that still cannot be linked are discarded.

SQLite concurrency. SQLite supports one writer. In this prototype there is one write path (Feed::append and FeedSync::merge). This is fine. Read-only queries (rendering the feed UI) run concurrently against the WAL.

Key material storage. Private keys must never be stored in SQLite. On desktop, use the OS keychain (macOS Keychain via security APIs, Linux via libsecret). On Android, use Android Keystore. The contacts_key (shared symmetric key) is stored encrypted in the platform keychain; the SQLite contacts table holds only an opaque handle.


Build & Test Instructions

Rust / p2panda Path

# 1. Install Rust toolchain (skip if already installed)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustup update stable

# 2. Create the library + binary project
cargo new prototype-feed --lib
cd prototype-feed

# 3. Add dependencies — paste into Cargo.toml [dependencies]:
#    ed25519-dalek = { version = "2", features = ["rand_core"] }
#    x25519-dalek  = { version = "2", features = ["static_secrets"] }
#    sha2          = "0.10"
#    hkdf          = "0.12"
#    chacha20poly1305 = "0.10"
#    serde         = { version = "1", features = ["derive"] }
#    serde_json    = "1"
#    rusqlite      = { version = "0.31", features = ["bundled"] }
#    base64        = "0.22"
#    rand          = "0.8"
#    p2panda-core  = "0.3"   # optional — omit if building minimal custom
#    p2panda-store = "0.3"   # optional

cargo build

# 4. Run unit tests
cargo test

# 5. Run the two-feeds sync demo
# (src/bin/two_feeds_sync.rs — create Alice and Bob feeds, export delta,
#  import on the other side, verify identical state)
cargo run --bin two_feeds_sync

# 6. Android cross-compilation (after core logic is stable)
cargo install cargo-ndk
rustup target add aarch64-linux-android
cargo ndk -t arm64-v8a build --release
# Output: target/aarch64-linux-android/release/libprototype_feed.so

Python Path

# 1. Create and activate virtual environment
python -m venv .venv && source .venv/bin/activate

# 2. Install dependencies
pip install cryptography

# 3. Run unit tests
python -m pytest tests/ -v

# 4. Run the CLI demo
#    Creates Alice and Bob feeds in /tmp/feeds/,
#    Alice posts 3 messages, exports delta to /tmp/sync_alice.json,
#    Bob imports and verifies chain integrity.
python demo.py

# 5. Verify chain integrity after sync
python -c "from feed import Feed; f = Feed.load('/tmp/feeds/bob'); print(f.verify_chain())"

Test Matrix

Layer Test Assertion
Unit append() genesis message sequence=0, previous=null, signature valid
Unit append() chain link SHA-256(canonical(msg[N-1])) == msg[N].previous
Unit verify_chain() tamper detection Modify any byte of msg[N], chain fails at N+1
Unit ContactsKey.derive() symmetry A.derive(B_pub) == B.derive(A_pub) (HKDF is deterministic)
Unit Encrypt/decrypt roundtrip — Contacts decrypt(encrypt_for_contact(pt, k), k) == pt
Unit Encrypt/decrypt roundtrip — SelfOnly decrypt_self(encrypt_self(pt, jk), jk) == pt
Unit Encrypt/decrypt roundtrip — Direct Double Ratchet roundtrip (simplified stub acceptable)
Unit Nonce uniqueness Two messages to the same contact produce different nonces
Unit Tombstone tombstoned=1 in DB; content_json cleared; chain still valid
Integration Two-feed file sync Alice posts 5, exports delta, Bob imports; Bob.verify_chain() passes; Bob renders all 5
Integration Partial sync Bob has seq 0-2, Alice sends 3-5 only; merge succeeds
Integration Sequence gap handling Deliver msg[5] before msg[4]; msg[5] buffered; on msg[4] arrival both commit
Integration Persistence Feed survives process restart; verify_chain() still passes
Performance 1000 append + encrypt (Contacts, 10 contacts) Completes in < 10 seconds on 2020-era laptop
Performance 100 append + encrypt Completes in < 1 second
Performance verify_chain() on 1000-message feed Completes in < 2 seconds

Success Criteria

  1. Append + sign: Calling feed.append() produces an envelope with a valid Ed25519 signature verifiable by the feed's public identity key.
  2. Hash chain: Calling feed.verify_chain() returns Ok(()) on an untampered feed. Mutating any byte of any stored envelope causes verify_chain() to return a ChainError pointing to the first broken link.
  3. Per-audience encryption independence: The same plaintext encrypted to contact A and contact B produces two independent ciphertexts. A can only decrypt their own ciphertext; B can only decrypt theirs. Cross-decryption returns a DecryptError.
  4. ContactsKey symmetry: ContactsKey::derive(alice_dh, bob_pub, alice_id, bob_id) is byte-identical to ContactsKey::derive(bob_dh, alice_pub, bob_id, alice_id).
  5. Sync convergence: After Alice exports her delta and Bob imports it, both instances have identical sets of message IDs for Alice's feed. verify_chain() passes on Bob's copy.
  6. Performance: Appending and encrypting 100 messages to 10 contacts completes in under 1 second on 2020-era commodity hardware (M1 MacBook Air, Core i5 laptop, or equivalent Android device).
  7. Tombstone behaviour: A tombstoned message has tombstoned=1 and null content_json in the messages table. The hash chain remains valid across the tombstone sequence position. The message is absent from feed render output.
  8. Persistence: A feed written to SQLite survives a process restart. After reload, feed.verify_chain() passes and feed.get_sequence() returns the correct value.

Risks & Unknowns

Risk Likelihood Impact Mitigation
p2panda API instability (pre-1.0, v0.1–0.3) — crate interface may change between patch versions High Medium Pin exact version in Cargo.lock; review changelog before upgrading. If too unstable, implement the data model directly (the spec is simple enough to not need the library).
Per-audience encryption is O(N contacts) — large contact counts slow append() noticeably Medium Medium Benchmark at 50 / 100 / 500 contacts. If slow, batch encrypt in parallel using rayon. For the prototype, 50-contact cap is acceptable.
Sequence gap handling complexity — BLE transport will deliver messages out of order; gap-buffer logic is non-trivial Medium High For the two-week prototype, fail-fast on gaps (reject out-of-order) and document the limitation. Implement the buffer in a follow-on iteration.
Feed storage growth — no pruning or archival strategy defined; 100 contacts × 10K messages = 500 MB Low Medium Enforce max_messages_per_feed = 10000 with FIFO eviction at the MessageStore layer. Blobs GC'd after 7 days if unreferenced.
openmls integration complexity — MLS group state machine has many edge cases (epoch advancement, commit ordering, Welcome message flow) High High Stub out group encryption for Days 1–12 using a placeholder symmetric key. Add real MLS on Day 13 if time allows; otherwise document as future work.
Canonical JSON serialization portability — field ordering and encoding must be byte-identical across Rust, Python, Kotlin Medium High Publish test vectors (a set of known input → expected canonical bytes → expected SHA-256) before implementing in more than one language. Rust is the reference implementation.
Private key material leaking into SQLite — easy to accidentally store a key in the wrong table Medium High Code review checklist: grep for INSERT INTO in any file that imports crypto types. Integration test: assert that PRAGMA integrity_check on the DB contains no 32- or 64-byte base64url strings in key columns.
Double Ratchet state management — ratchet state must be persisted and restored correctly; session corruption causes permanent message loss Medium High For prototype, stub DM encryption as a single-key AEAD (document the simplification). Full Double Ratchet can be added incrementally after the core data model is validated.

2-Week Day-by-Day Build Plan

Day Goal Deliverable Dependencies
1 Project scaffolding; keypair generation; genesis message cargo new / python -m venv; Ed25519 keypair; Feed::new() writes seq=0 profile_update to SQLite Rust toolchain or Python ≥3.11
2 Hash-chained append; verify_chain() Feed::append() computes previous, signs, writes to DB; verify_chain() iterates chain and checks hashes + sigs Day 1
3 ContactsKey derivation; contacts-audience encrypt/decrypt ContactsKey::derive() (X25519 + HKDF); EncryptionLayer::encrypt_for_contact() + decrypt(); symmetry unit test Day 2
4 Self-only (journal) messages audience = self; journal_key derivation from seed; encrypt/decrypt roundtrip; verify these messages are skipped by FeedSync::diff() Day 3
5 Direct message encryption (stub) audience = direct:{peer_id}; stub with single AEAD key (no full Double Ratchet); document limitation; DM roundtrip test Day 3
6 SQLite schema + MessageStore full implementation All five tables created on open(); append(), get_range(), max_seq() implemented and unit-tested Day 1
7 FeedSync::diff() + FeedSync::merge() (happy path) diff() returns correct delta given their_seq; merge() validates signatures and chain, writes to DB Days 2, 6
8 CLI demo: Alice and Bob sync via file export Two-process (or two-directory) demo: Alice posts 5 messages, writes delta.json, Bob loads and verifies; identical feed state confirmed Day 7
9 Tombstone support tombstone message type; MessageStore marks target as tombstoned=1, clears content_json; chain still valid; tombstone test Days 2, 6
10 profile_update message type + contact display name index profile_update content type; MessageStore indexes the most recent one per feed; retrieved by contacts layer Days 2, 6
11 Performance benchmark: 1000-message append + verify Benchmarks for append(), verify_chain(), diff() at 1K messages; assert under target times; profile any bottlenecks Days 2, 7
12 Partial sync and gap-buffer (limited) merge() handles receiving messages out of order; at minimum: logs gap, stores buffered messages, links them when the gap is filled; integration test Day 7
13 MLS group encryption (openmls or stub) If time: integrate openmls for audience = group:{id}; create group, add member, encrypt/decrypt one message. If not: document stub and leave Err(NotImplemented) Days 3, 7
14 Test vectors; findings document; cleanup Publish canonical-serialization test vectors as a JSON file in the repo; write FINDINGS.md covering what worked, what didn't, what the next prototype should address All

Further Reading