Storage Layer

calimero-store + calimero-store-rocksdb + calimero-dag + calimero-storage

column families

26+

group key prefixes

crates

AES-GCM

optional encryption

Purpose

The storage layer provides a column-family key-value abstraction over RocksDB. The core Database trait exposes has, get, put, delete, iter, and apply(Transaction). Keys are typed via generic_array, giving compile-time guarantees on key size and column assignment. An optional AES-GCM encryption layer transparently encrypts values at rest.

pub trait Database {
    fn has(&self, col: Column, key: &[u8]) -> Result<bool>;
    fn get(&self, col: Column, key: &[u8]) -> Result<Option<Slice>>;
    fn put(&self, col: Column, key: &[u8], value: &[u8]) -> Result<()>;
    fn delete(&self, col: Column, key: &[u8]) -> Result<()>;
    fn iter(&self, col: Column) -> Result<DBIterator>;
    fn apply(&self, tx: Transaction) -> Result<()>;
}

Column Architecture

All persistent data is partitioned into 11 column families. Each maps to a dedicated RocksDB column family with independent compaction and bloom filters. The Group column is the most complex, containing 26+ logical key prefixes for governance state including namespace identity, namespace governance ops, group hierarchy, and group encryption keys.

Membership

Resources

OpLog / Hashing

Lifecycle

Informal group::… labels in the diagram map to the typed keys and single-byte prefixes in Storage Schema (Group column): group::info → GroupMeta (0x20); group::member → GroupMember (0x21); group::role → role data on GroupMember values (0x21); group::context → GroupContextIndex (0x22) / reverse index ContextGroupRef (0x23); group::upgrade → GroupUpgradeKey (0x24); group::signing_key → GroupSigningKey (0x25); group::cap → GroupMemberCapability (0x26); group::settings (defaults) → GroupDefaultCaps (0x29); migration markers → GroupContextLastMigration (0x2B); group::nonce → GroupLocalGovNonce (0x2C); group::alias (member) → GroupMemberAlias (0x2D), group/context names → GroupAlias (0x2E), GroupContextAlias (0x2F); group::oplog → GroupOpLog (0x30); group::oplog_head → GroupOpHead (0x31); member–context links → GroupMemberContext (0x32), GroupContextMemberCap (0x33). Some cells (e.g. group::invite, group::state_hash) are illustrative and do not match a single prefix—use the schema tables as the source of truth.

Key Model

All keys are statically typed via Key<T>, a newtype over GenericArray<u8, T::Size>. The AsKeyParts / FromKeyParts traits define how a key is decomposed into column + byte components, giving compile-time column assignment.

pub struct Key<T: KeyParts>(GenericArray<u8, T::Size>);

pub trait AsKeyParts {
    type Size: ArrayLength<u8>;
    const COLUMN: Column;
    fn as_key(&self) -> Key<Self>;
}

pub trait FromKeyParts: AsKeyParts {
    fn from_key(key: Key<Self>) -> Self;
}

Key Types by Column

calimero-dag

A generic, in-memory causal DAG used for both context state deltas and governance operation logs. Provides topological ordering, pending queues for out-of-order delivery, and missing-parent detection for catch-up.

crates/dag

CausalDelta<T>

pub struct CausalDelta<T> {
    id: DeltaId,
    parents: Vec<DeltaId>,
    payload: T,
    timestamp: HLC,
    expected_root_hash: Hash,
}

Each delta records its causal parents, forming a partial order. The expected_root_hash enables fast consistency checks — a peer can verify it arrived at the same state after applying a delta.

DagStore<T>

pub struct DagStore<T> {
    deltas: HashMap<DeltaId, CausalDelta<T>>,
    applied: HashSet<DeltaId>,
    pending: Vec<CausalDelta<T>>,
    heads: HashSet<DeltaId>,
}

Tracks all known deltas, which have been applied, which are pending (missing parents), and the current DAG head set. New deltas promote heads automatically.

DeltaApplier<T> trait

pub trait DeltaApplier<T> {
fn apply_delta(&mut self, delta: &CausalDelta<T>) -> Result<()>;
fn restore_applied_delta(&mut self, delta: &CausalDelta<T>) -> Result<()>;
}

Key Operations

Topological Ordering

Before applying, all pending deltas are sorted in topological order (parents before children). This ensures deterministic replay regardless of arrival order.

Pending Queue

If a delta's parents haven't been seen yet, it enters the pending queue. When missing parents arrive, queued deltas are automatically drained and applied.

restore_applied_delta

Used during node restart to rebuild the in-memory DAG from persisted deltas without re-executing the payload (state is already in storage).

get_missing_parents

Returns the set of delta IDs referenced as parents but not yet received. Used by the sync protocol to request specific deltas from peers.

calimero-storage

Provides CRDT collections used by the WASM runtime for conflict-free replicated state. Each collection implements the Mergeable trait for automatic conflict resolution during sync.

crates/storage

UnorderedMap

Observed-remove map. Concurrent puts to the same key are resolved by LWW using HybridTimestamp. Deletions are tracked as tombstones until causally stable.

UnorderedSet

Observed-remove set. Add/remove conflicts resolved in favor of add (add-wins semantics). Internally backed by an UnorderedMap with unit values.

LwwRegister

Last-writer-wins register. Stores a single value with a HybridTimestamp. On merge, the value with the highest timestamp wins.

Core Traits

pub struct HybridTimestamp(Timestamp); // from uhlc: NTP64 wall-clock + 128-bit node ID tiebreaker

pub trait Mergeable {
fn merge(&mut self, other: &Self) -> Result<(), MergeError>;
}

The Mergeable trait is the fundamental building block — any type that implements it can be used as a CRDT value in the storage layer. The runtime's host functions call merge when applying remote deltas.

RocksDB Implementation

The calimero-store-rocksdb crate provides the concrete Database implementation backed by RocksDB.

crates/store/rocksdb

Column Family Mapping

Each Column enum variant maps 1:1 to a RocksDB column family. CFs are created at DB open time. Each has independent compaction, bloom filters (10 bits/key), and block cache partitions.

WriteBatch Transactions

The Transaction type accumulates puts and deletes, then is atomically committed via WriteBatch. Guarantees all-or-nothing semantics for multi-key operations like delta application.

Snapshot Iteration

Iterators are backed by RocksDB snapshots for consistent point-in-time reads. Prefix iteration uses set_iterate_range for efficient scans within a column family.

Pinned Gets

Uses get_pinned_cf for zero-copy reads where possible. The returned Slice borrows directly from the block cache, avoiding allocation for large values.

CRDT Collections

The calimero-storage crate provides application-level CRDT collections built on top of the storage layer. These are what SDK applications use for state management.

Available Collections

UnorderedMap

Key-value store with LWW (Last-Write-Wins) semantics per entry. Entries can be inserted, updated, and removed independently across nodes.

Vector

Ordered append-only list. Items are pushed to the end. Positional inserts and removes use index-based CRDT logic.

Counter

Generic over ALLOW_DECREMENT: bool. Default Counter<false> (alias GCounter) is grow-only: each node increments its own slot and value() returns the sum across all nodes. Counter<true> (alias PNCounter) layers a second per-node map to also support decrement. Both variants are commutative and idempotent.

Storage Primitives

Beyond the public CRDT collections above (UnorderedMap, Vector, UnorderedSet, LwwRegister, Counter) the storage layer provides wrappers that constrain who can write what. Three flavours of constraint:

Signature-based (UserStorage, SharedStorage, AuthoredMap, AuthoredVector) — the context manager signs each mutating action with the executor's private key (sign_authorized_actions in crates/context/src/handlers/execute/mod.rs) after WASM returns its outcome. Peers verify the signature at merge time in Interface::apply_action in crates/storage/src/interface.rs using the runtime's ed25519_verify host function and reject the action with InvalidSignature on mismatch.
Structural (FrozenStorage) — no per-identity signature; immutability after first write is enforced by content-addressing. Once the SHA-256 hash → value mapping is published, attempts to overwrite the same hash are rejected.
None (the public collections) — anyone in the context can write, update, or remove any entry. Listed below for contrast.

Comparison

Primitive	Storage stamp	Keyspace	Writes new	Mutates existing	Reads
UnorderedMap<K,V> (public baseline — no auth)	—	K → V	anyone	anyone	everyone
UserStorage<T>	User { owner = executor_id }	PublicKey → T, disjoint per-user slots	only into your own slot	only your own slot	everyone (get_for_user)
FrozenStorage<T> (structural — no signature check)	Frozen	Hash(value) → T, content-addressed	anyone	nobody — immutable	everyone
SharedStorage<T>	Shared { writers, frozen }	one slot with T (T often a nested map)	signer ∈ writers	signer ∈ writers; writers rotatable if !frozen	everyone
AuthoredMap<K,V>	User { owner } per-entry	shared keyspace, K → V	anyone (becomes owner)	only owner	everyone
AuthoredVector<V>	User { owner } per-entry	shared sequence, index → V	anyone (push, becomes owner)	only owner (update / tombstone)	everyone

Picking one

what guarantee do you need? ├── none, public → UnorderedMap / Vector / UnorderedSet / LwwRegister / Counter ├── immutable, content-addressed → FrozenStorage<T> └── identity-bound writes ↓ ├── one slot per user, disjoint → UserStorage<T> ├── one shared slot, named writer set → SharedStorage<T> (alias: PermissionedStorage<T, WriterSetAcl>) ├── one shared slot, single transferable owner → Ownable<T> (alias: PermissionedStorage<T, OwnerAcl>) ├── shared keyspace, per-entry author → AuthoredMap<K,V> └── shared sequence, per-entry author → AuthoredVector<V>

UserStorage<T>

Per-user slot keyed by PublicKey. The executor can only write into their own slot (env::executor_id()); reads are unrestricted. Internally an UnorderedMap<PublicKey, T>.

SharedStorage<T>

A single value writable by any signer in a mutable writers set. Any current writer can rotate the set unless it is frozen. The context manager signs each write after WASM returns its outcome; peers verify the signature against the stored writer set at merge time. See ADR 0001 for the rotation-during-concurrent-write contract.

FrozenStorage<T>

Content-addressable immutable storage. insert returns a SHA-256 hash; reads are by hash. Same value always yields the same hash, and entries cannot be updated once written. Internally an UnorderedMap<Hash, FrozenValue<T>> with first-write-wins semantics.

AuthoredMap<K, V>

Shared keyspace map with per-entry ownership. Any member can insert a new key; only the inserter can update or remove their own entries. Each entry carries a StorageType::User { owner } stamp set from the executor's public key at insert time.

AuthoredVector<V>

Ordered shared vector with per-entry ownership. Any member can push; only the pusher can update or tombstone their entry. There is no physical remove — shifting indices would complicate concurrent-push merge semantics. Use tombstone(idx) to retract a slot.

Merge-time enforcement (signature-based primitives)

Local update/remove calls short-circuit non-owner attempts so bugs surface in-process. The load-bearing check happens at merge time in Interface::apply_action:

Signature — the runtime's ed25519_verify host function is called against the entry's stored owner (or, for SharedStorage, against the current writer set). Mismatch returns InvalidSignature.
Replay nonce — incoming nonce must be strictly greater than the stored value; equal nonces are rejected as NonceReplay (i.e. the comparison is incoming > stored, not ≥). The nonce is the action's SignatureData.nonce, set from env::time_now() (a wall-clock nanosecond timestamp from SystemTime::now()) at action build time. Caveat: wall-clock time is not strictly monotonic across NTP slews, leap-second corrections, VM-host clock changes, or process restarts. Two consecutive writes on the same node can collide or step backward under those conditions, in which case the second write is rejected as NonceReplay; conversely a node with a far-future clock can write a nonce that effectively locks the entity until later honest writes catch up. Per the inline comment in interface.rs, the nonce check is itself a transitional v2 mechanism that the project plans to retire after a soak period in favour of DAG-causal verification.
Per-entity scope — the stored nonce lives on the individual storage entity (the row keyed by entity id), not on the owner globally. For AuthoredMap each map key is its own entity; for UserStorage each per-user slot; for SharedStorage the single value.
Entity binding — a signed action targeting one entity cannot be replayed against a different entity even when the same key signed both: the entity id is part of the bytes hashed by Action::payload_for_signing(), so the signature is bound to that specific entity.

PermissionedStorage<T, A> / Ownable<T>

Policy-parameterised wrapper over the underlying WriterSetCell<T> storage primitive. The type parameter A: Authorizer is a zero-sized policy marker that decides, at the API surface, whether the current executor may perform a given Op. Two built-in policies ship:

WriterSetAcl — any member of the writer set may perform any op (the behaviour of plain SharedStorage). Aliased as SharedStorage<T>.
OwnerAcl — the single writer is the owner; Ownable<T> is the ergonomic alias. Supports transfer_ownership and renounce_ownership.

The security boundary is unchanged: merge-time signature verification against the writer set. The Authorizer is fail-fast UX sugar — it lets a method reject an unauthorised caller early, but cannot substitute for correct storage-type placement (data must live inside the wrapper to be protected).

use calimero_storage::collections::{Ownable, PermissionedStorage, WriterSetAcl};

struct State {
config: Ownable<LwwRegister<String>>, // single owner, transferable
policy: PermissionedStorage<LwwRegister<String>, WriterSetAcl>,
}

OpMask — operation-granular capabilities

The writer set on a Shared entity has been extended from BTreeSet<PublicKey> to BTreeMap<PublicKey, OpMask>. Each writer now carries a bitmask of permitted operations, enforced at merge time by the ProtocolAuthorizer after signature verification:

bitflags! {
    pub struct OpMask: u8 {
        const INSERT = 0b0000_0001; // create new entity
        const UPDATE = 0b0000_0010; // modify existing
        const DELETE = 0b0000_0100; // delete
        const ADMIN = 0b0000_1000; // rotate writers / grant / revoke
        const APPEND = INSERT | UPDATE; // write, no delete
        const WRITE = INSERT | UPDATE | DELETE;
        const FULL = WRITE | ADMIN;
    }
}

A key with no explicit mask resolves to OpMask::FULL — preserving today's "in the set ⇒ anything" behaviour. grant_capability(who, mask) sets a per-principal mask; concurrent conflicting grants merge by bitwise AND (revoke-wins, fail-safe). WRITE/DELETE/ADMIN bits are merge-enforced; INSERT vs UPDATE is deferred (needs exists_at_cut).

Key distinctions

UserStorage vs AuthoredMap

Both stamp entries with StorageType::User { owner }. The difference is keying:

UserStorage<T>: key is the public key, slots are disjoint per user.
AuthoredMap<K,V>: key is application-defined, owner is recorded in the entry's metadata. Two users can compete to insert the same key; first writer wins, subsequent updates are owner-only.

SharedStorage vs AuthoredMap

Both allow mutation after creation. The difference is the granularity of the writer set:

SharedStorage<T>: one collection-level writer set governs one logical T. Several named members can co-author the same value.
AuthoredMap<K,V>: per-entry writer set (currently size-1 = single author). The writer varies per key.

Why composition doesn't replace these primitives

AuthoredMap is not UnorderedMap<K, SharedStorage<V>>

You can nest SharedStorage inside an UnorderedMap for per-key multi-writer values, but the outer map is still public — outer keys can be inserted, overwritten, or removed by anyone. AuthoredMap puts the ownership stamp on the entry itself within a shared keyspace, so K cannot be replaced by non-owners.

SharedStorage is not UnorderedMap<K, UserStorage<V>>

You can nest UserStorage inside an UnorderedMap for per-key single-author values, but again the outer map is public (key-level tampering remains possible). And there is no way to model "several named people co-own this value" without a writer set — exactly what SharedStorage provides.

Merge Semantics

All collections implement the Mergeable trait. When state deltas arrive from peers, the storage layer calls merge() on each affected entry. The merge is:

Commutative — merge(A, B) = merge(B, A)
Associative — merge(merge(A, B), C) = merge(A, merge(B, C))
Idempotent — merge(A, A) = A

These properties guarantee eventual consistency regardless of message ordering or duplication.