Writing App State Migrations
Reshape a context's existing data when an app's state shape changes — convergently, lazily, and recoverably
When you ship a new version of a Calimero app whose state shape changes, you write a migration that reshapes each context's existing data into the new layout. This guide covers when you need one, how to write it (the easy way and the hand-written way), the one rule that actually matters (convergence), how the SDK keeps you from getting it wrong, and how to test and ship it.
calimero-sdk. For
the internals, see crates/sdk/AGENTS.md.1. Do you even need a migration?
A migration is needed only when old serialized state can no longer be read as the new state. Borsh is positional, so:
| Change | Migration? |
|---|---|
| Add a method, fix logic (no field change) | No |
| Append a variant to an enum (kept indices) | No |
| Add a field | Yes — old bytes have no value for it |
| Remove / rename a field | Yes |
| Change a field's type | Yes |
If you're unsure, the calimero-abi diff CI lint compares the old and new
state-schema.json and tells you whether the change is additive, breaking, or an unsafe
identity downgrade (see §6).
A migration runs once per node, the first time each context is accessed after the
upgrade, under the LazyOnAccess upgrade policy.
2. Quick start — #[derive(Migrate)]
For the common cases — add / remove / rename a field — you don't write the
migration body at all. Declare the new state, derive Migrate, point it at the old
layout, and annotate only what changed:
use calimero_sdk::app;
use calimero_sdk::borsh::BorshDeserialize;
use calimero_storage::collections::{LwwRegister, UnorderedMap};
#[app::state(version = 2)]
#[derive(app::Migrate)]
#[migrate(from = DocV1Data)]
pub struct DocV2 {
entries: UnorderedMap<String, LwwRegister<String>>, // carried automatically
title: LwwRegister<String>, // carried automatically
#[migrate(new = LwwRegister::new("".to_owned()))]
notes: LwwRegister<String>, // added → you give the seed
}
// The old layout, as a borsh-only shadow. Field order MUST match v1's
// `#[app::state]` struct (borsh is positional). Don't import the v1 crate's
// `#[app::state]` — it would pull in v1's full SDK surface.
#[derive(BorshDeserialize)]
#[borsh(crate = "calimero_sdk::borsh")]
struct DocV1Data {
entries: UnorderedMap<String, LwwRegister<String>>,
title: LwwRegister<String>,
}
The derive generates migrate_v1_to_v2() for you. The rules:
| Annotation | Where | Result |
|---|---|---|
| (none) | field | carried from the old state by name (old.field) |
| #[migrate(new = EXPR)] | field | additive — you provide the seed value |
| #[migrate(from = old_name)] | field | renamed — carry old.old_name |
| #[migrate(with = EXPR)] | field | transform — EXPR(old.field) (combine with from to convert a renamed field). Handles type changes, struct→enum, etc. |
| field omitted from the new struct | — | dropped (the remove case) |
| #[migrate(emit = EXPR)] | struct | emit an app event from the migration (e.g. Migrated { from, to }) |
with and emit cover the two most common reasons to drop to a hand-written
body — a type change and an event — so much of what used to need
#[app::migrate] is now a one-line annotation. Example:
#[app::state(emits = for<'a> MigrateEvent<'a>)]
#[derive(app::Migrate)]
#[migrate(from = V1, method = migrate_v1_to_v2,
emit = MigrateEvent::Migrated { from: "1.0.0", to: "2.0.0" })]
pub struct V2 {
items: UnorderedMap<String, LwwRegister<String>>, // carried
#[migrate(from = count, with = u64_reg_to_string)]
count: LwwRegister<String>, // u64 -> String via `with`
}
fn u64_reg_to_string(c: LwwRegister<u64>) -> LwwRegister<String> {
LwwRegister::new(c.get().to_string())
}
A new field you forget to annotate is a compile error ("no field notes
on the old type") — it can't silently misbuild. A dropped field, by contrast, is silent
(that's the remove case), so review your new field list against the old one deliberately.
method defaults to the versioned name migrate_v{N-1}_to_v{N} derived from
#[app::state(version = N)] (plain migrate only for unversioned states), so
omitting it is the norm — names stay unique across releases and across multiple derives in one
module. Explicit method = … remains supported and resolves identically.
3. Writing a migration by hand
When you need real transformation — a type change, splitting a field, seeding from
other data — write the #[app::migrate] function yourself. The derive isn't magic; it
just generates this shape:
use calimero_sdk::state::read_raw;
#[app::migrate]
pub fn migrate_v1_to_v2() -> DocV2 {
// 1. Read the old root bytes (None only if no prior state exists).
let old_bytes = read_raw().unwrap_or_else(|| panic!("no prior state"));
// 2. Deserialize into the old-layout shadow.
let old: DocV1Data = BorshDeserialize::deserialize(&mut &old_bytes[..])
.unwrap_or_else(|e| panic!("v1 deserialize: {e:?}"));
// 3. Build and return the new state.
DocV2 {
entries: old.entries, // carry a collection — handle survives
title: old.title,
notes: LwwRegister::new("".to_owned()), // seed a new field
}
}
Notes:
- A migration returns the new state by value; there is no
Result. On unrecoverable input (no prior state, undeserializable bytes) youpanic!, which aborts the upgrade and leaves v1 state intact for a retry — a failed migration is non-destructive. - Carrying a collection (
entries: old.entries) reuses its existing storage handle; no re-population needed.
When you must hand-write it
The derive (with with / emit) handles a single-field transform, a type
change, a struct→enum, and emitting an event. You still need a hand-written
#[app::migrate] when one source feeds many fields or a new
field is derived from a field you're also keeping — i.e. the transform crosses fields:
| You need to… | Why the derive can't | Example scenario |
|---|---|---|
| Split one field into several | a with yields one field, not three | scenario-field-split |
| Re-derive a value from a field you also carry | that source would be moved twice | scenario-invariant-reshuffle |
| Seed an ordered collection from another you carry | same double-use, plus ordering logic | scenario-crdt-native |
| Anything genuinely multi-step / imperative | — | — |
(A single-field type change, struct→enum, content transform, or archiving a dropped
field is now a #[migrate(with = …)] one-liner — see §2.)
These are all in apps/migrations/ as scenario-* crate pairs with a matching
merobox workflow — use them as a cookbook.
Common hand-written patterns
Split a field — parse one field into several, with an explicit fallback:
let parts: Vec<&str> = old.address.get().split(", ").collect();
let (street, city, zip) = match parts.as_slice() {
[s, c, z] => (s.to_string(), c.to_string(), z.to_string()),
_ => (old.address.get().clone(), String::new(), String::new()), // handle malformed input
};
Seed an ordered structure (determinism!) — building a Vector from a
map you also carry; sort first, because two nodes may iterate an unordered source in
different orders:
let mut keys: Vec<String> = old.items.entries()?.map(|(k, _)| k).collect();
keys.sort(); // ← required, or the roots diverge
let mut tags = Vector::new();
for k in keys { tags.push(k.into())?; }
DocV2 { items: old.items, tags, .. } // `items` carried + re-read above
Drop entries — remove from the carried collection; don't rebuild a fresh one.
A new same-named UnorderedMap::new() is re-keyed to that field's deterministic id during
migrate, so it shares the carried v1 storage and unions with it — the entries you
"skipped" survive, so nothing is dropped. To actually drop, mutate the carried collection (sort first
for determinism):
let mut items = old.items; // carry — same storage id
let mut keys: Vec<String> = items.entries()?.map(|(k, _)| k).collect();
keys.sort();
if let Some(k) = keys.first() { items.remove(k)?; } // actually deletes one entry
DocV2 { items, .. }
#[migrate(with = …)] (§2) — these hand-written patterns are for the cross-field cases the
derive can't express. All of them run under the convergence rules in §4 and the category rules in §5,
the same as a derived migration. The merge-mode guards (Counter/RGA) still apply.
4. The convergence rule (the important part)
#[app::migrate] runs independently on every node, against that node's
own (already-synced, byte-identical) v1 state. The migrated state is not sent over
sync — each node re-derives it locally. Therefore:
What the SDK handles for you
A migration body runs under storage merge mode, and the SDK removes the two structural sources of per-node entropy automatically:
- Node-local timestamps.
LwwRegister::new(...)/.set(...)andElementupdate times are zeroed during a migration (instead of stamping this node's clock + id). - Random collection ids. New collections and
Vector/AuthoredVectorelements are re-keyed deterministically (by field name / append index) instead of theId::random()the live path uses.
So a migration that only carries fields and adds new ones with ::new()
cannot trigger a determinism bug.
What you must still avoid (app-level)
- Wall-clock / RNG / iteration order. If you materialize an ordered
structure (a
Vector) from an unordered one (a map/set), sort first — two nodes may iterate the source in different orders. - See the data categories below for the CRDT-specific rules.
5. The three data categories
Which migration moves are safe depends on the field type:
| Category | Types | In a migration |
|---|---|---|
| Convergent | UnorderedMap, Vector, UnorderedSet, UserStorage, FrozenStorage | key/content-addressed — rebuild freely; auto-converges |
| Replayable | Counter/GCounter/PNCounter, RGA | carry across, or replay deterministically — see below |
| Identity-gated | AuthoredMap, AuthoredVector, SharedStorage | carry-through only — re-inserting stamps this node as owner and diverges |
Replayable: Counter and RGA are guarded
Counter::increment / decrement stamp the running node's id;
RGA::insert / insert_str stamp a node-local clock. Calling them in a
migration would silently fork the network — so the SDK makes them panic during a
migration:
Counter::increment() is non-deterministic during a state migration: it stamps this node's identity… Carry the counter across unchanged (`field: old.field`) or replay with `increment_for(executor_id, …)`.
To rebuild one in a migration, carry it (c: old.c) or use the deterministic replay APIs
— increment_for(id, …), decrement_for(id, …),
insert_str_at_timestamp(pos, fixed_ts, s) — which take the identity/clock explicitly so
every node produces the same result.
Identity-gated: carry, don't re-insert
AuthoredMap/AuthoredVector/SharedStorage record ownership as
the running node's executor_id. Re-inserting their entries during a migration would
stamp each node as the owner and diverge. Carry the whole collection through
(entries: old.entries); the v1 owner stamps are preserved. Inside the
#[app::migrate] body, carry-through is still the only rule — converting an entry to the
new schema happens after the migration, owner-by-owner (below).
Identity-gated: converting entries to the new schema (owner-driven)
Carry-through preserves the v1 entries, but each keeps its v1 schema_version (a
Merkle-invisible tag) until its owner re-signs it — nobody can re-sign another
identity's entry. Until then the entry is served at its v1 shape via dual-read. Two paths re-stamp an
entry to the binary's target:
- Organically — the owner's (or a current writer's) next ordinary signed
write of that entry stamps
schema_version = targetand re-signs on its monotonic nonce, replicating as a normalAction::Update. A non-owner can never drive it. - One tap —
#[app::state]auto-generates amigrate_my_entries()method (a wasm export) for any state with anAuthoredMap/AuthoredVectorfield. One signed call sweeps every entry the caller owns that is still below target, converting each through the path above; it returns{converted, remaining}and is idempotent. Call it from the app/frontend ("migrate my data") after an upgrade;remaining == 0means the caller's data is fully converted.
For either path to fire, the new binary must declare its schema target with
#[app::state(version = N)] — the value the convert compares each entry's tag against. It
defaults to 0 (inert), so a v2 binary that omits it never converts its identity-gated
data. WriterSetCell/SharedStorage (group writer-set data) converts only via
the organic writer-write path, never the one-tap batch (it is group data, not single-owner "my data").
You do not write migrate_my_entries — #[app::state(version = N)]
generates it. All you do is declare the version; the method appears on your app and is
exported for RPC:
// v2 binary. `version = 2` both sets the convert target AND generates
// `migrate_my_entries()` because the state has an AuthoredMap field.
#[app::state(version = 2, emits = for<'a> Event<'a>)]
#[derive(app::Migrate)]
#[migrate(from = NotesV1, method = migrate_v1_to_v2)]
pub struct NotesV2 {
notes: AuthoredMap<String, LwwRegister<String>>, // carried by the migrate
#[migrate(new = LwwRegister::new(String::new()))]
migration_note: LwwRegister<String>,
}
// No migrate_my_entries body anywhere — the macro emits it.
Then the owner triggers it like any other method — one signed call, no args, returning
{converted, remaining}:
# after the upgrade, from the owner's node (frontend "migrate my data" button):
app_call(context_id, "migrate_my_entries", {})
→ { "converted": 2, "remaining": 0 } # this owner's 2 stale notes converted
Loop until remaining == 0 if you want to drain everything in one sitting (a single call
already converts all of the caller's currently-stale entries; a second call returns
{converted: 0, remaining: 0}). It only ever touches entries the caller owns, so each user
converts their own data independently.
Migration UX surfaces
Two node-level events help frontends track migration progress without polling:
AppVersionChanged event — fired once, over the context's SSE/WebSocket
event stream, when the context's application version flips (i.e. a migrate/upgrade was applied).
contextId rides on the outer ContextEvent envelope; the payload carries the
fromVersion / toVersion semver strings (either may be null if the
corresponding ApplicationMeta row was unavailable at emit time):
{
"type": "AppVersionChanged",
"data": {
"fromVersion": "1.0.0",
"toVersion": "2.0.0"
}
}
Subscribe to context events via the node's SSE endpoint and react to this event to prompt owners to
run migrate_my_entries() — this avoids bundle-skew where the frontend loads v2 assets but
the context is still running v1.
authored_remaining in the migration status endpoint — tracks how many
members in the cohort still have unconverted identity-gated entries:
GET /admin-api/groups/{namespace_id}/migration-status
→ { ..., "authored_remaining": 3 } # 3 cohort members still have stale entries
authored_remaining == 0 means all members have run migrate_my_entries() or
had no entries to convert. Use this to drive a "migration complete" indicator in admin UIs.
/admin-api/ route gated by the same
admin-credential check as the rest of that surface — the node rejects unauthenticated callers before
any migration state is returned. The authored_remaining count (and the group/namespace
structure it implies) is internal operational data: never expose it on an unauthenticated path, and
scope the admin token to operators of that node/namespace.
6. The no-silent-downgrade rail
Changing an identity-gated type to a plain one — AuthoredMap → UnorderedMap,
SharedStorage → UnorderedMap, AuthoredVector → Vector, or dropping the field
— strips per-entry authorship / the writer ACL across the whole network. This is
refused:
- in CI, by
calimero-abi diff(anUNSAFE_IDENTITY_DOWNGRADEfinding), and - at the node, by the upgrade gate, before the upgrade op is even emitted.
If you hit this, the fix is not to strip the type — it's an owner-driven rewrite (each owner re-migrates their own signed entries). Stripping authorship is almost never what you want.
7. Guarding a migration: migration_check + abort
A migration that compiles and runs can still be wrong — drop entries, break an invariant,
orphan a reference. To catch that before it commits, declare an optional
#[app::migration_check]. The migrate runs against an in-memory staging
buffer; the check runs against that same buffer before anything is written to the
live store. If it returns false, the runtime logically aborts —
the staging buffer is dropped, so the context stays on v1 with zero residue (root
and every child entry intact; no byte snapshot/restore needed). An app with no check commits
as before (backwards-compatible).
What the check can read (the contract)
The check receives old (the committed v1 root) and new (the produced v2
root). Staging makes these asymmetric:
newis fully trustworthy — its scalar/inline fields and its lazy collections (read through the staging buffer) reflect the produced v2 state. Readnew.items.len(), walknew's collections, etc.old's scalar/inline fields are pristine v1 (decoded from the committed v1 root bytes) — a safe baseline.old's lazy collections are NOT pristine:old.itemsandnew.itemsresolve to the same deterministic bucket, so in one check execution they read the same (staged) data. Do not diffoldvsnewcollections — the comparison is always trivially equal.
So write the check as an invariant over new (optionally against an
old scalar baseline), not as an old-vs-new collection diff.
Carrying a v1 baseline: the transient migration witness
When the invariant needs a v1 value the v2 schema doesn't keep (e.g. "every item survived"), the
migrate returns a (State, Witness) tuple. The Witness is a borsh blob
delivered to the check and never persisted — it rides out on the runtime Outcome like
logs/events:
#[derive(BorshSerialize, BorshDeserialize)]
#[borsh(crate = "calimero_sdk::borsh")]
struct MigrationWitness { v1_count: u64 }
#[app::migrate]
fn migrate() -> (DocV2, MigrationWitness) {
let mut items = old.items;
let v1_count = items.len().unwrap_or(0) as u64; // captured BEFORE any change
// ... transform ...
(DocV2 { items, /* .. */ }, MigrationWitness { v1_count })
}
#[app::migration_check]
fn check(_old: DocV1, new: DocV2, witness: MigrationWitness) -> bool {
// `new.items` is the produced collection; compare it to the v1 baseline.
matches!(new.items.len(), Ok(n) if n as u64 == witness.v1_count)
}
A migrate returning a plain State (no tuple) and a 2-arg check(old, new) stay
valid — the witness is opt-in. Prefer invariants that need no extra field where you
can: a required key present (new.items.get("alpha")?.is_some()), conservation against an
existing field (new.total == new.items.values().sum()), or a monotonic version
(new.version > old.version, an old scalar).
Built-in helpers (calimero_sdk::migration_check) operate on slices you build from
soundly-readable data (new collections, scalars, a witness):
entity_count_parity(a, b, delta)— counts match withindeltano_orphaned_refs(refs, keys)— every reference still resolvesconservation(old_total, new_total)— a total is preserved
The check and any witness must be a deterministic pure function of
the v1 state, exactly like the migrate: they run independently on every node against byte-identical
input, so all nodes reach the same verdict — either all commit or all abort. (A
non-deterministic check or witness is a split-verdict bug, the same hazard
assert_migrate_converges guards.) A failed check is retryable: no
migration marker is recorded, so the context re-runs migrate+check on its next access
— a transient cause (e.g. not-yet-synced v1) self-heals once the input is complete.
old vs new collection cardinality directly (without a
witness/baseline) would need a pristine-snapshot read path for old that is not yet
implemented — tracked as a follow-up. Use the witness pattern above.
Aborting an in-flight migration (admin)
An operator can call off a migration that's rolling out:
POST /admin-api/groups/{namespace_id}/migration/abort
It flips the group's pending target back to the pre-migration app id and drops the pending marker, cascading to every descendant subgroup carrying the same pending migration. Idempotent — a subtree with nothing pending is a no-op. It's a forward "stop" (un-migrated contexts stop switching), not a rewind of any context that already migrated.
8. Testing your migration
Fast, in-process — TestHost
Run the migration entirely in memory (cargo test, no Docker), and — most importantly —
assert it converges across nodes:
#[cfg(test)]
mod tests {
use calimero_sdk::testing::{assert_migrate_converges, TestHost};
#[test]
fn migrate_carries_and_seeds() {
let mut app = TestHost::new(DocV1::init);
app.call(|s| s.set_title("my-doc".to_owned())).unwrap();
let v2 = app.migrate(migrate_v1_to_v2); // run it in-process
assert_eq!(v2.view(|s| s.title().unwrap()), "my-doc"); // title carried
assert_eq!(v2.view(|s| s.notes().unwrap()), ""); // notes seeded
}
#[test]
fn migration_converges_across_nodes() {
// Runs the migration as two different node identities from an identical
// v1 and asserts the two merkle roots match — a non-deterministic
// migration fails here in milliseconds instead of forking production.
assert_migrate_converges::<DocV1, DocV2>(
install_v1, migrate_v1_to_v2, [1u8; 32], [2u8; 32],
);
}
}
assert_migrate_converges compares the merkle root hash, which folds in
every child-collection entry — so a per-node value baked anywhere in the migrated state (a field,
or a value inside a carried collection) is caught.
Full end-to-end — merobox
For real cross-node behaviour (and iteration-order determinism), run a merobox workflow. See
workflows/app-migration/README.md ("Running locally": merobox bootstrap run …)
and the worked scenarios in apps/migrations/.
#[app::migration_check] has no in-process TestHost entry point — it's exercised
end-to-end by merobox: workflows/app-migration/29-migration-check-pass.yml (the check
passes → the migration commits) and 30-migration-check-fail-abort.yml (the check fails →
zero-residue logical abort, v1 still served). Model new check scenarios on those.
merobox step types for migrations
A merobox workflow is a YAML list of steps run against a throwaway multi-node cluster
(merobox bootstrap run workflows/app-migration/<name>.yml). The migration-relevant step
types — every one of these is used by a real scenario-* workflow you can copy:
| Step | Does | Key fields |
|---|---|---|
| install_application | install a bundle/wasm on a node | node, path, dev, outputs: {app_v1: applicationId} |
| update_group_settings | set the upgrade policy — migrations need lazy_on_access | node, group_id, upgrade_policy: lazy_on_access |
| upgrade_group | trigger the upgrade; the node resolves the migrate from the target's ABI | node, group_id, target_application_id, cascade |
| call | invoke a method — a read counts, so it triggers the lazy migrate | node, context_id, method, args |
| get_migration_status / assert_migration_complete | read / poll the cohort rollup — admin node only | node, namespace_id [, timeout_seconds, poll_interval] |
| get_cascade_status / assert_cascade_complete | per-subgroup cascade rollup | node, namespace_id [, timeout_seconds] |
| resync_context | recover a stranded member (destructive) | node, context_id, force: true |
| abort_migration | admin stop of a rolling migration | node, namespace_id |
| assert_log_present / assert_log_absent | assert a node log line did / didn't fire | nodes: [...], patterns: [...] |
The canonical shape — install v1, write data, set LazyOnAccess, install v2, upgrade, poke each peer to migrate, assert:
steps:
- { type: install_application, node: n1, dev: true,
path: apps/migrations/migration-suite-v1/res/migration-suite-1.0.0.mpk,
outputs: { app_v1: applicationId } }
# ... create_namespace / create_context / node-2 joins / write v1 data via `call` ...
- { type: update_group_settings, node: n1, group_id: "{{group_id}}",
upgrade_policy: lazy_on_access }
- { type: install_application, node: n1, dev: true,
path: apps/migrations/migration-suite-v2/res/migration-suite-2.0.0.mpk,
outputs: { app_v2: applicationId } }
- { type: upgrade_group, node: n1, group_id: "{{namespace_id}}",
target_application_id: "{{app_v2}}", cascade: true }
- { type: call, node: n2, context_id: "{{ctx}}", method: get_something } # read ⇒ lazy migrate
- { type: assert_log_present, nodes: [n2], patterns: ["Migration completed successfully"] }
- { type: assert_migration_complete, node: n1, namespace_id: "{{namespace_id}}" }
See workflows/app-migration/README.md for running locally, and any
scenario-*.yml for one complete, copy-pastable workflow per migration shape.
Worked examples
apps/migration-harness-example/—#[derive(Migrate)]+TestHostunit tests (carry, seed, rename, convergence, divergence-detection).apps/migrations/scenario-*+workflows/app-migration/*.yml— one crate pair and one merobox workflow per migration shape (additive, remove, rename, type-change, CRDT-native, authored-map, …).
9. Shipping a migration
Migrations run only under UpgradePolicy::LazyOnAccess. Trigger the upgrade — note there is
no method name to pass; the node resolves the migration from the version + edge your
app's build embeds in its ABI:
upgrade_group(
target_application = <new bundle's application id>,
cascade = true, // optional: fan out across a namespace subtree
)
Per service, the node compares the state version of the bytecode the group currently runs against the target's and decides: equal → code-only swap (nothing runs); one ahead with a declared edge → run that edge's migrate; version bump with NO edge, more than one hop, or an older target → the upgrade is rejected with an actionable error instead of silently shipping new code onto old-layout state. There is no caller-supplied method at all — the declared ABI is the single source of truth.
Each node self-migrates on its next context access (logged as Executing migration →
Migrated state written successfully; the extra performing lazy upgrade before
execution line precedes them only on the non-cascade lazy-on-read path — under
cascade: true the cascade propagator drives the migrate instead). Picking
Automatic/Coordinated for a migration is rejected at emit time — only
LazyOnAccess runs the migrate function.
If the app has identity-gated state (AuthoredMap/AuthoredVector),
declare the new binary's schema target with #[app::state(version = N)] so the post-migrate
owner-driven convert (§5) has a target to compare against.
9.1 The policy prerequisite: create groups as LazyOnAccess
Migrations run only under UpgradePolicy::LazyOnAccess; an
upgrade_group whose target declares a migration is rejected at emit time
under Automatic or Coordinated. The trap: if your app creates its namespace
with upgradePolicy: 'Automatic' (a tempting-sounding default), every migration you ever
ship to that workspace fails at the upgrade call. Either create groups as LazyOnAccess from
day one, or heal before upgrading:
PATCH /admin-api/groups/:group_id { "upgradePolicy": "LazyOnAccess" }
LazyOnAccess while a migration is pending — that flip is rejected, because un-accessed
contexts would otherwise swap bytecode without running the migrate.
9.2 Bundle apps: the id never changes — the blob does
A bundle (.mpk) derives ApplicationId = hash(package, signer) — it is
version-stable. v1 and v2 of the same package share one id; installing v2 overwrites
the blob in place under that id. Consequences:
- Never key "is an update available / applied" off the ApplicationId changing. It
won't. The version discriminator is the bytecode blob id, recorded on the group as
appKeyat upgrade time. - Download ≠ apply. Installing the new bundle (registry download) flips the
local node's installed version immediately — but the workspace group still targets the old
bytecode until
upgrade_groupruns. A UI that only compares "installed version vs registry latest" reads "up to date" in the downloaded-but-never-migrated state and strands the admin. The ground truth for "has this workspace been migrated" is:group.appKey == installed_application.blob.bytecode(note:appKeyis hex on the admin API, blob ids are base58 — decode before comparing). - Single-wasm apps content-address the id per version, so the old id-comparison instinct works there — bundles are the exception that ships to real users.
9.2b The node resolves the method — callers carry nothing
Your build embeds state_version and the declared migration edges
(migrations: [{method, fromVersion}]) in each service's ABI, and the node reads them at
upgrade time. Updater UIs/CLIs never carry a method name; the typo/omission class of failure is
structurally gone. The export name is generated (migrate_v{N-1}_to_v{N}) unless you pass
method = … explicitly in the derive — explicit names remain supported and resolve
identically.
9.2c Multi-service bundles: per-service version arithmetic
A multi-service bundle (e.g. a registry service + a data service) upgrades as one unit, but the node judges each service by its own declared state version:
- a service whose version is unchanged is code-only by arithmetic — no wasm probing, no fake failure, nothing to declare;
- a service one version ahead with a declared edge runs its own migrate;
- a service that bumps its version without declaring an edge rejects the whole upgrade at emit time (mis-built bundle).
So each service simply declares its own truth and the release composes: "only service A changed", "only
B", and "both" all work with zero coordination. One current restriction: if two services declare
different explicit method names for the same release, the upgrade is rejected (the wire carries
one method for pre-v2 receivers) — omit method = … or use the same name; full per-service
actuation lifts this later.
9.3 Reaching every context: cascade: true
upgrade_group without cascade upgrades only the target group.
If your app puts data in subgroup contexts (per-folder/per-channel child contexts), they stay on the
old schema forever and the UI on top of them keeps reading v1. Pass cascade: true to fan
the upgrade out across the namespace subtree as one atomic op.
9.4 What each member's node does (and when)
After the admin's upgrade_group, peers converge lazily:
- the upgrade replicates via the group op stream; a peer's sync gate detects the
pending upgrade (id change, or same-id
appKeydivergence for bundles) and pre-stages the new blob over BlobShare; - the node migrates on its next access to each context — reads count, so merely opening the app advances it; an idle node stays on v1 until then (write your UI copy accordingly: "updates next time you use it", not "updates automatically");
- while a migration is in flight, writes are refused with
upgrade in progress … writes refused until migration completes— handle this error in the frontend as a status ("workspace updating"), not a raw failure; - on completion the node emits the
AppVersionChangedSSE event — subscribe to flip version displays live (mero.events.onAppVersionChanged).
9.5 Frontend checklist (the part nobody tests until a real user does)
- Show the installed version to every member, not just admins; pair it with honest LazyOnAccess copy.
- Detect the downloaded-but-not-applied state via the
appKey-vs-blob comparison (§9.2) and keep offering the apply step — it must survive a page reload (the bundle id is stable, so the group'stargetApplicationIdstill addresses the downloaded bytecode). - Map the upgrade-gate write refusal (§9.4.3) to an "updating…" banner.
- Subscribe to
AppVersionChangedfor the completion signal. - Browser-test the full admin flow once per release: CLI clients skip CORS, so a missing preflight
method (e.g. PATCH for
updateGroupSettings) surfaces only in browsers, as an opaqueHTTP 0network error.
9.6 Verifying propagation by hand
# the group's target bytecode (hex appKey) — same on every member after sync meroctl --output-format json --node <n> namespace ls # what's actually installed under the (stable) application id meroctl --node <n> app ls # check the Source/Blob columns' version # poke a context to trigger the lazy migrate (reads count) meroctl --node <n> call <any_read_method> --context <ctx_id>
appKey equal everywhere + each node's installed blob matching it + the call returning
post-migration data = the migration has fully landed.
9.7 Multiple versions on one node
A context executes the bytecode its own group points at — not whatever was downloaded last. Installing a newer bundle never changes what existing workspaces run: each migrates when its own admin upgrades it. Consequences you can rely on:
- two workspaces on one node can run different versions of the same app indefinitely;
createNamespaceaccepts an optionalappKeyto pin a new workspace to any installed version (default: latest);- the admin API lists every retained version of a package
(
GET /admin-api/applications/:id/versions) and namespace DTOs carry a per-workspaceappVersion— display that, never the shared installed version.
9.8 Catching up several versions behind
A context behind its group by more than one version catches up by replaying the group's recorded upgrade ladder hop by hop — each hop runs in that release's own bytecode, not the latest. The node fetches each rung's blob from peers as needed and runs that version's own migrate. Because every member runs the same frozen bytes for a given hop, the result is identical across the group by construction (no cross-node divergence). A single access on a stale context walks all pending hops in order and lands on the current version; nothing is run eagerly.
An admin can also move a group several versions in one action: the upgrade plans a rung per intermediate state version (a release that doesn't change state adds no rung) from the versions installed on the node, and the group advances rung by rung.
9.9 Support window and recovering a stranded member
The versions whose blobs remain obtainable (from peers or the registry) define which upgrades chain
directly. A member below that window — an intermediate blob is gone everywhere, or it was offline across
several releases and never fetched one — is detected, not silently wedged: it stays on
its current real version and reports failed with no_migration_path in the
migration rollup. It never runs a later hop's migrate against older state.
Two recoveries, both operator-initiated:
- Stepwise reinstall — install an intermediate version from the registry. Any installed version can be an upgrade target (§9.7), so the gap becomes a single hop and the next access finishes the chain.
- Resync —
POST /admin-api/contexts/{id}/resyncadopts an up-to-date peer's state wholesale (a full-state snapshot), bypassing replay. This is destructive: any local edits the context hasn't broadcast are discarded, so it refuses with the local-head count unless you pass{"force": true}. After it completes the member is back at the current version. (The37-stranded-resyncmerobox workflow exercises exactly this path end to end.)
A self-heal needs no action: if the missing blob simply arrives later, the next access resolves the hop,
migrates, and clears the failed marker.
10. Wiring it into your app — mero-js & mero-react
The node exposes the migration surface over its admin + RPC + SSE APIs. The JS SDK
(@calimero-network/mero-js) and React bindings (mero-react) — both on the
migrations-v2 3.0.0 line — wrap those so a frontend can drive and observe an upgrade
without hand-rolling HTTP.
mero-js
| Call | What it does |
|---|---|
| admin.upgradeGroup(groupId, { targetApplicationId, cascade }) | trigger the upgrade (§9); cascade fans out the subtree (§9.3). |
| admin.getMigrationStatus(namespaceId) | cohort rollup — migrated / in-progress / unknown / failed, allMigrated, authoredRemaining (§5). |
| admin.getCascadeStatus(namespaceId) | per-subgroup cascade rollup. |
| admin.listApplicationVersions(applicationId) | every retained version of a package — feeds a version picker (§9.7). |
| admin.resyncContext(contextId, { force }) | recover a stranded member (§9.9) — destructive, needs force. |
| rpc.migrateMyEntries(contextId) | owner one-tap convert of identity-gated data → { converted, remaining } (§5). |
| events.onAppVersionChanged(handler) | SSE subscription; fires { contextId, fromVersion, toVersion } when a context's version flips (§9.4). |
// trigger + observe an upgrade
await mero.admin.upgradeGroup(groupId, { targetApplicationId: v2Id, cascade: true });
const unsub = mero.events.onAppVersionChanged(({ toVersion }) => setVersion(toVersion));
// owner converts their own identity-gated entries after the upgrade
let { remaining } = await mero.rpc.migrateMyEntries(contextId);
// (call again until remaining === 0 to drain everything in one sitting)
// recover a member stuck with no_migration_path
await mero.admin.resyncContext(contextId, { force: true });
getMigrationStatus / AppVersionChanged, never the id.
mero-react hooks
| Hook | For |
|---|---|
| useUpgradeGroup() | an admin "apply update" action → upgradeGroup with loading/error. |
| useGroupUpgradeStatus(groupId) | live cohort upgrade rollup (upgradeStatus) for a progress UI. |
| useMigrationStatus(namespaceId) | the full migration-status rollup (per-member states + authoredRemaining). |
| useAppVersion(contextId, expected) | context's installed version + isStale vs the app's build constant; auto-updates on AppVersionChanged ("reload to update"). |
| useMyAuthoredMigration(contextId) | pending + authorize() — the per-user "migrate my data" tap (wraps migrate_my_entries). |
| useResyncContext() | resyncContext (force), for a "recover this member" action. |
| useRetryGroupUpgrade() | re-drive a stalled cohort upgrade. |
| useLatestVersion(applicationId) | registry latest + retained versions for a picker. |
function WorkspaceBanner({ contextId }) {
const { appVersion, isStale } = useAppVersion(contextId, BUILD_VERSION);
const { pending, authorize } = useMyAuthoredMigration(contextId);
if (isStale) return <Banner>Update available — reload to use v{appVersion}</Banner>;
if (pending) return <Button onClick={authorize}>Migrate my data</Button>;
return null; // up to date, nothing of mine left to convert
}
mero-js/src/admin-api/admin-types.ts, mero-react/src/hooks/index.ts) — the
canonical source if a field is renamed in a later release.
11. Quick reference
| Do | Don't |
|---|---|
Carry fields: field: old.field | Re-insert an AuthoredMap/UserStorage/SharedStorage in a migration |
Seed new fields with ::new() / #[migrate(new = …)] | Call Counter::increment/decrement or RGA::insert in a migration (they panic) |
increment_for / insert_str_at_timestamp to replay a CRDT deterministically | Use wall-clock, RNG, or unsorted iteration order |
sort() before building a Vector from a map/set | Change an identity-gated type to a plain one (refused) |
Prove convergence with assert_migrate_converges | Assume single-node tests prove cross-node determinism |
panic! on bad input (non-destructive abort) | Expect a Result from the migrate fn |
| Keep old version blobs obtainable so behind members can chain | Drop an intermediate version and expect members below it to auto-upgrade (they strand → reinstall or resync) |