Auto-Follow Group Membership

Event-driven context auto-join for group members, with per-member opt-in

Opt-in
per-member flag
DAG
event-driven
20/min
default rate limit
Replay
offline catch-up

Problem

Group membership in Calimero is explicit at every level. When a new context is registered in a group, existing members do not automatically replicate it — each must call join_context explicitly. When a subgroup is nested under a parent, the parent's members are not automatically admitted to the child either (subgroup membership is independent). This works for human-driven clients but breaks two concrete scenarios:

  • TEE fleet HA. A fleet node admitted via fleet-join joins only the contexts that existed at admission time. Later contexts are invisible until something else triggers a join, which — before auto-follow — was polling by the sidecar.
  • Regular members observing a growing group. A member who joined when the group had 3 contexts sees nothing when a 4th appears, unless the client polls or the user reacts manually.

Design

Each GroupMember gains a pair of opt-in flags:

AutoFollowFlags {
    contexts:  bool,   // auto-join new ContextRegistered     — default true
    subgroups: bool,   // auto-admit into newly-nested subgroups — default false
}

Defaults (as of #2422): contexts: true, subgroups: false. A new member added via add_group_member auto-follows new contexts in the group by default; subgroup auto-admit stays opt-in. ReadOnlyTee members get both set to true automatically — that's the TEE-fleet use case. The default lives in a single explicit impl Default for AutoFollowFlags in crates/store/src/key/group/mod.rs, so both the borsh-legacy decode fallback and .unwrap_or_default() sites pick it up.

Flags are toggled by the governance op GroupOp::MemberSetAutoFollow with admin-or-self authorization:

  • A group admin can toggle flags for any member.
  • A member can toggle their own flags.

Propagation

The handler subscribes to an in-process op-apply event channel (module context::op_events). When a relevant op is applied to local state, an OpEvent is broadcast and the handler emits the corresponding join op. Every emitted op is itself a DAG op, so offline catch-up comes for free via DAG replay — no separate reconcile loop.

Flow

0. (NEW, #2422) A new member is added to the group via
   GroupOp::MemberAdded / GroupOp::MemberJoinedViaTeeAttestation /
   RootOp::MemberJoined (open-subgroup self-join).
   └─ Apply path writes a fresh GroupMember row with default flags
      (contexts: true, subgroups: false).
   └─ emit_auto_follow_set_if_enabled synthesises
      OpEvent::AutoFollowSet { member, contexts: true, subgroups: false }
      so the on-join backfill cascade fires without requiring an
      explicit MemberSetAutoFollow op. This closes the Ronit/Fran
      regression where a joiner saw no pre-existing contexts until an
      admin manually flipped a flag.

1. Admin or member publishes MemberSetAutoFollow { target, contexts, subgroups }.
   └─ Authorized by admin-or-self check in apply_group_op_mutations.
   └─ Store updates GroupMemberValue.auto_follow.
   └─ op_events::notify(OpEvent::AutoFollowSet { ... }).

2. Auto-follow handler (spawned once from ContextManager::started) observes
   AutoFollowSet { member = self, contexts: true } and backfills: enumerate
   up to BACKFILL_LIMIT contexts in the group, emit JoinContext for each.
   Backfill is rate-limited to DEFAULT_BURST / DEFAULT_PER. Idempotent on
   already-joined contexts, so steps 0 + 1 firing for the same member is
   safe (e.g. TEE fleet-join: synthesised from MemberJoinedViaTeeAttestation
   then explicit MemberSetAutoFollow from fleet_join.rs).

3. Later, anyone registers a new context in the group.
   └─ GroupOp::ContextRegistered applied.
   └─ op_events::notify(OpEvent::ContextRegistered { group, context }).
   └─ Handler checks: is self a member with auto_follow.contexts = true?
   └─ If yes, emit JoinContext — same rate limit.

4. Later, an admin creates a subgroup under this group.
   └─ RootOp::GroupCreated { group_id, parent_id } applied on namespace DAG
       (atomic create+nest — strict-tree invariant).
   └─ op_events::notify(OpEvent::SubgroupCreated { parent, child }).
   └─ Handler reacts (subgroup variant — separate follow-up PR).

4b. Admin moves an existing subgroup to a new parent.
    └─ RootOp::GroupReparented { child, new_parent } applied.
    └─ op_events::notify(OpEvent::SubgroupReparented
        { old_parent, new_parent, child }).

TEE Fleet Integration

A TEE fleet node calls POST /admin-api/tee/fleet-join. The handler in server::admin::handlers::tee::fleet_join:

  1. Generates a TDX attestation quote bound to the node's namespace identity pubkey.
  2. Broadcasts TeeAttestationAnnounce on the namespace topic.
  3. Polls for admission (up to 30 s) by calling list_group_contexts. Once the verifier's MemberJoinedViaTeeAttestation op has propagated, the list succeeds.
  4. Joins all existing contexts in the group.
  5. Publishes MemberSetAutoFollow { target: self, contexts: true, subgroups: true }, signed by the node's own namespace-identity key. The admin-or-self rule is satisfied via the self path — the admitting verifier can't do this on the member's behalf because it usually lacks both admin authority and the member's signing key.

From this point, every new context in the group is auto-joined by the core handler. The mero-tee sidecar's per-group polling loop becomes redundant.

Policy scope — namespace, not per-group. Because auto-follow propagates fleet-node membership down into subgroups without a second admission check, any TEE admission policy set on a subgroup would be inert. As of 2026-04-21 this is made explicit: the canonical TeeAdmissionPolicy lives on the namespace root only. read_tee_admission_policy in group_store::tee resolves its argument to the namespace root before reading, and both the write handler (handlers::set_tee_admission_policy) and the apply path in group_store::apply_group_op_mutations refuse a TeeAdmissionPolicySet targeting a subgroup. Subgroup policy bytes in any legacy op logs are ignored. A deferred follow-up adds a drift guard that validates the root policy against Calimero's canonical fleet measurements.

Rate Limit & Backpressure

The handler runs behind a token-bucket limiter. Defaults:

  • DEFAULT_BURST = 20 — tokens available at once.
  • DEFAULT_PER = 60 s — bucket refills fully in this window (one token every 3 s).
  • BACKFILL_LIMIT = 1000 — per-flip cap for enumerating existing contexts. Future contexts beyond the cap are picked up event-driven with no additional limit.

Semaphore-closed and subscriber-lagged conditions are both surfaced via warn!. The authoritative recovery mechanism is always DAG replay: if an event is missed (best-effort broadcast), the next run of the handler walks the DAG and reconciles state.

Operator Notes

  • Observability. Every auto-join emits a structured info! log line with group_id and context_id. Failures emit warn! with the underlying error. No new metrics — the log stream is enough for postmortems.
  • Shutdown. Call auto_follow::shutdown() to abort the handler task and its refill loop. Subsequent spawn calls will start a fresh handler.
  • Migration. GroupMemberValue was extended with auto_follow via a custom Borsh deserializer. Records written under the pre-auto-follow schema are transparently read with default flags, and transparently upgraded on the next write. A partial trailing byte (data corruption) surfaces as a deserialization error instead of being silently defaulted.

Key Files

  • crates/context/src/op_events.rs — op-apply event channel + OpEvent enum.
  • crates/context/src/auto_follow.rs — handler task, rate limiter, spawn/shutdown.
  • crates/context/src/group_store/mod.rsapply_group_op_mutations handles MemberSetAutoFollow.
  • crates/context/src/group_store/membership.rsset_member_auto_follow helper.
  • crates/context/primitives/src/local_governance/mod.rs — the MemberSetAutoFollow op variant.
  • crates/store/src/key/group/mod.rsAutoFollowFlags and the backward-compatible BorshDeserialize for GroupMemberValue.
  • crates/context/src/handlers/admit_tee_node.rs — TEE admission publishes the op only; flags are set by the admitted node itself.
  • crates/server/src/admin/handlers/tee/fleet_join.rs — after admission, the member publishes MemberSetAutoFollow signed by self.