Feature Flags 07 · flag registry · runtime evaluation · staged rollout · rollback

Planning · v0.2
Flag registry ที่ทุก runtime ใช้ · per-request · per-tenant · per-user · staged rollout · one-click rollback. Sensitive flags ต้องผ่าน approval_matrix.row-sensitive-override ก่อน flip.
Runtime Phase 1: /runtime/feature-flags/ — registry live.
Runtime Phase 2a: /runtime/feature-flags/evaluator.html — evaluator module + contract + 19 canonical examples (all pass).
Runtime Phase 2a+: /runtime/feature-flags-service/ — HTTP boundary · FastAPI scaffold · 4 endpoints · parity-locked Python port (dev-mode only · no JWT · no cache · no override store yet).
Runtime Phase 2b: /runtime/feature-flags-service/phase-2b.html — additive · Bearer JWT resolver + file-backed override store (8 seed rows) + POST /api/flags/sensitive-eval · parity 19/19 preserved · rationale at feature-flags-phase-2b.html (still dev-mode · no JWKS · no Postgres · no mutation).

Context

Handoff hand-feature-flags: B declares WHICH flags are needed via tenant_scope.feature_flag_handoff · A implements registry + evaluation + rollout + rollback. ทุก 5 runtime (Dashboard · Cases · Generate · Tenant · Wizard) ต้องมี flag `*_runtime_v1` เพื่อ control rollout + rollback.

docs/kb/data/tenant_scope.jsonfeature_flag_handoff · declares required flags
docs/kb/data/publish_workflow.jsonStages define when a flag may flip
docs/kb/data/approval_matrix.jsonSensitive-surface flip requires dual-approval

A-owned Runtime Boundary

B owns (read-only for A)

  • feature_flag_handoff list (which flags needed)
  • Sensitive-surface markers on specific flags
  • Dependency declarations (flag A requires flag B)

A owns (this runtime)

  • Flag registry schema + storage
  • Evaluation service (in-process library + cache)
  • Per-tenant / per-user override storage
  • Staged rollout (% cohort bucketing)
  • Rollback (one-click)
  • Cross-service dependency graph
  • Audit trail per flip

Scope

ER Diagram flag registry + overrides + evaluations

Feature Flags · entity model
feature_flags ● id (PK) key (unique) description default_value (bool/str) type (bool/variant) rollout_pct (0..100) cohort_rules (JSONB) requires_approval sensitive_flag owner_module last_flipped_at last_approval_ref flag_tenant_overrides ● flag_id ● tenant_id value set_by_user_id set_at reason approval_ref expires_at flag_dependencies ● id flag_id (FK) requires_flag_id (FK) requires_value rationale flag_events (WORM) id · flag · actor · before · after · ts redis_cache key: ff:{flag} TTL 60s · pubsub invalidate 1..n 1..n 1..n invalidate on flip

Field Mapping — Flag Registry

Source: tenant_scope.feature_flag_handoff.

B Field B Type A Runtime Target A Owner Approval Gate Binding Notes
flag_keystringfeature_flags.key · unique indexBackendnonemapped-not-boundSnake_case convention
default_valueanyfeature_flags.default_value JSONBBackendnonemapped-not-boundUsed when no override
requires_approvalbooleanfeature_flags.requires_approval · block flip if no approval_refBackendgate-flag-flipplaceholderApproval ref from admin plane
sensitive_surfacebooleanfeature_flags.sensitive_flag · triggers dual-approvalBackendgate-sensitive-dualplaceholderPer approval_matrix.sensitive_surface_markers
owner_modulestringfeature_flags.owner_module · filter in admin UIBackend + FEnonemapped-not-bounde.g., "cases" / "wizard"
dependency.requiresarrayflag_dependencies table rowsBackendgate-dep-checkplaceholderBlocker: validator prevents flip if dep off
rollout_strategyobjectfeature_flags.cohort_rules JSONBBackendnonederivedComputed by evaluator · deterministic bucketing
tenant_overridemapflag_tenant_overrides rowsBackendgate-tenant-overridemapped-not-boundPer-tenant flip requires tenant-admin sign-off

API Sketch

GET/api/flags/eval?key=cases.runtime_v1&tenant=pty-zeroth&user=U-001
Evaluate a flag for a (tenant, user) context · cached · deterministic bucketing
// response { "key": "cases.runtime_v1", "value": true, "source": "tenant_override", // default | rollout | tenant_override | user_override "rollout_pct": 25, "bucket": 42, "cached": true }
PATCH/api/admin/flags/{key}
Flip rollout_pct or cohort rules · sensitive flags need approval_ref
// request { "rollout_pct": 50, "approval_ref": "APP-260418-0100", "rationale": "staged rollout wave 2" } // response 200 · or 428 if approval missing / dep not satisfied
POST/api/admin/flags/{key}/rollback
One-click rollback · sets rollout_pct=0 · pubsub invalidates cache · WORM audit

Sequence Flow evaluate + cache + invalidate on flip

Flag evaluation + rollback
App Service Flag Evaluator Redis Cache Postgres Admin · Audit 1 eval(cases.runtime_v1, tenant, user) 2 GET ff:cases.runtime_v1 3a HIT · value=true 4a return true (source=tenant_override) 3b MISS · fetch from db row · cache 60s TTL 5 PATCH flag rollout_pct=0 (rollback) pubsub invalidate flush ff:cases.runtime_v1

Dependencies

Upstream

  • Auth (for admin flip permission)
  • Postgres + Redis
  • Admin control plane (for approval refs)
  • Pubsub (Redis or Kafka) for invalidation

Downstream

  • Every runtime (Dashboard · Cases · Generate · Tenant · Wizard) gates on *.runtime_v1
  • Staged rollout strategy for any new feature

Approval Gates

Risks

Flag flipped without approval (bypass)
High
API enforces approval_ref for gated flags · DB constraint + app check
Cache not invalidated across all pods
Med
Pubsub + TTL 60s as backstop · worst-case 60s drift
Dependency off but flag on (inconsistent)
Med
Evaluator returns false if any dep off · validator blocks flip
Deterministic bucket collision (heavy user)
Low
Hash user_id + flag_key · 100-bucket uniform distribution

Definition of Done

  1. Registry schema deployed · Redis cache + pubsub functional
  2. Evaluator library in use by all 5 runtime pages
  3. Staged rollout tested with cohort selector
  4. Rollback tested end-to-end (<5s convergence)
  5. Dependency graph validator blocks bad flips
  6. Sensitive-flag flip requires dual-approval with evidence
  7. All flips audited in WORM log

Deferred

Feature Flags Planning · v0.2 · Session A · A-owned ← Planning hub