- Status: Accepted
- Date: 2026-05-03
- Affected:
supabase/functions/,supabase/migrations/
Context
V0 chained Edge Functions over fire-and-forget HTTP — ingest-audio →
transcribe-step. Workable for one step but doesn’t scale to the V0.1
pipeline (transcribe → diarize → reid → summarize → embed → compute_edges).
We need: durable retries, decoupled steps, per-step observability,
reasonable failure handling.
docs/01_SOFTWARE_PLAN.md §1 calls out the migration target as
“pgmq + pg_cron (V0.1) → Inngest (V0.2+) if complexity demands.”
Options considered
Option A — Direct HTTP chaining (V0 status quo)
- Pros: Zero infra; trivial to reason about.
- Cons: No retry on failure; one step crashing wedges the chain; observability is “tail logs across N functions”; can’t easily fan out (parallel embed + diarize) later.
Option B — Inngest
- Pros: Best-in-class developer experience for queues; durable; great UI for stuck jobs; native step functions.
- Cons: Another vendor; adds a service dependency; pricing kicks in fast at scale; pulls work out of Supabase, splitting the source of truth across two systems.
Option C — pgmq + pg_cron (chosen for V0.1)
- Pros: Lives inside Supabase; one source of truth; pgmq has durable retries via visibility timeouts; pg_cron ticks the drain function; zero new infra; free.
- Cons: pg_cron’s
seconds-precision schedule needs pg_cron ≥ 1.5 (Supabase ships ≥ 1.6); Vault is needed to hold the service-role key for the cron job to call Edge Functions; missing extensions silently break the migration unless wrapped defensively.
Option D — SQS / Cloud Tasks / Temporal
- Pros: Industry standard; battle-tested.
- Cons: Same “split source of truth” problem as Inngest plus more ops. Overkill for V0.1.
Decision
Use pgmq + pg_cron for V0.1. Each pipeline step is an Edge Function
that pulls from a single pipeline_jobs queue, processes its message,
enqueues the next step on success, and leaves the message for retry on
failure. A pipeline-tick Edge Function is invoked by pg_cron every
30s to drain the queue. ingest-audio falls back to direct HTTP
dispatch when pgmq is unreachable, so local dev works without cron.
Re-evaluate at V0.2 if pgmq’s observability or DLQ story becomes a blocker. Inngest stays on the option table.
Consequences
- One queue, one drain function, four step functions chain through it.
- Each step is idempotent (upsert /
on conflict do nothing/ status guards) so message replays don’t duplicate. - Retries happen for free via pgmq visibility timeout; no exponential backoff yet, no DLQ — see ADR-0008.
- Production deploy needs
vault.create_secret('<service-role-key>', 'service_role_key')to be set. Migration installs a placeholder so it’s obvious if missed. - Local dev uses
host.docker.internalURL in vault; production uses the project URL. Operators flip via SQL editor.
Notes
The defensive DO ... EXCEPTION blocks in
20260503000002_pgmq_pipeline.sql exist because Supabase CLI versions
prior to about Q1 2026 didn’t auto-install pgmq. Wrapping the calls
means the migration suite still applies cleanly even if extensions are
missing — pipeline simply degrades to direct HTTP fallback.