V1.0 Readiness — gap analysis (observability, testing, compliance, operations)

Date: 2026-05-07 Status: Working notes — partner review pending. Not committed plans yet; promote to 03_PROGRESS.md backlog as items get accepted. Scope: Everything between current V0.3-in-flight and V1.0 production launch that isn’t already in the backlog. Surfaces gaps in observability, testing, compliance, ops, and AI tooling.

TL;DR

We’re in good shape on the things that already shipped — Sentry + PostHog wired across apps/web and the 10 pipeline edge functions, rate limits live, DLQ + retry policies in place. What’s missing is the layer beneath: automated tests for the web flow, Dependabot for security patches, a privacy/ToS page, GDPR data export, uptime monitoring. None of these are V0.3-blockers. All become V1.0-blockers the day we let strangers sign up.

Five concrete items would close most of the V1.0 readiness gap with low cost. Listed in priority order at the bottom under Recommended next 5.

What we already have

For partner context — these are not gaps. Just the baseline:

Capability	Stack	Where wired	Status
Error monitoring	Sentry	`apps/web` (full SDK) + `supabase/functions/_shared/sentry.ts` consumed by all 10 pipeline edge functions	✅ Shipped
Product analytics + feature flags	PostHog	`apps/web/lib/posthog.ts`, `apps/web/lib/feature-flags.ts`, `apps/web/components/analytics.tsx`	✅ Shipped
Auth deliverability	Resend SMTP via verified `send.arcive.io`	Supabase Auth → SMTP Settings	✅ Shipped 2026-05-04
Pipeline reliability	pgmq + DLQ + max-retry per step + 30-day TTL pruning (ADR-0008)	All 7 pipeline steps	✅ Shipped
Rate limiting	Per-route token bucket	Stripe checkout/portal + `ingest-audio`	✅ Shipped 2026-05-03
Consent gate	Agent retrieval excludes non-consented participants (ADR-0007)	`/api/chat` via MCP `respect_consent=true`	✅ Shipped
AI vendor resilience	Gemini Flash → Anthropic Haiku → Groq Llama 3.3 70B fallback (ADR-0011)	summarize-step, embed-step, agent layer	✅ Shipped
Engineering docs site	Astro Starlight at `apps/docs/` deployed via Cloudflare Pages	https://master.arcive-io.pages.dev (custom domain pending)	✅ Shipped 2026-05-07

Gaps — V1.0 readiness

Sorted by impact-per-hour-invested. All are post-V0.3, pre-V1.0.

#	Gap	Why it matters	Cost	Recommendation
1	No Dependabot / Renovate — manual `pnpm update` only.	Security CVEs in Next.js / Astro / AI SDKs accumulate silently.	XS — 5-min `.github/dependabot.yml` config	Weekly cadence, auto-merge patch versions, group all dev deps
2	No external uptime monitoring	When Cloudflare/Supabase/Modal go down, we find out from users. Bad for partner trust.	XS — $0 free tier, ~10 min setup	BetterStack or Cronitor. Ping `docs.arcive.io`, `/api/healthcheck`, the `pipeline-tick` cron URL
3	No privacy policy / Terms of Service	Required by App Store, Play Store, GDPR, Stripe. Magic-link signup form should link to one.	S — 1-2 hours	Termly / Iubenda template + adapt. Host as `/legal/privacy` and `/legal/terms` on `apps/web`
4	No data export (ZIP/JSON) — flagged in 2026-05-06 CTO alignment discussion item #4 as “cheap to add now”	GDPR + CCPA require it. Way easier while schema is small (~14 tables, no indexes spanning users yet).	S — couple of hours	Edge function streaming user’s `memories` + `recordings` + `topics` + `memory_topics` as zipped JSON. Wire to “Export my data” button in account settings
5	No automated tests for `apps/web` — CI typechecks + builds, runs zero tests. Pipeline test (PR #10) covers backend chain only.	Regressions on the core flow ship invisible. Already TODO in 03_PROGRESS Hardening section.	M — 1-2 days	5-10 Playwright tests for the golden path: record → memory appears → search → talk-back. Add Vitest for `packages/agents` consent gate logic
6	No mobile observability — `apps/mobile` has neither Sentry nor PostHog. Crashes on device are invisible.	Pipeline failures from mobile uploads visible only server-side; mobile-side bugs (audio capture, queue drain) silent.	S — ~1 day	`@sentry/react-native` (Expo plugin) + `posthog-react-native`. Stay in Expo Go (per Apple Developer deferral).
7	AI cost monitoring per user — ADR-0013 exists; verify it’s actually wired or fold the implementation in here	At free tier scale, one runaway-prompt user can blow monthly Gemini/Anthropic budget without alarm	M — depends on ADR-0013 status	If not implemented, log token counts to PostHog per request. Alert on per-user/day threshold.

Tier 2 — fill before public launch (V1.0)

#	Gap	When it bites
8	Public status page (`status.arcive.io`)	Trust signal partners + B2B prospects expect. BetterStack/Statuspage.
9	Marketing site / landing page at `arcive.io` apex	First public marketing push
10	Cookie consent banner	Required EU GDPR / CCPA “Do Not Sell” link
11	Onboarding email sequence (Day 1 / 3 / 7 reactivation)	When churn rate becomes legible (~100 users)
12	Pricing page	Tiers exist in code (Free / Pro / Family / Marketplace / B2B per Master Plan §8) but no public-facing page.
13	Help center / end-user docs — `docs.arcive.io` is engineering-facing	Public launch needs a separate user-facing docs surface
14	Webhook / public API docs	When public MCP server ships per Master Plan §3 Phase 4

Tier 3 — AI-specific gaps (track, not urgent)

#	Gap	When it bites
15	No eval framework for prompts	First time we swap a model per ADR-0011 fallback chain and quality silently degrades
16	No prompt versioning — prompts live in code, no audit log of changes	When a prompt change correlates with a quality drop, hard to bisect
17	No vendor fallback dry-run testing — chain defined but not regularly exercised	The day Gemini goes down and the Anthropic prompt has a subtle incompat we didn’t notice

LangSmith, Braintrust, or homegrown CI all work. Pick when you have ≥10 prompts that meaningfully matter.

Things that aren’t gaps (worth knowing)

Stuff that looks missing but doesn’t actually need adding for our scale or stack:

Looks like a gap	Why it isn’t
Distributed tracing across pipeline	Sentry has it built in. Wire it when 7-step pipeline observability gets harder than it currently is.
Log aggregation (Datadog / Axiom / Better Stack logs)	Each service’s own UI is fine for ~50 users. Becomes worth it ~5k+.
Infrastructure metrics (CPU / memory / disk)	Everything’s serverless (Supabase, Modal, Cloudflare). They handle their own.
Container orchestration / k8s	Nothing self-hosted that needs it.
CDN	Cloudflare Pages + Workers are the CDN. Done.
Custom WAF	Cloudflare Pages includes it free; Supabase has its own.
Secrets manager	Supabase Vault + Vercel env + Modal secrets cover everything. No dedicated secrets-management product needed.

Discussion items — need a decision call

1. Where do tests go in the workflow?

Three choices for adding the Playwright golden-path:

(a) Manual run before merge. Cheap, easy to skip.
(b) GitHub Actions on every PR. Adds 2-3 min to PR cycle, blocks merge on red.
(c) Pre-deploy gate on Cloudflare Pages. Doesn’t block PRs but blocks bad code from going live.

Recommendation: (b) on apps/web PRs only. CI already typechecks; tests fit the same place.

2. Privacy policy — write or generate?

(a) Termly / Iubenda generated template (~$10-30/mo). Fast, decent, brand-generic.
(b) Hand-written. Hours of work but reflects ARCIVE specifics (HW recording, AI processing, B2B).
(c) Hire a lawyer. Hundreds of dollars but real legal coverage.

Recommendation: (a) for V1.0 launch (covers ~95% of what’s actually scrutinized), (c) before B2B contracts.

3. Status page — public or partner-only?

(a) status.arcive.io fully public. Most SaaS pattern.
(b) Partner-only via Cloudflare Access. Less public-trust signal but no public failure record either.

Recommendation: (a) once 5+ partners exist. Building trust requires showing the real thing.

4. AI cost attribution — when to wire?

ADR-0013 accepts the idea. Three urgency levels:

Now — log token counts per request to PostHog. Alert on outliers.
Before V1.0 — full per-user dashboard, billing-grade attribution.
Defer — wait until first runaway-cost incident.

Recommendation: now for the “log tokens to PostHog” minimum (XS). Full dashboard before V1.0.

Recommended next 5

If picking 5 items to do this week in priority order:

Dependabot config (~5 min, prevents real security gaps)
Uptime monitoring (~10 min, $0)
Data export ZIP/JSON (~3 hours, unblocks GDPR + signals partner trust)
5 Playwright tests for the golden path (~1 day, prevents regressions on the core flow)
Privacy policy + ToS draft (~2 hours, required before App Store)

Everything else can wait until specific signals warrant it.

Promotion path

This doc → discussion / exploratory
Items accepted → 03_PROGRESS.md “Production prep” or “V1.0” backlog (added in same commit as this doc)
Specific tool choices → ADRs (e.g. “ADR-0014: BetterStack for uptime”) only when alternatives were weighed

V1.0 Readiness — gap analysis (observability, testing, compliance, operations)

TL;DR

What we already have

Gaps — V1.0 readiness

Tier 2 — fill before public launch (V1.0)

Tier 3 — AI-specific gaps (track, not urgent)

Things that aren’t gaps (worth knowing)

Discussion items — need a decision call

1. Where do tests go in the workflow?

2. Privacy policy — write or generate?

3. Status page — public or partner-only?

4. AI cost attribution — when to wire?

Recommended next 5

Promotion path

Plans

Operations

Decisions (ADRs)

Discussions

V1.0 Readiness — gap analysis (observability, testing, compliance, operations)

TL;DR

What we already have

Gaps — V1.0 readiness

Tier 1 — fill before letting non-design-partners sign up

Tier 2 — fill before public launch (V1.0)

Tier 3 — AI-specific gaps (track, not urgent)

Things that aren’t gaps (worth knowing)

Discussion items — need a decision call

1. Where do tests go in the workflow?

2. Privacy policy — write or generate?

3. Status page — public or partner-only?

4. AI cost attribution — when to wire?

Recommended next 5

Promotion path

Plans

Operations

Decisions (ADRs)

Discussions