Date: 2026-05-07
Status: Working notes — partner review pending. Not committed plans yet; promote to 03_PROGRESS.md backlog as items get accepted.
Scope: Everything between current V0.3-in-flight and V1.0 production launch that isn’t already in the backlog. Surfaces gaps in observability, testing, compliance, ops, and AI tooling.
TL;DR
We’re in good shape on the things that already shipped — Sentry + PostHog wired across apps/web and the 10 pipeline edge functions, rate limits live, DLQ + retry policies in place. What’s missing is the layer beneath: automated tests for the web flow, Dependabot for security patches, a privacy/ToS page, GDPR data export, uptime monitoring. None of these are V0.3-blockers. All become V1.0-blockers the day we let strangers sign up.
Five concrete items would close most of the V1.0 readiness gap with low cost. Listed in priority order at the bottom under Recommended next 5.
What we already have
For partner context — these are not gaps. Just the baseline:
| Capability | Stack | Where wired | Status |
|---|---|---|---|
| Error monitoring | Sentry | apps/web (full SDK) + supabase/functions/_shared/sentry.ts consumed by all 10 pipeline edge functions | ✅ Shipped |
| Product analytics + feature flags | PostHog | apps/web/lib/posthog.ts, apps/web/lib/feature-flags.ts, apps/web/components/analytics.tsx | ✅ Shipped |
| Auth deliverability | Resend SMTP via verified send.arcive.io | Supabase Auth → SMTP Settings | ✅ Shipped 2026-05-04 |
| Pipeline reliability | pgmq + DLQ + max-retry per step + 30-day TTL pruning (ADR-0008) | All 7 pipeline steps | ✅ Shipped |
| Rate limiting | Per-route token bucket | Stripe checkout/portal + ingest-audio | ✅ Shipped 2026-05-03 |
| Consent gate | Agent retrieval excludes non-consented participants (ADR-0007) | /api/chat via MCP respect_consent=true | ✅ Shipped |
| AI vendor resilience | Gemini Flash → Anthropic Haiku → Groq Llama 3.3 70B fallback (ADR-0011) | summarize-step, embed-step, agent layer | ✅ Shipped |
| Engineering docs site | Astro Starlight at apps/docs/ deployed via Cloudflare Pages | https://master.arcive-io.pages.dev (custom domain pending) | ✅ Shipped 2026-05-07 |
Gaps — V1.0 readiness
Sorted by impact-per-hour-invested. All are post-V0.3, pre-V1.0.
Tier 1 — fill before letting non-design-partners sign up
| # | Gap | Why it matters | Cost | Recommendation |
|---|---|---|---|---|
| 1 | No Dependabot / Renovate — manual pnpm update only. | Security CVEs in Next.js / Astro / AI SDKs accumulate silently. | XS — 5-min .github/dependabot.yml config | Weekly cadence, auto-merge patch versions, group all dev deps |
| 2 | No external uptime monitoring | When Cloudflare/Supabase/Modal go down, we find out from users. Bad for partner trust. | XS — $0 free tier, ~10 min setup | BetterStack or Cronitor. Ping docs.arcive.io, /api/healthcheck, the pipeline-tick cron URL |
| 3 | No privacy policy / Terms of Service | Required by App Store, Play Store, GDPR, Stripe. Magic-link signup form should link to one. | S — 1-2 hours | Termly / Iubenda template + adapt. Host as /legal/privacy and /legal/terms on apps/web |
| 4 | No data export (ZIP/JSON) — flagged in 2026-05-06 CTO alignment discussion item #4 as “cheap to add now” | GDPR + CCPA require it. Way easier while schema is small (~14 tables, no indexes spanning users yet). | S — couple of hours | Edge function streaming user’s memories + recordings + topics + memory_topics as zipped JSON. Wire to “Export my data” button in account settings |
| 5 | No automated tests for apps/web — CI typechecks + builds, runs zero tests. Pipeline test (PR #10) covers backend chain only. | Regressions on the core flow ship invisible. Already TODO in 03_PROGRESS Hardening section. | M — 1-2 days | 5-10 Playwright tests for the golden path: record → memory appears → search → talk-back. Add Vitest for packages/agents consent gate logic |
| 6 | No mobile observability — apps/mobile has neither Sentry nor PostHog. Crashes on device are invisible. | Pipeline failures from mobile uploads visible only server-side; mobile-side bugs (audio capture, queue drain) silent. | S — ~1 day | @sentry/react-native (Expo plugin) + posthog-react-native. Stay in Expo Go (per Apple Developer deferral). |
| 7 | AI cost monitoring per user — ADR-0013 exists; verify it’s actually wired or fold the implementation in here | At free tier scale, one runaway-prompt user can blow monthly Gemini/Anthropic budget without alarm | M — depends on ADR-0013 status | If not implemented, log token counts to PostHog per request. Alert on per-user/day threshold. |
Tier 2 — fill before public launch (V1.0)
| # | Gap | When it bites |
|---|---|---|
| 8 | Public status page (status.arcive.io) | Trust signal partners + B2B prospects expect. BetterStack/Statuspage. |
| 9 | Marketing site / landing page at arcive.io apex | First public marketing push |
| 10 | Cookie consent banner | Required EU GDPR / CCPA “Do Not Sell” link |
| 11 | Onboarding email sequence (Day 1 / 3 / 7 reactivation) | When churn rate becomes legible (~100 users) |
| 12 | Pricing page | Tiers exist in code (Free / Pro / Family / Marketplace / B2B per Master Plan §8) but no public-facing page. |
| 13 | Help center / end-user docs — docs.arcive.io is engineering-facing | Public launch needs a separate user-facing docs surface |
| 14 | Webhook / public API docs | When public MCP server ships per Master Plan §3 Phase 4 |
Tier 3 — AI-specific gaps (track, not urgent)
| # | Gap | When it bites |
|---|---|---|
| 15 | No eval framework for prompts | First time we swap a model per ADR-0011 fallback chain and quality silently degrades |
| 16 | No prompt versioning — prompts live in code, no audit log of changes | When a prompt change correlates with a quality drop, hard to bisect |
| 17 | No vendor fallback dry-run testing — chain defined but not regularly exercised | The day Gemini goes down and the Anthropic prompt has a subtle incompat we didn’t notice |
LangSmith, Braintrust, or homegrown CI all work. Pick when you have ≥10 prompts that meaningfully matter.
Things that aren’t gaps (worth knowing)
Stuff that looks missing but doesn’t actually need adding for our scale or stack:
| Looks like a gap | Why it isn’t |
|---|---|
| Distributed tracing across pipeline | Sentry has it built in. Wire it when 7-step pipeline observability gets harder than it currently is. |
| Log aggregation (Datadog / Axiom / Better Stack logs) | Each service’s own UI is fine for ~50 users. Becomes worth it ~5k+. |
| Infrastructure metrics (CPU / memory / disk) | Everything’s serverless (Supabase, Modal, Cloudflare). They handle their own. |
| Container orchestration / k8s | Nothing self-hosted that needs it. |
| CDN | Cloudflare Pages + Workers are the CDN. Done. |
| Custom WAF | Cloudflare Pages includes it free; Supabase has its own. |
| Secrets manager | Supabase Vault + Vercel env + Modal secrets cover everything. No dedicated secrets-management product needed. |
Discussion items — need a decision call
1. Where do tests go in the workflow?
Three choices for adding the Playwright golden-path:
- (a) Manual run before merge. Cheap, easy to skip.
- (b) GitHub Actions on every PR. Adds 2-3 min to PR cycle, blocks merge on red.
- (c) Pre-deploy gate on Cloudflare Pages. Doesn’t block PRs but blocks bad code from going live.
Recommendation: (b) on apps/web PRs only. CI already typechecks; tests fit the same place.
2. Privacy policy — write or generate?
- (a) Termly / Iubenda generated template (~$10-30/mo). Fast, decent, brand-generic.
- (b) Hand-written. Hours of work but reflects ARCIVE specifics (HW recording, AI processing, B2B).
- (c) Hire a lawyer. Hundreds of dollars but real legal coverage.
Recommendation: (a) for V1.0 launch (covers ~95% of what’s actually scrutinized), (c) before B2B contracts.
3. Status page — public or partner-only?
- (a)
status.arcive.iofully public. Most SaaS pattern. - (b) Partner-only via Cloudflare Access. Less public-trust signal but no public failure record either.
Recommendation: (a) once 5+ partners exist. Building trust requires showing the real thing.
4. AI cost attribution — when to wire?
ADR-0013 accepts the idea. Three urgency levels:
- Now — log token counts per request to PostHog. Alert on outliers.
- Before V1.0 — full per-user dashboard, billing-grade attribution.
- Defer — wait until first runaway-cost incident.
Recommendation: now for the “log tokens to PostHog” minimum (XS). Full dashboard before V1.0.
Recommended next 5
If picking 5 items to do this week in priority order:
- Dependabot config (~5 min, prevents real security gaps)
- Uptime monitoring (~10 min, $0)
- Data export ZIP/JSON (~3 hours, unblocks GDPR + signals partner trust)
- 5 Playwright tests for the golden path (~1 day, prevents regressions on the core flow)
- Privacy policy + ToS draft (~2 hours, required before App Store)
Everything else can wait until specific signals warrant it.
Promotion path
- This doc → discussion / exploratory
- Items accepted →
03_PROGRESS.md“Production prep” or “V1.0” backlog (added in same commit as this doc) - Specific tool choices → ADRs (e.g. “ADR-0014: BetterStack for uptime”) only when alternatives were weighed