The single source of truth for the ARCIVE product, across software and hardware. Read this first. Then read 01_SOFTWARE_PLAN.md and 02_HARDWARE_PLAN.md.
1. Product Thesis
ARCIVE is a hardware & software platform that empowers and enriches the lives of its users by lowering the barrier to record, retrieve, interact, and create memories and thoughts — past, present & future — for individuals and/or friends and family.
This is the canonical one-sentence definition. Everything else in this plan derives from it.
- It is a platform, not a single product — one software stack, multiple hardware form factors over time (clip, pendant, tabletop, watch, card, screen-equipped, cellular, local-only).
- The four verbs — record, retrieve, interact, create — define every feature decision. If a feature doesn’t serve one of these, it doesn’t ship.
- Past, present & future: capture (past), companion-in-the-moment (present), and reflective/predictive companion (future) are all in scope. V0 covers past + early present; future scope expands into present + future.
- Individuals and/or friends and family: solo use cases AND shared spaces are first-class. Schema, pricing, and UX accommodate both from V0.
- Hardware and software are co-equal tracks, developed in parallel from week 1, integrated continuously via the contract in §6.
- The app works without any device so adoption is never blocked. The device(s) are how the product is meant to be experienced — frictionless capture, no screen pulling you in.
- Monetization is device + subscription + platform/marketplace + vertical B2B.
User mental model: “I have ARCIVE. It listens, it remembers, it understands me, it’s there when I need it — past, present, and future.”
2. Strategic Principles
| # | Principle | Why |
|---|---|---|
| 1 | Hardware and software are equal, parallel tracks | The device is core product identity, not a peripheral. Both teams ship every phase. The integration contract (§6) is the rail they run on. |
| 2 | App works standalone; device is the intended experience | Phone-mic input is the safety net for adoption, but the device is the differentiated capture surface (always-on, no screen, distraction-free). |
| 3 | Ship V0 in 3 weeks — software AND a working hardware bring-up | Both tracks must demo end-to-end at every phase boundary. No “hardware later.” |
| 4 | One integration contract, two implementations | The HW↔SW contract (§6) is frozen at start of each phase. Both sides build to it. Changes require a joint sync. |
| 5 | Schema & architecture designed for V1 from V0 | Pivot from dictaphone → companion → platform must not require a rewrite. Bones in place from V0. |
| 6 | Cost-per-user must be sub-$2/mo at free tier | Otherwise growth = bankruptcy. VAD on-device + cheap STT (Groq) + cheap embeddings (Voyage) are non-negotiable. |
| 7 | Privacy & consent are V0 features, not later additions | Always-on mics in shared spaces are a legal minefield. Hardware-level mute, non-overridable LED, consent screen all in V0. |
| 8 | Agent layer is swappable from day one | Stateless RAG → Agent SDK → voice-native realtime. The app code shouldn’t care. |
| 9 | MCP-first memory layer | Memory store doubles as B2B/platform offering. Build it as a service from the start. |
| 10 | Distraction-free is a hardware constraint AND a software constraint | Device has no screen by default. App has no infinite feeds, notifications, or engagement loops. Calm by design at every layer. |
2.5. Hardware Variant Lineup (Platform View)
ARCIVE is a platform. The first device defines the architecture; future variants extend it. All variants share the same software stack, the same /ingest-audio contract, the same BLE GATT schema (where applicable), and the same OTA system. Form factor and connectivity differ.
| Variant | When | Form | Use case | Key constraints |
|---|---|---|---|---|
| Clip / Pendant (V1.0) | Phase 4 | Wearable on lanyard or clip | Always-with-you personal capture | Battery 8–12 hr, BLE+WiFi, no screen |
| Tabletop puck (V1.0 SKU or V1.5) | Phase 4–5 | Desk or conference-table puck | Group conversation, meetings, family dinners, study rooms | Plugged-in option, larger battery, optimized for far-field |
| Pendant variant (V1.5) | Phase 5 | Necklace / pendant form | Always-on capture for users who don’t want a clip | Industrial-design refresh, smaller battery acceptable |
| Watch app companion (V1.1+) | Phase 5+ | Apple Watch / Wear OS app | Intentional dictation when device isn’t worn; quick capture | Software only, no new hardware; uses watch mic |
| Card format (V2.x) | Future | Credit-card-sized device worn in pocket / wallet | Discreet capture, executives / professionals | Slim battery, single MEMS mic likely (sacrifices array for form), BLE-tethered to phone for upload |
| Screen-equipped variant (V2.x) | Future | Tabletop or pendant with e-ink or small OLED | Status, summaries, role selection without phone | Display drives slightly higher BOM; firmware adds UI layer |
| Cellular variant (V2.x) | Future | Any of the above with onboard LTE-M / NB-IoT | Capture without a phone or WiFi nearby; caregiving / kids / safety use cases | LTE module + SIM, eSIM activation flow, higher BOM, recurring connectivity cost passed through to user |
| Local-only variant (V2.x) | Future | Any of the above with on-device summarization | Privacy-maximalist users, regulated industries (legal, healthcare), air-gapped settings | Larger MCU or co-processor (e.g., ESP32-P4 or RK3566), local STT (Whisper-tiny), local embeddings, syncs only to user’s own device or NAS |
Why this list now matters
- The platform contract must accommodate variants from day one —
device.kindenum, capability flags, optional cellular metadata, optional local-mode flag in the schema. - Cellular and local-only changes the data flow. Cellular adds latency + cost pressure (chunk smaller, upload smarter). Local-only inverts the cloud assumption (memory store can live on user device). Both must be design considerations even if not built until V2.
- Card / pendant / watch shift the input surface. Single-mic and watch-mic are degraded inputs; software diarization & speaker re-ID must gracefully fall back.
- Screen variant changes the firmware UI layer, but does not change the cloud product.
- These variants are NOT a roadmap commitment — they’re the shape of the platform we’re designing for. We commit to V1.0 (clip/pendant + tabletop) and keep the door open to the rest.
2.7. Architecture (current state — V0.3 in flight)
Color legend (both diagrams):
- 🟦 Blue — capture / write path (client → ingest → storage)
- 🟪 Purple — pipeline processing (internal step-to-step)
- 🟧 Orange — external API call (Modal worker or AI vendor)
- 🟩 Green — read / Realtime push / query path
- 🟥 Red — failure path (dead-letter queue)
- ⬜ Gray dotted — planned / not yet wired
2.7.a — High-level (5-box overview)
For a 30-second read of how data flows.
2.7.b — Detailed (numbered flow)
Numbered subgraphs follow the data path 1 → 6.
Not in either diagram (planned / paused):
- Pipecat voice service — scaffolded then paused per ADR-0010. Resumes when EAS unblocks mobile native deps AND speech-to-speech-vs-Pipecat decision lands.
- LiveKit group conversation mode — V0.3 deferred with voice talk-back.
- Public MCP server (Cloudflare Workers) — V1.0.
- Stripe + RevenueCat billing — V0.1 scaffolded; tiers live by V1.0.
3. Phased Roadmap — Parallel Tracks
Both tracks ship at every phase boundary. Each phase ends with an integrated demo: software working end-to-end with the current hardware build.
| Phase | Weeks | Version | Software | Hardware |
|---|---|---|---|---|
| 0 | 1–3 | V0 | Web PWA dictaphone (phone mic) | XVF3800 dev-kit bring-up; raw I2S → WiFi upload to same backend |
| 1 | 4–6 | V0.1 | Diarization + re-ID + Reviewer role + Pro tier | Full firmware on dev-kit — VAD, BLE pairing, status LED, OTA scaffold |
| 2 | 7–10 | V0.2 | Voice talk-back¹ + Family + mobile (Expo) | Enclosure prototype #1 (3D-printed), 10 internal units, AEC validated for talk-back |
| 3 | 11–14 | V0.3 | Group conversation mode + Caregiver role | 50-unit pilot batch, DoA fusion live in pipeline, BLE control complete |
| 4 | 15–22 | V1.0 | Marketplace + public MCP + B2B pilot | Enclosure v1 production-intent, 200-unit commercial batch, OTA fleet mgmt |
| 5 | 23+ | V1.1+ | Vertical packages + memory-as-a-service | Industrial-design refresh, optional e-ink screen, certification path |
¹ Voice talk-back scaffolded then paused 2026-05-05 — see ADR-0010.
Both tracks integrate continuously. Software contract endpoints exist from Phase 0 so firmware can target them; firmware sends real audio from Phase 0 even if the enclosure is a breadboard.
4. Version Matrix — Capabilities by Release
Software capabilities
| Capability | V0 | V0.1 | V0.2 | V0.3 | V1.0 | V1.1+ |
|---|---|---|---|---|---|---|
| Web app (PWA) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Native mobile (iOS+Android) | — | — | ✅ | ✅ | ✅ | ✅ |
| Magic-link auth | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Consent screen | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Capture from phone mic | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Capture from ARCIVE device | ✅ dev-kit | ✅ dev-kit | ✅ proto | ✅ pilot | ✅ retail | ✅ |
| List view + memory detail | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Text search | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Semantic search (embeddings) | — | ✅ | ✅ | ✅ | ✅ | ✅ |
| Graph / Universe view | — | ✅ | ✅ | ✅ | ✅ | ✅ |
| Speaker diarization | — | ✅ | ✅ | ✅ | ✅ | ✅ |
| Cross-session speaker re-ID | — | ✅ | ✅ | ✅ | ✅ | ✅ |
| AI roles (text) | — | ✅ Reviewer | ✅ +Tutor | ✅ +Caregiver | ✅ marketplace | ✅ vertical |
| AI talk-back (voice) | — | — | ✅ | ✅ | ✅ | ✅ |
| Group conversation mode | — | — | — | ✅ | ✅ | ✅ |
| MCP server | — | — | — | internal | ✅ public | ✅ |
| Stripe / RevenueCat billing | scaffolded | ✅ Pro | ✅ Family | ✅ | ✅ marketplace | ✅ enterprise |
| Free / Pro / Family tiers | scaffolded | Free+Pro | +Family | ✅ | ✅ | ✅ |
| Marketplace (custom roles) | — | — | — | — | ✅ | ✅ |
| B2B vertical packages | — | — | — | — | pilot | ✅ |
| Export (Markdown / Obsidian) | — | — | ✅ | ✅ | ✅ | ✅ |
| Family / shared spaces | — | — | ✅ | ✅ | ✅ | ✅ |
Hardware capabilities (parallel track)
| Capability | V0 | V0.1 | V0.2 | V0.3 | V1.0 | V1.1+ |
|---|---|---|---|---|---|---|
| Build form | dev-kit (XVF3800+XIAO breadboard) | dev-kit | enclosure proto #1 (3D-print) | pilot enclosure (3D-print refined) | production-intent enclosure (SLA/CNC) | injection-molded |
| Units in field | 5 internal | 10 internal | 10 + 5 design partners | 50 pilot | 200 commercial | 1k+ |
| I2S audio capture (XVF3800 → ESP32-S3) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| WiFi upload to backend | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| On-device VAD (silence not uploaded) | basic | ✅ tuned | ✅ | ✅ | ✅ | ✅ |
| BLE provisioning (WiFi creds + JWT) | — | ✅ | ✅ | ✅ | ✅ | ✅ |
| Status LED (breathing/solid/error) | basic | ✅ | ✅ | ✅ | ✅ | ✅ |
| Hardware mute (gates I2S clock) | — | wired | ✅ | ✅ | ✅ | ✅ |
| Battery + USB-C charge | dev-kit | dev-kit | ✅ proto | ✅ pilot | ✅ retail | ✅ |
| 8–12 hr battery life | n/a | n/a | tested | ✅ | ✅ | ✅ |
| DoA azimuth metadata per chunk | ✅ raw | ✅ | ✅ | ✅ fused with re-ID | ✅ | ✅ |
| AEC validated for talk-back | — | — | ✅ | ✅ | ✅ | ✅ |
| OTA firmware updates | — | scaffold | ✅ | ✅ signed | ✅ A/B partition | ✅ fleet mgmt |
| Memfault / crash telemetry | — | — | ✅ | ✅ | ✅ | ✅ |
| Local circular buffer (offline) | — | basic | ✅ 30-min | ✅ | ✅ | ✅ |
| Optional e-ink/OLED screen | — | — | — | — | — | ✅ optional SKU |
| FCC/CE certification | — | — | — | — | (skipped — research units) | ✅ retail |
5. Deliverables by Phase — Both Tracks
Each phase ends with a joint integration demo: software + hardware working end-to-end together.
Phase 0 — V0 (Week 1–3)
Software
- Next.js 15 PWA deployed to Vercel
- Magic-link auth + consent screen
ingest-audioEdge Function live and accepting uploads from phone-mic AND device dev-kit- Phone-mic recording (
getUserMedia+@ricky0123/vad-web) - Synchronous transcribe → store memory via Groq Whisper
- Today (list) + Memory detail views
- Postgres FTS text search
- Full schema deployed (people / roles / role_sessions / memory_participants / subscription_tier)
- Stripe customer pre-created on signup
- PostHog + Sentry instrumented
- 20 invited users
Hardware
- XVF3800 + XIAO ESP32-S3 dev-kit assembled (5 units, breadboard / Seeed reference board)
- I2S firmware flashed on XVF3800 (per Seeed wiki)
- ESP32-S3 firmware: I2S capture → 30s WAV chunk → WiFi → POST
/ingest-audio(hardcoded WiFi creds + dev JWT for now) - DoA azimuth queried via I2C, attached as metadata to each chunk
- LED breathing animation on capture
- Forks Seeed’s HTTP audio streaming sample for fastest path to working
Integration demo: dev-kit on a desk records a meeting, transcripts appear in the web app feed in real time alongside phone-mic recordings.
Phase 1 — V0.1 (Week 4–6)
Software
- Pipeline moved to
pgmqqueue + step workers - Diarization (Deepgram Nova-3)
- Pyannote.audio worker on Modal for speaker re-ID
- Voyage-3-lite embeddings + pgvector HNSW
- Semantic search live
- Universe / graph view (
react-force-graphweb) - First AI role: Reviewer (text-only)
- Pro tier ($12/mo) launches
- Markdown export
Hardware
- Same dev-kit, full firmware feature set
- VAD gating: silence never uploaded (uses XVF3800 VAD signal via I2C)
- BLE GATT server implemented (provisioning + control + status characteristics per §6.3)
- Pairing flow: app QR → BLE write of WiFi creds + device JWT → device reboots and joins WiFi
- LED states: idle / recording / muted / uploading / error
- OTA scaffolding (esp_https_ota wired up, manifest poll daily)
- 10 dev-kit units in internal use
Integration demo: paired device captures a multi-speaker meeting; diarization labels appear; same speaker recognized across two separate sessions.
Phase 2 — V0.2 (Week 7–10)
Software
- Expo mobile app (iOS + Android) with feature parity
- pnpm workspace,
packages/db,packages/shared,packages/agents - Voice talk-back loop: Pipecat + Deepgram streaming STT + Cartesia Sonic TTS
- Claude Agent SDK driver replaces stateless RAG
- Roles: Tutor, Brainstorm Partner
- Family tier ($25/mo) — spaces, multi-member, caregiver role
- Offline recording on mobile + queue-and-forward
- App Store + Play Store submissions
Hardware
- Enclosure prototype #1: 3D-printed shell housing XVF3800 + XIAO + LiPo + USB-C charge IC + LED + mute button
- 10 internal units + 5 design-partner units
- AEC validated for talk-back use case (XVF3800 onboard AEC; verified that device speaker echo doesn’t pollute uploaded audio when external speaker is involved — even though device has no speaker, AEC matters for talk-back via paired phone)
- Hardware mute button wired to physically gate I2S clock (security-critical)
- 8-hour battery test passes
- Local 30-min circular buffer for offline resilience
- OTA channel
devlive; firmware version reported via BLE characteristic - Memfault (or equivalent) crash reporting
Integration demo: user wears device on lanyard for a full work day; offline periods buffer locally; device firmware updates over OTA without user intervention; talk-back works via paired phone with the device as input.
Phase 3 — V0.3 (Week 11–14)
Software
- LiveKit-based group conversation mode (continuous WebRTC, server-side multi-party room)
- Backend bridge: device → HTTPS chunked upload (1s Opus chunks) → bridge publishes as a LiveKit participant track. Device → backend speaks HTTP; backend → LiveKit speaks WebRTC. Web/mobile clients join the room directly via LiveKit SDKs.
- Agent
interject()capability — agent can speak into group conversation as a defined role (audio TTS sent into the LiveKit room as a participant) - Caregiver role with per-person consent flow
- DoA metadata fused with diarization in pipeline → higher-confidence speaker labels in multi-person rooms
- Internal MCP server stood up (used by agents/roles as retrieval layer)
Hardware
- 50-unit pilot batch of refined enclosure (still 3D-printed, but iterated form factor based on Phase 2 feedback)
- Industrial-design partner engaged for V1.0 enclosure
- Continuous-streaming firmware mode (for group conversation): instead of 30s chunks, opens a sustained WebRTC/HTTPS stream
- Power profile validated for sustained streaming (target ≥4 hr in this mode)
- BLE notify channels for real-time mute/battery push
- Pairing flow polished (sub-30s end-to-end)
- 50 units shipped to design partners + early adopters
- Telemetry dashboard live: per-device upload volume, battery health, crash rate, daily active devices
Integration demo: device placed on conference table during a 4-person meeting; group mode active; speakers identified by name from the second session onward; AI role (Caregiver / Reviewer) interjects appropriately when invoked.
Phase 4 — V1.0 (Week 15–22)
Software
- Public MCP server — Pro users plug ARCIVE memory into Claude Desktop / ChatGPT / Cursor
- Role marketplace UI (browse / install / publish)
- Stripe Connect for marketplace payouts (70/30)
- B2B admin dashboard (multi-staff, audit log, org-level billing)
- First B2B pilot signed (likely caregiving)
- SOC 2 Type 1 prep started
Hardware
- Production-intent enclosure (SLA or CNC; injection-molded slated for v1.5)
- 200-unit commercial batch manufactured
- OTA with A/B partitions + signed firmware + automatic rollback on failed boot
- Fleet management telemetry (firmware version distribution, error rate by version, retire-and-replace flow)
- Hardware mute validation re-tested: cannot be defeated in software (security audit)
- LED VAD-driven indicator validated (cannot be turned off while mic is hot)
- Retail packaging design begins
- App Store listing live with hardware as upsell (“Get the ARCIVE device for $129”)
Integration demo: end-to-end retail experience — user buys device, scans QR in app, pairs in 30s, captures and reviews memories via marketplace role, plugs into Claude Desktop via MCP for cross-tool agent access.
Phase 5 — V1.1+ (Week 23+)
Software
- Memory-as-a-Service API tier (usage-billed)
- ARCIVE for Caregiving (per-resident license, ~$50–100/mo)
- ARCIVE for Education (per-student/school license)
- ARCIVE for Therapy (per-therapist, ~$50–100/mo)
- Smartwatch companion (Apple Watch / Wear OS) for intentional dictation when device isn’t worn
- Multi-language support (start ES / FR / DE)
Hardware
- Industrial-design refresh based on V1.0 field feedback
- Optional e-ink or OLED screen SKU (the original V0.3 “screen” idea, deferred to here once we know what users want to see)
- Replaceable / longer-runtime battery option
- Injection-molded enclosure
- FCC / CE certification (required for retail at scale)
- Retail distribution partnerships (DTC, Amazon, possibly Best Buy)
- Variants: clip / pendant / tabletop puck — based on V1.0 user preference data
6. Hardware ↔ Software Contract
This is the integration surface. Both teams must respect it. See full detail in plan files.
6.1 Audio Upload (HW → Backend)
POST https://<project>.supabase.co/functions/v1/ingest-audioHeaders: Authorization: Bearer <device_token> # JWT signed at pairing time X-Device-Id: <uuid> Content-Type: audio/wav (or audio/opus)Body: raw audio chunk (≤ 30s, VAD-trimmed)Query: ?recorded_at=<iso8601>&doa_json=<urlencoded>Response: 202 { recording_id }- Chunks ≤ 30 seconds (Edge Function timeout safety)
- Audio format: Opus mono @ 24 kbps (server billing assumes a 16 kbps floor; ~2 KB/s)
- VAD on-device — silence is never uploaded
- DoA metadata as compact JSON:
[{t_ms: 0, az: 87}, {t_ms: 1200, az: 92}, ...] - Retry with exponential backoff if offline; local circular buffer ≥ 30 minutes
- Raw audio is retained in private Storage (
audio/{user_id}/{recording_id}.{ext}) and is replayable from the memory detail page via short-lived signed URL
6.2 Pairing & Provisioning (App ↔ HW)
- App generates a pairing QR code containing:
{ pairing_url, pairing_token, supabase_url } - User scans QR in app’s pair-device flow
- HW receives WiFi creds + Supabase device JWT over BLE GATT (one-time, write-only characteristic)
- HW writes its
mac_addressback to app over BLE notify - App calls
POST /devicesto register, links to user account - BLE characteristic UUIDs live in
shared/ble-uuids.ts— single source of truth, imported by both firmware and app
6.3 BLE GATT Schema (V0.3+)
| Service | Characteristic | Direction | Purpose |
|---|---|---|---|
ARCIVE_PROV | wifi_creds | App → HW (write) | SSID + password JSON |
ARCIVE_PROV | device_jwt | App → HW (write) | Supabase upload token |
ARCIVE_PROV | mac | HW → App (read+notify) | Device MAC |
ARCIVE_CTRL | mute | App ↔ HW (read+write+notify) | 0=record, 1=muted |
ARCIVE_STATUS | battery | HW → App (read+notify) | 0–100 |
ARCIVE_STATUS | recording_state | HW → App (notify) | idle/recording/uploading/error |
ARCIVE_STATUS | firmware_version | HW → App (read) | Semver string |
UUIDs to be generated once and committed to shared/ble-uuids.ts.
6.4 Realtime sync (Backend → App)
- App subscribes to Supabase Realtime channel for
recordingstable filtered to current user - HW upload arrives → row inserted → app sees it appear in real time
- Pipeline status updates (
pending → processing → done) push the same way - App never polls
6.5 Privacy & Consent Contract
- HW LED must be visibly on (breathing) whenever mic is unmuted — non-overridable from firmware
- Mute button is hardware-level: cuts I2S clock to mic array, not software-bypassable
- App displays a “currently recording” indicator if any paired device is unmuted
- First-run consent screen must be acknowledged before any recording is ingested
- Device token can be revoked from app → HW receives revoke over BLE → wipes local creds
7. Cost Model
V0 (50 users, ~1 hr/day each, phone mic + 5 dev-kit HW units)
| Line | Estimate |
|---|---|
| Supabase | $0 (free tier) |
| Vercel | $0 (free tier) |
| Groq Whisper | ~$60/mo (50 × 30 hr × $0.04) |
| Voyage embeddings | ~$2 |
| Gemini Flash summaries | ~$5 |
| Sentry + PostHog | $0 (free tiers) |
| Recurring total | ~$70/mo |
| HW capex (one-time) | ~$270 (5× Seeed XVF3800 + XIAO dev-kits @ $54.50) |
V0.1 (Wk 4–6, 100 users + 10 HW dev-kits)
| Line | Estimate |
|---|---|
| Supabase | $0–25 |
| Vercel | $0 |
| AI services (Groq + Deepgram + Voyage + Gemini) | ~$150 |
| Pyannote on Modal | ~$30 |
| Recurring total | ~$200/mo |
| HW capex (one-time, marginal) | ~$270 (5 more dev-kits) |
V0.2 (500 users, mix of free + Pro, 15 HW prototype units)
| Line | Estimate |
|---|---|
| Supabase Pro | $25 |
| Vercel Pro | $20 |
| Transcription | ~$300 |
| Embeddings + summaries | ~$50 |
| Voice (Cartesia + Deepgram streaming) for Pro users | ~$200 |
| Pyannote (self-hosted on Modal) | ~$80 |
| Sentry/PostHog | $0–50 |
| Recurring total | ~$700/mo |
| HW capex (one-time) | ~$1,500 (5 more dev-kits + 3D-print materials + LiPo batteries + USB-C + enclosure iteration) |
| Revenue (assume 10% Pro at $12) | $600/mo |
| Net (recurring) | roughly break-even |
V1.0 (5,000 users + 200 HW units in field)
| Line | Estimate |
|---|---|
| Infra (Supabase, Vercel, LiveKit) | ~$500 |
| AI services | ~$3,000 |
| Hardware COGS amortized | ~$1,500/mo (assuming $30 unit cost, 200 sold over 2 mo) |
| Total | ~$5,000/mo |
| Revenue (assume 12% Pro, 3% Family, hardware margin) | ~$15,000/mo |
| Gross margin | ~67% |
8. Monetization Tiers
| Tier | Price | Target | Key features |
|---|---|---|---|
| Free | $0 | Trial / casual | 5 hr/mo, 1 default companion role, text-only |
| Pro | $12/mo | Power individual | 50 hr/mo, all built-in roles, voice talk-back (5 hr), graph view, MCP access, exports |
| Family | $25/mo | Households / caregiving | 5 members, shared spaces, caregiver role, group mode, multi-device |
| Marketplace | rev-share | Creators + buyers | Custom roles published by creators, 70/30 split |
| Caregiving B2B | $50–100/resident/mo | Assisted living | Compliance, audit, multi-staff access |
| Education B2B | per-student/school | Schools / tutors | Study companion, syllabus-aware retrieval |
| Therapy B2B | $50–100/therapist/mo | Practitioners | Session capture (with consent), patient role-play for practice |
| API / Memory-as-a-Service | usage-based | AI builders | MCP endpoint, vector + entity APIs |
9. Risk Register
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Always-on recording legal exposure | High | Existential | Consent screen V0, hardware LED non-overridable, device-side mute (gates I2S clock at hardware level), 2-party consent default |
| AI vendor price hikes | Medium | High | Multiple STT providers integrated (Groq + Deepgram), swappable agent layer (Pipecat + Claude Agent SDK + fallback OpenAI Realtime driver) |
| Hardware delays block roadmap | Medium | Medium | Both tracks decoupled at the integration contract (§6); HW running on Seeed reference board through Phase 3 means delays are firmware/enclosure scope, not platform-level. App always works without device. |
| Speaker re-ID quality poor | Medium | High | Self-host Pyannote on Modal, allow user manual labeling, build feedback loop into UX |
| Competitors (Limitless, Plaud, Friend) | High | Medium | Wedge = AI roles + group mode + platform/MCP + variants, not capture alone |
| Free-tier abuse | Medium | Medium | 5-hour cap, device-bound tokens, abuse rate-limits in Edge Functions |
| Privacy breach / data leak | Low | Existential | Encryption at rest, short retention defaults, per-user delete-all flow, SOC 2 Type 1 path planned for V1.0 |
| ESP32-S3 device-side WebRTC at edge of chip capability | Medium | Low | V0.3 ships HTTPS chunked upload + backend LiveKit bridge (proven path). esp-webrtc evaluated as Phase 2 research spike; if not viable, bridge remains permanent — group-mode quality unaffected, only data path changes. |
| Voice talk-back latency exceeds 1.5s budget | Medium | High | Phase 2 ships explicit benchmarking (01_SOFTWARE_PLAN.md §1.8). Mitigations: co-locate Deepgram + Cartesia regions with user; switch to Cartesia edge endpoints; downgrade LLM to 7B-on-Groq for faster first-token. |
| iOS App Store rejection of ambient mic capture | Medium | High | Explicit App Review note: user-initiated recording, persistent visual indicator (LED on HW, banner in app), consent screen on first launch; provide test account with HW in dev mode |
| Variant feature creep (cellular, screen, local) before V1.0 lands | Medium | Medium | V1.0 is clip + tabletop only; variants are architected for, not committed. Schema accommodates them without migration. |
10. Repository Layout
arcive/├── docs/ # ← you are here│ ├── 00_MASTER_PLAN.md│ ├── 01_SOFTWARE_PLAN.md│ └── 02_HARDWARE_PLAN.md│├── apps/│ ├── web/ # Next.js 15 PWA (V0 → onwards)│ └── mobile/ # Expo / React Native (V0.2 → onwards)│├── packages/│ ├── db/ # Supabase schema, migrations, generated types│ ├── shared/ # Zod schemas, agent interface, BLE UUIDs│ └── agents/ # Role definitions, system prompts, tools│├── backend/│ ├── functions/ # Supabase Edge Functions│ ├── workers/ # Queue workers (pgmq) for pipeline steps│ └── mcp/ # MCP server (V0.3+)│├── firmware/ # ESP32-S3 firmware (V0.3+)│ ├── src/│ ├── platformio.ini│ └── tests/│└── shared/ └── ble-uuids.ts # Single source of truth for HW + App11. Decision Log
| Date | Decision | Reason |
|---|---|---|
| 2026-05-03 | Build software-first, defer custom HW to Phase 4 | Hardware kills startups; software validates demand cheaply |
| 2026-05-03 | Web PWA before native mobile | 3-week ship, no app review, faster iteration |
| 2026-05-03 | Groq Whisper over AssemblyAI for V0 | ~10x cheaper, no diarization needed yet |
| 2026-05-03 | HNSW over ivfflat in pgvector | Better recall, scales further, default in 0.5+ |
| 2026-05-03 | Supabase as full backend | Single vendor, single dashboard, generous free tier |
| 2026-05-03 | Schema includes people/roles/sessions from V0 | Avoids migration pain when companion features land |
| 2026-05-03 | MCP-first internal retrieval API | Doubles as B2B/platform offering later |
| 2026-05-03 | HW v0 = white-label, HW v1 = custom XVF3800 | De-risk hardware before custom PCB investment |
| 2026-05-03 | LiveKit for group mode media layer | Open source, free tier, OpenAI Realtime uses it |
| 2026-05-03 | Stripe + RevenueCat for billing | RevenueCat handles iOS/Android/web in one SDK |