Original prepared: 4 May 2026 Status updated: 2026-05-06 Scope: Backend, App & SW | HW arriving separately
Context & Scope
ARCIVE is a HW & SW platform that empowers and enriches the lives of its users by lowering the barrier to record, retrieve, interact, and create memories and thoughts — past, present & future — for individuals and/or friends and family.
This task list covers all software and backend work that can be progressed while hardware is in development. Tasks are organised by software milestone and versioned to align with the product roadmap.
TL;DR — what’s changed since 4 May
- SW-M1 + SW-M2 are complete (V0 + V0.1 shipped, web + mobile)
- SW-M3 is partly shipped — text talk-back live; voice paused per ADR-0010
- SW-M4 mostly remains as V1.0 roadmap
- Auto-tagging shipped beyond spec — typed topic graph (people/places/projects/themes/events), not flat tags. See ADR-0012
- Three items need a decision call — E2E encryption, version-numbering, whether SW-M4 features move up
Product Version Roadmap
| Original (4 May) | Status (2026-05-06) | |
|---|---|---|
| V0 | Dictaphone + Transcription — HW sends audio to phone. Backend receives, transcribes, and stores. App shows sessions with transcript. Core infrastructure established. | ✅ Done (web + mobile) |
| V0.1 | Semantic Memory Space — Sessions embedded into a vector space. App shows an Obsidian-style graph view of sessions clustered by topic/similarity. Natural language search enabled. | ✅ Done (web + mobile, mobile graph in PR #13) |
| V0.2 | Talk-Back (Conversational Memory) — User can ask questions about their own memories. AI retrieves relevant sessions and generates a grounded conversational response. | ⚠️ Partial — text shipped (/roles/[id]/chat); voice paused per ADR-0010 |
| V0.3 | Status Display (HW-dependent) — Hardware status visible in app (battery, connectivity, recording state). Pending HW delivery — SW prep work only in this phase. | 🔄 Reinterpreted — V0.3 is now Topic Graph + Mobile Parity + Group Conversations. HW status display deferred to release prep (no Apple Developer account yet). See Discussion Items below. |
SW-M1 — Secure Core Archive
Goal: Foundation is solid. Data flows from device to cloud securely. Transcription works. V0 app is functional.
| Milestone | Task | Description | Priority | Status | Notes |
|---|---|---|---|---|---|
| SW-M1 | Secure cloud architecture setup | Design and deploy the core cloud infrastructure. Must support end-to-end encryption for all stored data (audio, transcripts, metadata). Zero-loss storage guarantee. | Critical | ⚠️ Shipped, with caveat | Supabase Postgres + Storage + RLS + per-user isolation, zero-loss via durable storage + DLQ. Encryption is at-rest, not E2E. True E2EE is incompatible with server-side AI features (RAG, semantic search, agent retrieval). See Discussion Items #1. |
| SW-M1 | User authentication & accounts | Implement secure user auth (email/social login). Each user gets an isolated encrypted vault. Multi-device session support. | Critical | ✅ Shipped | Magic-link auth via Resend (send.arcive.io, verified subdomain). Multi-device session via Supabase Auth. |
| SW-M1 | Audio file ingestion pipeline | Backend pipeline to receive raw audio from the HW device via Bluetooth/WiFi. Store securely with timestamp, duration, device ID metadata. HW will send audio files; pipeline must handle variable file sizes and formats. | Critical | ✅ Shipped (HW path scaffolded) | PWA capture + mobile capture (Expo) live and uploading to Supabase Storage. HW BLE/WiFi path uses the same /ingest semantics — wires up when device arrives. |
| SW-M1 | Transcription service integration | Auto-transcribe incoming audio. Store transcript alongside audio with word-level timestamps if possible. Dictaphone + transcription baseline. | Critical | ✅ Shipped | Whisper running on Modal (backend/modal/transcribe.py). Word-level timestamps captured. |
| SW-M1 | Basic companion app — audio playback & transcript view | Mobile app (iOS + Android) that shows a list of recorded sessions. Tap to play audio and read the auto-generated transcript. V0 deliverable — minimal UI, functional only. | High | ✅✅ Shipped beyond spec | Web AND mobile (Expo). Today feed grouped by date bucket, drag-to-seek, resume position, signed-URL playback, offline browse. Far past “minimal UI.” |
| SW-M1 | Session metadata model | Define the data model for a ‘memory session’: timestamp, location (optional), duration, audio URL, transcript text, tags (empty for now). Schema must be extensible for V0.1 embeddings. | High | ✅ Shipped | memories table with timestamp, duration, audio URL, transcript, status. Schema extended with embeddings, typed topics, edges, kinds — held up across every consumer without rework. |
SW-M2 — Semantic Memory Space (V0.1)
Goal: AI Memory Retrieval feature live. Sessions live in an embedded space, searchable and visualisable.
| Milestone | Task | Description | Priority | Status | Notes |
|---|---|---|---|---|---|
| SW-M2 | Embedding generation for sessions | After transcription, generate vector embeddings for each session using a language model (e.g. OpenAI embeddings or local model). Store in a vector database. V0.1 — enables the ‘Obsidian graph’ semantic space. | Critical | ✅ Shipped | embed-step edge function in pipeline, pgvector storage. Gemini embeddings primary; vendor strategy in ADR-0011. |
| SW-M2 | Semantic similarity engine | Given a query or session, retrieve the N most semantically related sessions using vector similarity search (e.g. cosine distance). Core of AI Memory Retrieval feature. | Critical | ✅ Shipped | match_memories RPC, cosine distance. Used by search + agent retrieval. |
| SW-M2 | Graph view — session clustering UI | App screen visualizing sessions as nodes in a 2D graph, clustered by semantic similarity. Tap a node to open session. Zoom/pan supported. V0.1 deliverable — the ‘embedded space’ view. | High | ✅ Shipped | Universe view on web; mobile Universe in PR #13 (awaiting merge). React Force Graph (web) / react-native-skia + d3-force (mobile). Pan + pinch + double-tap reset on mobile. |
| SW-M2 | Natural language search | Text search bar that queries the embedding space. ‘Find me notes about the project meeting last month’ returns ranked sessions. Part of AI Memory Retrieval feature. | High | ✅✅ Shipped beyond spec | Hybrid FTS + semantic search, plus topic:Daniel operator with intersection semantics. Substring fallback via ILIKE. |
| SW-M2 | Auto-tagging & categorization | AI auto-assigns category tags to each session (e.g. work, personal, ideas, health). Tags surface in the app for filtering. Feeds into graph clustering logic. | Medium | ✅✅ Shipped well beyond spec | Original spec was flat tags. Delivered: typed topic graph with kinds (person, place, project, theme, event), canonical entity resolution via hybrid pg_trgm + pgvector, topic-shared edges in the Universe, /topic/[id] detail pages, /topics index, inline highlights on transcripts. See ADR-0012. |
| SW-M2 | AI indexing by date, location, topic | Layer on top of embeddings: index sessions by extracted date references, detected topics, and (if available) GPS metadata from device. Enables richer search queries. | Medium | ⚠️ Partial | Date ✅ (timestamps everywhere). Topic ✅✅ (full entity graph). Location ⏳ pending — lands with multimodal δ.1 (EXIF → place topics, see discussions/2026-05-04_multimodal_expansion.md). |
SW-M3 — Social Interaction Layer & Talk-Back
Goal: Users can converse with their memories (V0.2). Early sharing and circle features enabled.
| Milestone | Task | Description | Priority | Status | Notes |
|---|---|---|---|---|---|
| SW-M3 | Talk-back / conversational memory interface | Chat-style UI where user types or speaks a question and the AI responds using relevant past sessions as context. ‘What was I thinking about last Tuesday?’ V0.2 deliverable — the ‘interact’ pillar of ARCIVE. | Critical | ⚠️ Partial — text yes, voice paused | Text shipped as /roles/[id]/chat and /talkback. Voice paused per ADR-0010 until (a) Pipecat-vs-speech-to-speech decision has a clearer signal, and (b) Apple Developer / EAS unblocks the mobile native dependency. |
| SW-M3 | RAG pipeline for talk-back | Build retrieval-augmented generation (RAG) backend: take a user query, retrieve relevant sessions from the vector store, pass them as context to a generative model, return a grounded answer. Backend for talk-back interface. | Critical | ✅ Shipped | backend/mcp/arcive-memory-mcp/ provides retrieval as an internal MCP server. Used by /api/chat. Topic-aware RPCs (memories_by_topic, related_topics) coming next. |
| SW-M3 | Session sharing (1-to-1) | Allow a user to share a specific session (audio + transcript) with another ARCIVE user via a secure link or in-app share. Early version of the social layer. | Medium | ✅ Shipped as Spaces | Sharing via Spaces (named groups) covers 1-to-1 by creating a 2-person Space. |
| SW-M3 | Family/friend group permissions | Create ‘circles’ — named groups with permission levels. A session can be shared with a circle. Members see shared sessions in their app. SW-M3 Social Interaction Layer milestone. | Medium | ✅✅ Shipped beyond spec | Spaces (= circles) + consent gate on agent retrieval (ADR-0007). Agents will not surface memories involving people who haven’t consented — stronger than original spec. |
| SW-M3 | Notifications & activity feed | Push notifications for new shared sessions from circle members. In-app feed showing recent activity across the user’s network. | Low | ❌ Not built | Easy add when Spaces has real shared activity at scale. Agree with original Low priority. |
SW-M4 — Full Platform Integration & Legacy Features
Goal: Platform-level completeness. Legacy, future messages, generative artifacts, and HW-SW sync optimised.
| Milestone | Task | Description | Priority | Status | Notes |
|---|---|---|---|---|---|
| SW-M4 | AI memory artifact generation | Generate shareable ‘memory artifacts’ from a set of sessions: summary doc, highlight transcript, or narrative story. User selects sessions, AI composes output. Feeds into Legacy Packaging feature. | High | ❌ Not built | Per-memory summaries exist; multi-session compose does not. Recommend: V1.0 candidate. Real product feature — fits the “create memories” pillar of the thesis. |
| SW-M4 | Legacy archive & chapter curation | AI groups a user’s sessions into life ‘chapters’ (by time period or topic). User can name, annotate, and lock chapters. Chapters form a long-term archive. SW-M4 — Full platform integration. | High | ❌ Not built | Genuinely missing from current roadmap. Recommend adding — strong narrative/legacy hook, distinctive vs other voice-journal products. |
| SW-M4 | Generative future boards | Shared workspace for a circle: pin voice notes, text ideas, and goals to a future-oriented board. Boards are collaborative and timestamped. Future timeframe feature. | Medium | ❌ Not built | Could ride on Spaces + Roles. V1.x candidate. |
| SW-M4 | Future message scheduling | User records a message and sets a delivery date/trigger. On that date, the message unlocks for themselves or a designated recipient. Future timeframe feature. | Medium | ❌ Not built | Small build (cron + delivery). Charming feature. V1.x candidate. |
| SW-M4 | Privacy controls + data export (ZIP) | Granular privacy settings per session and circle. Full data export (ZIP of audio + transcripts + metadata). Required feature set. | High | ⚠️ Partial | Privacy controls ✅ (Spaces consent + ADR-0007 retrieval gate). Data export ❌ — required for GDPR/CCPA. Recommend moving to production-prep checklist now (cheap while schema is small). |
| SW-M4 | HW-SW sync optimization | Reliable sync between device and the cloud. Handle offline buffering (device stores locally, syncs when connected). Full platform integration — SW-M4 milestone. | High | ⏳ Pending HW | Mobile-side offline queue (SQLite + auto-flush on foreground) already shipped — ready to absorb HW path. Aligned with current “Apple Developer + EAS” deferral. |
What we built that wasn’t in the original list
These exist today but aren’t in the 4 May doc — flagging so you know what’s there:
| Item | Why it matters |
|---|---|
Pipeline architecture — pgmq queue, dead-letter table with 30-day TTL, edge functions per step, retry semantics, end-to-end pipeline test (PR #10) re-run on every schema change | Production-grade reliability — failures don’t lose data, every step is observable |
| AI vendor strategy (ADR-0011) — Gemini Flash → Anthropic Haiku → Groq Llama 3.3 70B fallback chain | Cost + availability resilience; one vendor outage doesn’t take the product down |
| Diarization + speaker re-identification — Pyannote on Modal | Multi-speaker handling; Daniel/daniel/Dan collapse to one person |
| Server-side audio transcode — webm/Opus → m4a/AAC on Modal | Cross-browser/iOS playback parity |
| Offline-first mobile — SQLite queue, optimistic rows, resume position per recording, grouped feed, share-to-space | App works on flights, in elevators; uploads catch up on reconnect |
| Realtime sync via Supabase Realtime | Today feed updates without refresh |
| PWA share-target + offline browse | Capture from any iOS/Android share sheet |
| Roles with personas (Tutor, Brainstorm Partner, Caregiver) | Persona layer above raw RAG — different conversational frames over the same memory store |
| Topic graph — full entity-style typed topics with canonical resolution | Promoted from “auto-tagging” (Medium) to first-class navigation surface |
Discussion items — need a decision call
1. End-to-end encryption — fundamental fork
The original SW-M1 row spec’d “end-to-end encryption for all stored data (audio, transcripts, metadata).” We currently have Supabase at-rest encryption only, not E2EE.
True E2EE is incompatible with server-side AI features — the server can’t compute embeddings, run RAG, or run agent retrieval over data it can’t decrypt. Three plausible paths:
| Option | Implication |
|---|---|
| (a) Walk back the E2EE promise | ”Encrypted at rest, server-readable so AI features work.” What we have today. Standard for most AI products. |
| (b) Hybrid | E2EE on raw audio, server-readable on transcripts + embeddings + topics. Audio is recover-on-disk-only; AI features keep working. Modest dev cost. |
| (c) Full E2EE with client-side AI | Embeddings computed on-device; encrypted vector index; all RAG client-side. Multi-month rebuild, breaks public-MCP-server plans, kills mobile UX. |
Recommendation: option (b). Worth a 30-minute call to confirm.
2. Version-numbering reconciliation
Original V0.2 = voice talk-back; original V0.3 = HW status display. Current V0.2 = mobile slice; current V0.3 = topic graph + group conversations. Two fixes:
- Option A: rename current V0.3 → V0.4, slot HW status as V0.3. Keeps original doc as canonical.
- Option B: treat
docs/03_PROGRESS.mdas canonical and update the task-list doc together.
Recommendation: option B — 03_PROGRESS reflects what actually shipped.
3. Should SW-M4 features move up the stack?
Original task list ranked these as High priority but after RAG/talk-back. Current roadmap has them at V1.0+. Two of them are genuinely distinctive:
- AI memory artifact generation (multi-session compose into doc/narrative/highlight)
- Legacy archive + chapter curation (life chapters)
These reinforce the “create memories” pillar of the thesis more than another infrastructure pass. Recommendation: keep at V1.0 candidate but add a brainstorm session before V0.3 closes, so the data model can absorb them without rework.
4. Data export (ZIP/JSON)
GDPR/CCPA-required, originally bundled into “privacy controls” SW-M4 row. Cheap to add now while schema is small. Recommendation: move to production-prep checklist (XS).
Updated forward roadmap
What’s left, ranked:
| # | Item | Size | Source | Phase |
|---|---|---|---|---|
| 1 | Merge mobile Universe (PR #13) | XS | γ.1 | Now |
| 2 | MCP topic RPCs (memories_by_topic, related_topics) | S | γ.3 | Next |
| 3 | Mobile role chat (text talk-back parity) | M | γ.2 | Next |
| 4 | E2EE decision call | XS | His SW-M1 | Now |
| 5 | Data export (ZIP/JSON) | XS | His SW-M4 | Now |
| 6 | Multimodal ingest groundwork (incl. EXIF→place topics) | L | δ.1 | After γ |
| 7 | AI memory artifact generation | M | His SW-M4 | V1.0 |
| 8 | Legacy archive + chapter curation | M | His SW-M4 | V1.0 |
| 9 | Apple Developer + EAS dev build | S | ε.1 | Release prep |
| 10 | Hardware pilot batch (50 units) | M | Master plan Phase 3 | Release prep |
| 11 | Voice talk-back resume | L | His SW-M3 | After Pipecat decision |
| 12 | Future message scheduling | S | His SW-M4 | V1.x |
| 13 | Generative future boards | M | His SW-M4 | V1.x |
| 14 | Notifications + activity feed | S | His SW-M3 | When Spaces has real activity |
| 15 | Public MCP server + role marketplace | L | Master plan Phase 4 | V1.0 |
Status legend
- ✅ Shipped — done as spec’d
- ✅✅ Shipped beyond spec — delivered more than originally requested
- ⚠️ Partial — some shipped, some pending or paused
- ⏳ Pending — blocked on dependency (HW, decision, etc.)
- ❌ Not built — on roadmap, not yet started
- 🔄 Reinterpreted — scope changed since original doc
Priority levels (unchanged from original)
- Critical — blocks V0/V0.x delivery
- High — core feature completeness
- Medium — important but deferrable
- Low — nice-to-have
Hardware-dependent tasks (V0.3 status display, HW-SW sync) can be prepared in SW but require device availability for end-to-end testing.