Atlas + SFMS — Master Execution Plan
Status: v1.0 · Owner: Engineering Lead · Companion to
00_README.mdandPhase_0–9.mdThe single roadmap that ties the 10 phases together: timeline, critical path, parallelisation, team/RACI, gate calendar, cross-cutting workstreams, and the design↔engineering integration. Read00_README.mdfirst for product/stack; this doc is the how we run it layer.
1. At a glance
- Product: Multi-tenant Data Platform + Smart Facility Management System (Next.js 15 / MongoDB).
- Shape: 10 gated phases (P0–P9), G0–G9.
- Duration: ~29 weeks single-thread; ~22 weeks with P4–P8 partially parallelised.
- First customer: KTC reference engagement runs against the v1.0 platform (
Phase_9.md§3.2). - What this plan adds to the phase docs: the timeline/critical path, the team model, the gate calendar, the cross-cutting workstreams (design, security, QA, docs, observability) that span all phases, and the design-deliverable integration.
2. Critical path & sequencing
P0 ─▶ P1 ─▶ P2 ─▶ P3 ─┬─▶ P4 ─┬─────────────────────────▶ P9 ─▶ 🚀 v1.0
1w 2w 3w 4w │ 4w │
│ ├─▶ P5 Dashboards (4w) ─────┤
│ ├─▶ P6 AI Brain (4w) ─────┤
│ └─▶ P7 Workflow (3w) ─────┤
└───────────▶ P8 Mobile/PWA (2w) ───┘
(starts mid-P4, hardens after P5/P7)
- Strictly sequential: P0 → P1 → P2 → P3 → P4. This is the critical path; any slip here slips the launch.
- P3 before P4 — SFMS modules read the canonical data model.
- After P4 ships, P5/P6/P7 parallelise on the shared foundation (separate squads).
- P8 (Mobile/PWA) depends on shell (P2), modules (P4), dashboards (P5), approvals (P7); it starts its responsive audit mid-P4 and finishes after P5/P7.
- P9 integrates everything — regression, pentest, perf/soak, DR, deploy, hypercare.
Critical-path total (P0→P4 + P9): 1+2+3+4+4+2 = 16 weeks minimum, plus the longest parallel branch (P5 or P6 at 4w overlapping) → ~22 weeks elapsed.
3. Indicative timeline (week-by-week)
Assumes a single calendar start = Week 1. Parallel tracks share the post-P4 window.
| Weeks | Critical path | Parallel tracks | Gate |
|---|---|---|---|
| 1 | P0 Foundation | (design: token contract) | G0 end wk1 |
| 2–3 | P1 Requirements & Arch | (design: personas/journeys/IA/wireframes) | G1 end wk3 |
| 4–6 | P2 Core Platform | (design: hi-fi shell + admin) | G2 end wk6 |
| 7–10 | P3 Data Platform | (gateway agent track) | G3 end wk10 |
| 11–14 | P4 SFMS Modules | P8 responsive audit begins (wk13) | G4 end wk14 |
| 15–18 | — | P5 Dashboards · P6 AI · P7 Workflow (parallel squads) | G5/G6/G7 end wk18 |
| 19–20 | — | P8 Mobile/PWA hardening | G8 end wk20 |
| 21–22 | P9 QA/Sec/Deploy | pentest (started end-P7), hypercare | G9 → 🚀 wk22 |
If staffing only allows one squad post-P4, fall back to single-thread (P5→P6→P7→P8) and the timeline extends to ~29 weeks. The plan is designed to flex on team size without re-architecting.
4. Team model & RACI
Roles from the phase docs, consolidated. A = Accountable, R = Responsible, C = Consulted, I = Informed.
| Phase | Eng Lead | Backend | Data | SFMS | Frontend | AI | Workflow | Design | Security | QA | DevOps | Product |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| P0 Foundation | A/R | R | C | C | C | C | R | I | ||||
| P1 Req & Arch | A | C | C | C | C | C | C | R | R | C | C | R |
| P2 Core | A | R | R | R | R | C | C | C | ||||
| P3 Data | A | C | R | C | C | C | C | R | C | |||
| P4 SFMS | A | C | C | R | R | R | C | C | C | |||
| P5 Dashboards | A | C | R | C | C | R | C | C | ||||
| P6 AI | A | C | C | C | R | C | C | R | C | C | ||
| P7 Workflow | A | C | C | C | C | R | C | C | C | C | ||
| P8 Mobile | A | C | R | C | R | R | C | |||||
| P9 QA/Deploy | A | C | C | C | C | C | C | C | R | R | R | R |
Design Lead is R or C on every product-facing phase — design is not a P5-only concern. Security and QA are continuous, not just P9.
5. Gate calendar & sign-off
Each gate needs: all ACs met · required docs produced · Sev-1/Sev-2 closed · demo accepted (00_README.md §7). Sign-off recorded in _gates/Gate_GN_signoff.md.
| Gate | Demo / proof | Signers (add Design Lead to all product gates) |
|---|---|---|
| G0 | Clone→run<15min; CI green; Storybook themes | Eng Lead, Product |
| G1 | FRS/NFRS/Arch/ADRs + personas/journeys/IA/wireframes approved | Eng, Product, Architect, Security, Design |
| G2 | Login→invite→audit→API key; shell in 3 themes; a11y clean | Eng, Backend, Security, Product, Design |
| G3 | KTC import end-to-end; live telemetry; recon report | Eng, Data, Security, Product |
| G4 | KTC fixture walkthrough; WO lifecycle; ceiling-plan | SFMS, Eng, Product, Design |
| G5 | KTC dashboard in all 3 view modes; visual-regression baseline | Frontend, Product, Design |
| G6 | AI chat + categorise ≥85% + RAG citations + safety negative test | AI, Security, Product |
| G7 | Build "monthly KPI report" workflow live; approval on mobile | Workflow, Eng, Product |
| G8 | Offline inspection on a tablet by an FM persona; Lighthouse ≥90 | Frontend, QA, Product, Design |
| G9 | Pentest clean; perf/soak; DR drill; blue-green cutover; 5d hypercare | Eng, Product, Security, DevOps, CS, SalesEng |
Recommendation (added to the plan): the pentest is scheduled to start at end of P7 so closure has runway in P9 (Phase_9.md risk). Likewise, accessibility and i18n are gated per phase, not deferred to P9 — fixing them at the end is the most expensive way.
6. Cross-cutting workstreams (span all phases)
These don't live in one phase; they are continuous and owned end-to-end:
| Workstream | Owner | Cadence | Definition of done (per phase) |
|---|---|---|---|
| Design system | Design + Frontend | every phase | New components have Storybook stories (all states × themes); tokens-only; visual-regression baseline |
| Accessibility | QA(a11y) + Frontend | every UI phase | axe clean; keyboard parity; SR pass on new critical flows (WCAG 2.2 AA) |
| i18n (EN/JA) | Frontend + Product | every UI phase | no hardcoded strings; JA review on user-facing routes |
| Security | Security Lead | every phase | threat-model delta; SAST/dep-scan; deny-by-default RBAC tests |
| Observability | Backend + DevOps | from P2 | traces/metrics/logs on new services; SLOs documented |
| Documentation-as-code | All | every PR | ADRs, OpenAPI, runbooks, changelog updated in the same PR |
| Test pyramid | QA + all | every phase | ≥80% unit on business logic; integration + e2e for new flows |
| OpenAPI/SDK | Backend | from P2 | spec is source of truth; CI blocks drift; SDK regenerated |
7. Design ↔ engineering integration (the UX guarantee)
Because UX is a primary success axis, design is wired into the SDLC, not bolted on. Source of truth: docs/design/.
- Design precedes build, per phase. Personas/journeys/IA/wireframes are a G1 input; hi-fi for a module is produced in its phase before the screens are coded.
- Tokens are the contract. One CSS-variable token set (
design_system.md) consumed by Tailwind + Figma; no raw colour in code (lint-enforced). - Storybook = the acceptance surface. A component/widget is "done" only with a story covering all states × themes.
- State matrix is QA-tested. Every data surface implements loading/empty/error/offline/no-permission/etc. (
UX_patterns.md§1) — checked at the gate. - A11y + perf budgets are gate conditions, not aspirations (
UX_patterns.md§3, §5). - Design Lead signs the product gates (G1, G2, G4, G5, G8).
8. Consolidated risk register (top items across phases)
| # | Risk | Phase | L | I | Mitigation | Owner |
|---|---|---|---|---|---|---|
| 1 | Multi-tenant data leak | P2 | 1 | 5 | Mandatory tenantId at repo layer + cross-tenant test suite | Security |
| 2 | Time-series throughput at scale | P3 | 2 | 4 | Spike 1.D.1; sharded TS collection; 5k/s load test | Data |
| 3 | Prompt injection → data leak | P6 | 2 | 5 | Sanitiser + output validation + audit + pentest | AI/Sec |
| 4 | Agent acts beyond intent (OT) | P6/P7 | 2 | 5 | Critic step + write-grant gating + human-in-loop for OT | AI/Workflow |
| 5 | Pentest finds Critical near G9 | P9 | 3 | 5 | Start pentest end-P7; closure runway in P9 | Security |
| 6 | Requirements churn after G1 | P1 | 3 | 3 | Freeze v1 baseline; change-control late asks | Product |
| 7 | UX maturity gap (was untracked) | all | 3 | 4 | This docs/design/ set + design gates | Design |
| 8 | TV-mode memory leak (24h) | P5 | 3 | 3 | Soak test; per-widget unmount on rotation | Frontend |
| 9 | Offline conflict UX confusing | P8 | 3 | 3 | Plain-language merge; server-wins default + override | Frontend |
| 10 | Hardcoded colour breaks theming | P5 | 3 | 2 | Lint ban on hex/rgb outside tokens | Frontend |
| 11 | Cost runaway (AI) | P6 | 3 | 3 | Per-tenant budgets + alerts + hard caps | AI |
| 12 | Edge-agent install fragile | P3 | 3 | 3 | Docker image primary path; binary fallback | Data |
Risk #7 is the gap this planning round closed: UX was implicit and ungated; it is now an owned, gated workstream.
9. Definition of Ready / Definition of Done
Story is Ready when: it has a user-story statement, acceptance criteria (Gherkin), the OpenAPI/Zod contract (if it touches the API), the wireframe/hi-fi reference (if it touches UI), and named owner.
Story is Done when: code + tests (≥80% unit on logic) merged; OpenAPI/SDK/docs updated in the same PR; Storybook story with all states (if UI); axe + i18n clean (if UI); audit + observability wired (if mutating); demoable against the KTC fixture where applicable.
10. Immediate next actions (to start execution)
- Name owners for each phase and the cross-cutting workstreams (§4, §6).
- Provision accounts (MongoDB Atlas, GitHub, container registry) — Phase 0 prerequisite, no lead time to waste.
- Kick off P0 (1 week): repo, CI/CD, Next.js skeleton, design-system token seed (design + frontend pair on this from day 1).
- Schedule G1 design workshops (personas/journeys are drafted in
docs/design/; run the review to ratify and produce hi-fi wireframes). - Confirm the parallelisation decision at G4 based on actual squad count (one squad → single-thread; multiple → parallel P5/P6/P7).
The build sequence is locked through G4; the post-G4 fan-out flexes on team size without re-architecting. Start P0 now.