Atlas + SFMS — Master Execution Plan

Status: v1.0 · Owner: Engineering Lead · Companion to 00_README.md and Phase_0–9.md The single roadmap that ties the 10 phases together: timeline, critical path, parallelisation, team/RACI, gate calendar, cross-cutting workstreams, and the design↔engineering integration. Read 00_README.md first for product/stack; this doc is the how we run it layer.

1. At a glance

Product: Multi-tenant Data Platform + Smart Facility Management System (Next.js 15 / MongoDB).
Shape: 10 gated phases (P0–P9), G0–G9.
Duration: ~29 weeks single-thread; ~22 weeks with P4–P8 partially parallelised.
First customer: KTC reference engagement runs against the v1.0 platform (Phase_9.md §3.2).
What this plan adds to the phase docs: the timeline/critical path, the team model, the gate calendar, the cross-cutting workstreams (design, security, QA, docs, observability) that span all phases, and the design-deliverable integration.

2. Critical path & sequencing

P0 ─▶ P1 ─▶ P2 ─▶ P3 ─┬─▶ P4 ─┬─────────────────────────▶ P9 ─▶ 🚀 v1.0
 1w    2w    3w    4w  │   4w  │
                       │       ├─▶ P5 Dashboards (4w) ─────┤
                       │       ├─▶ P6 AI Brain   (4w) ─────┤
                       │       └─▶ P7 Workflow   (3w) ─────┤
                       └───────────▶ P8 Mobile/PWA (2w) ───┘
                                     (starts mid-P4, hardens after P5/P7)

Strictly sequential: P0 → P1 → P2 → P3 → P4. This is the critical path; any slip here slips the launch.
P3 before P4 — SFMS modules read the canonical data model.
After P4 ships, P5/P6/P7 parallelise on the shared foundation (separate squads).
P8 (Mobile/PWA) depends on shell (P2), modules (P4), dashboards (P5), approvals (P7); it starts its responsive audit mid-P4 and finishes after P5/P7.
P9 integrates everything — regression, pentest, perf/soak, DR, deploy, hypercare.

Critical-path total (P0→P4 + P9): 1+2+3+4+4+2 = 16 weeks minimum, plus the longest parallel branch (P5 or P6 at 4w overlapping) → ~22 weeks elapsed.

3. Indicative timeline (week-by-week)

Assumes a single calendar start = Week 1. Parallel tracks share the post-P4 window.

Weeks	Critical path	Parallel tracks	Gate
1	P0 Foundation	(design: token contract)	G0 end wk1
2–3	P1 Requirements & Arch	(design: personas/journeys/IA/wireframes)	G1 end wk3
4–6	P2 Core Platform	(design: hi-fi shell + admin)	G2 end wk6
7–10	P3 Data Platform	(gateway agent track)	G3 end wk10
11–14	P4 SFMS Modules	P8 responsive audit begins (wk13)	G4 end wk14
15–18	—	P5 Dashboards · P6 AI · P7 Workflow (parallel squads)	G5/G6/G7 end wk18
19–20	—	P8 Mobile/PWA hardening	G8 end wk20
21–22	P9 QA/Sec/Deploy	pentest (started end-P7), hypercare	G9 → 🚀 wk22

If staffing only allows one squad post-P4, fall back to single-thread (P5→P6→P7→P8) and the timeline extends to ~29 weeks. The plan is designed to flex on team size without re-architecting.

4. Team model & RACI

Roles from the phase docs, consolidated. A = Accountable, R = Responsible, C = Consulted, I = Informed.

Phase	Eng Lead	Backend	Data	SFMS	Frontend	AI	Workflow	Design	Security	QA	DevOps	Product
P0 Foundation	A/R	R			C			C	C	C	R	I
P1 Req & Arch	A	C	C	C	C	C	C	R	R	C	C	R
P2 Core	A	R			R			R	R	C	C	C
P3 Data	A	C	R		C			C	C	C	R	C
P4 SFMS	A	C	C	R	R			R	C	C		C
P5 Dashboards	A			C	R	C	C	R		C		C
P6 AI	A	C	C		C	R	C	C	R	C		C
P7 Workflow	A	C		C	C	C	R	C	C	C		C
P8 Mobile	A			C	R		C	R		R		C
P9 QA/Deploy	A	C	C	C	C	C	C	C	R	R	R	R

Design Lead is R or C on every product-facing phase — design is not a P5-only concern. Security and QA are continuous, not just P9.

5. Gate calendar & sign-off

Each gate needs: all ACs met · required docs produced · Sev-1/Sev-2 closed · demo accepted (00_README.md §7). Sign-off recorded in _gates/Gate_GN_signoff.md.

Gate	Demo / proof	Signers (add Design Lead to all product gates)
G0	Clone→run<15min; CI green; Storybook themes	Eng Lead, Product
G1	FRS/NFRS/Arch/ADRs + personas/journeys/IA/wireframes approved	Eng, Product, Architect, Security, Design
G2	Login→invite→audit→API key; shell in 3 themes; a11y clean	Eng, Backend, Security, Product, Design
G3	KTC import end-to-end; live telemetry; recon report	Eng, Data, Security, Product
G4	KTC fixture walkthrough; WO lifecycle; ceiling-plan	SFMS, Eng, Product, Design
G5	KTC dashboard in all 3 view modes; visual-regression baseline	Frontend, Product, Design
G6	AI chat + categorise ≥85% + RAG citations + safety negative test	AI, Security, Product
G7	Build "monthly KPI report" workflow live; approval on mobile	Workflow, Eng, Product
G8	Offline inspection on a tablet by an FM persona; Lighthouse ≥90	Frontend, QA, Product, Design
G9	Pentest clean; perf/soak; DR drill; blue-green cutover; 5d hypercare	Eng, Product, Security, DevOps, CS, SalesEng

Recommendation (added to the plan): the pentest is scheduled to start at end of P7 so closure has runway in P9 (Phase_9.md risk). Likewise, accessibility and i18n are gated per phase, not deferred to P9 — fixing them at the end is the most expensive way.

6. Cross-cutting workstreams (span all phases)

These don't live in one phase; they are continuous and owned end-to-end:

Workstream	Owner	Cadence	Definition of done (per phase)
Design system	Design + Frontend	every phase	New components have Storybook stories (all states × themes); tokens-only; visual-regression baseline
Accessibility	QA(a11y) + Frontend	every UI phase	axe clean; keyboard parity; SR pass on new critical flows (WCAG 2.2 AA)
i18n (EN/JA)	Frontend + Product	every UI phase	no hardcoded strings; JA review on user-facing routes
Security	Security Lead	every phase	threat-model delta; SAST/dep-scan; deny-by-default RBAC tests
Observability	Backend + DevOps	from P2	traces/metrics/logs on new services; SLOs documented
Documentation-as-code	All	every PR	ADRs, OpenAPI, runbooks, changelog updated in the same PR
Test pyramid	QA + all	every phase	≥80% unit on business logic; integration + e2e for new flows
OpenAPI/SDK	Backend	from P2	spec is source of truth; CI blocks drift; SDK regenerated

7. Design ↔ engineering integration (the UX guarantee)

Because UX is a primary success axis, design is wired into the SDLC, not bolted on. Source of truth: docs/design/.

Design precedes build, per phase. Personas/journeys/IA/wireframes are a G1 input; hi-fi for a module is produced in its phase before the screens are coded.
Tokens are the contract. One CSS-variable token set (design_system.md) consumed by Tailwind + Figma; no raw colour in code (lint-enforced).
Storybook = the acceptance surface. A component/widget is "done" only with a story covering all states × themes.
State matrix is QA-tested. Every data surface implements loading/empty/error/offline/no-permission/etc. (UX_patterns.md §1) — checked at the gate.
A11y + perf budgets are gate conditions, not aspirations (UX_patterns.md §3, §5).
Design Lead signs the product gates (G1, G2, G4, G5, G8).

8. Consolidated risk register (top items across phases)

#	Risk	Phase	L	I	Mitigation	Owner
1	Multi-tenant data leak	P2	1	5	Mandatory `tenantId` at repo layer + cross-tenant test suite	Security
2	Time-series throughput at scale	P3	2	4	Spike 1.D.1; sharded TS collection; 5k/s load test	Data
3	Prompt injection → data leak	P6	2	5	Sanitiser + output validation + audit + pentest	AI/Sec
4	Agent acts beyond intent (OT)	P6/P7	2	5	Critic step + write-grant gating + human-in-loop for OT	AI/Workflow
5	Pentest finds Critical near G9	P9	3	5	Start pentest end-P7; closure runway in P9	Security
6	Requirements churn after G1	P1	3	3	Freeze v1 baseline; change-control late asks	Product
7	UX maturity gap (was untracked)	all	3	4	This `docs/design/` set + design gates	Design
8	TV-mode memory leak (24h)	P5	3	3	Soak test; per-widget unmount on rotation	Frontend
9	Offline conflict UX confusing	P8	3	3	Plain-language merge; server-wins default + override	Frontend
10	Hardcoded colour breaks theming	P5	3	2	Lint ban on hex/rgb outside tokens	Frontend
11	Cost runaway (AI)	P6	3	3	Per-tenant budgets + alerts + hard caps	AI
12	Edge-agent install fragile	P3	3	3	Docker image primary path; binary fallback	Data

Risk #7 is the gap this planning round closed: UX was implicit and ungated; it is now an owned, gated workstream.

9. Definition of Ready / Definition of Done

Story is Ready when: it has a user-story statement, acceptance criteria (Gherkin), the OpenAPI/Zod contract (if it touches the API), the wireframe/hi-fi reference (if it touches UI), and named owner.

Story is Done when: code + tests (≥80% unit on logic) merged; OpenAPI/SDK/docs updated in the same PR; Storybook story with all states (if UI); axe + i18n clean (if UI); audit + observability wired (if mutating); demoable against the KTC fixture where applicable.

10. Immediate next actions (to start execution)

Name owners for each phase and the cross-cutting workstreams (§4, §6).
Provision accounts (MongoDB Atlas, GitHub, container registry) — Phase 0 prerequisite, no lead time to waste.
Kick off P0 (1 week): repo, CI/CD, Next.js skeleton, design-system token seed (design + frontend pair on this from day 1).
Schedule G1 design workshops (personas/journeys are drafted in docs/design/; run the review to ratify and produce hi-fi wireframes).
Confirm the parallelisation decision at G4 based on actual squad count (one squad → single-thread; multiple → parallel P5/P6/P7).

The build sequence is locked through G4; the post-G4 fan-out flexes on team size without re-architecting. Start P0 now.