📋
Section 1 — What's Left
Two remaining blocks · 8 tasks · ~14–18 hours total wall clock
5/7
Blocks done
2
Remaining
Already shipped
✓ Block A — Bug Fixes & Polish
✓ Block B — Admin & Infra
✓ Block D — Feature Improvements
✓ Block E — Corpus Expansion
✓ Block G — User Experience
C
~6–8h
Wall clock
~$28.90
API cost
Execution Chain (strictly sequential)
C1 Design
→
C2 NCC Pilot
→
⚡ Grant checkpoint
→
C3 Full rollout
C4 is standalone — can run anytime
C1
Enrichment Design + Test Harness
Design the metadata enrichment schema for NCC clauses. Build before/after benchmark harness to measure search quality impact.
~2h
gate: none
C2
NCC-Only Rollout (1,806 clauses)
Enrich NCC corpus only. Run benchmark comparison. Report quality delta to Grant. This is the pilot before touching all 5,600+ clauses.
~3h
needs C1
Grant checkpoint required — C2 delivers benchmark report.
Grant reviews before C3 starts. C3 cannot run without explicit approval.
C3
Full Corpus Enrichment (5,600+ clauses)
Rolls out enrichment to all corpus clauses if C2 benchmark proves positive. Largest single API cost task in Phase 2.
~3h
needs C2 + Grant ✓
C4
Response Depth Review (5-agent report)
Standalone deep review of whether response depth is being artificially limited. 5 parallel reviewer agents, produces a report. No corpus writes.
~1h
independent
F
~8–10h
Wall clock
~$0.50
API cost
Decision status — all approved
✓ HTML template for PDF
✓ Keep photos forever
✓ Lock after signing
✓ Void/unlock escape
F1
PDF Export (HTML template approach)
Generate printable PDF-ready HTML from checklist data. Uses server-side template rendering, not a PDF library. Approved: HTML template.
~3h
independent
F2
Photo Evidence Attachment
Attach photos to checklist line items as evidence. Photos stored permanently (never purged). Decision: keep photos forever.
~2.5h
independent
F3
Signed & Locked Checklists
Sign-off flow that locks the checklist. Decision: lock after signing with void/unlock escape hatch for corrections. Signature stored as evidence.
~2.5h
independent
F4
Docker / On-Prem Deployment Prep
Package app for Docker deployment. Enables on-premise installs for enterprise clients who can't use cloud-hosted version.
~1.5h
independent
Why two very different blocks? Block C is high-value corpus work that could meaningfully improve every answer the system gives — but it carries real risk.
Checklist v2 is pure feature shipping: new capabilities in isolated components with no shared state.
The question is sequencing: which block earns the next session?
⚠️
Section 2 — Risk Analysis
Corpus enrichment risk vs feature delivery risk · Side-by-side comparison
C
Block C — Corpus Enrichment Risk
HIGH RISK
| Risk Factor | Level |
|---|---|
| Search quality degradation |
|
| Blast radius if C3 goes wrong |
|
| Rollback complexity |
|
| Risk of accidental C3 trigger |
|
| API cost exposure |
|
Mitigations in place
C1 builds before/after benchmark harness — quantifies quality impact before any writes
C2 pilots NCC-only (1,806 clauses) — controlled test before touching full corpus
Hard checkpoint: Grant must review C2 benchmark before C3 can start — no auto-proceed
Rollback: re-upload original embeddings from backup. Restores pre-enrichment state.
F
Checklist v2 — Feature Delivery Risk
LOW RISK
| Risk Factor | Level |
|---|---|
| Impact on existing search/query |
|
| Blast radius if feature fails |
|
| Rollback complexity |
|
| Decision ambiguity |
|
| API cost exposure |
|
Why this is genuinely low risk
New components (PDF renderer, photo uploader, sign-off UI) are isolated from search pipeline
All decisions already approved 2026-06-17 — no ambiguity, no mid-task pivots
Worst case: a feature doesn't work → users continue existing workflow, zero regression
$0.50 total API cost — basically free to run relative to Block C
The core tension: Block C has the highest potential upside (better answers for every query) but the highest risk (metadata injection into 5,600+ embeddings could degrade global search quality).
The gating chain (C1→C2→checkpoint→C3) is designed to make the risk manageable, but the risk is real.
Block F is pure upside with no systemic risk — the question is whether delaying it costs users anything.
🎯
Section 3 — Three Options
Click a card to select your preferred approach · Options are mutually exclusive
📅
Section 4 — Timeline Mockup
Visual Gantt view of each option's execution order · Colour-coded by block
Block C tasks (corpus)
Block F tasks (Checklist v2)
Grant checkpoint / approval gate
Next run (deferred)
Option A — Block C Only (Recommended)
~6–8 hours · Single focused run · Checklist v2 in next run
Option B — Parallel Streams (Fastest completion)
~8–10 hours · Both blocks finish simultaneously · Requires parallel approval
Option C — Checklist v2 First, Block C After
~8–10h (Checklist v2) then ~6–8h (Block C) in a separate run
💬
Section 5 — Your Call
Select your preferred option and add any notes · No submit needed — just think it through
Pick the approach that feels right. The recommendation is Option A — it aligns with your stated preference, isolates the corpus risk, and gives Checklist v2 a clean run after. But you know the context better than the plan does.
Select your preferred option
No option selected yet.
Notes / additional context
💡 Quick reminders for the next run kick-off:
If Option A or C — confirm whether C3 should be auto-approved if C2 benchmark is positive, or always require explicit confirmation.
If Option B — confirm parallel agents are approved for this run.
Either way: C4 (depth review) is standalone and low-cost — it can run anytime as a quick win.
📋 Decision Summary
Selected
No option selected yet
Ranking
No ranking set yet
Notes
No notes added yet