Decision Deck — Next Run Planning

🗺️ Situation Report — Where We Are

✅ Five blocks shipped

Blocks A (bug fixes), B (admin/infra), D (features), E (corpus expansion), and G (UX) are all complete and deployed. Core product is stable.

⚠️ Block C — corpus work, gated chain

C1→C2→C3 must run in strict sequence with a Grant checkpoint between C2 and C3. C4 (depth review) is standalone. This is the only block that touches the live corpus.

📋 Checklist v2 — clean slate

4 features already approved by Grant (2026-06-17): PDF export, photo evidence, signed/locked checklists, Docker deploy. Zero risk to existing search functionality.

📋

Section 1 — What's Left

Two remaining blocks · 8 tasks · ~14–18 hours total wall clock

5/7

Blocks done

Remaining

Already shipped

✓ Block A — Bug Fixes & Polish

✓ Block B — Admin & Infra

✓ Block D — Feature Improvements

✓ Block E — Corpus Expansion

✓ Block G — User Experience

NCC Corpus Enrichment

Search & Corpus Quality · Strict gated chain C1→C2→C3

~6–8h

Wall clock

~$28.90

API cost

Execution Chain (strictly sequential)

C1 Design → C2 NCC Pilot → ⚡ Grant checkpoint → C3 Full rollout

C4 is standalone — can run anytime

Enrichment Design + Test Harness

Design the metadata enrichment schema for NCC clauses. Build before/after benchmark harness to measure search quality impact.

standalone no corpus write

~2h

gate: none

NCC-Only Rollout (1,806 clauses)

Enrich NCC corpus only. Run benchmark comparison. Report quality delta to Grant. This is the pilot before touching all 5,600+ clauses.

corpus write: NCC only checkpoint follows

~3h

needs C1

⚡

Grant checkpoint required — C2 delivers benchmark report. Grant reviews before C3 starts. C3 cannot run without explicit approval.

Full Corpus Enrichment (5,600+ clauses)

Rolls out enrichment to all corpus clauses if C2 benchmark proves positive. Largest single API cost task in Phase 2.

corpus write: ALL standards ~$20 API cost

~3h

needs C2 + Grant ✓

Response Depth Review (5-agent report)

Standalone deep review of whether response depth is being artificially limited. 5 parallel reviewer agents, produces a report. No corpus writes.

standalone no corpus risk 5-agent report

~1h

independent

Checklist v2 + Deployment

4 features · All decisions approved 2026-06-17 · Zero corpus risk

~8–10h

Wall clock

~$0.50

API cost

Decision status — all approved

✓ HTML template for PDF ✓ Keep photos forever ✓ Lock after signing ✓ Void/unlock escape

PDF Export (HTML template approach)

Generate printable PDF-ready HTML from checklist data. Uses server-side template rendering, not a PDF library. Approved: HTML template.

new component zero search risk

~3h

independent

Photo Evidence Attachment

Attach photos to checklist line items as evidence. Photos stored permanently (never purged). Decision: keep photos forever.

new feature R2 storage

~2.5h

independent

Signed & Locked Checklists

Sign-off flow that locks the checklist. Decision: lock after signing with void/unlock escape hatch for corrections. Signature stored as evidence.

new feature lock mechanism

~2.5h

independent

Docker / On-Prem Deployment Prep

Package app for Docker deployment. Enables on-premise installs for enterprise clients who can't use cloud-hosted version.

infra Docker on-prem path

~1.5h

independent

Why two very different blocks? Block C is high-value corpus work that could meaningfully improve every answer the system gives — but it carries real risk. Checklist v2 is pure feature shipping: new capabilities in isolated components with no shared state. The question is sequencing: which block earns the next session?

⚠️

Section 2 — Risk Analysis

Corpus enrichment risk vs feature delivery risk · Side-by-side comparison

Block C — Corpus Enrichment Risk

HIGH RISK

Risk Factor	Level
Search quality degradation	HIGH
Blast radius if C3 goes wrong	ALL 5,600+
Rollback complexity	MEDIUM
Risk of accidental C3 trigger	LOW
API cost exposure	~$28.90

Mitigations in place

🧪 C1 builds before/after benchmark harness — quantifies quality impact before any writes

🔬 C2 pilots NCC-only (1,806 clauses) — controlled test before touching full corpus

⚡ Hard checkpoint: Grant must review C2 benchmark before C3 can start — no auto-proceed

⏪ Rollback: re-upload original embeddings from backup. Restores pre-enrichment state.

Checklist v2 — Feature Delivery Risk

LOW RISK

Risk Factor	Level
Impact on existing search/query	NONE
Blast radius if feature fails	ISOLATED
Rollback complexity	EASY
Decision ambiguity	NONE
API cost exposure	~$0.50

Why this is genuinely low risk

🏗️ New components (PDF renderer, photo uploader, sign-off UI) are isolated from search pipeline

✅ All decisions already approved 2026-06-17 — no ambiguity, no mid-task pivots

🔄 Worst case: a feature doesn't work → users continue existing workflow, zero regression

💰 $0.50 total API cost — basically free to run relative to Block C

The core tension: Block C has the highest potential upside (better answers for every query) but the highest risk (metadata injection into 5,600+ embeddings could degrade global search quality). The gating chain (C1→C2→checkpoint→C3) is designed to make the risk manageable, but the risk is real. Block F is pure upside with no systemic risk — the question is whether delaying it costs users anything.

🎯

Section 3 — Three Options

Click a card to select your preferred approach · Options are mutually exclusive

★ Recommended

Block C Only

Tackle the corpus risk in isolation. Everything else is already shipped — give C the focused attention it deserves.

Rationale

Grant's "C last due to risk" instinct is correct. With A, B, D, E, G all shipped, there's a clean rollback baseline. If C3 degrades search quality, there's nothing else in flight to contaminate.

Execution Order

1.C4 first — depth review report, standalone, no risk, clears the deck

2.C1 — enrichment design + test harness

3.C2 — NCC pilot (1,806 clauses) + benchmark report

4.⚡ Grant checkpoint — review benchmark, approve/reject C3

5.C3 (if approved) — full corpus enrichment

→Next run: Checklist v2 (F1–F4), zero corpus risk in context

Timeline

~6–8 hours 1–2 sessions ~$28.90 API

Pros & Cons

Pros

Clean risk isolation — corpus work gets full attention

Benchmark is definitive — nothing else in flight to muddy results

Aligns with Grant's stated preference

Clear rollback baseline

Cons

Checklist v2 delayed by 1–2 sessions

Users wait longer for PDF export and signatures

Notes for Option A

Parallel Streams

Block C and Checklist v2 run simultaneously in two independent streams. Fastest total completion.

Rationale

Block C touches corpus files. Checklist v2 touches checklist components. Zero file overlap — they can run in parallel. Stream 1 handles C1→C2→C3. Stream 2 handles F1→F2→F3→F4 independently.

Execution (2 parallel streams)

S1C track: C4 → C1 → C2 → [checkpoint] → C3

S2F track: F1 → F2 → F3 → F4 (fully independent)

→Both blocks finish in a single run

Timeline

~8–10 hours (wall clock) both done together ~$29.40 API

Pros & Cons

Pros

Fastest total completion — both blocks done simultaneously

Checklist v2 ships sooner for users

No file overlap between streams

Cons

If C2 shows degradation, debugging corpus issues while also reviewing new feature deploys creates cognitive overhead

Higher cost (2 parallel Sonnet agents)

Parallel requires explicit approval — Grant originally said sequential only

⚡ Approval needed: Grant's 2026-06-17 directive was "sequential only." He approved 3-parallel for the last run but that was a one-off. Confirm parallel is still OK before choosing this option.

Notes for Option B

Checklist v2 First

Ship user-visible features now. Corpus enrichment gets its own focused run after — properly resourced, not an afterthought.

Rationale

Ship what users will see and use. PDF export, photos, signatures — these are tangible, demo-able improvements. Block C is backend infrastructure. Users don't see enrichment directly.

Execution Order

1.Next run: F1 (PDF) → F2 (Photos) → F3 (Signed) → F4 (Docker)

2.Run after: C4 → C1 → C2 → [checkpoint] → C3

Timeline

~8–10h (Checklist v2) then ~6–8h (Block C)

Pros & Cons

Pros

Users see new features sooner

More "demo-able" progress for near-term

Block C gets full focused attention in its own run

Cons

Block C keeps getting pushed back — it's been "next" for a while

Every day without enrichment is suboptimal search answers for all users

C's gated chain (C1→C2→checkpoint→C3) needs focused attention, not afterthought treatment

Notes for Option C

📅

Section 4 — Timeline Mockup

Visual Gantt view of each option's execution order · Colour-coded by block

Block C tasks (corpus)

Block F tasks (Checklist v2)

Grant checkpoint / approval gate

Next run (deferred)

Option A — Block C Only (Recommended)

~6–8 hours · Single focused run · Checklist v2 in next run

Option B — Parallel Streams (Fastest completion)

~8–10 hours · Both blocks finish simultaneously · Requires parallel approval

Option C — Checklist v2 First, Block C After

~8–10h (Checklist v2) then ~6–8h (Block C) in a separate run

💬

Section 5 — Your Call

Select your preferred option and add any notes · No submit needed — just think it through

Pick the approach that feels right. The recommendation is Option A — it aligns with your stated preference, isolates the corpus risk, and gives Checklist v2 a clean run after. But you know the context better than the plan does.

Select your preferred option

No option selected yet.

Notes / additional context

💡 Quick reminders for the next run kick-off: If Option A or C — confirm whether C3 should be auto-approved if C2 benchmark is positive, or always require explicit confirmation. If Option B — confirm parallel agents are approved for this run. Either way: C4 (depth review) is standalone and low-cost — it can run anytime as a quick win.

📋 Decision Summary

Selected

No option selected yet

Ranking

No ranking set yet

Notes

No notes added yet