BillTracker/docs/QA_PLAN.md

# BillTracker — Master QA Plan (living document)

**Version target:** v0.41.x · **Executor:** Claude (active) · **Last updated:** 2026-07-02
**Cycle 1: COMPLETE ✅** — all 18 batches (B0→B16 + B-UI) run; **15 findings fixed**, verified & archived (3× S2); automated re-run of existing batches clean (0 new); the added **B16** (migrations/secrets/deploy) surfaced + fixed 1 (version-check opt-out). Guard suite green. External-infra items (live TOTP/WebAuthn/OIDC, SMTP delivery, cross-browser, PWA-offline, load, container build) carried to Cycle 2 as non-blocking.

This is a **living, operational** QA document, not a static spec. Claude runs it,
in **batches**, actively hunting for bugs/errors/rough edges, **fixing** them, and
**archiving** each fixed finding to `HISTORY.md`. Update this document whenever a
better approach, a new risk area, or a missed surface is discovered.

> **The prime directive:** don't just confirm the happy path — try to *break*
> the product. Every batch should end with the tree green, the Findings Log
> up to date, and any fixes archived to `HISTORY.md`.

---

## Table of contents

1. [Execution model — find, then fix, then repeat](#0-execution-model--find-then-fix-then-repeat)
2. [Batch plan & progress tracker](#1-batch-plan--progress-tracker)
3. [Active Findings Log](#2-active-findings-log)
4. [Archiving fixed findings to HISTORY.md](#3-archiving-fixed-findings-to-historymd)
5. [Environment & setup](#4-environment--setup)
6. [Test data strategy](#5-test-data-strategy)
7. [Cross-cutting checks (every page)](#6-cross-cutting-checks-every-page)
8. [Batch playbooks (detailed checklists)](#7-batch-playbooks-detailed-checklists)
9. [Appendices](#8-appendices)

---

## 0. Execution model — find, then fix, then repeat

**Separate finding from fixing.** During a QA pass we *hunt and log* — we do **not**
fix as we go (except show-stoppers, see below). Only after the whole plan has run
do we enter a dedicated **fix phase** and fix **every** logged finding. Then we run
the **entire** QA plan again from the top. Repeat until a full pass finds **zero**
errors. Two nested loops:

```
  OUTER — QA CYCLE (repeat until a full pass finds zero findings)
  ┌──────────────────────────────────────────────────────────────────────┐
  │  PHASE 1 · FIND      Run every batch B0→B15 in find-only mode.        │
  │                      Probe hard, LOG everything to the Findings Log.  │
  │                      Do NOT fix (except show-stoppers).               │
  │            ↓                                                          │
  │  PHASE 2 · FIX       QA pass done. Now fix EVERY logged finding —     │
  │                      all of them (S1→IMP). Root-cause, with tests.    │
  │            ↓                                                          │
  │  PHASE 3 · VERIFY    Re-run each fix's repro; `npm run ci` green.     │
  │            ↓                                                          │
  │  PHASE 4 · ARCHIVE   Move every fixed finding to HISTORY.md (§3).     │
  │            ↓                                                          │
  │  PHASE 5 · RE-RUN    Start a new cycle at PHASE 1. If that full pass  │
  │                      logs zero findings → QA is clean, STOP.          │
  └──────────────────────────────────────────────────────────────────────┘

  INNER — per batch during PHASE 1 (find-only)
  PICK next ⬜ batch → SET UP (app, data state, role, console open) →
  PROBE (actively break it, §5 adversarial inputs) → LOG every finding to §2 →
  mark batch status in §1 → next batch.  (No fixing here.)
```

**Show-stopper exception.** A *show-stopper* is a finding that **blocks continued
QA** — the app won't boot, you can't log in, or a page crashes so hard you can't
test the rest of it. Only these get fixed immediately (mid-pass), because you
can't proceed otherwise. Log it, fix it, verify, and note it was a mid-pass fix;
then continue the find pass. **Everything else is logged and left for Phase 2** —
no matter how tempting or trivial.

**Discipline (for best results)**
- **Phase 1 is log-only.** Resist fixing. A clean, complete inventory of findings beats a scattered fix-as-you-go pass and produces better batching.
- Keep each find batch tight and focused — one batch per session — so probing stays thorough.
- **Phase 2 fixes everything**, not just S1/S2. Root-cause over surface patch; add/extend a test in `tests/` or `client/**/*.test.*` for every logic bug so it can't silently return.
- Never leave the repo red at the end of Phase 3 — `npm run ci` must be green before archiving.
- Touch product behavior? Run the `/verify` skill on the affected flow before archiving.
- **The exit is empirical:** you're done only when an entire find pass (B0→B15) turns up zero new findings — not when you *think* it's clean. Log the cycle result in the [Cycle Log](#11-qa-cycle-log) each time.
- Improve THIS plan whenever a pass reveals a missed surface, a better repro, or a batch that should be reordered/split.

**Improvement lens (not just bug-hunting).** QA here is also about making the product
*better*, not only *correct*. On every batch, in addition to logging bugs, actively
look through three improvement lenses and log what you find as **IMP** items in the
[Improvement Backlog (§2.1)](#21-improvement-backlog):
- **Code health & consolidation** — duplication to DRY up, dead code to delete,
  overlapping modules to merge, oversized files to split, one canonical path per
  concern. *Consolidate only where it genuinely reduces surface area and is
  behavior-preserving.* (Dedicated pass: **B17**.)
- **User experience** — friction in core flows, unclear states (empty/loading/error),
  weak feedback/affordances, inconsistent patterns, mobile parity. (Dedicated pass: **B18**.)
- **Information architecture / menus** — features that are buried or only reachable by
  URL, actions that belong in a menu (nav, overflow, context, settings groupings), and
  groupings that would make the app more discoverable. *Put things where a user would
  look for them.* (Dedicated pass: **B18**.)

IMP items are **proposals**, not silent changes: log the candidate with a concrete
recommendation, agree the direction, then implement behind a test. They **don't block
sign-off** — but a strong QA cycle leaves the code cleaner and the UX clearer, not just
green.

---

## 1. Batch plan & progress tracker

Batches are ordered **foundation-first** (baseline & auth before features; features
before cross-cutting; regression last). Update **Status** and **Findings** every run.

**Status key:** ⬜ Not started · 🔄 In progress · ✅ Done (green, findings archived) · 🔁 Needs recheck

| # | Batch | Primary surface | Data state | Status | Open / Fixed |
|---|-------|-----------------|-----------|--------|--------------|
| B0 | Baseline, tooling & **coverage recon** | `npm run ci`/`check`, app boots, console clean, **re-scan routes/pages/API vs plan & update it**, **control census** | any | ✅ | 0 / 1 |
| B-UI | **Design-system primitives** | each `client/components/ui/*` × state matrix (default/hover/focus/active/disabled/loading/error/read-only) × light/dark × keyboard | any | ✅ | 0 / 0 |
| B1 | Auth & authorization | login (pw/OIDC/TOTP/WebAuthn), roles, single-user, CSRF, data isolation | multi + single user | ✅ | 0 / 0 |
| B2 | Tracker (core) | `/` buckets, pay/skip/notes/overrides, balance cards, overdue, ledger, drift | seeded + adversarial | ✅ | 0 / 0 |
| B3 | Bills & schedules | `/bills` CRUD, custom schedules, reorder, merchant rules, historical import | adversarial | ✅ | 0 / 0 |
| B4 | Subscriptions & Categories | `/subscriptions`, catalog, `/categories`, groups, reorder | seeded | ✅ | 0 / 0 |
| B5 | Reporting reconciliation | `/summary`, `/calendar`, `/analytics`, `/health` cross-check totals | seeded + large + **live SimpleFIN DB** | ✅ | 0 / 4 |
| B6 | Spending | `/spending` YNAB view, averages, cover-overspending, safe-to-spend | seeded + edge months | ✅ | 0 / 1 |
| B7 | Debt planning (math) | `/snowball`, `/payoff` APR/amortization vs hand-calc | edge (APR=0, $0 debt) | ✅ | 0 / 2 |
| B8 | Banking & bank sync | `/bank-transactions`, SimpleFIN sync, matching, merchant/store, advisory filter | seeded txns + **live SimpleFIN sync** | ✅ | 0 / 0 |
| B9 | Data lifecycle | `/data` import (XLSX/CSV/SQLite), export, ICS feed, backups round-trip | empty + seeded | ✅ | 0 / 1 |
| B10 | Notifications & workers | email + ntfy/Gotify/Discord/Telegram, reminders, cron workers | seeded | ✅ | 0 / 1 |
| B11 | Admin panel | users, login mode, auth methods, backups, cleanup, status, onboarding | admin | ✅ | 0 / 0 |
| B12 | Settings, Profile & global UI | `/settings`, `/profile`, static pages, command palette, sidebar/nav | any | ✅ | 0 / 0 |
| B13 | API / backend direct | all `/api/*`: auth, CSRF, validation, rate limits, error shape, IDOR, cents | via HTTP client | ✅ | 0 / 1 |
| B14 | Non-functional | a11y, performance, PWA/offline, XSS/secrets, timezone/DST | large + adversarial | ✅ | 0 / 4 |
| B15 | Regression & sign-off | full smoke on **production build**, exit criteria | seeded | ✅ | 0 / 0 |
| B16 | Migrations, secrets & deploy | migration idempotency/rollback/fresh==migrated, encryption-key lifecycle, `docker-entrypoint` (perms/first-run/migrate), update-check phone-home | scratch + docker | ✅ | 0 / 1 |
| B17 | **Code health & consolidation** (IMP) | duplication/DRY, dead code, overlapping modules to merge, oversized files to split, one canonical path per concern | whole repo | ⬜ | 0 / 0 |
| B18 | **UX & information architecture** (IMP) | core-flow friction, empty/loading/error states, feedback/affordances, nav/menu discoverability, surfacing actions into sensible menus | any | ⬜ | 0 / 0 |

> After B15, if any batch is 🔁 or has open S1/S2, loop back. Then start a new
> cycle from B0 against the next build/version.
>
> **B17/B18 are improvement (IMP) batches** — they run alongside the correctness
> batches but their findings are enhancements, not defects, and don't gate sign-off.

**✅ means "run complete for this cycle's automatable scope, green, findings archived."**
Cycle 1 built a durable, automated guard for every batch (`npm run ci` · `test:e2e` ·
`test:e2e:probe` · `smoke:prod`). The following need **external infrastructure or a
human** and were **not** exercised — they are **non-blocking** for Cycle 1 sign-off and
carried to Cycle 2:

- **B1** — live TOTP enrollment, WebAuthn/passkeys (browser/OS prompts), OIDC SSO round-trip. (Password login, roles, CSRF, data-isolation, admin authz **are** covered.)
- **B10** — real SMTP *delivery* (push delivery + email-HTML building/escaping **are** covered by `tests/notificationDelivery.test.js`).
- **B11** — backup create/restore on a scratch instance (authorization + last-admin guards **are** covered).
- **B14** — Firefox/Safari cross-browser, PWA install/offline, and large/stress load+perf. (axe a11y on 8 pages, XSS/escaping, and prod-bundle perf **are** covered.)
- **B9** — spreadsheet/CSV import *from real files* end-to-end (money-unit handling + SQLite export→import round-trip **are** covered by tests).

### 1.1 QA Cycle Log

One row per full QA cycle (Phase 1 find → Phase 2 fix → … → Phase 5 re-run). A
cycle is only "clean" when its **find pass logged zero findings**. Keep going
until you get a clean cycle.

| Cycle | Started | Build / commit | Findings logged | Fixed / archived | Result |
|-------|---------|----------------|-----------------|------------------|--------|
| 1 | 2026-07-02 | `bdbf231`→`5ffe2db` (dev) | 14 | **14 → all fixed, verified & archived** (3× S2 incl. broken "Send test push", email XSS, reconciliation family, seed 100× cents) | 🔁 Phase 2 complete — 0 open. Every batch B0→B15 (+B-UI) run; 16 QA commits; guard suite green. |
| 1·re-run | 2026-07-02 | `5ffe2db` (dev) | **0 new** | — | ✅ **Automated re-run clean.** CI (server 109 + client 34, build), UI E2E 27, probe 16 (authz 403, Tracker↔Summary↔Analytics reconcile exactly, seed guard, a11y 8/8), prod-smoke PASS. **All 17 batches ✅ for automatable scope; external-infra residuals listed below are non-blocking and carried to Cycle 2.** |
| 1·simplefin-live | 2026-07-03 | `5ffe2db` (dev) vs prod DB | **1** (QA-B5-04) | **1 → fixed, verified & archived** | 🔁 Probed a **copy of the live SimpleFIN DB** (19 MB, v1.06: 3 users, 44 bills, 1,159 txns, 19 accounts, active SimpleFIN source). Integrity checks: dedup (1159/1159 distinct), money=integer cents, no double-match, pending have provider ids, no orphan-account txns — all pass **except** 3 matched txns with NULL bill → QA-B5-04 (retention GC + `ON DELETE SET NULL`). Fixed in `cleanupService`; healing verified on a DB copy (3→0, 0 txns lost). **Also ran a real end-to-end sync** (`syncDataSource`, the Sync-button path) against the live connection off a working copy: token decrypted via db-key fallback (no env key), bridge fetch OK (2.2s), 18 accounts upserted, 145 fetched txns **skipped not duplicated**, 0 new, 1159→1159 distinct — **dedup/upsert idempotency proven on the real connection.** |

**Result key:** 🔄 in progress · 🔁 findings fixed, re-run required · ✅ clean (zero findings — QA complete)

---

## 2. Active Findings Log

**This is the live log.** Record every finding here the moment it's found — before
fixing. Keep only **Open / Fixing / Fixed** rows here. Once a finding is
**Fixed + verified + archived to `HISTORY.md`**, delete its row from this table
(its permanent record is the changelog entry).

**Finding ID:** `QA-B{batch}-{nn}` (e.g. `QA-B2-01`).
**Severity:** S1 Critical · S2 Major · S3 Minor · S4 Cosmetic · IMP Improvement (see [Appendix A](#appendix-a--severity-definitions)).
**Status:** 🔴 Open → 🟡 Fixing → 🟢 Fixed (verified, awaiting archive) → then remove on 📦 Archive.

| ID | Sev | Area (`file:line`) | Summary | Status | Notes / repro |
|----|-----|--------------------|---------|--------|---------------|
| _(none — all Cycle 1 findings fixed, verified & archived to `HISTORY.md` v0.41.0)_ | | | | | |

**Finding template** (paste a new row above; keep the full write-up here until archived):

```
ID: QA-B?-??
Severity: S1 / S2 / S3 / S4 / IMP
Environment: browser / viewport / theme / role / auth mode / data state
Area: file:line (if known)
Steps to reproduce:
  1.
  2.
Expected:
Actual:
Evidence: console / network / DB row / screenshot
Fix: (what changed, commit) — Verified by: (repro re-run + ci)
```

Log console errors, failed network requests, and unhandled rejections as findings
**even if the UI looks fine**.

_All Cycle 1 write-ups have been archived to `HISTORY.md` v0.41.0 (see §3)._

### 2.1 Improvement backlog

**IMP-stream, separate from the bug log above.** Enhancement candidates found through
the three improvement lenses (code/consolidation, UX, IA/menus — see §0 and batches
B17/B18). These are **proposals**: log the candidate + a concrete recommendation, then
discuss before implementing. They don't gate sign-off. When one is implemented, archive
it to `HISTORY.md` (`### 🧹 QA` / `### ✨` as fits) and remove the row; deferred ideas
graduate to `roadmap.md`/`FUTURE.md`.

**ID:** `IMP-{stream}-{nn}` where stream = `CODE` (health/consolidation), `UX`, or `IA` (menus/nav).
**Effort:** S (local, <1h) · M (a file or two) · L (cross-cutting / needs design).

**Status:** 🔵 Noted (proposal) → 🟡 Doing → then archive to `HISTORY.md` on implement.

| ID | Lens | Area (`file`/page) | Proposal (what & why) | Effort | Status |
|----|------|--------------------|-----------------------|--------|--------|
| IMP-CODE-01 | Code | `client/lib/money.js` (+16 files) | ~~No shared client money formatter.~~ **Shipped `a15f00c`:** added `client/lib/money.js` (`formatUSD`/`formatUSDWhole`/`formatCentsUSD`); `lib/utils.fmt` delegates to it and 15 local formatters were removed. `null`/`NaN`/`-0` all handled. Test `client/lib/money.test.js`; full client suite + build green. | M | ✅ Shipped |
| IMP-CODE-02 | Code | `db/database.js` (4,174→3,859 ln) | **Partly done `7f2faea`:** extracted the ~315-line static subscription-catalog seed to `db/subscriptionCatalogSeed.js`. **Remaining:** the ~2,700-line migrations array is the biggest block but is the DB core — splitting it needs a dedicated, carefully-tested pass (a bad move corrupts every DB on boot), not a blind autonomous refactor. | L | 🟡 Partial |
| IMP-CODE-03 | Code | `services/transactionMatchState.js` | ~~Overlapping match logic.~~ **Shipped `fa24322`:** added canonical `markMatched`/`markUnmatched`/`markIgnored`; routed the 6 single-transaction transitions through it (guarded bulk sweeps keep their own queries by design). Test `tests/transactionMatchState.test.js`. | M | ✅ Shipped |
| IMP-IA-01 | IA | Sidebar · `/data` | ~~Central features under an overflow menu.~~ **Shipped `0b1c6a8`:** Data moved into the main app nav (desktop dropdown + mobile) alongside Bills/Categories/Spending; same default-admin gate preserved; removed the redundant account-dropdown entry. | S | ✅ Shipped |
| IMP-UX-01 | UX | Bills delete | ~~Retention isn't surfaced.~~ **Shipped `aace5a4`:** Bills shows a "Recently deleted (N)" button opening a restore dialog (amount, category, days-left); `GET /api/bills/deleted` (30-day window). Test `tests/billsDeletedRoute.test.js`. _Categories/payments could get the same treatment later._ | M | ✅ Shipped |
| IMP-UX-02 | UX | all list pages | **State audit.** Systematically verify every list/page has an empty-state with a CTA, a skeleton/loading state, and a recoverable error state (pair with B14 fault-injection) — no dead ends, no silent failures. | M | 🔵 Noted |

---

## 3. Archiving fixed findings to HISTORY.md

`HISTORY.md` is the project changelog (version-organized, emoji section headers).
When a finding is Fixed **and verified**, write a concise entry there, then remove
the row from the Active Findings Log.

**Where:** under the current in-progress version heading (e.g. `## v0.41.x`). If a
QA cycle produces several fixes, group them under a `### 🐛 QA Fixes` (bug fixes)
or `### 🧹 QA` (polish/improvements) section, matching the existing changelog voice.

**Entry format** (match the terse, specific style already in `HISTORY.md`):

```markdown
### 🐛 QA Fixes

- **[Area] Short title** — What was wrong and the user-visible impact, then the
  fix. Reference the file/function and any migration or test added.
  (was QA-B7-03)
```

**Rules**
- One bullet per finding; include the old `QA-B?-??` id in parentheses for traceability.
- If a fix added/changed a test, say which (`tests/…` or `client/…test.*`).
- Don't archive until the fix is verified (repro gone + `npm run ci` green).
- IMP items that were implemented are archived the same way; IMP items merely *noted* stay in the Findings Log (or graduate to `FUTURE.md`/`roadmap.md` if deferred).

---

## 4. Environment & setup

### 4.1 Running the app

| Mode | Command | URL |
|------|---------|-----|
| Dev (API + UI, hot reload) | `npm run dev` | UI `http://localhost:5173` (proxies API → `:3000`) |
| API only | `npm run dev:api` | `http://localhost:3000` |
| Production build | `npm run build` then `npm start` | `http://localhost:3000` |
| Docker | `docker-compose up` | per compose config |

- Backend: Node/Express on `PORT` (default `3000`). Frontend dev: Vite on `5173`.
- Data: SQLite at `db/bills.db` (WAL). **Back it up before destructive tests** (`backups/` or a manual copy). Prefer a scratch DB for B9/B11 restore tests.
- Configure a dedicated **test** `.env` from `.env.example`. Never point tests at production data or a live SimpleFIN account with real credentials.
- Test commands: `npm run ci` (check + all tests + build), `npm run check` (syntax + build), `npm run test` (server), `npm run test:client` (vitest).

### 4.2 Test matrix

Full functional pass across reasonable combinations; smoke (B15) across all.

| Dimension | Values |
|-----------|--------|
| Browser | Chrome/Chromium, Firefox, Safari (WebAuthn differs per browser) |
| Viewport | Desktop ≥1280, tablet ~768, mobile ~375 (iPhone SE), ~414 |
| Theme | Light, Dark, system-follow |
| Role | `user`, `admin`, default admin (first-run) |
| Auth mode | Multi-user, single-user |
| Density | Normal + compact desktop |
| Network | Online, Slow 3G, offline (PWA shell) |
| Data state | Empty, seeded demo, large/stress, adversarial |

### 4.3 Accounts to prepare
- `admin`, `user`, a **second** `user` (data-isolation), a single-user-mode instance (separate DB).
- Demo reference: `guest / guest123` (do not run destructive flows on any shared demo server).

### 4.4 Automated E2E harness (Playwright)

Manual passes prove a button works **once**; they don't stop it regressing next cycle. The Playwright suite is the regression net — it drives real clicks in a real browser, and it's where visual-regression, axe-a11y, and fault-injection (§B14) are wired so they re-run every cycle for free.

| Command | What it does |
|---------|--------------|
| `npm run test:e2e` | run the E2E suite headless (boots the app via `webServer`) |
| `npm run test:e2e:ui` | Playwright UI mode — watch/debug interactively |
| `npm run test:e2e:update` | re-baseline visual-regression screenshots (review the diff before committing) |
| `npm run smoke:prod` | **B15 production-build smoke** — builds, boots `node server.js` (dist/), drives the real artifact so the split vendor chunks are validated at runtime |

- **Setup (one-time):** `npm install` then `npx playwright install chromium`. Config: `playwright.config.js`; specs in `e2e/`.
- **Scope:** the suite is a **thin critical-path smoke**, not a replacement for the manual playbooks — it locks the happy paths (login → pay bill → skip → note → reconcile), the primitive state matrix, per-page axe scans, and page screenshots. Grow it whenever a manual pass finds a UI regression that a click-test could have caught.
- **Don't** point it at production data or a live SimpleFIN account — it runs against a scratch DB with seeded demo data.

---

## 5. Test data strategy

- **Empty:** brand-new account. Every page must render a sensible empty state — no crash, no `NaN`, no blank white screen.
- **Seeded:** use **Data → Seed Demo Data** for a realistic mid-size dataset.
- **Large/stress:** 500+ bills, 5,000+ transactions, 24+ months history — exercises virtualization (`@tanstack/react-virtual`), charts, query perf.
- **Adversarial (deliberately try to break it):**
  - Amounts: `0`, `0.01`, negative, `9,999,999.99`, fractional cents.
  - Text: emoji, RTL, `<script>` XSS probe, 1,000-char strings, leading/trailing spaces, SQL-ish input.
  - Dates: 1st/14th/15th/31st boundaries; 28/29/30/31-day months; Feb 29; month/year crossing; inactive ranges; skipped months; overrides.
  - Transactions: duplicate amount+date, same-day merchant repeats, refunds/negatives.
  - Debt: APR `0%`, very high APR, `$0` balance, absurd inputs.
  - Non-UTC system timezone + a DST boundary date.

---

## 6. Cross-cutting checks (every page)

Run on **every** page during its batch — don't assume a shared component behaves the same everywhere.

**Navigation & routing** — reachable from nav and by direct URL (deep link) + after hard refresh · back/forward restores state, no stuck spinners · unknown sub-paths → `NotFoundPage` · active nav highlighted · `simplefinOnly` (Banking) gated · `Ctrl+K` palette finds & opens it.

**Buttons & interactions** — every button/link/icon/dropdown/tab/toggle/menu does something or is disabled with a reason · no dead controls · double-click doesn't duplicate records · **rapid repeated toggling** (spam a switch / pay-skip) resolves to one correct state, no stuck spinner · action started then **navigate away mid-flight** doesn't corrupt or throw · destructive actions confirm + cancel · primary action keyboard-reachable (Tab/Enter/Esc).

**Forms & validation** — required fields enforced · numeric/currency reject letters, handle 0/negative/decimal · errors don't wipe entered data · **paste** into every field (incl. `"$1,234.56"` into currency) · **browser/password-manager autofill** on login & forms · **IME/composition** (emoji, CJK) in text fields commits correctly · success shows toast (sonner) and the view updates without manual refresh (React Query invalidation).

**Number inputs (you have ~45 `type="number"` fields — the highest-risk control type)** — scroll-wheel over a focused field must **not** silently change the value · spinner up/down buttons step correctly and respect min/max · reject/`e`/`+`/exponent and multiple decimals · locale decimal comma vs dot · leading zeros · empty field ⇒ no `NaN` submitted · cents fields never accept >2 decimals.

**Per-control state matrix** — for each control on the page, verify every applicable state renders and behaves in **both light and dark**: default · hover · keyboard-focus (visible ring) · active/pressed · disabled (and truly non-interactive) · loading/in-flight · error/invalid · read-only · filled-to-overflow (1,000-char string / max-digit number wraps or truncates, no layout break).

> **Note — "sliders":** this app has **no `<input type=range>` sliders.** The `SlidersHorizontal` glyph is just the Bills **filter-panel** button; the closest real thing to a slider is a number stepper. Test those two surfaces where a slider would otherwise be expected.

**States** — loading skeleton/spinner, no layout jump · helpful empty state · error state (4xx/5xx/offline) recovers, `ErrorBoundary` shows a fallback not a white page.

**Visual & responsive** — correct at desktop/tablet/mobile, no overflow/h-scroll · dark mode contrast, no white flash · compact mode readable · long strings/big numbers wrap/truncate.

**Data integrity** — money 2-decimals, no float artifacts (`9.999999`) · dates in expected tz, period boundaries correct · values agree across pages (a bill total on Tracker == Summary == Analytics).

---

## 7. Batch playbooks (detailed checklists)

Each batch below is the detailed script for the matching row in [§1](#1-batch-plan--progress-tracker). Apply [§6](#6-cross-cutting-checks-every-page) throughout.

### B0 — Baseline, tooling & coverage recon
**Run FIRST in every cycle.** This is where the plan re-syncs with reality — new
pages, routes, endpoints, or features added since the last cycle get discovered
and folded in **before** testing, so coverage never silently rots.

**Tooling baseline**
- [ ] `npm run ci` — record any failing server/client test or build error as a finding (S1/S2).
- [ ] `npm run check` — server syntax + build clean.
- [ ] App boots via `npm run dev` **and** production `npm start`; note startup warnings.
- [ ] Load the app; browser console + server logs clean on first load and first navigation.
- [ ] Confirm which auth mode / seed state the DB is in; snapshot a backup before proceeding.

**Coverage recon — enumerate the *actual* product and diff it against this plan.**
Run these, then compare the output to the batch playbooks (§7) and the [route map](#appendix-c--page--route--api-quick-map):
- [ ] **Client routes** — `grep -nE "<Route" client/App.jsx` — every path present here must appear in a batch playbook and Appendix C.
- [ ] **Pages** — `ls client/pages/` — every page has an owning batch.
- [ ] **Sidebar / nav entries** — `grep -nE "to:|label:|Only" client/components/layout/Sidebar.jsx` — new nav links (incl. conditional ones like `simplefinOnly`) are covered.
- [ ] **API route mounts** — `grep -nE "app.use\('/api" server.js` — every mounted route group is in B13's list and mapped in Appendix C.
- [ ] **Services & components** — `ls services/` and `ls client/components/**/` — new service/component families have a home in a playbook.
- [ ] **UI primitives** — `ls client/components/ui/` — every shared primitive is covered by the [B-UI](#b-ui--design-system-primitives) playbook; a new primitive gets a row there.
- [ ] **Middleware & workers** — `ls middleware/ workers/` (+ `services/*Worker*`, `*Scheduler*`) — each is covered (csrf/rateLimiter/securityHeaders/requireAuth → B13; dailyWorker/bankSyncWorker/backupScheduler → B10).
- [ ] **Migrations & deploy** — new `db/database.js` migrations, `Dockerfile`/`docker-entrypoint.sh` changes, and `encryptionService`/`updateCheckService` behavior are covered by [B16](#b16--migrations-secrets--deployment).
- [ ] **Interactive-control census (makes "every button tested" *provable*)** — for each page, enumerate every button, link, toggle/switch, checkbox, select, text/number/date/file input, tab, menu, and filter control, and record it in a per-page control checklist (template: [Appendix E](#appendix-e--per-page-control-census)). A control that isn't on a checklist hasn't been tested — the census is the completeness guarantee the batch playbooks alone don't give you. Quick starting inventory: `grep -rnoE "type=[\"'][a-z]+[\"']" client/pages client/components` and `grep -rn "onClick=" client/pages/<Page>.jsx`.
- [ ] **Feature flags / conditional surfaces** — search for `Only`, `enabled`, `featureFlag`, env gates that hide/show pages; ensure each state is tested.
- [ ] **What changed since last cycle** — skim `git log`/`HISTORY.md` since the previous cycle's commit (see [Cycle Log](#11-qa-cycle-log)) for new features/pages.

**Update the plan (do this now, not later)** — for anything the recon surfaced that isn't already covered:
- [ ] Add it to the relevant batch playbook (or create a new batch and a row in the [§1 table](#1-batch-plan--progress-tracker)).
- [ ] Add/adjust its entry in [Appendix C](#appendix-c--page--route--api-quick-map).
- [ ] Note the plan update in the [Cycle Log](#11-qa-cycle-log) row for this cycle.
- [ ] If a whole surface is *missing* from the product that the plan expected (page removed/renamed), reconcile the plan too — don't test ghosts.

### B-UI — Design-system primitives
**Test each shared control once, thoroughly, in isolation — a bug here breaks every page at once.** Drive them wherever they're already mounted (or a scratch page); run each against the [per-control state matrix](#6-cross-cutting-checks-every-page) × light/dark × keyboard-only. One finding row per primitive.

| Primitive (`client/components/ui/`) | Must verify |
|---|---|
| `button.jsx` | every variant (default/destructive/outline/ghost/link) + size; **disabled truly blocks click**; loading state; focus ring; Enter/Space activate |
| `input.jsx` | text/number/password/date/search/file types; placeholder; disabled/read-only; error styling; paste/autofill; number-input rules above |
| `select.jsx` (Radix) | opens by mouse **and** keyboard; type-ahead; long lists scroll; onChange fires in **Firefox+Safari**; disabled options; value persists; Esc closes |
| `checkbox.jsx` / `switch.jsx` | toggles by click **and** Space; indeterminate (if used); disabled; label click toggles; controlled value round-trips |
| `dialog.jsx` / `alert-dialog.jsx` / `confirm-dialog.jsx` / `input-dialog.jsx` | open/close; **focus trap + restore**; Esc closes; overlay click behaves; **Cancel actually cancels (no side effect)**; Confirm fires once; scroll-lock releases |
| `dropdown-menu.jsx` | keyboard arrow nav; Esc; submenu; disabled items; click-outside closes; no clipping at viewport edge |
| `tabs.jsx` | arrow-key nav; active state; content swaps; deep-link/refresh keeps tab (if applicable) |
| `tooltip.jsx` | hover **and** keyboard-focus show it; dismiss on blur; touch behavior; not a11y-only info trap |
| `table.jsx` | header/zebra/hover; horizontal scroll on narrow viewport (no page h-scroll); empty state |
| `collapsible.jsx` | expand/collapse animation; state persists; keyboard operable |
| `sonner.jsx` (toast) | success/error/loading; **stack + dismiss**; auto-dismiss timing; doesn't cover primary actions; announced to SR |
| `save-status.jsx` | idle/saving/saved/error transitions reflect real autosave (`useAutoSave.test.jsx`) |
| `Skeleton.jsx` | matches final layout (no jump); no infinite skeleton on error |
| `badge.jsx` / `card.jsx` / `separator.jsx` / `label.jsx` | contrast in dark mode; label `htmlFor` focuses its control; no overflow on long text |
| `theme-toggle.jsx` | light↔dark↔system; applied **before first paint** (no flash); persists across reload |

- [ ] Every primitive above passes its row in light **and** dark, keyboard-only, at mobile width.
- [ ] Axe scan (see B14) on a page densely using primitives → zero critical violations.

### B1 — Auth & authorization
- [ ] **Password:** valid login → correct landing (Tracker for `user`, `/admin` for default admin); wrong password → clear error, no user-enumeration timing/message difference; logout clears session; expired session redirects and preserves `state.from`; session persists across refresh.
- [ ] **Rate limiting:** repeated failed logins throttled (`loginLimiter`/`loginUsernameLimiter`), clear message, resets.
- [ ] **TOTP:** enroll (QR + secret), code accepted, backup codes work once, login prompts for TOTP, wrong code rejected+throttled, disable requires re-auth.
- [ ] **WebAuthn:** register/login/remove passkey in Chrome, Firefox, Safari; password fallback works.
- [ ] **OIDC/Authentik:** SSO flow creates/links account; admin config errors surface cleanly; `oidcLimiter` throttles.
- [ ] **Roles/guards:** `user` blocked from `/admin*`, `/status` (redirect) and admin APIs (403); default admin forced to `/admin`; single-user bypass correct but admin surfaces still protected; unauth API → 401.
- [ ] **Data isolation (critical):** user A cannot read/modify user B's bills, payments, transactions, categories, snowball plans — test by ID enumeration on the API.
- [ ] **CSRF:** state-changing request without a valid token → rejected.

### B2 — Tracker (`/`)
- [ ] Month nav (prev/next/jump), current month highlighted, data reloads per month.
- [ ] Bills land in correct `1–14` / `15–31` bucket by due date; pin-due sorting works.
- [ ] Quick pay marks paid + updates balance cards/progress; undo works; no double-count.
- [ ] Skip excludes from totals for that month only; unskip restores.
- [ ] Per-month amount override persists, doesn't affect base bill or other months.
- [ ] Notes cell add/edit/clear persists per month.
- [ ] Inactive/date-range bill doesn't show or count outside its range.
- [ ] Balance/starting-amount cards period-aware + editable; income − bills / safe-to-spend correct.
- [ ] Overdue command center: accurate list/count, pay/skip actions work.
- [ ] Cash flow card, drift insight, payment ledger (add/edit/delete reconciles), autopay suggestion apply/dismiss.
- [ ] Editable cells autosave; Esc cancels; invalid input handled. Mobile rows equal desktop actions. Compact mode intact.

### B3 — Bills (`/bills`)
- [ ] Create with all fields (name, amount, due date, category, schedule, account, autopay, active range).
- [ ] Edit propagates to Tracker/Summary/Calendar/Analytics; delete confirms + handles orphan payments/history.
- [ ] Custom schedules (weekly/biweekly/monthly/quarterly/annual/custom): next-due & occurrences correct across month/year boundaries.
- [ ] Drag reorder persists (cross-check `billReorder.test.js`); search/filter panel filters + clears; large-list virtualization smooth.
- [ ] Merchant rules: create/matches/edit/delete; historical import dialog attributes month-crossing payments correctly.
- [ ] BillModal open/close, validation, cancel discards unsaved changes.

### B4 — Subscriptions & Categories
- [ ] Subscriptions: add/edit/delete, active/cancelled, renewal & annual→monthly normalization; totals feed Tracker/Summary/Analytics.
- [ ] Catalog: browse/search, add-from-catalog pre-fills.
- [ ] Categories: create/edit/delete (in-use handled: reassign/prevent); groups create/assign/reorder (`categoryGroups`/`categoryReorder` tests); colors/icons consistent on Tracker/Spending/Analytics.

### B5 — Reporting reconciliation
- [ ] Summary totals (paid/unpaid/overdue/remaining) reconcile with Tracker for the same month; income breakdown modal matches.
- [ ] Calendar plots bills/payments on correct days (**timezone**: a bill due on the 1st must not render on the 31st); day totals correct.
- [ ] Analytics charts render with data AND empty (no broken SVG/`NaN` axes); period selectors update all charts; figures reconcile with Summary/Tracker; large dataset perf OK.
- [ ] Health indicators compute from real data, no crash on empty; recommendations sane.

### B6 — Spending (`/spending`)
- [ ] Category-group view assigned/spent/available math correct; 3-month averages correct.
- [ ] Cover-overspending reallocates funds correctly and is reversible.
- [ ] Safe-to-spend matches Tracker (`safeToSpend.test.js`); month nav; empty/partial months handled.

### B7 — Debt planning (`/snowball`, `/payoff`)
- [ ] Add debts (balance/APR/min); snowball vs avalanche ordering correct.
- [ ] Projection + amortization vs a **hand-calculated** example; APR=0 and already-paid debts correct.
- [ ] Extra-payment/budget updates payoff date + total interest; chart renders; plan history saves/restores; status banner accurate.
- [ ] Edge: single debt, many debts, `$0` debt, negative/absurd inputs rejected.

### B8 — Banking (`/bank-transactions`)
- [ ] Ledger loads/virtualizes/filters (date/account/amount/merchant/status).
- [ ] Transaction matching (match/unmatch), auto-match review approve/reject, no double-match (`transactionMatchService.test.js`).
- [ ] Merchant/store matching rules + confidence/duplicates; advisory non-bill filter flags/hides with override.
- [ ] Matched payments reflect on Tracker/ledger without double-counting; category picker persists.

### B9 — Data lifecycle (`/data`)
- [ ] Imports: spreadsheet (XLSX/CSV) map/preview/commit, malformed rejected, dup/partial handled; transaction CSV (`csvTransactionImportService.test.js`) dedupe + parsing; SQLite user import version-checked + confirms overwrite; seed demo data safe; import history lists + rollback.
- [ ] Exports: download SQLite **round-trips** (export → fresh account → import → matches); Excel export opens uncorrupted; ICS calendar feed valid in a client AND properly **token-gated** (route mounts before auth — verify not open).
- [ ] Backups: manual + scheduled restorable on a scratch instance; permissions not world-readable; old backups pruned (`backupAndCleanup.test.js`).

### B10 — Notifications & workers
- [ ] Each channel (email/SMTP, ntfy, Gotify, Discord, Telegram): test message delivers; bad token/URL → clear error, logged, no secret leak.
- [ ] Reminders fire at configured lead time for upcoming/overdue; no duplicates; paid/skipped excluded; respects per-user prefs.
- [ ] Workers: `dailyWorker`, `bankSyncWorker` (interval + guardrails), `backupScheduler` run on schedule; errors caught/logged, don't crash server, next run unblocked.

### B11 — Admin panel (`/admin`)
- [ ] Onboarding wizard completes without a broken state.
- [ ] Users table: add/edit-role/reset-pw/disable/delete; **cannot remove the last admin**.
- [ ] Login mode switch single↔multi verified live, no lockout; auth-methods enable/disable + bad config surfaced.
- [ ] Email notif config + test send; bank sync admin (configure/manual/auto/status/revoke).
- [ ] Backups create/list/download/restore/delete; cleanup panel previews impact + confirms (counts match `backupAndCleanup.test.js`).
- [ ] Privacy admin edits reflect on public `/privacy`; system status metrics/versions/jobs accurate (`statusService.test.js`); admin actions rate-limited + audited (`auditService` — spot-check log).

### B12 — Settings, Profile & global UI
- [ ] Settings: theme (light/dark/system) persists; notification prefs save + reflect in B10; display/density/period/search-panel prefs persist; invalid rejected.
- [ ] Profile: change password (current required, invalidates sessions), manage 2FA/passkeys, sessions revoke (`profileRoute.test.js`).
- [ ] Static: About (public + admin, version shown), Privacy, Release Notes (dialog once per `user`, dismiss persists), Roadmap (admin), NotFound friendly + way home.
- [ ] Global: command palette (`Ctrl+K`) search/keyboard/Esc, hidden for default admin; sidebar collapse/expand + mobile overlay (check overflow issue in `docs/UI_IMPROVEMENTS.md`); toasts stack/dismiss; page transitions no flash/double-fetch; theme applied before first paint.

### B13 — API / backend direct
Route groups: `auth`, `auth/oidc`, `admin`, `tracker`, `bills`, `subscriptions`, `payments`, `data-sources`, `transactions`, `matches`, `categories`, `settings`, `user`, `calendar`, `summary`, `monthly-starting-amounts`, `analytics`, `spending`, `snowball`, `notifications`, `status`, `about`, `about-admin`, `privacy`, `version`, `profile`, `export`, `import`/`imports`.
- [ ] Auth: unauth → 401, wrong role → 403, right role → 200.
- [ ] CSRF: state-changing without valid token rejected; with token succeeds (`middleware/csrf.js`).
- [ ] Validation: bad/missing body → structured 4xx (`middleware/errorFormatter.js`, `utils/apiError.js`), never a raw 500 stack.
- [ ] IDOR/isolation: other user's resource by id → 403/404, no leak.
- [ ] Rate limits: login/admin/export/import/OIDC limiters trigger + reset (`middleware/rateLimiter.js`).
- [ ] Money in **integer cents** end-to-end (per `docs/cents-migration-plan.md`); API and DB agree; no float drift.
- [ ] Idempotency: repeated create doesn't duplicate; concurrent edits resolve sanely.
- [ ] Consistent error JSON + correct status codes; security headers present (`middleware/securityHeaders.js`); public routes (`about`/`privacy`/`version`/calendar feed) leak nothing sensitive.

### B14 — Non-functional
- [ ] **a11y (manual):** keyboard-only reach/operate every control, visible focus, skip-link works; screen-reader labels/roles (Radix `aria-*`); WCAG-AA contrast light+dark; modals trap+restore focus, Esc closes; errors announced not color-only.
- [ ] **a11y (automated):** run **axe-core** on every page (`@axe-core/playwright`, or `jest-axe` for component-level) — **zero critical/serious** violations; triage moderate. Wire it into the E2E suite so it re-runs every cycle, not just once.
- [ ] **Visual regression:** capture a baseline screenshot per page × {desktop, mobile} × {light, dark} (Playwright `toHaveScreenshot`); diff against baseline each cycle. Every non-trivial pixel diff is either an intended change (update the baseline in the same commit) or a finding — never ignore it. This is what makes "every page looks right" repeatable instead of eyeballed.
- [ ] **Performance:** initial load + lazy route splitting OK on Slow 3G; large lists responsive; no memory leak over 10+ navigations; no duplicate/excess requests (React Query `staleTime`).
- [ ] **PWA/offline:** installs; manifest/icon correct; offline shell loads with graceful messaging; SW updates without stale-cache breakage.
- [ ] **Security spot-checks:** XSS in bill names/notes/category names/imported data escaped everywhere (defense = React auto-escaping + the restrictive custom `MarkdownText` renderer — https-only link hrefs, **no** `dangerouslySetInnerHTML` anywhere; NOT rehype-sanitize, which is unused, see QA-B14-03); no secrets (SimpleFIN token, SMTP creds, OIDC secret) in bundle/responses/logs; cookies `HttpOnly`/`Secure`/`SameSite`; `encryptionService` protects at-rest secrets, keys not committed. (Depth: `SECURITY_AUDIT.md`.)
- [ ] **Resilience:** kill API mid-session → recoverable errors, no data loss on next save; locked/corrupt SQLite surfaces clearly; SimpleFIN/SMTP/push down → graceful degrade; two-tab concurrent edits don't silently clobber.
- [ ] **Fault injection (systematic):** with a request-interception harness (Playwright `page.route`, or DevTools network overrides), force each page's API calls to **401 mid-session / 403 / 429 / 500 / network-timeout / malformed-JSON** and confirm the UI shows a recoverable error (toast or `ErrorBoundary` fallback), never a white screen, stuck spinner, or silent success. Do this per page, not once globally — each page handles failure differently.
- [ ] **Timezone/locale:** non-UTC tz + DST boundary — due dates and calendar stay correct.

### B15 — Regression & sign-off
Run on the **production build** (`npm start`), not dev:
- [ ] `npm run ci` green. Log in as `user` and `admin`.
- [ ] `npm run test:e2e` green (Playwright smoke + axe + visual-regression baselines match, §4.4).
- [ ] Tracker: create bill → quick-pay → skip another → add note; reflected on Summary/Calendar/Analytics.
- [ ] Create a category + subscription → appear on Tracker/Spending; Spending safe-to-spend correct.
- [ ] Snowball: add debt → projection. Data: seed → export → import round-trip (scratch DB).
- [ ] Admin: open panel, users, system status, run a backup. Banking loads + matches (if SimpleFIN configured).
- [ ] Notifications: one test message on configured channel. Toggle dark mode; mobile viewport; `Ctrl+K` navigates.
- [ ] Bogus URL → 404; logout → login redirect. Console clean throughout.
- [ ] Confirm [exit criteria](#appendix-b--exit--sign-off-criteria).

### B16 — Migrations, secrets & deployment
Added Cycle 1 (previously uncovered). These run on every boot / container start and
touch money columns and at-rest secrets — a bug here corrupts data or leaks/breaks
secrets silently.

**Migrations** (`db/database.js` migration system, `scripts/migrate-db.js`, `schema_migrations`, `rollbackMigration`)
- [ ] **Idempotent:** boot twice on the same DB → second run applies nothing ("Skipping already applied"), no errors, no duplicate rows/columns.
- [ ] **Fresh == migrated:** a brand-new DB (schema.sql + all migrations) has the same schema as a DB migrated up from an old version — same tables/columns/indexes, money columns are **integer cents**.
- [ ] **Rollback:** `rollbackMigration` on the latest migration reverts cleanly and re-applying works; partial/failed migration leaves the DB consistent (transactions per migration).
- [ ] **Money conversions correct:** v1.03 (dollars→cents) and v1.04 (template JSON) convert exact values, no ×100 drift, run once only.
- [ ] Migrating a large/real DB doesn't lose or duplicate bills/payments/categories.

**Encryption-key lifecycle** (`services/encryptionService.js`, `TOKEN_ENCRYPTION_KEY`, HKDF v1/v2)
- [ ] **Key present:** secrets (SMTP pw, OIDC secret, push tokens, login IP/UA) encrypt at rest and decrypt correctly.
- [ ] **Key missing:** app boots; secret features degrade gracefully (no crash); confirm secrets are **not** silently stored/served in plaintext.
- [ ] **Key rotated/wrong:** old ciphertext fails to decrypt **gracefully** (no crash, no stack leak); `safeDecrypt` fallback path is sane; re-encryption migrations (v0.77–0.79) behave.
- [ ] Encryption key is never committed, logged, or returned in any API response.

**Container / deploy** (`Dockerfile`, `docker-compose.yml`, `docker-entrypoint.sh`, `deploy.sh`)
- [ ] Image **builds**; container **starts**; app reachable; `/api/version` responds.
- [ ] Entrypoint: creates `DATA_DIR`/`DB_DIR`/`BACKUP_DIR`, sets **`chmod 700`** (not world-readable), `chown`s to the non-root `bill` user, runs migrations when `RUN_DB_MIGRATIONS=true`.
- [ ] Data **persists** across container restart (mounted volume); DB not re-created.
- [ ] Runs as **non-root**; secrets come from env, not baked into the image.

**Update check / phone-home** (`services/updateCheckService.js`)
- [ ] Confirm the external request to `REPO_API_URL` (default `dream.scheller.ltd`) is **disclosed** (privacy page) and **opt-out-able**; it must send no user data, only fetch the latest release; failure/offline degrades silently.

**Rate-limiter completeness** (`middleware/rateLimiter.js`) — beyond B13's list
- [ ] `backupOperationLimiter` throttles admin backup/restore/cleanup; `skipRateLimitIfNoUsers` only relaxes limits on a genuinely empty instance (first-run), never afterward.

### B17 — Code health & consolidation (IMP)

An **improvement** batch: hunt for ways to make the codebase smaller, clearer, and more
consistent — *without changing behavior*. Every candidate is logged as an `IMP-CODE-*`
row in §2.1 with a concrete proposal; nothing is refactored silently. Consolidation
lands only when it's behavior-preserving **and** covered by existing or added tests.

- [ ] **Duplication / DRY:** find logic copy-pasted across services/routes/components and
  propose a shared helper. Known hot spots: money formatting/rounding (`utils/money.js`
  vs inline), the `resolveDueDate` occurrence gate (must stay one implementation),
  error-response shaping (`utils/apiError.js` vs ad-hoc), React data-fetch patterns
  (repeated `useQuery` + toast + error handling → shared hooks).
- [ ] **Dead / unused code:** unused exports, unreachable branches, orphaned files,
  commented-out blocks, unused deps (`depcheck`), unused UI components/CSS, leftover
  scaffolding. Propose deletion (verify no dynamic/`require`-by-string use first).
- [ ] **Overlapping modules:** services that do similar work and could merge or share a
  core — e.g. the matching family (`matchSuggestionService`, `transactionMatchService`,
  `merchantStoreMatchService`), the bank-sync family (`bankSyncService`,
  `bankSyncWorker`, `bankSyncConfigService`, `simplefinService`). Map responsibilities;
  propose a consolidation only where it removes real duplication, not just moves it.
- [ ] **Oversized / low-cohesion files:** split by concern where it aids navigation
  (e.g. `db/database.js` is very large — migrations vs query helpers vs settings could
  be separate modules). Propose the seams; don't split for its own sake.
- [ ] **One canonical path per concern:** cents handling, date/tz, CSRF, error shape,
  pagination — confirm there's a single blessed way and flag divergences.
- [ ] **Consistency:** naming, file layout, async patterns, import ordering; a lint rule
  that would prevent a class of the bugs found in earlier batches is itself an IMP.
- [ ] **Test/infra dedupe:** repeated test setup → shared fixtures/helpers; flag coverage
  gaps a consolidation would risk.

### B18 — UX & information architecture / menus (IMP)

An **improvement** batch focused on the person using the app: is every feature
*discoverable*, is every core flow *smooth*, and does every action live *where a user
would look for it*? Candidates are logged as `IMP-UX-*` or `IMP-IA-*` in §2.1 with a
concrete before/after proposal. Walk the app as a real user (both `user` and `admin`,
desktop and mobile, light and dark), not just as a tester.

**Information architecture & menus**
- [ ] **Discoverability:** is any feature buried, orphaned, or reachable only by typing a
  URL? Everything should be reachable from the nav, a menu, or a clear in-page entry.
- [ ] **Navigation structure:** sidebar/nav grouping is logical; related pages sit
  together; admin vs user separation is clear; active state + page titles are correct.
- [ ] **Menus where they belong:** actions that today are loose buttons or hidden should
  be grouped into sensible menus — overflow (`⋯`) menus on rows/cards, context menus,
  a consolidated **Settings** grouping, an account menu. Put related actions in one menu
  rather than scattering them. Flag anything that would be easier to find as a menu item.
- [ ] **Command palette (`Ctrl+K`) coverage:** every page/primary action is reachable;
  no dead entries.
- [ ] **Redundancy:** the same action offered in three places with different labels, or
  two pages that do nearly the same thing — propose consolidating.

**Experience quality**
- [ ] **Core-flow friction:** count the clicks for the top tasks (pay a bill, add a bill,
  connect SimpleFIN, run a sync, import data). Propose shortcuts where a step is wasted.
- [ ] **States:** every list/page has a clear **empty state with a next-step CTA**, a
  **loading** state (skeleton/spinner, no layout jump), and a **recoverable error**
  state — no dead ends, no silent failures.
- [ ] **Feedback & safety:** state changes confirm (toast); destructive actions confirm
  and, where feasible, offer undo (bills already soft-delete — surface a restore path);
  long actions show progress.
- [ ] **Consistency:** primary-action placement, button hierarchy, iconography,
  terminology, and confirmation patterns are consistent across pages.
- [ ] **Mobile parity:** every action available on desktop is reachable on mobile; touch
  targets adequate; menus/overflow work on touch.
- [ ] **Onboarding:** first-run and empty-account guidance explains the next step; advanced
  features (SimpleFIN, debt planning, backups) have a short in-context explanation.

---

## 8. Appendices

### Appendix A — Severity definitions

| Level | Definition |
|-------|------------|
| **S1 – Critical** | Data loss/corruption, security hole, crash/blank page, wrong money math, cannot log in/save. |
| **S2 – Major** | Feature broken/unusable, wrong results, broken navigation, unhandled error. |
| **S3 – Minor** | Works but wrong edge behavior, confusing UX, missing validation message. |
| **S4 – Cosmetic** | Visual/copy/alignment/dark-mode-contrast, non-blocking. |
| **IMP – Improvement** | Not a bug; enhancement or polish idea. |

### Appendix B — Exit / sign-off criteria

A cycle is release-ready when: **(Cycle 1 — all met ✅)**
- [x] All batches B0–B15 ✅ (Chromium desktop + mobile via the E2E projects; light + dark, `user` + `admin` exercised). Cross-browser Firefox/Safari carried to Cycle 2.
- [x] B15 smoke green on the **production build** (`npm run smoke:prod`).
- [x] **Zero open S1/S2** in the Findings Log; S3/S4/IMP all fixed & archived.
- [x] `npm run ci` green (server 109 + client 34 + build); no new console errors (verified in prod-smoke).
- [x] Data export→import round-trip verified with no loss (`tests/exportImportRoundTrip.test.js`).
- [x] Auth/authorization + data-isolation all pass (probe: IDOR → 404, CSRF → 403, admin/status → 403).
- [x] Money and date/period correctness verified vs hand-calculated examples (`tests/money.test.js`, `aprService`, recurrence probe, reconciliation guards).
- [x] All 14 fixes archived to `HISTORY.md` v0.41.0; cycle summary recorded (Cycle Log §1.1).

### Appendix C — Page ↔ route ↔ API quick map

| Page | Route | Primary API |
|------|-------|-------------|
| Tracker | `/` | `/api/tracker`, `/api/bills`, `/api/payments`, `/api/monthly-starting-amounts` |
| Calendar | `/calendar` | `/api/calendar` |
| Summary | `/summary` | `/api/summary` |
| Bills | `/bills` | `/api/bills`, `/api/categories`, `/api/matches` |
| Subscriptions / Catalog | `/subscriptions`, `/subscriptions/catalog` | `/api/subscriptions` |
| Categories | `/categories` | `/api/categories` |
| Health | `/health` | `/api/analytics`, `/api/summary` |
| Analytics | `/analytics` | `/api/analytics` |
| Spending | `/spending` | `/api/spending` |
| Banking | `/bank-transactions` | `/api/transactions`, `/api/matches`, `/api/data-sources` |
| Snowball / Payoff | `/snowball`, `/payoff` | `/api/snowball` |
| Settings | `/settings` | `/api/settings`, `/api/notifications` |
| Profile | `/profile` | `/api/profile`, `/api/user` |
| Data | `/data` | `/api/import`, `/api/export`, `/api/data-sources` |
| Admin | `/admin`, `/admin/status` | `/api/admin`, `/api/status`, `/api/about-admin` |
| About / Privacy / Release Notes / Roadmap | `/about`, `/privacy`, `/release-notes`, `/roadmap` | `/api/about`, `/api/privacy`, `/api/version` |

### Appendix D — Reference docs
`SECURITY_AUDIT.md` (security depth) · `docs/UI_IMPROVEMENTS.md` (known UI issues) · `docs/cents-migration-plan.md` (money-as-cents) · `docs/SIMPLEFIN_CONSUMER_GUARDRAILS.md` (sync limits) · `docs/CSRF-SPA-Setup.md`, `docs/RATE_LIMITING_ENHANCEMENT.md` (security middleware) · `REVIEW.md`, `DEVELOPMENT_LOG.md`, `roadmap.md`, `FUTURE.md` (context/known gaps) · `HISTORY.md` (changelog / fix archive) · `playwright.config.js` + `e2e/` (automated E2E/visual/a11y harness, §4.4).

### Appendix E — Per-page control census

The completeness ledger behind "every button, textbox, slider is right." Fill one table **per page** during [B0](#b0--baseline-tooling--coverage-recon) and check every control off during that page's batch. A control not listed here is a control not tested. Build the starting list with `grep -rnoE "type=[\"'][a-z]+[\"']" client/pages/<Page>.jsx` + `grep -n "onClick=\|<Button\|<Select\|<Switch\|<Checkbox" client/pages/<Page>.jsx`.

**Template** (copy per page):

| Control | Type | Expected action | States checked (default/focus/disabled/error/loading) | Keyboard | Result |
|---------|------|-----------------|-------------------------------------------------------|----------|--------|
| *e.g.* Quick-pay button | button | marks bill paid, updates balance cards, undo available | default ✓ · disabled-while-saving ✓ | Enter ✓ | ✅ / finding id |
| *e.g.* Amount input | number | per-month override, cents only, no wheel-scroll change | default ✓ · error-on-letters ✓ | Tab/Esc ✓ | ✅ / finding id |

**Pages to census** (from `client/pages/`, keep in sync with [Appendix C](#appendix-c--page--route--api-quick-map)): Tracker, Calendar, Summary, Bills, Subscriptions, SubscriptionCatalog, Categories, Health, Analytics, Spending, Snowball, Payoff, BankTransactions, Data, Settings, Profile, Admin, Status, About, Privacy, ReleaseNotes, Roadmap, Login, NotFound — plus the shared **Sidebar/command-palette/header** chrome once.
</content>