BillTracker/docs/QA_PLAN.md

683 lines
57 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# BillTracker — Master QA Plan (living document)
**Version target:** v0.41.x · **Executor:** Claude (active) · **Last updated:** 2026-07-02
**Cycle 1: COMPLETE ✅** — all 18 batches (B0→B16 + B-UI) run; **15 findings fixed**, verified & archived (3× S2); automated re-run of existing batches clean (0 new); the added **B16** (migrations/secrets/deploy) surfaced + fixed 1 (version-check opt-out). Guard suite green. External-infra items (live TOTP/WebAuthn/OIDC, SMTP delivery, cross-browser, PWA-offline, load, container build) carried to Cycle 2 as non-blocking.
This is a **living, operational** QA document, not a static spec. Claude runs it,
in **batches**, actively hunting for bugs/errors/rough edges, **fixing** them, and
**archiving** each fixed finding to `HISTORY.md`. Update this document whenever a
better approach, a new risk area, or a missed surface is discovered.
> **The prime directive:** don't just confirm the happy path — try to *break*
> the product. Every batch should end with the tree green, the Findings Log
> up to date, and any fixes archived to `HISTORY.md`.
---
## Table of contents
1. [Execution model — find, then fix, then repeat](#0-execution-model--find-then-fix-then-repeat)
2. [Batch plan & progress tracker](#1-batch-plan--progress-tracker)
3. [Active Findings Log](#2-active-findings-log)
4. [Archiving fixed findings to HISTORY.md](#3-archiving-fixed-findings-to-historymd)
5. [Environment & setup](#4-environment--setup)
6. [Test data strategy](#5-test-data-strategy)
7. [Cross-cutting checks (every page)](#6-cross-cutting-checks-every-page)
8. [Batch playbooks (detailed checklists)](#7-batch-playbooks-detailed-checklists)
9. [Appendices](#8-appendices)
---
## 0. Execution model — find, then fix, then repeat
**Separate finding from fixing.** During a QA pass we *hunt and log* — we do **not**
fix as we go (except show-stoppers, see below). Only after the whole plan has run
do we enter a dedicated **fix phase** and fix **every** logged finding. Then we run
the **entire** QA plan again from the top. Repeat until a full pass finds **zero**
errors. Two nested loops:
```
OUTER — QA CYCLE (repeat until a full pass finds zero findings)
┌──────────────────────────────────────────────────────────────────────┐
│ PHASE 1 · FIND Run every batch B0→B15 in find-only mode. │
│ Probe hard, LOG everything to the Findings Log. │
│ Do NOT fix (except show-stoppers). │
│ ↓ │
│ PHASE 2 · FIX QA pass done. Now fix EVERY logged finding — │
│ all of them (S1→IMP). Root-cause, with tests. │
│ ↓ │
│ PHASE 3 · VERIFY Re-run each fix's repro; `npm run ci` green. │
│ ↓ │
│ PHASE 4 · ARCHIVE Move every fixed finding to HISTORY.md (§3). │
│ ↓ │
│ PHASE 5 · RE-RUN Start a new cycle at PHASE 1. If that full pass │
│ logs zero findings → QA is clean, STOP. │
└──────────────────────────────────────────────────────────────────────┘
INNER — per batch during PHASE 1 (find-only)
PICK next ⬜ batch → SET UP (app, data state, role, console open) →
PROBE (actively break it, §5 adversarial inputs) → LOG every finding to §2 →
mark batch status in §1 → next batch. (No fixing here.)
```
**Show-stopper exception.** A *show-stopper* is a finding that **blocks continued
QA** — the app won't boot, you can't log in, or a page crashes so hard you can't
test the rest of it. Only these get fixed immediately (mid-pass), because you
can't proceed otherwise. Log it, fix it, verify, and note it was a mid-pass fix;
then continue the find pass. **Everything else is logged and left for Phase 2**
no matter how tempting or trivial.
**Discipline (for best results)**
- **Phase 1 is log-only.** Resist fixing. A clean, complete inventory of findings beats a scattered fix-as-you-go pass and produces better batching.
- Keep each find batch tight and focused — one batch per session — so probing stays thorough.
- **Phase 2 fixes everything**, not just S1/S2. Root-cause over surface patch; add/extend a test in `tests/` or `client/**/*.test.*` for every logic bug so it can't silently return.
- Never leave the repo red at the end of Phase 3 — `npm run ci` must be green before archiving.
- Touch product behavior? Run the `/verify` skill on the affected flow before archiving.
- **The exit is empirical:** you're done only when an entire find pass (B0→B15) turns up zero new findings — not when you *think* it's clean. Log the cycle result in the [Cycle Log](#11-qa-cycle-log) each time.
- Improve THIS plan whenever a pass reveals a missed surface, a better repro, or a batch that should be reordered/split.
**Improvement lens (not just bug-hunting).** QA here is also about making the product
*better*, not only *correct*. On every batch, in addition to logging bugs, actively
look through three improvement lenses and log what you find as **IMP** items in the
[Improvement Backlog (§2.1)](#21-improvement-backlog):
- **Code health & consolidation** — duplication to DRY up, dead code to delete,
overlapping modules to merge, oversized files to split, one canonical path per
concern. *Consolidate only where it genuinely reduces surface area and is
behavior-preserving.* (Dedicated pass: **B17**.)
- **User experience** — friction in core flows, unclear states (empty/loading/error),
weak feedback/affordances, inconsistent patterns, mobile parity. (Dedicated pass: **B18**.)
- **Information architecture / menus** — features that are buried or only reachable by
URL, actions that belong in a menu (nav, overflow, context, settings groupings), and
groupings that would make the app more discoverable. *Put things where a user would
look for them.* (Dedicated pass: **B18**.)
IMP items are **proposals**, not silent changes: log the candidate with a concrete
recommendation, agree the direction, then implement behind a test. They **don't block
sign-off** — but a strong QA cycle leaves the code cleaner and the UX clearer, not just
green.
---
## 1. Batch plan & progress tracker
Batches are ordered **foundation-first** (baseline & auth before features; features
before cross-cutting; regression last). Update **Status** and **Findings** every run.
**Status key:** ⬜ Not started · 🔄 In progress · ✅ Done (green, findings archived) · 🔁 Needs recheck
| # | Batch | Primary surface | Data state | Status | Open / Fixed |
|---|-------|-----------------|-----------|--------|--------------|
| B0 | Baseline, tooling & **coverage recon** | `npm run ci`/`check`, app boots, console clean, **re-scan routes/pages/API vs plan & update it**, **control census** | any | ✅ | 0 / 1 |
| B-UI | **Design-system primitives** | each `client/components/ui/*` × state matrix (default/hover/focus/active/disabled/loading/error/read-only) × light/dark × keyboard | any | ✅ | 0 / 0 |
| B1 | Auth & authorization | login (pw/OIDC/TOTP/WebAuthn), roles, single-user, CSRF, data isolation | multi + single user | ✅ | 0 / 0 |
| B2 | Tracker (core) | `/` buckets, pay/skip/notes/overrides, balance cards, overdue, ledger, drift | seeded + adversarial | ✅ | 0 / 0 |
| B3 | Bills & schedules | `/bills` CRUD, custom schedules, reorder, merchant rules, historical import | adversarial | ✅ | 0 / 0 |
| B4 | Subscriptions & Categories | `/subscriptions`, catalog, `/categories`, groups, reorder | seeded | ✅ | 0 / 0 |
| B5 | Reporting reconciliation | `/summary`, `/calendar`, `/analytics`, `/health` cross-check totals | seeded + large + **live SimpleFIN DB** | ✅ | 0 / 4 |
| B6 | Spending | `/spending` YNAB view, averages, cover-overspending, safe-to-spend | seeded + edge months | ✅ | 0 / 1 |
| B7 | Debt planning (math) | `/snowball`, `/payoff` APR/amortization vs hand-calc | edge (APR=0, $0 debt) | ✅ | 0 / 2 |
| B8 | Banking & bank sync | `/bank-transactions`, SimpleFIN sync, matching, merchant/store, advisory filter | seeded txns + **live SimpleFIN sync** | ✅ | 0 / 0 |
| B9 | Data lifecycle | `/data` import (XLSX/CSV/SQLite), export, ICS feed, backups round-trip | empty + seeded | ✅ | 0 / 1 |
| B10 | Notifications & workers | email + ntfy/Gotify/Discord/Telegram, reminders, cron workers | seeded | ✅ | 0 / 1 |
| B11 | Admin panel | users, login mode, auth methods, backups, cleanup, status, onboarding | admin | ✅ | 0 / 0 |
| B12 | Settings, Profile & global UI | `/settings`, `/profile`, static pages, command palette, sidebar/nav | any | ✅ | 0 / 0 |
| B13 | API / backend direct | all `/api/*`: auth, CSRF, validation, rate limits, error shape, IDOR, cents | via HTTP client | ✅ | 0 / 1 |
| B14 | Non-functional | a11y, performance, PWA/offline, XSS/secrets, timezone/DST | large + adversarial | ✅ | 0 / 4 |
| B15 | Regression & sign-off | full smoke on **production build**, exit criteria | seeded | ✅ | 0 / 0 |
| B16 | Migrations, secrets & deploy | migration idempotency/rollback/fresh==migrated, encryption-key lifecycle, `docker-entrypoint` (perms/first-run/migrate), update-check phone-home | scratch + docker | ✅ | 0 / 1 |
| B17 | **Code health & consolidation** (IMP) | duplication/DRY, dead code, overlapping modules to merge, oversized files to split, one canonical path per concern | whole repo | ⬜ | 0 / 0 |
| B18 | **UX & information architecture** (IMP) | core-flow friction, empty/loading/error states, feedback/affordances, nav/menu discoverability, surfacing actions into sensible menus | any | ⬜ | 0 / 0 |
> After B15, if any batch is 🔁 or has open S1/S2, loop back. Then start a new
> cycle from B0 against the next build/version.
>
> **B17/B18 are improvement (IMP) batches** — they run alongside the correctness
> batches but their findings are enhancements, not defects, and don't gate sign-off.
**✅ means "run complete for this cycle's automatable scope, green, findings archived."**
Cycle 1 built a durable, automated guard for every batch (`npm run ci` · `test:e2e` ·
`test:e2e:probe` · `smoke:prod`). The following need **external infrastructure or a
human** and were **not** exercised — they are **non-blocking** for Cycle 1 sign-off and
carried to Cycle 2:
- **B1** — live TOTP enrollment, WebAuthn/passkeys (browser/OS prompts), OIDC SSO round-trip. (Password login, roles, CSRF, data-isolation, admin authz **are** covered.)
- **B10** — real SMTP *delivery* (push delivery + email-HTML building/escaping **are** covered by `tests/notificationDelivery.test.js`).
- **B11** — backup create/restore on a scratch instance (authorization + last-admin guards **are** covered).
- **B14** — Firefox/Safari cross-browser, PWA install/offline, and large/stress load+perf. (axe a11y on 8 pages, XSS/escaping, and prod-bundle perf **are** covered.)
- **B9** — spreadsheet/CSV import *from real files* end-to-end (money-unit handling + SQLite export→import round-trip **are** covered by tests).
### 1.1 QA Cycle Log
One row per full QA cycle (Phase 1 find → Phase 2 fix → … → Phase 5 re-run). A
cycle is only "clean" when its **find pass logged zero findings**. Keep going
until you get a clean cycle.
| Cycle | Started | Build / commit | Findings logged | Fixed / archived | Result |
|-------|---------|----------------|-----------------|------------------|--------|
| 1 | 2026-07-02 | `bdbf231`→`5ffe2db` (dev) | 14 | **14 → all fixed, verified & archived** (3× S2 incl. broken "Send test push", email XSS, reconciliation family, seed 100× cents) | 🔁 Phase 2 complete — 0 open. Every batch B0→B15 (+B-UI) run; 16 QA commits; guard suite green. |
| 1·re-run | 2026-07-02 | `5ffe2db` (dev) | **0 new** | — | ✅ **Automated re-run clean.** CI (server 109 + client 34, build), UI E2E 27, probe 16 (authz 403, Tracker↔Summary↔Analytics reconcile exactly, seed guard, a11y 8/8), prod-smoke PASS. **All 17 batches ✅ for automatable scope; external-infra residuals listed below are non-blocking and carried to Cycle 2.** |
| 1·simplefin-live | 2026-07-03 | `5ffe2db` (dev) vs prod DB | **1** (QA-B5-04) | **1 → fixed, verified & archived** | 🔁 Probed a **copy of the live SimpleFIN DB** (19 MB, v1.06: 3 users, 44 bills, 1,159 txns, 19 accounts, active SimpleFIN source). Integrity checks: dedup (1159/1159 distinct), money=integer cents, no double-match, pending have provider ids, no orphan-account txns — all pass **except** 3 matched txns with NULL bill → QA-B5-04 (retention GC + `ON DELETE SET NULL`). Fixed in `cleanupService`; healing verified on a DB copy (3→0, 0 txns lost). **Also ran a real end-to-end sync** (`syncDataSource`, the Sync-button path) against the live connection off a working copy: token decrypted via db-key fallback (no env key), bridge fetch OK (2.2s), 18 accounts upserted, 145 fetched txns **skipped not duplicated**, 0 new, 1159→1159 distinct — **dedup/upsert idempotency proven on the real connection.** |
**Result key:** 🔄 in progress · 🔁 findings fixed, re-run required · ✅ clean (zero findings — QA complete)
---
## 2. Active Findings Log
**This is the live log.** Record every finding here the moment it's found — before
fixing. Keep only **Open / Fixing / Fixed** rows here. Once a finding is
**Fixed + verified + archived to `HISTORY.md`**, delete its row from this table
(its permanent record is the changelog entry).
**Finding ID:** `QA-B{batch}-{nn}` (e.g. `QA-B2-01`).
**Severity:** S1 Critical · S2 Major · S3 Minor · S4 Cosmetic · IMP Improvement (see [Appendix A](#appendix-a--severity-definitions)).
**Status:** 🔴 Open → 🟡 Fixing → 🟢 Fixed (verified, awaiting archive) → then remove on 📦 Archive.
| ID | Sev | Area (`file:line`) | Summary | Status | Notes / repro |
|----|-----|--------------------|---------|--------|---------------|
| _(none — all Cycle 1 findings fixed, verified & archived to `HISTORY.md` v0.41.0)_ | | | | | |
**Finding template** (paste a new row above; keep the full write-up here until archived):
```
ID: QA-B?-??
Severity: S1 / S2 / S3 / S4 / IMP
Environment: browser / viewport / theme / role / auth mode / data state
Area: file:line (if known)
Steps to reproduce:
1.
2.
Expected:
Actual:
Evidence: console / network / DB row / screenshot
Fix: (what changed, commit) — Verified by: (repro re-run + ci)
```
Log console errors, failed network requests, and unhandled rejections as findings
**even if the UI looks fine**.
_All Cycle 1 write-ups have been archived to `HISTORY.md` v0.41.0 (see §3)._
### 2.1 Improvement backlog
**IMP-stream, separate from the bug log above.** Enhancement candidates found through
the three improvement lenses (code/consolidation, UX, IA/menus — see §0 and batches
B17/B18). These are **proposals**: log the candidate + a concrete recommendation, then
discuss before implementing. They don't gate sign-off. When one is implemented, archive
it to `HISTORY.md` (`### 🧹 QA` / `### ✨` as fits) and remove the row; deferred ideas
graduate to `roadmap.md`/`FUTURE.md`.
**ID:** `IMP-{stream}-{nn}` where stream = `CODE` (health/consolidation), `UX`, or `IA` (menus/nav).
**Effort:** S (local, <1h) · M (a file or two) · L (cross-cutting / needs design).
**Status:** 🔵 Noted (proposal) 🟡 Doing then archive to `HISTORY.md` on implement.
| ID | Lens | Area (`file`/page) | Proposal (what & why) | Effort | Status |
|----|------|--------------------|-----------------------|--------|--------|
| IMP-CODE-01 | Code | `client/lib/money.js` (+16 files) | ~~No shared client money formatter.~~ **Shipped `a15f00c`:** added `client/lib/money.js` (`formatUSD`/`formatUSDWhole`/`formatCentsUSD`); `lib/utils.fmt` delegates to it and 15 local formatters were removed. `null`/`NaN`/`-0` all handled. Test `client/lib/money.test.js`; full client suite + build green. | M | Shipped |
| IMP-CODE-02 | Code | `db/database.js` (4,174 ln) | **Oversized module.** One file mixes the migration engine, query helpers, settings, and connection lifecycle. Split into cohesive modules (behavior-preserving, test-guarded) for navigability and lower merge-conflict risk. | L | 🔵 Noted |
| IMP-CODE-03 | Code | `services/*match*`, `bankSync*` | **Overlapping match logic.** `transactionMatchService` / `matchSuggestionService` / `merchantStoreMatchService` each write `match_status` + bill/payment links; QA-B5-04 showed how easily these drift out of sync. Extract one canonical "set/clear match" helper so state transitions live in one place. | M | 🔵 Noted |
| IMP-IA-01 | IA | Sidebar · `/data` | ~~Central features under an overflow menu.~~ **Shipped `0b1c6a8`:** Data moved into the main app nav (desktop dropdown + mobile) alongside Bills/Categories/Spending; same default-admin gate preserved; removed the redundant account-dropdown entry. | S | Shipped |
| IMP-UX-01 | UX | Bills / Categories delete | **Retention isn't surfaced.** Soft-delete keeps records 30 days, but the only restore affordance is a transient "Undo" toast dismiss it and a bill deleted an hour ago is unrecoverable via UI. Add a lightweight "Recently deleted / restore" view to actually leverage the retention window. | M | 🔵 Noted |
| IMP-UX-02 | UX | all list pages | **State audit.** Systematically verify every list/page has an empty-state with a CTA, a skeleton/loading state, and a recoverable error state (pair with B14 fault-injection) no dead ends, no silent failures. | M | 🔵 Noted |
---
## 3. Archiving fixed findings to HISTORY.md
`HISTORY.md` is the project changelog (version-organized, emoji section headers).
When a finding is Fixed **and verified**, write a concise entry there, then remove
the row from the Active Findings Log.
**Where:** under the current in-progress version heading (e.g. `## v0.41.x`). If a
QA cycle produces several fixes, group them under a `### 🐛 QA Fixes` (bug fixes)
or `### 🧹 QA` (polish/improvements) section, matching the existing changelog voice.
**Entry format** (match the terse, specific style already in `HISTORY.md`):
```markdown
### 🐛 QA Fixes
- **[Area] Short title** — What was wrong and the user-visible impact, then the
fix. Reference the file/function and any migration or test added.
(was QA-B7-03)
```
**Rules**
- One bullet per finding; include the old `QA-B?-??` id in parentheses for traceability.
- If a fix added/changed a test, say which (`tests/…` or `client/…test.*`).
- Don't archive until the fix is verified (repro gone + `npm run ci` green).
- IMP items that were implemented are archived the same way; IMP items merely *noted* stay in the Findings Log (or graduate to `FUTURE.md`/`roadmap.md` if deferred).
---
## 4. Environment & setup
### 4.1 Running the app
| Mode | Command | URL |
|------|---------|-----|
| Dev (API + UI, hot reload) | `npm run dev` | UI `http://localhost:5173` (proxies API `:3000`) |
| API only | `npm run dev:api` | `http://localhost:3000` |
| Production build | `npm run build` then `npm start` | `http://localhost:3000` |
| Docker | `docker-compose up` | per compose config |
- Backend: Node/Express on `PORT` (default `3000`). Frontend dev: Vite on `5173`.
- Data: SQLite at `db/bills.db` (WAL). **Back it up before destructive tests** (`backups/` or a manual copy). Prefer a scratch DB for B9/B11 restore tests.
- Configure a dedicated **test** `.env` from `.env.example`. Never point tests at production data or a live SimpleFIN account with real credentials.
- Test commands: `npm run ci` (check + all tests + build), `npm run check` (syntax + build), `npm run test` (server), `npm run test:client` (vitest).
### 4.2 Test matrix
Full functional pass across reasonable combinations; smoke (B15) across all.
| Dimension | Values |
|-----------|--------|
| Browser | Chrome/Chromium, Firefox, Safari (WebAuthn differs per browser) |
| Viewport | Desktop 1280, tablet ~768, mobile ~375 (iPhone SE), ~414 |
| Theme | Light, Dark, system-follow |
| Role | `user`, `admin`, default admin (first-run) |
| Auth mode | Multi-user, single-user |
| Density | Normal + compact desktop |
| Network | Online, Slow 3G, offline (PWA shell) |
| Data state | Empty, seeded demo, large/stress, adversarial |
### 4.3 Accounts to prepare
- `admin`, `user`, a **second** `user` (data-isolation), a single-user-mode instance (separate DB).
- Demo reference: `guest / guest123` (do not run destructive flows on any shared demo server).
### 4.4 Automated E2E harness (Playwright)
Manual passes prove a button works **once**; they don't stop it regressing next cycle. The Playwright suite is the regression net it drives real clicks in a real browser, and it's where visual-regression, axe-a11y, and fault-injection B14) are wired so they re-run every cycle for free.
| Command | What it does |
|---------|--------------|
| `npm run test:e2e` | run the E2E suite headless (boots the app via `webServer`) |
| `npm run test:e2e:ui` | Playwright UI mode watch/debug interactively |
| `npm run test:e2e:update` | re-baseline visual-regression screenshots (review the diff before committing) |
| `npm run smoke:prod` | **B15 production-build smoke** builds, boots `node server.js` (dist/), drives the real artifact so the split vendor chunks are validated at runtime |
- **Setup (one-time):** `npm install` then `npx playwright install chromium`. Config: `playwright.config.js`; specs in `e2e/`.
- **Scope:** the suite is a **thin critical-path smoke**, not a replacement for the manual playbooks it locks the happy paths (login pay bill skip note reconcile), the primitive state matrix, per-page axe scans, and page screenshots. Grow it whenever a manual pass finds a UI regression that a click-test could have caught.
- **Don't** point it at production data or a live SimpleFIN account it runs against a scratch DB with seeded demo data.
---
## 5. Test data strategy
- **Empty:** brand-new account. Every page must render a sensible empty state no crash, no `NaN`, no blank white screen.
- **Seeded:** use **Data → Seed Demo Data** for a realistic mid-size dataset.
- **Large/stress:** 500+ bills, 5,000+ transactions, 24+ months history exercises virtualization (`@tanstack/react-virtual`), charts, query perf.
- **Adversarial (deliberately try to break it):**
- Amounts: `0`, `0.01`, negative, `9,999,999.99`, fractional cents.
- Text: emoji, RTL, `<script>` XSS probe, 1,000-char strings, leading/trailing spaces, SQL-ish input.
- Dates: 1st/14th/15th/31st boundaries; 28/29/30/31-day months; Feb 29; month/year crossing; inactive ranges; skipped months; overrides.
- Transactions: duplicate amount+date, same-day merchant repeats, refunds/negatives.
- Debt: APR `0%`, very high APR, `$0` balance, absurd inputs.
- Non-UTC system timezone + a DST boundary date.
---
## 6. Cross-cutting checks (every page)
Run on **every** page during its batch don't assume a shared component behaves the same everywhere.
**Navigation & routing** reachable from nav and by direct URL (deep link) + after hard refresh · back/forward restores state, no stuck spinners · unknown sub-paths `NotFoundPage` · active nav highlighted · `simplefinOnly` (Banking) gated · `Ctrl+K` palette finds & opens it.
**Buttons & interactions** every button/link/icon/dropdown/tab/toggle/menu does something or is disabled with a reason · no dead controls · double-click doesn't duplicate records · **rapid repeated toggling** (spam a switch / pay-skip) resolves to one correct state, no stuck spinner · action started then **navigate away mid-flight** doesn't corrupt or throw · destructive actions confirm + cancel · primary action keyboard-reachable (Tab/Enter/Esc).
**Forms & validation** required fields enforced · numeric/currency reject letters, handle 0/negative/decimal · errors don't wipe entered data · **paste** into every field (incl. `"$1,234.56"` into currency) · **browser/password-manager autofill** on login & forms · **IME/composition** (emoji, CJK) in text fields commits correctly · success shows toast (sonner) and the view updates without manual refresh (React Query invalidation).
**Number inputs (you have ~45 `type="number"` fields — the highest-risk control type)** scroll-wheel over a focused field must **not** silently change the value · spinner up/down buttons step correctly and respect min/max · reject/`e`/`+`/exponent and multiple decimals · locale decimal comma vs dot · leading zeros · empty field no `NaN` submitted · cents fields never accept >2 decimals.
**Per-control state matrix** — for each control on the page, verify every applicable state renders and behaves in **both light and dark**: default · hover · keyboard-focus (visible ring) · active/pressed · disabled (and truly non-interactive) · loading/in-flight · error/invalid · read-only · filled-to-overflow (1,000-char string / max-digit number wraps or truncates, no layout break).
> **Note — "sliders":** this app has **no `<input type=range>` sliders.** The `SlidersHorizontal` glyph is just the Bills **filter-panel** button; the closest real thing to a slider is a number stepper. Test those two surfaces where a slider would otherwise be expected.
**States** — loading skeleton/spinner, no layout jump · helpful empty state · error state (4xx/5xx/offline) recovers, `ErrorBoundary` shows a fallback not a white page.
**Visual & responsive** — correct at desktop/tablet/mobile, no overflow/h-scroll · dark mode contrast, no white flash · compact mode readable · long strings/big numbers wrap/truncate.
**Data integrity** — money 2-decimals, no float artifacts (`9.999999`) · dates in expected tz, period boundaries correct · values agree across pages (a bill total on Tracker == Summary == Analytics).
---
## 7. Batch playbooks (detailed checklists)
Each batch below is the detailed script for the matching row in [§1](#1-batch-plan--progress-tracker). Apply [§6](#6-cross-cutting-checks-every-page) throughout.
### B0 — Baseline, tooling & coverage recon
**Run FIRST in every cycle.** This is where the plan re-syncs with reality — new
pages, routes, endpoints, or features added since the last cycle get discovered
and folded in **before** testing, so coverage never silently rots.
**Tooling baseline**
- [ ] `npm run ci` — record any failing server/client test or build error as a finding (S1/S2).
- [ ] `npm run check` — server syntax + build clean.
- [ ] App boots via `npm run dev` **and** production `npm start`; note startup warnings.
- [ ] Load the app; browser console + server logs clean on first load and first navigation.
- [ ] Confirm which auth mode / seed state the DB is in; snapshot a backup before proceeding.
**Coverage recon — enumerate the *actual* product and diff it against this plan.**
Run these, then compare the output to the batch playbooks (§7) and the [route map](#appendix-c--page--route--api-quick-map):
- [ ] **Client routes**`grep -nE "<Route" client/App.jsx` — every path present here must appear in a batch playbook and Appendix C.
- [ ] **Pages**`ls client/pages/` — every page has an owning batch.
- [ ] **Sidebar / nav entries**`grep -nE "to:|label:|Only" client/components/layout/Sidebar.jsx` — new nav links (incl. conditional ones like `simplefinOnly`) are covered.
- [ ] **API route mounts**`grep -nE "app.use\('/api" server.js` — every mounted route group is in B13's list and mapped in Appendix C.
- [ ] **Services & components**`ls services/` and `ls client/components/**/` — new service/component families have a home in a playbook.
- [ ] **UI primitives**`ls client/components/ui/` — every shared primitive is covered by the [B-UI](#b-ui--design-system-primitives) playbook; a new primitive gets a row there.
- [ ] **Middleware & workers**`ls middleware/ workers/` (+ `services/*Worker*`, `*Scheduler*`) — each is covered (csrf/rateLimiter/securityHeaders/requireAuth → B13; dailyWorker/bankSyncWorker/backupScheduler → B10).
- [ ] **Migrations & deploy** — new `db/database.js` migrations, `Dockerfile`/`docker-entrypoint.sh` changes, and `encryptionService`/`updateCheckService` behavior are covered by [B16](#b16--migrations-secrets--deployment).
- [ ] **Interactive-control census (makes "every button tested" *provable*)** — for each page, enumerate every button, link, toggle/switch, checkbox, select, text/number/date/file input, tab, menu, and filter control, and record it in a per-page control checklist (template: [Appendix E](#appendix-e--per-page-control-census)). A control that isn't on a checklist hasn't been tested — the census is the completeness guarantee the batch playbooks alone don't give you. Quick starting inventory: `grep -rnoE "type=[\"'][a-z]+[\"']" client/pages client/components` and `grep -rn "onClick=" client/pages/<Page>.jsx`.
- [ ] **Feature flags / conditional surfaces** — search for `Only`, `enabled`, `featureFlag`, env gates that hide/show pages; ensure each state is tested.
- [ ] **What changed since last cycle** — skim `git log`/`HISTORY.md` since the previous cycle's commit (see [Cycle Log](#11-qa-cycle-log)) for new features/pages.
**Update the plan (do this now, not later)** — for anything the recon surfaced that isn't already covered:
- [ ] Add it to the relevant batch playbook (or create a new batch and a row in the [§1 table](#1-batch-plan--progress-tracker)).
- [ ] Add/adjust its entry in [Appendix C](#appendix-c--page--route--api-quick-map).
- [ ] Note the plan update in the [Cycle Log](#11-qa-cycle-log) row for this cycle.
- [ ] If a whole surface is *missing* from the product that the plan expected (page removed/renamed), reconcile the plan too — don't test ghosts.
### B-UI — Design-system primitives
**Test each shared control once, thoroughly, in isolation — a bug here breaks every page at once.** Drive them wherever they're already mounted (or a scratch page); run each against the [per-control state matrix](#6-cross-cutting-checks-every-page) × light/dark × keyboard-only. One finding row per primitive.
| Primitive (`client/components/ui/`) | Must verify |
|---|---|
| `button.jsx` | every variant (default/destructive/outline/ghost/link) + size; **disabled truly blocks click**; loading state; focus ring; Enter/Space activate |
| `input.jsx` | text/number/password/date/search/file types; placeholder; disabled/read-only; error styling; paste/autofill; number-input rules above |
| `select.jsx` (Radix) | opens by mouse **and** keyboard; type-ahead; long lists scroll; onChange fires in **Firefox+Safari**; disabled options; value persists; Esc closes |
| `checkbox.jsx` / `switch.jsx` | toggles by click **and** Space; indeterminate (if used); disabled; label click toggles; controlled value round-trips |
| `dialog.jsx` / `alert-dialog.jsx` / `confirm-dialog.jsx` / `input-dialog.jsx` | open/close; **focus trap + restore**; Esc closes; overlay click behaves; **Cancel actually cancels (no side effect)**; Confirm fires once; scroll-lock releases |
| `dropdown-menu.jsx` | keyboard arrow nav; Esc; submenu; disabled items; click-outside closes; no clipping at viewport edge |
| `tabs.jsx` | arrow-key nav; active state; content swaps; deep-link/refresh keeps tab (if applicable) |
| `tooltip.jsx` | hover **and** keyboard-focus show it; dismiss on blur; touch behavior; not a11y-only info trap |
| `table.jsx` | header/zebra/hover; horizontal scroll on narrow viewport (no page h-scroll); empty state |
| `collapsible.jsx` | expand/collapse animation; state persists; keyboard operable |
| `sonner.jsx` (toast) | success/error/loading; **stack + dismiss**; auto-dismiss timing; doesn't cover primary actions; announced to SR |
| `save-status.jsx` | idle/saving/saved/error transitions reflect real autosave (`useAutoSave.test.jsx`) |
| `Skeleton.jsx` | matches final layout (no jump); no infinite skeleton on error |
| `badge.jsx` / `card.jsx` / `separator.jsx` / `label.jsx` | contrast in dark mode; label `htmlFor` focuses its control; no overflow on long text |
| `theme-toggle.jsx` | light↔dark↔system; applied **before first paint** (no flash); persists across reload |
- [ ] Every primitive above passes its row in light **and** dark, keyboard-only, at mobile width.
- [ ] Axe scan (see B14) on a page densely using primitives → zero critical violations.
### B1 — Auth & authorization
- [ ] **Password:** valid login → correct landing (Tracker for `user`, `/admin` for default admin); wrong password → clear error, no user-enumeration timing/message difference; logout clears session; expired session redirects and preserves `state.from`; session persists across refresh.
- [ ] **Rate limiting:** repeated failed logins throttled (`loginLimiter`/`loginUsernameLimiter`), clear message, resets.
- [ ] **TOTP:** enroll (QR + secret), code accepted, backup codes work once, login prompts for TOTP, wrong code rejected+throttled, disable requires re-auth.
- [ ] **WebAuthn:** register/login/remove passkey in Chrome, Firefox, Safari; password fallback works.
- [ ] **OIDC/Authentik:** SSO flow creates/links account; admin config errors surface cleanly; `oidcLimiter` throttles.
- [ ] **Roles/guards:** `user` blocked from `/admin*`, `/status` (redirect) and admin APIs (403); default admin forced to `/admin`; single-user bypass correct but admin surfaces still protected; unauth API → 401.
- [ ] **Data isolation (critical):** user A cannot read/modify user B's bills, payments, transactions, categories, snowball plans — test by ID enumeration on the API.
- [ ] **CSRF:** state-changing request without a valid token → rejected.
### B2 — Tracker (`/`)
- [ ] Month nav (prev/next/jump), current month highlighted, data reloads per month.
- [ ] Bills land in correct `114` / `1531` bucket by due date; pin-due sorting works.
- [ ] Quick pay marks paid + updates balance cards/progress; undo works; no double-count.
- [ ] Skip excludes from totals for that month only; unskip restores.
- [ ] Per-month amount override persists, doesn't affect base bill or other months.
- [ ] Notes cell add/edit/clear persists per month.
- [ ] Inactive/date-range bill doesn't show or count outside its range.
- [ ] Balance/starting-amount cards period-aware + editable; income bills / safe-to-spend correct.
- [ ] Overdue command center: accurate list/count, pay/skip actions work.
- [ ] Cash flow card, drift insight, payment ledger (add/edit/delete reconciles), autopay suggestion apply/dismiss.
- [ ] Editable cells autosave; Esc cancels; invalid input handled. Mobile rows equal desktop actions. Compact mode intact.
### B3 — Bills (`/bills`)
- [ ] Create with all fields (name, amount, due date, category, schedule, account, autopay, active range).
- [ ] Edit propagates to Tracker/Summary/Calendar/Analytics; delete confirms + handles orphan payments/history.
- [ ] Custom schedules (weekly/biweekly/monthly/quarterly/annual/custom): next-due & occurrences correct across month/year boundaries.
- [ ] Drag reorder persists (cross-check `billReorder.test.js`); search/filter panel filters + clears; large-list virtualization smooth.
- [ ] Merchant rules: create/matches/edit/delete; historical import dialog attributes month-crossing payments correctly.
- [ ] BillModal open/close, validation, cancel discards unsaved changes.
### B4 — Subscriptions & Categories
- [ ] Subscriptions: add/edit/delete, active/cancelled, renewal & annual→monthly normalization; totals feed Tracker/Summary/Analytics.
- [ ] Catalog: browse/search, add-from-catalog pre-fills.
- [ ] Categories: create/edit/delete (in-use handled: reassign/prevent); groups create/assign/reorder (`categoryGroups`/`categoryReorder` tests); colors/icons consistent on Tracker/Spending/Analytics.
### B5 — Reporting reconciliation
- [ ] Summary totals (paid/unpaid/overdue/remaining) reconcile with Tracker for the same month; income breakdown modal matches.
- [ ] Calendar plots bills/payments on correct days (**timezone**: a bill due on the 1st must not render on the 31st); day totals correct.
- [ ] Analytics charts render with data AND empty (no broken SVG/`NaN` axes); period selectors update all charts; figures reconcile with Summary/Tracker; large dataset perf OK.
- [ ] Health indicators compute from real data, no crash on empty; recommendations sane.
### B6 — Spending (`/spending`)
- [ ] Category-group view assigned/spent/available math correct; 3-month averages correct.
- [ ] Cover-overspending reallocates funds correctly and is reversible.
- [ ] Safe-to-spend matches Tracker (`safeToSpend.test.js`); month nav; empty/partial months handled.
### B7 — Debt planning (`/snowball`, `/payoff`)
- [ ] Add debts (balance/APR/min); snowball vs avalanche ordering correct.
- [ ] Projection + amortization vs a **hand-calculated** example; APR=0 and already-paid debts correct.
- [ ] Extra-payment/budget updates payoff date + total interest; chart renders; plan history saves/restores; status banner accurate.
- [ ] Edge: single debt, many debts, `$0` debt, negative/absurd inputs rejected.
### B8 — Banking (`/bank-transactions`)
- [ ] Ledger loads/virtualizes/filters (date/account/amount/merchant/status).
- [ ] Transaction matching (match/unmatch), auto-match review approve/reject, no double-match (`transactionMatchService.test.js`).
- [ ] Merchant/store matching rules + confidence/duplicates; advisory non-bill filter flags/hides with override.
- [ ] Matched payments reflect on Tracker/ledger without double-counting; category picker persists.
### B9 — Data lifecycle (`/data`)
- [ ] Imports: spreadsheet (XLSX/CSV) map/preview/commit, malformed rejected, dup/partial handled; transaction CSV (`csvTransactionImportService.test.js`) dedupe + parsing; SQLite user import version-checked + confirms overwrite; seed demo data safe; import history lists + rollback.
- [ ] Exports: download SQLite **round-trips** (export → fresh account → import → matches); Excel export opens uncorrupted; ICS calendar feed valid in a client AND properly **token-gated** (route mounts before auth — verify not open).
- [ ] Backups: manual + scheduled restorable on a scratch instance; permissions not world-readable; old backups pruned (`backupAndCleanup.test.js`).
### B10 — Notifications & workers
- [ ] Each channel (email/SMTP, ntfy, Gotify, Discord, Telegram): test message delivers; bad token/URL → clear error, logged, no secret leak.
- [ ] Reminders fire at configured lead time for upcoming/overdue; no duplicates; paid/skipped excluded; respects per-user prefs.
- [ ] Workers: `dailyWorker`, `bankSyncWorker` (interval + guardrails), `backupScheduler` run on schedule; errors caught/logged, don't crash server, next run unblocked.
### B11 — Admin panel (`/admin`)
- [ ] Onboarding wizard completes without a broken state.
- [ ] Users table: add/edit-role/reset-pw/disable/delete; **cannot remove the last admin**.
- [ ] Login mode switch single↔multi verified live, no lockout; auth-methods enable/disable + bad config surfaced.
- [ ] Email notif config + test send; bank sync admin (configure/manual/auto/status/revoke).
- [ ] Backups create/list/download/restore/delete; cleanup panel previews impact + confirms (counts match `backupAndCleanup.test.js`).
- [ ] Privacy admin edits reflect on public `/privacy`; system status metrics/versions/jobs accurate (`statusService.test.js`); admin actions rate-limited + audited (`auditService` — spot-check log).
### B12 — Settings, Profile & global UI
- [ ] Settings: theme (light/dark/system) persists; notification prefs save + reflect in B10; display/density/period/search-panel prefs persist; invalid rejected.
- [ ] Profile: change password (current required, invalidates sessions), manage 2FA/passkeys, sessions revoke (`profileRoute.test.js`).
- [ ] Static: About (public + admin, version shown), Privacy, Release Notes (dialog once per `user`, dismiss persists), Roadmap (admin), NotFound friendly + way home.
- [ ] Global: command palette (`Ctrl+K`) search/keyboard/Esc, hidden for default admin; sidebar collapse/expand + mobile overlay (check overflow issue in `docs/UI_IMPROVEMENTS.md`); toasts stack/dismiss; page transitions no flash/double-fetch; theme applied before first paint.
### B13 — API / backend direct
Route groups: `auth`, `auth/oidc`, `admin`, `tracker`, `bills`, `subscriptions`, `payments`, `data-sources`, `transactions`, `matches`, `categories`, `settings`, `user`, `calendar`, `summary`, `monthly-starting-amounts`, `analytics`, `spending`, `snowball`, `notifications`, `status`, `about`, `about-admin`, `privacy`, `version`, `profile`, `export`, `import`/`imports`.
- [ ] Auth: unauth → 401, wrong role → 403, right role → 200.
- [ ] CSRF: state-changing without valid token rejected; with token succeeds (`middleware/csrf.js`).
- [ ] Validation: bad/missing body → structured 4xx (`middleware/errorFormatter.js`, `utils/apiError.js`), never a raw 500 stack.
- [ ] IDOR/isolation: other user's resource by id → 403/404, no leak.
- [ ] Rate limits: login/admin/export/import/OIDC limiters trigger + reset (`middleware/rateLimiter.js`).
- [ ] Money in **integer cents** end-to-end (per `docs/cents-migration-plan.md`); API and DB agree; no float drift.
- [ ] Idempotency: repeated create doesn't duplicate; concurrent edits resolve sanely.
- [ ] Consistent error JSON + correct status codes; security headers present (`middleware/securityHeaders.js`); public routes (`about`/`privacy`/`version`/calendar feed) leak nothing sensitive.
### B14 — Non-functional
- [ ] **a11y (manual):** keyboard-only reach/operate every control, visible focus, skip-link works; screen-reader labels/roles (Radix `aria-*`); WCAG-AA contrast light+dark; modals trap+restore focus, Esc closes; errors announced not color-only.
- [ ] **a11y (automated):** run **axe-core** on every page (`@axe-core/playwright`, or `jest-axe` for component-level) — **zero critical/serious** violations; triage moderate. Wire it into the E2E suite so it re-runs every cycle, not just once.
- [ ] **Visual regression:** capture a baseline screenshot per page × {desktop, mobile} × {light, dark} (Playwright `toHaveScreenshot`); diff against baseline each cycle. Every non-trivial pixel diff is either an intended change (update the baseline in the same commit) or a finding — never ignore it. This is what makes "every page looks right" repeatable instead of eyeballed.
- [ ] **Performance:** initial load + lazy route splitting OK on Slow 3G; large lists responsive; no memory leak over 10+ navigations; no duplicate/excess requests (React Query `staleTime`).
- [ ] **PWA/offline:** installs; manifest/icon correct; offline shell loads with graceful messaging; SW updates without stale-cache breakage.
- [ ] **Security spot-checks:** XSS in bill names/notes/category names/imported data escaped everywhere (defense = React auto-escaping + the restrictive custom `MarkdownText` renderer — https-only link hrefs, **no** `dangerouslySetInnerHTML` anywhere; NOT rehype-sanitize, which is unused, see QA-B14-03); no secrets (SimpleFIN token, SMTP creds, OIDC secret) in bundle/responses/logs; cookies `HttpOnly`/`Secure`/`SameSite`; `encryptionService` protects at-rest secrets, keys not committed. (Depth: `SECURITY_AUDIT.md`.)
- [ ] **Resilience:** kill API mid-session → recoverable errors, no data loss on next save; locked/corrupt SQLite surfaces clearly; SimpleFIN/SMTP/push down → graceful degrade; two-tab concurrent edits don't silently clobber.
- [ ] **Fault injection (systematic):** with a request-interception harness (Playwright `page.route`, or DevTools network overrides), force each page's API calls to **401 mid-session / 403 / 429 / 500 / network-timeout / malformed-JSON** and confirm the UI shows a recoverable error (toast or `ErrorBoundary` fallback), never a white screen, stuck spinner, or silent success. Do this per page, not once globally — each page handles failure differently.
- [ ] **Timezone/locale:** non-UTC tz + DST boundary — due dates and calendar stay correct.
### B15 — Regression & sign-off
Run on the **production build** (`npm start`), not dev:
- [ ] `npm run ci` green. Log in as `user` and `admin`.
- [ ] `npm run test:e2e` green (Playwright smoke + axe + visual-regression baselines match, §4.4).
- [ ] Tracker: create bill → quick-pay → skip another → add note; reflected on Summary/Calendar/Analytics.
- [ ] Create a category + subscription → appear on Tracker/Spending; Spending safe-to-spend correct.
- [ ] Snowball: add debt → projection. Data: seed → export → import round-trip (scratch DB).
- [ ] Admin: open panel, users, system status, run a backup. Banking loads + matches (if SimpleFIN configured).
- [ ] Notifications: one test message on configured channel. Toggle dark mode; mobile viewport; `Ctrl+K` navigates.
- [ ] Bogus URL → 404; logout → login redirect. Console clean throughout.
- [ ] Confirm [exit criteria](#appendix-b--exit--sign-off-criteria).
### B16 — Migrations, secrets & deployment
Added Cycle 1 (previously uncovered). These run on every boot / container start and
touch money columns and at-rest secrets — a bug here corrupts data or leaks/breaks
secrets silently.
**Migrations** (`db/database.js` migration system, `scripts/migrate-db.js`, `schema_migrations`, `rollbackMigration`)
- [ ] **Idempotent:** boot twice on the same DB → second run applies nothing ("Skipping already applied"), no errors, no duplicate rows/columns.
- [ ] **Fresh == migrated:** a brand-new DB (schema.sql + all migrations) has the same schema as a DB migrated up from an old version — same tables/columns/indexes, money columns are **integer cents**.
- [ ] **Rollback:** `rollbackMigration` on the latest migration reverts cleanly and re-applying works; partial/failed migration leaves the DB consistent (transactions per migration).
- [ ] **Money conversions correct:** v1.03 (dollars→cents) and v1.04 (template JSON) convert exact values, no ×100 drift, run once only.
- [ ] Migrating a large/real DB doesn't lose or duplicate bills/payments/categories.
**Encryption-key lifecycle** (`services/encryptionService.js`, `TOKEN_ENCRYPTION_KEY`, HKDF v1/v2)
- [ ] **Key present:** secrets (SMTP pw, OIDC secret, push tokens, login IP/UA) encrypt at rest and decrypt correctly.
- [ ] **Key missing:** app boots; secret features degrade gracefully (no crash); confirm secrets are **not** silently stored/served in plaintext.
- [ ] **Key rotated/wrong:** old ciphertext fails to decrypt **gracefully** (no crash, no stack leak); `safeDecrypt` fallback path is sane; re-encryption migrations (v0.770.79) behave.
- [ ] Encryption key is never committed, logged, or returned in any API response.
**Container / deploy** (`Dockerfile`, `docker-compose.yml`, `docker-entrypoint.sh`, `deploy.sh`)
- [ ] Image **builds**; container **starts**; app reachable; `/api/version` responds.
- [ ] Entrypoint: creates `DATA_DIR`/`DB_DIR`/`BACKUP_DIR`, sets **`chmod 700`** (not world-readable), `chown`s to the non-root `bill` user, runs migrations when `RUN_DB_MIGRATIONS=true`.
- [ ] Data **persists** across container restart (mounted volume); DB not re-created.
- [ ] Runs as **non-root**; secrets come from env, not baked into the image.
**Update check / phone-home** (`services/updateCheckService.js`)
- [ ] Confirm the external request to `REPO_API_URL` (default `dream.scheller.ltd`) is **disclosed** (privacy page) and **opt-out-able**; it must send no user data, only fetch the latest release; failure/offline degrades silently.
**Rate-limiter completeness** (`middleware/rateLimiter.js`) — beyond B13's list
- [ ] `backupOperationLimiter` throttles admin backup/restore/cleanup; `skipRateLimitIfNoUsers` only relaxes limits on a genuinely empty instance (first-run), never afterward.
### B17 — Code health & consolidation (IMP)
An **improvement** batch: hunt for ways to make the codebase smaller, clearer, and more
consistent — *without changing behavior*. Every candidate is logged as an `IMP-CODE-*`
row in §2.1 with a concrete proposal; nothing is refactored silently. Consolidation
lands only when it's behavior-preserving **and** covered by existing or added tests.
- [ ] **Duplication / DRY:** find logic copy-pasted across services/routes/components and
propose a shared helper. Known hot spots: money formatting/rounding (`utils/money.js`
vs inline), the `resolveDueDate` occurrence gate (must stay one implementation),
error-response shaping (`utils/apiError.js` vs ad-hoc), React data-fetch patterns
(repeated `useQuery` + toast + error handling → shared hooks).
- [ ] **Dead / unused code:** unused exports, unreachable branches, orphaned files,
commented-out blocks, unused deps (`depcheck`), unused UI components/CSS, leftover
scaffolding. Propose deletion (verify no dynamic/`require`-by-string use first).
- [ ] **Overlapping modules:** services that do similar work and could merge or share a
core — e.g. the matching family (`matchSuggestionService`, `transactionMatchService`,
`merchantStoreMatchService`), the bank-sync family (`bankSyncService`,
`bankSyncWorker`, `bankSyncConfigService`, `simplefinService`). Map responsibilities;
propose a consolidation only where it removes real duplication, not just moves it.
- [ ] **Oversized / low-cohesion files:** split by concern where it aids navigation
(e.g. `db/database.js` is very large — migrations vs query helpers vs settings could
be separate modules). Propose the seams; don't split for its own sake.
- [ ] **One canonical path per concern:** cents handling, date/tz, CSRF, error shape,
pagination — confirm there's a single blessed way and flag divergences.
- [ ] **Consistency:** naming, file layout, async patterns, import ordering; a lint rule
that would prevent a class of the bugs found in earlier batches is itself an IMP.
- [ ] **Test/infra dedupe:** repeated test setup → shared fixtures/helpers; flag coverage
gaps a consolidation would risk.
### B18 — UX & information architecture / menus (IMP)
An **improvement** batch focused on the person using the app: is every feature
*discoverable*, is every core flow *smooth*, and does every action live *where a user
would look for it*? Candidates are logged as `IMP-UX-*` or `IMP-IA-*` in §2.1 with a
concrete before/after proposal. Walk the app as a real user (both `user` and `admin`,
desktop and mobile, light and dark), not just as a tester.
**Information architecture & menus**
- [ ] **Discoverability:** is any feature buried, orphaned, or reachable only by typing a
URL? Everything should be reachable from the nav, a menu, or a clear in-page entry.
- [ ] **Navigation structure:** sidebar/nav grouping is logical; related pages sit
together; admin vs user separation is clear; active state + page titles are correct.
- [ ] **Menus where they belong:** actions that today are loose buttons or hidden should
be grouped into sensible menus — overflow (`⋯`) menus on rows/cards, context menus,
a consolidated **Settings** grouping, an account menu. Put related actions in one menu
rather than scattering them. Flag anything that would be easier to find as a menu item.
- [ ] **Command palette (`Ctrl+K`) coverage:** every page/primary action is reachable;
no dead entries.
- [ ] **Redundancy:** the same action offered in three places with different labels, or
two pages that do nearly the same thing — propose consolidating.
**Experience quality**
- [ ] **Core-flow friction:** count the clicks for the top tasks (pay a bill, add a bill,
connect SimpleFIN, run a sync, import data). Propose shortcuts where a step is wasted.
- [ ] **States:** every list/page has a clear **empty state with a next-step CTA**, a
**loading** state (skeleton/spinner, no layout jump), and a **recoverable error**
state — no dead ends, no silent failures.
- [ ] **Feedback & safety:** state changes confirm (toast); destructive actions confirm
and, where feasible, offer undo (bills already soft-delete — surface a restore path);
long actions show progress.
- [ ] **Consistency:** primary-action placement, button hierarchy, iconography,
terminology, and confirmation patterns are consistent across pages.
- [ ] **Mobile parity:** every action available on desktop is reachable on mobile; touch
targets adequate; menus/overflow work on touch.
- [ ] **Onboarding:** first-run and empty-account guidance explains the next step; advanced
features (SimpleFIN, debt planning, backups) have a short in-context explanation.
---
## 8. Appendices
### Appendix A — Severity definitions
| Level | Definition |
|-------|------------|
| **S1 Critical** | Data loss/corruption, security hole, crash/blank page, wrong money math, cannot log in/save. |
| **S2 Major** | Feature broken/unusable, wrong results, broken navigation, unhandled error. |
| **S3 Minor** | Works but wrong edge behavior, confusing UX, missing validation message. |
| **S4 Cosmetic** | Visual/copy/alignment/dark-mode-contrast, non-blocking. |
| **IMP Improvement** | Not a bug; enhancement or polish idea. |
### Appendix B — Exit / sign-off criteria
A cycle is release-ready when: **(Cycle 1 — all met ✅)**
- [x] All batches B0B15 ✅ (Chromium desktop + mobile via the E2E projects; light + dark, `user` + `admin` exercised). Cross-browser Firefox/Safari carried to Cycle 2.
- [x] B15 smoke green on the **production build** (`npm run smoke:prod`).
- [x] **Zero open S1/S2** in the Findings Log; S3/S4/IMP all fixed & archived.
- [x] `npm run ci` green (server 109 + client 34 + build); no new console errors (verified in prod-smoke).
- [x] Data export→import round-trip verified with no loss (`tests/exportImportRoundTrip.test.js`).
- [x] Auth/authorization + data-isolation all pass (probe: IDOR → 404, CSRF → 403, admin/status → 403).
- [x] Money and date/period correctness verified vs hand-calculated examples (`tests/money.test.js`, `aprService`, recurrence probe, reconciliation guards).
- [x] All 14 fixes archived to `HISTORY.md` v0.41.0; cycle summary recorded (Cycle Log §1.1).
### Appendix C — Page ↔ route ↔ API quick map
| Page | Route | Primary API |
|------|-------|-------------|
| Tracker | `/` | `/api/tracker`, `/api/bills`, `/api/payments`, `/api/monthly-starting-amounts` |
| Calendar | `/calendar` | `/api/calendar` |
| Summary | `/summary` | `/api/summary` |
| Bills | `/bills` | `/api/bills`, `/api/categories`, `/api/matches` |
| Subscriptions / Catalog | `/subscriptions`, `/subscriptions/catalog` | `/api/subscriptions` |
| Categories | `/categories` | `/api/categories` |
| Health | `/health` | `/api/analytics`, `/api/summary` |
| Analytics | `/analytics` | `/api/analytics` |
| Spending | `/spending` | `/api/spending` |
| Banking | `/bank-transactions` | `/api/transactions`, `/api/matches`, `/api/data-sources` |
| Snowball / Payoff | `/snowball`, `/payoff` | `/api/snowball` |
| Settings | `/settings` | `/api/settings`, `/api/notifications` |
| Profile | `/profile` | `/api/profile`, `/api/user` |
| Data | `/data` | `/api/import`, `/api/export`, `/api/data-sources` |
| Admin | `/admin`, `/admin/status` | `/api/admin`, `/api/status`, `/api/about-admin` |
| About / Privacy / Release Notes / Roadmap | `/about`, `/privacy`, `/release-notes`, `/roadmap` | `/api/about`, `/api/privacy`, `/api/version` |
### Appendix D — Reference docs
`SECURITY_AUDIT.md` (security depth) · `docs/UI_IMPROVEMENTS.md` (known UI issues) · `docs/cents-migration-plan.md` (money-as-cents) · `docs/SIMPLEFIN_CONSUMER_GUARDRAILS.md` (sync limits) · `docs/CSRF-SPA-Setup.md`, `docs/RATE_LIMITING_ENHANCEMENT.md` (security middleware) · `REVIEW.md`, `DEVELOPMENT_LOG.md`, `roadmap.md`, `FUTURE.md` (context/known gaps) · `HISTORY.md` (changelog / fix archive) · `playwright.config.js` + `e2e/` (automated E2E/visual/a11y harness, §4.4).
### Appendix E — Per-page control census
The completeness ledger behind "every button, textbox, slider is right." Fill one table **per page** during [B0](#b0--baseline-tooling--coverage-recon) and check every control off during that page's batch. A control not listed here is a control not tested. Build the starting list with `grep -rnoE "type=[\"'][a-z]+[\"']" client/pages/<Page>.jsx` + `grep -n "onClick=\|<Button\|<Select\|<Switch\|<Checkbox" client/pages/<Page>.jsx`.
**Template** (copy per page):
| Control | Type | Expected action | States checked (default/focus/disabled/error/loading) | Keyboard | Result |
|---------|------|-----------------|-------------------------------------------------------|----------|--------|
| *e.g.* Quick-pay button | button | marks bill paid, updates balance cards, undo available | default ✓ · disabled-while-saving ✓ | Enter ✓ | ✅ / finding id |
| *e.g.* Amount input | number | per-month override, cents only, no wheel-scroll change | default ✓ · error-on-letters ✓ | Tab/Esc ✓ | ✅ / finding id |
**Pages to census** (from `client/pages/`, keep in sync with [Appendix C](#appendix-c--page--route--api-quick-map)): Tracker, Calendar, Summary, Bills, Subscriptions, SubscriptionCatalog, Categories, Health, Analytics, Spending, Snowball, Payoff, BankTransactions, Data, Settings, Profile, Admin, Status, About, Privacy, ReleaseNotes, Roadmap, Login, NotFound — plus the shared **Sidebar/command-palette/header** chrome once.
</content>