docs(qa): define per-pass chunk granularity (sub-batch to one context window)

Round-1 calibration: A & D fit as single batches; B/C/E overflowed and got deferred. Add a batch-sizing table: B=1 game/chunk, C=1 screen-group/chunk, D=~4 sub-areas, E=3-5 types/chunk, F=1 dimension/chunk. Chunk = largest unit that finishes+commits in one window; commit + run-state update per chunk. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 21:47:04 -05:00 · 2026-06-24 21:47:04 -05:00 · bbd7ef0806
parent 84dd5f1152
commit bbd7ef0806
1 changed files with 19 additions and 0 deletions
--- a/ClaudeQAPlan.md
+++ b/ClaudeQAPlan.md
@ -85,6 +85,25 @@ State lives in **files**, not memory:
  online; (3) **installed build == current HEAD** (rebuild+reinstall if unsure — never QA a stale APK); (4) continue
  at the first `todo` / unverified-fix.

+## Batch sizing — sub-batch each pass to ONE context window (Round-1 calibration)
+A pass is a **category**, not a unit of work. Execute each pass as **sub-batches (chunks)**, where a chunk = the
+**largest coherent unit that reliably finishes AND commits within one context window, with margin**. End every chunk
+with a commit + run-state update. If a chunk starts overflowing, split it; if chunks feel trivial, merge them.
+**Why:** in Round 1, A & D fit as single batches, but B/C/E were too large → got cut off → deferred. Sub-batching
+prevents half-done/lost work and gives cleaner per-chunk verification + revertable commits.
+
+| Pass | Chunk granularity | ~chunks |
+|---|---|---|
+| A Premium | free-state gating sweep; then couple-shared verify (mostly code + a few live taps) | 1–2 |
+| B Games | **one game per chunk** — full two-device playthrough + edges + commit | 7 |
+| C Visual | **one screen-group per chunk** (both themes, ~6–10 screens, montage-reviewed + nav/back for that group) — never "all screens" at once (heaviest, image-bound) | 6–8 |
+| D Security | D1 at-rest · D2 rules + D3 negative · D4 keys/recovery · D5–D7 appcheck/secrets/leaks/migration | ~4 |
+| E Notifications | **3–5 types per chunk** × {foreground/background/killed} + tap-to-open | ~4 |
+| F Resilience | **one dimension per chunk** (concurrency · lifecycle/process-death · network · time · account-lifecycle) | ~5 |
+
+Context-cost tips: prefer **code/admin-read audits** (cheap) before live UI sweeps; **montage** screenshots
+(dark|light pairs) to review many at once; keep one chunk = one TodoWrite focus.
+
 ## Guardrails & efficiency
 - **Never `pm clear` / wipe app data** — breaks the App Check debug token. Pre-pairing QA: sign-out → fresh sign-up.
 - **Never run `seed/build_db.py`.** Admin seeds/writes, entitlement toggles, and any deploys are **user-authorized per occurrence**.