docs(qa): define per-pass chunk granularity (sub-batch to one context window)

Round-1 calibration: A & D fit as single batches; B/C/E overflowed and got deferred.
Add a batch-sizing table: B=1 game/chunk, C=1 screen-group/chunk, D=~4 sub-areas,
E=3-5 types/chunk, F=1 dimension/chunk. Chunk = largest unit that finishes+commits in one
window; commit + run-state update per chunk.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
null 2026-06-24 21:47:04 -05:00
parent 84dd5f1152
commit bbd7ef0806
1 changed files with 19 additions and 0 deletions

View File

@ -85,6 +85,25 @@ State lives in **files**, not memory:
online; (3) **installed build == current HEAD** (rebuild+reinstall if unsure — never QA a stale APK); (4) continue
at the first `todo` / unverified-fix.
## Batch sizing — sub-batch each pass to ONE context window (Round-1 calibration)
A pass is a **category**, not a unit of work. Execute each pass as **sub-batches (chunks)**, where a chunk = the
**largest coherent unit that reliably finishes AND commits within one context window, with margin**. End every chunk
with a commit + run-state update. If a chunk starts overflowing, split it; if chunks feel trivial, merge them.
**Why:** in Round 1, A & D fit as single batches, but B/C/E were too large → got cut off → deferred. Sub-batching
prevents half-done/lost work and gives cleaner per-chunk verification + revertable commits.
| Pass | Chunk granularity | ~chunks |
|---|---|---|
| A Premium | free-state gating sweep; then couple-shared verify (mostly code + a few live taps) | 12 |
| B Games | **one game per chunk** — full two-device playthrough + edges + commit | 7 |
| C Visual | **one screen-group per chunk** (both themes, ~610 screens, montage-reviewed + nav/back for that group) — never "all screens" at once (heaviest, image-bound) | 68 |
| D Security | D1 at-rest · D2 rules + D3 negative · D4 keys/recovery · D5D7 appcheck/secrets/leaks/migration | ~4 |
| E Notifications | **35 types per chunk** × {foreground/background/killed} + tap-to-open | ~4 |
| F Resilience | **one dimension per chunk** (concurrency · lifecycle/process-death · network · time · account-lifecycle) | ~5 |
Context-cost tips: prefer **code/admin-read audits** (cheap) before live UI sweeps; **montage** screenshots
(dark|light pairs) to review many at once; keep one chunk = one TodoWrite focus.
## Guardrails & efficiency
- **Never `pm clear` / wipe app data** — breaks the App Check debug token. Pre-pairing QA: sign-out → fresh sign-up.
- **Never run `seed/build_db.py`.** Admin seeds/writes, entitlement toggles, and any deploys are **user-authorized per occurrence**.