366 lines
34 KiB
Markdown
366 lines
34 KiB
Markdown
# Claude QA Playbook — Full-App QA → Fix → Re-QA until flawless
|
||
|
||
> Reusable QA plan for the Closer app. Run report-only first, fix everything, then re-QA until a clean round.
|
||
> Progress/state is tracked in **ClaudeReport.md** (issues) + **ClaudeQACoverage.md** (coverage matrix), which are
|
||
> the authoritative source of truth. See the Continuity section before resuming.
|
||
>
|
||
> **Program roadmap:** **Part 1** = Android QA (this doc) → **Part 2** = build the iOS app to Android's current
|
||
> parity → **Part 3** = run these same passes on iOS + a cross-platform (Android↔iOS) pass. **Parts 2 & 3 live in
|
||
> `ClaudeiOSPlan.md`** (note: iOS build/run/QA requires macOS — not possible from this Linux box).
|
||
|
||
## Context
|
||
Drive the real app on both emulators, verify each thing live, report, fix, re-verify. Five QA dimensions:
|
||
1. **Couple-shared premium** — if EITHER partner is premium, **all** premium features unlock for **both**.
|
||
2. **Games** — each starts, plays, finishes correctly on both devices.
|
||
3. **Full visual pass, light + dark** — every screen, text readable, nothing clipped/invisible.
|
||
4. **Security & encryption (cornerstone)** — every private field is ciphertext at rest, rules hold against
|
||
non-members, keys/recovery are sound. Findings here default to P0.
|
||
5. **Notifications** — all 17 types deliver to the right partner (foreground/background/killed), deep-link
|
||
correctly, and leak no private content.
|
||
|
||
Scope decisions: **exhaustive** visual pass (all ~50 screens, both modes); **full scope incl. pre-pairing** flows
|
||
(fresh throwaway account); **couple-shared everywhere** — per-user gates are bugs, fixed by routing through
|
||
`core/billing/CouplePremiumChecker.kt`.
|
||
|
||
**Early known signal:** only chat uses `CouplePremiumChecker`; games/packs/dates/wheel gate on the user's own
|
||
`EntitlementChecker.isPremium()` — so premium almost certainly does NOT unlock for the free partner there. Pass A
|
||
confirms + enumerates this; the fix phase applies couple-shared everywhere.
|
||
|
||
## Execution mode — run to completion (autonomous; do NOT stop)
|
||
- **Do not stop to check in or ask for approval.** Run all five passes → the fix phase → re-QA rounds **continuously
|
||
until a flawless round** (zero open P0–P2, Passes D + E clean, every game fully played through, navigation/back-stack
|
||
verified). Don't hand control back early.
|
||
- **Unblock yourself:** if anything **blocks progress** (a stale/blocking session, a crash, a build break, a missing
|
||
prerequisite state, a broken nav path that prevents reaching a screen), **fix it immediately and continue** — even
|
||
though passes are otherwise report-only. Blocking issues are fixed inline so the run can proceed; non-blocking
|
||
findings are still logged and fixed in the fix phase.
|
||
- **"Once executed, complete it":** never declare done before the Definition of Done is met — keep cycling fix → re-QA
|
||
until flawless, then stop.
|
||
- **Context limits ≠ stopping — do NOT hand back to the user when context fills.** The harness auto-summarizes a long
|
||
conversation and continues in the next window; you continue **without the user**. (You cannot self-invoke `/compact`
|
||
— and you don't need to; auto-compaction handles it.) The **committed `ClaudeReport.md` run-state + `ClaudeQACoverage.md`
|
||
are the authoritative state** and survive any compaction — after a summary, **re-read them and continue at the next
|
||
chunk**. Never pause a run merely because context is getting long; only stop for a true blocker (a denied gated action
|
||
even with standing auth, or the macOS requirement for iOS).
|
||
- **Commit before anything interruptible** so a mid-chunk compaction never loses progress. Keep chunks atomic; if a
|
||
chunk is cut off mid-way (e.g., a game session left active), the **session-start ritual recovers it** (clear the stuck
|
||
session via in-app "End their game", then redo that chunk). Right-sized chunks (see Batch sizing) make this rare.
|
||
- **Don't pause for "by-design vs bug":** log the ambiguous finding and keep going (don't unilaterally rewrite
|
||
deliberate design — the log captures it). Never halt the run to ask.
|
||
- **Only true stop = a gated action you cannot perform.** Production deploys, admin Firestore writes/seeds, and
|
||
entitlement toggles still need per-occurrence authorization (the classifier enforces this regardless of this doc).
|
||
If one is genuinely required to proceed and is denied, do **all** other work first, then surface only that single
|
||
blocker — don't halt the whole run for it.
|
||
|
||
## Methodology (every pass)
|
||
- Devices: **5554 (QA)**, **5556 (Sam)**, paired; one **fresh throwaway account** for pre-pairing flows.
|
||
- Drive via adb tap/swipe; resolve coords from `uiautomator dump` bounds; downscale screenshots to read;
|
||
scan `logcat` for `FATAL EXCEPTION`/ANR on each screen.
|
||
- Premium toggled via `scratchpad/set_premium.js` (admin, **user-authorized each time**).
|
||
- Theme toggled via **Settings → Appearance (Light/Dark)** (`MainActivity` `ThemeMode`).
|
||
- **REPORT-ONLY during passes — never fix mid-pass.**
|
||
- **THINK AS A CONSUMER — approach everything from different angles.** Beyond "does it work", constantly ask *"is this
|
||
what a real person would expect / want here? is this delightful, confusing, or annoying?"* Come at each flow from
|
||
multiple angles (first-time user, returning user, the partner who didn't start it, someone tapping fast, someone
|
||
reading carefully, the skeptic, the impatient). Vary inputs, depths, orders, and entry points (don't repeat one
|
||
happy path). A thing can be bug-free yet still *worse than it should be* — notice that too.
|
||
- **CAPTURE IMPROVEMENT / FEATURE IDEAS → `Future.md` (section `## QA`).** Bugs (broken/incorrect behavior) go to
|
||
`ClaudeReport.md` as always. But anything that *works yet could be better* — confusing copy, a missing affordance,
|
||
a rough-but-not-broken flow, a "it'd be great if…" feature idea — append it to **`Future.md` under `## QA`** with a
|
||
short title, what prompted it, and the suggested improvement. This is an idea backlog, **not** the bug log; logging
|
||
here is never a substitute for filing an actual defect in `ClaudeReport.md`.
|
||
- **Environment (senior-QA rec):** prefer the **Firebase Local Emulator Suite or a dedicated staging project** over
|
||
production — isolates test data, makes seeding / entitlement toggles / D3 negative tests **free** (no gated prod
|
||
writes), and avoids polluting real users. Caveat: App Check, RevenueCat IAP, and real FCM/APNs push need real
|
||
services — run those against staging/prod with test accounts. (We've been on prod with test accounts — works, but
|
||
every seed/toggle/deploy hits the gate.)
|
||
- **Device/OS matrix:** don't certify on one emulator only — cover **minSdk + targetSdk**, a **small** and a **large**
|
||
screen, and at least one **physical device** (App Check / Play Integrity behave differently on emulators).
|
||
- **Automate the regression smoke:** capture the smoke checklist as a runnable script (adb/Maestro) so every round
|
||
re-checks it cheaply instead of by hand.
|
||
- **Test-data hygiene:** keep known test accounts; clean up artifacts (stray messages/reactions/sessions) between
|
||
rounds so they don't masquerade as bugs.
|
||
|
||
## Continuity & resumability (this effort WILL span many context windows — don't lose state)
|
||
State lives in **files**, not memory:
|
||
- **`ClaudeReport.md`** = the issue log (committed). Each issue row is **self-contained in text** (repro + expected
|
||
+ actual) — screenshots are session-only and won't survive a compaction; never rely on a screenshot path alone.
|
||
- **`ClaudeQACoverage.md`** = the coverage matrix: every screen×mode, feature×premium-state, game×lifecycle,
|
||
notification×{foreground,background,killed}, each `todo | pass | fail(→issue id)`. The resume anchor.
|
||
- **Persistent memory** (`memory/`): QA methodology + exact commands; emulator↔account↔coupleId mapping;
|
||
`scratchpad/set_premium.js` + admin tooling; the couple-shared-premium-everywhere goal + the per-user-gate gap.
|
||
- **Run-state header** pinned at the TOP of `ClaudeReport.md`, always current: `Round N | Pass X | Chunk Y |
|
||
NEXT ACTION: …` — first thing to read, last thing to update before stopping.
|
||
- **Stable issue IDs**: `A-001 / B-002 / C-… / D-… / E-…` (pass-letter + number); coverage references the ID for
|
||
every `fail`. Never renumber or reuse.
|
||
- **Source of truth**: the two MD files are authoritative; the TodoWrite list is scratch for the current chunk only.
|
||
Update the MD files + run-state header *before* ending a session.
|
||
- **Commit cadence**: commit `ClaudeReport.md` + `ClaudeQACoverage.md` after each pass and each chunk.
|
||
- **Chunking**: run small chunks (Pass C one screen-group; Pass A one feature), checkpoint after each.
|
||
- **Session-start ritual**: (1) read run-state header + both MD files; (2) `adb devices` shows **both** emulators
|
||
online; (3) **installed build == current HEAD** (rebuild+reinstall if unsure — never QA a stale APK); (4) continue
|
||
at the first `todo` / unverified-fix.
|
||
|
||
## Batch sizing — sub-batch each pass to ONE context window (Round-1 calibration)
|
||
A pass is a **category**, not a unit of work. Execute each pass as **sub-batches (chunks)**, where a chunk = the
|
||
**largest coherent unit that reliably finishes AND commits within one context window, with margin**. End every chunk
|
||
with a commit + run-state update. If a chunk starts overflowing, split it; if chunks feel trivial, merge them.
|
||
**Why:** in Round 1, A & D fit as single batches, but B/C/E were too large → got cut off → deferred. Sub-batching
|
||
prevents half-done/lost work and gives cleaner per-chunk verification + revertable commits.
|
||
|
||
| Pass | Chunk granularity | ~chunks |
|
||
|---|---|---|
|
||
| A Premium | free-state gating sweep; then couple-shared verify (mostly code + a few live taps) | 1–2 |
|
||
| B Games | **one game per chunk** — full two-device playthrough + edges + commit | 7 |
|
||
| C Visual | **one screen-group per chunk** (both themes, ~6–10 screens, montage-reviewed + nav/back for that group) — never "all screens" at once (heaviest, image-bound) | 6–8 |
|
||
| D Security | D1 at-rest · D2 rules + D3 negative · D4 keys/recovery · D5–D7 appcheck/secrets/leaks/migration | ~4 |
|
||
| E Notifications | **3–5 types per chunk** × {foreground/background/killed} + tap-to-open | ~4 |
|
||
| F Resilience | **one dimension per chunk** (concurrency · lifecycle/process-death · network · time · account-lifecycle) | ~5 |
|
||
|
||
Context-cost tips: prefer **code/admin-read audits** (cheap) before live UI sweeps; **montage** screenshots
|
||
(dark|light pairs) to review many at once; keep one chunk = one TodoWrite focus.
|
||
|
||
## Guardrails & efficiency
|
||
- **Never `pm clear` / wipe app data** — breaks the App Check debug token. Pre-pairing QA: sign-out → fresh sign-up.
|
||
- **Never run `seed/build_db.py`.** Admin seeds/writes, entitlement toggles, and any deploys are **user-authorized per occurrence**.
|
||
- **By-design vs bug:** if a finding may be intended behavior, **log it and keep going** (don't stop to ask; don't unilaterally rewrite deliberate design — the log captures it).
|
||
- **Pass C parallelism:** set **5554 = Dark, 5556 = Light** to capture both themes at once.
|
||
- Never log decrypted message/answer content.
|
||
|
||
## Severity scale (label every issue)
|
||
- **P0 Critical** — crash/ANR, data loss, encryption/security leak, feature fully broken, premium bypass.
|
||
- **P1 Major** — feature partly broken, premium not unlocking for partner, wrong/missing notification, dead-end nav.
|
||
- **P2 Minor** — readability/contrast, clipping/overflow/truncation, theme not adapting, inconsistent styling.
|
||
- **P3 Polish** — spacing/alignment/copy nits.
|
||
|
||
## QA passes (Round 1 = baseline)
|
||
|
||
### Pass A — Couple-shared premium (target: either partner premium → both unlock)
|
||
Test each gated feature in 3 states: **neither** premium → locked + paywall; **partner-only** premium → BOTH unlock;
|
||
**self** premium → unlock. Toggle Sam premium, confirm QA (free) unlocks; toggle off.
|
||
Features: Play-hub games (Desire Sync + any premium-badged), Connection Challenges, Memory Lane; Question Packs;
|
||
Spin the Wheel / Category Picker / Wheel History; Date Match / Plan Date / Date Builder; chat media + reactions
|
||
(regression — already couple-shared); Subscription/Settings reflects entitlement.
|
||
Gated files (for the fix): `ui/play/PlayHubViewModel`, `ui/desiresync/DesireSyncScreen`,
|
||
`ui/wheel/{CategoryPicker,SpinWheel,WheelHistory}*`, `ui/questions/QuestionPackLibrary*`,
|
||
`ui/dates/{DateMatch,DateMatches}Screen`, `ui/memorylane/MemoryLaneScreen`, `ui/challenges/ConnectionChallengesScreen`.
|
||
|
||
### Pass B — Games lifecycle (MANDATORY: play each game ONE complete time through)
|
||
Games: This or That, How Well Do You Know Me, Desire Sync, Connection Challenges, Memory Lane, Spin the Wheel, + Date Match.
|
||
- **PLAY AS THE USER (mandatory mindset for this pass):** drive every game **the way a real user would** — reach it
|
||
through the actual in-app navigation a person would tap (Play hub → the game's card → its buttons), **not** via
|
||
deep-links, admin pokes, forced state, or any shortcut a user doesn't have. **Expect what the user expects:** if a
|
||
tap/button/flow doesn't do the obvious thing, or a screen doesn't behave the way a normal user would assume, **that
|
||
itself is a finding** — log it.
|
||
- **When something doesn't work: REPORT FIRST, then a minimal workaround (in that order).** Do **not** silently
|
||
engineer around breakage by taking extra steps the user wouldn't take. The moment the natural user path fails:
|
||
(1) **log the issue** in `ClaudeReport.md` with severity + the exact user action that failed and what was expected;
|
||
(2) **only then** apply the smallest workaround needed to keep the pass moving. The workaround **never replaces**
|
||
the report — a flow that needs a workaround to proceed is, by definition, broken and must be filed to fix. If a
|
||
workaround is impossible, mark the game `fail→<id>` (blocked) and continue with the next.
|
||
- **A launch/crash check is NOT sufficient. Each game MUST be played one full way through, end-to-end, on BOTH
|
||
devices** — start → answer/interact through **every** step/round/question on each device → reach the
|
||
**finish/reveal/results** screen → confirm the result renders correctly for both partners. Verify each
|
||
intermediate screen and interaction works (selections register, progress advances, both-answered gating,
|
||
reveal/scoring/summary correct). Premium games (Desire Sync, Memory Lane) need a premium toggle to play.
|
||
- The session lifecycle is exercised by the real playthrough: `status` active→completed; reveal/results correct on both.
|
||
- **VARY THE STYLE OF PLAY (don't just repeat the happy path):** across runs, deliberately exercise *different* ways a
|
||
real couple would play each game, because different inputs hit different code paths:
|
||
- **Different DEPTHS and QUESTION COUNTS — cover the matrix, don't settle for one combo:** play each game across
|
||
**every depth/mood** (Light, Everyday, Deep, All-topics/shuffle) AND **every round length / number of questions**
|
||
(5 / 10 / 15), in *different pairings* across runs (e.g. Light×5, Deep×15, Everyday×10, All×5) — short *and* long
|
||
sessions, shallow *and* deep content. Different depths surface different question sets, tones, and edge content
|
||
(e.g. Deep/Desire-Sync sensitive prompts); different counts stress pacing, progress, and the both-answered gate.
|
||
Also exercise **each distinct answer type** (A/B, Yes/No, True/False, 1–5 scale, multi-select, free-text).
|
||
- **Different answer *patterns* that change the result** — all-match vs all-mismatch vs partial; both-yes vs both-no
|
||
vs split (so reveals show "shared", "all private", "0 matches", "perfect/zero score" — verify each renders right).
|
||
- **Different turn orders / who-starts** — partner A starts vs partner B starts; the guesser opens before vs after
|
||
the subject finishes; both open simultaneously (race); one device much slower than the other.
|
||
- **Different exit/resume styles** — finish normally; quit mid-game; background mid-game then resume; cold-kill
|
||
mid-game then reopen; "End their game"; re-open a completed session for the replay/results; play two games
|
||
back-to-back, and a *different* game type immediately after.
|
||
- **Edge inputs** — submit with nothing selected (should be blocked), rapid double-taps on answer/confirm/next,
|
||
spamming the start button, tapping during the reveal animation. None should crash, duplicate, or desync.
|
||
- Edges: re-open a completed session, leave mid-game (resume), no stuck session, no crash, logcat clean.
|
||
- Game start/finish pushes (`onGameSessionUpdate`) exercised here; full delivery/deep-link audit in **Pass E**.
|
||
- **Media permissions** (CAMERA, RECORD_AUDIO): granted works, denied degrades gracefully.
|
||
- **Done = every game has one verified complete playthrough** (a launch-only "opens, no crash" row is `partial`, not `pass`).
|
||
|
||
### Pass C — Visual pass, light + dark, ALL screens
|
||
Every route in `core/navigation/AppRoute.kt` (~50), in **both** modes: text contrast/readability (no invisible/
|
||
low-contrast), no clipping/overflow/ellipsis breakage, icons visible, backgrounds adapt, controls legible. Groups:
|
||
auth/onboarding/pairing (fresh acct); Home (solo + paired); Play + every game; Today + reveal/history; Messages
|
||
(inbox + conversation); Packs; Dates (Match/Builder/Matches/Bucket List); Wheel (picker/session/complete/history);
|
||
Settings + all sub-pages (Account, Notifications, Appearance, Privacy, Subscription, Relationship, Security, Delete
|
||
Account); Paywall; Your Progress/Activity; Recovery.
|
||
- **Probe:** `ui/theme/Theme.kt` hardcoded brand colors + chat's custom `closerBackgroundBrush` — verify dark mode
|
||
truly adapts; grep screens for hardcoded `Color(0x...)`.
|
||
- **States, not just happy path:** empty / loading / error / not-paired where they exist; many need data setup
|
||
(seeding is user-gated) — note unreachable states in coverage rather than skipping silently.
|
||
- **Readability at scale:** default font size + spot-check largest system font scale on text-heavy screens.
|
||
- **Navigation from every entry point:** reach each screen from **all** the places that link to it and confirm it
|
||
opens correctly each time — e.g. a conversation from the inbox AND from "Discuss" AND from a notification; a game
|
||
from the Play hub AND from a notification; Paywall from each gated feature; Settings sub-pages; reveal from Today
|
||
AND from history AND from `partner_answered`. A screen that works from one entry but breaks/duplicates from another = bug.
|
||
- **TAKE EVERY AVENUE (exhaustive nav fuzzing — actively hunt for nav bugs, don't just walk the happy path):** treat
|
||
navigation as something to *break*. On every screen, **tap every interactive element** — each button, card, row,
|
||
icon, chip, link, tab, header back-arrow, system back, and any "see all / history / edit / manage" affordance — and
|
||
follow where it goes. Then try the *combinations and sequences* a curious user hits:
|
||
- **Every order:** switch bottom tabs in many orders, mid-flow (open a game, jump to Messages, come back); enter a
|
||
deep screen then tab away then back; open A→B→C then back-back-back.
|
||
- **Rapid / repeated input:** double- and triple-tap navigation targets (especially "open game", "Play now",
|
||
"Create/Start session", notification taps) to surface double-push/duplicate-screen/stale-route bugs (cf. B-004).
|
||
- **Interrupt mid-navigation:** background/rotate/lock during a transition; tap a notification while already on that
|
||
screen, on a different screen, and while logged-out/unpaired; cold-start straight onto a deep link.
|
||
- **Dead-ends & traps:** from *every* screen confirm there's always a way out (back/close/home) — no screen that
|
||
strands the user, needs two backs, exits the app unexpectedly, loops, or lands blank. Re-check the asymmetric-game
|
||
waiting screens, replay/results screens, and paywall specifically.
|
||
- Log **every** wrong/duplicate/dead destination with the exact tap sequence to reproduce. Wrong/double-back or
|
||
dead-end = **P2** (P1 if it traps the user or loses their progress).
|
||
- **Back-stack / "double back":** from every entry point, **system back AND the in-app back arrow** return to the
|
||
correct previous screen — no dead-ends, no exiting the app unexpectedly, and **no screen that requires pressing
|
||
back twice** (duplicate/stacked destinations on the back stack = bug). Bottom-tab reselection and deep-link/
|
||
notification entries must land with a sane back stack (back → Home, not off the app or a blank screen). Wrong/
|
||
double back or a dead-end = **P2** (P1 if it traps the user).
|
||
- **D1 At-rest coverage:** admin-read RAW docs/objects, assert ciphertext for every private type — chat text +
|
||
`lastMessagePreview` (`enc:v1:`), chat media bytes (Tink `01 69 59 51 f0…`), answers (`sealed:v1:`/`enc:v1:`),
|
||
date plans + `date_swipes`, Memory Lane capsules, Bucket List. Also: **wrappedCoupleKey** + recovery material never
|
||
plaintext; **invite code (KDF seed) never stored raw**; **no push payload carries private content**.
|
||
- **D2 Rules audit (static):** member-only reads, author/server-only writes, ciphertext enforced on every private
|
||
field, immutability, **no premium self-grant**, entitlements write:false; re-audit conversations/typing/reactions
|
||
+ entitlement partner-read; **no catch-all** `match /{document=**}`; list/query not enumerable; `get()`-rules don't
|
||
over-expose; **no legacy plaintext/downgrade path** (`coupleEncryptionEnabled` holds; no disabled-encryption branch).
|
||
- **D3 Negative access tests:** a **non-member** account is *denied* reading messages/answers/dates/entitlements,
|
||
writing plaintext to encrypted fields, self-granting premium, cross-couple access (live rules or rules-emulator).
|
||
- **D4 Key exchange / management / recovery (E2EE crux):** couple key client-generated, only leaves device **wrapped**
|
||
(KDF from invite seed; server holds only `wrappedCoupleKey`+`kdfSalt`/`kdfParams`+`encryptedRecoveryPhrase`); **KDF
|
||
strength**; Tink AEAD = AES-GCM/256 with **AAD=coupleId**, no weak/custom crypto/nonce reuse; keybox/sealed/commitment
|
||
integrity; **recovery-wrap server-blind**; **unpair revokes decrypt**; invites CSPRNG + single-use + expiry.
|
||
- **D5 App Check / Functions / secrets:** App Check enforced; callables validate auth+membership; webhook authenticity;
|
||
admin-only writes rejected from clients; service-account JSONs never committed; no plaintext/secrets in logcat; temp
|
||
files deleted.
|
||
- **D6 Leak vectors:** no private content in analytics/crash; `allowBackup=false` + backup rules exclude sensitive data;
|
||
deep links re-check membership; clipboard user-initiated; consider `FLAG_SECURE`; repo scan for committed secrets.
|
||
- **D7 Encryption migration:** test the `encryptionVersion` paths (0 plaintext → 1 migrating → 2 strict) on a legacy
|
||
couple — migration completes without exposing plaintext or losing/garbling old content, and a half-migrated couple
|
||
is safe (no mixed read failures, no downgrade). This is the riskiest data path for existing users.
|
||
|
||
### Pass G — Account creation, validation & fake-account abuse (MANDATORY — both the happy path AND the attacks)
|
||
Cover **every account-creation avenue a real user takes** and **every fake/abusive creation attempt an attacker would
|
||
try.** Use throwaway test accounts (sign-out → fresh sign-up; never `pm clear`). Report-first like every pass.
|
||
- **Real creation flows (happy path + validation):** sign-up (email/password and any social/anonymous path), profile
|
||
creation, and pairing — both **create-invite** and **accept-invite** sides. Verify field validation (invalid/empty
|
||
email, weak/short password, mismatched confirm, name length/emoji/unicode), the **error copy is friendly** (no raw
|
||
SDK/Firebase error leaking — cf. A-OBS), loading/disabled states, and that a brand-new unpaired account lands on the
|
||
correct "create or accept invite" home (not a broken/blank or paired view).
|
||
- **Duplicate / conflicting creation:** sign up with an **already-registered email** (clear "already in use", no crash,
|
||
offer sign-in); create a second account while one is signed in; re-run onboarding after completing it; accept an
|
||
invite while **already paired** (must be rejected cleanly); two devices accepting the **same invite** (single-use —
|
||
the second must fail gracefully).
|
||
- **Fake / malicious creation attempts (security — expect DENY, never crash or leak):** create an account that is
|
||
**NOT a member** of the test couple and attempt every cross-couple action (read messages/answers/dates/entitlements,
|
||
write to the couple, self-grant `premium`/`hasPremium`, join/hijack pairing with a guessed/expired/reused invite
|
||
code) — all must be **denied by rules** (this is the live execution of **D3**). Probe **invite-code abuse**: replay a
|
||
used code, use an expired code, brute-force/guess attempts (CSPRNG entropy + single-use + expiry must hold). Probe
|
||
**App Check**: a request without a valid token is rejected. Confirm a malformed/forged sign-up can't bypass profile
|
||
or membership requirements. **Any successful unauthorized create/read/write = P0.**
|
||
- **Account lifecycle around creation:** sign-out → sign-in (state restores, no stale couple); **delete account** then
|
||
re-create with the same email (clean slate, partner notified/unpaired); an unpaired/just-created account tapping a
|
||
stale notification or deep link is handled gracefully (no crash, sane landing).
|
||
- **Done = every creation avenue exercised** (happy + duplicate + malicious) with each attack **denied** and each happy
|
||
path validated end-to-end; findings filed with exact repro.
|
||
|
||
### Pass E — Notifications (every type delivers, deep-links, leaks nothing)
|
||
For each: trigger fires → delivered to the **right partner (never self)** → in **foreground/background/killed** →
|
||
correct channel + copy with **no private content** → **tap opens exactly the right item** (loaded, not generic Home/
|
||
dead-end) → no duplicates → rate limiter (20/day,100/week) doesn't drop legit ones.
|
||
Inventory (type → trigger → destination), all 17: `chat_message`(onMessageWritten→conversation, foreground→chat-head
|
||
bubble), `partner_started_game`/`partner_finished_game`(onGameSessionUpdate→game/results), `partner_answered`
|
||
(onAnswerWritten→reveal), `daily_question`(assignDailyQuestion)/`daily_question_reminder`/`daily_reminder`
|
||
(dailyQuestionReminder→Today), `date_match`(createDateMatch→match), `partner_joined`+`invite_created`
|
||
(acceptInviteCallable→pairing/home), `partner_left`(onCoupleLeave)/`partner_deleted_account`(onUserDelete→home/
|
||
relationship settings), `memory_capsule_unlocked`(scheduled→capsule), `challenge_day_ready`(→Connection Challenges),
|
||
`outcome_reminder`(scheduledOutcomesReminder), `reengagement`(reengagement/gameRetention), `gentle_reminder`
|
||
(sendGentleReminderCallable), `spki`(identify + confirm handled).
|
||
- **Tap-to-open:** every notification opens the **specific item** from foreground/background/killed; tapping in-app
|
||
doesn't stack/duplicate; logged-out/unpaired tap is graceful. Wrong/dead destination = P1.
|
||
- **Scheduled/time-based:** trigger manually (invoke callable/function or seed due condition — user-gated).
|
||
- **Foundations:** FCM token registration on sign-in (`TokenRegistrar`) + `onNewToken`; POST_NOTIFICATIONS prompt +
|
||
denied path; channels (`di/NotificationModule`); deep-link routing (`MainActivity.deepLinkRouteFromIntent` →
|
||
`AppNavigation`); foreground/background split (`core/notifications/AppMessagingService`).
|
||
- Build a delivery matrix (type × {foreground,background,killed}) in ClaudeQACoverage.md. Missed delivery or wrong
|
||
deep-link = P1; private content in any payload = P0.
|
||
|
||
### Pass F — Resilience, concurrency, lifecycle & time (cross-cutting; a 2-user realtime app needs these)
|
||
- **Concurrency / realtime races (two partners at once):** both answer the daily question simultaneously; both start
|
||
a game / swipe a date / react at the same time; partner acts while you're mid-flow. No lost writes, no stuck state,
|
||
no duplicate sessions, reveal still correct. (This is where a couples app breaks.)
|
||
- **Lifecycle / process death:** background mid-flow + return; force-kill the app and relaunch (Android may kill the
|
||
process) — state/auth/draft restore sanely; deep-link/notification after process death still loads (verified for
|
||
chat — extend to all). Rotation/config-change doesn't lose Compose state. Low-memory.
|
||
- **Network resilience:** offline / flaky / airplane mid-action across answers, games, dates (not just chat media) —
|
||
graceful failure + retry/queue, no crash, no silent data loss, recovery on reconnect.
|
||
- **Idempotency / rapid input:** double-tap send/submit, rapid nav, double-start — guarded (no double-send, no crash).
|
||
- **Time-dependent behavior:** daily-question rollover (6 PM CST assignment), streak day-boundary + repair window,
|
||
capsule unlock times, reminder schedules — test across a date change (manipulate device clock / trigger functions).
|
||
- **Account/couple lifecycle:** brand-new (empty) account; unpaired state; pair → unpair → re-pair; partner leaves
|
||
mid-session; account deletion cascade; same account on two devices. No orphaned/broken state.
|
||
- **Crash reporting:** confirm crashes/ANRs are actually captured (Crashlytics) so field issues surface.
|
||
|
||
### Pass H — Branding & artwork (every screen: could it carry more of the brand? where would art help?)
|
||
A consumer-mindset pass focused on **brand presence and delight**, not defects. Walk **every screen and surface** and
|
||
ask: *does this feel like Closer (private, warm, equal, intentional — a ritual for two)? Could brand color, the heart
|
||
mark, a brand message, or an illustration make it warmer or clearer without clutter?* Output is **artwork descriptions
|
||
written as ready-to-paste ChatGPT image-generation prompts** — the user generates the images; we only describe them.
|
||
- **First, lock the house style (do this once per round, refresh if the art evolved):** read `docs/brand/visual-identity.md`
|
||
+ `docs/brand/asset-system.md` AND open 2–3 existing illustrations (`illustration_couple_onboarding`,
|
||
`illustration_reveal_celebration`, `pack_art_*`) to capture the *actual* look. New screens/features since the last
|
||
brand review must be folded in. Keep the canonical **house-style prompt prefix** + palette in the branding deliverable
|
||
(`ClaudeBrandingReview.md`) so every prompt reuses it and **all generated art matches the existing artwork.**
|
||
- **House style (must hold for every prompt):** flat 2D pastel vector illustration; soft rounded shapes, no harsh
|
||
outlines, gentle gradients; palette aubergine `#24122F` / deep purple `#56306F` / lavender `#B98AF4` / soft pink
|
||
`#F7C8E4` / soft lavender `#D9B8FF` / blush white `#FFF8FC`; motifs = two-equal-halves heart, paired/sealed cards,
|
||
floating hearts + petals, candle/mug/lavender-sprig warmth, moon/quiet-hours, calendar/date-card, capsule; mood =
|
||
warm, quiet, equal, intentional. Couple figures balanced + inclusive, faces simple. **Never** show readable answer/
|
||
prompt/message text, invite codes, emails, dating-app clichés, stock photos, alarm/urgency/surveillance imagery.
|
||
- **Per screen, decide the brand opportunity** (pick the lightest that fits — don't over-decorate):
|
||
- none needed (already on-brand, or a dense list/form where art would clutter) — say so;
|
||
- **color/typographic** brand touch (palette, heart mark, a rotating privacy message);
|
||
- **small glyph** (brand glyph for a relationship concept — describe it for the glyph set);
|
||
- **hero/empty-state/celebration illustration** (the high-value case → write the full ChatGPT prompt).
|
||
- **Each artwork item records:** screen/route · placement (hero / empty / header / card / celebration) · why it helps ·
|
||
filename to match the existing scheme (`illustration_*`, `pack_art_*`, `glyph_*`, `particle_*`) · **the ChatGPT
|
||
prompt** (house-style prefix + the specific scene) · aspect ratio/size + light/dark behavior. Cross-check the
|
||
brand doc's "Needed additions" / empty-state list and **mark which already have assets vs still need art** (e.g.
|
||
Android may still lack illustrations that iOS has).
|
||
- **Prioritize** the screens a user feels most: onboarding/pairing, Home, paywall/subscription, reveal/celebration,
|
||
empty states (no messages/dates/capsules/history), Memory Lane, Connection Challenges, date match, quiet-hours.
|
||
- Branding *defects* (mis-colored, clipped, off-brand, low-contrast art) are bugs → `ClaudeReport.md`. Pure
|
||
"works but could be warmer / a feature idea" → `Future.md` `## QA`. New art to create → `ClaudeBrandingReview.md`.
|
||
|
||
## Reporting → ClaudeReport.md (living QA report)
|
||
- Header: date, build, devices, round number + run-state header.
|
||
- One section per pass (A/B/C/D/E/F), each a table: **ID | Area | Screen/Route | Mode | Severity | Description | Repro
|
||
| Evidence | Suggested fix | Status**.
|
||
- Summary: counts by severity. Report only during passes — no fixes recorded until the fix phase.
|
||
|
||
## Fix phase (only AFTER all passes of the round complete)
|
||
- Work strictly by severity: **all P0 → P1 → P2 → P3**.
|
||
- **One issue at a time**: implement → `./gradlew :app:assembleDebug` → install both → verify THAT fix live (correct
|
||
device/theme) + regression smoke (launch/no-crash, send text, inbox loads, a game opens, **content still ciphertext
|
||
in Firestore**) → flip its row to **Fixed** + **commit** (one per issue/cluster) → next. Don't start the next until
|
||
the current is verified.
|
||
- **Couple-shared premium fix**: replace direct `isPremium()` gates with
|
||
`CouplePremiumChecker.coupleHasPremium(partnerId)` in every gated VM/screen (partner-entitlement read rule deployed).
|
||
**High regression risk** — re-verify each feature in BOTH self-premium and free states.
|
||
- Gated actions (entitlement toggles, deploys) are **user-authorized per occurrence**.
|
||
- **New issues found while fixing** are logged (new ID), not silently fixed beyond scope — next re-QA round catches them.
|
||
|
||
**Definition of done:** a **pass** is done when every coverage row is `pass`/`fail→id`; a **round** is done when all
|
||
five passes are done; **flawless** = one full round with **zero open P0–P2 and Passes D + E fully clean**. Then stop
|
||
(P3s optional). Don't re-open a clean pass within the same round.
|
||
|
||
## Re-QA loop (until flawless)
|
||
After the fix phase, re-run Pass A/B/C/D/E/F (regression + confirm fixes). Repeat **fix → re-QA** rounds until a full
|
||
round yields zero P0–P2 and Passes D+E fully clean.
|