14 KiB
Claude QA Playbook — Full-App QA → Fix → Re-QA until flawless
Reusable QA plan for the Closer app. Run report-only first, fix everything, then re-QA until a clean round. Progress/state is tracked in ClaudeReport.md (issues) + ClaudeQACoverage.md (coverage matrix), which are the authoritative source of truth. See the Continuity section before resuming.
Context
Drive the real app on both emulators, verify each thing live, report, fix, re-verify. Five QA dimensions:
- Couple-shared premium — if EITHER partner is premium, all premium features unlock for both.
- Games — each starts, plays, finishes correctly on both devices.
- Full visual pass, light + dark — every screen, text readable, nothing clipped/invisible.
- Security & encryption (cornerstone) — every private field is ciphertext at rest, rules hold against non-members, keys/recovery are sound. Findings here default to P0.
- Notifications — all 17 types deliver to the right partner (foreground/background/killed), deep-link correctly, and leak no private content.
Scope decisions: exhaustive visual pass (all ~50 screens, both modes); full scope incl. pre-pairing flows
(fresh throwaway account); couple-shared everywhere — per-user gates are bugs, fixed by routing through
core/billing/CouplePremiumChecker.kt.
Early known signal: only chat uses CouplePremiumChecker; games/packs/dates/wheel gate on the user's own
EntitlementChecker.isPremium() — so premium almost certainly does NOT unlock for the free partner there. Pass A
confirms + enumerates this; the fix phase applies couple-shared everywhere.
Methodology (every pass)
- Devices: 5554 (QA), 5556 (Sam), paired; one fresh throwaway account for pre-pairing flows.
- Drive via adb tap/swipe; resolve coords from
uiautomator dumpbounds; downscale screenshots to read; scanlogcatforFATAL EXCEPTION/ANR on each screen. - Premium toggled via
scratchpad/set_premium.js(admin, user-authorized each time). - Theme toggled via Settings → Appearance (Light/Dark) (
MainActivityThemeMode). - REPORT-ONLY during passes — never fix mid-pass.
Continuity & resumability (this effort WILL span many context windows — don't lose state)
State lives in files, not memory:
ClaudeReport.md= the issue log (committed). Each issue row is self-contained in text (repro + expected- actual) — screenshots are session-only and won't survive a compaction; never rely on a screenshot path alone.
ClaudeQACoverage.md= the coverage matrix: every screen×mode, feature×premium-state, game×lifecycle, notification×{foreground,background,killed}, eachtodo | pass | fail(→issue id). The resume anchor.- Persistent memory (
memory/): QA methodology + exact commands; emulator↔account↔coupleId mapping;scratchpad/set_premium.js+ admin tooling; the couple-shared-premium-everywhere goal + the per-user-gate gap. - Run-state header pinned at the TOP of
ClaudeReport.md, always current:Round N | Pass X | Chunk Y | NEXT ACTION: …— first thing to read, last thing to update before stopping. - Stable issue IDs:
A-001 / B-002 / C-… / D-… / E-…(pass-letter + number); coverage references the ID for everyfail. Never renumber or reuse. - Source of truth: the two MD files are authoritative; the TodoWrite list is scratch for the current chunk only. Update the MD files + run-state header before ending a session.
- Commit cadence: commit
ClaudeReport.md+ClaudeQACoverage.mdafter each pass and each chunk. - Chunking: run small chunks (Pass C one screen-group; Pass A one feature), checkpoint after each.
- Session-start ritual: (1) read run-state header + both MD files; (2)
adb devicesshows both emulators online; (3) installed build == current HEAD (rebuild+reinstall if unsure — never QA a stale APK); (4) continue at the firsttodo/ unverified-fix.
Guardrails & efficiency
- Never
pm clear/ wipe app data — breaks the App Check debug token. Pre-pairing QA: sign-out → fresh sign-up. - Never run
seed/build_db.py. Admin seeds/writes, entitlement toggles, and any deploys are user-authorized per occurrence. - By-design vs bug: if a finding may be intended behavior, log it and confirm with the user before changing it.
- Pass C parallelism: set 5554 = Dark, 5556 = Light to capture both themes at once.
- Never log decrypted message/answer content.
Severity scale (label every issue)
- P0 Critical — crash/ANR, data loss, encryption/security leak, feature fully broken, premium bypass.
- P1 Major — feature partly broken, premium not unlocking for partner, wrong/missing notification, dead-end nav.
- P2 Minor — readability/contrast, clipping/overflow/truncation, theme not adapting, inconsistent styling.
- P3 Polish — spacing/alignment/copy nits.
QA passes (Round 1 = baseline)
Pass A — Couple-shared premium (target: either partner premium → both unlock)
Test each gated feature in 3 states: neither premium → locked + paywall; partner-only premium → BOTH unlock;
self premium → unlock. Toggle Sam premium, confirm QA (free) unlocks; toggle off.
Features: Play-hub games (Desire Sync + any premium-badged), Connection Challenges, Memory Lane; Question Packs;
Spin the Wheel / Category Picker / Wheel History; Date Match / Plan Date / Date Builder; chat media + reactions
(regression — already couple-shared); Subscription/Settings reflects entitlement.
Gated files (for the fix): ui/play/PlayHubViewModel, ui/desiresync/DesireSyncScreen,
ui/wheel/{CategoryPicker,SpinWheel,WheelHistory}*, ui/questions/QuestionPackLibrary*,
ui/dates/{DateMatch,DateMatches}Screen, ui/memorylane/MemoryLaneScreen, ui/challenges/ConnectionChallengesScreen.
Pass B — Games lifecycle (MANDATORY: play each game ONE complete time through)
Games: This or That, How Well Do You Know Me, Desire Sync, Connection Challenges, Memory Lane, Spin the Wheel, + Date Match.
- A launch/crash check is NOT sufficient. Each game MUST be played one full way through, end-to-end, on BOTH devices — start → answer/interact through every step/round/question on each device → reach the finish/reveal/results screen → confirm the result renders correctly for both partners. Verify each intermediate screen and interaction works (selections register, progress advances, both-answered gating, reveal/scoring/summary correct). Premium games (Desire Sync, Memory Lane) need a premium toggle to play.
- The session lifecycle is exercised by the real playthrough:
statusactive→completed; reveal/results correct on both. - Edges: re-open a completed session, leave mid-game (resume), no stuck session, no crash, logcat clean.
- Game start/finish pushes (
onGameSessionUpdate) exercised here; full delivery/deep-link audit in Pass E. - Media permissions (CAMERA, RECORD_AUDIO): granted works, denied degrades gracefully.
- Done = every game has one verified complete playthrough (a launch-only "opens, no crash" row is
partial, notpass).
Pass C — Visual pass, light + dark, ALL screens
Every route in core/navigation/AppRoute.kt (~50), in both modes: text contrast/readability (no invisible/
low-contrast), no clipping/overflow/ellipsis breakage, icons visible, backgrounds adapt, controls legible. Groups:
auth/onboarding/pairing (fresh acct); Home (solo + paired); Play + every game; Today + reveal/history; Messages
(inbox + conversation); Packs; Dates (Match/Builder/Matches/Bucket List); Wheel (picker/session/complete/history);
Settings + all sub-pages (Account, Notifications, Appearance, Privacy, Subscription, Relationship, Security, Delete
Account); Paywall; Your Progress/Activity; Recovery.
- Probe:
ui/theme/Theme.kthardcoded brand colors + chat's customcloserBackgroundBrush— verify dark mode truly adapts; grep screens for hardcodedColor(0x...). - States, not just happy path: empty / loading / error / not-paired where they exist; many need data setup (seeding is user-gated) — note unreachable states in coverage rather than skipping silently.
- Readability at scale: default font size + spot-check largest system font scale on text-heavy screens.
Pass D — Security & Encryption audit (cornerstone; findings default to P0)
- D1 At-rest coverage: admin-read RAW docs/objects, assert ciphertext for every private type — chat text +
lastMessagePreview(enc:v1:), chat media bytes (Tink01 69 59 51 f0…), answers (sealed:v1:/enc:v1:), date plans +date_swipes, Memory Lane capsules, Bucket List. Also: wrappedCoupleKey + recovery material never plaintext; invite code (KDF seed) never stored raw; no push payload carries private content. - D2 Rules audit (static): member-only reads, author/server-only writes, ciphertext enforced on every private
field, immutability, no premium self-grant, entitlements write:false; re-audit conversations/typing/reactions
- entitlement partner-read; no catch-all
match /{document=**}; list/query not enumerable;get()-rules don't over-expose; no legacy plaintext/downgrade path (coupleEncryptionEnabledholds; no disabled-encryption branch).
- entitlement partner-read; no catch-all
- D3 Negative access tests: a non-member account is denied reading messages/answers/dates/entitlements, writing plaintext to encrypted fields, self-granting premium, cross-couple access (live rules or rules-emulator).
- D4 Key exchange / management / recovery (E2EE crux): couple key client-generated, only leaves device wrapped
(KDF from invite seed; server holds only
wrappedCoupleKey+kdfSalt/kdfParams+encryptedRecoveryPhrase); KDF strength; Tink AEAD = AES-GCM/256 with AAD=coupleId, no weak/custom crypto/nonce reuse; keybox/sealed/commitment integrity; recovery-wrap server-blind; unpair revokes decrypt; invites CSPRNG + single-use + expiry. - D5 App Check / Functions / secrets: App Check enforced; callables validate auth+membership; webhook authenticity; admin-only writes rejected from clients; service-account JSONs never committed; no plaintext/secrets in logcat; temp files deleted.
- D6 Leak vectors: no private content in analytics/crash;
allowBackup=false+ backup rules exclude sensitive data; deep links re-check membership; clipboard user-initiated; considerFLAG_SECURE; repo scan for committed secrets.
Pass E — Notifications (every type delivers, deep-links, leaks nothing)
For each: trigger fires → delivered to the right partner (never self) → in foreground/background/killed →
correct channel + copy with no private content → tap opens exactly the right item (loaded, not generic Home/
dead-end) → no duplicates → rate limiter (20/day,100/week) doesn't drop legit ones.
Inventory (type → trigger → destination), all 17: chat_message(onMessageWritten→conversation, foreground→chat-head
bubble), partner_started_game/partner_finished_game(onGameSessionUpdate→game/results), partner_answered
(onAnswerWritten→reveal), daily_question(assignDailyQuestion)/daily_question_reminder/daily_reminder
(dailyQuestionReminder→Today), date_match(createDateMatch→match), partner_joined+invite_created
(acceptInviteCallable→pairing/home), partner_left(onCoupleLeave)/partner_deleted_account(onUserDelete→home/
relationship settings), memory_capsule_unlocked(scheduled→capsule), challenge_day_ready(→Connection Challenges),
outcome_reminder(scheduledOutcomesReminder), reengagement(reengagement/gameRetention), gentle_reminder
(sendGentleReminderCallable), spki(identify + confirm handled).
- Tap-to-open: every notification opens the specific item from foreground/background/killed; tapping in-app doesn't stack/duplicate; logged-out/unpaired tap is graceful. Wrong/dead destination = P1.
- Scheduled/time-based: trigger manually (invoke callable/function or seed due condition — user-gated).
- Foundations: FCM token registration on sign-in (
TokenRegistrar) +onNewToken; POST_NOTIFICATIONS prompt + denied path; channels (di/NotificationModule); deep-link routing (MainActivity.deepLinkRouteFromIntent→AppNavigation); foreground/background split (core/notifications/AppMessagingService). - Build a delivery matrix (type × {foreground,background,killed}) in ClaudeQACoverage.md. Missed delivery or wrong deep-link = P1; private content in any payload = P0.
Reporting → ClaudeReport.md (living QA report)
- Header: date, build, devices, round number + run-state header.
- One section per pass (A/B/C/D/E), each a table: ID | Area | Screen/Route | Mode | Severity | Description | Repro | Evidence | Suggested fix | Status.
- Summary: counts by severity. Report only during passes — no fixes recorded until the fix phase.
Fix phase (only AFTER all passes of the round complete)
- Work strictly by severity: all P0 → P1 → P2 → P3.
- One issue at a time: implement →
./gradlew :app:assembleDebug→ install both → verify THAT fix live (correct device/theme) + regression smoke (launch/no-crash, send text, inbox loads, a game opens, content still ciphertext in Firestore) → flip its row to Fixed + commit (one per issue/cluster) → next. Don't start the next until the current is verified. - Couple-shared premium fix: replace direct
isPremium()gates withCouplePremiumChecker.coupleHasPremium(partnerId)in every gated VM/screen (partner-entitlement read rule deployed). High regression risk — re-verify each feature in BOTH self-premium and free states. - Gated actions (entitlement toggles, deploys) are user-authorized per occurrence.
- New issues found while fixing are logged (new ID), not silently fixed beyond scope — next re-QA round catches them.
Definition of done: a pass is done when every coverage row is pass/fail→id; a round is done when all
five passes are done; flawless = one full round with zero open P0–P2 and Passes D + E fully clean. Then stop
(P3s optional). Don't re-open a clean pass within the same round.
Re-QA loop (until flawless)
After the fix phase, re-run Pass A/B/C/D/E (regression + confirm fixes). Repeat fix → re-QA rounds until a full round yields zero P0–P2 and Passes D+E fully clean.