Closer/ClaudeQAPlan.md

14 KiB
Raw Blame History

Claude QA Playbook — Full-App QA → Fix → Re-QA until flawless

Reusable QA plan for the Closer app. Run report-only first, fix everything, then re-QA until a clean round. Progress/state is tracked in ClaudeReport.md (issues) + ClaudeQACoverage.md (coverage matrix), which are the authoritative source of truth. See the Continuity section before resuming.

Context

Drive the real app on both emulators, verify each thing live, report, fix, re-verify. Five QA dimensions:

  1. Couple-shared premium — if EITHER partner is premium, all premium features unlock for both.
  2. Games — each starts, plays, finishes correctly on both devices.
  3. Full visual pass, light + dark — every screen, text readable, nothing clipped/invisible.
  4. Security & encryption (cornerstone) — every private field is ciphertext at rest, rules hold against non-members, keys/recovery are sound. Findings here default to P0.
  5. Notifications — all 17 types deliver to the right partner (foreground/background/killed), deep-link correctly, and leak no private content.

Scope decisions: exhaustive visual pass (all ~50 screens, both modes); full scope incl. pre-pairing flows (fresh throwaway account); couple-shared everywhere — per-user gates are bugs, fixed by routing through core/billing/CouplePremiumChecker.kt.

Early known signal: only chat uses CouplePremiumChecker; games/packs/dates/wheel gate on the user's own EntitlementChecker.isPremium() — so premium almost certainly does NOT unlock for the free partner there. Pass A confirms + enumerates this; the fix phase applies couple-shared everywhere.

Methodology (every pass)

  • Devices: 5554 (QA), 5556 (Sam), paired; one fresh throwaway account for pre-pairing flows.
  • Drive via adb tap/swipe; resolve coords from uiautomator dump bounds; downscale screenshots to read; scan logcat for FATAL EXCEPTION/ANR on each screen.
  • Premium toggled via scratchpad/set_premium.js (admin, user-authorized each time).
  • Theme toggled via Settings → Appearance (Light/Dark) (MainActivity ThemeMode).
  • REPORT-ONLY during passes — never fix mid-pass.

Continuity & resumability (this effort WILL span many context windows — don't lose state)

State lives in files, not memory:

  • ClaudeReport.md = the issue log (committed). Each issue row is self-contained in text (repro + expected
    • actual) — screenshots are session-only and won't survive a compaction; never rely on a screenshot path alone.
  • ClaudeQACoverage.md = the coverage matrix: every screen×mode, feature×premium-state, game×lifecycle, notification×{foreground,background,killed}, each todo | pass | fail(→issue id). The resume anchor.
  • Persistent memory (memory/): QA methodology + exact commands; emulator↔account↔coupleId mapping; scratchpad/set_premium.js + admin tooling; the couple-shared-premium-everywhere goal + the per-user-gate gap.
  • Run-state header pinned at the TOP of ClaudeReport.md, always current: Round N | Pass X | Chunk Y | NEXT ACTION: … — first thing to read, last thing to update before stopping.
  • Stable issue IDs: A-001 / B-002 / C-… / D-… / E-… (pass-letter + number); coverage references the ID for every fail. Never renumber or reuse.
  • Source of truth: the two MD files are authoritative; the TodoWrite list is scratch for the current chunk only. Update the MD files + run-state header before ending a session.
  • Commit cadence: commit ClaudeReport.md + ClaudeQACoverage.md after each pass and each chunk.
  • Chunking: run small chunks (Pass C one screen-group; Pass A one feature), checkpoint after each.
  • Session-start ritual: (1) read run-state header + both MD files; (2) adb devices shows both emulators online; (3) installed build == current HEAD (rebuild+reinstall if unsure — never QA a stale APK); (4) continue at the first todo / unverified-fix.

Guardrails & efficiency

  • Never pm clear / wipe app data — breaks the App Check debug token. Pre-pairing QA: sign-out → fresh sign-up.
  • Never run seed/build_db.py. Admin seeds/writes, entitlement toggles, and any deploys are user-authorized per occurrence.
  • By-design vs bug: if a finding may be intended behavior, log it and confirm with the user before changing it.
  • Pass C parallelism: set 5554 = Dark, 5556 = Light to capture both themes at once.
  • Never log decrypted message/answer content.

Severity scale (label every issue)

  • P0 Critical — crash/ANR, data loss, encryption/security leak, feature fully broken, premium bypass.
  • P1 Major — feature partly broken, premium not unlocking for partner, wrong/missing notification, dead-end nav.
  • P2 Minor — readability/contrast, clipping/overflow/truncation, theme not adapting, inconsistent styling.
  • P3 Polish — spacing/alignment/copy nits.

QA passes (Round 1 = baseline)

Pass A — Couple-shared premium (target: either partner premium → both unlock)

Test each gated feature in 3 states: neither premium → locked + paywall; partner-only premium → BOTH unlock; self premium → unlock. Toggle Sam premium, confirm QA (free) unlocks; toggle off. Features: Play-hub games (Desire Sync + any premium-badged), Connection Challenges, Memory Lane; Question Packs; Spin the Wheel / Category Picker / Wheel History; Date Match / Plan Date / Date Builder; chat media + reactions (regression — already couple-shared); Subscription/Settings reflects entitlement. Gated files (for the fix): ui/play/PlayHubViewModel, ui/desiresync/DesireSyncScreen, ui/wheel/{CategoryPicker,SpinWheel,WheelHistory}*, ui/questions/QuestionPackLibrary*, ui/dates/{DateMatch,DateMatches}Screen, ui/memorylane/MemoryLaneScreen, ui/challenges/ConnectionChallengesScreen.

Pass B — Games lifecycle (start / play / finish + results)

Games: This or That, How Well Do You Know Me, Desire Sync, Connection Challenges, Memory Lane, Spin the Wheel, + Date Match.

  • Start on A → status=active; both play through → status=completed; reveal/results correct on both.
  • Edges: re-open a completed session, leave mid-game, no stuck session, no crash.
  • Game start/finish pushes (onGameSessionUpdate) exercised here; full delivery/deep-link audit in Pass E.
  • Media permissions (CAMERA, RECORD_AUDIO): granted works, denied degrades gracefully.

Pass C — Visual pass, light + dark, ALL screens

Every route in core/navigation/AppRoute.kt (~50), in both modes: text contrast/readability (no invisible/ low-contrast), no clipping/overflow/ellipsis breakage, icons visible, backgrounds adapt, controls legible. Groups: auth/onboarding/pairing (fresh acct); Home (solo + paired); Play + every game; Today + reveal/history; Messages (inbox + conversation); Packs; Dates (Match/Builder/Matches/Bucket List); Wheel (picker/session/complete/history); Settings + all sub-pages (Account, Notifications, Appearance, Privacy, Subscription, Relationship, Security, Delete Account); Paywall; Your Progress/Activity; Recovery.

  • Probe: ui/theme/Theme.kt hardcoded brand colors + chat's custom closerBackgroundBrush — verify dark mode truly adapts; grep screens for hardcoded Color(0x...).
  • States, not just happy path: empty / loading / error / not-paired where they exist; many need data setup (seeding is user-gated) — note unreachable states in coverage rather than skipping silently.
  • Readability at scale: default font size + spot-check largest system font scale on text-heavy screens.

Pass D — Security & Encryption audit (cornerstone; findings default to P0)

  • D1 At-rest coverage: admin-read RAW docs/objects, assert ciphertext for every private type — chat text + lastMessagePreview (enc:v1:), chat media bytes (Tink 01 69 59 51 f0…), answers (sealed:v1:/enc:v1:), date plans + date_swipes, Memory Lane capsules, Bucket List. Also: wrappedCoupleKey + recovery material never plaintext; invite code (KDF seed) never stored raw; no push payload carries private content.
  • D2 Rules audit (static): member-only reads, author/server-only writes, ciphertext enforced on every private field, immutability, no premium self-grant, entitlements write:false; re-audit conversations/typing/reactions
    • entitlement partner-read; no catch-all match /{document=**}; list/query not enumerable; get()-rules don't over-expose; no legacy plaintext/downgrade path (coupleEncryptionEnabled holds; no disabled-encryption branch).
  • D3 Negative access tests: a non-member account is denied reading messages/answers/dates/entitlements, writing plaintext to encrypted fields, self-granting premium, cross-couple access (live rules or rules-emulator).
  • D4 Key exchange / management / recovery (E2EE crux): couple key client-generated, only leaves device wrapped (KDF from invite seed; server holds only wrappedCoupleKey+kdfSalt/kdfParams+encryptedRecoveryPhrase); KDF strength; Tink AEAD = AES-GCM/256 with AAD=coupleId, no weak/custom crypto/nonce reuse; keybox/sealed/commitment integrity; recovery-wrap server-blind; unpair revokes decrypt; invites CSPRNG + single-use + expiry.
  • D5 App Check / Functions / secrets: App Check enforced; callables validate auth+membership; webhook authenticity; admin-only writes rejected from clients; service-account JSONs never committed; no plaintext/secrets in logcat; temp files deleted.
  • D6 Leak vectors: no private content in analytics/crash; allowBackup=false + backup rules exclude sensitive data; deep links re-check membership; clipboard user-initiated; consider FLAG_SECURE; repo scan for committed secrets.

For each: trigger fires → delivered to the right partner (never self) → in foreground/background/killed → correct channel + copy with no private contenttap opens exactly the right item (loaded, not generic Home/ dead-end) → no duplicates → rate limiter (20/day,100/week) doesn't drop legit ones. Inventory (type → trigger → destination), all 17: chat_message(onMessageWritten→conversation, foreground→chat-head bubble), partner_started_game/partner_finished_game(onGameSessionUpdate→game/results), partner_answered (onAnswerWritten→reveal), daily_question(assignDailyQuestion)/daily_question_reminder/daily_reminder (dailyQuestionReminder→Today), date_match(createDateMatch→match), partner_joined+invite_created (acceptInviteCallable→pairing/home), partner_left(onCoupleLeave)/partner_deleted_account(onUserDelete→home/ relationship settings), memory_capsule_unlocked(scheduled→capsule), challenge_day_ready(→Connection Challenges), outcome_reminder(scheduledOutcomesReminder), reengagement(reengagement/gameRetention), gentle_reminder (sendGentleReminderCallable), spki(identify + confirm handled).

  • Tap-to-open: every notification opens the specific item from foreground/background/killed; tapping in-app doesn't stack/duplicate; logged-out/unpaired tap is graceful. Wrong/dead destination = P1.
  • Scheduled/time-based: trigger manually (invoke callable/function or seed due condition — user-gated).
  • Foundations: FCM token registration on sign-in (TokenRegistrar) + onNewToken; POST_NOTIFICATIONS prompt + denied path; channels (di/NotificationModule); deep-link routing (MainActivity.deepLinkRouteFromIntentAppNavigation); foreground/background split (core/notifications/AppMessagingService).
  • Build a delivery matrix (type × {foreground,background,killed}) in ClaudeQACoverage.md. Missed delivery or wrong deep-link = P1; private content in any payload = P0.

Reporting → ClaudeReport.md (living QA report)

  • Header: date, build, devices, round number + run-state header.
  • One section per pass (A/B/C/D/E), each a table: ID | Area | Screen/Route | Mode | Severity | Description | Repro | Evidence | Suggested fix | Status.
  • Summary: counts by severity. Report only during passes — no fixes recorded until the fix phase.

Fix phase (only AFTER all passes of the round complete)

  • Work strictly by severity: all P0 → P1 → P2 → P3.
  • One issue at a time: implement → ./gradlew :app:assembleDebug → install both → verify THAT fix live (correct device/theme) + regression smoke (launch/no-crash, send text, inbox loads, a game opens, content still ciphertext in Firestore) → flip its row to Fixed + commit (one per issue/cluster) → next. Don't start the next until the current is verified.
  • Couple-shared premium fix: replace direct isPremium() gates with CouplePremiumChecker.coupleHasPremium(partnerId) in every gated VM/screen (partner-entitlement read rule deployed). High regression risk — re-verify each feature in BOTH self-premium and free states.
  • Gated actions (entitlement toggles, deploys) are user-authorized per occurrence.
  • New issues found while fixing are logged (new ID), not silently fixed beyond scope — next re-QA round catches them.

Definition of done: a pass is done when every coverage row is pass/fail→id; a round is done when all five passes are done; flawless = one full round with zero open P0P2 and Passes D + E fully clean. Then stop (P3s optional). Don't re-open a clean pass within the same round.

Re-QA loop (until flawless)

After the fix phase, re-run Pass A/B/C/D/E (regression + confirm fixes). Repeat fix → re-QA rounds until a full round yields zero P0P2 and Passes D+E fully clean.