fix(games): atomic session start to prevent duplicate sessions on concurrent start (F-RACE-001)

Simultaneous game start by both partners created two divergent active sessions (TOCTOU: a
non-transactional check-then-create in GameSessionManager.startGameWithCouple). Each partner
ended up in a separate session with different questions → no shared reveal.

Fix: QuestionSessionRepository.startSessionAtomically runs a Firestore transaction on a
per-couple pointer doc (couples/{cid}/sessions/_active). It reads the pointer (+ the pointed
session) and either returns AlreadyActive (caller joins the existing session) or atomically
creates the new session and re-points the lock. Concurrent starts contend on the one pointer,
so the loser's transaction retries, sees the now-set pointer, and joins instead of duplicating.
The pointer self-heals (checks the pointed session's status) so no clear-on-finish is needed,
and it carries no status/completedAt so it's invisible to the active/history queries.
GameSessionManager routes all 7 games through it. firestore.rules adds member-write for
sessions/_active (deployed).

Verified live on both emulators: atomic create → 1 session + pointer; sequential 2nd start →
joins (1 session); literal parallel-tap race → 1 session (was 2); 0 FATAL.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
null 2026-06-25 21:43:06 -05:00
parent 164acf415d
commit 23dd6a75e8
6 changed files with 140 additions and 15 deletions

View File

@ -105,6 +105,15 @@ conversation w/ content). **partner_started_game**: bg deliver + content-free
present in code. Full 17×{fg/bg/killed} matrix not exhaustively run; routing centralized + code-verified for the rest.
## Pass F — Resilience / lifecycle / concurrency / time
**R7 DEEP DIVE:** **concurrency race — FOUND + FIXED F-RACE-001 (P1):** simultaneous game start created 2 divergent
sessions (TOCTOU in `startGameWithCouple`). **Fixed** via atomic transactional create on a per-couple `sessions/_active`
pointer (`startSessionAtomically`) + rule + deploy. **Verified live:** atomic create → 1 session + pointer; sequential
2nd start → joins (1); **parallel-tap race → 1 session (was 2)**; 0 FATAL. **Malformed/abusive deep-link intents**
(unknown type, missing extras, injection/path-traversal) → 0 crash. **Killed-state cold-start** chat deep-link →
conversation loads, 0 crash. Minor follow-up note: the race-loser sometimes lands on the Play hub rather than
WaitingForPartner→"Join the game" (no dup/crash; pre-existing routing).
**R3:** offline (airplane mode) → Today renders from cache, no crash ✅; rotation/config-change → landscape renders,
state preserved, no crash ✅; process-death/restore → ~6 cold restarts all clean to Home (auth persists) ✅;
concurrency → both devices played games simultaneously, sessions synced + B-001 auto-complete on concurrent finish ✅.

View File

@ -1,6 +1,6 @@
# Claude QA Report — Full-App QA (living report)
> **RUN-STATE: Round 7 (multi-angle DEEP DIVE) — 2026-06-25, client HEAD `f47c8e2`, functions deployed. Plan updated with a "Multi-angle attack mandate" + live raw-API D3.** Attacked security/data/concurrency from multiple angles (admin ground-truth read, raw Firestore REST as member+non-member, killed/cold state, malformed intents, simultaneous-start race). **Security cornerstone = FULLY CLEAN (deep):** D1 at-rest — messages/previews + all 4 game-answer collections (ToT/HowWell/DesireSync/Wheel, both users) + capsules + date-swipe actions all `enc:v1:`; couple key phrase-wrapped (argon2id), recovery phrase server-blind + `encryptedRecoveryPhrase` wiped on acceptance, plaintext `inviteCode` not exploitable (invite readable only by inviter; no code-encrypted secret persisted). D3 raw-API — **non-member denied ALL reads/writes (403)**; real premium path `users/{uid}/entitlements/premium` write **denied (403, server-only) → no self-grant**; cross-couple denied. Robustness — malformed/abusive deep-link intents (unknown type, missing extras, injection/path-traversal values) → 0 crash; killed-state cold-start chat deep-link → conversation loads. **NEW FINDING: F-RACE-001 (P1)** — simultaneous game start creates **two divergent active sessions** (TOCTOU). **SEVERITY BOARD: P1 = 1 open (F-RACE-001), P0/P2/P3 = 0 open.** Baseline restored (duplicate sessions ended, 0 active, couple intact). Two hardening notes → Future.md (App Check not enforced on Firestore; user-doc update rule allows arbitrary non-`hasPremium` fields).**
> **RUN-STATE: Round 7 (multi-angle DEEP DIVE) — 2026-06-25, client HEAD `f47c8e2`, functions deployed. Plan updated with a "Multi-angle attack mandate" + live raw-API D3.** Attacked security/data/concurrency from multiple angles (admin ground-truth read, raw Firestore REST as member+non-member, killed/cold state, malformed intents, simultaneous-start race). **Security cornerstone = FULLY CLEAN (deep):** D1 at-rest — messages/previews + all 4 game-answer collections (ToT/HowWell/DesireSync/Wheel, both users) + capsules + date-swipe actions all `enc:v1:`; couple key phrase-wrapped (argon2id), recovery phrase server-blind + `encryptedRecoveryPhrase` wiped on acceptance, plaintext `inviteCode` not exploitable (invite readable only by inviter; no code-encrypted secret persisted). D3 raw-API — **non-member denied ALL reads/writes (403)**; real premium path `users/{uid}/entitlements/premium` write **denied (403, server-only) → no self-grant**; cross-couple denied. Robustness — malformed/abusive deep-link intents (unknown type, missing extras, injection/path-traversal values) → 0 crash; killed-state cold-start chat deep-link → conversation loads. **F-RACE-001 (P1) — FIXED + VERIFIED LIVE** (atomic transactional session create). **SEVERITY BOARD: 0 open at ALL levels (P0P3).** Baseline restored (duplicate sessions ended, 0 active, couple intact). Two hardening notes → Future.md (App Check not enforced on Firestore; user-doc update rule allows arbitrary non-`hasPremium` fields).**
>
> **F-RACE-001 (P1, NEW — concurrency):** When BOTH partners start the *same* game within ~the same second, the couple
> ends up with **2 active sessions with different question sets** (proven live: QA "Which should end the date?" vs Sam
@ -16,6 +16,15 @@
> else create the session doc (client-generated id) + set the sentinel in the same transaction. Clear the sentinel on
> finish. Needs a `firestore.rules` update (member-only sentinel write) + a rules deploy + re-verify all 7 games. (This
> is an architectural change to the core game flow — flagged for a focused fix-phase implementation.)
> **FIX IMPLEMENTED + VERIFIED LIVE (2026-06-25):** `QuestionSessionRepository.startSessionAtomically` runs a Firestore
> transaction on a per-couple pointer doc `couples/{cid}/sessions/_active` — reads the pointer (+ the pointed session),
> and either returns `AlreadyActive` (caller joins) or atomically `set`s the new session + re-points the lock. Concurrent
> starts contend on the one pointer doc, so the loser's transaction retries, re-reads the now-set pointer, and joins
> instead of creating a duplicate. `GameSessionManager.startGameWithCouple` now calls it (all 7 games funnel through it);
> rule added (`sessions/_active` member-writable) + deployed. **Verified:** atomic create → 1 session + pointer written;
> sequential 2nd start → joins (1 session); **literal parallel-tap race → 1 session (was 2)**; 0 FATAL; pointer
> self-heals on completion (checks pointed session status). Files: `domain/repository/QuestionSessionRepository.kt`,
> `data/repository/QuestionSessionRepositoryImpl.kt`, `domain/usecase/GameSessionManager.kt`, `firestore.rules`.
> **RUN-STATE: Round 6 (branding + Future.md regression QA) COMPLETE — 2026-06-25. Client HEAD `f47c8e2` on both emulators (build == HEAD, reinstalled).** Scope: regression-verify the new surfaces from the branding drop (`95cad84`: white-keyhole launcher/notification icons, animated app-icon-chip loader + fill, cold-launch splash, pairing hero) and the Future.md backlog clear (`f47c8e2`: inclusive gender options, turn-aware Home copy, push rate-limit budget split, results-push suppression via new ActiveGameSessionMonitor, paywall retry/offline/hide-Continue, auth privacy rotator). **0 new issues — SEVERITY BOARD STILL 0 open P0P3.** LIVE-VERIFIED: animated loader (chip+fill, both themes), splash→handoff (white-keyhole icon, no white flash), launcher icon (round mask), This or That + How Well open with no crash (confirms #4's new VM injection is sound), paywall purchase screen shows friendly "Couldn't load plans" + Try again with Continue hidden (no dead button) + online→generic message (#5), onboarding carousel illustration. Unit tests green (NotificationRateLimiter rewritten + PartnerNotificationManagerTest repaired). No FATAL on either device all session. CODE/UNIT-VERIFIED (live deferred, low-risk over proven patterns + fragile multi-text-field/2-device paths): #1 gender step (EditProfile + onboarding sex step — same option list as the shipping Female/Male), #8 rotator on SignUp/Forgot (reuses the Login-proven BrandMessageRotator), #2 "Your turn to play." (static string in the proven GAME_WAITING path), #3 weekly-cap exemption (unit-tested; only triggers at ≥100/wk), #4 results-suppression timing (mechanism + VM wiring verified; simultaneous-finish timing is non-deterministic to drive). Baseline restored: 5554 signed out during the sign-up pass, re-signed-in QA (`Y05AKO2IlTPMa0JQW1BiNIM0uzK2`) via admin custom token; couple `Xal3Kw3gjSdn0niERYKJ` intact, Sam paired.**

View File

@ -5,6 +5,7 @@ import app.closer.data.remote.FirestoreCollections
import app.closer.domain.model.GameType
import app.closer.domain.model.QuestionSession
import app.closer.domain.repository.QuestionSessionRepository
import app.closer.domain.repository.SessionStartOutcome
import com.google.firebase.firestore.FirebaseFirestore
import kotlinx.coroutines.channels.awaitClose
import kotlinx.coroutines.flow.Flow
@ -53,6 +54,76 @@ class QuestionSessionRepositoryImpl @Inject constructor(
doc.id
}
override suspend fun startSessionAtomically(
session: QuestionSession
): Result<SessionStartOutcome> = runCatching {
val sessionsCol = firestore.collection(FirestoreCollections.COUPLES)
.document(session.coupleId)
.collection(FirestoreCollections.Couples.SESSIONS)
val pointerRef = sessionsCol.document(ACTIVE_POINTER_ID)
val newRef = sessionsCol.document() // pre-generate id for the create branch
firestore.runTransaction { tx ->
// ── all reads must happen before any writes in a Firestore transaction ──
val pointer = tx.get(pointerRef)
val activeId = pointer.getString("activeSessionId")?.takeIf { it.isNotBlank() }
val activeSnap = activeId?.let { tx.get(sessionsCol.document(it)) }
if (activeSnap != null && activeSnap.exists() &&
activeSnap.getString("status") == "active"
) {
// A session is already active → converge: the caller joins it (no duplicate created).
// Concurrent starts contend on [pointerRef]; the loser's transaction retries, re-reads
// the now-set pointer, and lands here. Stale/completed pointers fall through to create.
SessionStartOutcome.AlreadyActive(snapToSession(activeSnap, session.coupleId))
} else {
val data = mapOf(
"id" to newRef.id,
"coupleId" to session.coupleId,
"categoryId" to session.categoryId,
"questionIds" to session.questionIds,
"startedByUserId" to session.startedByUserId,
"startedAt" to session.startedAt,
"completedAt" to session.completedAt,
"partnerCompletedAt" to session.partnerCompletedAt,
"isPremium" to session.isPremium,
"status" to session.status,
"gameType" to session.gameType,
"completedByUsers" to session.completedByUsers
)
tx.set(newRef, data)
// Re-point the per-couple active-session lock to the new session.
tx.set(
pointerRef,
mapOf(
"activeSessionId" to newRef.id,
"updatedAt" to System.currentTimeMillis()
)
)
SessionStartOutcome.Created(session.copy(id = newRef.id))
}
}.await()
}
private fun snapToSession(
doc: com.google.firebase.firestore.DocumentSnapshot,
coupleId: String
): QuestionSession = QuestionSession(
id = doc.getString("id") ?: doc.id,
coupleId = doc.getString("coupleId") ?: coupleId,
categoryId = doc.getString("categoryId") ?: "",
questionIds = (doc.get("questionIds") as? List<*>)?.filterIsInstance<String>() ?: emptyList(),
startedByUserId = doc.getString("startedByUserId") ?: "",
startedAt = doc.getLong("startedAt") ?: 0L,
completedAt = doc.getLong("completedAt"),
partnerCompletedAt = doc.getLong("partnerCompletedAt"),
isPremium = doc.getBoolean("isPremium") ?: false,
status = doc.getString("status") ?: "active",
gameType = doc.getString("gameType") ?: GameType.WHEEL,
completedByUsers = (doc.get("completedByUsers") as? List<*>)?.filterIsInstance<String>()
?: emptyList()
)
override suspend fun getSessionsForCouple(coupleId: String): Result<List<QuestionSession>> =
runCatching {
firestore.collection(FirestoreCollections.COUPLES)
@ -223,4 +294,9 @@ class QuestionSessionRepositoryImpl @Inject constructor(
}.onFailure { crashReporter.recordException(it) }.getOrNull()
}
}.getOrNull()
companion object {
/** Per-couple pointer doc that serializes concurrent game starts (F-RACE-001 lock). */
private const val ACTIVE_POINTER_ID = "_active"
}
}

View File

@ -3,9 +3,25 @@ package app.closer.domain.repository
import app.closer.domain.model.QuestionSession
import kotlinx.coroutines.flow.Flow
/** Outcome of an atomic session start: we either created a fresh session, or one was already active. */
sealed interface SessionStartOutcome {
val session: QuestionSession
data class Created(override val session: QuestionSession) : SessionStartOutcome
data class AlreadyActive(override val session: QuestionSession) : SessionStartOutcome
}
interface QuestionSessionRepository {
/** Saves the session and returns its (possibly auto-generated) document id. */
suspend fun saveSession(session: QuestionSession): Result<String>
/**
* Atomically start a session: creates a new active session ONLY if the couple has none, via a
* Firestore transaction on a per-couple pointer doc. Concurrent starts converge to a single
* session the loser gets [SessionStartOutcome.AlreadyActive] instead of creating a duplicate
* (fixes F-RACE-001, the simultaneous-start race that produced two divergent sessions).
*/
suspend fun startSessionAtomically(session: QuestionSession): Result<SessionStartOutcome>
suspend fun getSessionsForCouple(coupleId: String): Result<List<QuestionSession>>
// Active session queries

View File

@ -6,6 +6,7 @@ import app.closer.domain.model.QuestionSession
import app.closer.domain.repository.AuthRepository
import app.closer.domain.repository.CoupleRepository
import app.closer.domain.repository.QuestionSessionRepository
import app.closer.domain.repository.SessionStartOutcome
import app.closer.domain.repository.UserRepository
import kotlinx.coroutines.flow.Flow
import javax.inject.Inject
@ -81,17 +82,6 @@ class GameSessionManager @Inject constructor(
categoryId: String?,
questionIds: List<String>?
): Result<QuestionSession> {
val activeSession = sessionRepository.getActiveSessionForCouple(couple.id)
if (activeSession != null) {
val partnerId = couple.userIds.firstOrNull { it != userId }
val partnerName = partnerId
?.let { runCatching { userRepository.getUser(it) }.getOrNull() }
?.displayName ?: "Partner"
return Result.failure(
Exception("partner_active_session|$partnerName|${gameTypeLabel(gameType)}")
)
}
val session = QuestionSession(
coupleId = couple.id,
categoryId = categoryId ?: "",
@ -101,9 +91,28 @@ class GameSessionManager @Inject constructor(
status = "active"
)
// Use the id assigned by the repository so the started game observes the right doc
// (an empty id builds an invalid Firestore path and crashes the game on open).
return sessionRepository.saveSession(session).map { id -> session.copy(id = id) }
// Atomic "create only if no active session" (Firestore transaction on a per-couple pointer):
// two partners tapping start at the same instant converge to ONE session instead of two
// divergent ones (F-RACE-001). The loser is told a session is already active and joins it.
return sessionRepository.startSessionAtomically(session).fold(
onSuccess = { outcome ->
when (outcome) {
is SessionStartOutcome.Created -> Result.success(outcome.session)
is SessionStartOutcome.AlreadyActive -> {
val partnerId = couple.userIds.firstOrNull { it != userId }
val partnerName = partnerId
?.let { runCatching { userRepository.getUser(it) }.getOrNull() }
?.displayName ?: "Partner"
Result.failure(
Exception(
"partner_active_session|$partnerName|${gameTypeLabel(outcome.session.gameType)}"
)
)
}
}
},
onFailure = { Result.failure(it) }
)
}
/**

View File

@ -305,6 +305,12 @@ service cloud.firestore {
// Read: both members can read
allow read: if isCouplesMember(coupleId);
// Per-couple active-session pointer used as an atomic lock so two partners starting a game
// at the same instant converge to ONE session instead of two divergent ones (F-RACE-001).
// It holds no game content (only activeSessionId + updatedAt) and carries no status/
// completedAt, so it never appears in the active-session or history queries.
allow create, update: if sessionId == '_active' && isCouplesMember(coupleId);
// Create: either member can start a session
allow create: if isCouplesMember(coupleId)
&& request.resource.data.startedByUserId == request.auth.uid;