diff --git a/seed/questions/DAILY_SINGLE_CHOICE_WEEKDAY_SYSTEM.md b/seed/questions/DAILY_SINGLE_CHOICE_WEEKDAY_SYSTEM.md index 00601708..54dce04b 100644 --- a/seed/questions/DAILY_SINGLE_CHOICE_WEEKDAY_SYSTEM.md +++ b/seed/questions/DAILY_SINGLE_CHOICE_WEEKDAY_SYSTEM.md @@ -1,4 +1,4 @@ -# Daily Single Choice Weekday System v5 +# Daily Single Choice Weekday System v6 This document defines the Closer daily weekday question pack. @@ -568,6 +568,76 @@ Each question must include exactly one new weekday tag: If the app code still uses older mode tags, include the compatibility tag too, but only one new weekday tag. +## Patch Discipline: Fix Only What Fails + +Daily pack updates must use patch discipline. + +The writer must review the full pack, mark the failing question IDs, and then fix only those marked IDs. + +Do not rewrite passing questions just because a rewrite is happening. Passing questions are frozen unless they later fail a specific rule. + +For normal content fixes, preserve: + +* `id` +* `type` +* `access` +* `depth` +* `sex` +* weekday tags +* app compatibility tags + +Change only the prompt and options unless metadata is the thing that failed. + +Mass rewrites are allowed only when more than 60 percent of a weekday or pack fails for the same root cause. If that happens, the review report must explain why patching would be worse. + +See `DAILY_PATCH_REVIEW_LOOP_POLICY.md` for the full patch loop. + +## Fun But Grounded Gate + +Fun does not mean random nonsense. + +Reject daily questions that feel like a carnival generator instead of a couples app. + +Watch for overuse of: + +* fake awards +* snack drafts +* mascot jokes +* couch games +* dramatic compliments +* random object choices +* silly phrases that do not match the prompt + +A little weird is good. A whole pack of weird becomes wallpaper with confetti on it. + +Daily questions should feel playful and usable by adults. + +They should not feel like children's party games, chore dice, or therapy cards with glitter. + +## Option Answer Test + +Every option must pass the answer test. + +Read the prompt, then read each option after it. The option must sound like a clean answer. + +Bad: + +```text +Which tiny date move fits after dinner? +Choosing the fun mug +``` + +Why it fails: choosing a mug is not really a date move. + +Better: + +```text +Which tiny date move fits after dinner? +A two-song kitchen dance +``` + +If one option fails the answer test, fix that option. If two or more fail, rewrite the whole answer set. + ## Production Review Loop Do not write or rewrite all 500 questions in one blind pass. @@ -578,12 +648,13 @@ For each weekday: 2. Read all 20 out loud. 3. Mark weak questions with reasons. 4. Fix only the marked questions. -5. Review the fixed set again. -6. Continue only when at least 18 of 20 pass. -7. Continue only when at least 16 of 20 feel fun, playful, sweet, flirty, silly, or date-like. -8. Expand in batches of 20 to 30. -9. Repeat mark, fix, review after each batch. -10. Move to the next weekday only after the current weekday passes. +5. Do not touch questions that passed. +6. Review the fixed set again. +7. Continue only when all 20 have no hard flags. +8. Continue only when at least 18 of 20 feel fun, playful, sweet, flirty, silly, or date-like. +9. Expand in batches of 20 to 30. +10. Repeat mark, fix, review after each batch. +11. Move to the next weekday only after the current weekday passes. ## Marking Reasons @@ -602,6 +673,9 @@ Use these reasons when marking weak daily questions: * weird_option * weak_weekday_fit * filler_question +* too_random +* mechanic_overuse +* patch_scope_violation ## Research Notes Used for This Guide @@ -628,6 +702,10 @@ Before shipping: 6. Mark anything therapy-coded, boring, weird, logistical, or not fun. 7. Fix the marked items. 8. Run a second random sample from each weekday. -9. Ship only when the second sample passes cleanly. +9. Fix only the newly marked sample items. +10. Run the second sample again if any sampled item changed. +11. Ship only when the second sample passes cleanly and the remaining hard flag count is 0. -The final sample must include no weird domestic options like "The good blanket saved" and no chore-heavy answer sets. +The final sample must include no weird domestic options like "The good blanket saved", no chore-heavy answer sets, and no random silliness that does not fit the prompt. + +The final review report must state how many questions were marked, how many were patched, and how many known flags remain.