From 3b49b76b330d6eeb0592718b723668e3272653ab Mon Sep 17 00:00:00 2001 From: null Date: Tue, 30 Jun 2026 23:48:14 -0500 Subject: [PATCH] =?UTF-8?q?docs(questions):=20v7=20=E2=80=94=20Patch=20Dis?= =?UTF-8?q?cipline=20Checks,=20Fun=20But=20Grounded=20Checks?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- seed/questions/QUESTION_QUALITY_CHECKLIST.md | 74 ++++++++++++++++++-- 1 file changed, 70 insertions(+), 4 deletions(-) diff --git a/seed/questions/QUESTION_QUALITY_CHECKLIST.md b/seed/questions/QUESTION_QUALITY_CHECKLIST.md index f6b99824..8eb40315 100644 --- a/seed/questions/QUESTION_QUALITY_CHECKLIST.md +++ b/seed/questions/QUESTION_QUALITY_CHECKLIST.md @@ -1,4 +1,4 @@ -# Closer Question Quality Checklist v6 +# Closer Question Quality Checklist v7 **See also:** [QUESTION_CONTENT_GUIDE.md](QUESTION_CONTENT_GUIDE.md) | [QUESTION_SCHEMA.md](QUESTION_SCHEMA.md) | [QUESTION_REWRITE_PLAN.md](QUESTION_REWRITE_PLAN.md) @@ -225,6 +225,66 @@ Better: Save me the best couch spot ``` +## Patch Discipline Checks + +Before updating a daily pack, confirm the workflow is patch mode. + +Required: + +* every failed question has a marked ID +* every mark has a reason +* every mark has a fix scope +* only marked IDs are edited +* passing IDs are left unchanged +* metadata is preserved unless metadata failed +* the report lists marked count, patched count, and remaining flag count + +Reject the update if it rewrites passing questions without a mass rewrite exception. + +Mass rewrite exception requires: + +* more than 60 percent of the weekday or pack fails +* one shared root cause is named +* the report explains why patching is worse +* preserved fields are listed + +## Fun But Grounded Checks + +Reject daily questions that are fun only because they are random. + +Mark as `too_random` or `mechanic_overuse` when the pack overuses: + +* snack drafts +* fake awards +* mascot jokes +* couch games +* dramatic bits +* random object picks +* silly phrases that do not fit the prompt + +A good daily question should feel playful and usable by adults. + +It should not feel like a children's party game, a meme prompt, or a slot machine full of snacks. + +## Option Answer Test + +For every single-choice question, read the prompt followed by each option. + +Each option must sound like a direct answer. + +If one option fails, fix that option. + +If two or more options fail, rewrite the answer set. + +Mark failures as `option_mismatch` or `weird_option`. + +Examples that fail: + +* Prompt asks for a date move, option is an object. +* Prompt asks what to do tonight, option is a vague feeling. +* Prompt asks for a playful choice, option is a chore. +* Prompt asks for a flirty pick, option is a household task. + ## Repetition Checks Reject or rewrite if: @@ -245,9 +305,11 @@ Before approving the full daily pack: 2. Mark anything therapy-coded, boring, weird, logistical, or not fun. 3. Fix the marked items. 4. Run a second random sample from each weekday. -5. Ship only when the second sample passes cleanly. +5. Fix only the sampled items that fail. +6. Run another sample if any sampled item changed. +7. Ship only when the second clean sample passes and the remaining hard flag count is 0. -The sample must include no chore-heavy answer sets and no weird domestic options. +The sample must include no chore-heavy answer sets, no weird domestic options, and no random silliness that does not fit the prompt. ## General Question Checks @@ -277,6 +339,9 @@ Use these reasons when marking weak questions: * weird_option * weak_weekday_fit * filler_question +* too_random +* mechanic_overuse +* patch_scope_violation * duplicate_text * duplicate_options * schema_issue @@ -286,8 +351,9 @@ Use these reasons when marking weak questions: Use one of these labels when reviewing a pack: * production_ready +* production_candidate * staging_only * needs_rewrite * reject -Do not call a pack production ready just because the JSON validates. +Do not call a pack production ready just because the JSON validates. Do not call it production ready while known hard content flags remain.