diff --git a/seed/questions/QUESTION_QUALITY_CHECKLIST.md b/seed/questions/QUESTION_QUALITY_CHECKLIST.md index f6b99824..8eb40315 100644 --- a/seed/questions/QUESTION_QUALITY_CHECKLIST.md +++ b/seed/questions/QUESTION_QUALITY_CHECKLIST.md @@ -1,4 +1,4 @@ -# Closer Question Quality Checklist v6 +# Closer Question Quality Checklist v7 **See also:** [QUESTION_CONTENT_GUIDE.md](QUESTION_CONTENT_GUIDE.md) | [QUESTION_SCHEMA.md](QUESTION_SCHEMA.md) | [QUESTION_REWRITE_PLAN.md](QUESTION_REWRITE_PLAN.md) @@ -225,6 +225,66 @@ Better: Save me the best couch spot ``` +## Patch Discipline Checks + +Before updating a daily pack, confirm the workflow is patch mode. + +Required: + +* every failed question has a marked ID +* every mark has a reason +* every mark has a fix scope +* only marked IDs are edited +* passing IDs are left unchanged +* metadata is preserved unless metadata failed +* the report lists marked count, patched count, and remaining flag count + +Reject the update if it rewrites passing questions without a mass rewrite exception. + +Mass rewrite exception requires: + +* more than 60 percent of the weekday or pack fails +* one shared root cause is named +* the report explains why patching is worse +* preserved fields are listed + +## Fun But Grounded Checks + +Reject daily questions that are fun only because they are random. + +Mark as `too_random` or `mechanic_overuse` when the pack overuses: + +* snack drafts +* fake awards +* mascot jokes +* couch games +* dramatic bits +* random object picks +* silly phrases that do not fit the prompt + +A good daily question should feel playful and usable by adults. + +It should not feel like a children's party game, a meme prompt, or a slot machine full of snacks. + +## Option Answer Test + +For every single-choice question, read the prompt followed by each option. + +Each option must sound like a direct answer. + +If one option fails, fix that option. + +If two or more options fail, rewrite the answer set. + +Mark failures as `option_mismatch` or `weird_option`. + +Examples that fail: + +* Prompt asks for a date move, option is an object. +* Prompt asks what to do tonight, option is a vague feeling. +* Prompt asks for a playful choice, option is a chore. +* Prompt asks for a flirty pick, option is a household task. + ## Repetition Checks Reject or rewrite if: @@ -245,9 +305,11 @@ Before approving the full daily pack: 2. Mark anything therapy-coded, boring, weird, logistical, or not fun. 3. Fix the marked items. 4. Run a second random sample from each weekday. -5. Ship only when the second sample passes cleanly. +5. Fix only the sampled items that fail. +6. Run another sample if any sampled item changed. +7. Ship only when the second clean sample passes and the remaining hard flag count is 0. -The sample must include no chore-heavy answer sets and no weird domestic options. +The sample must include no chore-heavy answer sets, no weird domestic options, and no random silliness that does not fit the prompt. ## General Question Checks @@ -277,6 +339,9 @@ Use these reasons when marking weak questions: * weird_option * weak_weekday_fit * filler_question +* too_random +* mechanic_overuse +* patch_scope_violation * duplicate_text * duplicate_options * schema_issue @@ -286,8 +351,9 @@ Use these reasons when marking weak questions: Use one of these labels when reviewing a pack: * production_ready +* production_candidate * staging_only * needs_rewrite * reject -Do not call a pack production ready just because the JSON validates. +Do not call a pack production ready just because the JSON validates. Do not call it production ready while known hard content flags remain.