Closer/seed/questions/QUESTION_QUALITY_CHECKLIST.md

8.0 KiB

Closer Question Quality Checklist v7

See also: QUESTION_CONTENT_GUIDE.md | QUESTION_SCHEMA.md | QUESTION_REWRITE_PLAN.md

Purpose

This checklist prevents technically valid but boring questions from reaching the app.

Passing JSON validation is not enough.

Every question must also feel human, useful, fun, and worth answering.

Automatic Rejects

Reject any question that contains or strongly resembles:

  • Describe...
  • Reflect on...
  • Discuss...
  • Evaluate...
  • In what ways...
  • How satisfied are you...
  • What boundary around...
  • Explore your feelings...
  • Identify the ways...
  • Rate the effectiveness...
  • Communication style
  • Emotional processing
  • Conflict framework
  • Relationship dynamic

These are therapy worksheet patterns.

Daily Pack Hard Checks

For the daily single choice weekday pack, confirm before content review:

  • 500 total questions
  • 75 free questions
  • 425 premium questions
  • every question is single_choice
  • every question has a weekday tag
  • every question has 4 to 6 options
  • 4 options preferred
  • no duplicate IDs
  • no duplicate question text
  • no duplicate exact option lists

Daily Fun Gate

For daily questions, reject anything that is merely useful but not fun.

A daily question must feel like one of these:

  • a game moment
  • a tiny date choice
  • a sweet choice
  • a flirt
  • a silly prompt
  • a playful debate
  • a low-pressure couple moment

Reject daily questions with answer sets built around:

  • clean counters
  • dishes
  • laundry
  • bills
  • appointments
  • errands
  • bedtime planning
  • household maintenance
  • saved blankets
  • clinical reassurance phrasing

These can exist in the real relationship. They should not dominate the daily fun pack.

Bad:

Before phones win, what would make the night nicer?

Bad options:

  • A clean counter
  • A quick shoulder rub
  • A simple bedtime plan
  • The good blanket saved

Better:

Before phones win, what should we do for fun?

Better options:

  • Pick a ridiculous snack
  • Watch one guilty-pleasure clip
  • Trade dramatic compliments
  • Choose tomorrow's tiny date

Research-Informed Daily Fun Checks

A daily question must feel playable, not merely pleasant.

Pass only if the question uses at least one of these:

  • a tiny mission
  • a funny choice
  • a playful debate
  • a flirty pick
  • a cute mini date
  • a snack or treat choice
  • a memory prompt
  • a silly award
  • a low-pressure dare
  • a small surprise

Reject if the question mainly feels like:

  • relationship maintenance
  • emotional homework
  • household management
  • responsible adult planning
  • generic wellness advice
  • a cute phrase with no actual game inside it

Ask this out loud:

Would two tired people still want to tap this for fun tonight?

If not, mark it as not_fun or filler_question and rewrite it.

Research-Informed Option Checks

Options must feel like choices in a game.

Reject options that are:

  • vague, like "something sweet"
  • clinical, like "more reassurance"
  • logistical, like "a bedtime plan"
  • chore-coded, like "a clean counter"
  • oddly phrased, like "the good blanket saved"
  • too similar to each other
  • too different in effort or intimacy

At least 3 out of 4 options should be visibly fun, sweet, flirty, silly, or date-like. If only 1 or 2 options feel fun, rewrite the whole answer set.

Daily Pack Rejects

Reject daily prompts that feel like:

  • therapy homework
  • self-help content
  • HR wellness surveys
  • communication worksheets
  • abstract emotional processing
  • generic AI relationship advice
  • household admin
  • bedtime logistics
  • chore planning

Reject daily questions using these words or phrases:

  • reset
  • process
  • mental load
  • emotional load
  • autopilot
  • pressure
  • soft landing
  • relationship dynamic
  • name the mood
  • emotional processing
  • communication style
  • conflict framework

Daily Option Checks

Every daily option must:

  • answer the exact prompt
  • be a complete answer
  • sound natural
  • be similar in weight to the other options
  • be fun, sweet, playful, flirty, silly, date-like, or warmly specific

Reject options that are:

  • fragments
  • too abstract
  • weirdly specific
  • chore-heavy
  • clinical
  • not connected to the prompt
  • much better or worse than the other options

Bad fragment:

When I need reassurance

Better:

Tell me one thing you liked about today

Bad weird option:

The good blanket saved

Better:

Save me the best couch spot

Patch Discipline Checks

Before updating a daily pack, confirm the workflow is patch mode.

Required:

  • every failed question has a marked ID
  • every mark has a reason
  • every mark has a fix scope
  • only marked IDs are edited
  • passing IDs are left unchanged
  • metadata is preserved unless metadata failed
  • the report lists marked count, patched count, and remaining flag count

Reject the update if it rewrites passing questions without a mass rewrite exception.

Mass rewrite exception requires:

  • more than 60 percent of the weekday or pack fails
  • one shared root cause is named
  • the report explains why patching is worse
  • preserved fields are listed

Fun But Grounded Checks

Reject daily questions that are fun only because they are random.

Mark as too_random or mechanic_overuse when the pack overuses:

  • snack drafts
  • fake awards
  • mascot jokes
  • couch games
  • dramatic bits
  • random object picks
  • silly phrases that do not fit the prompt

A good daily question should feel playful and usable by adults.

It should not feel like a children's party game, a meme prompt, or a slot machine full of snacks.

Option Answer Test

For every single-choice question, read the prompt followed by each option.

Each option must sound like a direct answer.

If one option fails, fix that option.

If two or more options fail, rewrite the answer set.

Mark failures as option_mismatch or weird_option.

Examples that fail:

  • Prompt asks for a date move, option is an object.
  • Prompt asks what to do tonight, option is a vague feeling.
  • Prompt asks for a playful choice, option is a chore.
  • Prompt asks for a flirty pick, option is a household task.

Repetition Checks

Reject or rewrite if:

  • too many questions start the same way
  • the same option text appears too often
  • the same situation repeats with different nouns
  • the same answer pattern repeats
  • the weekday starts to feel like wallpaper

The pack can pass duplicate checks and still fail repetition review.

Daily Sample Gate

Before approving the full daily pack:

  1. Read 10 random questions from each weekday.
  2. Mark anything therapy-coded, boring, weird, logistical, or not fun.
  3. Fix the marked items.
  4. Run a second random sample from each weekday.
  5. Fix only the sampled items that fail.
  6. Run another sample if any sampled item changed.
  7. Ship only when the second clean sample passes and the remaining hard flag count is 0.

The sample must include no chore-heavy answer sets, no weird domestic options, and no random silliness that does not fit the prompt.

General Question Checks

Every question must pass:

  • Would a real couple answer this willingly?
  • Is it easy to understand on the first read?
  • Does it create a conversation, laugh, flirt, memory, plan, or useful preference?
  • Are the answer options balanced?
  • Is the wording natural out loud?
  • Would this feel premium in the app?

Marking Reasons

Use these reasons when marking weak questions:

  • therapy_voice
  • wellness_voice
  • household_admin
  • not_fun
  • abstract_prompt
  • awkward_split_phrase
  • repeated_stem
  • option_mismatch
  • fragment_options
  • too_generic
  • weird_option
  • weak_weekday_fit
  • filler_question
  • too_random
  • mechanic_overuse
  • patch_scope_violation
  • duplicate_text
  • duplicate_options
  • schema_issue

Final Verdict Labels

Use one of these labels when reviewing a pack:

  • production_ready
  • production_candidate
  • staging_only
  • needs_rewrite
  • reject

Do not call a pack production ready just because the JSON validates. Do not call it production ready while known hard content flags remain.