Building in Public: Day 22 - The Quality Fix That Found Gold

The Plan: Fix Story Quality

After Day 21’s discovery that our stories had only 1-6 choices per path (need 10+ for credible MBTI analysis), we had a clear plan:

Phase 1: Update the story generator

Enforce minimum 10 choices per path
Add automated validation
Integrate image optimization
Test with a new story

Simple, right? Update some prompts, run validation, done.

What actually happened: We got that, plus an unexpected 75% content reduction that solved problems we didn’t even know we had.

Part 1: The Implementation

Morning Work: Building the Validation System

First step: Make it impossible to generate invalid stories.

Updated the AI prompts (generate_story_json.py):

ABSOLUTE REQUIREMENTS (NON-NEGOTIABLE):
- MINIMUM 10 CHOICE NODES PER PATH
- All 4 MBTI dimensions tested 2-3 times per path
- Structure: 4 early choices + 4 mid choices + 3 late choices = 11 per path

No more “please” or “preferably.” Just hard requirements.

Built the validation tool (validate_story_paths.py):

Traces every possible path through the story
Counts choices along each path
Validates MBTI dimension coverage
Reports: ✅ PASS or ❌ FAIL with specific issues

Tested it on Frankenstein:

📊 RESULTS:
- Total paths: 10
- FAILED paths: 10/10 (100%)
- Choices per path: min=1, max=6, avg=3.5

❌ VALIDATION FAILED - Story needs regeneration

The validation system worked perfectly. It caught exactly what we found on Day 21.

Added image optimization:

Automatic WebP conversion after each image generation
95%+ size reduction (2MB PNG → 0.1MB WebP)
No manual optimization needed

By noon: All three improvements implemented and tested. Ready for Phase 1.4.

Part 2: The Test Story

Afternoon Work: Generate Great Gatsby

We needed to prove the updated generator actually works. Generated a complete story from scratch:

“The Great Gatsby” by F. Scott Fitzgerald

49 nodes (16 choice nodes, 17 endings, 16 normal nodes)
Non-convergent tree structure
Full MBTI tracking

Ran validation:

✅ Found 1,408 complete paths

Choices per path:
  Min: 10 ✓
  Max: 10
  Avg: 10.0

MBTI Coverage (per path):
  E/I: 10 measurements ✓
  S/N: 10 measurements ✓
  T/F: 10 measurements ✓
  J/P: 10 measurements ✓

✅ VALIDATION PASSED

Perfect. The generator works. Phase 1.4 complete.

Time to document the changes and move to Phase 2—

The User Question That Changed Everything

Then I reviewed the Great Gatsby JSON to write documentation.

Each ending node had this:

"ending_warning_attempt": {
  "likelyPersonalities": ["ESFJ", "ENFJ", "ISFJ"],
  "personalityDescriptions": {
    "ISTJ": "**The Methodical Observer**: ...",
    "ISFJ": "**The Devoted Guardian**: ...",
    "INFJ": "...",
    "INTJ": "...",
    // ... 13 more descriptions ...
  }
}

User asked: “Are all 16 personality descriptions actually needed? The ‘likely personalities’ has only three possibilities.”

Wait.

likelyPersonalities says only 3 types are likely to reach this ending.

But we’re including descriptions for all 16 types.

That’s 13 unused descriptions… per ending… times 17 endings…

The Math

Current approach:

16 MBTI types per ending
17 endings in Great Gatsby
272 personality descriptions total
But only ~51 descriptions actually used (3 per ending)
221 descriptions completely wasted

Per story across all 3 story packs:

Frankenstein: 12 endings × 16 types = 192 descriptions
Pride & Prejudice: 12 endings × 16 types = 192 descriptions
Jekyll & Hyde: Similar structure
Total: ~560 descriptions
Actually used: ~140 descriptions
Waste: ~420 unused descriptions (75%)

And this waste cascades:

75% more content to generate
75% more content to translate to Korean
75% more JSON file size
75% more content to maintain and update

This was why Gemini API kept hitting token limits. The files were just too bloated.

The Fix

Simple change:

// NEW approach
"ending_warning_attempt": {
  "likelyPersonalities": ["ESFJ", "ENFJ", "ISFJ"],
  "personalityDescriptions": {
    "ESFJ": "**The Loyal Protector**: Your immediate instinct...",
    "ENFJ": "**The Compassionate Hero**: Your attempt to save...",
    "ISFJ": "**The Devoted Guardian**: Your quiet loyalty...",
    "GENERIC": "**The Brave Heart**: Your instinct to warn..."
  }
}

Only include:

Descriptions for types in likelyPersonalities (3-4 types)
One GENERIC fallback for edge cases

JavaScript handles the fallback:

const description = descriptions[userType] ||
                    descriptions['GENERIC'] ||
                    `**${userType}**: Your unique choices...`;

If someone gets an unlikely type, they see the generic description. No broken experience.

The Impact

Content reduction:

Great Gatsby: 272 descriptions → 68 descriptions (75% reduction)
Across all stories: ~560 → ~140 descriptions (75% reduction)

Benefits unlocked:

✅ Gemini API can now generate complete stories (no token limit)
✅ Faster page loads (smaller JSON files)
✅ 75% less translation work for Korean versions
✅ Easier maintenance (fewer descriptions to update)
✅ Better UX (users see targeted descriptions, not generic ones)

Updated documentation:

CLAUDE_StoryPack.md now specifies optimized approach
personality-calculator.js has fallback chain
Future stories automatically use lean structure

The Learning

We set out to fix one problem: Not enough choices per path.

We got:

✅ Fixed the choice count problem (10+ per path enforced)
✅ Added automated validation (catches issues before deployment)
✅ Integrated image optimization (95% size reduction)
✅ Discovered 75% content optimization (solves 4 different problems)

Sometimes the best optimizations come from user questions, not from planning.

I spent days building content generation pipelines, never questioning if we needed all 16 types per ending. It seemed like “more complete” was better.

User asked one simple question. Revealed massive waste.

What’s Next

Phase 1: COMPLETE ✅

Prompt engineering enhancement ✓
Path validation logic ✓
Image optimization ✓
Testing strategy ✓
Personality optimization ✓

Phase 2: Regenerate Existing Stories (Next)

Jekyll & Hyde with 10+ choices + optimized personalities
Frankenstein with 10+ choices + optimized personalities
Pride & Prejudice with 10+ choices + optimized personalities
Update Korean translations (75% less work now!)
Validate all stories

Timeline: ~5 hours for Phase 2, ready for marketing Day 33.

Building in Public: The Value

Day 21 taught us: Building in public prevents launching broken products.

Day 22 taught us: User feedback finds optimizations you’d never see yourself.

Without sharing the work publicly, I would have:

Kept generating bloated JSONs
Struggled with Gemini token limits
Done 75% more translation work
Never questioned the 16-type approach

One user question revealed gold.

This is why we build in public.

Progress: Day 22/100 Status: Phase 1 complete, Phase 2 next Next post: Regenerating all 3 story packs with quality fixes

Read the full story on GitHub