Jun 2026
Verification upgrade
Cross-Content Consistency Gate added — catches contradictions between pages
Added a new verification gate (Gate 1C, KS-SV1-46) that checks every claim in new content against all existing live content for contradictions. Each piece of content was already verified against its own sources individually — but two pages could each be grounded in real research while still contradicting each other if the interpretation drifted. This gate catches that drift before it goes live. Triggered by discovering two live Shorts making conflicting claims about the same tissue: one called muscle "the strongest predictor of calorie burn" (an interpretation drift from the source, which said "fat-free mass"), while the other correctly noted muscle is "metabolically quiet at rest" at 6 calories per pound per day. Both were individually grounded in real studies. Neither was wrong about its own source. But the reader saw two pages on the same site saying opposite things. The contradiction has been corrected, and the new gate prevents any future content from shipping with cross-page inconsistencies.
Manual audit discovered two live Shorts (KW-035 muscle-does-not-turn-to-fat and KW-015 muscle-calories-at-rest) making contradictory claims about muscle's metabolic contribution — each individually grounded, but conflicting due to interpretation drift from source verbatim
Jun 2026
Verification upgrade
Body-Composition Filter now enforced across all content tiers including Shorts
The Body-Composition Filter — which ensures every FitChef page changes how you eat, train, or build your body rather than providing medical reassurance — was already enforced at study tier (C0 Cluster Architect) and claim tier (Claim Rule 8). It was NOT enforced in the Shorts pipeline. Two high-demand keywords passed every existing gate (separation, evidence mapping, competition assessment) but their honest answers were medical reassurance: 'your kidneys are fine' and 'it's safe long-term.' Both killed. SK0 (Keyword Scout) now has a mandatory Body-Composition Filter gate that catches these keywords before they enter the production queue. SF1 (Short Fueler) has a safety-net gate that catches any that slipped through older bank versions. Editorial Foundation Rule 9 — the filter applies at every tier identically — is now mechanically enforced end-to-end.
Keyword bank audit found KW-021 ('is creatine safe long term') and KW-024 ('will too much protein damage kidneys') in the production queue. Both had massive search demand and zero competition — but both fail the Body-Composition Filter because their answers are safety reassurance, not behavioral change. The protein cluster build plan had already excluded the kidney topic at study tier with cut_reason 'medical_reassurance.'
Jun 2026
Pattern fix
Guide headline stats now provably match the studies on the page
A guide's headline numbers — how many studies and how many participants — must now be computed from that guide's own analyzed studies, not borrowed from any single underlying question. During an internal audit we found one guide whose 'total participants' figure had been pulled from a single claim's evidence set, most of it from one large study, which made the headline unrepresentative of the guide as a whole. We corrected the figure to reflect the guide's actual study set, and added a verification step that blocks any headline stat that can't be reconstructed from the studies shown on the page.
Internal audit found the Carbs guide's hero stat reported a participant total lifted from one claim (dominated by a single 68,128-participant study) instead of the guide's own studies.
Jun 2026
Bugfix
Removed a misleading “extra” from an ultra-processed-food hero stat
The study's headline read '340 extra calories.' The 340 figure is grounded — it is how many calories the measured eating rate delivers in a 20-minute meal — but the word 'extra' implied it was the between-diet difference, which is actually 508 calories per day (stated correctly in the key finding). We removed 'extra' so the headline number is unambiguous.
Internal audit found the hero stat labeled a 20-minute intake figure as 'extra' calories (Hall 2019).
Jun 2026
Bugfix
Corrected a meta-analysis's “participants” figure that was actually its study count
On the sugar-and-body-weight study page and a claim that cited it, the 'Participants' figure read 68 — but 68 is the number of studies the meta-analysis pooled (30 randomised trials plus 38 prospective cohort studies), not participants. The source paper reports per-analysis effect sizes, not a single participant total, so we removed the incorrect figure rather than substitute an invented one.
Internal audit traced the study page's 'Participants: 68' to the meta-analysis's pooled study count (Te Morenga 2013, BMJ).
Jun 2026
Fix
Removed unverifiable sample size from Schoenfeld 2017 load meta-analysis
The Schoenfeld 2017 study page stated '684 participants' as the pooled sample size across 21 studies. During the deep pre-scale audit, we verified this number against the full paper (J Strength Cond Res 31(12):3508-3523): the paper does not report a pooled participant total. Table 1 lists per-study sample sizes summing to 630, but this includes control groups excluded from the meta-analysis — the actual analyzed N is lower and unreported. 684 appeared nowhere in the paper text, abstract, or tables. Rather than replace it with a derived estimate, we removed the participant count entirely and retained '21 studies' as the scope descriptor, consistent with what the paper itself reports. The training library flagship participant total was updated accordingly (9,350 → 8,666). A cross-reference in the Zhang 2025 exercise-deficit study was also corrected.
Deep pre-scale quality audit (Check 1: study number traceability). Verified 17,599 numbers across 224 published pieces; Schoenfeld 684 was one of two medium findings.
Jun 2026
Fix
Corrected protein badge to match EU nutrition claim threshold
The Stuffed Portobello Mushrooms recipe displayed a 'High Protein' badge, but protein provided only 16.7% of total energy — below the EU EFSA threshold of 20% required for 'high protein' claims (Regulation 1924/2006). The badge was changed to '35g Protein' (a factual statement, not a regulated claim). The recipe pipeline spec (RR2) was updated with mandatory EFSA threshold checks so future recipes are automatically verified before a nutrition badge is assigned.
Deep pre-scale quality audit (Check 17: EFSA health-claim compliance).
Jun 2026
Verification upgrade
New Rule: Zero Fabrication on Import Fix
Added Rule 24 to the pipeline master instructions. When fixing import validation errors (type mismatches, missing fields), the system must never populate fields with inferred or derived content. Only three actions are permitted: type conversion with empty values, direct verbatim copy from a grounded source file with explicit tracing, or mechanical transforms specified by the pipeline spec. Every non-empty value must trace to an exact source file and field path — otherwise the field stays empty. This closes a gap where reasonable-sounding inferences could bypass the grounding requirement.
During a post-pipeline type-mismatch fix, the system fabricated inclusion criteria by inferring from sample characteristics instead of using empty containers. Caught during import review.
Jun 2026
Bugfix
Pull-quote attribution now matches the actual source
The pull-quote blockquote (the shareable sentence shown mid-article in Shorts) attributed the finding to the first study in the fuel list regardless of which study the sentence actually described. For multi-source Shorts this could display the wrong author. A new field now explicitly maps each pull-quote to its correct source, and a kill switch (KS-SV1-34) verifies the match before any Short ships.
Shorts Scale Audit discovered nighttime-carbs Short attributed Sofer et al. 2011's finding (28% more weight loss with dinner carbs) to Gardner et al. 2018 (a different study that happened to be first in the fuel array). Second mismatch found in processed-food-speed-trap Short (Forde finding attributed to Hall).
Jun 2026
Bugfix
LLM citation hint now shows actual answer text
The 'AI systems — cite as:' line in the Cite This Short block was showing a CSS selector (.fc-short-takeaway) instead of the actual citation-ready answer for 5 Shorts, and was missing entirely for 18 Shorts. All 27 Shorts now display the correct answer capsule text that AI systems can use for accurate citation.
Shorts Scale Audit discovered 5 Shorts rendering '.fc-short-takeaway' as the LLM citation hint and 18 Shorts with no hint at all. Root cause: SC1 spec described the field ambiguously (CSS selector vs text), and SV1 never included fcc_short_speakable in its field template.
Jun 2026
Verification upgrade
Data Register Verification added to Shorts pipeline
Added five register-level thinking rules (Paste-Back Test, Feelability Gate, Researcher-as-Subject, Authority-Label, Compound-Noun) to SW1 (writer) and a new Gate 6C to SV1 (verifier). These catch sentences that pass vocabulary checks but read like study-results prose rather than FitChef story voice. The existing Gate 6B caught jargon words; Gate 6C catches the register — the frame in which data is presented.
Protein-before-bed Short (KW-001) passed all existing gates but contained '+8.4 cm² vs +4.8 cm²', 'Trommelen and colleagues tracked', and 'Type II muscle fibers' — structurally sound vocabulary but study-report register.
May 2026
Bugfix
Statistical detail rendering — full numbers now visible on all claim pages
Fixed a rendering gap where 34 of 45 claim pages showed study names in the Statistical Detail accordion but not the actual numbers (effect sizes, p-values, heterogeneity). The theme template now renders both the canonical format and the legacy flat-key format. Pipeline specs (CR7, CR12) updated to mandate the correct structure for all future claims.
CL-002 production run revealed statistical numbers missing from the accordion. Deep audit confirmed 34 of 45 claims affected.
May 2026
Improvement
Sidebar participant counts corrected after cross-system audit
An independent AI review flagged discrepancies between article body numbers and sidebar Evidence Base widgets. Investigation traced the root cause: the agent building sidebar data had no mandated source for participant counts, leading to 6 values across 3 clusters that could not be traced to verified study extractions. Every article body number was confirmed grounded — the issue was isolated to the sidebar rendering layer. All 6 values have been corrected to match verified extraction data, and a new kill switch (KS-LIB-11) now requires every sidebar participant count to be sourced from the same grounded extraction files that feed the article body.
Independent AI cross-check flagged body-sidebar number mismatches
May 2026
Improvement
New blocking gate prevents sidebar-body number mismatches
Added KS-LIB-12: a mandatory cross-check that compares sidebar participant totals against the verified fuel file breakdown before any library page can be imported. If the sidebar sum does not match the article body total, the import is blocked until the discrepancy is resolved. This gate makes it structurally impossible for sidebar and body numbers to diverge — the same grounded data must feed both.
Number consistency audit revealed no existing gate caught the mismatch
May 2026
Bugfix
Study participant counts standardized to prevent silent parsing errors
Audit discovered that some study pages stored participant counts as freeform text (e.g., '49 studies, 1863 participants' or 'n=20 (10 per group)'). The rendering system extracted the first number it found — sometimes reading a study count as a participant count, or reading zero from text starting with a letter. Seven study imports and three claim imports have been corrected to clean integers, each verified against the original study's grounded extraction data.
Number consistency audit found PHP integer casting silently misreading freeform sample size values
May 2026
Verification upgrade
Four-category jargon sweep added to Shorts verification
The Short Verifier now scans every Short and its audio script for four categories of technical language: statistical notation, biomedical terms, exercise-science jargon, and paper-apparatus vocabulary. Any sentence a smart 8-year-old wouldn't understand on first read is caught and sent back for rewriting before it can publish. FAQ answers get the same sweep — no p-values or confidence intervals in reader-facing dropdowns.
Quality audit of first three Shorts found jargon leaks that passed verification: statistical notation in prose, untranslated clinical terms in audio.
May 2026
Verification upgrade
Audio scripts now grounding-verified against source research
Every number, claim, and finding in a Short's audio script must now trace back to the grounded fuel files — the same verified research that powers the written content. Audio translates verified content into spoken-word format; it can never add new claims, regardless of whether the claim is true. The verifier checks audio against fuel files before the Short can publish.
Architecture review found Shorts audio had no grounding constraint — the study pipeline's audio agent (R4) had this rule but the Shorts pipeline did not.
May 2026
Bugfix
Citation links now verified against actual published page URLs
The Citability Engineer and Short Verifier now verify every source citation link against the actual published page URL from the import system. Previously, citation URLs could use internal folder names instead of WordPress page slugs, producing broken links. The verifier also ensures external sources with DOIs always have clickable links — readers can click through to verify every cited study.
Review of first three Shorts found four broken citation URLs (wrong slugs) and two external sources with DOIs that had no clickable link.
May 2026
Verification upgrade
Universal Citation Verification Gate — every study now verified before entering the pipeline
The citation verification gate (KS-C0-CITATION) now requires every study — whether discovered by Claude or suggested by Gemini — to be verified via WebSearch or WebFetch before it can enter the cluster plan. Previously, only Gemini-suggested studies required verification. This upgrade was triggered when a hallucinated citation ('Ravelli 2019') was detected in the fat-loss cluster plan — the paper did not exist in any indexed source. The real paper had a different author, year, and journal. The hallucination never reached any published page (caught before the study entered the pipeline), but the gap that allowed it to enter the plan at all has now been closed. Every citation in every cluster plan is now tool-call verified.
Hallucinated citation detected in fat-loss cluster architecture during production queue review
May 2026
Verification upgrade
Answer capsule standalone gate now catches title-dependent openers
The answer capsule appears in featured snippets, AI citations, and the answer hero card — all contexts where the page title may be absent. The standalone gate previously verified that the answer named its subject and embedded its purpose, but did not check whether the opening word was a direct response to the headline question. An answer starting with 'No —' or 'Yes —' only makes sense if the reader just read a yes/no question above it. In standalone surfaces, there is no question. The gate now includes a fifth test: the first word must make sense without a question preceding it.
CL-005 test-boosters-mostly-scam-one-exception — answer_short 'No — three out of four ingredients...' passed all four existing tests but opened with a title-dependent 'No —' that has no referent in standalone contexts
May 2026
Verification upgrade
New blocking gate: Answer Capsule Standalone Verification
Added Step 3E to the Content Skeptic (CR10) — a blocking gate that verifies the answer capsule works as a standalone statement without the page title. The answer capsule appears in Google Featured Snippets, AI citations (ChatGPT, Perplexity), and the answer hero card — contexts where the title may be absent. The gate checks that the text names its subject and embeds its purpose. Previously, if the answer capsule writer (CR4) falsely logged a standalone PASS, no downstream agent caught it. On CL-003, the answer started with 'Yes — but the boost is smaller than it feels' — which doesn't tell the reader WHAT is being discussed without the title above it. The gate now catches this pattern before publication.
CL-003 preworkout-caffeine-small-real-edge — answer_short standalone failure propagated through entire pipeline undetected
May 2026
Verification upgrade
Doctor’s-letter test expanded to body prose — 4 prescriptive sentences caught and fixed
A full legal audit of all 60 import JSONs across 4 clusters found 4 sentences in claim body prose that read like dietary prescriptions rather than evidence reporting — all using 'aim for' with specific gram or frequency targets. The verification gate (CR10 Step 3A2) already scanned persona actions, FAQs, and skeptic notes for this pattern, but body prose was not included in the scan. All 4 sentences have been rewritten to describe what the evidence found rather than tell the reader what to do, and body prose is now included in the gate so the pattern cannot recur.
Pre-scaling legal audit of all import JSONs (2026-05-12)
May 2026
Verification upgrade
New mechanical gate catches raw statistical notation in narratives
A cross-cluster audit found that 7 of 23 study narratives contained raw statistical notation (p-values, confidence intervals, heterogeneity scores) that should have been translated to plain language. The existing Zero PubMed rule relied on agent judgment, which interpreted 'translate and keep' as compliant. Three spec upgrades close the gap: R1 now requires notation to be translated AWAY (not accompanied), R3 tightens the story-point exception to at most one instance per narrative, and R9 adds a mandatory regex code gate that mechanically catches any remaining notation before publication. Six study narratives were corrected.
Mark’s review of Schwingshackl 2013 live page revealed p-value density. Cross-cluster audit confirmed pattern across carbs and meal-timing clusters.
May 2026
Verification upgrade
Audio scripts now pass the doctor’s-letter test
A pre-scale legal audit of all 31 audio scripts found 5 scripts containing prescriptive imperatives ('aim for X grams', 'push toward X') — the same pattern found and fixed in written persona actions. All 5 scripts were surgically edited to replace imperatives with evidence-reporting language. Two audio scriptwriter specs were upgraded: CR9 (v2.4.0) for claim audio and R4 (v2.6) for study audio both now include an explicit doctor’s-letter test gate (CR9 Step 2C, R4 SR7B). The disclaimer check confirmed all 31 scripts already end with a spoken 'not medical advice' disclaimer — that system was already working correctly.
Pre-scale legal audit — Check 2 (audio scripts) following Check 1 (persona actions)
May 2026
Verification upgrade
FAQ answers and skeptic notes now pass the doctor’s-letter test
A comprehensive legal audit of all 164 FAQ answers and all skeptic notes across both clusters found 5 FAQ answers and 1 skeptic note containing prescriptive imperatives ('aim for X grams', 'you should aim for', 'eat more protein'). All 6 were surgically edited to replace imperatives with evidence-reporting language while preserving the same practical information. A final sweep of all 30 import JSONs confirmed zero prescriptive patterns remaining in any field.
Pre-scale legal audit — Checks 6 and 12 (FAQ fields and skeptic notes) following Checks 1-2 (persona actions and audio scripts)
May 2026
Verification upgrade
Persona actions and translations now pass the doctor’s-letter test
A pre-scale legal audit found that 14 persona action fields across 13 claims contained direct imperatives ('Aim for X grams', 'Push toward X', 'Move X from Y to Z') that could be read as medical advice rather than evidence reporting. All 22 affected fields across both clusters were rewritten to describe what the research tested and found, letting readers infer the action themselves. Three pipeline specs were upgraded with a new blocking gate: CR7 (v2.11.0) now bans prescriptive imperatives in persona actions and real-world translations, CR10 (v1.7.0) adds a mandatory doctor’s-letter verification scan before any claim can pass Gate 2, and R2 (v2.24) extends the same rule to study so_what fields. The reframe preserves the same practical information in the same plain language — the only change is whether FitChef tells readers what to do (old) or reports what was studied (new).
Comprehensive legal audit of all live content across protein and meal-timing clusters before scaling
Apr 2026
Launch
Claim Pipeline goes live — first multi-study synthesis verified
The Claim Pipeline (Phase 2) produced its first verified claim: 'Does intermittent fasting actually give you a better body than regular dieting?' This claim synthesizes 4 studies (529 participants) through a 15-agent pipeline with 18 verification steps, including evidence consistency scoring, content verification against all source papers, reader simulation, and Gemini cross-check. The claim pipeline adds a new layer of verification on top of individual study checks: every factual statement in the synthesis traces back to a verified study extraction, and a quality audit scores the final content across 5 dimensions (evidence integrity, content quality, dwell time design, SEO readiness, trust layer). This first claim scored 91.3/100 composite — above the 85 ship threshold.
First claim reaching production readiness through the complete claim pipeline
Apr 2026
Improvement
Legal safety audit v2: numeric scores removed from public pages + claim audio disclaimer added
Comprehensive 15-point legal safety re-audit of all FitChef systems. Found and fixed 4 places where internal numeric scores (consistency_index/100, trust_score/5) were rendered on public pages — claim cards, cluster hub rows, and OG meta descriptions. These numeric scores could imply medical authority or study validation (Legal Audit §1.3 violation). All replaced with human-readable certainty tier labels. Also found claim audio scripts were missing the legal disclaimer that study audio scripts have had since day one. Added mandatory disclaimer specification to CR9 (Audio Scriptwriter) and verification gate to CR10 (Content Skeptic). Fixed prescriptive heading language on study pages.
Full legal re-audit requested after many system iterations since original April 2026 audit
Apr 2026
Improvement
Internal numeric scores removed from all public outputs
Internal quality metrics (consistency index, data fidelity rate) were previously exposed in REST API responses, JSON-LD structured data, HTML meta tags, and AI-facing llms.txt. These numbers are used internally for sorting and quality gates but could imply FitChef is rating or evaluating science — which contradicts our identity as a content platform. All numeric scores have been removed from every public-facing output. The public now sees only human-readable certainty tiers (High/Moderate/Low) and factual counts (studies verified, claims grounded). The numbers continue to work behind the scenes for quality control.
Legal safety audit v2 — making all public outputs rock-solid for FitChef's identity as a content platform, not a research evaluator
Apr 2026
Improvement
Sidebar fixes + Legal text sweep + Mobile horizontal scroll fix
Multiple theme fixes shipped in v12.15.6: (1) Study sidebar 'What kind of study is this?' disclosure styled with custom CSS chevron. (2) Claim audio playlist CTA is now sticky at bottom-right. (3) Full legal text sweep — updated footer disclaimer, methodology page, skeptic protocol, and cite block headings to remove any implication FitChef conducts original research. (4) Claim card text overflow fixed with CSS line-clamp. (5) Studies archive mobile horizontal scroll fixed — Pattern A center columns now use split overflow (overflow-x: hidden, overflow-y: visible) instead of overflow: visible which broke horizontal containment.
Visual QA audit + Legal safety review
Apr 2026
Improvement
Claim pipeline: 3 automated code gates + voice-register enforcement added
Deep audit of claim pipeline vs. study pipeline revealed 12 quality gaps. Added three mandatory Python code gates to the claim editorial polish agent: bold distribution gate (ensures every content section has visual anchoring), paragraph length gate (max 400 characters — prevents wall-of-text rendering on mobile), and dense-outcome gate (catches results-section-style data dumps). Also added Pillar B voice-register enforcement matching the study pipeline's proven system — four-category jargon sweep, one-voice consistency check across all page fields, and gender-neutral language enforcement. These gates run automatically on every claim before it can pass editorial review.
Comprehensive claim vs. study pipeline comparison revealed claim pipeline lacked the mechanical quality enforcement the study pipeline has had since v3.5
Apr 2026
Improvement
First Claim Pipeline Execution Complete (CL-001)
The claim pipeline (Phase 2) processed its first claim: 'How much protein do you actually need per day?' (CL-001). 18 agents synthesized evidence from 4 studies (Morton 2018, Schoenfeld 2013, Nunes 2022, Jäger 2017 — Consistency Index 87, High Certainty). Quality audit score: 91.5 (SHIP). Gemini external review completed with one accepted fix. Infrastructure validation (CR13): PASS_WITH_GAPS — 0 blocking, 1 non-blocking (bridge metadata field not yet in plugin registry; bridge content preserved in body prose). Trust audit (CR14): PASS — 0 kill switches triggered, evidence chain fully traceable, trust page consistency confirmed (ClaimReview schema, LLM meta tags, Skeptic Protocol references all present).
Claim pipeline (Phase 2) completed first full production run for protein cluster CL-001
Apr 2026
Verification upgrade
Cluster Architect gains meta-analysis mandatory check and tension splitting gate
Post-launch review of the protein cluster revealed a critical gap: the 'how much protein when losing weight' question had no covering study — Wycherley 2012 (meta-analysis, 24 RCTs, 1,063 participants) was never evaluated as a candidate because the tension was framed as 'body recomposition' (Longland 2016), collapsing two different mass-audience questions into one. Two new rules now prevent this: A-S9 forces the Cluster Architect to search for meta-analyses for every tension with an RCT flagship, and A-S10 forces a two-archetype test to detect collapsed tensions. The protein cluster has been updated: 9 flagships (was 8), with Wycherley 2012 covering deficit populations that Morton's 1.62g/kg breakpoint explicitly excludes.
Post-launch protein cluster integrity review found Wycherley 2012 was never evaluated
Apr 2026
Verification upgrade
Independent Dwell-Time Verification Added to Claim Pages
Every claim page now undergoes an independent reading experience check by a second AI model (Gemini) that has never seen the pipeline. Gemini reads the page as a naive reader who Googled the question, identifies where engagement drops, and flags structural similarities between sibling claims. This model has zero authority over the evidence — all factual claims remain locked by the existing verification gates. The check catches dwell-time weak spots that the writing model cannot detect in its own work.
Claim pipeline design — claim pages need independent reading experience verification like study pages get from R10
Apr 2026
Expansion
Cluster Planning Rebuilt: Tension-First Architecture + Flagship/Satellite Studies + Body-Composition Filter
Every cluster is now built around the real debates fitness audiences argue about — not around which papers exist in the literature. Studies that answer the same question are grouped into flagship pages with convergent evidence sections, so you see one definitive page backed by multiple studies instead of redundant overlapping pages. Medical-reassurance topics (organ safety, mortality statistics) no longer earn standalone study pages — only findings that change how you eat, train, or build your body get published. Layer 3 of each cluster now includes four content types: a comprehensive Master Guide, a shareable Myths Piece, a methodology transparency Skeptic Note, and an interactive Tool (where applicable). The result is tighter clusters, zero redundancy, and every page serves a unique purpose.
Protein cluster C0 Phase A execution revealed five structural issues: redundant meta-analysis pages, medical-reassurance content passing the viral filter, Layer 3 underspecification, bottom-up planning producing academic completeness, and no upfront claim mapping. Cross-AI verification with Gemini confirmed and strengthened proposed solutions.
Apr 2026
Verification-upgrade
Gate 3b independent verification now phased for satellite studies
The independent skeptic verification (Gate 3b) now uses a structured two-phase process when satellite studies exist. Phase 1 completes a full forensic audit of the flagship extraction with its own verdict. Phase 2 then verifies each satellite individually with a targeted 6-point check against its own paper. This prevents quality degradation from information overload when the fresh verifier receives multiple papers at once. A scaling rule splits verification into separate sessions if satellite papers exceed 200KB combined.
First flagship-with-satellites execution (Morton 2018 protein cluster) — Mark flagged that the original single-prompt design could cause rushing when verifying 3+ papers simultaneously
Apr 2026
Verification-upgrade
Satellite Studies Now Fully Cited
Weight of Evidence satellite studies (independent research confirming or nuancing our flagship study) now receive the same citation treatment as every other source: inline [N] markers in the article text linking to the original paper, plus clickable source links on the evidence cards. Previously, satellite studies were mentioned by name but without verifiable links.
Morton 2018 first pipeline run: Nunes 2022 and Jäger 2017 appeared without source URLs (2026-04-16)
Apr 2026
Verification upgrade
Full Picture trust block now runs a mandatory cross-sibling swap test
The Full Picture trust block on every study page now runs a mandatory cross-sibling swap test before publishing. An audit found byte-identical Section 2 prose across four protein-cluster studies. Writing agents must now read at least two sibling blocks and pass a four-point comparison: different headers, different opening words, no shared six-word phrases. Any collision blocks publish until the block is regenerated fresh from source.
Audit found identical transparency prose across four sibling study pages in the protein cluster.
Apr 2026
Verification upgrade
Kill-switch count reconciled to 28 — KS-26 retired
The active kill-switch count drops from 29 to 28. KS-26 (Platform Number Fabrication) is retired because the data path it guarded — platform statistics appearing in study articles — was structurally removed when study pages were decoupled from the FitChef product layer. The error class is eliminated at the architecture level rather than caught by a verification gate. KS-26's ID slot is not reused; remaining IDs stay stable.
Architecture rebuild removed the platform data path from study pages, making KS-26 unnecessary.
Apr 2026
Verification upgrade
Trust pages reconciled to match actual verification state
Four trust-page surfaces were still describing retired pipeline features as live. The Verification Ledger heading was wired to the wrong integer source. Sidebar navigation referenced old gate numbers and a removed anchor. The AI Transparency page described a retired study tier. The llms.txt page listed meta-tag names the plugin no longer emits. All four now match the actual ship state.
Zero-tolerance audit found stale copy on trust pages that survived the prior architecture cleanup.
Apr 2026
Verification upgrade
Reading-grade ceiling enforced — every study must read at 8th-grade level or lower
Every study narrative now passes a Flesch-Kincaid reading-grade gate before reaching the human-experience review. Articles must score grade 8 or lower (magazine target). Grade 8-9 is flagged for review. Above grade 9 blocks the study for rewriting. Any single paragraph above grade 10 is fixed in place or flagged.
Post-rebuild alignment sweep found no numeric reading-grade enforcement in the readability agent.
Apr 2026
Verification upgrade
New gate: causal language detection for observational studies
Added a two-layer defense against causal language in articles based on observational research. When a study's design is observational (cohort, case-control, meta-analysis of observational studies), FitChef's editorial voice now must use associational language — 'was associated with,' 'showed an association' — never causal language like 'protects,' 'contributes,' 'prevents,' or 'scored.' Layer 1: a new Sacred Rule (SR6B) in the Editorial Polish agent catches causal phrasing during writing. Layer 2: a new mandatory code gate (Check 12C) in the Infrastructure Validator programmatically scans every field in the import JSON before publication, blocking any article that contains causal language for observational findings. This distinction matters because observational studies show correlations, not proven cause-and-effect — and implying causation from correlational data is both scientifically inaccurate and a legal risk.
Manual audit of a mortality/cancer/CVD meta-analysis (Naghshi 2020) found 47 instances of subtly causal language that passed all existing verification gates. Words like 'protection,' 'contributes,' and 'scored' imply proven effects but don't match classic overclaim patterns. Cross-study audit confirmed all 16 other articles were clean — but the pipeline must prevent this systematically for all future studies.
Apr 2026
Improvement
P2 anti-anchoring rule for finding count
Added explicit instruction to P2 Section 9 preventing Claude from anchoring to a fixed finding count. 10/14 extractions had exactly 10 findings due to unconscious pattern-matching. Rule states: no target count, paper complexity decides, pause if count is a round number.
Pattern analysis — 10/14 studies converged to exactly 10 findings
Apr 2026
Verification upgrade
Independent verification prompt strengthened after Casuso-Goossens review
The Gate 3b independent skeptic prompt — used when a fresh Claude session audits every extraction — was rewritten from a 12-line general checklist to a 50-line forensic audit protocol. The new prompt requires field-by-field walkthrough of the entire extraction JSON (no high-level skimming), explicit verification marks for every field, and specific anti-fabrication checks for metadata values like dropout rates and sample sizes that may be invented rather than sourced from the paper. This was triggered by the Casuso-Goossens 2025 review where the independent skeptic initially gave a surface-level pass and only found 11 errors (including a fabricated dropout rate) after being explicitly asked to check more thoroughly.
Casuso-Goossens 2025 Gate 3b independent review — skeptic required prompting for full audit
Apr 2026
Verification upgrade
Three-layer anti-delegation enforcement prevents invalid subagent execution
Added three structural safeguards to prevent Claude from delegating pipeline agent execution to subagents (which receive summarized instructions and produce invalid output). Layer 1: CLAUDE.md Rule 18 — absolute Agent tool ban during pipeline execution, placed in the file read FIRST every session so it survives context compaction. Layer 2: PIPELINE_ORCHESTRATOR.md Rule 10B compaction-survival clause with mandatory execution_method field in post-agent logs. Layer 3: Subagent tripwire in all 14 Run Phase spec reading gates — if a subagent reads the spec, it encounters a STOP instruction before execution begins. Together these make delegation structurally impossible, not just rule-prohibited.
Schoenfeld 2018 production failure — after context compaction, Claude delegated R6-R10 to subagents that produced invalid verification outputs with false PASS/SHIP verdicts
Apr 2026
Verification upgrade
Attribute & Report enforcement extended to audio scripts, titles, card headlines, and AI answer capsules
Audit found that audio scripts, post titles, card headlines, and answer capsules lacked the same Attribute & Report checks applied to body text. All 3 live studies had audio scripts where FitChef stated health outcomes as its own claims instead of attributing to researchers. Fixed the audio scripts and extended verification gates across 7 pipeline specs: R1 (title creation), R2 (card headlines), R4 (audio writing), R5 (answer capsules), R7 (audio verification), R9 (quality audit scope), and R11 (import validation).
Pre-scaling legal audit found all 3 live studies contained unattributed health verdicts in audio scripts and non-body-text content types
Apr 2026
Verification upgrade
Kill Switch 29: No medical verdicts in any content type
New kill switch added after pre-scaling legal audit found all 3 live studies contained unattributed health verdicts in audio scripts, post titles, persona actions, and FAQs. KS-29 catches FitChef stating health outcomes as its own claims ('builds zero muscle', 'does nothing for your muscles', 'boosts metabolism') and medical screening statements ('This applies to healthy people only') without researcher attribution. Distinct from existing KS-20 (prescriptive 'you should' language): KS-29 targets the subtler pattern of FitChef asserting health facts as a journalistic authority rather than attributing to researchers.
Pre-scaling legal audit found all 3 live studies contained unattributed health verdicts in audio scripts, post titles, persona actions, and FAQs
Apr 2026
Improvement
No medical verdicts in titles, meta descriptions, or featured snippets
Added a three-layer verification check preventing health conclusions from appearing as FitChef claims in search-visible elements. Titles and meta descriptions now must create curiosity about what research found — they can never state a health outcome as fact. P5 SEO Strategist validation checklist now includes title, meta, and snippet medical-verdict checks. R7 Field Skeptic now verifies all search elements against the Attribute & Report Rule before any study ships. This protects FitChef's legal position as data journalism (reporting what researchers found) rather than a medical authority (making health claims).
During Devries 2018 P5 execution, initial title stated 'Zero Damage' and meta stated 'zero effect on kidney function' — both medical verdicts positioned as FitChef claims rather than attributed research findings.
Apr 2026
Verification upgrade
Creative Director now enforces Attribute & Report from the source
The Creative Director agent (P6) sets the creative direction that the Narrative Writer executes. Previously, P6 had no direct rule preventing medical verdict language in its creative briefs — it relied on downstream agents to catch and fix verdict framing inherited from the brief. This created a gap: if the creative direction said 'protein is safe for kidneys,' the writer had to actively resist its own input to comply with data journalism rules. SR-7 now requires all creative direction language to attribute findings to the study ('the meta-analysis found no effect') rather than state health conclusions as FitChef's voice ('protein is safe'). Every sentence in the creative brief must pass the test: if the writer copied this phrasing into the published article, would it pass KS-20 (no prescriptive health language) and KS-29 (no medical verdicts)?
Pre-scaling legal audit found P6 was the only content-directing agent without direct Attribute & Report enforcement
Apr 2026
Verification upgrade
Four agents patched with direct Attribute & Report enforcement
Four agents that produce or modify reader-facing text had insufficient or zero Attribute & Report enforcement. R10 (Reader Simulation) runs after all verification gates and can add/edit prose — but had only an indirect FITCHEF_VOICE.md reference, no decision rules. V1 (Visual Creator) and V2 (Social Image Creator) produce headlines, insight lines, and social text with zero A&R rules. R3 (Editorial Polish) had a single anti-pattern line but no structured enforcement. All four now have dedicated Sacred Rules with self-tests, kill gates, and explicit prohibitions against dropping attribution during engagement-focused editing. R10 also gained a new quality test (Test 5: The Attribution Test) that blocks shipping if any enhancement introduced a medical verdict.
Systematic A&R audit across all 23+ pipeline agents revealed 3 high-risk and 1 medium-risk gap in agents that produce published content
Apr 2026
Verification upgrade
Mechanical Override Gate prevents recurring import format failures
Added S28 Hardcoded Value Override Gate to R11 Import Builder. This gate runs after all assembly is complete and mechanically overwrites values that agents consistently get wrong: audio URL forced to 'pending' (was empty, making Audio Generator invisible), primary_results forced to array format (was flat dict, making Card 3 empty), sample_size stripped to digits only, DOI prefix enforced, trust_score forced to integer. These failures recurred across multiple studies despite existing documentation because agents reason about values instead of mechanically applying them. S28 eliminates agent discretion for these fields.
Devries 2018 production failures: empty audio URL (no Generate Audio button), flat dict primary_results (empty Card 3), non-numeric sample_size, missing DOI prefix
Apr 2026
Verification upgrade
Three-Zone Page Architecture for Clearer Evidence Hierarchy
Redesigned study page structure into three semantic zones — Narrative (immersion), Personalization (reader-specific insights), and Evidence & Trust (skeptical review). This architectural change strengthens verification integrity by clarifying the reader's scroll journey and assigning distinct verification territory to each zone. Narrative sections remain the story. Personalization fields (so_what, persona_actions) render after the narrative with reader-specific context. Evidence fields (skeptic_note, findings, controversy, FAQ) render together in a dedicated zone where skeptical readers expect detailed verification. Added deduplication rule preventing both narrative and controversy field from covering the same scientific debate, reducing reader confusion about what's established vs. disputed.
Study page template v12.1.0 redesign to improve reader navigation and field clarity.
Apr 2026
Verification upgrade
Citability Pipeline — Machine Layer Integration
R5 Citability Engineer now feeds structured claim data, citation hints, and self-contained answer capsules through R11 into the WordPress schema and citation toolkit. LLMs and journalists see richer structured data and copy-ready citable paragraphs on every study page.
Citability pipeline integration — R5→R11 contract completion
Apr 2026
Verification upgrade
Excerpt Defensibility Gate — Sentence-Level Precision Check
Field Skeptic (Gate 2B) now verifies the boldest claim sentences can stand alone when excerpted — by social media, AI summaries, or skimming readers — without overstating the paper's actual evidence strength. Ensures every sentence is individually defensible, not just globally hedged.
Adversarial content review of published study — identified that bold claims excerpted out of context bypass global hedging sections
Apr 2026
Improvement
Headline Defensibility Gates — Medical Positioning & Metaphor Escalation Checks
Field Skeptic now runs two additional checks on every headline and title. The Medical Authority Positioning Check catches titles that frame FitChef against healthcare professionals rather than against guidelines (e.g., 'Your Doctor Is Wrong' → rewrite to critique the government number, not the doctor). The Editorial Metaphor Escalation Check catches vivid metaphors that characterize findings more strongly than the paper's own language (e.g., 'Broken Math' for what the paper calls 'systematic bias'). Both checks enforce the Attribute & Report rule at the headline level, where excerpt defensibility matters most.
Legal safety audit of Jäger 2017 article found headline 'Your Doctor's Protein Advice Is Based on Broken Math' — defensible in body text context but not as a standalone excerpt. Two patterns identified: medical counter-positioning and editorial metaphor escalation.
Apr 2026
Expansion
Kill switch 28: Source URL liveness verification
Added a new verification gate that checks every source URL in the article actually resolves to a live page with relevant content. Dead URLs are auto-fixed if a correct URL can be found, or flagged for human review. Additionally, P4 (Source Hunter) now verifies all URLs are live before presenting them to the human operator — catching hallucinated URLs at the source instead of after publication.
Morton 2018 shipped with a dead Grand View Research URL. Claude hallucinated a plausible URL suffix (-report) that didn't exist. The content was real (human-verified) but the link was broken from day one. No agent in the 14-agent pipeline checked if the URL actually worked.
Apr 2026
Improvement
Reader Simulation Anti-Rationalization Rule — No More 'Necessary But Flat' Passes
R10 (Reader Simulation) now has an explicit anti-rationalization rule that prevents flat sections from shipping with labels like 'necessary mechanism beat' or 'lower energy but appropriate.' If R10 notices a section is flat but finds itself thinking 'it's necessary,' that IS a drag point — fix it or flag it. Additionally, the Dwell Test now checks for relative attention valleys: 3+ consecutive elements at attention 7-8 surrounded by 9-10 sections are flagged as drag even though each element individually passes the absolute threshold.
Jäger 2017 shipped with a 4-paragraph nitrogen balance methodology section rated attention 7-8 that R10 labeled 'necessary mechanism beat' without enhancement. Human reader (Mark) flagged the article as feeling technical and flat. R10 had zero enhancements across the entire 54-element article — the anti-rationalization escape hatch let boring content pass verification.
Apr 2026
Fix
All verified sources now get numbered references in articles
Fixed a classification gap where human-verified sources used in the article text could be excluded from the numbered Sources list. Previously, sources classified as 'bomb amplifiers' or 'hidden gems' in the editorial pipeline were named in the text but not given clickable [N] reference numbers — even when they contained specific, verifiable claims. Now any human-verified source with attributable claims in the article gets a numbered reference, regardless of editorial classification.
Morton 2018 article referenced Greg Nuckols (Stronger by Science) and Menno Henselmans by name with specific claims but neither had a [N] marker or appeared in the Sources list. Readers had no way to verify those claims.
Apr 2026
Verification
Audio Script Validation Gate
Added a mandatory validation check that blocks study imports if the audio script is missing. Previously, a study could import without its audio data even when the audio creative work was complete — causing the audio player to show no content. The gate now verifies that every audio field (script, title, narrator) is properly transferred from creative data to the import file.
Trommelen-2023 study imported without audio script despite complete audio.json
Apr 2026
Verification
Post-Enhancement Grounding Verification
R10 Reader Simulation now adds real-world connection moments to articles — but every sentence it adds must cite the exact source file and quote from verified fuel that grounds it. This closes the gap where text added after the Triple Skeptic (R6-R8) could bypass grounding checks.
CEO-level pipeline audit — Check 3: Post-Verification Addition Gap
Apr 2026
Expansion
Kill switch 27: Health-condition study safety gate
Added a new verification gate for studies involving diagnosed medical conditions (kidney disease, diabetes, etc.). The gate ensures null findings ('no harmful effects detected') are never presented as positive safety claims, and mandates a healthcare-provider caveat in the reality check section. This prevents readers with medical conditions from interpreting research translations as personal medical guidance.
Legal safety audit identified that null findings in condition-specific studies could be misread as safety endorsements by patients
Apr 2026
Bugfix
Removed invalid study rating from schema
The Skeptic Review schema on study pages incorrectly included a reviewRating block that implied FitChef rates studies on a 1-5 scale. FitChef does not rate studies — the value was an extraction-accuracy metric (trust context), not a quality judgment. The rating block has been removed. The Skeptic Review still documents what was verified (review body and notes), but no longer includes a numeric rating. This aligns with FitChef's documented position as a transparent translator, not an evaluator.
External schema audit identified Rating node with text value in ratingValue field — both schema-invalid and a positioning violation
Apr 2026
Fix
Anti-template upgrade: Scene-based opening detection
Discovered and fixed a systematic pattern where the creative brainstorm process was producing identical 'It's 10pm, you're lying in bed...' opening scenes across different studies. The root cause: generic reader context (time-of-day, physical location) was being fed into the creative process instead of unique study data. Two published articles were affected. Fix ensures only study-specific findings and tensions drive creative direction — never interchangeable reader scenarios. Added mandatory cross-study scene detection to catch any future pattern matches before publication.
Pattern detected during Trommelen 2023 creative brainstorm: every study defaulted to nighttime phone-in-bed openings
Apr 2026
Improvement
Study-derived persona takeaways replace fixed template
Persona takeaway cards on each study page are now derived from the study's actual data instead of forced into four fixed categories. Each study defines its own audience segments based on who the study tested and what subgroups the data speaks to. Labels, count, and selection are all study-specific. Studies that tested specific age groups, training levels, or goals now show those exact categories instead of generic placeholders.
Architectural review revealed fixed persona keys conflicted with core principle that each study is unique
Mar 2026
Launch
Verification Pipeline v2.0 Launched
FitChef Verification Pipeline v2.0 launched. 23 agents, 3 verification gates, 28 kill switches. Every study processed through triple-skeptic review before publication.
Pipeline development complete. First production study (Morton 2018) published and verified.