The Proof

Why six constraints solve what scaling cannot

The Claim

The six-constraint architecture doesn't just describe validity—it produces it. Problems that have resisted five years of scaling, fine-tuning, and patching dissolve under structural analysis.

This document shows the mechanism.


Problem 1: Hallucination

The Failure

System asserts: "The Eiffel Tower was built in 1923."

Confident. Fluent. Wrong.

Why It Happens

The system has no constraint checking. It produces tokens that are probable given the input, not tokens that are true given reality.

How Six Constraints Fix It

ConstraintCheckResult
ReferentialIs "Eiffel Tower" grounded?Yes—identifiable entity
ContextualWhat's the scope?Historical fact claim
PremissiveWhat's the source?⚠️ No source cited
InferentialHow was date derived?⚠️ Pattern match, not lookup
ConstrainingConfidence limits?⚠️ Stated as fact, not qualified
TeleologicalWhy does user need this?Factual accuracy required

Three constraints fail. Output is flagged for revision or source verification.

Mechanism: Hallucination occurs when Premissive (no grounds) and Inferential (no valid derivation) constraints are absent. The architecture catches this before output.


The Pattern

Every major failure mode maps to missing constraints:

FailureMissing Constraints
HallucinationPremissive, Inferential
Semantic driftReferential (tracking)
Groundless confidencePremissive, Constraining
Calibration failureInferential (discrimination)
Inappropriate closureTeleological (authority routing)
Context degradationReferential, Contextual (state)

The architecture doesn't patch symptoms. It provides the structural elements whose absence causes the symptoms.


Why Scaling Doesn't Solve This

"We'll just train a bigger model."

Scaling gives you:

Scaling does not give you:

You cannot scale your way to structure. A trillion parameters checking zero constraints is still checking zero constraints.

The problems persist because they're architectural, not statistical.


Empirical Predictions

If the architecture is correct, systems implementing it will show:

MetricPrediction
Hallucination rateDrops to rate of source errors, not pattern errors
Calibrationr > 0.8 between stated and actual confidence
Drift detection>95% of meaning shifts caught before compounding
Inappropriate closureNear zero (routed to human)
Long-context coherenceStable to context limit

These are testable. We invite validation.

Every failure maps to a missing constraint. Every constraint is checkable. Every check is implementable. The proof is in the structure.