The Doctrine Firewall: Protecting Ancient Wisdom from LLM Hallucination

1. The Terminological Challenge

Both Western Geomancy and Indian Ramal Shastra rely on precise, rigid vocabularies. The names of the sixteen figures, the planetary associations, and the structural sections (e.g., the Via Puncti or the Court of the Judge) must remain invariant. Unfortunately, Large Language Models (LLMs) are designed for probabilistic variety. If you ask an LLM to generate a reading repeatedly, it will eventually synonymize a rigid technical term, transforming "Amissio" into "Loss" or inventing an entirely new, culturally mismatched term.

This is unacceptable for a production oracle. A geomancer reading a chart expects canonical terminology. When an LLM injects creative vocabulary into a sacred structure, it corrupts the reading's authenticity.

⚠️ The Danger of Synonymization

In Ramal Shastra, figures have specific phonetic weights and Arabic-derived origins that must be maintained in Hindi translations. An LLM attempting to "improve" the prose might swap out these foundational terms for modern Hindi equivalents, breaking the doctrinal chain of the practice.

2. Enforcing Purity: The Doctrine Firewall

To combat terminological drift, SAGE engineers built the Doctrine Firewall. Rather than trusting the LLM to remember strict vocabulary instructions, the pipeline uses post-generation deterministic registries to sanitize the output.

The firewall operates as a multi-stage pass over the generated text. For example, during a Ramal reading, the firewall actively scans for Western geomantic terms (which the base LLM often hallucinates due to its training data biases) and aggressively overwrites them with their canonical Ramal equivalents before the text ever reaches the client.

3. The Double-Edged Sword of Normalization

However, aggressive regex-based normalizers introduce a new problem: Scrubbing False Positives. In an effort to strip redundant structural markers (like the LLM randomly writing Via Puncti: at the start of a paragraph when the UI already provides the header), the normalizer can inadvertently destroy perfectly valid prose.

Consider the prompt instruction: "Start the paragraph with a clear sentence, e.g., 'The Via Puncti begins at the Judge in House 15...'"

A naive stripping regex looking for technical terms at the start of lines will match Via Puncti and aggressively remove it, resulting in the final reading rendering a dangling, truncated sentence: begins at the Judge in House 15...

💡 The Edge Case

How does a deterministic system distinguish between a hallucinated structural marker (which should be stripped) and the subject of a valid, flowing prose sentence (which must be preserved)?

4. The Subject Protection Heuristic

To solve this, we developed the Subject Protection Heuristic. This is a layer of context-aware regex injected directly into the Doctrine Firewall. Before the firewall decides to strip a technical term, it looks ahead in the token stream.

If the regex detects the term Via Puncti immediately followed by the verb begins, the firewall logs a "Subject detected" state. It halts the stripping operation, recognizing that the LLM is using the term as the subject of a flowing narrative sentence rather than an orphaned structural header.

5. The Final Result

The Doctrine Firewall, augmented with the Subject Protection Heuristic, allows us to achieve the best of both worlds. We successfully purge hallucinated structural markers and enforce strict 1,000-year-old vocabulary, while simultaneously protecting the rich, narrative prose generated by the LLM.

This ensures that every reading delivered by SAGE is not only structurally sound and doctrinally pure, but also linguistically beautiful.

Experience the Oracle

The engineering described here protects the integrity of every SAGE reading. Try it for free and witness the power of deterministic AI enforcement.