Foundation layer
Oracle Foundry
The foundation layer of governed inference.
Oracle Foundry transforms authoritative source documents into a queryable, tamper-evident knowledge base. Every downstream governance decision traces back to this layer.
Without this
Without tamper-evident source provenance, every downstream governance decision is an assertion without evidence. Claims can’t be verified, retrieval can’t be audited, and your gate has nothing to check against.
Position in the platform
| System | Layer |
|---|---|
| Oracle Foundry | Foundation |
| SIRE Crosswalk | Post-Foundry |
| Prompt Compiler | L0 |
| Claim Ledger | L1-L4 |
| Process Control System | Cross-layer |
| Forensics Lab | L5 |
Seven-stage foundry pipeline
Canonical oracle frontmatter
Every oracle corpus carries a YAML frontmatter block that combines identity, provenance, SIRE authority metadata, and license enforcement policy into a single canonical schema.
corpus_id: iso-iec-27001-2022-v1title: "ISO/IEC 27001:2022 — Information Security Management Systems"tier: tier_2version: 1content_type: proseframeworks: [ISO27001]industries: [fintech, healthcare, saas, ecommerce, cloud, manufacturing, government]segments: [enterprise, smb]source_url: https://www.iso.org/standard/27001source_publisher: "ISO/IEC Joint Technical Committee JTC 1, Subcommittee SC 27"last_verified: "2026-02-28"language: english
license: status: licensed notes: "ISO/IEC copyright. Reproduction restricted." output_policy: citation_only
fact_check: status: ai_parsed checked_at: "2026-02-28" checked_by: openrouter/anthropic/claude-sonnet-4-5
sire: subject: information_security_management included: - ISMS - risk assessment - risk treatment - Statement of Applicability - controls - Annex A - confidentiality - integrity - availability excluded: - PHI - covered entity - business associate - HIPAA - ePHI - data subject - personal data - controller - processor - GDPR - DPIA relevant: - ISO-27002:2022 - NIST-CSF - SOC2:CC6 - HITRUST-CSF - ISO-31000:2018sire
- subjectstring
Domain label (lowercase snake_case). Identity anchor for this corpus.
- includedstring[]
Editorial keywords inside this domain. Strengthens discovery — never enforces.
- excludedstring[]
Anti-keywords from other domains. The only deterministic enforcement gate at retrieval time.
- relevantstring[]
Cross-framework references for topological expansion. Discovery only, not enforcement.
license
- statusenum
licensed | customer_provided | public_domain | synthetic | unknown
- output_policyenum
citation_only | full_text_permitted | unrestricted | attributed_reproduction | restricted_pending
- notesstring
Human-readable constraint description for the source material.
Retrieval modes
Semantic search
Vector similarity retrieval over embedded chunks with metadata filters and thresholding.
Hybrid search
Weighted reciprocal-rank fusion combining vector relevance and full-text ranking.
Hybrid score formula
combined_score = semantic_weight * (1 / (20 + semantic_rank)) + (1 - semantic_weight) * (1 / (20 + text_rank))Sovereignty and tamper evidence
- • Every embedding vector carries an immutable attribution chain: embedding authority, egress policy, and pipeline run attestation
- • Un-attributed embeddings are structurally impossible — enforced by database CHECK constraint, not application logic
- • Chunk watermarks use HMAC-SHA-256 signatures that verify without database access — the chunk proves its own provenance
- • Immutable event log records every embedding operation (success or failure) for audit and compliance reporting
- • Designed for air-gap and VPC deployment where data must never leave the customer's security perimeter
Manufacturing quality model
- • Stage-gated quality controls from source qualification to production corpus activation
- • Per-stage SPC metrics: chunk variation, validation errors, embedding reliability
- • Production feedback metrics: retrieval hit rate, citation rate, gate pass rate, freshness age
- • Continuous-improvement loop: low-performing corpora are reworked, not ignored
Deployment profile
- • Works across Studio, Refinery, and Clean Room environments (runtime profiles: Lite, Standard, Full)
- • Supports customer-perimeter deployment (air-gap or VPC) for regulated workloads
- • Designed for customer-owned data boundary and provenance-preserving operation
See the full architecture
Oracle Foundry is the first layer in the six-system governed inference stack.
Who uses this
Operator
Corpus engineers
Data stewards who curate, version, and maintain authoritative source collections.
Consumer
Every downstream system
Prompt Compiler, Claim Ledger, SIRE Crosswalk, and Forensics Lab all consume oracle artifacts.