What is an ontology in the context of a knowledge graph?

An ontology is a formal definition of the concepts, relationships, and rules within a domain—functioning like a database schema for human knowledge. In a knowledge graph, an ontology defines what types of entities exist, what properties they can have, and how they relate to each other. Without an ontology, a graph stores arbitrary facts. With one, it stores facts that carry unambiguous meaning interpretable by both humans and AI models.

Why do context graphs need ontological grounding?

An LLM reading a knowledge graph has no way to distinguish 'diagnosis' as a medical finding from 'diagnosis' as a software root-cause analysis unless the graph carries type information. Ontological grounding enforces that distinction. Every entity has a type from the ontology's class hierarchy; every relationship has a property from the ontology's property definitions. This semantic precision is what allows an AI system to reason correctly over graph data at inference time.

What is Ontology RAG and how does it differ from GraphRAG?

Both GraphRAG and Ontology RAG extract structured knowledge from unstructured text into a knowledge graph and use vector similarity plus graph traversal for retrieval. The difference is in how extraction is guided. GraphRAG discovers entities and relationships automatically without a predefined schema—flexible but imprecise. Ontology RAG uses an OWL ontology to constrain extraction: only entities that match defined classes are extracted, and only relationships that match defined properties are stored. The result is a more precise, conformant graph at the cost of requiring a defined ontology upfront.

How do ontologies enable explainability in TrustGraph?

TrustGraph uses three named graphs within the same RDF store: the default graph for core knowledge, urn:graph:source for extraction provenance, and urn:graph:retrieval for query-time reasoning traces. The provenance and reasoning layers both conform to the W3C PROV-O ontology, which means the metadata about how knowledge entered the system and how it was used at query time is itself formally typed and queryable. This makes explainability a structured, auditable artifact rather than a log file.

What ontologies does TrustGraph use?

TrustGraph supports custom OWL ontologies for domain-specific extraction via Ontology RAG. For system-level metadata, TrustGraph uses W3C PROV-O for extraction provenance and query-time reasoning traces. Standard ontologies like SOSA/SSN (sensors and observations), FOAF (social networks), Dublin Core (document metadata), and FIBO (financial) can all be imported and used for Ontology RAG extraction flows.

Ontologies and Context Graphs

An ontology is often described as a formal schema for knowledge—a definition of what types of things exist, how they relate, and what properties are valid within a domain. In practice, for AI systems built on knowledge graphs, an ontology does something more important: it gives the graph meaning.

Without an ontology, a knowledge graph is a collection of facts. With one, it is a collection of facts that both humans and AI models can interpret without ambiguity. That difference is the foundation of everything TrustGraph calls semantic grounding, and it is why ontologies are inseparable from the context graph concept.

What an Ontology Actually Does

Consider a triple extracted from a medical report:

ex:Alice ex:hasDiagnosis ex:Condition_001 .

The word "diagnosis" here could mean a clinical diagnosis, a software system error diagnosis, or a diagnostic test. A language model reading this triple has no way to know which, unless the graph carries type information.

With ontological grounding, the triple becomes:

ex:Alice a onto:Patient ;
    onto:receivedDiagnosis ex:Diagnosis_001 .

ex:Diagnosis_001 a onto:ClinicalDiagnosis ;
    onto:diagnosedCondition onto:Type2Diabetes ;
    onto:diagnosedBy ex:Physician_Smith ;
    onto:diagnosisDate "2025-11-14"^^xsd:date .

The entity types—onto:Patient, onto:ClinicalDiagnosis—come from the ontology's class hierarchy. The properties—onto:receivedDiagnosis, onto:diagnosedCondition—come from the ontology's property definitions. Every node in the graph now carries meaning that is enforced by schema, not inferred from text.

This is what ontological grounding means: the graph does not just store facts—it stores facts with a formal semantic contract that makes those facts legible to any system that reads the graph, including an LLM receiving a subgraph as context.

The Three Layers of an OWL Ontology

TrustGraph uses OWL (Web Ontology Language) as its ontology standard, the same W3C specification that underpins the semantic web. An OWL ontology defines three things:

Layer	What it defines	Example
Classes	Types of entities that exist	`onto:Patient`, `onto:ClinicalDiagnosis`
Properties	Relationships and attributes between entities	`onto:receivedDiagnosis`, `onto:diagnosisDate`
Axioms	Constraints and rules	An `Observation` must have exactly one `hasResult`

These three layers together form the semantic contract for a domain. When TrustGraph extracts knowledge from documents using an ontology, the extracted graph is validated against this contract. Entities not matching defined classes are not extracted. Relationships not matching defined properties are not stored. The graph that results is conformant—it means what the ontology says it means.

Standard Ontologies Worth Knowing

Building an ontology from scratch is possible, but most domains have established ontologies worth building on:

Ontology	Domain	Common Use
SOSA/SSN	Sensors and observations	IoT, scientific measurement, intelligence
FOAF	Social networks and people	Identity, organizational relationships
Dublin Core	Document metadata	Content management, publishing
PROV-O	Data provenance	Audit trails, lineage tracking
FIBO	Financial industry	Banking, securities, risk
SNOMED CT	Medical terminology	Clinical data, healthcare AI

TrustGraph itself uses PROV-O as its internal ontology for tracking how knowledge entered the system and how it was used—more on that below.

Ontologies and Knowledge Extraction

The first place ontologies do practical work in TrustGraph is during knowledge extraction: the process of turning unstructured documents into graph triples.

TrustGraph supports two extraction modes:

GraphRAG extracts entities and relationships automatically. The LLM reads document chunks and discovers what entities seem to exist and how they relate—without any predefined schema. This is flexible and requires no setup, but the resulting graph is schema-free. "Diagnosis" might appear as a medical finding in one document and as a software term in another, with no formal distinction.

Ontology RAG uses an OWL ontology to guide extraction. For each document chunk, the ontology is loaded into the extraction context, and the LLM is constrained to extract only entities that match defined classes and only relationships that match defined properties. The resulting graph is schema-conformant.

The difference in output quality is significant for domains where precision matters. With Ontology RAG, the question "what sensors were used in this intelligence report?" returns entities specifically typed as intel:MaritimeSensor—not a mix of sensors, detectors, monitors, and surveillance systems that share similar text.

How Ontology RAG Handles Ontology Size

One practical challenge with ontologies is that a comprehensive domain ontology can be large—large enough to exceed an LLM's context window if loaded in full. TrustGraph addresses this by applying the GraphRAG algorithm to the ontology itself, stored as a graph. At extraction time, vector similarity identifies the subset of the ontology relevant to the document chunk being processed. The LLM receives only the relevant ontology slice, not the entire schema. This makes ontology-guided extraction practical even with large, multi-class ontologies.

Ontologies and Explainability

The second place ontologies do critical work is in TrustGraph's explainability architecture.

TrustGraph maintains three separate named graphs within the same RDF store:

Default graph — Core knowledge facts (the knowledge graph itself)
urn:graph:source — Extraction provenance: how knowledge entered the system
urn:graph:retrieval — Query-time reasoning traces: how knowledge was used

Both the provenance layer and the reasoning layer are stored as RDF triples conforming to the W3C PROV-O ontology. This means the metadata about knowledge is not a log file or a JSON blob—it is itself a formally typed, queryable graph that follows the same semantic standards as the knowledge it describes.

The provenance chain for any piece of knowledge runs four levels deep:

Document (original file)
  └─ Pages (extracted from document)
       └─ Chunks (text segments from pages)
            └─ Subgraphs (edges extracted from chunks)

At query time, TrustGraph records the full reasoning trace: the question, the graph nodes used as grounding, the traversal path through the graph, the edge selection with reasoning, and the final synthesis. These traces are stored as PROV-O triples and remain permanently queryable.

The practical consequence: for any AI-generated response, you can trace backward through the graph to the exact triples that grounded it, the model parameters in effect, and the source document those triples came from. This is a compliance and trust requirement for regulated industries—and it is made possible by ontological grounding at the metadata layer, not just the knowledge layer.

Ontologies in the Context Graph Architecture

The context graph is defined by three layers built on top of a base knowledge graph: ontological grounding, AI-optimized retrieval, and reification of agentic behavior. Ontologies are present in all three.

In the knowledge layer: OWL ontologies define the semantic vocabulary of the graph. Every entity has a class; every relationship has a typed property. The graph is legible to an LLM because the ontology enforces what the nodes mean.

In the retrieval layer: Ontology RAG uses the ontology to guide extraction and to filter retrieval results by type. When the LLM receives a subgraph as context, the structured format—RDF Turtle, JSON-LD, or Markdown—carries ontological type information that the LLM can use to reason correctly about what it is reading.

In the reification layer: When agentic behavior is reified into the graph—user requests, model parameters, reasoning chains, timestamps—that metadata is itself typed according to an ontological schema. The cg:AgentInteraction class, the cg:modelUsed property, the cg:queryTriples relationship: these are ontological definitions that make the reified behavior queryable and auditable in the same way as the core knowledge.

Choosing Between GraphRAG and Ontology RAG

The decision is not either/or—both modes operate on the same underlying graph and can be used together. The practical question is where to invest in ontology definition.

Situation	Recommendation
Exploratory analysis, unknown entity types	Start with GraphRAG
Domain has established W3C or industry ontologies	Use Ontology RAG with existing ontology
Type precision is a compliance requirement	Ontology RAG required
Schema changes frequently	GraphRAG, or keep ontology minimal
Need SPARQL queries over typed entities	Ontology RAG required
Large document sets, diverse domains	GraphRAG for discovery, Ontology RAG for precision

A common pattern: use GraphRAG first to discover what entity types naturally emerge from a document set, then use those findings to design a focused ontology for Ontology RAG. The ontology defines the 5–10 entity types that matter most; GraphRAG handles the rest.

What Makes a Good Ontology for Context Graphs

Ontology design for AI extraction is different from ontology design for pure knowledge representation. A few principles that hold in practice:

Keep it focused. A 5-class ontology with 10 well-defined properties extracts more reliably than a 50-class hierarchy with deep inheritance. The LLM's extraction task gets harder as the ontology gets more complex. Define only the types that matter for your retrieval use case.

Use existing ontologies as a base. Extending SOSA/SSN for sensor domains, or FOAF for organizational data, gives you tested semantics and interoperability with external systems. Build custom types as subclasses of established classes where possible.

Iterate based on extraction results. Low extraction rates usually mean the ontology is too restrictive—entity types that exist in your documents are not represented in your schema. Monitor which types of information are being missed and extend the ontology accordingly.

AI can generate first drafts. Providing a domain description to an LLM and asking for an OWL ontology in Turtle format produces a usable starting point that you can then refine. The refinement still requires domain expertise, but the initial structure generation can be automated.