Context Graphs: AI-Optimized Knowledge Graphs

Context Graphs are knowledge graphs specifically engineered and optimized for consumption by AI models. They extend traditional knowledge graphs by incorporating AI-specific optimizations like token efficiency, relevance ranking, provenance tracking, and hallucination reduction—making them ideal for providing LLMs with structured, semantically rich context.

From Knowledge Graphs to Context Graphs

Traditional Knowledge Graph

Traditional knowledge graphs excel at comprehensive data storage and human-oriented querying:

// Traditional Knowledge Graph
{
  entities: [
    // Thousands or millions of entities
    { id: "person_1", type: "Person", name: "Alice Johnson", age: 35, ... },
    { id: "person_2", type: "Person", name: "Bob Smith", age: 42, ... },
    { id: "company_1", type: "Company", name: "TechCorp", founded: 2010, ... },
    // ... thousands more entities
  ],
  relationships: [
    // Tens of thousands of relationships
    { from: "person_1", to: "company_1", type: "works_at", since: 2020 },
    { from: "person_2", to: "company_1", type: "works_at", since: 2015 },
    // ... many more relationships
  ]
}

Optimized for:

✅ Comprehensive data storage
✅ Human querying and exploration
✅ Complex analytical queries
✅ Data warehousing
✅ Long-term knowledge retention

Challenges for AI:

❌ Too large for LLM context windows
❌ Contains irrelevant information
❌ No relevance scoring for queries
❌ Verbose representations waste tokens
❌ Missing provenance and confidence

Context Graph

Context graphs transform knowledge graphs into AI-optimized subgraphs:

// Context Graph (AI-Optimized)
{
  query: "Who leads TechCorp and what is their background?",

  entities: [
    {
      id: "person_alice",
      type: "Person",
      name: "Alice Johnson",
      role: "CEO",
      background: "Former VP at StartupX, Stanford MBA",
      relevance: 0.95  // AI relevance score
    },
    {
      id: "company_techcorp",
      type: "Company",
      name: "TechCorp",
      industry: "Enterprise Software",
      relevance: 0.92
    }
  ],

  relationships: [
    {
      from: "person_alice",
      to: "company_techcorp",
      type: "leads",
      since: "2020",
      strength: 1.0,
      relevance: 0.90
    }
  ],

  metadata: {
    extractionTime: "2025-12-24T10:00:00Z",
    sources: ["hr_database", "linkedin_profile"],
    contextWindow: "8k tokens",
    tokensUsed: 350,  // Fits comfortably in context
    confidenceScore: 0.94
  }
}

Optimized for:

✅ LLM consumption - Fits within context windows
✅ Semantic clarity - Unambiguous entity semantics
✅ Reasoning support - Multi-hop reasoning paths
✅ Relevance ranking - Most relevant information prioritized
✅ Hallucination reduction - Grounded, verifiable facts
✅ Token efficiency - Dense information in minimal tokens
✅ Provenance tracking - Source and confidence metadata

Key Characteristics of Context Graphs

1. Query-Driven Subgraph Extraction

Context graphs are dynamically extracted from larger knowledge graphs based on query relevance:

# Using TrustGraph CLI to extract context graph
tg-invoke-graph-rag \
  --flow-id my-graphrag-flow \
  --collection enterprise-docs \
  --query "What cybersecurity threats does our company face?" \
  --max-entities 50 \
  --graph-depth 3 \
  --relevance-threshold 0.7

Result: A focused subgraph containing only entities and relationships relevant to cybersecurity threats, optimized to fit within the LLM's context window.

2. Relevance-Based Ranking

Every entity and relationship receives a relevance score:

// Context graph with relevance scoring
{
  entities: [
    {
      id: "threat_ransomware",
      type: "CyberThreat",
      name: "Ransomware",
      relevance: 0.98,  // Highly relevant to query
      severity: "critical"
    },
    {
      id: "threat_phishing",
      type: "CyberThreat",
      name: "Phishing",
      relevance: 0.95,
      severity: "high"
    },
    {
      id: "defense_firewall",
      type: "SecurityControl",
      name: "Next-Gen Firewall",
      relevance: 0.72,  // Moderately relevant
      effectiveness: "high"
    }
  ],

  // Sorted by relevance for optimal LLM processing
  sortedBy: "relevance_descending"
}

3. Token-Efficient Representation

Context graphs maximize information density per token:

## Verbose Format (150 tokens)
The person named Alice Johnson currently holds the position of Chief Executive
Officer at the technology company known as TechCorp, which she has been leading
since January 15th, 2020. TechCorp is a software company that was founded in
2015 and operates in the enterprise software industry. Alice previously worked
at another company before joining TechCorp.

## Context Graph Format (45 tokens)
Alice Johnson → CEO → TechCorp (since 2020-01-15)
TechCorp → industry → Enterprise Software
TechCorp → founded → 2015
Alice Johnson → previous_role → VP at StartupX

Result: 70% token reduction while preserving all essential information.

4. Provenance and Confidence Tracking

Every fact includes source information and confidence scores:

{
  triple: {
    subject: "Alice Johnson",
    predicate: "works_at",
    object: "TechCorp",

    // Provenance tracking
    sources: [
      {
        type: "hr_database",
        timestamp: "2024-12-01",
        confidence: 1.0,
        verified: true
      },
      {
        type: "linkedin_profile",
        timestamp: "2024-12-20",
        confidence: 0.95,
        verified: false
      }
    ],

    // Aggregated confidence
    overallConfidence: 0.98,
    lastVerified: "2024-12-01",

    // Temporal validity
    validFrom: "2020-01-15",
    validTo: null  // Still current
  }
}

5. Multi-Hop Reasoning Paths

Context graphs include reasoning chains for complex queries:

// Query: "How does climate change affect AI research funding?"
{
  query: "How does climate change affect AI research funding?",

  reasoningPaths: [
    {
      path: [
        "climate_change",
        "→ affects →",
        "government_priorities",
        "→ influences →",
        "research_funding",
        "→ supports →",
        "ai_research"
      ],
      strength: 0.85,
      evidence: [
        "Government climate reports 2024",
        "NSF research funding budgets",
        "AI research grant database"
      ]
    },
    {
      path: [
        "climate_change",
        "→ creates →",
        "environmental_challenges",
        "→ requires →",
        "ai_solutions",
        "→ increases_demand_for →",
        "ai_research"
      ],
      strength: 0.78,
      evidence: [
        "Climate AI research papers",
        "Tech industry sustainability reports"
      ]
    }
  ],

  synthesis: {
    directImpact: "medium-high",
    mechanisms: ["funding_reallocation", "problem_driven_demand"],
    confidence: 0.82
  }
}

Building Context Graphs in TrustGraph

Method 1: GraphRAG (Schema-Free Extraction)

GraphRAG automatically extracts entities and relationships from unstructured text without requiring predefined schemas:

# Step 1: Set up a collection
tg-set-collection \
  --name "Company Documents" \
  --description "Internal company documentation" \
  my_collection

# Step 2: Create a GraphRAG flow
tg-start-flow \
  --name graph-rag \
  --id company-graphrag \
  --description "Extract knowledge graph from company docs"

# Step 3: Add documents to the collection
tg-add-library-document \
  --name "Cybersecurity Report 2024" \
  --id doc-cybersec-2024 \
  --kind text/plain \
  documents/cybersecurity-report.txt

# Step 4: Process documents into knowledge graph
tg-start-library-processing \
  --flow-id company-graphrag \
  --document-id doc-cybersec-2024 \
  --collection my_collection

# Step 5: Query and extract context graph
tg-invoke-graph-rag \
  --flow-id company-graphrag \
  --collection my_collection \
  --query "What are our top cybersecurity vulnerabilities?"

What happens:

TrustGraph chunks the document
Extracts entities (threats, systems, vulnerabilities)
Extracts relationships (targets, exploits, mitigates)
Builds a knowledge graph
For your query, extracts a relevant context subgraph
Returns AI-optimized context for your LLM

Learn more: GraphRAG: Graph-Based Retrieval-Augmented Generation

Method 2: Ontology RAG (Schema-Driven Extraction)

Ontology RAG uses predefined OWL ontologies to ensure semantic precision:

# Step 1: Upload your ontology
cat domain-ontology.owl | tg-put-config-item \
  --type ontology \
  --key cybersec-ontology \
  --stdin

# Step 2: Create an Ontology RAG flow
tg-start-flow \
  --name onto-rag \
  --id cybersec-onto-rag \
  --description "Ontology-driven cybersecurity knowledge extraction"

# Step 3: Process documents with ontology
tg-start-library-processing \
  --flow-id cybersec-onto-rag \
  --document-id threat-intel-report \
  --collection threat_intel

# Step 4: Query with ontology-conformant context
tg-invoke-graph-rag \
  --flow-id cybersec-onto-rag \
  --collection threat_intel \
  --query "Which vulnerabilities affect our critical infrastructure?"

What happens:

TrustGraph loads the OWL ontology (e.g., SOSA/SSN, cybersecurity ontology)
Extracts entities that conform to ontology classes
Extracts relationships matching ontology properties
Builds a semantically precise knowledge graph
Extracts context graphs with guaranteed type conformance

Learn more: Ontology RAG: Schema-Driven Knowledge Extraction

Method 3: Manual Context Engineering

For fine-grained control, manually construct context graphs via the API:

# Using TrustGraph API to query and extract context
curl -X POST http://localhost:8001/api/invoke/graph-rag \
  -H "Content-Type: application/json" \
  -d '{
    "flow-id": "my-graphrag-flow",
    "collection": "enterprise-docs",
    "query": "What products does TechCorp sell?",
    "max-entities": 20,
    "max-depth": 3,
    "relevance-threshold": 0.7,
    "include-provenance": true,
    "semantic-expansion": true,
    "prioritize-types": ["sells", "manufactures", "offers", "provides"],
    "output-format": "markdown",
    "include-definitions": true,
    "max-tokens": 2000
  }'

API Response:

{
  "context-graph": {
    "entities": [
      {
        "id": "product_x",
        "type": "Product",
        "name": "Enterprise Platform X",
        "relevance": 0.95,
        "properties": {
          "category": "Software",
          "price": "$50k/year",
          "target-market": "Enterprise"
        }
      }
    ],
    "relationships": [
      {
        "from": "company_techcorp",
        "to": "product_x",
        "type": "sells",
        "relevance": 0.92
      }
    ],
    "metadata": {
      "tokens-used": 1842,
      "entities-count": 18,
      "relationships-count": 34,
      "confidence": 0.89
    }
  }
}

Context Graph Optimization Strategies

1. Hierarchical Summarization

Provide multi-level context to maximize information within token budgets:

{
  // High-level summary (always included, ~50 tokens)
  summary: {
    entities: ["TechCorp", "Q4 2024 Sales"],
    keyFact: "Sales declined 15% in Q4 2024 due to competitor pressure"
  },

  // Detailed context (included if tokens available, ~200 tokens)
  details: {
    entities: [
      {
        id: "q4_2024_sales",
        value: "$10M",
        change: "-15%",
        comparedTo: "q4_2023"
      },
      {
        id: "competitor_x",
        action: "launched rival product",
        date: "2024-10-01"
      }
    ],
    relationships: [
      "competitor_x → launched → rival_product",
      "rival_product → caused → market_share_loss",
      "market_share_loss → resulted_in → sales_decline"
    ]
  },

  // Supporting evidence (included only if space allows, ~500 tokens)
  evidence: {
    customerFeedback: ["..."],
    marketAnalysis: ["..."],
    financialReports: ["..."]
  }
}

2. Temporal Ordering for Reasoning

Order events chronologically to support causal reasoning:

{
  temporalContext: {
    timeline: [
      {
        date: "2024-01-01",
        event: "TechCorp launches Product X",
        entities: ["product_x", "techcorp"]
      },
      {
        date: "2024-06-15",
        event: "Competitor launches rival product",
        entities: ["competitor", "rival_product"],
        impact: "market_share_loss"
      },
      {
        date: "2024-10-01",
        event: "Sales decline becomes apparent",
        entities: ["q3_sales", "sales_decline"],
        cause: ["market_share_loss", "rival_product"]
      },
      {
        date: "2024-12-24",
        event: "Current state (query time)",
        status: "declining_sales"
      }
    ],

    // Causal chains for reasoning
    causality: [
      "rival_product → market_share_loss → sales_decline"
    ]
  }
}

3. Community Detection

Extract entire communities of related entities:

# Extract a context graph focused on a semantic community
tg-invoke-graph-rag \
  --flow-id company-graphrag \
  --collection enterprise-docs \
  --query "Explain our fintech product ecosystem" \
  --strategy community-based \
  --min-community-size 10

Result: Returns a tightly-connected subgraph representing the fintech ecosystem, including products, partners, technologies, and their interrelationships.

Context Graphs vs Knowledge Graphs

Aspect	Knowledge Graph	Context Graph
Purpose	Comprehensive knowledge storage	AI model consumption
Size	Millions of entities	10-100s of entities per query
Optimization	Query performance, storage	Token efficiency, relevance
Semantics	Implicit or explicit	Explicit with definitions
Provenance	Optional	Required
Confidence	Assumed or absent	Scored and tracked
Format	Database-optimized	LLM-optimized (Markdown, JSON)
Temporal	Snapshot or versioned	Time-ordered for reasoning
Relationships	All relationships stored	Prioritized by relevance
Scope	Domain-wide	Query-specific
Hallucination Reduction	Not a design goal	Core design principle

Use Cases in TrustGraph

1. Conversational AI Agents

Context graphs provide agents with evolving, conversation-aware context:

// Agent maintains context graph across conversation turns
class AIAgent {
  contextGraph: ContextGraph;

  async processQuery(userMessage: string) {
    // Update context graph with new information from conversation
    await this.contextGraph.update({
      entities: this.extractEntities(userMessage),
      relationships: this.inferRelationships(userMessage),
      conversationContext: {
        previousTurn: this.lastResponse,
        userIntent: this.classifyIntent(userMessage),
        turn: this.conversationTurn++
      }
    });

    // Query with accumulated context
    const response = await llm.generate({
      prompt: userMessage,
      context: this.contextGraph.toMarkdown(),
      systemPrompt: "Use the context graph to answer accurately"
    });

    return response;
  }
}

Learn more: Agent Memory: Persistent Context for AI Agents

2. RAG with Reduced Hallucinations

Context graphs ground LLM responses in verifiable facts:

# GraphRAG query with hallucination reduction
tg-invoke-graph-rag \
  --flow-id company-graphrag \
  --collection product-docs \
  --query "What are the security features of Product X?" \
  --grounding-mode strict \
  --include-provenance true

Result: LLM receives a context graph with:

Only verified entities from the knowledge graph
Source documents for each fact
Confidence scores for each relationship
Temporal validity information

The LLM's response can be validated against the graph to detect hallucinations.

3. Decision Support Systems

Multi-factor decision making with structured context:

# Query multiple context graphs for decision support
tg-invoke-graph-rag \
  --flow-id decision-support \
  --collection business-intel \
  --query "Should we expand into the European market?" \
  --include-graphs market-analysis,competitive-landscape,regulatory-env,financial-projections \
  --synthesis multi-factor-decision

Result: Context graph combining:

Market opportunity analysis
Competitive landscape
Regulatory requirements
Financial projections
With reasoning paths connecting factors

4. Knowledge-Augmented Code Generation

Code assistants use context graphs of codebases:

# Extract code context graph
tg-show-graph \
  --collection codebase-knowledge \
  --entity-types function,class,module \
  --relationship-types calls,imports,extends \
  --format markdown

Result: Context graph showing:

Functions and classes in the codebase
Import dependencies
Call graphs
Inheritance hierarchies

Used to generate code that integrates correctly with existing codebase.

Working with Knowledge Cores

TrustGraph uses Knowledge Cores as modular, isolated knowledge graphs:

# Create a knowledge core (isolated knowledge graph)
echo '{"id": "cybersec-core", "description": "Cybersecurity knowledge"}' | \
  tg-put-kg-core --key cybersec-core --stdin

# Load knowledge core for processing
tg-load-kg-core --core-id cybersec-core

# Show available knowledge cores
tg-show-kg-cores

# Extract context graph from specific knowledge core
tg-invoke-graph-rag \
  --flow-id company-graphrag \
  --kg-core-id cybersec-core \
  --query "What are our vulnerability management processes?"

# Unload knowledge core when done
tg-unload-kg-core --core-id cybersec-core

Knowledge Cores enable:

Multi-tenant isolation (separate knowledge graphs per customer)
Domain-specific context (cybersecurity core, finance core, etc.)
Version control (snapshot knowledge at specific points in time)
Access control (different users access different cores)

Learn more: Knowledge Cores: Modular Memory

Exporting Context Graphs

TrustGraph supports multiple serialization formats:

# Export context graph to Turtle (RDF)
tg-show-graph \
  --collection enterprise-docs \
  --output-format turtle

tg-graph-to-turtle \
  --collection enterprise-docs \
  > context-graph.ttl

# Export to JSON-LD
tg-show-graph \
  --collection enterprise-docs \
  --output-format jsonld

# Export to Markdown (LLM-optimized)
tg-show-graph \
  --collection enterprise-docs \
  --output-format markdown

Example Turtle output:

@prefix ex: <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

ex:alice a ex:Person ;
    ex:name "Alice Johnson" ;
    ex:worksAt ex:techcorp ;
    ex:role "CEO" ;
    ex:since "2020-01-15"^^xsd:date .

ex:techcorp a ex:Company ;
    ex:name "TechCorp" ;
    ex:industry "Enterprise Software" ;
    ex:founded "2015"^^xsd:date .

Best Practices

1. Match Context Window Constraints

# Ensure context fits in model's context window
tg-invoke-graph-rag \
  --flow-id company-graphrag \
  --collection enterprise-docs \
  --query "..." \
  --max-tokens 8000 \
  --reserve-tokens 2000 \
  --target-tokens 6000

2. Include Semantic Definitions

Don't assume LLMs know domain-specific terms:

{
  entity: "SOSA",
  type: "Ontology",
  definition: "Sensor, Observation, Sample, and Actuator ontology",
  context: "W3C standard for IoT semantic interoperability",
  example: "Used to describe sensor measurements in RDF"
}

3. Provide Reasoning Paths

Help LLMs follow logical chains:

{
  query: "Why did revenue increase?",
  reasoningChains: [
    {
      chain: [
        "new_product_launch",
        "→ caused →",
        "customer_acquisition",
        "→ resulted_in →",
        "revenue_increase"
      ],
      evidence: ["sales_report.pdf", "marketing_analytics.csv"]
    }
  ]
}

4. Track Uncertainty

Mark uncertain or low-confidence information:

{
  fact: "Competitor may launch product in Q3",
  confidence: 0.6,
  source: "industry_rumors",
  evidenceStrength: "weak",
  alternativeHypotheses: [
    { hypothesis: "Competitor delays to Q4", probability: 0.3 },
    { hypothesis: "Competitor cancels product", probability: 0.1 }
  ]
}

5. Monitor Token Usage

Track and optimize token consumption:

# Query with token budget monitoring
tg-invoke-graph-rag \
  --flow-id company-graphrag \
  --collection enterprise-docs \
  --query "..." \
  --max-tokens 4000 \
  --report-token-usage true

Summary

Context Graphs extend Knowledge Graphs by:

Optimizing for AI consumption - Token efficiency, context window awareness
Prioritizing relevance - Query-driven subgraph extraction with scoring
Tracking provenance - Source attribution and confidence scores
Enabling reasoning - Multi-hop paths and temporal ordering
Reducing hallucinations - Grounded facts with validation

TrustGraph provides comprehensive tools for building and querying context graphs through:

GraphRAG - Schema-free knowledge extraction
Ontology RAG - Schema-driven semantic precision
CLI tools - Full lifecycle management
APIs - Programmatic access and integration
Knowledge Cores - Modular, multi-tenant graphs