Semantic Structures

Semantic Structures are formal representations that organize knowledge with explicit meaning and context. Unlike simple data structures that store information, semantic structures capture the meaning (semantics) of information through defined types, relationships, constraints, and rules. In TrustGraph, semantic structures enable machines to understand, reason about, and validate knowledge.

What Are Semantic Structures?

Semantic structures provide machine-understandable definitions of:

Types: What kinds of things exist (Person, Organization, Event, Concept)
Properties: What attributes things have (name, age, location, description)
Relationships: How things connect (works_at, founded, influences, part_of)
Constraints: What rules apply (age must be positive, dates must be chronological)
Hierarchies: How types organize (Employee is_a Person, CEO is_a Employee)
Rules: What can be inferred (if X manages Y and Y manages Z, then X supervises Z)

Example: Without vs With Semantic Structure

Without semantic structure (plain data):

{
  "name": "Alice Johnson",
  "thing": "TechCorp",
  "date": "2020-01-15"
}

❌ What does "thing" mean? ❌ What's the relationship between Alice and TechCorp? ❌ What does the date represent? ❌ No validation possible

With semantic structure:

{
  "@type": "Person",
  "name": "Alice Johnson",
  "worksAt": {
    "@type": "Organization",
    "name": "TechCorp"
  },
  "startDate": "2020-01-15",
  "@context": "https://schema.org"
}

✅ Clear types (Person, Organization) ✅ Explicit relationship (worksAt) ✅ Semantic property (startDate) ✅ Validates against schema

Types of Semantic Structures

1. Schemas

Definition: Defines the structure of data with types and properties.

// Property Graph Schema (Cypher)
const schema = {
  nodeTypes: [
    {
      type: "Person",
      properties: {
        name: { type: "string", required: true },
        email: { type: "string", pattern: "^[^@]+@[^@]+$" },
        age: { type: "integer", min: 0, max: 150 },
        dateOfBirth: { type: "date" }
      },
      indexes: ["name", "email"]
    },
    {
      type: "Organization",
      properties: {
        name: { type: "string", required: true },
        founded: { type: "date" },
        industry: { type: "string", enum: ["tech", "finance", "healthcare"] }
      }
    }
  ],

  relationshipTypes: [
    {
      type: "WORKS_AT",
      sourceType: "Person",
      targetType: "Organization",
      properties: {
        startDate: { type: "date", required: true },
        role: { type: "string" },
        department: { type: "string" }
      }
    },
    {
      type: "MANAGES",
      sourceType: "Person",
      targetType: "Person",
      properties: {
        since: { type: "date" }
      }
    }
  ]
};

// Apply schema to TrustGraph
await trustgraph.applySchema(schema);

2. Ontologies

Definition: Rich semantic models with formal logic, hierarchies, and inference rules.

# OWL Ontology (Turtle syntax)
@prefix : <http://example.org/org#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

# Class definitions
:Person a owl:Class ;
    rdfs:label "Person" ;
    rdfs:comment "A human being" .

:Employee a owl:Class ;
    rdfs:subClassOf :Person ;
    rdfs:label "Employee" .

:Manager a owl:Class ;
    rdfs:subClassOf :Employee ;
    rdfs:label "Manager" .

:Organization a owl:Class ;
    rdfs:label "Organization" .

# Property definitions
:worksAt a owl:ObjectProperty ;
    rdfs:domain :Employee ;
    rdfs:range :Organization .

:manages a owl:ObjectProperty ;
    rdfs:domain :Manager ;
    rdfs:range :Employee .

:supervises a owl:ObjectProperty ;
    rdfs:domain :Manager ;
    rdfs:range :Employee ;
    owl:propertyChainAxiom ( :manages :manages ) .  # Inference rule

# Constraints
:email a owl:DatatypeProperty ;
    rdfs:domain :Person ;
    rdfs:range xsd:string .

:Employee owl:equivalentClass [
    a owl:Restriction ;
    owl:onProperty :worksAt ;
    owl:minCardinality 1  # Employee must work at least one Organization
] .

Use in TrustGraph:

// Load ontology
await trustgraph.loadOntology("ontologies/organization.owl");

// Reasoning engine applies inference rules
const person = await trustgraph.getNode("person_123");

// Automatic inference: person manages X who manages Y
// Therefore: person supervises Y (transitive property)
const supervised = await trustgraph.query({
  query: "Who does person_123 supervise?",
  reasoning: true  // Apply ontology rules
});

3. Vocabularies

Definition: Standardized terms and their meanings for a domain.

// Schema.org vocabulary usage
const article = {
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Understanding Knowledge Graphs",
  "author": {
    "@type": "Person",
    "name": "Jane Smith"
  },
  "datePublished": "2025-12-24",
  "publisher": {
    "@type": "Organization",
    "name": "TechPress",
    "logo": {
      "@type": "ImageObject",
      "url": "https://techpress.com/logo.png"
    }
  }
};

// TrustGraph understands Schema.org vocabulary
await trustgraph.ingest({
  data: article,
  vocabulary: "https://schema.org"
});

Common vocabularies:

Schema.org: General web content (articles, products, events, people)
FOAF: Friend of a Friend - social networks
Dublin Core: Document metadata (title, creator, date)
SKOS: Taxonomies and thesauri
Custom: Domain-specific vocabularies

4. Taxonomies

Definition: Hierarchical classification of concepts.

// Taxonomy structure
const industryTaxonomy = {
  "Industry": {
    "Technology": {
      "Software": {
        "Enterprise Software": ["CRM", "ERP", "Analytics"],
        "Consumer Software": ["Social Media", "Gaming", "Productivity"]
      },
      "Hardware": {
        "Consumer Electronics": ["Smartphones", "Laptops", "Wearables"],
        "Industrial Hardware": ["Servers", "Networking", "IoT Devices"]
      }
    },
    "Healthcare": {
      "Pharmaceuticals": ["Drug Manufacturing", "Biotech"],
      "Medical Devices": ["Diagnostic Equipment", "Surgical Instruments"],
      "Healthcare Services": ["Hospitals", "Clinics", "Telemedicine"]
    },
    "Finance": {
      "Banking": ["Retail Banking", "Investment Banking"],
      "Insurance": ["Life Insurance", "Property Insurance"],
      "Investment": ["Asset Management", "Venture Capital"]
    }
  }
};

// Load taxonomy into TrustGraph
await trustgraph.loadTaxonomy({
  taxonomy: industryTaxonomy,
  relationshipType: "subcategory_of"
});

// Query using taxonomy
const techCompanies = await trustgraph.query({
  query: "Find all companies in Technology",
  includeSubcategories: true  // Includes Software, Hardware, and their children
});

Why Semantic Structures Matter

1. Machine Understanding

Enable machines to understand data semantics:

// Without semantics
const data = {
  "Person1": { "relation": "Person2" }
};
// Machine doesn't know what "relation" means

// With semantics
const semanticData = {
  "@type": "Person",
  "id": "person1",
  "parent": {  // Semantic property with clear meaning
    "@type": "Person",
    "id": "person2"
  }
};
// Machine understands "parent" relationship and can reason about it

2. Data Validation

Enforce constraints and rules:

// Schema with constraints
const schema = {
  nodeTypes: [{
    type: "Person",
    properties: {
      age: { type: "integer", min: 0, max: 150 },
      email: { type: "string", pattern: "^[^@]+@[^@]+$" }
    }
  }]
};

// Validation catches errors
try {
  await trustgraph.createNode({
    type: "Person",
    properties: {
      age: -5,  // ❌ Invalid: negative age
      email: "invalid"  // ❌ Invalid: wrong format
    }
  });
} catch (error) {
  console.log(error.validationErrors);
  // [
  //   "age must be >= 0",
  //   "email must match pattern ^[^@]+@[^@]+$"
  // ]
}

3. Reasoning and Inference

Derive new knowledge from existing knowledge:

// Ontology with inference rule
const ontology = `
  :Employee rdfs:subClassOf :Person .
  :Manager rdfs:subClassOf :Employee .

  # Inference rule: Manager is_a Person (transitivity)
`;

await trustgraph.loadOntology(ontology);

// Query with reasoning
const managers = await trustgraph.query({
  query: "SELECT * WHERE { ?x a :Person }",
  reasoning: true
});

// Returns Managers too, even though they're not explicitly typed as Person
// Inference engine deduced: Manager -> Employee -> Person

4. Interoperability

Enable different systems to understand shared data:

// System A uses Schema.org
const personA = {
  "@context": "https://schema.org",
  "@type": "Person",
  "name": "Alice",
  "worksFor": { "@type": "Organization", "name": "TechCorp" }
};

// System B also uses Schema.org
const personB = {
  "@context": "https://schema.org",
  "@type": "Person",
  "name": "Bob",
  "worksFor": { "@type": "Organization", "name": "TechCorp" }
};

// Both systems understand the shared vocabulary
// Can merge data without ambiguity
await trustgraph.merge([personA, personB]);

Building Semantic Structures in TrustGraph

Step 1: Define Schema

const schema = {
  nodeTypes: [
    {
      type: "Concept",
      properties: {
        name: { type: "string", required: true },
        definition: { type: "string" },
        domain: { type: "string", enum: ["science", "technology", "business"] }
      }
    }
  ],

  relationshipTypes: [
    {
      type: "related_to",
      sourceType: "Concept",
      targetType: "Concept",
      properties: {
        strength: { type: "float", min: 0, max: 1 },
        description: { type: "string" }
      }
    }
  ]
};

await trustgraph.applySchema(schema);

Step 2: Add Ontology (Optional)

// Load OWL ontology for richer semantics
await trustgraph.loadOntology({
  file: "ontologies/domain.owl",
  reasoningEngine: "owlrl"  // or "hermit", "pellet"
});

Step 3: Ingest Data with Semantics

await trustgraph.ingest({
  data: [
    {
      "@type": "Concept",
      "name": "Machine Learning",
      "definition": "Branch of AI focused on learning from data",
      "domain": "technology"
    },
    {
      "@type": "Concept",
      "name": "Neural Networks",
      "definition": "Computing systems inspired by biological neural networks",
      "domain": "technology"
    }
  ],

  relationships: [
    {
      source: "Machine Learning",
      type: "related_to",
      target: "Neural Networks",
      properties: {
        strength: 0.9,
        description: "Neural networks are a key technique in ML"
      }
    }
  ],

  // Validate against schema
  validate: true
});

Step 4: Query with Semantics

// Query with semantic understanding
const results = await trustgraph.query({
  query: "Find all Concepts related to Machine Learning",

  // Use ontology reasoning
  reasoning: true,

  // Traverse relationships semantically
  semanticTraversal: true,

  // Include inferred relationships
  includeInferred: true
});

Advanced Semantic Features

1. Semantic Search

Search by meaning, not just keywords:

const results = await trustgraph.semanticSearch({
  query: "AI techniques for understanding language",

  // Semantic understanding maps query to concepts:
  // "AI" -> "Artificial Intelligence"
  // "understanding language" -> "Natural Language Processing"
  // "techniques" -> methods, algorithms, approaches

  expandSynonyms: true,
  expandHierarchy: true,  // Include child concepts
  reasoning: true
});

2. Semantic Similarity

Compute similarity based on semantics:

const similarity = await trustgraph.computeSemanticSimilarity({
  entity1: "machine_learning",
  entity2: "deep_learning",

  method: "ontology-based",  // Use ontology structure

  // Factors:
  // - Hierarchical distance (deep_learning is_a machine_learning)
  // - Shared properties
  // - Relationship overlap
});

console.log(similarity);  // 0.85 (high similarity)

3. Semantic Validation

Validate data against semantic rules:

const validation = await trustgraph.validateSemantics({
  data: {
    type: "Employee",
    properties: { name: "Alice" },
    relationships: []
  },

  rules: [
    "Employee must have worksAt relationship to Organization",
    "Employee is_a Person"
  ]
});

if (!validation.valid) {
  console.log(validation.errors);
  // ["Missing required relationship: worksAt"]
}

Best Practices

Start Simple: Begin with basic schema, add complexity as needed
Reuse Standards: Use Schema.org, FOAF, etc. instead of inventing terms
Document Semantics: Clear definitions for custom types and relationships
Validate Data: Enforce schema constraints during ingestion
Version Ontologies: Track changes to semantic structures
Use Reasoning: Leverage inference engines for derived knowledge
Test Thoroughly: Validate that reasoning produces expected results
Balance Complexity: Rich semantics vs. performance trade-offs