Skip to main content

Subgraphs and Embeddings

While the concepts behind the naive extraction process of TrustGraph are decades old, RAG is incredibly new in comparison. RAG is a rapidly evolving domain with “Graph RAG” being the latest buzzword. In fact, some might consider TrustGraph to be a “Graph RAG” solution. Until recently, the term RAG has referred to performing a semantic similarity search on vector embeddings which are typically linked to text statements stored in a table. The search returns a list of most similar indexes, which then retrieves the statements in the table. These statements are then fed into a LM for a generative response.

TrustGraph also creates vector embeddings during the naive extraction process in addition to a knowledge graph. There are many approaches to building these parallel knowledge stores. Embeddings can be created for a triple or a list of subjects from the graph. With the modular architecture of TrustGraph, these approaches are easily adjusted by simply changing the queue subscription of a service. In general, the search capabilities of VectorDBs, in this case Qdrant, serves to extract a subgraph from the knowledge graph. A subgraph is exactly what it sounds like, a subset of the full set of graph edges stored in the knowledge graph. The subgraph is the knowledge that will be reconstructed for input into the LM for a generative response.

tip

Subgraph retrieval is one of the biggest opportunties for performance improvements for TrustGraph. There are many sophisticated algorithms for querying knowledge graphs that have yet to be tested for the purposes of RAG.