Settings
The Graph RAG approach of TrustGraph is to perform a semantic similarity search using the request. The request finds the most relevant mapped vector embeddings to return a list of nodes stored in the knowledge graph. Subgraphs are then generated based on that list of entities. The subgraphs are the input context for the Graph RAG responses. The subgraphs can be adjusted with the following parameters:
entity-limit
triple-limit
max-subgraph-size
Entity Limit​
Specifies the number of entities to return based on the semantic similiarity search of the vector embeddings.
Triple Limit​
For each entity, specifies the number of triples to return from the knowledge graph.
Max Subgraph Size​
The total returned knowledge graph subgraph is the product of entity-limit
and triple-limit
. The max-subgraph-size
parameter can provide a cap to the possible returned subgraph. If the returned subgraph size exceeds max-subgraph-size
, some triples will be discarded.
LLM Input Context​
With dense knowledge graphs, the returned subgraphs can be quite large. With the default settings, the subgraph can easily exceed 10,000 tokens
. While long context LLMs can handle this amount of tokens, smaller models may struggle. It's difficult to predict, but the total subgraph size tends to generate 3x-5x
the number of tokens. For instance, if the max-subgraph-size
is 1000
, it's likely the subgraph will be 3,000-5,000
tokens.
Container Settings​
These settings can be adjusted in the command
list of the graph-rag
container found in the configuration YAML
file. The default settings are:
- --entity-limit
- '50'
- --triple-limit
- '30'
- --max-subgraph-size
- '3000'