Skip to main content

Core Modules

The Core Modules form the basic building blocks of the knowledge network. These modules are required to deploy a full end-to-end knowledge pipeline.

  • chunker-recursive - Accepts text documents and uses LangChain recursive chunking algorithm to produce smaller text chunks.
  • chunker-token - Chunks texts documents by a chosen amount of tokens.
  • embeddings-hf - A service which analyses text and returns a vector embedding using one of the HuggingFace embeddings models.
  • embeddings-ollama - A service which analyses text and returns a vector embedding using an Ollama embeddings model.
  • embeddings-vectorize - Uses an embeddings service to get a vector embedding which is added to the processor payload.
  • graph-rag - A query service which applies a Graph RAG algorithm to provide a response to a text prompt.
  • triples-write-cassandra - Takes knowledge graph edges and writes them to a Cassandra store.
  • triples-write-neo4j - Takes knowledge graph edges and writes them a Neo4j store.
  • kg-extract-definitions - knowledge extractor - examines text and produces graph edges describing discovered terms and also their defintions. Definitions are derived using the input documents.
  • kg-extract-relationships - knowledge extractor - examines text and produces graph edges describing the relationships between discovered terms.
  • loader - Takes a document and loads into the processing pipeline. Used, for example, to add PDF documents.
  • pdf-decoder - Takes a PDF document and emits extracted text. Text extraction from a PDF is not a perfect science as PDF is a printable format. For instance, the wrapping of text between lines in a PDF document is not semantically encoded, so the decoder will see wrapped lines as space-separated.
  • ge-write-qdrant - Takes graph embedding mappings and writes them to the vector embeddings store.