Skip to main content

Storing Knowledge Cores

caution

If you plan on storing a knowledge core, the storage scripts must be started PRIOR to beginning to the Naive Extraction process.

Knowledge Core CLI​

pip3 install trustgraph-parquet==0.11.20

Storing During Extraction​

Before beginning a Naive Extraction process, open 3 terminal windows. One window will be used to interact with TrustGraph. The other two terminal windows will run the storage commands while TrustGraph runs the Naive Extraction process.

In the main TrustGraph window:

mkdir dump

In the second terminal window:

triples-dump-parquet -d dump -p pulsar://localhost:6650 --no-metrics

In the third terminal window:

ge-dump-parquet -d dump -p pulsar://localhost:6650 --no-metrics

In the main TrustGraph window, you can now begin the Naive Extraction process with the document loader commands. Periodically, the storage scripts will dump the extracted graph edges and embeddings. When the Naive Extraction process has finished, build the dumped graph edges and embeddings into the 2 Knowledge Core components:

concat-parquet -i dump/graph-embeds-* -o <knowledge-core-embeddings>.parquet
concat-parquet -i dump/triples-* -o <knowledge-core-graph-edges>.parquet

These scripts will save the Knowledge Core components in the current directory.

Once the Knowledge Core has been successfully saved, the storage scripts must be ended manually. In the second and third terminal windows, for Linux hit ctrl+c. For MacOS hit ctrl+z and then kill %.