Skip to main content

Storing Knowledge Cores

caution

If you plan on storing a knowledge core, the storage scripts must be started PRIOR to beginning to the Naive Extraction process.

Before beginning a Naive Extraction process, open 3 terminal windows all in the TrustGraph directory. One window will be used to interact with TrustGraph. The other two terminal windows will run the storage scripts while TrustGraph runs the Naive Extraction process.

In the main TrustGraph window:

mkdir dump

In the second terminal window:

scripts/triples-dump-parquet -d dump -p pulsar://localhost:6650 --no-metrics

In the third terminal window:

scripts/ge-dump-parquet -d dump -p pulsar://localhost:6650 --no-metrics

In the main TrustGraph window, you can now begin the Naive Extraction process with scripts/loader. Periodically, the storage scripts will dump the extracted graph edges and embeddings. When the Naive Extraction process has finished, build the dumped graph edges and embeddings into the 2 Knowledge Core components:

scripts/concat-parquet -i dump/graph-embeds-* -o <knowledge-core-embeddings>.parquet
scripts/concat-parquet -i dump/triples-* -o <knowledge-core-graph-edges>.parquet

These scripts will save the Knowledge Core components in the main TrustGraph directory.

Once the Knowledge Core has been successfully saved, the storage scripts must be ended manually. In the second and third terminal windows, for Linux hit ctrl+c. For MacOS hit ctrl+z and then kill %.