Storing Knowledge Cores
If you plan on storing a knowledge core, the storage scripts must be started PRIOR to beginning to the Naive Extraction
process.
Before beginning a Naive Extraction
process, open 3 terminal
windows all in the TrustGraph
directory. One window will be used to interact with TrustGraph
. The other two terminal
windows will run the storage scripts while TrustGraph
runs the Naive Extraction
process.
In the main TrustGraph
window:
mkdir dump
In the second terminal
window:
scripts/triples-dump-parquet -d dump -p pulsar://localhost:6650 --no-metrics
In the third terminal
window:
scripts/ge-dump-parquet -d dump -p pulsar://localhost:6650 --no-metrics
In the main TrustGraph
window, you can now begin the Naive Extraction
process with scripts/loader
. Periodically, the storage scripts will dump the extracted graph edges and embeddings. When the Naive Extraction
process has finished, build the dumped graph edges and embeddings into the 2 Knowledge Core
components:
scripts/concat-parquet -i dump/graph-embeds-* -o <knowledge-core-embeddings>.parquet
scripts/concat-parquet -i dump/triples-* -o <knowledge-core-graph-edges>.parquet
These scripts will save the Knowledge Core
components in the main TrustGraph
directory.
Once the Knowledge Core
has been successfully saved, the storage scripts must be ended manually. In the second and third terminal
windows, for Linux
hit ctrl+c
. For MacOS
hit ctrl+z
and then kill %
.