Storing Knowledge Cores
If you plan on storing a knowledge core, the storage scripts must be started PRIOR to beginning to the Naive Extraction
process.
Knowledge Core CLI​
pip3 install trustgraph-parquet==0.11.20
Storing During Extraction​
Before beginning a Naive Extraction
process, open 3 terminal
windows. One window will be used to interact with TrustGraph
. The other two terminal
windows will run the storage commands while TrustGraph
runs the Naive Extraction
process.
In the main TrustGraph
window:
mkdir dump
In the second terminal
window:
triples-dump-parquet -d dump -p pulsar://localhost:6650 --no-metrics
In the third terminal
window:
ge-dump-parquet -d dump -p pulsar://localhost:6650 --no-metrics
In the main TrustGraph
window, you can now begin the Naive Extraction
process with the document loader commands. Periodically, the storage scripts will dump the extracted graph edges and embeddings. When the Naive Extraction
process has finished, build the dumped graph edges and embeddings into the 2 Knowledge Core
components:
concat-parquet -i dump/graph-embeds-* -o <knowledge-core-embeddings>.parquet
concat-parquet -i dump/triples-* -o <knowledge-core-graph-edges>.parquet
These scripts will save the Knowledge Core
components in the current directory.
Once the Knowledge Core
has been successfully saved, the storage scripts must be ended manually. In the second and third terminal
windows, for Linux
hit ctrl+c
. For MacOS
hit ctrl+z
and then kill %
.