From NLP to LMs
Despite the appearance that AI is “new”, the core concepts that form the building blocks of Language Models (LMs) goes back decades. Natural Language Processing, NLP
, has its roots going all the way back to the 1940s. While these concepts are far from new, the terminology has inconsistent usage. For TrustGraph
, extraction is an information discovery process that ingests raw, unstructured text and converts it to a static, structured knowledge model.
Until recently, NLP
models and techniques have been considered the most effective at information discovery. Many NLP
algorithms exist, with popular ones like TF-IDF and RAKE being the “go-to” for information discovery. These algorithms will, for example, extract entities and then score them using different methodologies to establish importance.
NLP
techniques produce objective data. The scoring algorithms can be adjusted. Further calculations can be made with the scores. Yet, subjective, empirical results show that LMs have far surpassed NLP
techniques in information discovery efficacy. TrustGraph
uses LMs for the information discovery process in the place of NLP
models. The techniques used are very similar, interchanging LMs for NLP
models and algorithms. Of course, this approach removes the objective data, yet the empirical evidence strongly supports the shift to using LMs for these tasks.