LLM versus AGI

 

LLMs do everything we need.

You put in a few words, you get hundreds of words, specially crafted for you.

 Not quite.

 An excellent FT article provides a good basis for comparing LLM and AGI approaches to understanding text.

You put in a few words, it starts writing from the prompt you have given it. If it can’t find a link for one of your search terms (the words in the prompt), it may drop it without comment.

There is nothing which understands the meaning of the words (how can there be, when there are so many pieces of text from so many different fields). There is nothing that understands how strings of prepositional phrases should be unravelled (because you need meaning for that), or how words should be clumped into objects, before being operated on by modifiers (e.g. “an imaginary flat surface” or “he went to jail for murder”).

 The basic idea, that a few words can be accurately turned into a few hundred or a few thousand words by keying off the words, is deeply flawed. If the person is naïve, then yes they have been given the thoughts of someone who used those words in describing something, but there is no guarantee they are describing what you wanted to know. OK for marketing fluff or school assignments, but not much where reliability is a factor.

 If you are confronted with a hundred page (or a thousand page) piece of legislation or specification, an LLM cannot contribute. The document has to be read on its own, or with specified supporting documents – trying to use other sources to shed light on what you are reading is doomed to failure.

 So is AGI any better?

There may be words in a specification which are being given new meanings because of new technology – these will be in the glossary, or you will have to do some work to unearth them.

 The Orion AGI system has a 50,000 word vocabulary, and 100,000 definitions – many words have a single definition, there are thousands of words that are both a noun and a verb (cost, push), a few words have 70 or 80 definitions (“set”, “run”), a word might be an adverb or a preposition – he turned on the light, he turned on a dime, he turned off the light, he turned off the highway. There is huge complexity just below the surface. The problem is handled by bringing the complexity to the surface.

 The AGI system knows how to clump words, and how to unravel prepositional phrases:

                He put the money from the bank in Fresno on the table in his office.        

Four prepositions – eight or ten in a sentence in a complex document is not uncommon, and no amount of looking in other documents will help.

 (the system still needs help to unravel everything, but once everything is unravelled, it can hold the whole thing live in its “head” – any changes you make will spread everywhere, backwards and forwards)

 Even seemingly very simple statements have an involved backstory.

A complex document is breaking many of the rules on which an LLM relies. Closeness (propinquity) is one such rule. A complex document can use punctuation to refer to a far away set of words:

Note:        For variation and revocation, see subsection 33(3) of the Acts Interpretation Act 1901.

tracing information has the meaning given by section 72.

 Orion has required a huge human effort over many years to emulate what our Unconscious Minds do in handling English. It should repay that investment manyfold, and remove a serious limitation that people have had.

Another limitation has been the inability to link the different technical vocabularies that different specialists have had, with the result that specialists talk past each other and no-one understands the whole document.


Comments

Popular Posts