a not-entirely-a-shitpost about tagging on a furry porn site by my old internet acquaintance @atomicthumbs mentioned a piece of AI technology called Cyc, which predates the modern LLM craze by literally decades and was the life's work of legendary computer scientist Doug Lenat (R.I.P.)
this brought up some personal memories because for several months in the early 2000s, I was a sysadmin at Lenat's company Cycorp, where I managed servers and workstations for the programmers and ontologists who built and trained the Cyc family of software. I didn't stay long because my disability came back with a vengeance and I couldn't reliably make it to the office anymore, but it was a neat place to work and it resembled the sort of classic MIT-descended hacker environment that things like The Jargon File reference. I was not fond of the fact that Cyc research was significantly funded by Department of Defense contracts, but then so is nearly any useful modern technology so it is what it is
and @atomicthumbs' comment that this was a "branch not taken" led me to think about this further, because Cyc not only predates LLMs and the like, but it's also fundamentally different in the way it approaches the task of machine learning and artificial intelligence. and I replied that it's different in a way which is unattractive to capitalism, which I think is an interesting observation and which I'm expanding on further in this efforpost
modern "AI" of the ChatGPT type consists mostly of Large Language Models, which use neural networks to build (not surprisingly) models of human language based on very large input sets. this is a decent introduction but in short, ChatGPT looks at a lot of text and tries to understand what words show up in what order. it doesn't understand the relationships between letters and words in any way, other than the order in which they appear in its training documents, which are basically The Entire Fucking Internet. this is why tools like ChatGPT are sometimes referred to as stochastic parrots, in that they are like a bird that is imitating a sound it hears but does not understand the words themselves the way humans do
one of the aspects of ChatGPT that is important to understand is that the resulting neural network is not understandable by humans. it does not consist of a set of rules that one can look at, and there is no way to trace a decision path through the neural network. it's a complete black box, with inputs and outputs, and whatever happens inside the box is a mystery. this is why "prompt engineering" is becoming a thing now, which is literally trying to coax your black box to generate useful output through your choice of questions and prompts
a tool like Cyc, on the other hand, is constructed to a large degree by hand, by ontologists who define concepts and the relationships between them. instead of creating a parrot, who says words without understanding them, a tool like Cyc is intended to actually understand what it's telling you, in the sense that it's constructing its statements based on the concepts behind words, and not just the words and the order they typically appear in
with a sufficiently large and complicated set of concepts and their relationships, the software could theoretically begin to add to its own sets of rules by reading documents on its own, and those new rules could be visible within the system to human auditors because they would be functionally identical to the ones created by hand. so far this has not happened, and Cyc has received a lot of criticism and is considered by many to be a dead end in terms of AI technology, although it has produced a number of useful expert systems over the years
but one of the reasons, I think, that tools like LLMs have taken off in a way that tools like Cyc never did is not because of the functionality of the technology, but instead because of the values of capitalism. to wit:
-
tools like Cyc require the work of many, many experts to build their knowledge systems, since the design is based on creating ontologies of meaning in a way that the systems can interpret. these will, obviously, be people with advanced degrees, people who understand how the world works on a low level, and thus are likely to be very smart, and demand both a commensurate salary and a work environment that is accepting of what is likely a very quirky human being. (I could tell stories about some of the very interesting folks at Cycorp but I don't want to violate their privacy even two decades later.) capitalism doesn't like paying people a lot (unless you're a C-level) and it wants everyone to be an easy-replaceable cog in the profit machine, so they really don't like having to deal with these kinds of people
-
but LLMs are not only relatively easy to code, since neural network technology is very mature and your average compsci graduate with a bachelor's could put together something with off-the-shelf tooling, but once you've got your code, you can just point it at your document corpus and it will do all the work for you. you're still spending a lot of money, but you're spending it on more hardware to grind through your corpus and run your every-growing language model. the money goes to another tech company whose CEO you're probably friends with, and hardware doesn't demand decent working hours and a non-hellish workspace. this aligns with the values of capitalism: server hardware is the very definition of capital, and we want to have as much capital as we can with as little labor involved as possible
-
a tool like Cyc can also be examined to see why a particular output occurred. if it starts producing biased or incorrect output, for example, it's possible to track the path through the software and see exactly why the decisions were made, examine the flaw in how something was modeled, and correct it. it is debuggable in a way similar to how regular software is
-
this is also much less attractive to modern capitalism, which is to a huge degree invested in the idea of avoiding responsibility. our corporations are structured to deflect blame away from management and the company itself, and systems where you can say "the software did it without human intervention" is a plus, not a minus. "the algorithm" can be blamed for whatever happens, so no human being can be blamed for an error, for bias, for plagiarism. this is the opposite of the "a computer must never make a management decision" way of thinking, which got in the way of quarterly profits
if the amount of money thrown at products like ChatGPT, not just in terms of man-hours but the costs of the hardware, the electricity and cooling, the middlemen who bill you for its use, etc, if that money had been pointed at something like Cyc, would we have more functional generalized AI? would we have expert systems that work better?
maybe? maybe not? but we'll never find out because capitalism doesn't like experts and it doesn't like accuracy. it likes things to be cheap and fast and profitable, and the actual functionality of the system is, at best, a secondary consideration. given the choice between a technology that aligns with the ideals of capital, and a technology that works, it will choose the former every time
[edit] @nic-hartley talked in this repost about the underpaid overseas global-south labor that also drives all of this, which I completely forgot to mention (and which these companies would also like you to forget about), and how talented people like ontologists are more likely to unionize, but those are also in line with how capitalism works so feel those points also support the fundamental thesis
.png)