vogon

the evil "Website Boy"

member of @staff, lapsed linguist and drummer, electronics hobbyist

zip's bf

no supervisor but ludd means the threads any good


twitter (inactive)
twitter.com/vogon
bluesky
if bluesky has a million haters I am one of them, if bluesky has one hater that's me, if bluesky has no haters then I am no more on the earth (more details: https://cohost.org/vogon/post/1845751-bonus-pure-speculati)
irl
seattle, WA

DecayWTF
@DecayWTF

We expose a surprising failure of generalization in auto-regressive large language
models (LLMs). If a model is trained on a sentence of the form “A is B”, it will
not automatically generalize to the reverse direction “B is A”. This is the Reversal Curse.

This is an interesting paper showing that, again, the same AI scaling problems that plagued us in the 60s affect modern systems too, but the bias and intent of the researchers shows through so plainly. A "surprising failure of generalization" instead of a more or less expected result of what LLMs actually do (ie, predict which language token could come next) and vague appeals to "well maybe humans have the same problem!!!1"


vogon
@vogon

reminded of how -- iirc from artificial intelligence class in college -- part of what precipitated the first AI winter was a decline of interest in neural networks in favor of symbolic systems1 because it was realized that perceptrons (early, rudimentary precursors to the same AI techniques in favor today) were mathematically incapable of learning the exclusive-or function -- the logical formalization of the concept "A or B but not both"


  1. which everyone eventually lost interest in because building a machine that can reason by hand requires too much work



DecayWTF
@DecayWTF

We expose a surprising failure of generalization in auto-regressive large language
models (LLMs). If a model is trained on a sentence of the form “A is B”, it will
not automatically generalize to the reverse direction “B is A”. This is the Reversal Curse.

This is an interesting paper showing that, again, the same AI scaling problems that plagued us in the 60s affect modern systems too, but the bias and intent of the researchers shows through so plainly. A "surprising failure of generalization" instead of a more or less expected result of what LLMs actually do (ie, predict which language token could come next) and vague appeals to "well maybe humans have the same problem!!!1"