jckarter

everyone already knows i'm a dog

the swift programming language is my fault to some degree. mostly here to see dogs, shitpost, fix old computers, and/or talk about math and weird computer programming things. for effortposts check the #longpost pinned tag. asks are open.


email
mailto:joe@duriansoftware.com
discord
jckarter

DecayWTF
@DecayWTF

We expose a surprising failure of generalization in auto-regressive large language
models (LLMs). If a model is trained on a sentence of the form “A is B”, it will
not automatically generalize to the reverse direction “B is A”. This is the Reversal Curse.

This is an interesting paper showing that, again, the same AI scaling problems that plagued us in the 60s affect modern systems too, but the bias and intent of the researchers shows through so plainly. A "surprising failure of generalization" instead of a more or less expected result of what LLMs actually do (ie, predict which language token could come next) and vague appeals to "well maybe humans have the same problem!!!1"


You must log in to comment.

in reply to @DecayWTF's post:

The Reversal Curse is robust across model sizes and model families and is not alleviated by data augmentation.

the funniest thing about this is that AI researchers ever thought that making their models larger is going to help, rather than, if anything, alleviating the very same pressures that would conceivably cause the model to generalize in the first place

Oh, I mean, it's not that surprising to me personally, but more... I can't believe the disparity is THAT BAD. Like, even in the direction it can "recall" things, it's a coin flip? Holy shit

the percentages are actually inflated for GPT-4, because they asked each question 10 times, and counted it as successful for that question if the LLM got the right answer even once out of those 10 times