i don't know what else to call it.
i ran across an excerpt from a paper recently, with a title mentioning something like "sparks of AGI" in GPT-4. the excerpt was about how they gave GPT-4 a modified problem from the 2022 IMO (a prestigious, and difficult, math contest) and it provided a solution and proof. holy fuck right
i looked at the proof and
-
it was not a good proof. i don't want to get into it again but... it starts out seemingly okay, but then it goes kind of off the rails. and even the seemingly okay stuff seems dubious. the paper calls it a "correct proof" but i am highly skeptical.
maybe i will Get Into It in another post idk. harder to do math on cohost with no mathjax
-
i went to write my own solution for the problem and realized that the GPT proof had missed the whole point of the problem, the key insight about it. it really just handwaved and jumped ahead.
and keep in mind: this was the answer they cherry-picked for their paper, and it was still junk. just plausible junk.
-
to test my "plausible junk" hypothesis, i went and found the original problem and pasted it to GPT-4. if you are curious, the real problem is:
Let โโบ denote the set of positive real numbers. Find all functions f : โโบ โ โโบ such that for each x โ โโบ, there is exactly one y โ โโบ satisfying x f(y) + y f(x) โค 2.
i don't have a fucking clue where to start with this. god damn. it is a thousand times more difficult than the baby high school problem they watered it down to.
but GPT-4 gave me an answer, and wrote a proof for it! it was even formatted beautifully with latex!
just one problem: the answer was completely wrong, and the proof contains basic algebra mistakes! it even contains "we will now prove that this solution is unique ... therefore, it is unique" without even attempting to prove that
and what really gets under my skin here is that, if they watered down this problem to make a version GPT-4 could solve, they must have tried the original first. surely, right? like if you're studying whether a computer can solve math problems, i can't imagine not even wondering how it would do with the real problem. that was the first thing that came to mind for me!
so why was that not in the fucking paper? what kind of "paper" only contains specific toy examples that you think worked well?
and the answer is a "paper" from microsoft, the same company that has poured ten billion dollars into openai so they can make stuff like gpt-4.
but they don't want to show the attempt to solve the original problem, because it demonstrates that what gpt-4 is actually designed to do is generate gibberish in the shape of human prose, and sometimes that prose coincidentally expresses a coherent idea. it's easier to steal the prestige of the IMO and staple that onto a high school calculus problem.
this whole field is a joke. it is hucksters and fraudsters and marketers. it is a fucking circus. it is people "studying" the thing while they have a strong financial interest in convincing the world at large that it's a thinking machine. it is advertising hosted on arxiv dot org. it is a scam and everyone involved in perpetuating it should be fucking ashamed.
the other paper to have crossed my path recently is this post about othello and whether a language model, trained only on sequences of moves, has an internal representation of the board or not.
it opens with a metaphor about a crow, because again, these people are trying to manipulate you into thinking that the computer has a brain now. it does not. it is a pattern-detection engine.
the experiment was to do the following:
- train a model on legal games of othello, represented as sequences of board coordinates
such as F5 D4 A3 etc (i made that up it's probably not a legal game)edit: my bad they actually used completely arbitrary tokens so it couldn't infer the arrangement of the cells from their names - train 64 more models on the original model, one per cell of the board, to see if they can detect anything that looks like a notion of what is in the specific cell
- compare this to a crow and conclude that the model has learned about the board state. give us funding
i'm not qualified to judge the gritty details, especially based on a funny png they made of some nodes, but my takeaway so far is this:
othello is a really mathematical game. i mean you've got a square grid and pieces that toggle in parity. it seems rife for patterns, especially if you throw a pattern-detection engine at it.
for example: every legal move has to be next to an existing piece. well, that's easy; we can find a bunch of legal moves with no clue what the board looks like, just by looking at the moves that have already been made! if D3 was a move, then the cells next to it are C2, C3, C4, D2, D4, E2, E3, and E4. but you can't play in the same cell twice, so eliminate any that are already in the move list. and there are rules about playing next to opposing colors, so if D3 was an even number of moves ago, then it's probably out. we've already culled huge numbers of illegal moves here โ there are surely other kinds of constraints to be inferred from the rules that would let you make legal moves based only on a transcript of the game. and you could do it without even knowing what the rules are, if you were some kind of pattern-detection engine being fed a lot of transcripts.
remember, this wasn't about whether it could play the game well, only whether it could make a move at all.
and you know that's kind of neat. it would be even neater if it were compared to any kind of previous mathematical analysis of othello, which almost certainly exists, but which this blog post probably doesn't even attempt to mention because it would make othello sound like a math thing which can be analyzed rather than a human thing which we've taught a computer to do.
and even the second model that can find some notion of grid cells in the first model is kind of neat.
but nobody can just stop there at "look what i made the thing do". no we have to claim we've found proof of an inner world. so now i have to spell out what actually happened here:
- they trained a model on legal games of othello, formatted as grid coordinates
- they trained a second model โ also a pattern-finding engine โ to specifically look for a notion of grid positions in the first model, based on what they knew the board should look like
- it worked
doesn't this seem a little bit like telling the horse when to stop counting? if you already know what shape of thing you're looking for, and you ask a model to find it... that's... what they do. it doesn't mean the first model contains a direct representation of the board; it means the first model contains something from which you can derive a representation of the board. and, fucking, of course it does โ because you can derive the board from the list of moves, which the first model contains! all you did was teach the second model how to ferret that out, based on the board state that you already knew!
am i missing something here because i swear to god!!!
the other thing about this stuff, is it's being worked into school curriculums just as the last cryptocurrency and "web3" bubble was driven by universities pushing cryptocurrency curriculums that didn't really give ppl that many skills outside of it.
