trans mom, wife, composer. The now-retired speedrunner who asked the axiom verge dev "why?"


caffeinatedOtter
@caffeinatedOtter

There's a reason I refer to "AI art" as statistical picture generators; not only do they have no understanding of art, or indeed pictures; they don't even leverage any human understanding of art, or indeed really pictures, in they way they do it.

You feed a bunch of art into one program, and it builds a statistical model relating metadata to pixel values. You feed new metadata ("queries") into another, and that program uses the model to interpolate a new, statistically correlated, pixel collection.

The trick works, to the extent it even does (it's really visible that it doesn't work on a human level! Hands!) the way that statistical measures usually do — only at scale.

They didn't train the models on literally billions of ethically and legally dubiously sourced images because they thought that was a great idea — they did it because the trick works worse and worse the less data you train it on. The indefensible data slurping is an intrinsic practical necessity.

Read it again, slowly: "AI art" only works, and can only work, on the back of stolen images, because it works, and can only work, when fed images in bulk exceeding any human capacity to legally, never mind ethically or considerately, source.

They have to feed it stolen data. They have to. Or it doesn't even fucking work.

Friends, I understand the impulse to say "well, you know, the problem is the capitalists monetising it, and/or the way it's deployed; surely surely, if we wrest it from them and put 'AI art' in the hands of artists — like a Photoshop filter! — it will no more destroy Art than Photoshop filters did."

But I'm afraid you have to acknowledge that the "Photoshop filters didn't destroy art" thing is an analogy. Photoshop filters are not powered by mass data theft as an unavoidable necessity.

Blood diamonds do not become ethical if they're only worn by nice people. "AI art" picture generators cannot be made ethical by only nice artists using them. There are ethical problems in the supply chain which are fundamental to the picture generators working at all.


You must log in to comment.

in reply to @caffeinatedOtter's post:

the big problem I see with this argument is that, in this framework, it's perfectly fine for Disney to train an AI image generator on their entire corpus of movies/comics/etc (since they own the rights to it) but it's not okay for, say, a good amount of dataerase's work to exist since it's clearly based on old pc-98 (or so) art sprites and the like, and I assume they didn't ask permission.

it's all "hand done" as far as I know. but why the distinction? you can visually recognize the start image in their work, which is more than you can say about any image generator and images in the training set. this seems like it's just as much "stolen data" as any art generative model.

There are two parts to the whole picture generator thing, and most people aren't distinguishing between them. First you run some software that generates a statistical model; then you run different software that uses the model to do statistical picture generation.

To build a model that meaningfully works requires literally billions of images. (That's, you know, the entire thing I wrote up there.) That's where the ethics of data scraping come in — including such lovely nuggets as Stable Diffusion having scraped pictures out of peoples' medical files. Which, idk, I'd call a pretty damn different ethical question to a glitch artist using pixel art from old video games?

It's also intrinsic to the thing. Whatever ethical questions you make about the picture-generating end, the people and the monetisation and whatever, the ethical issues pertaining to the model generation cannot be elided. These things require that much data to function, nobody can simply say "oh well we'll cull all the problematic data and re-train"; it would directly degrade the models' ability to produce images that humans accept as plausible. (Remember DeepDream and all the eyeball landscapes? That's the low-data primitive ancestor of all this! The sheer volume of data training is what makes these output more plausible images.)

(Remember DeepDream and all the eyeball landscapes? That's the low-data primitive ancestor of all this! The sheer volume of data training is what makes these output more plausible images.)

it's been a while since I was reading papers in this space, but I'm pretty sure this is not actually true. the Common Crawl dataset, which LAION is a subset of, has been around since 2011 or so. i remember seeing a paper that could generate plausible images in a restricted subset (doors with house numbers on them) in like... 2014 or so. i would say the reason we're only seeing these high-quality images now is a combination of better hardware allowing larger networks and theoretical advances (neural turing machines were introduced in 2014, for example, though i don't think any of the image generators use those).

you can also get stable diffusion to learn new concepts using very few images using textual inversion; the original paper shows it works with five images. of course this relies on the background knowledge, but it shows that these networks are capable of 'few-shot' learning in some cases.

The Disney thing is interesting, but also straightforward: my issue is largely with the ethics involved in sourcing training data.

Disney are one of the few entities in the world who arguably have enough relevant data under their ownership to meaningfully train a model of their own; and Disney hypothetically training a model solely on pictures they own the rights to would, indeed, answer my specific ethical issues around data sourcing.