• he/him

Coder, pun perpetrator
Grumpiness elemental
Hyperbole abuser


Tools programmer
Writer-wannabe
Did translations once upon a time
I contain multitudes


(TurfsterNTE off Twitter)


Trans rights
Black lives matter


Be excellent to each other


UE4/5 Plugins on Itch
nte.itch.io/

arborelia
@arborelia

I think I'm taking a risk writing this post, because unfortunately it's going to involve nuance. I want you to know up front, I'm probably on the same side of the issue as you. I think generative AI in the 2020s is destroying the public sphere and we need to do a lot of different things to stop it.

If you end up reading this post and thinking that I'm making a slippery-slope argument that says generative AI is inevitable and you should give up and let it take over our culture, please back up and read this. Holy shit no I am not saying anything like that. I do not think we should give up. I think we should target it, say what's bad about it, and stop it.

The thing that's been bouncing around my mind is that, if you take such a hard-line position that you want to "ban AI" without being able to say what you mean by that, you are not going to be effective enough at opposing it. I understand taking a nuance-free position as a tactic, especially if your position is that "well I don't understand this and I shouldn't have to understand this, I just know it's bad", but I don't think that works as the only tactic.

Here's the outline of what I mean:

  • The definition of AI is constantly changing
  • Many AI techniques in the past have been normalized to the point where it sounds really silly to call them "AI"
  • If today's generative AI follows that trend and gets normalized, that would be a problem, even though this sounds like recency bias
  • Something changed in the 2020s that made generative AI more dangerous and insidious, and if we pin down what changed, we will be able to oppose it better. There were warning signs before, but the call to action is now.

Turfster
@Turfster

We blasted far enough past the point where conmen and flimflam artists experienced any negative side effects1 of their scamming and grifting, and enough flotsam of humanity decided they were something to aspire to instead of something to despise and stamp out


1: like being run out of town, Actual Jail Time, or being beaten to death by an angry mob of duped victims


You must log in to comment.

in reply to @arborelia's post:

broadly agree, and the point about how many things we all broadly like used to be called "AI" is important. but two things:

  • if your main area of differentiation between today's "AI" and yesterday's "AI" is that right now data is being taken without express permission, that is not a very rock-hard defense. adobe et al (and some music companies, I very strongly suspect) are training models with only data they already own the copyright to. should we accept use of these models freely because nobody's work was "stolen"? well, no. the labor issues with this potential model are exactly the same (they're gonna try to replace people with these) and they're still going to output slop. I don't think this is a useful criteria for what we should accept and what we should reject.

  • you do not want to be on the same side as massive copyright holders. you absolutely do not want copyright to be strengthened to the point where the law alone would actually stop generative AI from being feasible to train. i do not think it is just an "uncomfortable tool in the fight against AI;" it is handing enormous slop-generating companies a cudgel and giving them the opportunity to look like saints for brandishing it. i have spent too much time researching modern music copyright cases to feel any other way about this

These are good points, particularly the one about Adobe.

It was just a couple of months ago when Adobe had the PR crisis over their terms of use, and they tried to reassure people, "we're not feeding your work into generative AI. Except for when we feed Adobe Stock into Adobe Firefly™, our amazing generative AI that is good".

They wanted to draw a line between harmful and harmless generative AI, and they drew it in a really precise place that was advantageous for them.

So how can we draw the line knowing that Adobe is on the wrong side of it, the side that is normalizing turning human art into slop?

I still think it's relevant that the people who contributed to Adobe Stock didn't actually agree to this. I mean, they agreed to some clickwrap terms, not knowing that this would happen, which is not the same as actually consenting to it. In some cases, users uploaded work that wasn't even theirs, sometimes work that was generated by a different generative AI, to Adobe Stock.

This would be a hard legal argument to make against Adobe, but it's still a moral argument.

honestly, i don't think we should draw a line to try to fence out adobe - i think it is cleaner to say "look, these models suck no matter what because they're making material conditions worse for both consumers and workers who are getting their jobs replaced, regardless of how the data is gathered"

also not a legal argument but i think it's a convincing moral one for a lot of people, and definitely galvanizes resistance from labor. this also leaves room for the "weird ai" stuff you mention to exist, and i agree with you that it should exist

nods - I appreciate the original post a lot, but yeah, these were two things that stuck out to me and I'm very glad that you addressed them.

I also get the feeling that some (perhaps not all, but at least some) people who are being mercilessly mocked for "making up sci-fi AI gods" are actually trying to argue something along the lines of the first point - that focusing on copyright is like trying to shoot a moving target, because the models are getting more powerful to the point that they won't need to scrape and steal work to operate anymore. That the models are evolving alarmingly quickly, and that the more powerful they get, the more potential they have for doing harm in ways that we haven't even thought of. I feel like a Terminator-esque robot overlord takeover is probably not the most likely of these possibilities - but it's worth considering if that's what the people in question are actually talking about, or if they're talking about something else, something that is relevant.

TBH, even in a hypothetical world where everyone could be assured of a decent quality of life regardless of their job or lack thereof, I think it would be good to curtail how big and powerful AI can get. Again, not because I think it'll become sapient and decide to nuke everyone, but because the bigger anything is, the more catastrophically it can fuck things up. And the more complex anything is, the more likely it'll fuck things up.

fun tidbit re: Deep Blue/Kasparov, i remember reading that actually the reason why it was able to beat him was that it made an error. it made a chess move that didn't make any sense, and Kasparov overthought it, and made his own error, and the machine took the game. Literally won by mind games.

At this point, i think chess is a solved game at the expense of ~equivalently the power consumption of Ireland, so it doesn't really matter for your point. doubly so because focusing on "the AI wasnt actually as smart as they say" detracts from the issue. but i thought it was funny

Destroy Google. Okay, I don't have a specific plan for how to do this, but it would sure help.

Tariffs and Protectionism. The PRC and the ROK already do it.
Google (and Microsoft) are foreign entities towards which the Central Government already is openly hostile, and it needs to grow the balls to build the public opinion to allow it to break up their IT sector.
The State already fucks with them and puts down other major IT companies, so we know it can do it.

I think a good point of leverage for breaking it is to:

point out the ads antitrust suit they lost,

point out how it's mingling with web search (they showed ads pressured search into giving worse results to show ads),

and then point out that they're doing things that seem like they'd qualify for anti-trust action now with Search: exclusive deal with Reddit regarding scraping and search-indexing pages, preventing alternatives from having equal footing.

under the guise of AI

and that's even something legislators and judges can understand when said that way

The thing that's been bouncing around my mind is that, if you take such a hard-line position that you want to "ban AI" without being able to say what you mean by that, you are not going to be effective enough at opposing it.

I'll go even stronger than this: if you take a hard-line position and want to "ban AI" without defining why it's bad, the opposition will choose things it can knock out as low hanging fruit and pretend they were your reasons.

"look, we got rid of the copyright infringement (now that our model is complete and we have enough money to legitimately pay. lowballing of course )"

"look, it no longer makes porn of your crush (because we banned all '18+' content"

etc

i feel very strongly that IP law should not be considered an "uncomfortable ally", it should be considered a poisoned well. strengthening the power of copyright would not help independent artists or to prevent AI slop from being generated, it would just center all of the power of that on big media corporations who can train their own models and/or copyright their own artstyle.

imagine a world where nintendo can sue every pokémon fanartist ever for "style infringement" on ken sugimori's art. the proposals to try and strengthen copyright by making "art style" a component of IP would do that and it would be devastating.

ai slop isn't great, but strengthening copyright to oppose it would be so, so much worse.

More than that, I think it's extremely shortsighted of any independent artist to assume such a system of "style copyright" would deem them worthy of the copyright to their "own style" or subject matter. Human artistic expression is unique, sure, but it follows patterns and themes. Statistically, someone made a similar piece before you at some point. The only alternative is to claim your artwork is so unique that nobody in history has made anything like it... which is a bit much, I think.

And all of this would be determined by copyright lawyers and such, not people with an appreciation for the arts, so subtleties would be lost.

Heavy sigh, yeah re Google. You can use alternatives for some things, some are easier than others, and some are trading the evilglobocorp1 for evilglobocorp2. My favorite alternative is @fastmail, a seriously fantastic paid email service with better filtering and rules than gmail by a thousand goddamn miles. And it is FAST. So much good to say about fastmail.

The common joke about "AI" when I was a CS undergrad years ago was "if it works reliably it's called an algorithm; if it doesn't, it's called artificial intelligence". The sentiment remains true, but mainstream usage has poisoned the word "algorithm" and more or less inverted its meaning from "a precise step-by-step procedure" into "an opaque and unknowable machine-learning model", so it's not as rhetorically sharp as it once was.

For my own writing, I prefer to use terms like "LLMs" and "GANNs" over "generative AI", because it nails the discussion down to real-world techniques with tangible faults and failure modes (i.e. being completely fucking useless for most purposes) rather than the hypothetical future technologies that enthusiasts like to pitch. It's like public transit via flying car versus trains: the trains have imperfections but exist, and the cars appear perfect (to gullible marks) because they don't exist.

Personally, my own distrust and dislike of AI extends further than the latest wave of generative models. I think the recent developments ought to recolor our opinions on earlier AI tech, even if it seems mundane by now. If I had to try to pin down where to draw the line, it would roughly be at "AI is bad if it's a black-box model that has been trained on a data set" (yes, I'm aware that this casts a VERY wide net).

There may be no putting that genie back in the bottle, but maybe we can get some guardrails. One I'd suggest--that sadly has no chance in hell of coming true even though it feels like common sense to me--is "AI companies are required to publish their training data". A more modest version that miiiight gain acceptance is "AI companies are required to publish their holdout data" (I don't know how to apply that rule to generative models though, just classifiers). Like, can we get the barest amount of accountability -- can a third party audit whether or not your model even does what you claim it does?

I certainly don't mean to come across as saying that older AI systems were morally good. In a lot of cases, they sure aren't. I mentioned the AI moderator "Perspective AI", which is both mundane and evil.

When it comes to requiring companies to publish their data, at least enough to independently audit their system: that would be great. If only. I feel like academic conferences and publications could try to have some leverage here, if they had a spine, but I guess you don't get conferences sponsored by having a spine.

I got very disillusioned with academia when conferences would give talk slots to corporations who would just boast about an AI tool they have and you don't. I was shocked that Google and Microsoft could get away with giving an irreplicable sales pitch, when everyone else was supposed to convey verifiable knowledge.

Totally agree. I classify a wide range of black-box machine learning as essentially professional malpractice. The harm of LLMs and GANNs is not of a fundamentally new and different nature, but the generality of these techniques makes them (mis)applicable on a disturbing new scale.

The burden of proof ought to be on machine learning proponents to demonstrate that their models robustly capture a pattern, but somehow now that LLMs are good at lying convincingly it seems like the public at large expects critics to be able to prove that models don't work correctly without access to any information about how they were made or are operated. Truly absurd.

I hope when you say that stuff about "art being collaged" etc., you're being metaphorical, but I would encourage not using that kind of framing. I've seen people come to the mistaken conclusion that image generation literally works by storing images, digitally cutting them up, and pasting them together, such that the original image could be plucked out of the source code like an asset in a video game.

While I agree that the provenance of datasets is important (e.g. the LAION 5B dataset includes nonconsensual sensitive medical records, which I think on its own is reason not to use it), I believe misrepresenting the function of the technology undermines one's point.

Optimizing a similarity function to an existing image is a form of copying that image, even if the design can't land on an exact copy of anything and obfuscates which images are involved.

I experimented briefly with one of the web-based DALL-E implementations (the one they renamed to "Craiyon" because its results were shitty enough to be bad for their DALL-E brand). I asked it for something like "Billy Joel with bottles of malt liquor taped to his hands" (because the phrase "Billy Joel Forty-hands" had come up in conversation for some reason, and I probably would have gotten something more literal and eldritch if I asked for it in that form).

The image I got largely consisted of both album covers for "The Essential Billy Joel" and a Wikimedia Commons image for "bottle".

What changed since then is mostly a matter of how many compressed images it's approximating similarity to.