Hönig told Ars that breaking Glaze was "simple." His team found that "low-effort and 'off-the-shelf' techniques"—such as image upscaling, "using a different finetuning script" when training AI on new data, or "adding Gaussian noise to the images before training"—"are sufficient to create robust mimicry methods that significantly degrade existing protections."
Sometimes, these attack techniques must be combined, but Hönig's team warned that a motivated, well-resourced art forger might try a variety of methods to break protections like Glaze. Hönig said that thieves could also just download glazed art and wait for new techniques to come along, then quietly break protections while leaving no way for the artist to intervene, even if an attack is widely known. This is why his team discourages uploading any art you want protected online.
Ultimately, Hönig's team's attack works by simply removing the adversarial noise that Glaze adds to images, making it once again possible to train an AI model on the art. They described four methods of attack that they claim worked to remove mimicry protections provided by popular tools, including Glaze, Mist, and Anti-DreamBooth. Three were considered "more accessible" because they don't require technical expertise. The fourth was more complex, leveraging algorithms to detect protections and purify the image so that AI can train on it.
Don't turn this thread into a debate about AI art or those tools, i'm just sharing because i know people here used those tools on their stuff but sadly, like all "protection", it was only a matter of time until flaws were found so I think it's useful to be aware of them.
taking this into its own post but:
If I'm being honest, I never really believed in Nightshade/Glaze/whatever tools as a long-term solution against "adverserial" generative image models. I'm still not 100% clear on whether or not these did anything when they first came out but it was always a matter of time until these tools were defeated/bypassed.
Now I fear a lot of non-tech-savvy artists just trained themselves to use them without considering that they just entered a game of cat and mouse with whoever really wants to copy their shit.
Like, if the attacks described in the article/underlying papers work as well as it's said...it's already over. Image upscaling/random noise is already added by every other social media platform when making thumbnails, unless you 100% own your art's distribution, there's probably already a "non-Glazed" copy of your art floating somewhere. And now you just spent more GPU cycles somewhere to add a few "poison pixels" to your PNG file for very little benefit.
You shouldn't trust that what you upload into most websites will be replicated 1:1. That's not a criticism of said website, that's just how Hosting Content works.
I don't know what the solution is, I just know that a technical solution was never gonna hold a very long time and I also know a "legal" solution will be A) too little too late and B) won't do shit for anybody in countries without laws protecting this so 😬
But the short version of the story is this: Ben's idea is that artists should add some adversarial noise to their images before releasing them to the public. This adversarial noise is supposed to make it so that, if someone tries to train a machine learning model on these images, the model will be bad.
But the most important thing to notice with this defense is that it's unpatchable in the same way that a break on AES would be unpatchable. Once someone has published their adversarially noised images, they've lost control of them---they can't update them after that. And someone who wanted to train a model on these protected images only needs to download a bunch of images all at once, wait a few weeks or months for an attack on the defense, and then can retroactively train a model on these images.
As it turns out, the attack is quite simple: it's not that hard to remove this adversarial noise through various techniques, and then, you can train a good model on these images. This makes the any exploit on the defense violate the security of everyone who has uploaded their images up to this point.
Sure, a future (second) version of the defense might prevent this attack---but the damage has already been done for everyone who published their images with the first version of the defense. You can't patch the images that have already been released.
What this means is that it is strictly better to publish the attack on this defense as early as possible. We did provide Ben's group with an initial draft of the paper to make sure they could check any errors in our work, but it doesn't make sense to wait for a patch in the same way as it would be for the other cases I've discussed above. Every day we wait to publish the attack is another day that someone might publish more of their images with this defense, and then be vulnerable to attack later.