Maverynthia

Local F.O.E. SPROUT in your dungeon

A tiny Brussels Sprout


🌱Labyrinth of Blue
maverynthia.com/

irisjaycomics
@irisjaycomics

https://www.404media.co/tumblr-and-wordpress-to-sell-users-data-to-train-ai-tools/
(reading this entire article is free, you just have to sign up for an account on the site)

TL;DR:

  • tumblr is planning to routinely scrape user data to sell to midjourney/openAI
  • they ALREADY scraped a shit ton of data and fucked up at it, scraping a bunch of private, explicit, deleted and copyrighted material
  • there's going to be an opt-out policy, but it's not an immediate opt-out and it relies entirely on AI companies following the honors system
  • Automattic's head of AI is named Andrew Spittle. not crucial information, but i did want to underline that

from what The Homies and i can tell this doesn't include privately hosted blogs with Wordpress installed, only blogs hosted on Wordpress.com, btw. if you used Wordpress to build your webcomic site or something, you're fine.

anyways. there are probably gonna be more people coming here from tumblr! that means more work for admins and more server load to pay for. maybe toss eggbug a few dollars before the tide rises

also, because i know this post might blow up, if you have more than a few dollars to spare, consider buying a comic or something from my store


You must log in to comment.

in reply to @irisjaycomics's post:

this is extremely true, but i know a lot of webcomic artists who use wordpress for their websites. comicpress and comic easel are good alternatives for this, but if rebuilding your site from scratch isn't an option or just isn't an option right now, you should be safe as long as you're not hosting your comic directly on Wordpress's site

Well it shouldn't have included private stuff, but 404 media reports that it probably does:

"the way the data was queried for the initial data dump to Midjourney/OpenAI means we compiled a list of all tumblr’s public post content between 2014 and 2023, but also unfortunately it included, and should not have included:

private posts on public blogs
posts on deleted or suspended blogs..."

https://www.404media.co/tumblr-and-wordpress-to-sell-users-data-to-train-ai-tools/