Ai Theft - Tumblr Posts
its official: tumblr is selling our data to Midjourney

we'd been hearing rumors about this for a bit but now its open and out there. some details from this article

it goes without saying, but if @staff goes through with this its going to be an utter shitshow and im all but certain the website will not survive it.
Your posts are in an AI model
and then Tumblr decided to sell them to AI models.
Now, don't get me wrong, tumblr selling out the users to AI companies is bad, yes, they shouldn't do that. It sucks.
but don't lets get this confused: your posts were already in there. Tumblr selling them is about tumblr making some money and about the AI models having more exhaustive post collections. It's not about your posts being in an AI model, vs not being in one. That battle has already been lost.
Can you find your post on google? Then it's almost certainly in an AI model already. Think about it: These AI sites showed up before all the sites were making deals to sell their users' content, right? How do you think they built them in the first place?
They scraped the posts. Just like google and bing and such do when they build their search indexes.
It's a fundamental part of how the open web works: you want your posts on tumblr to be visible to users, right? You want them to be readable?* Like, look how much stuff broke when twitter changed their whole read-while-not-logged-in policy, ruining a bunch of thread links/NSFW links. And if it's visible, it's scrapable. That's what the AI models were built on.
I've done website scraping before (not for AI models, of course. I was doing search engines and website archival), this is just how it works. You hire a few relatively smart CS graduates and tell them "build me a scraper that'll give us a bunch of tumblr posts" and they go off for a month or two and come back with a database of a few billion posts, and you stuff that into your AI model. That's how they got all the deviantart and flickr and twitter and pinterest and so on posts. They didn't pay for them: they just took them.
They only ever pay for this shit because either:
they fucked up in such a way that the site might be able to sue them for taking rather than paying
They can buy them cheaper than they can finish taking them. Maybe they'd need to pay the CS grads for an extra month? well, that might be more expensive than just throwing the site a couple hundred thousand bucks.
ANYWAY: my point is, don't treat this "oh no tumblr is selling our posts to AI" like it's a big thing that might happen and it would be bad to happen. Yes, it's bad, tumblr shouldn't do this, this'll let AI models get continual updates of content for far easier than just scraping them would be, tumblr betrayed user trust, and so on...
but realistically, this is not a black and white matter of "if only tumblr didn't do this, then we'd be safe from AI models!"
Nope. We already lost that battle. I'm sorry, and it does suck, but that's just how it is. The avalanche has already started, it's too late for the pebbles to vote. * I'm assuming here that you don't run a private blog that's set to only followers or something. You'd be safer then, of course, but you're not really my target audience for this rant
"Yeah art theft and copyright violation is bad and all but hey at least people who can't actually figure out how to google or create reference pictures can have something to inspire them!"
While how AI gets the reference art is fucking awful and it is theft, it is nice for those who may have trouble putting pen to paper or pen to screen or just can’t draw but they have a good idea in mind
By the way you can't oppose big studios attempting to use AI-generated replicas of real people then fawn over AI song covers and deepfakes. They're the same act.

A new tool lets artists add invisible changes to the pixels in their art before they upload it online so that if it’s scraped into an AI training set, it can cause the resulting model to break in chaotic and unpredictable ways.
The tool, called Nightshade, is intended as a way to fight back against AI companies that use artists’ work to train their models without the creator’s permission. Using it to “poison” this training data could damage future iterations of image-generating AI models, such as DALL-E, Midjourney, and Stable Diffusion, by rendering some of their outputs useless—dogs become cats, cars become cows, and so forth. MIT Technology Review got an exclusive preview of the research, which has been submitted for peer review at computer security conference Usenix.
AI companies such as OpenAI, Meta, Google, and Stability AI are facing a slew of lawsuits from artists who claim that their copyrighted material and personal information was scraped without consent or compensation. Ben Zhao, a professor at the University of Chicago, who led the team that created Nightshade, says the hope is that it will help tip the power balance back from AI companies towards artists, by creating a powerful deterrent against disrespecting artists’ copyright and intellectual property. Meta, Google, Stability AI, and OpenAI did not respond to MIT Technology Review’s request for comment on how they might respond.
Zhao’s team also developed Glaze, a tool that allows artists to “mask” their own personal style to prevent it from being scraped by AI companies. It works in a similar way to Nightshade: by changing the pixels of images in subtle ways that are invisible to the human eye but manipulate machine-learning models to interpret the image as something different from what it actually shows.
Continue reading article here
Deplorable and disgusting. Selling out your users to keep yourself rich. Project yourselves everyone. The IP thieves are coming to the neighbourhood.
What is this about the tumblr staff wanting to sell art data to midjourney?
An ex-colleague of mine mentioned yesterday that there may be contacts between Automattic and midjourney in that direction, but nothing is public yet and I don't have any more info. They probably won't have anything specific to share either, since they left the company weeks ago too. That being said:
I have no reason to doubt my ex-coworker word, they are a trustworthy person.
Tumblr's CEO has been absurdly enthusiastic (comically, even) about AI, and is a big fan of LLMs and 'AI' companies.
A deal with midjourney could solve tumblr financial issues (not the same company, but openAi is paying up to 5 million/year to news companies to use their content as training data... tumblr generates several orders of magnitude more content than any newspaper or any media company and it only would need a 20 to 30 million per year deal to be profitable)
So I don't have any extra info yet, but I'm keeping my ears open.

It's not made of nothing. It's made of all the things you mentioned that have been mined and stripped off the bodies of work created by the efforts of others. It's a Frankenstein's monster of mutilated art displayed as if it's not a garish perverted amalgam of theft; a beautifully arranged construction of stolen creativity and dreams. AI art is the digital equivalent of the British Museum.
ai generated images make me increasingly sad and tired the more i see them in more and more casual contexts. i dont know how to explain, but it just fills the world with a bunch of nothing. no matter how visually stunning the pictures might be, there's nothing behind it for me. no dedication, no emotions, no feelings, no hard work or creativity, nothing i can truly think about, admire or enjoy. i dont think thats how art is supposed to be
me and the gang celebrating the death of ai art

Reblog to make it die faster