Yo automated KKKommunity note$$!?

refolde [she/her, any] · 5 months ago

Yo automated KKKommunity note$$!?

Tabitha ☢️[she/her] · 5 months ago

I wonder if you could engineer an unrelated image to have the same md5 or perceptual hash and get it to auto-debunk.

TrudeauCastroson [he/him] · 5 months ago

If you have access to a quantum computer you could do this easily. With current computing it's hard.

git [he/him, comrade/them] · 5 months ago

This was a form of attack against Apple's on-device CSAM detection that they scrapped, so it's been possible for a while.

Neural hash collider: https://github.com/anishathalye/neural-hash-collider
Example collision: https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/issues/1
Script to generate collisions: https://gist.github.com/unrealwill/c480371c3a4bf3abb29856c29197c0be
Tainting the client side CSAM database: https://blog.xot.nl/2023/10/11/tainting-the-csam-client-side-scanning-database/index.html

TrudeauCastroson [he/him] · edit-2 5 months ago

Edit: wow I didn't realize md5 matching a picture was that easy, looks like you can make any image look enough like that twitter-deboonked one to generate a fake match. How has no one done this yet.

Thanks for the links, it's pretty interesting stuff I haven't kept up with for a while.

I didn't hear about that potential apple attack, I wonder if you could generate a collision with a pic that looks close enough to the twitter image they auto-deboonk and a pic that's completely unrelated, got twitter to add your new similar image to the auto-deboonker, and then troll on twitter by posting the unrelated image.

That'd be similar to that apple attack you linked, but it depends on how twitter auto-deboonking works and how easy you could get them to add a similar-but-different pic to their deboonker database.

bloubz@lemmygrad.ml · 5 months ago

md5 They said md5

emizeko [they/them] · 5 months ago

mega deeznuts five

Chronicon [they/them] · 5 months ago

well he said it's hard not impossible/impractical.

bloubz@lemmygrad.ml · 5 months ago

You're right

TrudeauCastroson [he/him] · 5 months ago

I thought md5 is vulnerable to generating 2 colliding files, not to trying to generate a match to an existing file.

bloubz@lemmygrad.ml · 5 months ago

It's definitely the easiest. But that's why we stopped using it, because it's proven we can have collision so it may be possible to generate a match on a real life file. I'm not sure about where we're at on this in research (if there's any)

Also I was actually not trying to make a point, just pointing on md5 as a joke

TrudeauCastroson [he/him] · 5 months ago

If you're using any hash smaller than your file (not just md5), then it's always possible to have 2 different files that match. This is just from pigeonhole principle. No matter what you use there will be collision.

md5 is just bad because it's small so it's easier to generate this match. It's also a question of how easy is it to reverse engineer a match, which apparently md5 is worse for on pictures than I expected.