Edit: wow I didn't realize md5 matching a picture was that easy, looks like you can make any image look enough like that twitter-deboonked one to generate a fake match. How has no one done this yet.
Thanks for the links, it's pretty interesting stuff I haven't kept up with for a while.
I didn't hear about that potential apple attack, I wonder if you could generate a collision with a pic that looks close enough to the twitter image they auto-deboonk and a pic that's completely unrelated, got twitter to add your new similar image to the auto-deboonker, and then troll on twitter by posting the unrelated image.
That'd be similar to that apple attack you linked, but it depends on how twitter auto-deboonking works and how easy you could get them to add a similar-but-different pic to their deboonker database.
It's definitely the easiest. But that's why we stopped using it, because it's proven we can have collision so it may be possible to generate a match on a real life file. I'm not sure about where we're at on this in research (if there's any)
Also I was actually not trying to make a point, just pointing on md5 as a joke
If you're using any hash smaller than your file (not just md5), then it's always possible to have 2 different files that match. This is just from pigeonhole principle. No matter what you use there will be collision.
md5 is just bad because it's small so it's easier to generate this match. It's also a question of how easy is it to reverse engineer a match, which apparently md5 is worse for on pictures than I expected.
If you have access to a quantum computer you could do this easily. With current computing it's hard.
This was a form of attack against Apple's on-device CSAM detection that they scrapped, so it's been possible for a while.
Edit: wow I didn't realize md5 matching a picture was that easy, looks like you can make any image look enough like that twitter-deboonked one to generate a fake match. How has no one done this yet.
Thanks for the links, it's pretty interesting stuff I haven't kept up with for a while.
I didn't hear about that potential apple attack, I wonder if you could generate a collision with a pic that looks close enough to the twitter image they auto-deboonk and a pic that's completely unrelated, got twitter to add your new similar image to the auto-deboonker, and then troll on twitter by posting the unrelated image.
That'd be similar to that apple attack you linked, but it depends on how twitter auto-deboonking works and how easy you could get them to add a similar-but-different pic to their deboonker database.
md5 They said md5
mega deeznuts five
well he said it's hard not impossible/impractical.
You're right
I thought md5 is vulnerable to generating 2 colliding files, not to trying to generate a match to an existing file.
It's definitely the easiest. But that's why we stopped using it, because it's proven we can have collision so it may be possible to generate a match on a real life file. I'm not sure about where we're at on this in research (if there's any)
Also I was actually not trying to make a point, just pointing on md5 as a joke
If you're using any hash smaller than your file (not just md5), then it's always possible to have 2 different files that match. This is just from pigeonhole principle. No matter what you use there will be collision.
md5 is just bad because it's small so it's easier to generate this match. It's also a question of how easy is it to reverse engineer a match, which apparently md5 is worse for on pictures than I expected.