So I'm never updating my Apple devices again...

LibsEatPoop [any] · edit-2 3 years ago

So I'm never updating my Apple devices again...

Sphere [he/him, they/them] · 3 years ago

I don't have time today to go into this deeply, but here is Apple's description of how NeuralHash works, which makes it clear that it is, in fact, AI-based (meaning anything that can trick a neural net can probably trick the hashing scheme as a whole, which is a problem given that neural nets have been found to be fooled by modifications to images that are undetectable by humans):

The system generates NeuralHash in two steps. First, an image is passed into a convolutional neural network to generate an N-dimensional, floating-point descriptor. Second, the descriptor is passed through a hashing scheme to convert the N floating-point numbers to M bits. Here, M is much smaller than the number of bits needed to represent the N floating-point numbers. NeuralHash achieves this level of compression and preserves sufficient information about the image so that matches and lookups on image sets are still successful, and the compression meets the storage and transmission requirements. The neural network that generates the descriptor is trained through a self-supervised training scheme. Images are perturbed with transformations that keep them perceptually identical to the original, creating an original/perturbed pair. The neural network is taught to generate descriptors that are close to one another for the original/perturbed pair. Similarly, the network is also taught to generate descriptors that are farther away from one another for an original/distractor pair. A distractor is any image that is not considered identical to the original. Descriptors are considered to be close to one another if the cosine of the angle between descriptors is close to 1. The trained network’s output is an N-dimensional, floating-point descriptor. These N floating-point numbers are hashed using LSH, resulting in M bits. The M-bit LSH encodes a single bit for each of M hyperplanes, based on whether the descriptor is to the left or the right of the hyperplane. These M bits constitute the NeuralHash for the image.

It appears that Apple ran this algorithm on each of the images in the known-CSAM database; otherwise it wouldn't be possible for this system to work.

drhead [he/him] · 3 years ago

It appears that Apple ran this algorithm on each of the images in the known-CSAM database; otherwise it wouldn’t be possible for this system to work.

No, that can't be it. There is no database of images that is publicly available. They only have the hashes, they only give people access to the hashes. If they had such a database there is zero chance they would let anyone access it outside of law enforcement.

They did make their own AI for scanning NSFW pictures as a parental control feature, but this is a separate feature.

I know Microsoft has a similar API that also lumps together an AI check with checking if it matches a known hash from a database, so this type of setup is not unprecedented.

Sphere [he/him, they/them] · 3 years ago

Not publicly available, no, but I imagine the images are kept somewhere, so as to allow them to be hashed with new algorithms as they're developed. Maybe Apple just supplied the algorithm and got back a set of hashes?

Anyway, the excerpt I posted above makes it very clear that the neural net step comes first, and then the result from the neural net is hashed. So there's no way to just use the old hash database for the system they're describing.

drhead [he/him] · 3 years ago

Hmm... okay, now that I read it again I think I understand it more. It sounds like this particular AI setup is for making it able to see past different types of obfuscation than a normal perceptual hash algorithm. Honestly, given that the normal algorithm in use now is Microsoft's PhotoDNA, it sounds like they primarily made this mostly because they don't want to implement something made by Microsoft if they can avoid it, even at great expense (and yes, you're right that they probably made a new hash DB for their algorithm -- apparently the hash database is less old than I thought, 2015). As far as I can tell from reading this technical summary it sounds like the neural net itself is used to determine what parts of the image are important enough to be factored into the hash, so not necessarily the type of thing people are used to tricking AIs with.

As far as fooling the neural net goes, I don't think there is any real way to test since there isn't even a way for the user to know whether an image was flagged. It sends a voucher with the result of a test off to Apple, but it is impossible for Apple to decrypt these until a certain number are uploaded. Then there's a manual review before any user-visible action happens to the account, so there goes any plans to make something innocuous that triggers it. As far as fooling it into a false positive goes, whoever wants to attempt that sure has their work cut out for them. Unless someone can reverse-engineer a neural network to make a hash collision with a database they cannot see the results of a test against, I think the chances of this happening are slim. Because assuming that Apple is not outright lying about parts of this (in which case why mention it at all), you'd have to send off several compromised images that look close enough to the real thing that they pass manual review. Then it would get to NCMEC who would check it against their images, and find that they are fake, because they have the real image to compare against. So nothing would really come of this.

But I've already heard half a dozen variants of "what if see see pee makes apple scan phones for pictures of tiananmen square tank man" in less than a week from this news breaking, so I think the brain damage from this is already done to the general population.

Sphere [he/him, they/them] · edit-2 3 years ago

Well given that it happens client-side, the software that does the actual hash has to be loaded onto the phone, right? So the algorithm itself can be retrieved with some reverse engineering, and I doubt nefarious actors would have much trouble getting hold of CSAM images that are likely to be in the database. Which means that it's certainly possible to game the system. That said, it would probably take a state-level actor to actually do it, but that doesn't mean it couldn't happen. You're right that it wouldn't amount to much of anything, though, so I doubt anyone would bother.

Anyway, your point about privacy in the cloud is quite salient, I think. Personally I don't have an Apple phone, nor do I use cloud storage services, and I sure as hell don't have any CSAM, so I'm not really concerned about it. In fact, I was originally making a point similar to yours, and had to backtrack somewhat when I actually looked into how it all works.

So I'm never updating my Apple devices again...

So I'm never updating my Apple devices again...

Is Apple's New AI Spying On You?