So I'm never updating my Apple devices again...

LibsEatPoop [any] · edit-2 3 years ago

So I'm never updating my Apple devices again...

Sphere [he/him, they/them] · edit-2 3 years ago

~~While there are real and valid concerns with this new system for detecting CP, it's actually 1) not as bad as it sounds, and 2) less effective than their press releases let on.~~

The trick they're using is checking the hash value of each image/video against a list of hash values of known CP files. A hash value is a short binary value generated by running an algorithm on the file, one which always produces exactly the same hash value for a given input, but which gives a drastically different hash value for even very slightly different output. Also, the algorithm cannot be reversed; to find out the exact nature of the input that produces a given output requires a glorified form of guess-and-check (usually; more on this below). So, Apple will never get hold of your actual files in any way, or even anything they could use to reconstruct those files or any portion thereof; it's all still end-to-end encrypted.

Edit: the hash function being used is not a cryptographic one but an AI-based one, which destroys some of these guarantees, and creates a real risk that Apple employees will end up viewing all kinds of random non-CP images because of flaws in the hashing algorithm.

But:

This does not stop people who are actively exploiting children and producing new files, since they won't be on the list yet, and they may never be added if they aren't widely distributed.
As mentioned, hash values change if there's even a slight modification made. So taking a CP image, opening it in Paint, and changing the color of a single pixel is enough to change the hash value, thus sidestepping this system entirely. (Edit: see thread below)
While it is essentially impossible to find an exact input to match a hash value, older hash algorithms are often broken, which means that sophisticated cryptographers can calculate an input that will produce a specific hash. (If this is done to match an existing hash, the result is known as a "collision," where two different inputs give the same hash. While collisions are inevitable given that hash algorithms can process an arbitrarily sized input, if there is a process that can generate a collision in a reasonable amount of time, the algorithm is thought of as broken.) As such, it is possible to create malicious files which will trigger a match with one of the hashes on the illegal-files list.

Also, there may be additional security and privacy concerns I'm not aware of; I'm not quoting anyone but rather just applying my knowledge of computer security concepts to this system.

LibsEatPoop [any] · 3 years ago

The video goes into how A.I. can help bypass Point 2. And it does mention Point 3.

IDK. Like, I fully expect their capabilities to improve in the future and for them to keep pushing to get more and more access. So, this is basically like a point at which you can say, enough is enough, I'm going to leave.

comi [he/him] · 3 years ago

They don’t use cryptohash though

Sphere [he/him, they/them] · 3 years ago

I'm not sure what you mean by this, but they definitely are using hashes to detect illegal images:

the system performs on-device matching using a database of known CSAM image hashes provided by NCMEC and other child safety organizations

comi [he/him] · 3 years ago

I thought they hashed result of some pattern network, to avoid pixel changing behaviour of straight hashes?

Sphere [he/him, they/them] · 3 years ago

Ah yes, in the technical summary it details their hashing algorithm; it takes some digging to actually find that, I note. I admit to being dubious that their NeuralHash algorithm is really as bulletproof as they seem to be saying it is; neural nets are, in my view, a shitty foundation for any algorithm.

So, on the one hand, this wipes out point 2 (as OP already mentioned), but on the other, anyone with a very large photo collection on iCloud is at real risk of having their images viewed by random Apple employees, which is pretty gross. (I was wondering why they would need a threshold rather than just booting someone for any such image.)

drhead [he/him] · 3 years ago

I don't think this is an accurate representation of what is going on.

This is a type of perceptual hashing, similar to Youtube's Content ID system. The underlying principle is the same: inputting the same image will give the same result. However, traditional cryptographic hashes will change drastically with a small change to the input, a perceptual image hash is very unlikely to change with just small changes, and will change only slightly with slight changes. From my understanding you can also figure out partial matches from cropped or resized images as well. The "AI-based" bit is probably deliberately or accidentally conflating it with the parental control feature (where it'll try to flag when accounts flagged as minors are sending nudes, warn them, and snitch to their parents if they do it anyways). NCMEC has a list of hashes for known CSAM. If you want to compare against that list, you need your own hash to compare to that is generated the same way NCMEC generated their hashes, not some AI nonsense. It simply will not work otherwise. People seem to have a perception that you can throw "AI" at any problem and that it will vaguely work but not necessarily well, but there are some problems that simply do not work like that.

This is already a system in widespread use. Every other cloud drive service has used this for years. Every user-generated content site has used it for years (especially adult sites). As for point 1, it's not meant to prevent, it's meant to stop distribution of known files, which I would imagine is of great importance to people who have already had images of them widely distributed and want it to stop -- this is likely the closest people can get to truly destroying something that is on the Internet already.

Should you be concerned about this? Not really. Hashing the images locally is the best way to do this, honestly. Hash collisions are fairly unlikely, and at least from their statement you'd have to violate it several times to get reviewed (keep in mind this is only flagging against a list of known images). Hash collisions are possible to create with perceptual hashing, but the results honestly look bizarre, not like something you would want to upload to a cloud service necessarily, and unless Apple is blatantly lying about their review procedures all they would see are some bizarre nonsense images (and if you are assuming they are lying then there was no reason to trust them at any point, and there's nothing at all to discuss). If you're really concerned about your privacy, then -- and I cannot emphasize enough how there is no way around this -- don't use cloud services. Privacy should be treated as the cost of one of the costs of using someone else's computer. You are entrusting that data in the hands of someone else.

Sphere [he/him, they/them] · 3 years ago

I don't have time today to go into this deeply, but here is Apple's description of how NeuralHash works, which makes it clear that it is, in fact, AI-based (meaning anything that can trick a neural net can probably trick the hashing scheme as a whole, which is a problem given that neural nets have been found to be fooled by modifications to images that are undetectable by humans):

The system generates NeuralHash in two steps. First, an image is passed into a convolutional neural network to generate an N-dimensional, floating-point descriptor. Second, the descriptor is passed through a hashing scheme to convert the N floating-point numbers to M bits. Here, M is much smaller than the number of bits needed to represent the N floating-point numbers. NeuralHash achieves this level of compression and preserves sufficient information about the image so that matches and lookups on image sets are still successful, and the compression meets the storage and transmission requirements. The neural network that generates the descriptor is trained through a self-supervised training scheme. Images are perturbed with transformations that keep them perceptually identical to the original, creating an original/perturbed pair. The neural network is taught to generate descriptors that are close to one another for the original/perturbed pair. Similarly, the network is also taught to generate descriptors that are farther away from one another for an original/distractor pair. A distractor is any image that is not considered identical to the original. Descriptors are considered to be close to one another if the cosine of the angle between descriptors is close to 1. The trained network’s output is an N-dimensional, floating-point descriptor. These N floating-point numbers are hashed using LSH, resulting in M bits. The M-bit LSH encodes a single bit for each of M hyperplanes, based on whether the descriptor is to the left or the right of the hyperplane. These M bits constitute the NeuralHash for the image.

It appears that Apple ran this algorithm on each of the images in the known-CSAM database; otherwise it wouldn't be possible for this system to work.

drhead [he/him] · 3 years ago

It appears that Apple ran this algorithm on each of the images in the known-CSAM database; otherwise it wouldn’t be possible for this system to work.

No, that can't be it. There is no database of images that is publicly available. They only have the hashes, they only give people access to the hashes. If they had such a database there is zero chance they would let anyone access it outside of law enforcement.

They did make their own AI for scanning NSFW pictures as a parental control feature, but this is a separate feature.

I know Microsoft has a similar API that also lumps together an AI check with checking if it matches a known hash from a database, so this type of setup is not unprecedented.

Sphere [he/him, they/them] · 3 years ago

Not publicly available, no, but I imagine the images are kept somewhere, so as to allow them to be hashed with new algorithms as they're developed. Maybe Apple just supplied the algorithm and got back a set of hashes?

Anyway, the excerpt I posted above makes it very clear that the neural net step comes first, and then the result from the neural net is hashed. So there's no way to just use the old hash database for the system they're describing.

drhead [he/him] · 3 years ago

Hmm... okay, now that I read it again I think I understand it more. It sounds like this particular AI setup is for making it able to see past different types of obfuscation than a normal perceptual hash algorithm. Honestly, given that the normal algorithm in use now is Microsoft's PhotoDNA, it sounds like they primarily made this mostly because they don't want to implement something made by Microsoft if they can avoid it, even at great expense (and yes, you're right that they probably made a new hash DB for their algorithm -- apparently the hash database is less old than I thought, 2015). As far as I can tell from reading this technical summary it sounds like the neural net itself is used to determine what parts of the image are important enough to be factored into the hash, so not necessarily the type of thing people are used to tricking AIs with.

As far as fooling the neural net goes, I don't think there is any real way to test since there isn't even a way for the user to know whether an image was flagged. It sends a voucher with the result of a test off to Apple, but it is impossible for Apple to decrypt these until a certain number are uploaded. Then there's a manual review before any user-visible action happens to the account, so there goes any plans to make something innocuous that triggers it. As far as fooling it into a false positive goes, whoever wants to attempt that sure has their work cut out for them. Unless someone can reverse-engineer a neural network to make a hash collision with a database they cannot see the results of a test against, I think the chances of this happening are slim. Because assuming that Apple is not outright lying about parts of this (in which case why mention it at all), you'd have to send off several compromised images that look close enough to the real thing that they pass manual review. Then it would get to NCMEC who would check it against their images, and find that they are fake, because they have the real image to compare against. So nothing would really come of this.

But I've already heard half a dozen variants of "what if see see pee makes apple scan phones for pictures of tiananmen square tank man" in less than a week from this news breaking, so I think the brain damage from this is already done to the general population.

Sphere [he/him, they/them] · edit-2 3 years ago

Well given that it happens client-side, the software that does the actual hash has to be loaded onto the phone, right? So the algorithm itself can be retrieved with some reverse engineering, and I doubt nefarious actors would have much trouble getting hold of CSAM images that are likely to be in the database. Which means that it's certainly possible to game the system. That said, it would probably take a state-level actor to actually do it, but that doesn't mean it couldn't happen. You're right that it wouldn't amount to much of anything, though, so I doubt anyone would bother.

Anyway, your point about privacy in the cloud is quite salient, I think. Personally I don't have an Apple phone, nor do I use cloud storage services, and I sure as hell don't have any CSAM, so I'm not really concerned about it. In fact, I was originally making a point similar to yours, and had to backtrack somewhat when I actually looked into how it all works.

So I'm never updating my Apple devices again...

So I'm never updating my Apple devices again...

Is Apple's New AI Spying On You?