A moderately technical user can check this themselves today with basic local network traffic monitors or packet sniffing utilities, since even heavily compressed audio data will stand out no matter how it’s encrypted, streamed, chunked, etc.
But for the sake of argument let’s assume some really clever software that allows local voice recognition, processing, inconspicuous delayed batch uploads, etc. In addition let’s assume a completely compromised device, rooted/jailbroken with all sandboxing and system integrity disabled. Even then, a phone will struggle to record and process audio impromptu without an immediately noticeable impact on energy and data use.
I’m not saying advertising companies wouldn’t love to collect that much raw data, just that it seems like quite a challenge to do so quietly.
That is glossing over how they process the data and transmit it to the cloud. The assistant wake word for "Hey Google" invokes an audio stream to an off site audio processor in order to handle the query. So that is easy to identify via traffic because it is immediate and large.
The advertising-wake words do not get processed that way. They are limited in scope and are handled by the low power hardware audio processor used for listening for the assistant wake word. The wake word processor is an FPGA or ASIC - specifically because it allows the integration of customizable words to listen for in an extremely low power raw form. When an advertising wake word is identified, it sends an interrupt to the CPU along with an enumerated value of which word was heard. The OS then stores that value and transmits a batch of them to a server at a later time. An entire day's worth of advertising wake word data may be less than 1 kb in size and it is sent along with other information.
Good luck finding that on wireshark.
Hmm, that’s outside my wheelhouse. So you’re saying phone hardware is designed to listen for not just one but multiple predefined or reprogrammable bank of wake words? I hadn’t read about that yet but it sounds more feasible than the constant livestream idea.
The echo had the capacity for multiple wake words IIRC, but I hadn’t heard of that for mobile devices. I’m curious how many of these key words can they fit?
Smartphones by definition are Spyware, at least if you use the OS as is, because in them all aspects are controlled and logged, either by Google on Android or by Apple on iOS. Adding the default apps that cannot be uninstalled on a mobile that is not rooted. As COX alleges, they also use third-party logs and therefore can track and profile the user very well, even without using this technology that they claim to have.
Although they feel authorized by the user's consent to the TOS and PP, the legality depends directly on the legislation of each country. TOS and PP itself, to be a legal contract, must comply in all its points with local legislation to be applicable to the user. For this reason, I think that these practices are very different in the EU from those in the US, where legislation regarding privacy is conspicuous by its absence, that is, that US users should take these COX statements very seriously in their devices, although in the EU they must also be clear that Google and Apple know exactly what they do and where users live, although they are limited from selling this data to third parties.
Basics:
-- READ ALWAYS TOS AND PP
- Review the permissions of each app, leaving only the most essential ones
- Desactivate GPS if not used
- Review in Android every app with Exodus Privacy, maybe Lookout or MyCyberHome in iOS (Freemium apps !!!)
- Use as less possible apps from the store
- Be aware of discount apps from the Supermarket or Malls
- Don't store important data in the Phone (Banking, Medical...)
Agreed, though I think it’s possible to use smart devices safely. For Android it can be difficult outside custom roms. The OEM flavors tend to have spyware baked in that takes time and root to fully undo, and even then I’m never sure I got it all. These are the most common phones, however, especially in economy price brackets, which is why I’d agree that for the average user most phones are spyware.
Flashing is not useful advice to most. “Just root it bro” doesn’t help your nontechnical relatives who can’t stop downloading toolbars and VPN installers. But with OEM variants undermining privacy at the system level, it feels like a losing battle.
I’d give credit to Apple for their privacy enablement, especially with E2EE, device lockdown, granular access permission control and audits. Unfortunately their devices are not as affordable and I'm not sure how to advise the average Android user beyond general opt-out vigilance.
https://arstechnica.com/tech-policy/2023/12/apple-admits-to-secretly-giving-governments-push-notification-data/
Yeah those push token systems need an overhaul. IIRC tokens are specific to app-device combinations, so invalidation that isn’t automatic should be push-button revocation. Users should have control of it like any other API on their device, if only to get apps to stop spamming coupons or whatever.
It’s funny though: when I first saw those headlines, my first reaction was that it was a positive sign, since this was apparently news worthy even though the magnitude of impact for this sort of systemic breach is demonstrably low. (In particular, it pertains to (1) incidental high-noise data (2) associated with devices and (3) available only by request to (4) governments, who are weak compared to even the smallest data brokers WRT capacity for data mining inference and redistribution, to put it mildly.)
Regardless, those systems need attention. Hopefully in an upcoming release.
What if the processing is don't locally and the only thing they send back home is keywords for marketable products?
Yeah they’d have to it seems, but real time transcription isn’t free. Even late model devices with better inference hardware have limited battery and energy monitoring. I imagine it’d be hard to conceal that behavior especially for an app recording in the background.
WetBeardHairs@lemmy.ml mentioned that mobile devices use the same hardware coprocessing used for wake word behavior to target specific key phrases. I don’t know anything about that, but it’s one way they could work around the technical limitations.
Of course, that’s a relatively bespoke hardware solution that might also be difficult to fully conceal, and it would come with its own limitations. Like in that case, there’s a preset list of high value key words that you can tally, in order to send company servers a small “score card” rather than a heavy audio clip. But the data would be far less rich than what people usually think of with these flashy headlines (your private conversations, your bowel movements, your penchant for musical theater, whatever).
I agree.
What could be possible, would be maybe send tiny bits. For example, a device could categorize some places or times, detect out of pattern behaviours and just record a couple of seconds here and there, then send it to the server when requesting something else to avoid being suspicious. Or just pretend it's a "false positive" or whatever and say "sorry, I didn't get that."
I don't think they're listening to everything, but they could technically get something if they wanted to target you.
Right, I suppose cybersecurity isn’t so different than physical security in that way. Someone who really wants to get to you always can (read: why there are so many burner phones at def con).
But for the average person, who uses consumer grade deadbolts in their home and doesn’t hire a private detail when they travel, does an iPhone fit within their acceptable risk threshold? Probably.
Of course they do. It's just they're no longer afraid of telling us they are
Very surprised by all the advertising amd data broker company boot lickers there are ITT