As in all the data that's gathered from users with tracking software and the likes? It's not deleted because it has value in the form of advertising firms that promise to extract something useful from it (we all know that recommendation algorithms suck anyway, but the promise is enough to get funding). The impact of storing the data is pretty small though, as other people mentioned the main issue is in compute when it's time to train new models that use that data to do something.
I use "junk data" to mean data or any information that isn't in use anymore and can be safely retired.
As in all the data that's gathered from users with tracking software and the likes? It's not deleted because it has value in the form of advertising firms that promise to extract something useful from it (we all know that recommendation algorithms suck anyway, but the promise is enough to get funding). The impact of storing the data is pretty small though, as other people mentioned the main issue is in compute when it's time to train new models that use that data to do something.