Permanently Deleted

    • crime [she/her, any]
      ·
      edit-2
      2 years ago

      It's not operating normally - it's perceived to be operating normally, but it's a ticking time bomb for catastrophic failure and they laid off all the firefighters

      It's also absolutely hemorrhaging money and losing revenue, advertisers are fleeing the platform and a lot of standing business deals have been cancelled because there's no one left arranging them. It's not taking down videos with DMCA claims because no one is left to deal with them, which is going to get it into legal trouble. The government is mad bc they fired the CIA regime change team. Capital might not care about long-term technical stability, but it does care about short term profits, intellectual property, and the ability to enforce its hegemony

      In no way will this be looked at as a "success"

        • crime [she/her, any]
          ·
          2 years ago

          I think the next major outage will be scrutinized heavily for sure. And I have a hunch the next outage will be in the next couple months — globally-observed events with spikes in traffic that have second-level specificity tend to put huge strain on very complex systems like most modern social media sites.

          Teams of people spend weeks doing capacity planning in anticipation of events like the World Cup or New Year's typically. World Cup specifically is a notorious SRE nightmare for big social media sites — you'll have people from all over the world posting at the exact same time (like, to the second/minute) when exciting things like goals happen and they'll be posting photos and video clips and excitedly spamming a million tweets.

          The #1 most common cause of major site outages is increased load by far. You've got a complex system with a million little tiny gears, and if one gets overwhelmed and starts to slow down, or if a disk or something fills up, or too many things are connected to a database, or whatever then the whole thing catches on fire in spectacular ways

          having everyone in the world tweeting about an incredible save, or a shitty call, or a ridiculous goal all at the same time means the traffic is super concentrated and super high. A lot of work has to go into preparing to keep things online and actively putting out fires while the events are ongoing. I've heard engineers from Instagram talk about how they always have a miserable New Years bc a lot of things always break with the increased load. And it's just not possible or practical to anticipate every potential failure mode.

          Long ramble, but i'd expect it to be a slow collapse until it isn't. It can't stay online with a skeleton crew forever