I'm not a lawyer, but I know tech companies run social media platforms to create data models about users for ad platforms. It seems to me that they could attempt to integrate themselves into a fediverse network and still harvest data, and not even provide services. So perhaps a software license could require that content posted to the platform by users is by default licensed under CC-BY-SA-NC or something that would prevent this.
I know we're not really talking about Git, but since you mentioned a federated GitHub clone of sorts, Radicle is pretty neat and a cool thing to check out!