Well, the big issue here is that we sort of don’t have the power you think we do.
What I mean is, say you have 10 servers. 7 are Lemmy, 3 are kbin. Great, each admin has control over those servers. Then you have Meta. They’ll run 1 huge server. When the 10 other servers enable Federation, Meta now has 10 servers of content that isn’t even on their own platform that they can sell. Your data will literally exist on the Meta server because your data is not contained within your instance/platform once it’s Federated. Meta can then harvest the entire Fediverse for data like this. It’s like an absolute wet dream for them. They don’t even have to coax people to use their own platform!
Meta must be defederated the second they so much as dip a toe into the Fediverse or everything you’ve ever done, or do, on any ActivityHub platform will be scooped up and sold.
Edit: And it’s even worse because all it takes is 1 server to Federate with Meta. If server A is Federated with your sever B, Meta can sill pull your data from server A they Federated with, even if your local server B has Defederated with Meta. This is a huge problem.
Right… But…
ActivityPub is not a protected encrypted protocol. Everything anyone says on any service using ActivityPub can already be intercepted and harvested by anyone, even blocked instances. The defederating is software based. But for example if someone wanted they could simply do https://mastodon.social/tags/fediverse.rss and there were go, instant access to data from the Fediverse. You can query any Mastodon server for any hashtag you like. That’s just one of many endpoints that will spit out Fediverse content.
What I’m taking issue with is essentially the same thing that is getting Reddit into hot water. Spez is acting like all the content on Reddit is exclusively his. And legally, it probably is, since it exists on his servers. Now if you extrapolate that out to Meta on ActivityHub, any instance that federates with them immediately puts all of your content directly onto Meta’s servers. Once it’s in their possession, it’s legally theirs to do with as they please. If they want to pull a Facebook or Reddit, using your data, they can with no way for you to opt-out. Sure, nothing is stopping people from doing it already, but Meta does not have your best interest in mind. Ever. They’ve shown it again and again. So I think people are preemptively wanting to cut off this spigot of user data to Meta because their abuse of it is a matter of when, not if. Any other company might deserve the benefit of the doubt, but Meta? We know who they are already.
Also, as I said elsewhere, Meta could already use a bot to scrape Lemmy instances, but you can’t sell a bot to investors. But you can sell a platform. Meta will build a slick platform to sell to investors and sit back while federation fills up their instance with data which they’ll turn around and sell the same way they do on Facebook. And the insidious part of it is that they’ll take your data even though you didn’t use their platform. Right now I can decide not to be data mined by Meta simply by not using Facebook. To do that here if instances start federating your data onto Meta servers, you’d have to not use ActivityPub at all. Either that or the fediverse fractures into Meta and not-Meta, which also sucks.
This is really a lot more than simply setting up an RSS feed.
I completely agree with the overall point you’re making, but would like to correct the legal aspects. I am not a lawyer, but I do have a pretty good understanding of US copyright law which is the most relevant in this case.
Having possession of data isn’t sufficient to legally establish the rights to do as a company pleases. In general, an individual author immediately has copyright on a creative work as soon as it’s recorded in any medium. The main exception to this is “work for hire” — a legal agreement that employers hold copyrights since they’re paying for the work. It’s usually part of the paperwork an established company has you sign when you start a job.
Because of this, and because we users aren’t employees of Reddit, they need a license to duplicate and display our copyrighted posts. The terms of service for any online service almost always stipulate a “worldwide, non-exclusive, perpetual license”. In other words: you still own the copyright to your post and can still share it elsewhere, but by sending it to Reddit, they get to put it anywhere they want and you can’t ever take that right away from them.
If Meta begins slurping up data from the Fediverse, things get tricky. They’re probably violating copyright law if they do that, just as ChatGPT, Google Bard, etc… likely have. However, legal enforcement of our rights would be near-impossible. Everyone who has ever had an account with any of Meta’s properties has most likely agreed to an binding arbitration provision. (These are utterly immoral, they force you — as a precondition of doing business! — to preemptively waive your legal rights before anything occurs that would cause you to need them.) These provisions also prohibit any sort of class action, so each individual person would have to initiate their own case against Meta. And then you’d have to somehow prove to an arbitrator from an organization selected by and paid by Meta that Meta violated your copyright. And Meta’s high-priced lawyers will have all kinds of ways of referencing prior cases to argue why what they did is fine.
So yeah. But again, I completely agree with your main point. Meta will (if they haven’t already) collect all the data they please from the Fediverse and use it to further their business interests. And those business interests are not aligned with our best interests.
I’m confused about what kind of data you want to protect. If you mean your posts and comments, they are already publicly availible on the Internet. Meta doesn’t need to make a activitypub app that gets federated with Lemmy (or kbin) to aggregate and sell this data.
Is there an other kind of data that is visible only to server administrators?
Edit: Been corrected, the following is NOT how it works! Original Text follows
Someone correct me if I’m getting details wrong, but from reading this post it appears as if fediverse admins are provided both the username and email accounts registered by those users that have visited their instances.
If that’s true, one problematic scenario I can imagine is when someone has registered on the fediverse with a pseudonym, but has an e-mail address they also use on their real-life Facebook profile. Visiting a Facebook-run ActivityPub instance while logged in would give Facebook enough data to link both the pseudonymous account (with past and future post history), and the real-life Facebook profile.
So, even if you’re not signed up for Facebook’s version of ActivityPub, engaging with it could still be giving Facebook a source of ongoing data for building personal profiles and targeted advertisement that people would not provide on their own.
Well, the big issue here is that we sort of don’t have the power you think we do.
What I mean is, say you have 10 servers. 7 are Lemmy, 3 are kbin. Great, each admin has control over those servers. Then you have Meta. They’ll run 1 huge server. When the 10 other servers enable Federation, Meta now has 10 servers of content that isn’t even on their own platform that they can sell. Your data will literally exist on the Meta server because your data is not contained within your instance/platform once it’s Federated. Meta can then harvest the entire Fediverse for data like this. It’s like an absolute wet dream for them. They don’t even have to coax people to use their own platform!
Meta must be defederated the second they so much as dip a toe into the Fediverse or everything you’ve ever done, or do, on any ActivityHub platform will be scooped up and sold.
Edit: And it’s even worse because all it takes is 1 server to Federate with Meta. If server A is Federated with your sever B, Meta can sill pull your data from server A they Federated with, even if your local server B has Defederated with Meta. This is a huge problem.
Right… But…
ActivityPub is not a protected encrypted protocol. Everything anyone says on any service using ActivityPub can already be intercepted and harvested by anyone, even blocked instances. The defederating is software based. But for example if someone wanted they could simply do https://mastodon.social/tags/fediverse.rss and there were go, instant access to data from the Fediverse. You can query any Mastodon server for any hashtag you like. That’s just one of many endpoints that will spit out Fediverse content.
What I’m taking issue with is essentially the same thing that is getting Reddit into hot water. Spez is acting like all the content on Reddit is exclusively his. And legally, it probably is, since it exists on his servers. Now if you extrapolate that out to Meta on ActivityHub, any instance that federates with them immediately puts all of your content directly onto Meta’s servers. Once it’s in their possession, it’s legally theirs to do with as they please. If they want to pull a Facebook or Reddit, using your data, they can with no way for you to opt-out. Sure, nothing is stopping people from doing it already, but Meta does not have your best interest in mind. Ever. They’ve shown it again and again. So I think people are preemptively wanting to cut off this spigot of user data to Meta because their abuse of it is a matter of when, not if. Any other company might deserve the benefit of the doubt, but Meta? We know who they are already.
Also, as I said elsewhere, Meta could already use a bot to scrape Lemmy instances, but you can’t sell a bot to investors. But you can sell a platform. Meta will build a slick platform to sell to investors and sit back while federation fills up their instance with data which they’ll turn around and sell the same way they do on Facebook. And the insidious part of it is that they’ll take your data even though you didn’t use their platform. Right now I can decide not to be data mined by Meta simply by not using Facebook. To do that here if instances start federating your data onto Meta servers, you’d have to not use ActivityPub at all. Either that or the fediverse fractures into Meta and not-Meta, which also sucks.
This is really a lot more than simply setting up an RSS feed.
I completely agree with the overall point you’re making, but would like to correct the legal aspects. I am not a lawyer, but I do have a pretty good understanding of US copyright law which is the most relevant in this case.
Having possession of data isn’t sufficient to legally establish the rights to do as a company pleases. In general, an individual author immediately has copyright on a creative work as soon as it’s recorded in any medium. The main exception to this is “work for hire” — a legal agreement that employers hold copyrights since they’re paying for the work. It’s usually part of the paperwork an established company has you sign when you start a job.
Because of this, and because we users aren’t employees of Reddit, they need a license to duplicate and display our copyrighted posts. The terms of service for any online service almost always stipulate a “worldwide, non-exclusive, perpetual license”. In other words: you still own the copyright to your post and can still share it elsewhere, but by sending it to Reddit, they get to put it anywhere they want and you can’t ever take that right away from them.
If Meta begins slurping up data from the Fediverse, things get tricky. They’re probably violating copyright law if they do that, just as ChatGPT, Google Bard, etc… likely have. However, legal enforcement of our rights would be near-impossible. Everyone who has ever had an account with any of Meta’s properties has most likely agreed to an binding arbitration provision. (These are utterly immoral, they force you — as a precondition of doing business! — to preemptively waive your legal rights before anything occurs that would cause you to need them.) These provisions also prohibit any sort of class action, so each individual person would have to initiate their own case against Meta. And then you’d have to somehow prove to an arbitrator from an organization selected by and paid by Meta that Meta violated your copyright. And Meta’s high-priced lawyers will have all kinds of ways of referencing prior cases to argue why what they did is fine.
So yeah. But again, I completely agree with your main point. Meta will (if they haven’t already) collect all the data they please from the Fediverse and use it to further their business interests. And those business interests are not aligned with our best interests.
I’m confused about what kind of data you want to protect. If you mean your posts and comments, they are already publicly availible on the Internet. Meta doesn’t need to make a activitypub app that gets federated with Lemmy (or kbin) to aggregate and sell this data.
Is there an other kind of data that is visible only to server administrators?
Edit: Been corrected, the following is NOT how it works! Original Text follows
Someone correct me if I’m getting details wrong, but from reading this post it appears as if fediverse admins are provided both the username and email accounts registered by those users that have visited their instances.
If that’s true, one problematic scenario I can imagine is when someone has registered on the fediverse with a pseudonym, but has an e-mail address they also use on their real-life Facebook profile. Visiting a Facebook-run ActivityPub instance while logged in would give Facebook enough data to link both the pseudonymous account (with past and future post history), and the real-life Facebook profile.
So, even if you’re not signed up for Facebook’s version of ActivityPub, engaging with it could still be giving Facebook a source of ongoing data for building personal profiles and targeted advertisement that people would not provide on their own.