do not fetch if :limit_to_local_content is :all or :unauthenticated #582

Merged
floatingghost merged 1 commit from beerriot/akkoma:develop-no-fetch-with-local-limit into develop 2025-01-05 15:39:14 +00:00
Contributor

If limit_to_local_content is set to unauthenticated, an unauthenticated search for keywords only returns posts that were made locally, but searching for a post URL will cause the fetcher to go grab that post from whatever server and display it in search results (as long as it's not blocked for another reason). This seemed wrong to me - kind of a loophole - so I changed the maybe_fetch function to check on limit_to_local_content first, and not fetch if it is set to unauthenticated (or to all).

I haven't added a test. It seems like it will require either simulating/mocking the fetcher, or an external server to respond to the fetcher. If there's another test demonstrating how to do that easily, please feel free to point me toward it.

If `limit_to_local_content` is set to `unauthenticated`, an unauthenticated search for keywords only returns posts that were made locally, but searching for a post URL will cause the fetcher to go grab that post from whatever server and display it in search results (as long as it's not blocked for another reason). This seemed wrong to me - kind of a loophole - so I changed the `maybe_fetch` function to check on `limit_to_local_content` first, and not fetch if it is set to `unauthenticated` (or to `all`). I haven't added a test. It seems like it will require either simulating/mocking the fetcher, or an external server to respond to the fetcher. If there's another test demonstrating how to do that easily, please feel free to point me toward it.
beerriot added 1 commit 2023-07-07 17:38:05 +00:00
do not fetch if limit_to_local_content is enabled
Some checks are pending
ci/woodpecker/pr/woodpecker Pipeline is pending
ci/woodpecker/pull_request_closed/woodpecker Pipeline is pending approval
97037c0b53
Prior to this change, anyone, authenticated or not, could submit a search
query for an activity by URL, and cause the fetcher to go fetch it. That
shouldn't happen if `limit_to_local_content` is set to `:all` or if it's
set to `:unauthenticated` and the query came from an unauthenticated
source.

hm, this would worry me a little - i get what you're going for, but that search bar is about the only way you have of asking the system to fetch a remote post; it might become extremely difficult to bootstrap new instances that turn this one

hm, this would worry me a little - i get what you're going for, but that search bar is about the only way you have of asking the system to fetch a remote post; it might become extremely difficult to bootstrap new instances that turn this one
Author
Contributor

Thanks for having a look, @floatingghost. I guess the only thing I'd say is that this patch does not change anything for logged-in users on an instance using the default config (limit_to_local_content = :unauthenticated), as far as I can tell. I think that means that the users on a new instance shouldn't have trouble bootstrapping it, because they can fetch all the remote posts via search that they like. But I'm also pretty new to Akkoma, so maybe I've missed another bootstrapping process?

Thanks for having a look, @floatingghost. I guess the only thing I'd say is that this patch does not change anything for logged-in users on an instance using the default config (`limit_to_local_content = :unauthenticated`), as far as I can tell. I think that means that the users on a new instance shouldn't have trouble bootstrapping it, because they can fetch all the remote posts via search that they like. But I'm also pretty new to Akkoma, so maybe I've missed another bootstrapping process?
First-time contributor

hm, this would worry me a little - i get what you're going for, but that search bar is about the only way you have of asking the system to fetch a remote post; it might become extremely difficult to bootstrap new instances that turn this one

what the issue reporter wants is for search to be disabled for unauthenticated users; it is what I want and how I ended up here in the first place. malicious entities being able to pull in arbitrary content to an instance as part of a DOS attack or to get a given instance to serve up content that may introduce liabilities is a huge vulnerability that frankly I'm surprised is still so prevalent in fediverse server software. i'm the only user on my instance, i don't want anyone else pullling in bullshit or garbage.

of course this needs to work out of the box, but you should be able to turn it off for logged-out users and frankly, at your option for all users or non-admin ones as well.

> hm, this would worry me a little - i get what you're going for, but that search bar is about the only way you have of asking the system to fetch a remote post; it might become extremely difficult to bootstrap new instances that turn this one what the issue reporter wants is for search to be disabled for unauthenticated users; it is what I want and how I ended up here in the first place. malicious entities being able to pull in arbitrary content to an instance as part of a DOS attack or to get a given instance to serve up content that may introduce liabilities is a huge vulnerability that frankly I'm surprised is still so prevalent in fediverse server software. i'm the only user on my instance, i don't want anyone else pullling in bullshit or garbage. of course this needs to work out of the box, but you should be able to turn it off for logged-out users and frankly, at your option for all users or non-admin ones as well.
Member

I suspect a misunderstanding occurred when originally discussing this patch. Although the setting is called limit_to_local_content without any further qualifiers, by default it only applies to unauthenticated users. Thus any logged-in user can still ask the system to fetch remote content for e.g. bootstrapping just fine. We merely no longer first fetch remote data just to then immediately hide it, i.e. pointlessly wasting time and other resources.

Note, the current patch here is incomplete; the same check ought to be applied to user search implemented in another module.

at your option for all users

If you don't want remote content at all you already can and will have to disable federation or limit it to an allow list of trusted/relevant servers. Otherwise any other server can still send you arbitrary data which will get processed, by necessity including resolving any further references it contains, and added to your database. See the “instance” and “MRF Policy (simple)” sections in the config cheatsheet

I suspect a misunderstanding occurred when originally discussing this patch. Although the setting is called `limit_to_local_content` without any further qualifiers, by default it only applies to _unauthenticated_ users. Thus any logged-in user can still ask the system to fetch remote content for e.g. bootstrapping just fine. We merely no longer first fetch remote data just to then immediately hide it, i.e. pointlessly wasting time and other resources. Note, the current patch here is incomplete; the same check ought to be applied to _user_ search implemented in another module. > at your option for all users If you don't want remote content at all you already can and will have to disable federation or limit it to an allow list of trusted/relevant servers. Otherwise any other server can still send you arbitrary data which will get processed, by necessity including resolving any further references it contains, and added to your database. See the “instance” and “MRF Policy (simple)” sections in the [config cheatsheet](https://docs.akkoma.dev/stable/configuration/cheatsheet/#federation)
Author
Contributor

Thanks for poking this, @anna. My explicit goal was to make it so that search only ever returned local results for unauthenticated users. It does this already for keyword search. This was fixing it for URL search. Partly I was fixing this for consistency, but I also had in mind that it was a way that someone might be able to evade someone else's block. Your points about introduction of liabilities is a great one too.

Although the setting is called limit_to_local_content without any further qualifiers, by default it only applies to unauthenticated users.

Yes, that is exactly the point we are both making. It is unauthenticated users that we are concerned about.

FWIW, I've been rebasing this patch onto every release, without any conflicts. It's still working for me.

Thanks for poking this, @anna. My explicit goal was to make it so that search only ever returned local results for unauthenticated users. It does this already for keyword search. This was fixing it for URL search. Partly I was fixing this for consistency, but I also had in mind that it was a way that someone might be able to evade someone else's block. Your points about introduction of liabilities is a great one too. > Although the setting is called limit_to_local_content without any further qualifiers, by default it only applies to unauthenticated users. Yes, that is exactly the point we are both making. It is unauthenticated users that we are concerned about. FWIW, I've been rebasing this patch onto every release, without any conflicts. It's still working for me.
Author
Contributor

Note, the current patch here is incomplete; the same check ought to be applied to user search implemented in another module.

I also think this might be incorrect. I just tried searching for a remote user as an unauthenticated user on my server, and I got no results. The response was immediate as well. I then went and searched for that user as myself on my server, and watched the loading circle twirl for a couple seconds before that user was displayed.

You didn't mention which other module should be addressed for user search, but looking at https://akkoma.dev/AkkomaGang/akkoma/src/branch/develop/lib/pleroma/user/search.ex#L202-L220 it seems like maybe_resolve/3 is doing the same thing as my change here. When :limit_to_local_content is set to :unauthenticated, and there is no user object associated with the request, resolve is a :noop.

> Note, the current patch here is incomplete; the same check ought to be applied to user search implemented in another module. I also think this might be incorrect. I just tried searching for a remote user as an unauthenticated user on my server, and I got no results. The response was immediate as well. I then went and searched for that user as myself on my server, and watched the loading circle twirl for a couple seconds before that user was displayed. You didn't mention which other module should be addressed for user search, but looking at https://akkoma.dev/AkkomaGang/akkoma/src/branch/develop/lib/pleroma/user/search.ex#L202-L220 it seems like `maybe_resolve/3` is doing the same thing as my change here. When `:limit_to_local_content` is set to `:unauthenticated`, and there is no user object associated with the request, resolve is a `:noop`.
Member

it seems like maybe_resolve/3 is doing the same thing as my change here. When :limit_to_local_content is set to :unauthenticated, and there is no user object associated with the request, resolve is a :noop.

neat, no need for changes there then (although it might be nice to refactor this to share the restriction decision like in this patch instead of copy-pasting the case)

Your points about introduction of liabilities is a great one too.

I’ll reiterate not-fetching content for unauthenticated searches does not provide any security; there are many other ways to cause a server to fetch objects which cannot be sealed off without breaking federation.
It does however avoid unnecessary work when we’ll just filter out fetched content afterwards anyway and apparently improves consistency, which are a good enough reason by themselves. Afaict all is good; thanks for the patch ^^

> it seems like maybe_resolve/3 is doing the same thing as my change here. When :limit_to_local_content is set to :unauthenticated, and there is no user object associated with the request, resolve is a :noop. neat, no need for changes there then *(although it might be nice to refactor this to share the restriction decision like in this patch instead of copy-pasting the `case`)* > Your points about introduction of liabilities is a great one too. I’ll reiterate not-fetching content for unauthenticated searches does not provide any security; there are many other ways to cause a server to fetch objects which cannot be sealed off without breaking federation. It _does_ however avoid unnecessary work when we’ll just filter out fetched content afterwards anyway _and_ apparently improves consistency, which are a good enough reason by themselves. Afaict all is good; thanks for the patch ^^
Oneric approved these changes 2024-10-08 02:21:40 +00:00
Author
Contributor

I just saw news of the 3.13.3 release, but I also see that this PR didn't make it. @Oneric approved it, but my view says I'm not authorized to merge. Is there something more you need from me here, or are we waiting for someone else's attention?

I just saw news of the 3.13.3 release, but I also see that this PR didn't make it. @Oneric approved it, but my view says I'm not authorized to merge. Is there something more you need from me here, or are we waiting for someone else's attention?

3.13.3 was not a full release per se but more a mandatory fix to stop hammering misskey

3.13.3 was not a full release per se but more a mandatory fix to stop hammering misskey
floatingghost merged commit 7c095a6b70 into develop 2025-01-05 15:39:14 +00:00
floatingghost deleted branch develop-no-fetch-with-local-limit 2025-01-05 15:39:14 +00:00
Sign in to join this conversation.
No description provided.