TwitterCard meta tags are supposed to use the attributes "name" and "content".
OpenGraph tags use the attributes "property" and "content".
Twitter itself is smart enough to detect broken meta tags and discover the TwitterCard
using "property" and "content", but other platforms that only implement parsing of TwitterCards
and not OpenGraph may fail to correctly detect the tags as they're under the wrong attributes.
> "Open Graph protocol also specifies the use of property and content attributes for markup while
> Twitter cards use name and content. Twitter’s parser will fall back to using property and content,
> so there is no need to modify existing Open Graph protocol markup if it already exists." [0]
[0] https://developer.twitter.com/en/docs/twitter-for-websites/cards/guides/getting-started
`context` fields for objects and activities can now be generated based
on the object/activity `inReplyTo` field or its ActivityPub ID, as a
fallback method in cases where `context` fields are missing for incoming
activities and objects.
When doing prune_objects, it's possible that bookmarked objects are deleted.
This gave problems when fetching the bookmark TL.
Here we clean up the bookmarks during pruning in the case were it's possible that bookmarked objects are deleted.
weird values in href will cause base64 encoding to fail later down the
line, so let's make sure the value we're passing on is somewhat sane, or
at the very least a binary
this fixes#482
When no image description is filled in, Pleroma allowed fallbacks.
Those were (based on a setting) either the filename, or a fixed description.
Neither are good options for image descriptions imo, so here we remove this.
Note that there's two tests removed who supposedly tested something else.
But examining closer, they didn't seem to test what they claimed to test,
so I removed them rather than try to "fix" them.
E.g. Flag activities have an array of objects
We prune the activity when NONE of the objects can be found
Note that the cost of finding and deleting these is ~4x higher than finding and deleting the non-array ones
Only string:
Delete on activities (cost=506573.48..506580.38 rows=0 width=0)
Only Array:
Delete on activities (cost=3570359.68..4276365.34 rows=0 width=0)
(They are still executed separately, so the total cost is the sum of the two)
We add an option to also prune remote activities who don't have existing objects any more they reference.
Rn, we only check for activities who only reference one object, not an array or embeded object.
This adds an option to the prune_objects mix task.
The original way deleted all non-local public posts older than a certain time frame.
Here we add a different query which you can call using the option --keep-threads.
We query from the activities table all context id's where
1. the newest activity with this context is still old
2. none of the activities with this context is is local
3. none of the activities with this context is bookmarked
and delete all objects with these contexts.
The idea is that posts with local activities (posts, replies, likes, repeats...) may be interesting to keep.
Besides that, a post lives in a certain context (the thread), so we keep the whole thread as well.
Caveats:
* ~~Quotes have a different context. Therefore, when someone quotes a post, it's possible the quoted post will still be deleted.~~ fixed in #379
* Although undocumented (in docs/docs/administration/CLI_tasks/database.md/#prune-old-remote-posts-from-the-database), the 'normal' delete action still kept old remote non-public posts. I added an option to keep this behaviour, but this also means that you now have to explicitly provide that option. **This could be considered a breaking change!**
* ~~Note that this removes from the objects table, but not from the activities.~~ See #427 for that.
Some statistics from explain analyse:
(cost=1402845.92..1933782.00 rows=3810907 width=62) (actual time=2562455.486..2562455.495 rows=0 loops=1)
Planning Time: 505.327 ms
Trigger for constraint chat_message_references_object_id_fkey: time=651939.797 calls=921740
Trigger for constraint deliveries_object_id_fkey: time=52036.009 calls=921740
Trigger for constraint hashtags_objects_object_id_fkey: time=20665.778 calls=921740
Execution Time: 3287933.902 ms
***
**TODO**
1. [x] **Question:** Is it OK to keep it like this in regard to quote posts? If not (ie post quoted by local users should also be kept), should we give quotes the same context as the post they are quoting? (If we don't want to give them the same context, I'll have to see how/if I can do it without being too costly)
* See #379
2. [x] **Question:** the "original" query only deletes public posts (this is undocumented, but you can check the code). This new one doesn't care for scope. From the docs I get that the idea is that posts can be refetched when needed. But I have from a trusted source that Pleroma can't refetch non-public posts. I assume that's the reason why they are kept here. I see different options to deal with this
1. ~~We keep it as currently implemented and just don't care about scope with this option~~
2. ~~We add logic to not delete non-public posts either (I'll have to see how costly that becomes)~~
3. We add an extra --keep-non-public parameter. This is technically speaking breakage (you didn't have to provide a param before for this, now you do), but I'm inclined to not care much because it wasn't documented nor tested in the first place.
3. [x] See if we can do the query using Elixir
4. [x] Test on a bigger DB to see that we don't run into a timeout
5. [x] Add docs
Co-authored-by: ilja <git@ilja.space>
Reviewed-on: #350
Co-authored-by: ilja <akkoma.dev@ilja.space>
Co-committed-by: ilja <akkoma.dev@ilja.space>
Some users post posts with spoofed timestamp, and some clients will have issues with certain dates. Tusky for example crashes if the date is any sooner than 1 BCE (“year zero” in the representation).
I limited the range of what is considered a valid date to be somewhere between the years 1583 and 9999 (inclusive).
The numbers have been chosen because:
- ISO 8601 only allows years before 1583 with “mutual agreement”
- Years after 9999 could cause issues with certain clients as well
Co-authored-by: Charlotte 🦝 Delenk <lotte@chir.rs>
Reviewed-on: #425
Co-authored-by: darkkirb <lotte@chir.rs>
Co-committed-by: darkkirb <lotte@chir.rs>
See #350 (comment)
When making quotes through Mast-API, they will now have the same context as the quoted post. This also results in them being showed when fetching the thread. I checked Misskey to see how it's there, and they show the quotes there as well, see e.g. <https://mk.toast.cafe/notes/98u1g0tulg>.
An example from Akkoma:
Co-authored-by: ilja <git@ilja.space>
Reviewed-on: #379
Reviewed-by: floatingghost <hannah@coffee-and-dreams.uk>
Co-authored-by: ilja <akkoma.dev@ilja.space>
Co-committed-by: ilja <akkoma.dev@ilja.space>
Argos Translate is a Python module for translation and can be used as a command line tool.
This is also the engine for LibreTranslate, for which we already have a module.
Here we can use the engine directly from our server without doing requests to a third party or having to install our own LibreTranslate webservice (obviously you do have to install Argos Translate).
One thing that's currently still missing from Argos Translate is auto-detection of languages (see <https://github.com/argosopentech/argos-translate/issues/9>). For now, when no source language is provided, we just return the text unchanged, supposedly translated from the target language. That way you get a near immediate response in pleroma-fe when clicking Translate, after which you can select the source language from a dropdown.
Argos Translate also doesn't seem to handle html very well. Therefore we give admins the option to strip the html before translating. I made this an option because I'm unsure if/how this will change in the future.
Co-authored-by: ilja <git@ilja.space>
Reviewed-on: #351
Co-authored-by: ilja <akkoma.dev@ilja.space>
Co-committed-by: ilja <akkoma.dev@ilja.space>
this didn't actually _do_ anything in the past,
the users would be prevented from accessing the resource,
but they shouldn't be able to even create them
Until now it was returning a 500 because the upload plug were going
through the changeset and ending in the JSON encoder, which raised
because struct has to @derive the encoder.
Objects who got updated would just pass through several of the MRF policies, undoing moderation in some situations.
In the relevant cases we now check not only for Create activities, but also Update activities.
I checked which ones checked explicitly on type Create using `grep '"type" => "Create"' lib/pleroma/web/activity_pub/mrf/*`.
The following from that list have not been changed:
* lib/pleroma/web/activity_pub/mrf/follow_bot_policy.ex
* Not relevant for moderation
* lib/pleroma/web/activity_pub/mrf/keyword_policy.ex
* Already had a test for Update
* lib/pleroma/web/activity_pub/mrf/object_age_policy.ex
* In practice only relevant when fetching old objects (e.g. through Like or Announce). These are always wrapped in a Create.
* lib/pleroma/web/activity_pub/mrf/reject_non_public.ex
* We don't allow changing scope with Update, so not relevant here
Objects who got updated would just pass the TagPolicy, undoing the moderation that was set in place for the Actor.
Now we check not only for Create activities, but also Update activities.