Inline URL preview documentation near the implementation.tags/v1.64.0rc1
@@ -1,2 +1 @@ | |||
Add a link to configuration instructions in the URL preview documentation. | |||
Move the documentation for how URL previews work to the URL preview module. |
@@ -0,0 +1 @@ | |||
Move the documentation for how URL previews work to the URL preview module. |
@@ -35,7 +35,6 @@ | |||
- [Application Services](application_services.md) | |||
- [Server Notices](server_notices.md) | |||
- [Consent Tracking](consent_tracking.md) | |||
- [URL Previews](development/url_previews.md) | |||
- [User Directory](user_directory.md) | |||
- [Message Retention Policies](message_retention_policies.md) | |||
- [Pluggable Modules](modules/index.md) | |||
@@ -544,7 +544,7 @@ Gets a list of all local media that a specific `user_id` has created. | |||
These are media that the user has uploaded themselves | |||
([local media](../media_repository.md#local-media)), as well as | |||
[URL preview images](../media_repository.md#url-previews) requested by the user if the | |||
[feature is enabled](../development/url_previews.md). | |||
[feature is enabled](../usage/configuration/config_documentation.md#url_preview_enabled). | |||
By default, the response is ordered by descending creation date and ascending media ID. | |||
The newest media is on top. You can change the order with parameters | |||
@@ -1,62 +0,0 @@ | |||
URL Previews | |||
============ | |||
For information on how to enable URL previews in synapse, please see the [config manual](../usage/configuration/config_documentation.md#url_preview_enabled). | |||
The `GET /_matrix/media/r0/preview_url` endpoint provides a generic preview API | |||
for URLs which outputs [Open Graph](https://ogp.me/) responses (with some Matrix | |||
specific additions). | |||
This does have trade-offs compared to other designs: | |||
* Pros: | |||
* Simple and flexible; can be used by any clients at any point | |||
* Cons: | |||
* If each homeserver provides one of these independently, all the HSes in a | |||
room may needlessly DoS the target URI | |||
* The URL metadata must be stored somewhere, rather than just using Matrix | |||
itself to store the media. | |||
* Matrix cannot be used to distribute the metadata between homeservers. | |||
When Synapse is asked to preview a URL it does the following: | |||
1. Checks against a URL blacklist (defined as `url_preview_url_blacklist` in the | |||
config). | |||
2. Checks the in-memory cache by URLs and returns the result if it exists. (This | |||
is also used to de-duplicate processing of multiple in-flight requests at once.) | |||
3. Kicks off a background process to generate a preview: | |||
1. Checks the database cache by URL and timestamp and returns the result if it | |||
has not expired and was successful (a 2xx return code). | |||
2. Checks if the URL matches an [oEmbed](https://oembed.com/) pattern. If it | |||
does, update the URL to download. | |||
3. Downloads the URL and stores it into a file via the media storage provider | |||
and saves the local media metadata. | |||
4. If the media is an image: | |||
1. Generates thumbnails. | |||
2. Generates an Open Graph response based on image properties. | |||
5. If the media is HTML: | |||
1. Decodes the HTML via the stored file. | |||
2. Generates an Open Graph response from the HTML. | |||
3. If a JSON oEmbed URL was found in the HTML via autodiscovery: | |||
1. Downloads the URL and stores it into a file via the media storage provider | |||
and saves the local media metadata. | |||
2. Convert the oEmbed response to an Open Graph response. | |||
3. Override any Open Graph data from the HTML with data from oEmbed. | |||
4. If an image exists in the Open Graph response: | |||
1. Downloads the URL and stores it into a file via the media storage | |||
provider and saves the local media metadata. | |||
2. Generates thumbnails. | |||
3. Updates the Open Graph response based on image properties. | |||
6. If the media is JSON and an oEmbed URL was found: | |||
1. Convert the oEmbed response to an Open Graph response. | |||
2. If a thumbnail or image is in the oEmbed response: | |||
1. Downloads the URL and stores it into a file via the media storage | |||
provider and saves the local media metadata. | |||
2. Generates thumbnails. | |||
3. Updates the Open Graph response based on image properties. | |||
7. Stores the result in the database cache. | |||
4. Returns the result. | |||
The in-memory cache expires after 1 hour. | |||
Expired entries in the database cache (and their associated media files) are | |||
deleted every 10 seconds. The default expiration time is 1 hour from download. |
@@ -7,8 +7,7 @@ The media repository | |||
users. | |||
* caches avatars, attachments and their thumbnails for media uploaded by remote | |||
users. | |||
* caches resources and thumbnails used for | |||
[URL previews](development/url_previews.md). | |||
* caches resources and thumbnails used for URL previews. | |||
All media in Matrix can be identified by a unique | |||
[MXC URI](https://spec.matrix.org/latest/client-server-api/#matrix-content-mxc-uris), | |||
@@ -59,8 +58,6 @@ remote_thumbnail/matrix.org/aa/bb/cccccccccccccccccccc/128-96-image-jpeg | |||
Note that `remote_thumbnail/` does not have an `s`. | |||
## URL Previews | |||
See [URL Previews](development/url_previews.md) for documentation on the URL preview | |||
process. | |||
When generating previews for URLs, Synapse may download and cache various | |||
resources, including images. These resources are assigned temporary media IDs | |||
@@ -109,10 +109,64 @@ class MediaInfo: | |||
class PreviewUrlResource(DirectServeJsonResource): | |||
""" | |||
Generating URL previews is a complicated task which many potential pitfalls. | |||
See docs/development/url_previews.md for discussion of the design and | |||
algorithm followed in this module. | |||
The `GET /_matrix/media/r0/preview_url` endpoint provides a generic preview API | |||
for URLs which outputs Open Graph (https://ogp.me/) responses (with some Matrix | |||
specific additions). | |||
This does have trade-offs compared to other designs: | |||
* Pros: | |||
* Simple and flexible; can be used by any clients at any point | |||
* Cons: | |||
* If each homeserver provides one of these independently, all the homeservers in a | |||
room may needlessly DoS the target URI | |||
* The URL metadata must be stored somewhere, rather than just using Matrix | |||
itself to store the media. | |||
* Matrix cannot be used to distribute the metadata between homeservers. | |||
When Synapse is asked to preview a URL it does the following: | |||
1. Checks against a URL blacklist (defined as `url_preview_url_blacklist` in the | |||
config). | |||
2. Checks the URL against an in-memory cache and returns the result if it exists. (This | |||
is also used to de-duplicate processing of multiple in-flight requests at once.) | |||
3. Kicks off a background process to generate a preview: | |||
1. Checks URL and timestamp against the database cache and returns the result if it | |||
has not expired and was successful (a 2xx return code). | |||
2. Checks if the URL matches an oEmbed (https://oembed.com/) pattern. If it | |||
does, update the URL to download. | |||
3. Downloads the URL and stores it into a file via the media storage provider | |||
and saves the local media metadata. | |||
4. If the media is an image: | |||
1. Generates thumbnails. | |||
2. Generates an Open Graph response based on image properties. | |||
5. If the media is HTML: | |||
1. Decodes the HTML via the stored file. | |||
2. Generates an Open Graph response from the HTML. | |||
3. If a JSON oEmbed URL was found in the HTML via autodiscovery: | |||
1. Downloads the URL and stores it into a file via the media storage provider | |||
and saves the local media metadata. | |||
2. Convert the oEmbed response to an Open Graph response. | |||
3. Override any Open Graph data from the HTML with data from oEmbed. | |||
4. If an image exists in the Open Graph response: | |||
1. Downloads the URL and stores it into a file via the media storage | |||
provider and saves the local media metadata. | |||
2. Generates thumbnails. | |||
3. Updates the Open Graph response based on image properties. | |||
6. If the media is JSON and an oEmbed URL was found: | |||
1. Convert the oEmbed response to an Open Graph response. | |||
2. If a thumbnail or image is in the oEmbed response: | |||
1. Downloads the URL and stores it into a file via the media storage | |||
provider and saves the local media metadata. | |||
2. Generates thumbnails. | |||
3. Updates the Open Graph response based on image properties. | |||
7. Stores the result in the database cache. | |||
4. Returns the result. | |||
The in-memory cache expires after 1 hour. | |||
Expired entries in the database cache (and their associated media files) are | |||
deleted every 10 seconds. The default expiration time is 1 hour from download. | |||
""" | |||
isLeaf = True | |||