Inline URL preview documentation near the implementation.tags/v1.64.0rc1
@@ -1,2 +1 @@ | |||||
Add a link to configuration instructions in the URL preview documentation. | |||||
Move the documentation for how URL previews work to the URL preview module. |
@@ -0,0 +1 @@ | |||||
Move the documentation for how URL previews work to the URL preview module. |
@@ -35,7 +35,6 @@ | |||||
- [Application Services](application_services.md) | - [Application Services](application_services.md) | ||||
- [Server Notices](server_notices.md) | - [Server Notices](server_notices.md) | ||||
- [Consent Tracking](consent_tracking.md) | - [Consent Tracking](consent_tracking.md) | ||||
- [URL Previews](development/url_previews.md) | |||||
- [User Directory](user_directory.md) | - [User Directory](user_directory.md) | ||||
- [Message Retention Policies](message_retention_policies.md) | - [Message Retention Policies](message_retention_policies.md) | ||||
- [Pluggable Modules](modules/index.md) | - [Pluggable Modules](modules/index.md) | ||||
@@ -544,7 +544,7 @@ Gets a list of all local media that a specific `user_id` has created. | |||||
These are media that the user has uploaded themselves | These are media that the user has uploaded themselves | ||||
([local media](../media_repository.md#local-media)), as well as | ([local media](../media_repository.md#local-media)), as well as | ||||
[URL preview images](../media_repository.md#url-previews) requested by the user if the | [URL preview images](../media_repository.md#url-previews) requested by the user if the | ||||
[feature is enabled](../development/url_previews.md). | |||||
[feature is enabled](../usage/configuration/config_documentation.md#url_preview_enabled). | |||||
By default, the response is ordered by descending creation date and ascending media ID. | By default, the response is ordered by descending creation date and ascending media ID. | ||||
The newest media is on top. You can change the order with parameters | The newest media is on top. You can change the order with parameters | ||||
@@ -1,62 +0,0 @@ | |||||
URL Previews | |||||
============ | |||||
For information on how to enable URL previews in synapse, please see the [config manual](../usage/configuration/config_documentation.md#url_preview_enabled). | |||||
The `GET /_matrix/media/r0/preview_url` endpoint provides a generic preview API | |||||
for URLs which outputs [Open Graph](https://ogp.me/) responses (with some Matrix | |||||
specific additions). | |||||
This does have trade-offs compared to other designs: | |||||
* Pros: | |||||
* Simple and flexible; can be used by any clients at any point | |||||
* Cons: | |||||
* If each homeserver provides one of these independently, all the HSes in a | |||||
room may needlessly DoS the target URI | |||||
* The URL metadata must be stored somewhere, rather than just using Matrix | |||||
itself to store the media. | |||||
* Matrix cannot be used to distribute the metadata between homeservers. | |||||
When Synapse is asked to preview a URL it does the following: | |||||
1. Checks against a URL blacklist (defined as `url_preview_url_blacklist` in the | |||||
config). | |||||
2. Checks the in-memory cache by URLs and returns the result if it exists. (This | |||||
is also used to de-duplicate processing of multiple in-flight requests at once.) | |||||
3. Kicks off a background process to generate a preview: | |||||
1. Checks the database cache by URL and timestamp and returns the result if it | |||||
has not expired and was successful (a 2xx return code). | |||||
2. Checks if the URL matches an [oEmbed](https://oembed.com/) pattern. If it | |||||
does, update the URL to download. | |||||
3. Downloads the URL and stores it into a file via the media storage provider | |||||
and saves the local media metadata. | |||||
4. If the media is an image: | |||||
1. Generates thumbnails. | |||||
2. Generates an Open Graph response based on image properties. | |||||
5. If the media is HTML: | |||||
1. Decodes the HTML via the stored file. | |||||
2. Generates an Open Graph response from the HTML. | |||||
3. If a JSON oEmbed URL was found in the HTML via autodiscovery: | |||||
1. Downloads the URL and stores it into a file via the media storage provider | |||||
and saves the local media metadata. | |||||
2. Convert the oEmbed response to an Open Graph response. | |||||
3. Override any Open Graph data from the HTML with data from oEmbed. | |||||
4. If an image exists in the Open Graph response: | |||||
1. Downloads the URL and stores it into a file via the media storage | |||||
provider and saves the local media metadata. | |||||
2. Generates thumbnails. | |||||
3. Updates the Open Graph response based on image properties. | |||||
6. If the media is JSON and an oEmbed URL was found: | |||||
1. Convert the oEmbed response to an Open Graph response. | |||||
2. If a thumbnail or image is in the oEmbed response: | |||||
1. Downloads the URL and stores it into a file via the media storage | |||||
provider and saves the local media metadata. | |||||
2. Generates thumbnails. | |||||
3. Updates the Open Graph response based on image properties. | |||||
7. Stores the result in the database cache. | |||||
4. Returns the result. | |||||
The in-memory cache expires after 1 hour. | |||||
Expired entries in the database cache (and their associated media files) are | |||||
deleted every 10 seconds. The default expiration time is 1 hour from download. |
@@ -7,8 +7,7 @@ The media repository | |||||
users. | users. | ||||
* caches avatars, attachments and their thumbnails for media uploaded by remote | * caches avatars, attachments and their thumbnails for media uploaded by remote | ||||
users. | users. | ||||
* caches resources and thumbnails used for | |||||
[URL previews](development/url_previews.md). | |||||
* caches resources and thumbnails used for URL previews. | |||||
All media in Matrix can be identified by a unique | All media in Matrix can be identified by a unique | ||||
[MXC URI](https://spec.matrix.org/latest/client-server-api/#matrix-content-mxc-uris), | [MXC URI](https://spec.matrix.org/latest/client-server-api/#matrix-content-mxc-uris), | ||||
@@ -59,8 +58,6 @@ remote_thumbnail/matrix.org/aa/bb/cccccccccccccccccccc/128-96-image-jpeg | |||||
Note that `remote_thumbnail/` does not have an `s`. | Note that `remote_thumbnail/` does not have an `s`. | ||||
## URL Previews | ## URL Previews | ||||
See [URL Previews](development/url_previews.md) for documentation on the URL preview | |||||
process. | |||||
When generating previews for URLs, Synapse may download and cache various | When generating previews for URLs, Synapse may download and cache various | ||||
resources, including images. These resources are assigned temporary media IDs | resources, including images. These resources are assigned temporary media IDs | ||||
@@ -109,10 +109,64 @@ class MediaInfo: | |||||
class PreviewUrlResource(DirectServeJsonResource): | class PreviewUrlResource(DirectServeJsonResource): | ||||
""" | """ | ||||
Generating URL previews is a complicated task which many potential pitfalls. | |||||
See docs/development/url_previews.md for discussion of the design and | |||||
algorithm followed in this module. | |||||
The `GET /_matrix/media/r0/preview_url` endpoint provides a generic preview API | |||||
for URLs which outputs Open Graph (https://ogp.me/) responses (with some Matrix | |||||
specific additions). | |||||
This does have trade-offs compared to other designs: | |||||
* Pros: | |||||
* Simple and flexible; can be used by any clients at any point | |||||
* Cons: | |||||
* If each homeserver provides one of these independently, all the homeservers in a | |||||
room may needlessly DoS the target URI | |||||
* The URL metadata must be stored somewhere, rather than just using Matrix | |||||
itself to store the media. | |||||
* Matrix cannot be used to distribute the metadata between homeservers. | |||||
When Synapse is asked to preview a URL it does the following: | |||||
1. Checks against a URL blacklist (defined as `url_preview_url_blacklist` in the | |||||
config). | |||||
2. Checks the URL against an in-memory cache and returns the result if it exists. (This | |||||
is also used to de-duplicate processing of multiple in-flight requests at once.) | |||||
3. Kicks off a background process to generate a preview: | |||||
1. Checks URL and timestamp against the database cache and returns the result if it | |||||
has not expired and was successful (a 2xx return code). | |||||
2. Checks if the URL matches an oEmbed (https://oembed.com/) pattern. If it | |||||
does, update the URL to download. | |||||
3. Downloads the URL and stores it into a file via the media storage provider | |||||
and saves the local media metadata. | |||||
4. If the media is an image: | |||||
1. Generates thumbnails. | |||||
2. Generates an Open Graph response based on image properties. | |||||
5. If the media is HTML: | |||||
1. Decodes the HTML via the stored file. | |||||
2. Generates an Open Graph response from the HTML. | |||||
3. If a JSON oEmbed URL was found in the HTML via autodiscovery: | |||||
1. Downloads the URL and stores it into a file via the media storage provider | |||||
and saves the local media metadata. | |||||
2. Convert the oEmbed response to an Open Graph response. | |||||
3. Override any Open Graph data from the HTML with data from oEmbed. | |||||
4. If an image exists in the Open Graph response: | |||||
1. Downloads the URL and stores it into a file via the media storage | |||||
provider and saves the local media metadata. | |||||
2. Generates thumbnails. | |||||
3. Updates the Open Graph response based on image properties. | |||||
6. If the media is JSON and an oEmbed URL was found: | |||||
1. Convert the oEmbed response to an Open Graph response. | |||||
2. If a thumbnail or image is in the oEmbed response: | |||||
1. Downloads the URL and stores it into a file via the media storage | |||||
provider and saves the local media metadata. | |||||
2. Generates thumbnails. | |||||
3. Updates the Open Graph response based on image properties. | |||||
7. Stores the result in the database cache. | |||||
4. Returns the result. | |||||
The in-memory cache expires after 1 hour. | |||||
Expired entries in the database cache (and their associated media files) are | |||||
deleted every 10 seconds. The default expiration time is 1 hour from download. | |||||
""" | """ | ||||
isLeaf = True | isLeaf = True | ||||