選択できるのは25トピックまでです。 トピックは、先頭が英数字で、英数字とダッシュ('-')を使用した35文字以内のものにしてください。

log_contexts.md 13 KiB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364
  1. # Log Contexts
  2. To help track the processing of individual requests, synapse uses a
  3. '`log context`' to track which request it is handling at any given
  4. moment. This is done via a thread-local variable; a `logging.Filter` is
  5. then used to fish the information back out of the thread-local variable
  6. and add it to each log record.
  7. Logcontexts are also used for CPU and database accounting, so that we
  8. can track which requests were responsible for high CPU use or database
  9. activity.
  10. The `synapse.logging.context` module provides facilities for managing
  11. the current log context (as well as providing the `LoggingContextFilter`
  12. class).
  13. Asynchronous functions make the whole thing complicated, so this document describes
  14. how it all works, and how to write code which follows the rules.
  15. In this document, "awaitable" refers to any object which can be `await`ed. In the context of
  16. Synapse, that normally means either a coroutine or a Twisted
  17. [`Deferred`](https://twistedmatrix.com/documents/current/api/twisted.internet.defer.Deferred.html).
  18. ## Logcontexts without asynchronous code
  19. In the absence of any asynchronous voodoo, things are simple enough. As with
  20. any code of this nature, the rule is that our function should leave
  21. things as it found them:
  22. ```python
  23. from synapse.logging import context # omitted from future snippets
  24. def handle_request(request_id):
  25. request_context = context.LoggingContext()
  26. calling_context = context.set_current_context(request_context)
  27. try:
  28. request_context.request = request_id
  29. do_request_handling()
  30. logger.debug("finished")
  31. finally:
  32. context.set_current_context(calling_context)
  33. def do_request_handling():
  34. logger.debug("phew") # this will be logged against request_id
  35. ```
  36. LoggingContext implements the context management methods, so the above
  37. can be written much more succinctly as:
  38. ```python
  39. def handle_request(request_id):
  40. with context.LoggingContext() as request_context:
  41. request_context.request = request_id
  42. do_request_handling()
  43. logger.debug("finished")
  44. def do_request_handling():
  45. logger.debug("phew")
  46. ```
  47. ## Using logcontexts with awaitables
  48. Awaitables break the linear flow of code so that there is no longer a single entry point
  49. where we should set the logcontext and a single exit point where we should remove it.
  50. Consider the example above, where `do_request_handling` needs to do some
  51. blocking operation, and returns an awaitable:
  52. ```python
  53. async def handle_request(request_id):
  54. with context.LoggingContext() as request_context:
  55. request_context.request = request_id
  56. await do_request_handling()
  57. logger.debug("finished")
  58. ```
  59. In the above flow:
  60. - The logcontext is set
  61. - `do_request_handling` is called, and returns an awaitable
  62. - `handle_request` awaits the awaitable
  63. - Execution of `handle_request` is suspended
  64. So we have stopped processing the request (and will probably go on to
  65. start processing the next), without clearing the logcontext.
  66. To circumvent this problem, synapse code assumes that, wherever you have
  67. an awaitable, you will want to `await` it. To that end, wherever
  68. functions return awaitables, we adopt the following conventions:
  69. **Rules for functions returning awaitables:**
  70. > - If the awaitable is already complete, the function returns with the
  71. > same logcontext it started with.
  72. > - If the awaitable is incomplete, the function clears the logcontext
  73. > before returning; when the awaitable completes, it restores the
  74. > logcontext before running any callbacks.
  75. That sounds complicated, but actually it means a lot of code (including
  76. the example above) "just works". There are two cases:
  77. - If `do_request_handling` returns a completed awaitable, then the
  78. logcontext will still be in place. In this case, execution will
  79. continue immediately after the `await`; the "finished" line will
  80. be logged against the right context, and the `with` block restores
  81. the original context before we return to the caller.
  82. - If the returned awaitable is incomplete, `do_request_handling` clears
  83. the logcontext before returning. The logcontext is therefore clear
  84. when `handle_request` `await`s the awaitable.
  85. Once `do_request_handling`'s awaitable completes, it will reinstate
  86. the logcontext, before running the second half of `handle_request`,
  87. so again the "finished" line will be logged against the right context,
  88. and the `with` block restores the original context.
  89. As an aside, it's worth noting that `handle_request` follows our rules
  90. - though that only matters if the caller has its own logcontext which it
  91. cares about.
  92. The following sections describe pitfalls and helpful patterns when
  93. implementing these rules.
  94. Always await your awaitables
  95. ----------------------------
  96. Whenever you get an awaitable back from a function, you should `await` on
  97. it as soon as possible. Do not pass go; do not do any logging; do not
  98. call any other functions.
  99. ```python
  100. async def fun():
  101. logger.debug("starting")
  102. await do_some_stuff() # just like this
  103. coro = more_stuff()
  104. result = await coro # also fine, of course
  105. return result
  106. ```
  107. Provided this pattern is followed all the way back up to the callchain
  108. to where the logcontext was set, this will make things work out ok:
  109. provided `do_some_stuff` and `more_stuff` follow the rules above, then
  110. so will `fun`.
  111. It's all too easy to forget to `await`: for instance if we forgot that
  112. `do_some_stuff` returned an awaitable, we might plough on regardless. This
  113. leads to a mess; it will probably work itself out eventually, but not
  114. before a load of stuff has been logged against the wrong context.
  115. (Normally, other things will break, more obviously, if you forget to
  116. `await`, so this tends not to be a major problem in practice.)
  117. Of course sometimes you need to do something a bit fancier with your
  118. awaitable - not all code follows the linear A-then-B-then-C pattern.
  119. Notes on implementing more complex patterns are in later sections.
  120. ## Where you create a new awaitable, make it follow the rules
  121. Most of the time, an awaitable comes from another synapse function.
  122. Sometimes, though, we need to make up a new awaitable, or we get an awaitable
  123. back from external code. We need to make it follow our rules.
  124. The easy way to do it is by using `context.make_deferred_yieldable`. Suppose we want to implement
  125. `sleep`, which returns a deferred which will run its callbacks after a
  126. given number of seconds. That might look like:
  127. ```python
  128. # not a logcontext-rules-compliant function
  129. def get_sleep_deferred(seconds):
  130. d = defer.Deferred()
  131. reactor.callLater(seconds, d.callback, None)
  132. return d
  133. ```
  134. That doesn't follow the rules, but we can fix it by calling it through
  135. `context.make_deferred_yieldable`:
  136. ```python
  137. async def sleep(seconds):
  138. return await context.make_deferred_yieldable(get_sleep_deferred(seconds))
  139. ```
  140. ## Fire-and-forget
  141. Sometimes you want to fire off a chain of execution, but not wait for
  142. its result. That might look a bit like this:
  143. ```python
  144. async def do_request_handling():
  145. await foreground_operation()
  146. # *don't* do this
  147. background_operation()
  148. logger.debug("Request handling complete")
  149. async def background_operation():
  150. await first_background_step()
  151. logger.debug("Completed first step")
  152. await second_background_step()
  153. logger.debug("Completed second step")
  154. ```
  155. The above code does a couple of steps in the background after
  156. `do_request_handling` has finished. The log lines are still logged
  157. against the `request_context` logcontext, which may or may not be
  158. desirable. There are two big problems with the above, however. The first
  159. problem is that, if `background_operation` returns an incomplete
  160. awaitable, it will expect its caller to `await` immediately, so will have
  161. cleared the logcontext. In this example, that means that 'Request
  162. handling complete' will be logged without any context.
  163. The second problem, which is potentially even worse, is that when the
  164. awaitable returned by `background_operation` completes, it will restore
  165. the original logcontext. There is nothing waiting on that awaitable, so
  166. the logcontext will leak into the reactor and possibly get attached to
  167. some arbitrary future operation.
  168. There are two potential solutions to this.
  169. One option is to surround the call to `background_operation` with a
  170. `PreserveLoggingContext` call. That will reset the logcontext before
  171. starting `background_operation` (so the context restored when the
  172. deferred completes will be the empty logcontext), and will restore the
  173. current logcontext before continuing the foreground process:
  174. ```python
  175. async def do_request_handling():
  176. await foreground_operation()
  177. # start background_operation off in the empty logcontext, to
  178. # avoid leaking the current context into the reactor.
  179. with PreserveLoggingContext():
  180. background_operation()
  181. # this will now be logged against the request context
  182. logger.debug("Request handling complete")
  183. ```
  184. Obviously that option means that the operations done in
  185. `background_operation` would be not be logged against a logcontext
  186. (though that might be fixed by setting a different logcontext via a
  187. `with LoggingContext(...)` in `background_operation`).
  188. The second option is to use `context.run_in_background`, which wraps a
  189. function so that it doesn't reset the logcontext even when it returns
  190. an incomplete awaitable, and adds a callback to the returned awaitable to
  191. reset the logcontext. In other words, it turns a function that follows
  192. the Synapse rules about logcontexts and awaitables into one which behaves
  193. more like an external function --- the opposite operation to that
  194. described in the previous section. It can be used like this:
  195. ```python
  196. async def do_request_handling():
  197. await foreground_operation()
  198. context.run_in_background(background_operation)
  199. # this will now be logged against the request context
  200. logger.debug("Request handling complete")
  201. ```
  202. ## Passing synapse deferreds into third-party functions
  203. A typical example of this is where we want to collect together two or
  204. more awaitables via `defer.gatherResults`:
  205. ```python
  206. a1 = operation1()
  207. a2 = operation2()
  208. a3 = defer.gatherResults([a1, a2])
  209. ```
  210. This is really a variation of the fire-and-forget problem above, in that
  211. we are firing off `a1` and `a2` without awaiting on them. The difference
  212. is that we now have third-party code attached to their callbacks. Anyway
  213. either technique given in the [Fire-and-forget](#fire-and-forget)
  214. section will work.
  215. Of course, the new awaitable returned by `gather` needs to be
  216. wrapped in order to make it follow the logcontext rules before we can
  217. yield it, as described in [Where you create a new awaitable, make it
  218. follow the
  219. rules](#where-you-create-a-new-awaitable-make-it-follow-the-rules).
  220. So, option one: reset the logcontext before starting the operations to
  221. be gathered:
  222. ```python
  223. async def do_request_handling():
  224. with PreserveLoggingContext():
  225. a1 = operation1()
  226. a2 = operation2()
  227. result = await defer.gatherResults([a1, a2])
  228. ```
  229. In this case particularly, though, option two, of using
  230. `context.run_in_background` almost certainly makes more sense, so that
  231. `operation1` and `operation2` are both logged against the original
  232. logcontext. This looks like:
  233. ```python
  234. async def do_request_handling():
  235. a1 = context.run_in_background(operation1)
  236. a2 = context.run_in_background(operation2)
  237. result = await make_deferred_yieldable(defer.gatherResults([a1, a2]))
  238. ```
  239. ## A note on garbage-collection of awaitable chains
  240. It turns out that our logcontext rules do not play nicely with awaitable
  241. chains which get orphaned and garbage-collected.
  242. Imagine we have some code that looks like this:
  243. ```python
  244. listener_queue = []
  245. def on_something_interesting():
  246. for d in listener_queue:
  247. d.callback("foo")
  248. async def await_something_interesting():
  249. new_awaitable = defer.Deferred()
  250. listener_queue.append(new_awaitable)
  251. with PreserveLoggingContext():
  252. await new_awaitable
  253. ```
  254. Obviously, the idea here is that we have a bunch of things which are
  255. waiting for an event. (It's just an example of the problem here, but a
  256. relatively common one.)
  257. Now let's imagine two further things happen. First of all, whatever was
  258. waiting for the interesting thing goes away. (Perhaps the request times
  259. out, or something *even more* interesting happens.)
  260. Secondly, let's suppose that we decide that the interesting thing is
  261. never going to happen, and we reset the listener queue:
  262. ```python
  263. def reset_listener_queue():
  264. listener_queue.clear()
  265. ```
  266. So, both ends of the awaitable chain have now dropped their references,
  267. and the awaitable chain is now orphaned, and will be garbage-collected at
  268. some point. Note that `await_something_interesting` is a coroutine,
  269. which Python implements as a generator function. When Python
  270. garbage-collects generator functions, it gives them a chance to
  271. clean up by making the `await` (or `yield`) raise a `GeneratorExit`
  272. exception. In our case, that means that the `__exit__` handler of
  273. `PreserveLoggingContext` will carefully restore the request context, but
  274. there is now nothing waiting for its return, so the request context is
  275. never cleared.
  276. To reiterate, this problem only arises when *both* ends of a awaitable
  277. chain are dropped. Dropping the the reference to an awaitable you're
  278. supposed to be awaiting is bad practice, so this doesn't
  279. actually happen too much. Unfortunately, when it does happen, it will
  280. lead to leaked logcontexts which are incredibly hard to track down.