···11++++
22+title = "Log all the things"
33+date = 2021-10-13
44+draft = true
55+66+[taxonomies]
77+tags = [
88+ "elixir",
99+ "programming",
1010+ "observability"
1111+]
1212++++
1313+1414+In Elixir 1.11 landed set of new features that allows for more powerful logging
1515+by utilising Erlang's [`logger`][erl-log] features. Here I will try to descibe
1616+new possibilities and how You can use them to improve your logs.
1717+1818+<!-- more -->
1919+2020+## New log levels {#levels}
2121+2222+Elixir gained 4 new log levels to total 8 (from most verbose to least verbose):
2323+2424+- debug
2525+- info
2626+- **notice** *
2727+- warning (renamed from warn)
2828+- error
2929+- **critical** *
3030+- **alert** *
3131+- **emergency** *
3232+3333+<small>* new levels</small>
3434+3535+This allow to provide finer graded verbosity control, however due to
3636+compatibility reasons, in Elixir backends we need to translate these levels back
3737+to "old" set of 4. The current table looks like:
3838+3939+| Call level | What Elixir backend will see |
4040+| -- | -- |
4141+| `debug` | `debug` |
4242+| `info` | `info` |
4343+| `notice` | **`info`** * |
4444+| `warning` (or `warn`) | `warn` |
4545+| `error` | `error` |
4646+| `critical` | **`error`** * |
4747+| `alert` | **`error`** * |
4848+| `emergency` | **`error`** * |
4949+5050+<small>* "translated" messages</small>
5151+5252+However we can set verbosity to all levels. This may be confusing during the
5353+transition period, but we cannot change the behaviour until Elixir 2 (which is
5454+not happening any time soon).
5555+5656+Usage of the new levels is "obvious":
5757+5858+```elixir
5959+Logger.notice("Hello")
6060+```
6161+6262+Will produce message with `notice` level of verbosity.
6363+6464+Additionally the `logger.level` option in configuration supports 2 additional
6565+verbosity levels that you can use in your config:
6666+6767+- `:all` - all messages will be logged, logically exactly the same as `:debug`
6868+- `:none` - no messages will be logged
6969+7070+## Per module log level {#per-module-level}
7171+7272+This is change that can be quite handy during debugging sessions. With this
7373+change we have 4 new functions in `Logger` module:
7474+7575+- [`get_module_level/1`](https://hexdocs.pm/logger/Logger.html#get_module_level/1)
7676+- [`put_module_level/2`](https://hexdocs.pm/logger/Logger.html#put_module_level/2)
7777+- [`delete_module_level/1`](https://hexdocs.pm/logger/Logger.html#delete_module_level/1)
7878+- [`delete_all_module_level/0`](https://hexdocs.pm/logger/Logger.html#delete_all_module_level/0)
7979+8080+These allow us to manipulate verbosity level on per-module basis. What is
8181+non-obvious and is super handy is that it allows both lowering **and raising**
8282+verbosity for given module. This mean that:
8383+8484+```elixir
8585+require Logger
8686+8787+Logger.configure(level: :error)
8888+8989+defmodule Foo do
9090+ def run do
9191+ Logger.debug("I am still there")
9292+ end
9393+end
9494+9595+Foo.run() # Does not log anything
9696+9797+# Set `debug` level for `Foo` module only
9898+Logger.put_module_level(Foo, :debug)
9999+Foo.run()
100100+# `I am still there` is logged
101101+Logger.debug("I will not be printed")
102102+# Nothing got logged as top-level verbositi is still set to `:error`
103103+```
104104+105105+Of course it will not work if you decide to use [compile time purging](https://hexdocs.pm/logger/Logger.html#module-application-configuration).
106106+107107+## Logger handlers {#handlers}
108108+109109+---
110110+111111+**Warning!** This is not fully implemented in both - Erlang and Elixir. Writing
112112+your own handlers without additional knowledge can cause overload problems.
113113+114114+---
115115+116116+Erlang together with their logging implementation needed to provide a way to
117117+ingest these logs somehow. This is done via Erlang logger handlers (in this
118118+article called *handlers* in contrast to Elixir backends called *backends*
119119+there).
120120+121121+Handlers are modules that export at least 1 function `log/2` that takes 2
122122+arguments:
123123+124124+- `log_event` which is a map with 3 fields:
125125+ - `:level` - verbosity level
126126+ - `:msg` - tuple describing message:
127127+ - `{:io.format(), [term()]}` - format string and list of terms that should
128128+ be passed to `:io_lib.format/2` function
129129+ - `{:report, map() | keyword()}` - report that can be formatted into string
130130+ by `report_cb/{1,2}` set in metadata map (see below)
131131+ - `{:string, :unicode.chardata()}` - raw string that should be printed as
132132+ a message
133133+ - `:meta` - map containing all metadata for given event. All keys should be
134134+ atoms and values can be anything. Some keys have special meaning, and some
135135+ of them will be populated automatically by the `Logger` macros and functions.
136136+ These are:
137137+ - `:pid` - PID of the process that fired log event
138138+ - `:gl` - group leader of the process that fired log event
139139+ - `:mfa` - tuple in form of `{module(), name :: atom(), arity :: non_neg_integer()}`
140140+ that describe function that fired log event
141141+ - `:file` - filename of file that defines the code that fired log event
142142+ - `:line` - line in the given file where the log event was fired
143143+ - `:domain` - list of atoms that can be used to describe log events
144144+ hierarchy which then can be used for filtering. All events fired using
145145+ `Logger` macros and functions will have `:elixir` prepended to their
146146+ domain list.
147147+ - `:report_cb` - function that will be used to format `{:report, map() | keyword()}`
148148+ messages. This can be either 1-ary function, that takes report and returns
149149+ `{:io.format(), [term()]}` leaving truncation and further formatting up to
150150+ the main formatter, or 2-ary function that takes report and configuration
151151+ map `%{depth: pos_integer() | :unlimited, chars_limit: pos_integer() |
152152+ :unlimited, single_line: boolean()}` and returns already formatted
153153+ `:unicode.chardata()`. More about it can be found in [separate section](#structured-logging).
154154+155155+Return value of this function is ignored. If there will be any exception raised
156156+when calling this function, then it will be captured and failing handler will be
157157+removed. This is important, as if such handler is the only one, then you can be
158158+left without any logging handler and miss logs.
159159+160160+The important thing about Erlang handlers and Elixir backends is that Erlang
161161+handlers functions are called **within caller process** while Elixir backends
162162+are called in separate process. This mean that wrongly written Erlang handler
163163+can cause quite substantial load on application.
164164+165165+To read on other, optional, callbacks that can be defined by Erlang handler, that
166166+will not be covered there, I suggest looking into [Erlang documentation](https://erlang.org/doc/man/logger.html#formatter-callback-functions).
167167+168168+## Structured logging {#structured-logging}
169169+170170+One of the biggest new features in the Elixir 1.11 is support for structured
171171+logging. This mean that the log message do not need to be free-form string, but
172172+instead we can pass structure, that can provide more machine-readable data for
173173+processing in log aggregators. In Elixir 1.11 is is simple as passing map as a
174174+first argument to the `Logger` macros:
175175+176176+```elixir
177177+Logger.info(%{
178178+ status: :completed,
179179+ response: :ok
180180+})
181181+```
182182+183183+This will produce message that looks like:
184184+185185+```log
186186+14:08:46.849 [info] [response: :ok, status: :completed]
187187+```
188188+189189+As we can see, the map (called *report*) is formatted as a keyword list. This is
190190+default way to present the report data. Unfortunately we cannot access the
191191+metadata from the Elixir backends, but we have 2 ways to make these messages
192192+more readable for the human operator:
193193+194194+1. Utilise [`Logger`'s translators](https://hexdocs.pm/logger/Logger.Translator.html)
195195+1. Using `:report_cb` field in metadata
196196+197197+1st option is described quite good in Elixir documentation and is available
198198+since Elixir 1.0 as it was used to translate `error_logger` messages in old
199199+Erlang versions. Here I will describe the 2nd option which provide way for
200200+**caller** to define how report should be formatted into human-readable string.
201201+202202+`:report_cb` accepts 2 kind of functions as an argument:
203203+204204+- 1-ary function, that takes report as an argument and should return tuple
205205+ in form of `{:io.format(), [term()]}` that will be later formatted
206206+ respectively by the formatters.
207207+- 2-ary function that takes report and configuration map as an arguments and
208208+ should return formatted string.
209209+210210+1st option is much easier for most use cases, as it do not force you to worry
211211+about handling width, depth, and multiline logs, as it will all be handled for
212212+you.
213213+214214+For example, instead of doing:
215215+216216+```elixir
217217+Logger.info("Started HTTP server on http://localhost:8080")
218218+```
219219+220220+We can do:
221221+222222+```elixir
223223+Logger.info(
224224+ %{
225225+ protocol: :http,
226226+ port: 8080,
227227+ address: "localhost",
228228+ endpoint: MyEndpoint,
229229+ handler: Plug.Cowboy
230230+ },
231231+ report_cb: &__MODULE__.report_cb/1
232232+)
233233+234234+# …
235235+236236+def report_cb(%{protocol: protocol, port: port, address: address}) do
237237+ {"Started ~s server on ~s://~s:~B", [protocol, protocol, address, port]}
238238+end
239239+```
240240+241241+While the second entry seems much more verbose, with proper handler, it can
242242+provide much more detailed output. Just imagine that we would have handler that
243243+output JSON data and what information we could contain in such message:
244244+245245+```json
246246+{
247247+ "msg": "Started HTTP server on http://localhost:8080",
248248+ "metadata": {
249249+ "mfa": "MyMod.start/2",
250250+ "file": "foo.ex",
251251+ "line": 42
252252+ }
253253+}
254254+```
255255+256256+Now our log aggregation service need to parse `msg` field to extract all
257257+information that is contained there, like port, address, and protocol. With
258258+structured logging we can have that message available already there while
259259+presenting the "human readable" form as well:
260260+261261+```json
262262+{
263263+ "text": "Started HTTP server on http://localhost:8080",
264264+ "msg": {
265265+ "address": "localhost",
266266+ "port": 8080,
267267+ "protocol": "http",
268268+ "endpoint": "MyEndpoint",
269269+ "handler": "Plug.Cowboy"
270270+ },
271271+ "metadata": {
272272+ "mfa": "MyMod.start/2",
273273+ "file": "foo.ex",
274274+ "line": 42
275275+ }
276276+}
277277+```
278278+279279+Additionally you can see there that we can have more information available in
280280+the structured log that would otherwise needed to be crammed somewhere into the
281281+text message, even if it is not important in "regular" Ops observability.
282282+283283+This can raise a question - why not use metadata for such functionality, like it
284284+is available in [`LoggerJSON`][] or [`Ink`][]? The reason is that their reason
285285+existence is different. Metadata meant for "meta" stuff like location, tracing
286286+ID, but not for the information about the message itself. It is best shown on
287287+example. For this use Elixir's implementation of `GenServer` wrapper that
288288+produces error log entry on unknown message handled by default `handle_info/2`:
289289+290290+```elixir
291291+Logger.error(
292292+ # Report
293293+ %{
294294+ label: {GenServer, :no_handle_info},
295295+ report: %{
296296+ module: __MODULE__,
297297+ message: msg,
298298+ name: proc
299299+ }
300300+ },
301301+ # Metadata
302302+ %{
303303+ domain: [:otp, :elixir],
304304+ error_logger: %{tag: :error_msg},
305305+ report_cb: &GenServer.format_report/1
306306+ }
307307+)
308308+```
309309+310310+As we can see there, the report contains informations like:
311311+312312+- `:label` - that describes type of the event
313313+- `:report` - content of the "main" event
314314+ - `:module` - module that created the event, it is important to notice, that
315315+ it is also present in metadata (as part of `:mfa` key), but their meaning is
316316+ different. Module name here is meant for the operator to know the name of
317317+ the implementor that failed to handle message, while `:mfa` is meant to
318318+ describe the location of the code that fired the event.
319319+ - `:message` - the message itself that hasn't been handled. Notice, that it is
320320+ not stringified in any way there, it is simply passed "as is" to the
321321+ report. It is meant to be stringified later by the `:report_cb` function.
322322+ - `:name` - name of the process. Remember, similarly to `:module`, the PID of
323323+ the current process is part of the metadata, so in theory we could use value
324324+ from there, but their meaning is different (additionally this one may be an
325325+ atom in case if the process is locally registered with name).
326326+327327+Metadata on the other hand contains information that will be useful for
328328+filtering or formatting of the event.
329329+330330+The rule of thumb you can follow is:
331331+332332+> If it is thing that you will want to filter on, then it propably should be
333333+> part of the metadata. If you want to aggregate information or just display
334334+> them, it should be part of the message report.
335335+336336+## Log filtering
337337+338338+Finally we come to first feature that is not directly accessible from the Elixir
339339+`Logger` API (yet). Erlang's `logger` have powerful functionality for filtering
340340+log messages which allows us to dynamically decide which message should, or
341341+should not be logged. These even can alter messages on the fly.
342342+343343+Currently that functionality is available only via `:logger` module. It can be
344344+used like:
345345+346346+```elixir
347347+defmodule MyFilter do
348348+ def filter(log_event, opts) do
349349+ # …
350350+ end
351351+end
352352+353353+:logger.add_primary_filter(:my_filter, {&MyFilter.filter/2, opts})
354354+# Or
355355+:logger.add_handler_filter(handler_id, :my_filter, {&MyFilter.filter/2, opts})
356356+```
357357+358358+Few important things that need to be remembered when writing such filters:
359359+360360+- It is best practice to make such functions public and define filters using
361361+ remote function capture, like `&__MODULE__.process_disabled/2` (so not
362362+ anonymous functions either). It will make such filter much easier for VM to
363363+ handle (it is bigger topic why it is that, I may to cover it in another post).
364364+- Filters are ran **within the same process that fired log event**, so it is
365365+ important to make such filters as fast as possible, and do not do any heavy
366366+ work there.
367367+368368+Filters can be used for 2 different things:
369369+370370+- preventing some messages from being logged
371371+- modifying a message
372372+373373+While the former is much more common, I will try to describe both use cases
374374+there, as the latter is also quite useful.
375375+376376+Filters are defined as 2-ary functions where 1st argument is log event, and
377377+second argument is any term that can be used as a configuration for filter.
378378+Filter should return one of these 3 values:
379379+380380+- `:stop` - which will immediately discard message and do not run any additional
381381+ filters.
382382+- `:ignore` - which mean that given filter didn't recognise the given message
383383+ and leaves it up to other filters to decide on the action. If all filters
384384+ return `:ignore` then `:filter_default` option for the handler will be taken.
385385+ By default it is `:log`, which mean that message will be logged, but default
386386+ handler has it set to `:stop` by default, which mean, that non-matching
387387+ messages will be discarded.
388388+- Just log event (possibly modified) that will cause next filter to be called
389389+ with altered message. The message returned by the last filter (or in case of
390390+ `:ignore` return, previous filters) will be the message passed to handler.
391391+392392+### Preventing some messages from being logged
393393+394394+Most common use-case for filters will probably be rejecting messages that aren't
395395+important for us. [Erlang even prepared some useful filters][logger_filters]:
396396+397397+- `domain` - allow filtering by metadata `:domain` field (remember as I said
398398+ that metadata is for filtering?). It supports multiple possible relations
399399+ between the log domain and defined domain.
400400+- `level` - allow filtering (in or out) messages depending on their level, in
401401+ both directions. So it will allow you to filter messages with higher level for
402402+ some handlers. Just remember, that it will not receive messages that will not
403403+ pass primary/module level.
404404+- `progress` - filters all reports from `supervisor` and
405405+ `application_controller`. Simply, reduces startup/process shutdown chatter
406406+ that often is meaningless for most time.
407407+- `remote_gl` - filters messages coming from group leader on another node.
408408+ Useful when you want to discard/log messages coming from other nodes in
409409+ cluster.
410410+411411+### Modifying a message
412412+413413+Sometimes (hopefully rarely) there is need to alter messages in the system. For
414414+example we may need to prevent sensitive information from being logged. When
415415+using "old" Elixir approach you could abuse translators, but that was error
416416+prone, as first successful translator was breaking pipeline, so you couldn't
417417+just smash one on top and then keep rest working as is. With "new" approach and
418418+structured logging you can just traverse the report and replace all occurences
419419+of the unsafe data with anonymised data. For example:
420420+421421+```elixir
422422+def filter_out_password(%{msg: {:report, report}} = event, _opts) do
423423+ %{event | msg: {:report, replace(report)}}
424424+end
425425+426426+@filtered "[FILTERED]"
427427+428428+defp replace(%{password: _} = map) do
429429+ for {k, v} <- %{map | password: @filtered}, into: %{} do
430430+ {k, replace(v)}
431431+ end
432432+end
433433+434434+defp replace(%{"password" => _} = map) do
435435+ for {k, v} <- %{map | "password" => @filtered}, into: %{} do
436436+ {k, replace(v)}
437437+ end
438438+end
439439+440440+defp replace(list) when is_list(list) do
441441+ for elem <- list do
442442+ case elem do
443443+ {:password, _} -> {:password, @filtered}
444444+ {"password", _} -> {"password", @filtered}
445445+ {k, v} -> {k, replace(v)}
446446+ other -> replace(other)
447447+ end
448448+ end
449449+end
450450+451451+defp replace(other), do: other
452452+```
453453+454454+This snippet will replace all occurences of `:password` or `"password"` with
455455+filtered out value.
456456+457457+There is disadvantage of such approach though - it will make all messages with
458458+such fields allowed in case if your filter has `:filter_default` set to `:stop`.
459459+That mean, that if you want to make some of them rejected anyway, then you will
460460+need to manually add additional step to reject messages that do not fit into
461461+your patterns. Alternatively you can use `filter_default: :log` and then use
462462+opt-out logging. There currently is no way to alter the message and make other
463463+filters decide whether log it or not (as of OTP 24).
464464+465465+## Summary
466466+467467+New features and possibilities with relation to logging in Elixir 1.11 can be
468468+overwhelming. Fortunately all of the new features are optional and provided in
469469+addition to "good 'ol `Logger.info("logging")`". But for the people who works on
470470+the observability in BEAM (EEF Observability WG, Sentry, Logflare, etc.) it
471471+brings a lot of new powerful capabilities.
472472+473473+I am thrilled to see what will people create using all that power.
474474+475475+[erl-log]: https://erlang.org/doc/man/logger.html
476476+[syslog]: https://en.wikipedia.org/wiki/Syslog#Severity_level
477477+[`LoggerJSON`]: https://github.com/Nebo15/logger_json
478478+[`Ink`]: https://hex.pm/packages/ink
479479+[logger_filters]: https://erlang.org/doc/man/logger_filters.html