···11++++
22+title = "Log all the things"
33+date = 2021-10-13
44+draft = true
55+66+[taxonomies]
77+tags = [
88+ "elixir",
99+ "programming",
1010+ "observability"
1111+]
1212+1313+[[extra.thanks]]
1414+name = "Kai Wern Choong"
1515++++
1616+1717+In Elixir 1.11 landed set of new features that allows for more powerful logging
1818+by utilising Erlang's [`logger`][erl-log] features. Here I will try to describe
1919+new possibilities and how You can use them to improve your logs.
2020+2121+<!-- more -->
2222+2323+## New log levels {#levels}
2424+2525+Elixir gained 4 new log levels to total 8 (from most verbose to least verbose):
2626+2727+- debug
2828+- info
2929+- **notice** *
3030+- warning (renamed from warn)
3131+- error
3232+- **critical** *
3333+- **alert** *
3434+- **emergency** *
3535+3636+<small>* new levels</small>
3737+3838+This allow to provide finer graded verbosity control, due to compatibility
3939+reasons, in Elixir backends we need to translate these levels back to "old" set
4040+of 4. The current table looks like:
4141+4242+| Call level | What Elixir backend will see |
4343+| -- | -- |
4444+| `debug` | `debug` |
4545+| `info` | `info` |
4646+| `notice` | **`info`** * |
4747+| `warning` (or `warn`) | `warn` |
4848+| `error` | `error` |
4949+| `critical` | **`error`** * |
5050+| `alert` | **`error`** * |
5151+| `emergency` | **`error`** * |
5252+5353+<small>* "translated" messages</small>
5454+5555+We can set verbosity to all levels. This may be confusing during the transition
5656+period, but we cannot change the behaviour until Elixir 2 (which is not
5757+happening any time soon).
5858+5959+Usage of the new levels is "obvious":
6060+6161+```elixir
6262+Logger.notice("Hello")
6363+```
6464+6565+Will produce message with `notice` level of verbosity.
6666+6767+Additionally the `logger.level` option in configuration supports 2 additional
6868+verbosity levels that you can use in your config:
6969+7070+- `:all` - all messages will be logged, logically exactly the same as `:debug`
7171+- `:none` - no messages will be logged
7272+7373+## Per module log level {#per-module-level}
7474+7575+This is change that can be quite handy during debugging sessions. With this
7676+change we have 4 new functions in `Logger` module:
7777+7878+- [`get_module_level/1`](https://hexdocs.pm/logger/Logger.html#get_module_level/1)
7979+- [`put_module_level/2`](https://hexdocs.pm/logger/Logger.html#put_module_level/2)
8080+- [`delete_module_level/1`](https://hexdocs.pm/logger/Logger.html#delete_module_level/1)
8181+- [`delete_all_module_level/0`](https://hexdocs.pm/logger/Logger.html#delete_all_module_level/0)
8282+8383+These allow us to manipulate verbosity level on per-module basis. What is
8484+non-obvious and is super handy is that it allows both lowering **and raising**
8585+verbosity for given module. This mean that:
8686+8787+```elixir
8888+require Logger
8989+9090+Logger.configure(level: :error)
9191+9292+defmodule Foo do
9393+ def run do
9494+ Logger.debug("I am still there")
9595+ end
9696+end
9797+9898+Foo.run() # Does not log anything
9999+100100+# Set `debug` level for `Foo` module only
101101+Logger.put_module_level(Foo, :debug)
102102+Foo.run()
103103+# `I am still there` is logged
104104+Logger.debug("I will not be printed")
105105+# Nothing got logged as top-level verbositi is still set to `:error`
106106+```
107107+108108+Of course it will not work if you decide to use [compile time purging][logger-purge]
109109+110110+## Logger handlers {#handlers}
111111+112112+---
113113+114114+**Warning!** This is not fully implemented in both - Erlang and Elixir. Writing
115115+your own handlers without additional knowledge can cause overload problems.
116116+117117+---
118118+119119+Erlang together with their logging implementation needed to provide a way to
120120+ingest these logs somehow. This is done via Erlang logger handlers (in this
121121+article called *handlers* in contrast to Elixir backends called *backends*
122122+there).
123123+124124+Handlers are modules that export at least 1 function `log/2` that takes 2
125125+arguments:
126126+127127+- `log_event` which is a map with 3 fields:
128128+ - `:level` - verbosity level
129129+ - `:msg` - tuple describing message:
130130+ - `{:io.format(), [term()]}` - format string and list of terms that should
131131+ be passed to `:io_lib.format/2` function
132132+ - `{:report, map() | keyword()}` - report that can be formatted into string
133133+ by `report_cb/{1,2}` set in metadata map (see below)
134134+ - `{:string, :unicode.chardata()}` - raw string that should be printed as
135135+ a message
136136+ - `:meta` - map containing all metadata for given event. All keys should be
137137+ atoms and values can be anything. Some keys have special meaning, and some
138138+ of them will be populated automatically by the `Logger` macros and functions.
139139+ These are:
140140+ - `:pid` - PID of the process that fired log event
141141+ - `:gl` - group leader of the process that fired log event
142142+ - `:mfa` - tuple in form of `{module(), name :: atom(), arity :: non_neg_integer()}`
143143+ that describe function that fired log event
144144+ - `:file` - filename of file that defines the code that fired log event
145145+ - `:line` - line in the given file where the log event was fired
146146+ - `:domain` - list of atoms that can be used to describe log events
147147+ hierarchy which then can be used for filtering. All events fired using
148148+ `Logger` macros and functions will have `:elixir` prepended to their
149149+ domain list.
150150+ - `:report_cb` - function that will be used to format `{:report, map() |
151151+ keyword()}` messages. This can be either 1-ary function, that takes report
152152+ and returns `{:io.format(), [term()]}` leaving truncation and further
153153+ formatting up to the main formatter, or 2-ary function that takes report
154154+ and configuration map `%{depth: pos_integer() | :unlimited, chars_limit:
155155+ pos_integer() | :unlimited, single_line: boolean()}` and returns already
156156+ formatted `:unicode.chardata()`. More about it can be found in [separate
157157+ section](#structured-logging).
158158+159159+Return value of this function is ignored. If there will be any exception raised
160160+when calling this function, then it will be captured and failing handler will be
161161+removed. This is important, as if such handler is the only one, then you can be
162162+left without any logging handler and miss logs.
163163+164164+The important thing about Erlang handlers and Elixir backends is that Erlang
165165+handlers functions are called **within caller process** while Elixir backends
166166+are called in separate process. This mean that wrongly written Erlang handler
167167+can cause quite substantial load on application.
168168+169169+To read on other, optional, callbacks that can be defined by Erlang handler, that
170170+will not be covered there, I suggest looking into [Erlang documentation][formatter_cb].
171171+172172+## Structured logging {#structured-logging}
173173+174174+One of the biggest new features in the Elixir 1.11 is support for structured
175175+logging. This mean that the log message do not need to be free-form string, but
176176+instead we can pass structure, that can provide more machine-readable data for
177177+processing in log aggregators. In Elixir 1.11 is simple as passing map as a
178178+first argument to the `Logger` macros:
179179+180180+```elixir
181181+Logger.info(%{
182182+ status: :completed,
183183+ response: :ok
184184+})
185185+```
186186+187187+This will produce message that looks like:
188188+189189+```log
190190+14:08:46.849 [info] [response: :ok, status: :completed]
191191+```
192192+193193+As we can see, the map (called *report*) is formatted as a keyword list. This is
194194+default way to present the report data. Unfortunately we cannot access the
195195+metadata from the Elixir backends, but we have 2 ways to make these messages
196196+more readable for the human operator:
197197+198198+1. Utilise [`Logger`'s translators](https://hexdocs.pm/logger/Logger.Translator.html)
199199+1. Using `:report_cb` field in metadata
200200+201201+1st option is described quite good in Elixir documentation and is available
202202+since Elixir 1.0 as it was used to translate `error_logger` messages in old
203203+Erlang versions. Here I will describe the 2nd option which provide way for
204204+**caller** to define how report should be formatted into human-readable string.
205205+206206+`:report_cb` accepts 2 kind of functions as an argument:
207207+208208+- 1-ary function, that takes report as an argument and should return tuple
209209+ in form of `{:io.format(), [term()]}` that will be later formatted
210210+ respectively by the formatters.
211211+- 2-ary function that takes report and configuration map as an arguments and
212212+ should return formatted string.
213213+214214+1st option is much easier for most use cases, as it do not force you to worry
215215+about handling width, depth, and multiline logs, as it will all be handled for
216216+you.
217217+218218+For example, instead of doing:
219219+220220+```elixir
221221+Logger.info("Started HTTP server on http://localhost:8080")
222222+```
223223+224224+We can do:
225225+226226+```elixir
227227+Logger.info(
228228+ %{
229229+ protocol: :http,
230230+ port: 8080,
231231+ address: "localhost",
232232+ endpoint: MyEndpoint,
233233+ handler: Plug.Cowboy
234234+ },
235235+ report_cb: &__MODULE__.report_cb/1
236236+)
237237+238238+# …
239239+240240+def report_cb(%{protocol: protocol, port: port, address: address}) do
241241+ {"Started ~s server on ~s://~s:~B", [protocol, protocol, address, port]}
242242+end
243243+```
244244+245245+While the second entry seems much more verbose, with proper handler, it can
246246+provide much more detailed output. Just imagine that we would have handler that
247247+output JSON data and what information we could contain in such message:
248248+249249+```json
250250+{
251251+ "msg": "Started HTTP server on http://localhost:8080",
252252+ "metadata": {
253253+ "mfa": "MyMod.start/2",
254254+ "file": "foo.ex",
255255+ "line": 42
256256+ }
257257+}
258258+```
259259+260260+Now our log aggregation service need to parse `msg` field to extract all
261261+information that is contained there, like port, address, and protocol. With
262262+structured logging we can have that message available already there while
263263+presenting the "human readable" form as well:
264264+265265+```json
266266+{
267267+ "text": "Started HTTP server on http://localhost:8080",
268268+ "msg": {
269269+ "address": "localhost",
270270+ "port": 8080,
271271+ "protocol": "http",
272272+ "endpoint": "MyEndpoint",
273273+ "handler": "Plug.Cowboy"
274274+ },
275275+ "metadata": {
276276+ "mfa": "MyMod.start/2",
277277+ "file": "foo.ex",
278278+ "line": 42
279279+ }
280280+}
281281+```
282282+283283+You can see there that we can have more information available in the structured
284284+log that would otherwise needed to be crammed somewhere into the text message,
285285+even if it is not important in "regular" Ops observability.
286286+287287+This can raise a question - why not use metadata for such functionality, like it
288288+is available in [`LoggerJSON`][] or [`Ink`][]? The reason is that their reason
289289+existence is different. Metadata meant for "meta" stuff like location, tracing
290290+ID, but not for the information about the message itself. It is best shown on
291291+example. For this use Elixir's implementation of `GenServer` wrapper that
292292+produces error log entry on unknown message handled by default `handle_info/2`:
293293+294294+```elixir
295295+Logger.error(
296296+ # Report
297297+ %{
298298+ label: {GenServer, :no_handle_info},
299299+ report: %{
300300+ module: __MODULE__,
301301+ message: msg,
302302+ name: proc
303303+ }
304304+ },
305305+ # Metadata
306306+ %{
307307+ error_logger: %{tag: :error_msg},
308308+ report_cb: &GenServer.format_report/1
309309+ }
310310+)
311311+```
312312+313313+As we can see there, the report contains informations like:
314314+315315+- `:label` - that describes type of the event
316316+- `:report` - content of the "main" event
317317+ - `:module` - module that created the event, it is important to notice, that
318318+ it is also present in metadata (as part of `:mfa` key), but their meaning is
319319+ different. Module name here is meant for the operator to know the name of
320320+ the implementor that failed to handle message, while `:mfa` is meant to
321321+ describe the location of the code that fired the event.
322322+ - `:message` - the message itself that hasn't been handled. Notice, that it is
323323+ not stringified in any way there, it is simply passed "as is" to the
324324+ report. It is meant to be stringified later by the `:report_cb` function.
325325+ - `:name` - name of the process. Remember, similarly to `:module`, the PID of
326326+ the current process is part of the metadata, so in theory we could use value
327327+ from there, but their meaning is different (additionally this one may be an
328328+ atom in case if the process is locally registered with name).
329329+330330+Metadata on the other hand contains information that will be useful for
331331+filtering or formatting of the event.
332332+333333+The rule of thumb you can follow is:
334334+335335+> If it is thing that you will want to filter on, then it probably should be
336336+> part of the metadata. If you want to aggregate information or just display
337337+> them, it should be part of the message report.
338338+339339+## Log filtering
340340+341341+Finally we come to first feature that is not directly accessible from the Elixir
342342+`Logger` API (yet). Erlang's `logger` have powerful functionality for filtering
343343+log messages which allows us to dynamically decide which message should, or
344344+should not be logged. These even can alter messages on the fly.
345345+346346+Currently that functionality is available only via `:logger` module. It can be
347347+used like:
348348+349349+```elixir
350350+defmodule MyFilter do
351351+ def filter(log_event, opts) do
352352+ # …
353353+ end
354354+end
355355+356356+:logger.add_primary_filter(:my_filter, {&MyFilter.filter/2, opts})
357357+# Or
358358+:logger.add_handler_filter(handler_id, :my_filter, {&MyFilter.filter/2, opts})
359359+```
360360+361361+Few important things that need to be remembered when writing such filters:
362362+363363+- It is best practice to make such functions public and define filters using
364364+ remote function capture, like `&__MODULE__.process_disabled/2` (so not
365365+ anonymous functions either). It will make such filter much easier for VM to
366366+ handle (it is bigger topic why it is that, I may to cover it in another post).
367367+- Filters are ran **within the same process that fired log event**, so it is
368368+ important to make such filters as fast as possible, and do not do any heavy
369369+ work there.
370370+371371+Filters can be used for 2 different things:
372372+373373+- preventing some messages from being logged
374374+- modifying a message
375375+376376+While the former is much more common, I will try to describe both use cases
377377+there, as the latter is also quite useful.
378378+379379+Filters are defined as 2-ary functions where 1st argument is log event, and
380380+second argument is any term that can be used as a configuration for filter.
381381+Filter should return one of these 3 values:
382382+383383+- `:stop` - which will immediately discard message and do not run any additional
384384+ filters.
385385+- `:ignore` - which mean that given filter didn't recognise the given message
386386+ and leaves it up to other filters to decide on the action. If all filters
387387+ return `:ignore` then `:filter_default` option for the handler will be taken.
388388+ By default it is `:log`, which mean that message will be logged, but default
389389+ handler has it set to `:stop` by default, which mean, that non-matching
390390+ messages will be discarded.
391391+- Just log event (possibly modified) that will cause next filter to be called
392392+ with altered message. The message returned by the last filter (or in case of
393393+ `:ignore` return, previous filters) will be the message passed to handler.
394394+395395+### Preventing some messages from being logged
396396+397397+Most common use-case for filters will probably be rejecting messages that aren't
398398+important for us. [Erlang even prepared some useful filters][logger_filters]:
399399+400400+- `domain` - allow filtering by metadata `:domain` field (remember as I said
401401+ that metadata is for filtering?). It supports multiple possible relations
402402+ between the log domain and defined domain.
403403+- `level` - allow filtering (in or out) messages depending on their level, in
404404+ both directions. It will allow you to filter messages with higher level for
405405+ some handlers. Just remember, that it will not receive messages that will not
406406+ pass primary/module level.
407407+- `progress` - filters all reports from `supervisor` and
408408+ `application_controller`. Simply, reduces startup/process shutdown chatter
409409+ that often is meaningless for most time.
410410+- `remote_gl` - filters messages coming from group leader on another node.
411411+ Useful when you want to discard/log messages coming from other nodes in
412412+ cluster.
413413+414414+### Modifying a message
415415+416416+Sometimes there is need to alter messages in the system. For example we may need
417417+to prevent sensitive information from being logged. When using "old" Elixir
418418+approach you could abuse translators, but that was error prone, as first
419419+successful translator was breaking pipeline, so you couldn't just smash one on
420420+top and then keep rest working as is. With "new" approach and structured logging
421421+you can just traverse the report and replace all occurrences of the unsafe data
422422+with anonymised data. For example:
423423+424424+```elixir
425425+def filter_out_password(%{msg: {:report, report}} = event, _opts) do
426426+ %{event | msg: {:report, replace(report)}}
427427+end
428428+429429+@filtered "[FILTERED]"
430430+431431+defp replace(%{password: _} = map) do
432432+ for {k, v} <- %{map | password: @filtered}, into: %{} do
433433+ {k, replace(v)}
434434+ end
435435+end
436436+437437+defp replace(%{"password" => _} = map) do
438438+ for {k, v} <- %{map | "password" => @filtered}, into: %{} do
439439+ {k, replace(v)}
440440+ end
441441+end
442442+443443+defp replace(list) when is_list(list) do
444444+ for elem <- list do
445445+ case elem do
446446+ {:password, _} -> {:password, @filtered}
447447+ {"password", _} -> {"password", @filtered}
448448+ {k, v} -> {k, replace(v)}
449449+ other -> replace(other)
450450+ end
451451+ end
452452+end
453453+454454+defp replace(other), do: other
455455+```
456456+457457+This snippet will replace all occurrences of `:password` or `"password"` with
458458+filtered out value.
459459+460460+The disadvantage of such approach - it will make all messages with such fields
461461+allowed in case if your filter has `:filter_default` set to `:stop`. That mean,
462462+that if you want to make some of them rejected anyway, then you will need to
463463+manually add additional step to reject messages that do not fit into your
464464+patterns. Alternatively you can use `filter_default: :log` and then use opt-out
465465+logging. There currently is no way to alter the message and make other filters
466466+decide whether log it or not (as of OTP 24).
467467+468468+## Summary
469469+470470+New features and possibilities with relation to logging in Elixir 1.11 can be
471471+overwhelming. Fortunately all of the new features are optional and provided in
472472+addition to "good 'ol `Logger.info("logging")`". But for the people who works on
473473+the observability in BEAM (EEF Observability WG, Sentry, Logflare, etc.) it
474474+brings a lot of new powerful capabilities.
475475+476476+I am thrilled to see what will people create using all that power.
477477+478478+[erl-log]: https://erlang.org/doc/man/logger.html
479479+[syslog]: https://en.wikipedia.org/wiki/Syslog#Severity_level
480480+[`LoggerJSON`]: https://github.com/Nebo15/logger_json
481481+[`Ink`]: https://hex.pm/packages/ink
482482+[logger_filters]: https://erlang.org/doc/man/logger_filters.html
483483+[logger-purge]: https://hexdocs.pm/logger/Logger.html#module-application-configuration
484484+[formatter_cb]: https://erlang.org/doc/man/logger.html#formatter-callback-functions