Fault detection and integrity monitoring for kernel isolation structures
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

OCaml 95.1%
Dune 1.7%
Other 3.2%
13 1 0

Clone this repository

https://tangled.org/gazagnaire.org/ocaml-fdir https://tangled.org/did:plc:jhift2vwcxhou52p3sewcrpx/ocaml-fdir
git@git.recoil.org:gazagnaire.org/ocaml-fdir git@git.recoil.org:did:plc:jhift2vwcxhou52p3sewcrpx/ocaml-fdir

For self-hosted knots, clone URLs may differ based on your setup.

Download tar.gz
README.md

fdir#

Integrity monitoring for kernel isolation structures.

Periodic integrity checker that takes known-good snapshots of kernel isolation structures (page tables, seccomp filters, cgroup configs, memory mappings) at boot and detects radiation-induced corruption by comparing against them on a configurable timer. Designed for space-grade Linux systems where single-event upsets can silently corrupt kernel state.

The library hashes /proc/self/maps, /proc/self/status, and /proc/self/cgroup using SHA-256 and runs as an Eio daemon fiber that periodically re-checks against the baseline. Anomalies are classified by severity (Log, Isolate, Restart, Degrade, Safe_mode) based on how many subsystems have diverged.

Installation#

Install with opam:

$ opam install fdir

If opam cannot find the package, it may not yet be released in the public opam-repository. Add the overlay repository, then install it:

$ opam repo add samoht https://tangled.org/gazagnaire.org/opam-overlay.git
$ opam update
$ opam install fdir

Usage#

Snapshot at boot, check on demand#

Take a baseline early in startup, then compare the live state against it whenever you want to verify that nothing has drifted:

let () =
  Eio_main.run @@ fun env ->
  let clock = Eio.Stdenv.clock env in
  let fs = Fdir.Procfs.live () in
  let baseline = Fdir.snapshot ~clock fs in
  (* ... application runs, then later: *)
  match Fdir.check ~baseline ~clock fs with
  | Fdir.Ok _ -> ()
  | Fdir.Anomaly { anomalies; _ } ->
      List.iter
        (fun a -> Fmt.epr "drift on %a@." Fdir.pp_subsystem a.Fdir.subsystem)
        anomalies

Monitor continuously#

start_daemon forks an Eio fiber that calls check on a timer and invokes a handler on anomalies. The handler returns a severity that the application can use to decide between logging, isolating, restarting, or dropping to safe mode.

let () =
  Eio_main.run @@ fun env ->
  Eio.Switch.run @@ fun sw ->
  let clock = Eio.Stdenv.clock env in
  let fs = Fdir.Procfs.live () in
  let baseline = Fdir.snapshot ~clock fs in
  let config = Fdir.Config.v ~interval:30.0 () in
  Fdir.start_daemon
    ~sw ~clock ~config ~baseline ~fs
    ~on_anomaly:Fdir.default_handler
    ()

Fdir.default_handler returns Log for 1 anomaly, Degrade for 2, and Safe_mode for 3+.

API#

Snapshot / check#

  • Fdir.snapshot ~clock fs -- SHA-256 snapshot of all monitored subsystems.
  • Fdir.check ~baseline ~clock fs -- Ok or Anomaly { anomalies; _ } listing divergent subsystems.
  • Fdir.Procfs.live () reads real /proc; Fdir.Procfs.mock ~maps ~status ~cgroups supplies canned data for tests.

Daemon#

  • Fdir.start_daemon ~sw ~clock ~config ~baseline ~fs ~on_anomaly ()
  • Fdir.Config.v ?interval ?subsystems () -- defaults are 30s and all three subsystems (Memory_maps, Seccomp, Cgroups).

Severity#

Log | Isolate | Restart | Degrade | Safe_mode -- pretty-printed by Fdir.pp_severity.