@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

at recaptime-dev/main 151 lines 7.2 kB view raw
1@title Arcanist User Guide: Script and Regex Linter 2@group userguide 3 4Explains how to use the Script and Regex linter to invoke an existing 5lint engine that is not integrated with Arcanist. 6 7The Script and Regex linter is a simple glue linter which runs some 8script on each path, and then uses a regex to parse lint messages from 9the script's output. (This linter uses a script and a regex to 10interpret the results of some real linter, it does not itself lint 11both scripts and regexes). 12 13Configure this linter by setting these keys in your configuration: 14 15 - `script-and-regex.script` Script command to run. This can be 16 the path to a linter script, but may also include flags or use shell 17 features (see below for examples). 18 - `script-and-regex.regex` The regex to process output with. This 19 regex uses named capturing groups (detailed below) to interpret output. 20 21The script will be invoked from the project root, so you can specify a 22relative path like `scripts/lint.sh` or an absolute path like 23`/opt/lint/lint.sh`. 24 25This linter is necessarily more limited in its capabilities than a normal 26linter which can perform custom processing, but may be somewhat simpler to 27configure. 28 29== Script... == 30 31The script will be invoked once for each file that is to be linted, with 32the file passed as the first argument. The file may begin with a `-`; ensure 33your script will not interpret such files as flags (perhaps by ending your 34script configuration with `--`, if its argument parser supports that). 35 36Note that when run via `arc diff`, the list of files to be linted does not 37include binary files, symlinks, deleted files, or directories. These special 38file types are not supported by this linter. 39 40The script should emit lint messages to stdout, which will be parsed with 41the provided regex. 42 43For example, you might use a configuration like this: 44 45 "script-and-regex.script": "/opt/lint/lint.sh --flag value --other-flag --" 46 47stderr is ignored. If you have a script which writes messages to stderr, 48you can redirect stderr to stdout by using a configuration like this: 49 50 "script-and-regex.script": "sh -c '/opt/lint/lint.sh \"$0\" 2>&1'" 51 52The return code of the script must be 0, or an exception will be raised 53reporting that the linter failed. If you have a script which exits nonzero 54under normal circumstances, you can force it to always exit 0 by using a 55configuration like this: 56 57 "script-and-regex.script": "sh -c '/opt/lint/lint.sh \"$0\" || true'" 58 59Multiple instances of the script will be run in parallel if there are 60multiple files to be linted, so they should not use any unique resources. 61For instance, this configuration would not work properly, because several 62processes may attempt to write to the file at the same time: 63 64 COUNTEREXAMPLE 65 "script-and-regex.script": "sh -c '/opt/lint/lint.sh --output /tmp/lint.out \"$0\" && cat /tmp/lint.out'" 66 67There are necessary limits to how gracefully this linter can deal with 68edge cases, because it is just a script and a regex. If you need to do 69things that this linter can't handle, you can write a phutil linter and move 70the logic to handle those cases into PHP. PHP is a better general-purpose 71programming language than regular expressions are, if only by a small margin. 72 73== ...and Regex == 74 75The regex must be a valid PHP PCRE regex, including delimiters and flags. 76 77The regex will be matched against the entire output of the script, so it 78should generally be in this form if messages are one-per-line: 79 80 /^...$/m 81 82The regex should capture these named patterns with `(?P<name>...)`: 83 84 - `message` (required) Text describing the lint message. For example, 85 "This is a syntax error.". 86 - `name` (optional) Text summarizing the lint message. For example, 87 "Syntax Error". 88 - `severity` (optional) The word "error", "warning", "autofix", "advice", 89 or "disabled", in any combination of upper and lower case. Instead, you 90 may match groups called `error`, `warning`, `advice`, `autofix`, or 91 `disabled`. These allow you to match output formats like "E123" and 92 "W123" to indicate errors and warnings, even though the word "error" is 93 not present in the output. If no severity capturing group is present, 94 messages are raised with "error" severity. If multiple severity capturing 95 groups are present, messages are raised with the highest captured 96 severity. Capturing groups like `error` supersede the `severity` 97 capturing group. 98 - `error` (optional) Match some nonempty substring to indicate that this 99 message has "error" severity. 100 - `warning` (optional) Match some nonempty substring to indicate that this 101 message has "warning" severity. 102 - `advice` (optional) Match some nonempty substring to indicate that this 103 message has "advice" severity. 104 - `autofix` (optional) Match some nonempty substring to indicate that this 105 message has "autofix" severity. 106 - `disabled` (optional) Match some nonempty substring to indicate that this 107 message has "disabled" severity. 108 - `file` (optional) The name of the file to raise the lint message in. If 109 not specified, defaults to the linted file. It is generally not necessary 110 to capture this unless the linter can raise messages in files other than 111 the one it is linting. 112 - `line` (optional) The line number of the message. If no text is 113 captured, the message is assumed to affect the entire file. 114 - `char` (optional) The character offset of the message. 115 - `offset` (optional) The byte offset of the message. If captured, this 116 supersedes `line` and `char`. 117 - `original` (optional) The text the message affects. 118 - `replacement` (optional) The text that the range captured by `original` 119 should be automatically replaced by to resolve the message. 120 - `code` (optional) A short error type identifier which can be used 121 elsewhere to configure handling of specific types of messages. For 122 example, "EXAMPLE1", "EXAMPLE2", etc., where each code identifies a 123 class of message like "syntax error", "missing whitespace", etc. This 124 allows configuration to later change the severity of all whitespace 125 messages, for example. 126 - `ignore` (optional) Match some nonempty substring to ignore the match. 127 You can use this if your linter sometimes emits text like "No lint 128 errors". 129 - `stop` (optional) Match some nonempty substring to stop processing input. 130 Remaining matches for this file will be discarded, but linting will 131 continue with other linters and other files. 132 - `halt` (optional) Match some nonempty substring to halt all linting of 133 this file by any linter. Linting will continue with other files. 134 - `throw` (optional) Match some nonempty substring to throw an error, which 135 will stop `arc` completely. You can use this to fail abruptly if you 136 encounter unexpected output. All processing will abort. 137 138Numbered capturing groups are ignored. 139 140For example, if your lint script's output looks like this: 141 142 error:13 Too many goats! 143 warning:22 Not enough boats. 144 145...you could use this regex to parse it: 146 147 /^(?P<severity>warning|error):(?P<line>\d+) (?P<message>.*)$/m 148 149The simplest valid regex for line-oriented output is something like this: 150 151 /^(?P<message>.*)$/m