@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Add documentation about the script and regex linter to the user guide.

Summary:
The big, gigantic comment about the script and regex linter belongs in a more obvious place. I think this is a more obvious place. I also cleaned up a couple things.

I'll update D9084 to remove the big comment block and point here instead.

Test Plan: `bin/diviner generate --book src/docs/book/user.book`

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley, Korvin

Differential Revision: https://secure.phabricator.com/D9100

authored by

William R. Otte and committed by
epriestley
0ab192d2 cff721c6

+155
+2
src/docs/user/userguide/arcanist_lint.diviner
··· 413 413 414 414 - integrating and customizing built-in linters and lint bindings with 415 415 @{article:Arcanist User Guide: Customizing Existing Linters}; or 416 + - use a linter that hasn't been integrated into Arcanist with 417 + @{article:Arcanist User Guide: Script and Regex Linter}; or 416 418 - learning how to add new linters and lint engines with 417 419 @{article:Arcanist User Guide: Customizing Lint, Unit Tests and Workflows}.
+153
src/docs/user/userguide/arcanist_lint_script_and_regex.diviner
··· 1 + @title Arcanist User Guide: Script and Regex Linter 2 + @group userguide 3 + 4 + Explains how to use the Script and Regex linter to invoke an existing 5 + lint engine that is not integrated with Arcanist. 6 + 7 + The Script and Regex linter is a simple glue linter which runs some 8 + script on each path, and then uses a regex to parse lint messages from 9 + the script's output. (This linter uses a script and a regex to 10 + interpret the results of some real linter, it does not itself lint 11 + both scripts and regexes). 12 + 13 + Configure this linter by setting these keys in your configuration: 14 + 15 + - `script-and-regex.script` Script command to run. This can be 16 + the path to a linter script, but may also include flags or use shell 17 + features (see below for examples). 18 + - `script-and-regex.regex` The regex to process output with. This 19 + regex uses named capturing groups (detailed below) to interpret output. 20 + 21 + The script will be invoked from the project root, so you can specify a 22 + relative path like `scripts/lint.sh` or an absolute path like 23 + `/opt/lint/lint.sh`. 24 + 25 + This linter is necessarily more limited in its capabilities than a normal 26 + linter which can perform custom processing, but may be somewhat simpler to 27 + configure. 28 + 29 + == Script... == 30 + 31 + The script will be invoked once for each file that is to be linted, with 32 + the file passed as the first argument. The file may begin with a "-"; ensure 33 + your script will not interpret such files as flags (perhaps by ending your 34 + script configuration with "--", if its argument parser supports that). 35 + 36 + Note that when run via `arc diff`, the list of files to be linted includes 37 + deleted files and files that were moved away by the change. The linter should 38 + not assume the path it is given exists, and it is not an error for the 39 + linter to be invoked with paths which are no longer there. (Every affected 40 + path is subject to lint because some linters may raise errors in other files 41 + when a file is removed, or raise an error about its removal.) 42 + 43 + The script should emit lint messages to stdout, which will be parsed with 44 + the provided regex. 45 + 46 + For example, you might use a configuration like this: 47 + 48 + "script-and-regex.script": "/opt/lint/lint.sh --flag value --other-flag --" 49 + 50 + stderr is ignored. If you have a script which writes messages to stderr, 51 + you can redirect stderr to stdout by using a configuration like this: 52 + 53 + "script-and-regex.script": "sh -c '/opt/lint/lint.sh \"$0\" 2>&1'" 54 + 55 + The return code of the script must be 0, or an exception will be raised 56 + reporting that the linter failed. If you have a script which exits nonzero 57 + under normal circumstances, you can force it to always exit 0 by using a 58 + configuration like this: 59 + 60 + "script-and-regex.script": "sh -c '/opt/lint/lint.sh \"$0\" || true'" 61 + 62 + Multiple instances of the script will be run in parallel if there are 63 + multiple files to be linted, so they should not use any unique resources. 64 + For instance, this configuration would not work properly, because several 65 + processes may attempt to write to the file at the same time: 66 + 67 + COUNTEREXAMPLE 68 + "script-and-regex.script": "sh -c '/opt/lint/lint.sh --output /tmp/lint.out \"$0\" && cat /tmp/lint.out'" 69 + 70 + There are necessary limits to how gracefully this linter can deal with 71 + edge cases, because it is just a script and a regex. If you need to do 72 + things that this linter can't handle, you can write a phutil linter and move 73 + the logic to handle those cases into PHP. PHP is a better general-purpose 74 + programming language than regular expressions are, if only by a small margin. 75 + 76 + == ...and Regex == 77 + 78 + The regex must be a valid PHP PCRE regex, including delimiters and flags. 79 + 80 + The regex will be matched against the entire output of the script, so it 81 + should generally be in this form if messages are one-per-line: 82 + 83 + /^...$/m 84 + 85 + The regex should capture these named patterns with `(?P<name>...)`: 86 + 87 + - `message` (required) Text describing the lint message. For example, 88 + "This is a syntax error.". 89 + - `name` (optional) Text summarizing the lint message. For example, 90 + "Syntax Error". 91 + - `severity` (optional) The word "error", "warning", "autofix", "advice", 92 + or "disabled", in any combination of upper and lower case. Instead, you 93 + may match groups called `error`, `warning`, `advice`, `autofix`, or 94 + `disabled`. These allow you to match output formats like "E123" and 95 + "W123" to indicate errors and warnings, even though the word "error" is 96 + not present in the output. If no severity capturing group is present, 97 + messages are raised with "error" severity. If multiple severity capturing 98 + groups are present, messages are raised with the highest captured 99 + serverity. Capturing groups like `error` supersede the `severity` 100 + capturing group. 101 + - `error` (optional) Match some nonempty substring to indicate that this 102 + message has "error" severity. 103 + - `warning` (optional) Match some nonempty substring to indicate that this 104 + message has "warning" severity. 105 + - `advice` (optional) Match some nonempty substring to indicate that this 106 + message has "advice" severity. 107 + - `autofix` (optional) Match some nonempty substring to indicate that this 108 + message has "autofix" severity. 109 + - `disabled` (optional) Match some nonempty substring to indicate that this 110 + message has "disabled" severity. 111 + - `file` (optional) The name of the file to raise the lint message in. If 112 + not specified, defaults to the linted file. It is generally not necessary 113 + to capture this unless the linter can raise messages in files other than 114 + the one it is linting. 115 + - `line` (optional) The line number of the message. 116 + - `char` (optional) The character offset of the message. 117 + - `offset` (optional) The byte offset of the message. If captured, this 118 + supersedes `line` and `char`. 119 + - `original` (optional) The text the message affects. 120 + - `replacement` (optional) The text that the range captured by `original` 121 + should be automatically replaced by to resolve the message. 122 + - `code` (optional) A short error type identifier which can be used 123 + elsewhere to configure handling of specific types of messages. For 124 + example, "EXAMPLE1", "EXAMPLE2", etc., where each code identifies a 125 + class of message like "syntax error", "missing whitespace", etc. This 126 + allows configuration to later change the severity of all whitespace 127 + messages, for example. 128 + - `ignore` (optional) Match some nonempty substring to ignore the match. 129 + You can use this if your linter sometimes emits text like "No lint 130 + errors". 131 + - `stop` (optional) Match some nonempty substring to stop processing input. 132 + Remaining matches for this file will be discarded, but linting will 133 + continue with other linters and other files. 134 + - `halt` (optional) Match some nonempty substring to halt all linting of 135 + this file by any linter. Linting will continue with other files. 136 + - `throw` (optional) Match some nonempty substring to throw an error, which 137 + will stop `arc` completely. You can use this to fail abruptly if you 138 + encounter unexpected output. All processing will abort. 139 + 140 + Numbered capturing groups are ignored. 141 + 142 + For example, if your lint script's output looks like this: 143 + 144 + error:13 Too many goats! 145 + warning:22 Not enough boats. 146 + 147 + ...you could use this regex to parse it: 148 + 149 + /^(?P<severity>warning|error):(?P<line>\d+) (?P<message>.*)$/m 150 + 151 + The simplest valid regex for line-oriented output is something like this: 152 + 153 + /^(?P<message>.*)$/m