cue/literal: reject invalid UTF-8 in Unquote
The scanner already rejects invalid UTF-8 byte sequences per the spec,
but literal.Unquote accepted them. For example, "\xb0" (a lone
continuation byte) was accepted by Unquote even though the parser
rejected it, because unquoteChar passed through the RuneError from
DecodeRuneInString without checking for the size==1 invalid-byte signal.
Fix unquoteChar to return errInvalidUTF8 when DecodeRuneInString
indicates an invalid byte, and update the isSimple fast path to also
bail out on RuneError (covering both invalid UTF-8 and valid U+FFFD,
the latter being handled correctly by the slow path).
Found by the fuzzer.
Signed-off-by: Daniel Martí <mvdan@mvdan.cc>
Change-Id: I047928b8cfd881bb69cfe06028afa20ee16ec537
Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1235297
Unity-Result: CUE porcuepine <cue.porcuepine@gmail.com>
Reviewed-by: Matthew Sackman <matthew@cue.works>
TryBot-Result: CUEcueckoo <cueckoo@cuelang.org>