fix: treat non-ASCII bytes as word characters to avoid splitting UTF-8
isWordChar now treats bytes >= 0x80 as word constituents, preventing
word motion from splitting inside multi-byte UTF-8 sequences like
accented characters. Added regression tests with café/über text.
authored by