Implement WHATWG Encoding: UTF-8 and UTF-16 codecs
Pure Rust implementation of UTF-8 and UTF-16 text encoding/decoding
per the WHATWG Encoding Standard:
- UTF-8 decoder (streaming state machine with replacement/fatal modes)
- UTF-8 encoder
- UTF-16LE/BE decoders with surrogate pair handling and BOM stripping
- Encoding label lookup (case-insensitive, whitespace-trimmed)
- BOM sniffing utility
- Public API: decode(), decode_strict(), encode(), lookup(), bom_sniff()
- 93 unit tests covering edge cases (overlong, surrogates, truncated, BOM)
No external dependencies, no unsafe.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
authored by
tangled.org
db9d5fbf
0f8e670c