Make `#🐳` work properly
Summary:
Ref T6223. Two issues:
- We don't use `/u` mode on these regexps. Without `/u`, the `\w`/`\W`/`\s`/`\S` modifiers have bad behavior on non-ASCII bytes. Add the flag to use unicode mode, making `\w` and `\s` behave like we expect.
- We might possibly want to do something different here eventually (for example, if the `/u` flag has some huge performance penalty) but this seems OK for now.
- We use `\b` (word boundary) to terminate the match, but `🐳` is not a word character. Use `(?!\w)` instead ("don't match before a word character") which is what we mean.
Test Plan: {F211498}
Reviewers: btrahan, chad
Reviewed By: chad
Subscribers: epriestley
Maniphest Tasks: T6223
Differential Revision: https://secure.phabricator.com/D10618