@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Make Ferret indexing more robust (UTF8, exception handling)

Summary:
Ref T12819. Two minor improvements from live data:

- Tokenize in a UTF8-aware way.
- When one document fails to index, kill the transaction explicitly (rather than leaving it hanging) so we don't cause other failures later.

Test Plan: Created some UTF8 documents locally, indexed them, got clean results.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T12819

Differential Revision: https://secure.phabricator.com/D18487

+11 -2
+7
src/applications/search/engineextension/PhabricatorFerretFulltextEngineExtension.php
··· 55 55 ->getNgramsFromString($ngrams_source, 'index'); 56 56 57 57 $ferret_document->openTransaction(); 58 + 59 + try { 58 60 $this->deleteOldDocument($engine, $object, $document); 59 61 60 62 $ferret_document->save(); ··· 85 87 $ferret_ngrams->getTableName(), 86 88 $chunk); 87 89 } 90 + } catch (Exception $ex) { 91 + $ferret_document->killTransaction(); 92 + throw $ex; 93 + } 94 + 88 95 $ferret_document->saveTransaction(); 89 96 } 90 97
+4 -2
src/applications/search/ngrams/PhabricatorNgramEngine.php
··· 26 26 break; 27 27 } 28 28 29 - $len = (strlen($token) - 2); 29 + $token_v = phutil_utf8v($token); 30 + $len = (count($token_v) - 2); 30 31 for ($ii = 0; $ii < $len; $ii++) { 31 - $ngram = substr($token, $ii, 3); 32 + $ngram = array_slice($token_v, $ii, 3); 33 + $ngram = implode('', $ngram); 32 34 $ngrams[$ngram] = $ngram; 33 35 } 34 36 }