@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Limit remarkup URI protocol length to 32 characters to avoid expensive regex behavior

Summary:
Ref T13608. When searching for bare URIs in remarkup text, don't look for URIs with a protocol string longer than 32 characters.

This avoids a case where the regexp engine may be tricked into executing at `O(N^2)` or some similar complexity.

Test Plan:
- Applied remarkup to "AAAA..." (512KB).
- Before: 64 seconds to process.
- After: <10ms to process.
- Ran unit tests.

Maniphest Tasks: T13608

Differential Revision: https://secure.phabricator.com/D21562

+35 -5
+35 -5
src/infrastructure/markup/markuprule/PhutilRemarkupHyperlinkRule.php
··· 9 9 } 10 10 11 11 public function apply($text) { 12 + static $angle_pattern; 13 + static $curly_pattern; 14 + static $bare_pattern; 15 + 16 + if ($angle_pattern === null) { 17 + // See T13608. Limit protocol matches to 32 characters to improve the 18 + // performance of the "<protocol>://" pattern, which can take a very long 19 + // time to match against long inputs if the maximum length of a protocol 20 + // sequence is unrestricted. 21 + 22 + $protocol_fragment = '\w{3,32}'; 23 + $uri_fragment = '[^\s'.PhutilRemarkupBlockStorage::MAGIC_BYTE.']+'; 24 + 25 + $angle_pattern = sprintf( 26 + '(<(%s://%s?)>)', 27 + $protocol_fragment, 28 + $uri_fragment); 29 + 30 + $curly_pattern = sprintf( 31 + '({(%s://%s?)})', 32 + $protocol_fragment, 33 + $uri_fragment); 34 + 35 + $bare_pattern = sprintf( 36 + '(%s://%s)', 37 + $protocol_fragment, 38 + $uri_fragment); 39 + } 40 + 12 41 // Hyperlinks with explicit "<>" around them get linked exactly, without 13 42 // the "<>". Angle brackets are basically special and mean "this is a URL 14 43 // with weird characters". This is assumed to be reasonable because they 15 - // don't appear in normal text or normal URLs. 44 + // don't appear in most normal text or most normal URLs. 16 45 $text = preg_replace_callback( 17 - '@<(\w{3,}://[^\s'.PhutilRemarkupBlockStorage::MAGIC_BYTE.']+?)>@', 46 + $angle_pattern, 18 47 array($this, 'markupHyperlinkAngle'), 19 48 $text); 20 49 21 50 // We match "{uri}", but do not link it by default. 22 51 $text = preg_replace_callback( 23 - '@{(\w{3,}://[^\s'.PhutilRemarkupBlockStorage::MAGIC_BYTE.']+?)}@', 52 + $curly_pattern, 24 53 array($this, 'markupHyperlinkCurly'), 25 54 $text); 26 55 ··· 31 60 32 61 // NOTE: We're explicitly avoiding capturing stored blocks, so text like 33 62 // `http://www.example.com/[[x | y]]` doesn't get aggressively captured. 63 + 34 64 $text = preg_replace_callback( 35 - '@(\w{3,}://[^\s'.PhutilRemarkupBlockStorage::MAGIC_BYTE.']+)@', 65 + $bare_pattern, 36 66 array($this, 'markupHyperlinkUngreedy'), 37 67 $text); 38 68 ··· 110 140 } 111 141 112 142 protected function markupHyperlinkUngreedy($matches) { 113 - $match = $matches[1]; 143 + $match = $matches[0]; 114 144 $tail = null; 115 145 $trailing = null; 116 146 if (preg_match('/[;,.:!?]+$/', $match, $trailing)) {