@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Improve some behaviors around memory pressure when pushing many and/or large changes

Summary:
Ref T13142. When commits are pushed, we try to handle them on one of two pathways:

- Normal changes: we load these into memory and potentially apply Herald content rules to them.
- "Enormous" changes: we don't load these into memory and skip content rules for them.

The goal is to degrade gracefully when users push huge changes: they should work, just not support all the features.

However, some changes can slip through the cracks right now:

- If you push a lot of commits at once, we'll try to cache all of the changes smaller than 1GB in memory. This can require an arbitrarily large amount of RAM.
- We calculate sizes by just looking at the `strlen()` of the diff, but a changeset takes more RAM in PHP than the raw diff does. So even if a diff is "only" 500MB, it can take much more memory than that. On systems with relatively little memory available, this may result in OOM while processing changes that are close to the "enormous" limit.

This change makes two improvements:

- Instead of caching everything, cache only 64MB of things.
- For most pushes, this is the same, since they have less than 64MB of diffs.
- For pushes of single very large changes, this is a bit slower (more CPU) since we have to do some work twice.
- For pushes of many changes, this is slower (more CPU) since we have to do some work twice, but, critically, doesn't require unlimited memory.
- Instead of flagging changes as "enormous" at 1GB, flag them as "enormous" at 256MB.
- This reduces how much memory is required to process the largest "non-enormous" changes.
- This also gets us under Git's hard-coded 512MB "always binary" cutoff; see T13143.
- This is still completely gigantic and way larger than any normal change should be.

An additional improvement would be to try to reduce the amount of memory we need to use to hold a change in process memory. I think the other changes here alone will fix the immediate issue in PHI657, but it would be nice if the "largest non-enormous change" required only a couple gigs of RAM.

Test Plan:
- Used `ini_set('memory_limit', '1G')` to artificially limit memory to 1GB.
- Pushed a series of two commits which add two 550MB text files (Temporarily, I added a `--binary` flag to trick Git into showing real diffs for these, see T13143.)
- Got a memory limit error.
- Applied the "cache only 64MB of stuff" and "consider 256MB, not 1GB, to be enormous" changes.
- Pushed again, got properly rejected as enormous.
- Added `memory_get_usage()` calls to measure how actual memory size and reported "size" estimate compare. For these changes, saw a 639MB diff require 31,479MB of memory, i.e. a factor of about 50x. This is, uh, pretty not great.
- Allowed enormous changes, pushed again, push went through.

Reviewers: amckinley

Maniphest Tasks: T13142

Differential Revision: https://secure.phabricator.com/D19455

+23 -5
+22 -4
src/applications/diffusion/engine/DiffusionCommitHookEngine.php
··· 37 37 private $rejectDetails; 38 38 private $emailPHIDs = array(); 39 39 private $changesets = array(); 40 + private $changesetsSize = 0; 40 41 41 42 42 43 /* -( Config )------------------------------------------------------------- */ ··· 1121 1122 return; 1122 1123 } 1123 1124 1125 + // See T13142. Don't cache more than 64MB of changesets. For normal small 1126 + // pushes, caching everything here can let us hit the cache from Herald if 1127 + // we need to run content rules, which speeds things up a bit. For large 1128 + // pushes, we may not be able to hold everything in memory. 1129 + $cache_limit = 1024 * 1024 * 64; 1130 + 1124 1131 foreach ($content_updates as $update) { 1125 1132 $identifier = $update->getRefNew(); 1126 1133 try { 1127 - $changesets = $this->loadChangesetsForCommit($identifier); 1128 - $this->changesets[$identifier] = $changesets; 1134 + $info = $this->loadChangesetsForCommit($identifier); 1135 + list($changesets, $size) = $info; 1136 + 1137 + if ($this->changesetsSize + $size <= $cache_limit) { 1138 + $this->changesets[$identifier] = $changesets; 1139 + $this->changesetsSize += $size; 1140 + } 1129 1141 } catch (Exception $ex) { 1130 1142 $this->changesets[$identifier] = $ex; 1131 1143 ··· 1207 1219 $changes = $parser->parseDiff($raw_diff); 1208 1220 $diff = DifferentialDiff::newEphemeralFromRawChanges( 1209 1221 $changes); 1210 - return $diff->getChangesets(); 1222 + 1223 + $changesets = $diff->getChangesets(); 1224 + $size = strlen($raw_diff); 1225 + 1226 + return array($changesets, $size); 1211 1227 } 1212 1228 1213 1229 public function getChangesetsForCommit($identifier) { ··· 1221 1237 return $cached; 1222 1238 } 1223 1239 1224 - return $this->loadChangesetsForCommit($identifier); 1240 + $info = $this->loadChangesetsForCommit($identifier); 1241 + list($changesets, $size) = $info; 1242 + return $changesets; 1225 1243 } 1226 1244 1227 1245 public function loadCommitRefForCommit($identifier) {
+1 -1
src/applications/diffusion/herald/HeraldCommitAdapter.php
··· 207 207 } 208 208 209 209 public static function getEnormousByteLimit() { 210 - return 1024 * 1024 * 1024; // 1GB 210 + return 256 * 1024 * 1024; // 256MB. See T13142 and T13143. 211 211 } 212 212 213 213 public static function getEnormousTimeLimit() {