@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Instead of retrying safe reads 3 times, retry each eligible service once

Summary: Ref T13286. When retrying a read request, keep retrying as long as we have canididate services. Since we consume a service with each attempt, there's no real reason to abort early, and trying every service allows reads to always succeed even if (for example) 8 nodes of a 16-node cluster are dead because of a severed network link between datacenters.

Test Plan: Ran `git pull` in a clustered repository with an up node and a down node; saw retry count dynamically adjust to available node count.

Maniphest Tasks: T13286

Differential Revision: https://secure.phabricator.com/D20777

+6 -15
+6 -15
src/applications/diffusion/ssh/DiffusionGitUploadPackSSHWorkflow.php
··· 126 126 ->setCommandChannelFromExecFuture($future) 127 127 ->execute(); 128 128 129 - $err = 1; 130 - 131 129 // TODO: Currently, when proxying, we do not write an event log on the 132 130 // proxy. Perhaps we should write a "proxy log". This is not very useful 133 131 // for statistics or auditing, but could be useful for diagnostics. ··· 144 142 // the same host. 145 143 array_shift($refs); 146 144 147 - // Check if we have more services we can try. If we do, we'll make an 148 - // effort to fall back to them below. If not, we can't do anything to 149 - // recover so just bail out. 150 - if (!$refs) { 151 - return $err; 152 - } 153 - 154 - $should_retry = $this->shouldRetryRequest(); 145 + $should_retry = $this->shouldRetryRequest($refs); 155 146 if (!$should_retry) { 156 147 return $err; 157 148 } ··· 168 159 return $this; 169 160 } 170 161 171 - private function shouldRetryRequest() { 162 + private function shouldRetryRequest(array $remaining_refs) { 172 163 $this->requestFailures++; 173 164 174 165 if ($this->requestFailures > $this->requestAttempts) { ··· 178 169 "missing call to \"didBeginRequest()\".\n")); 179 170 } 180 171 181 - $max_failures = 3; 182 - if ($this->requestFailures >= $max_failures) { 172 + if (!$remaining_refs) { 183 173 $this->writeClusterEngineLogMessage( 184 174 pht( 185 - "# Reached maximum number of retry attempts, giving up.\n")); 175 + "# All available services failed to serve the request, ". 176 + "giving up.\n")); 186 177 return false; 187 178 } 188 179 ··· 208 199 pht( 209 200 "# Service request failed, retrying (making attempt %s of %s).\n", 210 201 new PhutilNumber($this->requestAttempts + 1), 211 - new PhutilNumber($max_failures))); 202 + new PhutilNumber($this->requestAttempts + count($remaining_refs)))); 212 203 213 204 return true; 214 205 }