@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Allow Drydock Blueprints to control "supplemental allocation" behavior so all hosts in an Almanac pool get used

Summary:
Fixes T12145. Ref T13210. See PHI570. See PHI536.

Currently, when you give Drydock an Almanac host pool with more than one host, it never voluntarily builds a second host resource: there is no way to say "maximum X working copies per host" (only "maximum X global working copies") to make the first host overflow, and the allocator tries to pack resources as tightly as possible.

If you can force it to allocate the 2nd..Nth host, things will work reasonably well from there (it will spread working copies across the hosts randomly), but tricking it is very hard, especially before D19761.

To deal with this, give blueprints a new behavior around "supplemental allocations". The idea here is that a blueprint may decide that it would prefer to allocate a fresh new resource instead of allowing an otherwise valid acquisition to occur.

These supplemental allocations follow all the normal allocation rules (they can't exceed limits or actually replace existing resources), so they can only happen if there's free space in the resource pool. But a blueprint can elect for a supplemental allocation to provide a "grow the pool" hint.

The only useful policies here are probably "true" (immediately use all resources, like Almanac) or "false" (pack resources as efficiently as possible) but some other policies //might// be useful (perhaps "start growing the pool when we're getting a bit full even if we aren't at the limit yet, since our workload is bursty").

Then, give Almanac host resources a "true" policy (always allocate supplemental resources) so they use all hosts once a similar number of concurrent jobs arrive.

One aspect of this approach is that we only do supplemental resources if the normal allocation algorithm already decided that the best resource to acquire was part of the same blueprint. I started with an approach like "look at all the blueprints and see if any of them want to be greedy", but then a not-very-desirable blueprint would end up filling up its whole pool before we skipped the supplemental allocation part and ended up picking a different resource. That felt a bit silly and this feels a little cleaner and more focused.

Test Plan:
- Without changing the Almanac blueprint policy, allocated hosts. Got A, A, A, A, ... (second host never used).
- Changed the Almanac policy.
- Allocated hosts, got A, B, random mix of A and B.
- Destroyed B. Destroyed all leases on A. Allocated. Got A. This tests the "don't build a supplemental resource if there are no leases on the natural resource".

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13210, T12145

Differential Revision: https://secure.phabricator.com/D19762

+123
+10
src/applications/drydock/blueprint/DrydockAlmanacServiceHostBlueprintImplementation.php
··· 45 45 return true; 46 46 } 47 47 48 + public function shouldAllocateSupplementalResource( 49 + DrydockBlueprint $blueprint, 50 + DrydockResource $resource, 51 + DrydockLease $lease) { 52 + // We want to use every host in an Almanac service, since the amount of 53 + // hardware is fixed and there's normally no value in packing leases onto a 54 + // subset of it. Always build a new supplemental resource if we can. 55 + return true; 56 + } 57 + 48 58 public function canAllocateResourceForLease( 49 59 DrydockBlueprint $blueprint, 50 60 DrydockLease $lease) {
+32
src/applications/drydock/blueprint/DrydockBlueprintImplementation.php
··· 139 139 DrydockResource $resource, 140 140 DrydockLease $lease); 141 141 142 + /** 143 + * Return true to try to allocate a new resource and expand the resource 144 + * pool instead of permitting an otherwise valid acquisition on an existing 145 + * resource. 146 + * 147 + * This allows the blueprint to provide a soft hint about when the resource 148 + * pool should grow. 149 + * 150 + * Returning "true" in all cases generally makes sense when a blueprint 151 + * controls a fixed pool of resources, like a particular number of physical 152 + * hosts: you want to put all the hosts in service, so whenever it is 153 + * possible to allocate a new host you want to do this. 154 + * 155 + * Returning "false" in all cases generally make sense when a blueprint 156 + * has a flexible pool of expensive resources and you want to pack leases 157 + * onto them as tightly as possible. 158 + * 159 + * @param DrydockBlueprint The blueprint for an existing resource being 160 + * acquired. 161 + * @param DrydockResource The resource being acquired, which we may want to 162 + * build a supplemental resource for. 163 + * @param DrydockLease The current lease performing acquisition. 164 + * @return bool True to prefer allocating a supplemental resource. 165 + * 166 + * @task lease 167 + */ 168 + public function shouldAllocateSupplementalResource( 169 + DrydockBlueprint $blueprint, 170 + DrydockResource $resource, 171 + DrydockLease $lease) { 172 + return false; 173 + } 142 174 143 175 /* -( Resource Allocation )------------------------------------------------ */ 144 176
+9
src/applications/drydock/storage/DrydockBlueprint.php
··· 278 278 return $interface; 279 279 } 280 280 281 + public function shouldAllocateSupplementalResource( 282 + DrydockResource $resource, 283 + DrydockLease $lease) { 284 + return $this->getImplementation()->shouldAllocateSupplementalResource( 285 + $this, 286 + $resource, 287 + $lease); 288 + } 289 + 281 290 282 291 /* -( PhabricatorApplicationTransactionInterface )------------------------- */ 283 292
+72
src/applications/drydock/worker/DrydockLeaseUpdateWorker.php
··· 306 306 $allocated = false; 307 307 foreach ($resources as $resource) { 308 308 try { 309 + $resource = $this->newResourceForAcquisition($resource, $lease); 309 310 $this->acquireLease($resource, $lease); 310 311 $allocated = true; 311 312 break; ··· 318 319 // If a resource was reclaimed or destroyed by the time we actually 319 320 // got around to acquiring it, we just got unlucky. We can yield and 320 321 // try again later. 322 + $yields[] = $ex; 323 + } catch (PhabricatorWorkerYieldException $ex) { 324 + // We can be told to yield, particularly by the supplemental allocator 325 + // trying to give us a supplemental resource. 321 326 $yields[] = $ex; 322 327 } catch (Exception $ex) { 323 328 $exceptions[] = $ex; ··· 789 794 $blueprint->getClassName(), 790 795 'acquireLease()')); 791 796 } 797 + } 798 + 799 + private function newResourceForAcquisition( 800 + DrydockResource $resource, 801 + DrydockLease $lease) { 802 + 803 + // If the resource has no leases against it, never build a new one. This is 804 + // likely already a new resource that just activated. 805 + $viewer = $this->getViewer(); 806 + 807 + $statuses = array( 808 + DrydockLeaseStatus::STATUS_PENDING, 809 + DrydockLeaseStatus::STATUS_ACQUIRED, 810 + DrydockLeaseStatus::STATUS_ACTIVE, 811 + ); 812 + 813 + $leases = id(new DrydockLeaseQuery()) 814 + ->setViewer($viewer) 815 + ->withResourcePHIDs(array($resource->getPHID())) 816 + ->withStatuses($statuses) 817 + ->setLimit(1) 818 + ->execute(); 819 + if (!$leases) { 820 + return $resource; 821 + } 822 + 823 + // If we're about to get a lease on a resource, check if the blueprint 824 + // wants to allocate a supplemental resource. If it does, try to perform a 825 + // new allocation instead. 826 + $blueprint = $resource->getBlueprint(); 827 + if (!$blueprint->shouldAllocateSupplementalResource($resource, $lease)) { 828 + return $resource; 829 + } 830 + 831 + // If the blueprint is already overallocated, we can't allocate a new 832 + // resource. Just return the existing resource. 833 + $remaining = $this->removeOverallocatedBlueprints( 834 + array($blueprint), 835 + $lease); 836 + if (!$remaining) { 837 + return $resource; 838 + } 839 + 840 + // Try to build a new resource. 841 + try { 842 + $new_resource = $this->allocateResource($blueprint, $lease); 843 + } catch (Exception $ex) { 844 + $blueprint->logEvent( 845 + DrydockResourceAllocationFailureLogType::LOGCONST, 846 + array( 847 + 'class' => get_class($ex), 848 + 'message' => $ex->getMessage(), 849 + )); 850 + 851 + return $resource; 852 + } 853 + 854 + // If we can't actually acquire the new resource yet, just yield. 855 + // (We could try to move forward with the original resource instead.) 856 + $acquirable = $this->removeUnacquirableResources( 857 + array($new_resource), 858 + $lease); 859 + if (!$acquirable) { 860 + throw new PhabricatorWorkerYieldException(15); 861 + } 862 + 863 + return $new_resource; 792 864 } 793 865 794 866