@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Provide more useful guidance if a repository is clusterized into an existing multi-device cluster

Summary:
Fixes T12087. When transitioning into a clustered configuration for the first time, the documentation recommends using a one-device cluster as a transitional step.

However, installs may not do this for whatever reason, and we aren't as clear as we could be in warning about clusterizing directly into a multi-device cluster.

Roughly, when you do this, we end up believing that working copies exist on several different devices, but have no information about which copy or copies are up to date. //Usually// they all were already synchronized and are all up to date, but we can't make this assumption safely without risking data.

Instead, we err on the side of caution, and require a human to tell us which copy we should consider to be up-to-date, using `bin/repository thaw --promote`.

Test Plan:
```
$ ./bin/repository clusterize rLOCKS --service repos001.phacility.net
Service "repos001.phacility.net" is actively bound to more than one device
(local002.local, local001.phacility.net).

If you clusterize a repository onto this service it will be unclear which
devices have up-to-date copies of the repository. This leader/follower
ambiguity will freeze the repository. You may need to manually promote a
device to unfreeze it. See "Ambiguous Leaders" in the documentation for
discussion.

Continue anyway? [y/N]
```

Read other changes.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T12087

Differential Revision: https://secure.phabricator.com/D17169

+45 -8
+2 -3
src/applications/diffusion/protocol/DiffusionRepositoryClusterEngine.php
··· 251 251 pht( 252 252 'Repository "%s" exists on more than one device, but no device '. 253 253 'has any repository version information. Phabricator can not '. 254 - 'guess which copy of the existing data is authoritative. Remove '. 255 - 'all but one device from service to mark the remaining device '. 256 - 'as the authority.', 254 + 'guess which copy of the existing data is authoritative. Promote '. 255 + 'a device or see "Ambigous Leaders" in the documentation.', 257 256 $repository->getDisplayName())); 258 257 } 259 258
+34 -1
src/applications/repository/management/PhabricatorRepositoryManagementClusterizeWorkflow.php
··· 61 61 array( 62 62 AlmanacClusterRepositoryServiceType::SERVICETYPE, 63 63 )) 64 + ->needBindings(true) 64 65 ->executeOne(); 65 66 if (!$service) { 66 67 throw new PhutilArgumentUsageException( ··· 70 71 } 71 72 } 72 73 73 - 74 74 if ($service) { 75 75 $service_phid = $service->getPHID(); 76 + 77 + $bindings = $service->getActiveBindings(); 78 + 79 + $unique_devices = array(); 80 + foreach ($bindings as $binding) { 81 + $unique_devices[$binding->getDevicePHID()] = $binding->getDevice(); 82 + } 83 + 84 + if (count($unique_devices) > 1) { 85 + $device_names = mpull($unique_devices, 'getName'); 86 + 87 + echo id(new PhutilConsoleBlock()) 88 + ->addParagraph( 89 + pht( 90 + 'Service "%s" is actively bound to more than one device (%s).', 91 + $service_name, 92 + implode(', ', $device_names))) 93 + ->addParagraph( 94 + pht( 95 + 'If you clusterize a repository onto this service it may be '. 96 + 'unclear which devices have up-to-date copies of the '. 97 + 'repository. If so, leader/follower ambiguity will freeze the '. 98 + 'repository. You may need to manually promote a device to '. 99 + 'unfreeze it. See "Ambiguous Leaders" in the documentation '. 100 + 'for discussion.')) 101 + ->drawConsoleString(); 102 + 103 + $prompt = pht('Continue anyway?'); 104 + if (!phutil_console_confirm($prompt)) { 105 + throw new PhutilArgumentUsageException( 106 + pht('User aborted the workflow.')); 107 + } 108 + } 76 109 } else { 77 110 $service_phid = null; 78 111 }
+9 -4
src/docs/user/cluster/cluster_repositories.diviner
··· 422 422 ================= 423 423 424 424 Repository clusters can also freeze if the leader devices are ambiguous. This 425 - can happen if you replace an entire cluster with new devices suddenly, or 426 - make a mistake with the `--demote` flag. This generally arises from some kind 427 - of operator error, like this: 425 + can happen if you replace an entire cluster with new devices suddenly, or make 426 + a mistake with the `--demote` flag. This may arise from some kind of operator 427 + error, like these: 428 428 429 429 - Someone accidentally uses `bin/repository thaw ... --demote` to demote 430 430 every device in a cluster. 431 431 - Someone accidentally deletes all the version information for a repository 432 432 from the database by making a mistake with a `DELETE` or `UPDATE` query. 433 - - Someone accidentally disable all of the devices in a cluster, then add 433 + - Someone accidentally disables all of the devices in a cluster, then adds 434 434 entirely new ones before repositories can propagate. 435 + 436 + If you are moving repositories into cluster services, you can also reach this 437 + state if you use `clusterize` to associate a repository with a service that is 438 + bound to multiple active devices. In this case, Phabricator will not know which 439 + device or devices have up-to-date information. 435 440 436 441 When Phabricator can not tell which device in a cluster is a leader, it freezes 437 442 the cluster because it is possible that some devices have less data and others