@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Improve construction of commit queries from blame lookups

Summary:
Ref T2450. File blame tends to have the same commit a lot of times, and we don't do lookups like this efficiently right now.

In particular, for a file like `__phutil_library_map__.php`, we would issue a query with ~9,000 clauses like this:

```
(repositoryID = 1 AND commitIdentifier LIKE "XYZ%")
```

...but only a few hundred of those identifiers were unique. Instead, issue only one clause per unique identifier.

MySQL also seems to do a little better on "commitIdentifier = X" if we have the full hash, so special case that slightly.

Test Plan:
- Issuing a query for only unique identifiers dropped the cost from 400ms to 100ms locally.
- Swapping to `=` if we have the full hash dropped the cost from 100ms to 75ms locally.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T2450

Differential Revision: https://secure.phabricator.com/D14962

+26 -8
+26 -8
src/applications/diffusion/query/DiffusionCommitQuery.php
··· 51 51 * they queried for. 52 52 */ 53 53 public function withIdentifiers(array $identifiers) { 54 + // Some workflows (like blame lookups) can pass in large numbers of 55 + // duplicate identifiers. We only care about unique identifiers, so 56 + // get rid of duplicates immediately. 57 + $identifiers = array_fuse($identifiers); 58 + 54 59 $this->identifiers = $identifiers; 55 60 return $this; 56 61 } ··· 185 190 186 191 // Build the identifierMap 187 192 if ($this->identifiers !== null) { 188 - $ids = array_fuse($this->identifiers); 193 + $ids = $this->identifiers; 189 194 $prefixes = array( 190 195 'r'.$commit->getRepository()->getCallsign(), 191 196 'r'.$commit->getRepository()->getCallsign().':', ··· 395 400 $repos->execute(); 396 401 397 402 $repos = $repos->getIdentifierMap(); 398 - 399 403 foreach ($refs as $key => $ref) { 400 404 $repo = idx($repos, $ref['callsign']); 401 405 ··· 404 408 } 405 409 406 410 if ($repo->isSVN()) { 407 - if (!ctype_digit($ref['identifier'])) { 411 + if (!ctype_digit((string)$ref['identifier'])) { 408 412 continue; 409 413 } 410 414 $sql[] = qsprintf( ··· 419 423 if (strlen($ref['identifier']) < $min_qualified) { 420 424 continue; 421 425 } 422 - $sql[] = qsprintf( 423 - $conn, 424 - '(commit.repositoryID = %d AND commit.commitIdentifier LIKE %>)', 425 - $repo->getID(), 426 - $ref['identifier']); 426 + 427 + $identifier = $ref['identifier']; 428 + if (strlen($identifier) == 40) { 429 + // MySQL seems to do slightly better with this version if the 430 + // clause, so issue it if we have a full commit hash. 431 + $sql[] = qsprintf( 432 + $conn, 433 + '(commit.repositoryID = %d 434 + AND commit.commitIdentifier = %s)', 435 + $repo->getID(), 436 + $identifier); 437 + } else { 438 + $sql[] = qsprintf( 439 + $conn, 440 + '(commit.repositoryID = %d 441 + AND commit.commitIdentifier LIKE %>)', 442 + $repo->getID(), 443 + $identifier); 444 + } 427 445 } 428 446 } 429 447 }