@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.) hq.recaptime.dev/wiki/Phorge
phorge phabricator
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Write "Why does Phabricator need so many databases?"

Summary: We will sell you as many new databases as you want, cheap! Just $1 per database!

Test Plan: (O).(O)

Reviewers: chad

Reviewed By: chad

Differential Revision: https://secure.phabricator.com/D15249

+134 -4
+3 -4
src/docs/contributor/database.diviner
··· 28 28 ========= 29 29 30 30 Each Phabricator application has its own database. The names are prefixed by 31 - `phabricator_` (this is configurable). This design has two advantages: 31 + `phabricator_` (this is configurable). 32 32 33 - - Each database is easier to comprehend and to maintain. 34 - - We don't do cross-database joins so each database can live on its own 35 - machine. This gives us flexibility in sharding data later. 33 + Phabricator uses a separate database for each application. To understand why, 34 + see @{article:Why does Phabricator need so many databases?}. 36 35 37 36 Connections 38 37 ===========
+131
src/docs/flavor/so_many_databases.diviner
··· 1 + @title Why does Phabricator need so many databases? 2 + @group lore 3 + 4 + Phabricator uses about 60 databases (and we may have added more by the time you 5 + read this document). This sometimes comes as a surprise, since you might assume 6 + it would only use one database. 7 + 8 + The approach we use is designed to work at scale for huge installs with many 9 + thousands of users. We care a lot about working well for large installs, and 10 + about scaling up gracefully to meet the needs of growing organizations. We want 11 + small startups to be able to install Phabricator and have it grow with them as 12 + they expand to many thousands of employees. 13 + 14 + A cost of this approach is that it makes Phabricator more difficult to install 15 + on shared hosts which require a lot of work to create or authorize access to 16 + each database. However, Phabricator does a lot of advanced or complex things 17 + which are difficult to configure or manage on shared hosts, and we don't 18 + recommend installing it on a shared host. The install documentation explicitly 19 + discouarges installing on shared hosts. 20 + 21 + Broadly, in cases where we must choose between operating well at scale for 22 + growing organizations and installing easily on shared hosts, we prioritize 23 + operating at scale. 24 + 25 + 26 + Listing Databases 27 + ================= 28 + 29 + You can get a full list of the databases Phabricator needs with `bin/storage 30 + databases`. It will look something like this: 31 + 32 + ``` 33 + $ /core/lib/phabricator/bin/storage databases 34 + secure_audit 35 + secure_calendar 36 + secure_chatlog 37 + secure_conduit 38 + secure_countdown 39 + secure_daemon 40 + secure_differential 41 + secure_draft 42 + secure_drydock 43 + secure_feed 44 + ...<dozens more databases>... 45 + ``` 46 + 47 + Roughly, each application has its own database, and then there are some 48 + databases which support internal systems or shared infrastructure. 49 + 50 + 51 + Operating at Scale 52 + ================== 53 + 54 + This storage design is aimed at large installs that may need more than one 55 + physical database server to handle the load the install generates. 56 + 57 + The primary reason we a database per application is to allow large installs to 58 + scale up by spreading database load across more hardware. A large organization 59 + with many thousands of active users may find themselves limited by the capacity 60 + of a single database backend. 61 + 62 + If so, they can launch a second backend, move some applications over to it, and 63 + continue piling on more users. 64 + 65 + This can't continue forever, but provides a substantial amount of headroom for 66 + large installs to spread the workload across more hardware and continue scaling 67 + up. 68 + 69 + To make this possible, we put each application in its own database and use 70 + database boundaries to enforce the logical constraints that the application 71 + must have in order for this to work. For example, we can not perform joins 72 + between separable tables, because they may not be on the same hardware. 73 + 74 + Establishing boundaries with application databases is a simple, straightforward 75 + way to partition storage and make administrative operations like spreading load 76 + realistic. 77 + 78 + 79 + Ease of Development 80 + =================== 81 + 82 + This design is also easier for us to work with, and easier for users who 83 + want to work with the raw database data to understand and interact with. 84 + 85 + We have a large number of tables (more than 400) and we can not reasonably 86 + reduce the number of tables very much (each table generally represents some 87 + meaningful type of object in some application0. It's easier to develop with 88 + tables which are organized into separate application databases, just like it's 89 + easier to work with a large project if you organize source files into 90 + directories. 91 + 92 + If you aren't developing Phabricator and never look at the data in the 93 + database, you probably don't benefit from this organization. However, if you 94 + are a developer or want to extend Phabricator or look under the hood, it's 95 + easier to find what you're looking for and work with the tables and data when 96 + they're organized by application. 97 + 98 + 99 + Databases Have No Cost 100 + ====================== 101 + 102 + In almost all cases, creating databases has zero cost, just like organizing 103 + source code into directories has zero cost. 104 + 105 + Even if we didn't derive enormous benefits from this approach at scale, there 106 + is little reason //not// to organize storage like this. 107 + 108 + There are a handful of administrative tasks which are very slightly more 109 + complex to perform on multiple databases, but these are all either automated 110 + with `bin/storage` or easy to build on top of the list of databases emitted by 111 + `bin/storage databases`. 112 + 113 + For example, you can dump all the databases with `bin/storage dump`, and you 114 + can destroy all the databases with `bin/storage destroy`. 115 + 116 + As mentioned above, an exception to this is that if you're installing on a 117 + shared host and need to jump through hoops to individually authorize access to 118 + each database, databases do cost something. 119 + 120 + However, this cost is an artificial cost imposed by the selected environment, 121 + and this is only the first of many issues you'll run into trying to install and 122 + run Phabricator on a shared host. These issues are why we strongly discourage 123 + using shared hosts, and recommend against them in the install guide. 124 + 125 + 126 + Next Steps 127 + ========== 128 + 129 + Continue by: 130 + 131 + - learning more about databases in @{article:Database Schema}.