@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.)
hq.recaptime.dev/wiki/Phorge
phorge
phabricator
1@title Why does Phorge need so many databases?
2@group lore
3
4Phorge uses about 60 databases (and we may have added more by the time you
5read this document). This sometimes comes as a surprise, since you might assume
6it would only use one database.
7
8The approach we use is designed to work at scale for huge installs with many
9thousands of users. We care a lot about working well for large installs, and
10about scaling up gracefully to meet the needs of growing organizations. We want
11small startups to be able to install Phorge and have it grow with them as
12they expand to many thousands of employees.
13
14A cost of this approach is that it makes Phorge more difficult to install
15on shared hosts which require a lot of work to create or authorize access to
16each database. However, Phorge does a lot of advanced or complex things
17which are difficult to configure or manage on shared hosts, and we don't
18recommend installing it on a shared host. The install documentation explicitly
19discourages installing on shared hosts.
20
21Broadly, in cases where we must choose between operating well at scale for
22growing organizations and installing easily on shared hosts, we prioritize
23operating at scale.
24
25
26Listing Databases
27=================
28
29You can get a full list of the databases Phorge needs with `bin/storage
30databases`. It will look something like this:
31
32```
33$ /core/lib/phorge/bin/storage databases
34secure_audit
35secure_calendar
36secure_chatlog
37secure_conduit
38secure_countdown
39secure_daemon
40secure_differential
41secure_draft
42secure_drydock
43secure_feed
44...<dozens more databases>...
45```
46
47Roughly, each application has its own database, and then there are some
48databases which support internal systems or shared infrastructure.
49
50
51Operating at Scale
52==================
53
54This storage design is aimed at large installs that may need more than one
55physical database server to handle the load the install generates.
56
57The primary reason we use a separate database for each application is to allow
58large installs to scale up by spreading database load across more hardware. A
59large organization with many thousands of active users may find themselves
60limited by the capacity of a single database backend.
61
62If so, they can launch a second backend, move some applications over to it, and
63continue piling on more users.
64
65This can't continue forever, but provides a substantial amount of headroom for
66large installs to spread the workload across more hardware and continue scaling
67up.
68
69To make this possible, we put each application in its own database and use
70database boundaries to enforce the logical constraints that the application
71must have in order for this to work. For example, we can not perform joins
72between separable tables, because they may not be on the same hardware.
73
74Establishing boundaries with application databases is a simple, straightforward
75way to partition storage and make administrative operations like spreading load
76realistic.
77
78
79Ease of Development
80===================
81
82This design is also easier for us to work with, and easier for users who
83want to work with the raw data in the database.
84
85We have a large number of tables (more than 400) and we can not reasonably
86reduce the number of tables very much (each table generally represents some
87meaningful type of object in some application). It's easier to develop with
88tables which are organized into separate application databases, just like it's
89easier to work with a large project if you organize source files into
90directories.
91
92If you aren't developing Phorge and never look at the data in the
93database, you probably won't benefit from this organization. However, if you
94are a developer or want to extend Phorge or look under the hood, it's
95easier to find what you're looking for and work with the tables when they're
96organized by application.
97
98
99More Databases Cost Nothing
100===========================
101
102In almost all cases, creating more databases has zero cost, just like
103organizing source code into directories has zero cost. Even if we didn't derive
104enormous benefits from this approach at scale, there is little reason //not//
105to organize storage like this.
106
107There are a handful of administrative tasks which are very slightly more
108complex to perform on multiple databases, but these are all either automated
109with `bin/storage` or easy to build on top of the list of databases emitted by
110`bin/storage databases`.
111
112For example, you can dump all the databases with `bin/storage dump`, and you
113can destroy all the databases with `bin/storage destroy`.
114
115As mentioned above, an exception to this is that if you're installing on a
116shared host and need to jump through hoops to individually authorize access to
117each database, databases do cost something.
118
119However, this cost is an artificial cost imposed by the selected environment,
120and this is only the first of many issues you'll run into trying to install and
121run Phorge on a shared host. These issues are why we strongly discourage
122using shared hosts, and recommend against them in the install guide.
123
124
125Next Steps
126==========
127
128Continue by:
129
130 - learning more about databases in @{article:Database Schema}.