@recaptime-dev's working patches + fork for Phorge, a community fork of Phabricator. (Upstream dev and stable branches are at upstream/main and upstream/stable respectively.)
hq.recaptime.dev/wiki/Phorge
phorge
phabricator
1@title Things You Should Do Soon: Static Resources
2@group sundry
3
4Over time, you'll write more JS and CSS and eventually need to put systems in
5place to manage it.
6
7This is part of @{article:Things You Should Do Soon}, which describes
8architectural problems in web applications which you should begin to consider
9before you encounter them.
10
11= Manage Dependencies Automatically =
12
13The naive way to add static resources to a page is to include them at the top
14of the page, before rendering begins, by enumerating filenames. Facebook used to
15work like that:
16
17 COUNTEREXAMPLE
18 <?php
19
20 require_js('js/base.js');
21 require_js('js/utils.js');
22 require_js('js/ajax.js');
23 require_js('js/dialog.js');
24 // ...
25
26This was okay for a while but had become unmanageable by 2007. Because
27dependencies were managed completely manually and you had to explicitly list
28every file you needed in the right order, everyone copy-pasted a giant block
29of this stuff into every page. The major problem this created was that each page
30pulled in way too much JS, which slowed down frontend performance.
31
32We moved to a system (called //Haste//) which declared JS dependencies in the
33files using a docblock-like header:
34
35 /**
36 * @provides dialog
37 * @requires utils ajax base
38 */
39
40We annotated files manually, although theoretically you could use static
41analysis instead (we couldn't realistically do that, our JS was pretty
42unstructured). This allowed us to pull in the entire dependency chain of
43component with one call:
44
45 require_static('dialog');
46
47...instead of copy-pasting every dependency.
48
49
50= Include When Used =
51
52The other part of this problem was that all the resources were required at the
53top of the page instead of when they were actually used. This meant two things:
54
55 - you needed to include every resource that //could ever// appear on a page;
56 - if you were adding something new to 2+ pages, you had a strong incentive to
57 put it in base.js.
58
59So every page pulled in a bunch of silly stuff like the CAPTCHA code (because
60there was one obscure workflow involving unverified users which could
61theoretically show any user a CAPTCHA on any page) and every random thing anyone
62had stuck in base.js.
63
64We moved to a system where JS and CSS tags were output **after** page rendering
65had run instead (they still appeared at the top of the page, they were just
66prepended rather than appended before being output to the browser -- there are
67some complexities here, but they are beyond the immediate scope), so
68require_static() could appear anywhere in the code. Then we moved all the
69require_static() calls to be proximate to their use sites (so dialog rendering
70code would pull in dialog-related CSS and JS, for example, not any page which
71might need a dialog), and split base.js into a bunch of smaller files.
72
73
74= Packaging =
75
76The biggest frontend performance killer in most cases is the raw number of HTTP
77requests, and the biggest hammer for addressing it is to package related JS
78and CSS into larger files, so you send down all the core JS code in one big file
79instead of a lot of smaller ones. Once the other groundwork is in place, this is
80a relatively easy change. We started with manual package definitions and
81eventually moved to automatic generation based on production data.
82
83
84= Caches and Serving Content =
85
86In the simplest implementation of static resources, you write out a raw JS tag
87with something like `src="/js/base.js"`. This will break disastrously as you
88scale, because clients will be running with stale versions of resources. There
89are bunch of subtle problems (especially once you have a CDN), but the big one
90is that if a user is browsing your site as you push/deploy, their client will
91not make requests for the resources they already have in cache, so even if your
92servers respond correctly to If-None-Match (ETags) and If-Modified-Since
93(Expires) the site will appear completely broken to everyone who was using it
94when you push a breaking change to static resources.
95
96The best way to solve this problem is to version your resources in the URI,
97so each version of a resource has a unique URI:
98
99 rsrc/af04d14/js/base.js
100
101When you push, users will receive pages which reference the new URI so their
102browsers will retrieve it.
103
104**But**, there's a big problem, once you have a bunch of web frontends:
105
106While you're pushing, a user may make a request which is handled by a server
107running the new version of the code, which delivers a page with a new resource
108URI. Their browser then makes a request for the new resource, but that request
109is routed to a server which has not been pushed yet, which delivers an old
110version of the resource. They now have a poisoned cache: old resource data for
111a new resource URI.
112
113You can do a lot of clever things to solve this, but the solution we chose at
114Facebook was to serve resources out of a database instead of off disk. Before a
115push begins, new resources are written to the database so that every server is
116able to satisfy both old and new resource requests.
117
118This also made it relatively easy to do processing steps (like stripping
119comments and whitespace) in one place, and just insert a minified/processed
120version of CSS and JS into the database.
121
122= Reference Implementation: Celerity =
123
124Some of the ideas discussed here are implemented in Phorge's //Celerity//
125system, which is essentially a simplified version of the //Haste// system used
126by Facebook.