···99- Comment heavily, but not redundantly. Explain the "why" behind decisions
1010 clearly, don't repeat the "what."
1111- Try to limit line length below 100 characters, but don't be afraid to break
1212- this rule if it improves readability.
1212+ this rule if it improves readability.
1313 - Do not make lines too short to follow this rule either. Try
1414 to use as much of the max characters as possible without sacrificing
1515 readability.
···18181919- Add compile-time interface checks whenever a type implements an interface
2020 we care about. For example: `var _ io.Reader = (*MyType)(nil)`.
2121+2222+## Providers
2323+2424+- Buildkite API: <https://buildkite.com/docs/apis/rest-api.md>
+29-13
README.md
···3535go run . -addr :8080
3636```
37373838-## Endpoints (planned)
3838+## Configuration
39394040-- `GET /events` — WebSocket stream of pipeline status events,
4141- consumed by the Tangled appview.
4242-- `POST /webhooks/buildkite` — Buildkite webhook receiver.
4343-- `POST /xrpc/sh.tangled.pipeline.cancelPipeline` — cancel a running build.
4040+### Required
4141+4242+| Env var | Description |
4343+| ---------------- | ----------------------------------------------------------- |
4444+| `TACK_HOSTNAME` | This spindle's hostname (matches `sh.tangled.repo.spindle`) |
4545+| `TACK_OWNER_DID` | DID of the spindle operator |
44464545-## Configuration (planned)
4747+### Optional
46484747-| Env var | Description |
4848-| ---------------------- | ------------------------------------ |
4949-| `TACK_BUILDKITE_TOKEN` | Buildkite API token |
5050-| `TACK_BUILDKITE_ORG` | Buildkite organization slug |
5151-| `TACK_JETSTREAM_URL` | Tangled Jetstream WebSocket URL |
5252-| `TACK_DB_PATH` | Local SQLite path for the event log |
5353-| `TACK_OWNER_DID` | DID of the spindle operator |
4949+| Env var | Description |
5050+| -------------------- | -------------------------------------------------------- |
5151+| `TACK_LISTEN_ADDR` | HTTP listen address (default `:8080`) |
5252+| `TACK_DB_PATH` | Local SQLite path (default `tack.db`) |
5353+| `TACK_JETSTREAM_URL` | Tangled Jetstream WebSocket URL |
5454+| `TACK_DEV` | Use `ws://` for knot event-streams (any non-empty value) |
5555+5656+### Buildkite
5757+5858+Setting `TACK_BUILDKITE_TOKEN` enables Buildkite mode; when unset, tack
5959+runs the in-process fake provider for local development. When
6060+Buildkite mode is enabled, every other variable in this section is
6161+required.
6262+6363+| Env var | Description |
6464+| ------------------------------- | ------------------------------------------------------------------------------ |
6565+| `TACK_BUILDKITE_TOKEN` | Buildkite API token (enables Buildkite mode) |
6666+| `TACK_BUILDKITE_ORG` | Buildkite organization slug |
6767+| `TACK_BUILDKITE_PIPELINE` | Buildkite pipeline slug to fire builds on |
6868+| `TACK_BUILDKITE_WEBHOOK_SECRET` | Shared secret for `/webhooks/buildkite` auth |
6969+| `TACK_BUILDKITE_WEBHOOK_MODE` | `token` (default) or `signature` — must match Buildkite's notification setting |
+78-7
http.go
···2222 "encoding/json"
2323 "errors"
2424 "fmt"
2525+ "io"
2526 "log/slog"
2627 "net/http"
2728 "strconv"
···29303031 "github.com/gorilla/websocket"
3132 "tangled.org/core/api/tangled"
3333+3434+ "github.com/mitchellh/tack/internal/buildkite"
3235)
33363437// runHTTP starts the spindle's HTTP server and blocks until ctx is
···3740//
3841// The logger is read from ctx via loggerFrom. The broker is the
3942// in-process pub/sub used by /events to fan published records out to
4040-// connected websocket subscribers.
4141-func runHTTP(ctx context.Context, cfg config, br *broker, provider Provider) error {
4343+// connected websocket subscribers. bkProvider may be nil — when a
4444+// deployment runs the fake provider, /webhooks/buildkite still
4545+// registers but responds 503, so a misdirected Buildkite webhook
4646+// gets a clear "this spindle isn't accepting Buildkite events" rather
4747+// than a misleading 200.
4848+func runHTTP(ctx context.Context, cfg config, br *broker, provider Provider, bkProvider *buildkiteProvider) error {
4249 logger := loggerFrom(ctx)
43504451 mux := http.NewServeMux()
···4653 mux.HandleFunc("GET /events", eventsHandler(logger, br))
4754 mux.HandleFunc("GET /logs/{knot}/{pipelineRkey}/{workflow}", logsHandler(logger, provider))
4855 mux.HandleFunc("GET /xrpc/"+tangled.OwnerNSID, ownerHandler(logger, cfg.OwnerDID))
4949- mux.HandleFunc("POST /webhooks/buildkite", buildkiteWebhookHandler())
5656+ mux.HandleFunc("POST /webhooks/buildkite", buildkiteWebhookHandler(logger, bkProvider))
50575158 srv := &http.Server{
5259 Addr: cfg.Addr,
···96103 }
97104}
981059999-// buildkiteWebhookHandler is a placeholder until we implement Buildkite ->
100100-// pipeline.status translation.
101101-func buildkiteWebhookHandler() http.HandlerFunc {
106106+// buildkiteWebhookHandler receives Buildkite Pipelines webhook events,
107107+// authenticates the request against whichever scheme the provider was
108108+// configured with, and hands the decoded payload to the provider for
109109+// translation into a sh.tangled.pipeline.status publish.
110110+//
111111+// Authentication is intentionally fail-closed: when bk is nil (no
112112+// Buildkite provider configured) we 503 instead of accepting events
113113+// silently. The body is buffered up front because signature mode
114114+// HMACs the raw bytes — we can't rely on the JSON decoder reading
115115+// the request body before verification.
116116+//
117117+// Acknowledgement contract with Buildkite: we 200 on any well-formed
118118+// event we accepted (including events we deliberately ignore, like
119119+// job.* or builds we don't track), and 5xx only on internal failure
120120+// the operator should look at. A 4xx/5xx makes Buildkite retry,
121121+// which we don't want for "this isn't an event we care about".
122122+func buildkiteWebhookHandler(logger *slog.Logger, bk *buildkiteProvider) http.HandlerFunc {
102123 return func(w http.ResponseWriter, r *http.Request) {
103103- http.Error(w, "not implemented", http.StatusNotImplemented)
124124+ if bk == nil {
125125+ http.Error(w, "buildkite provider not configured",
126126+ http.StatusServiceUnavailable)
127127+ return
128128+ }
129129+130130+ // Cap body size so a malicious sender can't exhaust
131131+ // memory; Buildkite payloads in practice are well under
132132+ // 64 KiB but a generous-but-bounded ceiling is the
133133+ // right shape here.
134134+ body, err := io.ReadAll(io.LimitReader(r.Body, 1<<20))
135135+ if err != nil {
136136+ logger.Warn("buildkite webhook: read body", "err", err)
137137+ http.Error(w, "read body", http.StatusBadRequest)
138138+ return
139139+ }
140140+141141+ if err := bk.VerifyWebhook(r.Header, body); err != nil {
142142+ logger.Warn("buildkite webhook: verify failed",
143143+ "err", err,
144144+ "remote", r.RemoteAddr,
145145+ )
146146+ http.Error(w, "unauthorized", http.StatusUnauthorized)
147147+ return
148148+ }
149149+150150+ var payload buildkite.WebhookPayload
151151+ if err := json.Unmarshal(body, &payload); err != nil {
152152+ logger.Warn("buildkite webhook: decode body", "err", err)
153153+ http.Error(w, "bad payload", http.StatusBadRequest)
154154+ return
155155+ }
156156+ // The X-Buildkite-Event header is authoritative for the
157157+ // event name; the body field is convenience but doesn't
158158+ // always match exactly. Prefer the header.
159159+ if h := r.Header.Get("X-Buildkite-Event"); h != "" {
160160+ payload.Event = h
161161+ }
162162+163163+ // Translate + publish on the request context so a slow
164164+ // store/broker doesn't outlive an aborted webhook
165165+ // connection.
166166+ if err := bk.HandleWebhook(r.Context(), payload); err != nil {
167167+ logger.Error("buildkite webhook: handle", "err", err,
168168+ "event", payload.Event,
169169+ "build_uuid", payload.Build.ID,
170170+ )
171171+ http.Error(w, "internal error", http.StatusInternalServerError)
172172+ return
173173+ }
174174+ w.WriteHeader(http.StatusOK)
104175 }
105176}
106177
+346
internal/buildkite/buildkite.go
···11+// Package buildkite is a small Buildkite REST + webhook client tack
22+// uses to drive its Buildkite-backed Provider implementation.
33+//
44+// The package deliberately covers a tiny slice of the upstream API
55+// (create build, get build, fetch job log, decode + authenticate
66+// webhook payloads). It exists as its own package so the rest of
77+// tack — particularly the Provider implementation that translates
88+// Tangled triggers into Buildkite builds — can stay focused on
99+// translation rather than HTTP plumbing.
1010+//
1111+// Naming convention: types here are *not* prefixed with "Buildkite".
1212+// Imported as `buildkite.Client`, `buildkite.Build`, etc., the package
1313+// path supplies the disambiguation already.
1414+package buildkite
1515+1616+import (
1717+ "bytes"
1818+ "context"
1919+ "crypto/hmac"
2020+ "crypto/sha256"
2121+ "encoding/hex"
2222+ "encoding/json"
2323+ "errors"
2424+ "fmt"
2525+ "io"
2626+ "net/http"
2727+ "strconv"
2828+ "strings"
2929+ "time"
3030+)
3131+3232+// APIBase is the public Buildkite REST API root. Exported as a var
3333+// (not a const) so tests can swap it for an httptest server URL
3434+// without hooking up a real Buildkite account.
3535+var APIBase = "https://api.buildkite.com"
3636+3737+// ErrNotFound is returned by Get* methods when the upstream returns
3838+// 404. Callers translate it to whatever shape they need (the Provider
3939+// maps it onto its own ErrLogsNotFound for the /logs handler).
4040+var ErrNotFound = errors.New("buildkite: not found")
4141+4242+// Client is a thin wrapper around net/http carrying API credentials
4343+// + organization context so call sites don't repeat them. Safe for
4444+// concurrent use; the embedded http.Client is goroutine-safe.
4545+type Client struct {
4646+ http *http.Client
4747+ token string
4848+ org string
4949+}
5050+5151+// NewClient builds a Client with sensible defaults. The 30s timeout
5252+// covers individual requests, not the whole client lifetime —
5353+// long-poll-style endpoints aren't used here, so a generous-but-bounded
5454+// per-request timeout is the right default.
5555+func NewClient(token, org string) *Client {
5656+ return &Client{
5757+ http: &http.Client{Timeout: 30 * time.Second},
5858+ token: token,
5959+ org: org,
6060+ }
6161+}
6262+6363+// Job is the subset of a Buildkite job object we care about: the ID
6464+// and name we need to fetch logs for it, plus its state for
6565+// surfacing in webhook handling.
6666+//
6767+// Buildkite jobs come in several "type" values (script, waiter,
6868+// manual) — only "script" jobs have logs, but we decode the slice
6969+// as-is and let the caller decide whether to skip non-script entries.
7070+type Job struct {
7171+ ID string `json:"id"`
7272+ Type string `json:"type"`
7373+ Name string `json:"name"`
7474+ State string `json:"state"`
7575+}
7676+7777+// Build is the subset of a Buildkite build object we care about.
7878+// Fields not present here are dropped silently by the JSON decoder —
7979+// keep this list lean so additions to the upstream schema don't
8080+// force us to touch this file.
8181+type Build struct {
8282+ ID string `json:"id"`
8383+ Number int64 `json:"number"`
8484+ State string `json:"state"`
8585+ WebURL string `json:"web_url"`
8686+ Commit string `json:"commit"`
8787+ Branch string `json:"branch"`
8888+ Message string `json:"message"`
8989+ MetaData map[string]string `json:"meta_data"`
9090+ Jobs []Job `json:"jobs"`
9191+ Pipeline map[string]interface{} `json:"pipeline"`
9292+}
9393+9494+// CreateBuildRequest is the request body for POST /builds. Only the
9595+// fields callers actively use are exposed — the upstream API accepts
9696+// many more, but we'd just be passing through dead options.
9797+//
9898+// IgnorePipelineBranchFilters defaults to false on the wire (omitempty
9999+// elides the zero value); callers that want pipeline-level branch
100100+// filters bypassed should set it to true.
101101+type CreateBuildRequest struct {
102102+ Commit string `json:"commit"`
103103+ Branch string `json:"branch"`
104104+ Message string `json:"message,omitempty"`
105105+ Env map[string]string `json:"env,omitempty"`
106106+ MetaData map[string]string `json:"meta_data,omitempty"`
107107+ IgnorePipelineBranchFilters bool `json:"ignore_pipeline_branch_filters,omitempty"`
108108+}
109109+110110+// CreateBuild fires a build on the named pipeline. Returns the
111111+// decoded response so the caller can persist build_uuid + number for
112112+// later webhook lookup.
113113+//
114114+// Buildkite returns 201 on success; anything else is wrapped into an
115115+// error that includes the response body so a misconfigured pipeline
116116+// (e.g. wrong slug, missing branch) surfaces useful diagnostics into
117117+// the caller's log.
118118+func (c *Client) CreateBuild(
119119+ ctx context.Context,
120120+ pipelineSlug string,
121121+ req CreateBuildRequest,
122122+) (*Build, error) {
123123+ body, err := json.Marshal(req)
124124+ if err != nil {
125125+ return nil, fmt.Errorf("marshal build request: %w", err)
126126+ }
127127+128128+ url := fmt.Sprintf("%s/v2/organizations/%s/pipelines/%s/builds",
129129+ APIBase, c.org, pipelineSlug,
130130+ )
131131+ httpReq, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(body))
132132+ if err != nil {
133133+ return nil, fmt.Errorf("build request: %w", err)
134134+ }
135135+ httpReq.Header.Set("Authorization", "Bearer "+c.token)
136136+ httpReq.Header.Set("Content-Type", "application/json")
137137+138138+ resp, err := c.http.Do(httpReq)
139139+ if err != nil {
140140+ return nil, fmt.Errorf("create build: %w", err)
141141+ }
142142+ defer resp.Body.Close()
143143+144144+ if resp.StatusCode != http.StatusCreated {
145145+ raw, _ := io.ReadAll(io.LimitReader(resp.Body, 4096))
146146+ return nil, fmt.Errorf("create build: status %d: %s",
147147+ resp.StatusCode, strings.TrimSpace(string(raw)),
148148+ )
149149+ }
150150+151151+ var out Build
152152+ if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
153153+ return nil, fmt.Errorf("decode build response: %w", err)
154154+ }
155155+ return &out, nil
156156+}
157157+158158+// GetBuild fetches the full build record by number, including the
159159+// jobs slice. Used by callers that need the current set of jobs for
160160+// a known (pipelineSlug, buildNumber) pair.
161161+//
162162+// Returns ErrNotFound when Buildkite responds 404.
163163+func (c *Client) GetBuild(
164164+ ctx context.Context,
165165+ pipelineSlug string,
166166+ buildNumber int64,
167167+) (*Build, error) {
168168+ url := fmt.Sprintf("%s/v2/organizations/%s/pipelines/%s/builds/%d",
169169+ APIBase, c.org, pipelineSlug, buildNumber,
170170+ )
171171+ httpReq, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
172172+ if err != nil {
173173+ return nil, fmt.Errorf("build request: %w", err)
174174+ }
175175+ httpReq.Header.Set("Authorization", "Bearer "+c.token)
176176+177177+ resp, err := c.http.Do(httpReq)
178178+ if err != nil {
179179+ return nil, fmt.Errorf("get build: %w", err)
180180+ }
181181+ defer resp.Body.Close()
182182+183183+ if resp.StatusCode == http.StatusNotFound {
184184+ return nil, ErrNotFound
185185+ }
186186+ if resp.StatusCode != http.StatusOK {
187187+ raw, _ := io.ReadAll(io.LimitReader(resp.Body, 4096))
188188+ return nil, fmt.Errorf("get build: status %d: %s",
189189+ resp.StatusCode, strings.TrimSpace(string(raw)),
190190+ )
191191+ }
192192+193193+ var out Build
194194+ if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
195195+ return nil, fmt.Errorf("decode build: %w", err)
196196+ }
197197+ return &out, nil
198198+}
199199+200200+// GetJobLog fetches the plain-text log for a single job. Buildkite
201201+// supports several formats; we ask for text/plain explicitly so the
202202+// response is one big string the caller can split on newlines.
203203+//
204204+// Returns ErrNotFound when Buildkite responds 404 (typical for a
205205+// job that hasn't started yet — it has no log to serve).
206206+func (c *Client) GetJobLog(
207207+ ctx context.Context,
208208+ pipelineSlug string,
209209+ buildNumber int64,
210210+ jobID string,
211211+) (string, error) {
212212+ url := fmt.Sprintf("%s/v2/organizations/%s/pipelines/%s/builds/%d/jobs/%s/log",
213213+ APIBase, c.org, pipelineSlug, buildNumber, jobID,
214214+ )
215215+ httpReq, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
216216+ if err != nil {
217217+ return "", fmt.Errorf("build request: %w", err)
218218+ }
219219+ httpReq.Header.Set("Authorization", "Bearer "+c.token)
220220+ httpReq.Header.Set("Accept", "text/plain")
221221+222222+ resp, err := c.http.Do(httpReq)
223223+ if err != nil {
224224+ return "", fmt.Errorf("get job log: %w", err)
225225+ }
226226+ defer resp.Body.Close()
227227+228228+ if resp.StatusCode == http.StatusNotFound {
229229+ return "", ErrNotFound
230230+ }
231231+ if resp.StatusCode != http.StatusOK {
232232+ raw, _ := io.ReadAll(io.LimitReader(resp.Body, 4096))
233233+ return "", fmt.Errorf("get job log: status %d: %s",
234234+ resp.StatusCode, strings.TrimSpace(string(raw)),
235235+ )
236236+ }
237237+238238+ body, err := io.ReadAll(resp.Body)
239239+ if err != nil {
240240+ return "", fmt.Errorf("read job log: %w", err)
241241+ }
242242+ return string(body), nil
243243+}
244244+245245+// WebhookPayload is the small slice of the webhook body callers
246246+// actually decode. Build events all wrap a "build" object; job
247247+// events wrap a "job" — keeping both around lets callers add
248248+// job-level mapping later without changing the decoder.
249249+type WebhookPayload struct {
250250+ Event string `json:"event"`
251251+ Build Build `json:"build"`
252252+ Job Job `json:"job"`
253253+}
254254+255255+// WebhookMode selects how an inbound webhook request is
256256+// authenticated. The two values correspond directly to the two
257257+// settings Buildkite's notification service exposes for this:
258258+// WebhookModeToken sends the secret in the X-Buildkite-Token header
259259+// in plain text; WebhookModeSignature sends an HMAC-SHA256 of the
260260+// body in X-Buildkite-Signature.
261261+type WebhookMode string
262262+263263+const (
264264+ WebhookModeToken WebhookMode = "token"
265265+ WebhookModeSignature WebhookMode = "signature"
266266+)
267267+268268+// VerifySignature validates the X-Buildkite-Signature header against
269269+// secret using the documented "<timestamp>.<body>" HMAC-SHA256
270270+// scheme. Returns nil when the header is well-formed and the digest
271271+// matches; any other condition returns an error.
272272+//
273273+// We deliberately do NOT enforce a freshness window on timestamp:
274274+// callers in practice consume idempotent or storage-deduplicated
275275+// events, so a replayed event is at worst a duplicate publish.
276276+// Callers that need stricter freshness should layer it on top.
277277+//
278278+// The header format is "timestamp=<unix>,signature=<hex>".
279279+func VerifySignature(header, secret string, body []byte) error {
280280+ if header == "" {
281281+ return errors.New("missing X-Buildkite-Signature header")
282282+ }
283283+ if secret == "" {
284284+ // A misconfigured server is a programmer bug, but we'd
285285+ // rather fail closed than silently accept any signature.
286286+ return errors.New("server has no webhook secret configured")
287287+ }
288288+289289+ var ts, sig string
290290+ for _, part := range strings.Split(header, ",") {
291291+ k, v, ok := strings.Cut(strings.TrimSpace(part), "=")
292292+ if !ok {
293293+ continue
294294+ }
295295+ switch k {
296296+ case "timestamp":
297297+ ts = v
298298+ case "signature":
299299+ sig = v
300300+ }
301301+ }
302302+ if ts == "" || sig == "" {
303303+ return errors.New("malformed signature header")
304304+ }
305305+ // Sanity-check the timestamp is a parseable int. The value
306306+ // itself isn't validated against the clock (see comment above),
307307+ // but a non-numeric timestamp is structurally invalid.
308308+ if _, err := strconv.ParseInt(ts, 10, 64); err != nil {
309309+ return fmt.Errorf("invalid timestamp: %w", err)
310310+ }
311311+312312+ mac := hmac.New(sha256.New, []byte(secret))
313313+ mac.Write([]byte(ts))
314314+ mac.Write([]byte("."))
315315+ mac.Write(body)
316316+ expected := hex.EncodeToString(mac.Sum(nil))
317317+318318+ // Compare in constant time to keep the verifier from leaking
319319+ // the expected digest through timing.
320320+ if !hmac.Equal([]byte(expected), []byte(sig)) {
321321+ return errors.New("signature mismatch")
322322+ }
323323+ return nil
324324+}
325325+326326+// VerifyToken handles the simpler X-Buildkite-Token mode: the
327327+// configured secret is sent verbatim in the header. Constant-time
328328+// comparison keeps a brute-forcing attacker from learning the token
329329+// one byte at a time.
330330+//
331331+// Returns nil on match. Two error cases:
332332+// - missing header on the request (caller should 401),
333333+// - server has no expected token configured (caller should 500;
334334+// fail-closed, never accept-anything).
335335+func VerifyToken(header, expected string) error {
336336+ if header == "" {
337337+ return errors.New("missing X-Buildkite-Token header")
338338+ }
339339+ if expected == "" {
340340+ return errors.New("server has no webhook token configured")
341341+ }
342342+ if !hmac.Equal([]byte(header), []byte(expected)) {
343343+ return errors.New("token mismatch")
344344+ }
345345+ return nil
346346+}
+207
internal/buildkite/buildkite_test.go
···11+package buildkite
22+33+// Tests for the Buildkite REST client + webhook signature/token
44+// verifiers. Provider-level (Tangled translation) tests live with
55+// the provider in the main package.
66+77+import (
88+ "context"
99+ "crypto/hmac"
1010+ "crypto/sha256"
1111+ "encoding/hex"
1212+ "encoding/json"
1313+ "fmt"
1414+ "io"
1515+ "net/http"
1616+ "net/http/httptest"
1717+ "strings"
1818+ "testing"
1919+)
2020+2121+// TestVerifySignature covers the HMAC mode end to end. The reference
2222+// digest is computed with the documented "<timestamp>.<body>"
2323+// preimage so the test pins the wire format, not just the helper.
2424+func TestVerifySignature(t *testing.T) {
2525+ const secret = "shhh"
2626+ body := []byte(`{"event":"build.finished"}`)
2727+ const ts = "1700000000"
2828+2929+ mac := hmac.New(sha256.New, []byte(secret))
3030+ mac.Write([]byte(ts))
3131+ mac.Write([]byte("."))
3232+ mac.Write(body)
3333+ good := hex.EncodeToString(mac.Sum(nil))
3434+3535+ cases := []struct {
3636+ name string
3737+ header string
3838+ secret string
3939+ body []byte
4040+ wantErr bool
4141+ }{
4242+ {"valid", "timestamp=" + ts + ",signature=" + good, secret, body, false},
4343+ {"valid with whitespace", " timestamp=" + ts + " , signature=" + good + " ", secret, body, false},
4444+ {"empty header", "", secret, body, true},
4545+ {"empty server secret", "timestamp=" + ts + ",signature=" + good, "", body, true},
4646+ {"missing timestamp", "signature=" + good, secret, body, true},
4747+ {"missing signature", "timestamp=" + ts, secret, body, true},
4848+ {"non-numeric timestamp", "timestamp=abc,signature=" + good, secret, body, true},
4949+ {"wrong signature", "timestamp=" + ts + ",signature=00", secret, body, true},
5050+ {"wrong body", "timestamp=" + ts + ",signature=" + good, secret, []byte("nope"), true},
5151+ }
5252+ for _, c := range cases {
5353+ t.Run(c.name, func(t *testing.T) {
5454+ err := VerifySignature(c.header, c.secret, c.body)
5555+ if (err != nil) != c.wantErr {
5656+ t.Fatalf("err=%v wantErr=%v", err, c.wantErr)
5757+ }
5858+ })
5959+ }
6060+}
6161+6262+// TestVerifyToken pins the token-mode behaviour: it must be a
6363+// constant-time exact-match check, never a prefix or substring.
6464+func TestVerifyToken(t *testing.T) {
6565+ cases := []struct {
6666+ name string
6767+ header string
6868+ expect string
6969+ wantErr bool
7070+ }{
7171+ {"match", "abc123", "abc123", false},
7272+ {"empty header", "", "abc123", true},
7373+ {"empty expected", "abc123", "", true},
7474+ {"mismatch", "abc124", "abc123", true},
7575+ {"prefix is not a match", "abc", "abc123", true},
7676+ }
7777+ for _, c := range cases {
7878+ t.Run(c.name, func(t *testing.T) {
7979+ err := VerifyToken(c.header, c.expect)
8080+ if (err != nil) != c.wantErr {
8181+ t.Fatalf("err=%v wantErr=%v", err, c.wantErr)
8282+ }
8383+ })
8484+ }
8585+}
8686+8787+// TestClientCreateBuild covers the request shape we send (auth
8888+// header, JSON body, URL) and the response decoding for the happy
8989+// path.
9090+func TestClientCreateBuild(t *testing.T) {
9191+ srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
9292+ if r.URL.Path != "/v2/organizations/myorg/pipelines/mypipe/builds" {
9393+ t.Errorf("bad path %s", r.URL.Path)
9494+ }
9595+ if got := r.Header.Get("Authorization"); got != "Bearer tok" {
9696+ t.Errorf("bad auth %q", got)
9797+ }
9898+ if got := r.Header.Get("Content-Type"); got != "application/json" {
9999+ t.Errorf("bad content-type %q", got)
100100+ }
101101+ var got CreateBuildRequest
102102+ if err := json.NewDecoder(r.Body).Decode(&got); err != nil {
103103+ t.Fatalf("decode: %v", err)
104104+ }
105105+ if got.Commit != "abc" || got.Branch != "main" {
106106+ t.Errorf("bad body: %+v", got)
107107+ }
108108+ if got.MetaData["k"] != "v" {
109109+ t.Errorf("missing meta_data: %+v", got.MetaData)
110110+ }
111111+ w.WriteHeader(http.StatusCreated)
112112+ _ = json.NewEncoder(w).Encode(Build{
113113+ ID: "uuid-1",
114114+ Number: 42,
115115+ State: "scheduled",
116116+ })
117117+ }))
118118+ defer srv.Close()
119119+120120+ prev := APIBase
121121+ APIBase = srv.URL
122122+ defer func() { APIBase = prev }()
123123+124124+ c := NewClient("tok", "myorg")
125125+ build, err := c.CreateBuild(context.Background(), "mypipe", CreateBuildRequest{
126126+ Commit: "abc",
127127+ Branch: "main",
128128+ MetaData: map[string]string{"k": "v"},
129129+ })
130130+ if err != nil {
131131+ t.Fatalf("CreateBuild: %v", err)
132132+ }
133133+ if build.ID != "uuid-1" || build.Number != 42 || build.State != "scheduled" {
134134+ t.Fatalf("unexpected build: %+v", build)
135135+ }
136136+}
137137+138138+// TestClientCreateBuildError makes sure non-2xx responses surface
139139+// the upstream error body — that text ends up in operator logs, so
140140+// silently dropping it would make misconfigurations very painful to
141141+// diagnose.
142142+func TestClientCreateBuildError(t *testing.T) {
143143+ srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
144144+ w.WriteHeader(http.StatusUnprocessableEntity)
145145+ fmt.Fprint(w, `{"message":"branch is required"}`)
146146+ }))
147147+ defer srv.Close()
148148+149149+ prev := APIBase
150150+ APIBase = srv.URL
151151+ defer func() { APIBase = prev }()
152152+153153+ c := NewClient("tok", "myorg")
154154+ _, err := c.CreateBuild(context.Background(), "mypipe", CreateBuildRequest{})
155155+ if err == nil {
156156+ t.Fatal("expected error")
157157+ }
158158+ if !strings.Contains(err.Error(), "branch is required") {
159159+ t.Fatalf("error missing upstream body: %v", err)
160160+ }
161161+}
162162+163163+// TestClientGetJobLog confirms we send the right Accept header (so
164164+// Buildkite returns plain text, not JSON) and surface 404 as
165165+// ErrNotFound for callers to translate.
166166+func TestClientGetJobLog(t *testing.T) {
167167+ t.Run("ok", func(t *testing.T) {
168168+ srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
169169+ if got := r.Header.Get("Accept"); got != "text/plain" {
170170+ t.Errorf("bad accept %q", got)
171171+ }
172172+ if !strings.HasSuffix(r.URL.Path, "/builds/7/jobs/job-1/log") {
173173+ t.Errorf("bad path %s", r.URL.Path)
174174+ }
175175+ io.WriteString(w, "line1\nline2\n")
176176+ }))
177177+ defer srv.Close()
178178+179179+ prev := APIBase
180180+ APIBase = srv.URL
181181+ defer func() { APIBase = prev }()
182182+183183+ body, err := NewClient("tok", "myorg").
184184+ GetJobLog(context.Background(), "mypipe", 7, "job-1")
185185+ if err != nil {
186186+ t.Fatalf("GetJobLog: %v", err)
187187+ }
188188+ if body != "line1\nline2\n" {
189189+ t.Fatalf("body = %q", body)
190190+ }
191191+ })
192192+ t.Run("404", func(t *testing.T) {
193193+ srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
194194+ http.Error(w, "no", http.StatusNotFound)
195195+ }))
196196+ defer srv.Close()
197197+ prev := APIBase
198198+ APIBase = srv.URL
199199+ defer func() { APIBase = prev }()
200200+201201+ _, err := NewClient("tok", "myorg").
202202+ GetJobLog(context.Background(), "mypipe", 7, "job-1")
203203+ if err != ErrNotFound {
204204+ t.Fatalf("err = %v; want ErrNotFound", err)
205205+ }
206206+ })
207207+}
+1-1
knot.go
···205205 // Spawn is non-blocking — it fans out into provider-owned
206206 // goroutines so this worker can move on to the next event.
207207 // The provider keeps ctx around for shutdown coordination.
208208- k.provider.Spawn(ctx, src.Key(), msg.Rkey, p.Workflows)
208208+ k.provider.Spawn(ctx, src.Key(), msg.Rkey, p.TriggerMetadata, p.Workflows)
209209210210 default:
211211 // Knots may publish other record types over the same stream; we
+84-13
main.go
···88 "context"
99 "errors"
1010 "flag"
1111+ "fmt"
1112 "log/slog"
1213 "os"
1314 "os/signal"
1415 "syscall"
15161617 charmlog "github.com/charmbracelet/log"
1818+1919+ "github.com/mitchellh/tack/internal/buildkite"
1720)
18211922// config is the runtime configuration, sourced from environment variables and
···2831 // Dev flips the knot event-stream scheme from wss:// to ws://.
2932 // Useful when running against a local knot during development.
3033 Dev bool
3434+3535+ // Buildkite-mode configuration. BuildkiteToken is the switch:
3636+ // when empty we fall back to the in-process fake provider
3737+ // (useful for local development against a real Tangled
3838+ // jetstream); when set, the other Buildkite fields are
3939+ // required and tack will refuse to start without them.
4040+ BuildkiteToken string
4141+ BuildkiteOrg string
4242+ BuildkitePipeline string
4343+ BuildkiteWebhookSecret string
4444+ BuildkiteWebhookMode buildkite.WebhookMode
3145}
32463347func loadConfig() (config, error) {
3448 cfg := config{
3535- Addr: envOr("TACK_LISTEN_ADDR", ":8080"),
3636- Hostname: os.Getenv("TACK_HOSTNAME"),
3737- OwnerDID: os.Getenv("TACK_OWNER_DID"),
3838- JetstreamURL: envOr("TACK_JETSTREAM_URL", "wss://jetstream1.us-west.bsky.network/subscribe"),
3939- DBPath: envOr("TACK_DB_PATH", "tack.db"),
4040- Dev: os.Getenv("TACK_DEV") != "",
4949+ Addr: envOr("TACK_LISTEN_ADDR", ":8080"),
5050+ Hostname: os.Getenv("TACK_HOSTNAME"),
5151+ OwnerDID: os.Getenv("TACK_OWNER_DID"),
5252+ JetstreamURL: envOr("TACK_JETSTREAM_URL", "wss://jetstream1.us-west.bsky.network/subscribe"),
5353+ DBPath: envOr("TACK_DB_PATH", "tack.db"),
5454+ Dev: os.Getenv("TACK_DEV") != "",
5555+ BuildkiteToken: os.Getenv("TACK_BUILDKITE_TOKEN"),
5656+ BuildkiteOrg: os.Getenv("TACK_BUILDKITE_ORG"),
5757+ BuildkitePipeline: os.Getenv("TACK_BUILDKITE_PIPELINE"),
5858+ BuildkiteWebhookSecret: os.Getenv("TACK_BUILDKITE_WEBHOOK_SECRET"),
5959+ BuildkiteWebhookMode: buildkite.WebhookMode(
6060+ envOr("TACK_BUILDKITE_WEBHOOK_MODE", string(buildkite.WebhookModeToken)),
6161+ ),
4162 }
4263 addrFlag := flag.String("addr", cfg.Addr, "HTTP listen address (overrides TACK_LISTEN_ADDR)")
4364 flag.Parse()
···5677 return cfg, errors.New("TACK_HOSTNAME is required")
5778 }
58798080+ // If the operator opted into Buildkite mode (by supplying a
8181+ // token), every other Buildkite knob has to be present. Half-
8282+ // configured Buildkite leads to confusing failures deep in the
8383+ // provider; catch it at startup.
8484+ if cfg.BuildkiteToken != "" {
8585+ if cfg.BuildkiteOrg == "" {
8686+ return cfg, errors.New("TACK_BUILDKITE_ORG is required when TACK_BUILDKITE_TOKEN is set")
8787+ }
8888+ if cfg.BuildkitePipeline == "" {
8989+ return cfg, errors.New("TACK_BUILDKITE_PIPELINE is required when TACK_BUILDKITE_TOKEN is set")
9090+ }
9191+ if cfg.BuildkiteWebhookSecret == "" {
9292+ return cfg, errors.New("TACK_BUILDKITE_WEBHOOK_SECRET is required when TACK_BUILDKITE_TOKEN is set")
9393+ }
9494+ switch cfg.BuildkiteWebhookMode {
9595+ case buildkite.WebhookModeToken, buildkite.WebhookModeSignature:
9696+ default:
9797+ return cfg, fmt.Errorf("TACK_BUILDKITE_WEBHOOK_MODE must be %q or %q; got %q",
9898+ buildkite.WebhookModeToken, buildkite.WebhookModeSignature,
9999+ cfg.BuildkiteWebhookMode,
100100+ )
101101+ }
102102+ }
103103+59104 return cfg, nil
60105}
61106···115160 br := newBroker(st)
116161117162 // Provider that turns Tangled pipeline triggers into
118118- // pipeline.status events. The fake provider stands in for a real
119119- // CI integration: it emits synthetic running/success heartbeats
120120- // over the broker so the entire jetstream → knot → /events flow
121121- // is exercisable end-to-end. Swap this for a Buildkite-backed
122122- // implementation once that lands.
123123- provider := newFakeProvider(br, logger)
163163+ // pipeline.status events. The Buildkite provider is the real
164164+ // integration; the fake one stands in when no Buildkite token is
165165+ // configured so the full jetstream → knot → /events flow is
166166+ // still exercisable locally without a Buildkite account.
167167+ //
168168+ // bkProvider is kept as a typed pointer separately because the
169169+ // /webhooks/buildkite handler needs the concrete *buildkiteProvider
170170+ // (for HandleWebhook + signature verification), not the abstract
171171+ // Provider surface.
172172+ var (
173173+ provider Provider
174174+ bkProvider *buildkiteProvider
175175+ )
176176+ if cfg.BuildkiteToken != "" {
177177+ bkProvider = newBuildkiteProvider(
178178+ br, st,
179179+ buildkite.NewClient(cfg.BuildkiteToken, cfg.BuildkiteOrg),
180180+ cfg.BuildkitePipeline,
181181+ cfg.BuildkiteWebhookSecret,
182182+ cfg.BuildkiteWebhookMode,
183183+ logger,
184184+ )
185185+ provider = bkProvider
186186+ logger.Info("buildkite provider enabled",
187187+ "org", cfg.BuildkiteOrg,
188188+ "pipeline", cfg.BuildkitePipeline,
189189+ "webhook_mode", cfg.BuildkiteWebhookMode,
190190+ )
191191+ } else {
192192+ provider = newFakeProvider(br, logger)
193193+ logger.Info("fake provider enabled (set TACK_BUILDKITE_TOKEN to use buildkite)")
194194+ }
124195125196 // Start the knot event-stream consumer first so the jetstream
126197 // loop has somewhere to register newly-observed knots into. It
···143214144215 // Run the HTTP server. This blocks until ctx is cancelled or the
145216 // listener errors.
146146- if err := runHTTP(ctx, cfg, br, provider); err != nil {
217217+ if err := runHTTP(ctx, cfg, br, provider, bkProvider); err != nil {
147218 logger.Error("http server error", "err", err)
148219 os.Exit(1)
149220 }
+5-1
provider.go
···7373 // knot is the knot hostname the trigger arrived on; it's the
7474 // authority half of the pipeline ATURI that pipeline.status
7575 // records reference. pipelineRkey is the trigger record's rkey
7676- // on that knot. workflows is the unmodified slice from the
7676+ // on that knot. trigger is the decoded record's TriggerMetadata
7777+ // (may be nil — the lexicon doesn't enforce its presence) and
7878+ // carries the commit/branch/PR data a real CI provider needs to
7979+ // kick off a build. workflows is the unmodified slice from the
7780 // decoded sh.tangled.pipeline record; implementations should
7881 // tolerate nil entries and zero-length names defensively, since
7982 // the lexicon doesn't enforce either.
···8184 ctx context.Context,
8285 knot string,
8386 pipelineRkey string,
8787+ trigger *tangled.Pipeline_TriggerMetadata,
8488 workflows []*tangled.Pipeline_Workflow,
8589 )
8690
+552
provider_buildkite.go
···11+package main
22+33+// buildkiteProvider implements Provider against a real Buildkite
44+// account. Spawn translates a Tangled pipeline trigger into one
55+// Buildkite build per workflow; status updates flow back asynchronously
66+// through the /webhooks/buildkite handler (see http.go), which looks
77+// the build UUID up in the buildkite_builds table to recover the
88+// (knot, pipelineRkey, workflow) tuple this provider persisted at
99+// Spawn time and publishes a sh.tangled.pipeline.status record on
1010+// the in-process broker.
1111+//
1212+// Only one Buildkite pipeline is used per spindle (TACK_BUILDKITE_PIPELINE).
1313+// Every Tangled workflow runs as a build on that single pipeline, with
1414+// the workflow identity plumbed through env + meta_data. The operator
1515+// configures their Buildkite pipeline to read those env vars and
1616+// dispatch accordingly (e.g. via `pipeline upload`). Mapping every
1717+// Tangled workflow to its own Buildkite pipeline would force operators
1818+// to provision Buildkite resources for each workflow file in every
1919+// repo that points at the spindle — friction we don't want to impose.
2020+2121+import (
2222+ "context"
2323+ "encoding/json"
2424+ "errors"
2525+ "fmt"
2626+ "log/slog"
2727+ "net/http"
2828+ "strings"
2929+ "time"
3030+3131+ "tangled.org/core/api/tangled"
3232+3333+ "github.com/mitchellh/tack/internal/buildkite"
3434+)
3535+3636+// Buildkite-side meta_data keys carrying the Tangled identity of a
3737+// build. Mirrored into env vars (see envFromTuple) so an operator's
3838+// Buildkite pipeline script can also reach them via $TACK_*. They
3939+// stay tightly namespaced so a coexisting Buildkite job that uses
4040+// meta_data for its own purposes won't collide.
4141+const (
4242+ bkMetaKnot = "tack:knot"
4343+ bkMetaPipelineRkey = "tack:pipeline_rkey"
4444+ bkMetaWorkflow = "tack:workflow"
4545+)
4646+4747+// buildkiteProvider implements Provider.
4848+//
4949+// webhookSecret + webhookMode live on the provider rather than on
5050+// the HTTP server because the provider is the single owner of
5151+// "everything Buildkite-y": colocating the auth knob with the API
5252+// client and the state translator keeps configuration drift to one
5353+// place and makes the http.go side pure transport.
5454+type buildkiteProvider struct {
5555+ br *broker
5656+ st *store
5757+ log *slog.Logger
5858+ client *buildkite.Client
5959+ pipelineSlug string
6060+ webhookSecret string
6161+ webhookMode buildkite.WebhookMode
6262+}
6363+6464+// Compile-time interface conformance check.
6565+var _ Provider = (*buildkiteProvider)(nil)
6666+6767+// newBuildkiteProvider wires a provider to its Buildkite client and
6868+// to the broker it publishes pipeline.status records on. pipelineSlug
6969+// is the Buildkite pipeline that all builds get fired on (see file
7070+// header for why there's only one). webhookSecret/webhookMode govern
7171+// inbound /webhooks/buildkite request authentication.
7272+func newBuildkiteProvider(
7373+ br *broker,
7474+ st *store,
7575+ client *buildkite.Client,
7676+ pipelineSlug string,
7777+ webhookSecret string,
7878+ webhookMode buildkite.WebhookMode,
7979+ log *slog.Logger,
8080+) *buildkiteProvider {
8181+ return &buildkiteProvider{
8282+ br: br,
8383+ st: st,
8484+ log: log.With("component", "provider", "kind", "buildkite"),
8585+ client: client,
8686+ pipelineSlug: pipelineSlug,
8787+ webhookSecret: webhookSecret,
8888+ webhookMode: webhookMode,
8989+ }
9090+}
9191+9292+// VerifyWebhook authenticates an inbound webhook request using
9393+// whichever mode the provider was configured with. Returns nil on
9494+// success; the HTTP handler maps any returned error to 401.
9595+func (p *buildkiteProvider) VerifyWebhook(headers http.Header, body []byte) error {
9696+ switch p.webhookMode {
9797+ case buildkite.WebhookModeSignature:
9898+ return buildkite.VerifySignature(
9999+ headers.Get("X-Buildkite-Signature"),
100100+ p.webhookSecret, body,
101101+ )
102102+ default:
103103+ // Token mode is the Buildkite default and our default, so
104104+ // any unrecognised value falls through to it rather than
105105+ // fail-closed at startup.
106106+ return buildkite.VerifyToken(
107107+ headers.Get("X-Buildkite-Token"),
108108+ p.webhookSecret,
109109+ )
110110+ }
111111+}
112112+113113+// Spawn satisfies Provider. For each workflow it fires a separate
114114+// Buildkite build off the configured pipeline so each workflow gets
115115+// its own status timeline. The actual API call runs on a goroutine —
116116+// CreateBuild is one HTTP round-trip, but we still want Spawn to be
117117+// non-blocking per the interface contract.
118118+//
119119+// On a successful create we persist the build UUID → (knot, rkey,
120120+// workflow) mapping and publish a "pending" pipeline.status so the
121121+// appview sees activity immediately, instead of waiting for the
122122+// first webhook to land.
123123+func (p *buildkiteProvider) Spawn(
124124+ ctx context.Context,
125125+ knot string,
126126+ pipelineRkey string,
127127+ trigger *tangled.Pipeline_TriggerMetadata,
128128+ workflows []*tangled.Pipeline_Workflow,
129129+) {
130130+ if len(workflows) == 0 {
131131+ p.log.Warn("pipeline has no workflows; nothing to spawn",
132132+ "knot", knot, "rkey", pipelineRkey,
133133+ )
134134+ return
135135+ }
136136+137137+ // Derive build inputs once. Every workflow on this trigger
138138+ // targets the same commit/branch — only the workflow name
139139+ // varies between the per-workflow goroutines below.
140140+ commit, branch := triggerCommitAndBranch(trigger)
141141+ if commit == "" {
142142+ // Buildkite's create-build API requires a commit; we'd
143143+ // rather log loudly and skip than fire builds on "HEAD"
144144+ // and silently get whatever main happens to look like.
145145+ p.log.Error("trigger has no commit; refusing to spawn",
146146+ "knot", knot, "rkey", pipelineRkey,
147147+ )
148148+ return
149149+ }
150150+151151+ for _, wf := range workflows {
152152+ if wf == nil || wf.Name == "" {
153153+ continue
154154+ }
155155+ wf := wf
156156+ go p.spawnWorkflow(ctx, knot, pipelineRkey, commit, branch, wf)
157157+ }
158158+}
159159+160160+// spawnWorkflow does the per-workflow API + persistence work for
161161+// Spawn. Errors are logged with full context but not returned —
162162+// nothing in tack consumes the result, and a failed Spawn just
163163+// surfaces as the absence of any status update for the affected
164164+// workflow.
165165+func (p *buildkiteProvider) spawnWorkflow(
166166+ ctx context.Context,
167167+ knot string,
168168+ pipelineRkey string,
169169+ commit string,
170170+ branch string,
171171+ wf *tangled.Pipeline_Workflow,
172172+) {
173173+ logger := p.log.With(
174174+ "knot", knot,
175175+ "pipeline_rkey", pipelineRkey,
176176+ "workflow", wf.Name,
177177+ )
178178+179179+ pipelineURI := pipelineATURI(knot, pipelineRkey)
180180+ meta := map[string]string{
181181+ bkMetaKnot: knot,
182182+ bkMetaPipelineRkey: pipelineRkey,
183183+ bkMetaWorkflow: wf.Name,
184184+ }
185185+ env := envFromTuple(knot, pipelineRkey, wf)
186186+187187+ req := buildkite.CreateBuildRequest{
188188+ Commit: commit,
189189+ Branch: branch,
190190+ Message: fmt.Sprintf("tangled: %s", wf.Name),
191191+ Env: env,
192192+ MetaData: meta,
193193+ IgnorePipelineBranchFilters: true,
194194+ }
195195+196196+ build, err := p.client.CreateBuild(ctx, p.pipelineSlug, req)
197197+ if err != nil {
198198+ logger.Error("create buildkite build", "err", err)
199199+ return
200200+ }
201201+ logger.Info("buildkite build created",
202202+ "build_uuid", build.ID,
203203+ "build_number", build.Number,
204204+ "web_url", build.WebURL,
205205+ )
206206+207207+ if err := p.st.InsertBuildkiteBuild(ctx, BuildkiteBuildRef{
208208+ BuildUUID: build.ID,
209209+ BuildNumber: build.Number,
210210+ PipelineSlug: p.pipelineSlug,
211211+ Knot: knot,
212212+ PipelineRkey: pipelineRkey,
213213+ Workflow: wf.Name,
214214+ PipelineURI: pipelineURI,
215215+ }); err != nil {
216216+ // Webhook handlers will fail to translate this build's
217217+ // events because they can't recover the tuple. Surface
218218+ // loudly and bail; we don't want a half-tracked build
219219+ // silently leaking status into the broker.
220220+ logger.Error("persist buildkite build mapping", "err", err,
221221+ "build_uuid", build.ID,
222222+ )
223223+ return
224224+ }
225225+226226+ // Initial status publish so the appview shows the build as
227227+ // queued without waiting for the first webhook. This mirrors
228228+ // the upstream spindle's "schedule then run" cadence.
229229+ if err := p.publishStatus(
230230+ ctx, pipelineURI, wf.Name, "pending", build.ID,
231231+ nil, nil,
232232+ ); err != nil {
233233+ logger.Error("publish initial pending status", "err", err)
234234+ }
235235+}
236236+237237+// Logs satisfies Provider. We resolve the (knot, rkey, workflow)
238238+// tuple to a Buildkite build via the store, fetch the current jobs
239239+// list, then drain each job's plain-text log into the channel as one
240240+// LogLine per output line.
241241+//
242242+// Per-job control frames bracket each job so the appview's renderer
243243+// has start/end markers to lay out timing — same shape as the fake
244244+// provider and the upstream spindle.
245245+//
246246+// This is a snapshot read, not a tail — finished or in-progress, we
247247+// fetch what's there and close. Live tailing would require Buildkite
248248+// agent log streaming, which the public REST API doesn't expose; the
249249+// appview's repeated /logs calls during a running build give us
250250+// "good enough" liveness without that complexity.
251251+func (p *buildkiteProvider) Logs(
252252+ ctx context.Context,
253253+ knot string,
254254+ pipelineRkey string,
255255+ workflow string,
256256+) (<-chan LogLine, error) {
257257+ ref, err := p.st.LookupBuildkiteBuildByTuple(ctx, knot, pipelineRkey, workflow)
258258+ if err != nil {
259259+ return nil, fmt.Errorf("lookup build for logs: %w", err)
260260+ }
261261+ if ref == nil {
262262+ return nil, ErrLogsNotFound
263263+ }
264264+265265+ // Fresh fetch so we get the current job set, not whatever was
266266+ // returned at create time (when most jobs are still nil). The
267267+ // upstream's not-found is mapped to the Provider-shaped one
268268+ // here because the /logs handler only knows about ErrLogsNotFound.
269269+ build, err := p.client.GetBuild(ctx, ref.PipelineSlug, ref.BuildNumber)
270270+ if err != nil {
271271+ if errors.Is(err, buildkite.ErrNotFound) {
272272+ return nil, ErrLogsNotFound
273273+ }
274274+ return nil, fmt.Errorf("get build for logs: %w", err)
275275+ }
276276+277277+ out := make(chan LogLine, 64)
278278+ go func() {
279279+ defer close(out)
280280+ stepID := 0
281281+ for _, job := range build.Jobs {
282282+ // Only "script" jobs have agent-produced logs.
283283+ // Waiter / manual / trigger jobs have no body to
284284+ // fetch; skip them so we don't hit Buildkite with
285285+ // 404-bound requests.
286286+ if job.Type != "" && job.Type != "script" {
287287+ continue
288288+ }
289289+290290+ name := job.Name
291291+ if name == "" {
292292+ name = job.ID
293293+ }
294294+295295+ // Job-level start frame so the appview can bound
296296+ // timing per job.
297297+ if !sendLine(ctx, out, LogLine{
298298+ Kind: LogKindControl,
299299+ Time: time.Now(),
300300+ Content: name,
301301+ StepId: stepID,
302302+ StepStatus: StepStatusStart,
303303+ }) {
304304+ return
305305+ }
306306+307307+ body, err := p.client.GetJobLog(ctx, ref.PipelineSlug, ref.BuildNumber, job.ID)
308308+ if err != nil {
309309+ p.log.Debug("fetch job log",
310310+ "err", err,
311311+ "build_uuid", ref.BuildUUID,
312312+ "job_id", job.ID,
313313+ )
314314+ // Don't fail the whole stream on one job;
315315+ // emit the end frame and move on so the
316316+ // appview at least sees what other jobs
317317+ // produced.
318318+ body = ""
319319+ }
320320+321321+ for _, line := range strings.Split(strings.TrimRight(body, "\n"), "\n") {
322322+ if line == "" {
323323+ // Skip the leading empty entry that
324324+ // Split produces for empty bodies.
325325+ continue
326326+ }
327327+ if !sendLine(ctx, out, LogLine{
328328+ Kind: LogKindData,
329329+ Time: time.Now(),
330330+ Content: line + "\n",
331331+ StepId: stepID,
332332+ Stream: "stdout",
333333+ }) {
334334+ return
335335+ }
336336+ }
337337+338338+ if !sendLine(ctx, out, LogLine{
339339+ Kind: LogKindControl,
340340+ Time: time.Now(),
341341+ Content: name,
342342+ StepId: stepID,
343343+ StepStatus: StepStatusEnd,
344344+ }) {
345345+ return
346346+ }
347347+ stepID++
348348+ }
349349+ }()
350350+ return out, nil
351351+}
352352+353353+// publishStatus assembles a tangled.PipelineStatus record and pushes
354354+// it through the broker. buildUUID is mixed into the rkey so multiple
355355+// status events for the same workflow don't collide on the events
356356+// table's (rkey) uniqueness — and so an operator grepping the log
357357+// can find every record that pertains to a given Buildkite build.
358358+//
359359+// errMsg/exitCode are optional; pass nil for non-failure transitions.
360360+func (p *buildkiteProvider) publishStatus(
361361+ ctx context.Context,
362362+ pipelineURI, workflow, status, buildUUID string,
363363+ errMsg *string,
364364+ exitCode *int64,
365365+) error {
366366+ rec := tangled.PipelineStatus{
367367+ LexiconTypeID: tangled.PipelineStatusNSID,
368368+ Pipeline: pipelineURI,
369369+ Workflow: workflow,
370370+ Status: status,
371371+ CreatedAt: time.Now().UTC().Format(time.RFC3339),
372372+ Error: errMsg,
373373+ ExitCode: exitCode,
374374+ }
375375+ body, err := json.Marshal(rec)
376376+ if err != nil {
377377+ return fmt.Errorf("marshal pipeline.status: %w", err)
378378+ }
379379+ rkey := fmt.Sprintf("bk-%s-%s-%d", buildUUID, status, time.Now().UnixNano())
380380+ if _, err := p.br.Publish(ctx, rkey, tangled.PipelineStatusNSID, body); err != nil {
381381+ return fmt.Errorf("publish pipeline.status: %w", err)
382382+ }
383383+ return nil
384384+}
385385+386386+// HandleWebhook applies a decoded Buildkite webhook payload: looks
387387+// the build up in the store, translates the Buildkite state into a
388388+// Tangled StatusKind, and publishes a pipeline.status record. Used
389389+// by the HTTP webhook handler so both the ingress logic and the
390390+// translation logic live next to each other.
391391+//
392392+// Returns nil for events we intentionally ignore (job.* events,
393393+// build.scheduled which we already publish locally on Spawn, builds
394394+// we don't have a mapping for) so the handler can 200 them — webhook
395395+// retries from Buildkite on a 4xx/5xx are noisy and not what we want
396396+// for "we just don't care about this event".
397397+func (p *buildkiteProvider) HandleWebhook(
398398+ ctx context.Context,
399399+ payload buildkite.WebhookPayload,
400400+) error {
401401+ // Only build.* events drive pipeline.status today. Everything
402402+ // else (job.*, agent.*, ping) is acknowledged silently.
403403+ if !strings.HasPrefix(payload.Event, "build.") {
404404+ return nil
405405+ }
406406+407407+ ref, err := p.st.LookupBuildkiteBuildByUUID(ctx, payload.Build.ID)
408408+ if err != nil {
409409+ return fmt.Errorf("lookup build by uuid: %w", err)
410410+ }
411411+ if ref == nil {
412412+ // Most likely: this build was triggered outside tack and
413413+ // just happens to share our webhook URL. Nothing to do.
414414+ p.log.Debug("webhook for unknown build; ignoring",
415415+ "event", payload.Event,
416416+ "build_uuid", payload.Build.ID,
417417+ )
418418+ return nil
419419+ }
420420+421421+ status, ok := mapBuildkiteState(payload.Build.State)
422422+ if !ok {
423423+ // Unknown / transient state ("blocked", "skipped",
424424+ // "not_run", "waiting"…) — log so we can extend the map
425425+ // later, but don't error out the webhook.
426426+ p.log.Debug("unmapped buildkite state; ignoring",
427427+ "event", payload.Event,
428428+ "state", payload.Build.State,
429429+ "build_uuid", payload.Build.ID,
430430+ )
431431+ return nil
432432+ }
433433+434434+ if err := p.publishStatus(ctx, ref.PipelineURI, ref.Workflow,
435435+ status, ref.BuildUUID, nil, nil); err != nil {
436436+ return fmt.Errorf("publish webhook status: %w", err)
437437+ }
438438+ p.log.Info("buildkite webhook → pipeline.status",
439439+ "event", payload.Event,
440440+ "state", payload.Build.State,
441441+ "status", status,
442442+ "build_uuid", payload.Build.ID,
443443+ "workflow", ref.Workflow,
444444+ )
445445+ return nil
446446+}
447447+448448+// mapBuildkiteState translates Buildkite's build state strings into
449449+// the Tangled spindle StatusKind enum. The mapping aligns with the
450450+// upstream constants (StatusKindRunning/Failed/Cancelled/Success);
451451+// states that don't have a direct analogue (blocked, skipped,
452452+// not_run) are reported as not-mapped so the caller can decide
453453+// whether to ignore them.
454454+func mapBuildkiteState(state string) (string, bool) {
455455+ switch state {
456456+ case "scheduled":
457457+ return "pending", true
458458+ case "running", "failing":
459459+ return "running", true
460460+ case "passed":
461461+ return "success", true
462462+ case "failed":
463463+ return "failed", true
464464+ case "canceled", "canceling":
465465+ return "cancelled", true
466466+ default:
467467+ return "", false
468468+ }
469469+}
470470+471471+// envFromTuple builds the env block forwarded into the Buildkite
472472+// build. These are the only handle a user's Buildkite pipeline has
473473+// on the originating Tangled trigger: their pipeline.yml typically
474474+// reads $TACK_WORKFLOW and dispatches based on it (e.g. running a
475475+// `pipeline upload` against a workflow-specific YAML file).
476476+//
477477+// TACK_WORKFLOW_RAW carries the entire YAML body of the workflow as
478478+// captured in the Tangled record. It can be empty if the workflow
479479+// definition omitted it; consumers should defend.
480480+func envFromTuple(knot, pipelineRkey string, wf *tangled.Pipeline_Workflow) map[string]string {
481481+ return map[string]string{
482482+ "TACK_KNOT": knot,
483483+ "TACK_PIPELINE_RKEY": pipelineRkey,
484484+ "TACK_WORKFLOW": wf.Name,
485485+ "TACK_WORKFLOW_RAW": wf.Raw,
486486+ }
487487+}
488488+489489+// pipelineATURI returns the at-uri the appview joins pipeline.status
490490+// records back to their originating pipeline on. Format mirrors the
491491+// upstream spindle; the appview strips the `did:web:` prefix and
492492+// treats the remainder as the knot identifier.
493493+func pipelineATURI(knot, pipelineRkey string) string {
494494+ return fmt.Sprintf("at://did:web:%s/%s/%s",
495495+ knot, tangled.PipelineNSID, pipelineRkey,
496496+ )
497497+}
498498+499499+// triggerCommitAndBranch extracts (commit, branch) from a Tangled
500500+// pipeline trigger, regardless of whether it was a push, a pull
501501+// request, or a manual run. Returns empty strings on a fully-empty
502502+// trigger so the caller can decide whether that's fatal.
503503+func triggerCommitAndBranch(trigger *tangled.Pipeline_TriggerMetadata) (string, string) {
504504+ if trigger == nil {
505505+ return "", ""
506506+ }
507507+ switch {
508508+ case trigger.Push != nil:
509509+ // For push events, NewSha is the commit being built and
510510+ // Ref is the full ref (e.g. "refs/heads/main") — strip
511511+ // the prefix so Buildkite's branch-aware features work.
512512+ return trigger.Push.NewSha, refToBranch(trigger.Push.Ref)
513513+ case trigger.PullRequest != nil:
514514+ // PRs build the source commit on the source branch.
515515+ // Buildkite's pipeline can opt into PR-aware behaviour
516516+ // via pull_request_id (not currently plumbed through).
517517+ return trigger.PullRequest.SourceSha, trigger.PullRequest.SourceBranch
518518+ default:
519519+ // Manual triggers and any future kinds: fall back to the
520520+ // repo default branch with no commit, which the caller
521521+ // will treat as fatal — manual triggers will need
522522+ // additional plumbing to pick a commit.
523523+ if trigger.Repo != nil {
524524+ return "", trigger.Repo.DefaultBranch
525525+ }
526526+ return "", ""
527527+ }
528528+}
529529+530530+// refToBranch strips the conventional refs/heads/ prefix from a git
531531+// ref. Refs that don't match the prefix (tags, refs/pull/N/head) are
532532+// returned as-is so downstream tooling can decide what to do with
533533+// them — Buildkite happily accepts either form in `branch`.
534534+func refToBranch(ref string) string {
535535+ const prefix = "refs/heads/"
536536+ if strings.HasPrefix(ref, prefix) {
537537+ return strings.TrimPrefix(ref, prefix)
538538+ }
539539+ return ref
540540+}
541541+542542+// sendLine pushes one LogLine into out, returning false if ctx
543543+// fired first. Centralised so the per-job loop in Logs stays
544544+// focused on the wire-shape decisions.
545545+func sendLine(ctx context.Context, out chan<- LogLine, line LogLine) bool {
546546+ select {
547547+ case <-ctx.Done():
548548+ return false
549549+ case out <- line:
550550+ return true
551551+ }
552552+}
+405
provider_buildkite_test.go
···11+package main
22+33+// Provider-level integration tests for the Buildkite implementation:
44+// Spawn → CreateBuild + persist + initial pending publish, and
55+// HandleWebhook → translate state + publish status. Buildkite itself
66+// is stubbed with httptest so the tests don't need network access.
77+88+import (
99+ "context"
1010+ "crypto/hmac"
1111+ "crypto/sha256"
1212+ "encoding/hex"
1313+ "encoding/json"
1414+ "fmt"
1515+ "io"
1616+ "log/slog"
1717+ "net/http"
1818+ "net/http/httptest"
1919+ "strings"
2020+ "testing"
2121+ "time"
2222+2323+ "tangled.org/core/api/tangled"
2424+2525+ "github.com/mitchellh/tack/internal/buildkite"
2626+)
2727+2828+// newBuildkiteTestProvider wires a buildkiteProvider against an
2929+// httptest server impersonating api.buildkite.com. Returns the
3030+// store/broker so tests can inspect publishes + persistence.
3131+func newBuildkiteTestProvider(
3232+ t *testing.T,
3333+ mode buildkite.WebhookMode,
3434+ secret string,
3535+ bkHandler http.HandlerFunc,
3636+) (*buildkiteProvider, *store, *broker, *httptest.Server) {
3737+ t.Helper()
3838+ srv := httptest.NewServer(bkHandler)
3939+ t.Cleanup(srv.Close)
4040+4141+ prev := buildkite.APIBase
4242+ buildkite.APIBase = srv.URL
4343+ t.Cleanup(func() { buildkite.APIBase = prev })
4444+4545+ st := newTestStore(t)
4646+ br := newBroker(st)
4747+ logger := slog.Default()
4848+ p := newBuildkiteProvider(
4949+ br, st,
5050+ buildkite.NewClient("tok", "myorg"),
5151+ "mypipe",
5252+ secret, mode,
5353+ logger,
5454+ )
5555+ return p, st, br, srv
5656+}
5757+5858+// TestBuildkiteSpawn covers the full create-build path: trigger →
5959+// API call → DB row → "pending" status on the broker.
6060+func TestBuildkiteSpawn(t *testing.T) {
6161+ bk := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
6262+ w.WriteHeader(http.StatusCreated)
6363+ _ = json.NewEncoder(w).Encode(buildkite.Build{
6464+ ID: "uuid-1",
6565+ Number: 7,
6666+ })
6767+ })
6868+ p, st, _, _ := newBuildkiteTestProvider(t, buildkite.WebhookModeToken, "secret", bk)
6969+7070+ trigger := &tangled.Pipeline_TriggerMetadata{
7171+ Push: &tangled.Pipeline_PushTriggerData{
7272+ NewSha: "abcdef0123",
7373+ Ref: "refs/heads/main",
7474+ },
7575+ }
7676+ workflows := []*tangled.Pipeline_Workflow{
7777+ {Name: "test.yml", Raw: "steps:\n - run: true\n"},
7878+ }
7979+8080+ p.Spawn(context.Background(), "knot.example.com", "rkey-1", trigger, workflows)
8181+8282+ // Spawn fans out into goroutines; wait briefly for the side
8383+ // effects to land. The store row is the load-bearing artifact
8484+ // — once it's present, the publish has already happened too.
8585+ deadline := time.Now().Add(2 * time.Second)
8686+ var ref *BuildkiteBuildRef
8787+ for time.Now().Before(deadline) {
8888+ var err error
8989+ ref, err = st.LookupBuildkiteBuildByUUID(context.Background(), "uuid-1")
9090+ if err != nil {
9191+ t.Fatalf("lookup: %v", err)
9292+ }
9393+ if ref != nil {
9494+ break
9595+ }
9696+ time.Sleep(20 * time.Millisecond)
9797+ }
9898+ if ref == nil {
9999+ t.Fatal("buildkite build row not persisted within deadline")
100100+ }
101101+ if ref.Workflow != "test.yml" || ref.Knot != "knot.example.com" || ref.PipelineRkey != "rkey-1" {
102102+ t.Fatalf("ref mismatch: %+v", ref)
103103+ }
104104+ if ref.PipelineSlug != "mypipe" || ref.BuildNumber != 7 {
105105+ t.Fatalf("buildkite ref mismatch: %+v", ref)
106106+ }
107107+108108+ // One pending status should be on the events log.
109109+ rows, err := st.EventsAfter(context.Background(), 0)
110110+ if err != nil {
111111+ t.Fatalf("EventsAfter: %v", err)
112112+ }
113113+ if len(rows) != 1 {
114114+ t.Fatalf("got %d events, want 1", len(rows))
115115+ }
116116+ var rec tangled.PipelineStatus
117117+ if err := json.Unmarshal(rows[0].EventJSON, &rec); err != nil {
118118+ t.Fatalf("decode status: %v", err)
119119+ }
120120+ if rec.Status != "pending" || rec.Workflow != "test.yml" {
121121+ t.Fatalf("unexpected status: %+v", rec)
122122+ }
123123+ if !strings.Contains(rec.Pipeline, "knot.example.com") ||
124124+ !strings.Contains(rec.Pipeline, "rkey-1") {
125125+ t.Fatalf("pipeline ATURI wrong: %s", rec.Pipeline)
126126+ }
127127+}
128128+129129+// TestBuildkiteSpawnNoCommit confirms we don't fire a build when the
130130+// trigger has no commit to build — kicking one off would resolve to
131131+// whatever main looks like at agent-fetch time, which is dangerously
132132+// surprising.
133133+func TestBuildkiteSpawnNoCommit(t *testing.T) {
134134+ called := false
135135+ bk := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
136136+ called = true
137137+ })
138138+ p, st, _, _ := newBuildkiteTestProvider(t, buildkite.WebhookModeToken, "secret", bk)
139139+140140+ p.Spawn(context.Background(), "knot.example.com", "rkey-1",
141141+ &tangled.Pipeline_TriggerMetadata{Manual: &tangled.Pipeline_ManualTriggerData{}},
142142+ []*tangled.Pipeline_Workflow{{Name: "test.yml"}},
143143+ )
144144+145145+ // Give any rogue goroutine a moment.
146146+ time.Sleep(50 * time.Millisecond)
147147+ if called {
148148+ t.Fatal("CreateBuild called despite missing commit")
149149+ }
150150+ rows, _ := st.EventsAfter(context.Background(), 0)
151151+ if len(rows) != 0 {
152152+ t.Fatalf("got %d events, want 0", len(rows))
153153+ }
154154+}
155155+156156+// TestBuildkiteHandleWebhook checks the translation pipeline:
157157+// recorded build + matching webhook → success status published.
158158+func TestBuildkiteHandleWebhook(t *testing.T) {
159159+ p, st, _, _ := newBuildkiteTestProvider(t, buildkite.WebhookModeToken, "secret",
160160+ func(w http.ResponseWriter, r *http.Request) { t.Fatal("buildkite shouldn't be called") })
161161+162162+ // Pre-seed a known build mapping.
163163+ if err := st.InsertBuildkiteBuild(context.Background(), BuildkiteBuildRef{
164164+ BuildUUID: "uuid-1",
165165+ BuildNumber: 7,
166166+ PipelineSlug: "mypipe",
167167+ Knot: "knot.example.com",
168168+ PipelineRkey: "rkey-1",
169169+ Workflow: "test.yml",
170170+ PipelineURI: "at://did:web:knot.example.com/sh.tangled.pipeline/rkey-1",
171171+ }); err != nil {
172172+ t.Fatalf("InsertBuildkiteBuild: %v", err)
173173+ }
174174+175175+ err := p.HandleWebhook(context.Background(), buildkite.WebhookPayload{
176176+ Event: "build.finished",
177177+ Build: buildkite.Build{ID: "uuid-1", State: "passed"},
178178+ })
179179+ if err != nil {
180180+ t.Fatalf("HandleWebhook: %v", err)
181181+ }
182182+183183+ rows, err := st.EventsAfter(context.Background(), 0)
184184+ if err != nil {
185185+ t.Fatalf("EventsAfter: %v", err)
186186+ }
187187+ if len(rows) != 1 {
188188+ t.Fatalf("got %d events, want 1", len(rows))
189189+ }
190190+ var rec tangled.PipelineStatus
191191+ if err := json.Unmarshal(rows[0].EventJSON, &rec); err != nil {
192192+ t.Fatalf("decode: %v", err)
193193+ }
194194+ if rec.Status != "success" || rec.Workflow != "test.yml" {
195195+ t.Fatalf("bad status: %+v", rec)
196196+ }
197197+}
198198+199199+// TestBuildkiteHandleWebhookIgnored covers the "we don't care" paths:
200200+// non-build events and unknown builds must be no-op (no publish, no
201201+// error) so Buildkite doesn't retry them.
202202+func TestBuildkiteHandleWebhookIgnored(t *testing.T) {
203203+ p, st, _, _ := newBuildkiteTestProvider(t, buildkite.WebhookModeToken, "secret",
204204+ func(w http.ResponseWriter, r *http.Request) {})
205205+206206+ // Non-build event: no lookup, no publish.
207207+ if err := p.HandleWebhook(context.Background(), buildkite.WebhookPayload{
208208+ Event: "job.started",
209209+ Build: buildkite.Build{ID: "uuid-x"},
210210+ }); err != nil {
211211+ t.Fatalf("HandleWebhook (job.started): %v", err)
212212+ }
213213+214214+ // Build event for unknown UUID: no publish.
215215+ if err := p.HandleWebhook(context.Background(), buildkite.WebhookPayload{
216216+ Event: "build.finished",
217217+ Build: buildkite.Build{ID: "unknown-uuid", State: "passed"},
218218+ }); err != nil {
219219+ t.Fatalf("HandleWebhook (unknown): %v", err)
220220+ }
221221+222222+ // Known build but unmapped state: no publish.
223223+ if err := st.InsertBuildkiteBuild(context.Background(), BuildkiteBuildRef{
224224+ BuildUUID: "uuid-blocked", PipelineSlug: "mypipe",
225225+ Knot: "k", PipelineRkey: "r", Workflow: "w",
226226+ PipelineURI: "at://x",
227227+ }); err != nil {
228228+ t.Fatalf("seed: %v", err)
229229+ }
230230+ if err := p.HandleWebhook(context.Background(), buildkite.WebhookPayload{
231231+ Event: "build.finished",
232232+ Build: buildkite.Build{ID: "uuid-blocked", State: "blocked"},
233233+ }); err != nil {
234234+ t.Fatalf("HandleWebhook (blocked): %v", err)
235235+ }
236236+237237+ rows, _ := st.EventsAfter(context.Background(), 0)
238238+ if len(rows) != 0 {
239239+ t.Fatalf("got %d events, want 0", len(rows))
240240+ }
241241+}
242242+243243+// TestBuildkiteWebhookHandlerHTTP exercises the full HTTP path
244244+// including auth: a request signed with the wrong secret must be
245245+// rejected, and a correctly-signed one must reach the provider.
246246+func TestBuildkiteWebhookHandlerHTTP(t *testing.T) {
247247+ // Signature mode is the more interesting code path; we cover
248248+ // token mode in the verifier-level tests above.
249249+ const secret = "swordfish"
250250+ p, st, _, _ := newBuildkiteTestProvider(t, buildkite.WebhookModeSignature, secret,
251251+ func(w http.ResponseWriter, r *http.Request) { /* unused */ })
252252+253253+ // Pre-seed so the provider's HandleWebhook can resolve the build.
254254+ if err := st.InsertBuildkiteBuild(context.Background(), BuildkiteBuildRef{
255255+ BuildUUID: "uuid-2",
256256+ BuildNumber: 9,
257257+ PipelineSlug: "mypipe",
258258+ Knot: "knot.example.com",
259259+ PipelineRkey: "rkey-2",
260260+ Workflow: "test.yml",
261261+ PipelineURI: "at://did:web:knot.example.com/sh.tangled.pipeline/rkey-2",
262262+ }); err != nil {
263263+ t.Fatalf("seed: %v", err)
264264+ }
265265+266266+ body, _ := json.Marshal(map[string]any{
267267+ "event": "build.finished",
268268+ "build": map[string]any{
269269+ "id": "uuid-2",
270270+ "state": "failed",
271271+ },
272272+ })
273273+274274+ logger := slog.Default()
275275+ handler := buildkiteWebhookHandler(logger, p)
276276+277277+ // Unsigned request → 401.
278278+ t.Run("rejects unsigned", func(t *testing.T) {
279279+ req := httptest.NewRequest(http.MethodPost, "/webhooks/buildkite",
280280+ strings.NewReader(string(body)))
281281+ req.Header.Set("X-Buildkite-Event", "build.finished")
282282+ w := httptest.NewRecorder()
283283+ handler(w, req)
284284+ if w.Code != http.StatusUnauthorized {
285285+ t.Fatalf("status = %d; want 401", w.Code)
286286+ }
287287+ })
288288+289289+ // Wrong-secret request → 401.
290290+ t.Run("rejects bad signature", func(t *testing.T) {
291291+ ts := fmt.Sprintf("%d", time.Now().Unix())
292292+ mac := hmac.New(sha256.New, []byte("wrong"))
293293+ mac.Write([]byte(ts))
294294+ mac.Write([]byte("."))
295295+ mac.Write(body)
296296+ sig := hex.EncodeToString(mac.Sum(nil))
297297+298298+ req := httptest.NewRequest(http.MethodPost, "/webhooks/buildkite",
299299+ strings.NewReader(string(body)))
300300+ req.Header.Set("X-Buildkite-Event", "build.finished")
301301+ req.Header.Set("X-Buildkite-Signature", "timestamp="+ts+",signature="+sig)
302302+ w := httptest.NewRecorder()
303303+ handler(w, req)
304304+ if w.Code != http.StatusUnauthorized {
305305+ t.Fatalf("status = %d; want 401", w.Code)
306306+ }
307307+ })
308308+309309+ // Valid request → 200, status published.
310310+ t.Run("accepts valid", func(t *testing.T) {
311311+ ts := fmt.Sprintf("%d", time.Now().Unix())
312312+ mac := hmac.New(sha256.New, []byte(secret))
313313+ mac.Write([]byte(ts))
314314+ mac.Write([]byte("."))
315315+ mac.Write(body)
316316+ sig := hex.EncodeToString(mac.Sum(nil))
317317+318318+ req := httptest.NewRequest(http.MethodPost, "/webhooks/buildkite",
319319+ strings.NewReader(string(body)))
320320+ req.Header.Set("X-Buildkite-Event", "build.finished")
321321+ req.Header.Set("X-Buildkite-Signature", "timestamp="+ts+",signature="+sig)
322322+ w := httptest.NewRecorder()
323323+ handler(w, req)
324324+ if w.Code != http.StatusOK {
325325+ b, _ := io.ReadAll(w.Body)
326326+ t.Fatalf("status = %d body=%s; want 200", w.Code, string(b))
327327+ }
328328+ rows, _ := st.EventsAfter(context.Background(), 0)
329329+ if len(rows) != 1 {
330330+ t.Fatalf("got %d events, want 1", len(rows))
331331+ }
332332+ var rec tangled.PipelineStatus
333333+ if err := json.Unmarshal(rows[0].EventJSON, &rec); err != nil {
334334+ t.Fatalf("decode: %v", err)
335335+ }
336336+ if rec.Status != "failed" {
337337+ t.Fatalf("status = %q; want failed", rec.Status)
338338+ }
339339+ })
340340+}
341341+342342+// TestBuildkiteWebhookHandlerNoProvider confirms the 503 branch when
343343+// tack is running with the fake provider — a misdirected webhook
344344+// must get a clear "not configured here" instead of a misleading
345345+// 200 OK that silently throws the event away.
346346+func TestBuildkiteWebhookHandlerNoProvider(t *testing.T) {
347347+ handler := buildkiteWebhookHandler(slog.Default(), nil)
348348+ req := httptest.NewRequest(http.MethodPost, "/webhooks/buildkite",
349349+ strings.NewReader("{}"))
350350+ w := httptest.NewRecorder()
351351+ handler(w, req)
352352+ if w.Code != http.StatusServiceUnavailable {
353353+ t.Fatalf("status = %d; want 503", w.Code)
354354+ }
355355+}
356356+357357+// TestTriggerCommitAndBranch pins the trigger-shape mapping. Each
358358+// case pairs an input trigger with the (commit, branch) tuple a
359359+// real CI provider would feed into its build-creation API.
360360+func TestTriggerCommitAndBranch(t *testing.T) {
361361+ cases := []struct {
362362+ name string
363363+ in *tangled.Pipeline_TriggerMetadata
364364+ wantCommit string
365365+ wantBranch string
366366+ }{
367367+ {"nil", nil, "", ""},
368368+ {"push refs/heads",
369369+ &tangled.Pipeline_TriggerMetadata{
370370+ Push: &tangled.Pipeline_PushTriggerData{NewSha: "abc", Ref: "refs/heads/main"},
371371+ },
372372+ "abc", "main",
373373+ },
374374+ {"push tag ref preserved",
375375+ &tangled.Pipeline_TriggerMetadata{
376376+ Push: &tangled.Pipeline_PushTriggerData{NewSha: "abc", Ref: "refs/tags/v1"},
377377+ },
378378+ "abc", "refs/tags/v1",
379379+ },
380380+ {"pull request",
381381+ &tangled.Pipeline_TriggerMetadata{
382382+ PullRequest: &tangled.Pipeline_PullRequestTriggerData{
383383+ SourceSha: "def", SourceBranch: "feature",
384384+ },
385385+ },
386386+ "def", "feature",
387387+ },
388388+ {"manual with default branch",
389389+ &tangled.Pipeline_TriggerMetadata{
390390+ Manual: &tangled.Pipeline_ManualTriggerData{},
391391+ Repo: &tangled.Pipeline_TriggerRepo{DefaultBranch: "main"},
392392+ },
393393+ "", "main",
394394+ },
395395+ }
396396+ for _, c := range cases {
397397+ t.Run(c.name, func(t *testing.T) {
398398+ gotC, gotB := triggerCommitAndBranch(c.in)
399399+ if gotC != c.wantCommit || gotB != c.wantBranch {
400400+ t.Fatalf("got (%q,%q); want (%q,%q)",
401401+ gotC, gotB, c.wantCommit, c.wantBranch)
402402+ }
403403+ })
404404+ }
405405+}
···322322 return id, nil
323323}
324324325325+// BuildkiteBuildRef is the persisted mapping from one Buildkite build
326326+// to the Tangled pipeline tuple that spawned it. It's the row written
327327+// by the Buildkite provider at Spawn time and read back from two
328328+// places: the webhook handler (by build UUID) when an event arrives,
329329+// and the /logs handler (by knot+rkey+workflow) when an appview
330330+// client asks for output.
331331+type BuildkiteBuildRef struct {
332332+ BuildUUID string
333333+ BuildNumber int64
334334+ PipelineSlug string
335335+ Knot string
336336+ PipelineRkey string
337337+ Workflow string
338338+ PipelineURI string
339339+}
340340+341341+// InsertBuildkiteBuild records that a Buildkite build was created on
342342+// behalf of the given (knot, pipelineRkey, workflow) tuple. Uses
343343+// INSERT OR REPLACE so that an unlikely build-uuid collision (or a
344344+// Buildkite-side rebuild that re-fires us) just refreshes the row
345345+// instead of failing.
346346+func (s *store) InsertBuildkiteBuild(ctx context.Context, ref BuildkiteBuildRef) error {
347347+ _, err := s.db.ExecContext(ctx,
348348+ `INSERT INTO buildkite_builds (
349349+ build_uuid, build_number, pipeline_slug,
350350+ knot, pipeline_rkey, workflow,
351351+ pipeline_uri, created_at
352352+ ) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
353353+ ON CONFLICT(build_uuid) DO UPDATE SET
354354+ build_number = excluded.build_number,
355355+ pipeline_slug = excluded.pipeline_slug,
356356+ knot = excluded.knot,
357357+ pipeline_rkey = excluded.pipeline_rkey,
358358+ workflow = excluded.workflow,
359359+ pipeline_uri = excluded.pipeline_uri,
360360+ created_at = excluded.created_at`,
361361+ ref.BuildUUID, ref.BuildNumber, ref.PipelineSlug,
362362+ ref.Knot, ref.PipelineRkey, ref.Workflow,
363363+ ref.PipelineURI, time.Now().UTC().Format(time.RFC3339Nano),
364364+ )
365365+ if err != nil {
366366+ return fmt.Errorf("insert buildkite_build: %w", err)
367367+ }
368368+ return nil
369369+}
370370+371371+// LookupBuildkiteBuildByUUID returns the saved mapping for the given
372372+// Buildkite build UUID, or nil when no such build is recorded.
373373+// Returning a nil pointer rather than a sentinel error keeps the
374374+// webhook handler's "we don't know about this build" branch a simple
375375+// nil check.
376376+func (s *store) LookupBuildkiteBuildByUUID(ctx context.Context, buildUUID string) (*BuildkiteBuildRef, error) {
377377+ var ref BuildkiteBuildRef
378378+ err := s.db.QueryRowContext(ctx,
379379+ `SELECT build_uuid, build_number, pipeline_slug,
380380+ knot, pipeline_rkey, workflow, pipeline_uri
381381+ FROM buildkite_builds WHERE build_uuid = ?`,
382382+ buildUUID,
383383+ ).Scan(
384384+ &ref.BuildUUID, &ref.BuildNumber, &ref.PipelineSlug,
385385+ &ref.Knot, &ref.PipelineRkey, &ref.Workflow, &ref.PipelineURI,
386386+ )
387387+ if errors.Is(err, sql.ErrNoRows) {
388388+ return nil, nil
389389+ }
390390+ if err != nil {
391391+ return nil, fmt.Errorf("lookup buildkite_build by uuid: %w", err)
392392+ }
393393+ return &ref, nil
394394+}
395395+396396+// LookupBuildkiteBuildByTuple finds the most recently created build
397397+// for (knot, pipelineRkey, workflow). Returns nil when no build has
398398+// been recorded for that tuple — used by /logs to translate the
399399+// appview's path-based identity back into something Buildkite knows.
400400+//
401401+// "Most recent" matters because a workflow may have multiple builds
402402+// over time (rebuilds, re-triggers). We always serve logs for the
403403+// latest run; older runs are still queryable by build UUID directly
404404+// if anyone ever wants that.
405405+func (s *store) LookupBuildkiteBuildByTuple(ctx context.Context, knot, pipelineRkey, workflow string) (*BuildkiteBuildRef, error) {
406406+ var ref BuildkiteBuildRef
407407+ err := s.db.QueryRowContext(ctx,
408408+ `SELECT build_uuid, build_number, pipeline_slug,
409409+ knot, pipeline_rkey, workflow, pipeline_uri
410410+ FROM buildkite_builds
411411+ WHERE knot = ? AND pipeline_rkey = ? AND workflow = ?
412412+ ORDER BY created_at DESC
413413+ LIMIT 1`,
414414+ knot, pipelineRkey, workflow,
415415+ ).Scan(
416416+ &ref.BuildUUID, &ref.BuildNumber, &ref.PipelineSlug,
417417+ &ref.Knot, &ref.PipelineRkey, &ref.Workflow, &ref.PipelineURI,
418418+ )
419419+ if errors.Is(err, sql.ErrNoRows) {
420420+ return nil, nil
421421+ }
422422+ if err != nil {
423423+ return nil, fmt.Errorf("lookup buildkite_build by tuple: %w", err)
424424+ }
425425+ return &ref, nil
426426+}
427427+325428// EventsAfter returns every event row with `created` strictly greater
326429// than cursor, in cursor order. Used by /events to backfill a
327430// reconnecting subscriber and to drain newly-published rows on each
+26
store_migrate.go
···7878 event_json TEXT NOT NULL,
7979 inserted_at TEXT NOT NULL
8080);
8181+8282+-- Mapping from a Buildkite build back to the Tangled pipeline that
8383+-- spawned it. The Buildkite webhook receiver only knows the build
8484+-- UUID; everything we need to publish a pipeline.status record
8585+-- (knot, pipeline rkey, workflow name, full pipeline ATURI) lives
8686+-- on this row.
8787+--
8888+-- pipeline_uri is denormalized off (knot, pipeline_rkey) so the
8989+-- webhook handler doesn't have to recompute the at:// string on
9090+-- every event — it's a constant for the lifetime of the build and
9191+-- the webhook is the hot path for status fan-out.
9292+--
9393+-- The (knot, pipeline_rkey, workflow) index supports the /logs
9494+-- handler, which only knows that tuple at request time.
9595+CREATE TABLE IF NOT EXISTS buildkite_builds (
9696+ build_uuid TEXT PRIMARY KEY,
9797+ build_number INTEGER NOT NULL,
9898+ pipeline_slug TEXT NOT NULL,
9999+ knot TEXT NOT NULL,
100100+ pipeline_rkey TEXT NOT NULL,
101101+ workflow TEXT NOT NULL,
102102+ pipeline_uri TEXT NOT NULL,
103103+ created_at TEXT NOT NULL
104104+);
105105+CREATE INDEX IF NOT EXISTS buildkite_builds_lookup
106106+ ON buildkite_builds (knot, pipeline_rkey, workflow);
81107`
8210883109// migrate applies the schema. Safe to call repeatedly.