···114114115115## Integrated Development
116116117117-Sometimes it is helpful to run a PLC, PDS, BGS, labelmaker, and other components, all locally on your laptop, across languages. This section describes one setup for this.
117117+Sometimes it is helpful to run a PLC, PDS, BGS, and other components, all locally on your laptop, across languages. This section describes one setup for this.
118118119119First, you need PostgreSQL running locally. This could be via docker, or the following commands assume some kind of debian/ubuntu setup with a postgres server package installed and running.
120120···139139140140 make run-dev-pds
141141142142-In this repo (indigo), start a BGS and labelmaker, in two separate terminals:
142142+In this repo (indigo), start a BGS, in two separate terminals:
143143144144 make run-dev-bgs
145145- make run-dev-labelmaker
146145147146In a final terminal, run fakermaker to inject data into the system:
148147
-5
Makefile
···2323 go build ./cmd/lexgen
2424 go build ./cmd/stress
2525 go build ./cmd/fakermaker
2626- go build ./cmd/labelmaker
2726 go build ./cmd/hepa
2827 go build ./cmd/supercollider
2928 go build -o ./sonar-cli ./cmd/sonar
···9190run-bgs-image:
9291 docker run -p 2470:2470 bigsky /bigsky --admin-key localdev
9392# --crawl-insecure-ws
9494-9595-.PHONY: run-dev-labelmaker
9696-run-dev-labelmaker: .env ## Runs labelmaker for local dev
9797- GOLOG_LOG_LEVEL=info go run ./cmd/labelmaker --subscribe-insecure-ws
98939994.PHONY: run-dev-search
10095run-dev-search: .env ## Runs search daemon for local dev
+2-2
README.md
···53535454*not to be confused with the [AT command set](https://en.wikipedia.org/wiki/Hayes_command_set) or [Adenosine triphosphate](https://en.wikipedia.org/wiki/Adenosine_triphosphate)*
55555656-The Authenticated Transfer Protocol ("ATP" or "atproto") is a decentralized social media protocol, developed by [Bluesky PBC](https://blueskyweb.xyz). Learn more at:
5656+The Authenticated Transfer Protocol ("ATP" or "atproto") is a decentralized social media protocol, developed by [Bluesky PBC](https://bsky.social). Learn more at:
57575858- [Overview and Guides](https://atproto.com/guides/overview) 👈 Best starting point
5959- [Github Discussions](https://github.com/bluesky-social/atproto/discussions) 👈 Great place to ask questions
6060- [Protocol Specifications](https://atproto.com/specs/atp)
6161-- [Blogpost on self-authenticating data structures](https://blueskyweb.xyz/blog/3-6-2022-a-self-authenticating-social-protocol)
6161+- [Blogpost on self-authenticating data structures](https://bsky.social/about/blog/3-6-2022-a-self-authenticating-social-protocol)
62626363The Bluesky Social application encompasses a set of schemas and APIs built in the overall AT Protocol framework. The namespace for these "Lexicons" is `app.bsky.*`.
6464
···6464func TestCacheDirectory(t *testing.T) {
6565 t.Skip("TODO: skipping live network test")
6666 inner := BaseDirectory{}
6767- d := NewCacheDirectory(&inner, 1000, time.Hour*1, time.Hour*1)
6767+ d := NewCacheDirectory(&inner, 1000, time.Hour*1, time.Hour*1, time.Hour*1)
6868 for i := 0; i < 3; i = i + 1 {
6969 testDirectoryLive(t, &d)
7070 }
···8787 TryAuthoritativeDNS: true,
8888 SkipDNSDomainSuffixes: []string{".bsky.social"},
8989 }
9090- dir := NewCacheDirectory(&base, 1000, time.Hour*1, time.Hour*1)
9090+ dir := NewCacheDirectory(&base, 1000, time.Hour*1, time.Hour*1, time.Hour*1)
9191 // All 60 routines launch at the same time, so they should all miss the cache initially
9292 routines := 60
9393 wg := sync.WaitGroup{}
+1-1
backfill/gormstore.go
···121121func (s *Gormstore) createJobForRepo(repo, state string) error {
122122 dbj := &GormDBJob{
123123 Repo: repo,
124124- State: StateEnqueued,
124124+ State: state,
125125 }
126126 if err := s.db.Create(dbj).Error; err != nil {
127127 if err == gorm.ErrDuplicatedKey {
+1-1
cmd/hepa/README.md
···1515- which rules are included configured at compile time
1616- admin access to fetch private account metadata, and to persist moderation actions, is optional. it is possible for anybody to run a `hepa` instance
17171818-This is not a "labeling service" per say, in that it pushes labels in to an existing moderation service, and doesn't provide API endpoints or label streams. see `labelmaker` for a self-contained labeling service.
1818+This is not a "labeling service" per say, in that it pushes labels in to an existing moderation service, and doesn't provide API endpoints or label streams.
19192020Performance is generally slow when first starting up, because account-level metadata is being fetched (and cached) for every firehose event. After the caches have "warmed up", events are processed faster.
2121
···11-# Run this dockerfile from the top level of the indigo git repository like:
22-#
33-# podman build -f ./cmd/labelmaker/Dockerfile -t labelmaker .
44-55-### Compile stage
66-FROM golang:1.21-alpine3.18 AS build-env
77-RUN apk add --no-cache build-base make git
88-99-ADD . /dockerbuild
1010-WORKDIR /dockerbuild
1111-1212-# timezone data for alpine builds
1313-ENV GOEXPERIMENT=loopvar
1414-RUN GIT_VERSION=$(git describe --tags --long --always) && \
1515- go build -tags timetzdata -o /labelmaker ./cmd/labelmaker
1616-1717-### Run stage
1818-FROM alpine:3.18
1919-2020-RUN apk add --no-cache --update dumb-init ca-certificates
2121-ENTRYPOINT ["dumb-init", "--"]
2222-2323-WORKDIR /
2424-RUN mkdir -p data/labelmaker
2525-COPY --from=build-env /labelmaker /
2626-2727-# small things to make golang binaries work well under alpine
2828-ENV GODEBUG=netdns=go
2929-ENV TZ=Etc/UTC
3030-3131-EXPOSE 2210
3232-3333-CMD ["/labelmaker"]
3434-3535-LABEL org.opencontainers.image.source=https://github.com/bluesky-social/indigo
3636-LABEL org.opencontainers.image.description="ATP Labeling Service (labelmaker)"
3737-LABEL org.opencontainers.image.licenses=MIT
-101
cmd/labelmaker/README.md
···11-22-labelmaker
33-===========
44-55-## Database Setup
66-77-PostgreSQL and Sqlite are both supported. When using Sqlite, separate database
88-for the BGS database itself and the CarStore are used. With PostgreSQL a single
99-database server, user, and database, can all be reused.
1010-1111-Database configuration is passed via the `DATABASE_URL` and
1212-`CARSTORE_DATABASE_URL` environment variables, or the corresponding CLI args.
1313-1414-For PostgreSQL, the user and database must already be configured. Some example
1515-SQL commands are:
1616-1717- CREATE DATABASE bgs;
1818- CREATE DATABASE carstore;
1919-2020- CREATE USER ${username} WITH PASSWORD '${password}';
2121- GRANT ALL PRIVILEGES ON DATABASE bgs TO ${username};
2222- GRANT ALL PRIVILEGES ON DATABASE carstore TO ${username};
2323-2424-This service currently uses `gorm` to automatically run database migrations as
2525-the regular user. There is no concept of running a separate set of migrations
2626-under more privileged database user.
2727-2828-For database performance with many labels, it is important that `LC_COLLATE=C`.
2929-That is, the string sort behavior must be by byte order.
3030-3131-## Keyword Labeler
3232-3333-A trivial keyword filter labeler is included. To configure it, create a JSON
3434-with the same structure as the `example_keywords.json` file in this directory,
3535-and provide the path to the `--keyword-file` CLI arg (or the corresponding env
3636-var).
3737-3838-The structure is a list of label values ("value"), each with a list of
3939-lower-case keyword tokens. If a token is found in post or profile text, the
4040-corresponding label is generated.
4141-4242-4343-## micro-NSFW-img Integration
4444-4545-`micro_nsfw_img` is a simple image classification tool, useful for integration
4646-testing and local development. You can HTTP POST and image to it and get a set
4747-of floating point scores back about whether it is hentai, porn, etc. See more
4848-at <https://gitlab.com/bnewbold/micro-nsfw-img>.
4949-5050-To get it working with labelmaker, download the huge (3+ GByte) dockerfile and
5151-run it locally:
5252-5353- docker pull bnewbold/micro-nsfw-img:latest
5454- docker run --network host bnewbold/micro-nsfw-img
5555-5656-Then configure labelmaker with:
5757-5858- # or the '--micro-nsfw-img-url' CLI flag
5959- LABELMAKER_MICRO_NSFW_IMG_URL="http://localhost:5000/classify-image"
6060-6161-6262-## SQRL Integration
6363-6464-SQRL is a moderation system built around a declarative rule language,
6565-application events, and cached counter values. It is the open source release of
6666-Smyt, a moderation system acquired and used by Twitter many years ago. See the
6767-SQRL docs for more: <https://sqrl-lang.github.io/sqrl/index.html>
6868-6969-A local SQRL moderation server can be queried by providing `--sqrl-url` (or the
7070-corresponding env var). Post and Profile records will be passed, wrapped in a
7171-top-level JSON field `EventData`.
7272-7373-An example SQRL ruleset for posts and profiles is provided in `sqrl_example`.
7474-To use this, checkout the SQRL codebase and get it running, then copy the
7575-`bsky` folder to the top directory and run:
7676-7777- ./sqrl serve bsky/main.sqrl
7878-7979-Counter state will not persist across restarts unless Redis is configured as
8080-well.
8181-8282-8383-## Repo Account Setup
8484-8585-You'll need a DID and handle for the labelmaker service itself.
8686-8787-Generate the secret keys (as JSON files), along with did:key representations,
8888-and store these in a password manager:
8989-9090- go run ./cmd/laputa/ gen-key -o labelmaker_signing.key
9191- go run ./cmd/gosky/ did didKey --keypath labeler_signing.key
9292-9393- go run ./cmd/laputa/ gen-key -o labelmaker_recovery.key
9494- go run ./cmd/gosky/ did didKey --keypath labeler_recovery.key
9595-9696-Use the result to generate a new DID:
9797-9898- go run ./cmd/gosky/ did create --recoverydid did:key:FROMABOVE --signingkey labeler_signing.key your.handle.tld https://your.pds.host
9999-100100-The signing key JSON, along with repo handle and DID, can be passed to
101101-labelmaker via an environment variables.
···11-22-# EventData, EventType, UserDid already defined
33-LET Text := jsonValue(EventData, "$.post.text");
44-55-log("Text: %s", Text);
66-77-LET HasCryptoKeywords := patternMatches("cryptocurrency_keywords.txt", Text);
88-99-log("HasCryptoKeywords: %s", HasCryptoKeywords);
1010-1111-LET NumPostsAboutCrypto := count(BY UserDid WHERE HasCryptoKeywords LAST DAY);
1212-1313-CREATE RULE TooMuchCrypto WHERE NumPostsAboutCrypto > 5 WITH REASON "Repo ${UserDid} posted about crypto ${NumPostsAboutCrypto} times in the last day";
1414-WHEN TooMuchCrypto THEN blockAction();
-3
cmd/labelmaker/sqrl_example/bsky/profile.sqrl
···11-22-# EventData, EventType, UserDid already defined
33-LET Text := jsonValue(EventData, "$.profile.description");