this repo has no description
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

automod context refactor (#500)

Builds on warpfork's progress, and our co-working a couple days back.

Changelog:

- all state mutations in a single dedicated `Effects` struct
- consolidated record event metadata into a single RecordOp struct.
intended to be immutable (kind of like AccountMeta, but with less
fetched context). aligns functionally with repoOp from the atproto
firehose schema, but has somewhat different syntax. encompasses any of
create, update, or delete ops on a single record
- removed "*Event" and effectively replaced with "*Context". Didn't have
immutable "Event" as sub-fields on Context, but the AccountMeta and
RecordOp are basically that. engine and effects are (intentionally)
private fields. `*Context` is the API exposed to rule functions
- cherry-picked an unrelated API/schema hotfix patch in to this branch
- merged effects package into engine
- split out capture (and "fetch") code to a package
- moved engine-specific tests into the engine package
- re-export some types from the top-level `automod` package
- nuked the `automod/util` package, for now (just copied around
dedupeStrings)
- cleaned up some code duplication in tests (just use ProcessRecordOp
instead of duplicating code)

Questions for review:

- should canonical-log-line be a method on Engine, or Context, or
neither? feels like rules should not have access to this method. I
suspect the logic around notifications and dev-observability will get
more sophisticated soon, like only logging or notifying in slack for
specific effects. the effects split-out should help a lot with this, and
simplifying the existing "persist" methods
- should "effects" field be exposed ("Effects") on Context, and methods
called directly on that? I decided not for now, and put thin wrapper
methods directly on Context. I think we've achieved the state/mutation
split in implementation/internals, and rule-writers don't need to know
or think about this distinction
- we *could* hide more fields on the Context structs and make them
accessible only via methods (`c.Account().Identity` instead of
`c.Account.Identity`). I think for the "core" metadata it is fine as
fields, even though this locks us in API-wise. maybe things like the
admin "Private" metadata (which may or may not be present) should be
behind a method call.
- probably should do a review of expected behaviors around
partial-processing (eg, if an error is encountered mid-persist) and
ensure that is sane and documented

Current state: all tests updated and passing. README updated. all the
docs/comments likely need a review and update, but I think that can be a
separate PR. This feels pretty good and my disposition is to merge this
pretty soon, as the branch has gotten large and will cause merge
conflicts (we knew this when we started). the last thing I want to do is
testing in staging and against prod firehose (from my laptop).

authored by

bnewbold and committed by
GitHub
452b41b7 b877bf63

+1802 -1541
+18 -16
automod/README.md
··· 13 13 14 14 ## Architecture 15 15 16 - The runtime (`automod.Engine`) manages network requests, caching, and configuration. Outside calling code makes concurrent calls to the `Process*Event` methods that the runtime provides. The runtime constructs event structs (eg, `automod.RecordEvent`), hydrates relevant context metadata from (cached) external services, and then executes a configured set of rules on the event. Rules may request additional context, do arbitrary local compute, and mute the event with any moderation "actions". After all rules have run, the runtime will inspect the event, update counter state, and push any new moderation actions to external services. 16 + The runtime (`automod.Engine`) manages network requests, caching, and configuration. Outside calling code makes concurrent calls to the `Process*` methods that the runtime provides. The runtime constructs event context structs (eg, `automod.RecordContext`), hydrates relevant metadata from (cached) external services, and then executes a configured set of rules on the event. Rules may request additional context, do arbitrary local compute, and update the context with "effects" (such as moderation actions). After all rules have run, the runtime will inspect the context and persist any side-effects, such as updating counter state and pushing any new moderation actions to external services. 17 17 18 18 The runtime keeps state in several "stores", each of which has an interface and both in-memory and Redis implementations. It is expected that Redis is used in virtually all deployments. The store types are: 19 19 20 - - `automod.CacheStore`: generic data caching with expiration (TTL) and explicit purging. Used to cache account-level metadata, including identity lookups and (if available) private account metadata 21 - - `automod.CountStore`: keyed integer counters with time bucketing (eg, "hour", "day", "total"). Also includes probabilistic "distinct value" counters (eg, Redis HyperLogLog counters, with roughly 2% precision) 22 - - `automod.SetStore`: configurable static string sets. May eventually be runtime configurable 23 - - `automod.FlagStore`: mechanism to keep track of automod-generated "flags" (like labels or hashtags) on accounts or records. Mostly used to detect *new* flags. May eventually be moved in to the moderation service itself, similar to labels 20 + - `automod/cachestore`: generic data caching with expiration (TTL) and explicit purging. Used to cache account-level metadata, including identity lookups and (if available) private account metadata 21 + - `automod/countstore`: keyed integer counters with time bucketing (eg, "hour", "day", "total"). Also includes probabilistic "distinct value" counters (eg, Redis HyperLogLog counters, with roughly 2% precision) 22 + - `automod/setstore`: configurable static string sets. May eventually be runtime configurable 23 + - `automod/flagstore`: mechanism to keep track of automod-generated "flags" (like labels or hashtags) on accounts or records. Mostly used to detect *new* flags. May eventually be moved in to the moderation service itself, similar to labels 24 24 25 25 26 26 ## Rule API ··· 30 30 ```golang 31 31 var gtubeString = "XJS*C4JDBQADN1.NSBN3*2IDNEN*GTUBE-STANDARD-ANTI-UBE-TEST-EMAIL*C.34X" 32 32 33 - func GtubePostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 33 + func GtubePostRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 34 34 if strings.Contains(post.Text, gtubeString) { 35 - evt.AddRecordLabel("spam") 35 + c.AddRecordLabel("spam") 36 36 } 37 37 return nil 38 38 } ··· 40 40 41 41 Every new post record will be inspected to see if it contains a static test string. If it does, the label `spam` will be applied to the record itself. 42 42 43 - The `evt` parameter provides access to relevant pre-fetched metadata; methods to fetch additional metadata from the network; a `slog` logging interface; and methods to store output decisions. The runtime will catch and recover from unexpected panics, and will log returned errors, but rules are generally expected to run robustly and efficiently, and not have complex control flow needs. 43 + The `c` parameter provides access to relevant pre-fetched metadata; methods to fetch additional metadata from the network; a `slog` logging interface; and methods to store output decisions. The runtime will catch and recover from unexpected panics, and will log returned errors, but rules are generally expected to run robustly and efficiently, and not have complex control flow needs. 44 44 45 - Some of the more commonly used features of `evt` (`automod.RecordEvent`): 45 + Some of the more commonly used features of `c` (`automod.RecordContext`): 46 + 47 + - `c.Logger`: a `log/slog` logging interface 48 + - `c.Account.Identity`: atproto identity for the author account, including DID, handle, and PDS endpoint 49 + - `c.Account.Private`: when not-null (aka, when the runtime has administrator access) will contain things like `.IndexedAt` (account first seen) and `.Email` (the current registered account email) 50 + - `c.Account.Profile`: a cached subset of the account's `app.bsky.actor.profile` record (if non-null) 51 + - `c.GetCount(<namespace>, <value>, <time-period>)` and `c.Increment(<namespace>, <value>)`: to access and update simple counters (by hour, day, or total). Incrementing counters is lazy and happens in batch after all rules have executed: this means that multiple calls are de-duplicated, and that `GetCount` will not reflect any prior `Increment` calls in the same rule (or between rules). 52 + - `c.GetCountDistinct(<namespace>, <bucket>, <time-period>)` and `c.IncrementDistinct(<namespace>, <bucket>, <value>)`: similar to simple counters, but counts "unique distinct values" 53 + - `c.InSet(<set-name>, <value>)`: checks if a string is in a named set 46 54 47 - - `evt.Logger`: a `log/slog` logging interface 48 - - `evt.Account.Identity`: atproto identity for the author account, including DID, handle, and PDS endpoint 49 - - `evt.Account.Private`: when not-null (aka, when the runtime has administrator access) will contain things like `.IndexedAt` (account first seen) and `.Email` (the current registered account email) 50 - - `evt.Account.Profile`: a cached subset of the account's `app.bsky.actor.profile` record (if non-null) 51 - - `evt.GetCount(<namespace>, <value>, <time-period>)` and `evt.Increment(<namespace>, <value>)`: to access and update simple counters (by hour, day, or total). Incrementing counters is lazy and happens in batch after all rules have executed: this means that multiple calls are de-duplicated, and that `GetCount` will not reflect any prior `Increment` calls in the same rule (or between rules). 52 - - `evt.GetCountDistinct(<namespace>, <bucket>, <time-period>)` and `evt.IncrementDistinct(<namespace>, <bucket>, <value>)`: similar to simple counters, but counts "unique distinct values" 53 - - `evt.InSet(<set-name>, <value>)`: checks if a string is in a named set 55 + Notice that few (or none) of the context methods return errors. Errors are accumulated internally on the context itself, and error handling takes place before any effects are persisted by the engine. 54 56 55 57 56 58 ## Developing New Rules
+3 -31
automod/account_meta.go automod/engine/fetchaccountmeta.go
··· 1 - package automod 1 + package engine 2 2 3 3 import ( 4 4 "context" 5 5 "encoding/json" 6 6 "fmt" 7 - "time" 8 7 9 8 comatproto "github.com/bluesky-social/indigo/api/atproto" 10 9 appbsky "github.com/bluesky-social/indigo/api/bsky" 11 10 "github.com/bluesky-social/indigo/atproto/identity" 12 11 "github.com/bluesky-social/indigo/atproto/syntax" 13 - "github.com/bluesky-social/indigo/automod/util" 14 12 ) 15 - 16 - type ProfileSummary struct { 17 - HasAvatar bool 18 - Description *string 19 - DisplayName *string 20 - } 21 - 22 - type AccountPrivate struct { 23 - Email string 24 - EmailConfirmed bool 25 - IndexedAt time.Time 26 - } 27 - 28 - // information about a repo/account/identity, always pre-populated and relevant to many rules 29 - type AccountMeta struct { 30 - Identity *identity.Identity 31 - Profile ProfileSummary 32 - Private *AccountPrivate 33 - AccountLabels []string 34 - AccountNegatedLabels []string 35 - AccountFlags []string 36 - FollowersCount int64 37 - FollowsCount int64 38 - PostsCount int64 39 - Takendown bool 40 - } 41 13 42 14 func (e *Engine) GetAccountMeta(ctx context.Context, ident *identity.Identity) (*AccountMeta, error) { 43 15 ··· 96 68 Description: pv.Description, 97 69 DisplayName: pv.DisplayName, 98 70 }, 99 - AccountLabels: util.DedupeStrings(labels), 100 - AccountNegatedLabels: util.DedupeStrings(negLabels), 71 + AccountLabels: dedupeStrings(labels), 72 + AccountNegatedLabels: dedupeStrings(negLabels), 101 73 AccountFlags: flags, 102 74 } 103 75 if pv.PostsCount != nil {
-46
automod/action_dedupe_test.go
··· 1 - package automod 2 - 3 - import ( 4 - "context" 5 - "testing" 6 - 7 - appbsky "github.com/bluesky-social/indigo/api/bsky" 8 - "github.com/bluesky-social/indigo/atproto/identity" 9 - "github.com/bluesky-social/indigo/atproto/syntax" 10 - "github.com/bluesky-social/indigo/automod/countstore" 11 - 12 - "github.com/stretchr/testify/assert" 13 - ) 14 - 15 - func alwaysReportAccountRule(evt *RecordEvent) error { 16 - evt.ReportAccount(ReportReasonOther, "test report") 17 - return nil 18 - } 19 - 20 - func TestAccountReportDedupe(t *testing.T) { 21 - assert := assert.New(t) 22 - ctx := context.Background() 23 - engine := EngineTestFixture() 24 - engine.Rules = RuleSet{ 25 - RecordRules: []RecordRuleFunc{ 26 - alwaysReportAccountRule, 27 - }, 28 - } 29 - 30 - path := "app.bsky.feed.post/abc123" 31 - cid1 := "cid123" 32 - p1 := appbsky.FeedPost{Text: "some post blah"} 33 - id1 := identity.Identity{ 34 - DID: syntax.DID("did:plc:abc111"), 35 - Handle: syntax.Handle("handle.example.com"), 36 - } 37 - 38 - // exact same event multiple times; should only report once 39 - for i := 0; i < 5; i++ { 40 - assert.NoError(engine.ProcessRecord(ctx, id1.DID, path, cid1, &p1)) 41 - } 42 - 43 - reports, err := engine.GetCount("automod-quota", "report", countstore.PeriodDay) 44 - assert.NoError(err) 45 - assert.Equal(1, reports) 46 - }
+45
automod/capture/capture.go
··· 1 + package capture 2 + 3 + import ( 4 + "context" 5 + 6 + comatproto "github.com/bluesky-social/indigo/api/atproto" 7 + "github.com/bluesky-social/indigo/atproto/syntax" 8 + "github.com/bluesky-social/indigo/automod" 9 + ) 10 + 11 + type AccountCapture struct { 12 + CapturedAt syntax.Datetime `json:"capturedAt"` 13 + AccountMeta automod.AccountMeta `json:"accountMeta"` 14 + PostRecords []comatproto.RepoListRecords_Record `json:"postRecords"` 15 + } 16 + 17 + func CaptureRecent(ctx context.Context, eng *automod.Engine, atid syntax.AtIdentifier, limit int) (*AccountCapture, error) { 18 + ident, records, err := FetchRecent(ctx, eng, atid, limit) 19 + if err != nil { 20 + return nil, err 21 + } 22 + pr := []comatproto.RepoListRecords_Record{} 23 + for _, r := range records { 24 + if r != nil { 25 + pr = append(pr, *r) 26 + } 27 + } 28 + 29 + // clear any pre-parsed key, which would fail to marshal as JSON 30 + ident.ParsedPublicKey = nil 31 + am, err := eng.GetAccountMeta(ctx, ident) 32 + if err != nil { 33 + return nil, err 34 + } 35 + 36 + // auto-clear sensitive PII (eg, account email) 37 + am.Private = nil 38 + 39 + ac := AccountCapture{ 40 + CapturedAt: syntax.DatetimeNow(), 41 + AccountMeta: *am, 42 + PostRecords: pr, 43 + } 44 + return &ac, nil 45 + }
+23
automod/capture/capture_test.go
··· 1 + package capture 2 + 3 + import ( 4 + "testing" 5 + 6 + "github.com/bluesky-social/indigo/automod/countstore" 7 + "github.com/bluesky-social/indigo/automod/engine" 8 + "github.com/stretchr/testify/assert" 9 + ) 10 + 11 + func TestNoOpCaptureReplyRule(t *testing.T) { 12 + assert := assert.New(t) 13 + 14 + eng := engine.EngineTestFixture() 15 + capture := MustLoadCapture("testdata/capture_atprotocom.json") 16 + assert.NoError(ProcessCaptureRules(&eng, capture)) 17 + c, err := eng.GetCount("automod-quota", "report", countstore.PeriodDay) 18 + assert.NoError(err) 19 + assert.Equal(0, c) 20 + c, err = eng.GetCount("automod-quota", "takedown", countstore.PeriodDay) 21 + assert.NoError(err) 22 + assert.Equal(0, c) 23 + }
+96
automod/capture/fetch.go
··· 1 + package capture 2 + 3 + import ( 4 + "context" 5 + "fmt" 6 + 7 + comatproto "github.com/bluesky-social/indigo/api/atproto" 8 + "github.com/bluesky-social/indigo/atproto/identity" 9 + "github.com/bluesky-social/indigo/atproto/syntax" 10 + "github.com/bluesky-social/indigo/automod" 11 + "github.com/bluesky-social/indigo/xrpc" 12 + ) 13 + 14 + func FetchAndProcessRecord(ctx context.Context, eng *automod.Engine, aturi syntax.ATURI) error { 15 + // resolve URI, identity, and record 16 + if aturi.RecordKey() == "" { 17 + return fmt.Errorf("need a full, not partial, AT-URI: %s", aturi) 18 + } 19 + ident, err := eng.Directory.Lookup(ctx, aturi.Authority()) 20 + if err != nil { 21 + return fmt.Errorf("resolving AT-URI authority: %v", err) 22 + } 23 + pdsURL := ident.PDSEndpoint() 24 + if pdsURL == "" { 25 + return fmt.Errorf("could not resolve PDS endpoint for AT-URI account: %s", ident.DID.String()) 26 + } 27 + pdsClient := xrpc.Client{Host: ident.PDSEndpoint()} 28 + 29 + eng.Logger.Info("fetching record", "did", ident.DID.String(), "collection", aturi.Collection().String(), "rkey", aturi.RecordKey().String()) 30 + out, err := comatproto.RepoGetRecord(ctx, &pdsClient, "", aturi.Collection().String(), ident.DID.String(), aturi.RecordKey().String()) 31 + if err != nil { 32 + return fmt.Errorf("fetching record from Relay (%s): %v", aturi, err) 33 + } 34 + if out.Cid == nil { 35 + return fmt.Errorf("expected a CID in getRecord response") 36 + } 37 + recCID := syntax.CID(*out.Cid) 38 + op := automod.RecordOp{ 39 + Action: automod.CreateOp, 40 + DID: ident.DID, 41 + Collection: aturi.Collection(), 42 + RecordKey: aturi.RecordKey(), 43 + CID: &recCID, 44 + Value: out.Value.Val, 45 + } 46 + return eng.ProcessRecordOp(ctx, op) 47 + } 48 + 49 + func FetchRecent(ctx context.Context, eng *automod.Engine, atid syntax.AtIdentifier, limit int) (*identity.Identity, []*comatproto.RepoListRecords_Record, error) { 50 + ident, err := eng.Directory.Lookup(ctx, atid) 51 + if err != nil { 52 + return nil, nil, fmt.Errorf("failed to resolve AT identifier: %v", err) 53 + } 54 + pdsURL := ident.PDSEndpoint() 55 + if pdsURL == "" { 56 + return nil, nil, fmt.Errorf("could not resolve PDS endpoint for account: %s", ident.DID.String()) 57 + } 58 + pdsClient := xrpc.Client{Host: ident.PDSEndpoint()} 59 + 60 + resp, err := comatproto.RepoListRecords(ctx, &pdsClient, "app.bsky.feed.post", "", int64(limit), ident.DID.String(), false, "", "") 61 + if err != nil { 62 + return nil, nil, fmt.Errorf("failed to fetch record list: %v", err) 63 + } 64 + eng.Logger.Info("got recent posts", "did", ident.DID.String(), "pds", pdsURL, "count", len(resp.Records)) 65 + return ident, resp.Records, nil 66 + } 67 + 68 + func FetchAndProcessRecent(ctx context.Context, eng *automod.Engine, atid syntax.AtIdentifier, limit int) error { 69 + 70 + ident, records, err := FetchRecent(ctx, eng, atid, limit) 71 + if err != nil { 72 + return err 73 + } 74 + // records are most-recent first; we want recent but oldest-first, so iterate backwards 75 + for i := range records { 76 + rec := records[len(records)-i-1] 77 + aturi, err := syntax.ParseATURI(rec.Uri) 78 + if err != nil { 79 + return fmt.Errorf("parsing PDS record response: %v", err) 80 + } 81 + recCID := syntax.CID(rec.Cid) 82 + op := automod.RecordOp{ 83 + Action: automod.CreateOp, 84 + DID: ident.DID, 85 + Collection: aturi.Collection(), 86 + RecordKey: aturi.RecordKey(), 87 + CID: &recCID, 88 + Value: rec.Value.Val, 89 + } 90 + err = eng.ProcessRecordOp(ctx, op) 91 + if err != nil { 92 + return err 93 + } 94 + } 95 + return nil 96 + }
+70
automod/capture/testing.go
··· 1 + package capture 2 + 3 + import ( 4 + "context" 5 + "encoding/json" 6 + "io" 7 + "os" 8 + 9 + "github.com/bluesky-social/indigo/atproto/identity" 10 + "github.com/bluesky-social/indigo/atproto/syntax" 11 + "github.com/bluesky-social/indigo/automod" 12 + ) 13 + 14 + func MustLoadCapture(capPath string) AccountCapture { 15 + f, err := os.Open(capPath) 16 + if err != nil { 17 + panic(err) 18 + } 19 + defer func() { _ = f.Close() }() 20 + 21 + raw, err := io.ReadAll(f) 22 + if err != nil { 23 + panic(err) 24 + } 25 + 26 + var capture AccountCapture 27 + if err := json.Unmarshal(raw, &capture); err != nil { 28 + panic(err) 29 + } 30 + return capture 31 + } 32 + 33 + // Test helper which processes all the records from a capture. Intentionally exported, for use in other packages. 34 + // 35 + // This method replaces any pre-existing directory on the engine with a mock directory. 36 + func ProcessCaptureRules(eng *automod.Engine, capture AccountCapture) error { 37 + ctx := context.Background() 38 + 39 + did := capture.AccountMeta.Identity.DID 40 + dir := identity.NewMockDirectory() 41 + dir.Insert(*capture.AccountMeta.Identity) 42 + eng.Directory = &dir 43 + 44 + // initial identity rules 45 + eng.ProcessIdentityEvent(ctx, "new", did) 46 + 47 + // all the post rules 48 + for _, pr := range capture.PostRecords { 49 + aturi, err := syntax.ParseATURI(pr.Uri) 50 + if err != nil { 51 + return err 52 + } 53 + did, err := aturi.Authority().AsDID() 54 + if err != nil { 55 + return err 56 + } 57 + recCID := syntax.CID(pr.Cid) 58 + eng.Logger.Debug("processing record", "did", did) 59 + op := automod.RecordOp{ 60 + Action: automod.CreateOp, 61 + DID: did, 62 + Collection: aturi.Collection(), 63 + RecordKey: aturi.RecordKey(), 64 + CID: &recCID, 65 + Value: pr.Value.Val, 66 + } 67 + eng.ProcessRecordOp(ctx, op) 68 + } 69 + return nil 70 + }
-22
automod/capture_test.go
··· 1 - package automod 2 - 3 - import ( 4 - "testing" 5 - 6 - "github.com/bluesky-social/indigo/automod/countstore" 7 - "github.com/stretchr/testify/assert" 8 - ) 9 - 10 - func TestNoOpCaptureReplyRule(t *testing.T) { 11 - assert := assert.New(t) 12 - 13 - engine := EngineTestFixture() 14 - capture := MustLoadCapture("testdata/capture_atprotocom.json") 15 - assert.NoError(ProcessCaptureRules(&engine, capture)) 16 - c, err := engine.GetCount("automod-quota", "report", countstore.PeriodDay) 17 - assert.NoError(err) 18 - assert.Equal(0, c) 19 - c, err = engine.GetCount("automod-quota", "takedown", countstore.PeriodDay) 20 - assert.NoError(err) 21 - assert.Equal(0, c) 22 - }
-95
automod/circuit_breaker_test.go
··· 1 - package automod 2 - 3 - import ( 4 - "context" 5 - "fmt" 6 - "testing" 7 - 8 - appbsky "github.com/bluesky-social/indigo/api/bsky" 9 - "github.com/bluesky-social/indigo/atproto/identity" 10 - "github.com/bluesky-social/indigo/atproto/syntax" 11 - "github.com/bluesky-social/indigo/automod/countstore" 12 - 13 - "github.com/stretchr/testify/assert" 14 - ) 15 - 16 - func alwaysTakedownRecordRule(evt *RecordEvent) error { 17 - evt.TakedownRecord() 18 - return nil 19 - } 20 - 21 - func alwaysReportRecordRule(evt *RecordEvent) error { 22 - evt.ReportRecord(ReportReasonOther, "test report") 23 - return nil 24 - } 25 - 26 - func TestTakedownCircuitBreaker(t *testing.T) { 27 - assert := assert.New(t) 28 - ctx := context.Background() 29 - engine := EngineTestFixture() 30 - dir := identity.NewMockDirectory() 31 - engine.Directory = &dir 32 - // note that this is a record-level action, not account-level 33 - engine.Rules = RuleSet{ 34 - RecordRules: []RecordRuleFunc{ 35 - alwaysTakedownRecordRule, 36 - }, 37 - } 38 - 39 - path := "app.bsky.feed.post/abc123" 40 - cid1 := "cid123" 41 - p1 := appbsky.FeedPost{Text: "some post blah"} 42 - 43 - // generate double the quote of events; expect to only count the quote worth of actions 44 - for i := 0; i < 2*QuotaModTakedownDay; i++ { 45 - ident := identity.Identity{ 46 - DID: syntax.DID(fmt.Sprintf("did:plc:abc%d", i)), 47 - Handle: syntax.Handle("handle.example.com"), 48 - } 49 - dir.Insert(ident) 50 - assert.NoError(engine.ProcessRecord(ctx, ident.DID, path, cid1, &p1)) 51 - } 52 - 53 - takedowns, err := engine.GetCount("automod-quota", "takedown", countstore.PeriodDay) 54 - assert.NoError(err) 55 - assert.Equal(QuotaModTakedownDay, takedowns) 56 - 57 - reports, err := engine.GetCount("automod-quota", "report", countstore.PeriodDay) 58 - assert.NoError(err) 59 - assert.Equal(0, reports) 60 - } 61 - 62 - func TestReportCircuitBreaker(t *testing.T) { 63 - assert := assert.New(t) 64 - ctx := context.Background() 65 - engine := EngineTestFixture() 66 - dir := identity.NewMockDirectory() 67 - engine.Directory = &dir 68 - engine.Rules = RuleSet{ 69 - RecordRules: []RecordRuleFunc{ 70 - alwaysReportRecordRule, 71 - }, 72 - } 73 - 74 - path := "app.bsky.feed.post/abc123" 75 - cid1 := "cid123" 76 - p1 := appbsky.FeedPost{Text: "some post blah"} 77 - 78 - // generate double the quota of events; expect to only count the quota worth of actions 79 - for i := 0; i < 2*QuotaModReportDay; i++ { 80 - ident := identity.Identity{ 81 - DID: syntax.DID(fmt.Sprintf("did:plc:abc%d", i)), 82 - Handle: syntax.Handle("handle.example.com"), 83 - } 84 - dir.Insert(ident) 85 - assert.NoError(engine.ProcessRecord(ctx, ident.DID, path, cid1, &p1)) 86 - } 87 - 88 - takedowns, err := engine.GetCount("automod-quota", "takedown", countstore.PeriodDay) 89 - assert.NoError(err) 90 - assert.Equal(0, takedowns) 91 - 92 - reports, err := engine.GetCount("automod-quota", "report", countstore.PeriodDay) 93 - assert.NoError(err) 94 - assert.Equal(QuotaModReportDay, reports) 95 - }
-221
automod/engine.go
··· 1 - package automod 2 - 3 - import ( 4 - "context" 5 - "fmt" 6 - "log/slog" 7 - "strings" 8 - 9 - "github.com/bluesky-social/indigo/atproto/identity" 10 - "github.com/bluesky-social/indigo/atproto/syntax" 11 - "github.com/bluesky-social/indigo/automod/cachestore" 12 - "github.com/bluesky-social/indigo/automod/countstore" 13 - "github.com/bluesky-social/indigo/automod/flagstore" 14 - "github.com/bluesky-social/indigo/automod/setstore" 15 - "github.com/bluesky-social/indigo/xrpc" 16 - ) 17 - 18 - // runtime for executing rules, managing state, and recording moderation actions. 19 - // 20 - // TODO: careful when initializing: several fields should not be null or zero, even though they are pointer type. 21 - type Engine struct { 22 - Logger *slog.Logger 23 - Directory identity.Directory 24 - Rules RuleSet 25 - Counters countstore.CountStore 26 - Sets setstore.SetStore 27 - Cache cachestore.CacheStore 28 - Flags flagstore.FlagStore 29 - RelayClient *xrpc.Client 30 - BskyClient *xrpc.Client 31 - // used to persist moderation actions in mod service (optional) 32 - AdminClient *xrpc.Client 33 - SlackWebhookURL string 34 - } 35 - 36 - func (e *Engine) ProcessIdentityEvent(ctx context.Context, t string, did syntax.DID) error { 37 - // similar to an HTTP server, we want to recover any panics from rule execution 38 - defer func() { 39 - if r := recover(); r != nil { 40 - e.Logger.Error("automod event execution exception", "err", r, "did", did, "type", t) 41 - } 42 - }() 43 - 44 - ident, err := e.Directory.LookupDID(ctx, did) 45 - if err != nil { 46 - return fmt.Errorf("resolving identity: %w", err) 47 - } 48 - if ident == nil { 49 - return fmt.Errorf("identity not found for did: %s", did.String()) 50 - } 51 - 52 - am, err := e.GetAccountMeta(ctx, ident) 53 - if err != nil { 54 - return err 55 - } 56 - evt := IdentityEvent{ 57 - RepoEvent{ 58 - Engine: e, 59 - Logger: e.Logger.With("did", am.Identity.DID), 60 - Account: *am, 61 - }, 62 - } 63 - if err := e.Rules.CallIdentityRules(&evt); err != nil { 64 - return err 65 - } 66 - if evt.Err != nil { 67 - return evt.Err 68 - } 69 - evt.CanonicalLogLine() 70 - e.PurgeAccountCaches(ctx, am.Identity.DID) 71 - if err := evt.PersistActions(ctx); err != nil { 72 - return err 73 - } 74 - if err := evt.PersistCounters(ctx); err != nil { 75 - return err 76 - } 77 - // check for any new errors during persist 78 - if evt.Err != nil { 79 - return evt.Err 80 - } 81 - return nil 82 - } 83 - 84 - func (e *Engine) ProcessRecord(ctx context.Context, did syntax.DID, path, recCID string, rec any) error { 85 - // similar to an HTTP server, we want to recover any panics from rule execution 86 - defer func() { 87 - if r := recover(); r != nil { 88 - e.Logger.Error("automod event execution exception", "err", r, "did", did, "path", path) 89 - } 90 - }() 91 - 92 - ident, err := e.Directory.LookupDID(ctx, did) 93 - if err != nil { 94 - return fmt.Errorf("resolving identity: %w", err) 95 - } 96 - if ident == nil { 97 - return fmt.Errorf("identity not found for did: %s", did.String()) 98 - } 99 - 100 - am, err := e.GetAccountMeta(ctx, ident) 101 - if err != nil { 102 - return err 103 - } 104 - evt := e.NewRecordEvent(*am, path, recCID, rec) 105 - e.Logger.Debug("processing record", "did", ident.DID, "path", path) 106 - if err := e.Rules.CallRecordRules(&evt); err != nil { 107 - return err 108 - } 109 - if evt.Err != nil { 110 - return evt.Err 111 - } 112 - evt.CanonicalLogLine() 113 - // purge the account meta cache when profile is updated 114 - if evt.Collection == "app.bsky.actor.profile" { 115 - e.PurgeAccountCaches(ctx, am.Identity.DID) 116 - } 117 - if err := evt.PersistActions(ctx); err != nil { 118 - return err 119 - } 120 - if err := evt.PersistCounters(ctx); err != nil { 121 - return err 122 - } 123 - // check for any new errors during persist 124 - if evt.Err != nil { 125 - return evt.Err 126 - } 127 - return nil 128 - } 129 - 130 - func (e *Engine) ProcessRecordDelete(ctx context.Context, did syntax.DID, path string) error { 131 - // similar to an HTTP server, we want to recover any panics from rule execution 132 - defer func() { 133 - if r := recover(); r != nil { 134 - e.Logger.Error("automod event execution exception", "err", r, "did", did, "path", path) 135 - } 136 - }() 137 - 138 - ident, err := e.Directory.LookupDID(ctx, did) 139 - if err != nil { 140 - return fmt.Errorf("resolving identity: %w", err) 141 - } 142 - if ident == nil { 143 - return fmt.Errorf("identity not found for did: %s", did.String()) 144 - } 145 - 146 - am, err := e.GetAccountMeta(ctx, ident) 147 - if err != nil { 148 - return err 149 - } 150 - evt := e.NewRecordDeleteEvent(*am, path) 151 - e.Logger.Debug("processing record deletion", "did", ident.DID, "path", path) 152 - if err := e.Rules.CallRecordDeleteRules(&evt); err != nil { 153 - return err 154 - } 155 - if evt.Err != nil { 156 - return evt.Err 157 - } 158 - evt.CanonicalLogLine() 159 - // purge the account meta cache when profile is updated 160 - if evt.Collection == "app.bsky.actor.profile" { 161 - e.PurgeAccountCaches(ctx, am.Identity.DID) 162 - } 163 - if err := evt.PersistActions(ctx); err != nil { 164 - return err 165 - } 166 - if err := evt.PersistCounters(ctx); err != nil { 167 - return err 168 - } 169 - return nil 170 - } 171 - 172 - func (e *Engine) NewRecordEvent(am AccountMeta, path, recCID string, rec any) RecordEvent { 173 - parts := strings.SplitN(path, "/", 2) 174 - return RecordEvent{ 175 - RepoEvent{ 176 - Engine: e, 177 - Logger: e.Logger.With("did", am.Identity.DID, "collection", parts[0], "rkey", parts[1]), 178 - Account: am, 179 - }, 180 - rec, 181 - parts[0], 182 - parts[1], 183 - recCID, 184 - []string{}, 185 - false, 186 - []ModReport{}, 187 - []string{}, 188 - } 189 - } 190 - 191 - func (e *Engine) NewRecordDeleteEvent(am AccountMeta, path string) RecordDeleteEvent { 192 - parts := strings.SplitN(path, "/", 2) 193 - return RecordDeleteEvent{ 194 - RepoEvent{ 195 - Engine: e, 196 - Logger: e.Logger.With("did", am.Identity.DID, "collection", parts[0], "rkey", parts[1]), 197 - Account: am, 198 - }, 199 - parts[0], 200 - parts[1], 201 - } 202 - } 203 - 204 - func (e *Engine) GetCount(name, val, period string) (int, error) { 205 - return e.Counters.GetCount(context.TODO(), name, val, period) 206 - } 207 - 208 - func (e *Engine) GetCountDistinct(name, bucket, period string) (int, error) { 209 - return e.Counters.GetCountDistinct(context.TODO(), name, bucket, period) 210 - } 211 - 212 - // checks if `val` is an element of set `name` 213 - func (e *Engine) InSet(name, val string) (bool, error) { 214 - return e.Sets.InSet(context.TODO(), name, val) 215 - } 216 - 217 - // purge caches of any exiting metadata 218 - func (e *Engine) PurgeAccountCaches(ctx context.Context, did syntax.DID) error { 219 - e.Directory.Purge(ctx, did.AtIdentifier()) 220 - return e.Cache.Purge(ctx, "acct", did.String()) 221 - }
+33
automod/engine/account_meta.go
··· 1 + package engine 2 + 3 + import ( 4 + "time" 5 + 6 + "github.com/bluesky-social/indigo/atproto/identity" 7 + ) 8 + 9 + // information about a repo/account/identity, always pre-populated and relevant to many rules 10 + type AccountMeta struct { 11 + Identity *identity.Identity 12 + Profile ProfileSummary 13 + Private *AccountPrivate 14 + AccountLabels []string 15 + AccountNegatedLabels []string 16 + AccountFlags []string 17 + FollowersCount int64 18 + FollowsCount int64 19 + PostsCount int64 20 + Takendown bool 21 + } 22 + 23 + type ProfileSummary struct { 24 + HasAvatar bool 25 + Description *string 26 + DisplayName *string 27 + } 28 + 29 + type AccountPrivate struct { 30 + Email string 31 + EmailConfirmed bool 32 + IndexedAt time.Time 33 + }
+54
automod/engine/action_dedupe_test.go
··· 1 + package engine 2 + 3 + import ( 4 + "context" 5 + "testing" 6 + 7 + appbsky "github.com/bluesky-social/indigo/api/bsky" 8 + "github.com/bluesky-social/indigo/atproto/identity" 9 + "github.com/bluesky-social/indigo/atproto/syntax" 10 + "github.com/bluesky-social/indigo/automod/countstore" 11 + 12 + "github.com/stretchr/testify/assert" 13 + ) 14 + 15 + func alwaysReportAccountRule(c *RecordContext) error { 16 + c.ReportAccount(ReportReasonOther, "test report") 17 + return nil 18 + } 19 + 20 + func TestAccountReportDedupe(t *testing.T) { 21 + assert := assert.New(t) 22 + ctx := context.Background() 23 + eng := EngineTestFixture() 24 + eng.Rules = RuleSet{ 25 + RecordRules: []RecordRuleFunc{ 26 + alwaysReportAccountRule, 27 + }, 28 + } 29 + 30 + //path := "app.bsky.feed.post/abc123" 31 + cid1 := syntax.CID("cid123") 32 + p1 := appbsky.FeedPost{Text: "some post blah"} 33 + id1 := identity.Identity{ 34 + DID: syntax.DID("did:plc:abc111"), 35 + Handle: syntax.Handle("handle.example.com"), 36 + } 37 + 38 + // exact same event multiple times; should only report once 39 + op := RecordOp{ 40 + Action: CreateOp, 41 + DID: id1.DID, 42 + Collection: "app.bsky.feed.post", 43 + RecordKey: "abc123", 44 + CID: &cid1, 45 + Value: &p1, 46 + } 47 + for i := 0; i < 5; i++ { 48 + assert.NoError(eng.ProcessRecordOp(ctx, op)) 49 + } 50 + 51 + reports, err := eng.GetCount("automod-quota", "report", countstore.PeriodDay) 52 + assert.NoError(err) 53 + assert.Equal(1, reports) 54 + }
+109
automod/engine/circuit_breaker_test.go
··· 1 + package engine 2 + 3 + import ( 4 + "context" 5 + "fmt" 6 + "testing" 7 + 8 + appbsky "github.com/bluesky-social/indigo/api/bsky" 9 + "github.com/bluesky-social/indigo/atproto/identity" 10 + "github.com/bluesky-social/indigo/atproto/syntax" 11 + "github.com/bluesky-social/indigo/automod/countstore" 12 + 13 + "github.com/stretchr/testify/assert" 14 + ) 15 + 16 + func alwaysTakedownRecordRule(c *RecordContext) error { 17 + c.TakedownRecord() 18 + return nil 19 + } 20 + 21 + func alwaysReportRecordRule(c *RecordContext) error { 22 + c.ReportRecord(ReportReasonOther, "test report") 23 + return nil 24 + } 25 + 26 + func TestTakedownCircuitBreaker(t *testing.T) { 27 + assert := assert.New(t) 28 + ctx := context.Background() 29 + eng := EngineTestFixture() 30 + dir := identity.NewMockDirectory() 31 + eng.Directory = &dir 32 + // note that this is a record-level action, not account-level 33 + eng.Rules = RuleSet{ 34 + RecordRules: []RecordRuleFunc{ 35 + alwaysTakedownRecordRule, 36 + }, 37 + } 38 + 39 + cid1 := syntax.CID("cid123") 40 + p1 := appbsky.FeedPost{Text: "some post blah"} 41 + 42 + // generate double the quote of events; expect to only count the quote worth of actions 43 + for i := 0; i < 2*QuotaModTakedownDay; i++ { 44 + ident := identity.Identity{ 45 + DID: syntax.DID(fmt.Sprintf("did:plc:abc%d", i)), 46 + Handle: syntax.Handle("handle.example.com"), 47 + } 48 + dir.Insert(ident) 49 + op := RecordOp{ 50 + Action: CreateOp, 51 + DID: ident.DID, 52 + Collection: syntax.NSID("app.bsky.feed.post"), 53 + RecordKey: syntax.RecordKey("abc123"), 54 + CID: &cid1, 55 + Value: &p1, 56 + } 57 + assert.NoError(eng.ProcessRecordOp(ctx, op)) 58 + } 59 + 60 + takedowns, err := eng.GetCount("automod-quota", "takedown", countstore.PeriodDay) 61 + assert.NoError(err) 62 + assert.Equal(QuotaModTakedownDay, takedowns) 63 + 64 + reports, err := eng.GetCount("automod-quota", "report", countstore.PeriodDay) 65 + assert.NoError(err) 66 + assert.Equal(0, reports) 67 + } 68 + 69 + func TestReportCircuitBreaker(t *testing.T) { 70 + assert := assert.New(t) 71 + ctx := context.Background() 72 + eng := EngineTestFixture() 73 + dir := identity.NewMockDirectory() 74 + eng.Directory = &dir 75 + eng.Rules = RuleSet{ 76 + RecordRules: []RecordRuleFunc{ 77 + alwaysReportRecordRule, 78 + }, 79 + } 80 + 81 + cid1 := syntax.CID("cid123") 82 + p1 := appbsky.FeedPost{Text: "some post blah"} 83 + 84 + // generate double the quota of events; expect to only count the quota worth of actions 85 + for i := 0; i < 2*QuotaModReportDay; i++ { 86 + ident := identity.Identity{ 87 + DID: syntax.DID(fmt.Sprintf("did:plc:abc%d", i)), 88 + Handle: syntax.Handle("handle.example.com"), 89 + } 90 + dir.Insert(ident) 91 + op := RecordOp{ 92 + Action: CreateOp, 93 + DID: ident.DID, 94 + Collection: syntax.NSID("app.bsky.feed.post"), 95 + RecordKey: syntax.RecordKey("abc123"), 96 + CID: &cid1, 97 + Value: &p1, 98 + } 99 + assert.NoError(eng.ProcessRecordOp(ctx, op)) 100 + } 101 + 102 + takedowns, err := eng.GetCount("automod-quota", "takedown", countstore.PeriodDay) 103 + assert.NoError(err) 104 + assert.Equal(0, takedowns) 105 + 106 + reports, err := eng.GetCount("automod-quota", "report", countstore.PeriodDay) 107 + assert.NoError(err) 108 + assert.Equal(QuotaModReportDay, reports) 109 + }
+184
automod/engine/context.go
··· 1 + package engine 2 + 3 + import ( 4 + "context" 5 + "fmt" 6 + "log/slog" 7 + 8 + "github.com/bluesky-social/indigo/atproto/identity" 9 + "github.com/bluesky-social/indigo/atproto/syntax" 10 + ) 11 + 12 + // The primary interface exposed to rules. 13 + type BaseContext struct { 14 + // Actual golang "context.Context", if needed for timeouts etc 15 + Ctx context.Context 16 + // Any errors encountered while processing methods on this struct (or sub-types) get rolled up in this nullable field 17 + Err error 18 + // slog logger handle, with event-specific structured fields pre-populated. Pointer, but expected to never be nil. 19 + Logger *slog.Logger 20 + 21 + engine *Engine // NOTE: pointer, but expected never to be nil 22 + effects Effects 23 + } 24 + 25 + type AccountContext struct { 26 + BaseContext 27 + 28 + Account AccountMeta 29 + } 30 + 31 + type RecordContext struct { 32 + AccountContext 33 + 34 + RecordOp RecordOp 35 + // TODO: could consider adding commit-level metadata here. probably nullable if so, commit-level metadata isn't always available. might be best to do a separate event/context type for that 36 + } 37 + 38 + var ( 39 + CreateOp = "create" 40 + UpdateOp = "update" 41 + DeleteOp = "delete" 42 + ) 43 + 44 + // Immutable 45 + type RecordOp struct { 46 + // Indicates type of record mutation: create, update, or delete. 47 + // The term "action" is copied from com.atproto.sync.subscribeRepos#repoOp 48 + Action string 49 + DID syntax.DID 50 + Collection syntax.NSID 51 + RecordKey syntax.RecordKey 52 + CID *syntax.CID 53 + // NOTE: usually a *pointer*, not the value itself 54 + Value any 55 + } 56 + 57 + // Checks that op has expected fields, based on the action type 58 + func (op *RecordOp) Validate() error { 59 + switch op.Action { 60 + case CreateOp, UpdateOp: 61 + if op.Value == nil || op.CID == nil { 62 + return fmt.Errorf("expected record create/update op to contain both value and CID") 63 + } 64 + case DeleteOp: 65 + if op.Value != nil || op.CID != nil { 66 + return fmt.Errorf("expected record delete op to be empty") 67 + } 68 + default: 69 + return fmt.Errorf("unexpected record op action: %s", op.Action) 70 + } 71 + return nil 72 + } 73 + 74 + func (op *RecordOp) ATURI() syntax.ATURI { 75 + return syntax.ATURI(fmt.Sprintf("at://%s/%s/%s", op.DID, op.Collection, op.RecordKey)) 76 + } 77 + 78 + // TODO: in the future *may* have an IdentityContext with an IdentityOp sub-field 79 + 80 + // Access to engine's identity directory (without access to other engine fields) 81 + func (c *BaseContext) Directory() identity.Directory { 82 + return c.engine.Directory 83 + } 84 + 85 + // request external state via engine (indirect) 86 + func (c *BaseContext) GetCount(name, val, period string) int { 87 + out, err := c.engine.Counters.GetCount(c.Ctx, name, val, period) 88 + if err != nil { 89 + if nil == c.Err { 90 + c.Err = err 91 + } 92 + return 0 93 + } 94 + return out 95 + } 96 + 97 + func (c *BaseContext) GetCountDistinct(name, bucket, period string) int { 98 + out, err := c.engine.Counters.GetCountDistinct(c.Ctx, name, bucket, period) 99 + if err != nil { 100 + if nil == c.Err { 101 + c.Err = err 102 + } 103 + return 0 104 + } 105 + return out 106 + } 107 + 108 + func (c *BaseContext) InSet(name, val string) bool { 109 + out, err := c.engine.Sets.InSet(c.Ctx, name, val) 110 + if err != nil { 111 + if nil == c.Err { 112 + c.Err = err 113 + } 114 + return false 115 + } 116 + return out 117 + } 118 + 119 + func NewAccountContext(ctx context.Context, eng *Engine, meta AccountMeta) AccountContext { 120 + return AccountContext{ 121 + BaseContext: BaseContext{ 122 + Ctx: ctx, 123 + Err: nil, 124 + Logger: eng.Logger.With("did", meta.Identity.DID), 125 + engine: eng, 126 + effects: Effects{}, 127 + }, 128 + Account: meta, 129 + } 130 + } 131 + 132 + func NewRecordContext(ctx context.Context, eng *Engine, meta AccountMeta, op RecordOp) RecordContext { 133 + ac := NewAccountContext(ctx, eng, meta) 134 + ac.BaseContext.Logger = ac.BaseContext.Logger.With("collection", op.Collection, "rkey", op.RecordKey) 135 + return RecordContext{ 136 + AccountContext: ac, 137 + RecordOp: op, 138 + } 139 + } 140 + 141 + // update effects (indirect) 142 + func (c *BaseContext) Increment(name, val string) { 143 + c.effects.Increment(name, val) 144 + } 145 + 146 + func (c *BaseContext) IncrementDistinct(name, bucket, val string) { 147 + c.effects.IncrementDistinct(name, bucket, val) 148 + } 149 + 150 + func (c *BaseContext) IncrementPeriod(name, val string, period string) { 151 + c.effects.IncrementPeriod(name, val, period) 152 + } 153 + 154 + func (c *AccountContext) AddAccountFlag(val string) { 155 + c.effects.AddAccountFlag(val) 156 + } 157 + 158 + func (c *AccountContext) AddAccountLabel(val string) { 159 + c.effects.AddAccountLabel(val) 160 + } 161 + 162 + func (c *AccountContext) ReportAccount(reason, comment string) { 163 + c.effects.ReportAccount(reason, comment) 164 + } 165 + 166 + func (c *AccountContext) TakedownAccount() { 167 + c.effects.TakedownAccount() 168 + } 169 + 170 + func (c *RecordContext) AddRecordFlag(val string) { 171 + c.effects.AddRecordFlag(val) 172 + } 173 + 174 + func (c *RecordContext) AddRecordLabel(val string) { 175 + c.effects.AddRecordLabel(val) 176 + } 177 + 178 + func (c *RecordContext) ReportRecord(reason, comment string) { 179 + c.effects.ReportRecord(reason, comment) 180 + } 181 + 182 + func (c *RecordContext) TakedownRecord() { 183 + c.effects.TakedownRecord() 184 + }
+119
automod/engine/effects.go
··· 1 + package engine 2 + 3 + import ( 4 + "time" 5 + ) 6 + 7 + var ( 8 + // time period within which automod will not re-report an account for the same reasonType 9 + ReportDupePeriod = 7 * 24 * time.Hour 10 + // number of reports automod can file per day, for all subjects and types combined (circuit breaker) 11 + QuotaModReportDay = 50 12 + // number of takedowns automod can action per day, for all subjects combined (circuit breaker) 13 + QuotaModTakedownDay = 10 14 + ) 15 + 16 + type CounterRef struct { 17 + Name string 18 + Val string 19 + Period *string 20 + } 21 + 22 + type CounterDistinctRef struct { 23 + Name string 24 + Bucket string 25 + Val string 26 + } 27 + 28 + // Mutable container for all the possible side-effects from rule execution. 29 + // 30 + // This single type tracks generic effects (eg, counter increments), account-level actions, and record-level actions (even for processing of account-level events which have no possible record-level effects). 31 + type Effects struct { 32 + // List of counters which should be incremented as part of processing this event. These are collected during rule execution and persisted in bulk at the end. 33 + CounterIncrements []CounterRef 34 + // Similar to "CounterIncrements", but for "distinct" style counters 35 + CounterDistinctIncrements []CounterDistinctRef // TODO: better variable names 36 + // Label values which should be applied to the overall account, as a result of rule execution. 37 + AccountLabels []string 38 + // Moderation flags (similar to labels, but private) which should be applied to the overall account, as a result of rule execution. 39 + AccountFlags []string 40 + // Reports which should be filed against this account, as a result of rule execution. 41 + AccountReports []ModReport 42 + // If "true", indicates that a rule indicates that the entire account should have a takedown. 43 + AccountTakedown bool 44 + // Same as "AccountLabels", but at record-level 45 + RecordLabels []string 46 + // Same as "AccountFlags", but at record-level 47 + RecordFlags []string 48 + // Same as "AccountReports", but at record-level 49 + RecordReports []ModReport 50 + // Same as "AccountTakedown", but at record-level 51 + RecordTakedown bool 52 + } 53 + 54 + // Enqueues the named counter to be incremented at the end of all rule processing. Will automatically increment for all time periods. 55 + // 56 + // "name" is the counter namespace. 57 + // "val" is the specific counter with that namespace. 58 + func (e *Effects) Increment(name, val string) { 59 + e.CounterIncrements = append(e.CounterIncrements, CounterRef{Name: name, Val: val}) 60 + } 61 + 62 + // Enqueues the named counter to be incremented at the end of all rule processing. Will only increment the indicated time period bucket. 63 + func (e *Effects) IncrementPeriod(name, val string, period string) { 64 + e.CounterIncrements = append(e.CounterIncrements, CounterRef{Name: name, Val: val, Period: &period}) 65 + } 66 + 67 + // Enqueues the named "distinct value" counter based on the supplied string value ("val") to be incremented at the end of all rule processing. Will automatically increment for all time periods. 68 + func (e *Effects) IncrementDistinct(name, bucket, val string) { 69 + e.CounterDistinctIncrements = append(e.CounterDistinctIncrements, CounterDistinctRef{Name: name, Bucket: bucket, Val: val}) 70 + } 71 + 72 + // Enqueues the provided label (string value) to be added to the account at the end of rule processing. 73 + func (e *Effects) AddAccountLabel(val string) { 74 + e.AccountLabels = append(e.AccountLabels, val) 75 + } 76 + 77 + // Enqueues the provided flag (string value) to be recorded (in the Engine's flagstore) at the end of rule processing. 78 + func (e *Effects) AddAccountFlag(val string) { 79 + e.AccountFlags = append(e.AccountFlags, val) 80 + } 81 + 82 + // Enqueues a moderation report to be filed against the account at the end of rule processing. 83 + func (e *Effects) ReportAccount(reason, comment string) { 84 + if comment == "" { 85 + comment = "(no comment)" 86 + } 87 + comment = "automod: " + comment 88 + e.AccountReports = append(e.AccountReports, ModReport{ReasonType: reason, Comment: comment}) 89 + } 90 + 91 + // Enqueues the entire account to be taken down at the end of rule processing. 92 + func (e *Effects) TakedownAccount() { 93 + e.AccountTakedown = true 94 + } 95 + 96 + // Enqueues the provided label (string value) to be added to the record at the end of rule processing. 97 + func (e *Effects) AddRecordLabel(val string) { 98 + e.RecordLabels = append(e.RecordLabels, val) 99 + } 100 + 101 + // Enqueues the provided flag (string value) to be recorded (in the Engine's flagstore) at the end of rule processing. 102 + func (e *Effects) AddRecordFlag(val string) { 103 + e.RecordFlags = append(e.RecordFlags, val) 104 + } 105 + 106 + // Enqueues a moderation report to be filed against the record at the end of rule processing. 107 + func (e *Effects) ReportRecord(reason, comment string) { 108 + if comment == "" { 109 + comment = "(automod)" 110 + } else { 111 + comment = "automod: " + comment 112 + } 113 + e.RecordReports = append(e.RecordReports, ModReport{ReasonType: reason, Comment: comment}) 114 + } 115 + 116 + // Enqueues the record to be taken down at the end of rule processing. 117 + func (e *Effects) TakedownRecord() { 118 + e.RecordTakedown = true 119 + }
+169
automod/engine/engine.go
··· 1 + package engine 2 + 3 + import ( 4 + "context" 5 + "fmt" 6 + "log/slog" 7 + "strings" 8 + 9 + "github.com/bluesky-social/indigo/atproto/identity" 10 + "github.com/bluesky-social/indigo/atproto/syntax" 11 + "github.com/bluesky-social/indigo/automod/cachestore" 12 + "github.com/bluesky-social/indigo/automod/countstore" 13 + "github.com/bluesky-social/indigo/automod/flagstore" 14 + "github.com/bluesky-social/indigo/automod/setstore" 15 + "github.com/bluesky-social/indigo/xrpc" 16 + ) 17 + 18 + // runtime for executing rules, managing state, and recording moderation actions. 19 + // 20 + // NOTE: careful when initializing: several fields must not be nil or zero, even though they are pointer type. 21 + type Engine struct { 22 + Logger *slog.Logger 23 + Directory identity.Directory 24 + Rules RuleSet 25 + Counters countstore.CountStore 26 + Sets setstore.SetStore 27 + Cache cachestore.CacheStore 28 + Flags flagstore.FlagStore 29 + RelayClient *xrpc.Client 30 + BskyClient *xrpc.Client 31 + // used to persist moderation actions in mod service (optional) 32 + AdminClient *xrpc.Client 33 + SlackWebhookURL string 34 + } 35 + 36 + func (eng *Engine) ProcessIdentityEvent(ctx context.Context, typ string, did syntax.DID) error { 37 + // similar to an HTTP server, we want to recover any panics from rule execution 38 + defer func() { 39 + if r := recover(); r != nil { 40 + eng.Logger.Error("automod event execution exception", "err", r, "did", did, "type", typ) 41 + } 42 + }() 43 + 44 + ident, err := eng.Directory.LookupDID(ctx, did) 45 + if err != nil { 46 + return fmt.Errorf("resolving identity: %w", err) 47 + } 48 + if ident == nil { 49 + return fmt.Errorf("identity not found for did: %s", did.String()) 50 + } 51 + 52 + am, err := eng.GetAccountMeta(ctx, ident) 53 + if err != nil { 54 + return err 55 + } 56 + ac := NewAccountContext(ctx, eng, *am) 57 + if err := eng.Rules.CallIdentityRules(&ac); err != nil { 58 + return err 59 + } 60 + eng.CanonicalLogLineAccount(&ac) 61 + eng.PurgeAccountCaches(ctx, am.Identity.DID) 62 + if err := eng.persistAccountModActions(&ac); err != nil { 63 + return err 64 + } 65 + if err := eng.persistCounters(ctx, &ac.effects); err != nil { 66 + return err 67 + } 68 + return nil 69 + } 70 + 71 + func (eng *Engine) ProcessRecordOp(ctx context.Context, op RecordOp) error { 72 + // similar to an HTTP server, we want to recover any panics from rule execution 73 + defer func() { 74 + if r := recover(); r != nil { 75 + eng.Logger.Error("automod event execution exception", "err", r, "did", op.DID, "collection", op.Collection, "rkey", op.RecordKey) 76 + } 77 + }() 78 + 79 + if err := op.Validate(); err != nil { 80 + return fmt.Errorf("bad record op: %w", err) 81 + } 82 + ident, err := eng.Directory.LookupDID(ctx, op.DID) 83 + if err != nil { 84 + return fmt.Errorf("resolving identity: %w", err) 85 + } 86 + if ident == nil { 87 + return fmt.Errorf("identity not found for did: %s", op.DID) 88 + } 89 + 90 + am, err := eng.GetAccountMeta(ctx, ident) 91 + if err != nil { 92 + return err 93 + } 94 + rc := NewRecordContext(ctx, eng, *am, op) 95 + rc.Logger.Debug("processing record") 96 + switch op.Action { 97 + case CreateOp, UpdateOp: 98 + if err := eng.Rules.CallRecordRules(&rc); err != nil { 99 + return err 100 + } 101 + case DeleteOp: 102 + if err := eng.Rules.CallRecordDeleteRules(&rc); err != nil { 103 + return err 104 + } 105 + default: 106 + return fmt.Errorf("unexpected op action: %s", op.Action) 107 + } 108 + eng.CanonicalLogLineRecord(&rc) 109 + // purge the account meta cache when profile is updated 110 + if rc.RecordOp.Collection == "app.bsky.actor.profile" { 111 + eng.PurgeAccountCaches(ctx, am.Identity.DID) 112 + } 113 + if err := eng.persistRecordModActions(&rc); err != nil { 114 + return err 115 + } 116 + if err := eng.persistCounters(ctx, &rc.effects); err != nil { 117 + return err 118 + } 119 + return nil 120 + } 121 + 122 + func (e *Engine) GetCount(name, val, period string) (int, error) { 123 + return e.Counters.GetCount(context.TODO(), name, val, period) 124 + } 125 + 126 + func (e *Engine) GetCountDistinct(name, bucket, period string) (int, error) { 127 + return e.Counters.GetCountDistinct(context.TODO(), name, bucket, period) 128 + } 129 + 130 + // checks if `val` is an element of set `name` 131 + func (e *Engine) InSet(name, val string) (bool, error) { 132 + return e.Sets.InSet(context.TODO(), name, val) 133 + } 134 + 135 + // purge caches of any exiting metadata 136 + func (e *Engine) PurgeAccountCaches(ctx context.Context, did syntax.DID) error { 137 + e.Directory.Purge(ctx, did.AtIdentifier()) 138 + return e.Cache.Purge(ctx, "acct", did.String()) 139 + } 140 + 141 + func (e *Engine) CanonicalLogLineAccount(c *AccountContext) { 142 + c.Logger.Info("canonical-event-line", 143 + "accountLabels", c.effects.AccountLabels, 144 + "accountFlags", c.effects.AccountFlags, 145 + "accountTakedown", c.effects.AccountTakedown, 146 + "accountReports", len(c.effects.AccountReports), 147 + ) 148 + } 149 + 150 + func (e *Engine) CanonicalLogLineRecord(c *RecordContext) { 151 + c.Logger.Info("canonical-event-line", 152 + "accountLabels", c.effects.AccountLabels, 153 + "accountFlags", c.effects.AccountFlags, 154 + "accountTakedown", c.effects.AccountTakedown, 155 + "accountReports", len(c.effects.AccountReports), 156 + "recordLabels", c.effects.RecordLabels, 157 + "recordFlags", c.effects.RecordFlags, 158 + "recordTakedown", c.effects.RecordTakedown, 159 + "recordReports", len(c.effects.RecordReports), 160 + ) 161 + } 162 + 163 + func splitRepoPath(path string) (string, string, error) { 164 + parts := strings.SplitN(path, "/", 3) 165 + if len(parts) != 2 { 166 + return "", "", fmt.Errorf("invalid record path: %s", path) 167 + } 168 + return parts[0], parts[1], nil 169 + }
+254
automod/engine/persist.go
··· 1 + package engine 2 + 3 + import ( 4 + "context" 5 + "fmt" 6 + 7 + comatproto "github.com/bluesky-social/indigo/api/atproto" 8 + ) 9 + 10 + func (eng *Engine) persistCounters(ctx context.Context, eff *Effects) error { 11 + // TODO: dedupe this array 12 + for _, ref := range eff.CounterIncrements { 13 + if ref.Period != nil { 14 + err := eng.Counters.IncrementPeriod(ctx, ref.Name, ref.Val, *ref.Period) 15 + if err != nil { 16 + return err 17 + } 18 + } else { 19 + err := eng.Counters.Increment(ctx, ref.Name, ref.Val) 20 + if err != nil { 21 + return err 22 + } 23 + } 24 + } 25 + for _, ref := range eff.CounterDistinctIncrements { 26 + err := eng.Counters.IncrementDistinct(ctx, ref.Name, ref.Bucket, ref.Val) 27 + if err != nil { 28 + return err 29 + } 30 + } 31 + return nil 32 + } 33 + 34 + // Persists account-level moderation actions: new labels, new flags, new takedowns, and reports. 35 + // 36 + // If necessary, will "purge" identity and account caches, so that state updates will be picked up for subsequent events. 37 + // 38 + // Note that this method expects to run *before* counts are persisted (it accesses and updates some counts) 39 + func (eng *Engine) persistAccountModActions(c *AccountContext) error { 40 + ctx := c.Ctx 41 + 42 + // de-dupe actions 43 + newLabels := dedupeLabelActions(c.effects.AccountLabels, c.Account.AccountLabels, c.Account.AccountNegatedLabels) 44 + newFlags := dedupeFlagActions(c.effects.AccountFlags, c.Account.AccountFlags) 45 + 46 + // don't report the same account multiple times on the same day for the same reason. this is a quick check; we also query the mod service API just before creating the report. 47 + partialReports, err := eng.dedupeReportActions(ctx, c.Account.Identity.DID, c.effects.AccountReports) 48 + if err != nil { 49 + return err 50 + } 51 + newReports, err := eng.circuitBreakReports(ctx, partialReports) 52 + if err != nil { 53 + return err 54 + } 55 + newTakedown, err := eng.circuitBreakTakedown(ctx, c.effects.AccountTakedown && !c.Account.Takendown) 56 + if err != nil { 57 + return err 58 + } 59 + 60 + anyModActions := newTakedown || len(newLabels) > 0 || len(newFlags) > 0 || len(newReports) > 0 61 + if anyModActions && eng.SlackWebhookURL != "" { 62 + msg := slackBody("⚠️ Automod Account Action ⚠️\n", c.Account, newLabels, newFlags, newReports, newTakedown) 63 + if err := eng.SendSlackMsg(ctx, msg); err != nil { 64 + eng.Logger.Error("sending slack webhook", "err", err) 65 + } 66 + } 67 + 68 + // flags don't require admin auth 69 + if len(newFlags) > 0 { 70 + eng.Flags.Add(ctx, c.Account.Identity.DID.String(), newFlags) 71 + } 72 + 73 + // if we can't actually talk to service, bail out early 74 + if eng.AdminClient == nil { 75 + return nil 76 + } 77 + 78 + xrpcc := eng.AdminClient 79 + 80 + if len(newLabels) > 0 { 81 + eng.Logger.Info("labeling record", "newLabels", newLabels) 82 + comment := "automod" 83 + _, err := comatproto.AdminEmitModerationEvent(ctx, xrpcc, &comatproto.AdminEmitModerationEvent_Input{ 84 + CreatedBy: xrpcc.Auth.Did, 85 + Event: &comatproto.AdminEmitModerationEvent_Input_Event{ 86 + AdminDefs_ModEventLabel: &comatproto.AdminDefs_ModEventLabel{ 87 + CreateLabelVals: newLabels, 88 + NegateLabelVals: []string{}, 89 + Comment: &comment, 90 + }, 91 + }, 92 + Subject: &comatproto.AdminEmitModerationEvent_Input_Subject{ 93 + AdminDefs_RepoRef: &comatproto.AdminDefs_RepoRef{ 94 + Did: c.Account.Identity.DID.String(), 95 + }, 96 + }, 97 + }) 98 + if err != nil { 99 + return err 100 + } 101 + } 102 + 103 + // reports are additionally de-duped when persisting the action, so track with a flag 104 + createdReports := false 105 + for _, mr := range newReports { 106 + created, err := eng.createReportIfFresh(ctx, xrpcc, c.Account.Identity.DID, mr) 107 + if err != nil { 108 + return err 109 + } 110 + if created { 111 + createdReports = true 112 + } 113 + } 114 + 115 + if newTakedown { 116 + eng.Logger.Warn("account-takedown") 117 + comment := "automod" 118 + _, err := comatproto.AdminEmitModerationEvent(ctx, xrpcc, &comatproto.AdminEmitModerationEvent_Input{ 119 + CreatedBy: xrpcc.Auth.Did, 120 + Event: &comatproto.AdminEmitModerationEvent_Input_Event{ 121 + AdminDefs_ModEventTakedown: &comatproto.AdminDefs_ModEventTakedown{ 122 + Comment: &comment, 123 + }, 124 + }, 125 + Subject: &comatproto.AdminEmitModerationEvent_Input_Subject{ 126 + AdminDefs_RepoRef: &comatproto.AdminDefs_RepoRef{ 127 + Did: c.Account.Identity.DID.String(), 128 + }, 129 + }, 130 + }) 131 + if err != nil { 132 + return err 133 + } 134 + } 135 + 136 + needCachePurge := newTakedown || len(newLabels) > 0 || len(newFlags) > 0 || createdReports 137 + if needCachePurge { 138 + return eng.PurgeAccountCaches(ctx, c.Account.Identity.DID) 139 + } 140 + 141 + return nil 142 + } 143 + 144 + // Persists some record-level state: labels, takedowns, reports. 145 + // 146 + // NOTE: this method currently does *not* persist record-level flags to any storage, and does not de-dupe most actions, on the assumption that the record is new (from firehose) and has no existing mod state. 147 + func (eng *Engine) persistRecordModActions(c *RecordContext) error { 148 + ctx := c.Ctx 149 + if err := eng.persistAccountModActions(&c.AccountContext); err != nil { 150 + return err 151 + } 152 + 153 + // NOTE: record-level actions are *not* currently de-duplicated (aka, the same record could be labeled multiple times, or re-reported, etc) 154 + newLabels := dedupeStrings(c.effects.RecordLabels) 155 + newFlags := dedupeStrings(c.effects.RecordFlags) 156 + newReports, err := eng.circuitBreakReports(ctx, c.effects.RecordReports) 157 + if err != nil { 158 + return err 159 + } 160 + newTakedown, err := eng.circuitBreakTakedown(ctx, c.effects.RecordTakedown) 161 + if err != nil { 162 + return err 163 + } 164 + atURI := fmt.Sprintf("at://%s/%s/%s", c.Account.Identity.DID, c.RecordOp.Collection, c.RecordOp.RecordKey) 165 + 166 + if newTakedown || len(newLabels) > 0 || len(newFlags) > 0 || len(newReports) > 0 { 167 + if eng.SlackWebhookURL != "" { 168 + msg := slackBody("⚠️ Automod Record Action ⚠️\n", c.Account, newLabels, newFlags, newReports, newTakedown) 169 + msg += fmt.Sprintf("`%s`\n", atURI) 170 + if err := eng.SendSlackMsg(ctx, msg); err != nil { 171 + eng.Logger.Error("sending slack webhook", "err", err) 172 + } 173 + } 174 + } 175 + 176 + // flags don't require admin auth 177 + if len(newFlags) > 0 { 178 + eng.Flags.Add(ctx, atURI, newFlags) 179 + } 180 + 181 + // exit early 182 + if !newTakedown && len(newLabels) == 0 && len(newReports) == 0 { 183 + return nil 184 + } 185 + 186 + if eng.AdminClient == nil { 187 + return nil 188 + } 189 + 190 + if c.RecordOp.CID == nil { 191 + eng.Logger.Warn("skipping record actions because CID is nil, can't construct strong ref") 192 + return nil 193 + } 194 + cid := *c.RecordOp.CID 195 + strongRef := comatproto.RepoStrongRef{ 196 + Cid: cid.String(), 197 + Uri: atURI, 198 + } 199 + 200 + xrpcc := eng.AdminClient 201 + if len(newLabels) > 0 { 202 + eng.Logger.Info("labeling record", "newLabels", newLabels) 203 + comment := "automod" 204 + _, err := comatproto.AdminEmitModerationEvent(ctx, xrpcc, &comatproto.AdminEmitModerationEvent_Input{ 205 + CreatedBy: xrpcc.Auth.Did, 206 + Event: &comatproto.AdminEmitModerationEvent_Input_Event{ 207 + AdminDefs_ModEventLabel: &comatproto.AdminDefs_ModEventLabel{ 208 + CreateLabelVals: newLabels, 209 + NegateLabelVals: []string{}, 210 + Comment: &comment, 211 + }, 212 + }, 213 + Subject: &comatproto.AdminEmitModerationEvent_Input_Subject{ 214 + RepoStrongRef: &strongRef, 215 + }, 216 + }) 217 + if err != nil { 218 + return err 219 + } 220 + } 221 + 222 + for _, mr := range newReports { 223 + eng.Logger.Info("reporting record", "reasonType", mr.ReasonType, "comment", mr.Comment) 224 + _, err := comatproto.ModerationCreateReport(ctx, xrpcc, &comatproto.ModerationCreateReport_Input{ 225 + ReasonType: &mr.ReasonType, 226 + Reason: &mr.Comment, 227 + Subject: &comatproto.ModerationCreateReport_Input_Subject{ 228 + RepoStrongRef: &strongRef, 229 + }, 230 + }) 231 + if err != nil { 232 + return err 233 + } 234 + } 235 + if newTakedown { 236 + eng.Logger.Warn("record-takedown") 237 + comment := "automod" 238 + _, err := comatproto.AdminEmitModerationEvent(ctx, xrpcc, &comatproto.AdminEmitModerationEvent_Input{ 239 + CreatedBy: xrpcc.Auth.Did, 240 + Event: &comatproto.AdminEmitModerationEvent_Input_Event{ 241 + AdminDefs_ModEventTakedown: &comatproto.AdminDefs_ModEventTakedown{ 242 + Comment: &comment, 243 + }, 244 + }, 245 + Subject: &comatproto.AdminEmitModerationEvent_Input_Subject{ 246 + RepoStrongRef: &strongRef, 247 + }, 248 + }) 249 + if err != nil { 250 + return err 251 + } 252 + } 253 + return nil 254 + }
+182
automod/engine/persisthelpers.go
··· 1 + package engine 2 + 3 + import ( 4 + "context" 5 + "fmt" 6 + "strings" 7 + "time" 8 + 9 + comatproto "github.com/bluesky-social/indigo/api/atproto" 10 + "github.com/bluesky-social/indigo/atproto/syntax" 11 + "github.com/bluesky-social/indigo/automod/countstore" 12 + "github.com/bluesky-social/indigo/xrpc" 13 + ) 14 + 15 + func dedupeLabelActions(labels, existing, existingNegated []string) []string { 16 + newLabels := []string{} 17 + for _, val := range dedupeStrings(labels) { 18 + exists := false 19 + for _, e := range existingNegated { 20 + if val == e { 21 + exists = true 22 + break 23 + } 24 + } 25 + for _, e := range existing { 26 + if val == e { 27 + exists = true 28 + break 29 + } 30 + } 31 + if !exists { 32 + newLabels = append(newLabels, val) 33 + } 34 + } 35 + return newLabels 36 + } 37 + 38 + func dedupeFlagActions(flags, existing []string) []string { 39 + newFlags := []string{} 40 + for _, val := range dedupeStrings(flags) { 41 + exists := false 42 + for _, e := range existing { 43 + if val == e { 44 + exists = true 45 + break 46 + } 47 + } 48 + if !exists { 49 + newFlags = append(newFlags, val) 50 + } 51 + } 52 + return newFlags 53 + } 54 + 55 + func (eng *Engine) dedupeReportActions(ctx context.Context, did syntax.DID, reports []ModReport) ([]ModReport, error) { 56 + newReports := []ModReport{} 57 + for _, r := range reports { 58 + counterName := "automod-account-report-" + ReasonShortName(r.ReasonType) 59 + existing, err := eng.GetCount(counterName, did.String(), countstore.PeriodDay) 60 + if err != nil { 61 + return nil, fmt.Errorf("checking report de-dupe counts: %w", err) 62 + } 63 + if existing > 0 { 64 + eng.Logger.Debug("skipping account report due to counter", "existing", existing, "reason", ReasonShortName(r.ReasonType)) 65 + } else { 66 + err = eng.Counters.Increment(ctx, counterName, did.String()) 67 + if err != nil { 68 + return nil, fmt.Errorf("incrementing report de-dupe count: %w", err) 69 + } 70 + newReports = append(newReports, r) 71 + } 72 + } 73 + return newReports, nil 74 + } 75 + 76 + func (eng *Engine) circuitBreakReports(ctx context.Context, reports []ModReport) ([]ModReport, error) { 77 + if len(reports) == 0 { 78 + return []ModReport{}, nil 79 + } 80 + c, err := eng.GetCount("automod-quota", "report", countstore.PeriodDay) 81 + if err != nil { 82 + return nil, fmt.Errorf("checking report action quota: %w", err) 83 + } 84 + if c >= QuotaModReportDay { 85 + eng.Logger.Warn("CIRCUIT BREAKER: automod reports") 86 + return []ModReport{}, nil 87 + } 88 + err = eng.Counters.Increment(ctx, "automod-quota", "report") 89 + if err != nil { 90 + return nil, fmt.Errorf("incrementing report action quota: %w", err) 91 + } 92 + return reports, nil 93 + } 94 + 95 + func (eng *Engine) circuitBreakTakedown(ctx context.Context, takedown bool) (bool, error) { 96 + if !takedown { 97 + return false, nil 98 + } 99 + c, err := eng.GetCount("automod-quota", "takedown", countstore.PeriodDay) 100 + if err != nil { 101 + return false, fmt.Errorf("checking takedown action quota: %w", err) 102 + } 103 + if c >= QuotaModTakedownDay { 104 + eng.Logger.Warn("CIRCUIT BREAKER: automod takedowns") 105 + return false, nil 106 + } 107 + err = eng.Counters.Increment(ctx, "automod-quota", "takedown") 108 + if err != nil { 109 + return false, fmt.Errorf("incrementing takedown action quota: %w", err) 110 + } 111 + return takedown, nil 112 + } 113 + 114 + // Creates a moderation report, but checks first if there was a similar recent one, and skips if so. 115 + // 116 + // Returns a bool indicating if a new report was created. 117 + func (eng *Engine) createReportIfFresh(ctx context.Context, xrpcc *xrpc.Client, did syntax.DID, mr ModReport) (bool, error) { 118 + // before creating a report, query to see if automod has already reported this account in the past week for the same reason 119 + // NOTE: this is running in an inner loop (if there are multiple reports), which is a bit inefficient, but seems acceptable 120 + 121 + // AdminQueryModerationEvents(ctx context.Context, c *xrpc.Client, createdBy string, cursor string, inc ludeAllUserRecords bool, limit int64, sortDirection string, subject string, types []string) 122 + resp, err := comatproto.AdminQueryModerationEvents(ctx, xrpcc, xrpcc.Auth.Did, "", false, 5, "", did.String(), []string{"com.atproto.admin.defs#modEventReport"}) 123 + if err != nil { 124 + return false, err 125 + } 126 + for _, modEvt := range resp.Events { 127 + // defensively ensure that our query params worked correctly 128 + if modEvt.Event.AdminDefs_ModEventReport == nil || modEvt.CreatedBy != xrpcc.Auth.Did || modEvt.Subject.AdminDefs_RepoRef == nil || modEvt.Subject.AdminDefs_RepoRef.Did != did.String() || (modEvt.Event.AdminDefs_ModEventReport.ReportType != nil && *modEvt.Event.AdminDefs_ModEventReport.ReportType != mr.ReasonType) { 129 + continue 130 + } 131 + // igonre if older 132 + created, err := syntax.ParseDatetime(modEvt.CreatedAt) 133 + if err != nil { 134 + return false, err 135 + } 136 + if time.Since(created.Time()) > ReportDupePeriod { 137 + continue 138 + } 139 + 140 + // there is a recent report which is similar to this one 141 + eng.Logger.Info("skipping duplicate account report due to API check") 142 + return false, nil 143 + } 144 + 145 + eng.Logger.Info("reporting account", "reasonType", mr.ReasonType, "comment", mr.Comment) 146 + _, err = comatproto.ModerationCreateReport(ctx, xrpcc, &comatproto.ModerationCreateReport_Input{ 147 + ReasonType: &mr.ReasonType, 148 + Reason: &mr.Comment, 149 + Subject: &comatproto.ModerationCreateReport_Input_Subject{ 150 + AdminDefs_RepoRef: &comatproto.AdminDefs_RepoRef{ 151 + Did: did.String(), 152 + }, 153 + }, 154 + }) 155 + if err != nil { 156 + return false, err 157 + } 158 + return true, nil 159 + } 160 + 161 + func slackBody(header string, acct AccountMeta, newLabels, newFlags []string, newReports []ModReport, newTakedown bool) string { 162 + msg := header 163 + msg += fmt.Sprintf("`%s` / `%s` / <https://bsky.app/profile/%s|bsky> / <https://admin.prod.bsky.dev/repositories/%s|ozone>\n", 164 + acct.Identity.DID, 165 + acct.Identity.Handle, 166 + acct.Identity.DID, 167 + acct.Identity.DID, 168 + ) 169 + if len(newLabels) > 0 { 170 + msg += fmt.Sprintf("New Labels: `%s`\n", strings.Join(newLabels, ", ")) 171 + } 172 + if len(newFlags) > 0 { 173 + msg += fmt.Sprintf("New Flags: `%s`\n", strings.Join(newFlags, ", ")) 174 + } 175 + for _, rep := range newReports { 176 + msg += fmt.Sprintf("Report `%s`: %s\n", rep.ReasonType, rep.Comment) 177 + } 178 + if newTakedown { 179 + msg += fmt.Sprintf("Takedown!\n") 180 + } 181 + return msg 182 + }
+71
automod/engine/ruleset.go
··· 1 + package engine 2 + 3 + import ( 4 + "fmt" 5 + 6 + appbsky "github.com/bluesky-social/indigo/api/bsky" 7 + ) 8 + 9 + type RuleSet struct { 10 + PostRules []PostRuleFunc 11 + ProfileRules []ProfileRuleFunc 12 + RecordRules []RecordRuleFunc 13 + RecordDeleteRules []RecordRuleFunc 14 + IdentityRules []IdentityRuleFunc 15 + } 16 + 17 + func (r *RuleSet) CallRecordRules(c *RecordContext) error { 18 + // first the generic rules 19 + for _, f := range r.RecordRules { 20 + err := f(c) 21 + if err != nil { 22 + return err 23 + } 24 + } 25 + // then any record-type-specific rules 26 + switch c.RecordOp.Collection.String() { 27 + case "app.bsky.feed.post": 28 + post, ok := c.RecordOp.Value.(*appbsky.FeedPost) 29 + if !ok { 30 + return fmt.Errorf("mismatch between collection (%s) and type", c.RecordOp.Collection) 31 + } 32 + for _, f := range r.PostRules { 33 + err := f(c, post) 34 + if err != nil { 35 + return err 36 + } 37 + } 38 + case "app.bsky.actor.profile": 39 + profile, ok := c.RecordOp.Value.(*appbsky.ActorProfile) 40 + if !ok { 41 + return fmt.Errorf("mismatch between collection (%s) and type", c.RecordOp.Collection) 42 + } 43 + for _, f := range r.ProfileRules { 44 + err := f(c, profile) 45 + if err != nil { 46 + return err 47 + } 48 + } 49 + } 50 + return nil 51 + } 52 + 53 + func (r *RuleSet) CallRecordDeleteRules(c *RecordContext) error { 54 + for _, f := range r.RecordDeleteRules { 55 + err := f(c) 56 + if err != nil { 57 + return err 58 + } 59 + } 60 + return nil 61 + } 62 + 63 + func (r *RuleSet) CallIdentityRules(c *AccountContext) error { 64 + for _, f := range r.IdentityRules { 65 + err := f(c) 66 + if err != nil { 67 + return err 68 + } 69 + } 70 + return nil 71 + }
+10
automod/engine/ruletypes.go
··· 1 + package engine 2 + 3 + import ( 4 + appbsky "github.com/bluesky-social/indigo/api/bsky" 5 + ) 6 + 7 + type IdentityRuleFunc = func(c *AccountContext) error 8 + type RecordRuleFunc = func(c *RecordContext) error 9 + type PostRuleFunc = func(c *RecordContext, post *appbsky.FeedPost) error 10 + type ProfileRuleFunc = func(c *RecordContext, profile *appbsky.ActorProfile) error
+71
automod/engine/testing.go
··· 1 + package engine 2 + 3 + import ( 4 + "log/slog" 5 + "time" 6 + 7 + appbsky "github.com/bluesky-social/indigo/api/bsky" 8 + "github.com/bluesky-social/indigo/atproto/identity" 9 + "github.com/bluesky-social/indigo/atproto/syntax" 10 + "github.com/bluesky-social/indigo/automod/cachestore" 11 + "github.com/bluesky-social/indigo/automod/countstore" 12 + "github.com/bluesky-social/indigo/automod/flagstore" 13 + "github.com/bluesky-social/indigo/automod/setstore" 14 + ) 15 + 16 + var _ PostRuleFunc = simpleRule 17 + 18 + func simpleRule(c *RecordContext, post *appbsky.FeedPost) error { 19 + for _, tag := range post.Tags { 20 + if c.InSet("bad-hashtags", tag) { 21 + c.AddRecordLabel("bad-hashtag") 22 + break 23 + } 24 + } 25 + for _, facet := range post.Facets { 26 + for _, feat := range facet.Features { 27 + if feat.RichtextFacet_Tag != nil { 28 + tag := feat.RichtextFacet_Tag.Tag 29 + if c.InSet("bad-hashtags", tag) { 30 + c.AddRecordLabel("bad-hashtag") 31 + break 32 + } 33 + } 34 + } 35 + } 36 + return nil 37 + } 38 + 39 + func EngineTestFixture() Engine { 40 + rules := RuleSet{ 41 + PostRules: []PostRuleFunc{ 42 + simpleRule, 43 + }, 44 + } 45 + cache := cachestore.NewMemCacheStore(10, time.Hour) 46 + flags := flagstore.NewMemFlagStore() 47 + sets := setstore.NewMemSetStore() 48 + sets.Sets["bad-hashtags"] = make(map[string]bool) 49 + sets.Sets["bad-hashtags"]["slur"] = true 50 + dir := identity.NewMockDirectory() 51 + id1 := identity.Identity{ 52 + DID: syntax.DID("did:plc:abc111"), 53 + Handle: syntax.Handle("handle.example.com"), 54 + } 55 + dir.Insert(id1) 56 + eng := Engine{ 57 + Logger: slog.Default(), 58 + Directory: &dir, 59 + Counters: countstore.NewMemCountStore(), 60 + Sets: sets, 61 + Flags: flags, 62 + Cache: cache, 63 + Rules: rules, 64 + } 65 + return eng 66 + } 67 + 68 + // Helper to access the private effects field from a context. Intended for use in test code, *not* from rules. 69 + func ExtractEffects(c *BaseContext) Effects { 70 + return c.effects 71 + }
+14 -6
automod/engine_test.go automod/engine/engine_test.go
··· 1 - package automod 1 + package engine 2 2 3 3 import ( 4 4 "context" ··· 15 15 assert := assert.New(t) 16 16 ctx := context.Background() 17 17 18 - engine := EngineTestFixture() 18 + eng := EngineTestFixture() 19 19 id1 := identity.Identity{ 20 20 DID: syntax.DID("did:plc:abc111"), 21 21 Handle: syntax.Handle("handle.example.com"), 22 22 } 23 - path := "app.bsky.feed.post/abc123" 24 - cid1 := "cid123" 23 + cid1 := syntax.CID("cid123") 25 24 p1 := appbsky.FeedPost{ 26 25 Text: "some post blah", 27 26 } 28 - assert.NoError(engine.ProcessRecord(ctx, id1.DID, path, cid1, &p1)) 27 + op := RecordOp{ 28 + Action: CreateOp, 29 + DID: id1.DID, 30 + Collection: syntax.NSID("app.bsky.feed.post"), 31 + RecordKey: syntax.RecordKey("abc123"), 32 + CID: &cid1, 33 + Value: &p1, 34 + } 35 + assert.NoError(eng.ProcessRecordOp(ctx, op)) 29 36 30 37 p2 := appbsky.FeedPost{ 31 38 Text: "some post blah", 32 39 Tags: []string{"one", "slur"}, 33 40 } 34 - assert.NoError(engine.ProcessRecord(ctx, id1.DID, path, cid1, &p2)) 41 + op.Value = &p2 42 + assert.NoError(eng.ProcessRecordOp(ctx, op)) 35 43 }
-604
automod/event.go
··· 1 - package automod 2 - 3 - import ( 4 - "context" 5 - "fmt" 6 - "log/slog" 7 - "strings" 8 - "time" 9 - 10 - comatproto "github.com/bluesky-social/indigo/api/atproto" 11 - appbsky "github.com/bluesky-social/indigo/api/bsky" 12 - "github.com/bluesky-social/indigo/atproto/syntax" 13 - "github.com/bluesky-social/indigo/automod/countstore" 14 - "github.com/bluesky-social/indigo/automod/util" 15 - "github.com/bluesky-social/indigo/xrpc" 16 - ) 17 - 18 - var ( 19 - // time period within which automod will not re-report an account for the same reasonType 20 - ReportDupePeriod = 7 * 24 * time.Hour 21 - // number of reports automod can file per day, for all subjects and types combined (circuit breaker) 22 - QuotaModReportDay = 50 23 - // number of takedowns automod can action per day, for all subjects combined (circuit breaker) 24 - QuotaModTakedownDay = 10 25 - ) 26 - 27 - type CounterRef struct { 28 - Name string 29 - Val string 30 - Period *string 31 - } 32 - 33 - type CounterDistinctRef struct { 34 - Name string 35 - Bucket string 36 - Val string 37 - } 38 - 39 - // Base type for events specific to an account, usually derived from a repo event stream message (one such message may result in multiple `RepoEvent`) 40 - // 41 - // Events are both containers for data about the event itself (similar to an HTTP request type); aggregate results and state (counters, mod actions) to be persisted after all rules are run; and act as an API for additional network reads and operations. 42 - // 43 - // Handling of moderation actions (such as labels, flags, and reports) are deferred until the end of all rule execution, then de-duplicated against any pre-existing actions on the account. 44 - type RepoEvent struct { 45 - // Back-reference to Engine that is processing this event. Pointer, but must not be nil. 46 - Engine *Engine 47 - // Any error encountered while processing the event can be stashed in this field and handled at the end of all processing. 48 - Err error 49 - // slog logger handle, with event-specific structured fields pre-populated. Pointer, but expected to not be nil. 50 - Logger *slog.Logger 51 - // Metadata for the account (identity) associated with this event (aka, the repo owner) 52 - Account AccountMeta 53 - // List of counters which should be incremented as part of processing this event. These are collected during rule execution and persisted in bulk at the end. 54 - CounterIncrements []CounterRef 55 - // Similar to "CounterIncrements", but for "distinct" style counters 56 - CounterDistinctIncrements []CounterDistinctRef // TODO: better variable names 57 - // Label values which should be applied to the overall account, as a result of rule execution. 58 - AccountLabels []string 59 - // Moderation flags (similar to labels, but private) which should be applied to the overall account, as a result of rule execution. 60 - AccountFlags []string 61 - // Reports which should be filed against this account, as a result of rule execution. 62 - AccountReports []ModReport 63 - // If "true", indicates that a rule indicates that the entire account should have a takedown. 64 - AccountTakedown bool 65 - } 66 - 67 - // Immediate fetches a count from the event's engine's countstore. Returns 0 by default (if counter has never been incremented). 68 - // 69 - // "name" is the counter namespace. 70 - // "val" is the specific counter with that namespace. 71 - // "period" is the time period bucke (one of the fixed "Period*" values) 72 - func (e *RepoEvent) GetCount(name, val, period string) int { 73 - v, err := e.Engine.GetCount(name, val, period) 74 - if err != nil { 75 - e.Err = err 76 - return 0 77 - } 78 - return v 79 - } 80 - 81 - // Enqueues the named counter to be incremented at the end of all rule processing. Will automatically increment for all time periods. 82 - // 83 - // "name" is the counter namespace. 84 - // "val" is the specific counter with that namespace. 85 - func (e *RepoEvent) Increment(name, val string) { 86 - e.CounterIncrements = append(e.CounterIncrements, CounterRef{Name: name, Val: val}) 87 - } 88 - 89 - // Enqueues the named counter to be incremented at the end of all rule processing. Will only increment the indicated time period bucket. 90 - func (e *RepoEvent) IncrementPeriod(name, val string, period string) { 91 - e.CounterIncrements = append(e.CounterIncrements, CounterRef{Name: name, Val: val, Period: &period}) 92 - } 93 - 94 - // Immediate fetches an estimated (statistical) count of distinct string values in the indicated bucket and time period. 95 - func (e *RepoEvent) GetCountDistinct(name, bucket, period string) int { 96 - v, err := e.Engine.GetCountDistinct(name, bucket, period) 97 - if err != nil { 98 - e.Err = err 99 - return 0 100 - } 101 - return v 102 - } 103 - 104 - // Enqueues the named "distinct value" counter based on the supplied string value ("val") to be incremented at the end of all rule processing. Will automatically increment for all time periods. 105 - func (e *RepoEvent) IncrementDistinct(name, bucket, val string) { 106 - e.CounterDistinctIncrements = append(e.CounterDistinctIncrements, CounterDistinctRef{Name: name, Bucket: bucket, Val: val}) 107 - } 108 - 109 - // Checks the Engine's setstore for whether the indicated "val" is a member of the "name" set. 110 - func (e *RepoEvent) InSet(name, val string) bool { 111 - v, err := e.Engine.InSet(name, val) 112 - if err != nil { 113 - e.Err = err 114 - return false 115 - } 116 - return v 117 - } 118 - 119 - // Enqueues the entire account to be taken down at the end of rule processing. 120 - func (e *RepoEvent) TakedownAccount() { 121 - e.AccountTakedown = true 122 - } 123 - 124 - // Enqueues the provided label (string value) to be added to the account at the end of rule processing. 125 - func (e *RepoEvent) AddAccountLabel(val string) { 126 - e.AccountLabels = append(e.AccountLabels, val) 127 - } 128 - 129 - // Enqueues the provided flag (string value) to be recorded (in the Engine's flagstore) at the end of rule processing. 130 - func (e *RepoEvent) AddAccountFlag(val string) { 131 - e.AccountFlags = append(e.AccountFlags, val) 132 - } 133 - 134 - // Enqueues a moderation report to be filed against the account at the end of rule processing. 135 - func (e *RepoEvent) ReportAccount(reason, comment string) { 136 - if comment == "" { 137 - comment = "(no comment)" 138 - } 139 - comment = "automod: " + comment 140 - e.AccountReports = append(e.AccountReports, ModReport{ReasonType: reason, Comment: comment}) 141 - } 142 - 143 - func slackBody(header string, acct AccountMeta, newLabels, newFlags []string, newReports []ModReport, newTakedown bool) string { 144 - msg := header 145 - msg += fmt.Sprintf("`%s` / `%s` / <https://bsky.app/profile/%s|bsky> / <https://admin.prod.bsky.dev/repositories/%s|ozone>\n", 146 - acct.Identity.DID, 147 - acct.Identity.Handle, 148 - acct.Identity.DID, 149 - acct.Identity.DID, 150 - ) 151 - if len(newLabels) > 0 { 152 - msg += fmt.Sprintf("New Labels: `%s`\n", strings.Join(newLabels, ", ")) 153 - } 154 - if len(newFlags) > 0 { 155 - msg += fmt.Sprintf("New Flags: `%s`\n", strings.Join(newFlags, ", ")) 156 - } 157 - for _, rep := range newReports { 158 - msg += fmt.Sprintf("Report `%s`: %s\n", rep.ReasonType, rep.Comment) 159 - } 160 - if newTakedown { 161 - msg += fmt.Sprintf("Takedown!\n") 162 - } 163 - return msg 164 - } 165 - 166 - func dedupeLabelActions(labels, existing, existingNegated []string) []string { 167 - newLabels := []string{} 168 - for _, val := range util.DedupeStrings(labels) { 169 - exists := false 170 - for _, e := range existingNegated { 171 - if val == e { 172 - exists = true 173 - break 174 - } 175 - } 176 - for _, e := range existing { 177 - if val == e { 178 - exists = true 179 - break 180 - } 181 - } 182 - if !exists { 183 - newLabels = append(newLabels, val) 184 - } 185 - } 186 - return newLabels 187 - } 188 - 189 - func dedupeFlagActions(flags, existing []string) []string { 190 - newFlags := []string{} 191 - for _, val := range util.DedupeStrings(flags) { 192 - exists := false 193 - for _, e := range existing { 194 - if val == e { 195 - exists = true 196 - break 197 - } 198 - } 199 - if !exists { 200 - newFlags = append(newFlags, val) 201 - } 202 - } 203 - return newFlags 204 - } 205 - 206 - func dedupeReportActions(evt *RepoEvent, reports []ModReport) []ModReport { 207 - newReports := []ModReport{} 208 - for _, r := range reports { 209 - counterName := "automod-account-report-" + reasonShortName(r.ReasonType) 210 - existing := evt.GetCount(counterName, evt.Account.Identity.DID.String(), countstore.PeriodDay) 211 - if existing > 0 { 212 - evt.Logger.Debug("skipping account report due to counter", "existing", existing, "reason", reasonShortName(r.ReasonType)) 213 - } else { 214 - evt.Increment(counterName, evt.Account.Identity.DID.String()) 215 - newReports = append(newReports, r) 216 - } 217 - } 218 - return newReports 219 - } 220 - 221 - func circuitBreakReports(evt *RepoEvent, reports []ModReport) []ModReport { 222 - if len(reports) == 0 { 223 - return []ModReport{} 224 - } 225 - if evt.GetCount("automod-quota", "report", countstore.PeriodDay) >= QuotaModReportDay { 226 - evt.Logger.Warn("CIRCUIT BREAKER: automod reports") 227 - return []ModReport{} 228 - } 229 - evt.Increment("automod-quota", "report") 230 - return reports 231 - } 232 - 233 - func circuitBreakTakedown(evt *RepoEvent, takedown bool) bool { 234 - if !takedown { 235 - return takedown 236 - } 237 - if evt.GetCount("automod-quota", "takedown", countstore.PeriodDay) >= QuotaModTakedownDay { 238 - evt.Logger.Warn("CIRCUIT BREAKER: automod takedowns") 239 - return false 240 - } 241 - evt.Increment("automod-quota", "takedown") 242 - return takedown 243 - } 244 - 245 - // Creates a moderation report, but checks first if there was a similar recent one, and skips if so. 246 - // 247 - // Returns a bool indicating if a new report was created. 248 - func createReportIfFresh(ctx context.Context, xrpcc *xrpc.Client, evt RepoEvent, mr ModReport) (bool, error) { 249 - // before creating a report, query to see if automod has already reported this account in the past week for the same reason 250 - // NOTE: this is running in an inner loop (if there are multiple reports), which is a bit inefficient, but seems acceptable 251 - 252 - // AdminQueryModerationEvents(ctx context.Context, c *xrpc.Client, createdBy string, cursor string, inc ludeAllUserRecords bool, limit int64, sortDirection string, subject string, types []string) 253 - resp, err := comatproto.AdminQueryModerationEvents(ctx, xrpcc, xrpcc.Auth.Did, "", false, 5, "", evt.Account.Identity.DID.String(), []string{"com.atproto.admin.defs#modEventReport"}) 254 - if err != nil { 255 - return false, err 256 - } 257 - for _, modEvt := range resp.Events { 258 - // defensively ensure that our query params worked correctly 259 - if modEvt.Event.AdminDefs_ModEventReport == nil || modEvt.CreatedBy != xrpcc.Auth.Did || modEvt.Subject.AdminDefs_RepoRef == nil || modEvt.Subject.AdminDefs_RepoRef.Did != evt.Account.Identity.DID.String() || (modEvt.Event.AdminDefs_ModEventReport.ReportType != nil && *modEvt.Event.AdminDefs_ModEventReport.ReportType != mr.ReasonType) { 260 - continue 261 - } 262 - // igonre if older 263 - created, err := syntax.ParseDatetime(modEvt.CreatedAt) 264 - if err != nil { 265 - return false, err 266 - } 267 - if time.Since(created.Time()) > ReportDupePeriod { 268 - continue 269 - } 270 - 271 - // there is a recent report which is similar to this one 272 - evt.Logger.Info("skipping duplicate account report due to API check") 273 - return false, nil 274 - } 275 - 276 - evt.Logger.Info("reporting account", "reasonType", mr.ReasonType, "comment", mr.Comment) 277 - _, err = comatproto.ModerationCreateReport(ctx, xrpcc, &comatproto.ModerationCreateReport_Input{ 278 - ReasonType: &mr.ReasonType, 279 - Reason: &mr.Comment, 280 - Subject: &comatproto.ModerationCreateReport_Input_Subject{ 281 - AdminDefs_RepoRef: &comatproto.AdminDefs_RepoRef{ 282 - Did: evt.Account.Identity.DID.String(), 283 - }, 284 - }, 285 - }) 286 - if err != nil { 287 - return false, err 288 - } 289 - return true, nil 290 - } 291 - 292 - // Persists account-level moderation actions: new labels, new flags, new takedowns, and reports. 293 - // 294 - // If necessary, will "purge" identity and account caches, so that state updates will be picked up for subsequent events. 295 - // 296 - // Note that this method expects to run *before* counts are persisted (it accesses and updates some counts) 297 - func (e *RepoEvent) PersistAccountActions(ctx context.Context) error { 298 - 299 - // de-dupe actions 300 - newLabels := dedupeLabelActions(e.AccountLabels, e.Account.AccountLabels, e.Account.AccountNegatedLabels) 301 - newFlags := dedupeFlagActions(e.AccountFlags, e.Account.AccountFlags) 302 - 303 - // don't report the same account multiple times on the same day for the same reason. this is a quick check; we also query the mod service API just before creating the report. 304 - newReports := circuitBreakReports(e, dedupeReportActions(e, e.AccountReports)) 305 - newTakedown := circuitBreakTakedown(e, e.AccountTakedown && !e.Account.Takendown) 306 - 307 - anyModActions := newTakedown || len(newLabels) > 0 || len(newFlags) > 0 || len(newReports) > 0 308 - if anyModActions && e.Engine.SlackWebhookURL != "" { 309 - msg := slackBody("⚠️ Automod Account Action ⚠️\n", e.Account, newLabels, newFlags, newReports, newTakedown) 310 - if err := e.Engine.SendSlackMsg(ctx, msg); err != nil { 311 - e.Logger.Error("sending slack webhook", "err", err) 312 - } 313 - } 314 - 315 - // flags don't require admin auth 316 - if len(newFlags) > 0 { 317 - e.Engine.Flags.Add(ctx, e.Account.Identity.DID.String(), newFlags) 318 - } 319 - 320 - // if we can't actually talk to service, bail out early 321 - if e.Engine.AdminClient == nil { 322 - return nil 323 - } 324 - 325 - xrpcc := e.Engine.AdminClient 326 - 327 - if len(newLabels) > 0 { 328 - e.Logger.Info("labeling record", "newLabels", newLabels) 329 - comment := "automod" 330 - _, err := comatproto.AdminEmitModerationEvent(ctx, xrpcc, &comatproto.AdminEmitModerationEvent_Input{ 331 - CreatedBy: xrpcc.Auth.Did, 332 - Event: &comatproto.AdminEmitModerationEvent_Input_Event{ 333 - AdminDefs_ModEventLabel: &comatproto.AdminDefs_ModEventLabel{ 334 - CreateLabelVals: newLabels, 335 - NegateLabelVals: []string{}, 336 - Comment: &comment, 337 - }, 338 - }, 339 - Subject: &comatproto.AdminEmitModerationEvent_Input_Subject{ 340 - AdminDefs_RepoRef: &comatproto.AdminDefs_RepoRef{ 341 - Did: e.Account.Identity.DID.String(), 342 - }, 343 - }, 344 - }) 345 - if err != nil { 346 - return err 347 - } 348 - } 349 - 350 - // reports are additionally de-duped when persisting the action, so track with a flag 351 - createdReports := false 352 - for _, mr := range newReports { 353 - created, err := createReportIfFresh(ctx, xrpcc, *e, mr) 354 - if err != nil { 355 - return err 356 - } 357 - if created { 358 - createdReports = true 359 - } 360 - } 361 - 362 - if newTakedown { 363 - e.Logger.Warn("account-takedown") 364 - comment := "automod" 365 - _, err := comatproto.AdminEmitModerationEvent(ctx, xrpcc, &comatproto.AdminEmitModerationEvent_Input{ 366 - CreatedBy: xrpcc.Auth.Did, 367 - Event: &comatproto.AdminEmitModerationEvent_Input_Event{ 368 - AdminDefs_ModEventTakedown: &comatproto.AdminDefs_ModEventTakedown{ 369 - Comment: &comment, 370 - }, 371 - }, 372 - Subject: &comatproto.AdminEmitModerationEvent_Input_Subject{ 373 - AdminDefs_RepoRef: &comatproto.AdminDefs_RepoRef{ 374 - Did: e.Account.Identity.DID.String(), 375 - }, 376 - }, 377 - }) 378 - if err != nil { 379 - return err 380 - } 381 - } 382 - 383 - needCachePurge := newTakedown || len(newLabels) > 0 || len(newFlags) > 0 || createdReports 384 - if needCachePurge { 385 - return e.Engine.PurgeAccountCaches(ctx, e.Account.Identity.DID) 386 - } 387 - 388 - return nil 389 - } 390 - 391 - func (e *RepoEvent) PersistActions(ctx context.Context) error { 392 - return e.PersistAccountActions(ctx) 393 - } 394 - 395 - func (e *RepoEvent) PersistCounters(ctx context.Context) error { 396 - // TODO: dedupe this array 397 - for _, ref := range e.CounterIncrements { 398 - if ref.Period != nil { 399 - err := e.Engine.Counters.IncrementPeriod(ctx, ref.Name, ref.Val, *ref.Period) 400 - if err != nil { 401 - return err 402 - } 403 - } else { 404 - err := e.Engine.Counters.Increment(ctx, ref.Name, ref.Val) 405 - if err != nil { 406 - return err 407 - } 408 - } 409 - } 410 - for _, ref := range e.CounterDistinctIncrements { 411 - err := e.Engine.Counters.IncrementDistinct(ctx, ref.Name, ref.Bucket, ref.Val) 412 - if err != nil { 413 - return err 414 - } 415 - } 416 - return nil 417 - } 418 - 419 - func (e *RepoEvent) CanonicalLogLine() { 420 - e.Logger.Info("canonical-event-line", 421 - "accountLabels", e.AccountLabels, 422 - "accountFlags", e.AccountFlags, 423 - "accountTakedown", e.AccountTakedown, 424 - "accountReports", len(e.AccountReports), 425 - ) 426 - } 427 - 428 - // Alias of RepoEvent 429 - type IdentityEvent struct { 430 - RepoEvent 431 - } 432 - 433 - // Extends RepoEvent. Represents the creation of a single record in the given repository. 434 - type RecordEvent struct { 435 - RepoEvent 436 - 437 - // The un-marshalled record, as a go struct, from the api/atproto or api/bsky type packages. 438 - Record any 439 - // The "collection" part of the repo path for this record. Must be an NSID, though this isn't indicated by the type of this field. 440 - Collection string 441 - // The "record key" (rkey) part of repo path. 442 - RecordKey string 443 - // CID of the canonical CBOR version of the record, as matches the repo value. 444 - CID string 445 - // Same as "AccountLabels", but at record-level 446 - RecordLabels []string 447 - // Same as "AccountTakedown", but at record-level 448 - RecordTakedown bool 449 - // Same as "AccountReports", but at record-level 450 - RecordReports []ModReport 451 - // Same as "AccountFlags", but at record-level 452 - RecordFlags []string 453 - // TODO: commit metadata 454 - } 455 - 456 - // Enqueues the record to be taken down at the end of rule processing. 457 - func (e *RecordEvent) TakedownRecord() { 458 - e.RecordTakedown = true 459 - } 460 - 461 - // Enqueues the provided label (string value) to be added to the record at the end of rule processing. 462 - func (e *RecordEvent) AddRecordLabel(val string) { 463 - e.RecordLabels = append(e.RecordLabels, val) 464 - } 465 - 466 - // Enqueues the provided flag (string value) to be recorded (in the Engine's flagstore) at the end of rule processing. 467 - func (e *RecordEvent) AddRecordFlag(val string) { 468 - e.RecordFlags = append(e.RecordFlags, val) 469 - } 470 - 471 - // Enqueues a moderation report to be filed against the record at the end of rule processing. 472 - func (e *RecordEvent) ReportRecord(reason, comment string) { 473 - if comment == "" { 474 - comment = "(automod)" 475 - } else { 476 - comment = "automod: " + comment 477 - } 478 - e.RecordReports = append(e.RecordReports, ModReport{ReasonType: reason, Comment: comment}) 479 - } 480 - 481 - // Persists some record-level state: labels, takedowns, reports. 482 - // 483 - // NOTE: this method currently does *not* persist record-level flags to any storage, and does not de-dupe most actions, on the assumption that the record is new (from firehose) and has no existing mod state. 484 - func (e *RecordEvent) PersistRecordActions(ctx context.Context) error { 485 - 486 - // NOTE: record-level actions are *not* currently de-duplicated (aka, the same record could be labeled multiple times, or re-reported, etc) 487 - newLabels := util.DedupeStrings(e.RecordLabels) 488 - newFlags := util.DedupeStrings(e.RecordFlags) 489 - newReports := circuitBreakReports(&e.RepoEvent, e.RecordReports) 490 - newTakedown := circuitBreakTakedown(&e.RepoEvent, e.RecordTakedown) 491 - atURI := fmt.Sprintf("at://%s/%s/%s", e.Account.Identity.DID, e.Collection, e.RecordKey) 492 - 493 - if newTakedown || len(newLabels) > 0 || len(newFlags) > 0 || len(newReports) > 0 { 494 - if e.Engine.SlackWebhookURL != "" { 495 - msg := slackBody("⚠️ Automod Record Action ⚠️\n", e.Account, newLabels, newFlags, newReports, newTakedown) 496 - msg += fmt.Sprintf("`%s`\n", atURI) 497 - if err := e.Engine.SendSlackMsg(ctx, msg); err != nil { 498 - e.Logger.Error("sending slack webhook", "err", err) 499 - } 500 - } 501 - } 502 - 503 - // flags don't require admin auth 504 - if len(newFlags) > 0 { 505 - e.Engine.Flags.Add(ctx, atURI, newFlags) 506 - } 507 - 508 - if e.Engine.AdminClient == nil { 509 - return nil 510 - } 511 - 512 - strongRef := comatproto.RepoStrongRef{ 513 - Cid: e.CID, 514 - Uri: atURI, 515 - } 516 - xrpcc := e.Engine.AdminClient 517 - if len(newLabels) > 0 { 518 - e.Logger.Info("labeling record", "newLabels", newLabels) 519 - comment := "automod" 520 - _, err := comatproto.AdminEmitModerationEvent(ctx, xrpcc, &comatproto.AdminEmitModerationEvent_Input{ 521 - CreatedBy: xrpcc.Auth.Did, 522 - Event: &comatproto.AdminEmitModerationEvent_Input_Event{ 523 - AdminDefs_ModEventLabel: &comatproto.AdminDefs_ModEventLabel{ 524 - CreateLabelVals: newLabels, 525 - NegateLabelVals: []string{}, 526 - Comment: &comment, 527 - }, 528 - }, 529 - Subject: &comatproto.AdminEmitModerationEvent_Input_Subject{ 530 - RepoStrongRef: &strongRef, 531 - }, 532 - }) 533 - if err != nil { 534 - return err 535 - } 536 - } 537 - 538 - for _, mr := range newReports { 539 - e.Logger.Info("reporting record", "reasonType", mr.ReasonType, "comment", mr.Comment) 540 - _, err := comatproto.ModerationCreateReport(ctx, xrpcc, &comatproto.ModerationCreateReport_Input{ 541 - ReasonType: &mr.ReasonType, 542 - Reason: &mr.Comment, 543 - Subject: &comatproto.ModerationCreateReport_Input_Subject{ 544 - RepoStrongRef: &strongRef, 545 - }, 546 - }) 547 - if err != nil { 548 - return err 549 - } 550 - } 551 - if newTakedown { 552 - e.Logger.Warn("record-takedown") 553 - comment := "automod" 554 - _, err := comatproto.AdminEmitModerationEvent(ctx, xrpcc, &comatproto.AdminEmitModerationEvent_Input{ 555 - CreatedBy: xrpcc.Auth.Did, 556 - Event: &comatproto.AdminEmitModerationEvent_Input_Event{ 557 - AdminDefs_ModEventTakedown: &comatproto.AdminDefs_ModEventTakedown{ 558 - Comment: &comment, 559 - }, 560 - }, 561 - Subject: &comatproto.AdminEmitModerationEvent_Input_Subject{ 562 - RepoStrongRef: &strongRef, 563 - }, 564 - }) 565 - if err != nil { 566 - return err 567 - } 568 - } 569 - return nil 570 - } 571 - 572 - func (e *RecordEvent) PersistActions(ctx context.Context) error { 573 - if err := e.PersistAccountActions(ctx); err != nil { 574 - return err 575 - } 576 - return e.PersistRecordActions(ctx) 577 - } 578 - 579 - func (e *RecordEvent) CanonicalLogLine() { 580 - e.Logger.Info("canonical-event-line", 581 - "accountLabels", e.AccountLabels, 582 - "accountFlags", e.AccountFlags, 583 - "accountTakedown", e.AccountTakedown, 584 - "accountReports", len(e.AccountReports), 585 - "recordLabels", e.RecordLabels, 586 - "recordFlags", e.RecordFlags, 587 - "recordTakedown", e.RecordTakedown, 588 - "recordReports", len(e.RecordReports), 589 - ) 590 - } 591 - 592 - // Extends RepoEvent. Represents the deletion of a single record in the given repository. 593 - type RecordDeleteEvent struct { 594 - RepoEvent 595 - 596 - Collection string 597 - RecordKey string 598 - } 599 - 600 - type IdentityRuleFunc = func(evt *IdentityEvent) error 601 - type RecordRuleFunc = func(evt *RecordEvent) error 602 - type PostRuleFunc = func(evt *RecordEvent, post *appbsky.FeedPost) error 603 - type ProfileRuleFunc = func(evt *RecordEvent, profile *appbsky.ActorProfile) error 604 - type RecordDeleteRuleFunc = func(evt *RecordDeleteEvent) error
-113
automod/fetch.go
··· 1 - package automod 2 - 3 - import ( 4 - "context" 5 - "fmt" 6 - 7 - comatproto "github.com/bluesky-social/indigo/api/atproto" 8 - "github.com/bluesky-social/indigo/atproto/identity" 9 - "github.com/bluesky-social/indigo/atproto/syntax" 10 - "github.com/bluesky-social/indigo/xrpc" 11 - ) 12 - 13 - func (e *Engine) FetchAndProcessRecord(ctx context.Context, aturi syntax.ATURI) error { 14 - // resolve URI, identity, and record 15 - if aturi.RecordKey() == "" { 16 - return fmt.Errorf("need a full, not partial, AT-URI: %s", aturi) 17 - } 18 - ident, err := e.Directory.Lookup(ctx, aturi.Authority()) 19 - if err != nil { 20 - return fmt.Errorf("resolving AT-URI authority: %v", err) 21 - } 22 - pdsURL := ident.PDSEndpoint() 23 - if pdsURL == "" { 24 - return fmt.Errorf("could not resolve PDS endpoint for AT-URI account: %s", ident.DID.String()) 25 - } 26 - pdsClient := xrpc.Client{Host: ident.PDSEndpoint()} 27 - 28 - e.Logger.Info("fetching record", "did", ident.DID.String(), "collection", aturi.Collection().String(), "rkey", aturi.RecordKey().String()) 29 - out, err := comatproto.RepoGetRecord(ctx, &pdsClient, "", aturi.Collection().String(), ident.DID.String(), aturi.RecordKey().String()) 30 - if err != nil { 31 - return fmt.Errorf("fetching record from Relay (%s): %v", aturi, err) 32 - } 33 - if out.Cid == nil { 34 - return fmt.Errorf("expected a CID in getRecord response") 35 - } 36 - return e.ProcessRecord(ctx, ident.DID, aturi.Path(), *out.Cid, out.Value.Val) 37 - } 38 - 39 - func (e *Engine) FetchRecent(ctx context.Context, atid syntax.AtIdentifier, limit int) (*identity.Identity, []*comatproto.RepoListRecords_Record, error) { 40 - ident, err := e.Directory.Lookup(ctx, atid) 41 - if err != nil { 42 - return nil, nil, fmt.Errorf("failed to resolve AT identifier: %v", err) 43 - } 44 - pdsURL := ident.PDSEndpoint() 45 - if pdsURL == "" { 46 - return nil, nil, fmt.Errorf("could not resolve PDS endpoint for account: %s", ident.DID.String()) 47 - } 48 - pdsClient := xrpc.Client{Host: ident.PDSEndpoint()} 49 - 50 - resp, err := comatproto.RepoListRecords(ctx, &pdsClient, "app.bsky.feed.post", "", int64(limit), ident.DID.String(), false, "", "") 51 - if err != nil { 52 - return nil, nil, fmt.Errorf("failed to fetch record list: %v", err) 53 - } 54 - e.Logger.Info("got recent posts", "did", ident.DID.String(), "pds", pdsURL, "count", len(resp.Records)) 55 - return ident, resp.Records, nil 56 - } 57 - 58 - func (e *Engine) FetchAndProcessRecent(ctx context.Context, atid syntax.AtIdentifier, limit int) error { 59 - 60 - ident, records, err := e.FetchRecent(ctx, atid, limit) 61 - if err != nil { 62 - return err 63 - } 64 - // records are most-recent first; we want recent but oldest-first, so iterate backwards 65 - for i := range records { 66 - rec := records[len(records)-i-1] 67 - aturi, err := syntax.ParseATURI(rec.Uri) 68 - if err != nil { 69 - return fmt.Errorf("parsing PDS record response: %v", err) 70 - } 71 - err = e.ProcessRecord(ctx, ident.DID, aturi.Path(), rec.Cid, rec.Value.Val) 72 - if err != nil { 73 - return err 74 - } 75 - } 76 - return nil 77 - } 78 - 79 - type AccountCapture struct { 80 - CapturedAt syntax.Datetime `json:"capturedAt"` 81 - AccountMeta AccountMeta `json:"accountMeta"` 82 - PostRecords []comatproto.RepoListRecords_Record `json:"postRecords"` 83 - } 84 - 85 - func (e *Engine) CaptureRecent(ctx context.Context, atid syntax.AtIdentifier, limit int) (*AccountCapture, error) { 86 - ident, records, err := e.FetchRecent(ctx, atid, limit) 87 - if err != nil { 88 - return nil, err 89 - } 90 - pr := []comatproto.RepoListRecords_Record{} 91 - for _, r := range records { 92 - if r != nil { 93 - pr = append(pr, *r) 94 - } 95 - } 96 - 97 - // clear any pre-parsed key, which would fail to marshal as JSON 98 - ident.ParsedPublicKey = nil 99 - am, err := e.GetAccountMeta(ctx, ident) 100 - if err != nil { 101 - return nil, err 102 - } 103 - 104 - // auto-clear sensitive PII (eg, account email) 105 - am.Private = nil 106 - 107 - ac := AccountCapture{ 108 - CapturedAt: syntax.DatetimeNow(), 109 - AccountMeta: *am, 110 - PostRecords: pr, 111 - } 112 - return &ac, nil 113 - }
+1 -3
automod/flagstore/flagstore_mem.go
··· 2 2 3 3 import ( 4 4 "context" 5 - 6 - "github.com/bluesky-social/indigo/automod/util" 7 5 ) 8 6 9 7 type MemFlagStore struct { ··· 32 30 for _, f := range flags { 33 31 v = append(v, f) 34 32 } 35 - v = util.DedupeStrings(v) 33 + v = dedupeStrings(v) 36 34 s.Data[key] = v 37 35 return nil 38 36 }
+13
automod/flagstore/util.go
··· 1 + package flagstore 2 + 3 + func dedupeStrings(in []string) []string { 4 + var out []string 5 + seen := make(map[string]bool) 6 + for _, v := range in { 7 + if !seen[v] { 8 + out = append(out, v) 9 + seen[v] = true 10 + } 11 + } 12 + return out 13 + }
+36
automod/pkg.go
··· 1 + package automod 2 + 3 + import ( 4 + "github.com/bluesky-social/indigo/automod/countstore" 5 + "github.com/bluesky-social/indigo/automod/engine" 6 + ) 7 + 8 + type Engine = engine.Engine 9 + type AccountMeta = engine.AccountMeta 10 + type RuleSet = engine.RuleSet 11 + 12 + type AccountContext = engine.AccountContext 13 + type RecordContext = engine.RecordContext 14 + type RecordOp = engine.RecordOp 15 + 16 + type IdentityRuleFunc = engine.IdentityRuleFunc 17 + type RecordRuleFunc = engine.RecordRuleFunc 18 + type PostRuleFunc = engine.PostRuleFunc 19 + type ProfileRuleFunc = engine.ProfileRuleFunc 20 + 21 + var ( 22 + ReportReasonSpam = engine.ReportReasonSpam 23 + ReportReasonViolation = engine.ReportReasonViolation 24 + ReportReasonMisleading = engine.ReportReasonMisleading 25 + ReportReasonSexual = engine.ReportReasonSexual 26 + ReportReasonRude = engine.ReportReasonRude 27 + ReportReasonOther = engine.ReportReasonOther 28 + 29 + PeriodTotal = countstore.PeriodTotal 30 + PeriodDay = countstore.PeriodDay 31 + PeriodHour = countstore.PeriodHour 32 + 33 + CreateOp = engine.CreateOp 34 + UpdateOp = engine.UpdateOp 35 + DeleteOp = engine.DeleteOp 36 + )
+3 -2
automod/report.go automod/engine/report.go
··· 1 - package automod 1 + package engine 2 2 3 + // Simplified variant of input parameters for com.atproto.moderation.createReport, for internal tracking 3 4 type ModReport struct { 4 5 ReasonType string 5 6 Comment string ··· 14 15 ReportReasonOther = "com.atproto.moderation.defs#reasonOther" 15 16 ) 16 17 17 - func reasonShortName(reason string) string { 18 + func ReasonShortName(reason string) string { 18 19 switch reason { 19 20 case ReportReasonSpam: 20 21 return "spam"
+1 -1
automod/rules/all.go
··· 28 28 RecordRules: []automod.RecordRuleFunc{ 29 29 InteractionChurnRule, 30 30 }, 31 - RecordDeleteRules: []automod.RecordDeleteRuleFunc{ 31 + RecordDeleteRules: []automod.RecordRuleFunc{ 32 32 DeleteInteractionRule, 33 33 }, 34 34 IdentityRules: []automod.IdentityRuleFunc{
+4 -4
automod/rules/gtube.go
··· 10 10 // https://en.wikipedia.org/wiki/GTUBE 11 11 var gtubeString = "XJS*C4JDBQADN1.NSBN3*2IDNEN*GTUBE-STANDARD-ANTI-UBE-TEST-EMAIL*C.34X" 12 12 13 - func GtubePostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 13 + func GtubePostRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 14 14 if strings.Contains(post.Text, gtubeString) { 15 - evt.AddRecordLabel("spam") 15 + c.AddRecordLabel("spam") 16 16 } 17 17 return nil 18 18 } 19 19 20 - func GtubeProfileRule(evt *automod.RecordEvent, profile *appbsky.ActorProfile) error { 20 + func GtubeProfileRule(c *automod.RecordContext, profile *appbsky.ActorProfile) error { 21 21 if profile.Description != nil && strings.Contains(*profile.Description, gtubeString) { 22 - evt.AddRecordLabel("spam") 22 + c.AddRecordLabel("spam") 23 23 } 24 24 return nil 25 25 }
+6 -6
automod/rules/hashtags.go
··· 6 6 ) 7 7 8 8 // looks for specific hashtags from known lists 9 - func BadHashtagsPostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 9 + func BadHashtagsPostRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 10 10 for _, tag := range ExtractHashtags(post) { 11 11 tag = NormalizeHashtag(tag) 12 - if evt.InSet("bad-hashtags", tag) { 13 - evt.AddRecordFlag("bad-hashtag") 12 + if c.InSet("bad-hashtags", tag) { 13 + c.AddRecordFlag("bad-hashtag") 14 14 break 15 15 } 16 16 } ··· 18 18 } 19 19 20 20 // if a post is "almost all" hashtags, it might be a form of search spam 21 - func TooManyHashtagsPostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 21 + func TooManyHashtagsPostRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 22 22 tags := ExtractHashtags(post) 23 23 tagChars := 0 24 24 for _, tag := range tags { ··· 27 27 tagTextRatio := float64(tagChars) / float64(len(post.Text)) 28 28 // if there is an image, allow some more tags 29 29 if len(tags) > 4 && tagTextRatio > 0.6 && post.Embed.EmbedImages == nil { 30 - evt.AddRecordFlag("many-hashtags") 30 + c.AddRecordFlag("many-hashtags") 31 31 } else if len(tags) > 7 && tagTextRatio > 0.8 { 32 - evt.AddRecordFlag("many-hashtags") 32 + c.AddRecordFlag("many-hashtags") 33 33 } 34 34 return nil 35 35 }
+22 -9
automod/rules/hashtags_test.go
··· 1 1 package rules 2 2 3 3 import ( 4 + "context" 4 5 "testing" 5 6 6 7 appbsky "github.com/bluesky-social/indigo/api/bsky" 7 8 "github.com/bluesky-social/indigo/atproto/identity" 8 9 "github.com/bluesky-social/indigo/atproto/syntax" 9 10 "github.com/bluesky-social/indigo/automod" 11 + "github.com/bluesky-social/indigo/automod/engine" 10 12 11 13 "github.com/stretchr/testify/assert" 12 14 ) 13 15 14 16 func TestBadHashtagPostRule(t *testing.T) { 15 17 assert := assert.New(t) 18 + ctx := context.Background() 16 19 17 - engine := automod.EngineTestFixture() 20 + eng := engine.EngineTestFixture() 18 21 am1 := automod.AccountMeta{ 19 22 Identity: &identity.Identity{ 20 23 DID: syntax.DID("did:plc:abc111"), 21 24 Handle: syntax.Handle("handle.example.com"), 22 25 }, 23 26 } 24 - path := "app.bsky.feed.post/abc123" 25 - cid1 := "cid123" 27 + cid1 := syntax.CID("cid123") 26 28 p1 := appbsky.FeedPost{ 27 29 Text: "some post blah", 28 30 } 29 - evt1 := engine.NewRecordEvent(am1, path, cid1, &p1) 30 - assert.NoError(BadHashtagsPostRule(&evt1, &p1)) 31 - assert.Empty(evt1.RecordFlags) 31 + op := engine.RecordOp{ 32 + Action: engine.CreateOp, 33 + DID: am1.Identity.DID, 34 + Collection: syntax.NSID("app.bsky.feed.post"), 35 + RecordKey: syntax.RecordKey("abc123"), 36 + CID: &cid1, 37 + Value: p1, 38 + } 39 + c1 := engine.NewRecordContext(ctx, &eng, am1, op) 40 + assert.NoError(BadHashtagsPostRule(&c1, &p1)) 41 + eff1 := engine.ExtractEffects(&c1.BaseContext) 42 + assert.Empty(eff1.RecordFlags) 32 43 33 44 p2 := appbsky.FeedPost{ 34 45 Text: "some post blah", 35 46 Tags: []string{"one", "slur"}, 36 47 } 37 - evt2 := engine.NewRecordEvent(am1, path, cid1, &p2) 38 - assert.NoError(BadHashtagsPostRule(&evt2, &p2)) 39 - assert.NotEmpty(evt2.RecordFlags) 48 + op.Value = p2 49 + c2 := engine.NewRecordContext(ctx, &eng, am1, op) 50 + assert.NoError(BadHashtagsPostRule(&c2, &p2)) 51 + eff2 := engine.ExtractEffects(&c2.BaseContext) 52 + assert.NotEmpty(eff2.RecordFlags) 40 53 }
+2 -2
automod/rules/helpers.go
··· 160 160 } 161 161 162 162 // checks if the post event is a reply post for which the author is replying to themselves, or author is the root author (OP) 163 - func IsSelfThread(evt *automod.RecordEvent, post *appbsky.FeedPost) bool { 163 + func IsSelfThread(c *automod.RecordContext, post *appbsky.FeedPost) bool { 164 164 if post.Reply == nil { 165 165 return false 166 166 } 167 - did := evt.Account.Identity.DID.String() 167 + did := c.Account.Identity.DID.String() 168 168 parentURI, err := syntax.ParseATURI(post.Reply.Parent.Uri) 169 169 if err != nil { 170 170 return false
+14 -14
automod/rules/identity.go
··· 10 10 ) 11 11 12 12 // triggers on first identity event for an account (DID) 13 - func NewAccountRule(evt *automod.IdentityEvent) error { 13 + func NewAccountRule(c *automod.AccountContext) error { 14 14 // need access to IndexedAt for this rule 15 - if evt.Account.Private == nil || evt.Account.Identity == nil { 15 + if c.Account.Private == nil || c.Account.Identity == nil { 16 16 return nil 17 17 } 18 18 19 - did := evt.Account.Identity.DID.String() 20 - age := time.Since(evt.Account.Private.IndexedAt) 19 + did := c.Account.Identity.DID.String() 20 + age := time.Since(c.Account.Private.IndexedAt) 21 21 if age > 2*time.Hour { 22 22 return nil 23 23 } 24 - exists := evt.GetCount("acct/exists", did, countstore.PeriodTotal) 24 + exists := c.GetCount("acct/exists", did, countstore.PeriodTotal) 25 25 if exists == 0 { 26 - evt.Logger.Info("new account") 27 - evt.Increment("acct/exists", did) 26 + c.Logger.Info("new account") 27 + c.Increment("acct/exists", did) 28 28 29 - pdsURL, err := url.Parse(evt.Account.Identity.PDSEndpoint()) 29 + pdsURL, err := url.Parse(c.Account.Identity.PDSEndpoint()) 30 30 if err != nil { 31 - evt.Logger.Warn("invalid PDS URL", "err", err, "endpoint", evt.Account.Identity.PDSEndpoint()) 31 + c.Logger.Warn("invalid PDS URL", "err", err, "endpoint", c.Account.Identity.PDSEndpoint()) 32 32 return nil 33 33 } 34 34 pdsHost := strings.ToLower(pdsURL.Host) 35 - existingAccounts := evt.GetCount("host/newacct", pdsHost, countstore.PeriodTotal) 36 - evt.Increment("host/newacct", pdsHost) 35 + existingAccounts := c.GetCount("host/newacct", pdsHost, countstore.PeriodTotal) 36 + c.Increment("host/newacct", pdsHost) 37 37 38 38 // new PDS host 39 39 if existingAccounts == 0 { 40 - evt.Logger.Info("new PDS instance", "host", pdsHost) 41 - evt.Increment("host", "new") 42 - evt.AddAccountFlag("host-first-account") 40 + c.Logger.Info("new PDS instance", "host", pdsHost) 41 + c.Increment("host", "new") 42 + c.AddAccountFlag("host-first-account") 43 43 } 44 44 } 45 45 return nil
+20 -20
automod/rules/interaction.go
··· 10 10 var interactionDailyThreshold = 800 11 11 12 12 // looks for accounts which do frequent interaction churn, such as follow-unfollow. 13 - func InteractionChurnRule(evt *automod.RecordEvent) error { 14 - did := evt.Account.Identity.DID.String() 15 - switch evt.Collection { 13 + func InteractionChurnRule(c *automod.RecordContext) error { 14 + did := c.Account.Identity.DID.String() 15 + switch c.RecordOp.Collection { 16 16 case "app.bsky.feed.like": 17 - evt.Increment("like", did) 18 - created := evt.GetCount("like", did, countstore.PeriodDay) 19 - deleted := evt.GetCount("unlike", did, countstore.PeriodDay) 17 + c.Increment("like", did) 18 + created := c.GetCount("like", did, countstore.PeriodDay) 19 + deleted := c.GetCount("unlike", did, countstore.PeriodDay) 20 20 ratio := float64(deleted) / float64(created) 21 21 if created > interactionDailyThreshold && deleted > interactionDailyThreshold && ratio > 0.5 { 22 - evt.Logger.Info("high-like-churn", "created-today", created, "deleted-today", deleted) 23 - evt.AddAccountFlag("high-like-churn") 24 - evt.ReportAccount(automod.ReportReasonSpam, fmt.Sprintf("interaction churn: %d likes, %d unlikes today (so far)", created, deleted)) 22 + c.Logger.Info("high-like-churn", "created-today", created, "deleted-today", deleted) 23 + c.AddAccountFlag("high-like-churn") 24 + c.ReportAccount(automod.ReportReasonSpam, fmt.Sprintf("interaction churn: %d likes, %d unlikes today (so far)", created, deleted)) 25 25 } 26 26 case "app.bsky.graph.follow": 27 - evt.Increment("follow", did) 28 - created := evt.GetCount("follow", did, countstore.PeriodDay) 29 - deleted := evt.GetCount("unfollow", did, countstore.PeriodDay) 27 + c.Increment("follow", did) 28 + created := c.GetCount("follow", did, countstore.PeriodDay) 29 + deleted := c.GetCount("unfollow", did, countstore.PeriodDay) 30 30 ratio := float64(deleted) / float64(created) 31 31 if created > interactionDailyThreshold && deleted > interactionDailyThreshold && ratio > 0.5 { 32 - evt.Logger.Info("high-follow-churn", "created-today", created, "deleted-today", deleted) 33 - evt.AddAccountFlag("high-follow-churn") 34 - evt.ReportAccount(automod.ReportReasonSpam, fmt.Sprintf("interaction churn: %d follows, %d unfollows today (so far)", created, deleted)) 32 + c.Logger.Info("high-follow-churn", "created-today", created, "deleted-today", deleted) 33 + c.AddAccountFlag("high-follow-churn") 34 + c.ReportAccount(automod.ReportReasonSpam, fmt.Sprintf("interaction churn: %d follows, %d unfollows today (so far)", created, deleted)) 35 35 } 36 36 } 37 37 return nil 38 38 } 39 39 40 - func DeleteInteractionRule(evt *automod.RecordDeleteEvent) error { 41 - did := evt.Account.Identity.DID.String() 42 - switch evt.Collection { 40 + func DeleteInteractionRule(c *automod.RecordContext) error { 41 + did := c.Account.Identity.DID.String() 42 + switch c.RecordOp.Collection { 43 43 case "app.bsky.feed.like": 44 - evt.Increment("unlike", did) 44 + c.Increment("unlike", did) 45 45 case "app.bsky.graph.follow": 46 - evt.Increment("unfollow", did) 46 + c.Increment("unfollow", did) 47 47 } 48 48 return nil 49 49 }
+12 -12
automod/rules/keyword.go
··· 7 7 "github.com/bluesky-social/indigo/automod" 8 8 ) 9 9 10 - func KeywordPostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 10 + func KeywordPostRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 11 11 for _, tok := range ExtractTextTokensPost(post) { 12 - if evt.InSet("bad-words", tok) { 13 - evt.AddRecordFlag("bad-word") 14 - evt.ReportRecord(automod.ReportReasonRude, fmt.Sprintf("bad-word: %s", tok)) 12 + if c.InSet("bad-words", tok) { 13 + c.AddRecordFlag("bad-word") 14 + c.ReportRecord(automod.ReportReasonRude, fmt.Sprintf("bad-word: %s", tok)) 15 15 break 16 16 } 17 17 } 18 18 return nil 19 19 } 20 20 21 - func KeywordProfileRule(evt *automod.RecordEvent, profile *appbsky.ActorProfile) error { 21 + func KeywordProfileRule(c *automod.RecordContext, profile *appbsky.ActorProfile) error { 22 22 for _, tok := range ExtractTextTokensProfile(profile) { 23 - if evt.InSet("bad-words", tok) { 24 - evt.AddRecordFlag("bad-word") 25 - evt.ReportRecord(automod.ReportReasonRude, fmt.Sprintf("bad-word: %s", tok)) 23 + if c.InSet("bad-words", tok) { 24 + c.AddRecordFlag("bad-word") 25 + c.ReportRecord(automod.ReportReasonRude, fmt.Sprintf("bad-word: %s", tok)) 26 26 break 27 27 } 28 28 } 29 29 return nil 30 30 } 31 31 32 - func ReplySingleKeywordPostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 33 - if post.Reply != nil && !IsSelfThread(evt, post) { 32 + func ReplySingleKeywordPostRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 33 + if post.Reply != nil && !IsSelfThread(c, post) { 34 34 tokens := ExtractTextTokensPost(post) 35 - if len(tokens) == 1 && evt.InSet("bad-words", tokens[0]) { 36 - evt.AddRecordFlag("reply-single-bad-word") 35 + if len(tokens) == 1 && c.InSet("bad-words", tokens[0]) { 36 + c.AddRecordFlag("reply-single-bad-word") 37 37 } 38 38 } 39 39 return nil
+5 -5
automod/rules/mentions.go
··· 11 11 var mentionHourlyThreshold = 40 12 12 13 13 // DistinctMentionsRule looks for accounts which mention an unusually large number of distinct accounts per period. 14 - func DistinctMentionsRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 15 - did := evt.Account.Identity.DID.String() 14 + func DistinctMentionsRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 15 + did := c.Account.Identity.DID.String() 16 16 17 17 // Increment counters for all new mentions in this post. 18 18 var newMentions bool ··· 22 22 if mention == nil { 23 23 continue 24 24 } 25 - evt.IncrementDistinct("mentions", did, mention.Did) 25 + c.IncrementDistinct("mentions", did, mention.Did) 26 26 newMentions = true 27 27 } 28 28 } ··· 31 31 if !newMentions { 32 32 return nil 33 33 } 34 - if mentionHourlyThreshold <= evt.GetCountDistinct("mentions", did, countstore.PeriodHour) { 35 - evt.AddAccountFlag("high-distinct-mentions") 34 + if mentionHourlyThreshold <= c.GetCountDistinct("mentions", did, countstore.PeriodHour) { 35 + c.AddAccountFlag("high-distinct-mentions") 36 36 } 37 37 38 38 return nil
+15 -15
automod/rules/misleading.go
··· 78 78 return false 79 79 } 80 80 81 - func MisleadingURLPostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 81 + func MisleadingURLPostRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 82 82 // TODO: make this an InSet() config? 83 - if evt.Account.Identity.Handle == "nowbreezing.ntw.app" { 83 + if c.Account.Identity.Handle == "nowbreezing.ntw.app" { 84 84 return nil 85 85 } 86 86 facets, err := ExtractFacets(post) 87 87 if err != nil { 88 - evt.Logger.Warn("invalid facets", "err", err) 88 + c.Logger.Warn("invalid facets", "err", err) 89 89 // TODO: or some other "this record is corrupt" indicator? 90 - //evt.AddRecordFlag("broken-post") 90 + //c.AddRecordFlag("broken-post") 91 91 return nil 92 92 } 93 93 for _, facet := range facets { 94 94 if facet.URL != nil { 95 - if isMisleadingURLFacet(facet, evt.Logger) { 96 - evt.AddRecordFlag("misleading-link") 95 + if isMisleadingURLFacet(facet, c.Logger) { 96 + c.AddRecordFlag("misleading-link") 97 97 } 98 98 } 99 99 } 100 100 return nil 101 101 } 102 102 103 - func MisleadingMentionPostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 103 + func MisleadingMentionPostRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 104 104 // TODO: do we really need to route context around? probably 105 105 ctx := context.TODO() 106 106 facets, err := ExtractFacets(post) 107 107 if err != nil { 108 - evt.Logger.Warn("invalid facets", "err", err) 108 + c.Logger.Warn("invalid facets", "err", err) 109 109 // TODO: or some other "this record is corrupt" indicator? 110 - //evt.AddRecordFlag("broken-post") 110 + //c.AddRecordFlag("broken-post") 111 111 return nil 112 112 } 113 113 for _, facet := range facets { ··· 118 118 } 119 119 handle, err := syntax.ParseHandle(strings.ToLower(txt)) 120 120 if err != nil { 121 - evt.Logger.Warn("mention was not a valid handle", "text", txt) 121 + c.Logger.Warn("mention was not a valid handle", "text", txt) 122 122 continue 123 123 } 124 124 125 - mentioned, err := evt.Engine.Directory.LookupHandle(ctx, handle) 125 + mentioned, err := c.Directory().LookupHandle(ctx, handle) 126 126 if err != nil { 127 - evt.Logger.Warn("could not resolve handle", "handle", handle) 128 - evt.AddRecordFlag("broken-mention") 127 + c.Logger.Warn("could not resolve handle", "handle", handle) 128 + c.AddRecordFlag("broken-mention") 129 129 break 130 130 } 131 131 132 132 // TODO: check if mentioned DID was recently updated? might be a caching issue 133 133 if mentioned.DID.String() != *facet.DID { 134 - evt.Logger.Warn("misleading mention", "text", txt, "did", facet.DID) 135 - evt.AddRecordFlag("misleading-mention") 134 + c.Logger.Warn("misleading mention", "text", txt, "did", facet.DID) 135 + c.AddRecordFlag("misleading-mention") 136 136 continue 137 137 } 138 138 }
+32 -12
automod/rules/misleading_test.go
··· 1 1 package rules 2 2 3 3 import ( 4 + "context" 4 5 "log/slog" 5 6 "testing" 6 7 ··· 8 9 "github.com/bluesky-social/indigo/atproto/identity" 9 10 "github.com/bluesky-social/indigo/atproto/syntax" 10 11 "github.com/bluesky-social/indigo/automod" 12 + "github.com/bluesky-social/indigo/automod/engine" 11 13 12 14 "github.com/stretchr/testify/assert" 13 15 ) 14 16 15 17 func TestMisleadingURLPostRule(t *testing.T) { 16 18 assert := assert.New(t) 19 + ctx := context.Background() 17 20 18 - engine := automod.EngineTestFixture() 21 + eng := engine.EngineTestFixture() 19 22 am1 := automod.AccountMeta{ 20 23 Identity: &identity.Identity{ 21 24 DID: syntax.DID("did:plc:abc111"), 22 25 Handle: syntax.Handle("handle.example.com"), 23 26 }, 24 27 } 25 - path := "app.bsky.feed.post/abc123" 26 - cid1 := "cid123" 28 + cid1 := syntax.CID("cid123") 27 29 p1 := appbsky.FeedPost{ 28 30 Text: "https://safe.com/ is very reputable", 29 31 Facets: []*appbsky.RichtextFacet{ ··· 42 44 }, 43 45 }, 44 46 } 45 - evt1 := engine.NewRecordEvent(am1, path, cid1, &p1) 46 - assert.NoError(MisleadingURLPostRule(&evt1, &p1)) 47 - assert.NotEmpty(evt1.RecordFlags) 47 + op := engine.RecordOp{ 48 + Action: engine.CreateOp, 49 + DID: am1.Identity.DID, 50 + Collection: syntax.NSID("app.bsky.feed.post"), 51 + RecordKey: syntax.RecordKey("abc123"), 52 + CID: &cid1, 53 + Value: p1, 54 + } 55 + c1 := engine.NewRecordContext(ctx, &eng, am1, op) 56 + assert.NoError(MisleadingURLPostRule(&c1, &p1)) 57 + eff1 := engine.ExtractEffects(&c1.BaseContext) 58 + assert.NotEmpty(eff1.RecordFlags) 48 59 } 49 60 50 61 func TestMisleadingMentionPostRule(t *testing.T) { 51 62 assert := assert.New(t) 63 + ctx := context.Background() 52 64 53 - engine := automod.EngineTestFixture() 65 + eng := engine.EngineTestFixture() 54 66 am1 := automod.AccountMeta{ 55 67 Identity: &identity.Identity{ 56 68 DID: syntax.DID("did:plc:abc111"), 57 69 Handle: syntax.Handle("handle.example.com"), 58 70 }, 59 71 } 60 - path := "app.bsky.feed.post/abc123" 61 - cid1 := "cid123" 72 + cid1 := syntax.CID("cid123") 62 73 p1 := appbsky.FeedPost{ 63 74 Text: "@handle.example.com is a friend", 64 75 Facets: []*appbsky.RichtextFacet{ ··· 77 88 }, 78 89 }, 79 90 } 80 - evt1 := engine.NewRecordEvent(am1, path, cid1, &p1) 81 - assert.NoError(MisleadingMentionPostRule(&evt1, &p1)) 82 - assert.NotEmpty(evt1.RecordFlags) 91 + op := engine.RecordOp{ 92 + Action: engine.CreateOp, 93 + DID: am1.Identity.DID, 94 + Collection: syntax.NSID("app.bsky.feed.post"), 95 + RecordKey: syntax.RecordKey("abc123"), 96 + CID: &cid1, 97 + Value: p1, 98 + } 99 + c1 := engine.NewRecordContext(ctx, &eng, am1, op) 100 + assert.NoError(MisleadingMentionPostRule(&c1, &p1)) 101 + eff1 := engine.ExtractEffects(&c1.BaseContext) 102 + assert.NotEmpty(eff1.RecordFlags) 83 103 } 84 104 85 105 func pstr(raw string) *string {
+4 -4
automod/rules/private.go
··· 8 8 ) 9 9 10 10 // dummy rule. this leaks PII (account email) in logs and should never be used in real life 11 - func AccountPrivateDemoPostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 12 - if evt.Account.Private != nil { 13 - if strings.HasSuffix(evt.Account.Private.Email, "@blueskyweb.xyz") { 14 - evt.Logger.Info("hello dev!", "email", evt.Account.Private.Email) 11 + func AccountPrivateDemoPostRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 12 + if c.Account.Private != nil { 13 + if strings.HasSuffix(c.Account.Private.Email, "@blueskyweb.xyz") { 14 + c.Logger.Info("hello dev!", "email", c.Account.Private.Email) 15 15 } 16 16 } 17 17 return nil
+3 -3
automod/rules/profile.go
··· 6 6 ) 7 7 8 8 // this is a dummy rule to demonstrate accessing account metadata (eg, profile) from within post handler 9 - func AccountDemoPostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 10 - if evt.Account.Profile.Description != nil && len(post.Text) > 5 && *evt.Account.Profile.Description == post.Text { 11 - evt.AddRecordFlag("own-profile-description") 9 + func AccountDemoPostRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 10 + if c.Account.Profile.Description != nil && len(post.Text) > 5 && *c.Account.Profile.Description == post.Text { 11 + c.AddRecordFlag("own-profile-description") 12 12 } 13 13 return nil 14 14 }
+11 -11
automod/rules/promo.go
··· 13 13 // looks for new accounts, with a commercial or donation link in profile, which directly reply to several accounts 14 14 // 15 15 // this rule depends on ReplyCountPostRule() to set counts 16 - func AggressivePromotionRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 17 - if evt.Account.Private == nil || evt.Account.Identity == nil { 16 + func AggressivePromotionRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 17 + if c.Account.Private == nil || c.Account.Identity == nil { 18 18 return nil 19 19 } 20 20 // TODO: helper for account age 21 - age := time.Since(evt.Account.Private.IndexedAt) 21 + age := time.Since(c.Account.Private.IndexedAt) 22 22 if age > 7*24*time.Hour { 23 23 return nil 24 24 } 25 - if post.Reply == nil || IsSelfThread(evt, post) { 25 + if post.Reply == nil || IsSelfThread(c, post) { 26 26 return nil 27 27 } 28 28 29 29 allURLs := ExtractTextURLs(post.Text) 30 - if evt.Account.Profile.Description != nil { 31 - profileURLs := ExtractTextURLs(*evt.Account.Profile.Description) 30 + if c.Account.Profile.Description != nil { 31 + profileURLs := ExtractTextURLs(*c.Account.Profile.Description) 32 32 allURLs = append(allURLs, profileURLs...) 33 33 } 34 34 hasPromo := false ··· 38 38 } 39 39 u, err := url.Parse(s) 40 40 if err != nil { 41 - evt.Logger.Warn("failed to parse URL", "url", s) 41 + c.Logger.Warn("failed to parse URL", "url", s) 42 42 continue 43 43 } 44 44 host := strings.TrimPrefix(strings.ToLower(u.Host), "www.") 45 - if evt.InSet("promo-domain", host) { 45 + if c.InSet("promo-domain", host) { 46 46 hasPromo = true 47 47 break 48 48 } ··· 51 51 return nil 52 52 } 53 53 54 - did := evt.Account.Identity.DID.String() 55 - uniqueReplies := evt.GetCountDistinct("reply-to", did, countstore.PeriodDay) 54 + did := c.Account.Identity.DID.String() 55 + uniqueReplies := c.GetCountDistinct("reply-to", did, countstore.PeriodDay) 56 56 if uniqueReplies >= 5 { 57 - evt.AddAccountFlag("promo-multi-reply") 57 + c.AddAccountFlag("promo-multi-reply") 58 58 } 59 59 60 60 return nil
+16 -16
automod/rules/replies.go
··· 11 11 ) 12 12 13 13 // does not count "self-replies" (direct to self, or in own post thread) 14 - func ReplyCountPostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 15 - if post.Reply == nil || IsSelfThread(evt, post) { 14 + func ReplyCountPostRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 15 + if post.Reply == nil || IsSelfThread(c, post) { 16 16 return nil 17 17 } 18 18 19 - did := evt.Account.Identity.DID.String() 20 - if evt.GetCount("reply", did, countstore.PeriodDay) > 3 { 19 + did := c.Account.Identity.DID.String() 20 + if c.GetCount("reply", did, countstore.PeriodDay) > 3 { 21 21 // TODO: disabled, too noisy for prod 22 - //evt.AddAccountFlag("frequent-replier") 22 + //c.AddAccountFlag("frequent-replier") 23 23 } 24 - evt.Increment("reply", did) 24 + c.Increment("reply", did) 25 25 26 26 parentURI, err := syntax.ParseATURI(post.Reply.Parent.Uri) 27 27 if err != nil { 28 - evt.Logger.Warn("failed to parse reply AT-URI", "uri", post.Reply.Parent.Uri) 28 + c.Logger.Warn("failed to parse reply AT-URI", "uri", post.Reply.Parent.Uri) 29 29 return nil 30 30 } 31 - evt.IncrementDistinct("reply-to", did, parentURI.Authority().String()) 31 + c.IncrementDistinct("reply-to", did, parentURI.Authority().String()) 32 32 return nil 33 33 } 34 34 ··· 38 38 // Looks for accounts posting the exact same text multiple times. Does not currently count the number of distinct accounts replied to, just counts replies at all. 39 39 // 40 40 // There can be legitimate situations that trigger this rule, so in most situations should be a "report" not "label" action. 41 - func IdenticalReplyPostRule(evt *automod.RecordEvent, post *appbsky.FeedPost) error { 42 - if post.Reply == nil || IsSelfThread(evt, post) { 41 + func IdenticalReplyPostRule(c *automod.RecordContext, post *appbsky.FeedPost) error { 42 + if post.Reply == nil || IsSelfThread(c, post) { 43 43 return nil 44 44 } 45 45 46 46 // increment first. use a specific period (IncrementPeriod()) to reduce the number of counters (one per unique post text) 47 47 period := countstore.PeriodDay 48 - bucket := evt.Account.Identity.DID.String() + "/" + HashOfString(post.Text) 49 - evt.IncrementPeriod("reply-text", bucket, period) 48 + bucket := c.Account.Identity.DID.String() + "/" + HashOfString(post.Text) 49 + c.IncrementPeriod("reply-text", bucket, period) 50 50 51 51 // don't action short replies, or accounts more than two weeks old 52 52 if utf8.RuneCountInString(post.Text) <= 10 { 53 53 return nil 54 54 } 55 - if evt.Account.Private != nil { 56 - age := time.Since(evt.Account.Private.IndexedAt) 55 + if c.Account.Private != nil { 56 + age := time.Since(c.Account.Private.IndexedAt) 57 57 if age > 2*7*24*time.Hour { 58 58 return nil 59 59 } 60 60 } 61 61 62 - if evt.GetCount("reply-text", bucket, period) >= identicalReplyLimit { 63 - evt.AddAccountFlag("multi-identical-reply") 62 + if c.GetCount("reply-text", bucket, period) >= identicalReplyLimit { 63 + c.AddAccountFlag("multi-identical-reply") 64 64 } 65 65 66 66 return nil
+8 -6
automod/rules/replies_test.go
··· 5 5 "testing" 6 6 7 7 "github.com/bluesky-social/indigo/automod" 8 + "github.com/bluesky-social/indigo/automod/capture" 9 + "github.com/bluesky-social/indigo/automod/engine" 8 10 9 11 "github.com/stretchr/testify/assert" 10 12 ) ··· 13 15 assert := assert.New(t) 14 16 ctx := context.Background() 15 17 16 - engine := automod.EngineTestFixture() 17 - engine.Rules = automod.RuleSet{ 18 + eng := engine.EngineTestFixture() 19 + eng.Rules = automod.RuleSet{ 18 20 PostRules: []automod.PostRuleFunc{ 19 21 IdenticalReplyPostRule, 20 22 }, 21 23 } 22 24 23 - capture := automod.MustLoadCapture("testdata/capture_hackerdarkweb.json") 24 - did := capture.AccountMeta.Identity.DID.String() 25 - assert.NoError(automod.ProcessCaptureRules(&engine, capture)) 26 - f, err := engine.Flags.Get(ctx, did) 25 + cap := capture.MustLoadCapture("testdata/capture_hackerdarkweb.json") 26 + did := cap.AccountMeta.Identity.DID.String() 27 + assert.NoError(capture.ProcessCaptureRules(&eng, cap)) 28 + f, err := eng.Flags.Get(ctx, did) 27 29 assert.NoError(err) 28 30 assert.Equal([]string{"multi-identical-reply"}, f) 29 31 }
-86
automod/ruleset.go
··· 1 - package automod 2 - 3 - import ( 4 - "fmt" 5 - 6 - appbsky "github.com/bluesky-social/indigo/api/bsky" 7 - ) 8 - 9 - type RuleSet struct { 10 - PostRules []PostRuleFunc 11 - ProfileRules []ProfileRuleFunc 12 - RecordRules []RecordRuleFunc 13 - RecordDeleteRules []RecordDeleteRuleFunc 14 - IdentityRules []IdentityRuleFunc 15 - } 16 - 17 - func (r *RuleSet) CallRecordRules(evt *RecordEvent) error { 18 - // first the generic rules 19 - for _, f := range r.RecordRules { 20 - err := f(evt) 21 - if err != nil { 22 - return err 23 - } 24 - if evt.Err != nil { 25 - return evt.Err 26 - } 27 - } 28 - // then any record-type-specific rules 29 - switch evt.Collection { 30 - case "app.bsky.feed.post": 31 - post, ok := evt.Record.(*appbsky.FeedPost) 32 - if !ok { 33 - return fmt.Errorf("mismatch between collection (%s) and type", evt.Collection) 34 - } 35 - for _, f := range r.PostRules { 36 - err := f(evt, post) 37 - if err != nil { 38 - return err 39 - } 40 - if evt.Err != nil { 41 - return evt.Err 42 - } 43 - } 44 - case "app.bsky.actor.profile": 45 - profile, ok := evt.Record.(*appbsky.ActorProfile) 46 - if !ok { 47 - return fmt.Errorf("mismatch between collection (%s) and type", evt.Collection) 48 - } 49 - for _, f := range r.ProfileRules { 50 - err := f(evt, profile) 51 - if err != nil { 52 - return err 53 - } 54 - if evt.Err != nil { 55 - return evt.Err 56 - } 57 - } 58 - } 59 - return nil 60 - } 61 - 62 - func (r *RuleSet) CallRecordDeleteRules(evt *RecordDeleteEvent) error { 63 - for _, f := range r.RecordDeleteRules { 64 - err := f(evt) 65 - if err != nil { 66 - return err 67 - } 68 - if evt.Err != nil { 69 - return evt.Err 70 - } 71 - } 72 - return nil 73 - } 74 - 75 - func (r *RuleSet) CallIdentityRules(evt *IdentityEvent) error { 76 - for _, f := range r.IdentityRules { 77 - err := f(evt) 78 - if err != nil { 79 - return err 80 - } 81 - if evt.Err != nil { 82 - return evt.Err 83 - } 84 - } 85 - return nil 86 - }
+1 -1
automod/slack.go automod/engine/slack.go
··· 1 - package automod 1 + package engine 2 2 3 3 import ( 4 4 "bytes"
automod/testdata/capture_atprotocom.json automod/capture/testdata/capture_atprotocom.json
-146
automod/testing.go
··· 1 - package automod 2 - 3 - import ( 4 - "context" 5 - "encoding/json" 6 - "io" 7 - "log/slog" 8 - "os" 9 - "time" 10 - 11 - appbsky "github.com/bluesky-social/indigo/api/bsky" 12 - "github.com/bluesky-social/indigo/atproto/identity" 13 - "github.com/bluesky-social/indigo/atproto/syntax" 14 - "github.com/bluesky-social/indigo/automod/cachestore" 15 - "github.com/bluesky-social/indigo/automod/countstore" 16 - "github.com/bluesky-social/indigo/automod/flagstore" 17 - "github.com/bluesky-social/indigo/automod/setstore" 18 - ) 19 - 20 - func simpleRule(evt *RecordEvent, post *appbsky.FeedPost) error { 21 - for _, tag := range post.Tags { 22 - if evt.InSet("bad-hashtags", tag) { 23 - evt.AddRecordLabel("bad-hashtag") 24 - break 25 - } 26 - } 27 - for _, facet := range post.Facets { 28 - for _, feat := range facet.Features { 29 - if feat.RichtextFacet_Tag != nil { 30 - tag := feat.RichtextFacet_Tag.Tag 31 - if evt.InSet("bad-hashtags", tag) { 32 - evt.AddRecordLabel("bad-hashtag") 33 - break 34 - } 35 - } 36 - } 37 - } 38 - return nil 39 - } 40 - 41 - func EngineTestFixture() Engine { 42 - rules := RuleSet{ 43 - PostRules: []PostRuleFunc{ 44 - simpleRule, 45 - }, 46 - } 47 - cache := cachestore.NewMemCacheStore(10, time.Hour) 48 - flags := flagstore.NewMemFlagStore() 49 - sets := setstore.NewMemSetStore() 50 - sets.Sets["bad-hashtags"] = make(map[string]bool) 51 - sets.Sets["bad-hashtags"]["slur"] = true 52 - dir := identity.NewMockDirectory() 53 - id1 := identity.Identity{ 54 - DID: syntax.DID("did:plc:abc111"), 55 - Handle: syntax.Handle("handle.example.com"), 56 - } 57 - dir.Insert(id1) 58 - engine := Engine{ 59 - Logger: slog.Default(), 60 - Directory: &dir, 61 - Counters: countstore.NewMemCountStore(), 62 - Sets: sets, 63 - Flags: flags, 64 - Cache: cache, 65 - Rules: rules, 66 - } 67 - return engine 68 - } 69 - 70 - func MustLoadCapture(capPath string) AccountCapture { 71 - f, err := os.Open(capPath) 72 - if err != nil { 73 - panic(err) 74 - } 75 - defer func() { _ = f.Close() }() 76 - 77 - raw, err := io.ReadAll(f) 78 - if err != nil { 79 - panic(err) 80 - } 81 - 82 - var capture AccountCapture 83 - if err := json.Unmarshal(raw, &capture); err != nil { 84 - panic(err) 85 - } 86 - return capture 87 - } 88 - 89 - // Test helper which processes all the records from a capture. Intentionally exported, for use in other packages. 90 - // 91 - // This method replaces any pre-existing directory on the engine with a mock directory. 92 - func ProcessCaptureRules(e *Engine, capture AccountCapture) error { 93 - ctx := context.Background() 94 - 95 - dir := identity.NewMockDirectory() 96 - dir.Insert(*capture.AccountMeta.Identity) 97 - e.Directory = &dir 98 - 99 - // initial identity rules 100 - idevt := IdentityEvent{ 101 - RepoEvent{ 102 - Engine: e, 103 - Logger: e.Logger.With("did", capture.AccountMeta.Identity.DID), 104 - Account: capture.AccountMeta, 105 - }, 106 - } 107 - if err := e.Rules.CallIdentityRules(&idevt); err != nil { 108 - return err 109 - } 110 - if idevt.Err != nil { 111 - return idevt.Err 112 - } 113 - idevt.CanonicalLogLine() 114 - if err := idevt.PersistActions(ctx); err != nil { 115 - return err 116 - } 117 - if err := idevt.PersistCounters(ctx); err != nil { 118 - return err 119 - } 120 - 121 - // all the post rules 122 - for _, pr := range capture.PostRecords { 123 - aturi, err := syntax.ParseATURI(pr.Uri) 124 - if err != nil { 125 - return err 126 - } 127 - path := aturi.Collection().String() + "/" + aturi.RecordKey().String() 128 - evt := e.NewRecordEvent(capture.AccountMeta, path, pr.Cid, pr.Value.Val) 129 - e.Logger.Debug("processing record", "did", aturi.Authority(), "path", path) 130 - if err := e.Rules.CallRecordRules(&evt); err != nil { 131 - return err 132 - } 133 - if evt.Err != nil { 134 - return evt.Err 135 - } 136 - evt.CanonicalLogLine() 137 - // NOTE: not purging account meta when profile is updated 138 - if err := evt.PersistActions(ctx); err != nil { 139 - return err 140 - } 141 - if err := evt.PersistCounters(ctx); err != nil { 142 - return err 143 - } 144 - } 145 - return nil 146 - }
+2 -2
automod/util/strings.go automod/engine/util.go
··· 1 - package util 1 + package engine 2 2 3 - func DedupeStrings(in []string) []string { 3 + func dedupeStrings(in []string) []string { 4 4 var out []string 5 5 seen := make(map[string]bool) 6 6 for _, v := range in {
+41 -3
cmd/hepa/consumer.go
··· 6 6 "fmt" 7 7 "net/http" 8 8 "net/url" 9 + "strings" 9 10 10 11 comatproto "github.com/bluesky-social/indigo/api/atproto" 11 12 "github.com/bluesky-social/indigo/atproto/syntax" 13 + "github.com/bluesky-social/indigo/automod" 12 14 "github.com/bluesky-social/indigo/events/schedulers/autoscaling" 13 15 lexutil "github.com/bluesky-social/indigo/lex/util" 14 16 ··· 88 90 ) 89 91 } 90 92 93 + // TODO: move this to a "ParsePath" helper in syntax package? 94 + func splitRepoPath(path string) (syntax.NSID, syntax.RecordKey, error) { 95 + parts := strings.SplitN(path, "/", 3) 96 + if len(parts) != 2 { 97 + return "", "", fmt.Errorf("invalid record path: %s", path) 98 + } 99 + collection, err := syntax.ParseNSID(parts[0]) 100 + if err != nil { 101 + return "", "", err 102 + } 103 + rkey, err := syntax.ParseRecordKey(parts[1]) 104 + if err != nil { 105 + return "", "", err 106 + } 107 + return collection, rkey, nil 108 + } 109 + 91 110 // NOTE: for now, this function basically never errors, just logs and returns nil. Should think through error processing better. 92 111 func (s *Server) HandleRepoCommit(ctx context.Context, evt *comatproto.SyncSubscribeRepos_Commit) error { 93 112 ··· 113 132 114 133 for _, op := range evt.Ops { 115 134 logger = logger.With("eventKind", op.Action, "path", op.Path) 135 + collection, rkey, err := splitRepoPath(op.Path) 136 + if err != nil { 137 + logger.Error("invalid path in repo op") 138 + return nil 139 + } 116 140 117 141 ek := repomgr.EventKind(op.Action) 118 142 switch ek { ··· 127 151 logger.Error("mismatch between commit op CID and record block", "recordCID", rc, "opCID", op.Cid) 128 152 break 129 153 } 130 - 131 - err = s.engine.ProcessRecord(ctx, did, op.Path, op.Cid.String(), rec) 154 + recCID := syntax.CID(op.Cid.String()) 155 + err = s.engine.ProcessRecordOp(ctx, automod.RecordOp{ 156 + Action: automod.CreateOp, 157 + DID: did, 158 + Collection: collection, 159 + RecordKey: rkey, 160 + CID: &recCID, 161 + Value: rec, 162 + }) 132 163 if err != nil { 133 164 logger.Error("engine failed to process record", "err", err) 134 165 continue 135 166 } 136 167 case repomgr.EvtKindDeleteRecord: 137 - err = s.engine.ProcessRecordDelete(ctx, did, op.Path) 168 + err = s.engine.ProcessRecordOp(ctx, automod.RecordOp{ 169 + Action: automod.DeleteOp, 170 + DID: did, 171 + Collection: collection, 172 + RecordKey: rkey, 173 + CID: nil, 174 + Value: nil, 175 + }) 138 176 if err != nil { 139 177 logger.Error("engine failed to process record", "err", err) 140 178 continue
+5 -4
cmd/hepa/main.go
··· 11 11 12 12 "github.com/bluesky-social/indigo/atproto/identity" 13 13 "github.com/bluesky-social/indigo/atproto/syntax" 14 + "github.com/bluesky-social/indigo/automod/capture" 14 15 "github.com/bluesky-social/indigo/automod/directory" 15 16 16 17 "github.com/carlmjohnson/versioninfo" ··· 250 251 return err 251 252 } 252 253 253 - return srv.engine.FetchAndProcessRecord(ctx, aturi) 254 + return capture.FetchAndProcessRecord(ctx, srv.engine, aturi) 254 255 }, 255 256 } 256 257 ··· 281 282 return err 282 283 } 283 284 284 - return srv.engine.FetchAndProcessRecent(ctx, *atid, cctx.Int("limit")) 285 + return capture.FetchAndProcessRecent(ctx, srv.engine, *atid, cctx.Int("limit")) 285 286 }, 286 287 } 287 288 ··· 312 313 return err 313 314 } 314 315 315 - capture, err := srv.engine.CaptureRecent(ctx, *atid, cctx.Int("limit")) 316 + cap, err := capture.CaptureRecent(ctx, srv.engine, *atid, cctx.Int("limit")) 316 317 if err != nil { 317 318 return err 318 319 } 319 320 320 - outJSON, err := json.MarshalIndent(capture, "", " ") 321 + outJSON, err := json.MarshalIndent(cap, "", " ") 321 322 if err != nil { 322 323 return err 323 324 }