Mirror of https://github.com/roostorg/coop
github.com/roostorg/coop
1# Coop Architecture
2
3This document provides an overview of Coop's system architecture for developers and operators.
4
5### Overview
6
7Coop is built as a monorepo with a React frontend, Node.js backend, and multi-database architecture designed for high-throughput content moderation at scale. Coop:
8
9* Lets operations and policy teams manage settings, like which queue to send reports to, or \# of strikes per enforcement, without requiring engineers to change backend code
10* Supports both automation and a manual review process
11* Provides intuitive UI with role-based access control permissioning
12* Includes an embedded media player for image and video
13* Best-practice wellness features built-in
14* Uses webhook-based architecture to link effects with events
15* Logs an audit trail of actions taken, metadata about the action (incl. When it happened and who it was performed by), and the corresponding policy
16* Dev/staging env for manual testing and automated integration tests
17
18### Technology Stack
19
20| Layer | Technologies |
21| :---- | :---- |
22| **Frontend** | React, TypeScript, Ant Design, TailwindCSS, Apollo Client |
23| **Backend** | Node.js, Express, Apollo Server, TypeScript |
24| **Databases** | PostgreSQL, Scylla(5.2), ClickHouse, Redis |
25| **Messaging** | Kafka (optional), BullMQ |
26| **ORM** | Sequelize, Kysely |
27| **Auth** | Passport.js, express-session, SAML (SSO) |
28| **Observability** | OpenTelemetry |
29
30## **Directory Structure**
31
32```
33coop/
34├── client/ # React frontend
35│ └── src/
36│ ├── webpages/ # Page components
37│ ├── graphql/ # GraphQL queries/mutations
38│ └── components/ # Shared UI components
39│ └── utils/ # Utility Functions
40│
41├── server/ # Node.js backend
42│ ├── bin/ # CLI scripts
43│ ├── graphql/ # GraphQL schema and resolvers
44│ ├── iocContainer/ # Dependency injection setup
45│ ├── models/ # Sequelize ORM models
46│ ├── routes/ # REST API routes
47│ ├── rule_engine/ # Rule evaluation logic
48│ ├── services/ # Business logic services including NCMEC
49│ └── workers_jobs/ # Background processing
50│
51├── .devops/
52│ └── migrator/ # Database migrations
53│ └── src/scripts/
54│ ├── api-server-pg/ # PostgreSQL
55│ ├── clickhouse/ # ClickHouse
56│ └── scylla/ # Scylla
57│
58└── docs/ # Documentation
59```
60
61
62# Coop Core Components
63
64## API
65
66Coop accepts both synchronous and asynchronous input.
67
68* Synchronous input is handled via REST APIs and supports item submission, action execution, reporting workflows, policy retrieval, and related operations.
69* Asynchronous input is handled via Kafka-based event streaming using the ITEM\_SUBMISSION\_EVENT topic.
70
71All API requests require an organization API key passed via the x-api-key header.
72
73### Content Submission
74
75* **File**: `/server/routes/content/ContentRoutes.ts`
76* **Route**: `Post /api/v1/content/`
77* **Header**: `x-api-key: <org-api-key>`
78
79Accepts any item (eg: content, user, thread) but only accepts a single item at a time. By default, requests are processed asynchronously. To force synchronous mode, set `sync: true`
80
81**Example request body (JSON):**
82```json
83{
84 "contentId": "unique-id-123",
85 "contentType": "Comment",
86 "content": {
87 "text": "Hello world",
88 "authorId": "user-456",
89 "createdAt": "2024-01-01T00:00:00Z"
90 },
91 "userId": "user-456",
92 "sync": false
93}
94```
95
96### Item Submission
97
98* **File**: `/server/routes/items/ItemRoutes.ts`
99* **Route**: `POST /api/v1/items/async/`
100* **Header**: `x-api-key: <org-api-key>`
101
102Accepts one or more arbitrary items (users, threads, etc.). All processing is asynchronous.
103
104**Example request body (JSON):**
105
106```json
107{
108 "items": [
109 {
110 "id": "unique-item-id-123",
111 "data": {
112 "fieldName1": "value1",
113 "fieldName2": 123
114 },
115 "typeId": "your-item-type-id",
116 "typeVersion": "optional-version-string",
117 "typeSchemaVariant": "original"
118 }
119 ]
120}
121```
122
123### Action Execution
124
125* **File**: `/server/routes/action/ActionRoutes.ts`
126* **Route**: `POST /api/v1/actions`
127* **Header**: `x-api-key: \<org-api-key\>`
128
129**Example request body (JSON):**
130
131```json
132{
133 "actionId": "action-id-to-execute",
134 "itemId": "target-item-id",
135 "itemTypeId": "item-type-id",
136 "policyIds": ["policy-id-1", "policy-id-2"],
137 "reportedItems": [
138 {
139 "id": "reported-item-id",
140 "typeId": "reported-item-type-id"
141 }
142 ],
143 "actorId": "user-id-who-triggered-action"
144}
145```
146
147### Reporting
148
149* **File:** `/server/routes/reporting/ReportingRoutes.ts`
150* **Route**: `POST /api/v1/report`
151* **Header**: `x-api-key: <org-api-key>`
152
153Used to submit reports from users or systems, including contextual items and thread history. The payload supports:
154
155* Reporter identity
156* Reported item
157* Thread context
158* Policy reason(s)
159* Additional contextual items
160
161**Example request body (JSON):**
162
163```json
164{
165 "reporter": {
166 "kind": "user",
167 "typeId": "reporter-user-type-id",
168 "id": "reporter-user-id"
169 },
170 "reportedAt": "2024-01-15T10:30:00.000Z",
171 "reportedForReason": {
172 "policyId": "violated-policy-id",
173 "reason": "Free-text reason from reporter",
174 "csam": false
175 },
176 "reportedItem": {
177 "id": "reported-item-id",
178 "data": { "fieldName": "value" },
179 "typeId": "item-type-id"
180 },
181 "reportedItemThread": [
182 {
183 "id": "thread-message-1",
184 "data": { "content": "message content" },
185 "typeId": "message-type-id"
186 }
187 ],
188 "reportedItemsInThread": [
189 { "id": "specific-reported-message", "typeId": "message-type-id" }
190 ],
191 "additionalItems": [
192 { "id": "additional-context-item", "data": {}, "typeId": "item-type-id" }
193 ]
194}
195```
196
197### Appeal
198
199* **File**: `/server/routes/reporting/ReportingRoutes.ts:105-154`
200* **Route**: `POST /api/v1/report/appeal`
201* **Header**: `x-api-key: <org-api-key>`
202
203Appeals allow users to contest actions taken against items. Appeals include the original action, violated policies, appeal reason, and optional additional context.
204
205**Example request body (JSON):**
206
207```json
208{
209 "appealId": "customer-internal-appeal-id",
210 "appealedBy": {
211 "typeId": "appealer-user-type-id",
212 "id": "appealer-user-id"
213 },
214 "appealedAt": "2024-01-15T12:00:00.000Z",
215 "actionedItem": {
216 "id": "item-that-was-actioned",
217 "data": { "fieldName": "value" },
218 "typeId": "item-type-id"
219 },
220 "actionsTaken": ["action-id-1", "action-id-2"],
221 "appealReason": "User's explanation for why they are appealing",
222 "violatingPolicies": [
223 { "id": "policy-id-1" },
224 { "id": "policy-id-2" }
225 ],
226 "additionalItems": [
227 { "id": "additional-context-item", "data": {}, "typeId": "item-type-id" }
228 ]
229}
230```
231
232### Supporting API Endpoints
233
234* **Policies**: `GET /api/v1/policies/`
235* **User Scores**: `GET /api/v1/user_scores`
236* **GDPR Deletion**: `POST /api/v1/gdpr/delete`
237
238### Errors
239
240All API errors use a consistent JSON structure:
241
242```json
243{
244 "errors": [
245 {
246 "status": 400,
247 "type": ["/errors/invalid-user-input"],
248 "title": "Short error description",
249 "detail": "Detailed explanation (optional)",
250 "pointer": "/path/to/problematic/field (optional)",
251 "requestId": "correlation-id (optional)"
252 }
253 ]
254}
255```
256
257## Rules Engine
258
259When an item is submitted, Coop retrieves all [rules](RULES.md) associated with the item’s type. Each rule is evaluated by recursively processing its `conditionSet`, extracting values from the item, optionally passing them through signals, and comparing results using configured comparators.
260
261Key characteristics:
262
263* Conditions are evaluated in ascending cost order
264* Short-circuiting is applied based on conjunction type (AND / OR / XOR)
265* Expensive signals are skipped when earlier conditions fail
266* Actions are deduplicated before execution
267
268For rules in actionable environments (e.g., `LIVE`, `MANUAL`), actions are published via the `ActionPublisher`, which handles:
269
270* Customer webhooks
271* MRT enqueueing
272* NCMEC routing
273
274**Location**: `/server/rule_engine`
275
276**Rule structure:** `/server/models/rules/RuleModel.ts`
277
278```typescript
279Rule {
280 id: string;
281 name: string;
282 status: RuleStatus;
283 ruleType: RuleType;
284 conditionSet: ConditionSet;
285 orgId: string;
286 tags: string[];
287 maxDailyActions: number;
288}
289```
290
291## Manual Review Tool (MRT)
292
293The Manual Review Tool (MRT) is a BullMQ-backed queue system used for human review. Items enter MRT via rule actions or user reports. Each job is enriched with context (user scores, related items) and routes them to named queues via routing rules configured in the UI. Moderators claim tasks via exclusive locks (so only one person can claim one task) and submit decisions (aka take actions), which trigger downstream callbacks or reporting workflows (ie. NCMEC).
294
295### Queue Management
296
297#### Queue Operations
298
299**File**: `/server/services/manualReviewToolService/modules/QueueOperations.ts`
300
301Jobs can be enqueued from:
302
303* Rules engine execution
304* User reports
305* Post-action workflows
306* MRT internal jobs
307
308**Users:**
309
310* Dequeue jobs with exclusive locks
311* Submit decisions
312* Trigger post-decision webhooks or NCMEC reporting
313
314**Supported decision types:**
315
316* `IGNORE`
317* `CUSTOM_ACTION`
318* `SUBMIT_NCMEC_REPORT`
319* `ACCEPT_APPEAL`
320* `REJECT_APPEAL`
321* `TRANSFORM_JOB_AND_RECREATE_IN_QUEUE`
322* `AUTOMATIC_CLOSE`
323
324**Manual Enqueue:**
325
326```typescript
327{
328 orgId: string;
329 correlationId: RuleExecutionCorrelationId | ActionExecutionCorrelationId;
330 createdAt: Date;
331 enqueueSource: 'REPORT' | 'RULE_EXECUTION' | 'POST_ACTIONS' | 'MRT_JOB';
332 enqueueSourceInfo: ReportEnqueueSourceInfo | RuleExecutionEnqueueSourceInfo | ...;
333 payload: ManualReviewJobPayloadInput;
334 policyIds: string[];
335}
336```
337
338**Entry from Rules Engine** (ActionPublisher.ts):
339
340```typescript
341case ActionType.ENQUEUE_TO_MRT:
342 await this.manualReviewToolService.enqueue({
343 orgId,
344 payload: { kind: 'DEFAULT', item, reportHistory: [], ... },
345 enqueueSource: 'RULE_EXECUTION',
346 enqueueSourceInfo: { kind: 'RULE_EXECUTION', rules: rules.map(x => x.id) },
347 correlationId,
348 policyIds: policies.map(it => it.id),
349 });
350```
351
352**Dequeue with lock:**
353
354```typescript
355async dequeueNextJob(opts: {
356 orgId: string;
357 queueId: string;
358 userId: string;
359}): Promise<{ job: ManualReviewJob; lockToken: string } | null>
360```
361
362**Submit Decisions:**
363
364```typescript
365async submitDecision(opts: SubmitDecisionInput): Promise<SubmitDecisionResponse>
366```
367
368## Actions
369
370Actions are created when a rule matches or a moderator submits a decision. Coop determines *when* an action should occur; the customer determines *what* happens as a result (label / warn / ban / remove content etc). The actual action is taken by the customer after being triggered through Coop.
371
372Action types:
373
374* CUSTOMER\_DEFINED\_ACTION: POST webhook to customer infrastructure
375* ENQUEUE\_TO\_MRT: Add item to the manual review queue
376* ENQUEUE\_TO\_NCMEC: Route to NCMEC reporting queue
377
378**Webhook structure:**
379
380```json
381{
382 "item": { "id": "...", "typeId": "..." },
383 "policies": [{ "id": "...", "name": "...", "penalty": "..." }],
384 "rules": [{ "id": "...", "name": "..." }],
385 "action": { "id": "..." },
386 "custom": {},
387 "actorEmail": "moderator@example.com"
388}
389```
390
391Failed webhook deliveries retry five times with exponential back off.
392
393## Storage
394
395Coop uses a multiple database storage system:
396
397* **PostgreSQL** stores configuration, rules, users, sessions, and MRT decisions with ACID guarantees.
398* **Redis (via BullMQ)** powers MRT job queues, caching, and aggregation counters for very low latency.
399* **ScyllaDb (5.2)** stores item submission history for high-throughput writes with materialized views for varied access patterns.
400* **Clickhouse** serves as the analytics warehouse for rule executions, actions and user statistics.
401
402### PostgreSQL
403
404ACID compliant storage for config, auth, rules, and operational data including:
405
406* *public*: orgs, users, actions, policies, item\_types, banks, api_keys
407* *jobs*: Scheduled job tracking
408* *manual_review_tool:* manual review queues, decisions, routing rules, comments
409* *ncmec_reporting*: Child safety NCMEC reports
410* *reporting_rules:* User / content reporting rules
411* *signal_service:* Signal configuration
412* *user_management_service*: User management
413* *users_statistics_service:* User statistics
414
415### Redis
416
417Used as low-latency hot cache for:
418
419* **MRT**: BullMQ job queues
420* **Caching**: Sets, Sorted Sets, Lua scripts
421* **Distributed counters**
422
423### ScyllaDb
424
425Used for high-throughput item history (Investigations tool and associated users/items). It serves as time-series item submission storage with multiple access patterns
426
427Tables/Views
428
429* **item_submission_by_thread**: Primary table
430* **item_submission_by_item_id**: Lookup by item ID
431* **item_submission_by_thread_and_time**: Thread and time range
432* **item_submission_by_creator**: Lookup by creator
433
434### ClickHouse
435
436Serves as the OLAP storage for analytics, aggregations, and audit trails
437
438Databases and key tables
439
440* **analytics**: RULE_EXECUTIONS, ACTION_EXECUTIONS, CONTENT_API_REQUESTS, ITEM_MODEL_SCORES_LOG
441* **action executions:** ACTION_STATISTICS_SERVICE: BY_ACTION, BY_RULE, BY_POLICY, ACTIONED_SUBMISSION_COUNTS
442 * MANUAL_REVIEW_TOOL: ROUTING_RULE_EXECUTIONS
443* **Reporting and appeal stats:** REPORTING_SERVICE: REPORTS, APPEALS, REPORTING_RULE_EXECUTIONS
444* **User level metrics:** USER_STATISTICS_SERVICE: LIFETIME_ACTION_STATS, SUBMISSION_STATS, USER_SCORES
445
446## Signals
447
448Signals are scoring or evaluation functions used by rules. They range from simple text matching to third-party ML services.
449
450The rules engine calls signals when evaluating conditions that need a score. Signals run in cost order (e.g. text matching will run early). If an early condition fails, the expensive signals are skipped. Results are memoized and cached for 30 seconds for reuse. Signals extend a shared base class and define metadata, cost, and execution logic.
451
452File: `/server/services/signalsService`
453
454**Signals Base Class:**
455File: `/server/services/signalsService/signals/SignalBase.ts`
456
457```typescript
458abstract class SignalBase<Input, OutputType, MatchingValue, Type> {
459 abstract get id(): SignalId;
460 abstract get displayName(): string;
461 abstract get description(): string;
462 abstract get eligibleInputs(): readonly Input[];
463 abstract get outputType(): OutputType;
464 abstract get supportedLanguages(): readonly Language[] | 'ALL';
465 abstract get integration(): Integration | null;
466 abstract getCost(): number;
467 abstract run(input: SignalInput): Promise<SignalResult | SignalErrorResult>;
468}
469```
470
471# Services Required
472
473* PostgreSQL
474* Redis
475* Kafka
476 * Schema registry
477 * Zookeeper
478* Clickhouse
479* ScyllaDb
480* Metrics
481 * Jaeger
482 * Open Telemetry
483
484# Configuration
485
486Server configuration lives in `/server/.env.example`
487
488* Database: PostgreSQL
489* Analytics, Warehouse: Clickhouse
490* Redis: Redis
491* Scylla: Scylla
492
493Rules
494
495* Configured in frontend via GraphQL/dashboard UI
496* Rate limiting via maxDailyActions for each rule
497* Rule status: `LIVE`, `DRAFT`, `BACKGROUND`, `EXPIRED`
498* Signals: Configured in the rules front-end
499
500User roles
501
502* ADMIN: Full access
503* RULES_MANAGER: Can modify live rules
504* ANALYST: View insights
505* MODERATOR_MANAGER: Managers MRT queues
506* MODERATOR: Reviews assigned queues
507* CHILD_SAFETY_MODERATOR: Access to NCMEC data
508* EXTERNAL_MODERATOR: View only MRT access
509
510Permissions
511
512* MANAGE_ORG: ADMIN
513* MUTATE_LIVE_RULES: ADMIN, RULES_MANAGER
514* VIEW_MRT: All moderator roles
515* EDIT_MRT_QUEUES: ADMIN, MODERATOR_MANAGER
516* VIEW_CHILD_SAFETY_DATA: ADMIN, MODERATOR_MANAGER, CHILD\_SAFETY\_MODERATOR
517
518# Action Rules vs Routing Rules
519
520Coop supports two sets of [rules](RULES.md). Each has separate code paths, storage tables, and UI surfaces.
521
5221. [Automated Action rules](RULES.md#automated-action-rules): All rules act in parallel on all events to determine auto actions and MRT decisioning
5232. [Routing rules](RULES.md#routing-rules): First routing rule that succeeds routes the MRT bound event into the appropriate queue awaiting review, the rest are executed in order.
524
525## Rules Engine Rules
526
527Code: `/server/models/rules/RuleModel.ts`
528
529UI: `/client/src/webpages/dashboard/rules/`
530
531Storage tables:
532
533* public.rules
534* public.rules_and_actions
535* public.rules_and_item_types
536* public.rules_and_policies
537* public.rules_history
538
539## Routing Rules
540
541Code: `/server/services/manualReviewToolService/modules/JobRouting.ts`
542UI: `/client/src/webpages/dashboard/mrt/queue_routing/`
543
544Storage tables:
545
546* manual_review_tool.routing_rules
547* manual_review_tool.routing_rules_to_item_types
548* manual_review_tool.routing_rules_history
549* manual_review_tool.appeal_routing_rules
550* manual_review_tool.appeal_routing_rules_to_item_types
551
552## Authentication
553
554Coop supports three authentication methods: API key authentication for programmatic access, and session-based.
555
556### API Key Authentication
557
558API keys authenticate programmatic requests to REST endpoints. All API requests require the x-api-key header.
559
560 1. Middleware extracts the x-api-key header
561 2. Key is validated via SHA-256 hash lookup in the database
562 3. If valid, orgId is set on the request for downstream handlers
563 4. Returns 401 Unauthorized if invalid or missing
564
565
566* Keys are 32-byte random values, SHA-256 hashed before storage
567* Each key is scoped to a single team (ie. if you have different teams in the same organization whose data should not mix)
568* Last-used timestamp tracked for auditing
569* Keys can be rotated (creates new key, deactivates old)
570
571Files:
572* Middleware: `/server/utils/apiKeyMiddleware.ts`
573* Service: `/server/services/apiKeyService/apiKeyService.ts`
574
575### Session-Based Authentication
576
577Session authentication is used for dashboard UI access via GraphQL.
578
579 1. User submits credentials via GraphQL login mutation
580 2. Passport's GraphQLLocalStrategy validates email/password
581 3. Password verified via bcrypt comparison
582 4. On success, user serialized to session via passport.serializeUser()
583 5. Session stored in PostgreSQL via connect-pg-simple
584Session configuration:
585* Store: PostgreSQL-backed
586* Cookie: Secure flag in production, 30-day expiry
587* Session secret: process.env.SESSION_SECRET
588 Files: `/server/api.ts`
589
590### SAML/SSO Authentication
591
592Enterprise SSO uses SAML with per-organization configuration.
593
594 1. User navigates to /saml/login/{orgId}
595 2. Passport's MultiSamlStrategy retrieves org-specific SAML settings
596 3. User redirected to configured SAML provider
597 4. Provider authenticates and posts assertion to callback URL
598 5. User email extracted from SAML assertion
599 6. User record looked up and session created
600Configuration (per org in org\_settings table):
601
602* saml\_enabled: Boolean flag
603* sso\_url: SAML entry point URL
604* cert: Certificate for validation
605
606 Files:
607`/server/api.ts (lines 142-227)`
608`/server/services/SSOService/SSOService.ts`
609