docs/ARCHITECTURE.md at 3f8c425df5edf1886efb0d7dd4eccb3c63279ce2

roost.tools / coop
fork
Mirror of https://github.com/roostorg/coop github.com/roostorg/coop
fork
coop / docs / ARCHITECTURE.md
at 3f8c425df5edf1886efb0d7dd4eccb3c63279ce2 609 lines 22 kB view raw view rendered
wrap content
Juan S. Mrad Initial open source release (v0) 2mo ago
23e31d47
  1# Coop Architecture
  2
  3This document provides an overview of Coop's system architecture for developers and operators.
  4
  5### Overview
  6
  7Coop is built as a monorepo with a React frontend, Node.js backend, and multi-database architecture designed for high-throughput content moderation at scale. Coop:
  8
  9* Lets operations and policy teams manage settings, like which queue to send reports to, or \# of strikes per enforcement, without requiring engineers to change backend code  
 10* Supports both automation and a manual review process  
 11* Provides intuitive UI with role-based access control permissioning  
 12* Includes an embedded media player for image and video   
 13* Best-practice wellness features built-in  
 14* Uses webhook-based architecture to link effects with events  
 15* Logs an audit trail of actions taken, metadata about the action (incl. When it happened and who it was performed by), and the corresponding policy   
 16* Dev/staging env for manual testing and automated integration tests
 17
 18### Technology Stack
 19
 20| Layer | Technologies |
 21| :---- | :---- |
 22| **Frontend** | React, TypeScript, Ant Design, TailwindCSS, Apollo Client |
 23| **Backend** | Node.js, Express, Apollo Server, TypeScript |
 24| **Databases** | PostgreSQL, Scylla(5.2), ClickHouse, Redis |
 25| **Messaging** | Kafka (optional), BullMQ |
 26| **ORM** | Sequelize, Kysely |
 27| **Auth** | Passport.js, express-session, SAML (SSO) |
 28| **Observability** | OpenTelemetry |
 29
 30## **Directory Structure**
 31
 32```
 33coop/
 34├── client/                    # React frontend
 35│   └── src/
 36│       ├── webpages/         # Page components
 37│       ├── graphql/          # GraphQL queries/mutations
 38│       └── components/       # Shared UI components
 39│       └── utils/                    # Utility Functions
 40│
 41├── server/                    # Node.js backend
 42│   ├── bin/                  # CLI scripts
 43│   ├── graphql/              # GraphQL schema and resolvers
 44│   ├── iocContainer/         # Dependency injection setup
 45│   ├── models/               # Sequelize ORM models
 46│   ├── routes/               # REST API routes
 47│   ├── rule_engine/          # Rule evaluation logic
 48│   ├── services/             # Business logic services including NCMEC
 49│   └── workers_jobs/         # Background processing
 50│
 51├── .devops/
 52│   └── migrator/             # Database migrations
 53│       └── src/scripts/
 54│           ├── api-server-pg/  # PostgreSQL
 55│           ├── clickhouse/     # ClickHouse
 56│           └── scylla/         # Scylla
 57│
 58└── docs/                      # Documentation
 59```
 60
 61
 62# Coop Core Components
 63
 64## API
 65
 66Coop accepts both synchronous and asynchronous input.
 67
 68* Synchronous input is handled via REST APIs and supports item submission, action execution, reporting workflows, policy retrieval, and related operations.  
 69* Asynchronous input is handled via Kafka-based event streaming using the ITEM\_SUBMISSION\_EVENT topic.
 70
 71All API requests require an organization API key passed via the x-api-key header.
 72
 73### Content Submission
 74
 75* **File**: `/server/routes/content/ContentRoutes.ts`  
 76* **Route**: `Post /api/v1/content/`
 77* **Header**: `x-api-key: <org-api-key>`
 78
 79Accepts any item (eg: content, user, thread) but only accepts a single item at a time. By default, requests are processed asynchronously. To force synchronous mode, set `sync: true` 
 80
 81**Example request body (JSON):**  
 82```json
 83{
 84  "contentId": "unique-id-123",
 85  "contentType": "Comment",
 86  "content": {
 87    "text": "Hello world",
 88    "authorId": "user-456",
 89    "createdAt": "2024-01-01T00:00:00Z"
 90  },
 91  "userId": "user-456",
 92  "sync": false
 93}
 94```
 95
 96### Item Submission
 97
 98* **File**: `/server/routes/items/ItemRoutes.ts`
 99* **Route**: `POST /api/v1/items/async/`
100* **Header**: `x-api-key: <org-api-key>`
101
102Accepts one or more arbitrary items (users, threads, etc.). All processing is asynchronous.
103
104**Example request body (JSON):**
105
106```json
107{
108  "items": [
109    {
110      "id": "unique-item-id-123",
111      "data": {
112        "fieldName1": "value1",
113        "fieldName2": 123
114      },
115      "typeId": "your-item-type-id",
116      "typeVersion": "optional-version-string",
117      "typeSchemaVariant": "original"
118    }
119  ]
120}
121``` 
122
123### Action Execution
124
125* **File**: `/server/routes/action/ActionRoutes.ts`
126* **Route**: `POST /api/v1/actions`
127* **Header**: `x-api-key: \<org-api-key\>`
128
129**Example request body (JSON):**
130
131```json
132{
133  "actionId": "action-id-to-execute",
134  "itemId": "target-item-id",
135  "itemTypeId": "item-type-id",
136  "policyIds": ["policy-id-1", "policy-id-2"],
137  "reportedItems": [
138    {
139      "id": "reported-item-id",
140      "typeId": "reported-item-type-id"
141    }
142  ],
143  "actorId": "user-id-who-triggered-action"
144}
145```
146
147### Reporting
148
149* **File:** `/server/routes/reporting/ReportingRoutes.ts`  
150* **Route**: `POST /api/v1/report`                              
151* **Header**: `x-api-key: <org-api-key>`
152
153Used to submit reports from users or systems, including contextual items and thread history. The payload supports: 
154
155* Reporter identity   
156* Reported item   
157* Thread context   
158* Policy reason(s)   
159* Additional contextual items
160
161**Example request body (JSON):**
162
163```json
164{
165  "reporter": {
166    "kind": "user",
167    "typeId": "reporter-user-type-id",
168    "id": "reporter-user-id"
169  },
170  "reportedAt": "2024-01-15T10:30:00.000Z",
171  "reportedForReason": {
172    "policyId": "violated-policy-id",
173    "reason": "Free-text reason from reporter",
174    "csam": false
175  },
176  "reportedItem": {
177    "id": "reported-item-id",
178    "data": { "fieldName": "value" },
179    "typeId": "item-type-id"
180  },
181  "reportedItemThread": [
182    {
183      "id": "thread-message-1",
184      "data": { "content": "message content" },
185      "typeId": "message-type-id"
186    }
187  ],
188  "reportedItemsInThread": [
189    { "id": "specific-reported-message", "typeId": "message-type-id" }
190  ],
191  "additionalItems": [
192    { "id": "additional-context-item", "data": {}, "typeId": "item-type-id" }
193  ]
194}
195```
196
197### Appeal
198
199* **File**: `/server/routes/reporting/ReportingRoutes.ts:105-154`
200* **Route**: `POST /api/v1/report/appeal`                            
201* **Header**: `x-api-key: <org-api-key>`  
202
203Appeals allow users to contest actions taken against items. Appeals include the original action, violated policies, appeal reason, and optional additional context.  
204                          
205**Example request body (JSON):**
206
207```json
208{
209  "appealId": "customer-internal-appeal-id",
210  "appealedBy": {
211    "typeId": "appealer-user-type-id",
212    "id": "appealer-user-id"
213  },
214  "appealedAt": "2024-01-15T12:00:00.000Z",
215  "actionedItem": {
216    "id": "item-that-was-actioned",
217    "data": { "fieldName": "value" },
218    "typeId": "item-type-id"
219  },
220  "actionsTaken": ["action-id-1", "action-id-2"],
221  "appealReason": "User's explanation for why they are appealing",
222  "violatingPolicies": [
223    { "id": "policy-id-1" },
224    { "id": "policy-id-2" }
225  ],
226  "additionalItems": [
227    { "id": "additional-context-item", "data": {}, "typeId": "item-type-id" }
228  ]
229}
230```
231
232### Supporting API Endpoints
233
234* **Policies**: `GET /api/v1/policies/`  
235* **User Scores**: `GET /api/v1/user_scores`  
236* **GDPR Deletion**: `POST /api/v1/gdpr/delete`
237
238### Errors
239
240All API errors use a consistent JSON structure:
241
242```json
243{
244  "errors": [
245    {
246      "status": 400,
247      "type": ["/errors/invalid-user-input"],
248      "title": "Short error description",
249      "detail": "Detailed explanation (optional)",
250      "pointer": "/path/to/problematic/field (optional)",
251      "requestId": "correlation-id (optional)"
252    }
253  ]
254}
255```
256
257## Rules Engine
258
259When an item is submitted, Coop retrieves all [rules](RULES.md) associated with the item’s type. Each rule is evaluated by recursively processing its `conditionSet`, extracting values from the item, optionally passing them through signals, and comparing results using configured comparators.
260
261Key characteristics:
262
263* Conditions are evaluated in ascending cost order  
264* Short-circuiting is applied based on conjunction type (AND / OR / XOR)  
265* Expensive signals are skipped when earlier conditions fail  
266* Actions are deduplicated before execution
267
268For rules in actionable environments (e.g., `LIVE`, `MANUAL`), actions are published via the `ActionPublisher`, which handles:
269
270* Customer webhooks  
271* MRT enqueueing  
272* NCMEC routing
273
274**Location**: `/server/rule_engine`
275
276**Rule structure:** `/server/models/rules/RuleModel.ts`
277
278```typescript
279Rule {
280  id: string;
281  name: string;
282  status: RuleStatus;
283  ruleType: RuleType;
284  conditionSet: ConditionSet;
285  orgId: string;
286  tags: string[];
287  maxDailyActions: number;
288}
289```
290
291## Manual Review Tool (MRT)
292
293The Manual Review Tool (MRT) is a BullMQ-backed queue system used for human review. Items enter MRT via rule actions or user reports. Each job is enriched with context (user scores, related items) and routes them to named queues via routing rules configured in the UI. Moderators claim tasks via exclusive locks (so only one person can claim one task) and submit decisions (aka take actions), which trigger downstream callbacks or reporting workflows (ie. NCMEC).
294
295### Queue Management
296
297#### Queue Operations
298
299**File**: `/server/services/manualReviewToolService/modules/QueueOperations.ts`
300
301Jobs can be enqueued from:
302
303* Rules engine execution  
304* User reports  
305* Post-action workflows  
306* MRT internal jobs
307
308**Users:**
309
310* Dequeue jobs with exclusive locks  
311* Submit decisions  
312* Trigger post-decision webhooks or NCMEC reporting
313
314**Supported decision types:**
315
316* `IGNORE`  
317* `CUSTOM_ACTION`  
318* `SUBMIT_NCMEC_REPORT`  
319* `ACCEPT_APPEAL`  
320* `REJECT_APPEAL`  
321* `TRANSFORM_JOB_AND_RECREATE_IN_QUEUE`  
322* `AUTOMATIC_CLOSE`
323
324**Manual Enqueue:**
325
326```typescript
327{
328  orgId: string;
329  correlationId: RuleExecutionCorrelationId | ActionExecutionCorrelationId;
330  createdAt: Date;
331  enqueueSource: 'REPORT' | 'RULE_EXECUTION' | 'POST_ACTIONS' | 'MRT_JOB';
332  enqueueSourceInfo: ReportEnqueueSourceInfo | RuleExecutionEnqueueSourceInfo | ...;
333  payload: ManualReviewJobPayloadInput;
334  policyIds: string[];
335}
336```
337
338**Entry from Rules Engine** (ActionPublisher.ts):
339
340```typescript
341case ActionType.ENQUEUE_TO_MRT:
342  await this.manualReviewToolService.enqueue({
343    orgId,
344    payload: { kind: 'DEFAULT', item, reportHistory: [], ... },
345    enqueueSource: 'RULE_EXECUTION',
346    enqueueSourceInfo: { kind: 'RULE_EXECUTION', rules: rules.map(x => x.id) },
347    correlationId,
348    policyIds: policies.map(it => it.id),
349  });
350```
351
352**Dequeue with lock:**
353
354```typescript
355async dequeueNextJob(opts: {
356  orgId: string;
357  queueId: string;
358  userId: string;
359}): Promise<{ job: ManualReviewJob; lockToken: string } | null>
360```
361
362**Submit Decisions:**
363
364```typescript
365async submitDecision(opts: SubmitDecisionInput): Promise<SubmitDecisionResponse>
366```
367
368## Actions
369
370Actions are created when a rule matches or a moderator submits a decision. Coop determines *when* an action should occur; the customer determines *what* happens as a result (label / warn / ban / remove content etc). The actual action is taken by the customer after being triggered through Coop. 
371
372Action types:
373
374* CUSTOMER\_DEFINED\_ACTION: POST webhook to customer infrastructure  
375* ENQUEUE\_TO\_MRT: Add item to the manual review queue  
376* ENQUEUE\_TO\_NCMEC: Route to NCMEC reporting queue
377
378**Webhook structure:**
379
380```json
381{
382  "item": { "id": "...", "typeId": "..." },
383  "policies": [{ "id": "...", "name": "...", "penalty": "..." }],
384  "rules": [{ "id": "...", "name": "..." }],
385  "action": { "id": "..." },
386  "custom": {},
387  "actorEmail": "moderator@example.com"
388}
389```
390
391Failed webhook deliveries retry five times with exponential back off. 
392
393## Storage
394
395Coop uses a multiple database storage system: 
396
397* **PostgreSQL** stores configuration, rules, users, sessions, and MRT decisions with ACID guarantees.   
398* **Redis (via BullMQ)** powers MRT job queues, caching, and aggregation counters for very low latency.   
399* **ScyllaDb (5.2)** stores item submission history for high-throughput writes with materialized views for varied access patterns.   
400* **Clickhouse** serves as the analytics warehouse for rule executions, actions and user statistics. 
401
402### PostgreSQL
403
404ACID compliant storage for config, auth, rules, and operational data including:
405
406* *public*: orgs, users, actions, policies, item\_types, banks, api_keys  
407* *jobs*: Scheduled job tracking  
408* *manual_review_tool:* manual review queues, decisions, routing rules, comments  
409* *ncmec_reporting*: Child safety NCMEC reports  
410* *reporting_rules:* User / content reporting rules  
411* *signal_service:* Signal configuration  
412* *user_management_service*: User management  
413* *users_statistics_service:* User statistics
414
415### Redis
416
417Used as low-latency hot cache for:
418
419* **MRT**: BullMQ job queues  
420* **Caching**: Sets, Sorted Sets, Lua scripts  
421* **Distributed counters**
422
423### ScyllaDb
424
425Used for high-throughput item history (Investigations tool and associated users/items). It serves as time-series item submission storage with multiple access patterns
426
427Tables/Views
428
429* **item_submission_by_thread**: Primary table  
430* **item_submission_by_item_id**: Lookup by item ID  
431* **item_submission_by_thread_and_time**: Thread and time range  
432* **item_submission_by_creator**: Lookup by creator
433
434### ClickHouse
435
436Serves as the OLAP storage for analytics, aggregations, and audit trails
437
438Databases and key tables
439
440* **analytics**: RULE_EXECUTIONS, ACTION_EXECUTIONS, CONTENT_API_REQUESTS, ITEM_MODEL_SCORES_LOG  
441* **action executions:** ACTION_STATISTICS_SERVICE: BY_ACTION, BY_RULE, BY_POLICY, ACTIONED_SUBMISSION_COUNTS  
442  * MANUAL_REVIEW_TOOL: ROUTING_RULE_EXECUTIONS  
443* **Reporting and appeal stats:** REPORTING_SERVICE: REPORTS, APPEALS, REPORTING_RULE_EXECUTIONS  
444* **User level metrics:** USER_STATISTICS_SERVICE: LIFETIME_ACTION_STATS, SUBMISSION_STATS, USER_SCORES 
445
446## Signals
447
448Signals are scoring or evaluation functions used by rules. They range from simple text matching to third-party ML services.
449
450The rules engine calls signals when evaluating conditions that need a score. Signals run in cost order (e.g. text matching will run early). If an early condition fails, the expensive signals are skipped. Results are memoized and cached for 30 seconds for reuse. Signals extend a shared base class and define metadata, cost, and execution logic.
451
452File: `/server/services/signalsService`
453
454**Signals Base Class:**
455File: `/server/services/signalsService/signals/SignalBase.ts`
456
457```typescript
458abstract class SignalBase<Input, OutputType, MatchingValue, Type> {
459  abstract get id(): SignalId;
460  abstract get displayName(): string;
461  abstract get description(): string;
462  abstract get eligibleInputs(): readonly Input[];
463  abstract get outputType(): OutputType;
464  abstract get supportedLanguages(): readonly Language[] | 'ALL';
465  abstract get integration(): Integration | null;
466  abstract getCost(): number;
467  abstract run(input: SignalInput): Promise<SignalResult | SignalErrorResult>;
468}
469```
470
471# Services Required
472
473* PostgreSQL  
474* Redis  
475* Kafka  
476  * Schema registry  
477  * Zookeeper  
478* Clickhouse  
479* ScyllaDb  
480* Metrics  
481  * Jaeger  
482  * Open Telemetry
483
484# Configuration
485
486Server configuration lives in `/server/.env.example`
487
488* Database: PostgreSQL  
489* Analytics, Warehouse: Clickhouse  
490* Redis: Redis  
491* Scylla: Scylla
492
493Rules
494
495* Configured in frontend via GraphQL/dashboard UI  
496* Rate limiting via maxDailyActions for each rule  
497* Rule status: `LIVE`, `DRAFT`, `BACKGROUND`, `EXPIRED`  
498* Signals: Configured in the rules front-end
499
500User roles
501
502* ADMIN: Full access  
503* RULES_MANAGER: Can modify live rules  
504* ANALYST: View insights  
505* MODERATOR_MANAGER: Managers MRT queues  
506* MODERATOR: Reviews assigned queues  
507* CHILD_SAFETY_MODERATOR: Access to NCMEC data  
508* EXTERNAL_MODERATOR: View only MRT access
509
510Permissions
511
512* MANAGE_ORG: ADMIN  
513* MUTATE_LIVE_RULES: ADMIN, RULES_MANAGER  
514* VIEW_MRT: All moderator roles  
515* EDIT_MRT_QUEUES: ADMIN, MODERATOR_MANAGER  
516* VIEW_CHILD_SAFETY_DATA: ADMIN, MODERATOR_MANAGER, CHILD\_SAFETY\_MODERATOR
517
518# Action Rules vs Routing Rules
519
520Coop supports two sets of [rules](RULES.md). Each has separate code paths, storage tables, and UI surfaces.
521
5221. [Automated Action rules](RULES.md#automated-action-rules): All rules act in parallel on all events to determine auto actions and MRT decisioning  
5232. [Routing rules](RULES.md#routing-rules): First routing rule that succeeds routes the MRT bound event into the appropriate queue awaiting review, the rest are executed in order.
524
525## Rules Engine Rules
526
527Code: `/server/models/rules/RuleModel.ts`
528
529UI: `/client/src/webpages/dashboard/rules/`  
530
531Storage tables:
532
533* public.rules  
534* public.rules_and_actions  
535* public.rules_and_item_types  
536* public.rules_and_policies  
537* public.rules_history
538
539## Routing Rules
540
541Code: `/server/services/manualReviewToolService/modules/JobRouting.ts`
542UI: `/client/src/webpages/dashboard/mrt/queue_routing/`  
543
544Storage tables:
545
546* manual_review_tool.routing_rules  
547* manual_review_tool.routing_rules_to_item_types  
548* manual_review_tool.routing_rules_history  
549* manual_review_tool.appeal_routing_rules  
550* manual_review_tool.appeal_routing_rules_to_item_types
551
552## Authentication
553
554Coop supports three authentication methods: API key authentication for programmatic access, and session-based.
555
556### API Key Authentication
557
558API keys authenticate programmatic requests to REST endpoints. All API requests require the x-api-key header.  
559                                                                                                                  
560  1. Middleware extracts the x-api-key header      
561  2. Key is validated via SHA-256 hash lookup in the database                                                       
562  3. If valid, orgId is set on the request for downstream handlers                                                  
563  4. Returns 401 Unauthorized if invalid or missing                                                          
564
565
566* Keys are 32-byte random values, SHA-256 hashed before storage                                                   
567* Each key is scoped to a single team (ie. if you have different teams in the same organization whose data should not mix) 
568* Last-used timestamp tracked for auditing
569* Keys can be rotated (creates new key, deactivates old)                                                        
570                                                                                                                    
571Files:                                                                                                          
572* Middleware: `/server/utils/apiKeyMiddleware.ts`                                                                   
573* Service: `/server/services/apiKeyService/apiKeyService.ts`  
574
575### Session-Based Authentication                                                                                                                         
576
577Session authentication is used for dashboard UI access via GraphQL.                                               
578                                                                          
579  1. User submits credentials via GraphQL login mutation 
580  2. Passport's GraphQLLocalStrategy validates email/password                                                       
581  3. Password verified via bcrypt comparison
582  4. On success, user serialized to session via passport.serializeUser()                                            
583  5. Session stored in PostgreSQL via connect-pg-simple                                                                                                                                                              
584Session configuration:                                                                                          
585* Store: PostgreSQL-backed                                                                                        
586* Cookie: Secure flag in production, 30-day expiry                                                                
587* Session secret: process.env.SESSION_SECRET                                                                                                                                                  
588  Files: `/server/api.ts`   
589
590### SAML/SSO Authentication                                                                                         
591
592Enterprise SSO uses SAML with per-organization configuration.                                                     
593                                                                                                      
594  1. User navigates to /saml/login/{orgId}
595  2. Passport's MultiSamlStrategy retrieves org-specific SAML settings                                              
596  3. User redirected to configured SAML provider
597  4. Provider authenticates and posts assertion to callback URL            
598  5. User email extracted from SAML assertion
599  6. User record looked up and session created                                                                                                                                                                  
600Configuration (per org in org\_settings table):                                                                  
601
602* saml\_enabled: Boolean flag                                                                                      
603* sso\_url: SAML entry point URL                                                                                   
604* cert: Certificate for validation                                                                              
605                                                                                                                    
606  Files:                                                                                                            
607`/server/api.ts (lines 142-227)`                                                                                  
608`/server/services/SSOService/SSOService.ts`
609
Configure Feed

Configure Feed