AT Protocol PLC directory counter
4
fork

Configure Feed

Select the types of activity you want to include in your feed.

TypeScript 100.0%
2 1 0

Clone this repository

https://tangled.org/gui.do/atproto-plc-counter https://tangled.org/did:plc:45uheisi25szrjvjurfpritx/atproto-plc-counter
git@tangled.org:gui.do/atproto-plc-counter git@tangled.org:did:plc:45uheisi25szrjvjurfpritx/atproto-plc-counter

For self-hosted knots, clone URLs may differ based on your setup.

Download tar.gz
README.md

AT Protocol PLC Directory Counter#

Count and analyze AT Protocol accounts by scanning all PDS (Personal Data Store) servers via the PLC directory.

Used to generate the stats at sifa.id/stats. See the methodology for a full explanation of what is and isn't counted.

What It Does#

  • Crawls the entire PLC directory (~87M DID registrations as of April 2026)
  • Counts accounts per PDS provider
  • Tests reachability by sampling handles and resolving them via the AT Protocol identity system
  • Exports CSV time-series data for daily and all-time views

Scripts#

Script Purpose Runtime
count-plc-dids.ts Full crawl of plc.directory, builds all CSVs 2–3 hours
recheck-reachability.ts Reachability scan using plc-count-results.json (standalone workflow) ~7 hours
check-reachability.ts Reachability scan reading/writing the NAS CSV format (production workflow) ~7 hours

Quick Start#

Prerequisites#

  • Node.js 18+ (needs native fetch)
  • tsx (TypeScript runner): npm install -g tsx

Run the Counter#

# Full crawl (2-3 hours)
npx tsx count-plc-dids.ts

# Resume if interrupted
npx tsx count-plc-dids.ts --after 2025-09-02

Run Reachability Check (Optional, ~7 hours)#

# Standalone workflow (reads plc-count-results.json)
npx tsx recheck-reachability.ts

# NAS/production workflow (reads/writes plc-count-by-pds.csv directly)
npx tsx check-reachability.ts

Output Files#

File Description
plc-count-results.json Summary stats, PDS breakdown
plc-count-by-pds.csv Per-PDS totals + reachability %
plc-count-by-pds-daily.csv Daily time-series by PDS
data/plc-reachability-through.txt Date of last reachability scan (NAS workflow)

Key Findings (April 2026)#

Metric Count
Total did:plc ~87M
Unique PDS hosts ~16K
Bluesky (bsky.social + *.bsky.network) ~61M
Self-hosted / other ~23M (dominated by ~22M pds.trump.com bot accounts)

Note: our total is higher than Bluesky's reported active user count (~33–36M) because we count every identity ever registered in the PLC directory, including bots, spam, and deleted/deactivated accounts. See the methodology for details.

How It Works#

  1. Crawl: Pages through plc.directory/export (1000 ops/page)
  2. Filter: Counts genesis ops (prev=null) = new DIDs
  3. Track: Extracts PDS endpoint from each operation
  4. Sample: Tests 3–50 handles per PDS (scaled to host size) for reachability
  5. Export: Writes CSVs with daily breakdowns

Rate Limiting#

  • 75ms delay between pages
  • Handles 429s with 10s backoff
  • Reachability checks: 50ms between handle resolves

License#

MIT