A Product Hunt Clone for AtProto with an emphasis on on-proto community
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Initial commit: README and Jetstream URL-harvester scratch

README describes the product model: nominations (with keytrace-verified
canonical listings), three side-by-side metrics (engagement, buzz, votes),
and orthogonal citizenship badges issued via badge.blue.

main.go is a scratch that subscribes to Jetstream for one hour of
backfilled data and extracts unique URLs from records — the seed of the
buzz/engagement measurement pipeline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Chris Pardy 7d53c247

+202
+2
.gitignore
··· 1 + /harvester 2 + *.log
+75
README.md
··· 1 + # harvester.blue 2 + 3 + A Product Hunt for the at-proto ecosystem — but instead of *hunting* the next launch, we *harvest* what's actually growing on the network. 4 + 5 + harvester.blue is a directory and ranking site for at-proto apps. Apps don't earn their place by marketing alone; they earn it by being used, being voted for, and being built well. 6 + 7 + ## Nomination 8 + 9 + Apps enter the directory by being **nominated** by a harvester (a logged-in user). The nominator declares the measurable aspects of the app — most importantly: 10 + 11 + - The **record schemas (NSIDs) the app owns** — what the firehose pipeline should count as evidence of use. 12 + - The app's **canonical URLs / domains** — what counts as a mention. 13 + - Anything else needed to measure the app fairly. 14 + 15 + Nominator-declared metadata is the input to all the automatic measurement that follows. 16 + 17 + ### Verified nominations 18 + 19 + Anyone can nominate any app. But if the nominator is verified via keytrace as authorized on behalf of the app's domain, their nomination is treated as **canonical** — it's the version of the listing the site stands behind. Unverified nominations remain visible (and useful for surfacing apps whose makers aren't on harvester.blue yet), but a verified nomination overrides them. 20 + 21 + ### Recommendation to app builders: link your identity 22 + 23 + To make your app's listing canonical and to score higher across the board, line up your identity end-to-end: 24 + 25 + - Set up a DID for your service. 26 + - Host the app under a domain you control. 27 + - Publish your lexicons under that same name. 28 + 29 + When your domain, DID, and NSID authority all match, everything links: verification is straightforward, citizenship badges line up (service records, correctly-published lexicons), and the metrics counted against the listing are unambiguously yours. The more these are linked, the higher the listing counts. 30 + 31 + ## The three metrics 32 + 33 + Each app shows three metrics side-by-side. They are deliberately **not** collapsed into a single "harvester score" — each tells a different story, and the user picks the sort order based on the question they're asking. 34 + 35 + ### 1. Engagement — are people using it? 36 + 37 + Count of distinct DIDs creating, updating, or deleting records under the schemas the app owns. This is signal that the app is alive on-network, not just that it has a landing page. 38 + 39 + ### 2. Buzz — are people talking about it? 40 + 41 + Count of distinct DIDs whose creates/updates contain URLs pointing at the app's domains. 42 + 43 + ### 3. Votes — do humans think it's cool? 44 + 45 + Logged-in users (via Bluesky OAuth) vote for apps they like. Votes are stored as records in the voter's own repo under a harvester.blue lexicon — portable, verifiable, and expensive to fake without real DIDs behind them. 46 + 47 + ## Citizenship badges 48 + 49 + Separately from the three metrics, apps display **citizenship badges** issued as cryptographic attestations via [badge.blue](https://badge.blue). Badges reflect "built right" on at-proto: 50 + 51 + - **Lexicons correctly published.** 52 + - **Service records added to the DID.** 53 + - **Minimal OAuth scope** — not `transition:generic`. 54 + - **Composes with other at-proto services** — uses lexicons or services from projects like Tangled (code), and others in the broader at-proto ecosystem. 55 + - **Brings decentralizing infra.** The principle is that good citizens contribute infrastructure that makes the network more distributed, rather than running entirely on existing shared infra. Self-hosting a PDS is the canonical example, but running a feed generator, labeler, relay, or appview also counts. 56 + 57 + Badges are displayed per app. Because they're attestations on badge.blue, they're verifiable independently of harvester.blue itself. 58 + 59 + ## Why metrics + badges, not one score 60 + 61 + Engagement alone rewards popular-but-poorly-built apps. Votes alone reward marketing. Citizenship alone rewards purism without traction. Showing the three metrics side-by-side, with citizenship as an orthogonal layer of attested badges, lets a viewer ask the question that fits them — *who is being used, who is being talked about, who is loved, who is doing it right* — without the site picking the answer for them. 62 + 63 + ## Status 64 + 65 + Early. The current code (`main.go`) is a Go scratch that subscribes to Jetstream and extracts unique URLs from records — the seed of the activity-measurement pipeline. 66 + 67 + ## Open questions 68 + 69 + These are unresolved and tracked here so they don't get lost: 70 + 71 + - **Mention counting:** any URL on the app's domain, or only canonical URLs? Apps with user-generated subpaths will otherwise dominate. 72 + - **Measurement window:** trailing 7d, 30d, all-time? Probably surface both — "growing" vs "established." 73 + - **Vote budget:** 1 per product ever (PH-style), 1 per day, or a small per-season budget (e.g. 5/week)? 74 + - **"Composes with other services":** what counts — links out, reads from, writes to, reuses lexicons? And which services qualify — Tangled, Frontpage, Smoke Signal, Whitewind, Linkat, Bookhive, others? 75 + - **"Decentralizing infra" badge:** how is this verified? Self-hosted PDS is detectable from the DID doc; feed generators and labelers are discoverable via service records; what about appviews or relays?
+5
go.mod
··· 1 + module harvester 2 + 3 + go 1.25.6 4 + 5 + require github.com/coder/websocket v1.8.14 // indirect
+2
go.sum
··· 1 + github.com/coder/websocket v1.8.14 h1:9L0p0iKiNOibykf283eHkKUHHrpG7f65OE3BhhO7v9g= 2 + github.com/coder/websocket v1.8.14/go.mod h1:NX3SzP+inril6yawo5CQXx8+fk145lPDC6pumgx0mVg=
+118
main.go
··· 1 + package main 2 + 3 + import ( 4 + "context" 5 + "encoding/json" 6 + "fmt" 7 + "log" 8 + "net/url" 9 + "os" 10 + "os/signal" 11 + "syscall" 12 + "time" 13 + 14 + "github.com/coder/websocket" 15 + ) 16 + 17 + type event struct { 18 + Kind string `json:"kind"` 19 + TimeUs int64 `json:"time_us"` 20 + Commit *struct { 21 + Operation string `json:"operation"` 22 + Record json.RawMessage `json:"record"` 23 + } `json:"commit"` 24 + } 25 + 26 + func main() { 27 + ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM) 28 + defer cancel() 29 + 30 + connectTime := time.Now() 31 + startUs := connectTime.Add(-1 * time.Hour).UnixMicro() 32 + endUs := connectTime.UnixMicro() 33 + wsURL := fmt.Sprintf("wss://jetstream2.us-east.bsky.network/subscribe?cursor=%d", startUs) 34 + 35 + conn, _, err := websocket.Dial(ctx, wsURL, nil) 36 + if err != nil { 37 + log.Fatalf("dial: %v", err) 38 + } 39 + defer conn.Close(websocket.StatusNormalClosure, "") 40 + conn.SetReadLimit(4 << 20) 41 + 42 + urls := make(map[string]struct{}) 43 + 44 + lastReport := time.Now() 45 + report := func() { 46 + fmt.Printf("\runique_urls=%d", len(urls)) 47 + } 48 + 49 + for { 50 + _, data, err := conn.Read(ctx) 51 + if err != nil { 52 + fmt.Println() 53 + log.Printf("read: %v", err) 54 + break 55 + } 56 + 57 + var e event 58 + if err := json.Unmarshal(data, &e); err != nil { 59 + continue 60 + } 61 + 62 + if e.Kind == "commit" && e.Commit != nil && len(e.Commit.Record) > 0 { 63 + var body any 64 + if err := json.Unmarshal(e.Commit.Record, &body); err == nil { 65 + walk(body, urls) 66 + } 67 + } 68 + 69 + if e.TimeUs >= endUs { 70 + fmt.Println() 71 + log.Printf("reached 1h of data (cursor caught up to connect time)") 72 + break 73 + } 74 + 75 + if time.Since(lastReport) >= time.Second { 76 + report() 77 + lastReport = time.Now() 78 + } 79 + } 80 + 81 + report() 82 + fmt.Println() 83 + fmt.Printf("DONE: %d unique URLs\n", len(urls)) 84 + } 85 + 86 + func walk(v any, set map[string]struct{}) { 87 + switch x := v.(type) { 88 + case map[string]any: 89 + for _, val := range x { 90 + walk(val, set) 91 + } 92 + case []any: 93 + for _, val := range x { 94 + walk(val, set) 95 + } 96 + case string: 97 + if isURL(x) { 98 + set[x] = struct{}{} 99 + } 100 + } 101 + } 102 + 103 + func isURL(s string) bool { 104 + if len(s) < 8 || len(s) > 2048 { 105 + return false 106 + } 107 + u, err := url.Parse(s) 108 + if err != nil { 109 + return false 110 + } 111 + if u.Scheme != "http" && u.Scheme != "https" { 112 + return false 113 + } 114 + if u.Host == "" { 115 + return false 116 + } 117 + return true 118 + }