search for standard sites
pub-search.waow.tech
search
zig
blog
atproto
1# API reference
2
3base URL: `https://leaflet-search-backend.fly.dev`
4
5## endpoints
6
7### search
8
9```
10GET /search?q=<query>&tag=<tag>&platform=<platform>&since=<date>&author=<did|handle>&mode=<mode>
11```
12
13full-text search across documents and publications.
14
15**parameters:**
16| param | type | required | description |
17|-------|------|----------|-------------|
18| `q` | string | no* | search query (titles and content) |
19| `tag` | string | no | filter by tag (documents only) |
20| `platform` | string | no | filter by platform: `leaflet`, `pckt`, `offprint`, `greengale`, `whitewind`, `other` |
21| `since` | string | no | ISO date, filter to documents created after |
22| `author` | string | no | filter by author: DID (`did:plc:xyz`) or handle (`nate.bsky.social`). handles are resolved server-side via AT Protocol. |
23| `mode` | string | no | `keyword` (default), `semantic`, or `hybrid`. semantic uses voyage-4-lite embeddings + turbopuffer ANN. hybrid merges keyword + semantic via reciprocal rank fusion. |
24| `format` | string | no | `v2` wraps response in `{"results": [...], "total": N, "hasMore": bool}` |
25| `limit` | int | no | max results to return (default 20) |
26| `offset` | int | no | pagination offset |
27
28*at least one of `q`, `tag`, or `author` required
29
30**filter behavior by mode:**
31- **keyword**: respects all filters (`tag`, `platform`, `since`, `author`)
32- **semantic**: respects `platform` and `author`. ignores `tag` and `since`.
33- **hybrid**: keyword half respects all filters, semantic half respects `platform` and `author`. results merged via RRF.
34
35**response:**
36```json
37[
38 {
39 "type": "article|looseleaf|publication",
40 "uri": "at://did:plc:.../collection/rkey",
41 "did": "did:plc:...",
42 "title": "document title",
43 "snippet": "...matched text...",
44 "createdAt": "2025-01-15T...",
45 "rkey": "abc123",
46 "basePath": "gyst.leaflet.pub",
47 "platform": "leaflet",
48 "path": "/001",
49 "coverImage": "",
50 "handle": "@user.bsky.social"
51 }
52]
53```
54
55with `format=v2`:
56```json
57{
58 "results": [ /* same as above */ ],
59 "total": 89,
60 "hasMore": false
61}
62```
63
64hybrid mode adds `source` and `score` fields:
65```json
66{
67 "source": "keyword+semantic",
68 "score": 0.85
69}
70```
71
72**result types:**
73- `article`: document in a publication
74- `looseleaf`: standalone document (no publication)
75- `publication`: the publication itself (only returned for text queries, not tag/platform filters)
76
77**ranking:** hybrid BM25 + recency. text relevance primary, recent docs boosted (~1 point per 30 days).
78
79### similar
80
81```
82GET /similar?uri=<at-uri>
83```
84
85find semantically similar documents using vector similarity (voyage-4-lite embeddings + turbopuffer ANN).
86
87**parameters:**
88| param | type | required | description |
89|-------|------|----------|-------------|
90| `uri` | string | yes | AT-URI of source document |
91
92**response:** same format as search (array of results)
93
94### tags
95
96```
97GET /tags
98```
99
100list all tags with document counts, sorted by popularity.
101
102**response:**
103```json
104[
105 {"tag": "programming", "count": 42},
106 {"tag": "rust", "count": 15}
107]
108```
109
110### popular
111
112```
113GET /popular
114```
115
116popular search queries.
117
118**response:**
119```json
120[
121 {"query": "rust async", "count": 12},
122 {"query": "leaflet", "count": 8}
123]
124```
125
126### stats
127
128```
129GET /stats
130```
131
132index statistics and request timing.
133
134**response:**
135```json
136{
137 "documents": 11445,
138 "publications": 2603,
139 "embeddings": 10900,
140 "searches": 5000,
141 "errors": 5,
142 "cache_hits": 1200,
143 "cache_misses": 800,
144 "timing": {
145 "search_keyword": {"count": 1000, "avg_ms": 25, "p50_ms": 20, "p95_ms": 50, "p99_ms": 80, "max_ms": 150},
146 "search_semantic": {"count": 100, "avg_ms": 350, "p50_ms": 340, ...},
147 "search_hybrid": {"count": 50, "avg_ms": 380, ...},
148 "similar": {"count": 200, "avg_ms": 150, ...},
149 "tags": {"count": 500, "avg_ms": 5, ...},
150 "popular": {"count": 300, "avg_ms": 3, ...}
151 }
152}
153```
154
155### activity
156
157```
158GET /activity
159```
160
161hourly activity counts (last 24 hours).
162
163**response:**
164```json
165[12, 8, 5, 3, 2, 1, 0, 0, 1, 5, 15, 25, 30, 28, 22, 18, 20, 25, 30, 35, 28, 20, 15, 10]
166```
167
168### dashboard
169
170```
171GET /api/dashboard
172```
173
174rich dashboard data for analytics UI. includes platform counts (no separate `/platforms` endpoint).
175
176**response:**
177```json
178{
179 "startedAt": 1705000000,
180 "searches": 5000,
181 "publications": 2603,
182 "documents": 11445,
183 "platforms": [{"platform": "leaflet", "count": 5399}],
184 "tags": [{"tag": "programming", "count": 42}],
185 "timeline": [{"date": "2025-01-15", "count": 25}],
186 "topPubs": [{"name": "gyst", "basePath": "gyst.leaflet.pub", "count": 150}],
187 "timing": {...}
188}
189```
190
191### health
192
193```
194GET /health
195```
196
197**response:**
198```json
199{"status": "ok"}
200```
201
202## building URLs
203
204documents can be accessed on the web via their `basePath` and platform-specific patterns:
205
206| platform | URL pattern | example |
207|----------|-------------|---------|
208| leaflet | `https://{basePath}/{rkey}` | `https://gyst.leaflet.pub/3ldasifz7bs2l` |
209| pckt | `https://{basePath}{path}` | `https://devlog.pckt.blog/some-slug` |
210| offprint | `https://{basePath}{path}` | `https://dalisay.offprint.app/a/3me5ucj7vxf23-title-slug` |
211| greengale | `https://{basePath}{path}` | `https://3fz.greengale.app/001` |
212| whitewind | `https://whtwnd.com/{did}/{rkey}` | `https://whtwnd.com/did:plc:.../3abc123` |
213| publications | `https://{basePath}` | `https://gyst.leaflet.pub` |