···6464- **AT Protocol network** — source of all Tangled content
6565- **Tap** — filtered event delivery from the AT Protocol firehose (deployed on Railway)
6666- **Turso/libSQL** — relational storage, Tantivy-backed FTS, and native vector search
6767-- **Embedding provider** — generates vectors for semantic search
6868-- **Railway** — deployment platform for Twister services and Tap
6767+- **Ollama** — local embedding model server (nomic-embed-text or EmbeddingGemma); deployed as a Railway sidecar service
6868+- **Railway** — deployment platform for Twister services, Tap, and Ollama
69697070## 7. Architecture Summary
7171···106106| -------------- | -------------------------------------------- | -------------------------- |
107107| `api` | HTTP search, graph summary, and document API | Railway service (public) |
108108| `indexer` | Tap consumer, normalizer, DB writer | Railway service (internal) |
109109-| `embed-worker` | Async embedding generation | Optional Railway service |
109109+| `embed-worker` | Async embedding generation via Ollama | Optional Railway service |
110110+| `ollama` | Local embedding model server | Railway service (internal) |
110111| `tap` | ATProto sync | Railway (already deployed) |
111112112113## 9. Repository Structure
···141142```
142143143144## 11. Technology Choices
145145+146146+### Embedding: Ollama (self-hosted)
147147+148148+Embeddings are generated locally via Ollama rather than an external API service. This eliminates per-token costs, external service dependencies, and data egress concerns.
149149+150150+**Recommended models (in order of preference):**
151151+152152+| Model | Parameters | Dimensions | Quantized Size | Notes |
153153+|-------|-----------|------------|----------------|-------|
154154+| nomic-embed-text-v1.5 | 137M | 768 (Matryoshka: 64–768) | ~262 MB (F16) | 8192 context, battle-tested, Railway template exists |
155155+| EmbeddingGemma | 308M | 768 | <200 MB (quantized) | Best-in-class MTEB for size, released Sept 2025 |
156156+| all-minilm | 23M | 384 | ~46 MB | Budget option, lower quality |
157157+158158+**Go integration:** Use the official Ollama Go client (`github.com/ollama/ollama/api`) with the `Embed()` method. The embed-worker calls Ollama over Railway's internal network (`ollama.railway.internal:11434`).
159159+160160+**Railway deployment:** Ollama runs as a separate Railway service (~1–2 GB RAM, 1–2 vCPU, ~$10–30/mo). The nomic-embed Railway template provides a proven starting point. No cold starts on always-on services; model loads in 2–10 seconds on first request after deploy.
144161145162### Language: Go
146163
+32-1
docs/api/specs/03-data-model.md
···9393) WITH (weights='title=3.0,repo_name=2.5,author_handle=2.0,summary=1.5,tags_json=1.2,body=1.0');
9494```
95959696+### FTS Maintenance
9797+9898+Turso's Tantivy-backed FTS uses `NoMergePolicy` — segment count grows with writes and is never automatically compacted. This increases query fan-out over time.
9999+100100+**Required maintenance:** Run `OPTIMIZE INDEX idx_documents_fts;` periodically (e.g., daily cron or after bulk backfill). This merges segments and reclaims space.
101101+102102+**Known limitations:**
103103+- No read-your-writes within a transaction — FTS queries see a pre-commit snapshot
104104+- No snippet function (use `fts_highlight()` for highlighting)
105105+- FTS is experimental in Turso; requires the `fts` feature flag
106106+96107## 4. Embeddings Table
9710898109```sql
···108119);
109120```
110121111111-The vector dimension (768) is configurable by model. Changing models requires a new column or table migration.
122122+The vector dimension (768) matches nomic-embed-text-v1.5 and EmbeddingGemma defaults. Changing models may require a new column or table migration if the dimension changes.
123123+124124+### Vector Index Tuning
125125+126126+The DiskANN index accepts tuning parameters at creation time:
127127+128128+```sql
129129+CREATE INDEX idx_embeddings_vec ON document_embeddings(
130130+ libsql_vector_idx(embedding, 'metric=cosine', 'max_neighbors=50', 'search_l=200')
131131+);
132132+```
133133+134134+| Parameter | Default | Description |
135135+|-----------|---------|-------------|
136136+| `max_neighbors` | 3*sqrt(D) | Graph connectivity; higher = better recall, more storage |
137137+| `search_l` | 200 | Neighbors visited during search; higher = better recall, slower |
138138+| `insert_l` | 70 | Neighbors visited during insert |
139139+| `alpha` | 1.2 | Graph sparsity factor |
140140+| `compress_neighbors` | — | Quantize neighbor vectors for storage savings |
141141+142142+Start with defaults and tune after measuring recall on representative queries.
112143113144## 5. Sync State Table
114145
+1
docs/api/specs/04-data-pipeline.md
···324324- Jobs are retried with exponential backoff up to a max attempt count
325325- After max attempts, the job enters `dead` state
326326- The embed-worker exposes failed job count as a metric
327327+- If Ollama is unreachable (sidecar down), all pending jobs pause until connectivity is restored
327328328329### DB Failures
329330
+7-1
docs/api/specs/05-search.md
···6161fts_highlight(d.body, '<mark>', '</mark>', ?) AS body_snippet
6262```
63636464+### FTS Operational Notes
6565+6666+- **Segment merging:** Turso FTS uses Tantivy's `NoMergePolicy`. Run `OPTIMIZE INDEX idx_documents_fts;` after bulk writes (backfill) and periodically in production to keep query performance stable.
6767+- **Read-your-writes:** FTS queries within the same transaction see a pre-commit snapshot. If a document is written and immediately searched in the same transaction, FTS will not find it. The indexer and API are separate processes, so this is not a concern in normal operation.
6868+- **Feature flag:** Turso FTS requires the `fts` feature flag to be enabled on the database.
6969+6470## 3. Semantic Search
65716672### Query Flow
67736868-1. Convert user query text to embedding via the configured provider
7474+1. Convert user query text to embedding via Ollama (self-hosted)
69752. Query `vector_top_k` for nearest neighbors
70763. Join back to `documents` to get metadata
71774. Filter out deleted/hidden documents
+15-18
docs/api/specs/06-operations.md
···4141| `SEARCH_MAX_LIMIT` | `100` | Maximum results per page |
4242| `SEARCH_DEFAULT_MODE` | `keyword` | Default search mode |
43434444-### Embedding
4444+### Embedding (Ollama — self-hosted)
45454646-| Variable | Default | Description |
4747-| ---------------------- | ------- | ---------------------------------------------------- |
4848-| `EMBEDDING_PROVIDER` | — | Provider name (e.g., `openai`, `ollama`, `voyageai`) |
4949-| `EMBEDDING_MODEL` | — | Model name (e.g., `text-embedding-3-small`) |
5050-| `EMBEDDING_API_KEY` | — | Provider API key |
5151-| `EMBEDDING_API_URL` | — | Provider base URL (for self-hosted) |
5252-| `EMBEDDING_DIM` | `768` | Vector dimensionality |
5353-| `EMBEDDING_BATCH_SIZE` | `32` | Batch size for embed-worker |
4646+| Variable | Default | Description |
4747+| ---------------------- | ------------------------------------------ | ---------------------------------------------- |
4848+| `OLLAMA_URL` | `http://ollama.railway.internal:11434` | Ollama server URL |
4949+| `EMBEDDING_MODEL` | `nomic-embed-text` | Ollama model name |
5050+| `EMBEDDING_DIM` | `768` | Vector dimensionality (must match model) |
5151+| `EMBEDDING_BATCH_SIZE` | `32` | Documents per embedding batch |
54525553### Hybrid Search
5654···8785SEARCH_DEFAULT_LIMIT=20
8886SEARCH_MAX_LIMIT=100
89879090-# Embedding (Phase 2)
9191-# EMBEDDING_PROVIDER=openai
9292-# EMBEDDING_MODEL=text-embedding-3-small
9393-# EMBEDDING_API_KEY=sk-...
8888+# Embedding — Ollama (Phase 2)
8989+# OLLAMA_URL=http://ollama.railway.internal:11434
9090+# EMBEDDING_MODEL=nomic-embed-text
9491# EMBEDDING_DIM=768
95929693# Server
···285282| ------------------- | --------------------------------- |
286283| `TURSO_AUTH_TOKEN` | Turso database authentication |
287284| `TAP_AUTH_PASSWORD` | Tap admin API authentication |
288288-| `EMBEDDING_API_KEY` | Embedding provider authentication |
285285+| `OLLAMA_URL` | Ollama sidecar connection (no secret if internal networking) |
289286| `ADMIN_AUTH_TOKEN` | Admin endpoint authentication |
290287291288### Admin Endpoints
···331328| api | `twister api` | `GET /healthz` | yes |
332329| indexer | `twister indexer` | `GET :9090/health` | no |
333330| embed-worker | `twister embed-worker` | `GET :9091/health` | no |
331331+| ollama | (Railway template) | `GET /api/tags` | no |
334332335333All services share the same Docker image. Railway uses the start command to select the subcommand.
336334···356354TAP_AUTH_PASSWORD=...
357355INDEXED_COLLECTIONS=sh.tangled.repo,sh.tangled.repo.issue,sh.tangled.repo.pull,sh.tangled.string,sh.tangled.actor.profile
358356359359-# Embed-worker (Phase 2)
360360-# EMBEDDING_PROVIDER=openai
361361-# EMBEDDING_MODEL=text-embedding-3-small
362362-# EMBEDDING_API_KEY=sk-...
357357+# Embed-worker + Ollama (Phase 2)
358358+# OLLAMA_URL=http://ollama.railway.internal:11434
359359+# EMBEDDING_MODEL=nomic-embed-text
363360```
364361365362Railway supports referencing other services' variables with `${{service.VAR}}` syntax, which is useful for linking the indexer to Tap's domain.
+166
docs/api/specs/09-search-site.md
···11+---
22+title: "Spec 09 — Search Site"
33+updated: 2026-03-23
44+---
55+66+A minimal static site that serves as both the public Twister API documentation and a live search showcase. Dark mode only, no framework or build step.
77+88+## 1. Purpose
99+1010+- Give developers a browsable reference for the Twister search API
1111+- Give anyone a way to try search against live indexed Tangled content
1212+- Provide a shareable public URL before the mobile app ships
1313+1414+## 2. Scope
1515+1616+In scope:
1717+1818+- Static HTML/CSS/JS (Alpine.js, no bundler)
1919+- API reference pages generated from the spec docs
2020+- Live search input wired to `GET /search`
2121+- Result rendering with type-aware cards (repo, issue, PR, profile, string)
2222+- Filter controls for collection, type, author, language, state
2323+- Pagination
2424+- Responsive layout (mobile-friendly, single breakpoint)
2525+2626+Out of scope:
2727+2828+- Auth, OAuth, or any write operations
2929+- Semantic or hybrid mode toggle (keyword only for MVP)
3030+- Server-side rendering or static-site generator
3131+- Analytics or telemetry
3232+3333+## 3. Pages
3434+3535+| Route | Content |
3636+| ----------------- | --------------------------------------------------------------------------- |
3737+| `/` | Search input + results (the homepage is the search page) |
3838+| `/docs` | API overview: base URL, auth (none for public), rate limits, response shape |
3939+| `/docs/search` | `GET /search` — parameters, filters, response contract, examples |
4040+| `/docs/documents` | `GET /documents/{id}` — request/response, examples |
4141+| `/docs/health` | `GET /healthz`, `GET /readyz` — purpose and expected responses |
4242+4343+## 4. Search Page Behavior
4444+4545+1. Text input with a submit button. No debounce search-as-you-type for MVP.
4646+2. On submit, fetch `GET {API_BASE}/search?q={query}&limit=20` (plus any active filters).
4747+3. Render results as a vertical list of cards.
4848+4. Each card shows: `record_type` badge, `title`, `body_snippet` (with `<mark>` highlights preserved), `author_handle`, `repo_name` (when present), `updated_at` relative time.
4949+5. Clicking a result opens the canonical Tangled URL (`https://tangled.org/{handle}/{repo}` for repos, etc.) in a new tab.
5050+6. "Load more" button appends the next page (`offset += limit`).
5151+7. Empty state: "No results" message.
5252+8. Error state: inline message if the API is unreachable.
5353+9. Filter bar above results: dropdowns/inputs for `type`, `language`, `author`. Filters are query params so URLs are shareable.
5454+5555+## 5. API Docs Pages
5656+5757+Hand-written HTML mirroring the contracts in spec 05 (search) and spec 08 (app integration). Each page includes:
5858+5959+- Endpoint signature (method, path)
6060+- Parameter table (name, type, default, description)
6161+- Example request (curl)
6262+- Example response (JSON block with syntax highlighting via `<pre><code>`)
6363+6464+No generated docs tooling. The pages are static and updated manually when the API changes.
6565+6666+## 6. Styling
6767+6868+Minimal CSS, no utility framework.
6969+7070+### Tokens
7171+7272+```css
7373+:root {
7474+ --bg: #0e0e0e;
7575+ --surface: #1a1a1a;
7676+ --border: #2a2a2a;
7777+ --text: #e0e0e0;
7878+ --text-dim: #888;
7979+ --accent: #7aa2f7;
8080+ --mark-bg: #7aa2f733;
8181+ --mono: "Google Sans Mono", monospace;
8282+ --sans: "Google Sans", sans-serif;
8383+ --radius: 6px;
8484+}
8585+```
8686+8787+### Rules
8888+8989+- Dark theming.
9090+- `Google Sans` for body text. `Google Sans Mono` for code, JSON, and badges.
9191+- Fonts loaded via Google Fonts `<link>`. System fallbacks: `sans-serif`, `monospace`.
9292+- Max content width: `720px`, centered.
9393+- Cards: `var(--surface)` background, `var(--border)` border, `var(--radius)` corners.
9494+- `<mark>` tags in snippets styled with `var(--mark-bg)` background and `var(--accent)` text.
9595+- Code blocks: `var(--surface)` background, horizontal scroll, no wrapping.
9696+- Links: `var(--accent)`, no underline, underline on hover.
9797+- Inputs and buttons: `var(--surface)` background, `var(--border)` border, `var(--text)` text.
9898+- One breakpoint at `640px` for mobile: full-width cards, stacked filter bar.
9999+100100+## 7. Package Design
101101+102102+The site lives in `internal/view/` as a self-contained Go package. It owns the templates, static assets, and HTTP handlers. The `api` package mounts `view.Handler()` into its router — nothing else leaks out.
103103+104104+### Exports
105105+106106+The package exposes a single constructor:
107107+108108+```go
109109+// Handler returns an http.Handler that serves the site pages and static assets.
110110+func Handler() http.Handler
111111+```
112112+113113+The `api` package calls `view.Handler()` and mounts it as a fallback after API routes.
114114+115115+### Package Structure
116116+117117+```text
118118+internal/view/
119119+ view.go # Handler(), route setup, embed directives
120120+ templates/
121121+ layout.html # Shared shell (head, nav, footer)
122122+ index.html # Search page
123123+ docs/
124124+ index.html # API overview
125125+ search.html # GET /search docs
126126+ documents.html # GET /documents/{id} docs
127127+ health.html # Health endpoints docs
128128+ static/
129129+ style.css # All styles, single file
130130+ search.js # Search fetch, render, pagination, filters
131131+```
132132+133133+### Embedding
134134+135135+`view.go` uses `//go:embed` to bundle `templates/` and `static/`. Templates are parsed once at init. Static assets are served under `/static/` via `http.FileServer`.
136136+137137+### Routing
138138+139139+`view.Handler()` returns a mux that handles:
140140+141141+| Pattern | Handler |
142142+| --- | --- |
143143+| `GET /` | Render `index.html` |
144144+| `GET /docs` | Render `docs/index.html` |
145145+| `GET /docs/search` | Render `docs/search.html` |
146146+| `GET /docs/documents` | Render `docs/documents.html` |
147147+| `GET /docs/health` | Render `docs/health.html` |
148148+| `GET /static/*` | Serve embedded CSS/JS files |
149149+150150+## 9. Configuration
151151+152152+Since the site is served by the same origin as the API, search requests use relative paths (`/search?q=...`). No `API_BASE` config needed — the browser's origin is the API.
153153+154154+## 10. Local Development
155155+156156+Run `twister api` locally. The site is served at `http://localhost:8080/` alongside the API endpoints. No separate dev server or file server required.
157157+158158+The API docs pages render without any indexed data. The search page needs a running indexer and populated database to return results.
159159+160160+## 11. Constraints
161161+162162+- No dependencies besides Alpine via CDN.
163163+- Total site weight target: under 50 KB excluding fonts.
164164+- Works in modern browsers (last 2 versions of Chrome, Firefox, Safari).
165165+- All fetch calls include error handling for network failures and non-200 responses.
166166+- No CORS concerns — the site and API share an origin.
+11-10
docs/api/specs/README.md
···10101111## Specifications
12121313-| # | Document | Description |
1414-|---|----------|-------------|
1515-| 1 | [Architecture](01-architecture.md) | Purpose, goals, design principles, system context, tech choices |
1616-| 2 | [Tangled Lexicons](02-tangled-lexicons.md) | `sh.tangled.*` record schemas and fields |
1717-| 3 | [Data Model](03-data-model.md) | Database schema, search documents, sync state |
1818-| 4 | [Data Pipeline](04-data-pipeline.md) | Tap integration, normalization, failure handling |
1919-| 5 | [Search](05-search.md) | Search modes, API contract, scoring, filtering |
2020-| 6 | [Operations](06-operations.md) | Configuration, observability, security, deployment |
2121-| 7 | [Graph Backfill](07-graph-backfill.md) | Seed-based user discovery and content backfill |
2222-| 8 | [App Integration](08-app-integration.md) | Mobile-facing contracts for search and graph summaries |
1313+| # | Document | Description |
1414+| --- | ------------------------------------------ | --------------------------------------------------------------- |
1515+| 1 | [Architecture](01-architecture.md) | Purpose, goals, design principles, system context, tech choices |
1616+| 2 | [Tangled Lexicons](02-tangled-lexicons.md) | `sh.tangled.*` record schemas and fields |
1717+| 3 | [Data Model](03-data-model.md) | Database schema, search documents, sync state |
1818+| 4 | [Data Pipeline](04-data-pipeline.md) | Tap integration, normalization, failure handling |
1919+| 5 | [Search](05-search.md) | Search modes, API contract, scoring, filtering |
2020+| 6 | [Operations](06-operations.md) | Configuration, observability, security, deployment |
2121+| 7 | [Graph Backfill](07-graph-backfill.md) | Seed-based user discovery and content backfill |
2222+| 8 | [App Integration](08-app-integration.md) | Mobile-facing contracts for search and graph summaries |
2323+| 9 | [Search Site](09-search-site.md) | Static site for API docs and live search |
+1
docs/api/tasks/README.md
···3838- Restart does not lose sync position
3939- Reindex exists for repair
4040- Graph backfill populates initial content from seed users
4141+- A static search site with API docs is publicly accessible
+59-1
docs/api/tasks/phase-1-mvp.md
···1616- Restart does not lose sync position
1717- Reindex exists for repair
1818- Graph backfill populates initial content from seed users
1919+- A static search site with API docs is publicly accessible
19202021## M0 — Repository Bootstrap ✅
2122···261262262263A user can search Tangled content reliably with keyword search.
263264265265+## M5a — Search Site
266266+267267+refs: [specs/09-search-site.md](../specs/09-search-site.md)
268268+269269+### Goal
270270+271271+Ship a static site that doubles as public API documentation and a live search demo. Alpine.js via CDN for reactivity, no build step.
272272+273273+### Deliverables
274274+275275+- `internal/view/` package exporting `Handler() http.Handler`
276276+- Embedded templates (`templates/`) and static assets (`static/`) via `//go:embed`
277277+- Search page (`/`) wired to `GET /search` with result cards, filters, and pagination
278278+- API docs pages (`/docs/*`) covering search, documents, and health endpoints
279279+- Dark-mode-only styling with Google Sans fonts and minimal CSS tokens
280280+281281+### Tasks
282282+283283+- [ ] Create `internal/view/` package with `view.go`, `templates/`, and `static/` directories
284284+- [ ] Implement `Handler()` that returns an `http.Handler` with routes for all pages and `/static/*`
285285+- [ ] Embed templates and static assets via `//go:embed`; parse templates once at init
286286+- [ ] Use a shared `layout.html` template for the shell (head, nav, footer)
287287+- [ ] Mount `view.Handler()` in the `api` package router as a fallback after API routes
288288+- [ ] Build search page:
289289+ - Text input + submit
290290+ - Fetch `GET /search` with relative path (same origin)
291291+ - Render result cards with type badge, title, snippet (preserve `<mark>`), author, repo, relative time
292292+ - "Load more" pagination via offset
293293+ - Filter bar: type, language, author (reflected in URL query params)
294294+ - Empty and error states
295295+- [ ] Build API docs pages:
296296+ - `/docs` — overview (base URL, response shape, no auth)
297297+ - `/docs/search` — `GET /search` params, filters, example curl, example response
298298+ - `/docs/documents` — `GET /documents/{id}` request/response
299299+ - `/docs/health` — `GET /healthz`, `GET /readyz`
300300+- [ ] Implement `style.css` with design tokens (`--bg`, `--surface`, `--border`, `--accent`, etc.)
301301+- [ ] Load Google Sans and Google Sans Mono via Google Fonts `<link>`
302302+- [ ] Result card links open canonical Tangled URLs in new tab
303303+- [ ] Verify total site weight under 50 KB (excluding fonts and Alpine CDN)
304304+305305+### Verification
306306+307307+- [ ] `twister api` serves the search page at `http://localhost:8080/`
308308+- [ ] API endpoints (`/search`, `/healthz`, etc.) still work alongside the site
309309+- [ ] Searching a known repo name shows it in results
310310+- [ ] Filter by type restricts results to that type
311311+- [ ] "Load more" appends next page of results
312312+- [ ] API docs pages render correct endpoint signatures, parameter tables, and example JSON
313313+- [ ] Site works on mobile viewport (stacked layout at 640px)
314314+- [ ] Site works with API unavailable (error state shown, no crash)
315315+- [ ] All pages share consistent styling and navigation
316316+317317+### Exit Criteria
318318+319319+A user can search Tangled content and read API docs from a public URL without installing anything.
320320+264321## M6 — Railway Deployment
265322266323refs: [specs/06-operations.md](../specs/06-operations.md)
···336393 2. For each document, re-run normalization from stored fields (or re-fetch if source available)
337394 3. Update FTS-relevant fields
338395 4. Upsert back to store
339339- 5. Log progress (N/total, errors)
396396+ 5. Run `OPTIMIZE INDEX idx_documents_fts` after bulk reindex to merge Tantivy segments
397397+ 6. Log progress (N/total, errors)
340398- [ ] Implement `POST /admin/reindex` endpoint (behind `ENABLE_ADMIN_ENDPOINTS` + `ADMIN_AUTH_TOKEN`)
341399- [ ] Add error summary output on completion
342400- [ ] Exit non-zero on unrecoverable failures
+70-15
docs/api/tasks/phase-2-semantic.md
···11---
22title: "Phase 2 — Semantic Search"
33-updated: 2026-03-22
33+updated: 2026-03-23
44---
5566# Phase 2 — Semantic Search
7788-Add embedding generation and vector-based retrieval on top of the keyword baseline.
88+Add embedding generation and vector-based retrieval on top of the keyword baseline, using self-hosted Ollama for embeddings instead of external API services.
991010-## M8 — Embedding Pipeline
1010+## M8 — Ollama Sidecar and Embedding Pipeline
11111212-refs: [specs/03-data-model.md](../specs/03-data-model.md), [specs/05-search.md](../specs/05-search.md)
1212+refs: [specs/01-architecture.md](../specs/01-architecture.md), [specs/03-data-model.md](../specs/03-data-model.md), [specs/05-search.md](../specs/05-search.md)
13131414### Goal
15151616-Add asynchronous embedding generation without blocking ingestion.
1616+Deploy Ollama as a Railway sidecar and add asynchronous embedding generation without blocking ingestion.
17171818### Deliverables
19192020+- Ollama Railway service running nomic-embed-text-v1.5 (or EmbeddingGemma)
2021- `embedding_jobs` table operational (schema from M1)
2122- `embed-worker` subcommand
2222-- Embedding provider abstraction (OpenAI, Voyage, Ollama)
2323+- Ollama-backed embedding provider (with interface for future alternatives)
2324- Retry and dead-letter behavior
2425- `twister reembed` command
25262627### Tasks
27282929+- [ ] Deploy Ollama on Railway:
3030+ - Use the nomic-embed Railway template as a starting point
3131+ - Configure as internal service (no public URL)
3232+ - Pre-pull `nomic-embed-text` model on startup
3333+ - Health check: `GET /api/tags` on port 11434
3434+ - Resource budget: 1–2 GB RAM, 1–2 vCPU
2835- [ ] Define embedding provider interface:
29363037 ```go
···3542 }
3643 ```
37443838-- [ ] Implement OpenAI provider (or preferred provider)
4545+- [ ] Implement Ollama provider using the official Go client:
4646+4747+ ```go
4848+ import "github.com/ollama/ollama/api"
4949+5050+ // OllamaProvider calls Ollama's /api/embed endpoint
5151+ // over Railway internal networking (ollama.railway.internal:11434)
5252+ type OllamaProvider struct {
5353+ client *api.Client
5454+ model string // "nomic-embed-text"
5555+ dim int // 768
5656+ }
5757+ ```
5858+5959+ - Configure via `OLLAMA_URL` env var (default: `http://ollama.railway.internal:11434`)
6060+ - Support batch embedding (Ollama accepts multiple inputs per request)
6161+ - Timeout per request (default: 30s)
6262+ - Connection health check on startup
3963- [ ] Implement embedding input text composition (see spec 04-data-pipeline.md, section 5):
4064 `title\nrepo_name\nauthor_handle\ntags\nsummary\nbody`
4165- [ ] Add job enqueueing: on document upsert, insert `embedding_jobs` row with `status=pending`
4266- [ ] Implement `embed-worker` loop:
4343- 1. Poll for `pending` jobs (batch by `EMBEDDING_BATCH_SIZE`)
6767+ 1. Poll for `pending` jobs (batch by `EMBEDDING_BATCH_SIZE`, default: 32)
4468 2. Compose input text per document
4545- 3. Call embedding provider
6969+ 3. Call Ollama provider
4670 4. Store vectors in `document_embeddings` with `vector32(?)`
4771 5. Mark job `completed`
4872 6. On failure: increment `attempts`, set `last_error`, backoff
4973 7. After max attempts: mark `dead`
5050-- [ ] Create DiskANN vector index: `CREATE INDEX idx_embeddings_vec ON document_embeddings(libsql_vector_idx(embedding, 'metric=cosine'))`
7474+- [ ] Create DiskANN vector index (see spec 03 for tuning params):
7575+ ```sql
7676+ CREATE INDEX idx_embeddings_vec ON document_embeddings(
7777+ libsql_vector_idx(embedding, 'metric=cosine')
7878+ );
7979+ ```
5180- [ ] Implement `reembed` command (re-generate all embeddings, useful for model migration)
5281- [ ] Skip deleted documents in embedding pipeline
5382- [ ] Add health check endpoint for embed-worker (port 9091)
8383+- [ ] Add Ollama connectivity check to embed-worker readiness probe
8484+8585+### Model Selection Notes
8686+8787+**nomic-embed-text-v1.5** is the default recommendation:
8888+- 137M parameters, 768-dimension vectors
8989+- Matryoshka support (can truncate to 64/128/256/512 dims for storage tradeoff)
9090+- 8192 token context window
9191+- ~262 MB at F16 quantization, ~500 MB RAM at runtime
9292+- Battle-tested with llama.cpp/Ollama, Railway template exists
9393+9494+**EmbeddingGemma** is the quality alternative:
9595+- 308M parameters, 768-dimension vectors
9696+- Best MTEB scores for models under 500M parameters
9797+- <200 MB quantized, similar RAM footprint
9898+- Released Sept 2025, less deployment track record
9999+100100+**all-minilm** is the budget fallback:
101101+- 23M parameters, 384-dimension vectors (requires schema change)
102102+- ~46 MB model, minimal resources
103103+- Suitable for testing or cost-constrained environments
5410455105### Verification
56106107107+- [ ] Ollama service starts on Railway and responds to health checks
57108- [ ] Creating a new searchable document enqueues an embedding job
58109- [ ] Worker processes the job and stores a vector in `document_embeddings`
59110- [ ] Failed embedding calls retry with bounded attempts
6060-- [ ] Keyword search still works when embed-worker is down
111111+- [ ] Keyword search still works when embed-worker or Ollama is down
61112- [ ] `reembed` regenerates embeddings for all eligible documents
113113+- [ ] Ollama connectivity failure is surfaced in embed-worker health check
6211463115### Exit Criteria
641166565-Embeddings are produced asynchronously and stored durably.
117117+Embeddings are produced asynchronously via self-hosted Ollama and stored durably in Turso.
6611867119## M9 — Semantic Search
68120···75127### Deliverables
7612877129- `GET /search/semantic` endpoint
7878-- Query-time embedding (convert query text → vector)
130130+- Query-time embedding (convert query text → vector via Ollama)
79131- Vector similarity search via `vector_top_k`
80132- Response parity with keyword search
8113382134### Tasks
831358484-- [ ] Implement query embedding: call embedding provider with user's query text
136136+- [ ] Implement query embedding: call Ollama provider with user's query text
137137+- [ ] Cache query embeddings for identical queries within a short TTL (optional, reduces Ollama load)
85138- [ ] Implement semantic search repository:
8613987140 ```sql
···98151- [ ] Add timeout and cost controls (limit vector search to reasonable K)
99152- [ ] Wire `/search/semantic` handler
100153- [ ] Return `matched_by: ["semantic"]` in results
154154+- [ ] Graceful degradation: if Ollama is unreachable, return 503 for semantic search while keyword search remains available
101155102156### Verification
103157···106160- [ ] Semantic search returns the same JSON schema as keyword search
107161- [ ] Latency is acceptable under small test load
108162- [ ] Filters work correctly with semantic results
163163+- [ ] Semantic search degrades gracefully when Ollama is down
109164110165### Exit Criteria
111166112112-The API supports true semantic search over Tangled documents.
167167+The API supports true semantic search over Tangled documents, powered entirely by self-hosted infrastructure.