cxs Agent Guide#
项目定位#
cxs 是一个面向本机 Codex session 日志的渐进式检索 CLI,不是 GUI app,也不是实时同步守护进程。
当前接受的产品边界:
- 命令面固定为:
status、sync、find、read-range、read-page、list、stats - 主工作流固定为:
status -> ensure selector coverage -> find/list -> read-range/read-page sync是唯一会修改索引的命令;其余命令只读 SQLite- 默认接受手动增量同步,不做 watcher / daemon / realtime sync
- 这个仓库可以作为其他 sidecar / GUI 的 retrieval engine,但本仓库自身不以 GUI 为目标
当前实现真相#
- 检索主链是
message/session recall -> session heuristic rerank -> progressive read status只返回执行上下文、source inventory、index 状态与 coverage 状态;status --selector只读地报告目标 selector 的 coverage/freshness 和 recommendedAction;它可以扫描 raw session metadata,但不写 index、不回答内容问题- 内容回答只能来自 cxs index;source inventory 只能用于构造 selector 和判断可能的同步范围,不能作为内容真相源
sync --selector是建立 coverage 的唯一入口;只读命令不得隐式触发 sync- 查找前不要求无条件 sync;只有目标 selector coverage 缺失或 stale 时才同步。fresh
all(root)coverage 可以覆盖同 root 下更窄 selector find默认按 relevance 排序;“最新/最近 + 关键词”应使用find <query> --sort ended,必要时--exclude-session排除当前会话/self-hit- 候选召回来自
messages_fts与sessions_fts(title + summary_text + compact_text + reasoning_summary_text);极少数零 token CJK query 在 message 侧回退到 LIKE summary_text、compact_text、reasoning_summary_text已持久化,也会通过sessions_fts参与 session-level recall- session-level FTS 使用显式字段权重:title 8.0、compact 4.0、summary 3.0、reasoning summary 1.2
classifyQueryProfile()仍存在,但当前评分没有按broad/exact做显式分权- parser 只把
event_msg里的 user / assistant 写入messages;type=compacted与response_item.reasoning.summary只进入 session-level 索引字段,不形成可回读 message projection
不要把下面这些说成已完成:
- 真正独立的 stage-2 / resource-level reranker
- richer projection / range cache / event-level replay
- duplicate family collapse / diversity control
- 强约束的 gold set / rubric / error taxonomy
代码地图#
- cli.ts: CLI 命令面
- indexer.ts: sync 与索引更新
- parser.ts: Codex JSONL 解析与
summary_text生成 - db.ts: SQLite facade;具体 schema / store / coverage 模块在
src/db/ - query.ts: 查询 facade;find / read / list / stats / search / snippet 模块在
src/query/ - status.ts: status 输出编排
- selector.ts: selector 解析与覆盖蕴含规则
- source-inventory.ts: raw sessions metadata inventory
- types.ts: CLI JSON contract 与核心类型
- ranking.ts: session heuristic rerank
- eval/: manual eval、batch compare
文档规则#
- 优先维护当前态文档,不保留“看起来像现状、其实只是目标态”的 research 长文
- 如果文档已经腐化,优先删除或重写,不要叠补丁式修辞
docs/里的文档要服务后续 agent 直接接手,而不是保留调研过程痕迹- 任何涉及“当前已实现什么”的文档,都必须先对齐代码和测试
Skill Package 边界#
本仓库不维护项目级 .agents/skills。cxs 的 skill 是给用户安装后操作 CLI 的发行物,不是维护本仓库时给 Codex agent 自动加载的项目 workflow。
本仓库维护两条 skill 通道:
skill-packages/cxs: 发布版 skill 源码,必须匹配将要发布的cxsCLI 行为skill-packages/cxsd: 本机开发版 skill,必须使用cxsd/CXSD_BIN,用于 dogfood 当前 checkout
发布版 cxs#
cxs 永远代表线上安装版。不要把本地 dirty tree rsync 到全局 cxs skill。
对外推荐安装方式,也是本机更新全局线上 skill 的方式:
npx skills add catoncat/cxs --full-depth --skill cxs -g -a codex -y
注意这个 skill 不会自动安装 cxs CLI 本体。默认约定:
- 优先使用
CXS_BIN - 未设置时回退到
PATH里的cxs
开发版 cxsd#
cxsd 永远代表本地 checkout。它用于验证未发布代码和未发布 skill,不用于验证 npm/npx 线上版本。
本机约定:
- dev bin:
/Users/envvar/.local/bin/cxsd - dev bin 指向:
/Users/envvar/work/repos/cxs/src/cli.ts - global dev skill:
/Users/envvar/.agents/skills/cxsd -> /Users/envvar/work/repos/cxs/skill-packages/cxsd - Claude exposure:
/Users/envvar/.claude/skills/cxsd -> /Users/envvar/.agents/skills/cxsd
维护规则:
- 改 CLI 行为时,先更新
skill-packages/cxs,再同步调整skill-packages/cxsd skill-packages/cxsd只能把入口从cxs改成cxsd,不能发明另一套产品语义skill-packages/cxsd内不要出现CXS_BIN、${CXS_BIN:-cxs}或指向发布版cxs的示例- 全局
cxs通过npx skills add更新;全局cxsd通过 symlink 跟随本地 repo - 若
cxs与cxsd行为不一致,先判断是“线上尚未发布”还是“dev skill 漂移”,不要直接覆盖任一通道
Dogfood eval 边界#
Dogfood golden 是开发者本机的真实历史检索验收集,不是普通用户功能。
- 通用 runner / schema 可以维护在
eval/,例如npm run eval:dogfood -- <goldens.local.jsonl> - 私有 golden 默认放在 ignored 路径:
data/cxs-dogfood/goldens.local.jsonl - 添加 / promote dogfood golden 只能在用户显式触发 dev-only skill 时进行
- 本机 dev-only skill 路径:
~/.agents/skills/cxs-dogfood - 不要把 dogfood capture 流程或私有 golden 放进
skill-packages/cxs/skill-packages/cxsd - 普通代码实现任务可以运行已有 dogfood gate,但不能自行新增 golden,也不能自行把
candidatepromote 为hard
默认验证#
涉及实现或文档真相变更时,至少做与改动直接相关的验证:
npm run check- 必要时补一条 CLI 烟测,例如
cxsd status --json或cxsd find "<query>" --json - 涉及 skill 通道时,验证
npx skills ls -g --json、readlink /Users/envvar/.agents/skills/cxsd和cxsd --help
没有验证证据,不要声称“已对齐”“已完成”“文档正确”。
当前近端优先级#
- 先把 eval 从弱提示升级成更可信的 acceptance gate
- 继续观察 session-level 字段召回是否引入排序噪音,并补 eval 覆盖
- 更重的 reranker / projection / diversity 控制放后面
具体 roadmap 见 docs/ROADMAP.md。
Mainline#
Mainline is a git-native intent memory layer for AI-assisted engineering. It gives coding agents the historical why before they inspect the current what.
This project uses Mainline to record the intent behind every AI-driven change and to surface conflicts between intents before they reach a PR review. The agent is expected to both read team intents (for context) and write its own intent (for the work it's doing). Both halves matter — intents capture why changes were made, which is information the diff alone cannot give you.
v0.3 invariant: every commit on
mainis in exactly one of three states —covered(sealed intent claims it),skipped(Mainline-Skip:trailer or matched config pattern), oruncovered(neither). Runmainline statusto see the rollup;mainline gapsto see uncovered commits with rescue suggestions.
At the start of a task#
mainline status --json
If there is no active_intent, start one (use the user's goal verbatim
when possible — it becomes the headline in mainline log):
mainline start "<short description of the user's goal>" --json
Intent-first workflow (the load-bearing rule)#
Before making any non-trivial code change, retrieve relevant intent context before searching the codebase directly.
The default agent order is:
mainline status— overall state, identity, sync staleness, suggestions.mainline context --current --json— historical intents relevant to your current branch + active draft + diff vs main.- If the task names files:
mainline context --files <path>... --json. - If the task is semantic:
mainline context --query "<task summary>" --json. - Read the returned intents'
summary,decisions,risks,anti_patterns, andfingerprint. - Only then grep / read code to verify against the current implementation.
- Edit.
- When sealing, reference relevant prior intent IDs in your
decisions, and record any new
anti_patternsfuture agents must avoid in this area.
Do not lead with grep, rg, or broad file reads for non-trivial
changes unless Mainline is unavailable or the task is purely mechanical.
mainline context does NOT replace code inspection. It provides the
historical why before the agent inspects the current what.
Reading the retrieval output — every returned intent carries a
status field that tells you how to use it RIGHT NOW (distinct from
its lifecycle status):
| status | how to read it |
|---|---|
current |
the current effective decision; verify against current code, then apply |
superseded |
replaced — read superseded_by instead and use this only for context |
abandoned |
this approach was tried and abandoned; do not repeat without understanding why |
stale |
files have churned or the intent is old; verify decisions still hold |
Each intent also carries:
risks— soft warnings to weigh.anti_patterns— hard constraints. Each one carries awhat, awhy, and aseverity. Do not violate them. The retrieval API never truncatesanti_patterns, so if you see one, it is in scope.guidance— a single-line reminder derived fromstatus.
Pre-edit checklist for agents#
Before editing code, answer:
- Did I run
mainline status? - Did I run
mainline context --current --json? - If files are involved, did I run
mainline context --files ... --json? - Did I read the relevant prior decisions and risks?
- Did I verify those intents against the current code?
- Am I about to repeat an abandoned or superseded approach?
Task priority — when intent-first matters most#
| Always mainline-first | Mainline-first preferred | Direct code OK |
|---|---|---|
| architecture changes / refactors | bug fixes | typo / formatting fixes |
| migrations / deletions | new feature additions | one-line obvious syntax fixes |
| auth / billing / data-model / permissions | API behaviour changes | mechanical rename, scoped |
| test-strategy changes | config / CI / release tweaks | user explicitly asks to view ONE file |
| any cross-file change | ||
| user asks "why is this here?" | ||
| user asks "can we delete this?" | ||
| user asks "did we try this before?" |
Read team intents for context (do this aggressively)#
Before working on anything non-trivial, scan recent intents for prior
work in the area you're about to touch. Each intent's summary
(what / why / decisions / risks / followups) plus fingerprint
(subsystems, files_touched, tags) is strictly richer than the diff —
it tells you why the code looks the way it does, which decisions
were considered and rejected, and what the author flagged as a risk
or follow-up.
mainline log --json --limit 30
Filter by goal/title keywords matching the user's task. For each relevant hit, pull the full record:
mainline show <intent_id> --json # decisions / risks / fingerprint
mainline trace <intent_id> --json # turn timeline (when each turn
# was added, how long it took)
show answers what the intent decided. trace answers how it
unfolded over time — useful when you're trying to understand why a
PR looks the way it does, or whether the agent got stuck and looped.
Before designing a change, also see what is currently in flight so your work does not collide with someone else's proposed intent:
mainline list-proposals --json
mainline context --json is a quick agent-consumption snapshot of
the same data (current actor, active intent, recent merged) — useful
for orientation but does not replace the targeted log/show calls.
Use this aggressively. The cost is one or two CLI calls; the payoff is correct architectural decisions and not duplicating someone's just-finished work.
Turns and intent history#
Turns are a lightweight thinking scaffold used to prepare a good seal. They are not expected to be a real-time activity log.
It is normal for several turns to be recorded together near seal
time, especially when an agent summarizes its work before sealing.
mainline trace will surface this honestly via the
append_turns_recorded_together flag — that is informational, not a
warning.
Use:
mainline show <intent_id> --json
to inspect the structured conclusion of an intent: summary, decisions, risks, and fingerprint.
Use:
mainline trace <intent_id> --json
to inspect how an intent unfolded over time: start, append, seal, abandon, or supersede events.
show answers: "What did this intent decide?"
trace answers: "How did this intent unfold?"
While working#
Record turns at points that will help you write a good seal — when a meaningful subtask completes, when you pivot, when a discovery changes the plan. Many short turns or a few long turns are both fine; what matters is that the seal author (you, later) has the material to compose a faithful summary:
mainline append "<what changed and why>" --json
Turns are append-only. Don't try to amend or delete them — describe the next state in a new turn.
When the task is complete#
-
Commit your code changes the normal way:
git add <files> && git commit -m "<message>" -
Ask Mainline to prepare a seal package:
mainline seal --prepare --json > .ml-cache/seal.jsonThe package includes a
seal_result_starterfield — a partially- filledSealResultwith the deterministic bits (intent_id, fingerprint.files_touched, fingerprint.subsystems) pre-populated. Patch in the agent-judgment fields rather than typing the JSON from scratch.Why
.ml-cache/? Init writes that directory to.gitignore, so the temporary seal file stays out of git and does not trip the v0.3 worktree-clean snapshot contract on submit. -
Generate a
SealResultJSON matching the schema returned by--prepare. Populate the fingerprint generously — primary subsystem, synonyms, parent concepts, related technologies — so phase-1 conflict detection has signal:"tags": ["auth", "authentication", "security", "jwt", "session"]When the work establishes constraints future agents must respect, record them as
anti_patterns(NOT asrisks). Each entry MUST carry bothwhatandwhy; emptywhyis rejected at seal time."anti_patterns": [ { "what": "Removing legacy session middleware on /oauth path", "why": "OAuth callback handler still requires session state", "severity": "high" } ]Use
risksfor soft warnings the reviewer should weigh; useanti_patternsfor hard constraints the next agent must not violate. Anti-patterns are surfaced uncapped inmainline context, so future agents will always see them. -
Submit it:
mainline seal --submit --json < .ml-cache/seal.jsonSubmit auto-syncs with the team and runs phase-1 conflict detection against every other proposed/merged intent. If the JSON response carries a
conflictsarray, surface those conflicts to the user verbatim before continuing. Do not silently move on. -
(Optional but encouraged) Quality-check the seal:
mainline lint <intent_id> --jsonlintruns deterministic checks against the sealed payload — empty / boilerplatewhat, missing decisions, decision without rationale, missing risks/anti_patterns, broken supersedes refs. Errors mean the seal will be hard for future retrieval to use; warnings are advisory. Lint is not wired into submit, so a bad seal still goes through — but a low-quality seal pollutes futuremainline contextresults, which is the whole loop this workflow exists to keep healthy.
When the user asks you to phase-2 check an intent#
Phase 1 is automatic; phase 2 is invoked deliberately when phase 1
flags an overlap ([check:~] in mainline log) and the user wants a
real semantic judgment.
mainline check --prepare --intent <id> --json
Read each judgment_task in the package, judge whether it is a real
semantic conflict, and submit a CheckJudgmentResult:
mainline check --submit --json < judgment.json
The verdict surfaces in mainline log's [check:X] column.
Optional: agent hooks (opt-in context provider)#
If mainline hooks install <agent> has been run for your agent
runtime (Cursor today; Codex / Claude Code reserved), the hook layer
runs two mechanical operations at session start and injects a
status snapshot into your system context — nothing more:
- At
sessionStartthe hook runsmainline sync(refreshes the team view) andmainline status(active intent, proposed count, synced head). It feeds that snapshot back to you as system-prompt context along with a pointer to this document. You no longer need to runmainline statusas the very first call of a session — it has already run. - At every other lifecycle event (turn start, turn end, subagent
end, session end) the hook is a no-op for your reasoning. It
fires webhook notifications for external observers (CI dashboards,
pager integrations) and exits. It does NOT call
mainline start,mainline append,mainline seal --prepare, or any other command that requires deciding what counts as a goal / a meaningful change / a fingerprint — those are LLM judgments and you remain the only party qualified to make them.
Concretely: every step described above (start when there is real work, append after each meaningful logical change, commit, seal --prepare, fill SealResult, seal --submit, surface conflicts) you do yourself, hooks installed or not. The hook layer is a context provider, not a workflow driver.
Run mainline hooks status to confirm whether hooks are wired and
whether auto_sync_on_session_start is on (the only mechanical
toggle). Disable it with mainline hooks disable if your network
makes the session-start sync painful — you can still drive the rest
of the workflow by hand.
Hub: human reader surface (you don't run this; you suggest it)#
mainline hub export <dir> and mainline hub open build a static
HTML site over the local synced intent view. It is for humans, not
agents — agents use context / show / trace / gaps.
You should suggest the hub when the user asks one of:
- "What's the history of
<file>?" → hub's per-file page lists every intent that touched it. - "Who's been working on what lately?" → hub's index shows the recent intents table; the actor pages give per-author rollups.
- "Are there any conflicts or risky changes I should review?" → hub's risks page and graph (supersessions, conflicts_with, shares_file edges) put the answer one click away.
Concretely:
mainline hub open # build + open in the default browser
mainline hub export ./hub-snapshot # write a portable copy elsewhere
Both commands default the output to <os-temp>/mainline-hub/<repo>
so the static site never enters git.
Hub is read-only and rebuildable from the synced view; it never modifies repo files outside the user-chosen output directory.
What you do NOT need to run#
mainline sync— runs automatically insideseal --submitand whenever a fresh-data command (check,pin) needs it.mainline pin— runs automatically after every sync; the strategy cascade (tree_hash → commit_hash → goal_text) catches GitHub squash-merges with near-100 % reliability.mainline merge— humans merge via the GitHub PR UI; the nextmainline syncauto-pins the squash commit.
Do not run unless the user explicitly asks#
mainline pin <intent> <commit> # manual fallback
mainline merge --intent <id> # non-PR pipeline only
mainline init --rewire # repo setup repair
mainline doctor --setup --fix # repo setup repair
Encountering an uncovered commit (v0.3 rescue)#
If mainline status or mainline gaps flags an uncovered commit (one
that landed on main with no intent), pick the best path you still
can — ordered by reversibility, cheapest first:
-
Unpushed — undo and redo via the proper flow:
git reset --soft HEAD^ # un-commit, keep changes mainline start "<goal>" <continue normal flow> -
Pushed — backfill an intent that retroactively claims the commit:
mainline start "<why this commit was made>" --commits <sha> mainline append "<turn-by-turn description, post-hoc>" mainline seal --prepare > .ml-cache/seal.json <fill .ml-cache/seal.json> mainline seal --submit < .ml-cache/seal.jsonThe seal flow auto-pins the new intent to the listed commit on next
mainline sync. -
Routine (chore / format / version bump) — mark as deliberately skipped:
git commit --amend # add `Mainline-Skip: <reason>` trailerOr add a pattern in
.mainline/config.tomlunder[mainline.skip]so future similar commits classify automatically:[mainline.skip] patterns = ["^chore: format", "^bump:"] -
Already distributed, regrettably — accept uncovered. The mainline log is a record of reality, not aspiration.
Seal snapshot contract (v0.3)#
mainline seal --prepare snapshots the worktree state (HEAD, branch,
clean/dirty/untracked) and persists it. mainline seal --submit
validates the live repo against that snapshot — HEAD drift, branch
drift, or dirty worktree all fail by default with a typed error. The
escape hatch is the explicit CLI flag --allow-dirty; even then, the
sealed event permanently records worktree_status so reviewers see
the audit trail.
Always commit your code BEFORE mainline seal --prepare. Untracked
files (planning docs, scratch notes) do not enter sealed evidence.