···11+# Planner + Pushdown Engine for `is-tree`
22+33+This document proposes a lightweight query planner for `is-tree` so we can push computation earlier in the pipeline, especially for `--scan` and picker workflows.
44+55+Related docs and code:
66+77+- [`/doc/pick-iter.md`](/doc/pick-iter.md)
88+- [`/README.md`](/README.md)
99+- [`/src/main.rs`](/src/main.rs)
1010+- [`/src/plugin.rs`](/src/plugin.rs)
1111+1212+## Why now
1313+1414+We already landed one targeted optimization: short-circuiting `--all --format directory` in [`/src/main.rs`](/src/main.rs).
1515+1616+That win validates the direction, but it is still a special case. We need a general mechanism that can answer:
1717+1818+- Which columns are actually needed?
1919+- Which filters can run before expensive plugin work?
2020+- Which sorts can run early enough to keep `--scan` responsive?
2121+2222+In short: turn the CLI request into an execution plan, then push expensive work as late as possible.
2323+2424+## Goals
2525+2626+- Preserve current CLI behavior by default.
2727+- Make fast paths automatic when query shape allows.
2828+- Keep `--scan` interactive by prioritizing early-sort keys.
2929+- Reuse existing plugin registry architecture instead of bypassing it.
3030+- Allow incremental rollout without rewriting the whole runtime.
3131+3232+## Non-goals
3333+3434+- No SQL parser or user-facing query DSL.
3535+- No distributed execution.
3636+- No breaking changes to existing output formats.
3737+3838+## Core idea
3939+4040+Treat each invocation as a query:
4141+4242+- **Projection**: requested output columns
4343+- **Filters**: row predicates
4444+- **Sort**: ordered keys
4545+- **Mode**: full vs scan
4646+- **Input**: explicit paths or discovered roots
4747+4848+Then compile to a physical plan where each column and predicate is annotated by when it becomes available and how expensive it is.
4949+5050+## Architecture
5151+5252+```mermaid
5353+flowchart LR
5454+ ParseCli[Parse CLI Args] --> LogicalQuery[Build LogicalQuery]
5555+ LogicalQuery --> PlanRules[Apply Pushdown Rules]
5656+ PlanRules --> PhysicalPlan[Build PhysicalPlan]
5757+ PhysicalPlan --> EnumerateStage[Enumerate Candidate Paths]
5858+ EnumerateStage --> EarlyProbeStage[Run Early Probes]
5959+ EarlyProbeStage --> EarlyFilterSort[Apply Early Filters and Sorts]
6060+ EarlyFilterSort --> LateProbeStage[Run Late Plugin Probes If Required]
6161+ LateProbeStage --> FinalFilterSort[Apply Remaining Filters and Sorts]
6262+ FinalFilterSort --> RenderStage[Render Text or JSON]
6363+```
6464+6565+## Data model draft
6666+6767+These are implementation-level structs we can add near runtime planning code.
6868+6969+```rust
7070+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
7171+enum ExecMode {
7272+ Full,
7373+ Scan,
7474+}
7575+7676+#[derive(Debug, Clone, PartialEq, Eq)]
7777+struct LogicalQuery {
7878+ mode: ExecMode,
7979+ roots: Vec<std::path::PathBuf>,
8080+ projection: Vec<String>,
8181+ filters: Vec<FilterExpr>,
8282+ sort_keys: Vec<SortKey>,
8383+ emit_json: bool,
8484+}
8585+8686+#[derive(Debug, Clone, PartialEq, Eq)]
8787+struct SortKey {
8888+ column: String,
8989+ desc: bool,
9090+}
9191+9292+#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
9393+enum AvailabilityStage {
9494+ Enumerate,
9595+ EarlyProbe,
9696+ LateProbe,
9797+ Finalize,
9898+}
9999+100100+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
101101+enum CostClass {
102102+ Free,
103103+ Cheap,
104104+ Expensive,
105105+}
106106+107107+#[derive(Debug, Clone, PartialEq, Eq)]
108108+struct ColumnPlanMeta {
109109+ key: &'static str,
110110+ stage: AvailabilityStage,
111111+ cost: CostClass,
112112+ stable_in_scan: bool,
113113+}
114114+```
115115+116116+### Practical column classification (initial)
117117+118118+| Column | Stage | Cost | Notes |
119119+|---|---|---|---|
120120+| `directory` | `Enumerate` | `Free` | Known from input path list |
121121+| `status` | `EarlyProbe` | `Cheap` | `detect_repo(path)` is local fs checks |
122122+| `workparent` | `LateProbe` | `Cheap` | Path parsing + repo metadata checks |
123123+| `change-date` | `EarlyProbe` | `Cheap` | local metadata mtime |
124124+| `commit-date` | `LateProbe` | `Expensive` | subprocess/git history lookup |
125125+| `ahead` | `LateProbe` | `Expensive` | jj/git remote-related logic |
126126+127127+This table is the planner contract. It can start hard-coded and later move to plugin metadata.
128128+129129+## Pushdown rules
130130+131131+### 1) Projection pushdown
132132+133133+Only compute columns that are needed by:
134134+135135+- output projection
136136+- filter predicates
137137+- sort keys
138138+139139+If requested columns are only `directory`, skip repo probe and plugins entirely.
140140+141141+### 2) Filter pushdown
142142+143143+Apply predicates at earliest available stage.
144144+145145+Examples:
146146+147147+- `status == jj` can run at `EarlyProbe`.
148148+- `ahead > 0` must wait for `LateProbe`.
149149+150150+### 3) Sort pushdown
151151+152152+Sort as early as possible, but only when sort keys are available.
153153+154154+- `--sort directory+` sorts during enumeration.
155155+- `--sort change-date-` sorts after `EarlyProbe`.
156156+- `--sort ahead-` requires `LateProbe`.
157157+158158+If multiple keys are mixed, planner splits sort into staged ordering:
159159+160160+- Early stable sort on early keys
161161+- Final sort after late keys are available
162162+163163+### 4) Mode-aware gating
164164+165165+`--scan` should avoid `LateProbe` by default.
166166+167167+Planner behavior in scan mode:
168168+169169+- If query needs only `Enumerate`/`EarlyProbe` columns, stay scan-fast.
170170+- If query requests late columns or late sort keys, use policy:
171171+ - `upgrade`: automatically switch to full plan
172172+ - `defer`: keep scan-fast behavior and warn that late requirements are skipped
173173+ - `error`: fail with clear message
174174+175175+Default recommendation: `upgrade` for correctness unless user opts into strict fast mode.
176176+177177+## Execution examples
178178+179179+### Case A: `--all --format directory`
180180+181181+Plan:
182182+183183+1. Enumerate candidate subdirectories
184184+2. Render path list
185185+186186+No probe, no plugin execution.
187187+188188+### Case B: `--scan --format "{status} {directory}" --sort directory+`
189189+190190+Plan:
191191+192192+1. Enumerate
193193+2. Early probe for `status`
194194+3. Early sort by `directory`
195195+4. Stream render
196196+197197+No late stage required.
198198+199199+### Case C: `--scan --sort ahead- --format directory`
200200+201201+Planner detects `ahead` as late/expensive.
202202+203203+- With `upgrade`: switch to full mode and compute ahead before final sort.
204204+- With `defer`: run scan-only path ordering and warn that `ahead` sort is not applied.
205205+206206+## Integration with picker pipeline
207207+208208+This planner directly supports the high-value pipeline from [`/doc/pick-iter.md`](/doc/pick-iter.md):
209209+210210+```bash
211211+is-tree --scan --sort change-date- --format directory | fuzzel --dmenu --multi | is-tree --stdin --format all
212212+```
213213+214214+Key benefits:
215215+216216+- Fast candidate emission (`Enumerate` + `EarlyProbe` only)
217217+- Useful prioritization (`change-date` pushdown)
218218+- Expensive columns deferred until user has narrowed selection
219219+220220+## Implementation plan
221221+222222+### Phase 1: planner metadata and rule engine
223223+224224+- Add `LogicalQuery`, `PhysicalPlan`, and column metadata table.
225225+- Build planner from existing CLI args (`format`, `sort`, `filter`, `json`, `all`).
226226+- Keep old runtime path as fallback.
227227+228228+Acceptance:
229229+230230+- Planner returns deterministic stage assignment for projection/filter/sort keys.
231231+- `--all --format directory` is represented as enumerate-only plan.
232232+233233+### Phase 2: staged execution runtime
234234+235235+- Introduce execution stages in `run()` path:
236236+ - enumerate
237237+ - early probe/filter/sort
238238+ - optional late probe/filter/sort
239239+ - render
240240+- Route current short-circuit through planner instead of bespoke branch.
241241+242242+Acceptance:
243243+244244+- Existing directory-only optimization remains fast and behaviorally identical.
245245+- Query results remain equivalent to current behavior for full-mode queries.
246246+247247+### Phase 3: scan policy + diagnostics
248248+249249+- Add scan late-key policy (`upgrade`, `defer`, `error`).
250250+- Emit explicit diagnostics when requested sort/filter cannot run in scan-fast stage.
251251+252252+Acceptance:
253253+254254+- Users can predictably control correctness vs speed in scan mode.
255255+- Help text documents scan policy behavior.
256256+257257+### Phase 4: plugin metadata integration
258258+259259+- Extend plugin column declarations with planning hints (`stage`, `cost`).
260260+- Remove hard-coded planner map once plugin hints are complete.
261261+262262+Acceptance:
263263+264264+- Planner decisions come from plugin metadata rather than ad-hoc key matching.
265265+- New plugins can participate in pushdown automatically.
266266+267267+## Testing strategy
268268+269269+- Unit tests for planner rule decisions:
270270+ - projection-only query
271271+ - mixed early/late sort keys
272272+ - scan policy behaviors
273273+- Integration tests for runtime equivalence:
274274+ - full mode unchanged output
275275+ - scan mode staged behavior
276276+- Performance checks:
277277+ - compare current vs planned execution on large directory sets
278278+279279+## Ticket alignment
280280+281281+- `is-tree-scan-priority`: provides the mechanism to prioritize and stream candidates.
282282+- `is-tree-fuzzel-pipeline`: provides the UX workflow that consumes staged scan output.
283283+- `is-tree-per-file-stats` and `is-tree-staleness-views`: benefit from selecting expensive drill-down only after narrowing candidates.
284284+285285+## Decision summary
286286+287287+We should evolve `is-tree` from ad-hoc fast paths into a small planner-driven runtime:
288288+289289+- classify column availability/cost
290290+- push projection/filter/sort as early as possible
291291+- keep `--scan` responsive while preserving correctness controls
292292+293293+This gives us a reusable optimization model, not just one-off special cases.