grex
A nested meta-repo manager. Track many git repos as a single graph, sync them in parallel, and drive every operation from a shell, CI, or an LLM agent speaking MCP.
grex is what you reach for when one git repo is no longer enough — when
you have a tree of related repos (a workspace, a fleet of services, a set
of dotfiles + plugins + tools) and you want one declarative source of
truth that says which repos belong, where they live on disk, and how
they're kept in sync.
It is not a dev-environment installer, not a package manager, not
mise / asdf. It manages repos, not language toolchains.
In 30 seconds
cargo install grex-cli # binary is `grex`
grex init # creates grex.jsonl in cwd
grex add https://github.com/you/svc-a # registers + clones a sub-repo
grex add https://github.com/you/svc-b
grex sync # parallel pull/clone for all
grex status --json # machine-readable state
grex.jsonl (intent) and grex.lock.jsonl (resolved state) are the only
files you commit to your meta-repo. Everything else grex does — clone,
pull, run actions, talk MCP — is reproducible from those two files.
What you get
- One CLI, twelve frozen verbs.
init add rm ls status sync update doctor serve import run exec. Universal--json --plain --dry-run --parallel <N> --filter <EXPR>on every verb. See the CLI reference. - Pack contract. Any git repo with a
.grex/pack.yamlis a pack. Three built-in pack-types ship; the plugin API lets you add more without forking. Read the pack spec. - Reproducible manifest. Newline-delimited JSON, schema-versioned per row. See manifest.
- MCP server built-in.
grex servespeaks native MCP 2025-06-18 over stdio — every non-serveverb becomes a tool call, no custom dialect. See MCP reference. - Parallel scheduler with a Lean4 invariant proof. Bounded semaphore
- per-pack
.grex-lock+fd-lockmanifest guard; "no double-lock" is mechanised. See concurrency.
- per-pack
- Migration from
REPOS.jsonmeta-repos viagrex import --from-repos-json. See migration.
Read next
- New here? Start with Goals then Architecture.
- Writing a pack? Read the Pack spec and Pack template.
- Driving grex from an agent? Jump to MCP and the CLI JSON output reference.
- Curious how it's built? See the engineering handbook and the roadmap.
API reference (rustdoc): grex-core ·
grex-mcp.
Heads up: the published crate is
grex-cli; the installed binary isgrex. If pemistahl's unrelatedgrex(regex-from-test-cases) is already on yourPATH, pass--forcetocargo install grex-clior rename the other binary first.
goals
Philosophy, competitive positioning, and scope for grex v1.
Philosophy (7 principles)
-
Git repo is a universal container for machine-configurable state. Configs, tools, env declarations, symlink trees, install manifests all ride on git: free versioning, distribution, diffing, authorship.
-
Pack = git repo +
.grex/directory. The.grex/dir is the contract grex understands. Pack content outside.grex/is opaque to grex. A pack is just a git repo that opts into the protocol. -
Every pack is a meta-pack. Uniform model. Packs can nest child packs. Leaf packs just have zero children. No special-casing in code.
-
Repo sync is a universal op, orthogonal to pack-type. Every pack gets
grex sync(git fetch/pull + recurse into children) for free. Install / update / teardown are per-pack-type. -
Extensibility is vital. grex cannot precompile every install or config logic. The action vocabulary and pack-types are plugin interfaces. v1 ships a small built-in set compiled in; v2 opens external plugin loading.
-
Future-proof core, pragmatic content. Stable schemas + trait APIs at v1. Action vocabulary stays small (YAGNI) but grows via plugin contributions over time.
-
Agent-native. Embedded MCP stdio JSON-RPC server exposing all CLI verbs 1:1. Not a subprocess wrapper — handlers call the same library entrypoints the CLI dispatcher calls.
Cross-cutting: blazingly fast via Rust + tokio. All built-in actions are native Rust (no shell fork). A shell escape hatch exists (exec action, scripted pack-type) but is the last resort, not the default.
Competitive positioning
| Axis | codyaverett/metarepo | grex |
|---|---|---|
| Domain | git repos only | any resource via pack protocol |
| Concurrency | sequential | tokio parallel, bounded semaphore |
| State | intent-only | intent + lockfile (separate files) |
| Atomic writes | no | yes (temp + rename always) |
| MCP | subprocess wrapper | embedded in-process server |
| Lean4 proof | no | 1 scheduler invariant v1 |
| Nesting | via sub-meta | uniform (every pack = meta) |
| Extension | code changes only | trait-based plugin registry |
| Cross-plat | yes | yes + explicit Win/Linux/Mac CI matrix |
v1 shippable scope
Core (always compiled)
- Manifest (JSONL, intent events)
- Lockfile (JSONL, resolved SHA + state, separate file)
- Scheduler (tokio + bounded semaphore + per-pack
.grex-lock+fd-lockglobal) - Sync engine (git clone/pull, recurse into children)
- Gitignore automation (managed block markers)
- MCP server (stdio JSON-RPC 2.0, methods = CLI verbs 1:1)
- Pack discovery (
.grex/pack.yamlparse) - Action executor + in-process action plugin registry
- Pack-type executor + in-process pack-type plugin registry
- Atomic file writes (temp + rename always)
- Lean4 invariant proof (no double-lock on same resource path)
Built-in pack-types (3)
meta— nests children, no own actions.declarative— runs Tier 1 actions frompack.yaml.scripted— escape hatch; runs.grex/hooks/{setup,sync,teardown}.{sh,ps1}.
Built-in actions (7 Tier 1, grounded in real E:\repos scripts)
symlink— create/update symlink w/ backup, idempotent, cross-platform.env— set env var (user / machine / session scope).mkdir— idempotent dir creation (parents).rmdir— remove dir, optional backup.require— prereq / idempotency gate (path-exists, cmd-available, reg-key, os, psversion, symlink-ok).when— platform / conditional gate wrapping nested actions.exec— shell escape (array-form cmd, no shell-parse by default).
CLI verbs (12, frozen contract)
init add rm ls status sync update doctor serve import run exec
Stable public APIs (breaking changes forbidden post-v1 without major bump)
.grex/pack.yamlschema (withschema_version: "1").grex.jsonlmanifest schema.grex.lock.jsonllockfile schema.ActionPluginRust trait.PackTypePluginRust trait.FetcherRust trait.- CLI verb surface.
- MCP method surface (= CLI verbs 1:1).
v2 backlog (NOT v1)
- External plugin loading (dylib via
libloadingor WASM viawasmtime/extism). - Retro-futurist
ratatuiTUI dashboard. - Additional pack-types (
software-list,env-bundle,dotfiles) via plugin. - Additional actions (
pkg-install,url-download,archive-extract,file-append,patch,json-merge,template,path-add,shell-rc-inject) via plugin. - Extra Lean4 proofs (idempotency, commutativity, crash-safety of manifest fold).
- SQLite optional backend for very large workspaces.
- Self-update (
grex upgrade). - Pack registry (
grex.dev). - Embedded scripting (Lua / Rhai) — middle ground between declarative YAML and shell escape.
Non-goals (permanent)
- Monorepo conversion.
- Git submodule full replacement.
- Cross-VCS support (hg, svn, fossil, perforce).
- Language-specific build orchestration.
- Generic CI runner.
Grounded reality — action-vocab rationale
Scanned real-world E:\repos scripts: 3 PowerShell scripts, 945 LOC total. Pattern frequencies:
| Pattern | Count | v1 Action |
|---|---|---|
symlink-create | 8 | symlink |
idempotency-check | 9 | require |
env-set | 7 | env |
exec-cmd (chain scripts) | 5 | exec |
dir-create | 2 | mkdir |
platform-gate | 2 | when |
dir-remove (backup pattern) | 1 | rmdir |
| package installs | 0 | deferred v2 plugin |
| JSON merges | 0 | deferred v2 plugin |
| archive extracts | 0 | deferred v2 plugin |
The 7-primitive Tier 1 vocab is grounded, not speculated. Everything else is deferred to v2 plugin contributions.
architecture
Crate layout, trait surfaces, and data-flow for grex v1.
Workspace
Single crate grex (lib + bin). Sub-crates avoided in v1 to keep the plugin trait crate vendored in the same compilation unit. v2 may split grex-plugin-api into its own crate for ABI stability.
grex/
├── Cargo.toml
├── rust-toolchain.toml
├── src/
│ ├── main.rs # thin bin entrypoint
│ ├── lib.rs # public surface re-exports
│ ├── cli/
│ │ ├── mod.rs # clap::Command composition
│ │ ├── init.rs # grex init
│ │ ├── add.rs # grex add
│ │ ├── rm.rs # grex rm
│ │ ├── ls.rs # grex ls
│ │ ├── status.rs # grex status
│ │ ├── sync.rs # grex sync
│ │ ├── update.rs # grex update
│ │ ├── doctor.rs # grex doctor
│ │ ├── serve.rs # grex serve --mcp
│ │ ├── import.rs # grex import
│ │ ├── run.rs # grex run <action>
│ │ ├── exec.rs # grex exec <cmd>
│ │ └── output.rs # all print! / table / color
│ ├── manifest/
│ │ ├── mod.rs
│ │ ├── event.rs # intent events
│ │ ├── state.rs # folded pack state
│ │ ├── fold.rs # event stream → HashMap<Id, State>
│ │ ├── lock.rs # grex.lock.jsonl
│ │ ├── io.rs # atomic temp+rename, fd-lock
│ │ └── compact.rs
│ ├── pack/
│ │ ├── mod.rs # Pack struct, tree walk
│ │ ├── schema.rs # pack.yaml schema v1
│ │ └── discovery.rs # load/resolve children
│ ├── plugin/
│ │ ├── mod.rs # registries, trait re-exports, v1 co-located builtins
│ │ ├── action.rs # ActionPlugin trait
│ │ ├── packtype.rs # PackTypePlugin trait
│ │ └── fetcher.rs # Fetcher trait (git backend)
│ ├── log.rs # ActionLogger trait (plugin diagnostics)
│ ├── env.rs # EnvResolver trait ($VAR expansion surface)
│ ├── lockfile/
│ │ └── hash.rs # compute_actions_hash (sha256 over canonical actions+sha)
│ ├── actions/ # 7 built-in action plugins
│ │ ├── symlink.rs
│ │ ├── env.rs
│ │ ├── mkdir.rs
│ │ ├── rmdir.rs
│ │ ├── require.rs
│ │ ├── when.rs
│ │ └── exec.rs
│ ├── packtypes/ # 3 built-in pack-type plugins
│ │ ├── meta.rs
│ │ ├── declarative.rs
│ │ └── scripted.rs
│ ├── fetchers/
│ │ └── git.rs # gix or git2 behind Fetcher trait
│ ├── gitignore/
│ │ └── mod.rs # managed-block read/write
│ ├── mcp/
│ │ ├── mod.rs # stdio JSON-RPC 2.0 loop
│ │ ├── methods.rs # verb → method dispatch
│ │ └── schema.rs
│ └── concurrency/
│ ├── mod.rs # tokio runtime bootstrap
│ ├── scheduler.rs # semaphore + per-pack lock
│ └── packlock.rs # <path>/.grex-lock
├── tests/
│ ├── integration_add.rs
│ ├── integration_rm.rs
│ ├── sync_recursive.rs
│ ├── sync_parallel.rs
│ ├── gitignore_preserves_user_lines.rs
│ ├── crash_recovery.rs
│ ├── mcp_stdio.rs
│ ├── import_legacy.rs
│ ├── doctor_drift.rs
│ ├── pack_types_end_to_end.rs
│ └── property_manifest.rs
├── proof/
│ ├── lakefile.lean
│ └── Grex/
│ └── Scheduler.lean
└── .github/workflows/
├── ci.yml
├── lean.yml
└── release.yml
Core trait sketches
Full contracts in plugin-api.md. Condensed here:
#![allow(unused)] fn main() { use async_trait::async_trait; use serde_json::Value; use std::path::Path; pub enum Os { Windows, Linux, Macos } // v1: PackCtx is realized as ExecCtx in code (2026-04-20). pub struct ExecCtx<'a> { pub vars: &'a VarEnv, // implements EnvResolver pub pack_root: &'a Path, pub workspace: &'a Path, pub platform: Os, // type-safe; decision 2026-04-20 // deferred to M5: pack_id, dry_run, logger: &dyn ActionLogger } // v1 shipped shape (2026-04-20 — aligned with shipped trait in M4-B review fix). // Sync fn, typed &Action (not &Value), returns ExecStep. Async + &Value form is // the v2-facing target reserved for external plugin loading (dylib/WASM). pub trait ActionPlugin: Send + Sync { fn name(&self) -> &str; fn execute(&self, action: &Action, ctx: &ExecCtx<'_>) -> Result<ExecStep, ExecError>; } #[async_trait] pub trait PackTypePlugin: Send + Sync { fn name(&self) -> &str; async fn install(&self, ctx: &ExecCtx<'_>, pack: &Pack) -> anyhow::Result<()>; async fn update(&self, ctx: &ExecCtx<'_>, pack: &Pack) -> anyhow::Result<()>; async fn teardown(&self, ctx: &ExecCtx<'_>, pack: &Pack) -> anyhow::Result<()>; async fn sync(&self, ctx: &ExecCtx<'_>, pack: &Pack) -> anyhow::Result<()>; } pub struct FetchReport { pub sha: Option<String>, pub branch: Option<String>, } #[async_trait] pub trait Fetcher: Send + Sync { fn scheme(&self) -> &str; // "git" async fn clone(&self, url: &str, dst: &Path) -> anyhow::Result<FetchReport>; async fn pull(&self, dst: &Path) -> anyhow::Result<FetchReport>; } }
Verb → module map
| CLI verb | Entry module | Primary collaborators |
|---|---|---|
init | cli::init | manifest::io, gitignore, concurrency |
add | cli::add | manifest, pack::discovery, plugin::packtype, fetchers::git, gitignore |
rm | cli::rm | manifest (tombstone), plugin::packtype::teardown, gitignore |
ls | cli::ls | manifest::fold, manifest::lock |
status | cli::status | manifest, per-pack-type status dispatch |
sync | cli::sync | fetchers::git, concurrency::scheduler, recursion |
update | cli::update | sync + pack-type.install if lockfile delta |
doctor | cli::doctor | manifest integrity, gitignore diff, schema validate |
serve | cli::serve | mcp::* |
import | cli::import | legacy REPOS.json ingest → manifest::event::Add |
run | cli::run | plugin::action, cli::output |
exec | cli::exec | tokio::process, concurrency::scheduler |
Data flow (ASCII)
┌──────────────┐
argv ──►│ clap parse │
└──────┬───────┘
│ verb + args
▼
┌──────────────┐ ┌────────────────────┐
│ dispatcher │────►│ manifest::load │
└──────┬───────┘ │ fold events │
│ └────────┬───────────┘
│ │ HashMap<PackId, State>
▼ │
┌──────────────┐ │
│ pack::walk │◄─────────────┘
│ (load .grex/ │
│ pack.yaml, │
│ recurse │
│ children) │
└──────┬───────┘
│ PackTree
▼
┌──────────────┐
│ concurrency │ tokio runtime
│ scheduler │ semaphore(N)
└──────┬───────┘ per-pack .grex-lock
│
┌───────────┼───────────┐
▼ ▼ ▼
fetcher packtype action
(git plugin plugin
pull) dispatch exec
│
▼
┌──────────────┐
│ manifest:: │ atomic temp+rename
│ append │ fd-lock RW
└──────┬───────┘
│
▼
┌──────────────┐
│ lockfile │ resolved state
│ update │
└──────┬───────┘
│
▼
┌──────────────┐
│ gitignore │ managed-block sync
│ sync │
└──────┬───────┘
│
▼
┌──────────────┐
│ cli::output │ pretty | plain | json
└──────────────┘
pack::walk traverses two distinct edges in the pack graph:
childrenedge — ownership. The walker clones missing children, recurses into them, and applies their lifecycle transitively.depends_onedge — verification only. The walker checks each named/URL'd prerequisite resolves to a present, satisfied pack in the workspace; it does NOT clone or recurse. Unresolveddepends_onentries are a hard error at plan phase, before the scheduler dispatches any action. See pack-spec.md §childrenvsdepends_on.
Runtime invariants
- I1 (Lean4 v1 proof): scheduler never holds two concurrent locks on the same pack path.
- I2: every manifest append is preceded by acquiring the global fd-lock.
- I3:
.gitignoremanaged-block sync is idempotent — running it twice is a no-op on disk. - I4: compaction output is fold-equivalent to its input.
- I5: pack tree walk terminates (cycle detection).
See concurrency.md for I1's Lean4 formalization.
pack-spec
The .grex/ contract directory and pack.yaml schema v1. Normative.
Pack definition
A pack is a git repository containing a .grex/ directory at its root. grex reads and acts on the contract inside .grex/; everything else in the repo is opaque.
some-pack/ # git repo root
├── .grex/ # contract dir (required)
│ ├── pack.yaml # required: pack manifest
│ ├── targets/ # optional: platform overrides
│ │ ├── windows.yaml
│ │ ├── linux.yaml
│ │ └── macos.yaml
│ ├── files/ # optional: payload files (configs, themes)
│ ├── hooks/ # optional: scripted-type escape hatch
│ │ ├── setup.sh / .ps1
│ │ ├── sync.sh / .ps1
│ │ └── teardown.sh / .ps1
│ └── .state/ # gitignored: runtime state cache
└── ... # opaque to grex
pack.yaml schema v1
Top-level fields
| Field | Type | Required | Notes |
|---|---|---|---|
schema_version | string | yes | Must be "1". Future reader rejects unknown. |
name | string | yes | Unique within the parent workspace. Slug-like. |
type | string | yes | One of meta, declarative, scripted. |
version | string | no | Pack's own semver; not enforced by grex v1. |
depends_on | list[string|url] | no | External prerequisites. Tool verifies presence; does NOT clone or walk. See below. |
children | list[child-ref] | no | Owned sub-packs. Tool clones, walks, and syncs transitively. See below. |
actions | list[action] | no | Ordered action list. Meaningful for type: declarative (and declarative children of meta). |
teardown | list[action] | no | Optional explicit teardown. If omitted, default = reverse of actions. |
children vs depends_on — ownership split
The two edge types in the pack graph are distinct and tools must not conflate them:
children— owned sub-packs. grex clones them into the workspace, walks into each on sync, and applies their lifecycle transitively. Children appear in the pack tree output (grex ls). Removing a parent teardowns its children.depends_on— external prerequisites. grex verifies the named/URL'd packs are already present and satisfied in the workspace, but does NOT clone, walk, or modify them. They do not appear under the dependent pack in the pack tree. Failure to resolve adepends_onentry is a hard error at plan phase (before any action runs).
Every pack graph therefore has two edge kinds: a children edge (ownership / walk) and a depends_on edge (verification only). Cycle detection runs over both independently.
children child-ref shape
children:
- url: git@github.com:user/warp-themes
path: themes # optional; default = last URL segment
ref: v1.2.0 # optional; branch, tag, or SHA. Default: remote HEAD.
Children resolve as flat siblings of the parent pack root: a parent at ~/code/.grex/pack.yaml with a child path: themes materialises that child at ~/code/themes/.grex/pack.yaml. The bare-name rule on path is enforced at plan phase since v1.1.0 — see Validation rules for the regex and rejection shape.
actions list
Each entry is a YAML object with exactly one known action key (symlink, env, mkdir, rmdir, require, when, exec) or a plugin-registered name. The value under the key is the action's arg-object, per that action's schema (see actions.md).
Targets / platform overrides
Files under .grex/targets/{windows,linux,macos}.yaml are merged over the base pack.yaml on the matching OS. Merge rules:
- Top-level scalars (
name,type,version): override replaces. - Lists (
actions,children,depends_on): appended (base first, then override), unless the override setsactions_replace: trueat top level. - The override file follows the same schema as
pack.yaml(minusschema_version; inherited).
Alternative to separate files: inline when: gates in actions (platform dispatch via the when action — see below).
files/ payload convention
Arbitrary files shipped inside the pack. Actions (e.g. symlink) reference them via paths relative to the pack root: files/config.yaml, files/themes/default.toml. grex resolves these against the pack's workdir at runtime.
.state/ runtime cache
Gitignored. Holds per-pack runtime cache (lock markers, resolved deps, per-platform resolution memo). grex doctor --compact may prune this.
The 3 built-in pack-types
meta
Nests children only. Has no own actions. Lifecycle:
install= clone all children, recursively dispatch their pack-type's install.sync= git pull self, then recurse into children's sync.update= sync + dispatch children's update if lockfile SHA changed.teardown= recurse children teardown, then remove self dir (if owned).
schema_version: "1"
name: dev-env
type: meta
children:
- url: git@github.com:user/warp-cfg
path: warp-cfg
- url: git@github.com:user/fonts-pack
path: fonts
declarative
Runs actions list from pack.yaml in order. All actions are idempotent (or gated by require). May also have children.
install= runactionstop-to-bottom under the current OS.sync= git pull self, then recurse into children.actionsre-run only if lockfile SHA changed (covered byupdate).update= sync + re-runactionsif lockfile delta.teardown= runteardown:list if present; else reverse-order rollback ofactions.
schema_version: "1"
name: warp-cfg
type: declarative
version: "0.2.0"
actions:
- require:
any_of:
- cmd_available: git
- os: windows
on_fail: error
- when:
os: windows
actions:
- mkdir: { path: "$HOME/.warp" }
- symlink:
src: files/config.yaml
dst: "$HOME/.warp/config.yaml"
backup: true
normalize: true
- env:
name: WARP_HOME
value: "$HOME/.warp"
scope: user
- when:
os: macos
actions:
- symlink:
src: files/config.yaml
dst: "$HOME/Library/Application Support/warp/config.yaml"
teardown:
- rmdir: { path: "$HOME/.warp", backup: true }
scripted
Escape hatch. Runs .grex/hooks/{setup,sync,teardown}.{sh,ps1} on the matching OS. grex picks .ps1 on Windows, .sh on Linux/macOS. If the expected hook is absent for the current OS, the lifecycle phase no-ops.
install= runhooks/setup.{sh,ps1}with cwd = pack workdir.sync= git pull self, then runhooks/sync.{sh,ps1}if present.update= sync + rerun setup if lockfile delta (no-op if no setup hook).teardown= runhooks/teardown.{sh,ps1}if present.
Hooks receive env vars: GREX_PACK_NAME, GREX_PACK_PATH, GREX_PACK_OS, GREX_DRY_RUN.
Exit code non-zero = failure (propagates).
schema_version: "1"
name: legacy-vim
type: scripted
# hooks/ directory ships setup.sh, setup.ps1, teardown.sh, teardown.ps1
Plain-git children (v1.1.1+)
A child path declared in a parent pack's children: list does not have
to carry its own .grex/pack.yaml. When the walker resolves a child to a
directory that contains .git/ but no .grex/pack.yaml, grex synthesizes
an in-memory scripted-no-hooks pack manifest for it. No file is written
to disk.
Synthetic packs are leaves by construction. They declare empty
children: [], empty actions: [], and empty teardown: [], so the
walker recurses no further past them. Sync against a synthetic pack runs
git pull only — no setup, update, or teardown hooks fire (there are
none to fire).
This makes the bootstrap pattern (REPOS.json-style flat-sibling layouts:
a parent meta-pack whose children are existing plain git repos that the
user did not author specifically for grex) walk end-to-end on
grex sync without per-child .grex/pack.yaml authoring ceremony.
Surfacing
- Lockfile: synthetic pack entries set
synthetic: true(defaultfalseand#[serde(default)], so v1.1.0 lockfiles parse forward). grex ls: synthetic entries are prefixed with~in tree mode and gain"synthetic": truein--jsonmode.grex doctor: synthetic packs reportOK (synthetic)instead of raising a missing-manifest error. JSON output gains"synthetic": trueon the per-pack diagnostic.
Failure mode
If a declared child path resolves to a directory that has neither
.grex/pack.yaml nor .git/, the walker still raises
TreeError::ManifestNotFound. Synthesis only fires when at least one of
the two exists; a path pointing at "nothing" is genuinely an error.
Example
Workspace layout:
~/code/ # parent (meta) pack root, declares children
├── .grex/pack.yaml # type: meta, children: [algo-leet, neetcode]
├── algo-leet/ # child #1 — plain git repo, no .grex/
│ └── .git/
└── neetcode/ # child #2 — plain git repo, no .grex/
└── .git/
grex sync ~/code walks algo-leet and neetcode as synthesized
scripted-no-hooks packs, runs git pull in each, and exits 0. The
lockfile records both with synthetic: true; grex ls shows them
with the ~ prefix.
Validation rules
schema_versionmust be exactly"1".typemust be one of the 3 built-ins (or a registered plugin name when the plugin is loaded).typein.grex/pack.yamlis the authoritative source of truth. Runtime manifest / lockfile entries recordtypeas an observed snapshot only. On disagreement (manifesttype≠ pack.yamltype), pack.yaml wins and the manifest is corrected on the next sync. See manifest.md.nameregex:^[a-z][a-z0-9-]*$(letter-led; digits allowed in later positions).children[].pathmust be bare name: same regex asname. Rejected: path separators (/,\),.,.., the empty string"", anything starting with a digit or capital letter, or a leading/. The empty-string rejection matters because it would otherwise resolve children at the parent's own pack root and silently overwrite it.- Unknown top-level keys rejected unless prefixed with
x-(user annotations). - Unknown action keys rejected unless the plugin is registered.
- Empty lists are VALID:
actions: [],children: [],depends_on: [],teardown: []all parse cleanly. Emptyactionsin adeclarativepack is a no-op install. Emptychildrenin ametapack is a no-op sync. Do not reject empty lists. - Duplicate
symlink.dstwithin the same pack is a validation error, caught at plan phase (before execution). Two or moresymlinkactions resolving to the same absolutedstpath abort the plan withActionArgsInvalid. Cross-pack duplicates are handled by conflict detection at the workspace level (separate concern; see concurrency.md). - YAML anchors (
&name) and aliases (*name) are REJECTED during parse. Rationale: prevents billion-laughs / alias-bomb DoS. Implementation: parser config disables alias resolution, or the loader detects and errors before expansion.
grex doctor runs these checks on every registered pack.
Opacity rule
grex reads only .grex/. It never inspects or touches content outside it. Pack authors may store anything adjacent — scripts, assets, source — and grex stays agnostic.
Relationship to the workspace manifest
A workspace (the directory where you run grex init) is itself a git repo. It has its own grex.jsonl + grex.lock.jsonl tracking which packs are registered. A workspace does not need its own .grex/pack.yaml unless it is also meant to be published as a pack.
manifest
grex.jsonl (intent log) and grex.lock.jsonl (resolved state). Both live at the workspace root. Both are newline-delimited JSON (LF on all platforms — writer normalizes).
Two-file split
| File | Purpose | Written by |
|---|---|---|
grex.jsonl | Append-only intent log. User actions: register a pack, remove a pack, update a ref. | add, rm, update verbs. |
grex.lock.jsonl | Append-only resolved state. Actual SHA + install state after each successful sync/install. | sync, update verbs. |
Split rationale: intent is portable across machines; lockfile pins the actual state on this machine. Commit intent to git; lockfile may be committed too (for reproducible bootstrap) or gitignored (for per-machine pinning).
grex.jsonl event schemas
Common envelope (all events):
{"op":"<verb>","ts":"<rfc3339>","id":"<pack-id>","schema_version":"1"}
add
{"op":"add","ts":"2026-04-19T10:00:00Z","id":"warp-cfg","schema_version":"1","url":"git@github.com:user/warp-cfg","path":"warp-cfg","type":"declarative","ref":"main"}
rm
{"op":"rm","ts":"2026-04-19T11:00:00Z","id":"warp-cfg","schema_version":"1"}
update
{"op":"update","ts":"2026-04-19T12:00:00Z","id":"warp-cfg","schema_version":"1","ref":"v0.2.0"}
sync (optional intent marker)
{"op":"sync","ts":"2026-04-19T13:00:00Z","id":"warp-cfg","schema_version":"1"}
Action event brackets — action_started / action_completed / action_halted
The sync path writes three bracketing events around each action it applies. These sit alongside (do not replace) the sync intent marker; readers built against v1.0 continue to parse cleanly — unknown op values are ignored per the forward-compat rule.
{"op":"action_started","ts":"2026-04-20T10:00:00Z","id":"warp-cfg","schema_version":"1","action":"symlink","idx":0}
{"op":"action_completed","ts":"2026-04-20T10:00:00Z","id":"warp-cfg","schema_version":"1","action":"symlink","idx":0,"changed":true}
{"op":"action_halted","ts":"2026-04-20T10:00:01Z","id":"warp-cfg","schema_version":"1","action":"exec","idx":1,"reason":"ExecNonZero","stderr":"<truncated to 2 KiB>"}
Semantics:
action_startedis written under the manifest lock before the action runs.action_completedis written under the manifest lock after the action returnsOk.action_haltedis written when the action returnsErr, carrying a compact failure reason plus (forexec) a stderr tail capped at 2 KiB (see actions.md §exec).- An
action_startedwith no matchingaction_completed/action_haltedindicates a crash mid-action. The startup recovery scan (see concurrency.md §Recovery scan) reports these; cleanup isgrex doctorterritory (M4+).
ManifestLock is acquired per-action (not per-sync), so a long sync with many actions interleaves lock acquire/release rather than holding the global lock end-to-end.
Fold algorithm (pseudocode):
state = {}
for line in read_jsonl(grex.jsonl):
match line.op:
"add": state[id] = Pack::from(line)
"update": state[id].patch(line)
"rm": state.remove(id)
"sync": no-op (intent marker)
return state
O(N) in event count. Deterministic regardless of compaction history.
grex.lock.jsonl resolved-state schema
{"id":"warp-cfg","sha":"abc123...","branch":"main","installed_at":"2026-04-19T13:05:00Z","actions_hash":"sha256:deadbeef..."}
Fields:
| Field | Required | Description |
|---|---|---|
id | yes | Pack id; matches manifest id. |
sha | yes | Git commit SHA of the pack workdir after sync. Stored as the empty string when the pack is not a git working tree (e.g. a local-only root pack) OR when the HEAD probe failed. actions_hash is computed with the same commit_sha value, so empty-SHA records are internally consistent — if a future sync successfully probes a non-empty SHA, the hash differs and the skip-on-hash short-circuit correctly re-executes the pack. Probe failures are surfaced as a grex::walker tracing::warn! line so operators see the signal without the sync aborting. Lockfile-write failures at end-of-sync are intentionally non-fatal (recorded as a report.event_log_warnings entry); the successful pack actions are not rolled back. |
branch | no | Branch tracked; null if detached. |
installed_at | yes | RFC3339 timestamp of last successful install/sync. |
actions_hash | yes | SHA-256 content fingerprint of the pack's installable surface. Scope varies by pack type (see below). Used to detect whether update needs to re-run install logic. |
actions_hash scope by pack type (name retained; semantics explicitly broadened):
declarative: hash of normalizedactionsarray +files/tree.meta: hash of the serializedchildrenarray + each child's resolved SHA (from the child's lockfile entry). Captures the fact that a meta pack's installable surface is the set of owned children at pinned revisions.scripted: hash of normalizedactionsarray (if any) +files/tree + SHA-256 of each hook file in.grex/hooks/(sorted by filename, then concatenated). Any hook edit re-triggersupdate.
Rationale for keeping the name actions_hash: the field's purpose — "has the installable content changed since last sync?" — is unchanged; only its per-type inputs differ. Renaming would force a lockfile schema bump for no semantic gain.
Fold for lockfile: last-line-wins per id.
type field authority
The type recorded on add events and in lockfile entries is an observed snapshot of what the pack reported at that moment. The authoritative source of truth is .grex/pack.yaml's type field (see pack-spec.md §Validation rules). If the manifest type disagrees with pack.yaml on a subsequent sync, pack.yaml wins and the manifest is corrected by emitting a fresh add/update event reflecting the true type. Readers MUST NOT treat manifest type as normative when pack.yaml is available.
Atomic append
Single-line append uses buffered write + fsync:
#![allow(unused)] fn main() { let mut f = OpenOptions::new().append(true).open("grex.jsonl")?; f.write_all(line.as_bytes())?; f.write_all(b"\n")?; f.sync_data()?; }
Held under fd-lock. POSIX append is atomic for writes ≤ PIPE_BUF; we enforce event size ≤ 2 KiB to stay inside.
Compaction (temp + rename)
Periodic or on grex doctor --compact:
- Acquire global fd-lock (exclusive).
- Fold events → state map.
- Emit minimal equivalent event set to
grex.jsonl.tmp(oneaddper live id, tombstoned ids dropped entirely). fs::rename(grex.jsonl.tmp, grex.jsonl)— atomic on POSIX and Windows NTFS (MoveFileExwithREPLACE_EXISTING).- Release fd-lock.
Invariant: fold(pre-compaction) == fold(post-compaction).
Lockfile compaction mirrors intent-log compaction: last-line-wins per id → one line per id → atomic rename.
Locking
Global RW lock via fd-lock:
#![allow(unused)] fn main() { let file = OpenOptions::new().read(true).write(true).open("grex.jsonl")?; let mut lock = fd_lock::RwLock::new(file); let _guard = lock.write()?; // exclusive for append/compact }
- Mutators (
add,rm,update,syncwrite-phase,doctor --compact) take exclusive write lock. - Readers (
ls,status,syncread-phase) take shared read lock.
Crash recovery (torn-line detection)
On every read:
- Parse line-by-line.
- If the final line fails JSON parse AND file does not end in
\n, treat as torn write. - Truncate file to length of last valid line.
- Emit tracing warning; continue.
Test: tests/crash_recovery.rs spawns a child, SIGKILL / TerminateProcess mid-append, asserts parent recovers.
Schema versioning
Every event has schema_version: "1". Breaking changes bump. Reader rejects unknown versions with actionable error pointing to grex upgrade-schema (post-v1 migration command).
Lockfile entries carry an implicit schema version tied to the workspace config. Separate bump cadence from intent-log schema.
Migration from legacy REPOS.json
grex import --from-repos-json <path> reads flat [{"url":"...","path":"..."},...] → emits one add event per entry with type defaulted to meta (or user-specified via --default-type). Idempotent: re-running detects existing ids by path and no-ops.
Example sequence
{"op":"add","ts":"2026-04-19T10:00:00Z","id":"warp-cfg","schema_version":"1","url":"git@github.com:me/warp-cfg","path":"warp-cfg","type":"declarative","ref":"main"}
{"op":"add","ts":"2026-04-19T10:01:00Z","id":"fonts","schema_version":"1","url":"git@github.com:me/fonts","path":"fonts","type":"meta","ref":"main"}
{"op":"update","ts":"2026-04-19T11:00:00Z","id":"warp-cfg","schema_version":"1","ref":"v0.2.0"}
{"op":"rm","ts":"2026-04-19T12:00:00Z","id":"fonts","schema_version":"1"}
Corresponding lock after first successful sync:
{"id":"warp-cfg","sha":"abc123def","branch":"main","installed_at":"2026-04-19T10:00:05Z","actions_hash":"sha256:..."}
{"id":"fonts","sha":"fff111","branch":"main","installed_at":"2026-04-19T10:01:05Z","actions_hash":"sha256:..."}
Fold of intent log → live set = {warp-cfg} (fonts tombstoned). Subsequent sync rewrites lockfile entry for warp-cfg and drops the fonts line on compaction.
walker
How grex sync traverses your nested meta-pack tree under v1.2.0+ — phase by phase, with the rules that decide what to clone, what to recurse into, and what to refuse.
Canonical source: .omne/cfg/walker.md (SSOT, separate
grex-instrepo). This page is the user-facing projection; the SSOT is normative for behaviour.
What is a meta pack?
A pack is any directory carrying <dir>/.grex/pack.yaml. There are two flavours:
- meta pack —
pack.yamllistschildren:. Owns its own lockfile at<meta>/.grex/grex.lock.jsonl. Recursion enters here. - leaf pack —
pack.yamlhas nochildren:. Holds actions, no lockfile.
The directory where you run grex is the cwd-meta — the entry point for the recursion. There is no longer a single global "workspace root" anchor (retired in v1.2.0); every recursion frame computes destinations against ITS own meta dir.
Three changes vs. v1.1.x
- Parent-relative resolution.
dest = current_meta.join(child.path). Each frame uses its own meta dir as the join anchor. - Distributed lockfile. Each meta has its own
<meta>/.grex/grex.lock.jsonllisting ONLY direct children. Sub-metas are autonomous — a parent has zero knowledge of grandchildren. See lockfile. - Cargo-style parallel. Direct siblings sync in parallel; sub-meta recursion fires in parallel across siblings. Bounded by concurrency primitives.
The walker is manifest-graph-driven, not filesystem-driven. It only ever visits paths declared by some live manifest's children: list. Undeclared directories on disk — even those carrying their own .git/ — are NOT auto-discovered, NOT auto-registered. v1.1.1's sync-time auto-synthesis is retired; see §5-way classifier.
The three phases
sync(cwd_meta) runs three phases per recursion frame. Each frame is autonomous: load my own pack.yaml, sync only my direct children, then recurse.
Phase 1 — sync direct children (parallel)
For each child in manifest.children, in parallel:
- Compute
dest = canonical(cwd_meta.join(child.path)). Pre-canonicalization rejects relative segments that would resolve outsidecwd_meta. - Re-verify no path segment is a symlink crossing the parent boundary (see §Symlink hardening and toctou).
mkdir -p dest.parent()(idempotent — concurrent siblings sharing an ancestor liketools/race-safely).- Apply the 5-way classifier (next section).
- Upsert a
LockEntryinto<meta>/.grex/grex.lock.jsonl, keyed by canonical meta-relative POSIXpath.
After all children settle, if any landed on the "untracked git" branch the walker returns Err(UntrackedGitRepos(list)) with the complete list — no partial completion. Phase 2 and Phase 3 do not run for this frame.
5-way classifier (Phase 1)
The walker examines dest and routes to exactly one of five branches (evaluated top-down, mutually exclusive):
| # | Pre-condition at dest | Action |
|---|---|---|
| 1 | Does not exist | git clone child.url dest --branch child.ref |
| 2 | Exists AND is an empty directory | Treat as branch 1 — retry the clone (recovers a failed mid-clone that left an empty dest). |
| 3 | dest/.git exists AND dest/.grex/pack.yaml does NOT | Push onto the untracked list. NO synthesis under v1.2.0+; user must run grex add <url> <path>. |
| 4 | dest exists, is non-empty, AND lacks .git/ | Return Err(DestOccupied(dest, content_summary)). Foreign content; refuses to clone-over. |
| 5 | dest/.git AND dest/.grex/pack.yaml BOTH present (registered pack) | git fetch + checkout child.ref. Skip-on-hash if actions_hash and SHA unchanged. |
Branches 1, 2, and 5 are the only ones that mutate dest. Branch 2 explicitly recovers a failed-mid-clone state, so a second sync always reaches branch 5 (idempotent). Branch 4 is a hard error — a typo or stale checkout that the walker refuses to silently destroy.
Phase 2 — prune children removed from manifest
Read the lockfile. For each entry whose path is NOT in the current manifest's children: paths:
- If
dest/.gitdoes not exist → drop the lockfile entry, norm -rf(idempotent — already gone). - Prune-safety check (default-deny — bypass only with
--force-prune):- HEAD SHA must match
entry.sha. - Working tree must be clean (
git status --porcelain --ignoredempty — covers tracked edits AND ignored content liketarget/ornode_modules/). - No in-progress git op (rebase, merge, cherry-pick, revert, bisect — see force-prune §In-progress probe).
- Recursive consent walk. If dest contains its own non-empty
.grex/grex.lock.jsonl, recursively check every grandchild for the same three conditions. Any dirty/in-progress grandchild → refuse the prune unless--force-prune-recursive.
- HEAD SHA must match
rm -rf dest(delegated to platform-native helper).- Delete the lockfile entry (atomic rewrite).
Cleanup is CLI-invocation-driven, not eager. Removing a child from pack.yaml triggers prune on the next grex sync / update, not on edit. See force-prune for the full safety contract and audit log.
Phase 3 — recurse into child metas (parallel, autonomous)
For each child, in parallel:
- Compute
child_dest = cwd_meta.join(child.path). - If
child_dest/.grex/pack.yamlexists, parse it. - If the parsed manifest has non-empty
children:, recursively callsync(child_dest).
Each recursion is a fresh autonomous frame: it loads its own manifest, walks its own lockfile, syncs its own direct children. Sibling sub-meta syncs run in parallel; the per-pack .grex-lock (see concurrency §Per-pack PackLock) prevents two ops on the same pack path even across recursion frames.
Recursive consent (--with-children)
Phase 2 prune semantics deliberately cascade safety checks down the sub-meta tree. A meta whose declared child has its OWN sub-children (grandchildren) cannot be silently pruned if any grandchild is dirty or has an in-progress git op.
Three flag levels graduate the override:
| Flag | Effect |
|---|---|
| (none — default) | Default-deny. Refuse on any SHA mismatch, dirty tracked file, dirty ignored file, in-progress op, or dirty grandchild. |
--force-prune | Bypass clean-tree assertions at the named dest. Still respects in-progress ops and still refuses if any grandchild is dirty. |
--force-prune-with-ignored | Allow ignored content (e.g. target/, node_modules/) to be destroyed without warning at the named dest. |
--force-prune-recursive | Cascades the bypass to grandchildren. Required to prune past a dirty grandchild. See force-prune §Blast radius. |
grex remove --force <path> is the per-path equivalent of --force-prune: it bypasses checks 2 and 3 at the named dest only. It does NOT cascade past one level.
Validator rules — child.path
Applied at every recursion depth, identical rules:
| Rule | Behaviour |
|---|---|
Forward slash / | Allowed (multi-segment paths). Each segment must match ^[a-z][a-z0-9-]*$. |
Backslash \ | Normalised to / at parse-time on all platforms. |
.. segment (any position) | Rejected. |
| Absolute path | Rejected. |
| Symlink crossing parent boundary | Rejected post-canonicalization. |
| Empty path | Rejected. |
Duplicate path across two children: entries | Rejected at parse-time as DuplicateChildPath(path). |
: in any segment | Rejected (NTFS Alternate Data Streams). |
$ in any segment | Rejected (variable expansion / Windows special). |
~digit pattern (progra~1) | Rejected (Windows 8.3 short-name aliasing). |
NUL byte / control chars \x01-\x1F, \x7F | Rejected. |
Drive-letter prefix (C:, D:) | Rejected. |
Path segments are NFC-normalised at parse-time before deduplication. Two manifests declaring caf\u00E9/foo (NFC) and cafe\u0301/foo (NFD) collide post-normalisation.
Untracked git policy (5-way branch 3)
v1.1.1's sync-time auto-synthesis (silently registering a plain .git/ discovered at a declared dest) is RETIRED. Under v1.2.0+ the walker NEVER synthesises a manifest from a plain .git/. A declared dest with .git/ but no .grex/pack.yaml is an error, never silently registered.
Contract:
- The walker collects ALL untracked git repos across one
syncinvocation. - After Phase 1 completes for a frame, if any untracked were collected, the frame returns
Err(UntrackedGitRepos(list))with the COMPLETE list of offenders. - Phase 2 (prune) and Phase 3 (recurse) do NOT run for that frame.
User remediation: explicitly register each path with grex add <url> <path>. The walker has no opinion on which url is correct — that is operator-supplied by design.
The error message cites every untracked dir's absolute path so you can fix all in one batch rather than iteratively.
Symlink hardening
dest_has_git_repo(dest) refuses symlinked destinations outright via std::fs::symlink_metadata. Closes the symlink-redirection attack: a parent declaring path: code against a meta where <meta>/code -> $HOME cannot trick the walker into operating on $HOME/.git.
Reparse-point and gitfile policy. Maintainer-locked: REJECT ALL Windows junctions and non-symlink reparse points. v1.2.0+ rejects on Windows: IO_REPARSE_TAG_MOUNT_POINT (junctions, mklink /J), all reparse points except proper symlinks, and gitfile .git (regular file containing gitdir: ...). POSIX symlinks accepted with the boundary check; Windows proper symlinks accepted with the same check (they have a proper security model since Win10). Junctions and gitfile .git are unconditionally rejected — no flag, no override.
For the dirfd-binding TOCTOU mitigation that closes the path-swap window between canonicalize and clone, see toctou.
Cycle detection
Each recursion pushes pack_identity_for_child(child) (url:<url>@<ref>) onto an in-progress stack; a repeat returns TreeError::CycleDetected. Identity for the cwd-meta itself is path-keyed; for children it is URL+ref so the same repo at two distinct refs is distinct.
Lockfile keying
Lockfile entries within a meta are keyed by the canonical relative POSIX path of the child within that meta — single segment for direct children, but the writer always normalises through the path-keyed code path. v1.1.x bare-name keys remain valid as the degenerate single-segment case; readers fall back to bare-name lookup for legacy entries. See lockfile §Path keying and v1.1.1→v1.2.0 read-fallback for the full migration story.
Cross-references
- Distributed lockfile schema, three readers, v1.1.1→v1.2.0 migration: lockfile
- Bounded semaphore + per-pack lock + Lean4 invariant: concurrency
- Force-prune semantics, audit log, blast radius: force-prune
- BoundedDir TOCTOU primitive (cap-std + Linux openat2): toctou
- Manifest event log + crash recovery: manifest
- Pack layout +
.grex/contract: pack-spec
lockfile
grex.lock.jsonl — the resolved-state snapshot that pins each pack's last-synced commit, ref, install timestamp, and actions_hash. Companion to but distinct from the events.jsonl intent/audit log (see manifest).
Canonical source: .omne/cfg/lockfile.md (SSOT, separate
grex-instrepo). This page is the user-facing projection.
Concept: pack.yaml = INTENT, lockfile = STATE
The two artifacts answer different questions:
| Artifact | Question answered | Authored by |
|---|---|---|
pack.yaml (+ events.jsonl) | "Which children at which paths at which refs do I want?" | User / pack author |
grex.lock.jsonl | "Which commits am I currently at, with which actions applied?" | sync / update |
Same intent-vs-state separation as Cargo (Cargo.toml / Cargo.lock), npm (package.json / package-lock.json), Bundler, Poetry. The lockfile lives next to its meta's manifest — see §File location.
Why both are needed
- Idempotency (skip-on-hash).
syncre-resolves the ref → SHA, recomputesactions_hash, compares to the recorded entry, short-circuits if both match. Without a lockfile, every sync would re-execute every action. - Drift triangulation (3-leg).
doctorcompares declared (manifest) vs recorded (lockfile) vs present (disk). A 2-leg model cannot distinguish "user edited pack.yaml since last sync" from "someone hand-edited the working tree". - Concurrent-sync safety. Lockfile-write happens under the manifest fd-lock;
syncreads it once at plan phase and writes once at the end.
File location
<meta>/.grex/grex.lock.jsonl
Distributed under v1.2.0+: EACH meta owns its own <meta>/.grex/grex.lock.jsonl, tracking ONLY that meta's direct children. There is no global workspace lockfile — each recursion frame in walker §Three phases reads and writes its OWN lockfile.
Both the lockfile (.grex/grex.lock.jsonl) and the event log (.grex/events.jsonl) live in the manifest folder .grex/. Their names are deliberately distinct (no shared grex.*.jsonl prefix) to prevent the lockfile-vs-event-log conflation that caused historical SSOT errors.
Three "lock" artifacts — disambiguation
The codebase has three artifacts whose names contain "lock". The rule of thumb: if it ends in .jsonl it carries state; if it does not, it is a mutex.
| Artifact | Path (v1.2.0+) | Purpose | Format |
|---|---|---|---|
| THE lockfile | <meta>/.grex/grex.lock.jsonl | Resolved-state snapshot (commit + actions_hash per pack) | JSONL |
| Event log | <meta>/.grex/events.jsonl | Append-only history of add/rm/update/sync events | JSONL |
| Manifest fd-lock | <meta>/.grex.lock | OS-level file mutex serialising lockfile + event-log writes | Empty file |
Other file mutexes (<meta>/.grex.sync.lock per-meta-sync, <dest>.grex-backend.lock per-repo, <pack>/.grex-lock per-pack) are documented in concurrency §Five cooperating mechanisms; none of them carry state — they exist solely for mutual exclusion.
LockEntry schema
{"id":"warp-cfg","sha":"abc123...","branch":"main","installed_at":"2026-04-19T13:05:00Z","actions_hash":"sha256:deadbeef...","path":"warp-cfg"}
| Field | Since | Notes |
|---|---|---|
id | v1.0 | Pack name: slug; matches Event::Add.id. |
sha | v1.0 | Resolved commit SHA; empty string if pack is non-git or HEAD probe failed. |
branch | v1.0 | Tracked branch; null if detached. |
installed_at | v1.0 | RFC3339 timestamp of last successful install/sync. |
actions_hash | v1.0 | SHA-256 over installable surface (scope per pack-type — see manifest §actions_hash scope). |
schema_version | v1.0 | Bumped on breaking lockfile schema change. |
synthetic | v1.1.1 | true for plain-git children synthesized by the walker (semantically dead under v1.2.0+ — see below). |
path | v1.2.0 | Option<String> #[serde(default)], parent-meta-relative POSIX, normalised at write-time. Lookup-map key. |
The path field is the lookup-map key under v1.2.0's nested-children layout. See walker §Lockfile keying.
Three readers
| Reader | What it does with the lockfile |
|---|---|
sync | Skip-on-hash. Re-resolve commit, recompute actions_hash, compare against the prior LockEntry. Match → skip; mismatch → re-execute. |
doctor | Drift triangulation. Joins three legs: declared (manifest fold), recorded (lockfile entry), present (disk readdir / git probe). Each pair-mismatch is a distinct drift class. |
ls | State render. Per-pack synced / unsynced status. ls --long reads SHA + installed_at + actions_hash directly without folding the event log. |
Path keying and v1.1.1 → v1.2.0 read-fallback
Through v1.1.1, lockfile entries were keyed by bare pack id (manifest name:), and the flat-sibling rule guaranteed id was unique within the single global workspace lockfile. Under v1.2.0's nested child paths, two declared children at distinct paths (e.g. tools/foo and vendor/foo) MAY share the same name: — a bare-id key would collide.
v1.2.0 decision: path-keyed, per-meta lockfile. Each meta owns its own lockfile tracking ONLY its direct children. Within that lockfile the in-memory index keys entries by meta-relative pack path (canonical relative POSIX, normalised at write-time, NFC).
Read-time fallback for v1.1.1 lockfiles
When a v1.2.0 binary reads a v1.1.1 lockfile entry where path: None (deserialized via #[serde(default)]), the path is derived as Some(entry.id.clone()). This is sound because v1.1.1 enforced bare-name-only paths (the validator rejected /), so id == path for all v1.1.1 entries. The walker proceeds without rewriting the file. After the next successful sync the entry is rewritten with path: Some(...), and subsequent reads bypass the fallback.
This means v1.2.0 reads v1.1.1 lockfiles cleanly with no manual migration step required. The library function grex_core::lockfile::migrate_v1_1_1 and the planned grex migrate-lockfile CLI subcommand (v1.2.1) exist for users who want to eagerly upgrade the lockfile bytes to v1.2.0 schema (e.g. before committing to git). Both are opt-in.
Migration path summary
| Scenario | Behaviour |
|---|---|
| v1.2.0 binary reads v1.1.1 lockfile (no edit) | Read-fallback: path derived as id. Sync proceeds. |
| v1.2.0 binary writes after a successful sync | All entries written with path: Some(...). Subsequent reads use the on-disk path directly. |
| User wants to eagerly upgrade lockfile bytes | grex migrate-lockfile [--dry-run] [--workspace <path>] (v1.2.1) — atomic temp+rename, idempotent. |
| User downgrades v1.2.0 → v1.1.x | v1.1.x reader ignores path: field (forward-compat — unknown fields skipped); id-keyed lookup still works. |
LockEntry.synthetic deprecation
The synthetic: bool field on LockEntry (introduced v1.1.1 to mark plain-git children synthesised by the walker) is semantically dead under v1.2.0+. No v1.2.0 code path sets synthetic: true. The field is retained on the struct for backward-compat reads — v1.1.x lockfiles continue to deserialize cleanly.
A future schema bump (post-v1.2.0) MAY drop the field. Until then, a successful v1.2.0 sync against a workspace with synthetic entries either (a) finds the path now registered via grex add (entry rewritten with synthetic: false), or (b) reports UntrackedGitRepos and refuses to proceed. Either way, no v1.2.0+ sync writes a fresh synthetic entry.
Lifecycle
- First sync. Walker reads
pack.yamlgraph → clones each child → runs install actions → writes oneLockEntryper direct child of the cwd-meta into<cwd-meta>/.grex/grex.lock.jsonl. Sub-metas write their own lockfiles in their own.grex/dirs. - Re-sync (no edits). Walker re-resolves refs → for each pack, recomputes
actions_hashand compares to the recorded entry. Match → skip; lockfile entry carried forward unchanged. - Re-sync after
pack.yamledit. User changes a child'srefor anactionsblock → next sync'sactions_hashdiffers → pack re-executes →LockEntryrewritten with newsha/actions_hash/installed_at. - Child removed from
pack.yamlchildren:. Next sync's walker Phase 2 reconciles the lockfile against the manifest, deletes the orphan dest (subject to prune-safety — see force-prune), and removes the lockfile entry. grex doctor. Reads lockfile + intent-log fold + disk state → flags drift across the three legs.- Lockfile-write failure at end-of-sync. Intentionally non-fatal. Successful pack actions are not rolled back; the failure is recorded as a
report.event_log_warningsentry.
Crash recovery
Lockfile writes use write-then-rename atomicity (write to <lockfile>.tmp, fsync, rename over the original). A crash mid-write leaves either the old or the new file fully intact — never a torn JSONL. The manifest fd-lock (see concurrency) serialises all writes, so concurrent torn writes are also impossible.
On read, parse-failure of any line surfaces as LockfileCorrupt(path, line_no, parse_error):
- Severity
Warningundergrex doctor(which can repair the file by replaying from the event log + a clean re-sync). - Severity
Errorundergrex sync(refuses to plan against a corrupt lockfile).
User remediation path: grex doctor → repair → re-sync.
Cross-references
- Walker keying decision + parent-relative model: walker
- File mutexes (sync-lock, backend-lock, pack-lock, manifest-lock) + Lean4 I1: concurrency
- Schema field table + intent-log split + crash recovery: manifest
- Force-prune audit log + safety contract: force-prune
concurrency
Tokio runtime, bounded semaphore, per-pack file lock, per-meta manifest lock. One Lean4-verified invariant.
Canonical source: .omne/cfg/concurrency.md (SSOT, separate
grex-instrepo). This page is the user-facing projection.
Runtime
#[tokio::main(flavor = "multi_thread", worker_threads = ...)] async fn main() -> anyhow::Result<()> { ... }
Worker threads default = num_cpus::get(), overridable via --parallel N or GREX_PARALLEL env. The same --parallel N cap is honoured by the rayon scheduler that drives sibling sync within one meta — see walker §Phase 1 and walker §Phase 3.
Five cooperating mechanisms
- Per-meta sync lock —
<meta>/.grex.sync.lock(fd-lock, non-blocking, fail-fast). Held for the fullgrex synclifetime of THAT meta's frame. Two concurrentgrex syncinvocations against the same meta are a hard error, not a queue. v1.x → v1.2.0: through v1.x this was a single<workspace>/.grex.sync.lockat the workspace root (one global lock per workspace). Under v1.2.0+ each meta owns its own fd-lock under its own dir; cross-meta locks are independent (distinct metas never serialize against each other), the walker's recursion acquires + releases one lock per meta frame, and cargo-style parallel sub-meta sync is N concurrent fd-locks across the meta tree (one per meta currently being processed). Locking is per-meta, never global. - Per-repo backend lock —
<dest>.grex-backend.lock(fd-lock, sibling file NOT inside<dest>so it survives<dest>wipe). Held acrossclone+fetch+materialise_treefor one repo path. - Bounded semaphore — caps in-flight pack ops across the process.
- Per-pack
.grex-lock— prevents two ops on the same pack path across processes and tasks. - Per-meta manifest RW lock (
fd-lock) — serialises that meta's lockfile + event-log writes. v1.x → v1.2.0: under v1.2.0+ each meta has its own manifest fd-lock at<meta>/.grex.lock; the lock is per-meta, not global, so distinct metas may mutate their own lockfile + event log in parallel.
Lock acquisition order (fixed, deadlock-free): per-meta-sync → semaphore → pack-lock → repo-backend → manifest-lock. Never reversed.
TOCTOU closure
The sync pipeline revalidates the per-meta dirty-check twice:
- Before attempting to acquire the per-meta sync lock (fast reject).
- After acquiring the per-meta sync lock AND immediately before calling
materialise_tree(authoritative — any drift between steps 1 and 2 surfaces here).
Rationale: a concurrent non-sync writer (e.g. the user editing a file) could dirty the tree between our initial check and the moment we begin applying actions. The second check closes the window.
The path-swap TOCTOU (attacker swapping a directory for a symlink between canonicalize(dest) and the actual filesystem write) is closed by the BoundedDir dirfd-binding primitive — see toctou.
Recovery scan
At sync startup, before acquiring the per-meta lock, grex runs an informational recovery scan that:
- Lists stale
.grex.sync.lock/<dest>.grex-backend.lockwhose owning PID is gone. - Lists incomplete event brackets in the manifest (
action_startedwith no matchingaction_completed/action_halted).
The scan only logs — it never mutates. Auto-cleanup is grex doctor territory.
Bounded semaphore
#![allow(unused)] fn main() { use tokio::sync::Semaphore; use std::sync::Arc; pub struct Scheduler { permits: Arc<Semaphore>, } impl Scheduler { pub fn new(parallel: usize) -> Self { Self { permits: Arc::new(Semaphore::new(parallel)) } } pub async fn run<F, T>(&self, pack_path: &std::path::Path, fut: F) -> anyhow::Result<T> where F: Future<Output = anyhow::Result<T>> + Send, T: Send, { let _permit = self.permits.clone().acquire_owned().await?; let _plock = PackLock::open(pack_path)?.acquire_async().await?; // v1.2.4+: legacy `PackLock::acquire` is a deprecated shim fut.await } } }
The semaphore caps process-wide in-flight pack ops; the per-pack lock prevents double-execution of the same pack path across recursion frames or invocations. Sibling parallelism inside one meta and sub-meta parallelism across metas both run under the same semaphore cap.
Per-pack PackLock
File: <pack_workdir>/.grex-lock. Held exclusively via fd-lock::RwLock::write. Non-blocking try-first; on contention the task yields and retries with backoff.
API note (v1.2.4+): the canonical async entry point is
PackLock::acquire_async(andPackLock::acquire_cancellablefor the cancellable variant). The originalPackLock::acquiresignature shown in the sketch below is deprecated and retained only as a thin shim for backward compatibility — new call sites should useacquire_async.
#![allow(unused)] fn main() { pub struct PackLock { _guard: fd_lock::RwLockWriteGuard<'static, std::fs::File>, } impl PackLock { // Deprecated since v1.2.4 — prefer `acquire_async` (shown for prose continuity only). pub async fn acquire(pack_path: &std::path::Path) -> anyhow::Result<Self> { let lock_path = pack_path.join(".grex-lock"); let file = std::fs::OpenOptions::new() .create(true).read(true).write(true) .open(&lock_path)?; let lock = fd_lock::RwLock::new(file); // retry loop: try_write() → on WouldBlock sleep + retry // ... } } }
Released on Drop. The file is NOT deleted on release (avoids a TOCTOU race). grex doctor prunes stale .grex-lock files whose owning PID is gone.
Per-meta manifest RW lock
Any events.jsonl or grex.lock.jsonl mutation takes exclusive fd_lock::RwLock::write on the meta-local <meta>/.grex.lock. Readers take shared read. See manifest. The three-way disambiguation between this fd-lock file (.grex.lock), the lockfile (.grex/grex.lock.jsonl), and the event log (.grex/events.jsonl) lives in lockfile §Three "lock" artifacts.
Because the lock is per-meta under v1.2.0+, distinct metas can mutate their own lockfile + event log in parallel. There is no global serialisation point at the manifest layer — the only cross-meta serialisation is the bounded process-wide semaphore on in-flight pack ops.
Scheduler pseudocode
schedule(packs, op):
futures = []
for pack in packs:
fut = async {
_sem_permit = semaphore.acquire() # bound parallelism
_pack_lock = PackLock::acquire_async(pack.path) # per-pack exclusive (v1.2.4+; sync `acquire` is deprecated shim)
result = op.run_on(pack)
_manifest_lock = pack.meta.manifest.write_lock() # innermost (per-meta)
manifest.append(event_from(result))
drop(_manifest_lock) # release innermost first
result
}
futures.push(fut)
return join_all(futures)
Key property: locks acquired outer-to-inner, released inner-to-outer. Manifest lock is the briefest; semaphore the longest.
Lean4 invariant I1 (no_double_lock)
Invariant I1: for any two concurrent tasks t1, t2 scheduled by Scheduler, if t1.pack_path == t2.pack_path, then their lock-holding windows do NOT overlap in time.
I1= "Invariant 1" — first concurrency-series invariant. Distinct from walkerI1(boundary preservation) and architectureI1(the same scheduler theorem re-cited from the architecture doc). See the invariant series cross-reference table in the SSOT.
Informal: PackLock::acquire_async (canonical entry point since v1.2.4; the legacy PackLock::acquire is a deprecated shim) is exclusive per path; the later arrival awaits the earlier's drop.
File: proof/Grex/Scheduler.lean.
Sketch:
namespace Grex.Scheduler
structure Task where
path : String
started : Nat -- logical clock
ended : Nat
deriving Repr
def Schedule := List Task
def overlaps (a b : Task) : Prop :=
a.started < b.ended ∧ b.started < a.ended
-- PackLock is modeled as FIFO queue per path:
-- acquire(p) returns only after all prior holders for p have released.
axiom pack_lock_exclusive
(s : Schedule) (a b : Task) :
a ∈ s → b ∈ s → a.path = b.path → a ≠ b → ¬ overlaps a b
-- I1: scheduler never holds two concurrent locks on the same pack path.
theorem no_double_lock
(s : Schedule) (a b : Task)
(ha : a ∈ s) (hb : b ∈ s) (hpath : a.path = b.path) (hne : a ≠ b) :
¬ overlaps a b :=
pack_lock_exclusive s a b ha hb hpath hne
end Grex.Scheduler
CI job (.github/workflows/lean.yml):
- uses: leanprover/lean-action@v1
- run: cd proof && lake build
Zero sorry; zero unresolved axiom outside the stated model-bridging ones.
Walker I8 reduction
Walker invariant I8 (parallel sync of disjoint sub-trees commutes — see walker §Three changes vs v1.1.x) reduces to concurrency I1 for its mutual-exclusion lemma. The shipped axiom sync_disjoint_commutes in proof/Grex/Walker.lean covers the disjoint-pack work commutativity that the rayon scheduler relies on; no new theorem is required for the v1.2.1 rayon sibling-sync swap.
Operational tuning
--paralleldefault =num_cpus::get(). Typical 4-16.- Git fetch is IO-bound → higher parallelism helps until network saturates.
- Shell-out actions (
exec) may be internally multi-threaded; consider a per-type cap in v1.x.
Telemetry
Each scheduled task emits a tracing span: pack_path, op, duration_ms, result. grex doctor can read the last-N spans from an on-disk journal for retrospective diagnosis.
Cross-references
- Walker phases + parallel sibling/sub-meta scheduling: walker
- Distributed lockfile + per-meta manifest fd-lock disambiguation: lockfile
- Manifest event-log atomic append + crash recovery: manifest
- TOCTOU
BoundedDir(cap-std + Linux openat2): toctou - Force-prune audit log (writes through this manifest fd-lock): force-prune
force-prune
Default-deny safety contract for walker Phase 2 — and the --force-prune family of flags that override it. With audit log, blast-radius analysis, and a forward reference to the v1.2.1 --quarantine snapshot.
Canonical source: .omne/cfg/walker.md §Cleanup semantics (SSOT, separate
grex-instrepo). A dedicated.omne/cfg/force-prune.mdwill land in the SSOT repo separately. This page is the user-facing projection.
When does prune fire?
Cleanup is CLI-invocation-driven, not eager. Removing a child from pack.yaml children: triggers prune on the next grex sync / update invocation, not on edit. Phase 2 reconciles each meta's lockfile against its current manifest and rm -rfs any orphans — subject to the safety contract below.
| Property | Behaviour |
|---|---|
| Trigger | Manifest edit removes child, then user runs any grex command that touches the meta. |
| Scope | rm -rf <meta>/<child.path> AND drop the lockfile entry. |
| Eagerness | CLI-invocation-driven (NOT filesystem-watcher-eager). |
| Idempotency | Re-running with the child still removed: lockfile already lacks entry, rm -rf is a no-op. |
| Cross-meta | Each meta cleans its OWN orphans only. |
| Safety | Default-deny on dirty / SHA-mismatched / in-progress dest; bypass via --force-prune. |
Safety contract
Phase 2 must NOT silently destroy modified or shared content. Before rm -rf, the walker verifies the dest still matches the state recorded in the lockfile.
Adversary scenario
User cp -rs a child folder into a sibling meta and re-registers it there, then removes the original entry from the source meta's pack.yaml. Without verification, the source meta's Phase 2 destroys the now-shared folder while the sibling meta still believes it owns it.
The five checks
Default behaviour (override only with the --force-prune family below):
-
Missing
.git/at dest. Treated as already-gone — drop the lockfile entry, norm -rf. Idempotent. -
HEAD SHA mismatch (
git rev-parse HEAD≠LockEntry.sha). Abort withErr(DirtyDestRefuseToPrune(path, lockfile_sha, dest_sha)). The user has either rebased, fetched without resyncing, or the dest was swapped for foreign content. -
Dirty working tree (
git status --porcelain --ignorednon-empty). Abort withErr(DirtyDestRefuseToPrune(...)). The user has uncommitted edits OR gitignored content (build artefacts, deps caches, e.g.target/,node_modules/) that prune would silently destroy. -
Sub-meta consent walk. If dest contains
<dest>/.grex/grex.lock.jsonlwith non-empty entries, recursively check every grandchild for the same conditions. Any dirty/in-progress grandchild → refuse the prune unless--force-prune-recursive.grex remove --force <path>does NOT cascade past one level. -
In-progress git op probe. Refuse if any of these exist at
<dest>/.git/:rebase-merge/,rebase-apply/(in-progress rebase)MERGE_HEAD,CHERRY_PICK_HEAD,REVERT_HEAD(in-progress merge / cherry-pick / revert)BISECT_LOG,sequencer/(in-progress bisect / sequencer)
Even if HEAD SHA matches lockfile and the working tree is clean, an in-progress git op blocks prune. No flag bypasses this except
--force-prune-recursivecombined with explicit per-path--force-prune. -
Match — clean tree, SHA equal to lockfile, no in-progress op, no dirty grandchild.
rm -rfproceeds.
The --force-prune flag family
| Flag | Effect |
|---|---|
--force-prune | Bypass clean-tree assertions (checks 2 and 3) at the named dest. Still respects in-progress ops (check 5) and still refuses if any grandchild is dirty (check 4). |
--force-prune-with-ignored | Allow ignored content (target/, node_modules/) to be destroyed without warning at the named dest. Useful when the only "dirty" content is a build cache. |
--force-prune-recursive | Cascades the bypass to grandchildren. Required to prune past a dirty grandchild. |
grex remove --force <path> is the per-path equivalent of --force-prune: it bypasses checks 2 and 3 at the named dest only, never cascades.
The flag family is opt-in by design: a typo in pack.yaml should surface as a refusal, not as data loss.
Loss profile
Loss of ignored content (build artefacts, deps caches) is recoverable but expensive (re-compile / re-fetch). Loss of tracked dirty edits is unrecoverable. Loss of an in-progress rebase is unrecoverable from --force-prune-recursive's vantage point even though the underlying commits are still in .git/objects/ — the working state and rebase script are gone.
Audit log
Every --force-prune, --force-prune-with-ignored, or --force-prune-recursive invocation appends an entry to <meta>/.grex/events.jsonl BEFORE the rm -rf fires:
{"op":"force-prune","ts":"2026-04-30T10:00:00Z","id":"<pack-id>","schema_version":"1","path":"<meta-relative path>","lockfile_sha":"<sha>","dest_sha":"<sha>","dirty_files":<n>,"ignored_size":<bytes>}
The audit entry is fsync'd before the deletion proceeds. A crash mid-prune leaves a recoverable trail of what was about to be destroyed:
- The fsync barrier guarantees the audit line hits stable storage before any unlink syscall fires.
- On recovery,
grex doctorcan read the orphan entry and report "force-prune was about to delete<path>; the dest is gone — no recovery possible without git or filesystem-level undelete". - The audit lives in the same per-meta event log used by
add/rm/update— see manifest §events.jsonl event schemas for the common envelope and atomic-append guarantees.
Blast radius
The blast radius of a --force-prune invocation is bounded as follows.
| Flag | Within scope (deletable) | Out of scope (untouched) |
|---|---|---|
--force-prune | The named dest's tracked dirty edits at the top level | Any grandchild with its own dirty edits or in-progress op (check 4 still refuses) |
--force-prune-with-ignored | All of --force-prune plus ignored content (target/, node_modules/, etc.) at the named dest | Any grandchild's ignored content (check 4 still applies) |
--force-prune-recursive | The full sub-tree, including grandchildren's tracked dirty edits and ignored content | Sibling metas (cleanup is per-meta — see walker §Phase 2) |
The walker NEVER deletes outside the cwd-meta's own tree. Sibling metas and parent metas are unreachable from any --force-prune invocation.
Recovery
Once rm -rf has fired, there is no in-band recovery path under v1.2.0. Options:
- Restore from backup.
- Use a filesystem-level undelete tool (
extundelete, ntfsundel, etc.) — typically only succeeds on recent deletes against quiet filesystems. - If the deleted dest was a git working tree,
git/objects/may still be available in a parent's.git/modules/subtree (only if grex's clone used submodule semantics — rare).
v1.2.1 --quarantine flag (PLANNED, TBD)
The v1.2.1 release plans an opt-in --quarantine flag on --force-prune and --force-prune-with-ignored that snapshots the entire dest sub-tree to <meta>/.grex/trash/<ISO8601>/<basename>/ BEFORE the rm -rf fires. Failure of the snapshot aborts the prune (no delete). The Lean4 theorem Grex.Walker.quarantine_snapshot_precedes_delete is the gate that lets the Rust implementation land — proof-first per the SSOT rule.
The conceptual feature name is "quarantine"; the on-disk folder is named trash/. Per-meta scope (each meta has its own .grex/trash/ bucket). Not present in v1.2.0; see the v1.2.1 spec for the LOCKED layout decisions and acceptance criteria.
Until --quarantine lands, --force-prune is irreversible. The audit log is the only forensic trail.
Cross-references
- Walker Phase 2 algorithm + 5-way classifier context: walker
- Lockfile entry + path keying (the lookup map for prune candidates): lockfile
- Per-meta manifest fd-lock + audit-append serialisation: concurrency
- Audit-log envelope + crash-recovery torn-line detection: manifest
- BoundedDir TOCTOU primitive (the
rm -rfitself is dirfd-bound): toctou
toctou
The BoundedDir primitive — how grex closes the path-swap TOCTOU window between canonicalize(dest) and the actual filesystem write. Hybrid cap-std (uniform) plus Linux openat2(RESOLVE_BENEATH) (internal acceleration).
Canonical source: forthcoming
.omne/cfg/toctou.md(SSOT, separategrex-instrepo). For now this page derives from.omne/cfg/walker.md§Symlink hardening,.omne/cfg/rust-design-decisions.md§6,.omne/proof/impl-axiom-bridge.md§3 (sync_local_writes), andcrates/grex-core/src/fs/boundary.rs(the implementing module).
What is TOCTOU?
TOCTOU = Time-Of-Check / Time-Of-Use — a race-condition class where a program checks a property of a path (e.g. "this resolves to
<meta>/code/.git") and then operates on the same path later, between which an attacker swaps the path's target.
The classic walker race window without BoundedDir:
parent.canonicalize() → resolve(child) → fs::create_dir_all(dest) → clone(dest)
▲ ▲ ▲
└─ race window ──┴─ swap dest for symlink ──┘
An attacker who can write inside the workspace mid-flight could redirect the clone write to an arbitrary location — for example, replace <meta>/code/ with a symlink to $HOME/.ssh/, then watch grex happily clone-over the user's keys.
Path-string-based filesystem APIs (std::fs::create_dir_all(path), std::fs::write(path, ...)) re-walk the path on every call. Each walk is a fresh check; nothing in the standard library ties consecutive calls to the SAME inode the prior call resolved.
Why TOCTOU matters for grex sync
The walker mutates the filesystem in three places — each is a TOCTOU surface if not bound to a kernel-vouched handle:
- Phase 1, branch 1 (clone).
git_clone(child.url, dest, child.ref)writes todest. A path-swap attack between the validator's canonicalisation pass and the clone could redirect the write outside<meta>'s subtree. - Phase 1, branch 2 (recover empty dest). Same exposure as branch 1 — the recovery clone re-walks
dest's path. - Phase 2 (
rm -rf). Deleting the dest after a default-deny prune-safety pass. A swap between the safety check and the unlink could redirectrm -rfto a location outside<meta>(catastrophic — see force-prune §Blast radius for the boundary contract that depends on this NOT happening).
See the 5-way classifier in walker §Phase 1 for context — branches 1, 2, and 5 are the only branches that mutate dest; branch 5 (git fetch) operates against the already-bound dest and is not a fresh path-walk.
BoundedDir — the primitive
BoundedDir is a thin wrapper around a kernel-confirmed directory handle (a "dirfd") obtained for a path provably contained beneath a parent directory. Once constructed, the handle is bound to the inode the kernel resolved at construction time — a subsequent attacker swap of the parent path for a symlink cannot redirect operations performed through the handle.
BoundedDir::open(parent, child_relative)? → handle bound to inode
Downstream operations either go through the dirfd (write confined) or compare against BoundedDir::path (the canonicalised, post-resolve path) — both of which the kernel has already vouched for.
The module lives at crates/grex-core/src/fs/boundary.rs. Visibility is pub(crate) — cap-std types do not leak into the public API surface, so a future implementation swap (e.g. raw openat2 plumbing) does not bump the grex-core SemVer.
Hybrid strategy: cap-std uniform, openat2 internal
Per design decision §6 in .omne/cfg/rust-design-decisions.md:
| Platform | What BoundedDir actually does |
|---|---|
| Linux ≥ 5.6 | cap-std internally uses openat2(RESOLVE_BENEATH) — single syscall, kernel-enforced bound |
| Linux < 5.6 | cap-std falls back to O_NOFOLLOW-by-component verification |
| macOS / Windows | cap-std uses platform-equivalent capability handles |
The BoundedDir API is uniform across all platforms — callers do not branch. The openat2 acceleration on modern Linux is an internal detail.
Why uniform cap-std rather than per-OS hand-rolling
Trade: the marginal performance of a single-syscall openat2 versus three concrete benefits:
- No
unsafeingrex-core. The crate carries#![deny(unsafe_code)]. Hand-rolledopenat2plumbing requiresunsafefor the libc syscall surface. - One code path to test across OSes. Per-OS branches multiply the test matrix and the audit surface.
- No kernel-version branching at runtime. cap-std handles the Linux ≥ 5.6 / < 5.6 split internally; grex-core never observes it.
Dirfd-binding model
What "bound to an inode" actually means:
BoundedDir::open(parent, child_relative)opensparentas acap_std::fs::Dir. The kernel resolvesparentto an inode and gives back a file descriptor referring to that specific inode.- The constructor then resolves
child_relativethrough that dirfd — the kernel walks segment-by-segment, verifying each step does not escape the parent capability (no.., no symlink-to-outside, no absolute redirect). - The returned
BoundedDircarries the verified child dirfd. All future operations route through this fd, not through the original path string. - If an attacker swaps
parent/child_relativefor a symlink AFTER step 2, subsequent reads/writes through theBoundedDirstill hit the original inode the kernel resolved at step 2 — the attacker's swap is invisible to the bound handle.
Dropping the BoundedDir releases the dirfd; the inode is then eligible for unlinking by other processes (this is fine — by then the walker is done with that dest).
Path-swap attack closure
Concretely, the attack the walker now defeats:
T+0 user runs: grex sync .
T+1 walker validates: canonicalize(<meta>/code/) → /home/u/proj/code (real dir)
T+2 walker calls: BoundedDir::open(<meta>, "code") → fd 17 bound to inode 99
T+3 ATTACKER (running concurrently): rm <meta>/code; ln -s $HOME/.ssh <meta>/code
T+4 walker calls: git_clone(url, BoundedDir::path(&fd17)) → writes to inode 99
(the original /home/u/proj/code, NOT $HOME/.ssh)
Without BoundedDir, step T+4's git_clone would re-walk the path string <meta>/code/ and follow the attacker's symlink to $HOME/.ssh. With BoundedDir, the write is bound to the kernel-vouched inode from T+2.
The walker also rejects symlinks at canonicalisation gate (see walker §Symlink hardening) — but that gate runs at validation time, BEFORE the bind. BoundedDir is what closes the window between validation and write.
Lean4 axiom: sync_local_writes
The Lean4 mechanisation models this as a bridge axiom in proof/Grex/Bridge.lean:
axiom sync_local_writes
(parent : Path) (w : World) (q : Path) :
¬ descends q parent →
(sync parent w).tracked q = w.tracked q ∧
(sync parent w).lock q = w.lock q ∧
(sync parent w).hasGit q = w.hasGit q
"Bridge 3.
syncwrites only inside its argument subtree. Every other path'stracked,lock, andhasGitare unchanged. This is the model-level statement of v1.2.0's parent-relative discipline."
The Rust contract that discharges this axiom is precisely the BoundedDir capability handle. Without it, a malicious symlink inside the subtree could cause sync to clobber w.hasGit q for a q that does not descend from parent, falsifying the axiom.
The validate_children_paths gate (rejects .. and absolute segments) is necessary but NOT sufficient on its own; the capability-handle invariant is what closes the symlink-traversal escape window. Any change to the Rust impl that swaps cap-std for raw std::fs MUST re-prove this axiom (or bridge it via an explicit "no-symlink-escape" lemma) — .omne/proof/impl-axiom-bridge.md §3 documents this re-review trigger.
What BoundedDir does NOT cover
O_NOFOLLOWdoes not protect against TOCTOU on theparentitself.BoundedDir::open(parent, child)opensparentas the capability root — if the attacker swaps the parent's path to a different inode BEFOREBoundedDir::openis called, the bind happens against the wrong inode. The walker mitigates this by binding the cwd-meta's parent dirfd at sync-startup, before any per-child resolution fires; from then on every recursion'sparentis itself a fd, not a path string.- Concurrent writers within the bound dirfd. If a non-grex process holds a write descriptor under the bound dirfd, the walker's writes can race with theirs at the inode level. The per-pack
.grex-lock(see concurrency §Per-pack PackLock) closes the in-grex case; cross-tool coordination is out of scope forBoundedDir. - Filesystem-layer attacks. A malicious filesystem (FUSE, network mount with adversarial server) can violate the kernel's inode-stability guarantee.
BoundedDirassumes a non-adversarial filesystem layer.
Cross-references
- Walker phases that perform the bound writes: walker
rm -rfblast-radius contract that depends on bound deletes: force-prune- Per-pack
.grex-lockmutual exclusion (orthogonal to TOCTOU): concurrency - Lockfile atomic rewrite (also dirfd-bound under v1.2.0+): lockfile
cli — v1 frozen verb contract
12 verbs. Freeze is additive-only post-v1: adding verbs, flags, or JSON-output fields is allowed; removing or renaming is a v2 change.
Universal flags
| Flag | Effect |
|---|---|
--json | emit machine-readable JSON to stdout, suppress ANSI |
--plain | ANSI off, no Unicode, CI/agent-friendly |
--dry-run | compute plan, print it, do NOT mutate disk or manifest |
--parallel N | bound scheduler semaphore to N permits (default: num_cpus) |
--filter <expr> | restrict verb to matching packs (name glob, type, depth) |
--manifest <path> | override default ./grex.jsonl |
-v, -vv, -vvv | tracing verbosity |
Output mode precedence: --json > --plain > TTY auto-detect pretty default.
Exit codes
| Code | Meaning |
|---|---|
| 0 | success |
| 1 | generic error |
| 2 | CLI usage error |
| 3 | manifest integrity failure |
| 4 | pack op failed (fetch/install/sync/teardown) |
| 5 | lock contention / concurrency |
| 6 | MCP protocol error |
| 7 | doctor found drift |
| 8 | plugin / unknown action or pack-type |
The 12 verbs
grex init
Initialize a grex workspace (creates grex.jsonl, configures hooks, writes .gitignore managed-block markers if missing).
- Args: none.
- Flags:
--hooks-path <dir>(default.grex/hooks),--no-clone(skip fetch of pre-existing entries). - Example:
grex init --parallel 4 - JSON:
{"workspace":"<cwd>","created":["grex.jsonl","grex.lock.jsonl"],"hooks":"<path>","cloned":[]}
grex add <url> [path]
Register a pack, clone it, run its install.
- Args:
<url>required;[path]optional bare-name, inferred from URL basename. - Flags:
--type <meta|declarative|scripted>(auto-detected from pack.yaml),--ref <branch|tag|sha>,--no-install(clone only). - Exit: 2 if path not bare; 4 if fetch or install fails.
- Example:
grex add git@github.com:user/warp-cfg.git warp-cfg - JSON:
{"id":"warp-cfg","type":"declarative","path":"warp-cfg","sha":"abc123","installed":true}
grex rm <path>
Run teardown, remove pack dir, tombstone in manifest, update .gitignore.
- Args:
<path>required. - Flags:
--keep-files(tombstone only),--skip-teardown(do not run teardown actions/hooks). - JSON:
{"id":"...","removed":true,"files_deleted":true,"teardown":"ok"}
grex ls
List registered packs (post-fold).
- Flags:
--type <...>,--long(include SHA + install time + actions_hash),--tree(nested view). - JSON:
[{"id":"...","type":"...","path":"...","ref":"...","sha":"...","installed_at":"..."}]
grex status
Drift report: manifest vs lockfile vs on-disk.
- Flags:
--stale-after <duration>,--fail-on-drift. - JSON:
[{"id":"...","on_disk":true,"sha_match":true,"actions_hash_match":true,"drift":"clean|dirty|missing|untracked|stale"}] - Exit: 7 if any drift with
--fail-on-drift.
grex sync [--recursive]
Git fetch/pull every pack; recurse into children. Install actions are not re-run here (see update).
- Flags:
--recursive(default true),--only <id>,--fail-fast. - M4-D flags (additive, freeze-preserving):
--ref <REF>— override every pack's declaredreffor this sync invocation (branch, tag, or commit SHA). Applied by the walker at each child clone / checkout; the root pack itself is not re-checked-out (operator manages root viagrex add/ manual git). Empty and whitespace-only values rejected at parse time.--only <GLOB>— restrict sync to packs whose workspace-relative pack path, normalized to forward-slash form (/), matches the glob. Cross-platform consistent:a,b/c,vendor/*evaluate identically on Windows and POSIX. The root pack (whose path lies outside the workspace) falls back to its absolute forward-slash path. Bare pack names do not match unless the name coincides with the workspace-relative path. Repeat the flag to OR-combine multiple patterns. Non-matching packs are skipped entirely (no action execution); their prior lockfile entry is carried forward so a subsequent unfiltered sync still short-circuits on hash. Invalid globs exit 2 (CLI usage error). Caveat —--onlydoes NOT expand to include a pack'sdepends_on/ child dependencies; operator must include them explicitly if dependency-filtered runs are required. Empty and whitespace-only values rejected at parse time.--force— re-execute every pack even when itsactions_hashis unchanged from the prior lockfile. Bypasses the M4-B skip-on-hash short-circuit. Caveat — non-idempotent actions (execwithout guard,mkdirwithmodedrift, etc.) may produce duplicate / compounding side effects when--forcereplays after a mid-run halt; operator responsibility to ensure action idempotency before using--forceon a partially-applied workspace.
- JSON:
[{"id":"...","result":"ok|err","sha_before":"...","sha_after":"...","message":""}] - Exit: 4 on op failure without
--keep-going.
grex update [pack]
Sync + re-run install actions for packs whose lockfile SHA or actions_hash changed.
- Args:
[pack]optional; defaults to all. - Flags:
--force(re-run install regardless of lock),--only <id>. - JSON:
[{"id":"...","synced":true,"reinstalled":true,"reason":"sha-changed|hash-changed|forced|none"}]
grex doctor
Integrity + drift + lint.
- Checks: manifest schema, gitignore managed-block in sync, on-disk pack drift,
.grex/pack.yamlschema validity, stale.grex-lockfiles, orphan entries. - Flags:
--compact(run manifest compaction),--fix(auto-fix fixable issues). - Exit: 7 on drift, 3 on manifest integrity failure, 0 clean.
grex serve --mcp
Launch embedded MCP stdio JSON-RPC 2.0 server.
- Flags:
--mcp(required; reserved for--httpin v2). - Exit: 6 on protocol error.
- Details: mcp.md.
grex import
Bring external state into the manifest.
- Flags:
--from-repos-json <path>— ingest legacy flatREPOS.jsonarray.--scan— walk workspace one level deep, register untracked.gitdirs.--default-type <...>— pack-type assumed for entries without pack.yaml (default:meta).
- JSON:
{"imported":[...],"skipped":[...],"errors":[...]}
grex run <action> [--filter <expr>]
Invoke a registered action by name across matched packs. Primarily for testing/diagnostic use; production installs go through pack-type lifecycle.
- Args:
<action>required; matches registered plugin name. - Flags:
--filter <expr>,--parallel N. - JSON:
[{"pack":"...","action":"...","changed":true,"message":""}]
grex exec <cmd> [-- args...] [--filter <expr>]
Run an arbitrary command inside each matched pack's workdir.
- Args:
<cmd>required. - Flags:
--filter,--parallel N,--shell(opt-in shell parsing; off by default). - Example:
grex exec git status - JSON:
[{"pack":"...","stdout":"","stderr":"","exit":0}]
Verb interactions
synconly fetches;update=sync+ install re-run on lockfile delta.runoperates on actions directly, bypassing pack-type lifecycle; useful for debugging a single action.execis never filtered through the action plugin registry; it runs arbitrary commands.serve --mcpdoes not block other verbs; it exposes them over JSON-RPC.
Freeze semantics
A v1.x release may:
- Add a new verb.
- Add a new flag to an existing verb.
- Add a new field to any
--jsonoutput. - Add a new action name, pack-type name, or MCP method (all additive).
A v1.x release may NOT:
- Remove or rename a verb.
- Change the meaning of an existing flag.
- Change the type of an existing JSON output field.
- Remove an action name or pack-type name.
grex doctor validates pack.yaml against the frozen schema version.
CLI --json output
Every non-transport verb honours the global --json flag. When present,
the verb writes a single JSON document to stdout and suppresses the
default human-readable output. The serve verb is excluded — it owns
stdio for JSON-RPC framing, so --json is not applicable there.
This chapter is the v1 JSON contract. Field names are stable across
PATCH / MINOR releases; new fields may be added (readers must ignore
unknown keys per the manifest wire invariant). Breaking changes require
a MAJOR bump and a deprecation cycle — see semver.md.
Two envelope families
Every --json payload belongs to exactly one of two families. Callers
distinguish them by the presence or absence of a top-level status key:
| discriminant | envelope family | stability |
|---|---|---|
"status": "unimplemented" | stub envelope | stable shape while the verb remains unimplemented (see below) |
no status key | verb-specific shape | stable shape per the verb's section below |
A verb transitioning from unimplemented to wired is a schema addition,
not a replacement: the stub envelope is dropped and a verb-specific shape
takes its place. Consumers MUST branch on the presence of status:
// Pseudocode — pick the right parser per verb
if (payload.status === "unimplemented") {
// Stub verb. Treat as "no semantic data yet" and proceed.
} else {
// Verb-specific shape documented below.
}
The two families never co-exist in the same payload. status is reserved
for the stub envelope; no verb-specific shape will ever gain a top-level
status field.
Stub envelope (unimplemented verbs)
init, rm, status, update, run, exec are still
M1 stubs. --json emits:
{"status": "unimplemented", "verb": "init"}
Fields:
status— always the literal string"unimplemented".verb— the verb name as typed on the command line.
The stub envelope is a contract for consumers to detect unfinished verbs
without parsing the (absent) verb-specific body. When the verb is wired,
the stub envelope is removed; the verb now emits its verb-specific
shape. Exit codes are unchanged (stubs exit 0).
add
Wired. Emits an add registration report:
{
"dry_run": false,
"id": "pack-a",
"url": "https://example.com/pack-a.git",
"path": "pack-a",
"type": "scripted",
"appended": true
}
Fields:
dry_run— bool; mirrors the global--dry-runflag.id— pack id written to the manifest; currently equal topath.url— source URL as provided.path— workspace-relative pack path, explicit or inferred from URL.type— classified pack kind (scriptedfor git-like URLs,declarativeotherwise).appended— bool;falseonly whendry_runistrue.
The MCP add tool emits a byte-identical body.
ls
Wired in v1.1.1. Walks the workspace from a root pack.yaml (or the
current directory when no pack_root is given) without cloning,
fetching, or executing anything, and emits a structured tree envelope:
{
"workspace": "/abs/path/to/workspace",
"tree": [
{
"id": 0,
"name": "rootp",
"path": "/abs/path/to/workspace",
"type": "meta",
"synthetic": false,
"children": [
{
"id": 1,
"name": "alpha",
"path": "/abs/path/to/workspace/alpha",
"type": "scripted",
"synthetic": true,
"children": []
}
]
}
]
}
Fields:
workspace— absolute path to the resolved workspace (the directory holding the root pack's.grex/, or the pack root itself for the flat-sibling layout).tree[]— root-level nodes. Currently always one entry; the array shape is reserved so future surfaces walking from a workspace dir with multiple sibling packs can extend without a schema break.- Per node:
id(stable in-walk depth-first counter, root = 0),name,path(absolute),type(one ofmeta,declarative,scripted),synthetic(bool — see below),children[].
synthetic: true indicates a plain-git child whose pack manifest was
synthesised in-memory by the walker (the destination directory carries
.git/ but no .grex/pack.yaml). Synthetic nodes always carry
type: "scripted" per the v1.1.1 design. See
pack-spec.md §"Plain-git children" for
the full contract.
Error envelope
{"verb": "ls", "error": {"kind": "tree", "message": "..."}}
kind values: tree (root manifest could not be loaded), usage
(invalid pack_root argument). The verb exits 2 on error and 0
on success.
The MCP ls tool emits a byte-identical successful body. The MCP
surface does NOT accept a pack_root parameter (workspace-confinement
invariant); the walk always starts from the server's pinned workspace.
sync and teardown
These verbs drive the M3 Stage B pipeline. --json emits a
SyncReport-shaped document:
{
"verb": "sync",
"dry_run": false,
"steps": [
{"pack": "a", "action": "file-write", "idx": 0, "result": "performed_change", "details": null},
{"pack": "b", "action": "shell-run", "idx": 1, "result": "skipped",
"details": {"pack_path": "b", "actions_hash": "sha256:..."}}
],
"halted": null,
"event_log_warnings": [],
"summary": {"total_steps": 2}
}
result values: performed_change, would_perform_change,
already_satisfied, noop, skipped, other.
Missing <pack_root> → usage error (exit 2)
sync / teardown without a <pack_root> positional emit a
verb-specific error envelope and exit 2 (the frozen usage-error exit
code from cli.md):
{
"verb": "sync",
"error": {"kind": "usage", "message": "`<pack_root>` is required (directory with `.grex/pack.yaml` or the YAML file)"}
}
This is NOT a stub envelope — no status key. The usage-error branch is
distinct from the unimplemented-verb branch so callers can distinguish
"tell the user to fix their invocation" (exit 2) from "this verb has no
implementation yet" (exit 0).
Error envelope (other failure paths)
Validation / tree / exec / halted paths share the same envelope shape:
{
"verb": "sync",
"error": {"kind": "validation", "message": "…"}
}
kind values: validation, tree, exec, usage, other. The
halted sub-case emits a dedicated shape:
{"verb": "sync", "halted": {"pack": "a", "action": "shell-run",
"idx": 0, "error": "…", "recovery_hint": "…"}}
doctor
Wired. Emits a DoctorReport:
{
"exit_code": 0,
"worst_severity": "ok",
"findings": [
{"check": "manifest-schema", "severity": "ok",
"pack": null, "detail": "", "auto_fixable": false, "synthetic": false},
{"check": "synthetic-pack", "severity": "ok",
"pack": "algo-leet", "detail": "OK (synthetic)",
"auto_fixable": false, "synthetic": true}
]
}
Fields:
exit_code— number; the severity-roll-up exit code the CLI also returns from the process.worst_severity— string; one ofok/warning/error. Matches the highest severity infindings.findings[]— array of per-check finding objects.
severity values: ok, warning, error.
v1.1.1+ adds synthetic: true to findings for synthetic plain-git
children (skipped schema validation; gitignore + drift checks still
run). The flag mirrors the synthetic marker on the matching LsTree
node and on the lockfile entry, so consumers correlating doctor
findings with grex ls output see the same plain-git provenance on
both surfaces.
The MCP doctor tool emits a byte-identical body. The MCP surface does
NOT accept --fix (read-only inspection only) or --workspace
(workspace-confinement invariant). CLI-only users retain grex doctor --fix for interactive gitignore healing.
import
Wired. Emits an ImportPlan:
{
"dry_run": true,
"imported": [
{"path": "pack-a", "url": "https://…", "kind": "declarative",
"would_dispatch": true}
],
"skipped": [{"path": "pack-b", "reason": "path_collision"}],
"failed": []
}
Fields:
dry_run— bool; mirrors whichever of--dry-run/ global--dry-runwas in effect.imported[]— entries that will be (or were) added to the manifest.skipped[]— entries excluded;reasonis one ofpath_collision,duplicate_in_input.failed[]— entries that errored during ingest; each carries a human-readableerrorstring.
No summary wrapper — callers derive counts from the three arrays. The
MCP import tool emits a byte-identical body. The MCP surface does NOT
accept a workspace parameter (workspace-confinement invariant); the
fromReposJson path is resolved relative to the server's workspace and
rejected if it canonicalises outside it.
Exit codes
--json does not alter exit codes — callers MUST use the process exit
code as the source of truth for success / failure, not the presence of
an error key. The JSON payload is diagnostic detail, not the wire
signal.
actions
7 Tier 1 action primitives. Grounded in observed real-world script patterns (see goals.md grounded-reality table). Each is a native Rust built-in registered as an ActionPlugin at compile time.
Action invocation shape
In pack.yaml:
actions:
- <action-name>:
<arg>: <value>
...
Or for actions that take a bare argument object:
actions:
- mkdir: { path: "$HOME/.warp" }
grex parses each entry, looks up the action by key in the registry, and dispatches to its ActionPlugin::execute.
Variable expansion
Action args support env-var interpolation: $HOME, $USER, $APPDATA, $LOCALAPPDATA, ${NAME}. Expansion is done by grex in the PackCtx::env resolver — native-per-platform:
- POSIX: standard
$VAR, case-sensitive. - Windows:
$VARworks, plus%VAR%for legacy paths.$HOMEmaps to%USERPROFILE%(fallback applied onVarEnv::from_os/from_maponly — NOT on an explicitinsert). Lookup is ASCII-case-insensitive via a secondary lowercase-keyed index;$UserProfileand$USERPROFILEresolve to the same value.
Escape syntax
- POSIX form: a literal
$is written as$$.$${HOME}expands to the literal string${HOME}(no expansion). - Windows form: a literal
%is written as%%.%%USERNAME%%expands to the literal string%USERNAME%. - Backslash escapes (
\$,\%) are not supported.
- env:
name: GREX_DOC_EXAMPLE
value: "literal $${HOME} and %%USERNAME%%" # → literal ${HOME} and %USERNAME%
The 7 primitives
1. symlink
Create or update a symlink, with optional backup of any existing dst.
- symlink:
src: files/config.yaml # relative to pack workdir
dst: "$HOME/.warp/config.yaml"
backup: true # default false; renames existing dst to <dst>.grex-bak.<ts>
normalize: true # default true; absolute-normalizes both paths
kind: auto # auto | file | directory; Windows needs explicit for dir symlinks
| Field | Type | Default | Notes |
|---|---|---|---|
src | path | required | Resolved relative to pack workdir. |
dst | path | required | May contain env vars. |
backup | bool | false | Renames existing dst before creating symlink. |
normalize | bool | true | Canonicalizes both sides. |
kind | enum | auto | auto infers from src; directory forced on Windows for dir links. |
Cross-platform: uses std::os::unix::fs::symlink on POSIX, std::os::windows::fs::{symlink_file, symlink_dir} on Windows. Requires Developer Mode or SeCreateSymbolicLink privilege on Windows; require gate recommended.
kind: auto with missing src: if kind is auto and src does not exist at execute time, grex errors with SymlinkAutoKindUnresolvable rather than defaulting to file. A dangling file-symlink where a directory was required is worse than a loud failure.
Idempotency: if dst is already a symlink pointing at src, no-op (changed: false).
Rollback: removes the symlink; if a backup was made, restores it.
Backup + create atomicity: when backup: true is set and dst exists, grex renames dst → <dst>.grex.bak then creates the symlink. If the rename succeeds but the create fails, grex renames the backup back to dst (best-effort restore). If the restore rename also fails, grex surfaces SymlinkCreateAfterBackupFailed — the user is told exactly what is on disk (backup at <dst>.grex.bak, no symlink at dst) so manual recovery is unambiguous.
Errors: src missing, dst parent missing, privilege denied, SymlinkAutoKindUnresolvable (see above), SymlinkCreateAfterBackupFailed (see above).
Duplicate dst within a pack: two or more symlink actions in the same pack whose resolved dst paths are equal is a plan-phase validation error (ActionArgsInvalid), raised before any action executes. On case-insensitive filesystems (Windows, macOS default APFS) the comparison is ASCII-case-folded so C:\Users\a\x and c:\users\a\X are detected as duplicates. Cross-pack collisions on the same dst are handled separately by workspace-level conflict detection. See pack-spec.md §Validation rules.
2. env
Set an environment variable.
- env:
name: WARP_HOME
value: "$HOME/.warp"
scope: user # user | machine | session
| Field | Type | Default | Notes |
|---|---|---|---|
name | string | required | Variable name. |
value | string | required | Expanded before setting. |
scope | enum | user | user persists to shell rc / registry HKCU; machine → HKLM / /etc/environment (requires admin); session → current process only. |
Platform:
- Windows:
userwritesHKCU\Environment+ broadcastsWM_SETTINGCHANGE.machinewritesHKLM\System\CurrentControlSet\Control\Session Manager\Environment. - POSIX:
userappends managed-block to~/.bashrc/~/.zshrc/~/.config/fish/config.fish.machinewrites/etc/environment. sessionusesstd::env::set_var(doesn't persist).
Idempotency: re-read current value; no-op if already set.
Rollback: restores previous value if captured; else unsets.
3. mkdir
Create a directory, including parents.
- mkdir: { path: "$HOME/.warp" }
| Field | Type | Default | Notes |
|---|---|---|---|
path | path | required | Expanded. |
mode | string | "755" (POSIX) | Ignored on Windows. |
Idempotency: no-op if already a directory.
Errors: path exists as non-directory.
Rollback: if grex created it, remove it (only if empty).
4. rmdir
Remove a directory, optionally with backup.
- rmdir:
path: "$HOME/.warp"
backup: true # default false; renames to <path>.grex-bak.<ts>
force: false # default false; if false, refuses non-empty unless backup
| Field | Type | Default | Notes |
|---|---|---|---|
path | path | required | Expanded. |
backup | bool | false | Renames rather than deleting. |
force | bool | false | Allow recursive delete of non-empty. |
Idempotency: no-op if already absent.
Rollback: restores backup if one was made; else creates empty dir (best-effort).
5. require
Prerequisite / idempotency gate. Evaluates predicates; on failure, aborts or skips per on_fail.
- require:
all_of: # or any_of / none_of
- cmd_available: git
- os: windows
- psversion: ">=5.1"
on_fail: error # error | skip | warn
Predicates:
| Predicate | Arg | Meaning |
|---|---|---|
path_exists | path | Filesystem path present. |
cmd_available | name | name in PATH. |
reg_key | hive\path!name | Registry value present (Windows only; off-platform a leaf evaluation yields PredicateNotSupported). Forward-slash separators (HKCU/Software/X) are accepted and normalized to \. ACL-denied or transient registry I/O surfaces as PredicateProbeFailed rather than collapsing to false. |
os | windows|linux|macos | Current OS matches. |
psversion | version-spec | PowerShell version constraint (Windows only; off-platform a leaf evaluation yields PredicateNotSupported). Probe is bounded by a 5 s timeout, prefers the absolute %SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe path to resist PATH-hijack, compares the full (major, minor) tuple, and surfaces non-zero exit / timeout / unexpected I/O as PredicateProbeFailed. powershell.exe genuinely missing degrades to false (matches the reg_key NotFound shape). |
symlink_ok | — | Privilege / dev-mode present to create symlinks. |
Combiners: all_of (AND), any_of (OR), none_of (NOT). Nest freely. Inside these combiners (and inside when's all_of / any_of / none_of lists) a leg that yields PredicateNotSupported is treated as false so other legs still get a chance — this preserves the cross-platform rescue pattern any_of: [{reg_key: ...}, {path_exists: /etc/foo}]. The top-level combiner attached to a require stays strict: a single unsupported leaf under require still bubbles the typed error.
on_fail:
error→ abort pack install with non-zero exit.skip→ remaining actions in this pack skipped, lifecycle reports "skipped".warn→ log warning, continue.
Observed frequency: 9 uses in the scanned scripts. Highest-leverage primitive.
6. when
Platform / conditional gate wrapping nested actions. Sugar over require for common platform dispatch.
- when:
os: windows # or: any_of / all_of / none_of
actions:
- mkdir: { path: "$HOME/.warp" }
- symlink: { src: files/config.yaml, dst: "$HOME/.warp/config.yaml" }
| Field | Type | Default | Notes |
|---|---|---|---|
os | string | — | Shorthand for require { os: ... }. |
all_of/any_of/none_of | list | — | Full predicate combiner support. |
actions | list | required | Nested actions; run only if condition holds. |
On condition false: all nested actions are skipped (not failures). No rollback needed — nothing ran.
Combiner precedence: when os and any of all_of/any_of/none_of appear together, they compose conjunctively (AND). os: is shorthand equivalent to an os: predicate inside an implicit all_of; the explicit combiners are appended to that same all_of. Mixed example:
- when:
os: windows
all_of:
- cmd_available: pwsh
- psversion: ">=7.0"
actions:
- exec: { cmd: ["pwsh", "-NoProfile", "-File", "files/setup.ps1"] }
Both the os: windows shorthand and every predicate under all_of must hold for the nested actions to run.
7. exec
Shell escape. Runs a command. Array form by default (no shell interpretation). Opt into shell parsing explicitly.
- exec:
cmd: ["rclone", "copy", "gdrive:backup", "$HOME/backup"]
cwd: "$HOME" # default: pack workdir
env: # extra env vars for this invocation
RCLONE_CONFIG: "$HOME/.config/rclone/rclone.conf"
shell: false # default false; true = parse via sh -c / cmd /c
on_fail: error # error | warn | ignore
| Field | Type | Default | Notes |
|---|---|---|---|
cmd | list[string] | required (when shell=false) | argv array. |
cmd_shell | string | required (when shell=true) | Single string passed to shell. |
cwd | path | pack workdir | Where to run. |
env | map | {} | Extra env vars. |
shell | bool | false | Enable shell interpretation. |
on_fail | enum | error | Error propagation. |
Rule: exec is the last-resort primitive. If you find yourself writing a second exec in the same pack, consider promoting the logic to a purpose-built action (built-in or plugin).
No idempotency guarantee. grex does not know whether the command you ran is repeatable. Pair with require to gate it.
Rollback: none (grex cannot know how to undo arbitrary commands). Pack authors wanting true rollback must pair with a teardown action.
stderr capture on failure: when exec returns a non-zero status (and on_fail: error), grex records the failure as ExecNonZero and attaches a truncated copy of the command's stderr — capped at 2 KiB — to the manifest action_halted event. The cap bounds manifest event size to stay below the fd-lock append atomicity ceiling (see manifest.md §Atomic append). Full stderr is printed to the terminal regardless; only the manifest copy is truncated.
Observed-pattern → primitive mapping
From the E:\repos scan (3 PowerShell scripts, 945 LOC):
| Observed pattern | Count | v1 primitive | Notes |
|---|---|---|---|
New-Item -ItemType SymbolicLink / ln -s | 8 | symlink | Direct mapping. |
if (Test-Path …) { … } idempotency guards | 9 | require | path_exists or cmd_available predicate. |
[Environment]::SetEnvironmentVariable(…, 'User') | 7 | env (scope: user) | Direct. |
& ./install.ps1 chain scripts | 5 | exec | Temporary; plugin should replace long-term. |
New-Item -ItemType Directory -Force | 2 | mkdir | Direct. |
if ($IsWindows) { … } platform gate | 2 | when | Direct. |
Rename-Item backup then Remove-Item -Recurse | 1 | rmdir (backup: true) | Direct. |
No observed: package installs (winget, choco), JSON merges, archive extracts, template rendering. Those are real patterns but not in this sample. Deferred to v2 plugin contributions.
Action plugin registration
Built-ins register via the canonical register_builtins(&mut Registry) free function called from Registry::bootstrap() (decision 2026-04-20). inventory::submit! auto-registration is feature-gated behind plugin-inventory (default off) and lands in Stage M4-E. User-facing YAML keys resolve through the registry name-to-plugin map.
Full trait definition, registration details, and v2 external-loading path: plugin-api.md.
Error taxonomy
| Error | Cause | Recovery |
|---|---|---|
ActionArgsInvalid | Malformed YAML for action. | Fix pack.yaml. |
ActionPreconditionFailed | require predicate false with on_fail: error. | Fix environment or pack. |
ActionExecutionFailed | Runtime error during action. | Pack-type rollback invoked. |
ActionUnknown | Action key not registered. | Plugin missing. Exit 8. |
PredicateNotSupported | Predicate (reg_key / psversion) is platform-specific and the current platform cannot answer it. Inside all_of / any_of / none_of combiners this is tolerated as false; at the top-level require it is fatal. | Wrap with when: { os: windows } or use any_of with a cross-platform fallback leg. |
PredicateProbeFailed | The probe ran on the correct platform but itself broke — non-zero powershell.exe exit, 5 s timeout, ACL-denied registry read, or other OS I/O that is not a plain NOT_FOUND. Always fatal. | Investigate the probe error (AV hook, WinRM stall, ACL). Not rescued by combiner tolerance — a broken probe is not a rescue-eligible condition. |
All actions return Result<ExecStep, ExecError> to the pack-type driver (v1 shape, 2026-04-20; see plugin-api.md); the driver aggregates failures and triggers rollback per pack-type policy.
plugin-api
Stable trait contracts for v1 extension points. Post-v1 these are semver-protected: breaking changes require a major bump of grex itself.
Three traits
ActionPlugin— implements one action name (e.g.symlink,env).PackTypePlugin— implements one pack-type (meta,declarative,scripted).Fetcher— implements one URL scheme (gitin v1).
All three are Send + Sync + 'static async trait objects via async_trait.
Uniform &str across plugin traits (2026-04-20) — enables String-backed plugins in v2 (WASM/dylib); builtins return literals which coerce to 'static-lifetime &str for zero alloc.
ActionPlugin
#![allow(unused)] fn main() { use async_trait::async_trait; use serde_json::Value; #[async_trait] pub trait ActionPlugin: Send + Sync { /// Stable action name, matches the YAML key. fn name(&self) -> &str; /// Execute the action. Args are the raw YAML sub-tree under the action key. async fn execute( &self, ctx: &ExecCtx<'_>, args: &Value, ) -> Result<ExecStep, ExecError>; } }
M4-B shipped shape (2026-04-20): the snippet above is the v2-facing target (WASM/dylib plugins consume raw &Value). The in-process v1 trait landed sync and takes the typed &Action instead of &Value:
#![allow(unused)] fn main() { pub trait ActionPlugin: Send + Sync { fn name(&self) -> &str; fn execute(&self, action: &Action, ctx: &ExecCtx<'_>) -> Result<ExecStep, ExecError>; } }
Rationale: the wet-run executor, planner, and scheduler are all synchronous today; the parse step has already validated shape + invariants so taking the typed &Action is zero-cost at the boundary. The async + &Value form is reserved for external plugin loading (M5+ / v2) where the trait crosses a dylib/WASM ABI boundary. Both shapes return ExecStep — that is stable across v1 and v2.
Return type (v1): ExecStep carries the per-action result envelope — action_name, result (ok/skipped/failed with diagnostics), kind, and related fields. ActionOutcome is superseded by ExecStep in v1 — richer shape carries diagnostics. Original ActionOutcome { changed, message } design retired 2026-04-20.
Rollback is not on the trait surface (decision 2026-04-20, matches openspec/feat-grex/spec.md §1). Rollback semantics remain where the M3 executor kept them (per-action inverse logic in the executor), not in an ExecStep field. A dedicated rollback protocol is deferred to M5+ when pack-type drivers may require it.
ExecCtx (v1 realization of PackCtx)
PackCtx as originally drafted is v1-realized as ExecCtx<'a> in code. Fields present: vars (implements EnvResolver), pack_root, workspace, platform (typed as Os enum). Fields deferred to M5: pack_id, dry_run, explicit logger: &dyn ActionLogger wiring. The ActionLogger and EnvResolver traits are defined in grex-core::{log, env} and available for plugins to use directly; ExecCtx field wiring deferred.
#![allow(unused)] fn main() { pub struct ExecCtx<'a> { pub vars: &'a VarEnv, // implements EnvResolver pub pack_root: &'a std::path::Path, pub workspace: &'a std::path::Path, pub platform: Os, // Windows | Linux | Macos // deferred to M5: pack_id, dry_run, logger: &dyn ActionLogger } }
PackTypePlugin
Updated 2026-04-20: M5-1 trait signature aligned with shipped M4 code patterns. The trait mirrors M4
ActionPluginexactly — sameExecCtx<'_>context, sameResult<ExecStep, ExecError>return envelope — so pack-type and action plugins share one result pipeline. The earlieranyhow::Result<()>+ barePackdraft is retired.
#![allow(unused)] fn main() { pub trait PackTypePlugin: Send + Sync { fn name(&self) -> &str; async fn install( &self, ctx: &ExecCtx<'_>, pack: &PackManifest, ) -> Result<ExecStep, ExecError>; async fn update( &self, ctx: &ExecCtx<'_>, pack: &PackManifest, ) -> Result<ExecStep, ExecError>; async fn teardown( &self, ctx: &ExecCtx<'_>, pack: &PackManifest, ) -> Result<ExecStep, ExecError>; async fn sync( &self, ctx: &ExecCtx<'_>, pack: &PackManifest, ) -> Result<ExecStep, ExecError>; } }
Ground-truth references (M4 shipped, 2026-04-20):
- M4
ActionPlugintrait:crates/grex-core/src/plugin/mod.rs:49-62— patternPackTypePluginreuses. ExecCtx<'a>:crates/grex-core/src/execute/ctx.rs:96-146— reused verbatim.PackManifest:crates/grex-core/src/pack/mod.rs:171-197— canonical name (notPack).ExecStep/ExecError:crates/grex-core/src/plugin/mod.rs— same envelope asActionPluginreturn.
Async form: uses 2024-edition native async-in-trait; fall back to #[async_trait] only if a toolchain blocker surfaces at M5-1 implementation time.
PackManifest
Parsed .grex/pack.yaml. Ground-truth struct from crates/grex-core/src/pack/mod.rs:171-197:
#![allow(unused)] fn main() { pub struct PackManifest { pub schema_version: SchemaVersion, // literal "1" pub name: String, pub r#type: PackType, // enum: Meta | Declarative | Scripted | plugin-name pub version: Option<String>, pub depends_on: Vec<String>, pub children: Vec<ChildRef>, pub actions: Vec<Action>, pub teardown: Option<Vec<Action>>, // already parsed; R-M5-09 just reads it pub extensions: BTreeMap<String, serde_yaml::Value>, } }
Dispatch at M5 executor boundary: registry.get(pack.r#type.as_str()). The r#type: PackType enum stays in the parsed form; the string view is only consumed at registry lookup.
Lifecycle semantics (required contract)
| Method | Required behavior |
|---|---|
install | Idempotent. Running twice must be equivalent to running once. |
update | Run only when lockfile sha or actions_hash changed (grex core decides; plugin just does the work when called). |
teardown | Must attempt to reverse install. May be partial. |
sync | May recurse into children. May no-op for leaf types. |
Fetcher
#![allow(unused)] fn main() { #[async_trait] pub trait Fetcher: Send + Sync { /// URL scheme this fetcher handles: "git". fn scheme(&self) -> &str; async fn clone( &self, url: &str, ref_spec: Option<&str>, dst: &std::path::Path, ) -> anyhow::Result<FetchReport>; async fn pull( &self, dst: &std::path::Path, ) -> anyhow::Result<FetchReport>; async fn current_sha( &self, dst: &std::path::Path, ) -> anyhow::Result<String>; } pub struct FetchReport { pub sha: Option<String>, pub branch: Option<String>, pub bytes: Option<u64>, } }
v1 ships one implementation (fetchers::git, either gix or git2). v2 may ship rclone, s3, oci, http behind the same trait.
Registry struct
Canonical v1 registry holding the action plugins. Packtypes + fetchers retain their existing maps on Registry; the signature below covers the action surface added in M4:
#![allow(unused)] fn main() { pub struct Registry { actions: HashMap<String, Box<dyn ActionPlugin>>, // packtypes, fetchers: see existing fields } impl Registry { pub fn new() -> Self; pub fn register<P: ActionPlugin + 'static>(&mut self, plugin: P); pub fn get(&self, name: &str) -> Option<&dyn ActionPlugin>; pub fn bootstrap() -> Self; // calls register_builtins internally } }
bootstrap() is the canonical entrypoint: it constructs an empty Registry and hands it to register_builtins for the 7 Tier 1 actions. Executor dispatch goes through Registry::get(name) (an unknown name yields UnknownAction) — the dispatch swap from direct Action enum match to Registry::get lands in M4-B (moved 2026-04-20 from M4-A; see milestone.md Stage order note and openspec/feat-grex/spec.md §4). In M4-A the Registry is shipped as a parallel surface and covered by plugin-layer unit tests while FsExecutor / PlanExecutor keep the existing enum-match dispatch.
register_builtins free function
#![allow(unused)] fn main() { pub fn register_builtins(reg: &mut Registry); }
Populates reg with all 7 Tier 1 plugins (symlink, env, mkdir, rmdir, require, when, exec). This is the canonical registration path in v1 — inventory::submit! auto-registration is optional (see feature flag below).
Builtins crate location (2026-04-20): v1 builtins live in grex-core::plugin (co-located for simplicity). grex-plugins-builtin is reserved for v2 third-party-facing extensions. Physical move deferred to M5+.
Idempotency
ExecResult::Skipped { pack_path: PathBuf, actions_hash: String } is emitted when the lockfile-stored actions_hash for a pack equals the recomputed hash at sync time. Hash scope is canonical JSON of the pack's actions: list plus the resolved commit sha (consistent with the "lockfile actions_hash field name kept" open-question note; variant reserved in PR #14). On a Skipped emission the executor performs no work for that pack and writes no new per-action events for it.
Hash algorithm (2026-04-20): actions_hash = sha256(b"grex-actions-v1\0" || canonical_json(actions) || b"\0" || commit_sha), lowercase hex. Computed per-pack; stored in LockEntry.actions_hash; compared at sync start; match emits ExecResult::Skipped and short-circuits the pack. Implemented in grex-core::lockfile::hash::compute_actions_hash.
Feature flag plugin-inventory
Default: off in v1. When on, built-in action modules use inventory::submit! to auto-register and Registry::bootstrap() walks inventory::iter::<BuiltinAction>(). When off, register_builtins is the only path. Keeping inventory optional means grex-core carries no hard dependency on it; linker-based collection is a deployment concern per-consumer.
Registration (v1 in-process)
Canonical path (decision 2026-04-20): explicit register_builtins(reg: &mut Registry). Registry::bootstrap() constructs an empty Registry and hands it to register_builtins, which registers all 7 Tier 1 actions + 3 pack-types + the git fetcher. No inventory dependency is pulled into grex-core on the default path.
#![allow(unused)] fn main() { fn register_builtins(reg: &mut Registry) { reg.register_action(Box::new(actions::Symlink)); reg.register_action(Box::new(actions::Env)); // ... remaining 5 Tier 1 actions reg.register_pack_type(Box::new(packtypes::Meta)); reg.register_pack_type(Box::new(packtypes::Declarative)); reg.register_pack_type(Box::new(packtypes::Scripted)); reg.register_fetcher(Box::new(fetchers::Git)); } }
Alternative: inventory::submit! (feature-gated, M4-E)
Opt-in compile-time auto-registration via the inventory crate, gated behind the plugin-inventory cargo feature (default off; see "Feature flag plugin-inventory" above). Lands in Stage M4-E as a discovery hook; not on the critical path for v1 and not required by any other stage.
#![allow(unused)] fn main() { pub struct BuiltinAction(pub fn() -> Box<dyn ActionPlugin>); inventory::collect!(BuiltinAction); pub struct BuiltinPackType(pub fn() -> Box<dyn PackTypePlugin>); inventory::collect!(BuiltinPackType); pub struct BuiltinFetcher(pub fn() -> Box<dyn Fetcher>); inventory::collect!(BuiltinFetcher); }
Each built-in module would then call inventory::submit! at file scope:
#![allow(unused)] fn main() { // src/actions/symlink.rs pub struct Symlink; #[async_trait::async_trait] impl ActionPlugin for Symlink { /* ... */ } inventory::submit! { crate::plugin::BuiltinAction(|| Box::new(Symlink)) } }
When the feature is on, Registry::bootstrap() walks inventory::iter::<BuiltinAction>() (and the pack-type / fetcher collectors) instead of calling register_builtins directly. When the feature is off (default), register_builtins is the only path.
Adding a new built-in plugin in v1
The flow for a v1 contributor wanting to add, say, a pkg-install action:
- Create
src/actions/pkg_install.rsimplementingActionPlugin. pub mod pkg_install;insrc/actions/mod.rs.- Add
inventory::submit!block (or explicit register call). - Integration test under
tests/actions_pkg_install.rs. - Docs entry in actions.md.
No changes to trait crate; no ABI concerns. Core grex recompile required, but plugin author writes no glue code beyond the trait impl.
v2 external plugin loading
Deferred. Two candidate routes:
Option A: dylib via libloading + abi_stable
- Host loads
libgrex_plugin_foo.{so,dylib,dll}. - Plugin crate uses
abi_stablefor FFI-safe trait objects. - Pros: native speed, same language.
- Cons: ABI versioning is strict; every trait tweak risks SIGSEGV on version skew.
Option B: WASM via wasmtime / extism
- Host loads
foo.wasm. - Plugin compiled to wasm32-wasi.
- Pros: sandboxed, cross-platform binary, forward-compatible ABI.
- Cons: syscall surface must be bridged; filesystem access needs capability grants.
Decision in v2 alpha. ABI contract versioning strategy:
grex-plugin-apicrate (extracted in v2) carries its own semver.- Plugin manifest declares
grex_plugin_api = "1.x". - Host refuses load on major mismatch, warns on minor mismatch.
- Candidate extension: ABI hash baked into plugin binary, checked at load.
Stability guarantees (v1)
Post-v1.0.0 the following are frozen until a v2.0.0:
ActionPluginmethod signatures.PackTypePluginmethod signatures.Fetchermethod signatures.ExecCtxfield names & types (fields may be added; none removed or retyped).ExecStep,FetchReportstruct layouts (additive).PackManifeststruct (additive).- Registration mechanism.
Breaking changes require a grex major bump; v2 re-extracts the plugin traits into a separately-versioned crate so host and plugin can move independently.
mcp — embedded MCP server
grex serve launches an embedded stdio server speaking MCP 2025-06-18 natively. Every CLI verb except serve is exposed as an MCP tool invoked via tools/call. No custom JSON-RPC dialect, no grex.* methods, no batching.
Goal
Agent-native control surface. MCP tool handlers call the same library entrypoints the CLI dispatcher calls — no subprocess wrapper. Single-process observability, shared tokio runtime, manifest cache persists across requests, scheduler + pack-lock primitives shared verbatim.
Transport
- Wire: stdio, newline-delimited JSON per MCP 2025-06-18 (one JSON-RPC message per line, LF-terminated).
rmcptransport-iodefault framer. - Encoding: UTF-8.
- Protocol version:
2025-06-18— returned frominitialize, asserted by clients and mcp-protocol-validator. - Batching: NOT supported. MCP 2025-06-18 rejects JSON-RPC batch arrays. Server MUST return
-32600 Invalid Requestif[req, req, …]arrives. - Stdout discipline: stdout is reserved exclusively for the JSON-RPC wire. Tracing, logs, and diagnostics go to stderr only. Any accidental stdout write is a server bug.
Protocol lifecycle
Only MCP-standard methods are accepted.
| Method / notification | Direction | Purpose |
|---|---|---|
initialize (req) | client → server | Capability negotiation, protocol-version agreement. |
notifications/initialized | client → server | Client ready to send requests. |
tools/list (req) | client → server | Return the 11 tools with JSON-Schema. |
tools/call (req) | client → server | Invoke a tool by name. |
notifications/cancelled | client → server | Cancel an in-flight tools/call by requestId. |
notifications/progress | server → client | Optional per-operation progress (deferred). |
shutdown (req) | client → server | Drain in-flight tasks then exit. |
Handshake
→ {"jsonrpc":"2.0","id":1,"method":"initialize",
"params":{"protocolVersion":"2025-06-18",
"clientInfo":{"name":"claude-code","version":"x"},
"capabilities":{}}}
← {"jsonrpc":"2.0","id":1,
"result":{"protocolVersion":"2025-06-18",
"serverInfo":{"name":"grex","version":"<workspace-version>"},
"capabilities":{"tools":{"listChanged":false}}}}
→ {"jsonrpc":"2.0","method":"notifications/initialized"}
tools/call example
→ {"jsonrpc":"2.0","id":42,"method":"tools/call",
"params":{"name":"sync","arguments":{"recursive":true,"parallel":8}}}
← {"jsonrpc":"2.0","id":42,
"result":{"content":[{"type":"text","text":"<json-result>"}],"isError":false}}
Tool catalog (11 tools)
Frozen CLI verb set: init, add, rm, ls, status, sync, update, doctor, serve, import, run, exec (12 verbs).
Exposed as MCP tools: 11. serve is the server itself → not a tool. teardown is a plugin lifecycle hook of rm, not a user-invokable verb → not a tool. The constant VERBS_11_EXPOSED_AS_TOOLS is defined in grex-mcp and drives every len() assertion.
| Tool name | Description (for tools/list) | readOnlyHint | destructiveHint |
|---|---|---|---|
init | Initialise a grex workspace. | false | false |
add | Register and clone a pack. | false | false |
rm | Unregister a pack (runs teardown unless --skip-teardown). | false | true |
ls | List registered packs. | true | false |
status | Report drift + installed state. | true | false |
sync | Sync all packs recursively. | false | false |
update | Update one or more packs (re-resolve refs, reinstall). | false | false |
doctor | Check manifest + gitignore + on-disk drift. | true | false |
import | Import packs from a REPOS.json meta-repo index. | false | false |
run | Run a declared action across matching packs. | false | true |
exec | Execute a command across matching packs. | false | true |
Param and result shapes mirror the --json output of each CLI verb field-for-field. Every *Params struct derives JsonSchema; rmcp auto-publishes schemas in tools/list.
exec --shell is removed from the MCP surface. Arbitrary shell interpolation is a dangerous capability for an agent. The flag remains on the CLI but is absent from the exec tool's param schema. Reintroduction requires an explicit per-session capability opt-in (deferred).
Cancellation
MCP-standard notifications/cancelled with requestId. No custom grex.cancel method.
→ {"jsonrpc":"2.0","method":"notifications/cancelled",
"params":{"requestId":42,"reason":"user aborted"}}
Server signals the matching request's tokio_util::sync::CancellationToken. Every tool handler propagates the token through:
Scheduler::acquire_cancellable(&CancellationToken)—tokio::select!betweensemaphore.acquire_owned()andcancel.cancelled().PackLock::acquire_cancellable(path, &CancellationToken)— same pattern; breaks the backoff loop on cancel.- Inner action / pack-type dispatch loop — checks
cancel.is_cancelled()between steps.
Cancelled request returns -32800 request cancelled (MCP-standard reserved code).
Progress
notifications/progress is optional and deferred. v1 tool calls return only a final CallToolResult. Progress wiring from sync / update / run / exec handlers (tracing span → progress bridge) lands in a later milestone.
Error codes
Standard JSON-RPC 2.0 codes + MCP-standard -32800 + grex-reserved -32001..-32005 for pack-op failures.
| Code | Source | Meaning |
|---|---|---|
-32600 | JSON-RPC | Invalid Request (malformed envelope; batch array) |
-32601 | JSON-RPC | Method / tool not found |
-32602 | JSON-RPC | Invalid params (deserialization failure; disallowed flag) |
-32603 | JSON-RPC | Internal error (catch-all) |
-32800 | MCP | Request cancelled |
-32001 | grex | Manifest integrity failure |
-32002 | grex | Pack op failed or initialization-state error (see note) |
-32003 | grex | Lock contention |
-32004 | grex | Drift detected |
-32005 | grex | Unknown action / pack-type (plugin missing) |
Dual use of -32002: same code surfaces (a) a user-level pack-op failure returned inside a completed tools/call, and (b) an initialization-state protocol error ("not initialized" / "already initialized") returned from the envelope. Disambiguation is by data.kind: "pack_op" vs "init_state". Splitting into two codes is a future item.
Agent-safety annotations
Every tool in tools/list declares both annotations.readOnlyHint and annotations.destructiveHint. See the catalog table above.
- Read-only tools (
ls,status,doctor) are safe for unattended agent use. - Destructive tools (
rm,run,exec) carrydestructiveHint: trueso policy layers (claude-code, IDE clients) can prompt the user or gate them behind approval. - The annotations are advisory hints, not enforcement — enforcement is the client's responsibility.
Session model
One grex serve process = one MCP client session. Concurrent multi-client sessions over a single server are a future milestone. Rationale:
- stdio transport is inherently single-peer.
- Manifest cache, scheduler permit pool, and pack-lock table are scoped to the process — a second client would need explicit session partitioning.
- Agent-harness pattern (Claude Code, Cursor, etc.) spawns one server per workspace anyway.
Concurrency integration
MCP tool handlers share one Arc<Scheduler> for the server lifetime — concurrent tools/call invocations respect --parallel exactly like local CLI invocations. Manifest cache is reused across requests. ExecCtx is built fresh per call, borrowing the shared scheduler + registry handles.
5-tier lock ordering invariant (M6). Tool handlers MUST acquire concurrency primitives in the fixed order documented in .omne/cfg/concurrency.md:
- workspace-sync lock
- scheduler semaphore permit
- pack-lock (per pack)
- backend (git) lock
- manifest lock
No handler may invert this order. Enforced at runtime by acquisition helpers and statically by M6's Lean4 proof (feat-m6-3).
Launch
grex serve — no --mcp flag; the command is the MCP server. Flags:
--manifest <path>— override manifest path (captured at launch; clients cannot override mid-session).- Inherits global
--parallel Nfrom the grex CLI root.
Security posture:
- stdio only. No network listener.
- Filesystem ops confined to the workspace root.
- Session inherits process file permissions; no privilege escalation.
Implementation stack
- Server framework:
rmcp = "1.5"(official Rust MCP SDK). Provides transport framing,initializenegotiation,tools/listschema publication, andnotifications/cancelledplumbing out of the box. - Schema generation:
schemars— every tool's*Paramsstruct derivesJsonSchema. - Cancellation:
tokio_util::sync::CancellationTokenthreaded throughSchedulerandPackLock. - Crate layout:
crates/grex-mcp/(server + tool handlers) +crates/grex/src/cli/verbs/serve.rs(thin launch shim).
Testing:
crates/grex-mcp/src/**— inline#[cfg(test)]unit tests (routing, schema gen, error mapping).crates/grex-mcp/tests/**— integration tests viatokio::io::duplex..github/workflows/ci.yml—mcp-validatorjob runsmcp-protocol-validatoragainst a release build ofgrex serve.
Out-of-scope / future
- Multi-client sessions over a single server process.
notifications/progressemission from long-running tool handlers.exec --shellre-exposure via per-session capability opt-in.- Splitting
-32002into distinct pack-op vs init-state codes. - Remote transports (HTTP/SSE); stdio is the only v1 transport.
Pack template
grex ships a reference pack at examples/pack-template/ in the main repo.
At v1.0.0 release time, the in-tree tree is mirrored to a standalone
repo (git@github.com:egoisth777/grex-pack-template.git) so users can
install it via grex add <URL>; until then, use the in-tree form below.
Trying the template
From a checkout of the main grex repo:
grex init
# Local (in-tree) form — works today, no external repo required:
grex add --from-path examples/pack-template
grex sync
grex doctor
Once grex v1.0.0+ is published, you'll also be able to install via the standalone mirror:
# Available at v1.0.0+ release; until then use the --from-path form above.
grex add git@github.com:egoisth777/grex-pack-template.git
Expected behaviour: the pack creates $HOME/.grex-pack-template/ and a
symlink inside it pointing at the pack's files/hello.txt. Re-running
grex sync is a no-op — every action is idempotent.
To undo: grex teardown grex-pack-template (or grex rm grex-pack-template
to also remove it from the workspace manifest). The directory is backed up
under <path>.grex-bak.<ts> before removal.
Walkthrough of the manifest
The template is type: declarative — the simplest of grex's three pack
types. Its pack.yaml is structured as:
require— gate the pack. Ifgitis unavailable and the OS is not Windows, the install aborts before any filesystem action runs.mkdir+symlink— a single pair of actions, portable across linux / macos / windows via$HOME. grex-core's var-expansion synthesises$HOMEfrom%USERPROFILE%on Windows (seecrates/grex-core/src/vars/mod.rs), so no per-OSwhen:fan-out is required.teardown:— a singlermdirthat reverses the install. Without an explicitteardownlist, grex would default to reverse-order rollback ofactions, which works but is less readable.
Every action is chosen for idempotency on repeat syncs: require is
read-only, mkdir no-ops when the path exists, symlink no-ops when dst
already points at src.
Structure of the in-tree copy
examples/pack-template/
├── .grex/
│ └── pack.yaml # manifest (schema_version "1", type declarative)
├── files/
│ └── hello.txt # payload referenced by the symlink action
├── README.md # user-facing docs (Install / Structure / Customisation / Testing / Licence)
└── .gitignore # M6 managed-block: .grex/.state/
The template is type: declarative, so it has no .grex/hooks/ directory.
Hooks fire only for type: scripted packs.
Customising the template for your own pack
- Fork the tree into a new git repo.
- Rename
name:inpack.yaml(regex^[a-z][a-z0-9-]*$). - Replace the actions with your own — see the actions reference for the 7 built-in primitives.
- If you need arbitrary shell steps that don't fit the declarative
primitives, switch the manifest to
type: scriptedand add a.grex/hooks/directory withsetup.{sh,ps1}/sync.{sh,ps1}/teardown.{sh,ps1}scripts. Hooks receiveGREX_PACK_NAME,GREX_PACK_PATH,GREX_PACK_OS, andGREX_DRY_RUNas env vars. - Update the
teardown:list to reverse your actions. - Publish and install with
grex add <your-url>.
CI validation
The in-tree copy is the canonical source and is exercised in CI by
crates/grex/tests/pack_template_smoke.rs. The smoke test:
- Parses
examples/pack-template/.grex/pack.yamlviagrex_core::pack::parseand asserts the top-level shape (name / type / schema_version / first-action is arequiregate). - Asserts the payload files the README promises (
.grex/pack.yaml,files/hello.txt,README.md,.gitignore) are present on disk. - Copies the template into a tempdir and runs
grex_core::sync::runagainst it end-to-end, then re-runs sync to verify the second pass is an all-no-op.
If any check fails in CI, the template is broken — fix the in-tree copy before the next release, since the external mirror is regenerated from it (see the appendix below).
Relationship to other M8 stages
- M8-1 (cargo-dist): the installer scripts referenced in the template's README live on the main grex releases page, not on the template repo.
- M8-2 (crates.io): the template has no crates.io presence — it is a git-installable reference pack, not a Rust crate.
- M8-3 (mdBook): this chapter is the authoritative doc for the template's ownership contract.
- M8-5 (CHANGELOG): every release that changes the template must note it in the main grex CHANGELOG entry, plus re-mirror per the appendix.
Appendix: publishing the external mirror (release-time procedure)
Run these steps once per major grex release (v1.0.0, v1.1.0, v2.0.0, ...):
-
Create the empty GitHub repo. On github.com: new repo
egoisth777/grex-pack-template, public, MIT OR Apache-2.0 licence, empty (no README / .gitignore / licence auto-init — we push our own). -
Mirror the in-tree tree into a fresh git history. From the grex repo root (replace
v1.0.0with the actual release tag):cp -r examples/pack-template /tmp/grex-pack-template cd /tmp/grex-pack-template git init -b main git add -A git commit -m "feat: initial template from grex v1.0.0" git remote add origin git@github.com:egoisth777/grex-pack-template.git git push -u origin mainOn Windows, substitute
$env:TEMPfor/tmpand use PowerShell-nativeCopy-Item -Recurse. -
Tag the external repo to match the grex release.
git tag -a v1.0.0 -m "grex v1.0.0" git push origin v1.0.0 -
Verify end-to-end. From a fresh workspace:
grex init grex add git@github.com:egoisth777/grex-pack-template.git grex sync grex doctorExpected: all four commands exit 0;
grex doctorreports the pack as OK. -
Record the first-commit SHA in the main grex repo's
CHANGELOG.mdunder the release entry, for traceability.
Ownership & CODEOWNERS
- In-tree copy (
examples/pack-template/) is governed by the main grexCODEOWNERS— same reviewers as the rest of the workspace. - External repo (
grex-pack-template) has its ownCODEOWNERSfile, independent of main grex. Day-to-day PRs on the external repo (typo fixes, user-reported issues) land directly; breaking changes to the template shape MUST land in the in-tree copy first, ship with the next grex release, and then be force-pushed over the external repo as a new commit history per step 2 above. - Never hand-edit the external repo and the in-tree copy independently. The in-tree copy is canonical; the external repo is regenerated.
migration — from REPOS.json + .scripts/ to grex
Users on the legacy Python .scripts/ meta-repo migrate by running grex import --from-repos-json ./REPOS.json. Both systems can coexist during transition.
Legacy source system
repo/
├── .scripts/
│ ├── init.py add.py rm.py sync.py track.py test.py
│ ├── lib/
│ └── hooks/
├── REPOS.json # [{url, path}, ...]
└── .gitignore # hand-curated, sub-repo dirs appended
REPOS.json shape:
[
{"url": "https://github.com/grex-org/grex-tui.git", "path": "grex-tui"},
{"url": "https://github.com/grex-org/grex-core.git", "path": "grex-core"}
]
Legacy shell native scripts (.ps1/.sh) are irrelevant in grex — Rust std::fs + built-in actions replace them.
Import command
grex import --from-repos-json ./REPOS.json
Behavior:
- Read + parse
REPOS.json. Validateurlandpath(bare name) on every entry. - For each entry not already in
grex.jsonl(bypath), emit anaddevent withtype: meta(or--default-type <...>). - For each entry already present with matching URL, skip.
- For each entry with same
pathbut differenturl, abort unless--force(then emitupdate). - Optionally
--migrate-gitignore: rewrite.gitignoreto use the managed-block format, preserving pre-existing lines outside the managed region.
grex import --from-repos-json ./REPOS.json --migrate-gitignore
Idempotent: re-running is a no-op once imported.
Disk-scan variant
grex import --scan
Walks workspace root one level deep, detects directories with .git/ not yet in grex.jsonl. For each, reads git config --get remote.origin.url, emits an add event. Skips entries without a remote.
Combinable with --from-repos-json: both sources processed, deduplicated by path.
Pack type for imported entries
Legacy REPOS.json carries no pack type info. Default assumption:
- If the imported dir contains a
.grex/pack.yaml, use its declaredtype. - Else use
--default-type(flag), which defaults tometa(safe: meta packs have no actions, so no surprise side effects on first install).
User can later convert to declarative or scripted by adding a .grex/pack.yaml in the imported pack's own repo.
From v1.1.1+, plain-git children (no .grex/pack.yaml) walk via synthetic-manifest fallback — grex import --from-repos-json followed by grex sync works end-to-end on the bootstrap pattern (REPOS.json + flat-sibling git repos). See pack-spec.md §"Plain-git children" for the synthesis rule.
Coexistence during transition
Both systems can run against the same workspace if:
.scripts/remains in place unmodified.grex.jsonlis added alongsideREPOS.json..gitignoreis in managed-block format and lists everypathfrom BOTH sources.
grex doctor in coexistence mode:
- Warns (non-fatal) if
REPOS.jsonhas entries missing fromgrex.jsonl. - Warns if
.scripts/is still present whilegrex.jsonlexists. - Suggests running
grex import --from-repos-jsonor retiring.scripts/.
Disambiguation rules
Same path in both sources:
REPOS.json | grex.jsonl | Action |
|---|---|---|
| present | absent | add event emitted |
| present (url A) | present (url A) | no-op |
| present (url A) | present (url B) | error, abort without --force |
| absent | present | no-op |
| present | tombstoned | skip, log info |
--force resolves URL conflicts by emitting update from the REPOS.json value.
Path rule transition
Legacy REPOS.json required bare path (no separators). v1 grex preserves this. Nested paths (e.g. packs/foo) are deferred to v1.x; will require path-normalization + collision detection.
Retirement of .scripts/
Post-migration:
- Verify
grex lsmatches expected pack list. - Run
grex sync --parallel 8. - Delete
.scripts/viagit rm -r .scripts/. - Delete
REPOS.json. git config core.hooksPath .grex/hooks(grex installs these oninit).- Commit.
grex doctor after retirement should exit 0 on clean workspace.
Rollback
Nothing in grex import mutates .scripts/ or REPOS.json. Rollback = delete grex.jsonl + grex.lock.jsonl + revert .gitignore (if --migrate-gitignore used). No data loss path.
engineering
Cargo workspace setup, feature flags, CI matrix, release pipeline, versioning policy.
Cargo workspace
Single crate grex (lib + bin). No sub-crates in v1. grex-plugin-api splits out in v2 for ABI-stable plugin authoring.
Cargo.toml (root):
[workspace]
members = ["grex"]
resolver = "2"
[workspace.package]
edition = "2024"
rust-version = "1.82"
license = "Apache-2.0 OR MIT"
repository = "https://github.com/grex-org/grex"
grex/Cargo.toml:
[package]
name = "grex" # fallback: "grex-cli" if crates.io taken
version = "0.1.0"
description = "Cross-platform dev-environment orchestrator. Pack-based, agent-native, Rust-fast."
readme = "README.md"
keywords = ["dev-env", "pack", "meta-repo", "mcp", "cli"]
categories = ["command-line-utilities", "development-tools"]
[[bin]]
name = "grex"
path = "src/main.rs"
[features]
default = ["git-backend-gix"]
git-backend-gix = ["dep:gix"]
git-backend-git2 = ["dep:git2"]
simd-json = ["dep:simd-json"]
tui = ["dep:ratatui", "dep:crossterm"] # v2
sqlite = ["dep:rusqlite"] # v2
lean4 = [] # marker; CI-only proof job
[dependencies]
tokio = { version = "1", features = ["full"] }
clap = { version = "4", features = ["derive"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
serde_yaml = "0.9"
anyhow = "1"
thiserror = "2"
tracing = "0.1"
tracing-subscriber = "0.3"
comfy-table = "7"
owo-colors = "4"
fd-lock = "4"
async-trait = "0.1"
num_cpus = "1"
inventory = "0.3"
gix = { version = "0.66", optional = true }
git2 = { version = "0.19", optional = true }
simd-json = { version = "0.14", optional = true }
ratatui = { version = "0.28", optional = true }
crossterm = { version = "0.28", optional = true }
rusqlite = { version = "0.32", optional = true, features = ["bundled"] }
[dev-dependencies]
proptest = "1"
tempfile = "3"
assert_cmd = "2"
predicates = "3"
criterion = "0.5"
Versions pinned at scaffold (M1); refreshed at release (M8).
Build
| Command | Purpose |
|---|---|
cargo build | dev |
cargo build --release | optimized |
cargo build --all-features | exercise optional features |
LTO: [profile.release] lto = "thin", codegen-units = 1 | release speed |
Test
| Command | Scope |
|---|---|
cargo test | unit + integration default features |
cargo test --all-features --workspace | full matrix |
cargo test -p grex --test crash_recovery | single integration file |
cargo bench | criterion (M2 onward) |
Lint
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
typos
cargo deny check
Details in linter.md.
CI matrix (.github/workflows/ci.yml)
name: ci
on: [push, pull_request]
jobs:
test:
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
toolchain: [stable, beta]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ matrix.toolchain }}
components: rustfmt, clippy
- uses: Swatinem/rust-cache@v2
- run: cargo fmt --all -- --check
- run: cargo clippy --all-targets --all-features -- -D warnings
- run: cargo test --all-features --workspace
lean:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: leanprover/lean-action@v1
- run: cd proof && lake build
deny:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: EmbarkStudios/cargo-deny-action@v1
Release pipeline
Tool: cargo-dist for cross-compiled release binaries.
Targets:
x86_64-unknown-linux-gnuaarch64-unknown-linux-gnux86_64-apple-darwinaarch64-apple-darwinx86_64-pc-windows-msvcaarch64-pc-windows-msvc
Flow:
- Bump
versioninCargo.toml. - Update
CHANGELOG.md. git tag vX.Y.Z+ push tag.release.yml(cargo-dist-generated) builds artifacts + creates GitHub Release.cargo publish -p grexto crates.io.- Verify
cargo install grexclean install on all three OSes (smoke).
Versioning policy
- Crate semver
MAJOR.MINOR.PATCH. Post-v1:- PATCH: bug fix, no API change.
- MINOR: additive (new verb, flag, action, pack-type, MCP method).
- MAJOR: any removal, rename, or semantic change of the 8 stable APIs.
- Manifest schema (
grex.jsonlschema_versionfield) — versioned independently. Breaking bump → reader rejects with actionable error pointing togrex upgrade-schema. - Lockfile schema — versioned independently (separate cadence from intent log).
pack.yamlschema_version — independent. v1 packs must remain readable by any v1.x.- MCP method catalog — tied to CLI verb surface; additions emit
notifications/methods_changed.
Toolchain pin
rust-toolchain.toml:
[toolchain]
channel = "1.82"
components = ["rustfmt", "clippy"]
External tooling required
gitCLI (fallback).lake+lean(CI-only, for proof job).cargo-dist(release-pipeline only).typos+cargo-deny(CI).
Security
cargo deny checkenforces license + advisory gates.#![forbid(unsafe_code)]at crate root; narrow exceptions via#[allow(unsafe_code)]per-module where absolutely needed (fd-lock integration, Windows symlink APIs).- Supply-chain: consider
cargo vetin v1.x once stable. - No shell invocation outside the
actions::execmodule.
Observability
tracingthroughout.tracing-subscriberwired at binary entry; CLI-v/-vv/-vvvcontrols filter.- Structured fields:
pack_path,action,op,duration_ms,result. - v1.x may add on-disk JSON log sink for
grex doctorretrospection.
License
Decision locked at M7. Current preference: dual MIT OR Apache-2.0 (Rust-community convention). Alternative single-license choice acceptable if legal reviewer prefers.
test-plan
Pyramid from unit through CI cross-platform + Lean4 proof compilation + pack-protocol contract tests.
Pyramid
┌──────────────────────────────┐
│ Cross-plat CI matrix │ few, slow
├──────────────────────────────┤
│ Pack-protocol contract │ fixture packs end-to-end
├──────────────────────────────┤
│ MCP roundtrip │ JSON-RPC scripted
├──────────────────────────────┤
│ Crash injection │ SIGKILL / TerminateProcess
├──────────────────────────────┤
│ Integration │ real git, temp dirs
├──────────────────────────────┤
│ Property (proptest) │ manifest CRUD algebra
├──────────────────────────────┤
│ Unit │ fast, exhaustive, in-process
└──────────────────────────────┘
Unit tests
In-module #[cfg(test)]. Fast, no IO except via tempfile.
Coverage targets:
manifest::event— every event variant, schema bump rejection, malformed line behavior.manifest::fold— ordering, tombstone precedence, update idempotence.manifest::lock— last-write-wins per id.pack::schema— fullpack.yamlschema validation, rejects + accepts.gitignore— managed-block insert, update, preserve-user-lines, idempotent-sync.cli::output— JSON / plain / pretty modes against golden strings.concurrency::scheduler— semaphore acquisition order with mockedPackLock.actions::*— each of 7 primitives has targeted unit tests (args parsing, dry-run, idempotency check).packtypes::*— each lifecycle method dispatches correctly.fetchers::git— URL parsing, ref-spec resolution.
Integration tests (tests/)
Each spins a temp dir via tempfile::TempDir, invokes compiled binary via assert_cmd or library entrypoints directly.
| File | Scenario |
|---|---|
integration_add.rs | grex add against local bare-repo fixture → event appended, dir cloned, .gitignore updated, pack.yaml auto-detected |
integration_rm.rs | add → rm → manifest tombstoned, dir gone, teardown ran |
sync_recursive.rs | meta-pack with nested children syncs 3 levels deep |
sync_parallel.rs | 8 local fixture packs, grex sync --parallel 4, all succeed, wall time sub-linear |
gitignore_preserves_user_lines.rs | pre-populated .gitignore with user content outside managed block → round-trip preserves byte-for-byte |
crash_recovery.rs | spawn child, SIGKILL (Win: TerminateProcess) mid-append, grex ls recovers via torn-line detection |
mcp_stdio.rs | spawn grex serve --mcp, scripted JSON-RPC session, assert responses |
import_legacy.rs | seed REPOS.json + .gitignore, run grex import --from-repos-json, verify manifest + gitignore coexistence |
doctor_drift.rs | corrupt manifest / delete workdir, grex doctor --fix restores invariants |
pack_types_end_to_end.rs | one fixture of each of 3 pack-types: install + sync + teardown full round-trip on all OSes |
bench_manifest.rs | 10k events fold < 1s, 100k events < 10s (criterion; non-blocking) |
Git fixtures: bare .git local repos under tests/fixtures/, served via file:// URLs. No network in CI tests.
Property tests (proptest)
tests/property_manifest.rs:
- Generate arbitrary sequences of
add/rm/update/syncevents. - Invariants under fold:
- Tombstoned id never in state map.
- Compaction idempotent:
compact(compact(m)) == compact(m). - Fold-equivalence:
fold(m) == fold(compact(m)). - Update last-writer-wins per id.
tests/property_gitignore.rs:
- Random pre-existing
.gitignore+ random sequences of add/rm. - Invariants:
- User lines outside managed block unchanged byte-for-byte.
- Two consecutive syncs produce identical output.
tests/property_actions.rs:
- Each action primitive: running twice in sequence is equivalent to running once (idempotency).
rollback(execute(x))== starting state (for actions that support rollback).
Crash injection
tests/crash_recovery.rs:
- Spawn helper binary (
crash-helper, built alongside the test) that appends togrex.jsonlthen panics mid-write (partial bytes, no newline, exits). - Parent opens the manifest, runs fold, expects success + one truncated-tail warning in tracing output.
Windows variant uses TerminateProcess via raw handle (#[cfg(windows)]).
MCP roundtrip
tests/mcp_stdio.rs:
assert_cmdspawnsgrex serve --mcp --manifest <tempdir>/grex.jsonl.- Pipe JSON-RPC frames to stdin, read stdout.
- Sequence:
initialize→grex.add→grex.ls→grex.sync→grex.status→grex.rm→grex.ls. - Assert each response matches expected JSON shape (via
serde_json::Valueequality + predicates). - Assert clean shutdown on stdin close.
Cross-plat CI matrix
All integration + property + crash + MCP tests run on:
ubuntu-latestmacos-latestwindows-latest
Fixtures avoid platform-specific paths — all tests use tempfile + PathBuf.
Lean4 proof verification
.github/workflows/lean.yml:
- uses: leanprover/lean-action@v1
- run: cd proof && lake build
Job succeeds only if proof/Grex/Scheduler.lean compiles to .olean with zero sorry. Any unresolved axiom outside the single pack_lock_exclusive model-bridge axiom (resolved to theorem by M5-exit) fails CI.
Lean type-checking is the guarantee; CI does not attempt to verify proof content beyond compilation.
Pack-protocol contract tests
tests/pack_types_end_to_end.rs + fixture pack repos under tests/fixtures/packs/:
meta-basic/— meta pack with 2 nested declarative children.declarative-basic/— declarative pack exercising all 7 action types.scripted-basic/— scripted pack with setup.sh + setup.ps1 + teardown.{sh,ps1}.
Contract assertions:
- Install + sync + teardown round-trip leaves the workspace in the pre-install state.
- Install followed by install (no changes) = idempotent.
- Teardown followed by teardown = idempotent (second is no-op).
- Lockfile entry matches expected
sha+actions_hashafter install.
Fixtures double as living documentation — they're the canonical "what does a v1 pack look like" examples.
Coverage
cargo-llvm-cov weekly on main. Target: 80% line coverage on manifest, pack, plugin, actions, packtypes, gitignore, concurrency. CLI + MCP exercised via integration tests, not measured for line coverage.
Smoke test (pre-release, manual)
Before tagging a release:
cargo install --path grexcd <tempdir> && grex init && grex add git@github.com:grex-org/grex-inst.git && grex ls --long && grex sync && grex doctor.grex serve --mcp→ sendinitializemanually, verify response.grex doctor→ exit 0.- Repeat on macOS and Windows.
linter
Rules enforced on every PR. CI fails on any violation.
Standard Rust tooling
| Tool | Command | Gate |
|---|---|---|
rustfmt | cargo fmt --all -- --check | fail on any diff |
clippy | cargo clippy --all-targets --all-features -- -D warnings | fail on any warning |
typos | typos | fail on misspellings |
cargo-deny | cargo deny check | license + advisory + source gates |
Clippy configuration
clippy.toml:
avoid-breaking-exported-api = false
msrv = "1.82"
Lint levels in src/lib.rs:
#![allow(unused)] #![forbid(unsafe_code)] #![deny( fn main() { clippy::unwrap_used, clippy::expect_used, clippy::panic, clippy::dbg_macro, clippy::print_stdout, clippy::print_stderr, clippy::todo, clippy::unimplemented, )] #![warn( clippy::pedantic, clippy::nursery, missing_docs, )] }
Tests and benches relax via #![allow(clippy::unwrap_used, clippy::expect_used)] at crate root for test binaries.
Custom rules
Output centralization
- No
println!/eprintln!/print!/eprint!outsidesrc/cli/output.rs. - Enforced by
clippy::print_stdout+clippy::print_stderr= deny. - All output goes through the formatter which honors
--json/--plain/ TTY detection.
Error handling discipline
- Library modules (
src/manifest,src/pack,src/plugin,src/actions,src/packtypes,src/fetchers,src/concurrency): usethiserrortyped errors.anyhowbanned here. - Binary modules (
src/cli,src/main.rs,src/mcp): may useanyhow. - No
unwrap()/expect()in production paths. Startup-only paths mayexpect()with a human-meaningful message if the invariant is unrecoverable (e.g. inventory registry empty = developer bug).
No direct shell-spawning outside actions/exec
tokio::process::Commandandstd::process::Commandallowed ONLY insrc/actions/exec.rs,src/packtypes/scripted.rs, andsrc/fetchers/git.rs(for CLI fallback).- Any other file invoking
Commandfails lint. - Enforced by CI grep rule:
if grep -rn 'process::Command' src/ --include='*.rs' \
| grep -vE '^src/(actions/exec|packtypes/scripted|fetchers/git)\.rs'; then
echo "shell invocation outside allowed modules"; exit 1
fi
Path rules (ported from legacy .scripts/test.py)
-
No hardcoded absolute paths in source, config, or embedded strings.
-
Banned:
C:\,D:\,E:\,/home/,/Users/,/mnt/,/opt/. -
CI grep:
if grep -rn -E '([A-Z]:\\|/home/|/Users/|/mnt/|/opt/)' src/ --include='*.rs'; then echo "hardcoded path detected"; exit 1 fi
-
-
No
~in source strings. Home expansion lives in aPackCtx::envhelper usingdirs::home_dir(). -
No string concatenation with path separators. Use
std::path::PathBuf+push()/join(). Clippy'spath_buf_push_overwritehelps.
Manifest rules (runtime + lint)
pack.yamlchildren[].pathMUST be bare name. Enforced at parse bypack::schema::validate()and at doctor-time bygrex doctor.grex.jsonleventpathfield likewise bare. No drive letters anywhere in manifest.
Plugin trait discipline
- Every module under
src/actions/MUST contain exactly oneimpl ActionPlugin. - Every module under
src/packtypes/MUST contain exactly oneimpl PackTypePlugin. - Every module under
src/fetchers/MUST contain exactly oneimpl Fetcher. - Enforced by code review + presence of
inventory::submit!block.
Shim rules — N/A
Legacy .scripts/ had Python-specific shim rules (no shutil.rmtree, no subprocess.run(shell=True), etc.). Rust has no direct analogue:
std::fs::remove_dir_allis cross-platform — no native-script indirection needed.- Shell invocation is already gated by the "no shell-spawning outside allowed modules" rule above.
- Symlinks use
std::os::{unix,windows}::fsdirectly.
CI job
.github/workflows/lint.yml (or a job in ci.yml):
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with: { components: rustfmt, clippy }
- run: cargo fmt --all -- --check
- run: cargo clippy --all-targets --all-features -- -D warnings
- uses: crate-ci/typos@master
- uses: EmbarkStudios/cargo-deny-action@v1
- name: hardcoded paths
run: |
! grep -rn -E '([A-Z]:\\|/home/|/Users/|/mnt/|/opt/)' src/ --include='*.rs'
- name: shell invocation scope
run: |
! grep -rn 'process::Command' src/ --include='*.rs' \
| grep -vE '^src/(actions/exec|packtypes/scripted|fetchers/git)\.rs'
Pre-commit hook
.grex/hooks/pre-commit:
#!/usr/bin/env bash
set -e
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
Activated by grex init via git config core.hooksPath .grex/hooks.
Man pages
grex ships a full set of Unix man pages — one root page plus one per CLI
verb. They are a passive projection of the clap::Command tree defined
in crates/grex/src/cli/args.rs;
never edit the .1 files by hand.
What ships
14 files under man/ at the repo root:
| Page | Covers |
|---|---|
grex.1 | Top-level binary + global flags (--json, --plain, --dry-run, --filter) |
grex-init.1 | grex init |
grex-add.1 | grex add <url> [path] |
grex-rm.1 | grex rm <path> |
grex-ls.1 | grex ls |
grex-status.1 | grex status |
grex-sync.1 | grex sync (parallel + --only + --ref) |
grex-update.1 | grex update [pack] |
grex-doctor.1 | grex doctor --fix --lint-config |
grex-serve.1 | grex serve (MCP stdio) |
grex-import.1 | grex import --from-repos-json |
grex-run.1 | grex run <action> |
grex-exec.1 | grex exec <cmd> … |
grex-teardown.1 | grex teardown |
Generating
The generator lives in crates/xtask/ and is invoked via the
cargo xtask alias configured in .cargo/config.toml:
cargo xtask gen-man # write to <workspace>/man/
cargo xtask gen-man --out-dir /tmp/m # write elsewhere
Internally the binary calls clap_mangen::Man::new(cmd).render(&mut buf) once
for the root Command and once per subcommand. The subcommand name is
prefixed with grex- so the .TH header reads grex-sync(1) instead of
sync(1).
CI drift check
CI runs a man-drift job on every PR (see
.github/workflows/ci.yml):
cargo run -p xtask -- gen-mangit diff --exit-code -- man/— fails if the generated output differs from the committed files.
If you touch crates/grex/src/cli/args.rs (add a verb, rename a flag,
edit a /// help doc comment) you must re-run cargo xtask gen-man
and commit the regenerated .1 files or CI will reject the PR.
Release artifact inclusion
man/ is listed in [workspace.metadata.dist].include in the root
Cargo.toml,
so every cargo-dist-built release tarball ships the full man-page set
alongside README.md, CHANGELOG.md, and the licenses.
Installing
See the README "Man pages" section
for the one-line install -Dm644 incantation. The shell / PowerShell
installer one-liners do not install pages into the system man path —
manual copy is required for now.
roadmap
Content scope by release. Timeline is ordering + dependencies, not dates.
v1 — Pack-based orchestrator, stable core
Ships all 7 philosophy principles (see goals.md).
Core always compiled
- Manifest (JSONL intent log).
- Lockfile (JSONL resolved state).
- Scheduler (tokio + bounded semaphore + per-pack
.grex-lock+fd-lock). - Sync engine (git clone/pull, recursion).
- Gitignore automation (managed-block markers).
- MCP stdio JSON-RPC server.
- Pack discovery (
.grex/pack.yaml). - Action plugin registry + 7 built-in actions.
- Pack-type plugin registry + 3 built-in pack-types.
- Atomic file writes (temp + rename).
- Lean4 proof
Grex.Scheduler.no_double_lock.
Frozen public APIs
.grex/pack.yamlschema (v1).grex.jsonlevent schema.grex.lock.jsonlschema.ActionPlugintrait.PackTypePlugintrait.Fetchertrait.- CLI verb surface (12 verbs).
- MCP method surface (1:1 with CLI).
Explicitly NOT in v1
- External plugin loading.
- TUI.
- Non-git fetchers.
- Additional pack-types / actions beyond the built-ins.
- Pack registry.
- Self-update.
Exit criteria: all success criteria in the feature spec PASS in CI matrix; crates.io publish successful; reference pack repo installs cleanly.
v2 — Extensibility & aesthetics
Opens third-party extension; adds TUI + non-git fetchers.
External plugin loading
Two candidate routes evaluated in v2 alpha:
- Dylib (
libloading+abi_stable): native speed, strict ABI versioning. - WASM (
wasmtime/extism): sandboxed, forward-compatible, syscall bridging required.
Decision in v2 alpha; both may ship (host selects by file extension).
Retro-futurist TUI
ratatui-based dashboard, feature-flagged --features tui. Live pack tree, per-pack sync stream, lock inspector, CRT glyph aesthetic. Falls back to plain ANSI when --plain or non-TTY.
Additional pack-types (via plugin)
software-list— iterates package installs (winget/brew/apt).env-bundle— manages a coherent group of env vars + PATH entries.dotfiles— dotfile-manager style: iterate + symlink.
Additional actions (via plugin)
pkg-install, url-download, archive-extract, file-append, patch, json-merge, template, path-add, shell-rc-inject.
Additional Lean4 proofs
- I2: manifest append serialization under fd-lock.
- I3:
.gitignoremanaged-block idempotence. - I4: compaction fold-equivalence.
- Commutativity of disjoint-path events.
SQLite optional backend
Feature flag sqlite. Same Manifest API. For users with >100k events.
Self-update
grex upgrade pulls latest release from GitHub.
Embedded scripting
Lua or Rhai in-process scripting — middle ground between declarative YAML and full shell escape. Candidate for a pack-type plugin in v2.
Non-git fetchers
rclone, s3, oci, http — all implement the Fetcher trait. grex add accepts --scheme <rclone|s3|...> or auto-detects from URL.
v3+ — Scale & federation
Exploratory. No commitments yet.
- Pack registry (
grex.dev) — hosted index of discoverable packs. - Rules engine —
.rules.yamlper pack, enforced on add/sync (modeled after metarepo's rules plugin). - Org-level federation — multiple top-level workspaces referencing each other.
- Interactive HTTP dashboard —
grex serve --httpwith web UI. - Distributed locking — optional consul/etcd for multi-host deployments.
- p2p fetchers — IPFS, BitTorrent.
- Supply-chain signing — pack signatures; registry-enforced integrity.
Non-roadmap (never)
- Cross-VCS support (hg, svn, fossil, perforce).
- Monorepo conversion tooling.
- Git replacement.
- Generic CI runner.
- Full
.gitmodulessemantic replacement.
Dependency ordering (cross-release)
v1 (frozen APIs)
└─► v2 external plugin loading
└─► v2 additional pack-types + actions (as plugins)
└─► v2 non-git fetchers (via Fetcher trait impls)
└─► v2 TUI (independent)
└─► v2 SQLite backend (independent)
└─► v3 pack registry (needs plugin signing story)
m3-review-findings
Master finding list from the M3-close review series, plus mapping to the fix PRs that landed on main.
- Date: 2026-04-20
- Baseline:
mainatd160c7c(M3 Stage B close). - Final state:
mainat7ce186e(5 fix PRs merged). - Test count: 316 → 344.
Methodology
Eight parallel reviews were run:
- Codex adversarial passes (4) — semver hygiene, data-integrity, concurrency, cross-platform. Prompted to find breakage, not to polish.
- Analytical subagent passes (4) — docs / rustdoc coverage, perf / allocations, recovery / crash-resume, security audit.
7 of 8 returned usable synthesis. The security audit stalled at synthesis twice (codex truncated mid-report on both retries). Security retry is filed under open carry-forwards rather than being treated as a clean pass — do not assume the review was completed.
Each review produced a file:line-cited report; the master list below is the synthesized severity grouping the reviewer saw at close.
Master finding list
Severity legend: CRITICAL (correctness / data loss) · HIGH (wrong result under realistic input) · MEDIUM (bad UX / minor correctness) · LOW (cosmetic / edge) · NIT (style).
CRITICAL
| # | Finding | Evidence |
|---|---|---|
| C1 | Concurrent grex sync against the same workspace could interleave manifest appends — no workspace-level lock existed | crates/grex-core/src/sync/mod.rs (pre-#16); concurrency review report |
| C2 | Manifest could record a successful Sync for an action that panicked mid-side-effect — readers had no way to detect partial apply | crates/grex-core/src/sync/emit.rs (pre-#15) |
| C3 | Symlink backup path: after rename(dst → .grex.bak) succeeded, a failed symlink() left the user with no original file and no new symlink | crates/grex-core/src/execute/fs/symlink.rs (pre-#18) |
HIGH
| # | Finding | Evidence |
|---|---|---|
| H1 | VarEnv was case-sensitive on Windows → $USERPROFILE vs $UserProfile resolved differently | crates/grex-core/src/vars/env.rs (pre-#17) |
| H2 | DupSymlinkValidator compared dst paths byte-for-byte → duplicates that differ only in case passed validation on case-insensitive FSes | crates/grex-core/src/pack/validate/dup_symlink.rs (pre-#17) |
| H3 | kind: auto silently defaulted to file when src was missing, creating a dangling file-symlink where directory was required | crates/grex-core/src/execute/fs/symlink.rs (pre-#17) |
| H4 | Concurrent sync on the same clone dest could race the bare fetch vs the checkout | crates/grex-core/src/git/backend/gix.rs (pre-#16) |
| H5 | All public enums / arg structs were implicit #[non_exhaustive]-missing → adding a variant in M4 would be a SemVer major | crates/grex-core/src/** (pre-#14) |
| H6 | ExecNonZero carried the full stderr → event size could exceed fd-lock append atomicity ceiling | crates/grex-core/src/execute/fs/exec.rs (pre-#18) |
MEDIUM
| # | Finding | Evidence |
|---|---|---|
| M1 | Action name was &'static str → plugin-provided names (heap-allocated) could not register | crates/grex-core/src/pack/action.rs (pre-#14) |
| M2 | No pre-run scan for stale locks / orphaned .grex.bak files → surfaced only on next hit | recovery review report |
| M3 | Dirty-check ran before lock acquire → TOCTOU window between check and materialise_tree | crates/grex-core/src/sync/mod.rs (pre-#16) |
| M4 | HOME → USERPROFILE fallback also fired in insert → user-explicit HOME insert was silently retargeted | crates/grex-core/src/vars/env.rs (pre-#17) |
| M5 | No ExecResult::Skipped variant → M4 idempotency skip would force a non-additive enum change | crates/grex-core/src/execute/result.rs (pre-#14) |
LOW
| # | Finding | Evidence |
|---|---|---|
| L1 | Unicode NFC/NFD path equality not handled (macOS) | cross-platform review |
| L2 | Windows MAX_PATH: no \\?\ prefix for long paths | cross-platform review |
| L3 | POSIX mode on Windows mkdir silently ignored — no warning | cross-platform review |
| L4 | README status line claims "M1" — stale vs actual M3-complete | docs review |
| L5 | CONTRIBUTING.md missing | docs review |
| L6 | PR template missing | docs review |
| L7 | ~39% rustdoc gap concentrated in grex CLI crate | docs review |
| L8 | Only 1 file has rustdoc code examples | docs review |
| L9 | Arc<PackManifest> would eliminate multiple per-action clones | perf review |
| L10 | Batched manifest appends under single lock acquire | perf review |
| L11 | Predicate cache on ExecCtx — repeated cmd_available probes | perf review |
| L12 | Cow<str> hot path in vars::expand | perf review |
| L13 | gix shallow-clone option exposed via SyncOptions | perf review |
NIT
| # | Finding |
|---|---|
| N1 | Inconsistent tracing span names across sync path |
| N2 | Several test names begin with test_ (clippy items_after_statements style) |
Mapping: finding → PR → resolution
Fix PRs on main:
- A = PR #14 — semver hygiene
- B = PR #15 — data integrity (event brackets + halt context)
- C = PR #16 — concurrency (workspace + repo fd-locks, TOCTOU closure)
- D = PR #17 — cross-platform (VarEnv, case-folding, kind:auto)
- E = PR #18 — recovery (backup rollback, recovery scan, stderr cap)
| # | Finding (short) | PR | Resolution |
|---|---|---|---|
| C1 | workspace-concurrent sync | C (#16) | resolved — <workspace>/.grex.sync.lock fail-fast |
| C2 | partial-apply undetectable | B (#15) | resolved — ActionStarted/Completed/Halted + SyncError::Halted(Box<HaltedContext>) |
| C3 | backup-then-create atomicity | E (#18) | resolved — rename-back on create failure; SymlinkCreateAfterBackupFailed if rollback fails |
| H1 | Win case-sensitive VarEnv | D (#17) | resolved — two-map (inner + ASCII-lowercase lookup_index) |
| H2 | DupSymlink case-sensitive | D (#17) | resolved — ASCII case-fold on Windows/macOS |
| H3 | kind: auto silent default | D (#17) | resolved — ExecError::SymlinkAutoKindUnresolvable |
| H4 | repo-concurrent race | C (#16) | resolved — <dest>.grex-backend.lock sibling file |
| H5 | missing #[non_exhaustive] | A (#14) | resolved — applied workspace-wide (list in PR description) |
| H6 | unbounded stderr in events | E (#18) | resolved — 2 KB truncation cap |
| M1 | plugin name heap-alloc | A (#14) | resolved — Cow<'static, str> |
| M2 | no startup recovery scan | E (#18) | resolved — informational scan (auto-cleanup deferred to grex doctor M4+) |
| M3 | dirty-check TOCTOU | C (#16) | resolved — revalidated after lock + immediately before materialise_tree |
| M4 | HOME→USERPROFILE in insert | D (#17) | resolved — fallback only in from_os / from_map |
| M5 | no Skipped variant | A (#14) | reserved — variant added, emission deferred to M4 lockfile idempotency |
| L1 | NFC/NFD equality | — | deferred (carry-forward) |
| L2 | MAX_PATH \\?\ | — | deferred (carry-forward) |
| L3 | POSIX mode on Win warn | — | deferred (carry-forward) |
| L4 | README stale | — | deferred (docs carry-forward) |
| L5 | CONTRIBUTING.md | — | deferred (docs carry-forward) |
| L6 | PR template | — | deferred (docs carry-forward) |
| L7 | rustdoc gap | — | deferred (docs carry-forward) |
| L8 | no rustdoc examples | — | deferred (docs carry-forward) |
| L9–L13 | perf items | — | deferred (perf carry-forward; not on M4 critical path) |
| N1–N2 | nits | — | punted (no ticket) |
Deferred findings (remain open)
Grouped for triage when M4 planning starts:
Security
- Security review retry — codex synthesis stalled twice. Re-run with a smaller scope or a different synthesizer before claiming a clean security pass.
Docs
- README status line (M1 → M3).
- Add
CONTRIBUTING.md. - Add PR template.
- Close the 39% rustdoc gap (primary offender:
grexCLI crate). - Add rustdoc code examples to at least the public
grex-coresurface.
Perf
Arc<PackManifest>to eliminate clones across the sync pipeline.- Batched manifest appends under a single fd-lock acquire.
- Predicate cache on
ExecCtx(repeatedcmd_availableetc.). Cow<str>on thevars::expandhot path.- Expose
gixshallow-clone option viaSyncOptions.
Platform edges (LOW)
- Unicode NFC/NFD path equality (macOS).
- Windows
\\?\long-path prefix for MAX_PATH. - POSIX-only
modefield onmkdirshould warn on Windows.
Cross-refs
progress.md— "Decisions locked during M3 review series" mirrors the decisions captured in the PR descriptions..omne/cfg/concurrency.md— updated to document workspace + repo fd-lock contract..omne/cfg/manifest.md— updated to documentActionStarted/ActionCompleted/ActionHaltedevent brackets..omne/cfg/actions.md— updated to document symlink backup-rollback,kind: automissing-src error, and exec stderr truncation.
Release process
How to cut a grex release. Covers the GitHub Release (binaries via
cargo-dist) and the crates.io publish steps. Rollback procedure at the end.
Audience: maintainers. Users should install per README.md §Install.
Prerequisites
- Push access to
mainand tag-push rights onegoisth777/grex. - A crates.io API token with publish rights on
grex-cli,grex-core,grex-mcp,grex-plugins-builtin(cargo loginon your workstation). cargo-distinstalled locally at the pinned version matching[workspace.metadata.dist].cargo-dist-version(currently0.31.0) — only required if you want to re-rundist planbefore tagging.- Clean
git status; working tree must match the exact commit you are releasing. No un-committed changes.
1. Prepare the CHANGELOG
In CHANGELOG.md:
- Rename the
[Unreleased - 1.0.0]heading to[1.0.0] - YYYY-MM-DDusing today's UTC date. - Open a new empty
[Unreleased]section above it with emptyAdded/Changed/Fixed/Removedsubsections. - Ensure every
Addedbullet references the PR that introduced it. - Commit:
git commit -am "chore(release): prepare v1.0.0". - Push to
mainvia the normal PR flow. Do NOT tag yet.
2. Tag and push
Once the chore(release): prepare v1.0.0 commit is on main:
git switch main
git pull --ff-only
git tag -a v1.0.0 -m "grex v1.0.0"
git push origin v1.0.0
The tag push triggers .github/workflows/release.yml:
plan— validatesdist-manifest.jsonagainst the 5 targets.build-local-artifacts× 5 — buildsgrexfor each target, signs artefacts via GitHub's native attestations (actions/attest-build-provenance).build-global-artifacts— produces theinstaller.sh+installer.ps1scripts and SHA-256 sums.host+announce— creates the GitHub Release and uploads all artefacts (.tar.xz/.zip/*.sha256/ installers /source.tar.gz).
The GitHub Release body is auto-extracted from the [1.0.0] section of
CHANGELOG.md (cargo-dist convention).
3. Publish to crates.io (manual)
cargo-dist does NOT publish to crates.io. Do this manually from a
checkout of the tagged commit, in strict topological order. Prefer
--wait-for-publish (cargo 1.66+) over a hand-timed sleep — it polls
the index and only exits once the crate is actually resolvable:
git switch --detach v1.0.0
cargo publish --wait-for-publish --timeout 300 -p grex-core
cargo publish --wait-for-publish --timeout 300 -p grex-plugins-builtin
cargo publish --wait-for-publish --timeout 300 -p grex-mcp
cargo publish --wait-for-publish --timeout 300 -p grex-cli
Order rationale: grex-plugins-builtin and grex-mcp both depend on
grex-core; grex-cli depends on all three. See
openspec/changes/feat-m8-release/crates-io-names.md
§2 for the dep graph.
Smoke test post-publish:
cargo install grex-cli --locked
grex --version # must print 1.0.0
4. Installer smoke tests
From a fresh shell session:
# Linux / macOS
curl -LsSf https://github.com/egoisth777/grex/releases/latest/download/grex-cli-installer.sh | sh
grex --version
# Windows
powershell -c "irm https://github.com/egoisth777/grex/releases/latest/download/grex-cli-installer.ps1 | iex"
grex --version
Verified install (recommended for security-sensitive environments)
Every artefact is signed via GitHub's native build provenance
(actions/attest-build-provenance). Users can verify the binary matches
the commit + workflow that produced it before trusting it:
# Download + verify attestation (requires gh CLI >= 2.49)
gh release download v1.0.0 --repo egoisth777/grex --pattern '*.tar.xz'
gh attestation verify grex-cli-x86_64-unknown-linux-gnu.tar.xz --repo egoisth777/grex
tar xf grex-cli-x86_64-unknown-linux-gnu.tar.xz
sudo mv grex-cli*/grex /usr/local/bin/
grex --version
The curl | sh / irm | iex one-liners above are a convenience path
and do NOT verify attestations.
Supported platforms
Pre-built binaries ship for these five triples (see
[workspace.metadata.dist].targets in root Cargo.toml):
| Triple | Runner |
|---|---|
x86_64-unknown-linux-gnu | ubuntu-22.04 |
aarch64-unknown-linux-gnu | ubuntu-22.04-arm |
x86_64-apple-darwin | macos-13 |
aarch64-apple-darwin | macos-14 |
x86_64-pc-windows-msvc | windows-2022 |
Everything else (32-bit, musl, FreeBSD, aarch64-windows, etc.) falls back to building from source:
cargo install grex-cli --locked
Rollback
Yank a bad crates.io release
cargo yank hides the version from the resolver without deleting it.
Yank in reverse-dependency order (bin first, so dependents cannot keep
pulling it in):
cargo yank --version 1.0.0 grex-cli
cargo yank --version 1.0.0 grex-mcp
cargo yank --version 1.0.0 grex-plugins-builtin
cargo yank --version 1.0.0 grex-core
Yanking is reversible: cargo yank --version 1.0.0 --undo <crate> if
you decide to keep the release after all.
Mark the GitHub Release as pre-release
gh release edit v1.0.0 --prerelease
This hides it from the "latest" installer URL without deleting the artefacts. Users on the installer one-liner will stop picking up the bad release automatically.
Ship a fix
Cut a fresh patch release (v1.0.1) with the fix — do not re-tag
v1.0.0. Re-tagging breaks provenance and every cached copy of the
installer script.
On the limits of rollback
cargo yankis notcargo delete. The crate file stays on crates.io forever; yanking only excludes it from new resolves. Code that pinned= 1.0.0in a lockfile keeps compiling. There is no delete API.- Sigstore attestations are immutable. A released artefact whose
build provenance is on the Sigstore transparency log cannot be
revoked —
gh attestation verifywill keep returningOKeven after you mark the release pre-release. - Compromised-binary rollback MUST use a patch bump (
v1.0.1) that supersedes the bad version. Yankv1.0.0, mark its GitHub Release pre-release, and pushv1.0.1through the same release pipeline. Do not attempt to re-tag or delete artefacts.
Pinning updates
To update the pinned cargo-dist version:
- Bump
[workspace.metadata.dist].cargo-dist-versionin rootCargo.toml. - Run
cargo install cargo-dist --locked --version <new>locally. - Run
dist generateto regenerate.github/workflows/release.yml. - Commit both files together. CI's
release-planjob verifies the manifest still parses.
SemVer policy for grex
grex follows Semantic Versioning 2.0.0. This document pins down what "breaking", "additive", and "fix" mean concretely for grex, because the public surface spans four distinct contracts that users and agents depend on:
- Manifest schema —
grex.jsonl+grex.lock.jsonlrow shapes, keyed on the per-rowschema_versionfield. - CLI surface — verb names, flag names, exit codes, and the
--json/--plainstdout formats. - MCP tool surface — JSON-RPC tool names, input/output JSON schemas, and
tool annotations exposed by
grex serve --mcp. pack.yamlschema — pack-type plugin names, action names, and action field shapes consumed by the pack parser.
A release is MAJOR, MINOR, or PATCH based on the worst change across all four surfaces. A MAJOR change on any one surface forces a MAJOR release, even if the other three are additive-only.
The short version
| Bump | What it means |
|---|---|
| MAJOR | Existing workspaces / agents / packs may stop working after upgrade; migration may be needed. |
| MINOR | Everything that worked before still works; new capabilities are available to opt into. |
| PATCH | Behaviour identical from the user's perspective; bugs fixed, perf improved, docs clarified. |
The manifest-wire invariant
The single load-bearing invariant across all four surfaces is the JSONL wire
format of grex.jsonl and grex.lock.jsonl:
- Every row carries a
schema_versioninteger field (since M2 / PR #2). - Writers never emit rows at a schema version older than the one their
binary understands. A newer
grexwrites newer rows; an oldergrexnever downgrades. - Readers treat unknown future fields on a known-version row as skip-don't-error — extra keys are ignored, not rejected.
- Bumping
schema_versionpast the max a reader supports is a MAJOR event for that row kind; readers older than the new major will refuse the row with a structured error and instruct the user to upgrade.
This is the one rule that survives any SemVer ambiguity below: if you cannot
round-trip a manifest through an older compatible grex without silent data
loss, the change is MAJOR.
Per-surface rules
1. Manifest schema (grex.jsonl / grex.lock.jsonl)
| Change | Bump |
|---|---|
Remove a row kind (e.g. drop RegisterPack rows) | MAJOR |
Rename a required field on an existing row (e.g. url → repo_url) | MAJOR |
Change the type of an existing field (e.g. parallel: int → parallel: str) | MAJOR |
| Tighten a constraint (e.g. a previously free-form string becomes enum-only) | MAJOR |
Bump schema_version past what older readers support | MAJOR |
Add a new row kind that older readers skip cleanly (unknown kind = skip) | MINOR |
| Add a new optional field to an existing row (readers ignore unknown fields) | MINOR |
| Widen a constraint (e.g. enum gains a new variant — older readers skip row) | MINOR |
| Fix a writer bug that emitted malformed rows (readers already tolerant) | PATCH |
| Improve compaction perf; rewrite internals without format change | PATCH |
2. CLI surface (verbs, flags, exit codes, stdout format)
| Change | Bump |
|---|---|
Rename or remove a verb (grex add → grex register) | MAJOR |
Change a verb's positional-argument shape (<url> [path] → <path> <url>) | MAJOR |
Remap an existing exit code's meaning (e.g. 2 previously = parse error, now 2 = lock contention) | MAJOR |
Change the shape of --json stdout for an existing verb | MAJOR |
| Remove a flag (even a short alias) that was stable in the previous MINOR | MAJOR |
| Add a new verb | MINOR |
| Add a new flag with a safe default that preserves prior behaviour | MINOR |
Add a new field to an existing --json payload (consumers ignore unknowns) | MINOR |
Improve an error message; reword --help text; fix tab alignment | PATCH |
| Fix a buggy exit code that never returned its documented value | PATCH |
Caveat on exit-code fixes: a PATCH-class exit-code correction is still visible
to scripts that pinned against the buggy value. The CHANGELOG entry must
call it out under Fixed in bold so operators notice before upgrading.
3. MCP tool surface (grex serve --mcp)
| Change | Bump |
|---|---|
Rename or remove a tool (pack_add → register_pack) | MAJOR |
| Remove a required field from a tool's input schema | MAJOR |
| Add a new required field to a tool's input schema | MAJOR |
| Change a tool's output schema field type | MAJOR |
| Change or remove a tool annotation an existing client depends on | MAJOR |
| Add a new tool | MINOR |
| Add an optional input field with a safe default | MINOR |
| Add a new output field (clients ignore unknowns per MCP spec) | MINOR |
| Add a new annotation | MINOR |
| Fix a handler bug where the tool returned success on partial failure | PATCH |
| Improve tool description strings; tighten input-validation error messages | PATCH |
The MCP conformance suite (PR #28) pins the 2025-06-18 MCP spec revision. Bumping to a later MCP spec revision is itself MAJOR if the newer spec has breaking changes the grex surface propagates; otherwise MINOR.
4. pack.yaml schema (pack-type + action plugins)
| Change | Bump |
|---|---|
Rename a built-in pack-type (declarative → static) | MAJOR |
Rename a built-in action (file-write → write-file) | MAJOR |
| Remove a field from an existing action's input shape | MAJOR |
| Change an action's default behaviour for an existing field | MAJOR |
| Remove a built-in pack-type or action | MAJOR |
| Add a new built-in pack-type | MINOR |
| Add a new built-in action | MINOR |
| Add a new optional field to an existing action | MINOR |
| Loosen a validation rule (previously rejected input now accepted with warning) | MINOR |
| Fix a parser bug; improve error locations; clarify validation messages | PATCH |
| Improve action-execution perf; refactor executor internals | PATCH |
External plugin ABI stability is deferred to the v2 plugin spec; v1.0.0 has no external plugin surface.
Deprecation policy
- When grex needs to remove a verb, flag, tool, annotation, pack-type, action, or manifest field, it first deprecates the surface in a MINOR release.
- A deprecated surface continues to work for at least one full MINOR cycle before removal in the next MAJOR.
grex doctorsurfaces deprecation warnings in its output when a workspace's manifest or pack tree uses a deprecated surface.doctor-clean before a MAJOR upgrade means no deprecated usage left.- MCP clients receive deprecation notices via the tool-annotation mechanism; a
deprecated tool's annotation gets a
deprecated: truemarker and a human-readable message pointing at its replacement. - Deprecation entries go under
### DeprecatedinCHANGELOG.mdon the MINOR release that introduces them and under### Removedin the MAJOR release that retires them, with a back-reference to the deprecation entry.
What is not covered by SemVer
grex's SemVer contract covers the four public surfaces above. The following are explicitly out-of-scope:
- Internal module layout (
grex-coreinternals, private items). Reshuffled without bumping — consumers should not depend on private crate APIs. - Log / trace / stderr formatting (not
--jsonand not--plain). Free to evolve at any point. - Build artefact names and installer script URLs — these follow the cargo-dist release pipeline's conventions, not grex SemVer.
docs/content, design notes, andmilestone.md. Documentation is maintained for correctness but not versioned.- CI matrix composition (adding or dropping platforms from the build matrix). Platform-support drops will be called out in the CHANGELOG but follow their own platform-support policy.
- Minimum supported Rust version (MSRV) — MSRV bumps are MINOR and are called out in the CHANGELOG.
See also
CHANGELOG.md— per-release entries with categorised changes..omne/cfg/manifest.md— normative manifest / lockfile schema..omne/cfg/cli.md— v1 frozen verb contract..omne/cfg/mcp.md— MCP server surface..omne/cfg/pack-spec.md—pack.yamlschema and built-in pack-types.
MCP protocol conformance (CI gate)
Shipped in feat-m7-3. See
openspec/changes/feat-m7-3-mcp-ci-conformance/proposal.md.
Purpose
mcp-conformance is the L6 external-oracle gate: an independent
implementation (mcp-validator, Janix-ai) drives grex serve and asserts
wire-protocol conformance at MCP protocol version 2025-06-18. Pairs with
the in-process L2-L5 harness from feat-m7-2 — the harness checks our own
rmcp-typed client, this job checks a third-party one.
Pin
| Field | Value |
|---|---|
| Upstream | github.com/Janix-ai/mcp-validator |
| CI source | github.com/egoisth777/mcp-validator (org-controlled mirror, same SHA) |
| Tag | v0.3.1 |
| Commit SHA | d766d3ee94076b13d0b73253e5221bbc76b9edb2 |
| Released | 2025-07-08T13:55:45Z |
| Install path | actions/checkout the pinned SHA from the mirror into .mcp-validator/, then pip install -r .mcp-validator/requirements.txt, then run python -m mcp_testing.stdio.cli with PYTHONPATH=.mcp-validator. |
| PyPI status | mcp-validator==0.3.1 is NOT published on PyPI (only 0.1.1 is). |
pip install git+URL | NOT supported at this SHA. The upstream repo at tag v0.3.1 ships neither setup.py nor pyproject.toml, so pip refuses with does not appear to be a Python project. Clone-and-run is the only supported path until upstream adds a packaging file. |
| Protocol | 2025-06-18 (matches .omne/cfg/mcp.md SSOT) |
| Pin verified | 2026-04-22 via gh api repos/Janix-ai/mcp-validator/git/refs/tags/v0.3.1 |
Bump policy
Any bump MUST update tag AND SHA together. Re-run:
gh api repos/Janix-ai/mcp-validator/releases/latest --jq '.tag_name,.published_at'
gh api repos/Janix-ai/mcp-validator/git/refs/tags/v<NEW> --jq '.object.sha'
Drift between the two is a merge blocker.
Supply-chain hardening
The CI job checks out from egoisth777/mcp-validator (an org-controlled
mirror / fork of Janix-ai/mcp-validator) rather than upstream directly.
Rationale: actions/checkout of an external repo that we then pip install
hands that external maintainer arbitrary code execution under the CI token
if upstream is compromised, rewritten, or replaced. Mirroring the pin into
a repo we control closes that window — the SHA is byte-identical to
upstream, but the host cannot be tampered with by third parties.
Mirror refresh procedure (run once per validator bump):
# 1. Confirm new upstream tag + SHA.
gh api repos/Janix-ai/mcp-validator/releases/latest --jq '.tag_name,.published_at'
NEW_SHA=$(gh api repos/Janix-ai/mcp-validator/git/refs/tags/v<NEW> --jq '.object.sha')
# 2. Sync mirror's default branch with upstream (one-time, if not already
# tracking). The fork created via `gh repo fork Janix-ai/mcp-validator`
# already has all history; subsequent refreshes via:
gh api -X POST repos/egoisth777/mcp-validator/merge-upstream \
-f branch=main
# 3. Verify the new SHA is reachable from the mirror.
gh api repos/egoisth777/mcp-validator/commits/$NEW_SHA --jq '.sha'
# 4. Update `ref:` in `.github/workflows/ci.yml` mcp-conformance job.
CLI invocation
There is no mcp-validator console entry point at this SHA. The canonical
invocation (matches upstream ref_gh_actions/stdio-validation.yml at tag
v0.3.1) is a python -m call against the checked-out module:
PYTHONPATH=/abs/path/to/mcp-validator-checkout \
python -m mcp_testing.stdio.cli \
"$GITHUB_WORKSPACE/target/release/grex serve" \
--protocol-version 2025-06-18 \
--output-dir reports \
--report-format json
Verified --help output at tag v0.3.1:
usage: cli.py [-h] [--args ARGS [ARGS ...]] [--debug]
[--protocol-version {2024-11-05,2025-03-26,2025-06-18}]
[--output-dir OUTPUT_DIR] [--report-format {text,json,html}]
server_command
Notes:
- Server command is positional, NOT
--server-command(the earlier spec draft had this wrong; corrected here and inci.yml). - No
--timeoutflag exists at this SHA. Upstream's ownref_gh_actions/stdio-validation.ymltemplate lists--timeout 30but the template drifted from the code; the CLI rejects it withunrecognized arguments: --timeout 30. Omitted. --protocol-versionis passed to the validator, NOT togrex serve(which does not accept that flag).PYTHONPATHis required because the upstream repo at this SHA is not pip-installable (nosetup.py/pyproject.toml).- The job uploads
reports/as a workflow artefact (mcp-conformance-reports, 14-day retention) regardless of pass/fail so failed runs are debuggable.
Local repro
From repo root on any supported OS (Python 3.12):
cargo build --release -p grex
# Use the org mirror (same SHA) so local repro matches CI's supply chain.
git clone https://github.com/egoisth777/mcp-validator .mcp-validator
git -C .mcp-validator checkout d766d3ee94076b13d0b73253e5221bbc76b9edb2
python -m pip install --upgrade pip
pip install -r .mcp-validator/requirements.txt
mkdir -p reports
PYTHONPATH="$(pwd)/.mcp-validator" \
python -m mcp_testing.stdio.cli \
"$(pwd)/target/release/grex serve" \
--protocol-version 2025-06-18 \
--output-dir reports \
--report-format json
Exit code 0 = conformant. Non-zero = protocol drift (inspect
reports/*.json for the failing test cases).
Deliberate-regression smoke
To confirm the gate actually gates (not just green-by-accident):
- On a throwaway branch matching the CI trigger glob (
feat/**), break the MCP surface. Notes on what actually trips validator v0.3.1's stdio suite: renaming a tool or downgradingprotocolVersionboth PASS (validator accepts any initialize and does not assert tool inventory). The reliable break is makinggrex serveexit non-zero before accepting any stdio frames — e.g. replaceserve::runbody withanyhow::bail!("smoke"). - Push and wait for CI — the
MCP protocol conformance (2025-06-18)job MUST exit red. - Delete the throwaway branch locally and on origin; do NOT merge.
Proof of red run
| Date | Branch | Head SHA | Run URL | Result |
|---|---|---|---|---|
| 2026-04-22 | feat/m7-3-smoke-regression-proof (since deleted) | 357b3b77dbbfabe7a5fec1f20f70a906cabfcd38 | https://github.com/egoisth777/grex/actions/runs/24808713997/job/72608705067 | mcp-conformance = failure |
Revert: sub-branch deleted from both local and origin after capture;
PR #28's feat/m7-3-mcp-ci-conformance HEAD never contained the smoke
commit. Proof commit (357b3b7) is no longer reachable via any branch
— only this URL preserves the evidence.
Bypass procedure
Adversarial case: the validator itself regresses and blocks all PRs. To remove the gate temporarily:
- Maintainer: GitHub → Settings → Branches →
mainbranch protection → "Required status checks" → untickMCP protocol conformance (2025-06-18). - Save. The gate is now advisory.
- File a tracking issue linking the validator upstream regression.
- Once the upstream pin is fixed (or reverted), re-tick the required check and close the issue.
The pin is explicit, so a validator regression is always reproducible locally via the install command above — fixes are single-line PRs that bump the tag + SHA together.
Upstream disappearance
If Janix-ai/mcp-validator is deleted, renamed, or has its history
rewritten, CI continues to work unchanged because the job reads from the
org mirror egoisth777/mcp-validator, which retains the pinned SHA
independently. Remediation in that scenario:
- Confirm the mirror still holds the pinned SHA:
gh api repos/egoisth777/mcp-validator/commits/d766d3ee94076b13d0b73253e5221bbc76b9edb2 --jq '.sha' - File a tracking issue noting upstream loss so future bumps either (a)
find a replacement validator or (b) vendor the validator source under
.mcp-validator-vendored/in-repo and drop the external checkout step. - No CI changes required in the meantime — the mirror IS the durable source of truth for the pinned build.
Pointing ref: back at upstream is only appropriate if upstream is
restored AND has been re-audited.
CI job layout
See .github/workflows/ci.yml job mcp-conformance. Notes:
- No
needs:dependency. The job owns its owncargo build --release -p grexstep and runs in parallel with the debugbuildmatrix. Addingneeds: [build]would stall on the 3-OS matrix (~5 min p95) with no artefact payoff. Budget: ~3.5 min cold / ~1.5 min warm. - Release cache key is distinct (
key: release) so the release target profile does not thrash the debug cache used bybuild. - Python 3.12 matches upstream's template.