Glossary

Catalog

Catalogs are a system for associating human-readable names with WareIDs.

See the Catalog Design for extended description.

Fileset

A set of files and directories, including standard posix metadata. Nothing special :)

When “packed”, a fileset becomes a Ware.

We use this term primarily to disambiguate what a Fileset is versus any details of its packing and transport. A good analogy is tarballs and torrents: both tar files and torrents might be considered a way of packing and distributing a Ware, but both of them can contain an identical Fileset.

Filesets can be packed and unpacked by the Rio tool.

Formula

A complete definition of a computation. Pins all input filesets by WareID. Formulas are evaluated in a sandbox. The result of evaluating a Formula is a RunRecord.

See the Layer 1 Schema docs for extended description, and the Repeatr tool for how to execute them.

Formula Action

A script to run inside the container when evaluating a Formula.

See the Layer 1 Schema docs for extended description, and the Repeatr tool for how to execute them.

Formula Inputs

A group of WareIDs and where to mount them in a directory tree when setting up the environment to evaluate a Formula.

Together with the Formula Action, the Inputs provide a Hermetic description of a computation.

Formula Output

A list of paths to save when the Formula’s Action is complete. Files under an output path will be exported as a Ware, and when the run is complete, the resulting WareIDs are reported in the RunRecord.

Lineage

A lineage is the collected, ordered set of releases for some module.

A lineage is composed of references to Release Records.

Linages exist at Layer 2 of the API, with the other Catalog concepts, and are evaluted by the Reach tool.

Module

A graph (a DAG) describing a series of proto-Formulas, specifying human readable names (scoped locally to the Module, or, using Catalogs to connect to a bigger picture); these human readable names will then later be resolved into WareIDs.

Each Step in the Module will be resolved to one Formula (using some name resolution process), which can then be evaluated, and then its outputs can be used as inputs in other steps.

Modules exist at Layer 2 of the API, and are evaluted by the Reach tool.

Module Import

TODO // oh boy. we need pages and pages on this. modules aren’t actually discussed in the design chapters at all yet.

Module Export

TODO

Module Step

TODO

Packtype

Packtype refers to the format of a packed Ware. It’s the first segment of a WareID.

ReleaseRecord

TODO

RunRecord

The result of evaluting a Formula. Includes the exitcode, a ‘results’ map full of WareIDs corresponding to the Formula Outputs, and also metadata like timestamp.

Note that while Formulas are designed to converge, RunRecords are the opposite: inclusion of nondeterministic fields like the timestamp is intentionally used to make RunRecords not deduplicate. This is useful because we may want to explicitly record the fact that a Formula evaluation produced the same results at several different times.

RunRecords exist at Layer 1 of the API, and are produced by Repeatr when it evaluates a Formula.

Ware

The “packed” form of a Fileset. Tarballs, zips, git commits; many formats are defined. Wares are immutable and can be refered to by WareID.

Wares can be packed and unpacked by the Rio tool.

WareID

A short string identifying a Ware, composed of a Packtype and a hash. Holding a WareID gives you an immutable reference to a Ware, which you can unpack into a Fileset.

WareID tuples look like “{packtype}:{hash}”, delimited by a colon – for example, “tar:6q7G4hWr283FpTa5Lf8heVqw9t97b5VoMU6AGszuBYAz9EzQdeHVFAou7c4W9vFcQ6”, or “git:c0d6989e1d1031a2760b8b0bfe163d75eae05e07”.

WareIDs exist at Layer 0 of the API, and are the inputs and outputs of the Rio tool.