When Magic Pipes runs user code, the following bindings are available in the environment: * [http://www.schemers.org/Documents/Standards/R5RS/|R5RS Scheme] * [http://wiki.call-cc.org/man/4/Extensions%20to%20the%20standard|Core Chicken extensions] * [http://api.call-cc.org/doc/data-structures|The Chicken data-structures unit] * [http://api.call-cc.org/doc/srfi-1|SRFI-1] (list utilities) * [http://api.call-cc.org/doc/srfi-12|SRFI-13] (string utilities) * [http://api.call-cc.org/doc/srfi-69|SRFI-69] (hash tables) * [http://api.call-cc.org/doc/alist-lib|alist-lib] * Any bindings added to the environment through use of the standard -u, -d or -i command-line arguments. * Useful Magic Pipes runtime tools, described below The runtime tools are described below. They can be loaded in external code by importing the magic-pipes-runtime module.

`(mplog format args...)`

This uses Chicken's [http://api.call-cc.org/doc/extras/printf|printf] formatting system to output strings to the standard error port. It's convenient to use this in Magic Pipes code to display progress, debugging, and informational reports to the user, without disrupting pipeline output to standard output. (mplog "Current total: ~A" current-total)

`(alist-project fields alist)`

This returns an list composed of only the fields in the supplied alist that are mentioned in fields, which is a list of either alist keys or pairs of the form (key . value). In the latter case, if the key is not present in alist then an element is inserted into the output alist of the form (key . value); any value thus supplied works as a default value. Keys specified in fields without a default value do not appear in the output alist unless they appeared in alist. The output alist has keys listed in the order they are present in fields - as well as slimming an alist down, this procedure is useful for putting an alist into the right order for a later step that's fussy about that.

`(alist-projector fields)`

Returns a procedure accepting an alist and returning an alist which, when called, effectively calls (alist-project fields alist). This form is easier to use as an argument to mpmap than alist-project, as you can say:

mpmap "(alist-projector '(a b c))"

Instead of:

mpmap "(lambda (x) (alist-project '(a b c) x))"

`(alist-modify transformers alist)`

Returns an alist with the same keys as alist in the same order, except that any keys in alist which are also keys of the alist transformers are associated with the result of applying the value of that key in transformers to the value of that key in alist. This is useful for processing the output of tools like mpre or mpcsv-read | mptable2alist that produce only strings, if you use things like string->number as transformer values.

`(alist-modified transformers)`

Returns a procedure accepting an alist and returning an alist which, when called, effectively calls (alist-modify transformers alist). This form is easier to use as an argument to mpmap than alist-modify, as you can say:

mpmap '(alist-modifier `((size . ,string->number)))'

Instead of:

mpmap '(lambda (x) (alist-modify `((size . ,string->number)) x))'

`(mplookup type filename [dupmode: {all|one}] [reverse: boolean])`

This opens a persistent key:value lookup table. Several file types are supported, which will be described below. mplookup returns a suite of values, each of which is a procedure; in order, they are the lookup procedure, the update procedure, the deletion procedure, the fold procedure, and the close procedure. If you don't call the close procedure, not only may you leak resources, but updates and deletions you have performed may not be correctly written to the file. The lookup procedure accepts a key, looks it up in the lookup table, and returns the corresponding value, or #f if there is none. An optional second argument can be provided, which is used as the default value instead of #f. However, if dupmode was set to all (the default is one) when the lookup table was opened, then the lookup procedure instead returns a list of matching values; this list will be empty if there are none, and can contain more than one value of the lookup table contains duplicates. The update procedure accepts a key and value, and binds that key solely to that value in the lookup table. Any previous bindings of that key to values in the lookup table are deleted. There is currently no way to bind a key to more than one value through this interface (but I might extend it in future). The delete procedure accepts a key, and removes any values associated to that key in the lookup table. The fold procedure accepts a procedure of three arguments (key, value and accumulator), and an initial accumulator. It calls the procedure for every key:value binding in the lookup table (which, if dupmode is all, might be several times for a single key), threading an accumulator value through. Finally, the close procedure writes any pending changes to the file, and releases any held resources. If the optional reverse argument is true, then the lookup table is inverted.

Lookup table type `sqlite`

This lookup table type uses an SQLite database containing s-expressions, with a unique index on the key column and an index on the value column. As such, it can only represent a single value for each key. The database is created transparently if it does not already exist (the suggested extension is .sqlite). As lookup of keys and values is done by their exact textual representation, it is not recommended that the SQLite database be modified directly, as a different encoding of the same s-expression value may produce erronious results.

Lookup table type `aliases`

This lookup table type uses a plain text file of the sort traditionally used to specify email aliases. On each line, any hash (#) symbol and the rest of the line thereafter is ignored; from what remains, entries of the form key:value (with any whitespace before or after the key or value being ignored) are interpreted as the bindings of the lookup table, with the key and the value both being taken as strings without any parsing. Lines not matching that structure are ignored silently.

Lookup table type `alist`

This lookup table type uses a plain text file containing zero or more alists, written as sexprs. An alist is a list whose elements are pairs mapping keys to values, like so: ((key . value) (message . "Hello World") (complex-structure . (1 2 (3 4 5 6))) If there are multiple alists in the same file, they are all logically concatenated. Multiple occurrences of the same key, be they in the same alist or not, are handled as per the lookup table's dupmode setting.

Lookup table type `sexprs`

This lookup table type is very similar to alist, except without the "outer list"; the file is read as a sequence of sexprs, each of which is a single (key . value) pair. The advantage over alist is that the resulting file is easier to process one entry at a time, without ending up reading the entire alist into memory in one go, when read or written directly rather than via mplookup.

Dirent tools

mpls reads directory entries into a structured object called a "dirent"; a number of utility procedures are provided to manipulate them.

`(->dirent path-or-dirent)`

If the argument is a string, creates a dirent object representing that path. An error is signalled if the path does not exist. If the argument is already a dirent, returns it as-is. Otherwise, an error is signalled.

`(dirent? object)`

Returns a true value if the supplied object is a dirent, or #f otherwise.

Accessors

dirent-path - the full path
dirent-directory - just the directory path
dirent-filename - just the filename
dirent-inode-number
dirent-mode
dirent-number-of-links
dirent-uid
dirent-gid
dirent-size
dirent-access-time
dirent-change-time
dirent-modification-time
dirent-parent-device-id
dirent-device-id
dirent-block-size
dirent-number-of-blocks
dirent-link-target
dirent-type
dirent-regular-file?
dirent-directory?
dirent-fifo?
dirent-socket?
dirent-symbolic-link?
dirent-character-device?
dirent-block-device?

The dirent accessors return various attributes of the directory entry.

Older and newer

`(dirent-older? path-or-dirent path-or-dirent-or-age-in-seconds [accessor])`

`(dirent-newer? path-or-dirent path-or-dirent-or-age-in-seconds [accessor])`

If given two paths-or-dirents, returns true if and only if the first one is older (or newer, respectively) than the second one. The timestamp used to compute the "age" is the result of dirent-modification-time unless accessor is specified, in which case it can be any other accessor that converts a dirent into a POSIX timestamp - dirent-access-time or dirent-change-time being obvious choices, but it could be anything. If given a path-or-dirent as the first argument and a number as the second, it instead returns true if and only if the the dirent's modification time (or some other timestamp, if accessor is overridden) is older (or newer) then age-in-seconds ago (measure from the current timestamp).

Nicer ways to specify ages in seconds

Rather than having to work out how many seconds a week is, you can use these convenience procedures:

`(minutes number)`

`(hours number)`

`(days number)`

`(weeks number)`

Returns the number of seconds in the specified number of minutes, hours, days, or weeks, respectively. For example, to find files older than ten days in or below the current directory:

mpls -R | mpfilter "(cut dirent-older? <> (days 10))" | \
  mpmap "(alist-projector '(path access-time))" | \
  mpmap '(alist-modifier `((access-time . ,seconds->string)))' | \
  mpalist2table -H path access-time | mpcsv-write

Pathname patterns

`(dirent-match? regexp path-or-dirent [full-path?])`

Returns true if and only if the filename of path-or-dirent matches the regular expression regexp (which may be a POSIX-style string regexp or an SRE). If full-path? is specified and true, the regular expression is matched against the entire path rather than just the filename part.

`(dirent-matcher regexp [full-path?])`

Returns a procedure from path-or-dirent to boolean, that effectively calls (dirent-match? regexp path-or-dirent full-path?). However, compilation of the regular expression is only done once, so this form is preferable from a performance perspective.

mpls -R | mpfilter "(dirent-matcher '(: (* any) (+ numeric) (* any)) #t)" | mpmap dirent-path

mpls -R | mpfilter "(dirent-matcher \".*tags/1\.0.*\" #t)" | mpmap dirent-path

`(dirent-glob? pattern path-or-dirent [full-path?])`

As dirent-match?, except that it uses a simple glob pattern instead of a full regular expression. Note that, when using full-path?, the * pattern in globs does NOT match / - so a pattern like *foo* will match foo but not stuff/foo or foo/stuff.

`(dirent-globber pattern [full-path?])`

Returns a procedure from path-or-dirent to boolean, that effectively calls (dirent-glob? pattern path-or-dirent full-path?). However, compilation of the regular expression is only done once, so this form is preferable from a performance perspective.

mpls -R | mpfilter "(dirent-globber \"*.scm\")" | mpmap dirent-path

(mplog format args...)

(alist-project fields alist)

(alist-projector fields)

(alist-modify transformers alist)

(alist-modified transformers)

(mplookup type filename [dupmode: {all|one}] [reverse: boolean])

Lookup table type sqlite

Lookup table type aliases

Lookup table type alist

Lookup table type sexprs

Dirent tools

(->dirent path-or-dirent)

(dirent? object)