\stylesheet{mistie.css} \title{Mistie} \obeylines{ \urlh{http://www.cs.rice.edu/~dorai}{Dorai Sitaram} \urlh{mistie.tar.gz}{Download mistie.tar.gz!} } \small{\flushright{ Knowne vnto these, and to my selfe disguisde: Ile say as they say, and perseuer so: And in this mist at all aduentures go. (\i{The Comedy of Errors}, II.ii) }} Mistie is a programmable filter. Its primary aim is to let the user define a document's markup using Scheme. By itself, Mistie does not require any style of markup or format of either its input or its output. It simply copies its standard input to standard output as is. E.g., \p{ mistie.scm < input.doc > output.doc } produces an \p{output.doc} that is indistinguishable from \p{input.doc}. \p{mistie.scm} can be given a file's name as argument, in which case it reads that file instead of standard input. Thus, the above command is equivalent to \p{ mistie.scm input.doc > output.doc } To make Mistie do something more interesting than copying input verbatim to output, the user must supply a \i{format file}. A format file is a Scheme file that describes the markup of the input document in terms of the desired output format. Format files are specified with the \p{-f} option. E.g., \p{ mistie.scm -f myformat.mistie < input.doc } produces a formatted version of \p{input.doc}, the formatting being dictated by the format file \p{myformat.mistie}. The formatted version may either go to standard output or to some file depending on \p{myformat.mistie}. We will use the \p{.mistie} extension for Scheme files used as format files, but this is just a convention. In general, a format file will use the Mistie infrastructure to define a particular markup, deciding both what the input document should look like and what kind of output to emit. Format authors are \i{not} limited to a specialized sublanguage -- they can use full Scheme, including all the nonstandard features of the particular Scheme dialect they have at their disposal. Writing a format file requires some Scheme programming skill. If you're already a Scheme programmer, you are all set. If not, you can rely on format files written by people whose taste you trust. If it helps, Mistie is somewhat like TeX in its mode of operation (though not in its domain), with the ``macro'' language being Scheme. The analogy is not perfect though: There are no predefined primitives (everything must be supplied via a format file), and the output style is CFD (completely format dependent) rather than some DVI (device independent). (Hope that wasn't too mistie-rious.) The distribution includes several sample format files: Format files may be combined in the call to \p{mistie.scm}, e.g., \p{ mistie.scm -f plain.mistie -f footnote.mistie file.doc > file.html mistie.scm -f plain.mistie -f multipage.mistie file.doc } Alternatively, a new combination format file can be written that loads other format files. E.g., the following format file \p{basic.mistie} combines within itself the effects of \p{plain.mistie}, \p{scmhilit.mistie}, and \p{multipage.mistie}: \q{ ; File: basic.mistie (load-relative "plain.mistie") ;or use `load' with full pathnames (load-relative "scmhilit.mistie") (load-relative "multipage.mistie") } It is invoked in the usual manner: \p{ mistie.scm -f basic.mistie file.doc } Note that the format file \p{multipage.mistie} creates a set of \p{.html} files whose names are based on the name of the input document. Therefore, when using this format file, whether explicitly or implicitly, redirection of standard input or standard output is inappropriate. The name Mistie stands for Markup In Scheme That Is Extensible. Possible pronunciations are \i{miss-tea} and \i{miss-tie}. \pagebreak \section{Writing Mistie formats} A typical intent of a format file is to cause certain characters in the input document to trigger non-trivial changes in the output document. E.g., if the output is to be HTML, we'd like the characters \p{<}, \p{>}, \p{&}, and \p{"} in the input to come out as \p{<}, \p{>}, \p{&}, and \p{"}, respectively. The Mistie procedure \q{mistie-def-char} can be used for this: \q{ (mistie-def-char #\< (lambda () (display "<"))) (mistie-def-char #\> (lambda () (display ">"))) (mistie-def-char #\& (lambda () (display "&"))) (mistie-def-char #\" (lambda () (display """))) } \q{mistie-def-char} takes two arguments: The first is the character that is defined, and the second is the procedure associated with it. Here, the procedure writes the HTML encoded version of the character. Suppose we want a contiguous sequence of blank lines to be come out as the paragraph separator, \p{
}. We could \q{mistie-def-char} the newline character as follows: \q{ (mistie-def-char #\newline (lambda () (newline) (let* ((s (h-read-whitespace)) (n (h-number-of-newlines s))) (if (> n 0) (begin (display "
") (newline) (newline)) (display s))))) } This will cause newline to read up all the following whitespace, and then check to see how many further newlines it picked up. If there was at least one, it outputs the paragraph separator, viz., \p{
} followed by two newlines (added for human
readability). Otherwise, it merely prints the
picked up whitespace as is. The help procedures
\q{h-read-whitespace} and \q{h-number-of-newlines}
are ordinary Scheme procedures:
\q{
(define h-read-whitespace
(lambda ()
(let loop ((r '()))
(let ((c (peek-char)))
(if (or (eof-object? c) (not (char-whitespace? c)))
(list->string (reverse r))
(loop (cons (read-char) r)))))))
(define h-number-of-newlines
(lambda (ws)
(let ((n (string-length ws)))
(let loop ((i 0) (k 0))
(if (>= i n) k
(loop (+ i 1)
(if (char=? (string-ref ws i) #\newline)
(+ k 1) k)))))))
}
\subsection{Control sequences}
The Mistie procedure \q{mistie-def-ctl-seq} defines
\i{control sequences}. A control sequence is a
sequence of letters (alphabetic characters), and
is invoked in the input document by prefixing the
sequence with an \i{escape character}. (The case of
the letters is insignificant.) \q{mistie-def-ctl-seq}
associates a procedure with a control sequence --
when the control sequence occurs in the input
document, it causes the procedure to be applied.
The following defines the control sequence \q{br},
which emits the HTML tag \p{
}:
\q{
(mistie-def-ctl-seq 'br
(lambda ()
(display "
")))
}
Before a control sequence can be used, we must fix the
escape character. The following sets it to backslash:
\q{
(set! mistie-escape-char #\\)
}
We can now invoke the \q{br} control sequence as \p{\br}.
\subsection{Frames}
However, we can do better and get automatic line
breaks with a more powerful control sequence. Let's
say text between \p{\obeylines} and \p{\endobeylines}
should have automatic line breaks. We define
the control sequences \q{obeylines} and
\q{endobeylines} as follows:
\q{
(mistie-def-ctl-seq 'obeylines
(lambda ()
(mistie-push-frame)
(mistie-def-char #\newline
(lambda ()
(display "
")
(newline)))
(mistie-def-ctl-seq 'endobeylines
(lambda ()
(mistie-pop-frame)))))
}
The \q{obeylines} control sequence first pushes a new
frame on to the Mistie environment, using the Mistie
procedure \q{mistie-push-frame}. What this means is
that any definitions (whether \q{mistie-def-char} or
\q{mistie-def-ctl-seq}) will shadow existing
definitions. The Mistie procedure
\q{mistie-pop-frame} exits the frame, causing
the older definitions to take effect again.
In this case, we create a shadowing
\q{mistie-def-char} for newline, so that it will emit
\p{
} instead of performing its default action
(which, as we described above, was to look for
paragraph separation). We also define a control
sequence \q{endobeylines} which will pop the frame
pushed by \q{obeylines}. With this definition in
place, any text sandwiched between \p{\obeylines} and
\p{\endobeylines} (assuming \p{\} is the escape
character) will be output with a \p{
} at the end
of each of its lines.
\subsection{Calling Scheme from within the document}
We can define a control sequence \q{eval} that
will allow the input document to explicitly evaluate
Scheme expressions, without having to put them all
in a format file.
\q{
(mistie-def-ctl-seq 'eval
(lambda ()
(eval (read))))
}
This will cause \p{\eval} followed by a Scheme
expression to evaluate that Scheme expression.
E.g.,
\p{
\eval (display (+ 21 21))
}
will cause \p{42} to be printed at the point where the
\p{\eval} statement is placed. Of course, once you
have arbitrary access to Scheme within your document,
the amount of kooky intertextual stuff you can do is
limited only by your imagination. A mundane use for
\p{\eval} is to reset the escape character at
arbitrary locations in the document, should the
existing character be needed (temporarily or
permanently) for something else.
\pagebreak
\input formats.doc
\pagebreak
\section{References}
\eval
(mistie-def-ctl-seq 'aa
(lambda ()
(display "å")
(if (eqv? (peek-char) #\space) (read-char))))
\bibitem{mzscheme} Matthew Flatt.
\urlh{http://www.cs.rice.edu/CS/PLT/packages/mzscheme}{MzScheme}.
\bibitem{tex} Donald E. Knuth.
\urlh{http://cseng.awl.com/bookdetail.qry?ISBN=0-201-13448-9&ptype=0}{The
TeXbook}. Addison-Wesley, 1993.
\bibitem{css} H\aa kon Wium Lie and Bert Bos.
\urlh{http://www.awl.com/css}{Cascading Style Sheets}.
Addison Wesley Longman, 1997.