.
The formatter treats such code by formatting it in a monospace font, as in this
note. Because a chunk is only code if its first line is a named reference or
if it begins with a left-parenthesis or semi-colon, it is possible to include
lengthy blocks of display code in a literate program (or temporarily
comment-out a block of code) by writing a block of code with double-brackets on
the first and last lines, like this:
[[
(multiple lines
(of)
(display code))
]]
which will be displayed as:
| (multiple lines
| (of)
| (display code))
SchemeWeb weaves its output into html. Text chunks are displayed in Arial,
code (whether in chunks or in display) is displayed in Courier, and display
chunks are offset to the right of a vertical line to show they are not part of
the tangled output. Other formatting commands may be passed to the browser by
embedding html tags into text chunks.
This note is itself a SchemeWeb literate Scheme source file, and contains the
complete source code of the SchemeWeb literate programming system.
[[Tangle]], [[weave]], and [[lload]]
SchemeWeb provides three user-callable procedures.
The [[tangle]] procedure [[(tangle input output)]] takes two arguments and
returns a string; it may write a file or port as a side-effect. Both
arguments are optional; if [[output]] is given, [[input]] must also be given.
Both arguments may contain either ports or strings containing filenames. If
the arguments are given as strings, the corresponding ports are opened; if the
filename extension is omitted from the input filename, it defaults to [[.lss]],
and if the filename extension is omitted from the output filename, it defaults
to [[.ss]]. If no input is specified, input is taken from the current input
port. Output is written to a file or port only if the output argument is
specified; regardless, the tangled output is returned as the value of the
function.
The [[weave]] procedure [[(weave input)]] takes a single argument which may be
either a port or a string containing a filename. If the argument is given as a
string, the corresponding port is opened; if the filename extension is omitted
from the input filename; it defaults to [[.lss]]. [[Weave]] writes output to a
file whose name is constructed using the same basename as the input file and
extension [[.html]], and returns no value.
The [[lload]] procedure [[(lload filename)]] takes a single string argument and
returns nothing; it loads the Scheme source code in a SchemeWeb literate Scheme
source file into a running Scheme top-level as a side-effect. If the filename
extension is omitted from the input filename, it defaults to [[.lss]].
[[Lload]] calls [[tangle]] to extract the Scheme source code from the [[.lss]]
file.
The [[weave]] procedure provided by SchemeWeb is considerably less featureful
than the [[weave]] function of other literate programming systems such as CWEB
and [[noweb]]. As described in his Usenet posting cited in the references,
the author believes that most working code is never published outside the text
editor where it is written, so fancy weaving is seldom needed. Other [[weave]]s
provide such features as code pretty-printing, indexing of variable and function
names, and cross-referencing of chunk definitions and usages, but when the code
is viewed from within the programmer's text editor, a simple search replaces all
the indexing and cross-referencing features, and pretty-printing only causes
confusion by making the on-screen and printed versions of the program look
different. In fact, the author believes that the point of diminishing returns
is quickly reached when adding features to [[weave]], and it is unlikely any of
those other features will ever be added to SchemeWeb. In fact, the author
disclaims any responsibility for [[weave]] on the grounds he never uses it.
The SchemeWeb implementation
SchemeWeb provides the three procedures [[tangle]], [[weave]], and [[lload]].
It also uses many local procedures, but these are defined within the three
exported procedures to avoid namespace pollution. The three procedures are
described below.
[[Tangle]]
[[Tangle]] takes zero, one, or two arguments.
; TANGLE [INPUT [OUTPUT]] -- extract code, return as string, optionally write output
(define (tangle . args)
<>
<>
<>)
The main code block of [[tangle]] sets up input and output ports as specified
by the arguments, builds a dictionary of the chunks in the input, then calls
[[tangl]] to write the tangled source code.
<>=
(let ((i (open-input (if (pair? args) (car args) (current-input-port)) ".lss"))
(o (open-output-string)))
(let ((dict (build i)))
(if (pair? dict)
(begin
(tangl "" dict "" o)
(close-input-port i)
(let ((s (get-output-string o)))
(close-output-port o)
(if (and (pair? args) (pair? (cdr args)))
(let ((o (open-output (cadr args) ".ss")))
(display s o)
(close-output-port o)))
s)))))
Tangling works in two phases. First, the input is read, and code chunks, both
named and unnamed, are stored in a dictionary; that work is done by the call to
[[build]]. Then, a recursive process performs depth-first search through the
code-chunk call-graph, writing code as it proceeds, starting at a chunk named
[[""]] (the unnamed chunk); that work is done by the call to [[tangl]]. The
dictionary is implemented as an association-list; a-lists have linear time
complexity, but unless the input is very large, that will have little effect on
the actual running time of the program.
The recursive tangling phase
[[Tangl]] is the recursive function that performs depth-first search through
the code-chunk call-graph, writing output as it goes. The four arguments to
[[tangl]] are the name of the code-chunk to be tangled, the a-list that stores
the code-chunk dictionary, the [[indent]] that preceeds each line, and the
port on which output is written. The [[indent]] is interesting; [[tangle]]
works hard to make its output look nice. [[Tangl]] first looks up the code
associated with the named chunk, then loops through each code line, parsing it
to look for calls to additional code-chunks. If it finds a chunk-call,
[[tangl]] calls itself recursively to expand the code of the called chunk,
being careful to set the [[indent]].
<>=
; tangl name dict indent output -- write tangled output
(define (tangl name dict indent output)
(let loop ((lines (cdr (assoc name dict))))
(call-with-values
(lambda () (parse-line (car lines)))
(lambda (prefix call-name suffix)
(display prefix output)
(if (and (not (string=? "" call-name))
(assoc call-name dict))
(tangl call-name
dict
(make-string (+ (string-length indent)
(string-length prefix))
#\space)
output))
(cond ((not (string=? "" suffix)) (loop (cons suffix (cdr lines))))
((pair? (cdr lines))
(newline output)
(display indent output)
(loop (cdr lines))))))))
[[Tangl]] calls [[parse-line]] to extract chunk-call references from code lines.
[[Parse-line]] returns three values: the [[prefix]] that preceeds a chunk-call
reference, the [[call-name]] of a chunk-call reference, and the [[suffix]] that
follows a chunk-call reference. If the code line has no chunk-call reference,
the [[call-name]] and [[suffix]] will be empty strings.
<>=
; parse-line line -- extract prefix, call-name, suffix from line
(define (parse-line line)
(let* ((start (string-index line "<<" 0))
(end (if start (string-index line ">>" (+ start 2)) #f)))
(if (and start end)
(values (substring line 0 start)
(substring line (+ start 2) end)
(substring line (+ end 2) (string-length line)))
(values line "" ""))))
The dictionary building phase
The [[build]] function takes an input port and returns a dictionary. [[Build]]
loops over each chunk on the input port until it finds a [[null?]] chunk
indicating the end of the input. Code chunks, whether named or unnamed, are
added to the dictionary; text chunks are ignored. [[Build]] calls [[get-name]]
to extract the chunk name from the first line of named chunks.
<>=
; build port -- build dictionary from port
(define (build input)
(let loop ((par (read-par input)) (dict '()))
(cond ((null? par) dict)
((unnamed? par) (loop (read-par input) (add-dict "" dict par)))
((named? par) (loop (read-par input)
(add-dict (get-name par) dict (cdr par))))
(else (loop (read-par input) dict)))))
<>=
; get-name par -- extract name from first line of par
(define (get-name par)
(let* ((line (car par))
(start (string-index line "<<" 0))
(end (string-index line ">>=" (+ start 2))))
(substring line (+ start 2) end)))
The [[add-dict]] function adds a chunk to the dictionary. It loops through the
dictionary a-list, looking for a chunk with the appropriate name; if none
exists, it creates a new entry in the a-list. [[Add-dict]] calls [[dedent]] to
remove leading space from the chunk, so that SchemeWeb authors can indent code
chunks to make them more easily visible, if desired.
<>=
; add-dict name dict lines -- append lines to name in dict, or create new name
(define (add-dict name dict lines)
(if (null? dict)
(cons (cons name (dedent lines)) dict)
(let loop ((item (car dict)) (unscanned (cdr dict)) (scanned '()))
(cond ((string=? (car item) name)
(cons (append item (dedent lines))
(append unscanned (reverse scanned))))
((null? unscanned)
(cons (cons name (dedent lines)) (cons item (reverse scanned))))
(else (loop (car unscanned) (cdr unscanned) (cons item scanned)))))))
The [[dedent]] function takes a list of strings and removes leading whitespace
they have in common. [[Dedent]] looks at the first character of each string in
the list, and if all strings have the same first character, each string will be
shortened; [[dedent]] continues to call itself recursively until all leading
whitespace has been removed.
<>=
; dedent ls -- remove common indentation from a list of strings
(define (dedent ls)
(define (all xs)
(or (null? xs)
(and (car xs)
(all (cdr xs)))))
(cond ((null? ls) ls)
((null? (cdr ls)) (list (string-trim-left (car ls))))
((and (< 1 (length ls))
(all (map (lambda (s) (positive? (string-length s))) ls))
(char-whitespace? (string-ref (car ls) 0))
(apply char=? (map (lambda (s) (string-ref s 0)) ls)))
(dedent (map (lambda (s) (substring s 1 (string-length s))) ls)))
(else ls)))
[[Weave]]
Weave takes a single argument, the name of the [[.lss]] SchemeWeb literate
Scheme source file. The mainline code of [[weave]] opens and closes needed
files and calls [[weeve]] to do the actual work of creating output.
; WEAVE INPUT
(define (weave input)
<>
<>
(let ((i (open-input input ".lss"))
(o (open-output (base-name input ".lss") ".html")))
(weeve (base-name input ".lss") i o)
(close-input-port i)
(close-output-port o)))
[[Weeve]] takes the base-name of the file being woven (which it writes to the
title of the [[.html]] output file, the input file, and the output file. It
writes a header and footer. In between, it loops over each chunk, classifies
it, and calls the appropriate function to write it to the output.
<>=
; weeve input output
(define (weeve name input output)
(display "" output) (newline output)
(if (not (string=? "" name))
(begin
(display "" output) (display name output)
(display "" output) (newline output)))
(display "" output) (newline output)
(let loop ((par (read-par input)))
(if (pair? par)
(begin (cond ((named? par) (write-named par output))
((unnamed? par) (write-unnamed par output))
((code? par) (write-code par output))
((pair? par) (write-text par output)))
(loop (read-par input)))))
(display "