" output) (display name output) (display "

>) The main code block of [[tangle]] sets up input and output ports as specified by the arguments, builds a dictionary of the chunks in the input, then calls [[tangl]] to write the tangled source code. <

>= (let ((i (open-input (if (pair? args) (car args) (current-input-port)) ".lss")) (o (open-output-string))) (let ((dict (build i))) (if (pair? dict) (begin (tangl "" dict "" o) (close-input-port i) (let ((s (get-output-string o))) (close-output-port o) (if (and (pair? args) (pair? (cdr args))) (let ((o (open-output (cadr args) ".ss"))) (display s o) (close-output-port o))) s))))) Tangling works in two phases. First, the input is read, and code chunks, both named and unnamed, are stored in a dictionary; that work is done by the call to [[build]]. Then, a recursive process performs depth-first search through the code-chunk call-graph, writing code as it proceeds, starting at a chunk named [[""]] (the unnamed chunk); that work is done by the call to [[tangl]]. The dictionary is implemented as an association-list; a-lists have linear time complexity, but unless the input is very large, that will have little effect on the actual running time of the program.

The recursive tangling phase

[[Tangl]] is the recursive function that performs depth-first search through the code-chunk call-graph, writing output as it goes. The four arguments to [[tangl]] are the name of the code-chunk to be tangled, the a-list that stores the code-chunk dictionary, the [[indent]] that preceeds each line, and the port on which output is written. The [[indent]] is interesting; [[tangle]] works hard to make its output look nice. [[Tangl]] first looks up the code associated with the named chunk, then loops through each code line, parsing it to look for calls to additional code-chunks. If it finds a chunk-call, [[tangl]] calls itself recursively to expand the code of the called chunk, being careful to set the [[indent]]. <>= ; tangl name dict indent output -- write tangled output (define (tangl name dict indent output) (let loop ((lines (cdr (assoc name dict)))) (call-with-values (lambda () (parse-line (car lines))) (lambda (prefix call-name suffix) (display prefix output) (if (and (not (string=? "" call-name)) (assoc call-name dict)) (tangl call-name dict (make-string (+ (string-length indent) (string-length prefix)) #\space) output)) (cond ((not (string=? "" suffix)) (loop (cons suffix (cdr lines)))) ((pair? (cdr lines)) (newline output) (display indent output) (loop (cdr lines)))))))) [[Tangl]] calls [[parse-line]] to extract chunk-call references from code lines. [[Parse-line]] returns three values: the [[prefix]] that preceeds a chunk-call reference, the [[call-name]] of a chunk-call reference, and the [[suffix]] that follows a chunk-call reference. If the code line has no chunk-call reference, the [[call-name]] and [[suffix]] will be empty strings. <>= ; parse-line line -- extract prefix, call-name, suffix from line (define (parse-line line) (let* ((start (string-index line "<<" 0)) (end (if start (string-index line ">>" (+ start 2)) #f))) (if (and start end) (values (substring line 0 start) (substring line (+ start 2) end) (substring line (+ end 2) (string-length line))) (values line "" ""))))

The dictionary building phase

The [[build]] function takes an input port and returns a dictionary. [[Build]] loops over each chunk on the input port until it finds a [[null?]] chunk indicating the end of the input. Code chunks, whether named or unnamed, are added to the dictionary; text chunks are ignored. [[Build]] calls [[get-name]] to extract the chunk name from the first line of named chunks. <>= ; build port -- build dictionary from port (define (build input) (let loop ((par (read-par input)) (dict '())) (cond ((null? par) dict) ((unnamed? par) (loop (read-par input) (add-dict "" dict par))) ((named? par) (loop (read-par input) (add-dict (get-name par) dict (cdr par)))) (else (loop (read-par input) dict))))) <>= ; get-name par -- extract name from first line of par (define (get-name par) (let* ((line (car par)) (start (string-index line "<<" 0)) (end (string-index line ">>=" (+ start 2)))) (substring line (+ start 2) end))) The [[add-dict]] function adds a chunk to the dictionary. It loops through the dictionary a-list, looking for a chunk with the appropriate name; if none exists, it creates a new entry in the a-list. [[Add-dict]] calls [[dedent]] to remove leading space from the chunk, so that SchemeWeb authors can indent code chunks to make them more easily visible, if desired. <>= ; add-dict name dict lines -- append lines to name in dict, or create new name (define (add-dict name dict lines) (if (null? dict) (cons (cons name (dedent lines)) dict) (let loop ((item (car dict)) (unscanned (cdr dict)) (scanned '())) (cond ((string=? (car item) name) (cons (append item (dedent lines)) (append unscanned (reverse scanned)))) ((null? unscanned) (cons (cons name (dedent lines)) (cons item (reverse scanned)))) (else (loop (car unscanned) (cdr unscanned) (cons item scanned))))))) The [[dedent]] function takes a list of strings and removes leading whitespace they have in common. [[Dedent]] looks at the first character of each string in the list, and if all strings have the same first character, each string will be shortened; [[dedent]] continues to call itself recursively until all leading whitespace has been removed. <>= ; dedent ls -- remove common indentation from a list of strings (define (dedent ls) (define (all xs) (or (null? xs) (and (car xs) (all (cdr xs))))) (cond ((null? ls) ls) ((null? (cdr ls)) (list (string-trim-left (car ls)))) ((and (< 1 (length ls)) (all (map (lambda (s) (positive? (string-length s))) ls)) (char-whitespace? (string-ref (car ls) 0)) (apply char=? (map (lambda (s) (string-ref s 0)) ls))) (dedent (map (lambda (s) (substring s 1 (string-length s))) ls))) (else ls)))

[[Weave]]

Weave takes a single argument, the name of the [[.lss]] SchemeWeb literate Scheme source file. The mainline code of [[weave]] opens and closes needed files and calls [[weeve]] to do the actual work of creating output. ; WEAVE INPUT (define (weave input) <> <> (let ((i (open-input input ".lss")) (o (open-output (base-name input ".lss") ".html"))) (weeve (base-name input ".lss") i o) (close-input-port i) (close-output-port o))) [[Weeve]] takes the base-name of the file being woven (which it writes to the title of the [[.html]] output file, the input file, and the output file. It writes a header and footer. In between, it loops over each chunk, classifies it, and calls the appropriate function to write it to the output. <>= ; weeve input output (define (weeve name input output) (display "" output) (newline output) (if (not (string=? "" name)) (begin (display "" output) (display name output) (display "" output) (newline output))) (display "" output) (newline output) (let loop ((par (read-par input))) (if (pair? par) (begin (cond ((named? par) (write-named par output)) ((unnamed? par) (write-unnamed par output)) ((code? par) (write-code par output)) ((pair? par) (write-text par output))) (loop (read-par input))))) (display "" output) (newline output))

Functions that write chunks

There are four functions that write chunks, one for each type of chunk. The three that write code change the font face to courier, but are careful to set the font face to arial before they leave, because that is the default font used to display text. These procedures do all the setting of fonts and writing of newlines, but call other functions to actually write code and text. <>= ; write-named par output (define (write-named par output) (display "

" output)
  (call-with-values
    (lambda () (parse-line (car par)))
    (lambda (prefix call-name suffix)
      (display "«" output)
      (display-text call-name output)
      (display "»" output)
      (display "≡" output)))
  (display "" output) (newline output)
  (for-each (lambda (s) (display-code s output) (newline output)) (cdr par))
  (display "

" output) (newline output)) <>= ; write-unnamed par output (define (write-unnamed par output) (display "

" output) (newline output)
  (for-each (lambda (s) (display-code s output) (newline output)) par)
  (display "

" output) (newline output)) <>= ; write-code par output (define (write-code par output) (display "

" output) (newline output)
  (let loop ((par (cdr par)))
    (display "| " output) (display-code (car par) output) (newline output)
    (if (and (pair? (cdr par)) (pair? (cddr par))) (loop (cdr par))))
  (display "

" output) (newline output)) <>= ; write-text par output (define (write-text par output) (display "

" output) (newline output) (for-each (lambda (s) (display-text s output) (newline output)) par) (display "

" output) (newline output))

Functions that write code and text

These functions actually write code and text. They are called in various contexts by the four functions that write chunks, and are careful to never write a newline, because various contexts may require that they are called within a line, not to write an entire line. <>= ; display-code line output (define (display-code line output) (let loop ((line line)) (call-with-values (lambda () (parse-line line)) (lambda (prefix call-name suffix) (display (quote-html prefix) output) (if (not (string=? "" call-name)) (begin (display "«" output) (display-text call-name output) (display "»" output))) (if (not (string=? "" suffix)) (loop suffix)))))) <>= ; display-text line output (define (display-text line output) (let loop ((line line)) (let* ((start (string-index line "[[" 0)) (end (if start (string-index line "]]" (+ start 2)) #f))) (if (and start end) (begin (display (substring line 0 start) output) (display "" output) (display (quote-html (substring line (+ start 2) end)) output) (display "" output) (loop (substring line (+ end 2) (string-length line)))) (display line output))))) These functions call a utility function [[quote-html]] that changes literal less-than, greater-than and ampersand symbols to their html equivalents. Note that the ampersand replacement must come first because it is part of the replacement text of the other symbols. <>= ; quote-html -- quote special html characters <, >, and & (define (quote-html str) (let* ((s1 (string-replace str "&" "&")) (s2 (string-replace s1 "<" "<")) (s3 (string-replace s2 ">" ">"))) s3))

Utility functions, and functions common to [[tangle]] and [[weave]]

There are several utility functions required by [[tangle]] and [[weave]]. They are described below.

Chunk-classification functions

Chunks may be of four types: named, unnamed, display-code, and text. The three functions below identify the first three types; anything else is a text chunk. <>= ; named? par -- #t if named paragraph, #f otherwise (define (named? par) (if (null? par) #f (let* ((line (string-trim (car par))) (start (string-index line "<<" 0)) (end (if start (string-index line ">>=" (+ start 2)) #f))) (and start end (zero? start) (= end (- (string-length line) 3)))))) <>= ; unnamed? par -- #t if unnamed paragraph, #f otherwise (define (unnamed? par) (if (null? par) #f (let ((line (string-trim (car par)))) (and (not (string=? "" line)) (or (char=? #\; (string-ref line 0)) (char=? #\( (string-ref line 0))))))) <>= ; code? par -- #t if code paragraph (for display), #f otherwise (define (code? par) (and (string=? "[[" (car par)) (string=? "]]" (car (reverse par)))))

Port-handling functions

These functions open input ports and output ports. The parameter may be either a port or a string. If it is a string, the appropriate port is opened, perhaps using a default extension. If it is a port, it is returned unchanged. These functions use several non-R5RS extensions — [[file-exists?]], [[delete-file]], and [[error]] — which exist in most Scheme systems. <>= ; open-input port-or-file . ext -- open input file or port, optional extension (define (open-input port-or-file . ext) (cond ((input-port? port-or-file) port-or-file) ((not (string? port-or-file)) (error "error opening file")) ((file-exists? port-or-file) (open-input-file port-or-file)) ((and (pair? ext) (file-exists? (string-append port-or-file (car ext)))) (open-input-file (string-append port-or-file (car ext)))) (else (error (string-append "can't open " port-or-file))))) <>= ; open-output port-or-file . ext -- open output file or port, optional extension (define (open-output port-or-file . ext) (cond ((output-port? port-or-file) port-or-file) ((not (string? port-or-file)) (error "error opening file")) ((file-exists? port-or-file) (delete-file port-or-file) (open-output-file port-or-file)) ((null? ext) (open-output-file port-or-file)) ((file-exists? (string-append port-or-file (car ext))) (delete-file (string-append port-or-file (car ext))) (open-output-file (string-append port-or-file (car ext)))) (else (open-output-file (string-append port-or-file (car ext)))))) The [[base-name]] function strips an extension from a filename. <>= ; base-name file-name suffix -- delete suffix from file-name if it matches (define (base-name file-name suffix) (let ((len-file (string-length file-name)) (len-suffix (string-length suffix))) (if (string=? (substring file-name (- len-file len-suffix) len-file) suffix) (substring file-name 0 (- len-file len-suffix)) file-name)))

String functions

The following functions remove leading and trailing whitespace from strings. <>= ; string-trim-left s -- remove whitespace from left end of string s (define (string-trim-left s) (cond ((string=? "" s) s) ((char-whitespace? (string-ref s 0)) (string-trim-left (substring s 1 (string-length s)))) (else s))) <>= ; string-trim-right s -- remove whitespace from right end of string s (define (string-trim-right s) (cond ((string=? "" s) s) ((char-whitespace? (string-ref s (- (string-length s) 1))) (string-trim-right (substring s 0 (- (string-length s) 1)))) (else s))) <>= ; string-trim s -- remove whitespace from both ends of string s (define (string-trim s) (string-trim-left (string-trim-right s))) Function [[string-index]] finds the first occurrence of a search string in a target string, starting after a specified position within the string. It returns the index within the search string where the target string begins, or [[#f]] if the search string does not appear in the designated portion of the target string. The algorithm is simple brute-force search. <>= ; string-index search target start -- first appearance at or after start or #f (define (string-index search target start) (let ((search-len (string-length search)) (target-len (string-length target))) (let loop ((k start)) (cond ((< search-len (+ k target-len)) #f) ((string=? (substring search k (+ k target-len)) target) k) (else (loop (+ k 1))))))) Function [[string-replace]] examines a target string, replacing all occurrences of a search string with a replacement string; it returns a newly-allocated string containing all the replacements. If the search string is not found within the target string, no replacements are made, and the original target string is returned. <>= ; string-replace target search replace (define (string-replace target search replace) (let ((search-len (string-length search)) (target-len (string-length target))) (let loop ((k 0)) (cond ((< target-len (+ k search-len)) target) ((string=? (substring target k (+ k search-len)) search) (string-append (substring target 0 k) replace (string-replace (substring target (+ k search-len) target-len) search replace))) (else (loop (+ k 1)))))))

Functions that read ports

SchemeWeb is concerned with chunks that consist of lines. The following two functions read lines and chunks (called paragraphs) from an input port. Note that [[read-line]] is careful to handle lines terminated by a carriage return, a line feed, or both, so it can be used on a variety of operating systems. <>= ; read-line [port] -- read next line from port, return line or eof-object (define (read-line . port) (define (eat c p) (if (and (not (eof-object? (peek-char p))) (char=? (peek-char p) c)) (read-char p))) (let ((p (if (null? port) (current-input-port) (car port)))) (let loop ((c (read-char p)) (line '())) (cond ((eof-object? c) (if (null? line) c (list->string (reverse line)))) ((char=? #\newline c) (eat #\return p) (list->string (reverse line))) ((char=? #\return c) (eat #\newline p) (list->string (reverse line))) (else (loop (read-char p) (cons c line))))))) [[Read-par]] calls [[read-line]] to return a chunk. <>= ; read-par p -- next maximal set of non-blank lines from p, or eof-object (define (read-par p) (define (get-non-blank-line p) (let blank ((s (read-line p))) (if (and (not (eof-object? s)) (string=? "" s)) (blank (read-line p)) s))) (let par ((s (get-non-blank-line p)) (ls '())) (if (or (eof-object? s) (string=? "" s)) (reverse ls) (par (read-line p) (cons s ls)))))

[[Lload]]

[[Lload]] calls [[tangle]] to extract the Scheme code from a SchemeWeb literate Scheme source file, then executes a read-eval loop to load the tangled code into the currently-running top-level environment: ; LLOAD FILE-NAME (define (lload file-name) (let* ((ss (tangle file-name)) (i (open-input-string ss))) (let loop ((obj (read i))) (if (eof-object? obj) (close-input-port i) (begin (eval obj (interaction-environment)) (loop (read i)))))))

Obtaining and installing SchemeWeb

SchemeWeb is available from [[pbewig.googlepages.com]]. The SchemeWeb literate Scheme source file is available as [[schemeweb.lss]]. The woven version of SchemeWeb is available as [[schemeweb.html]]. The tangled Scheme source code is available as [[schemeweb.ss]]. To install SchemeWeb, obtain [[schemeweb.ss]], copy it to a convenient directory, and say [[(load "schemeweb.ss")]]; this is most usefully done in a standard prelude loaded each time the Scheme system is started.

References

Philip L. Bewig, The Essence of Literate Programming, [[comp.programming.literate]], May 27, 1996. Describes the chunking mechanism as essential, and suggests that plain ascii text, not fancy formatting, is sufficient for literate programmers who view their work from inside a text editor rather than in print. Donald E. Knuth, The WEB System of Structured Documentation, Computer Science Department Report STAN-CS-83-980, Stanford University, 1983. The original WEB program, in Pascal. Donald E. Knuth, Computers and Typesetting, Volume B: TeX: The Program, Addison-Wesley Professional, 1986 (ISBN 0201124273). The program for which literate programming was invented. Donald E. Knuth, Literate Programming, Center for the Study of Language and Information, 1992 (ISBN 0937073814). Reprints many of the early papers by Knuth and others that define and describe literate programming. Donald E. Knuth and Silvio Levy, The CWEB System of Structured Documentation, Addison-Wesley, 1993 (ISBN 0201575698). Provides user documentation and complete source code for the CWEB literate programming system. Norman Ramsey, Noweb – A Simple, Extensible Tool for Literate Programming, [[www.eecs.harvard.edu/~nr/noweb/]]. Noweb is a programming-language- and formatter-independent alternative to Knuth's Web and CWEB systems.

SchemeWeb

Literate Programming