This chapter describes Chez Scheme's generic port facility, operations on ports, and various Chez Scheme extensions to the standard set of input/output operations. See Chapter 7 of The Scheme Programming Language, Third Edition or the Revised5 Report on Scheme for a description of standard input/output operations. Definitions of a few sample generic ports are given in Section 9.12.
Chez Scheme's "generic port" facility allows the programmer to add new types of ports with arbitrary input/output semantics. It may be used, for example, to define any of the built-in Common Lisp [24] stream types, i.e., synonym streams, broadcast streams, concatenated streams, two-way streams, echo streams, and string streams. It may also be used to define more exotic ports, such as ports that represent windows on a bit-mapped display or ports that represent processes connected to the current process via pipes or sockets.
Each port has an associated port handler. A port handler is a procedure that accepts messages in an object-oriented style. Each message corresponds to one of the low-level Scheme operations on ports, such as read-char and close-input-port (but not read, which is defined in terms of the lower-level operations). Most of these operations simply call the handler immediately with the corresponding message.
Standard messages adhere to the following conventions: the message name is the first argument to the handler. It is always a symbol, and it is always the name of a primitive Scheme operation on ports. The additional arguments are the same as the arguments to the primitive procedure and occur in the same order. (The port argument to some of the primitive procedures is optional; in the case of the messages passed to a handler, the port argument is always supplied.) The following messages are defined for built-in ports:
block-read port string count
block-write port string count
char-ready? port
clear-input-port port
clear-output-port port
close-port port
file-position port
file-position port position
file-length port
flush-output-port port
peek-char port
port-name port
read-char port
unread-char char port
write-char char port
Additional messages may be accepted by user-defined ports.
Chez Scheme input and output is normally buffered for efficiency. To support buffering, each input port contains an input buffer and each output port contains an output buffer. Bidirectional ports, ports that are both input ports and output ports, contain both input and output buffers. Input is not buffered if the input buffer is the empty string, and output is not buffered if the output buffer is the empty string. In the case of unbuffered input and output, calls to read-char, write-char, and similar messages cause the handler to be invoked immediately with the corresponding message. For buffered input and output, calls to these procedures cause the buffer to be updated, and the handler is not called under normal circumstances until the buffer becomes empty (for input) or full (for output). Handlers for buffered ports must not count on the buffer being empty or full when read-char, write-char, and similar messages are received, however, due to the possibility that (a) the handler is invoked through some other mechanism, or (b) the call to the handler is interrupted.
In the presence of keyboard, timer, and other interrupts, it is possible for a call to a port handler to be interrupted or for the handler itself to be interrupted. If the port is accessible outside of the interrupted code, there is a possibility that the interrupt handler will cause input or output to be performed on the port. This is one reason, as stated above, that port handlers must not count on the input buffer being empty or output buffer being full when a read-char, write-char, or similar message is received. In addition, port handlers may need to manipulate the buffers only within critical sections (using critical-section).
Generic ports are created via one of the port construction procedures make-input-port, make-output-port, and make-input/output-port defined later in this chapter. Ports have seven accessible fields:
The output-size and output-index fields are valid only for output ports, and the input-size and input-index fields are valid only for input ports. The output and input size and index fields may be updated as well using the corresponding "set-field!" procedure.
A port's output size determines how much of the port's output buffer is actually available for writing by write-char. The output size is often the same as the string length of the port's output buffer, but it can be set to less (but no less than zero) at the discretion of the programmer. The output index determines to which position in the port's buffer the next character will be written. The output index should be between 0 and the output size, inclusive. If no output has occurred since the buffer was last flushed, the output index should be 0. If the index is less than the size, write-char stores its character argument into the specified character position within the buffer and increments the index. If the index is equal to the size, write-char leaves the fields of the port unchanged and invokes the handler.
A port's input size determines how much of the port's input buffer is actually available for reading by read-char. A port's input size and input index are constrained in the same manner as output size and index, i.e., the input size must be between 0 and the string length of the input buffer (inclusive), and the input index must be between 0 and the input size (inclusive). Often, the input size is less than the length of the input buffer because there are fewer characters available to read than would fit in the buffer. The input index determines from which position in the input buffer the next character will be read. If the index is less than the size, read-char extracts the character in this position, increments the index, and returns the character. If the index is equal to the size, read-char leaves the fields of the port unchanged and invokes the handler.
The operation of peek-char is similar to that of read-char, except that it does not increment the input index. unread-char decrements the input index if it is greater than 0, otherwise it invokes the handler. char-ready? returns #t if the input index is less than the input size, otherwise it invokes the handler.
Although the fields shown and discussed above are logically present in a port, actual implementation details may differ. The current Chez Scheme implementation uses a different representation that allows read-char, write-char, and similar operations to be open-coded with minimal overhead. The access and assignment operators perform the conversion between the actual representation and the one shown above.
Port handlers receiving a message must return a value appropriate for the corresponding operation. For example, a handler receiving a read-char message must return a character or eof object (if it returns). For operations that return unspecified values, such as close-port, the handler is not required to return any particular value.
The procedures used to create, access, and alter ports directly are described in this section. Also described are several nonstandard operations on port.
Unless otherwise specified, procedures requiring either input ports or output ports as arguments accept input/output ports as well, i.e., an input/output port is both an input port and an output port.
procedure: (make-input-port handler input-buffer)
procedure: (make-output-port handler output-buffer)
procedure: (make-input/output-port handler input-buffer output-buffer)
returns: a new port
handler must be a procedure or nonnegative fixnum (see below), and input-buffer and output-buffer must be strings. Each procedure creates a generic port. The handler associated with the port is handler, the input buffer is input-buffer, and the output buffer is output-buffer. For make-input-port, the output buffer is undefined, and for make-output-port, the input buffer is undefined.
The input size of an input or input/output port is initialized to the string length of the input buffer, and the input index is set to 0. The output size and index of an output or input/output port are initialized similarly.
The length of an input or output buffer may be zero, in which case buffering is effectively disabled.
If handler is a nonnegative fixnum, it is assumed to be a file descriptor obtained by some operating-system specific means, and the handler established for the new port sends output to or draws input from the file descriptor. The buffers in this case may not be empty, and for output or input/output ports, the output size must be explicitly set to something less than the length of the output buffer, since the established handler assumes it has one additional buffer location beyond the output size. The code below demonstrates the use of a fixnum handler to create an output port for the Unix standard-error file descriptor, which by convention is file descriptor 2,
(define stderr
(let ((buffer-size 1))
(let ((p (make-output-port 2 (make-string buffer-size))))
(set-port-output-size! p (- buffer-size 1))
p)))
A buffer size of one in this case effectively disables buffering.
procedure: (port-handler port)
returns: a procedure
For generic ports, port-handler returns the handler passed to one of the generic port creation procedures described above. For ports created by open-input-file and similar procedures, port-handler returns an internal handler that may be invoked in the same manner as any other handler.
procedure: (port-input-buffer input-port)
procedure: (port-input-size input-port)
procedure: (port-input-index input-port)
returns: see below
These procedures return the input buffer, size, or index of input-port.
procedure: (set-port-input-size! input-port n)
procedure: (set-port-input-index! input-port n)
returns: unspecified
The procedure set-port-input-index! sets the input index field of input-port to n, which must be a nonnegative integer less than or equal to the port's input size.
The procedure set-port-input-size! sets the input size field of input-port to n, which must be a nonnegative integer less than or equal to the string length of the port's input buffer.
The procedure set-port-input-size! also sets the input index to 0.
procedure: (port-output-buffer output-port)
procedure: (port-output-size output-port)
procedure: (port-output-index output-port)
returns: see below
These procedures return the output buffer, size, or index of output-port.
procedure: (set-port-output-size! output-port n)
procedure: (set-port-output-index! output-port n)
returns: unspecified
The procedure set-port-output-index! sets the output index field of output-port to n, which must be a nonnegative integer less than or equal to the port's output size.
The procedure set-port-output-size! sets the output size field of output-port to n, which must be a nonnegative integer less than or equal to the string length of the port's output buffer.
The procedure set-port-output-size! also sets the output index to 0.
procedure: (mark-port-closed! port)
returns: unspecified
This procedure directly marks the port closed so that no further input or output operations are allowed on it. It is typically used by handlers upon receipt of a close-port message.
procedure: (set-port-bol! output-port obj)
returns: unspecified
When When obj is #f, the port's beginning-of-line (BOL) flag is cleared; otherwise, the port's BOL flag is set.
The BOL flag is consulted by fresh-line (page 183) to determine if it needs to emit a newline. This flag is maintained automatically for file output ports, string output ports, and transcript ports. The flag is set for newly created file and string output ports, except for file output ports created with the append option, for which the flag is reset. The BOL flag is clear for newly created generic ports and never set automatically, but may be set explicitly using set-port-bol!. The port is always flushed immediately before the flag is consulted, so it need not be maintained on a per-character basis for buffered ports.
procedure: (port? obj)
returns: #t if obj is a port, #f otherwise
(port? "hi there") #f
(port? (current-input-port)) #t
(port? (current-output-port)) #t
procedure: (close-port port)
returns: unspecified
The procedure close-port is equivalent to close-input-port on an input or input/output port, and to close-output-port on an output or input/output port.
Chez Scheme closes ports automatically after they become inaccessible to the program or when the Scheme program exits, but it is best to close ports explicitly whenever possible.
procedure: (port-closed? port)
returns: #t if port is closed, #f otherwise
(let ([p (open-output-string)])
(port-closed? p)) #f
(let ([p (open-output-string)])
(close-port p)
(port-closed? p)) #t
procedure: (port-name port)
returns: the name associated with port
The name may be a string or #f (denoting no name). For file ports, the name is typically a string naming the file.
(let ([p (open-input-file "myfile.ss")])
(port-name p)) "myfile.ss"
(let ([p (open-output-string)])
(port-name p)) "string"
procedure: (file-length port)
returns: the length of the file to which port refers
The length is measured in bytes from the start of the file. If the file length cannot be determined, 0 is returned.
procedure: (file-position port)
procedure: (file-position port n)
returns: see below
When the second argument is omitted, file-position returns the position of the port in the file to which the port refers, measured in bytes from the start of the file. For input ports, this is the position from which the next character will be read, and for output ports, this is the position to which the next character will be written. If the file position cannot be determined, the most negative fixnum is returned.
When passed two arguments, file-position sets the file position of port to n. Normally, the maximum allowable value for the second argument to file-position is the value returned by file-length, although some operating systems automatically extend output files when the value of the second argument exceeds this amount.
For compressed files opened with the compressed flag, file-position returns the position in the uncompressed stream of data. Changing the position of a compressed input file opened with the compressed flag generally requires rewinding and rereading the file and might thus be slow. The position of a compressed output file opened with the compressed flag can be moved forward only; this is accomplished by writing a (compressed) sequence of zeros.
procedure: (clear-input-port)
procedure: (clear-input-port input-port)
returns: unspecified
If input-port is not supplied, it defaults to the current input port. This procedure discards any characters in the buffer associated with input-port. This may be necessary, for example, to clear any type-ahead from the keyboard in preparation for an urgent query.
procedure: (clear-output-port)
procedure: (clear-output-port output-port)
returns: unspecified
If output-port is not supplied, it defaults to the current output port. This procedure discards any characters in the buffer associated with output-port. This may be necessary, for example, to clear any pending output on an interactive port in preparation for an urgent message.
procedure: (flush-output-port)
procedure: (flush-output-port output-port)
returns: unspecified
If output-port is not supplied, it defaults to the current output port. This procedure forces any characters in the buffer associated with output-port to be printed immediately. The console output port is automatically flushed after a newline and before input from the console input port; all ports are automatically flushed when they are closed. flush-output-port may be necessary, however, to force a message without a newline to be sent to the console output port or to force output to appear on a file without delay.
String ports allow the creation and manipulation of strings via port operations. The procedure open-input-string converts a string into an input port, allowing the characters in the string to be read in sequence via input operations such as read-char or read. The procedure open-output-string allows new strings to be built up with output operations such as write-char and write.
While string ports could be defined as generic ports, they are instead supported as primitive by the implementation.
procedure: (open-input-string string)
returns: a new string input port
A string input port is similar to a file input port, except that characters and objects drawn from the port come from string rather than from a file.
A string port is at "end of file" when the port reaches the end of the string. It is not necessary to close a string port, although it is okay to do so.
(let ([p (open-input-string "hi mom!")])
(let ([x (read p)])
(list x (read p)))) (hi mom!)
procedure: (with-input-from-string string thunk)
returns: the value returned by thunk
with-input-from-string parameterizes the current input port to be the result of opening string for input during the application of thunk.
(with-input-from-string "(cons 3 4)"
(lambda ()
(eval (read)))) (3 . 4)
procedure: (open-output-string)
returns: a new string output port
A string output port is similar to a file output port, except that characters and objects written to the port are placed in a string (which grows as needed) rather than to a file.
The string built by writing to a string output port may be obtained with get-output-string. See the example given for get-output-string below. It is not necessary to close a string port, although it is okay to do so.
procedure: (get-output-string string-output-port)
returns: the string associated with string-output-port
As a side effect, get-output-string resets string-output-port so that subsequent output to string-output-port is placed into a fresh string.
(let ([p (open-output-string)])
(write 'hi p)
(write-char #\space p)
(write 'mom! p)
(get-output-string p)) "hi mom!"
An implementation of format (Section 9.8) might be written using string-output ports to produce string output.
procedure: (with-output-to-string thunk)
returns: a string containing the output
with-output-to-string parameterizes the current output port to a new string output port during the application of thunk. If thunk returns, the string associated with the new string output port is returned, as with get-output-from-string.
(with-output-to-string
(lambda ()
(display "Once upon a time ...")
(newline))) "Once upon a time ...\n"
procedure: (eof-object)
returns: the eof object
(eof-object? (eof-object)) #t
console-input-port is a parameter that determines the input port used by the waiter and interactive debugger. When called with no arguments, it returns the console input port. When called with an input port argument, it changes the value of the console input port.
current-input-port is a parameter that determines the default port argument for most input procedures, including read-char, peek-char, and read, When called with no arguments, current-input-port returns the current input port. When called with an input port argument, it changes the value of the current input port. The standard Scheme version of current-input-port accepts only zero arguments, i.e., it cannot be used to change the current input port.
procedure: (open-input-file filename)
procedure: (open-input-file filename options)
returns: a new input port
filename must be a string. open-input-file opens an input port for the file named by filename. An error is signaled if the file does not exist or cannot be opened for input.
options, if present, is a symbolic option name or option list. Possible symbolic option names are compressed, uncompressed, buffered, and unbuffered. An option list is a list containing zero or more symbolic option names.
The mutually exclusive compressed and uncompressed options determine whether the input file should be decompressed if it is compressed. (See open-output-file.) The default is uncompressed, so the uncompressed option is useful only as documentation.
The mutually exclusive buffered and unbuffered options determine whether input is buffered. When input is buffered, it is read in large blocks and buffered internally for efficiency to reduce the number of operating system requests. When the unbuffered option is specified, input is unbuffered, but not fully, since one character of buffering is required to support peek-char and unread-char. Input is buffered by default, so the buffered option is useful only as documentation.
For example, the call
(open-input-file "frob" '(compressed))
opens the file frob with decompression enabled.
The standard Scheme version of open-input-file does not support the optional options argument.
procedure: (call-with-input-file filename proc)
procedure: (call-with-input-file filename proc options)
returns: the result of invoking proc
filename must be a string. proc must be a procedure of one argument.
call-with-input-file creates a new input port for the file named by filename and passes this port to proc. An error is signaled if the file does not exist or cannot be opened for input. If proc returns, call-with-input-file closes the input port and returns the value returned by proc.
call-with-input-file does not automatically close the input port if a continuation created outside of proc is invoked, since it is possible that another continuation created inside of proc will be invoked at a later time, returning control to proc. If proc does not return, an implementation is free to close the input port only if it can prove that the input port is no longer accessible. As shown in Section 5.5 of The Scheme Programming Language, Third Edition, dynamic-wind may be used to ensure that the port is closed if a continuation created outside of proc is invoked.
See open-input-file above for a description of the optional options argument.
The standard Scheme version of call-with-input-file does not support the optional input argument.
procedure: (with-input-from-file filename thunk)
procedure: (with-input-from-file filename thunk options)
returns: the value returned by thunk
filename must be a string.
with-input-from-file temporarily changes the current input port to be the result of opening the file named by filename for input during the application of thunk. If thunk returns, the port is closed and the current input port is restored to its old value.
The behavior of with-input-from-file is unspecified if a continuation created outside of thunk is invoked before thunk returns. An implementation may close the port and restore the current input port to its old value---but it may not.
See open-input-file above for a description of the optional options argument.
The standard Scheme version of with-input-from-file does not support the optional options argument.
procedure: (unread-char char)
procedure: (unread-char char input-port)
returns: unspecified
If input-port is not supplied, it defaults to the current input port. unread-char "unreads" the last character read from input-port. char may or may not be ignored, depending upon the implementation. In any case, it is an error for char not to be last character read from the port. It is also an error to call unread-char twice on the same port without an intervening call to read-char.
unread-char is provided for applications requiring one character of lookahead and may be used in place of, or even in combination with, peek-char. One character of lookahead is required in the procedure read-word, which is defined below in terms of unread-char. read-word returns the next word from an input port as a string, where a word is defined to be a sequence of alphabetic characters. Since it does not know until it reads one character too many that it has read the entire word, read-word uses unread-char to return the character to the input port.
(define read-word
(lambda (p)
(list->string
(let f ([c (read-char p)])
(cond
[(eof-object? c) '()]
[(char-alphabetic? c)
(cons c (f (read-char p)))]
[else
(unread-char c p)
'()])))))
In the alternate version below, peek-char is used instead of unread-char.
(define read-word
(lambda (p)
(list->string
(let f ([c (peek-char p)])
(cond
[(eof-object? c) '()]
[(char-alphabetic? c)
(read-char p)
(cons c (f (peek-char p)))]
[else '()])))))
The advantage of unread-char in this situation is that only one call to unread-char per word is required, whereas one call to peek-char is required for each character in the word plus the first character beyond. In many cases, unread-char does not enjoy this advantage, and peek-char should be used instead.
procedure: (block-read input-port string count)
returns: see below
count must be a nonnegative fixnum less than or equal to the length of string.
If input-port is at end-of-file, an eof object is returned. Otherwise, string is filled with as many characters as are available for reading from input-port up to count, and the number of characters placed in the string is returned.
If input-port is buffered and the buffer is nonempty, the buffered input or a portion thereof is returned; otherwise block-read bypasses the buffer entirely.
procedure: (read-token)
procedure: (read-token input-port)
returns: see below
Parsing of a Scheme datum is conceptually performed in two steps. First, the sequence of characters that form the datum are grouped into tokens, such as symbols, numbers, left parentheses, and double quotes. During this first step, whitespace and comments are discarded. Second, these tokens are grouped into data.
read performs both of these steps and creates an internal representation of each datum it parses. read-token may be used to perform the first step only, one token at a time. read-token is intended to be used by editors and program formatters that must be able to parse a program or datum without actually reading it.
If input-port is not supplied, it defaults to the current input port. One token is read from the input port and returned as four values:
When the token type fully specifies the token, read-token returns #f for the value. The token types are listed below with the corresponding value in parentheses.
The set of token types is likely to change in future releases of the system; check the release notes for details on such changes.
The input port is left pointing to the first character position beyond the token, i.e., end characters from the starting position.
> (read-token)(
lparen
#f
0
1
> (read-token) abc
atomic
abc
1
4
> (read-token (open-input-string ""))
eof
#!eof
0
0
> (define s (open-input-string "#7=#7#"))
> (read-token s)
mark
7
0
3
> (read-token s)
insert
7
3
6
The information read-token returns is not always sufficient for reconstituting the exact sequence of characters that make up a token. For example, 1.0 and 1e0 both return type atomic with value 1.0. The exact sequence of characters may be obtained only by repositioning the port and reading a block of characters of the appropriate length, using the relative positions given by start and end.
parameter: console-output-port
console-output-port is a parameter that determines the output port used by the waiter and interactive debugger. When called with no arguments, it returns the console output port. When called with an output port argument, it changes the value of the console output port.
parameter: current-output-port
current-output-port is a parameter that determines the default port argument for most output procedures, including write-char, newline, write, display, and pretty-print. When called with no arguments, current-output-port returns the current output port. When called with an output port argument, it changes the value of the current output port. The standard Scheme version of current-output-port accepts only zero arguments, i.e., it cannot be used to change the current output port.
procedure: (open-output-file filename)
procedure: (open-output-file filename options)
returns: a new output port
filename must be a string. open-output-file opens an output port for the file named by filename.
options, if present, is a symbolic option name or option list. Possible symbolic option names are error, truncate, replace, append, compressed, uncompressed, buffered, unbuffered, exclusive, and nonexclusive. An option list is a list containing zero or more symbolic option names and possibly the two-element option mode mode.
The mutually exclusive error, truncate, replace, and append options are used to direct what happens when the file to be opened already exists.
The mutually exclusive compressed and uncompressed options determine whether the output file is to be compressed. Compression is performed with the use of the zlib compression library developed by Jean-loup Gailly and Mark Adler. It is therefore compatible with the gzip program, which means that gzip may be used to uncompress files produced by Chez Scheme and visa versa. Files are uncompressed by default, so the uncompressed option is useful only as documentation.
The mutually exclusive buffered and unbuffered options determine whether output is buffered. Unbuffered output is sent immediately to the file, whereas buffered output not written until the port's output buffer is filled or the port is flushed (via flush-output-port) or closed (via flush-output-port or by the storage management system when the port becomes inaccessible). Output is buffered by default for efficiency, so the buffered option is useful only as documentation.
The mutually exclusive exclusive and nonexclusive options determine whether access to the file is "exclusive." When the exclusive option is specified, the file is locked until the port is closed to prevent access by other processes. On some systems the lock is advisory, i.e., it inhibits access by other processes only if they also attempt to open exclusively. Nonexclusive access is the default, so the nonexclusive option is useful only as documentation.
The mode option determines the permission bits on Unix systems when the file is created by the operation, subject to the process umask. The subsequent element in the options list must be an exact integer specifying the permissions in the manner of the Unix open function. The mode option is ignored under Windows.
For example, the call
(open-output-file "frob" '(compressed truncate mode #o644))
opens the file frob with compression enabled. If frob already exists it is truncated. On Unix-based systems, if frob does not already exist, the permission bits on the newly created file are set to logical and of #o644 and the process's umask.
The standard Scheme version of open-output-file does not support the optional options argument.
procedure: (call-with-output-file filename proc)
procedure: (call-with-output-file filename proc options)
returns: the result of invoking proc
filename must be a string. proc must be a procedure of one argument.
call-with-output-file creates a new output port for the file named by filename and passes this port to proc. An error is signaled if the file cannot be opened for output. If proc returns, call-with-output-file closes the output port and returns the value returned by proc.
call-with-output-file does not automatically close the output port if a continuation created outside of proc is invoked, since it is possible that another continuation created inside of proc will be invoked at a later time, returning control to proc. If proc does not return, an implementation is free to close the output port only if it can prove that the output port is no longer accessible. As shown in Section 5.5 of The Scheme Programming Language, Third Edition, dynamic-wind may be used to ensure that the port is closed if a continuation created outside of proc is invoked.
See open-output-file above for a description of the optional options argument.
The standard Scheme version of call-with-output-file does not support the optional options argument.
procedure: (with-output-to-file filename thunk)
procedure: (with-output-to-file filename thunk options)
returns: the value returned by thunk
filename must be a string. with-output-to-file temporarily rebinds the current output port to be the result of opening the file named by filename for output during the application of thunk. If thunk returns, the port is closed and the current output port is restored to its old value.
The behavior of with-output-to-file is unspecified if a continuation created outside of thunk is invoked before thunk returns. An implementation may close the port and restore the current output port to its old value---but it may not.
See open-output-file above for a description of the optional options argument.
The standard Scheme version of with-output-to-file does not support the optional options argument.
procedure: (display-string string)
procedure: (display-string string output-port)
returns: unspecified
display-string writes the characters contained within string to output-port or to the current-output port if output-port is not specified. The enclosing string quotes are not printed, and special characters within the string are not escaped. display-string is a more efficient alternative to display for displaying strings.
procedure: (block-write output-port string count)
returns: unspecified
count must be a nonnegative fixnum less than or equal to the length of string.
block-write writes the first count characters of string to output-port. If output-port is buffered and the buffer is nonempty, the buffer is flushed before the contents of string are written. In any case, the contents of string are written immediately, without passing through the buffer.
procedure: (truncate-file output-port)
procedure: (truncate-file output-port pos)
returns: unspecified
pos must be an exact nonnegative integer. It defaults to 0.
truncate-file truncates the file associated with output-port to pos and repositions the port to that position. On some operating systems, pos may be beyond the current contents of the file, in which case the file is extended.
procedure: (fresh-line)
procedure: (fresh-line output-port)
returns: unspecified
If output-port is not supplied, it defaults to the current output port.
This procedure behaves like newline, i.e., sends a newline character to output-port, unless it can determine that the port is already positioned at the start of a line. It does this by flushing the port and consulting the "beginning-of-line" (BOL) flag associated with the port. (See page 171.)
procedure: (open-input-output-file filename)
procedure: (open-input-output-file filename options)
returns: a new input-output port
filename must be a string. open-input-output-file opens an input-output port for the file named by filename.
The port may be used to read from or write to the named file. The file is created if it does not already exist.
options, if present, is a symbolic option name or option list. Possible symbolic option names are buffered, unbuffered, exclusive, and nonexclusive. An option list is a list containing zero or more symbolic option names and possibly the two-element option mode mode. See the description of open-output-file for an explanation of these options.
Input/output files are usually closed using close-port but may also be closed with either close-input-port or close-output-port.
The pretty printer is a version of the write procedure that produces more human-readable output via introduced whitespace, i.e., line breaks and indentation. The pretty printer is the default printer used by the read-eval-print loop (waiter) to print the output(s) of each evaluated form. The pretty printer may also be invoked explicitly by calling the procedure pretty-print.
The pretty printer's operation can be controlled via the pretty-format procedure described later in this section, which allows the programmer to specify how specific forms are to be printed, various pretty-printer controls, also described later in this section, and also by the generic input/output controls described in Section 9.9.
procedure: (pretty-print obj)
procedure: (pretty-print obj output-port)
returns: unspecified
If output-port is not supplied, it defaults to the current output port.
pretty-print is similar to write except that it uses any number of spaces and newlines in order to print obj in a style that is pleasing to look at and which shows the nesting level via indentation. For example,
(pretty-print '(define factorial (lambda (n) (let fact ((i n) (a 1))
(if (= i 0) a (fact (- i 1) (* a i)))))))
might produce
(define factorial
(lambda (n)
(let fact ([i n] [a 1])
(if (= i 0) a (fact (- i 1) (* a i))))))
procedure: (pretty-file ifn ofn)
returns: unspecified
ifn and ofn must be strings. pretty-file reads each object in turn from the file named by ifn and pretty prints the object to the file named by ofn. Comments present in the input are discarded by the reader and so do not appear in the output file. If the file named by ofn already exists, it is replaced.
procedure: (pretty-format sym)
returns: see below
procedure: (pretty-format sym fmt)
returns: unspecified
By default, the pretty printer uses a generic algorithm for printing each form. This procedure is used to override this default and guide the pretty-printers treatment of specific forms. The symbol sym names a syntactic form or procedure. With just one argument, pretty-format returns the current format associated with sym, or #f if no format is associated with sym.
In the two-argument case, the format fmt is associated with sym for future invocations of the pretty printer. fmt must be in the formatting language described below.
<fmt> | (quote symbol) | |
| | var | |
| | symbol | |
| | (read-macro string symbol) | |
| | (meta) | |
| | (bracket . fmt-tail) | |
| | (alt fmt fmt*) | |
| | fmt-tail | |
fmt-tail | () | |
| | (tab fmt ...) | |
| | (fmt tab ...) | |
| | (tab fmt . fmt-tail) | |
| | (fmt ...) | |
| | (fmt . fmt-tail) | |
| | (fill tab fmt ...) | |
tab | int | |
| | #f |
Some of the format forms are used for matching when there are multiple alternatives, while others are used for matching and control indentation or printing. A description of each fmt is given below.
Indentation of list-structured forms is determined via the fmt-tail specifier used to the last two cases above. A description of each fmt-tail is given below.
A tab determines the amount by which a list subform is indented. If tab is a nonnegative exact integer int, the subform is indented int spaces in from the character position just after the opening parenthesis or bracket of the parent form. If tab is #f, the standard indentation is used. The standard indentation can be determined or changed via the parameter pretty-standard-indent, which is described later in this section.
In cases where a format is given that doesn't quite match, the pretty printer tries to use the given format as far as it can. For example, if a format matches a list-structured form with a specific number of subforms, but more or fewer subform are given, the pretty printer will discard or replicate subform formats as necessary.
Here is an example showing the formatting of let might be specified.
(pretty-format 'let
'(alt (let ([bracket var x] 0 ...) #f e #f e ...)
(let var ([bracket var x] 0 ...) #f e #f e ...)))
Since let comes in two forms, named and unnamed, two alternatives are specified. In either case, the bracket fmt is used to enclose the bindings in square brackets, with all bindings after the first appearing just below the first (and just after the enclosing opening parenthesis), if they don't all fit on one line. Each body form is indented by the standard indentation.
parameter: pretty-line-length
parameter: pretty-one-line-limit
The value of each of these parameters must be a positive fixnum.
The parameters pretty-line-length and pretty-one-line-limit control the output produced by pretty-print. pretty-line-length determines after which character position (starting from the first) on a line the pretty printer attempts to cut off output. This is a soft limit only; if necessary, the pretty-printer will go beyond pretty-line-length.
pretty-one-line-limit is similar to pretty-line-length, except that it is relative to the first nonblank position on each line of output. It is also a soft limit.
parameter: pretty-initial-indent
The value of this parameter must be a nonnegative fixnum.
The parameter pretty-initial-indent is used to tell pretty-print where on an output line it has been called. If pretty-initial-indent is zero (the default), pretty-print assumes that the first line of output it produces will start at the beginning of the line. If set to a nonzero value n, pretty-print assumes that the first line will appear at character position n and will adjust its printing of subsequent lines.
parameter: pretty-standard-indent
The value of this parameter must be a nonnegative fixnum.
This determines the amount by which pretty-print indents subexpressions of most forms, such as let expressions, from the form's keyword or first subexpression.
parameter: pretty-maximum-lines
The parameter pretty-maximum-lines controls how many lines pretty-print prints when it is called. If set to #f (the default), no limit is imposed; if set to a nonnegative fixnum n, at most n lines are printed.
procedure: (format format-string obj ...)
procedure: (format #f format-string obj ...)
procedure: (format #t format-string obj ...)
procedure: (format output-port format-string obj ...)
returns: see below
When the first argument to format is a string or #f (first and second forms above), format constructs an output string from format-string and the objects obj .... Characters are copied from format-string to the output string from left to right, until format-string is exhausted. The format string may contain one or more format directives, which are multi-character sequences prefixed by a a tilde ( ~ ). Each directive is replaced by some other text, often involving one or more of the obj ... arguments, as determined by the semantics of the directive.
When the first argument is #t, output is sent to the current output port instead, as with printf. When the first argument is a port, output is sent to that port, as with fprintf. printf and fprintf are described later in this section.
Chez Scheme's implementation of format supports all of the Common Lisp [24] format directives except for those specific to the Common Lisp pretty printer. Please consult a Common Lisp reference or the Common Lisp Hyperspec, for complete documentation. A few of the most useful directives are described below.
Absent any format directives, format simply displays its string argument.
(format "hi there") "hi there"
The ~s directive is replaced by the printed representation of the next obj, which may be any object, in machine-readable format, as with write.
(format "hi ~s" 'mom) "hi mom"
(format "hi ~s" "mom") "hi \"mom\""
(format "hi ~s~s" 'mom #\!) "hi mom#\\!"
The general form of a ~s directive is actually ~mincol,colinc,minpad,padchars, and the s can be preceded by an at sign ( @ ) modifier. These additional parameters are used to control padding in the output, with at least minpad copies of padchar plus an integer multiple of colinc copies of padchar to make the total width, including the written object, mincol characters wide. The padding is placed on the left if the @ modifier is present, otherwise on the right. mincol and minpad default to 0, colinc defaults to 1, and padchar defaults to space. If specified, padchar is prefixed by a single quote mark.
(format "~10s" 'hello) "hello "
(format "~10@s" 'hello) " hello"
(format "~10,,,'*@s" 'hello) "*****hello"
The ~a directive is similar, but prints the object as with display.
(format "hi ~s~s" "mom" #\!) "hi \"mom\"#\\!"
(format "hi ~a~a" "mom" #\!) "hi mom!"
A tilde may be inserted into the output with ~~, and a newline may be inserted with ~% (or embedded in the string with \n).
(format "~~line one,~%line two.~~") "~line one,\nline two.~"
(format "~~line one,\nline two.~~") "~line one,\nline two.~"
Real numbers may be printed in floating-point notation with ~f.
(format "~f" 3.14159) 3.14159
Exact numbers may printed as well as inexact numbers in this manner; they are simply converted to inexact first as if with exact->inexact.
(format "~f" 1/3) "0.3333333333333333"
The general form is actually ~w,d,k,overflowchar,padcharf. If specified, w determines the overall width of the output, and d the number of digits to the right of the decimal point. padchar, which defaults to space, is the pad character used if padding is needed. Padding is always inserted on the left. The number is scaled by 10k when printed; k defaults to zero. The entire w-character field is filled with copies of overflowchar if overflowchar is specified and the number cannot be printed in w characters. k defaults to 1 If an @ modifier is present, a plus sign is printed before the number for nonnegative inputs; otherwise, a sign is printed only if the number is negative.
(format "~,3f" 3.14159) "3.142"
(format "~10f" 3.14159) " 3.14159"
(format "~10,,,'#f" 1e20) "##########"
Real numbers may also be printed with ~e for scientific notation or with ~g, which uses either floating-point or scientific notation based on the size of the input.
(format "~e" 1e23) "1.0e+23"
(format "~g" 1e23) "1.0e+23"
A real number may also be printed with ~$, which uses monetary notation defaulting to two digits to the right of the decimal point.
(format "$~$" (* 39.95 1.06)) "$42.35"
(format "~$USD" 1/3) "0.33USD"
Words can be pluralized automatically using p.
(format "~s bear~:p in ~s den~:p" 10 1) "10 bears in 1 den"
Numbers may be printed out in words or roman numerals using variations on ~r.
(format "~r" 2599) "two thousand five hundred ninety-nine"
(format "~:r" 99) "ninety-ninth"
(format "~@r" 2599) "MMDXCIX"
Case conversions can be performed by bracketing a portion of the format string with the ~@( and ~) directives.
(format "~@(~r~)" 2599) "Two thousand five hundred ninety-nine"
(format "~@:(~a~)" "Ouch!") "ouch!"
Some of the directives shown above have more options and parameters, and there are other directives as well, including directives for conditionals, iteration, indirection, and justification. Again, please consult a Common Lisp reference for complete documentation.
An implementation of a greatly simplified version of format appears in Section 9.6 of The Scheme Programming Language, Third Edition.
procedure: (printf format-string obj ...)
procedure: (fprintf output-port format-string obj ...)
returns: unspecified
These procedures are simple wrappers for format. printf prints the formatted output to the current output, as with a first-argument of #f to format, and fprintf prints the formatted output to the output-port, as when the first argument to format is a port.
The I/O control operations described in this section are used to control how the reader reads and printer writes, displays, or pretty-prints characters, symbols, gensyms, numbers, vectors, long or deeply nested lists or vectors, and graph-structured objects.
procedure: (char-name obj)
returns: see below
procedure: (char-name name char)
returns: unspecified
char-name is used to associate names (symbols) with characters or to retrieve the most recently associated name or character for a given character or name. A name can map to only one character, but more than one name can map to the same character. The name most recently associated with a character determines how that character prints, and each name associated with a character may be used after the #\ character prefix to name that character on input.
In the one-argument form, obj must be a symbol or character. If it is a symbol and a character is associated with the symbol, char-name returns that character. If it is a symbol and no character is associated with the symbol, char-name returns #f. Similarly, if obj is a character, char-name returns the most recently associated symbol for the character or #f if no name is associated with the character. For example, with the default set of character names:
(char-name #\space) space
(char-name 'space) #\space
(char-name 'nochar) #f
(char-name #\a) #f
When passed two arguments, name is added to the set of names associated with char, and any other association for name is dropped. char may be #f, in which case any other association for name is dropped and no new association is formed. In either case, any other names associated with char remain associated with char.
The following interactive session demonstrates the use of char-name to establish and remove associations between characters and names, including the association of more than one name with a character.
> (char-name 'etx)
#f
> (char-name 'etx #\003)
> (char-name 'etx)
#\etx
> (char-name #\003)
etx
> #\etx
#\etx
> (eq? #\etx #\003)
#t
> (char-name 'etx #\space)
> (char-name #\003)
#f
> (char-name 'etx)
#\etx
> #\space
#\etx
> (char-name 'etx #f)
> #\etx
Error in read: invalid character name #\etx.
> #\space
#\space
The case-sensitive parameter determines whether or not the reader and printer are case-sensitive with respect to symbol names. When set to false (the default, as required by the Scheme standard) the case of alphabetic characters within symbol names is insignificant. When set to true, case is significant.
> (case-sensitive #f)
> 'ABC
abc
> (eq? 'abc 'ABC)
#t
> (case-sensitive #t)
> (eq? 'abc 'ABC)
#f
> 'ABC
ABC
When print-graph is set to a nonfalse value, write and pretty-print locate and print objects with shared structure, including cycles, in a notation that may be read subsequently with read. This notation employs the syntax "#n=obj," where n is a nonnegative integer and obj is the printed representation of an object, to label the first occurrence of obj in the output. The syntax "#n#" is used to refer to the object labeled by n thereafter in the output. print-graph is set to #f by default.
If graph printing is not enabled, the settings of print-length and print-level are insufficient to force finite output, and write or pretty-print detects a cycle in an object it is given to print, a warning is issued and the object is printed as if print-graph were enabled.
Since objects printed through the ~s option in the format control strings of format, printf, and fprintf are printed as with write, the printing of such objects is also affected by print-graph.
(parameterize ([print-graph #t])
(let ([x (list 'a 'b)])
(format "~s" (list x x)))) "(#0=(a b) #0#)"
(parameterize ([print-graph #t])
(let ([x (list 'a 'b)])
(set-car! x x)
(set-cdr! x x)
(format "~s" x))) "#0=(#0# . #0#)"
The graph syntax is understood by the procedure read, allowing graph structures to be printed and read consistently.
parameter: print-level
parameter: print-length
These parameters can be used to limit the extent to which nested or multiple-element structures are printed. When called without arguments, print-level returns the current print level and print-length returns the current print length. When called with one argument, which must be a nonnegative fixnum or #f, print-level sets the current print level and print-level sets the current print length to the argument.
When print-level is set to a nonnegative integer n, the procedures write and pretty-print traverse only n levels deep into nested structures. If a structure being printed exceeds n levels of nesting, the substructure beyond that point is replaced in the output by an ellipsis ( ... ). print-level is set to #f by default, which places no limit on the number of levels printed.
When print-length is set to a nonnegative integer n, the procedures write and pretty-print print only n elements of a list or vector, replacing the remainder of the list or vector with an ellipsis ( ... ). print-length is set to #f by default, which places no limit on the number of elements printed.
Since objects printed through the ~s option in the format control strings of format, printf, and fprintf are printed as with write, the printing of such objects is also affected by print-level and print-length.
The parameters print-level and print-length are useful for controlling the volume of output in contexts where only a small portion of the output is needed to identify the object being printed. They are also useful in situations where circular structures may be printed (see also print-graph).
(format "~s" '((((a) b) c) d e f g)) "((((a) b) c) d e f g)"
(parameterize ([print-level 2])
(format "~s" '((((a) b) c) d e f g))) "(((...) c) d e f g)"
(parameterize ([print-length 3])
(format "~s" '((((a) b) c) d e f g))) "((((a) b) c) d e ...)"
(parameterize ([print-level 2]
[print-length 3])
(format "~s" '((((a) b) c) d e f g))) "(((...) c) d e ...)"
The print-radix parameter determines the radix in which numbers are printed by write, pretty-print, and display. Its value should be an integer between 2 and 36, inclusive. Its default value is 10.
When the value of print-radix is not 10, write and pretty-print print a radix prefix before the number (#b for radix 2, #o for radix 8, #x for radix 16, and #nr for any other radix n).
Since objects printed through the ~s and ~a options in the format control strings of format, printf, and fprintf are printed as with write and display, the printing of such objects is also affected by print-radix.
(format "~s" 11242957) "11242957"
(parameterize ([print-radix 16])
(format "~s" 11242957)) "#xAB8DCD"
(parameterize ([print-radix 16])
(format "~a" 11242957)) "AB8DCD"
When print-gensym is set to #t (the default) or any other true value except for pretty, gensyms are printed with an extended symbol syntax that includes both the pretty name and the unique name of the gensym: #{pretty-name unique-name}. When set to pretty, the pretty name only is shown, with the prefix #:. When set to #f, the pretty name only is shown, with no prefix.
Since objects printed through the ~s option in the format control strings of format, printf, error, etc., are printed as with write, the printing of such objects is also affected by print-gensym.
When printing an object that may contain more than one occurrence of a gensym and print-graph is set to pretty, it is useful to set print-graph to #t so that multiple occurrences of the same gensym are marked as identical in the output.
(let ([g (gensym)])
(format "~s" g)) "#{g0 bdids2xl6v49vgwe-a}"
(let ([g (gensym)])
(parameterize ([print-gensym 'pretty])
(format "~s" g))) "#:g1
(let ([g (gensym)])
(parameterize ([print-gensym #f])
(format "~s" g))) "g2"
(let ([g (gensym)])
(parameterize ([print-graph #t] [print-gensym 'pretty])
(format "~s" (list g g)))) "(#0=#:g3 #0#)"
When print-brackets is set to a true value, the pretty printer (see pretty-print) uses square brackets rather than parentheses around certain subexpressions of common control structures, e.g., around let bindings and cond clauses. print-brackets is set to #t by default.
(let ([p (open-output-string)])
(pretty-print '(let ([x 3]) x) p) "(let ([x 3]) x)
(get-output-string p)) "
(parameterize ([print-brackets #f])
(let ([p (open-output-string)])
(pretty-print '(let ([x 3]) x) p) "(let ((x 3)) x)
(get-output-string p))) "
parameter: print-vector-length
When print-vector-length is set to a true value, write and pretty-print include the length for all vectors between the "#" and open parenthesis and all fxvectors between the "#vfx" and open parenthesis. This parameter is set to #t by default.
When print-vector-length is set to a true value, write and pretty-print also suppress duplicated trailing elements in the vector or fxvector to reduce the amount of output. This form is also recognized by the reader.
Since objects printed through the ~s option in the format control strings of format, printf, and fprintf are printed as with write, the printing of such objects is also affected by the setting of print-vector-length.
(format "~s" (vector 'a 'b 'c)) "#3(a b c)"
(format "~s" (vector 'a 'b 'c 'c 'c)) "#5(a b c)"
(format "~s" (fxvector 1 2 3 4 4 4)) "#vfx6(1 2 3 4)"
(parameterize ([print-vector-length #f])
(format "~s" (vector 'a 'b 'c 'c 'c))) "#(a b c c c)"
The procedures write and pretty-print print objects in a human readable format. For objects with external datum representations (see Chapter 15.5), the output produced by write and pretty-print is also machine-readable with read. Objects with external datum representations include pairs, symbols, vectors, strings, numbers, characters, booleans, and records but not procedures and ports.
An alternative fast loading, or fasl, format may be used for objects with external datum representations. The fasl format is not human readable, but it is machine readable and both more compact and more quickly processed by read. This format is always used for compiled code generated by compile-file, but it may also be used for data that needs to be written and read quickly, such as small databases encoded with Scheme data structures.
Objects are printed in fasl format with fasl-write. Objects written in fasl format and objects written in the standard human-readable format may be included within the same file; the fasl format for any object begins with the prefix #@, allowing the reader to recognize fasl format objects in the input. Since the reader recognizes and handles objects written in fasl format, no special procedures are needed for reading objects written in fasl format.
Fasl reading is supported from input ports created by open-input-string, open-input-file, or open-input-output-file only, i.e., not from arbitrary generic ports.
procedure: (fasl-write obj)
procedure: (fasl-write obj output-port)
returns: unspecified
If output-port is not supplied, it defaults to the current output port. fasl-write writes the fasl prefix #@ to the output port followed by the fasl representation for obj. An error is signaled if obj or any portion of obj has no external representation as a datum.
> (define op (open-output-string))
> (fasl-write '(a b c) op)
> (write '(1 2 3) op)
> (define ip (open-input-string (get-output-string op)))
> (read ip)
(a b c)
> (read ip)
(1 2 3)
procedure: (fasl-file ifn ofn)
returns: unspecified
ifn and ofn must be strings. fasl-file may be used to convert a file in human-readable format or mixed human-readable and fasl formats into an equivalent file written in fasl format. fasl-file reads each object in turn from the file named by ifn and writes the fasl format for the object onto the file named by ofn. If the file named by ofn already exists, it is replaced.
This section describes operations on files, directories, and pathnames.
parameter: current-directory
parameter: cd
When invoked without arguments, current-directory returns a string representing the the current working directory. Otherwise, the current working directory is changed to the directory specified by the argument, which must be a string representing a valid directory pathname.
cd is bound to the same parameter.
procedure: (directory-list pathname)
returns: a list of file names
pathname must be a string. The return value is a list of strings representing the names of files found in the directory named by pathname.
procedure: (file-exists? pathname)
procedure: (file-exists? pathname follow?)
returns: #t if the file named by pathname exists, #f otherwise
pathname must be a string. If the optional follow? argument is true (the default), file-exists? follows symbolic links; otherwise it does not. Thus, file-exists? will return #f when handed the pathname of a broken symbolic link unless follow? is provided and is #f.
procedure: (file-regular? pathname)
procedure: (file-regular? pathname follow?)
returns: #t if the file named by pathame is a regular file, #f otherwise
pathname must be a string. If the optional follow? argument is true (the default), file-regular? follows symbolic links; otherwise it does not.
procedure: (file-directory? pathname)
procedure: (file-directory? pathname follow?)
returns: #t if the file named by pathame is a directory, #f otherwise
pathname must be a string. If the optional follow? argument is true (the default), file-directory? follows symbolic links; otherwise it does not.
procedure: (file-symbolic-link? pathname)
returns: #t if the file named by pathame is a symbolic link, #f otherwise
pathname must be a string. file-symbolic-link? never follows symbolic links in making its determination.
procedure: (mkdir pathname)
procedure: (mkdir pathname mode)
returns: unspecified
pathname must be a string, and mode must be a fixnum.
mkdir creates a directory with the name given by pathname. All pathname path components leading up to the last must already exist. If the optional mode argument is present, it overrides the default permissions for the new directory. Under Windows, the mode argument is ignored.
procedure: (delete-file pathname)
procedure: (delete-file pathname error?)
returns: see below
pathname must be a string. delete-file removes the file named by pathname. If the optional error? argument is false (the default), delete-file returns a boolean value: #t if the operation is successuful and #f if it is not. Otherwise, delete-file returns an unspecified value if the operation is successful and signals an error if it is not.
procedure: (delete-directory pathname)
procedure: (delete-directory pathname error?)
returns: see below
pathname must be a string. delete-directory removes the directory named by pathname. If the optional error? argument is false (the default), delete-directory returns a boolean value: #t if the operation is successuful and #f if it is not. Otherwise, delete-directory returns an unspecified value if the operation is successful and signals an error if it is not.
procedure: (chmod pathname mode)
returns: unspecified
pathname must be a string, and mode must be a fixnum.
chmod sets the permissions on the file named by pathname to mode. Under Windows, the chmod does nothing.
procedure: (directory-separator? char)
returns: #t if char is a directory separator, #f otherwise
The character #\/ is a directory separator on all current machine types, and #\\ is a directory separator under Windows.
procedure: (directory-separator)
returns: the preferred directory separator
The preferred directory separator is #\\ for Windows and #\/ for other systems.
procedure: (path-extension pathname)
procedure: (path-root pathname)
procedure: (path-last pathname)
procedure: (path-parent pathname)
returns: the specified component of pathname
pathname must be a string. The return value is also a (possibly empty) string. The path extension component is the portion of pathname that follows the last dot (period) in the path name. The path root component is the portion of pathname that does not include the extension, if any, or the dot that precedes it. The path last component is the portion of pathname that appears after the last directory separator, if any. The path parent component is the portion of pathname that does not include the path last component, if any, or the directory separator that precedes it.
(define pn "../a/b.c/d.e.f")
(path-extension pn) "f"
(path-root pn) "../a/b.c/d.e"
(path-last pn) "d.e.f"
(path-parent pn) "../a/b.c"
This section presents the definitions for three types of generic ports: two-way ports, transcript ports, and process ports.
Two-way ports. The first example defines make-two-way-port, which constructs an input/output port from a given pair of input and output ports. For example:
(define ip (open-input-string "this is the input"))
(define op (open-output-string))
(define p (make-two-way-port ip op))
The port returned by make-two-way-port is both an input and an output port:
(port? p) #t
(input-port? p) #t
(output-port? p) #t
Items read from a two-way port come from the constituent input port, and items written to a two-way port go to the constituent output port:
(read p) this
(write 'hello p)
(get-output-string op) hello
The definition of make-two-way-port is straightforward. To keep the example simple, no local buffering is performed, although it would be more efficient to do so.
(define make-two-way-port
(lambda (ip op)
(define handler
(lambda (msg . args)
(record-case (cons msg args)
[block-read (p s n) (block-read ip s n)]
[block-write (p s n) (block-write op s n)]
[char-ready? (p) (char-ready? ip)]
[clear-input-port (p) (clear-input-port ip)]
[clear-output-port (p) (clear-output-port op)]
[close-port (p) (mark-port-closed! p)]
[flush-output-port (p) (flush-output-port op)]
[file-position (p . pos) (apply file-position ip pos)]
[file-length (p) (file-length ip)]
[peek-char (p) (peek-char ip)]
[port-name (p) "two-way"]
[read-char (p) (read-char ip)]
[unread-char (c p) (unread-char c ip)]
[write-char (c p) (write-char c op)]
[else (error 'two-way-port
"operation ~s not handled"
msg)])))
(make-input/output-port handler "" "")))
Most of the messages are passed directly to one of the constituent ports. Exceptions are close-port, which is handled directly by marking the port closed, port-name, which is also handled directly. file-position and file-length are rather arbitrarily passed off to the input port.
Transcript ports. The next example defines make-transcript-port, which constructs an input/output port from three ports: an input port ip and two output ports, op and tp. Input read from a transcript port comes from ip, and output written to a transcript port goes to op. In this manner, transcript ports are similar to two-way ports. Unlike two-way ports, input from ip and output to op is also written to tp, so that tp reflects both input from ip and output to op.
Transcript ports may be used to define the Scheme procedures transcript-on and transcript-off, or the Chez Scheme procedure transcript-cafe. For example, here is a definition of transcript-cafe:
(define transcript-cafe
(lambda (pathname)
(let ([tp (open-output-file pathname 'replace)])
(let ([p (make-transcript-port
(console-input-port)
(console-output-port)
tp)])
; set both console and current ports so that
; the waiter and read/write will be in sync
(parameterize ([console-input-port p]
[console-output-port p]
[current-input-port p]
[current-output-port p])
(let-values ([vals (new-cafe)])
(close-port p)
(close-port tp)
(apply values vals)))))))
The implementation of transcript ports is significantly more complex than the implementation of two-way ports defined above, primarily because it buffers input and output locally. Local buffering is needed to allow the transcript file to reflect accurately the actual input and output performed in the presence of unread-char, clear-output-port, and clear-input-port. Here is the code:
(define make-transcript-port
(lambda (ip op tp)
(define (handler msg . args)
(record-case (cons msg args)
[block-read (p str cnt)
(critical-section
(let ([b (port-input-buffer p)]
[i (port-input-index p)]
[s (port-input-size p)])
(if (< i s)
(let ([cnt (fxmin cnt (fx- s i))])
(do ([i i (fx+ i 1)]
[j 0 (fx+ j 1)])
((fx= j cnt)
(set-port-input-index! p i)
cnt)
(string-set! str j (string-ref b i))))
(let ([cnt (block-read ip str cnt)])
(unless (eof-object? cnt)
(block-write tp str cnt))
cnt))))]
[char-ready? (p)
(or (< (port-input-index p) (port-input-size p))
(char-ready? ip))]
[clear-input-port (p)
; set size to zero rather than index to size
; in order to invalidate unread-char
(set-port-input-size! p 0)]
[clear-output-port (p)
(set-port-output-index! p 0)]
[close-port (p)
(flush-output-port p)
(set-port-output-size! p 0)
(set-port-input-size! p 0)
(mark-port-closed! p)]
[file-position (p . pos)
(if (null? pos)
(most-negative-fixnum)
(error 'transcript-port "cannot reposition"))]
[flush-output-port (p)
(critical-section
(let ([b (port-output-buffer p)]
[i (port-output-index p)])
(unless (fx= i 0)
(block-write op b i)
(block-write tp b i)
(set-port-output-index! p 0)
(set-port-bol! p (char=? (string-ref b (fx- i 1)) #\newline))))
(flush-output-port op)
(flush-output-port tp))]
[peek-char (p)
(critical-section
(let ([b (port-input-buffer p)]
[i (port-input-index p)]
[s (port-input-size p)])
(if (fx< i s)
(string-ref b i)
(begin
(flush-output-port p)
(let ([s (block-read ip b)])
(if (eof-object? s)
s
(begin
(block-write tp b s)
(set-port-input-size! p s)
(string-ref b 0))))))))]
[port-name (p) "transcript"]
[constituent-ports (p) (values ip op tp)]
[read-char (p)
(critical-section
(let ([c (peek-char p)])
(unless (eof-object? c)
(set-port-input-index! p
(fx+ (port-input-index p) 1)))
c))]
[unread-char (c p)
(critical-section
(let ([b (port-input-buffer p)]
[i (port-input-index p)]
[s (port-input-size p)])
(when (fx= i 0)
(error 'unread-char
"tried to unread too far on ~s"
p))
(set-port-input-index! p (fx- i 1))
; following could be skipped; it's supposed
; to be the same character anyway
(string-set! b (fx- i 1) c)))]
[write-char (c p)
(critical-section
(let ([b (port-output-buffer p)]
[i (port-output-index p)]
[s (port-output-size p)])
(string-set! b i c)
; could check here to be sure that we really
; need to flush; we may end up here even if
; the buffer isn't full
(block-write op b (fx+ i 1))
(block-write tp b (fx+ i 1))
(set-port-output-index! p 0)
(set-port-bol! p (char=? c #\newline))))]
[block-write (p str cnt)
(critical-section
; flush buffered data
(let ([b (port-output-buffer p)]
[i (port-output-index p)])
(unless (fx= i 0)
(block-write op b i)
(block-write tp b i)
(set-port-output-index! p 0)
(set-port-bol! p (char=? (string-ref b (fx- i 1)) #\newline))))
; write new data
(unless (fx= cnt 0)
(block-write op str cnt)
(block-write tp str cnt)
(set-port-bol! p (char=? (string-ref str (fx- cnt 1)) #\newline))))]
[else (error 'transcript-port
"operation ~s not handled"
msg)]))
(let ([ib (make-string 1024)] [ob (make-string 1024)])
(let ([p (make-input/output-port handler ib ob)])
(set-port-input-size! p 0)
(set-port-output-size! p (fx- (string-length ob) 1))
p))))
The chosen length of both the input and output ports is the same; this is not necessary. They could have different lengths, or one could be buffered locally and the other not buffered locally. Local buffering could be disabled effectively by providing zero-length buffers.
After we create the port, the input size is set to zero since there is not yet any data to be read. The port output size is set to one less than the length of the buffer. This is done so that write-char always has one character position left over into which to write its character argument. Although this is not necessary, it does simplify the code somewhat while allowing the buffer to be flushed as soon as the last character is available.
Block reads and writes are performed on the constituent ports for efficiency and (in the case of writes) to ensure that the operations are performed immediately.
The call to flush-output-port in the handling of read-char insures that all output written to op appears before input is read from ip. Since block-read is typically used to support higher-level operations that are performing their own buffering, or for direct input and output in support of I/O-intensive applications, the flush call has been omitted from that part of the handler.
Critical sections are used whenever the handler manipulates one of the buffers, to protect against untimely interrupts that could lead to reentry into the handler. The critical sections are unnecessary if no such reentry is possible, i.e., if only one "thread" of the computation can have access to the port.
Process ports. The final example demonstrates how to incorporate the socket interface defined in Section 4.8 into a generic port that allows transparent communication with subprocesses via normal Scheme input/output operations.
A process port is created with open-process, which accepts a shell command as a string. open-process sets up a socket, forks a child process, sets up two-way communication via the socket, and invokes the command in a subprocess.
The sample session below demonstrates the use of open-process, running and communicating with another Scheme process started with the "-q" switch to suppress the greeting and prompts.
> (define p (open-process "exec scheme -q"))
> (define s (make-string 1000 #\nul))
> (pretty-print '(+ 3 4) p)
> (read p)
7
> (pretty-print '(define (f x) (if (= x 0) 1 (* x (f (- x 1))))) p)
> (pretty-print '(f 10) p)
> (read p)
3628800
> (pretty-print '(exit) p)
> (read p)
#!eof
> (close-port p)
Since process ports, like transcript ports, are two-way, the implementation is somewhat similar. The main difference is that a transcript port reads from and writes to its subordinate ports, whereas a process port reads from and writes to a socket. When a process port is opened, the socket is created and subprocess invoked, and when the port is closed, the socket is closed and the subprocess is terminated.
(define open-process
(lambda (command)
(define handler
(lambda (pid socket)
(define (flush-output who p)
(let ([i (port-output-index p)])
(when (fx> i 0)
(check who (c-write socket (port-output-buffer p) i))
(set-port-output-index! p 0))))
(lambda (msg . args)
(record-case (cons msg args)
[block-read (p str cnt)
(critical-section
(let ([b (port-input-buffer p)]
[i (port-input-index p)]
[s (port-input-size p)])
(if (< i s)
(let ([cnt (fxmin cnt (fx- s i))])
(do ([i i (fx+ i 1)]
[j 0 (fx+ j 1)])
((fx= j cnt)
(set-port-input-index! p i)
cnt)
(string-set! str j (string-ref b i))))
(begin
(flush-output 'block-read p)
(let ([n (check 'block-read
(c-read socket str cnt))])
(if (fx= n 0)
#!eof
n))))))]
[char-ready? (p)
(or (< (port-input-index p) (port-input-size p))
(bytes-ready? socket))]
[clear-input-port (p)
; set size to zero rather than index to size
; in order to invalidate unread-char
(set-port-input-size! p 0)]
[clear-output-port (p) (set-port-output-index! p 0)]
[close-port (p)
(critical-section
(flush-output 'close-port p)
(set-port-output-size! p 0)
(set-port-input-size! p 0)
(mark-port-closed! p)
(terminate-process pid))]
[file-length (p) 0]
[file-position (p . pos)
(if (null? pos)
(most-negative-fixnum)
(error 'process-port "cannot reposition"))]
[flush-output-port (p)
(critical-section
(flush-output 'flush-output-port p))]
[peek-char (p)
(critical-section
(let ([b (port-input-buffer p)]
[i (port-input-index p)]
[s (port-input-size p)])
(if (fx< i s)
(string-ref b i)
(begin
(flush-output 'peek-char p)
(let ([s (check 'peek-char
(c-read socket b (string-length b)))])
(if (fx= s 0)
#!eof
(begin (set-port-input-size! p s)
(string-ref b 0))))))))]
[port-name (p) "process"]
[read-char (p)
(critical-section
(let ([b (port-input-buffer p)]
[i (port-input-index p)]
[s (port-input-size p)])
(if (fx< i s)
(begin
(set-port-input-index! p (fx+ i 1))
(string-ref b i))
(begin
(flush-output 'peek-char p)
(let ([s (check 'read-char
(c-read socket b (string-length b)))])
(if (fx= s 0)
#!eof
(begin (set-port-input-size! p s)
(set-port-input-index! p 1)
(string-ref b 0))))))))]
[unread-char (c p)
(critical-section
(let ([b (port-input-buffer p)]
[i (port-input-index p)]
[s (port-input-size p)])
(when (fx= i 0)
(error 'unread-char
"tried to unread too far on ~s"
p))
(set-port-input-index! p (fx- i 1))
; following could be skipped; supposed to be
; same character
(string-set! b (fx- i 1) c)))]
[write-char (c p)
(critical-section
(let ([b (port-output-buffer p)]
[i (port-output-index p)]
[s (port-output-size p)])
(string-set! b i c)
(check 'write-char (c-write socket b (fx+ i 1)))
(set-port-output-index! p 0)))]
[block-write (p str cnt)
(critical-section
; flush buffered data
(flush-output 'block-write p)
; write new data
(check 'block-write (c-write socket str cnt)))]
[else
(error 'process-port "operation ~s not handled" msg)]))))
(let* ([server-socket-name (tmpnam 0)]
[server-socket (setup-server-socket server-socket-name)])
(dofork
(lambda () ; child
(check 'close (close server-socket))
(let ([sock (setup-client-socket server-socket-name)])
(dodup 0 sock)
(dodup 1 sock))
(check 'execl (execl4 "/bin/sh" "/bin/sh" "-c" command))
(error 'open-process "subprocess exec failed"))
(lambda (pid) ; parent
(let ([sock (accept-socket server-socket)])
(check 'close (close server-socket))
(let ([ib (make-string 1024)] [ob (make-string 1024)])
(let ([p (make-input/output-port
(handler pid sock)
ib ob)])
(set-port-input-size! p 0)
(set-port-output-size! p (fx- (string-length ob) 1))
p))))))))
R. Kent Dybvig /
Copyright © 2005 R. Kent Dybvig
Revised July 2007 for Chez Scheme Version 7.4
Cadence Research Systems / www.scheme.com
Cover illustration © 1998 Jean-Pierre Hébert
ISBN: 0-9667139-1-5
to order this book /
about this book