The Portable I/O API

The Basis library provides a portable I/O API built on top of the operating system facilities. The source for most of the API can be found in the boot/IO directory of the compiler. The OS-dependent part of the implementation can be found in the boot/Unix directory for Posix-based Unix systems.

Figure 3-2 shows the major interfaces of the I/O API. (The notation is based on UML). SML signatures are pure interfaces and are extended by refinement which adds new features and by specialisation which makes abstract types concrete.

Figure 3-2. The Major Signatures of the Portable I/O API

The lowest level interface is PRIM_IO. It abstracts the basic operations of reading and writing over some I/O channel. The OS_PRIM_IO interface extends this with some functions for associating a channel with a file via some sort of OS-dependent file descriptor or handle.

The next level up is the STREAM_IO interface. It wraps buffering operations around the I/O channels and calls them streams. It is abstract over any implementation of I/O channels and any type of data element. (The PRIM_IO interface will be used to provide an implementation for STREAM_IO later).

Input streams are handled in a lazy functional manner. This means that streams are read from only upon demand (as you would expect) and the read returns a stream updated at a new position. So you can read from the same stream value multiple times and it always returns the same data from the same position. Output streams are imperative. Each write will append new data to the output stream.

The TEXT_STREAM_IO interface specialises STREAM_IO for characters and extends it with a function to read a line of text and to write from substrings (see the Substring structure).

The IMPERATIVE_IO interface wraps around the STREAM_IO and provides an interface using imperative streams. This means that the position of a stream is a hidden state variable which is updated after each read operation. The IMPERATIVE_IO is then specialised into binary and text I/O. The BIN_IO interface fixes the data type to be bytes. The TEXT_IO fixes it to be characters with an understanding of text conventions such as line splitting. It also gets knowledge of the Unix stdin, stdout and stderr text streams. The underlying STREAM_IO interface is made visible in TEXT_IO for when you want to use a functional I/O style.

The implementation hierarchy is shown in Figure 3-3. The PrimIO structure mainly defines some types and utility routines which represent buffered I/O. These are specialised by having the I/O types bound to either bytes (Word8.word) or characters. This results in the structures BinPrimIO and TextPrimIO respectively. The PosixBinPrimIO structure adds an implementation of binary I/O using the I/O functions in Posix.IO. (See the section called Posix.IO). Then the PosixTextPrimIO structure casts the binary I/O to text I/O and this results in the TextIO structure.

Figure 3-3. The Major Structures Implementing the Portable I/O API

For binary I/O there is a matching BinIO structure that reads and writes streams of bytes.

There is an IO structure that defines some common exceptions and types. The IO.Io exception is the main error reporting mechanism for the portable I/O API.

Here is a simple example that counts characters, words and lines in a file. It does the job of the Unix wc command but without the command line options. Here is the main function.

fun main(arg0, argv) =
let
in
    case argv of
      [] => count TextIO.stdIn ""

    | (file::_) =>
        let
            val strm = TextIO.openIn file
        in
            (count strm file) handle x =>
                (TextIO.closeIn strm; raise x);
            TextIO.closeIn strm
        end;

    OS.Process.success
end
handle
  IO.Io {name, function, cause} =>
    (
        toErr(concat["IO Error: ", name,
                     ", ", exnMessage cause, "\n"]);
        OS.Process.failure
    )

| x => (toErr(concat["Uncaught exception: ", exnMessage x,"\n"]);
        OS.Process.failure)

If there are no command line arguments then I read from stdin. If there are some then I take the first one and ignore the rest. Any I/O exception from the count function for a file is caught so that we can close the file. This is not strictly necessary since the file will get closed anyway when the program exits but I included it as an example of catching and reraising an exception. An I/O exception from anywhere else will be caught in the outermost handlers down the bottom.

Here is the count function. It is just a simple loop which terminates when the inputLine function returns an empty string. An empty line does not terminate the loop since it will have a new-line character in it. The inputLine function also returns a new-line in the case of an unterminated last line in a file. So the program will count an extra character in this case. Words are counted by splitting the line into tokens at white space and counting how many we get.

fun count strm file =
let
    fun read (nchars, nwords, nlines) =
    (
        (* This ensures the line ends with a \n
           unless we are at eof.
        *)
        case TextIO.inputLine strm of
          "" => (nchars, nwords, nlines)

        | line =>
            let
                val words = String.tokens Char.isSpace line
            in
                read (nchars + size line,
                      nwords + length words,
                      nlines + 1)
            end
    )

    val (nchars, nwords, nlines) = read (0, 0, 0)
in
    print(concat[Int.toString nlines, " ",
                 Int.toString nwords, " ",
                 Int.toString nchars, " ",
                 file, "\n"])
end