Prev Up Next
Go backward to Adaptors
Go up to Top
Go forward to Formal Specification of STL Container and Iterator Concepts

Literate Generic Programming

A key to the successful application of generic library components is thorough and accurate documentation for library users. A significant advance in this area is the Silicon Graphics Standard Template Library Programmer's Guide, organized in terms of concepts and models. Another type of documentation needed is design and implementation level descriptions of algorithm and data structures. The audience for these is considerably smaller than for user-level documentation, but many new issues arise in designing and coding generic algorithms and data structures that deserve thorough discussion. Indeed much of the accepted wisdom of algorithm and data structure design may need to be re-examined when issues of genericity are considered. For describing generic algorithms we have begun experimenting with variations of Knuth's "literate programming" style. Knuth (1984) introduced this term for a style of program development in which code and its documentation are developed simultaneously, in a single source file. It is called "literate" because Knuth's goal was to promote the practice of describing programs in a highly readable style, including full use of typesetting facilities (TeX in particular). The TeX documentation and the program code are extracted from the single source file by a tool called Web. Knuth's TeX: The Program, Addison-Wesley, 1986, is a major example of literate programming--perhaps the largest program ever published in a readable form.

Since Knuth's original paper, several variations on literate programming have been developed. The original Web tool required coding in Pascal. Many variations on Web have been developed, including Cweb (Levy 1987), for coding in C or C++ (Knuth describes Cweb as his favorite programming language); Noweb (Ramsey 1992), which is programming-language independent: one can code in any language; and Nuweb (Briggs 1989), which is also programming-language independent, and has a very simple command set. We are currently using Nuweb, slightly modified. We started with a version of the Nuweb tool previously modified by Ramsdell and Mengel and made additional small changes in terminology in the LaTeX file the tool produces: "part" is used in place of "scrap" and "definition" in place of "macro" The new version does not differ from previous versions in the way it produces code files from Nuweb source files. The modified version of Nuweb is available, itself as a Nuweb source file. The archive file also includes the C files that Nuweb would generate from its source file, so one can get up and running with Nuweb with only a C compiler.

One of the papers in the generic algorithms section, A Fast Generic Sequence Matching Algorithm, is an example of the literate programming style. From the source file we use Nuweb to generate both the paper's LaTeX source file and all of the program code described in the paper. For expository purposes, we use Ada 95 code (in a limited way that makes it comparable to most pseudo-code languages); production code is in C++. The appendices of the paper include and document all of the code used in testing and measuring performance of the various algorithms described in the body of the paper. The Nuweb source file for the paper, as well as all of the code generated by applying Nuweb to it, is available. (The operation counting components used in some of the performance measurement programs are not included, but are separately available, as are the data files for the English text search tests and DNA search tests.)


 

Prev Up Next