Software history – Dusty Decks: Preserving historic software

Xerox PARC IFS archive

In 2014, the Computer History Museum released the Xerox Alto file server archive, constituting about 15,000 files from the Xerox Alto personal computer including the Alto operating system; BCPL, Mesa, and (portions of the) Smalltalk programming environments; applications such as Bravo, Draw, and the Laurel email client; fonts and printing software (PARC had the first laser printers); and server software (including the IFS file server and the Grapevine distributed mail and name server). I told the story behind that archive here.

Today CHM released the Xerox PARC Interim File System (IFS) archive:

The archive contains nearly 150,000 unique files—around four gigabytes of information—and covers an astonishing landscape: programming languages; graphics; printing and typography; mathematics; networking; databases; file systems; electronic mail; servers; voice; artificial intelligence; hardware design; integrated circuit design tools and simulators; and additions to the Alto archive.

A blog post by David Brock introduces the archive. Access to the archive itself is available here.

I began working on this project in 2018 under an NDA with PARC: reading the old media prepared years earlier by Al Kossow, updating the conversion software I’d written for the earlier Alto project, and winnowing down a list of 300,000 files to the 150,000 files that I submitted to PARC management for approval. David Brock’s post ends with an Acknowledgments section noting all the people at CHM and PARC who contributed.

CAL Timesharing System: Before computers were personal

In 2023 computers are all around us: our phones, tablets, laptops, and desktops, and lurking inside our television sets, appliances, automobiles, to say nothing of our workplaces and the internet. It wasn’t always that way: I was born in 1949, just as the first stored-program digital computers were going into operation. Those computers were big, filling a room, and difficult to use. Initially a user would sign up for a block of time to test and run a program that had been written and punched into paper tape or 80-column cards.cThe fact that an expensive computer sat idle while the user was thinking or mounting tapes seemed wasteful, so people designed batch operating systems that would run programs one after the other, with a trained operator mounting tapes just before they were needed. The users submitted their card decks and waited in their offices until their programs had run and the listings had been printed. While this was more efficient, there was a demand for computers that operated in “real time”, interacting with people and other equipment. MIT’s Whirlwind, TX-0, and TX-2 and Wes Clark’s LINC are examples.

The ability to interact directly with a computer via a terminal (especially when a display was available) was compelling, and computers were becoming much faster, which led to the idea of timesharing: making the computer divide its attention among a set of users, each with a terminal. Ideally the computer would have enough memory and speed so each user would get good service. Early timesharing projects included CTSS at MIT, DTSS at Dartmouth, and Project Genie at Berkeley. By 1966, Berkeley (that is, the University of California at Berkeley) decided to replace its IBM batch system with a larger computer that would provide interactive (time-shared) service as well as batch computing. None of the large commercial computers came with a timesharing system, so Berkeley decided they would build their own. The story of that project—from conception, through funding, design, implementation, (brief) usage, to termination—is told here:

Paul McJones and Dave Redell. History of the CAL Timesharing System. IEEE Annals of the History of Computing, Vol. 45, No. 3 (July-September 2023). IEEE Xplore (Open access)

How did I come to write that paper? In the winter of 1968-1969 I was invited to join the timesharing project. At that time I had about 2 years of programming experience gained in classes and on-the-job experience during high school and college (Berkeley). That wasn’t much, but it included one good-sized project—a Snobol4 implementation with Charles Simonyi—so the team welcomed me to the project. For the next three years I helped build the CAL Timesharing System, performed some maintenance on the Snobol4 system, and finished my bachelor’s degree. In December 1971, CAL TSS development was canceled, and I graduated and moved on to the CRMS APL project elsewhere on campus.

Those three years were hectic but immensely enjoyable. The team was small, with under a dozen people, housed first in an old apartment on Channing Way and then in the brand-new Evans Hall. Lifelong friendships were formed. People often worked into the night, when the computer was available, and then trooped over to a nearby hamburger joint for a late meal. Exciting things were going on around us. There were protests, the Vietnam War, and the first moon landings. Rock music seemed fresh and exciting. I had met my future wife in 1968, and we were married in 1970.

As CAL TSS came to an end, we all agreed the experience could never be equalled. But we didn’t realize people in the future would be interested in studying our system, so we weren’t careful about preserving the magnetic tapes. However many of us kept manuals, design documents, and listings, plus a few tapes. In 1980 and again in 1991 we had reunions and I offered to store everything until it became clear what to do for the long run. Around 2003 I started scanning the materials and organizing a web site. In 2022 the Computer History Museum agreed to accept the physical artifacts, and this year they agreed to host the web site:

https://caltss.computerhistory.org/

Jack Schwartz and SETL

In April 2020, just as the Covid pandemic began, Annie Liu, a professor at Stony Brook University, emailed me to chat about programming language history. She suggested that Python, with its antecedents SETL, ABC, and C, would be a good topic for historical study and preservation. I mentioned that I’d considered SETL as an interesting topic back in the early 2000s, but unfortunately had not acted. After a few more rounds of email with her, I began looking around the web and Annie introduced me to several SETL people. Starting with these people, a few other personal contacts, and some persistence, I was soon in touch with much of the small but friendly SETL community, who very generously pored through their files and donated a wide variety of materials. The result is an historical archive of materials on the SETL programming language, including source code, documentation, and an extensive set of design notes that is available at the Software Preservation Group web site:

http://www.softwarepreservation.org/projects/SETL/

In addition, the digital artifacts and some of the physical artifacts are now part of the Computer History Museum’s permanent collection.

The SETL programming language was designed by Jack Schwartz at the Courant Institute of Mathematical Sciences at New York University. Schwartz was an accomplished mathematician who became interested in computer science during the 1960s. While working with John Cocke to learn and document a variety of compiler optimization algorithms, he got the idea of a high-level programming language able to describe such complex algorithms and data structures. [1] It occurred to him that set theory could be the basis for such a language since it was rich enough to serve as a foundation for all of mathematics. As his colleagues Martin Davis and Edward Schonberg described it in their biographical memoir to him: [2]

The central feature of the language is the use of sets and mappings over arbitrary domains, as well as the use of universally and existentially quantified expressions to describe predicates and iterations over composite structures. This set-theoretic core is embedded in a conventional imperative language with familiar control structures, subprograms, recursion, and global state in order to make the language widely accessible. Conservative for its time, it did not include higher-order functions. The final version of the language incorporated a backtracking mechanism (with success and fail primitives) as well as database operations. The popular scripting and general purpose programming language Python is understood to be a descendent of SETL, and its lineage is apparent in Python’s popularization of the use of mappings over arbitrary domains.

Schwartz viewed SETL first as a specification language allowing complex algorithms and data structures to be written down, conveyed to other humans, and executed as a part of algorithm development or even as a component of a complete prototype system. Actual production use would typically require reprogramming in terms of data structures closer to the machine such as arrays and lists. Schwartz believed that SETL programs could be optimized “by a combination of automatic and programmer-assisted procedures.” [3, page 70] He wrote several memos about his ideas for SETL [4, 5], and began assembling a project team — mostly graduate students. A series of design notes and memos called the SETL Newsletter was launched. [6] Malcolm Harrison, another NYU professor, had designed an extensible LISP-like language called BALM; in the first SETL Newsletter he sketched a simple prototype of SETL as a BALM extension. [7]

Over the following years the SETL Newsletters chronicled a long and confusing series of SETL implementations implemented with various versions of BALM and also with LITTLE, a low-level systems programming language.

BALMSETL (1971-1972) consisted of a runtime library of procedures corresponding to the various SETL operations, and a modification of BALM which replaced the standard BALM syntactic forms with calls to the appropriate procedures in the library. This runtime library used a hash-based representation of sets (earlier prototypes had used lists).
SETLB (spring 1972) consisted of a preprocessor (written in Fortran) that translated a simplified subset of SETL to BALMSETL. BALM was converted from producing interpretative code for a generalized BALM machine to producing CDC 6600 machine code.
SETLB.2 (1973?) was based upon a version of the BALM interpreter written in LITTLE, plus the SETL Run Time Library. It offered a limited capability for variation of the semantics of subroutine and function invocation by the SETLB programmer.
SETLA (1974?)’s input language was closer to SETL, but it still used the BALMSETL-based runtime library and BALM-based name scoping.
SETLC (1975?) consisted of a lexical scanner and syntactic analyzer (written in LITTLE), tree-walking routines (written in BALM) that built BALM parse trees), a translator that emitted LITTLE from the parse trees (written in BALM), and the LITTLE compiler. The generated LITTLE code used the SETL Run Time Library.
SETL/LITTLE (1977-1978?) consisted of a SETL-to-LITTLE translator, a runtime library, and a LITTLE-to-CDC 6600 machine code compiler (all written in LITTLE).

The final system (the only one for which source code is available) was ported to the IBM System/370, Amdahl UTS, DECsystem-10, and DEC VAX. There was also a sophisticated optimizer, itself written in SETL, which however was too large and slow to use in production. Work stopped around the end of 1984 as Schwartz’s focus moved to other fields such as parallel computing and robotics and many of the graduate students received their degrees. A follow-on SETL2 project produced more SETL Newsletters but no system.

Other SETL implementations

Starting in the mid 1970s, SETL-influenced languages were implemented at other institutions including Akademgorodok in Novosibirsk, and then at NYU itself. After a 30-year gestation period, GNU SETL was released in 2022. See https://www.softwarepreservation.org/projects/SETL/#Dialects for more.

Aftermath

Many reports and theses were written and papers were published. Perhaps the most well-known result was the NYUAda project, which was an “executable specification” for Ada that was the first validated Ada implementation. The project members went on to found AdaCore and GNAT Ada compiler.

Acknowledgements

In addition to Annie Liu, many people helped me on this project; see https://www.softwarepreservation.org/projects/SETL/#Acknowledgments .

References

[1] John Cocke and Jacob T. Schwartz. Programming Languages and Their Compilers. Preliminary Notes. 1968-1969; second revised version, Apri1 1970. Courant Institute of Mathematical Sciences, New York University. PDF at software preservation.org

[2] Martin Davis and Edmond Schonberg. Jacob Theodore Schwartz 1930-2009: A Biographical Memoir. National Academy of Science, 2011. PDF at nasonline.org

[3] Jacob T. Schwartz. On Programming: An Interim Report on the SETL Project. Installment 1: Generalities; Installment 2: The SETL Language, and Examples of Its Use. Computer Science Department, Courant Institute of Mathematical Sciences, New York University, 1973; revised June 1975. PDF at softwarepreservation.org

[4] Jacob T. Schwartz. Set theory as a language for program specification and programming. Courant Institute of Mathematical Sciences, September 1970, 97 pages.

[5] Jacob T. Schwartz. Abstract algorithms and a set theoretic language for their expression. Computer Science Department, Courant Institute of Mathematical Sciences, New York University. Preliminary draft, first part. 1970-1971, 16+289 pages. PDF at softwarepreservation.org

[6] SETL Newsletter. #1-#217, November 1970 – November 1981; #220-#233, April 1987 – November 1988. Online at software preservatio n.org

[7] M. C. Harrison. BALM-SETL: A simple implementation of SETL. SETL Newsletter #1, 5 November 1970. PDF at softwarepreservation.org

Remembering Maarten van Emden

Maarten van Emden died on January 4, 2023, at the age of 85.[1] He was a pioneer of logic programming, a field he explored for much of his career. I was not in his field, and only got to know him starting in 2010, so this is a personal, but not professional, remembrance of a very dear friend.

His life

Maarten was born in Velp, the Netherlands, but his family soon moved to the Dutch East Indies, where his botanist father was working on improving tea plants. In 1942 the Japanese invaded. Maarten’s father escaped to join the resistance, but Maarten, his younger sister, and his mother were sent to a detention camp. As the war came to a close, his father was able to rescue and reunite the family. Over the next few years they returned to the Netherlands, with a brief return to the newly-formed Indonesia, followed by boarding school in Australia for Maarten. They were finally reunited in the Netherlands in 1954, where Maarten began his final year of high school. After graduating in 1955, he went to national flight school (Rijksluchtvaartschool). He did a year of military service, including flight training, and then joined KLM Royal Dutch Airlines. But KLM was adopting DC-8 jets for transatlantic service, whose speed, capacity, and ease of operation led to the need for fewer pilots. Maarten took advantage of a company program to enroll part-time in an engineering curriculum at the University of Delph. Later he was laid off by KLM and finished a master’s degree as a full-time student. He then enrolled in the PhD program administered by the University of Amsterdam with research at the Mathematisch Centrum (now CWI), and also made several visits to the University of Edinburgh. His 1971 dissertation was An Analysis of Complexity and his advisor was Adriaan van Wijngaarden. Maarten was awarded a post-doctoral fellowship by IBM, which he spent at the Thomas J. Watson Research Center in Yorktown Heights, NY during the 1971-1972 academic year, before returning to Edinburgh for a research position under Donald Michie in the Department of Machine Intelligence. In 1975 he accepted a professorship at the University of Waterloo, and in 1987 he moved to the University of Victoria.

Programming

Maarten was one of 15 individuals recognized as Founders of Logic Programming by the Association for Logic Programming.[2] His work began with an early collaboration with Bob Kowalski[3] and continued throughout his career with collaborations and individual projects to explore many aspects of the field. Underlying his interest in logic programming was a fascination with programming and programming languages of all sorts.[4] His first language was Algol 60, which he taught himself using McCracken’s new book[5] when his university suddenly switched from Marchant calculators to a Telefunken TR-4 computer for the numerical methods course.[6] Moving on to the MC he was surrounded by ALGOL experts (his advisor van Wijngaarden was a member of the ALGOL 60 Committee and the instigator of the infamous ALGOL 68). Maarten was originally attracted to Edinburgh after hearing about the POP-2 timesharing system of Burstall and Popplestone; it was only later that he realized he’d initially used POP-2 as if it was ALGOL rather than a rich functional programming language. During his post-doc at IBM he learned APL and Lisp. Fred Blair was implementing a statically-scoped Lisp for the SCRATCHPAD computer algebra group.[7] And William Burge, who had worked with Burstall and Landin, was spreading the gospel of functional programming.[8] Ensconced in Edinburgh in 1972, he became an early convert to Kowalski’s logic programming, which he noted could be traced back as early as Cordell Green’s paper at the 4th Machine Intelligence workshop.[9] But Maarten’s first impression of Preliminary Prolog was not positive — the frequent control annotations seemed to detract from the logic. Nevertheless, he and Kowalksi began writing short programs to explore the ideas. And when David Warren returned from a visit to Marseille with a box of cards containing Final Prolog as well as his short but powerful WARPLAN program, things changed. The language no longer needed the control annotations, and Warren quickly ported its Fortran-coded lowest layer to the local DEC-10. WARPLAN served as a tutorial for all sorts of programs in the new language. Maarten was surprised that his friend Alan Robinson, the inventor of resolution logic, wouldn’t give up Lisp for logic programming.[10] At Waterloo, he advised Grant Roberts, who built Waterloo Prolog for the IBM System /370, and another series of students who built several Prologs for Unix. At Victoria, he wrote a first-year textbook for science and engineering students based on C:

It is indeed true that object-oriented programming represents a great advance. It is also true that polymorphism in object-oriented programming does away with many if-statements and switch statements; that iterators replace or simplify many loops. But experience has shown that introducing objects first does not lead to a first course that produces better programmers; on the contrary. It is as much necessary as in the old days to make sure that students master variables, functions, branches, loops, arrays, and structures.
[11], page xi

In the acknowledgements of the book, he wrote:

I had the good fortune to grow up in three distinctive programming cultures: the Mathematical Centre in Amsterdam, the Lisp group in the IBM T.J. Watson Research Center, and the Department of Machine Intelligence in the University of Edinburgh. Though all of these entities have ceased to exist, I trust I am not the only surviving beneficiary.

If this book is better than others, it is due to my choice of those who were, often without knowing it, my teachers: H. Abelson, J. Bentley, W. Burge, R. Burstall, M. Cheng, A. Colmerauer, T. Dekker, E. Dijkstra, D. Gries, C. Hoare, D. Hoffman, N. Horspool, B. Kernighan, D. Knuth, R. O’Keefe, P. Plauger, R. Popplestone, F. Roberts, G. Sussman, A. van Wijngaarden, N. Wirth.
[11], page xi

Getting to know Maarten

As different as we were, Maarten and I had a few things in common: fathers who piloted B-24 bombers in WWII, a charismatic mutual friend named Jim Gray, attendance at the 1973 NATO Summer School on Structured Programming, books named Elements of Programming, and a fascination with the early development of programming languages. Jim Gray had been an informal mentor for me at UC Berkeley as I worked on CAL Snobol and Cal TSS. Then he left Berkeley for IBM Research in Yorktown, and made friends with Maarten. Jim soon decided he couldn’t tolerate life on the east coast, but before leaving he encouraged Maarten and his wife Jos to drive across the country and visit him in California, where he would show them around. They took him up on the offer, and during a brief stay in fall 1972 at Jim’s home in Berkeley I met Maarten, but didn’t make much of an impression on him (although he later told me Jim had mentioned the “great programmers on Cal TSS”). The next summer both Maarten and I attended the NATO Summer School on Structured Programming at Marktoberdorf, but neither of us remembered encountering the other. Maarten mentioned the summer school in his remembrance of Dijkstra.[12]

In 1974 I caught up with Jim Gray again, joining IBM Research in San Jose (before Almaden). The next summer Maarten visited Jim, although I didn’t learn of it until much later:

“After I returned to Europe Jim and I kept writing letters. In the summer of 1975 I was in a workshop in Santa Cruz and Jim came up in a beautiful old Porsche. I was at the height of my logic programming infatuation. Jim was rather dismissive of it. Nothing of what he told me about System R turned me on; the relationship died with that meeting. How I wish I could talk to him now about the mathematics of RDBs, which I started working on recently.”
[Maarten van Emden, personal communication, September 3, 2010]

Maarten left three technical reports with Jim, who passed them along to me.[13] [14] [15] I looked at them, and then put them aside for the next 35 years. In the fall of 2010 I had retired and was spending more time on software history projects. I’d been following Maarten’s blog; a recent pair of articles about the Fifth Generation Computer System project and the languages Prolog and Lisp[16] [17] prompted me to contact him about a project I was contemplating: an historical archive of implementations of Prolog.[18] That began a friendship carried out mostly through some 2000 emails and almost 400 weekly video calls, plus one in-person visit when Maarten visited the Bay Area in early 2011. I will always remember his charming manners, gentle humor, wide-ranging interests, and intriguing stories.

Acknowledgements

Thanks to Maarten’s daughter Eva van Emden for information about his life.

For more of his writing, see:

His web site: http://maarten.vanemden.com/
His blog A Programmer’s Place: https://vanemden.wordpress.com

Update 1 January 2024

Shortly after I wrote my post, Maarten’s colleagues wrote this article for the Association of Logic Programming: In Memoriam: Maarten van Emden.

Maarten’s web site at UVic has moved to http://maarten.vanemden.com/

References

[1] Eva van Emden. Maarten van Emden Obituary. The Times Colonist, January 10, 2023. https://www.legacy.com/ca/obituaries/timescolonist/name/maarten-van-emden-obituary?pid=203621344
[2] ALP Awards. Association for Logic Programming. https://www.cs.nmsu.edu/ALP/the-association-for-logic-programming/alp-awards/
[3] M. H. van Emden and R. A. Kowalski. The Semantics of Predicate Logic as a Programming Language. Journal of the ACM, Vol. 23, No. 4, 1976, pp. 733-742. https://www.doc.ic.ac.uk/~rak/papers/kowalski-van_emden.pdf
[4] Maarten van Emden. The Early Days of Logic Programming: A Personal Perspective. Association for Logic Programming Newsletter, August 2006. https://dtai.cs.kuleuven.be/projects/ALP/newsletter/aug06/nav/articles/article4/article.html
[5] Daniel McCracken. A Guide to ALGOL Programming. John Wiley and Sons, 1962. https://archive.org/details/mccracken1962guide
[6] Maarten van Emden. On Finding a Discarded Copy of “A guide to Algol Programming.” 1993 email to Frank Ruskey. https://web.archive.org/web/20050906222954/http://csr.uvic.ca:80/~vanemden/other/guideAlgol.html
[7] J. H. Griesmer and R. D. Jenks. SCRATCHPAD/1: An interactive facility for symbolic mathematics. In Proceedings of the second ACM symposium on Symbolic and algebraic manipulation (SYMSAC ’71). Association for Computing Machinery, New York, NY, USA, 42–58. https://doi.org/10.1145/800204.806266
[8] William Burge. Recursive Programming Techniques. Addison-Wesley 1975. https://archive.org/details/recursiveprogram0000burg
[9] Cordell Green. Theorem-Proving by Resolution as a Basis for Question-Answering Systems. Machine Intelligence 4, Bernard Meltzer and Donald Michie, editors, Edinburgh University Press, Edinburgh, Scotland, 1969, pages 183–205. https://www.kestrel.edu/people/green/publications/theorem-proving.pdf
[10] Maarten van Emden. Interview with Alan Robinson, inventor of resolution logic. June 8, 2010. https://vanemden.wordpress.com/2010/06/08/interview-with-alan-robinson-inventor-of-resolution-logic/
[11] M. H. van Emden. Elements of Programming. Andromeda Research Associates, Ltd. Third edition, 2009, page ix. https://abazu.vanemden.com/4EOP/EOP.html
[12] Maarten van Emden. I remember Edsger Dijkstra (1930 – 2002). August 2008. https://vanemden.wordpress.com/2008/05/06/i-remember-edsger-dijkstra-1930-2002/
[13] Robert Kowalski. Predicate Logic as Programming Language. Department of Computational Logic, University of Edinburgh, Memo No. 70, November 1973. https://www.doc.ic.ac.uk/~rak/papers/IFIP74.pdf
[14] M. H. van Emden and R. A. Kowalski. The semantics of predicate logic as a programming language. School of Artificial Intelligence, University of Edinburgh, MIP-R-103, February 1974. https://www.doc.ic.ac.uk/~rak/papers/kowalski-van_emden.pdf
[15] M. H. van Emden. First-order predicate logic as a high-level program language. School of Artificial Intelligence, University of Edinburgh, MIP-R-106, May 1974. http://webhome.cs.uvic.ca/~vanemden/Publications/FOPLasHLPL.pdf
[16] Maarten van Emden. Who Killed Prolog? A Programmer’s Place blog, August 21, 2010. https://vanemden.wordpress.com/2010/08/21/who-killed-prolog/
[17] Maarten van Emden. The Fatal Choice. A Programmer’s Place blog, August 31, 2010. https://vanemden.wordpress.com/2010/08/31/the-fatal-choice/
[18] Paul McJones, editor. Prolog and Logic Programming Historical Sources Archive. https://www.softwarepreservation.org/projects/prolog/

Robert L. Patrick on eMuseums

Bob Patrick is a friend of mine who entered the computer field in 1951, and whose hands-on experience running programs on an IBM 701 led him to conceive of the architecture for the General Motors/North American Monitor for the IBM 704 computer. (Bob described this work in a 1987 National Computer Conference paper. Other aspects of his extensive career are discussed in his recent Annals of the History of Computing paper and in a 2006 Oral History.)

For a number of years Bob has been involved in volunteer activities at the Computer History Museum, and recently he organized his thoughts on how museums can use the web to present technology, in the form of this article: “Museums in the Computer Age: meeting the challenge of technology“. Bob invites comments on the article via email at bobpatrick@mac.com.