Thoth, A Portable Real-time Operating System

Transcription

OperationsSystemsR.S. GainesEditorThoth, a PortableReal-Time OperatingSystemDavid R. Cheriton, Michael A. Malcolm,Lawrence S. Melen, and Gary R. SagerUniversity of WaterlooThoth is a real-time operating system which isdesigned to be portable over a large set of machines. Itis currently running on two minicomputers with quitedifferent architectures. Both the system and applicationprograms which use it are written in a high-levellanguage. Because the system is implemented by thesame software on different hardware, it has the sameinterface to user programs. Hence, applicationprograms which use Thoth are highly portable. Thothencourages structuring programs as networks ofcommunicating processes by providing efficientinterprocess communication primitives.Key Words and Phrases: portability, real time,operating systems, minicomputerCR Categories: 3.80, 4.30, 4.35I. IntroductionThis paper describes a portable real-time operatingsystem called Thoth which has been developed at thePermission to copy without fee all or part of this material isgranted provided that the copies are not made or distributed for directcommercial advantage, the ACM copyright notice and the title of thepublication and its date appear, and notice is given that copying is bypermission of the Association for Computing Machinery. To copyotherwise, or to republish, requires a fee and/or specific permission.Authors' present addresses: D. Cheriton, Department of ComputerScience, University of British Columbia, Vancouver, B.C., CanadaV6T IW5; M.A. Malcolm and G.R. Sager, Department of ComputerScience, University of Waterloo, Waterloo, Ontario, Canada, N2L3GI; L.S. Melen, Transport Canada, Air Traffic Services, Place deVille, Ottawa, Canada, KIA 0N8.A version of this paper was presented at the Sixth Symposium onOperating Systems Principles, West Lafayette, Indiana, November16-18, 1977.The remaining papers appear in Operating Systems.Review (ACMSIGOPS Newsletter), Vol. 11, No. 5 (Special Issue). This special issueis available prepaid from ACM, P.O. Box 12105, Church Street Station,New York, NY 10249; ACM or SIGOPS members 9.00, all others 12.00. t 979 ACM 0001-0782/79/0200-0105 00.75105University of Waterloo as part of a research study intothe feasibility of portable operating systems. Thoth supports multiple processes, dynamic memory allocation,device-independent input/output, a file system, multipleterminals, and swapping. It is currently running on twominicomputers with quite different architectures (TexasInstruments 990 and Data General NOVA).This research is motivated by the difficulties encountered when moving application programs from one system to another; these difficulties arise when interfacingwith the hardware and system software of the targetmachine. The problems encountered interfacing withnew system software are generally more difficult than those of interfacing with new hardware because of thewide variety of abstract machines presented by the compilers, assemblers, loaders, file systems and operatingsystems of the various target machines. We have takenthe approach of developing portable system software andporting it to "bare" hardware. The same system softwareis used on different hardware, thus the same abstractmachine is available to application programs. Thus mostapplication programs which use Thoth are portable ifnot machine independent.Most previous work on software portability has focused on problems of porting programs over differentoperating systems as well as different hardware. To ourknowledge, this is the first time an entire system hasbeen designed for portability. Our experience indicatesthat this approach is practical both in the cost of portingthe system and its time and space performance.An earlier experiment in operating system portabilityhas been reported by Cox [4]. More recently, the UNIXoperating system [13] has been moved from a PDP-I 1/45 to an INTERDATA7/32; this port was done independently by Miller [ll] at the University of Wollongong,and Johnson and Ritchie [7] at Bell Telephone Laboratories.The design of Thoth strives for more than portability.A second design goal is to provide a system in whichprograms may be structured using many small concurrent processes. We have aimed for efficient interprocesscommunication and inexpensive processes to make thisstructuring technique attractive.A third design goal is that the system meet thedemands of real-time applications. To help meet thisgoal, the system guarantees that the worst-case time forresponse to certain external events (interrupt requests) isbounded by a small machine-dependent constant.A fourth design goal is that the system be adaptableto a variety of real-time applications. A range of systemconfigurations is possible: A stand-alone application program can use a version of the kernel which supportsdynamic memory allocation and interprocess communication. Larger configurations support process destruction, a device-independent input-output system, a treestructured file system, multiple terminals, and swapping.Thoth as described in this paper has evolved throughseveral versions. It was originally developed using aCommunicationsofthe ACMFebruary 1979Volume 22 -mber 2

compiler and other support software running on a Honeywell 6050. The first version was running on a DataGeneral NOVA in May 1976. It was ported to a TexasInstruments 990 in August 1976 using the Honeywell forsoftware development. Since that time, new versionshave been developed for either the TI or the NOVA, thenported to the other machine when complete. Largerconfigurations of Thoth can be used to develop software;in particular, all system maintenance and developmentare now done using a multiterminal Thoth system.2. The Thoth MachineThoth implements an abstract machine referred to asthe Thoth machine. The Thoth machine is implementedas a base language and a set of system functions implemented in this language.The base language, described by Braga [l], modelsthe hardware facilities available on a large number ofmachines. It was designed to conceal hardware idiosyncrasies while avoiding being a barrier between the programmer and the hardware. This is a stack-orientedlanguage derived from B [6], which is a descendant ofBCPL [12]. As in BCPL, programs in our base languageare written as a set of functions and data modules whichhave global scope. Variables local to a particular function, including its parameters, are dynamically allocatedon a stack so that functions can be reentrant.The language includes statements for disabling andenabling hardware interrupts to provide indivisible executions of sections of code. There is also a twit statementwhich provides a way of inserting assembly languagedirectly into the code. This makes it possible to insertspecial I / O instructions, and makes the presence of suchmachine-dependent code obvious. The name of the statement was chosen to suggest its low-level nature, and todiscourage its indiscriminant use. These statements areused almost exclusively for implementing primitive functions in the operating system, and are seldom necessaryor desirable in user programs.Under Thoth, a function is invoked as either a subroutine or as a separate process. When invoked as asubroutine, the function uses the caller's stack. Wheninvoked as a process, a new stack is allocated separatefrom that of the invoking process.A program comprises a tree of processes which interface with the Thoth machine via system function calls.This tree of "user processes" is actually a subtree of thetree of system processes.With small configurations, the system functions areall linked with the user program into an executable coreimage. We have used the convention of beginning allglobal system names with a "." so the user can avoidinadvertently replacing a system function or data moduleby starting identifiers with some other character.The remainder of this section describes the systemfunctions.106Memory allocation. A contiguous vector of memoryis allocated by the function callvec .Alloc vec(size)This returns a pointer to a vector of size 1 words, whichcan be indexed as vet[0] through vec[size]. The allocationis done by means o f a next-fit algorithm based on theboundary-tag method of Knuth [8]. The vector is returned to the free list by.Free(vec)Process creation. A process is created with a specifiedstack size byid -- .Create(funct, stack size)where funct is a pointer to the function to be invoked asa process. A unique nonzero process id is returned whichis used in future references to the new process. Thecreated process becomes a direct descendant of its creatorin the tree o f processes. The process is created in theembryonic state and cannot execute until it is readied:.Ready(id, argument list).Ready passes the (optional) arguments to the new process and makes it eligible for execution.An optional third argument can be passed to.Create to specify the priority o f the new process; thedefault priority level is 0.C P U allocation. Processes are allocated the CPU asfollows. A process eligible for execution is said to beready. A process which is not ready is said to be blocked.O f the highest priority ready processes, the process readyfor the longest time is allocated the CPU, and is said tobe active.The active process relinquishes the CPU either byblocking or by being preempted when a higher priorityprocess becomes ready. The latter is caused either by ahardware interrupt or by an action of the active process.The active process may block by attempting to communicate with another process or by waiting for an interruptto occur. Certain system functions, st/ch as input andoutput primitives, use interprocess communication andmay block the active process.It follows that on a single processor machine, aprocess executes indivisibly with respect to processes ofthe same or lower priority until it blocks. This relativeindivisibility is a useful property of Thoth processes.The priority of a process remains fixed throughoutits lifetime. The highest user priority is 0; lower prioritiesare greater than 0. The root of a subtree of user processesnormally has priority 0. Thoth system processes havehigher priorities than user processes with the exceptionof a default process which has lower priority than alluser processes and is always ready.Interprocess communication. There are four primitives for passing messages between processes. All messages are 8 words in length.Communicationsofthe A C MFebruary 1979Volume 22Number 2

A process sends a message to another process byid .Send(msg, id)The contents of the 8-word msg vector is sent to theprocess specified by id. The sending process blocks untilthe receiving process has received the message and sentback an 8-word reply with .Reply. The reply messageoverwrites the original msg vector.If the receiving process does not exist, .Send returns0 and the msg vector remains unchanged. Normally.Send returns the id of the process which sent the reply.The receiving process usesid .Receive(msg)orid .Receive(msg, id)The receiving process blocks, if necessary, to receive an8-word message in its msg vector. When the optional idparameter is present, the message must come from thespecified process. When the id parameter is not present,the first process sending to the receiving process willsatisfy the receive. The id of the sending process isreturned for later use in .Reply:.Reply(msg, id)The 8-word reply containing in the msg vector is sent tothe specified process awaiting a reply from the receivingprocess. The sending process is readied upon receivingthe reply, and the replying process does not block.An attempt to receive from a nonexistent processresults in an undefined message and a 0 being returnedby .Receive. A .Reply to a nonexistent process is a nulloperation.Instead of replying to a sender, the receiving processcan forward the message, possibly changing its contents,to another process:.Forward(msg, from id, to id)The process specified by from id must be blocked awaiting a reply from the forwarding process. The effect of.Forward is the same as if the from id process hadperformed a .Send to the process specified by to id ofthe 8-word message in the forwarding process's msgvector. The forwarding process does not block.The interprocess communication primitives can beused for synchronizing and eliminate the need for primitives like semaphores. These primitives have changed anumber of times as the system has evolved. Their semantics and efficiency have considerable impact on thesystem and seem worthy of further study.Interrupts. Interrupts are handled by system processes. These processes use the function call.Await interrupt(device id)to block until an interrupt occurs for the specified device.Such an interrupt can only occur when no processes ofthe same or higher priority are active. Hence an interrupt107causes its associated process to become both ready andactive.Process destruction. Any process can destroy a process (possibly itself) by.Destroy(id)The destroyed process ceases to exist in the sense that itsid becomes invalid, it can no longer execute, and its stackand all memory it has allocated are returned to the freelist. When a process is destroyed, all of its descendantsare also destroyed.For any process blocked doing a .Send to or .Receivefrom a process which is destroyed, the result is the sameas if the .Send or .Receive had been executed for anonexistent process. In particular, when a process isdestroyed, all blocked processes attempting the send toor receive from the deceased process become ready.Teams. Each process belongs to a team which is a setof processes sharing a common address space and acommon free list of memory resources. Processes on thesame team can share data. Processes on different teamscannot share data, but they can communicate via theinterprocess communication primitives. There are twotypes of teams: resident and transient. Processes on resident teams remain in memory and are higher prioritythan those on transient teams. Transient teams may beswapped to secondary storage when primary memorybecomes scarce. Transient teams can be created dynamically while resident teams are created only when thesystem is initialized. The priority, and hence the relativeindivisibility property, of a process on a transient teamapplies only with respect to other processes on the sameteam.Teams allow the use of physical memories larger thanthe logical address space on machines with memorymanagement hardware, and greater concurrency viaswapping on machines with a secondary storage devicesuitable for swapping.Clock. Machines with a source of periodic interruptscan be configured to support a clock abstraction. Thecurrent date and time of day is maintained.The clock can be used by a process to block for aperiod of time. This is done by either.Sleep(time and date)or.Delay(seconds)A process invoking .Sleep will block until the currenttime and date becomes equal to or later than that specified by the time and date vector. A process invoking.Delay will block until the specified number of secondshas elapsed. Using an optional second argument to .Delay, a process can sleep for as little as one clock interrupt(which is machine dependent). In both cases, the processthen becomes ready.Communicationsofthe ACMFebruary 1979Volume 22Number 2

The current date and time of day can be obtained bytransferred by .Put have actually been output to the file.Get time(time and date) Flush( )It can be set byflushes the buffers of the file selected for output.For devices and files on devices which allow directaccess to bytes, Set time(time and date)Input/output. The Thoth input/output system provides a reasonably uniform interface with peripheraldevices and files so they can be used interchangeably bymost programs Each device is assigned a unique name.For example, terminals are named " tty0", " tty l", .the disks are named " disk0", . . . . etc. The randomaccess memory can be treated as an input/output device,called-" mem". This allows the use of input/outputediting functions on strings of characters stored in memory.A file or device is accessed byfcb .Open(pathname, mode)where pathname is a string specifying either the name ofa device or the pathname of a file (which will be definedlater) and mode is a string specifying the mode of access(read, write, append or read/write). Append mode isequivalent to write mode on devices and is definedfurther for files in the next section. In the remainder ofthis section, we will use the word file to mean "file ordevice."The function .Open returns a pointer to afile controlblock which contains a description of the accessed fileand any necessary buffer(s); this pointer serves as anidentifier for the accessed file. The process is said to ownthe fcb and no other process can use it, although otherprocesses can still access the file using separate fcbs.Each open file has a current byte position. When afile is opened (except for append mode), this currentbyte position is initialized to 0, the beginning of the file.An fcb can be used to transfer data to or from filesafter it is selected. An fcb is selected for input by.Seek(fcb, where, how)changes the current byte position. The "how" parameterspecifies the interpretation of the "where" parameter The three possibilities are: absolute byte which sets thecurrent byte position to the where-th byte in the file;relative byte which increments the current byte positionby where, which may be negative; absolute block whichsets the current byte position to the first byte in thewhere-th block of the file. Absolute block seeking appliesonly to devices and files on devices for which the conceptof a block is appropriate.Close(fcb)flushes output (if necessary), removes access to the fileand releases memory used for the fcb. When a processis destroyed, all of its accessed files are automaticallyclosed.File system. Thoth can be configured to support afile system on target machines with one or more directaccess secondary storage devices. The file system is structured as a tree in which each node is a file. In addition,each node may have substructure consisting of one ormore descendant nodes.Each node has a name consisting of up to 32 characters and all the immediate descendants of a node haveunique names. The root node of the tree has the uniquename *. A node is specified by a pathname which is asequence of names separated by the / character. Apathname describes a path through the tree. For exampie, the pathname * refers to the root node, and thepathname Select input(fcb)*/src/fsys/seekAn fcb is selected for output by:refers to the node named seek which is an immediatedescendant of fsys, which is an immediate descendant ofsrc, and src is immediately under the root.The nodes which are direct descendants, or sons, ofa given node, their father, are ordered The file systemprovides functions which return the pathname of thefather, the next brother or the first son of the nodespecified by a given pathname. This enables a programto traverse a subtree of the file system.Each process has an associated current node which isinherited from its parent; it may be changed by.Select output(fcb)These two functions verify the fcb ownership and accessmode.Data is transferred one byte at a time from a fileselected for input bydata .Get ( ).Get returns the current byte right-adjusted and zeropadded on the left. Similarly, data is transferred to a fileselected for output by.Put(data).Put writes the rightmost byte of the data word to thecurrent byte in the file. Both .Put and .Get have theeffect of incrementing the current byte position.These data transfers are implemented with devicedependent buffering schemes To guarantee that all bytes108 Set current node(pathname)The concept of current node allows abbreviated references to files in the subtree rooted at the current node.A pathname starting with " / " describes a path startingat the current node. The current node is referred to by@ . For example, if the current node is */src/fsys thenCommunicationsofthe ACMFebruary 1979Volume 22Number 2

the file */src/fsys/seek can also be referred to as either.Seek mark(fcb)/seekor when a file is opened for append access.or.At mark(fcb)@/seekreturns 1 if the current byte position is equal to the mark,and 0 otherwise. The concept of mark generalizes that of"end of file."Two functions are used to modify the file system treestructure. A new node is created by Make node(pathname)The space occupied by a file is reclaimed and, if it is aleaf, the node is deleted by Remove node(pathname)A file system structure can be grafted as a subtree tothe file system by.Graft(pathname, d e v i c e n a m e )The root node of the file system structure on the devicespecified by device name can subsequently be referredto as pathname. Nodes in the grafted file system structurecan be referred to as described above using pathname asthe name of the root node.Ungraft(pathname)ungrafts the file system structure rooted at pathnamemaking the root of the previously grafted substructureinaccessible using pathname. Grafts allow the use ofmultiple and removable secondary storage devices.The contents of a file is regarded as an infinitesequence of bytes. Initially all of the bytes are null. Thefunction .Put, described above, is used to modify individual bytes in a file when it is open with write or appendaccess. A file is physically represented by one or moreblocks, where the size of a block depends on what isconvenient or efficient for the particular device. All bytesafter the last physical block are null by definition. Filesopen with write or append access are automaticallygrown to contain the data written. A file open with writeaccess can be explicitly changed to a specified size by.Change file(fcb, size in blocks)after which it will only become smaller by another callto .Change file or by removing the file, although the filewill still be grown as required to contain data written.The current byte position is determined byposition .Where(fcb)The file mark is a byte position in the file which isremembered over accesses to the file. The mark is initialized to 0 when the file is created. The mark of the fileselected for output is set to the current byte position by.Mark( )When a file open with only write or append access isclosed, the mark is set to the current byte position. Thecurrent byte position is set to the mark by109Environment enquiry, Environment enquiry is thefacility for a program to access parameters describing thetarget machine. The availability of these parameters maymake an otherwise machine dependent program machineindependent. Some parameters, such as the number ofbits per word, the number of bits per byte, etc., areknown at compile time while others, especially thosedescribing the system configuration, are only availableat execution time.Parameters known at compile time are provided inthe form of predefined manifest constants. A manifestconstant is a base language identifier defined to representa string of text. Every occurrence of a manifest constantis replaced by its definition via textual substitution in thesource code during compilation.Parameters known only at execution time are madeavailable as global variables, or through system functioncalls 3. Portability of ThothIn this section the notion of portable system softwareis made more precise. We then characterize the set ofmachines over which Thoth is considered to be portableand discuss the work entailed in porting the system.With certain programs, such as compilers, assemblers, loaders, etc., one can distinguish the host machineon which the program executes from the target machinefor which the output of the program is intended. We saythat a program is portable over a set of machines if itcosts significantly less to modify it for each machine thanto implement and maintain separately. This cost shouldinclude the cost of running the program throughout itslifetime; hence, a program that is easy to convert for anew machine but grossly inefficient may not be considered portable. If moving portable software to new hostmachines requires no modification, it is said to be machine independent. Portable software is said to be machineinvariant if it requires no modification for new targetmachines. Software that is not machine invariant is saidto be machine specific.Portable software has advantages in development,maintenance and adaptability. It is usually less expensiveto develop one program for several machines than tocustomize an entirely separate program for each machine. Moreover, one can justify better design and documentation because of wider applicability. Also, it iseasier to maintain one well-designed program than several programs.Communicationsofthe ACMFebruary 1979Volume 22Number 2

Operating system portability is aimed at a problemmost prevalent with minicomputers. The decision toacquire a new minicomputer must be largely predicatedon the software available for the machine unless substantial/time and resources are allocated to softwaredevelopment. Moreover a major task for the manufacturer of a new machine is to develop software for it.Porting Thoth to a new machine provides a substantialbody of system software plus a growing collection ofmachine-independent application programs. This reduces the cost of providing software for a new machineand in the case of owning several different minicomputers, greatly reduces the software maintenance cost whenall the machines are running Thoth.Three main portability problems were addressed during the development of Thoth. The first problem was todesign an abstraction of a minicomputer that could beefficiently realized on a large number of machines. Thedual of this problem is that of choosing the domain oftarget machines so that a reasonable (and efficient)abstraction is possible. The second problem was to represent the abstraction in such a form as to minimize theeffort required to implement it on target machines. Thethird problem was to design and implement softwaretools to automate as much of the implementation aspossible.Implementability over a specific domain of machineswas a maj or consideration during the design of the ThothMachine abstraction described in Section 2. Some desirable ideas were not incorporated due to the apparentdifficulty o f implementing them on certain machines.Similarly, some possible target machines were rejecteddue to their lack of hardware to efficiently implementabstractions thought to be fundamental to any reasonable system.A high-level language has been used to representmost of the system so that most of the translation intomachine code can be done by a compiler. The languagehas been designed to encourage the use of machineindependent constructs; however machine-specific codecan, and sometimes must, be used. To document machinedependencies, the software is divided into componentswhich are stored in separate files, each of which containseither a single function, a set of related global datamodules, or a set of related manifest definitions. Eachfile is classified as either machine-invariant or machinespecific; this classification is implied by the subtree ofthe file system in which the file is stored. The machinespecific components which need to be modified duringa port are thus isolated from machine-invariant code,and easy to find.The tree structure of the Thoth file system is used tostructure the source files to reflect functionality as wellas machine dependency. The subtree containing allsource files is rooted at */src; immediately below thisnode are subtrees containing major functional components such as kernel, input/output, file system, etc. Eachof these subtrees contains machine-invariant source filesimmediately below its root plus a further subtree ofmachine-specific code for each target machine type.Thus, for example, there is a subtree rooted at */src/kernel/nova containing the kernel source files which arespecific to NOVA computers.The tree structure of the file system has provedinvaluable for managing the over 2000 files of rapidlychanging source code. The management of these files isa nontrivial job that will become more difficult as Thothis ported to new machines.Some components o f the system which are nearlymachine-invariant are rendered machine-invariant byreplacing machine dependencies with manifests. Thenonly the manifest definitions occur in a machine-specificsource file. For example, the process which is activatedby each real-time clock interrupt executes the followingfunction:110Communicationsofthe ACM.Chronographer(timer){extra .Time vec, .Wake time vec, .Time mods;auto i;repeat{START CLOCK;.Await interrupt(RTC);STOP CLOCK; .Time vec[5];for(i 5; .Time vec[i] .Time mods[i];){.Time vec[i] -- 0; .Time vec[--i];)if(i DAYS && .Time vec[i] 365 &&YEARS[.Time vec] & 3) \ non-leap year test{DAYS[.Time vec] -- 0; YEARS[.Time vec];)if(.Compare vec(.Wake time vec, .Time vec, 5) ! 1&& STATE[timer] RECEIVE BLOCKED){disable;MESSAGE[timer] WAKE UP;BLOCKED ON[ im, er] 0;.Add ready(timer);enable;}})The manifests S T A R T C L O C K and S T O P C L O C Kmust be defined for each different target machine. Themachine-specific manifest definitions for the NOVA are:#START CLOCK twit(.NIO.].IS., .RTC.);#STOPCLOCK twit(.NIO.l.IC., .RTC.);For the TI 990 the definitions are:#STARTCLOCK twit(.CKON.)#STOPCLOCK twit(.CKOF.)Thoth is also made more portable by the readabilityof its source code. The most accurate documentation o fa program is usually its source; internal and externaldocumentation is often out of date. For this reason, ourFebruary 1979Volume 22Number 2

language design has been strongly influenced by aesthetic considerations which are often dismissed as "syntactic sugar." Since the readability of source code depends heavily on the style of coding, members of theproject have spent considerable time discussing

Thoth, a Portable Real-Time Operating System David R. Cheriton, Michael A. Malcolm, Lawrence S. Melen, and Gary R. Sager University of Waterloo Thoth is a real-time operating system which is designed to be portable over a large set of machines