Project White: Architectural Overview Document ============================================== John Quigley Version 00.02, {localdate} {localtime} Project White is a research and development project aiming to design and implement a cutting-edge commercial programming language for distributed and highly dynamic mobile networks. WARNING: This document is entirely a work in progress. Proper attribution, thought organization and further details are forthcoming. Thanks for your patience, and please excuse us for any confusion. Introduction ------------ The White Programming Language will closely adhere to the vision of Ambient Intelligence and the Ambient-oriented paradigm as set forth by the European Commision's _Information Society Technologies Advisory Group_ and the Department of Computer Science at Vrije University in Brussels, Belgium. The language will build upon these fundamental concepts, integrating ideas synthesized by ourselves, as well as those at the vanguard of today's language reasearch. The language will provide a robust and powerful system for implementing massively distributed entities. In this document, I present the various aspects of system design as well as the primary areas of innovation in the implementation of this language. Those innovations stand to include: *Realtime communication* :- It is understood that a highly network-aware system that relies heavily upon messaging in a potentially dynamic or hostile environment requires communication facilities that explicitly expect and incorporate potential network failures at the very heart of its basic computational steps. I propose a set of constructs to supply this, namely a lightweight messaging by contract step at the point of communication initiation. Also, since all messaging is asyncronous, it becomes necessary to support futures for synchronization. White will innovate by providing time-aware futures, that promise to yield a computational value after some contractually-agreed upon time limit, or trow an exception should that limit have been exceeded. *Access-control and security* :- Existing implementations fail miserably when contrasted with today's requirements for secure systems and communication. CORBA's unencrypted traffic is subject to eavesdropping and man-in-the-middle attacks, and it requires a port to be opened in the firewall for each service. This conflicts with the reality of most sane network security policies. White will employ security conscientious design at all levels from the very start. All socket communication will default to tranmission across a secure transport such as OpenSSL. Furthermore, I intend to fully allow for a X.509 certificate-based infrastructure, to ensure trust among White peers. Lastly, a robust set of access control mechanisms will be put in place to guard against unwarranted communication amongst nodes. *Non-blocking communication primitives* :- As there is no shared state between any objects in the White system, a robust communication standard and protocol must be designed. Protocol efficiency is an incredibly important aspect, as there exists the very real potential for heavy messaging in large mesh networks simply to keep all nodes in sync. This overhead requires us to avoid design flaws such as those in CORBA's interoperability protocol, which makes it nearly impossible to build a high-performance event distribution service. The on-the-wire encoding of CORBA contains a large amount of redundancy, but the protocol does not support compression. This leads to poor performance over wide-area networks. *Reified resource and acquaintance management* :- I want to avoid the concomitant complexity that arises from naming facilities like CORBA's interoperable object references (IORs). IORs are opaque entities whose contents are supposed to remain hidden from developers. This is unfortunate for many reasons, not least of which because opaque references essentially force the use of a naming service because clients cannot create object references without the help of an external service. I propose the use of three reified objects accessible to the network subsystem and transparent to the programmer that present resource mappings. Guiding Principles ------------------ As mentioned in the introduction, the goals in my implementation of White are: *Openness* :- No part of the system is hidden. All stages are visible, accessible to and modifiable by, the user. The transformations between adjacent stages are evident. *Simplicity* :- I seek the most simplex language that achieves the stated requirements, and the most compliant C code that implements this language. This necessitates a simple syntax with a small number of language constructs. *Efficiency* :- I seek fast compilation and fast execution of White programs. This implies a fast, smart, one-pass compiler and a fast virtual machine. Additionally, flexible and consise protocol messaging, compiled down to ASN.1 distinguished encoding rules for on-the-wire communication is implemented. *Portability* :- I require White to run on as many platforms as possible. I want to be able to compile the White core unmodified everywhere, and to run White programs unmodified on every platform that has a suitable Lua interpreter. This implies a clean, strict ANSI C implementation with special attention to portability issues, such as avoiding dark corners of C and its librarys, and ensuring that it compiles cleanly as C++. I seek warning-free compilations. *Practical* :- I seek to reign in the sea of research that is at the vanguard of the distributed programming field by identifying the most viable paradigms. I will encorporate these elements into a single language runtime, with an eyes towards practicaliy, style and elegance. Design and Implementation ------------------------- White will use a parser produced by yacc, which is a very valuable tool while the language's suntax is unstable. The White compiler will use no intermediate representation. It will emit instructions for the virtual machine "on the fly" as it parses a program. Nevertheless, it will perform some optimizations. For instance, it may delay the generation of code for base expressions like variables and constants. When it parses such expressions, it generates no code; instead, it will a simple structure to represent them. Therefore, it is very easy to check whether an operand for a given instruction is a constant or a local variable and use those values directly in the instruction, thereby avoiding unnecessary and costly moves. To be portable across many different C compilers and platforms, White must avoid many tricks commonly used by interpreters, such as direct threaded code. Instead, well use a standard while-switch dispatch loop. The Object Model ---------------- Prototype-Based ~~~~~~~~~~~~~~~ As a consequence of parameter passing in the context of remote messages, objects are copied back and forth between remote hosts. Since an object in a class-based programming language cannot exist without its class, this copying of objects implies that classes have to be copied as well. However, a class is by definition an entity that is conceptually shared by all its instances. From a conceptual point of view there is only one single version of the class on the network, containing the shared class variables and method implementations. Hence, copying classes over the network causes state consistency problems because objects residing on different machines can independently update a class variable of "their" copy of the class. By definition, classes impose a sharing relation upon all their instances. This relation is established at object creation time and remains implicit throughout the lifetime of all its instances. However, because of independent class updates performed by autonomous disconnected devices, two instances of the same class can unexpectedly exhibit different behaviour. A simple solution consists of getting rid of classes and the sharing relation they impose on objects altogether. This is the paradigm defined by prototype-based languages like Self. In these languages objects are conceptually idiosyncratic, such that the above problems do not arise. Sharing relations between different prototypes can still be established (such as traits), but the point is that these have to be explicitly encoded by the programmer at all times. Resources and Acquaintances --------------------------- Object discovery is perhaps one of the most challenging aspects of a decentralized system. Any node in the context of a White system is capable of cloning _N_ active objects, all of which must both notify the network of their presence and broadcast their public functionality as made available in the _Provided Mailbox_. Discovery ~~~~~~~~~ TODO Acquaintance Management ~~~~~~~~~~~~~~~~~~~~~~~ TODO Communication ------------- Non-blocking Primitives ~~~~~~~~~~~~~~~~~~~~~~~ The fact that every hardware device is an autonomous computational entity (inducing natural concurrency), combined with the fact that connections are volatile, implies the necessity for non-blocking communication primitives. Blocking communication is a source of deadlocks, in our case, distributed deadlocks. Such deadlocks in local networks are not considered to be that harmful, since the cause of the deadlock can be easily debugged with contemporary remote debugging environments. However, in a mobile network environment, not all parties are necessarily available for communication, making the resolution of deadlocks extremely hard. Another, more important consideration when designing a concurrency model for a language that is to run on mobile networks, is that the communication mechanism should minimize the duration resources are locked. This is very important, because the extremely high latency of communication (over volatile connections) in mobile networks would diminish the availability of resources. Indeed, having blocking communication primitives would imply a program or device to block upon encountering unstable connections or temporary unavailability of another device. One of the foundamental tenents in ambient-oriented programming tells us that communication must be non-blocking, to avoid the issues previously discussed. An ambient-oriented concurrency model is a concurrency model without blocking communication primitives. Quite often, the issue of non-blocking communication is confused with asynchronous message sending. Asynchronous message sending implies that the send operation is nonblocking, but tells us nothing about the, potentially implicit, receive operation. Realtime Facilities ~~~~~~~~~~~~~~~~~~~ *Characteristics* :: - server cannot statically decide timing constraints on the server - _rejected_ or _aborted_ notifications from server do not always arrive *Requirements* :: - _best effort / least suffering_ - impossible to assume global clock - communication time cannot be bounded - cannot make assumptions about time permitted for computation on server *Elements* :: - time-constrained futures - object response timing threshold contracts (_inter-messaging by contract_) Message Mailboxes ~~~~~~~~~~~~~~~~~ TODO Communication Traces ~~~~~~~~~~~~~~~~~~~~ TODO Threads and Coroutines ---------------------- White plans to implement asymmetric coroutines (also called semi-symmetric coroutines or semi-coroutines). These coroutines will be supported by three functions from the White standard library: create, resume, and yield. The create function receives a "main" function and creates a new coroutine with that function. It returns a value of type thread that represents that coroutine (like all values in White, coroutines will be first-class values). The resume function (re)starts the execution of a given coroutine, calling its main function. The yield function suspends the execution of the running coroutine and returns the control to the call that resumed that coroutine. Conceptually, each coroutine has its own stack. Coroutines in White should be stackful, in the sense that one can suspend a coroutine from inside any number of nested calls. The interpreter simply puts aside the entire stack for later use and continues running on another stack. A program can restart any suspended coroutine at will. The garbage collector collects stacks whose coroutines are no longer accessible. The combination of stackfulness and first-class status makes coroutines, as implemented in White, equivalent to one-shot continuations. They allow the programmer to implement several advanced control mechanisms, such as cooperative multithreading, generators, symmetric coroutines, backtracking, and so on. Virtual Machine --------------- White will run programs by first compiling them into instructions ("opcodes") for a virtual machine and then executing those instructions. For each function that White compiles, it will create a prototype, which contains an array with the opcodes for the function and an array of White values with all constants (literal strings and numerals) used by the function. White will use a register-based virtual machine. This register-based machine also uses a stack, for allocating activation records, wherein the registers live. When White enters a function, it preallocates from the stack an activation record large enough to hold all the function registers. All local variables are allocated in registers. Consequently, access to local variables should be particularly efficient. Register-based code avoids several "push" and "pop" instructions that stack-based code needs to move values around the stack. The register architecture both avoids excessive copying of values, and reduces the total number of instructions per function. There are two problems typically associated with register-based machines: code size and decoding overhead. An instruction in a register machine needs to specify its operands, and so it is typically larger than a corresponding instruction in a stack machine. On the other hand, register machines generate less opcodes than stack machines, so the total code size is not much larger. Most instructions in a stack machine have implicit operands. The corresponding instructions in a register machine must decode their operands from the instruction. Such decoding adds overhead to the interpreter. There are several factors that ameliorate this overhead. First, stack machines spend some time manipulating implicit operands (e.g., to increment or decrement the stack top). Secondly, in a register machine all operands are inside the instruction and the instruction is a machine word, so the operand decoding involves only cheap operations, such as logical operations. Moreover, instructions in stack machines frequently need multi-byte operands. On a register machine, because the operands are inside the instruction, the interpreter does not have to fetch them independently. There are 35 proposed instructions in White's virtual machine. Most instructions were chosen to correspond directly to language constructs: arithmetic, table creation and indexing, function and method calls, setting variables and getting values. There is also a set of conventional jump instruction to implement control structures. Below is the complete set, together with a brief summary of what each instruction does, using the following notation: - R(X) means the Xth register - K(X) means the Xth constant - RK(X) means either R(X) or K(X-k) - G[X] means the field X in the table of globals - U[X] means the Xth upvalue .Figure N: Tentative instructions in White's proposed virtual machine. ........................................................................... MOVE A B R(A) := R(B) LOADK A Bx R(A) := K(Bx) LOADBOOL A B C R(A) := (Bool)B; if (C) PC++ LOADNIL A B R(A) := ... := R(B) := nil GETUPVAL A B R(A) := U[B] GETGLOBAL A Bx R(A) := G[K(Bx)] GETTABLE A B C R(A) := R(B)[RK(C)] SETGLOBAL A Bx G[K(Bx)] := R(A) SETUPVAL A B U[B] := R(A) SETTABLE A B C R(A)[RK(B)] := RK(C) NEWTABLE A B C R(A) := {} (size = B,C) SELF A B C R(A+1) := R(B); R(A) := R(B)[RK(C)] ADD A B C R(A) := RK(B) + RK(C) SUB A B C R(A) := RK(B) - RK(C) MUL A B C R(A) := RK(B) * RK(C) DIV A B C R(A) := RK(B) / RK(C) POW A B C R(A) := RK(B) ^ RK(C) UNM A B R(A) := -R(B) NOT A B R(A) := not R(B) CONCAT A B C R(A) := R(B) .. ... .. R(C) JMP sBx PC += sBx EQ A B C if ((RK(B) == RK(C)) ~= A) then PC++ LT A B C if ((RK(B) < RK(C)) ~= A) then PC++ LE A B C if ((RK(B) <= RK(C)) ~= A) then PC++ TEST A B C if (R(B) <=> C) then R(A) := R(B) else PC++ CALL A B C R(A), ... ,R(A+C-2) := R(A)(R(A+1), ... ,R(A+B-1)) TAILCALL A B C return R(A)(R(A+1), ... ,R(A+B-1)) RETURN A B return R(A), ... ,R(A+B-2) (see note) FORLOOP A sBx R(A)+=R(A+2); if R(A)