The Death of the Stack

Friday, February 15, 2008

The Death of The Stack

The Death of The Stack is the gateway to asynchronous computing and high concurrency applications, and thus will ultimately enable higher complexity in software.

Currently all major programming languages are stack based: C/C++, Java, .Net, Smalltalk, Perl, PHP, Python etc. The importance of the stack concept influenced the development of the underlying CPUs which generally have special segment registers and corresponding op codes to facilitate stack manipulation.

The use of a stack in a procedural language like C makes sense in terms of the basic metaphor - call a function and, on returning, continue where you left off. However, it begins to show its age in any OO language and doubly so in the development of multithreaded applications or distributed applications.

In OO languages an object cleanly encapsulates its own data and functionality, yet the actual execution is still delegated to the underlying machine, be that a physical CPU or a virtual machine. Thus the metaphor in OO does not extend to the encapsulation of execution.

Viewed differently, the stack is inherently bound to the current thread of control, yet the thread of control can cross over from one object to the next as new functions are called. This breaks the illusion that objects are autonomous entities. Instead there is always the need for creating an object and then, independently, a thread of execution.

This makes it cumbersome to implement asynchronous execution. Synchronous calls are obviously modeled well by a stack, where the return is handled by popping off the current frame. Yet, an asynchronous call requires explicitly using an API to create a new thread and either initialise the thread with a closure (runnable, delegates, anonymous object etc.), or trigger a method via a message queue (or trivial blocking latch). To get information out of a thread we need to explicitly create listeners or additional message queues.

Thus, while languages facilitate thread synchronisation (e.g. Java monitors and the synchronized keyword). None of the mainstream languages actually encapsulate a suitable model for asynchronous calls.

To escape this short fall we need to build languages that are stackless. Stated out of context this seems like a bit of a jump in the line of reasoning. But lets ask the question: what does this mean and why will it support the changes needed?

Languages like Self talk of activation records. Thus, when a method is invoked, an activation record is created. This is essentially an instance of a small object that holds the local variables and method arguments together information about execution of method instructions. Using this paradigm, the record could actual be allocated on the heap along with all other short lived object instances and recycled via garbage collection.

If these activation records are chained together in a linked list, and the activity in the current one is suspended when a new activation record is created, then clearly we are modeling the same synchronous call path that the stack models when the equivalent information is stored in a stack frame.

However, if the underlying (virtual) machine serviced all activation records simultaneously as they where created, then every activity would execute concurrently - on current CPUs and operating systems this could be implemented by having a number of underlying service threads that carry out the work.

Unfortunately, this is clearly an over simplification. For one, a given object could be carrying out two different actions simultaneously (since the calling object could simply invoke the object twice in succession). This would generally lead to undesirable behaviour unless the object performed explicit internal locking for synchronising manipulation of internal state. Instead, I would suggest that a more sensible default is that any given object can only perform exactly one task at a time. If the object wishes to act as if it was performing multiple concurrent operations internally, it could simply delegate tasks to a number of inner objects.

Furthermore, there is still a need for “returning” data. However, this could be achieved by utilising a callback mechanism - which would now be safe since there would be no growing stack in the case of mutual recursion (A calls B and B calls A).

Finally, there is still the need for serialising certain tasks into a clear sequence. However, this could be easily achieved by using a simple state machine to manage instructions. That is, the language construct implicitly creates lightweight objects that act as state machines that, upon receiving a completion callback from one instruction, would trigger the invocation of the next instruction. Instructions would generally be akin the the Smalltalk metaphor of message passing.

Thus, in the end, by removing stacks we open up the path towards true asynchronous, high concurrency languages that will be able to take advantage of increasing number of underlying cores being made available in modern CPUs.