Synchronising Models with Reality

with tags models life cycle - Saturday, August 19, 2017

When taking a model driven approach to software development, one of the problems that arises is that of how to synchronise the model with the reified implementation.

Why is this a problem?

Put more bluntly:

How do we know that the final software accurately implements the model.

Providing a bit more context, while software modelling is not new e.g. UML, MDA, EMF and other TLA approaches, we are focusing on using conventional software languages and tools. These models are then built as standard executable software.

As such, a process model, in this form, would simulate a sufficient set of business activities that need to be catered for by the system being built. However, being a model it appropriately excludes aspects such as:

horizontal scaling of the system
redundancy and fail over
accumulation of telemetry
access control restrictions
network communications
…

And so the list goes on. That is, it should be clear that while the model is indeed vanilla executable software, it simply avoids many of the elements needed for being production ready. This makes it easier to rapidly develop the model without being side tracked by other technicalities.

But, as the real system is being developed the tendency will be for it to diverge from the model. This is problematic for a couple of reasons:

ensuring that design decisions that have been baked into the model have been honoured in production system
ensuring future utility of the model as a representative view of the production system

For the first aspect the problem is that by necessity a software developer, building a production ready version of the system, will need to narrow their focus. They need to deal with localised issues and pay attention to fine grained details. As such, they may not be seeing the bigger picture. Therefore, it may seem reasonable, at implementation time, to side step or re-engineer some aspect of the design, rather than following the model. The affects of this may only show up much later as integration problems, race conditions surfacing in higher level processes, or unnecessary rigidity in the system architecture that hampers future development.

The second point relates one of the key values of models in the first place. Models not only provide a means of testing ideas or a means of communicating designs, but they also provide a fundamental means by which to manage complexity over time.

A model, by design, factors out many aspects of the system and focuses on a particular cross-cut of the real system. This means that the model presents a less complex rendition of the complete system. This, in turn, makes it easier to reason about – given our limited human processing power.

Now, as most of us in software development know, systems are rarely built in one go and then left in a static state thereafter. Rather, systems are evolved over time to facilitate incremental releases and to adapt to new requirements. As such, we need to be able to review various aspects of the system after it has reached production. But, now the system is really a lot more complex since it needs to cater for the full range dimensions needed for production. Yet, it would be valuable to be able to step back again and use the model as a way of reasoning about one particular dimension (say the business processes for example).

However, if the model is no longer representative of the real world system then we can not trust using the model since it actually becomes an immediate burden to try and decide when the model is accurate or not, in addition to dealing with the actual problem at hand. Often this is tackled by simply ignoring any previous models and rather reverse engineering a model from the reified system – however this is generally an expensive exercise.

Ultimately, both of these requirements depend on being able to show how accurately the reified system implements the model, and where the differences are.

How can we solve this?

We could tackle this from a rather theoretical stance of first trying defined exactly what an abstraction is in the context of software engineering. And then arguing that the production system and the model should, at some level, exhibit the same abstractions. Therefore, if we analyse the source code of the production system we should be able to, in some theoretical sense, to derive an abstraction that matches the level of abstraction represented by the source code of the model. Then we could compare these two abstractions.

While this approach maybe laudable and could provide a good theoretical underpinning for system modelling, it would require the refinement of a lot of theoretical machinery, not to mention then needing to build practical tooling.

Alternatively, we could attempt to impose the model as the single source of truth and then attempt to use code generation or similar to produce code templates, as have been done with many of the MDE or MDA tools. But, history would seem to suggest that this is probably too heavy handed to be successful within the industry. More specifically, and somewhat ironically from an architectural stand-point, these approaches are probably too tightly coupled to the implementation and therefore difficult to maintain or introduce alternative overlapping model based views of the system.

So, rather than going down either of these rabbit holes, let’s take a more pragmatic approach that enables the model to remain decoupled from the production system.

For this we start by and considering systems as enactments of processes. That is, in some sense, the system performs a sequence of actions that can be recorded as an event trace.

Now, within the context of an event trace it becomes much simpler to think about how we might compare two different representations of, what we expect to be, equivalent systems (modulo all the stuff that is different ;) ).

More concretely, we execute the model and record a literal trace of the sequence of events simulated by the model. We then let the real system run an equivalent use case to that of the model and similarly record the sequence of events from the real system.

With these two traces, depending on how the trace events are generated, we could easily expect there to be differences. So, now we explicitly filter out aspects from each side that are not relevant for the comparison e.g.

the model might need to simulate users interacting with the system
the production system might need to implement complex state replication

But, for the business process we really need to show, that for both the model and the production system, that the data flows between key software elements are the same.

For example:

        Model                               System
        ─────                               ──────
    |   user clicks 'send'               ≠  •
        frontend receives click          ≡  frontend receives click
    t   fronted delegates to backend     ≡  frontend delegates to backend
    i   •                                ≠  backend routes message to replica
    m   backend computes updated state   ≡  backend computes updated state
    e   backend records updated state    ≡  backend records updated state
        •                                ≠  backend orchestrates state replication
    ↓   backend replies to frontend      ≡  backend replies to frontend
        frontend result consumed by user ≠  •

One additional complexity that should be mentioned and may need to be catered for is that of concurrency within the system, be that the model or the production system. This concurrency may result in different ordering of events. Without going into the detail here, this can be addressed by considering ‘happens-after’ relationships between subprocesses.

What can we do with this?

If we automate the mechanisms needed for generating and comparing these traces then we are in a good position to manage the following types scenarios during the life cycle of the software:

architectural models can be used in automated testing as part of continuous integration pipelines.
complex refactoring changes of the production system can still be easily evaluated for process correctness by excluding other production ready aspects.
system changes that have been modelled can be used to generate a diff between different processes generated by different versions of the model so as to evaluate the impact.
system changes that have been modelled can be used to generate a diff between the model and the production system so as to evaluate the scope and complexity of work required to implement the changes.
models can be built using readily available standard software development tools.
architectural models can be refactored to meet other design goals while being verified to produce the same behavioural processing.

That is, we are ultimately in a much better position to explicitly manage change in the system. We are better able to detect aberrant changes, we are better able to assess the impact of necessary changes and we are better able to communicate the required changes.

This leads to higher quality software systems that can be developed more effectively.