A photo of a globular sculpture suspended over a building

End-to-end simulation hello world!

I’ve talked to many people about how to maximise the utility of a simulator for business decision-making, rather than focussing on the fidelity of reproducing real phenomena. This generally means delivering a custom simulator project lean, in thin, vertical, end-to-end slices. This approach maximises putting learning into action and minimises risk carried forward.

For practitioners, and to sharpen my own thinking, I have provided a concrete example in code; a Hello, World! for custom simulator development. The simulated phenomena will be simple, but we’ll drive all the way through a very thin slice to supporting a business decision.

Introduction to simulation

To understand possible futures, we can simulate many things; physical processes, digital systems, management processes, crowd behaviours, combinations of these things, and more.

When people intend to make changes in systems under simulation, they wish to understand the performance of the system in terms of the design of actions they may execute or features that may exist in the real world. The simulator’s effectiveness is determined by how much it accelerates positive change in the real world.

Even as we come to understand systems through simulation, it is also frequently the case that people continue to interact with these systems in novel and unpredictable ways – sometimes influenced by the system design and typically impacting the measured performance – so it remains desirable to perform simulated and real world tests may in some combination, sequentially or in parallel.

For both of these reasons – because simulation is used to shorten the lead time to interventions, and because those interventions may evolve based on both simulated and exogenous factors – it is desirable to build simulation capability in an iterative manner, in thin, vertical slices, in order to respond rapidly to feedback.

This post describes a thin vertical slice for measuring the performance of a design in a simple case, and hence supporting a business decision that can be represented by the choice of design, and the outcomes of which can be represented by performance measures.

This post is a walk-through of this simulation_hello_world notebook and, as such, is aimed at a technical audience, but one that cares about aligning responsive technology delivery and simulation-augmented decision making for a wider stakeholder group.

If you’re just here to have fun observing how simulated systems behave, this post is not for you, because we choose a very simple, very boring system as our example. There are many frameworks for the algorithmic simulation of real phenomena in various domains, but our example here is so simple, we don’t use a framework.

The business decision

How much should we deposit fortnightly into a bank account to best maintain a target balance?

Custom simulator structure

To preserve agility over multiple thin slices through separation of concerns, we’ll consider the following structure for a custom simulator:

  1. Study of the real world
  2. Core algorithms
  3. Translation of performance and design
  4. Experimentation
  5. Translation of an environment

We’ll look at each stage and then put them all together and consider next steps. As a software engineering exercise, the ongoing iterative development of a simulator requires consideration of Continuous Delivery, and in particular shares techniques with Continuous Delivery for Machine Learning (CD4ML). I’ll expand on this in a future post.

Study of the real world

We study how a bank account works and aim to keep the core algorithms really simple – as befits a Hello, World! example. In our study, we recognise the essential features of: a balance, deposits and withdrawals. The balance is increased by deposits and reduced by withdrawals, as shown in the following Python code.

class Account:

  def __init__(self):
    self.balance = 0

  def deposit(self, amount):
    self.balance = self.balance + amount

  def withdraw(self, amount):
    self.balance = self.balance - amount

Core algorithms

Core algorithms reproduce real world phenomena, in some domain, be it physical processes, behaviour of agents, integrated systems, etc. We can simulate how our simple model of bank account will behave when we perform some set of transactions on it. In the realm of simulation, these are possible rather than actual transactions.

def simulate_transaction(account, kind, amount):
  if kind == 'd':
  elif kind == 'w':
    raise ValueError("kind must be 'd' or 'w'")

def simulate_balance(transactions):
  account = Account()
  balances = [account.balance]
  for t in transactions:
    simulate_transaction(account, t[0], t[1])
  return balances

When we simulate the bank account, note that we are interested in capturing a fine-grained view of its behaviour and how it evolves (the sequence of balances), rather than just the final state. The final state can be extracted in the translation layer if required.

tx = [('d', 10), ('d', 20), ('w', 5)]
[0, 10, 30, 25]

Visualisation is critical in simulator development – it helps to communicate function, understand results, validate and tune implementation and diagnose errors, at every stage of development and operation of a simulator.

This could be considered a type of discrete event simulation, but as above, we’re more concerned with a thin slice through the whole structure than the nature of the core algorithms.

Translation of performance and design

The core algorithms are intended to provide a fine-grained, objective reproduction of real observable phenomena. The translation layer allows people to interact with a simulated system at a human scale. This allows people to specify high-level performance measures used to evaluate the system designs, which are also specified at a high level, while machines simulate all the rich detail of a virtual world using the core algorithms.


We decide for our Hello, World! example that this is a transactional account, and as such we care about keeping the balance close to a target balance, so that sufficient funds will be available, but we don’t leave too much money in a low interest account. We therefore measure performance as the average (mean) absolute difference between the balance at each transaction, and the target balance. Every transaction that leaves us over or under the target amount is penalised by the difference and this penalty is averaged over transactions. A smaller measure means better performance.

This may be imperfect, but there are often trade-offs in how we measure performance. Discussing these trade-offs with stakeholders can be a fruitful source of further insight.

def translate_performance_TargetBalance(balances, target):
  return sum([abs(b - target) for b in balances]) / len(balances)


While we can make any arbitrary number of transactions and decide about each individual transaction, we consider for simplicity that set up a fortnightly deposit schedule, and make one decision about the amount of that fortnightly deposit.

As the performance translation distills complexity into a single figure, so the design translation causes the inverse, with complexity blooming from a single parameter We translate the single design parameter into a list of every individual transaction, suitable for simulation by core algorithms.

def translate_design_FortnightlyDeposit(design_parameter):
  return [('d', design_parameter)] * ANNUAL_FORTNIGHTS

[('d', 10), ('d', 10), ('d', 10), ('d', 10), ('d', 10)]

Performance = f(Design)

Now we can connect human-relevant design to human-relevant performance measure via the sequence:

design -> design translation -> core algorithm simulation -> performance translation -> performance measure

If we set our target balance at 100, we see how performance becomes a function of design by chaining translations and simulation.

def performance_of_design(design_translator, design_parameters):
  return translate_performance_Target100(

This example above (and from the notebook) specialises the target balance performance translator to Target100 and makes the design translator configurable.

Remebering that we’re interested in the mean absolute delta between the actual balance and the target, here’s what it might look like to evaluate and visualise a design:

evaluating account balance target 100
with FortnightlyDeposit [9]
the mean abs delta is 61.89


Experimentation involves exploring and optimising simulated results through the simple interface of performance = f(design). This means choosing a monthly deposit amount and seeing how close we get to our target balance over the period. Note that while performance and design are both single values (scalars) in our Hello, World! example, in general they would both consist of multiple, even hundreds of, parameters.

This performance function shows visually the optimum (minimum) design is the point at the bottom of the curve. We can find the optimal design automatically using (in this instance) scipy.optimize.minimize on the function performance = f(design).

We can explore designs in design space, as above, and we can also visualise how an optimal design impacts the simulated behaviour of the system, as below.

Note that in the optimal design, the fortnightly deposit amount is ~5 for a mean abs delta of ~42 (the optimal performance) rather than a deposit of 9 (an arbitrary design) for a mean abs delta of ~62 (the associated performance) in the example above.

Translation of an environment

The example so far assumes we have complete control of the inputs to the system. This is rarely the case in the real world. This section introduces environmental factors that we incorporate into the simulation, but consider out of our control.

Just like design parameters, environmental factors identified by humans need to be translated for simulation by core algorithms. In this case, we have a random fortnightly expense that is represented as a withdrawal from the account.

def translate_environment_FortnightlyRandomWithdrawal(seed=42, high=5):
  rng = np.random.RandomState(seed)
  random_withdrawals = rng.randint(0, high=high, size=ANNUAL_FORTNIGHTS)
  return list(zip(['w'] * ANNUAL_FORTNIGHTS, random_withdrawals))

[('w', 3), ('w', 4), ('w', 2), ('w', 4), ('w', 4)]

If we were to simulate this environmental factor with without any deposits being made, it would look like below.

Now we translate design decisions and environmental factors together into the inputs for the simulation core algorithms. In this case we interleave or “zip” the events together

def translate_FortnightlyDepositAndRandomWithdrawal(design_parameters):
  interleaved = zip(translate_design_FortnightlyDeposit(design_parameters),
  return [val for pair in interleaved for val in pair]

[('d', 9), ('w', 3), ('d', 9), ('w', 4), ('d', 9), ('w', 2)]

Putting it all together

We’ll now introduce an alternative design and optimise that design by considering performance when the simulation includes environmental factors.

Alternative design

After visualising the result of our first design against our performance measurement, an alternative design suggests itself. Instead of having only a fixed fortnightly deposit, we could make an initial large deposit, followed by smaller fortnightly deposits. Our design now has two parameters.

def translate_design_InitialAndFortnightlyDeposit(design_pars):
  return [('d', design_pars[0])] + [('d', design_pars[1])] * ANNUAL_FORTNIGHTS

design_2 = [90, 1]

In the absence of environmental factors, our system would evolve as below.

However, we can also incorporate the environmental factors by interleaving the environmental transactions as we did above.


Our performance function now incorporates two design dimensions. We can run experiments on any combination of the two design parameters to see how they perform. With consideration of environmental factors now, we can visualise performance and optimisation the design in two dimensions.

Instead of the low point on a curve, the optimal design is now the low point in a bowl shape, the contours of which are shown on the plot above.

This represents our optimal business decision based on what we’ve captured about the system so far, an initial deposit of about $105 and an fortnightly deposit of $2.40. If we simulate the evolution of the system under the optimal design, we see a result like below, and can see visually why it matches our intent to minimise the deviation from the target balance at each time..

The next thin slice

We could increase fidelity by modelling transactions to the second, but it may not substantially change our business decision. We could go on adding design parameters, environmental factors, and alternative performance measures. Some would only require changes at the translation layer, some would require well-understood changes to the core algorithms, and others would require us to return to our study of the world to understand how to proceed.

We only add these things when required to support the next business decision. We deliver another thin slice through whatever layers are required to support this decision by capturing relevant design parameters and performance measures at the required level of fidelity. We make it robust and preserve future flexibility with continuous delivery. And in this way we can most responsively support successive business decisions with a thinly sliced custom simulator.

Weighting objectives and constraints

We used a single, simple measure of performance in this example, but in general there will be many measures of performance, which need to be weighted against each other, and often come into conflict. We may use more sophisticated solvers as the complexity of design parameters and performance measures increases.

Instead of looking for a single right answer, we can also raise these questions with stakeholders to understand where the simulation is most useful in guiding decision-making, and where we need to rely on other or hyrbid methods, or seek to improve the simulation with another development iteration.

Experimentation in noisy environments

The environment we created is noisy; it has non-deterministic values in general, though we use a fixed random seed above. To properly evaluate a design in a noisy environment, we would need to run multiple trials.

Final thoughts

Simulation scenarios become arbitrarily complex. We first validate the behaviour in the simplest point scenarios that ideally have analytic solutions, as above. We can then calibrate against more complex real world scenarios, but when we are sufficiently confident the core algorithms are correct, we accept that the simulation produces the most definitive result for the most complex or novel scenarios.

Most of the work is then in translating human judgements about design choices and relevant performance measures into a form suitable for exploration and optimisation. This is where rich conversations with stakeholders are key to draw out their understanding and expectations of the scenarios you are simulating, and how the simulation is helping drive their decision-making. Focussing on this engagement ensures you’ll continue to build the right thing, and not just build it right.