Privacy puzzles

I contributed a database reconstruction attack demonstration to the companion repository to the excellent book Practical Data Privacy by my colleague Katharine Jarmul.

My interest was piqued by my colleague Mitchell Lisle sharing the paper Understanding Database Reconstruction Attacks on Public Data from the US Census Bureau authors Simson Garfinkel, John M. Abowd, and Christian Martindale. Mitchell and I collaborated on a pair of solutions using mathematical optimisation/satisfaction techniques. Check out Mitchell’s solution using the Z3 library. I used OR-Tools instead.

The notebook demonstrates that individual rows of a database may be reconstructed, even if only summary statistics are shared, by considering the constraints that the statistics place on possible values of the data. Constraints include mean and median for all numerical values globally and for various cohorts of records determined by class values.

Note that the intent is of this notebook is not to compromise any private data, but to raise awareness of the potential for privacy breaches due to reconstruction attacks!

Maths Whimsy with Python

At PyCon AU 2023 in Adelaide I delivered a talk titled Maths Whimsy with Python. It was a great chance to review a range of projects small and large I’ve already shared here. Check out the slides and video.

In three years of the maths whimsy repo, I’ve covered a lot of ground, and got hands-on with many Python libraries for data science. I was particularly pleased when the idea to visualise multi-stage classification pipelines as Sankey diagrams was enthusiastically picked up by one client. I’m not sure I’ll find an application for which numpig chunking of large arrays is the best solution, but I remain quite fond of it. And I could spend all day watching the skeumorphic odometer visualisation created for the slackometer.

The talk dived into the square and weird planet gravity whimsy where it all started, the Lockdown Wheelie Project, the Sankey diagrams, and touched on Semantle and Asteroid Escape solvers, then reviewed the typical project pattern, and explored how all the mini projects collectively reinforced one another.

On reflection the lessons for me were:

  • Be driven by interest
  • Be opportunistic with energy
  • Take small steps in problem & solution complexity 
  • Small steps help deliver outputs
  • Learning will cascade
  • And real world applications will become apparent

Afterwards, someone asked me about the times it hadn’t worked. I should have covered this too! The primary time I wasn’t satisfied that I had something to share was when I was experimenting with reinforcement learning to solve mazes with disconnected walls. I bit off more than I could chew, and while I went end-to-end with Deep Q-Learning, and then backfilled some bits of the training pipeline, I didn’t reach a satisfactory conclusion within the time I had available. I took this lesson forward to re-frame the problem into smaller, achievable chunks whenever it looked like I may have dived too deep on subsequent whimsies, and it has served me well since.

Thanks to Ned for the picture!

Electrifying the world with AI Augmented decision-making

I wrote an article about optimising the design of EV charging networks. It’s a story of work done by a team at Thoughtworks, demonstrating the potential of AI augmented decision-making (including some cool optimisation techniques), in this rapidly evolving but durably important space.

We were able to thread together these many [business problem, AI techniques, data sourcing, technology] concerns in such a short time because of our relentless focus — not on the code or the mathematics — but on generating value: namely making it easier for humans to make decisions according to business goals.

A coding saga with Bard

Though but a footnote in the epic of coding with AI, I though it worth musing on my recent experience with Bard. Bard currently uses the LaMDA model, which is capable of generating code, but not optimised for it. The story might be different with Codey as protagonist (or is that antagonist?)

I didn’t produce any software I could use in this session, but I did learn that Bard was at its most useful when prompted with open questions or very specific concepts. I also learned that generated code may be sabotaged by plausible but incorrect domain logic and conceptual fault lines that occur between contexts.

Also, given that contributes 0.0000005% of tokens to the C4 dataset which makes up some of Bard’s training data, I thought I should see some return!

Act 1

In which we meet a domain.

Take 1

I started naively with a reasonably complex ask: how do I calculate the swept path of a vehicle in python?

A drawing of an overhead view of a car following a curved path

I’ve done swept path analysis before, but in manufacturing, and I certainly don’t have a Python solution handy. Even if I only got the broad outlines of a solution, an answer or parts of an answer to this question would save a bunch of time.

This was far too wide an ask. I got code that looks like it’s for a curve that a vehicle might follow, but that’s just one part of the solution, and the more I looked at it, the more issues I found.

# The vehicle's position
x = np.array([0, 0])
y = np.array([0, 0])

# The vehicle's orientation
theta = 0

# The vehicle's velocity
v = 1

# The vehicle's wheelbase
w = 2

# Calculate the vehicle's position at each time step
for t in np.arange(0, 10, 0.1):
    x += v * np.cos(theta) * t
    y += v * np.sin(theta) * t
    theta += v / w * t

First, vehicle is a word that identifies a whole domain of assumptions, including how steering works. But this domain logic is weird; assuming theta represents direction of travel, that’s not how steering works, in this universe at least. Export to Colab feature is nice, so let’s see how the code runs. Spoiler: it doesn’t.

As a user of Python, I could diagnose the runtime error and fix it (the initial values of x and y could be defined with dtype=float), but on closer inspection, the += operator overwrites rather than appends, and, wait a minute, x and y are treated both as vectors and scalars. As the domain logic is still bizarre, is there any purpose in fixing these conceptual faults in the code?

Take 2

A secondary prompt to provide more guidance: I would like to consider the area swept out by the corners of the vehicle, can you update the solution?

This only resulted in more trigonometric prestidigitation; more domain dislogic. I use the term dislogic a bit like the term truthiness. There is some related logic to the code, but it is “apart” from the logic we need.

Take 3

I tried much more explicit guidance by elaborating on domain concepts in the third prompt. This lead to more detailed but still fundamentally flawed and unusable solution. I decided this path wouldn’t lead to a happy ending.

Morals of Act 1

Don’t assume too much domain expertise. Bard has learned on generic public examples. Use your domain expertise to break the problem into smaller chunks.

Also, don’t expect conceptual consistency throughout. LLMs like Bard, as next-token predictors, don’t necessarily ensure conceptual consistency in their output.

Act 2

In which I choose a path to follow.

Take 1

I decided to focus on one part of the solution; getting the curve right. I reset Bard’s context.

I want a python function to create a curve between two points. The function arguments should be the start and end points and the tangent to the curve at the start and end points

Nice linear interpolation, shame about the tangents (which, while present as arguments, were totally ignored in the function body).

A straight line chart

And the above could only be generated after fixing more errors preventing the code from running. The affordances of tuples and numpy.ndarray were confused, and the coordinates weren’t passed correctly to the plot method. The syntax was fine, but the code was riven with conceptual fault lines between contexts – what looked OK in one or other context in isolation caused problems when the contexts were brought together. The bugs were fairly obvious in this case, but in general could be subtle and difficult to detect.

Still, after minor adjustments, it’s a curve that meets some of the requirements. This is more useful than what we got in Act 1.

Take 2

I augmented the initial prompt.

The curve tangent should match the tangents of the start and end point supplied as arguments. Please define points and tangents as numpy arrays

Now we have two nice linear interpolations, but the tangent is not the tangent of the curve! (It’s more like we have linearly interpolated the position and orientation of the vehicle.)

A series of line segments. The start point of each line in the series follows a straight line. The end point direction from the start point rotates through 90 degrees from the first to the last line in the series

This code, presumably due to specifying numpy, worked straight out of the box.

Take 3

Maybe I can establish the link between the tangent and the curve to which it is tangent?

In the create curve function, the tangent should be calculated as unit vector in the direction from one point to the next on the curve. This calculated tangent should equal the tangent argument at both the start and end of the curve.

Nope, that was too naively optimistic. I was forgetting the Morals of Act 1.

Take 4

Let’s recall the Morals of Act 1, and try to link to concepts with better public support, such as Bézier curves.

The create curve function should be updated to return a bezier curve rather than a linear interpolation between start and end point

The result looked Bézier-ish, but the formula wasn’t transcribed faithfully, with the calculated points quickly racing away to infinity (well outside the convex hull of the control points).

# Calculate the point at the current index.
point = (1 - i / num_points)**3 * start_point + 3 * (1 - i / num_points)**2 * i / num_points * start_tangent + 3 * (1 - i / num_points) * (i / num_points)**2 * end_tangent + i**3 * end_point

However, I think we’re getting closer.

Take 5

Time to reset Bard’s context again and go all in on Bézier.

I would like a python function to create a bezier curve


Please update the solution to plot the curve with matplotlib

And with a few manual tweaks that are becoming familiar…

Image of a diff between two code versions
Diff of Bard generated code (left) and my edits (right) to get the desired result

… we finally get a curve with the desired properties from Take 1!

A bezier curve with 4 control points

However, this is curve is not necessarily going to represent the path of a vehicle, which has a minimum radius of curvature (the vehicle’s turning circle) that a Bézier curve can’t guarantee. Based on the experience above, it’s unclear to me how I would incorporate the curvature requirement with further prompts.

Taking stock: after much prompting, I have a method to create a class of curve I can’t directly use in the solution. (And the code for which I ended up looking up anyway.)

Moral of Act 2

Don’t forget the Morals of Act 1.

In addition, being able to directly reference the key concepts in a generic domain, rather than trying to explain the logic, may be key to getting closer to a correct solution in a few shots.

If you’re not familiar enough with the key concepts to reference them directly, you might use a more open prompt initially, such as: How do you create a smooth curve? This might point you in the right direction…

There are many ways to create a smooth curve. Here are a few examples:

* Use a Bezier curve. A Bezier curve is a parametric curve that is defined by a set of control points. The curve is smooth because it is a polynomial function of the control points.

* ... [goes on to reference Splines, Fourier series and B-Splines] ...

Act 3

Because we need 3 acts. In which I back up and broaden the domain.

Take 1

Considering whether the word vehicle in the initial prompt had given Bard the wrong steer, I tried the more generic prompt: how do I calculate the swept path of an object in python?

This gave helpful introductory commentary on breaking the problem down, and a nearly usable solution.

# Define the circle's geometry.
points = np.array(...)

# Define the circle's motion.
path = np.array(...)

# Calculate the swept path.
swept_path = np.zeros((len(points), len(path)))
for i in range(len(points)):
    for j in range(len(path)):
        swept_path[i][j] = points[i] + path[j]

But one that still needed an expert review to ensure values ended up in the all the right places.

An image showing the diff between two versions of code
Diff of Bard generated code (left) and my edits (right) to get the desired result

Below we can see different coloured “circles” drawn at each position in the path.

5 overlapping squares (representing circles with very low resolution)

This is pretty trivial though – it’s just organised vector addition – did I need AI for that?

Moral of Act 3

Keeping it simple increases the chance of success, but you should balance this against whether a simple solution provides sufficient value.

Concluding the saga, for now

I tried to use Bard to deliver large chunks of a complex solution, rather than as a smarter autocomplete for finer details, or as an aid to understanding existing or proposed solutions. In the time I spent prompting Bard, I would have got further writing code directly. However, I have a lot of scope to improve my prompting.

With expertise in the domain and the code, I was able to diagnose and correct the issues in Bard’s solutions, but I suspect that someone who lacked one or both of those areas of expertise couldn’t recover quickly. In some respects, developing software is about recovering quickly from errors – we can’t avoid making mistakes, but we set up feedback loops to detect them quickly, and over time we become vigilant to more of the types of mistakes we are likely to make. Does an AI coding assistant like Bard help us recover quickly from mistakes? I didn’t actually ask Bard to help much in this session, so that question needs further work to resolve, possibly taking the angle of AI-aided test-first development.

What I did learn that Bard was at its most useful when prompted with open questions or very specific concepts with public data support. I also learned that generated code is likely to be sabotaged by domain dislogic and conceptual fault lines between contexts.

Over time, we’ll figure out how to make AI a better protagonist and antagonist in our coding stories; for me, this was an interesting way to introduce a new character.

Smarter Semantle Solvers

A little smarter, anyway. I didn’t expect to pick this up again, but when I occasionally run the first generation solvers online, I’m often equal parts amused and frustrated by rare words thrown up that delay the solution – from amethystine to zigging.

Animation of online semantle solution
An example solution with fewer than typical rare words guessed

The solvers used the first idea that worked; can we make some tweaks to make them smarter? The code is now migrated to its own new repo after outgrowing its old home.

Measuring smarts

I measure solver performance by running multiple trials of a solver configuration against the simulator for a variety of target words. This gives a picture of how often the solver typically succeeds within a certain number of guesses.

Chart showing cumulative distribution function curves for two solver configurations


It turns out that the vocabulary to date based on english_words_set is a poor match for the most frequently used English words, according to unigram frequency data.

So we might expect that simply replacing the solver vocabulary would improve performance, and we also get word ranking from unigram_freq.

Semantic models

We’ll continue with Universal Sentence Encoder (USE) to ensure search strategies are robust to different semantic models.


To improve the gradient solver I tried making another random guess every so often to avoid long stretches exploring local minima. But it didn’t make things better, and probably made them worse!

In response, I made each guess the most common local word to the extrapolated semantic location, rather than just the nearest word. Still no better, and trying both “improvements” together was significantly worse!

Ah well, experiments only fail if we fail to learn from them!

Vocabulary again

I think the noise inherent in a different semantic model, plus the existing random extrapolation distance, overwhelms the changes I tried. In better news, we see a major improvement from using unigram freq vocabulary, reducing the mean from 280 (with many searches capped at 500) to 198, approximately a 30% improvement.

Smarter still?

Here we see that the data-centric (vocabulary) improvement had a far bigger impact than any model-centric (search algorithm) improvement that I had the patience to try (though I left a bunch of further todos). Maybe just guessing randomly from the top n words will be better again! ????

At least I’ve made a substantial dent in reducing those all-too-common guesses at rare words.

I did it my way – hand-rolled navigation with open spatial data

Sure commercial maps app directions are great, but have you ever found the customisation options limited? What if you want to use bike paths and back streets when cycling, or avoid winding roads that might make backseat passengers car-sick on a road trip?

Example of cycling route options that are direct (B) and pleasant (A)

The paved route

OpenStreetMap and OpenRouteService do provide this type of functionality, and could be considered for use as-is or with further customisation. PostGIS and pgRouting provide capabilities if you bring your own data. Many dedicated apps support groups with particular mobility needs.

My way, off the highway

In researching these capabilities however, and because I’m a fan of maps and I wanted to understand the whole data transformation journey, I decided to hand-roll my own navigation solution using pyshp, numpy, scipy, and networkx, all visualised with matplotlib. The end result is far from polished, but it can ingest 1.1M road geometries for the Australian state of Victoria, and generate a topological graph for routing within minutes, then use that map to to generate turn-by-turn directions in real time.

See the source code and the brief write-up below if you’re interested.


The solution uses data from the Vicmap Transport data set, which provides road centrelines for highways, streets, tracks, paths, etc, for all of Victoria and some bordering regions. The spatial features are augmented with 71 attributes useful for routing, including road names, permitted directions of travel, height limits, etc. I used a GDA2020 datum grid projection shapefile export. Pyshp provides a list of geometries and attributes via shapeRecords.


Vicmap Transport road centrelines are collections of polylines. The endpoints of these polylines (aka road segments) helpfully coincide where we might infer an continuous stretch of road or intersection. This allows us to describe how roads are connected with a graph.

Map of Bairnsdale township showing spatial location of roads overlaid with their network connectivity
Road network modeled as a graph in Bairnsdale, Victoria (data: Vicmap Transport)

Each endpoint will map to a node in the graph. The node may be unique to one road segment, or shared between multiple road segments if it’s at a junction. We find the coincident endpoints for a shared node with the method query pairs. The road segments with coincident endpoints then define the edges of the graph at this node. A directed graph can be defined using the direction code attribute of each segment (forward, reverse, or both directions).


Two routes through Bairnsdale township overlaid on the network connectivity map

With a graph representation of the road network, we can find a route between any two nodes (if connected) with standard algorithms. The examples above and below uses Dijkstra’s algorithm to find shortest path based on edge weights that reflect our routing preferences. The orange route is “fewest hops” (count of road segments) and the green route is “shortest distance”. Geometric length of road segments is calculated in a post-processing pass over the ingested polyline data, and assigned as a weight to each edge.

Two routes through the wider Bairnsdale area overlaid on the network connectivity map
Routing options in the area surrounding Bairnsdale

Optimisation and scaling

My first spikes were hideously inefficient, but once the method was established, there was a lot of room for improvement. I addressed three major performance bottlenecks as I scaled from processing 50k road segments in 50 minutes, to 1.1M road segments in 30 seconds. These figures represent a 2017 Macbook Pro, or free Colab instance, being roughly similar.

Processing stage and end-to-end times50k segments1.1M segments
Coincident segment endpoints
For loop accumulating unique ids with distance test50 mins
numpy array calculation argwhere distance < e and numpig6 mins
scipy.spatial.KDTree.query_pairs2 mins
Shared node mapping
List comprehension elementwise mapBroke Colab
numpy materialised mapping< 2 mins
Directed edges (previously undirected)
For loop accumulating correctly directed edges case-wise and discarding topological duplicates> 12 hrs
numpy vectorisation of loop conditional logic30 sec
Stages of optimisation to support scaling

An additional ~30s is required to post-process geometric length of segments as an attribute per edge, and I imagine similar for other derived edge attributes. For instance, we might get finer grained average cycling speed per segment, or traffic risk factors, etc.

For calculating routes, (i.e., at inference time) it takes about 1-4s to find a shortest path, depending on the length of the route (using pure Python networkx). We can now find routes of over 950km length, from Mildura in the state’s north-west to Mallacoota in the east.

Two long routes overlaid on sampled road data from all of Victoria and surrounds
Two routes from Mildura to Mallacoota

More latitude for navigation

We would like to be able to find the start and end nodes of a route from latitude and longitude. However, as the nodes in our routing graph are located by VicGrid grid coordinates (eastings and northings), we first need to “wrap” this planar grid around the roughly spherical Earth. While geometrically complex (see below), it’s easy to do this transformation back and forth between coordinate systems with pyproj

A grid with axis-aligned arrow pairs at each point showing how space is squashed or stretched in 2D when transforming from eastings, northing to latitude, longitude

Turn-by-turn navigation directions

With the start and end nodes located from latitude and longitude, a route can be calculated as above. Then, turn-by-turn directions can then be derived by considering the geometry of the road network at each intermediate node on the route, and what instructions might be required by users, for instance:

  • Determine the compass direction of travel along a road segment to initiate travel in the correct direction,
  • Calculate the angle between entry and exit directions at intersections to provide turn directions as left, right, straight, etc,
  • Use the geometric length of road segments to provide distance guidance,
  • Consolidate to a minimum set of directions by identifying where explicit guidance is not required (e.g., continuing straight on the same road), and
  • Render the instructions into an easily (?) human-consumable form, with natural language descriptions of appropriate precision.
['travel south west on Holly Court',
 'continue for 190m on Holly Court',
 'turn right into Marigold Crescent',
 'continue for 360m on Marigold Crescent',
 'go straight into Gowanbrae Drive',
 'continue for 420m on Gowanbrae Drive',
 'turn left into Gowanbrae Drive',
 'continue for 150m on Gowanbrae Drive',
 'turn left into Lanark Way',
 'continue for 170m on Lanark Way']

This constitutes a rather neat first approximation to to commercial turn-by-turn directions, but I suspect it suffers in many edge cases, like roundabouts and slip lanes.

Next steps

With a drive-by look at the key elements, the road ahead to future “my way” hand-rolled navigation is clearer. An essential next step would be an interactive map interface. However, making this prototype roadworthy also likely needs more data wrangling under the hood (e.g., for cycling-specific data), a review of where to leverage existing open services, and polishing edge cases to a mirror finish.

End-to-end simulation hello world!

I’ve talked to many people about how to maximise the utility of a simulator for business decision-making, rather than focussing on the fidelity of reproducing real phenomena. This generally means delivering a custom simulator project lean, in thin, vertical, end-to-end slices. This approach maximises putting learning into action and minimises risk carried forward.

For practitioners, and to sharpen my own thinking, I have provided a concrete example in code; a Hello, World! for custom simulator development. The simulated phenomena will be simple, but we’ll drive all the way through a very thin slice to supporting a business decision.

Introduction to simulation

To understand possible futures, we can simulate many things; physical processes, digital systems, management processes, crowd behaviours, combinations of these things, and more.

When people intend to make changes in systems under simulation, they wish to understand the performance of the system in terms of the design of actions they may execute or features that may exist in the real world. The simulator’s effectiveness is determined by how much it accelerates positive change in the real world.

Even as we come to understand systems through simulation, it is also frequently the case that people continue to interact with these systems in novel and unpredictable ways – sometimes influenced by the system design and typically impacting the measured performance – so it remains desirable to perform simulated and real world tests may in some combination, sequentially or in parallel.

For both of these reasons – because simulation is used to shorten the lead time to interventions, and because those interventions may evolve based on both simulated and exogenous factors – it is desirable to build simulation capability in an iterative manner, in thin, vertical slices, in order to respond rapidly to feedback.

This post describes a thin vertical slice for measuring the performance of a design in a simple case, and hence supporting a business decision that can be represented by the choice of design, and the outcomes of which can be represented by performance measures.

This post is a walk-through of this simulation_hello_world notebook and, as such, is aimed at a technical audience, but one that cares about aligning responsive technology delivery and simulation-augmented decision making for a wider stakeholder group.

If you’re just here to have fun observing how simulated systems behave, this post is not for you, because we choose a very simple, very boring system as our example. There are many frameworks for the algorithmic simulation of real phenomena in various domains, but our example here is so simple, we don’t use a framework.

The business decision

How much should we deposit fortnightly into a bank account to best maintain a target balance?

Custom simulator structure

To preserve agility over multiple thin slices through separation of concerns, we’ll consider the following structure for a custom simulator:

  1. Study of the real world
  2. Core algorithms
  3. Translation of performance and design
  4. Experimentation
  5. Translation of an environment

We’ll look at each stage and then put them all together and consider next steps. As a software engineering exercise, the ongoing iterative development of a simulator requires consideration of Continuous Delivery, and in particular shares techniques with Continuous Delivery for Machine Learning (CD4ML). I’ll expand on this in a future post.

Study of the real world

We study how a bank account works and aim to keep the core algorithms really simple – as befits a Hello, World! example. In our study, we recognise the essential features of: a balance, deposits and withdrawals. The balance is increased by deposits and reduced by withdrawals, as shown in the following Python code.

class Account:

  def __init__(self):
    self.balance = 0

  def deposit(self, amount):
    self.balance = self.balance + amount

  def withdraw(self, amount):
    self.balance = self.balance - amount

Core algorithms

Core algorithms reproduce real world phenomena, in some domain, be it physical processes, behaviour of agents, integrated systems, etc. We can simulate how our simple model of bank account will behave when we perform some set of transactions on it. In the realm of simulation, these are possible rather than actual transactions.

def simulate_transaction(account, kind, amount):
  if kind == 'd':
  elif kind == 'w':
    raise ValueError("kind must be 'd' or 'w'")

def simulate_balance(transactions):
  account = Account()
  balances = [account.balance]
  for t in transactions:
    simulate_transaction(account, t[0], t[1])
  return balances

When we simulate the bank account, note that we are interested in capturing a fine-grained view of its behaviour and how it evolves (the sequence of balances), rather than just the final state. The final state can be extracted in the translation layer if required.

tx = [('d', 10), ('d', 20), ('w', 5)]
[0, 10, 30, 25]

Visualisation is critical in simulator development – it helps to communicate function, understand results, validate and tune implementation and diagnose errors, at every stage of development and operation of a simulator.

This could be considered a type of discrete event simulation, but as above, we’re more concerned with a thin slice through the whole structure than the nature of the core algorithms.

Translation of performance and design

The core algorithms are intended to provide a fine-grained, objective reproduction of real observable phenomena. The translation layer allows people to interact with a simulated system at a human scale. This allows people to specify high-level performance measures used to evaluate the system designs, which are also specified at a high level, while machines simulate all the rich detail of a virtual world using the core algorithms.


We decide for our Hello, World! example that this is a transactional account, and as such we care about keeping the balance close to a target balance, so that sufficient funds will be available, but we don’t leave too much money in a low interest account. We therefore measure performance as the average (mean) absolute difference between the balance at each transaction, and the target balance. Every transaction that leaves us over or under the target amount is penalised by the difference and this penalty is averaged over transactions. A smaller measure means better performance.

This may be imperfect, but there are often trade-offs in how we measure performance. Discussing these trade-offs with stakeholders can be a fruitful source of further insight.

def translate_performance_TargetBalance(balances, target):
  return sum([abs(b - target) for b in balances]) / len(balances)


While we can make any arbitrary number of transactions and decide about each individual transaction, we consider for simplicity that set up a fortnightly deposit schedule, and make one decision about the amount of that fortnightly deposit.

As the performance translation distills complexity into a single figure, so the design translation causes the inverse, with complexity blooming from a single parameter We translate the single design parameter into a list of every individual transaction, suitable for simulation by core algorithms.

def translate_design_FortnightlyDeposit(design_parameter):
  return [('d', design_parameter)] * ANNUAL_FORTNIGHTS

[('d', 10), ('d', 10), ('d', 10), ('d', 10), ('d', 10)]

Performance = f(Design)

Now we can connect human-relevant design to human-relevant performance measure via the sequence:

design -> design translation -> core algorithm simulation -> performance translation -> performance measure

If we set our target balance at 100, we see how performance becomes a function of design by chaining translations and simulation.

def performance_of_design(design_translator, design_parameters):
  return translate_performance_Target100(

This example above (and from the notebook) specialises the target balance performance translator to Target100 and makes the design translator configurable.

Remebering that we’re interested in the mean absolute delta between the actual balance and the target, here’s what it might look like to evaluate and visualise a design:

evaluating account balance target 100
with FortnightlyDeposit [9]
the mean abs delta is 61.89


Experimentation involves exploring and optimising simulated results through the simple interface of performance = f(design). This means choosing a monthly deposit amount and seeing how close we get to our target balance over the period. Note that while performance and design are both single values (scalars) in our Hello, World! example, in general they would both consist of multiple, even hundreds of, parameters.

This performance function shows visually the optimum (minimum) design is the point at the bottom of the curve. We can find the optimal design automatically using (in this instance) scipy.optimize.minimize on the function performance = f(design).

We can explore designs in design space, as above, and we can also visualise how an optimal design impacts the simulated behaviour of the system, as below.

Note that in the optimal design, the fortnightly deposit amount is ~5 for a mean abs delta of ~42 (the optimal performance) rather than a deposit of 9 (an arbitrary design) for a mean abs delta of ~62 (the associated performance) in the example above.

Translation of an environment

The example so far assumes we have complete control of the inputs to the system. This is rarely the case in the real world. This section introduces environmental factors that we incorporate into the simulation, but consider out of our control.

Just like design parameters, environmental factors identified by humans need to be translated for simulation by core algorithms. In this case, we have a random fortnightly expense that is represented as a withdrawal from the account.

def translate_environment_FortnightlyRandomWithdrawal(seed=42, high=5):
  rng = np.random.RandomState(seed)
  random_withdrawals = rng.randint(0, high=high, size=ANNUAL_FORTNIGHTS)
  return list(zip(['w'] * ANNUAL_FORTNIGHTS, random_withdrawals))

[('w', 3), ('w', 4), ('w', 2), ('w', 4), ('w', 4)]

If we were to simulate this environmental factor with without any deposits being made, it would look like below.

Now we translate design decisions and environmental factors together into the inputs for the simulation core algorithms. In this case we interleave or “zip” the events together

def translate_FortnightlyDepositAndRandomWithdrawal(design_parameters):
  interleaved = zip(translate_design_FortnightlyDeposit(design_parameters),
  return [val for pair in interleaved for val in pair]

[('d', 9), ('w', 3), ('d', 9), ('w', 4), ('d', 9), ('w', 2)]

Putting it all together

We’ll now introduce an alternative design and optimise that design by considering performance when the simulation includes environmental factors.

Alternative design

After visualising the result of our first design against our performance measurement, an alternative design suggests itself. Instead of having only a fixed fortnightly deposit, we could make an initial large deposit, followed by smaller fortnightly deposits. Our design now has two parameters.

def translate_design_InitialAndFortnightlyDeposit(design_pars):
  return [('d', design_pars[0])] + [('d', design_pars[1])] * ANNUAL_FORTNIGHTS

design_2 = [90, 1]

In the absence of environmental factors, our system would evolve as below.

However, we can also incorporate the environmental factors by interleaving the environmental transactions as we did above.


Our performance function now incorporates two design dimensions. We can run experiments on any combination of the two design parameters to see how they perform. With consideration of environmental factors now, we can visualise performance and optimisation the design in two dimensions.

Instead of the low point on a curve, the optimal design is now the low point in a bowl shape, the contours of which are shown on the plot above.

This represents our optimal business decision based on what we’ve captured about the system so far, an initial deposit of about $105 and an fortnightly deposit of $2.40. If we simulate the evolution of the system under the optimal design, we see a result like below, and can see visually why it matches our intent to minimise the deviation from the target balance at each time..

The next thin slice

We could increase fidelity by modelling transactions to the second, but it may not substantially change our business decision. We could go on adding design parameters, environmental factors, and alternative performance measures. Some would only require changes at the translation layer, some would require well-understood changes to the core algorithms, and others would require us to return to our study of the world to understand how to proceed.

We only add these things when required to support the next business decision. We deliver another thin slice through whatever layers are required to support this decision by capturing relevant design parameters and performance measures at the required level of fidelity. We make it robust and preserve future flexibility with continuous delivery. And in this way we can most responsively support successive business decisions with a thinly sliced custom simulator.

Weighting objectives and constraints

We used a single, simple measure of performance in this example, but in general there will be many measures of performance, which need to be weighted against each other, and often come into conflict. We may use more sophisticated solvers as the complexity of design parameters and performance measures increases.

Instead of looking for a single right answer, we can also raise these questions with stakeholders to understand where the simulation is most useful in guiding decision-making, and where we need to rely on other or hyrbid methods, or seek to improve the simulation with another development iteration.

Experimentation in noisy environments

The environment we created is noisy; it has non-deterministic values in general, though we use a fixed random seed above. To properly evaluate a design in a noisy environment, we would need to run multiple trials.

Final thoughts

Simulation scenarios become arbitrarily complex. We first validate the behaviour in the simplest point scenarios that ideally have analytic solutions, as above. We can then calibrate against more complex real world scenarios, but when we are sufficiently confident the core algorithms are correct, we accept that the simulation produces the most definitive result for the most complex or novel scenarios.

Most of the work is then in translating human judgements about design choices and relevant performance measures into a form suitable for exploration and optimisation. This is where rich conversations with stakeholders are key to draw out their understanding and expectations of the scenarios you are simulating, and how the simulation is helping drive their decision-making. Focussing on this engagement ensures you’ll continue to build the right thing, and not just build it right.

Summertime, and the puzzling is made easier

In between beach trips and bike rides, I whiled away more than a few summer hours on puzzles I found in the various AirBNBs we rented. Returning home, I rediscovered a sliding tile puzzle with a twist called Asteroid Escape, where embedded asteroids prevent certain tiles sliding past each other in certain configurations.

A photo of the Asteroid Escape sliding tile puzzle on a desk

With 60 challenges from Starter to Wizard level, I was musing about generating the catalogue of challenges automatically. I’ve enjoyed automating puzzle solutions since brute-forcing Instant Insanity with BASIC in my teens (the code is long lost). While Starter puzzles may only require 6 moves, Wizard may require as many as 109 moves. I decided I would particularly enjoy avoiding the gentle frustration of manually solving so many variants of Asteroid Escape if I were to use an automated solver.

I built a solver in Python with numpy for working with grids of numbers. If automating puzzles is your thing too, check out the source code and read on for a summary of the approach. To execute the proposed solution, I printed out the moves and crossed them off with a pencil as I made them. There’s a lot that could be more elegant or efficient, but it gets the job done!

Solving the hardest Wizard level Asteroid Escape puzzle in the minimum 109 steps

Solver evolution

Define a simple sliding tile board. Elements of a 2D array hold the indexes of the tile pieces at each location (numbered 1..8 for a 3×3 board). The tiles can be reordered through a sequence of moves of the “hole”, that is the space with no tile (numbered 0). The available moves depend on the hole location – four options from the centre but only two from the corners.

Illustration of sliding tile puzzle with tile pieces in a 3x3 grid labelled with numbers 1 to 8, and the hole or gap labelled with 0

Solve the board. Find sequences of moves of the hole that reconfigure the board into a target arrangement of tiles. This I treated as a queue-based breadth-first graph search, with arrangements of pieces being the nodes of a graph, and moves being the edges. However the specific constraints of Asteroid Escape aren’t captured at this point; all tiles are interchangeable in their movements.

A sequence of sliding tile puzzle grids showing how the hole is moved around to reach a target arrangement

Model the blockers that prevent certain tile arrangements. I quickly realised I would miss edge cases if I tried to list permitted arrangements of tiles, and instead defined the “blockers” on each piece with reference to a 4×4 grid. The grid had one cell margin around the outside of the piece to model projecting outer blockers, and 4 positions inside the piece that could carry blockers. A grid cell would be assigned 1 if there was a blocker or 0 otherwise. For a proposed board arrangement, the piece grids could be offset to their location and superposed onto a larger 8×8 board grid by adding piece cell values. A resultant value greater than 1 would mean that two or more blockers were interfering, so the board arrangement was not actually possible.

Grids modelling blocking protrusions of individual pieces, illustrating how interference occurs when blockers from separate tiles (values of 1) occupy the same space (value of 2)

Solve the board while respecting blocked arrangements. The breath-first search identifies the neighbours – reachable in one move – from each arrangement of pieces. Any neighbours that would result in interfering blockers are now excluded from the search. The target state also only requires that the spacecraft piece should be in the exit position. Both of these constraints were captured as functions passed to the solver. This worked well for a Starter puzzle I tested, but failed on a Wizard puzzle, because it failed to capture piece interference that only occurs when moving pieces.

Model the blockers that prevent certain tile moves. In particular, the nose of the spacecraft doesn’t interfere with the protruding corner blocker on one asteroid tile in static arrangements, but there are scenarios where the tiles can’t be moved past on another. To model this, I increased the piece grid to 5×5 (now 9 interior positions) to capture the previously ignored spacecraft nose, and when assessing a move, I shifted the tile by one grid cell at a time so that each move assessed interference at start and end, and two intermediate positions.

A sequence of 4 grids illustrating interference between blockers on separate pieces as one piece slides past the other, though there is no interference at start or end position

Solve the board while respecting blocked arrangements and blocked moves. Update the neighbour search to exclude moves to neighbours that would result in blocker interference during the move. Update the target state to avoid any interference in sliding the spacecraft off the board.

Of course, the solver didn’t go off without a hitch when I tried it in the real world. I failed to specify a clear exit was required at first, and failed to specify it properly the second time, but third time was a charm (as above).

Asteroid Escape solver bloopers


From a cursory inspection, the Wizard level 60 solution in the notebook and animated above looks the same as the solution provided with the puzzle (so I didn’t really need a solver at all!) However, this solver can also find multiple other (longer) solutions to each challenge. It could also be used to generate and solve more variants at each level of difficulty, extending the catalogue of puzzles. Improvements can wait, however, as it’s time to enjoy the bike or the beach once more.

Throwback Thursday

The metaverse is a topic currently, though the concept has a long history. Twenty years ago, in the dotcom era, I was exploring this space, as I was recently reminded. Feeling nostalgic, I dug these projects out of the NAS archives. Tech has moved on, but there’s enduring relevance in what I learned.

VO2max (1999)

Virtual Online Orienteering, to the max! Conceived as similar to Catching Features, but a game that was online by default, as you could play in the browser in a virtual world authored in VRML97.

A screenshot of a virtual orienteering game showing a first person view of terrain, a start banner and a compass, with a list of checkpoints underneath.

The UI consisted of two browser windows: a first person view and motion controls (using the now defunct Cosmo Player), and a map in a second window. 

A screenshot of a virtual orienteering game showing a first person view of terrain, a boulder with a checkpoint next to it, and a compass, with a list of checkpoints underneath.

The draggable compass needle, the checkpoints and the course logic (must visit checkpoints in order), and a widget that visualised your completed route as an electric blue string hovering 3 feet above the ground were all modular VRML Protos.

A screenshot of a virtual orienteering game showing a first person view of terrain, the time to complete the course, and a blue line that shows the route taken to complete the course.

The map and terrain for the only level ever created were generated together with a custom C++ application. I was pretty pleased this all worked, and it demonstrated some concepts for…

4DUniverse (2000)

4DUniverse was a broad concept for virtual online worlds for socialising, shopping, gaming, etc, similar at the time to ActiveWorlds (which still exists today!), but again accessible through the browser (assuming you had a VRML plugin).

I thought I’d have great screengrabs to illustrate this part of the story, but I was surprised how few I’d captured, that they were very low resolution, and that they were in archaic formats. The source artefacts from this post – WRL, HTML, JS, JAVA, etc – have lost no fidelity, but would only meet modern standards and interpreters to various degrees. Maybe I will modernise them someday and generate new images to do justice to the splendour I held in my mind!

A variety of screengrabs of virtual worlds from the 4DUniverse. Three different worlds with different themes are shown. Vaguely Greek open hotel lobby, Gothic cathedral in orbit, etc

We authored a number of worlds, connected by teleports, with the tools we had to hand, being text editors, spreadsheets, and custom scripts. While a lot of fun, we came to the conclusion that doing the things we envisaged in the 4DUniverse wasn’t any more compelling than doing them in the 2D interfaces of the time. VRML eventually went away, and probably because no one else was able to make a compelling case for its use. At least I crafted a neat animated GIF logo rendered with POV-Ray.

4 capital D's with a metallic finish joined in a flower shape that is rotating

Less multiverse, and more quantum realm, I also generated VRML content at nanoscale with…

NanoCAD (2001)

NanoCAD was a neat little (pun intended) CAD application for designing molecules, which I extended with a richer editing UI, supporting the design of much more complex hypothesised molecular mechanisms.

NanoCAD UI showing 3D view of construction of a large buckytube from graphene sheets. Application controls and help text are shown in the lower panel.

The Java app allowed users to place atoms in 3D and connect them with covalent bonds. Then an energy solver would attempt to find a stable configuration for the molecules (using classical rather than quantum methods). With expressive selection, duplication and transformation mechanics, it was possible to create benzene rings, stitch them into graphene sheets, and roll them up into “enormous” buckytubes, or other complex carbon creations.

NanoCAD UI showing detailed 3D view of construction of a large buckytube from graphene sheets. Application controls and help text are shown in the lower panel. Further desktop background including MS Word, Windows Explorer and WinAmp evoke the 2001 era

I also created cables housed inside sheaths, gears – built with benzene ring teeth attached to buckytubes – and other micro devices. If 4DUniverse was inspired by Snow Crash, NanoCAD was inspired by The Diamond Age. Nanocad could run in a browser as an applet and the molecules could also be exported as WRL files for display in other viewers.

3D visualisation in the space-filling style of a molecular gear made from benzene ring teeth attached to the outside of a buckytube

Comparing contemporary professional projects

It’s nice to contrast the impermanence of these personal projects with the durability of my contemporary professional work with ANCA Machines. At the time, I was documenting the maths and code of Cimulator3D and also developing the maths and bidirectional UI design for iFlute, both used in the design and manufacturing of machine tools via grinding processes. Both products are still on the market in similar form today, more than two decades later.

I wonder how I’ll view this post in another twenty years?

Second Semantle Solver

In the post Sketching Semantle Solvers, I introduced two methods for solving Semantle word puzzles, but I only wrote up one. The second solver here is based the idea that the target word should appear in the intersection between the cohorts of possible targets generated by each guess.

Finding the semantle target through overlapping cohorts. Shows two intersecting rings of candidate words based on cosine similarity.

To recap, the first post:

  • introduced the sibling strategies side-by-side,
  • discussed designing for sympathetic sequences, so the solver can play along with humans, with somewhat explainable guesses, and
  • shared the source code and visualisations for the gradient descent solver.

Solution source

This post shares the source for the intersecting cohorts solver, including notebook, similarity model and solver class.

The solver is tested against the simple simulator for semantle scores from last time. Note that the word2vec model data for the simulator (and agent) is available at this word2vec download location.

Stylised visualisation of the search for a target word with intersecting  cohorts. Shows distributions of belief strength at each guess and strength and rank of target word

The solver has the following major features:

  1. A vocabulary, containing all the words that can be guessed,
  2. A semantic model, from which the agent can calculate the similarity of word pairs,
  3. The ability to generate cohorts of words from the vocabulary that are similar (in Semantle score) to a provided word (a guess), and
  4. An evolving strength of belief that each word in the vocabulary is the target.

In each step towards guessing the target, the solver does the following:

  1. Choose a word for the guess. The current choice is the word with the strongest likelihood of being the target, but it could equally be any other word from the solver’s vocabulary (which might help triangulate better), or it could be provided by a human player with their own suspicions.
  2. Score the guess. The Semantle simulator scores the guess.
  3. Generate a cohort. The guess and the score are used to generate a new cohort of words that would share the same score with the guess.
  4. Merge the cohort into the agent’s belief model. The score is added to the current belief strength for each word in the cohort, providing a proxy for likelihood for each word. The guess is also masked from further consideration.

Show of strength

The chart below shows how the belief strength (estimated likelihood) of the target word gradually approaches the maximum belief strength of any word, as the target (which remains unknown until the end) appears in more and more cohorts.

Intersecting cohorts solver. Line chart showing the belief strength of the target word at each guess in relation to the maximum belief strength of remaining words.

We can also visualise the belief strength across the whole vocabulary at each guess, and the path the target word takes in relation to these distributions, in terms of its absolute score and its rank relative to other words.

Chart showing the cohort solver belief strength across the whole vocabulary at each guess, and the path the target word takes in relation to these distributions, in terms of its absolute score and its rank relative to other words

Superior solution?

The cohort solver can be (de)tuned to almost any level of performance by adjusting the parameters precision and recall, which determine the tightness of the similarity band and completeness of results from the generated cohorts. The gradient descent solver has potential for tuning parameters, but I didn’t explore this much. To compare the two, we’d therefore need to consider configurations of each solver. For now, I’m pleased that the two distinct sketches solve to my satisfaction!