The notebook demonstrates that individual rows of a database may be reconstructed, even if only summary statistics are shared, by considering the constraints that the statistics place on possible values of the data. Constraints include mean and median for all numerical values globally and for various cohorts of records determined by class values.
Note that the intent is of this notebook is not to compromise any private data, but to raise awareness of the potential for privacy breaches due to reconstruction attacks!
In three years of the maths whimsy repo, I’ve covered a lot of ground, and got hands-on with many Python libraries for data science. I was particularly pleased when the idea to visualise multi-stage classification pipelines as Sankey diagrams was enthusiastically picked up by one client. I’m not sure I’ll find an application for which numpig chunking of large arrays is the best solution, but I remain quite fond of it. And I could spend all day watching the skeumorphic odometer visualisation created for the slackometer.
Afterwards, someone asked me about the times it hadn’t worked. I should have covered this too! The primary time I wasn’t satisfied that I had something to share was when I was experimenting with reinforcement learning to solve mazes with disconnected walls. I bit off more than I could chew, and while I went end-to-end with Deep Q-Learning, and then backfilled some bits of the training pipeline, I didn’t reach a satisfactory conclusion within the time I had available. I took this lesson forward to re-frame the problem into smaller, achievable chunks whenever it looked like I may have dived too deep on subsequent whimsies, and it has served me well since.
I wrote an article about optimising the design of EV charging networks. It’s a story of work done by a team at Thoughtworks, demonstrating the potential of AI augmented decision-making (including some cool optimisation techniques), in this rapidly evolving but durably important space.
We were able to thread together these many [business problem, AI techniques, data sourcing, technology] concerns in such a short time because of our relentless focus — not on the code or the mathematics — but on generating value: namely making it easier for humans to make decisions according to business goals.
Though but a footnote in the epic of coding with AI, I though it worth musing on my recent experience with Bard. Bard currently uses the LaMDA model, which is capable of generating code, but not optimised for it. The story might be different with Codey as protagonist (or is that antagonist?)
I didn’t produce any software I could use in this session, but I did learn that Bard was at its most useful when prompted with open questions or very specific concepts. I also learned that generated code may be sabotaged by plausible but incorrect domain logic and conceptual fault lines that occur between contexts.
Also, given that safetydave.net contributes 0.0000005% of tokens to the C4 dataset which makes up some of Bard’s training data, I thought I should see some return!
In which we meet a domain.
I started naively with a reasonably complex ask: how do I calculate the swept path of a vehicle in python?
I’ve done swept path analysis before, but in manufacturing, and I certainly don’t have a Python solution handy. Even if I only got the broad outlines of a solution, an answer or parts of an answer to this question would save a bunch of time.
This was far too wide an ask. I got code that looks like it’s for a curve that a vehicle might follow, but that’s just one part of the solution, and the more I looked at it, the more issues I found.
# The vehicle's position
x = np.array([0, 0])
y = np.array([0, 0])
# The vehicle's orientation
theta = 0
# The vehicle's velocity
v = 1
# The vehicle's wheelbase
w = 2
# Calculate the vehicle's position at each time step
for t in np.arange(0, 10, 0.1):
x += v * np.cos(theta) * t
y += v * np.sin(theta) * t
theta += v / w * t
First, vehicle is a word that identifies a whole domain of assumptions, including how steering works. But this domain logic is weird; assuming theta represents direction of travel, that’s not how steering works, in this universe at least. Export to Colab feature is nice, so let’s see how the code runs. Spoiler: it doesn’t.
As a user of Python, I could diagnose the runtime error and fix it (the initial values of x and y could be defined with dtype=float), but on closer inspection, the += operator overwrites rather than appends, and, wait a minute, x and y are treated both as vectors and scalars. As the domain logic is still bizarre, is there any purpose in fixing these conceptual faults in the code?
A secondary prompt to provide more guidance: I would like to consider the area swept out by the corners of the vehicle, can you update the solution?
This only resulted in more trigonometric prestidigitation; more domain dislogic. I use the term dislogic a bit like the term truthiness. There is some related logic to the code, but it is “apart” from the logic we need.
I tried much more explicit guidance by elaborating on domain concepts in the third prompt. This lead to more detailed but still fundamentally flawed and unusable solution. I decided this path wouldn’t lead to a happy ending.
Morals of Act 1
Don’t assume too much domain expertise. Bard has learned on generic public examples. Use your domain expertise to break the problem into smaller chunks.
Also, don’t expect conceptual consistency throughout. LLMs like Bard, as next-token predictors, don’t necessarily ensure conceptual consistency in their output.
In which I choose a path to follow.
I decided to focus on one part of the solution; getting the curve right. I reset Bard’s context.
I want a python function to create a curve between two points. The function arguments should be the start and end points and the tangent to the curve at the start and end points
Nice linear interpolation, shame about the tangents (which, while present as arguments, were totally ignored in the function body).
And the above could only be generated after fixing more errors preventing the code from running. The affordances of tuples and numpy.ndarray were confused, and the coordinates weren’t passed correctly to the plot method. The syntax was fine, but the code was riven with conceptual fault lines between contexts – what looked OK in one or other context in isolation caused problems when the contexts were brought together. The bugs were fairly obvious in this case, but in general could be subtle and difficult to detect.
Still, after minor adjustments, it’s a curve that meets some of the requirements. This is more useful than what we got in Act 1.
I augmented the initial prompt.
The curve tangent should match the tangents of the start and end point supplied as arguments. Please define points and tangents as numpy arrays
Now we have two nice linear interpolations, but the tangent is not the tangent of the curve! (It’s more like we have linearly interpolated the position and orientation of the vehicle.)
This code, presumably due to specifying numpy, worked straight out of the box.
Maybe I can establish the link between the tangent and the curve to which it is tangent?
In the create curve function, the tangent should be calculated as unit vector in the direction from one point to the next on the curve. This calculated tangent should equal the tangent argument at both the start and end of the curve.
Nope, that was too naively optimistic. I was forgetting the Morals of Act 1.
Let’s recall the Morals of Act 1, and try to link to concepts with better public support, such as Bézier curves.
The create curve function should be updated to return a bezier curve rather than a linear interpolation between start and end point
The result looked Bézier-ish, but the formula wasn’t transcribed faithfully, with the calculated points quickly racing away to infinity (well outside the convex hull of the control points).
# Calculate the point at the current index.
point = (1 - i / num_points)**3 * start_point + 3 * (1 - i / num_points)**2 * i / num_points * start_tangent + 3 * (1 - i / num_points) * (i / num_points)**2 * end_tangent + i**3 * end_point
However, I think we’re getting closer.
Time to reset Bard’s context again and go all in on Bézier.
I would like a python function to create a bezier curve
Please update the solution to plot the curve with matplotlib
And with a few manual tweaks that are becoming familiar…
… we finally get a curve with the desired properties from Take 1!
However, this is curve is not necessarily going to represent the path of a vehicle, which has a minimum radius of curvature (the vehicle’s turning circle) that a Bézier curve can’t guarantee. Based on the experience above, it’s unclear to me how I would incorporate the curvature requirement with further prompts.
Taking stock: after much prompting, I have a method to create a class of curve I can’t directly use in the solution. (And the code for which I ended up looking up anyway.)
Moral of Act 2
Don’t forget the Morals of Act 1.
In addition, being able to directly reference the key concepts in a generic domain, rather than trying to explain the logic, may be key to getting closer to a correct solution in a few shots.
If you’re not familiar enough with the key concepts to reference them directly, you might use a more open prompt initially, such as: How do you create a smooth curve? This might point you in the right direction…
There are many ways to create a smooth curve. Here are a few examples:
* Use a Bezier curve. A Bezier curve is a parametric curve that is defined by a set of control points. The curve is smooth because it is a polynomial function of the control points.
* ... [goes on to reference Splines, Fourier series and B-Splines] ...
Because we need 3 acts. In which I back up and broaden the domain.
Considering whether the word vehicle in the initial prompt had given Bard the wrong steer, I tried the more generic prompt: how do I calculate the swept path of an object in python?
This gave helpful introductory commentary on breaking the problem down, and a nearly usable solution.
# Define the circle's geometry.
points = np.array(...)
# Define the circle's motion.
path = np.array(...)
# Calculate the swept path.
swept_path = np.zeros((len(points), len(path)))
for i in range(len(points)):
for j in range(len(path)):
swept_path[i][j] = points[i] + path[j]
But one that still needed an expert review to ensure values ended up in the all the right places.
Below we can see different coloured “circles” drawn at each position in the path.
This is pretty trivial though – it’s just organised vector addition – did I need AI for that?
Moral of Act 3
Keeping it simple increases the chance of success, but you should balance this against whether a simple solution provides sufficient value.
Concluding the saga, for now
I tried to use Bard to deliver large chunks of a complex solution, rather than as a smarter autocomplete for finer details, or as an aid to understanding existing or proposed solutions. In the time I spent prompting Bard, I would have got further writing code directly. However, I have a lot of scope to improve my prompting.
With expertise in the domain and the code, I was able to diagnose and correct the issues in Bard’s solutions, but I suspect that someone who lacked one or both of those areas of expertise couldn’t recover quickly. In some respects, developing software is about recovering quickly from errors – we can’t avoid making mistakes, but we set up feedback loops to detect them quickly, and over time we become vigilant to more of the types of mistakes we are likely to make. Does an AI coding assistant like Bard help us recover quickly from mistakes? I didn’t actually ask Bard to help much in this session, so that question needs further work to resolve, possibly taking the angle of AI-aided test-first development.
What I did learn that Bard was at its most useful when prompted with open questions or very specific concepts with public data support. I also learned that generated code is likely to be sabotaged by domain dislogic and conceptual fault lines between contexts.
Over time, we’ll figure out how to make AI a better protagonist and antagonist in our coding stories; for me, this was an interesting way to introduce a new character.
A little smarter, anyway. I didn’t expect to pickthisup again, but when I occasionally run the first generation solvers online, I’m often equal parts amused and frustrated by rare words thrown up that delay the solution – from amethystine to zigging.
The solvers used the first idea that worked; can we make some tweaks to make them smarter? The code is now migrated to its own new repo after outgrowing its old home.
I measure solver performance by running multiple trials of a solver configuration against the simulator for a variety of target words. This gives a picture of how often the solver typically succeeds within a certain number of guesses.
It turns out that the vocabulary to date based on english_words_set is a poor match for the most frequently used English words, according to unigram frequency data.
So we might expect that simply replacing the solver vocabulary would improve performance, and we also get word ranking from unigram_freq.
To improve the gradient solver I tried making another random guess every so often to avoid long stretches exploring local minima. But it didn’t make things better, and probably made them worse!
In response, I made each guess the most common local word to the extrapolated semantic location, rather than just the nearest word. Still no better, and trying both “improvements” together was significantly worse!
Ah well, experiments only fail if we fail to learn from them!
I think the noise inherent in a different semantic model, plus the existing random extrapolation distance, overwhelms the changes I tried. In better news, we see a major improvement from using unigram freq vocabulary, reducing the mean from 280 (with many searches capped at 500) to 198, approximately a 30% improvement.
Here we see that the data-centric (vocabulary) improvement had a far bigger impact than any model-centric (search algorithm) improvement that I had the patience to try (though I left a bunch of further todos). Maybe just guessing randomly from the top n words will be better again! ????
At least I’ve made a substantial dent in reducing those all-too-common guesses at rare words.
Sure commercial maps app directions are great, but have you ever found the customisation options limited? What if you want to use bike paths and back streets when cycling, or avoid winding roads that might make backseat passengers car-sick on a road trip?
The paved route
OpenStreetMap and OpenRouteService do provide this type of functionality, and could be considered for use as-is or with further customisation. PostGIS and pgRouting provide capabilities if you bring your own data. Many dedicated apps support groups with particular mobility needs.
My way, off the highway
In researching these capabilities however, and because I’m a fan of maps and I wanted to understand the whole data transformation journey, I decided to hand-roll my own navigation solution using pyshp, numpy, scipy, and networkx, all visualised with matplotlib. The end result is far from polished, but it can ingest 1.1M road geometries for the Australian state of Victoria, and generate a topological graph for routing within minutes, then use that map to to generate turn-by-turn directions in real time.
See the source code and the brief write-up below if you’re interested.
The solution uses data from the Vicmap Transport data set, which provides road centrelines for highways, streets, tracks, paths, etc, for all of Victoria and some bordering regions. The spatial features are augmented with 71 attributes useful for routing, including road names, permitted directions of travel, height limits, etc. I used a GDA2020 datum grid projection shapefile export. Pyshp provides a list of geometries and attributes via shapeRecords.
Vicmap Transport road centrelines are collections of polylines. The endpoints of these polylines (aka road segments) helpfully coincide where we might infer an continuous stretch of road or intersection. This allows us to describe how roads are connected with a graph.
Each endpoint will map to a node in the graph. The node may be unique to one road segment, or shared between multiple road segments if it’s at a junction. We find the coincident endpoints for a shared node with the method query pairs. The road segments with coincident endpoints then define the edges of the graph at this node. A directed graph can be defined using the direction code attribute of each segment (forward, reverse, or both directions).
With a graph representation of the road network, we can find a route between any two nodes (if connected) with standard algorithms. The examples above and below uses Dijkstra’s algorithm to find shortest path based on edge weights that reflect our routing preferences. The orange route is “fewest hops” (count of road segments) and the green route is “shortest distance”. Geometric length of road segments is calculated in a post-processing pass over the ingested polyline data, and assigned as a weight to each edge.
Optimisation and scaling
My first spikes were hideously inefficient, but once the method was established, there was a lot of room for improvement. I addressed three major performance bottlenecks as I scaled from processing 50k road segments in 50 minutes, to 1.1M road segments in 30 seconds. These figures represent a 2017 Macbook Pro, or free Colab instance, being roughly similar.
Processing stage and end-to-end times
Coincident segment endpoints
For loop accumulating unique ids with distance test
For loop accumulating correctly directed edges case-wise and discarding topological duplicates
> 12 hrs
numpy vectorisation of loop conditional logic
Stages of optimisation to support scaling
An additional ~30s is required to post-process geometric length of segments as an attribute per edge, and I imagine similar for other derived edge attributes. For instance, we might get finer grained average cycling speed per segment, or traffic risk factors, etc.
For calculating routes, (i.e., at inference time) it takes about 1-4s to find a shortest path, depending on the length of the route (using pure Python networkx). We can now find routes of over 950km length, from Mildura in the state’s north-west to Mallacoota in the east.
More latitude for navigation
We would like to be able to find the start and end nodes of a route from latitude and longitude. However, as the nodes in our routing graph are located by VicGrid grid coordinates (eastings and northings), we first need to “wrap” this planar grid around the roughly spherical Earth. While geometrically complex (see below), it’s easy to do this transformation back and forth between coordinate systems with pyproj
Turn-by-turn navigation directions
With the start and end nodes located from latitude and longitude, a route can be calculated as above. Then, turn-by-turn directions can then be derived by considering the geometry of the road network at each intermediate node on the route, and what instructions might be required by users, for instance:
Determine the compass direction of travel along a road segment to initiate travel in the correct direction,
Calculate the angle between entry and exit directions at intersections to provide turn directions as left, right, straight, etc,
Use the geometric length of road segments to provide distance guidance,
Consolidate to a minimum set of directions by identifying where explicit guidance is not required (e.g., continuing straight on the same road), and
Render the instructions into an easily (?) human-consumable form, with natural language descriptions of appropriate precision.
['travel south west on Holly Court',
'continue for 190m on Holly Court',
'turn right into Marigold Crescent',
'continue for 360m on Marigold Crescent',
'go straight into Gowanbrae Drive',
'continue for 420m on Gowanbrae Drive',
'turn left into Gowanbrae Drive',
'continue for 150m on Gowanbrae Drive',
'turn left into Lanark Way',
'continue for 170m on Lanark Way']
This constitutes a rather neat first approximation to to commercial turn-by-turn directions, but I suspect it suffers in many edge cases, like roundabouts and slip lanes.
With a drive-by look at the key elements, the road ahead to future “my way” hand-rolled navigation is clearer. An essential next step would be an interactive map interface. However, making this prototype roadworthy also likely needs more data wrangling under the hood (e.g., for cycling-specific data), a review of where to leverage existing open services, and polishing edge cases to a mirror finish.
I’ve talked to many people about how to maximise the utility of a simulator for business decision-making, rather than focussing on the fidelity of reproducing real phenomena. This generally means delivering a custom simulator projectlean, in thin, vertical, end-to-end slices. This approach maximises putting learning into action and minimises risk carried forward.
For practitioners, and to sharpen my own thinking, I have provided a concrete example in code; a Hello, World! for custom simulator development. The simulated phenomena will be simple, but we’ll drive all the way through a very thin slice to supporting a business decision.
Introduction to simulation
To understand possible futures, we can simulate many things; physical processes, digital systems, management processes, crowd behaviours, combinations of these things, and more.
When people intend to make changes in systems under simulation, they wish to understand the performance of the system in terms of the design of actions they may execute or features that may exist in the real world. The simulator’s effectiveness is determined by how much it accelerates positive change in the real world.
Even as we come to understand systems through simulation, it is also frequently the case that people continue to interact with these systems in novel and unpredictable ways – sometimes influenced by the system design and typically impacting the measured performance – so it remains desirable to perform simulated and real world tests may in some combination, sequentially or in parallel.
For both of these reasons – because simulation is used to shorten the lead time to interventions, and because those interventions may evolve based on both simulated and exogenous factors – it is desirable to build simulation capability in an iterative manner, in thin, vertical slices, in order to respond rapidly to feedback.
This post describes a thin vertical slice for measuring the performance of a design in a simple case, and hence supporting a business decision that can be represented by the choice of design, and the outcomes of which can be represented by performance measures.
This post is a walk-through of this simulation_hello_world notebook and, as such, is aimed at a technical audience, but one that cares about aligning responsive technology delivery and simulation-augmented decision making for a wider stakeholder group.
If you’re just here to have fun observing how simulated systems behave, this post is not for you, because we choose a very simple, very boring system as our example. There are manyframeworks for the algorithmic simulation of real phenomena in various domains, but our example here is so simple, we don’t use a framework.
The business decision
How much should we deposit fortnightly into a bank account to best maintain a target balance?
Custom simulator structure
To preserve agility over multiple thin slices through separation of concerns, we’ll consider the following structure for a custom simulator:
Study of the real world
Translation of performance and design
Translation of an environment
We’ll look at each stage and then put them all together and consider next steps. As a software engineering exercise, the ongoing iterative development of a simulator requires consideration of Continuous Delivery, and in particular shares techniques with Continuous Delivery for Machine Learning (CD4ML). I’ll expand on this in a future post.
Study of the real world
We study how a bank account works and aim to keep the core algorithms really simple – as befits a Hello, World! example. In our study, we recognise the essential features of: a balance, deposits and withdrawals. The balance is increased by deposits and reduced by withdrawals, as shown in the following Python code.
Core algorithms reproduce real world phenomena, in some domain, be it physical processes, behaviour of agents, integrated systems, etc. We can simulate how our simple model of bank account will behave when we perform some set of transactions on it. In the realm of simulation, these are possible rather than actual transactions.
def simulate_transaction(account, kind, amount):
if kind == 'd':
elif kind == 'w':
raise ValueError("kind must be 'd' or 'w'")
account = Account()
balances = [account.balance]
for t in transactions:
simulate_transaction(account, t, t)
When we simulate the bank account, note that we are interested in capturing a fine-grained view of its behaviour and how it evolves (the sequence of balances), rather than just the final state. The final state can be extracted in the translation layer if required.
Visualisation is critical in simulator development – it helps to communicate function, understand results, validate and tune implementation and diagnose errors, at every stage of development and operation of a simulator.
This could be considered a type of discrete event simulation, but as above, we’re more concerned with a thin slice through the whole structure than the nature of the core algorithms.
Translation of performance and design
The core algorithms are intended to provide a fine-grained, objective reproduction of real observable phenomena. The translation layer allows people to interact with a simulated system at a human scale. This allows people to specify high-level performance measures used to evaluate the system designs, which are also specified at a high level, while machines simulate all the rich detail of a virtual world using the core algorithms.
We decide for our Hello, World! example that this is a transactional account, and as such we care about keeping the balance close to a target balance, so that sufficient funds will be available, but we don’t leave too much money in a low interest account. We therefore measure performance as the average (mean) absolute difference between the balance at each transaction, and the target balance. Every transaction that leaves us over or under the target amount is penalised by the difference and this penalty is averaged over transactions. A smaller measure means better performance.
This may be imperfect, but there are often trade-offs in how we measure performance. Discussing these trade-offs with stakeholders can be a fruitful source of further insight.
def translate_performance_TargetBalance(balances, target):
return sum([abs(b - target) for b in balances]) / len(balances)
While we can make any arbitrary number of transactions and decide about each individual transaction, we consider for simplicity that set up a fortnightly deposit schedule, and make one decision about the amount of that fortnightly deposit.
As the performance translation distills complexity into a single figure, so the design translation causes the inverse, with complexity blooming from a single parameter We translate the single design parameter into a list of every individual transaction, suitable for simulation by core algorithms.
This example above (and from the notebook) specialises the target balance performance translator to Target100 and makes the design translator configurable.
Remebering that we’re interested in the mean absolute delta between the actual balance and the target, here’s what it might look like to evaluate and visualise a design:
evaluating account balance target 100
with FortnightlyDeposit 
the mean abs delta is 61.89
Experimentation involves exploring and optimising simulated results through the simple interface of performance = f(design). This means choosing a monthly deposit amount and seeing how close we get to our target balance over the period. Note that while performance and design are both single values (scalars) in our Hello, World! example, in general they would both consist of multiple, even hundreds of, parameters.
This performance function shows visually the optimum (minimum) design is the point at the bottom of the curve. We can find the optimal design automatically using (in this instance) scipy.optimize.minimize on the function performance = f(design).
We can explore designs in design space, as above, and we can also visualise how an optimal design impacts the simulated behaviour of the system, as below.
Note that in the optimal design, the fortnightly deposit amount is ~5 for a mean abs delta of ~42 (the optimal performance) rather than a deposit of 9 (an arbitrary design) for a mean abs delta of ~62 (the associated performance) in the example above.
Translation of an environment
The example so far assumes we have complete control of the inputs to the system. This is rarely the case in the real world. This section introduces environmental factors that we incorporate into the simulation, but consider out of our control.
Just like design parameters, environmental factors identified by humans need to be translated for simulation by core algorithms. In this case, we have a random fortnightly expense that is represented as a withdrawal from the account.
If we were to simulate this environmental factor with without any deposits being made, it would look like below.
Now we translate design decisions and environmental factors together into the inputs for the simulation core algorithms. In this case we interleave or “zip” the events together
interleaved = zip(translate_design_FortnightlyDeposit(design_parameters),
return [val for pair in interleaved for val in pair]
We’ll now introduce an alternative design and optimise that design by considering performance when the simulation includes environmental factors.
After visualising the result of our first design against our performance measurement, an alternative design suggests itself. Instead of having only a fixed fortnightly deposit, we could make an initial large deposit, followed by smaller fortnightly deposits. Our design now has two parameters.
In the absence of environmental factors, our system would evolve as below.
However, we can also incorporate the environmental factors by interleaving the environmental transactions as we did above.
Our performance function now incorporates two design dimensions. We can run experiments on any combination of the two design parameters to see how they perform. With consideration of environmental factors now, we can visualise performance and optimisation the design in two dimensions.
Instead of the low point on a curve, the optimal design is now the low point in a bowl shape, the contours of which are shown on the plot above.
This represents our optimal business decision based on what we’ve captured about the system so far, an initial deposit of about $105 and an fortnightly deposit of $2.40. If we simulate the evolution of the system under the optimal design, we see a result like below, and can see visually why it matches our intent to minimise the deviation from the target balance at each time..
The next thin slice
We could increase fidelity by modelling transactions to the second, but it may not substantially change our business decision. We could go on adding design parameters, environmental factors, and alternative performance measures. Some would only require changes at the translation layer, some would require well-understood changes to the core algorithms, and others would require us to return to our study of the world to understand how to proceed.
We only add these things when required to support the next business decision. We deliver another thin slice through whatever layers are required to support this decision by capturing relevant design parameters and performance measures at the required level of fidelity. We make it robust and preserve future flexibility with continuous delivery. And in this way we can most responsively support successive business decisions with a thinly sliced custom simulator.
Weighting objectives and constraints
We used a single, simple measure of performance in this example, but in general there will be many measures of performance, which need to be weighted against each other, and often come into conflict. We may use more sophisticated solvers as the complexity of design parameters and performance measures increases.
Instead of looking for a single right answer, we can also raise these questions with stakeholders to understand where the simulation is most useful in guiding decision-making, and where we need to rely on other or hyrbid methods, or seek to improve the simulation with another development iteration.
Experimentation in noisy environments
The environment we created is noisy; it has non-deterministic values in general, though we use a fixed random seed above. To properly evaluate a design in a noisy environment, we would need to run multiple trials.
Simulation scenarios become arbitrarily complex. We first validate the behaviour in the simplest point scenarios that ideally have analytic solutions, as above. We can then calibrate against more complex real world scenarios, but when we are sufficiently confident the core algorithms are correct, we accept that the simulation produces the most definitive result for the most complex or novel scenarios.
Most of the work is then in translating human judgements about design choices and relevant performance measures into a form suitable for exploration and optimisation. This is where rich conversations with stakeholders are key to draw out their understanding and expectations of the scenarios you are simulating, and how the simulation is helping drive their decision-making. Focussing on this engagement ensures you’ll continue to build the right thing, and not just build it right.
In between beach trips and bike rides, I whiled away more than a few summer hours on puzzles I found in the various AirBNBs we rented. Returning home, I rediscovered a sliding tile puzzle with a twist called Asteroid Escape, where embedded asteroids prevent certain tiles sliding past each other in certain configurations.
With 60 challenges from Starter to Wizard level, I was musing about generating the catalogue of challenges automatically. I’ve enjoyed automating puzzle solutions since brute-forcing Instant Insanity with BASIC in my teens (the code is long lost). While Starter puzzles may only require 6 moves, Wizard may require as many as 109 moves. I decided I would particularly enjoy avoiding the gentle frustration of manually solving so many variants of Asteroid Escape if I were to use an automated solver.
I built a solver in Python with numpy for working with grids of numbers. If automating puzzles is your thing too, check out the source code and read on for a summary of the approach. To execute the proposed solution, I printed out the moves and crossed them off with a pencil as I made them. There’s a lot that could be more elegant or efficient, but it gets the job done!
Define a simple sliding tile board. Elements of a 2D array hold the indexes of the tile pieces at each location (numbered 1..8 for a 3×3 board). The tiles can be reordered through a sequence of moves of the “hole”, that is the space with no tile (numbered 0). The available moves depend on the hole location – four options from the centre but only two from the corners.
Solve the board. Find sequences of moves of the hole that reconfigure the board into a target arrangement of tiles. This I treated as a queue-based breadth-first graph search, with arrangements of pieces being the nodes of a graph, and moves being the edges. However the specific constraints of Asteroid Escape aren’t captured at this point; all tiles are interchangeable in their movements.
Model the blockers that prevent certain tile arrangements. I quickly realised I would miss edge cases if I tried to list permitted arrangements of tiles, and instead defined the “blockers” on each piece with reference to a 4×4 grid. The grid had one cell margin around the outside of the piece to model projecting outer blockers, and 4 positions inside the piece that could carry blockers. A grid cell would be assigned 1 if there was a blocker or 0 otherwise. For a proposed board arrangement, the piece grids could be offset to their location and superposed onto a larger 8×8 board grid by adding piece cell values. A resultant value greater than 1 would mean that two or more blockers were interfering, so the board arrangement was not actually possible.
Solve the board while respecting blocked arrangements. The breath-first search identifies the neighbours – reachable in one move – from each arrangement of pieces. Any neighbours that would result in interfering blockers are now excluded from the search. The target state also only requires that the spacecraft piece should be in the exit position. Both of these constraints were captured as functions passed to the solver. This worked well for a Starter puzzle I tested, but failed on a Wizard puzzle, because it failed to capture piece interference that only occurs when moving pieces.
Model the blockers that prevent certain tile moves. In particular, the nose of the spacecraft doesn’t interfere with the protruding corner blocker on one asteroid tile in static arrangements, but there are scenarios where the tiles can’t be moved past on another. To model this, I increased the piece grid to 5×5 (now 9 interior positions) to capture the previously ignored spacecraft nose, and when assessing a move, I shifted the tile by one grid cell at a time so that each move assessed interference at start and end, and two intermediate positions.
Solve the board while respecting blocked arrangements and blocked moves. Update the neighbour search to exclude moves to neighbours that would result in blocker interference during the move. Update the target state to avoid any interference in sliding the spacecraft off the board.
Of course, the solver didn’t go off without a hitch when I tried it in the real world. I failed to specify a clear exit was required at first, and failed to specify it properly the second time, but third time was a charm (as above).
From a cursory inspection, the Wizard level 60 solution in the notebook and animated above looks the same as the solution provided with the puzzle (so I didn’t really need a solver at all!) However, this solver can also find multiple other (longer) solutions to each challenge. It could also be used to generate and solve more variants at each level of difficulty, extending the catalogue of puzzles. Improvements can wait, however, as it’s time to enjoy the bike or the beach once more.
The metaverse is a topic currently, though the concept has a long history. Twenty years ago, in the dotcom era, I was exploring this space, as I was recently reminded. Feeling nostalgic, I dug these projects out of the NAS archives. Tech has moved on, but there’s enduring relevance in what I learned.
Virtual Online Orienteering, to the max! Conceived as similar to Catching Features, but a game that was online by default, as you could play in the browser in a virtual world authored in VRML97.
The UI consisted of two browser windows: a first person view and motion controls (using the now defunct Cosmo Player), and a map in a second window.
The draggable compass needle, the checkpoints and the course logic (must visit checkpoints in order), and a widget that visualised your completed route as an electric blue string hovering 3 feet above the ground were all modular VRML Protos.
The map and terrain for the only level ever created were generated together with a custom C++ application. I was pretty pleased this all worked, and it demonstrated some concepts for…
4DUniverse was a broad concept for virtual online worlds for socialising, shopping, gaming, etc, similar at the time to ActiveWorlds (which still exists today!), but again accessible through the browser (assuming you had a VRML plugin).
I thought I’d have great screengrabs to illustrate this part of the story, but I was surprised how few I’d captured, that they were very low resolution, and that they were in archaic formats. The source artefacts from this post – WRL, HTML, JS, JAVA, etc – have lost no fidelity, but would only meet modern standards and interpreters to various degrees. Maybe I will modernise them someday and generate new images to do justice to the splendour I held in my mind!
We authored a number of worlds, connected by teleports, with the tools we had to hand, being text editors, spreadsheets, and custom scripts. While a lot of fun, we came to the conclusion that doing the things we envisaged in the 4DUniverse wasn’t any more compelling than doing them in the 2D interfaces of the time. VRML eventually went away, and probably because no one else was able to make a compelling case for its use. At least I crafted a neat animated GIF logo rendered with POV-Ray.
Less multiverse, and more quantum realm, I also generated VRML content at nanoscale with…
NanoCAD was a neat little (pun intended) CAD application for designing molecules, which I extended with a richer editing UI, supporting the design of much more complex hypothesised molecular mechanisms.
The Java app allowed users to place atoms in 3D and connect them with covalent bonds. Then an energy solver would attempt to find a stable configuration for the molecules (using classical rather than quantum methods). With expressive selection, duplication and transformation mechanics, it was possible to create benzene rings, stitch them into graphene sheets, and roll them up into “enormous” buckytubes, or other complex carbon creations.
I also created cables housed inside sheaths, gears – built with benzene ring teeth attached to buckytubes – and other micro devices. If 4DUniverse was inspired by Snow Crash, NanoCAD was inspired by The Diamond Age. Nanocad could run in a browser as an applet and the molecules could also be exported as WRL files for display in other viewers.
Comparing contemporary professional projects
It’s nice to contrast the impermanence of these personal projects with the durability of my contemporary professional work with ANCA Machines. At the time, I was documenting the maths and code of Cimulator3D and also developing the maths and bidirectional UI design for iFlute, both used in the design and manufacturing of machine tools via grinding processes. Both products are still on the market in similar form today, more than two decades later.
I wonder how I’ll view this post in another twenty years?
In the post Sketching Semantle Solvers, I introduced two methods for solving Semantle word puzzles, but I only wrote up one. The second solver here is based the idea that the target word should appear in the intersection between the cohorts of possible targets generated by each guess.
A vocabulary, containing all the words that can be guessed,
A semantic model, from which the agent can calculate the similarity of word pairs,
The ability to generate cohorts of words from the vocabulary that are similar (in Semantle score) to a provided word (a guess), and
An evolving strength of belief that each word in the vocabulary is the target.
In each step towards guessing the target, the solver does the following:
Choose a word for the guess. The current choice is the word with the strongest likelihood of being the target, but it could equally be any other word from the solver’s vocabulary (which might help triangulate better), or it could be provided by a human player with their own suspicions.
Score the guess. The Semantle simulator scores the guess.
Generate a cohort. The guess and the score are used to generate a new cohort of words that would share the same score with the guess.
Merge the cohort into the agent’s belief model. The score is added to the current belief strength for each word in the cohort, providing a proxy for likelihood for each word. The guess is also masked from further consideration.
Show of strength
The chart below shows how the belief strength (estimated likelihood) of the target word gradually approaches the maximum belief strength of any word, as the target (which remains unknown until the end) appears in more and more cohorts.
We can also visualise the belief strength across the whole vocabulary at each guess, and the path the target word takes in relation to these distributions, in terms of its absolute score and its rank relative to other words.
The cohort solver can be (de)tuned to almost any level of performance by adjusting the parameters precision and recall, which determine the tightness of the similarity band and completeness of results from the generated cohorts. The gradient descent solver has potential for tuning parameters, but I didn’t explore this much. To compare the two, we’d therefore need to consider configurations of each solver. For now, I’m pleased that the two distinct sketches solve to my satisfaction!