Rebooting AI Review

I was excited to read Rebooting AI (website), to find inspiration and tools for doing things better. Here is the book in one great quote:

For now, we are in a kind of interregnum: narrow but networked intelligences with autonomy, but too little genuine intelligence to be able to reason about the consequences of that power.

There is a lot to like. Marcus & Davis clearly map out the history and landscape of AI challenges, plus plausible elements of future solutions. They provide useful tools for thinking about problems with partial solutions to intelligence, such as the “fundamental over attribution error” and the “illusory progress gap”. They show how current ML solutions based on big data can be opaque and brittle. They demonstrate how key attributes of human intelligence instead allow the development of rich cognitive models – such as how language and the real world work – and how solutions incorporating such models would address current shortcomings, enabling AI to tackle open-ended tasks. This is great material for a general reader.

Where I felt the book fell short was that it didn’t build many bridges between our current “narrow but networked intelligences” and the authors’ posited future state capabilities. The future state reads like Artificial General Intelligence (AGI) by another name, fleshed out by scenarios that are short on implementation detail. Though sometimes mundane, from our current perspective, Arthur C Clarke might describe as them “indistinguishable from magic” and hence Rodney Brooks would say they are “no longer falsifiable”.  We know there’s a massive chasm between current ML solutions and AGI, but I didn’t find much to close or bridge the chasm in Rebooting AI.

Some of these future capabilities are illustrated by domain-specific modelling techniques – like formal logic – that would be familiar to many computer science students. But I found this a little incongruous because these techniques have also failed to deliver on promises of realising intelligence, and not done any more to squash the “long tail of edge cases” than other narrow intelligences. Given the diverse facets of intelligence, maybe the paradigm of “narrow but networked intelligences” is the best way to achieve or approximate intelligence, or maybe it’s ultimately illusory progress, but these illustrations didn’t help me resolve that.

There is undeniable value in the current generation of ML solutions. How do we build on these? A detailed analysis of a number of key avenues of short to medium term progress was lacking. For instance, starting with current ML solutions, the authors could have explored:

  • various designs of hybrid human-machine decision-making systems that augment human abilities while remaining resilient to new scenarios that stump machines;
  • transfer learning, few-shot learning and sophisticated representation learning like transformers, that have potential to increase the representative and reasoning power of solutions;
  • the role of ecosystem design and governance, including ongoing monitoring and data curation to correct issues (for instance bias testing, CD4ML, etc).

Instead, ML was stereotyped as fully automated, tabula rasa, E2E.

Finally, to know things are getting better, we need the right baseline and measures. While the language examples clearly demonstrated superficial artificial understanding, and self-driving vehicles have a ways to go, some issues raised were not assessed against incumbent human capabilities on narrow tasks in a like-for-like comparison, but rather against posited capabilities of a future AGI system. I would agree that humans can individually reflect and introspect to recognise their mistakes, but it is still the case that, in operational scenarios, humans make mistakes like artificial systems do. These operational mistakes are moderated by the wider ecosystem in which humans operate, in the same way as predictive inference mode is moderated by a wider human-machine ecosystem. I felt the core issue in some instances was structurally unsound or concentrated decision-making without proper governance, rather than whether or not mistakes were made, and this confounded the analysis. I would have liked to have seen these factors teased out so comparisons could be made in a way that would help to measure progress.

Marcus & Davis do lay out a helpful framework for building trust in AI systems, including stress testing, understanding costs of failures, building in modularity and maintainability, etc. This is good guidance but it would be really helpful to see more detail or case studies under these headlines, to the specificity of other works like Weapons of Math Destruction and Made by Humans.

So, maybe I was hoping for “Refactoring AI” rather than “Rebooting AI”. The book certainly clearly describes problems with the current state, and desirable characteristics of the future state. On balance, the technical arguments may indeed be sounder than my concerns. If you’re curious, I would encourage you to read it and draw your own conclusions. Ultimately, however, I’m disappointed because I didn’t leave inspired and equipped with new insight and new tools for improving AI today, tomorrow, and the day after.

No Smooth Path to Good Design

The path to good design is bumpy, as we will demonstrate with four teapots. (Yes, teapots. Teapots are a staple of computer science and philosophy.)

The path to good design matters, because if you are trying to build a design capability, the journey will be smoother if you understand that the path is bumpy.

Leaders who appreciate the bumpy path can facilitate far greater value creation and support a more engaged group of workers.

What is design?

Design is an activity, but also a result: the specification for a product (service), which determines how it is made or delivered.

Performance is a measure of how a product actually functions, for a given task in a given context. Performance in the broadest sense includes emotional responses, static and dynamic physical characteristics, service characteristics, etc. For simplicity, let’s measure performance in monetary terms; eg. lifetime economic value.

Design is important as an activity and a result, because it is the prime determinant of performance that is within your control.

The smooth path

Teapot by Norman
Teapot by Norman[1]
Consider the distinctive teapot from the cover of Don Norman’s Design of Everyday Things, where the handle – instead of opposing – is aligned with the spout.

We know a thing or two about teapots, so we assume this design has very poor performance!

However, we also assume that a traditional design with handle opposed to the spout produces the best performance.

We can plot our smooth model of how performance varies as a function of the angle between spout and handle.

Performance of teapot design variants
Performance of teapot design variants

And it’s pretty clear how to find the best design. The more opposing the handle and spout, the better the performance, the more value created, and hence the better the design.

The first bump in the path

Yokode Kyusu
Yokode Kyusu [1]
However, this model is broken. We can’t interpolate smoothly (linearly) between design points, as demonstrated by the Japanese yokode kyusu, which features a handle at right angles to its spout, to extract every last drop of tea.

With this new insight, and a further assumption that handles in between the points we’ve plotted (eg, 45 degrees) are much worse due to awkward twisting motions when pouring, we can draw a new model, which is already much less smooth.

Teapot performance with new information
Teapot performance with new information

What’s interesting about this landscape is that most design variants perform pretty poorly, and you must be close to a good design to find it. If you didn’t have the insight into teapot performance that we have assumed – if you had only tested performance at the awkward angles, and you had assumed smooth behaviour in between – you would likely miss the best designs and leave significant value on the table. (Note that the scale of this diagram should be greatly exaggerated to demonstrate the true size of value creation opportunities.)

Value created by discovery
Value created by exploration

So, this is the first lesson of the bumpy path to good design. We need to explore the performance of multiple design variants, and understand that small changes in design can have enormous impacts on performance, to be confident we are approaching our potential to create value.

Teapot with handle on top
Teapot with handle on top [3]
So far, we have only explored the impact of one design variable, but for any product we have effectively infinitely many design variables (if we can just conceive them). For instance, the handle of a teapot could also be on top, but we could also consider the shape, material, fixtures, etc. Then we could move beyond the handle to the design of the rest of the teapot!

Now consider the design and delivery of digital products and services. Constraints do exist, but infinite design variants still exist within those constraints. Further, like the rolled up dimensions of string theory, there are extra dimensions of design that are easy to miss, but once discovered can be expanded and explored to create ever more value.

The first lesson

How do leaders get this wrong? By failing to encourage the exploration of a sufficient number of design variants, and by failing to encourage the exploration of minor changes that have outsize impact.

As a leader, you must be prepared to carve out time and space, embrace uncertainty and ambiguity, and bring creativity, compassion and patience to the exploration process. As important as this is to creating value, it is also key to maintaining the engagement of teams involved in or interacting with design.

I’m often told that exploration feels inefficient. Or, rather, felt inefficient. The distinction is importation. Hindsight bias distorts the reality that before starting an exploration into a sufficiently bumpy landscape, we simply cannot know what we will find. So how do we measure efficiency of exploration? Certainly not by how quickly we arrive at a design, or by how many designs are discarded. Should we even measure efficiency of exploration? That is a better question. We should focus on net value creation, and do enough exploration to mitigate the risk that we are leaving significant value on the table.

This design sensibility, however, may not be apparent to the whole team. Designers will be frustrated being managed to a smooth path, while others who perceive the challenge to be simple may become frustrated when the bumpiness is allowed to surface. The team’s various activities may have different cadences that sometimes align, and sometimes don’t. This can create friction and dissatisfaction in teams. Some functional conflict is healthy in this regard, but as a leader, you must support and enable a team to focus on what it takes to create value.

The second bump in the path

I have used word “assume” liberally and deliberately above. I have assumed a large number of things about the tasks that users of the teapots are seeking to achieve, and the broader contexts of use. I have further assumed that my readers share a traditional western notion of teapots and their use. I have done this to keep simple – I hope – the explanation of the first bump.

But “assume” is at the root of the second bump. During product development, we can’t assume performance, we must test designs with users engaged a task in a context. We may take shortcuts by prototyping, simulating, etc, but we must test as objectively as possible, for a meaningful prediction of a product’s performance, and potential to create value.

In a bumpy design landscape, poor predictions of actual performance carry significant opportunity cost.

Value created by testing
Value created by testing

(Note also that during the development of a typical digital product/service, we are typically iteratively discovering the task and the context in parallel.)

We assumed, with our teapots above, that a spout aligned with the handle would lead to poor performance, but we didn’t test it (with a minor tweak in a hidden dimension). If we’d tested this traditional oriental design (as UX Designer Mike Eng did), we would have discovered that, for the task of serving oneself, in a solitary context, the aligned handle actually produces superior performance.

Aligned handle teapot
Aligned handle teapot [4]
I was surprised to find this teapot design existed when I stumbled upon the post from above. I suspect this teapot design has a specific name or an interesting story behind it, but I haven’t been able to track it down. However, it serves as an excellent demonstration that the best design paths are bumpy.

The second lesson

The second lesson is that assumptions about performance, task and context hide the inherent bumpiness in design. As a leader, you must recognise and challenge assumptions, encourage the testing of designs under the correct conditions, and appreciate that our understanding of task and context may evolve with testing.

There are many resources that discuss lightweight and effective approaches to UX research and testing; you could do worse than to start here.

Conclusion

We have discussed two major value creation activities in design:

  • Exploration and consequent discovery of performant designs
  • Testing and consequent selection of more performant designs

But these activities are overlooked or de-prioritised with a smooth mindset. While there is uncertainty, ambiguity and friction along the path, and sometimes progress is difficult to discern, as a leader, you must embrace the bumps because – if you are in the business of creating value – there is no smooth path to good design.

Image credits
  1. http://www.amazon.co.uk/Design-Everyday-Things-Donald-Norman/dp/0262640376
  2. http://commons.wikimedia.org/wiki/File:JapaneseTeapot.jpg
  3. http://www.wilkinpottery.com/product/teapot-top-handle/
  4. Ebay listing from seller http://www.ebay.com/usr/mitch8670

Jetty to Jetty app

I released an app 🙂 – for iOS and Android.

It’s a self-guided audio tour of historic sites in Broome, Western Australia, including beautiful stories told by locals. Nyamba Buru Yawuru developed the concept, curated the media, engaged local stakeholders, and were product owners for the app.

Jetty to Jetty screenshots
Jetty to Jetty screenshots

This work was exciting for its value to the Broome and Yawuru community, but also because it was an opportunity to innovate under the constraint of building the simplest thing possible. The simplest thing possible was in stark contrast to the technical whizbangery (though lean delivery) of my previous app project – Fireballs in the Sky.

I had fun working on the interaction and visual design challenges under the constraints, and I think the key successes were:

  • Simplifying presentation of the real-world and in-app navigation as a hand-rolled map (drawn in Inkscape), showing all the sites, that scrolls in a single direction.
  • Hiding everything unnecessary during playback of stories, to allow the user to focus on the place and the story.
  • Playback control behaviour across sites and the main map.
  • Not succumbing to the temptation to add geo-location, background audio, or anything else that could have added to the complexity!

My colleague Nathan Jones laid the technical foundations – Phonegap/Cordova wrapping a static site built by Middleman and using CoffeeScript, knockout.js, HAML, Sass and HTML5/Cordova plugin for media. He later went on to extend and open-source (as Jila) this framework for the Yawuru Ngan-ga language app. Most of the development work by Nathan and me was done in early 2014.

While intended to be used in Broome (and yet another reason to visit Broome), the app and its beautiful stories can be enjoyed anywhere.

Leave Product Development to the Dummies

This is the talk I gave at Agile Australia 2013 about the role of simulation in product development. Check out a PDF of the slides with brief notes.

Description

"Dummies" talk at Agile Australia

Stop testing on humans! Auto manufacturers have greatly reduced the harm once caused by inadvertently crash-testing production cars with real people. Now, simulation ensures every new car endures thousands of virtual crashes before even a dummy sets foot inside. Can we do the same for software product delivery?

Simulation can deliver faster feedback than real-world trials, for less cost. Simulation supports agility, improves quality and shortens development cycles. Designers and manufacturers of physical products found this out a long time ago. By contrast, in Agile software development, we aim to ship small increments of real software to real people and use their feedback to guide product development. But what if that’s not possible? (And can we still benefit from simulation even when it is?)

The goal of trials remains the same: get a good product to market as quickly as possible (or pivot or kill a bad product as quickly as possible). However, if you have to wait for access to human subjects or real software, or if it’s too costly to scale to the breadth and depth of real-world trials required to optimise design and minimise risk, consider simulation.

Learn why simulation was chosen for the design of call centre services (and compare this with crash testing cars), how a simulator was developed, and what benefits the approach brought. You’ll leave equipped to decide whether simulation is appropriate for your next innovation project, and with some resources to get you started.

Discover:

  • How and when to use simulation to improve agility
  • The anatomy of a simulator
  • A lean, risk-based approach to developing and validating a simulator
  • Techniques for effectively visualising and communicating simulations
  • Implementing simulated designs in the real world