LEGO and Software – Variety and Specialisation

Since my first post on LEGO as a Metaphor for Software Reuse, I have done some more homework on existing analyses of LEGO® products, to understand what I could myself reuse and what gaps I could fill with further data analysis.

I’ve found three fascinating analyses that I share below. However, I should note that these analyses weren’t performed considering LEGO products as a metaphor or benchmark for software reuse. So I’ve continued to ask myself: what lessons can we take away for better management of software delivery? For this post, the key takeaways are market and product drivers of variety, specialisation and complexity, rather than strategies for reuse as such. I’m hoping to share more insight on reuse in LEGO in future posts, in the context of these market and product drivers.

I also discovered the Rebrickable downloads and API, which I plan to use for any further analysis – I do hope I need to play with more data!

Reuse Concepts

I started all this thinking about software reuse, which is not an aim in itself, but a consideration and sometimes an outcome in efficiently satisfying software product objectives. As we think about reuse and consider existing analyses, I found it helpful to define a few related concepts:

  • Variety – the number of different forms or properties an entity under consideration might take. We might talk about variety of themes, sets, parts, and colours, etc.
  • Specialisation – of parts in particular, where parts serve only limited purposes.
  • Complexity – the combinations or interactions of entities, generally increasing with increasing variety and specialisation.
  • Sharing – of parts between sets in particular, where parts appear in multiple sets. We might infer specialisation from limited sharing.
  • Reuse – sharing, with further consideration of time, as some reuse scenarios may be identified when a part is introduced, some may emerge over time, and some opportunities for future reuse may not be realised.

Considering these concepts, the first two analyses focus mainly on understanding variety and specialisation, while the third dives deeper into sharing and reuse.

Increase in Variety and Specialisation

The Colorful Lego

Visualisation of colours in use in LEGO sets over time
Analysis of LEGO colours in use over time. Source: The Colorful Lego Project

Great visualisations and analysis in this report and public dashboard from Edra Stafaj, Hyerim Hwang and Yiren Wang, driven primarily by the evolving colours found in LEGO sets of over time, and considering colour as a proxy for complexity. Some of the key findings:

  • The variety of colours has increased dramatically over time, with many recently introduced colours already discontinued.
  • The increase in variety of colours is connected with growth of new themes. Since 2010, there has been a marked increase in co-branded sets (“cooperative” theme, eg, Star Wars) and new in-house branded sets (“LEGO commercial” theme, eg, Ninjago) as a proportion of all sets.
  • That specialised pieces (as modelled by Minifig Heads – also noted as the differentiating part between themes) make up the bulk of new pieces, compared to new generic pieces (as modelled by Bricks & Plates).

Colour is an interesting dimension to consider, as it may be argued an aesthetic, rather than mechanical, consideration for reuse. However, as noted in the diversification of themes, creating and satisfying a wider array of customer segments is connected to the increasing variety of colour.

So I see variety and complexity increasing, and more specialisation over time. The discontinuation of colours suggests reuse may be reducing over time, even while generic bricks & plates persist.

67 Years of Lego Sets

Visualisation of the LEGO, in the LEGO, for the LEGO people. Source 67 Years of Lego Sets

An engaging summary from Joel Carron of the evolution of LEGO sets over the years, including Python notebook code, and complete with a final visualisation made of LEGO bricks! Some highlights:

  • The number of parts in a set has in general increased over time.
  • The smaller sets have remained a similar size over time, but the bigger sets keep getting bigger.
  • As above, colours are diversifying, with minor colours accounting for more pieces, and themes developing distinct colour palettes.
  • Parts and sets can be mapped in a graph or network showing the degree to which parts are shared between sets in different themes. This shows some themes share a lot of parts with other themes, while some themes have a greater proportion of unique parts. Generally, smaller themes (with fewer total parts) share more than larger themes (with more total parts).

So here we add to variety and specialisation with learning about sharing too, but without the chronological view of that would help us understand more about reuse – were sets with high degrees of part sharing developed concurrently or sequentially?

Reduction in Sharing and Reuse

LEGO products have become more complex

A comprehensive paper, with dataset and R scripts, analysing increasing complexity in LEGO products, with a range of other interesting-looking references to follow up on, though acknowledgement that scientific investigations on the development of the LEGO products remain scarce.

This needs a thorough review in its own post, with further analysis and commentary on the implications for software reuse and management. That will be the third post of this trilogy in N parts.

Lessons for Software Reuse

If we are considering LEGO products as a metaphor and benchmark for software reuse, we should consider the following.

Varied market needs drive variety and specialisation of products, which in turn can be expected to drive variety and specialisation of software components. Reuse of components here may be counter-productive from a product-market fit perspective (alone, without further technical considerations). However, endless customisation is also problematic and a well-designed product portfolio will allow efficient service of the market.

Premium products may also be more complex, with more specialised components. Simpler products, with lesser performance requirements, may share more components. The introduction of more premium products over time may be a major driver of increased variety and specialisation.

These market and product drivers provide context for reuse of software components.

LEGO® is a trademark of the LEGO Group of companies which does not sponsor, authorize or endorse this site.

LEGO as a Metaphor for Software Reuse – Does the Data Stack Up?

LEGO® products are often cited as a metaphor for software reuse; individual parts being composable in myriad ways.

I think this is a bit simplistic and may miss the point for software, but let’s assume we should aim to make software in components that are as reusable as LEGO parts. With that assumption, what level of reuse do we actually observe in LEGO sets over time?

I’d love to know if anyone else has done this analysis publicly, but a quick search didn’t reveal much. So, here’s a first pass from me. I discuss further analysis below.

The data comes from the awesome catalogue at Bricklink. I start with the basic catalogue view by year.

More New Parts Every Year

My first observation is that there are an ever-increasing number of new parts released every year.

Chart of Lego parts over time, showing 10x increase in parts from late 1980s to early 2020s

This trend in new parts has been exponential until just the last few years. The result is that there are currently 10 times the number of parts as when I was a kid – that’s why the chart is on a log scale! These new parts are reusable too.

Therefore, the existence of reusable parts doesn’t preclude the creation of new reusable parts. Nor should we expect the creation of new parts to reduce over time, even if existing parts are reusable. If LEGO products are our benchmark for software reuse, we should expect to be introducing new components all the time.

More New Parts in New Products

My second observation is that the new parts (by count) in new products have generally increased year by year.

The increase has been most pronounced from about 2000 to 2017, rising from about two to over five new parts per new set on average. The increase is observed because – even though new sets over time are increasing (below) – new parts are increasing faster (as above).

Chart showing new Lego sets over time

Therefore, the existence of an ever-increasing number of reusable parts doesn’t preclude an increase in the count of new parts in new sets. If LEGO products are our benchmark for software reuse, we should expect to continue introducing new components in new products, possibly at an increasing rate.

This is only part of the picture, though. I have been careful to specify count of new parts in new sets only, as that data is easy to come by (from the basic Bricklink catalogue view by year). The average count of new parts in new sets is simply the number of new parts each year divided by the number of new sets each year.

We also need to understand how existing parts are reused in new sets. For instance, is the total part count increasing in new products and are are new parts therefore a smaller component of new products? Which existing parts see the most reuse? We might also want to consider the varied nature and function of new and existing parts in new products, and the interaction between products. These deeper analyses could be the focus of future posts.

Provisional Conclusions

As a LEGO product aficionado, I’ve observed an explosion of new parts and sets since I was a kid, but I was surprised by the size of the increase revealed by the data.

There is more work to do on this data, and LEGO products may be a flawed metaphor for software engineering. We may make stronger claims about software reuse with other lines of reasoning and evidence. However, if LEGO products are to be used as a benchmark, the data dispels some misconceptions, and suggests that for functional software product development:

  • Reusable components don’t eliminate the need for new components
  • The introduction of new components may increase over time, even in the presence of reusable components
  • New products may contain more new components than existing products, even in the presence of reusable components

LEGO® is a trademark of the LEGO Group of companies which does not sponsor, authorize or endorse this site.

More Sankey for Less Confusion?

Confusion Matrixes are essential for evaluating classifiers, but for some who are new to them, they can cause, well, confusion.

Sankey Diagrams are an alternative way of representing matrix data, and I’ve found some people – who are new to matrix data, like business domain experts who are not experienced data scientists – find them easier to understand. Also, some machine learning researchers find Sankey diagrams useful for analysing data and classifiers.

So, I have posted simple code for visualising classifier evaluation or comparisons as Sankey diagrams. Maybe it will be useful for others, as well as fun for me.

The code combines large portions of Plotly Sankey Diagrams with essence of scikit-learn confusion matrix and a lashings of list comprehension code golf.

The scenarios supported are:

  1. Evaluating a binary classifier against ground truth or as champion-challenger,
  2. Evaluating a multi-class classifier against ground truth or as champion-challenger,
  3. Comparing multiple stages of a decision process, or multiple versions of a binary classifier, for instance over time, or hyper-parameter sweeps, and
  4. Comparing multiple versions of a multi-class classifier.
Example confusion matrixes as Sankey diagrams

See the code on Github.

Maths Whimsy

Time to make for a home for those occasional mathematical coding curios. I’ve kicked off with an analysis, using various Numpy approaches, of the gravity field around a square (or cubic) planet, inspired by a project my children were working on.

If you’ve ever wondered, this is what gravity looks like on the surface of a square planet (20 length units long, arbitrary gravitational units) …

… even though the surface would appear visually flat, it would only feel level in the centre of the face. Near a corner, you would feel like you were standing on a 45 degree slope, and because the surface would be visually flat, it would look like you could slide off the far end of it – weird and cool.

I imagine I’ll add to this over time. The bulk of learning to code for me through high school involved mathematical simulations of all kinds: motion of planets under gravity, double pendulums, Mandelbrot sets, L-systems, 3D projections, etc, etc. All that BASIC (and some C) code lost now, but I’ll keep my eye out for more interesting problems and compile them here.

See also ThoughtWorks “Shokunin” coding problems of a mathematical nature that have piqued my interest over time:

Breadth first search and simulated annealing solvers for a task allocation problem

Scaling Change

Once upon a time, scaling production may have been enough to be competitive. Now, the most competitive organisations scale change to continually improve customer experience. How can we use what we’ve learned scaling production to scale change?

Metaphors for scaling
Metaphors for scaling

I recently presented a talk titled “Scaling Change”. In the talk I explore the connections between scaling production, sustaining software development, and scaling change, using metaphors, maths and management heuristics. The same model of change applies from organisational, marketing, design and technology perspectives.  How can factories, home loans and nightclubs help us to think about and manage change at scale?

Read on with the spoiler post if you’d rather get right to the heart of the talk.

Scaling Change Spoiler

When software engineers think about scaling, they think in terms of the order of complexity, or “Big-O“, of a process or system. Whereas production is O(N) and can be scaled by shifting variable costs to fixed, I contend that change is O(N2) due to the interaction of each new change with all previous changes. We could visualise this as a triangular matrix heat map of the interaction cost of each pair of changes (where darker shading is higher cost).

Change heatmap
Change interaction heatmap

The thing about change being O(N2) is that the old production management heuristics of shifting variable cost to fixed no longer work, because the dominant mode is interaction cost. Instead we use the following management heuristics:

Socialise

Socialising change
Socialising change

We take a variable cost hit for each change to help it play more nicely with every other change. This reduces the cost coefficient but not the number of interactions (N2).

Screen

Screening change
Screening change

We only take in the most valuable changes. Screening half our changes (N/2) reduces change interactions by three quarters (N2/4).

Seclude

Secluding change
Secluding change

We arrange changes into separate spaces and prevent interaction between spaces. Using n spaces reduces the interactions to N2/n.

Surrender

Surrendering change
Surrendering change

Like screening, but at the other end. We actively manage out changes to reduce interactions. Surrendering half our changes (N/2) reduces change interactions by three quarters (N2/4).

Scenarios

Where do we see these approaches being used? Just some examples:

  • Start-ups screen or surrender changes and hence are more agile than incumbents because they have less history of change.
  • Product managers screen changes in design and seclude changes across a portfolio, for example the separate apps of Facebook/ Messenger/ Instagram/ Hyperlapse/ Layout/ Boomerang/ etc
  • To manage technical debt, good developers socialise via refactoring, better seclude through architecture, and the best surrender
  • In hiring, candidates are screened and socialised through rigorous recruitment and training processes
  • Brand architectures also seclude changes – Unilever’s Dove can campaign for real beauty while Axe/Lynx offends Dove’s targets (and many others).

See Also

Seeing Stars – Bespoke AR for Mobiles

I presented on the development of the awesome Fireballs in the Sky app (iOS and Android) at YOW! West with some great app developers. See the PDF. (NB. there were a lot of transitions)

Abstract

We’ll explore the development of the Fireballs in the Sky app, designed for citizen scientists to record sightings of meteorites (“fireballs”) in the night sky. We’ll introduce the maths for AR on a mobile device, using the various sensors, and we’ll throw in some celestial mechanics for good measure.

We’ll discuss the prototyping approach in Processing. We’ll describe the iOS implementation, including: libraries, performance tuning, and testing. We’ll then do the same for the Android implementation. Or maybe the other way around…

Augmented/Virtual Reality with Horizontal Coordinates in iOS and Android

Augmented reality star maps
Augmented reality star maps

So, you want your mobile or tablet to know where in the world you’re pointing it for a virtual reality or augmented reality application?

To draw 3D geometry on the screen in OpenGL, you can use the rotation matrixes returned by the respective APIs (iOS/Android). The APIs will also give you roll, pitch and yaw angles for the device.

What’s not easy to do through the APIs is to get three angles that tell you in general where the device is pointing – that is, the direction in which the rear camera is pointing. You might want this information to capture the location of something in the real world, or to draw a virtual or augmented view of a world on the screen of the phone. The Fireballs in the Sky app (iOSAndroid) does both, allowing you to capture the start and end point of a “fireball” (meteor/ite) by pointing your phone at the sky, while drawing a HUD and stars on the phone  screen during the capture process, so you’re confident you’ve got the right part of the sky.

Azimuth and elevation
Azimuth and elevation

Roll, pitch and yaw tell you how the device sees itself – they are rotations around lines that go through the device (device axes). But in this case we want to know how the device sees the world – we need rotations around lines fixed in the real world (world axes). To know where the device is pointing, we actually want azimuthelevation and tilt, as shown.

Azimuth and elevation together are commonly known as a horizontal coordinate system.

Tilt angle
Tilt angle

The azimuth, elevation pair of angles gives you enough information to define a direction, and hence capture objects in the real world (assuming the distance to the object does not need to be specified). However, if you want to draw something on the screen of your device, you need to know whether the device is held in landscape orientation, portrait orientation, or somewhere in-between; thus a third angle – tilt – is required.

Azimuth is defined as the compass angle of the direction the device is pointing. Elevation is the angle above horizontal of the direction the device is pointing. Tilt is the angle the device is rotated around the direction in which it is pointing (the direction defined by azimuth and elevation angles).

We can get azimuth, elevation and tilt with the following approach:

  1. Define a world reference frame
  2. Obtain the device’s rotation matrix with respect to this frame
  3. Calculate the azimuth, elevation and tilt angles from the rotation matrix

It will really help to be familiar with the mathematical concept of a vector (three numbers defining a point or direction in 3D space), and be able to convert between radians and degrees, from here on in. Sample code may be published in future.

Define a World Reference Frame

World reference frame
World reference frame

We’re somewhere in the world, defined by latitudelongitude and altitude. We’ll define a reference frame with its origin at this point. For convenience, we’d like Z to point straight up into the sky, and X to point to true north. Therefore, Y points west (for a right-handed frame), as shown here. We define unit vectors ijk in the principal directions (or axes) X, Y, Z, and we’ll use them later.

\[ \newcommand{\vect}[1]{\mathbf{#1}}
\vect{i} = \left[1,0,0\right], \vect{j} = \left[0,1,0\right], \vect{k} = \left[0,0,1\right]\]

Obtain Device Rotation Matrix

Device rotation with respect to world frame
Device rotation with respect to world frame

What we want eventually is an rotation matrix that is made up of the components of the device axes abc, (also unit vectors) with reference to the world frame we defined. This matrix will allow us to convert a direction in the device frame into a direction in the world frame, and vice versa. This gives us all the information we need to derive azimuth, elevation and tilt angles.

We’ll describe the device axes as:

  • is “screen right”, the direction from the centre to the right of the screen with the device in portrait
  • is “screen top”, the direction from the centre to the top of the screen with the device in portrait
  • c is “screen normal”, the direction straight out of the screen (at right angles to the screen, towards the viewer’s eye)

We can write each device axis as a vector sum of the components in each of the principal world frame directions, or we can use the shorthand of a list of numbers:

\[\vect{a} = a_i\vect{i}+a_j\vect{j}+a_k\vect{k} = \left[a_i,a_j,a_k\right]\]

The rotation matrix then has the form:

\[\mathbf{A} = \left[\begin{array}{ccc}
a_i & b_i & c_i \\
a_j & b_j & c_j \\
a_k & b_k & c_k \end{array}\right]\]

To get a matrix of this form in iOS, just use reference CMAttitudeReferenceFrameXTrueNorthZVertical and get the rotation matrix. However, the returned matrix will be the transpose of the matrix above, so you will need to transpose the result of the API call.

In Android, you will need to correct for magnetic declination and a default frame that uses Y as magnetic north, and therefore X as east. Both corrections are rotations about the Z axis. The matrix will similarly be transposed.

Calculate View Angles

Device elevation angle
Device elevation angle

We can calculate the view angles with some vector maths. The easiest angle is elevation, so let’s start there. We find the angle that the screen normal (c) makes with the vertical (k) using the dot product cosine relationship.

\[-\vect{c} \cdot \vect{k} = \cos\left(\frac{\pi}{2}-e\right)\]
\[e = \frac{\pi}{2} – \arccos\left(-\vect{c} \cdot \vect{k}\right)\]

Elevation is in the range [-90, 90]. Note also from the definitions above that such dot products can be extracted directly from the rotation matrix, as we can write:

\[\vect{c} \cdot \vect{k} = c_k \]

Device azimuth angle
Device azimuth angle

Next, we calculate azimuth, for which we need the horizontal projection (cH) of the screen normal (c). We use Pythagoras’ theorem to calculate cH:

\[1 = c_H^2 + c_V^2\]
\[c_H = \sqrt{1 – c_k^2}\]

We then define a vector cP in the direction of c, such that the horizontal projection of this vector is always equal to 1, so we can use this horizontal projection to calculate angles with the horizontal vectors i & j.

\[\vect{c}_P = \frac{\vect{c}}{c_H}\]

Horizontal projection of device screen normal
Horizontal projection of device screen normal

We then calculate the angle the horizontal projection of the screen normal (cP) makes with the north axis (i). We get the magnitude of this angle from this dot product with i, and we get the direction (E or W of north) from the dot product with the west axis (j).

\[\cos{\alpha} = -\vect{c}_P \cdot \vect{i} = \frac{-\vect{c} \cdot \vect{i}}{c_H}\]
\[\alpha’ = \arccos\left(-\frac{c_i}{c_H}\right)\]
\[\newcommand{\sgn}{\text{sgn}}
\alpha = \sgn\left({c_j}\right) \times \alpha’\]

Note that because we’ve only used screen normal direction up until now, we don’t care how the phone is tilted between portrait and landscape.

Device tilt angle
Device tilt angle

Last, we calculate tilt. For this calculation we also need to ensure the projection of the screen right vector aP onto the vertical axis (k) is always equal to 1. As above, we divide a by cH.

\[\vect{a}_P = \frac{\vect{a}}{c_H}\]

We take the angle between aP and the world frame vertical axis k.

\[\cos{\tau} = -\vect{a}_P \cdot \vect{k} = \frac{-\vect{a} \cdot \vect{k}}{c_H}\]
\[\tau’ = \arccos\left(-\frac{a_k}{c_H}\right)\]
\[\tau = \sgn\left({b_k}\right) \times \tau’\]

Note that as the elevation gets closer to +/-90, both the azimuth value and the tilt value will become less accurate because the horizontal projection of the screen normal approaches zero, and the vertical projection of the screen right direction approaches zero. How to handle elevation +/-90 is left as an exercise to the reader.

Sample Code

Sample code may be available in future. However, these calculations have been verified in iOS and Android.

Leave Product Development to the Dummies

This is the talk I gave at Agile Australia 2013 about the role of simulation in product development. Check out a PDF of the slides with brief notes.

Description

"Dummies" talk at Agile Australia

Stop testing on humans! Auto manufacturers have greatly reduced the harm once caused by inadvertently crash-testing production cars with real people. Now, simulation ensures every new car endures thousands of virtual crashes before even a dummy sets foot inside. Can we do the same for software product delivery?

Simulation can deliver faster feedback than real-world trials, for less cost. Simulation supports agility, improves quality and shortens development cycles. Designers and manufacturers of physical products found this out a long time ago. By contrast, in Agile software development, we aim to ship small increments of real software to real people and use their feedback to guide product development. But what if that’s not possible? (And can we still benefit from simulation even when it is?)

The goal of trials remains the same: get a good product to market as quickly as possible (or pivot or kill a bad product as quickly as possible). However, if you have to wait for access to human subjects or real software, or if it’s too costly to scale to the breadth and depth of real-world trials required to optimise design and minimise risk, consider simulation.

Learn why simulation was chosen for the design of call centre services (and compare this with crash testing cars), how a simulator was developed, and what benefits the approach brought. You’ll leave equipped to decide whether simulation is appropriate for your next innovation project, and with some resources to get you started.

Discover:

  • How and when to use simulation to improve agility
  • The anatomy of a simulator
  • A lean, risk-based approach to developing and validating a simulator
  • Techniques for effectively visualising and communicating simulations
  • Implementing simulated designs in the real world