A bowl of famous Hokkaido soup curry

GenAI stone soup

GenAI (typically as an LLM) is pretty amazing, and you can use it to help with tasks or rapidly build all kinds of things that previously weren’t feasible.

Things that work some of the time.

The soup

But do you find yourself reworking large chunks of generated content, or face major hurdles in getting a prototype to production?

In the case of taking your LLM prototype to production, you need it to work most, if not all of the time. You soon realise you must:

  • Ensure that any data used is properly prepared, correct and current with appropriate controls
  • Consistently evaluate responses and manage regressions
  • Add some guardrails to prevent unexpected or adversarial inputs
  • Limit the outputs to a set of safe, useful and desirable choices
  • Re-imagine the UI to better suit the more constrained interaction
  • Improve latency and cost of the back-end to run at scale
  • etc

And all of a sudden, the LLM is surrounded by a lot of supporting infrastructure, and maybe the LLM is not doing much. In a RAG solution it’s a mix, but data preparation and evaluation remain crucial. In other scenarios, you might reduce to a setup wizard, or a semantic recommender feeding search, or a classifier matching content or triggering defined workflows, and so on, in which the LLM may play a small part, or no part at all.

(NB. This is one potential scenario. LLMs, though fundamentally unreliable, do have many genuine and transformative applications. Over time, we will also get better at avoiding and dealing with reliability failures.)

The stone

So we have a paradox in that the technology that triggered this feature or helped get started on a task doesn’t play a major role in implementing the feature or completing the task.

That doesn’t mean the LLM isn’t useful, it’s just that we should understand its utility more precisely.

The parable

In the parable of Stone Soup, a hungry traveller arrives at a village with only a stone in their knapsack (or a variant of this). Undeterred, they go to the first house proclaiming they have an amazing recipe for stone soup, if only they could borrow an onion. The villager obliges. At the next house, the traveller does the same, but asks for carrots, then potatoes, and so on. Eventually, they have enough for a hearty soup that feeds the whole village – all made from just a stone!

The lesson

The LLM is useful in the way the stone in the parable of Stone Soup is useful, as a catalyst for innovation.

The primary moral of Stone Soup also is relevant, being that each person contributes what they can to create something great for everyone. In this respect, complex software solutions are built by teams bringing together many simple parts. Also durably valuable is the discipline you might bring to curating your organisation’s unique data. With better data management and governance, you might get beyond opportunity soup to the main course (and on to nuts).

So don’t despair if your “GenAI” feature contains no LLM, it still played a useful roll! [sic]

Actual footage of LLM engineering from 1962 – couldn’t resist! (courtesy Horizon Book of Science)

Postres

Since writing this post (after talking about it for 12 months!), I was tickled to learn that two prominent AI researchers use this metaphor too. The first is Alison Gopnik in regards to the LLM training process, on Berkley Simons Institute News and the Santa Fe Institute Complexity podcast. The second is Subbarao Kambhampati in regards to augmenting LLMs for reasoning tasks, on LinkedIn.


Posted

in

, , ,

by