Category: Data

  • Solving EV charger anxiety

    Solving EV charger anxiety

    Many EV adventures are accessible using the charging network in Victoria, but faulty chargers still have the potential to induce charger anxiety on road trips. Planning apps–EV drivers’ constant companions–may not fully solve this when the reported status of chargers is unreliable and faults are prevalent. As a driver, I want resilient plans that already…

  • Hopsworks and multidisciplinary ML

    Hopsworks and multidisciplinary ML

    I recently had a brief but fun chat with Hopsworks about the multidisciplinary nature of building machine learning products, as part of their 5-minute podcast series hosted by Rik Van Bruggen. See the transcript and video at 5-minute-interview-with-david-colls-nextdata. Rik and I talked about how David, Ada and I address this multidisciplinary perspective in our book…

  • EMLT Q&A

    EMLT Q&A

    A fun Q&A with Thoughtworks on the drivers, key messages and writing process for Effective Machine Learning Teams (EMLT) with my fellow authors Ada and David. It’s neat to be featured alongside all the other many great books from Thoughtworks authors. Find the book, trial and purchase options at O’Reilly, and find yourself a nice…

  • Dealing with data inventory

    Dealing with data inventory

    Data held by businesses is often described as an asset. This can be misleading or even incorrect. In any case, data managed inappropriately leaves value on the table, inflates cost, reduces responsiveness, and creates risk. Some data held by businesses would better be described as inventory. It might one day be a true asset, but…

  • Effective Machine Learning Teams

    Effective Machine Learning Teams

    I’m very excited to be writing a book with my colleagues David Tan and Ada Leung. The topic and title Effective Machine Learning Teams was born from our combined work on team technical and delivery practices, and wider organisational patterns, applied to developing machine learning applications. The book has two landing pages where you can…

  • 7 wastes of data production – when pipelines become sewers

    7 wastes of data production – when pipelines become sewers

    I recently had the chance to present an updated version of my 7 wastes of data production talk at DataEngBytes Melbourne 2023. I think the talk was stronger this time around and I really appreciated all the great feedback from the audience. Check out the video below and the slides. Thanks to Peter Hanssens and…

  • Privacy puzzles

    Privacy puzzles

    I contributed a database reconstruction attack demonstration to the companion repository to the excellent book Practical Data Privacy by my colleague Katharine Jarmul. My interest was piqued by my colleague Mitchell Lisle sharing the paper Understanding Database Reconstruction Attacks on Public Data from the US Census Bureau authors Simson Garfinkel, John M. Abowd, and Christian…

  • Perspectives edition #27

    Perspectives edition #27

    I was thrilled to contribute to Thoughtworks Perspectives edition #27: Power squared: How human capabilities will supercharge AI’s business impact. There are a lot of great quotes from my colleagues Barton Friedland and Ossi Syd in the article, and here’s one from me: The ability to build or consume solutions isn’t necessarily going to be…

  • Electrifying the world with AI Augmented decision-making

    Electrifying the world with AI Augmented decision-making

    I wrote an article about optimising the design of EV charging networks. It’s a story of work done by a team at Thoughtworks, demonstrating the potential of AI augmented decision-making (including some cool optimisation techniques), in this rapidly evolving but durably important space. We were able to thread together these many [business problem, AI techniques,…

  • Humour me – DRY vs WRY

    Humour me – DRY vs WRY

    Don’t Repeat Yourself (DRY) is a tenet of software engineering, but – humour me – let’s consider some reasons Why to Repeat Yourself (WRY). LEGO reuse lessons In 2021, I wrote a series of posts analysing LEGO® data about parts appearing in sets to understand what it might tell us about reuse of software components…