More Sankey for Less Confusion?

Confusion Matrixes are essential for evaluating classifiers, but for some who are new to them, they can cause, well, confusion.

Sankey Diagrams are an alternative way of representing matrix data, and I’ve found some people – who are new to matrix data, like business domain experts who are not experienced data scientists – find them easier to understand. Also, some machine learning researchers find Sankey diagrams useful for analysing data and classifiers.

So, I have posted simple code for visualising classifier evaluation or comparisons as Sankey diagrams. Maybe it will be useful for others, as well as fun for me.

The code combines large portions of Plotly Sankey Diagrams with essence of scikit-learn confusion matrix and a lashings of list comprehension code golf.

The scenarios supported are:

  1. Evaluating a binary classifier against ground truth or as champion-challenger,
  2. Evaluating a multi-class classifier against ground truth or as champion-challenger,
  3. Comparing multiple versions of a binary classifier, for instance over time, or hyper-parameter sweeps, and
  4. Comparing multiple versions of a multi-class classifier.
Example confusion matrixes as Sankey diagrams

See the code on Github.

Leave a Reply

Your email address will not be published. Required fields are marked *