I contributed a database reconstruction attack demonstration to the book Practical Data Privacy by my colleague Katharine Jarmul. While we might think anonymous summary data is safe to share, this attack demonstrates it’s possible to dramatically reduce the search space for re-identification, in this case from half a trillion quadrillion possibilities to just one!
My interest was piqued by my colleague Mitchell Lisle sharing the paper Understanding Database Reconstruction Attacks on Public Data from the US Census Bureau authors Simson Garfinkel, John M. Abowd, and Christian Martindale. Mitchell and I collaborated on a pair of solutions using mathematical optimisation/satisfaction techniques. Check out Mitchell’s solution in the companion repository using the Z3 library. I used OR-Tools.

The notebook demonstrates that individual rows of a database may be reconstructed, even if only summary statistics are shared, by considering the constraints that are inherent in the data itself (like ages), and that statistical measures place constraints on possible values of the input data. The statistical measures in this case are mean and median globally and for cohorts.
Note that the intent is of this notebook is not to compromise any private data, but to raise awareness of the potential for privacy breaches due to re-identification attacks!