Data anonymization is an old topic but with the rise of legal and ethical concerns about privacy, database administrators must keep track of senstive data and hide it to certain users in different contexts : unit testing platform, open data access, audits, development, etc.
This presentation is an overview of various anonymization techniques : substitution, randomization, variance, shuffling, encrypting, partial scrambling, ... with a special focus on "dynamic masking". These methods can now be implemented easily with "PostgreSQL Anonymizer", an new extension that will hide or replace personal information using only SQL statements.
This project is a prototype designed to show that data masking is key feature for Postgres. The long-term goal is to introduce a new specific DDL syntax for anonymization ("MASKED WITH ...") and let users define masks on certain columns, just like they would declare CHECK constraints.
The following slides have been made available for this session: