Too Long; Didn't Read
An important requirement for data privacy and protection is to find and catalog tables and columns that contain PII or PHI data in a data warehouse. Open source data catalogs like [Datahub] and [Amundsen] enable cataloging of information in data warehouses. This post describes two strategies to scan and detect PII as well as introduce an open source application [PIICatcher] that can be used to scan data warehouses for PII. PII data includes SSN, email or phone numbers, login ID details, social media posts, digital images, geolocation and more.
Share Your Thoughts