Data breaches are the today, and every company should be doing something to protect their most sensitive data. That sounds nice in the paper but in practice, it's a pain in the ass. number 1 threats Setting up security policies for protecting data is not big trouble, I think that this is the part of the problem itself. easiest The big problem remains on how to know where , and what you have. There are different levels of risk, depending on the grade of sensitivity; it's not the same if I have customer emails, instead of full credit card info. data is stored kind of data You can't protect what you don't know It would be as if the bank tried to , but it doesn't know . protect the money where it's stored If I don't know where is my most sensitive data, , no matter how hard I try, I'll fail. I couldn't protect it Goal I'll show how to use Macie to scan any database, , to discover sensitive data on tables. inclusive on-premises I'll do it by taking snapshots or full backups from databases and launching Macie scanning jobs automatically. When scanning is done, I'll get a full summary of what kind of sensitive data is stored. How Macie works? The service is designed to scan S3 buckets, and to identify sensitive data in objects stored there. Currently, it doesn't have any integration with databases. Macie has predefined patterns to find these : kinds of data Credentials Financial Information Personal Health Information Personally Identifiable Information If I'd like, I can extend the Discover scope, using . I just need to create my custom regex and provide them to Macie. In this way, I could define whatever pattern I consider sensible for my company, like internal employee IDs. my own pattern matching Solution It seems complicated, but it's much simpler than it looks. The whole process is designed to take snapshots of databases, store them in S3, and trigger Macie jobs that will look for sensitive data. The scans are performed once a week, and then uploaded to SecurityHub, in order to manage the findings. Finally, after the job was completed, previously created snapshots are deleted. I've created multiple lambdas that helped me automate all the processes. Snapshot creation and exporting are usually manual tasks, and Macie job creation too. AWS Lambda helped me to automate that. The example above is for an RDS database, but the same process will work with any database backup in SQL language ( ). The advantage with RDS snapshots is that Amazon using parquet format, and Macie is able to read them. and other formats automatically compresses snapshots The first Lambda ( ) function has a time-based trigger. I've configured it to run once a week. It's in charge of for RDS. This logic is pretty simple, the only task of this lambda is to start snapshot creation. Start Backup Function initiate snapshot creation The second Lambda ( ) is also time-based, but I've configured it to run every 15 minutes. Its job consists of checking if the , and then . RDS snapshots take a while to be done, for this reason, the function is executed in short periods, just to check if there are snapshots pending export. export snapshot S3 snapshot has completed exports it to the S3 bucket The last Lambda function ( ), is triggered by every object that is inserted in the bucket dedicated to Macie scans. The logic of this function is to , to let it perform the scan over the bucket. Create Macie Job configure, create and start the Macie job Macie is configured to send every data finding to SecurityHub automatically. There I can with other security events and aggregate them all together, to have a better view of your landscape. manage the findings For snapshot deletion, I've configured a for the S3 bucket that deletes objects after 1 day. lifecycle rule That's not all folks Do you like to see a ? I'm finishing the code because I've done all the stuff manually. Follow me on my channel, I'll publish updates of new posts there! PoC I hope that you've learned something new with my post, and if this is your case I encourage you to become one of our members of my telegram about Cloud and Security. fantastic channel