After each large outage, we created a document describing the nature and first occurrence of an outage, our course of action, and outcomes. Post-mortems were finalized with a follow-up section indicating plans and Jira tickets to prevent similar events in the future. I really like this approach and advocate this practice to avoid recurring issues.