Disaster Recover Considerations for Self-Managed Immuta Deployments
FAQ: How can we best prepare ourselves for a DR event as a Self Managed Immuta Customer?
Question As a Self Managed Immuta customer, I have certain SLAs regarding downtime and requirements for recovery during a disaster recovery (DR) event. What is the best way we can go about preparing for a DR event and maintain our SLAs? Answer Assuming you are utilizing one of the big cloud providers for your Kubernetes (K8s) infrastructure such as Azure (AKS), Google (GKS), or Amazon (EKS), preparing for a DR event is fairly straightforward. First, start by identifying your data loss allowance window (can you lose 1 day's worth of data, 1 week, or 1 hour?), your time to recovery (how soon do you need to have Immuta back up, functional, and accessible to your end users?), and the scope of your recovery (do you only need to protect against a node outage or do you need to be prepared for a full region outage?). Once you have identified these core elements, you can take advantage of Immuta's backup and restoration process as well as the redundancies provided by the various cloud providers in their object storage options.
Enable backups and adjust your Immuta backup schedule to run in alignment with your data loss allowance, taking into account that Immuta defaults to running every 24 hours
https://documentation.immuta.com/install-immuta/installation/kubernetes/helm-chart/options/#backup
Make sure you have the backups saving to a multi-AZ or mutli-Region object storage location such as Azure's Geo-redundant or Geo-zone-redundant Azure Blob storage to ensure that your Immuta backup files will be accessible should you lose an AZ or Region
Make sure that you have any necessary Kubernetes secrets available in your backup region such as TLS and your Image Pull Secret
These can very easily be stored in a .yaml file for easy deployment to your new cluster/namespace
In the event of an outage, redeploy Immuta to a new AZ or Region using the "restore from backup" feature of Immuta
Create a new node pool in the backup region (or have one on standby) as appropriate
Modify your Immuta helm values file to enable restore on deployment
Ensure the backup file location details are still correct in your helm file
If needed, deploy a new external postgresql database (or have one on standby) and adjust the helm values file to reflect the new connection information
If you do not want to use an external postgresql database, you can simply adjust the helm values file to reflect this
If you are not using an external postgresql database make sure you have persistence turned on for the database and we highly recommend using sdd-backed PVC storage especially for large deployments
As appropriate deploy your load balancer and adjust your DNS server to redirect your Immuta external hostname to point to the new deployment's IP address
Finally, make sure you run at least (1) DR exercise to ensure that your process works and you don't have any gaps in your instructions
I cannot stress how important this is! Creating a DR plan without testing is worse that not having a DR plan at all.
If you have issues during your DR exercise, open an Immuta support case and we can help you work through things and ensure that you are set up for success!
Last updated