Disaster Recovery scenarios with self-managed Deployments (IHC)

This article is meant to be a general guide on what to do and what to expect in the unfortunate event where your Immuta instance is either destroyed or irrecoverable. This guide will address specific

Goals

  • How to get Immuta back up and running on AWS, Azure and GCP on helm chart 4.12.1 and lower?

  • How does a recovery situation differ if I am using an external metadata database?

  • What can I do if my external database is corrupted or down?

  • What if I need to upgrade my Kubernetes version?

  • What are my options if I need help?

Guidance How to get Immuta back up and running on AWS, Azure and GCP? With every major cloud provider we have a simple and easy process to allow a recovery when required for Immuta to either be brought back up or recovered. The process and steps needed will all require one thing, a backup of the Immuta database and Immuta Query Engine. Regardless of cloud provider you can ensure that Immuta is nothing but a few minor values changes away from being restored. Linked here is the documentation that goes over and walks you through a restore process using a backup that are stored in S3, AZBlob or GCP Storage. The main requirements for this are as follows:

  • Helm version 4.12.1

  • A backup that is located in either S3, AZBlob or GCP Storage

How does a recovery situation differ if I am using an external metadata database? The process to recover when using an external metadata database is very different and allows more flexibility in how you wish to restore if even required. With an external database, we no longer rely on a database and query engine backup file in some form of blob storage. We rely entirely on the cloud provider snapshot system that comes with using either GCP SQL, AWS RDS or Azure Flex Server. The only time a restore or recovery procedure is require would be in the cloud instance is either unreachable or deleted. For AWS RDS:

  • Navigate to the AWS RDS page > Select "Snapshots" on the left hand side > Select the snapshot you wish to use > Click on "Actions" in the top right > Restore snapshot

    • This will bring you to a page that looks like the RDS creation panel but it will have some information that is brought over from the snapshot and will create a new RDS instance with the snapshot selected

    • The only change that will be required on the Immuta side, would be to update the externalDatabase.host field in you Immuta-values.yaml file and perform a helm upgrade to apply the new host name of the restored database.

  • You can also navigate to the RDS instance itself > Select "Actions" in the top right > Restore to point in time

    • This will restore to a specified/ selected PIT snapshot to recover to

For Azure Flex Server:

  • Navigate to your AZ Flex Server page > Left hand panel select "Backup and restore" > On the snapshot you wish to recover with select "Fast Restore"

    • This will bring you to a page that looks like the AZ FS creation panel but it will have some information that is brought over from the snapshot and will create a new AZ FS instance with the snapshot selected.

  • The only change that will be required on the Immuta side, would be to update the externalDatabase.host field in you Immuta-values.yaml file and perform a helm upgrade to apply the new host name of the restored database.

For GCP SQL:

  • Select the GCP SQL instance you wish to restore/ recover > On the left panel select "Backups" > Then choose the backup you wish to use and select "Restore" on the right hand side > follow the prompts and this will restore the instance using that snapshot

What can I do if my external database service is completely down? In this instance where the service is completely unreachable due to either cloud provider outages or issues based with the cloud provider itself you will need to reach out the respective support team for the cloud provider. In the event a region with your remote database is down and you want to restore, you can change the region that the snapshot gets deployed to during the restore process for AWS RDS and AZ Flex Server. For GCP SQL, you will need to create a new instance entirely that has the matching passwords previously set> Restore the snapshot to that instance > update the values file to point at that new instance in the new region. What if I need to upgrade my Kubernetes version? When upgrading your k8s cluster that contains Immuta there are a few considerations to be taken prior to upgrading. It is recommended to take an adhoc backup prior to upgrading to ensure there is an up to date recovery solution if required. If using the built-in database pod (non externalized metadata database):

  1. Take a backup of Immuta as documented above

  2. Uninstall Immuta

    1. This will remove the pods but PVCs that contain the data will remain

  3. Upgrade Kubernetes

  4. Reinstall Immuta

    1. If issues are encountered either open a support ticket for help or refer to the steps outlined at the top of this guide to restore your instance.

If using an external database + Query Engine Rehydration (GCP SQL, AWS RDS, AZ Flex Server):

  1. Uninstall the Immuta application

  2. Upgrade Kubernetes

  3. Reinstall Immuta via Helm

What are my options if I need help? If the above steps or scenarios do not work and your Immuta instance is failing to come up pre or post restore attempt please create a support case and support.immuta.com immediately and a support engineer will be happy to assist you with getting your instance back up and running as quickly as possible.

Last updated