Storage vMotion of an Oracle RAC Cluster with minimal downtime

By | September 12, 2018

Over the last couple of weeks, I was asked to migrate an Oracle RAC environment from ESX to ESX on Nutanix.

This procedure is for Development, QA, and Pre-Production RAC clusters which are not constrained by the 100% uptime business SLA’s. Administrators can choose to go another route of storage vMotioning Oracle RAC with minimal downtime in a very short time.

Use these high level steps for this approach. (RDMs are not talked about in the blog post)

Assumption, we have a 2 nodes of an Oracle RAC Cluster. VM1 and VM2

  1. Shutdown all the RAC VM’s (VM1 and VM2)
  2. Make a copy the .vmx file for all RAC VM’s
    • append a subscript ‘ORIG’ at the end of all VM vmx files (e.g cp VM2.vmx VM2.vmx.ORIG)
  3. Run the command on any RAC VM and save the output to a file
    • cat VM1.vmx | grep -I multi-writer > multi-writer.out
    • the contents of the ‘multi-writer.out’ file will have entries containing the multi-writer setting e.g scsiX:Y.sharing = “multi-writer”
  4. Save both files .vmx.ORIG and multi-writer.out to a backup location
  5. On VM2, remove ALL the shared vmdk/s using one of the 2 methods below
    • Web Client [ DO NOT DELETE THE VMDK/s from storage]
  6. Repeat Step 5 for all RAC VM’s except VM1
  7. Perform storage vMotion of all VM’s to the new storage array except VM1
    • All VM’s with non-shared vmdk/s except VM1 are now on the new storage array
  8. Now VM1 has 2 genres of vmdk
    • non-shared vmdk/s ( i.e. OS, Oracle binaries etc )
    • shared vmdk/s with multi-writer settings
  9. On VM1, remove the multi-writer setting for all shared disks using one of the 2 methods below
    • Web Client
  10. Perform storage vMotion of VM1 to the new storage array
  11. At this point, VM1 vmdk/s is on the new storage
    • original non-shared vmdks
    • common vmdk/s without multi-writer setting (Make sure this storage is Thick Lazy Zero)
  12. On VM1, add multi-writer settings for all common vmdk/s back using one of the 2 methods
    • Web Client
  13. VM1 now has
    • non-shared vmdk/s
    • shared vmdk/s with multi-writer setting
  14. On the rest of the VM’s , add ALL shared vmdk/s with the multi-writer settings using the ‘Existing Hard Disk’ option using one of the 2 methods
    • Web Client
  15. Do Step 10 for all RAC VM’s except VM1
  16. At this point, all VM/s vmdk/s are on the new storage
    • original non-shared vmdks
    • common vmdk/s with multi-writer setting
  17. Power on RAC VM
    • Click OK for option ‘I copied it’ when prompted
  18. RAC Cluster has been successfully migrated over to the new cluster

Here is another process for creating a clone by pushing the data to the new Datastore

  1. Shutdown all the RAC VM’s (VM1 and VM2)
  2. Make sure you have set up the new DataStores on the old Oracle RAC cluster.
  3. Make a clone of the VM and select the Datastore that is connected to the new cluster. 
  4. On VM2, remove ALL the shared vmdk/s using one of the 2 methods below
    • Web Client [ DO NOT DELETE THE VMDK/s from storage]
  5. Repeat Step 4 for all RAC VM’s except VM1
  6. On VM1, remove the multi-writer setting for all shared disks
  7. Register the VMs on the new cluster
  8. At this point, VM1 vmdk/s are on the new storage
    • original non-shared vmdks
    • common vmdk/s without multi-writer setting (Make sure this storage is Thick Lazy Zero)
  9. On VM1, add multi-writer settings for all common vmdk/s
  10. VM1 now has
    • non-shared vmdk/s
    • shared vmdk/s with multi-writer setting
  11. On the rest of the VM’s, add ALL shared vmdk/s with the multi-writer settings using the ‘Existing Hard Disk’ option
  12. Do Step 10 for all RAC VM’s except VM1
  13. At this point, all VM/s vmdk/s are on the new storage
    • original non-shared vmdks
    • common vmdk/s with multi-writer setting
  14. Power on RAC VM
  15. RAC Cluster has been successfully migrated over to the new cluster

Both of these options are viable.  The bottom one takes into account that you have the original copy still available if there was an issue with the transfer between Datastores.

Even though this process was done for a Nutanix engagement, this same process could be done in any environment.