Thursday, May 27, 2010

Testing a DR with DataGuard

I recently performed a DR test using a DataGuard database, but because it was a test the users wanted to leave the Primary (PROD) database active. I explained that this wasn't a DR test in that case, and that the process and commands we would use for the test would not be relevant for a real DR. They accepted this, and I decided that I'd set up my own DataGuard environment so I could test a 'real' DR scenario. In the meantime, these are the steps I performed to allow the standby database to be opened for the users to be able to connect while leaving the primary production database available for normal business.

This DataGuard environment is using a DataGuard Broker.

For brevity, I'll just include the steps, and not all of the commands.

Set the Oracle environmment and connect to the DataGuard Broker on the primary server:
# dgmgrl
DGMGRL> connect /
Connected

DGMGRL> show configuration

The last line of this command should show this:

Current status for "DGPRI_DATAGUARD":
SUCCESS

Disable the configuration (disable configuration;).

Once this is done, alert log messages will appear regarding loss of connectivity to the standby, this is expected and nothing to worry about:

ARC1: Attempting destination LOG_ARCHIVE_DEST_1 network reconnect (3113)
ARC1: Destination LOG_ARCHIVE_DEST_1 network reconnect abandoned
PING[ARC1]: Error 3113 when pinging standby (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=DGSBY_XPT.SYSDOMAIN.LOCAL)(INSTANCE_NAME=DGSBY)(SERVER=dedicated))).

I then checked that the Primary database was OK and that users were still logged in.

On the standby database, you need to issue a command to stop the log apply:

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE;

To allow the database to open in READ WRITE mode, issue this command:

SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;

You then need to stop and restart the standby database.

I checked the primary database to make sure that this command hadn't done something mad like stop and restart the database in standby mode (always best to check), it was fine.

That was basically it - disable the configuration, issue a couple of commands on the standby and then stop and restart the standby. This allowed the testers to pretend the standby was now the Production database for DR purposes, but also allowed the real production database to remain active.

After the test was complete I rebuilt the standby (basically a hot backup and restore of the primary, then running the relevant commands and enabling the broker configuration).

My next test will be to stop my primary test database untidily and start my standby, pretending it is a real loss of Production. I'll update data, then switch it back. Hopefully it all hangs together and this is what would happen in a 'real' DR.