Wednesday, October 23, 2013

OEM12C Alert Log Filtering

I'm setting up OEM12C and have had some "fun" with the Generic Alert Log Error filtering.

When you turn this on you'll start to get lots of emails with messages that you don't need / want or that you know about and don't want to be sent a mail about every couple of hours.

This post explains how to use the filtering, because I didn't find the online documentation very helpful.

Login to OEM and click on "Enterprise / Monitoring / Monitoring Templates":









Select the relevant template from the list and click on "Edit".

Click on the "Metric Thresholds" tab, then on the icon of the 3 pencils on the "Generic Alert Log Error" line to edit it.







 This is the default Filter string:

.*ORA-0*(54|1142|1146)\D.*

But I want to filter "ORA-00060 Deadlock detected"  alerts.

This is actually quite straightforward, and this is the new string that's used to do that:

.*ORA-0*(54|1142|1146|0060)\D.*

So put that in the field and click on "Continue" and then "OK" to save it.

I also wanted to filter these:

"ORA-00600: internal error code, arguments: [KSSRMP1], [], [], [], [], [], [], []~ORA-03203: concurrent update activity makes space analysis impossible"

The example given on the page explains how to do this - but it doesn't tell you need to add the "|" separator. So this is the string needed to add that to the filter:

 .*ORA-0*(54|1142|1146|0060)\D.*|.*ORA-00600:.*\[KSSRMP1[^\]]*\].*

Something else that it doesn't mention; when a change is made to the monitoring templates, it won't come into effect until the monitoring is synchronised.
This can be forced - go to "Enterprise / Monitoring / Template Collections. Click on the "Associations" tab.
Click on the link next to the "Name" and at the page that comes up click on "Start Synchronization". It will take a few minutes.

You can test any filters as follows:

It can be tested using the alerttest.pl script:

perl alerttest.pl ".*ORA-0*(54|1142|1146)\D.*|.*ORA-00600:.*\[KSSRMP1[^\]]*\].*" "ORA-[0-9]+[^0-9]" "ORA-0*(600?|7445)[^0-9]" "ORA-00600: internal error code, arguments: [KSSRMP1], [], [], [], [], [], [], []~ORA-03203: concurrent update activity makes space analysis impossible"
The error would not be collected by the Alert Log Metric as it would be filtered

perl alerttest.pl ".*ORA-0*(54|1142|1146)\D.*|.*ORA-00600:.*\[KSSRMP1[^\]]*\].*" "ORA-[0-9]+[^0-9]" "ORA-0*(600?|7445)[^0-9]" "ORA-00600: internal error code, arguments: [SOMETHING]"                                             
This error would raise a Critical Alert

This is the alerttest.pl script:

my ($filter_expression, $warning_threshold, $critical_threshold,$error)=@ARGV;

if ($error =~ /$filter_expression/ && $filter_expression ne "" ) {
print "The error would not be collected by the Alert Log Metric as it would be filtered\n";
} elsif ($error =~ /$critical_threshold/ && $critical_threshold ne "") {
print "This error would raise a Critical Alert\n";
} elsif ($error =~ /$warning_threshold/ && $warning_threshold ne "") {
print "This error would raise a Warning Alert\n";
} else {
print "This line will not raise an alert\n";
};

More information on alert log monitoring can be found in the MOS note:

Database Alert log monitoring in 12c explained (Doc ID 1538482.1)

Tuesday, September 3, 2013

Configuring Data Guard with a Broker with a non-default port

My current site has a number of databases that are using Data Guard, but have not been set up with the broker, and I've been asked to set them up to use it.

Simple, I thought.

Ha.

The actual set up is quite straight forward if the databases use port 1521, I followed this Oracle Support Document: Step By Step How to Recreate Dataguard Broker Configuration (Doc ID 808783.1).

However, when I did a "show configuration", "show database verbose "ORCLP" and "show database verbose "ORCLS" I got the following errors:

Warning: ORA-16607: one or more databases have failed

Error: ORA-16778: redo transport error for one or more databases

Error: ORA-12541: TNS:no listener

After poking around for a while I decided to see if they could be created with OEM (V12C), so logged in there and went to the "Availability / DataGuard Administration" screen.

When going through the process, it popped up a message to the effect "you're not using the default port, you need to set up the local_listener parameter".

So I ran this command on the databases:

Primary:
 alter system set local_listener='(ADDRESS=(PROTOCOL=tcp)(HOST=primary)(PORT=1528))';

Standby:
alter system set local_listener='(ADDRESS=(PROTOCOL=tcp)(HOST=standby)(PORT=1528))';

I also had to set the property on the Primary in dgmgrl:

DGMGRL> disable configuration

edit database "ORCLP" set property LocalListenerAddress='(ADDRESS=(PROTOCOL=tcp)(HOST=primary)(PORT=1528))';

edit database "ORCLS" set property LocalListenerAddress='(ADDRESS=(PROTOCOL=tcp)(HOST=primary)(PORT=1528))';

enable configuration

After a few minutes it all fell into place - check the alert log of the standby, and you can see the logs getting applied, and also a "show configuration" and "show database" in DGMGRL should show "SUCCESS":

DGMGRL> show configuration

Configuration
  Name:                ORCLDGB
  Enabled:             YES
  Protection Mode:     MaxPerformance
  Fast-Start Failover: DISABLED
  Databases:
    ORCLP      - Primary database
    ORCLS - Physical standby database

Current status for "ORCLDGB":
SUCCESS