Wednesday, October 23, 2013

OEM12C Alert Log Filtering

I'm setting up OEM12C and have had some "fun" with the Generic Alert Log Error filtering.

When you turn this on you'll start to get lots of emails with messages that you don't need / want or that you know about and don't want to be sent a mail about every couple of hours.

This post explains how to use the filtering, because I didn't find the online documentation very helpful.

Login to OEM and click on "Enterprise / Monitoring / Monitoring Templates":









Select the relevant template from the list and click on "Edit".

Click on the "Metric Thresholds" tab, then on the icon of the 3 pencils on the "Generic Alert Log Error" line to edit it.







 This is the default Filter string:

.*ORA-0*(54|1142|1146)\D.*

But I want to filter "ORA-00060 Deadlock detected"  alerts.

This is actually quite straightforward, and this is the new string that's used to do that:

.*ORA-0*(54|1142|1146|0060)\D.*

So put that in the field and click on "Continue" and then "OK" to save it.

I also wanted to filter these:

"ORA-00600: internal error code, arguments: [KSSRMP1], [], [], [], [], [], [], []~ORA-03203: concurrent update activity makes space analysis impossible"

The example given on the page explains how to do this - but it doesn't tell you need to add the "|" separator. So this is the string needed to add that to the filter:

 .*ORA-0*(54|1142|1146|0060)\D.*|.*ORA-00600:.*\[KSSRMP1[^\]]*\].*

Something else that it doesn't mention; when a change is made to the monitoring templates, it won't come into effect until the monitoring is synchronised.
This can be forced - go to "Enterprise / Monitoring / Template Collections. Click on the "Associations" tab.
Click on the link next to the "Name" and at the page that comes up click on "Start Synchronization". It will take a few minutes.

You can test any filters as follows:

It can be tested using the alerttest.pl script:

perl alerttest.pl ".*ORA-0*(54|1142|1146)\D.*|.*ORA-00600:.*\[KSSRMP1[^\]]*\].*" "ORA-[0-9]+[^0-9]" "ORA-0*(600?|7445)[^0-9]" "ORA-00600: internal error code, arguments: [KSSRMP1], [], [], [], [], [], [], []~ORA-03203: concurrent update activity makes space analysis impossible"
The error would not be collected by the Alert Log Metric as it would be filtered

perl alerttest.pl ".*ORA-0*(54|1142|1146)\D.*|.*ORA-00600:.*\[KSSRMP1[^\]]*\].*" "ORA-[0-9]+[^0-9]" "ORA-0*(600?|7445)[^0-9]" "ORA-00600: internal error code, arguments: [SOMETHING]"                                             
This error would raise a Critical Alert

This is the alerttest.pl script:

my ($filter_expression, $warning_threshold, $critical_threshold,$error)=@ARGV;

if ($error =~ /$filter_expression/ && $filter_expression ne "" ) {
print "The error would not be collected by the Alert Log Metric as it would be filtered\n";
} elsif ($error =~ /$critical_threshold/ && $critical_threshold ne "") {
print "This error would raise a Critical Alert\n";
} elsif ($error =~ /$warning_threshold/ && $warning_threshold ne "") {
print "This error would raise a Warning Alert\n";
} else {
print "This line will not raise an alert\n";
};

More information on alert log monitoring can be found in the MOS note:

Database Alert log monitoring in 12c explained (Doc ID 1538482.1)