Tuesday, September 22, 2015

Creating a Database Monitoring Dashboard in APEX

This was copied and pasted from MS Word, so apologies for the formatting.

This is where to get all the files : http://1drv.ms/1OtWys4

The Dashboard

A lot of our clients are small, have standard edition or a few databases, so don't install Cloud Control. This means virtually all of the monitoring is performed using cron jobs, scripts and emails.
While this is usually good enough, I wanted to create a dashboard that showed the status of the databases, and it sort of evolved as I thought of things I'd like to see, and what I thought a client might like to see.
So what started as a page to show if the database was up, turned into this:












Because this evolved as I went along rather than being designed, there are probably some decisions I made that may not be the best – so you may want to read through this and re-think it yourself.


Assumptions:

You have Oracle installed.
You know how to manage and administer Oracle.
You have APEX (at least V4) installed and know how to manage that (ie create users, workspaces and a bit of development knowledge).
You know shell scripting.

How it works

There is a repository database which houses all the dashboard tables and the APEX application.
There are unix shell scripts (can be windows batch jobs if that's your platform) which are scheduled by cron (or Windows task scheduler) which connect to each database using a tnsnames.ora file and run SQL to collect the data. The data is written to flat files on the host where the monitoring database sits. SQL Loader jobs then load the data into the dashboard tables.

If, for some reason, you cannot connect to the databases using tnsnames, then an alternative would be to run the scripts locally and ftp or scp the flat files to the host that is running the repository database, but that is more complex and you have less control.

One thing to note is that this is not a live display – that would require more complexity and would place load on the database – this is not an agent-based monitor, and it's not Cloud Control. It's designed to give an overview of the databases so that the DBA can login to the database to have a closer look if necessary, and to store some historical information.

The scripts are scheduled to run at different times depending on how recent you want the data – the most frequent scripts run every 20 minutes to collect information on locks, alert log, session details etc. You can schedule them as frequently as you think necessary, but be cautious in case you cause load (although the scripts are fairly benign, remember that you are probably running them on a production database so be careful. And of course, make sure you follow the site change control procedures…).

The data I am collecting

This is what I gather, you may decide you want other information or that you don't need some of these, but I think it is a good base.

Account Status                      -              Shows if any application accounts are locked.
Alert Log                              -              (V11R2 and above) – shows ORA- errors.
Database Waits                 -              Chart of database wait events.
File I/O                                 -              Charts file reads and writes history.
DDL Changes                      -              Shows application objects that have changed.
File Latency                        -              Charts I/O latency (not yet shown on dashboard).
Host CPU                             -              Charts Host CPU usage (not yet shown on dashboard).
Host Details                        -              Shows OS, version, database, version, hostname.
Host Memory                    -              Charts host memory usage (not yet shown on dashboard).
Host Space                          -              Charts host space history (not yet shown on dashboard).
Invalid Objects                  -              Shows application invalid objects.
Locks                                     -              Shows blocking locks.
Long Operations               -              Shows any LONGOPS.
Backups                               -              Shows RMAN backup status and information.
Sessions                               -              Charts and lists database sessions.
Stale Stats                           -              Lists objects that have stale statistics.
Tablespace Space            -              Charts tablespace space with history.
Transactions                       -              Charts transactions by hour.
Uptime                                 -              Shows when database was started.

Some of these have data in the tables but I haven't got around to creating a page in APEX. By setting them up and collecting the data, I'll have the history available when I eventually get around to it.

The tables

I'll upload the DDL for all the objects, but this is a sample:

DASHBOARD_ACCOUNT_STATUS

CREATE TABLE MONITOR.DASHBOARD_ACCOUNT_STATUS
   (   DB_NAME VARCHAR2(20 BYTE),
       USERNAME VARCHAR2(50 BYTE),
       ACCOUNT_STATUS VARCHAR2(20 BYTE),
       SAMPLE_DATE DATE DEFAULT sysdate NOT NULL ENABLE
   ) SEGMENT CREATION IMMEDIATE
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE MONITOR_DAT ;

So, the first thing to do is create a tablespace, and a schema. Because this is in a database I've set up specifically, I give the schema DBA access so I don't have to worry about those pesky permissions. If you need to create this in an existing database you need to sort out the permissions yourself, but the schema shouldn't need DBA.

The common columns in all the tables are the DB_NAME and the SAMPLE_DATE columns – these allow you to drill down in the APEX application. The other columns are determined by the sort of information you're collecting.

These are the tables I have, you may not want them all, or you may want to create your own.

DASHBOARD_ACCOUNT_STATUS
DASHBOARD_ALERT_LOG
DASHBOARD_ALERT_LOG_HISTORY
DASHBOARD_DATABASE_WAITS
DASHBOARD_DB_FILE_IO
DASHBOARD_DB_FILE_IO_HISTORY
DASHBOARD_DDL_CHANGES
DASHBOARD_FILE_LATENCY
DASHBOARD_HOST_CPU
DASHBOARD_HOST_CPU_HISTORY
DASHBOARD_HOST_DETAILS
DASHBOARD_HOST_INFORMATION
DASHBOARD_HOST_MEM
DASHBOARD_HOST_SPACE
DASHBOARD_INVALID_OBJECTS
DASHBOARD_LOCK_DETAILS
DASHBOARD_LOCK_HISTORY
DASHBOARD_LONGOPS
DASHBOARD_RMAN_BACKUPS
DASHBOARD_SESSIONS_HISTORY
DASHBOARD_SESSIONS_HWM
DASHBOARD_SESSION_DETAILS
DASHBOARD_STALE_STATS
DASHBOARD_TBS_SPACE
DASHBOARD_TRANSACTIONS
DASHBOARD_TRANSACTIONS_HISTORY
DASHBOARD_UPTIME

I created a couple of history tables, the idea being to move old data out of the current ones, but I haven't got round to it yet.

How the data is collected

The data is collected through shell scripts that connect to each database in the tnsnames.ora, run SQL statements, spool to flat files and then the data is loaded into the database using SQL Loader.
The cron job runs a script based on the frequency required. This script contains calls to the data gathering scripts. This one runs every hour:

#!/bin/sh
#
# Get data each hour
#

export TNS_ADMIN=/home/oracle/mga/akh
export ORACLE_HOME=/opt/oracle/product/11.2.0.3/db1
export ORACLE_SID=MONITOR
export PATH=$ORACLE_HOME/bin:$PATH

for i in AWATST EAREPTST AWS1DEV AWAUAT AREGUAT AREGSUP AWASUP AWAPRD AREGPRD
do
 /home/oracle/mga/akh/dashboard/script/get_io.sh $i
 /home/oracle/mga/akh/dashboard/script/get_trans.sh $i
 /home/oracle/mga/akh/dashboard/script/get_backup.sh $i
 /home/oracle/mga/akh/dashboard/script/get_waits.sh $i
 /home/oracle/mga/akh/dashboard/script/get_uptime.sh $i
 /home/oracle/mga/akh/dashboard/script/get_accounts.sh $i
done

Note that the database names are hard-coded here and are used as input into the scripts. You could create a parameter file, or use a sed / awk process to read the tnsnames.ora instead.

This is one of the scripts called by the cronjob – note I use the system user to connect. If you can't do that, you will need to create a user on the target database(s) with the correct permissions.

#!/bin/bash
# Script to get account status
# Andy Horne 25/07/2015
#
export ORACLE_HOME=/opt/oracle/product/11.2.0.3/db1
export ORACLE_SID=$1
export PATH=$ORACLE_HOME/bin:$PATH

sqlplus -S system/xxxxxx@${ORACLE_SID} <
set pages 0 feed off verify off
spool /home/oracle/mga/akh/dashboard/script/output/${ORACLE_SID}_accounts.out
SELECT '$ORACLE_SID',
username, account_status
from dba_users where username in ('DFC_RO_USER',
'AWRIS2',
'ARCHREPO',
'DFC_USER',
'DSTFIX',
'DWH_COMMON',
'DWH_ETL_USER',
'DWH_INTEGRATION',
'DWH_MAPPING',
'DWH_RO_USER',
'DWH_STAGING',
'DWH_TEMPORAL',
'EAREPTST',
'REGISTRY',
'WADI',
'WADI2',
'WDM')
order by 1 asc;
EOF

#
# Load the data
/home/oracle/mga/akh/dashboard/script/load_accounts.sh ${ORACLE_SID}

So in the cron job script (get_hourly.sh) there is the variable 'i' which is assigned the name of the databases in turn (eg AREGPRD). This variable is fed into the ORACLE_SID variable in the called script (get_accounts.sh). So the get_accounts.sh script runs for each database, connects to it, and spools the output to a file with the database name as the prefix (eg AREGPRD_accounts.out).

This file gets read in by the sql loader job which looks like this:

#!/bin/sh
# Load the data into the database
export ORACLE_HOME=/opt/oracle/product/11.2.0.3/db1
export ORACLE_SID=MONITOR
export PATH=$ORACLE_HOME/bin:$PATH
export DBNAME=$1

#
# Load the data
#
sqlldr monitor/pwd_goes_here control=/home/oracle/mga/akh/dashboard/script/load_accounts.ctl

This is the control file for that job:
--
-- SQL Loader control file to load sessions data into the monitor.dashboard_sessions_history table
--
-- Andy Horne 25/07/2014
--
LOAD DATA
infile '/home/oracle/mga/akh/dashboard/script/output/${DBNAME}_accounts.out'
append
into table DASHBOARD_ACCOUNT_STATUS
trailing nullcols
(db_name terminated by whitespace,
username terminated by whitespace,
account_status terminated by whitespace)

And this is what the data file looks like:

AREGPRD REGISTRY                       OPEN
AREGPRD DWH_ETL_USER                   OPEN
AREGPRD DWH_RO_USER                    OPEN
AREGPRD DFC_RO_USER                    OPEN
AREGPRD DFC_USER                       OPEN

The sample_date column exists in all the tables, but it is default sysdate so doesn't need to be loaded.
This is what the data in the table looks like:

AREGPRD       DFC_USER      OPEN   21/SEP/15
AREGPRD       DFC_RO_USER   OPEN   21/SEP/15
AREGPRD       DWH_RO_USER   OPEN   21/SEP/15
AREGPRD       DWH_ETL_USER  OPEN   21/SEP/15
AREGPRD       REGISTRY      OPEN   21/SEP/15
AWAPRD        WDM           OPEN   21/SEP/15
AWAPRD        DWH_COMMON    OPEN   21/SEP/15
AWAPRD        DWH_TEMPORAL  OPEN   21/SEP/15

And this is what the dashboard region looks like:



I'll upload all the scripts, control files and sample data.

The APEX Application

I found it best to get the scripts working and loading data before turning to APEX – it's a lot easier to design the APEX application if you have decent data.
So, assuming you have the data, you can start to create the application.
I'll upload my application, but you can design yours from scratch if you like, depending on what you want on the front page.

In my case, I have this:

Date and time on hosts (my current site uses UTC on the hosts, which throws out a lot of the time stats, so I have this to be able to see what is happening when).

Database information – dbname, host name, startup time. Note, I hard code the host and db name because it's only needed once. I type the info into the table manually.

Locked Accounts

Database Locks

Database Session Count

Alert Log Errors

DDL Changes

Invalid Objects Count

RMAN Backup Status

Long Operations

Tablespaces over 50% full

Most of these have links or buttons which drill-down to other pages. It looks like this:




Let's get started

So assuming you have a database that you can use as a repository, we can start.
Set the environment to your database (. oraenv).

Run the cr_monitor.sql to create the tablespace, user and tables – edit the path to the directory as required:

SQL> @cr_monitor.sql
SQL> CREATE TABLESPACE MONITOR_DAT DATAFILE
  2    '/data/oracle/db07/oradata/MONITOR/monitor_dat01.dbf' SIZE 524288000
  3    LOGGING ONLINE PERMANENT BLOCKSIZE 8192
  4    EXTENT MANAGEMENT LOCAL AUTOALLOCATE DEFAULT
  5   NOCOMPRESS  SEGMENT SPACE MANAGEMENT AUTO;

Tablespace created.

SQL>
SQL> CREATE USER MONITOR IDENTIFIED BY CHANGE_PASSWORD
  2   DEFAULT TABLESPACE MONITOR_DAT TEMPORARY TABLESPACE TEMP
  3    QUOTA UNLIMITED ON MONITOR_DAT;

User created.

SQL>
SQL> GRANT DBA TO MONITOR;

Grant succeeded.

SQL>
SQL> CREATE TABLE MONITOR.DASHBOARD_DB_FILE_IO
  2     (    DB_NAME VARCHAR2(50 BYTE),
  3          METRIC VARCHAR2(20 BYTE),
  4          METRIC_VALUE NUMBER,
  5          SAMPLE_DATE DATE DEFAULT sysdate NOT NULL ENABLE
  6     ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
  7   NOCOMPRESS LOGGING
  8    STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  9    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
 10    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
 11    TABLESPACE MONITOR_DAT ;

Table created.

SQL>
SQL> CREATE TABLE MONITOR.DASHBOARD_DB_FILE_IO_HISTORY
  2     (    DB_NAME VARCHAR2(50 BYTE),
  3          METRIC VARCHAR2(20 BYTE),
  4          METRIC_VALUE NUMBER,
  5          SAMPLE_DATE DATE DEFAULT sysdate NOT NULL ENABLE
  6     ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
  7   NOCOMPRESS LOGGING
  8    STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  9    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
 10    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
 11    TABLESPACE MONITOR_DAT ;

Table created.

.
.
Etc
.
.
SQL> exit

So we now have tables, but no data.
The shell scripts assume this is the path to the output, so amend accordingly:

/home/oracle/mga/akh/dashboard/script/output

Copy all the shell scripts into a directory, and edit the 'get_all_sql.sh' script and uncomment all the commented scripts:

#!/bin/sh
#
# Get all data
#

export TNS_ADMIN=/home/oracle/mga/akh
export ORACLE_HOME=/opt/oracle/product/11.2.0.3/db1
export ORACLE_SID=MONITOR
export PATH=$ORACLE_HOME/bin:$PATH

for i in AWATST EAREPTST AWS1DEV AWAUAT AREGUAT AREGSUP AWASUP AWAPRD AREGPRD
do
/home/oracle/mga/akh/dashboard/script/get_backup.sh $i
/home/oracle/mga/akh/dashboard/script/get_cpu.sh $i
/home/oracle/mga/akh/dashboard/script/get_io.sh $i
/home/oracle/mga/akh/dashboard/script/get_latency.sh $i
/home/oracle/mga/akh/dashboard/script/get_locks.sh $i
/home/oracle/mga/akh/dashboard/script/get_sessions.sh $i
/home/oracle/mga/akh/dashboard/script/get_tablespace.sh $i
/home/oracle/mga/akh/dashboard/script/get_trans.sh $i
/home/oracle/mga/akh/dashboard/script/get_session_details.sh $i
/home/oracle/mga/akh/dashboard/script/get_longops.sh $i
/home/oracle/mga/akh/dashboard/script/get_uptime.sh $i
/home/oracle/mga/akh/dashboard/script/get_stale_stats.sh $i
/home/oracle/mga/akh/dashboard/script/get_alert_log.sh $i
/home/oracle/mga/akh/dashboard/script/get_invalid.sh $i
/home/oracle/mga/akh/dashboard/script/get_accounts.sh $i
/home/oracle/mga/akh/dashboard/script/get_sessions_hwm.sh $i
/home/oracle/mga/akh/dashboard/script/get_waits.sh $i
/home/oracle/mga/akh/dashboard/script/get_schema_diffs.sh $i
/home/oracle/mga/akh/dashboard/script/get_ddl_changes.sh $i
done

Note that you will obviously need to edit this to reflect the database names and the paths. You also need to change the TNS_ADMIN line or remove it, depending on whether you use the existing tnsnames.ora file or create your own.

You will also need to change all the 'get_' shell scripts to reflect the system password or change the user to one that has permission to view the data dictionary tables on the target databases. I didn't say this was easy.

You may want to limit the first run to one database to make sure all the scripts are OK, or only uncomment one 'get_' script at a time.

So, assuming all these run through OK, you should have files in the output directory and data in the tables:

-rw-r--r-- 1 oracle oinstall  5474 Sep 21 04:59 AWAPRD_session_details.out
-rw-r--r-- 1 oracle oinstall    81 Sep 21 05:00 AWAPRD_sessions_hwm.out
-rw-r--r-- 1 oracle oinstall   162 Sep 21 04:59 AWAPRD_sessions.out
-rw-r--r-- 1 oracle oinstall  5957 Sep 21 05:00 AWAPRD_stale_stats.out
-rw-r--r-- 1 oracle oinstall  1215 Sep 21 04:59 AWAPRD_tbsspace.out
-rw-r--r-- 1 oracle oinstall    81 Sep 21 04:59 AWAPRD_trans.out

AREGPRD       DFC_RO_USER   OPEN   21/SEP/15
AREGPRD       DWH_RO_USER   OPEN   21/SEP/15
AREGPRD       DWH_ETL_USER  OPEN   21/SEP/15
AREGPRD       REGISTRY      OPEN   21/SEP/15
AREGPRD       DFC_USER      OPEN   21/SEP/15
AWAPRD        DWH_ETL_USER  OPEN   21/SEP/15
AWAPRD        DWH_RO_USER   OPEN   21/SEP/15
AWAPRD        DWH_TEMPORAL  OPEN   21/SEP/15
AWAPRD        WDM           OPEN   21/SEP/15
AWAPRD        DWH_COMMON    OPEN   21/SEP/15
AWAPRD        DWH_STAGING   OPEN   21/SEP/15

Now you can import the APEX application. Sign in to APEX as the ADMIN user, and create a workspace and user (called monitor).

Download the dashboard.sql file and import it into APEX.

Run the app and sign in as the monitor user, and hopefully you should see something like this:



Note that there is 'no data found' in the 'Database Information' region. This is because this table is not loaded, but entered manually since it's static. So use SQL Developer or SQLPlus to enter the information, and then refresh the page:

So that's it really. Hopefully this is enough to get you started, and you can tailor it as you see fit.

This is where to get all the files : http://1drv.ms/1OtWys4

You'll have to wait a few days for some of the charts to populate properly since some are loaded overnight.

Adding databases should be easy – just add them to the tnsnames.ora and the list in the 'get_hourly.sh', 'get_daily.sh' etc scripts:

for i in AWATST EAREPTST AWS1DEV AWAUAT NEW_DB_NAME


You're also not restricted to Oracle – if you can get data into a flat file from other databases then you can load it using SQL Loader.

Although the scripts are for unix, you should be able to adapt them for Windows.


2 comments:

  1. Hi Andy,

    Thanks! for the wonderful blog. I was searching for this kind of monitoring for my Standard edition databases. This blog really helps.

    >>I'll upload the DDL for all the objects, but this is a sample:<<

    Have you uploaded the other DDL's anywhere?
    I would like to implement this kind of monitoring with APEX.

    Thanks once again.

    ReplyDelete
    Replies
    1. This is where to get all the files : http://1drv.ms/1OtWys4

      Delete