I'm not an expert tuner, but know the basics and where to look. Yesterday someone came to me with a problem - 2 databases, same data, same query. One was taking 2 seconds (that's good), one was taking 15 seconds (that's bad).
I checked the obvious things first and compared the results between the 2 servers:
Hardware
Database Parameters
Amount of data
Explain Plan.
Did a "top"
Ran awrrpt
The hardware on the bad server (memory and CPU) is actually a lot better on than on the good server:
Bad = 48GB of RAM, and 24 x 2.93GHz CPUs
Good = 12GB of RAM, and 16 x 2.93GHz CPUs
(just goes to show that throwing hardware at a problem isn't always the solution).
The database parameters on the bad server were beefed up to take advantage of the hardware.
The "top" command didn't show any runaway processes, or anything untoward.
The explain plan was the same for both servers, and showed the query using an index.
I created a 5 GB file on both - we've had issues with disk I/O on some of these hosts before. The bad server created the file a lot faster, so I ruled out the earlier issues we experienced.
I ran
"analyze index owner.index_name validate structure;"
then a
"select * from index_stats;"
on both servers.
The result (sorry about the formatting):
Bad:
HEIGHT BLOCKS NAME LF_ROWS LF_BLKS LF_ROWS_LEN LF_BLK_LEN BR_ROWS BR_BLKS BR_ROWS_LEN BR_BLK_LEN DEL_LF_ROWS DEL_LF_ROWS_LEN DISTINCT_KEYS MOST_REPEATED_KEY BTREE_SPACE USED_SPACE PCT_USED ROWS_PER_KEY BLKS_GETS_PER_ACCESS PRE_ROWS PRE_ROWS_LEN OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
3 19272 DBQ_PROCESS_TIME 190128 18768 6004458 5524 18767 103 601529 8028 184802 5765005 190128 1 104501316 6605987 7 1 4 0 0 2 32
Good:
HEIGHT BLOCKS NAME LF_ROWS LF_BLKS LF_ROWS_LEN LF_BLK_LEN BR_ROWS BR_BLKS BR_ROWS_LEN BR_BLK_LEN DEL_LF_ROWS DEL_LF_ROWS_LEN DISTINCT_KEYS MOST_REPEATED_KEY BTREE_SPACE USED_SPACE PCT_USED ROWS_PER_KEY BLKS_GETS_PER_ACCESS PRE_ROWS PRE_ROWS_LEN OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
3 17208 DBQ_PROCESS_TIME 9426 16048 401044 6052 16047 87 544198 8028 9392 399684 9426 1 97820932 945242 1 1 4
The figures from the bad server are a lot higher.
I rebuilt the index on the bad server:
alter index owner.index_name rebuild online parallel 3;
Didn't take long, then re-ran the query - it then ran in 0.01 seconds.
I rebuilt the index the same way on the good server, same result.
I was later told that the application owner did a purge of the table the night before and removed about 10M rows.
If I'd known that before it would have saved me a bit of investigation.