CDC on a database was stopped and restarted twice a day by an
automated process. Apparently this was because it would send false-positive SMS alerts otherwise.
There must be a long-running transaction, because every time
it restarted, it started to mine the same logfile (seq# 252207). This went on
for a couple of days, until eventually this log was deleted by a clean-up job.
Then the CDC failed with this:
State       
: WAITING FOR DICTIONARY REDO: FILE
/apps/oracle/admin/PROD/PROD_arch/PROD_1_708192618_0000252207.arc
Luckily this file was still on ASM disk, so I copied it back
down to the file system, and it started to mine it.
However, for some reason it wouldn't move on, it just sat
there, even though the other files (252208, 252209) were also there on disk.
In the meantime, the older logs were getting deleted by the clean up job. I identified
the logs that were missing from disk (252210 – 252243) copied what I could from
ASM, and also had to restore some from tape.
The destination file system didn't have enough space, so I
had to borrow some and moved them to a staging file
system (I'll call it /cdc_arch_staging)
I then registered them with the database and voila – the
mining kicked in.
I still don’t know why it just stopped after I’d copied
252207 – 252210.
Here is a summary of the commands I used:
To copy files from ASM:
for i in $(asmcmd ls
+ARCH01/PROD/ARCHIVELOG/2014_11_04); do
  asmcmd cp
+ARCH01/PROD/ARCHIVELOG/2014_11_04/$i /cdc_arch_staging/PROD/arch
done
Note that this copied them to the file system with names
like “thread_1_seq_252369.6617.862722301”, and I had to rename them to
“PROD_1_708192618_0000252369.arc” with a combination of awk and manually
editing.
I had to restore archived logs that had been deleted from ASM by the clean up job:
connect target /
connect catalog
rman/xxx@RMAN;
run {
set archivelog
destination to '/cdc_arch_staging/PROD/arch';
  allocate
channel T1 type SBT_TAPE;
  allocate
channel T2 type SBT_TAPE;
  allocate
channel T3 type SBT_TAPE;
  allocate
channel T4 type SBT_TAPE;
restore (archivelog
from logseq=252208 until logseq=252423);
}
Registering files with CDC:
alter
database register or replace logical logfile '/cdc_arch_staging/PROD/arch/PROD_1_708192618_0000252491.arc'
for 'CDC$C_CS_IEE';
alter
database register or replace logical logfile
'/cdc_arch_staging/PROD/arch/PROD_1_708192618_0000252492.arc' for
'CDC$C_CS_IEE';
alter
database register or replace logical logfile '/cdc_arch_staging/PROD/arch/PROD_1_708192618_0000252493.arc'
for 'CDC$C_CS_IEE';
I created a script with awk to do that, using an 'ls' to list the logs to a file and then wrapping the command around it.
Once I’d run that, I checked the alert log and it was mining
away.
