Wednesday, November 21, 2012

One Oracle AQ Deadlock

A few days ago, an AQ based Oracle application (Oracle 11.2.0.3.0) threw a deadlock after more than 1,000,000 times of unobstructed running since a few years.

The incident is frustrating since the reliability of AQ based implementation is disputed, and resorting to Oracle appears being a long way to go.

This demonstrates again that correctness proof of computer programs is hardly achievable, let alone concurrent programs.

The deadlock is caused by the following simplified pseudo code:

 dbms_aqadm.start_queue;
 enqueue some message;
 dbms_job.submit to launch a dequeue job in background;  -- can also be dbms_scheduler
 wait till queue is empty;
 dbms_aqadm.stop_queue;


which is started in main session and concurrently forks background job session.

The trace dump shows that main session is waiting for 'library cache pin' in the step of dbms_aqadm.stop_queue; whereas job session is waiting for 'enq: TM - contention' with Current SQL Statement:

 select ... from queue_table ... for update skip locked;

Although reproducing this deadlock is not evident, we can still try to make some plausible experiment and some sort of post-mortem analysis.

At first we open 3 Sqlplus sessions:

  observer session: observer_sid
 main session:     main_sid
 job session:      job_sid


before and after each step, we run the following 3 statements to check the delta of statistics:

 select sid, type, id1, id2, lmode, request,ctime, block
   from v$lock where type in ('TM', 'TQ');
 select eq_type, eq_name, req_reason, total_req#, total_wait#,succ_req#,failed_req#, cum_wait_time
   from v$enqueue_statistics where eq_type in ('TQ', 'TM');
 select locked_total, pinned_total, locks, pins, lock_mode, pin_mode, executions
   from v$db_object_cache v where name = 'KSUN_QUEUE';


Inspired by Perturbation Theory, we "perturbe" time series by a table lock so that we can microscope the Oracle internal locking mechanisms.

Here is the time sequence and actions of first experiment on the observer session and main session as well as the noticed statistics change:

 T1 observer_sid --- start_queue
 T2 observer_sid --- lock queue table in exclusive mode
 T3 main_sid     --- stop_queue
 T4 observer_sid --- check stats
    main_sid got 1 TQ Lock in Mode 6, requests 1 TM in Mode 6
    TM DML: TOTAL_REQ# increased 1, but SUCC_REQ# not changed
    TQ Queue table enqueue DDL contention: TOTAL_REQ# increased 1, SUCC_REQ# increased 1
    LOCKED_TOTAL increased 1, PINNED_TOTAL increased 1
 T5 observer_sid --- rollback to unlock table
    TM DML: TOTAL_REQ# increased 6, SUCC_REQ# increased 7
    LOCKED_TOTAL increased 1, PINNED_TOTAL increased 1


Here is the second experiment on the observer session and job session:

 T6 observer_sid --- start_queue
 T7 observer_sid --- lock queue table in exclusive mode
 T8 job_sid --- dequeue with dequeue_options.wait being 3 seconds
 T9 observer_sid --- check stats
    job_sid requests 1 TM in Mode 3
    TM DML: TOTAL_REQ# increased 1, but SUCC_REQ# not changed
    LOCKED_TOTAL increased 1, PINNED_TOTAL increased 1
 T10 observer_sid --- rollback to unlock table
    TM DML: TOTAL_REQ# increased 2, but SUCC_REQ# increased 3
    LOCKED_TOTAL increased 1, PINNED_TOTAL increased 2


According the above statistics change, we can conclude:

 (1). "stop_queue" got 'TQ Queue table enqueue DDL contention' before getting TM DML,
         and again 'library cache pin' / 'library cache lock' after getting TM DM.

 (2). "dequeue" got 'library cache pin'/'library cache lock' before getting TM DML,
         and again 'library cache pin'/'library cache lock' after getting TM DM.

that probably explains:

(a). when main session "stop_queue", it first holds 'TQ Queue table enqueue DDL contention' (and hence TM) lock
      and then is waiting for 'library cache pin';

(b). whereas job session when "dequeue", it first holds 'library cache pin'/'library cache lock'
      and then is waiting for 'enq: TM - contention' with Current SQL Statement:

         select ... from queue_table ... for update skip locked;

Because 'library cache pin' and 'enq: TM - contention' in main session and job session are acquired and requested in reverse order, it signals a deadlock.

Oracle has 4 different req_reasons for TQ:

 TQ DDL-INI contention
 TQ INI contention
 TQ DDL contention
 TQ TM contention


what the first experiment shows is that "stop_queue" requires "DDL contention", which seems incurring 1 TQ in Mode 6 firstly and subsequently 1 TM in Mode 6 on queue table.

Probably one fix is to ensure no more Job sessions running before stepping into stop_queue.

Be careful that dbms_job.broken, dbms_job.remove, dbms_scheduler.disable, dbms_scheduler.stop_job, dbms_scheduler.drop_job seem not terminating job session immediately.

The dump file contains some more details about main session:

           time in wait: 1.309131 sec
    timeout after: 14 min 58 sec

          
and job session:
       
            time in wait: 1.309007 sec
     timeout after: never
  
          
Usually Oracle picks the longer waiting session to resolve the deadlock. In this case, it is the main session which raises an error, and job session continues its processing after receiving the lock.
   
The above text confirms again the "library cache pin" 15 minutes timeout mentioned in Blog:
Oracle 11gR2 single session "library cache pin"

15 minutes timeout can also be justified by following code.

In the first session, execute:

 create or replace procedure proc as
 begin
   execute immediate 'alter procedure proc compile';
 end;
 /

 exec proc;
 /


In the second session, run this query a few times:

 select in_wait_secs, time_remaining_secs, (in_wait_secs + time_remaining_secs) total_secs
   from v$wait_chains;

two evidences can be observed:

(I).  total_secs displays constantly 899 seconds (exit when elapsed > 899, that is, one second less than 15 minutes).

(II). both in_wait_secs and time_remaining_secs are updated each 3 seconds (deadlock detection interval).

after 899 seconds, the first session throws:

 ORA-04021: timeout occurred while waiting to lock object

In fact, 15 minutes comes from default setting of "_kgl_time_to_wait_for_locks", which controls time to wait for locks and pins before timing out (time in minute, modification requires instance restart).

By the way, following statement requests a TM Lock in Mode 3 even the table is empty.

 select ... from queue_table ... for update skip locked;