Wednesday, April 3, 2024

A Way to Identify gc block lost wait event' Wait event in Oracle database

 One of the most annoying Oracle wait event is 'gc block lost wait event'. In this post, I am sharing my approach to resolve this issue in an production environment.

First use the netstat command to get the hang of the system:

% netstat --tcp --numeric  

Active Internet connections (w/o servers)  

Proto Recv-Q Send-Q Local Address           Foreign Address         State       

tcp        0      0 192.168.128.152:993     192.168.128.120:3853   ESTABLISHED

tcp        0      0 192.168.128.152:143     192.168.128.194:3076   ESTABLISHED

tcp        0      0 192.168.128.152:45771   192.168.128.34:389      TIME_WAIT

tcp        0      0 192.168.128.152:110     192.168.33.123:3521     TIME_WAIT

tcp        0      0 192.168.128.152:25      192.168.231.27:44221    TIME_WAIT

tcp        0    256 192.168.128.152:22      192.168.128.78:47258   ESTABLISHED

If you want to see what (TCP) ports your machine is listening on, use netstat --tcp --listening.

Another useful flag to add to this is --programs which indicates which process is listening on the specified port.

The following example shows a machine listening on ports 80 (www), 443 (https), 22 (ssh), and 25 (smtp);


Code Listing 2: netstat --tcp --listening --programs


# sudo netstat --tcp --listening --programs

Active Internet connections (only servers)

Proto Recv-Q Send-Q Local Address   Foreign Address   State     PID/Program name

tcp        0      0 *:www           *:*               LISTEN    28826/apache2

tcp        0      0 *:ssh           *:*               LISTEN    26604/sshd

tcp        0      0 *:smtp          *:*               LISTEN    6836/

tcp        0      0 *:https         *:*               LISTEN    28826/apache2

Note: Using --all displays both connections and listening ports.


The next example uses netstat --route to display the routing table. For most people, this will show one IP and and the gateway address but if you have more than one interface or have multiple IPs assigned to an interface, this command can help troubleshoot network routing problems.


Code Listing 3: netstat --route


% netstat --route

Kernel IP routing table

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0

0.0.0.0         192.168.1.1     0.0.0.0         UG    1      0        0 eth0

The last example of netstat uses the --statistics flag to display networking statistics. Using this flag by itself displays all IP, TCP, UDP, and ICMP connection statistics.

To just show some basic information. For example purposes, only the output from --raw is displayed here.

Combined with the uptime command, this can be used to get an overview of how much traffic your machine is handling on a daily basis.


Code Listing 4: netstat --statistics --route


% netstat --statistics --raw

Ip:

    620516640 total packets received

    0 forwarded

    0 incoming packets discarded

    615716262 incoming packets delivered

    699594782 requests sent out

    5 fragments dropped after timeout

    3463529 reassemblies required

    636730 packets reassembled ok

    5 packet reassembles failed

    310797 fragments created

// ICMP statistics truncated

Note: For verbosity, the long names for the various flags were given. Most can be abbreviated to avoid excessive typing (e.g. netstat -tn, netstat -tlp, netstat -r, and netstat -sw).


and now check the AWR report:


1) 


Top 10 Foreground Events by Total Wait Time

Top 10 Foreground Events by Total Wait Time

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

                                           Total Wait       Wait   % DB Wait

Event                                Waits Time (sec)    Avg(ms)   time Class

------------------------------ ----------- ---------- ---------- ------ --------

DB CPU                                         3691.6              70.3

gc cr block lost                       947      511.7     540.31    9.7 Cluster

library cache lock                  30,871      422.1      13.67    8.0 Concurre

db file sequential read            252,506      189.6       0.75    3.6 User I/O

gc buffer busy acquire               7,745      183.7      23.72    3.5 Cluster

gc cr block busy                    90,856      141.4       1.56    2.7 Cluster

gc cr multi block request            1,768       71.4      40.40    1.4 Cluster

name-service call wait                 398       31.7      79.68     .6 Other

gc cr block 2-way                  165,758       30.5       0.18     .6 Cluster

log file sync                       25,866       20.5       0.79     .4 Commit


Both OS level and AWR level info should tell you if this event is the issue or not and then you can simply look at the sessions contributing to it.

No comments: