December 10, 2015

Exadata Storage Server Diagnostic Data Collection

Whenever any issue occurs with Exadata Storage Server, like SMART SCAN issues, H/W Failures, SQL getting quarantined etc. It is best to collect the bellow information immediately from the environment and if any SR has been raised with Oracle Support, upload them into the SR.

It's always best to upload the below information into the SR even if the SR engineer doesn't ask for these yet. But trust me, they will ask these now or later. So why waste time?

For one Exadata Storage Server, it might take 10-20 mins on an average to collect these information. If the same situation is observed in multiple storage server, then it might be alright to collect these information from only one or two storage servers as a start and upload into the SR.
 1. Content of $CELLTRACE directory  
 (root)# cd $CELLTRACE/../..  
 (root)# zip -r /tmp/celltrace_<cellname>.zip .  
 2. Incident package associated to the error  
 (root)# cellcli -e list alerthistory detail  
 Note: Search for how to generate the incident package (Look for alertAction:)   
-- Collection example  
 (root)# adrci  
 (root)# cd /opt/oracle/cell/log  
 (root)# adrci  
 adrci> set home diag/asm/cell/SYS_121213_151021  
 adrci> ips pack incident 18 in /tmp  
 Note (1): adrci displays a message including the name of generated zip file.  
 Note (2): Download this zip file for Oracle Support or further analysis.  
 Note (3): Finally, remove the generated zip file from the storage server.  
 3. Output of the following cellcli commands  
 (root)# cellcli -e list quarantine detail  
 (root)# cellcli -e list alerthistory detail  
 4. iLOM Snapshot  
 5. Sundiag: (root)# /opt/oracle.SupportTools/sundiag.sh  
 Note: sundiag.sh will collect and then displays a message including the name of generated zip file.  

1 comment:

  1. Thanks for sharing your thoughts on comment.
    Regards

    ReplyDelete