Database

(Oracle) Beware the /var/tmp/.oracle Hidden Directory!

steloflute 2013. 9. 4. 23:30

http://www.pythian.com/blog/beware-the-vartmp-oracle-hidden-directory/

 

 

Beware the /var/tmp/.oracle Hidden Directory!

Jul 2, 2010 / By Don Seiler

Tags: ,

A few months ago, we had a test instance complaining that it couldn’t write to ASM. This was an 11.1.0.7 single (non-RAC) instance on Oracle Enterprise Linux 5, using ASM for the storage. We first saw these errors in the alert log:

1
2
3
4
ORA-15032: not all alterations performed
ORA-29702: error occurred in Cluster Group Service operation
ORA-29702: error occurred in Cluster Group Service operation
ERROR: error ORA-15032 caught in ASM I/O path

Uh-oh, that doesn’t look good. So I log into the ASM instance and try to see if the disks are OK:

1
2
3
4
5
6
7
SQL> select path, mount_status from v$asm_disk;
select path, mount_status from v$asm_disk
                               *
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-29702: error occurred in Cluster Group Service operation
ORA-29702: error occurred in Cluster Group Service operation


I can’t even query that. As Ted would say, “strange things are afoot at the Circle K.” To be safe, I thought I’d try to shutdown the DBMS instance, which also failed without having to abort:

1
2
3
4
5
6
SQL> shutdown immediate
ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '+FOOTEST_DATA/footest1_footest_db/control01.ctl'
ORA-15081: failed to submit an I/O operation to a disk
SQL> shutdown abort
ORACLE instance shut down.

We decided to restart the whole DBMS/ASM/CSS stack, but CSS wouldn’t stop either:

1
2
3
4
-bash-3.2# /etc/init.d/init.cssd stop
Stopping Cluster Synchronization Services.
Unable to communicate with the Cluster Synchronization Services daemon.
Shutdown has begun. The daemons should exit soon.

We ended up booting the server altogether, after which everything came up nicely. We filed an SR with Oracle Support, who directed us to Note 391790.1 (Unable To Connect To Cluster Manager Ora-29701). This note lists the cause, quite simply, as:

The hidden directory ‘/var/tmp/.oracle’ was removed while instances & the CRS stack were up and running. Typically this directory contains a number of “special” socket files that are used by local clients to connect via the IPC protocol (sqlnet) to various Oracle processes including the TNS listener, the CSS, CRS & EVM daemons or even database or ASM instances. These files are created when the “listening” process starts.

The solution is to restart CRS or reboot the machine. Our /var/tmp/.oracle directory looked like this:

1
2
3
4
5
6
7
8
9
10
11
12
[oracle@footest ~]$ ls -la /var/tmp/.oracle
total 12
drwxrwxrwt 2 root   root 4096 May  8 15:03 .
drwxrwxrwt 3 root   root 4096 May 10 07:02 ..
srwxrwxrwx 1 oracle dba     0 May  8 15:03 s#18854.1
srwxrwxrwx 1 oracle dba     0 May  8 15:03 s#18854.2
srwxrwxrwx 1 oracle dba     0 May  8 15:03 sEXTPROC
srwxrwxrwx 1 oracle dba     0 May  8 14:44 sfootestDBG_CSSD
srwxrwxrwx 1 oracle dba     0 May  8 14:44 sOCSSD_LL_footest_
srwxrwxrwx 1 oracle dba     0 May  8 14:44 sOCSSD_LL_footest_localhost
srwxrwxrwx 1 oracle dba     0 May  8 14:44 sOracle_CSS_LclLstnr_localhost_0
srwxrwxrwx 1 oracle dba     0 May  8 15:03 sPNPKEY

I did some sandbox testing, and found that only the Oracle and root OS users could delete that directory, and was able to duplicate the error every time when doing so.

However, I really was dumbstruck that Oracle would have so critical a directory in /var/tmp! I politely note this to Oracle Support, who justified this location with a few solid reasons:

  1. It has always been in this location (and still is in 11gR2).
  2. /var/tmp/.oracle is a hidden directory, so it probably won’t be noticed by any miscreants looking to cause trouble.

OK, I was being sarcastic, these reasons are awful. The only safeguard they gave was “make sure no one deletes it.” We scoured the server for cron jobs that would automatically clean out /var/tmp but didn’t find any, nor any bash history suggesting malice. The only thing that we could think of was that this test server was in a VM (Citrix Xen), although one would hope that it doesn’t happen at all, regardless. We certainly could not find an explanation, but now we’re aware to not delete /var/tmp/.oracle while the instances are running (even though we never did before).

Surachart Opun has also blogged on this topic.

 

 

http://surachartopun.com/2009/01/vartmporacle-hidden-directory.html

 

Tuesday, January 13, 2009

/var/tmp/.oracle hidden directory



When Your Database is Unix RAC cluster or standalone ASM install, You find hidden directory '/var/tmp/.oracle'. This directory contains a number of "special" socket files that are used by local clients to connect via the IPC protocol (sqlnet) to various Oracle processes including the TNS listener, the CSS, CRS & EVM daemons or even database or ASM instances.

$ cd /var/tmp/.oracle/
$ ls -la

srwxrwxrwx 1 oracle oinstall 0 Jan 13 12:45 s#27901.1
srwxrwxrwx 1 oracle oinstall 0 Jan 13 12:45 s#27901.2
srwxrwxrwx 1 oracle oinstall 0 Jan 13 12:44 sAdb01_crs_evm
srwxrwxrwx 1 root root 0 Jan 13 12:44 sdb01DBG_CRSD
srwxrwxrwx 1 oracle oinstall 0 Jan 13 12:44 sdb01DBG_CSSD
srwxrwxrwx 1 oracle oinstall 0 Jan 13 12:44 sdb01DBG_EVMD
srwxrwxrwx 1 oracle oinstall 0 Jan 13 12:44 sCdb01_crs_evm
srwxrwxrwx 1 root root 0 Jan 13 12:45 sCRSD_UI_SOCKET
srwxrwxrwx 1 oracle oinstall 0 Jan 13 12:45 sEXTPROC
srwxrwxrwx 1 oracle oinstall 0 Jan 13 12:44 sOCSSD_LL_db01_crs
srwxrwxrwx 1 oracle oinstall 0 Jan 13 12:44 sOracle_CSS_LclLstnr_crs_1
srwxrwxrwx 1 root root 0 Jan 13 12:44 sora_crsqs
srwxrwxrwx 1 oracle oinstall 0 Jan 13 12:45 sora_racg_db_db01
srwxrwxrwx 1 root root 0 Jan 13 12:44 sprocr_local_conn_0_PROC
srwxrwxrwx 1 oracle oinstall 0 Jan 13 12:44 sSYSTEM.evm.acceptor.authth

If this hidden directory was removed while instances & the CRS stack were up and running.
These will find something Error...

Check on Crsd Log:
[ COMMCRS][1084229984]clsc_connect: (0xb7e210) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_db01_crs))
[ CSSCLNT][2541583904]clsssInitNative: connect failed, rc 9
if use "crs_stat" command, finf error:

[ COMMCRS][2541583904]clsc_connect: (0x6f5d40) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))

If Check on CSS Log:
TRACE: clsc_post: (0x707c70) code 4, NS err (12603, 12532), transport (502, 22, 0)
TRACE: clscsendx: (0x7dc4c0) Physical connection (0x7dc640) not active
WARNING: clssnmsendmsg: send failed, node 2, type 3, rc 3
If Check on ASM: alert log
ORA-29702: error occurred in Cluster Group Service operation
LMON: terminating instance due to error 29702

These show some Errors ...
when deleted(/var/tmp/.oracle hidden directory)... must to re-create these special files on this folder.
Restart (instance, listener, CRS), maybe that can help... .
In a RAC environment this requires the shutdown & restart of the entire CRS stack.

10gR1:
# /etc/init.d/init.crs stop
# /etc/init.d/init.crs start

10gR2/11g:
# $ORA_CRS_HOME/bin/crsctl stop crs
# $ORA_CRS_HOME/bin/crsctl start crs

If the above fails to successfully stop the CRS stack, A system reboot will be inevitable.

Anyway who someone would like to remove "/var/tmp" folder and create symbolic link its to "/tmp"
How?
A. Stop Cluster and then remove "/var/tmp"+ create symbolic link... after that start Cluster

B. ... move "/var/tmp/.oracle" folder to /tmp before .. and then remove "/var/tmp/" + create symbolic link...
# mv /var/tmp/.oracle /tmp/ && mv /var/tmp /var/tmp-old && ln -s /tmp /var/tmp
A. choice is better...

Deleting files from temporary directory via a cronjob (or otherwise):
the directory '/var/tmp/.oracle' (on some platform /tmp/.oracle) should be excluded from such jobs/tasks.

 

 

 

 

'Database' 카테고리의 다른 글

(Oracle) dictionary 조회  (0) 2013.09.13
Script to kill user sessions in Oracle  (0) 2013.09.13
(Oracle) ANALYZE (통계정보 생성)  (0) 2013.09.02
Convert Unix Timestamp to Oracle data format  (0) 2013.08.26
[Oracle] PARTITION BY 구문  (0) 2013.08.02