2004年12月30日

rman 的备份文件,只能在 server 端生成,不管 rman 在什么客户端;(比如 server linux, 而 rman 在 windows 下启动,最终的
backup piece 还是在 linux 端生成)

backup database 的时候,假如不设置 channel 的 limit ,一开始就会创建一个大文件,因此,假如你空间不够,他就会报错,比如

RMAN> run { allocate channel c3 type disk; backup database ;}

RMAN-03022: compiling command: allocate
RMAN-03023: executing command: allocate
RMAN-08030: allocated channel: c3
RMAN-08500: channel c3: sid=23 devtype=DISK

RMAN-03022: compiling command: backup
RMAN-03023: executing command: backup
RMAN-08008: channel c3: starting full datafile backupset
RMAN-08502: set_count=19 set_stamp=483979175 creation_time=22-JAN-03
RMAN-08010: channel c3: specifying datafile(s) in backupset
RMAN-08522: input datafile fno=00005 name=/db/oracle/oradata/netbar/users01.dbf
RMAN-08522: input datafile fno=00003 name=/db/oracle/oradata/netbar/rbs01.dbf
RMAN-08522: input datafile fno=00004 name=/db/oracle/oradata/netbar/temp01.dbf
RMAN-08522: input datafile fno=00006 name=/db/oracle/oradata/netbar/indx01.dbf
RMAN-08522: input datafile fno=00001 name=/db/oracle/oradata/netbar/system01.dbf
RMAN-08011: including current controlfile in backupset
RMAN-08522: input datafile fno=00002 name=/db/oracle/oradata/netbar/tools01.dbf
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03007: retryable error occurred during execution of command: backup
RMAN-07004: unhandled exception during command execution on channel c3
RMAN-10035: exception raised in RPC: ORA-19504: failed to create file “0jedhrt7_1_1″
ORA-27044: unable to write the header block of file
Linux Error: 22: Invalid argument
Additional information: 2
RMAN-10031: ORA-19624 occurred during call to DBMS_BACKUP_RESTORE.BACKUPPIECECREATE

因此,必须设置 channel 的 limit 啦
RMAN>run { allocate channel c3 type disk;set limit channel c3 kbytes 1800000;backup database format ‘/home/ora/netbars_%p.rman’;}
就完全工作正常。

另外,假如没有指明 format ,那么 文件位置在 $ORACLE_HOME/dbs 下面

情况描述
客户报告数据库故障,新来的系统管理员误操作。删掉了一些文件。具体情况是:删掉了所有重要数据文件、所有控制文件。数据库原来是归档模式,用rman备份数据,而rman 使用控制文件。幸运的是,最后一次 rman full 备份是包括了控制文件在内。系统没有设定自动备份控制文件。现在状况是数据库无法启动。
不用说,客户的备份方案不够完善,但是这时候再去说这些话责备用户有事后诸葛亮之嫌,”用户是上帝,不要去得罪他”。还有,客户有Full备份(虽然不是自动备份控制文件,这样无法用常规的恢复步骤来进行恢复)。这对我们来说是个绝对的好消息。

下面我们通过一次模拟操作来演示这个问题的解决办法。

背景知识
在Oracle 816 以后的版本中,Oracle提供了一个包:DBMS_BACKUP_RESTORE.DBMS_BACKUP_RESTORE 包是由dbmsbkrs.sql 和 prvtbkrs.plb 这两个脚本创建的.catproc.sql 脚本运行后会调用这两个包.所以是每个数据库都有的这个包是Oracle服务器和操作系统之间IO操作的接口.由恢复管理器直接调用。而且据说这两个脚本的功能是内建到Oracle的一些库文件中的.

由此可见,我们可以在数据库 nomount 情况下调用这些package ,来达到我们的恢复目的。在dbmsbkrs.sql 和prvtbkrs.plb 这两个脚本中有详细的说明文档,出于篇幅问题,就不一一加以翻译了,但在下面会直接引用一些原文说明。

关键的内容有:

FUNCTION  deviceAllocate(
       type IN varchar2 default NULL
      ,name IN varchar2 default NULL
      ,ident IN varchar2 default NULL
      ,noio IN boolean default FALSE
      ,params IN varchar2 default NULL )
RETURN varchar2;

– Describe the device to be used for sequential I/O. For device types where
– only one process at a time can use a device, this call allocates a device
– for exclusive use by this session. The device remains allocated until
– deviceDeallocate is called or session termination. The device can be used
– both for creating and restoring backups.

– Specifying a device allocates a context that exists until the session
– terminates or deviceDeallocate is called. Only one device can be specified
– at a time for a particular session. Thus deviceDeallocate must be called
– before a different device can be specified. This is not a limitation since
– a session can only read or write one backup at a time.

– The other major effect of allocating a device is to specify the name space
– for the backup handles (file names). The handle for a sequential file does
– not necessarily define the type of device used to write the file. Thus it
– is necessary to specify the device type in order to interpret the file
– handle. The NULL device type is defined for all systems. It is the file
– system supplied by the operating system. The sequential file handles are
– thus normal file names.

– A device can be specified either by name or by type.
– If the type is specified but not the name, the system picks an
– available device of that type.
– If the name is specified but not the type, the type is determined
– from the device.
– If neither the type or the name is given, the backups are files in
– the operating system file system.

– Note that some types of devices, optical disks for example, can be shared
– by many processes, and thus do not really require allocation of the device
– itself. However we do need to allocate the context for accessing the
– device, and we do need to know the device type for proper interpretation
– of the file handle. Thus it is always necessary to make the device
– allocation call before making most other calls in this package.

– Input parameters:
– type
– If specified, this gives the type of device to use for sequential
– I/O. The allowed types are port specific. For example a port may
– support the type “TAPE” which is implemented via the Oracle tape
– API. If no type is specified, it may be implied by specifying a
– particular device name to allocate. The type should be allowed to
– default to NULL if operating system files are to be used.

– name
– If specified, this names a particular piece of hardware to use for
– accessing sequential files. If not specified, any available
– device of the correct type will be allocated. If the device cannot
– be shared, it is allocated to this session for exclusive use.
– The name should be allowed to default to NULL if operating system
– files are to be used.

– ident
– This is the users identifier that he uses to name this device. It
– is only used to report the status of this session via
– dbms_application_info. This value will be placed in the CLIENT_INFO
– column of the V$SESSION table, in the row corresponding to the
– session in which the device was allocated. This value can also
– be queried with the dbms_application_info.read_client_info procedure.

– noio
– If TRUE, the device will not be used for doing any I/O. This allows
– the specification of a device type for deleting sequential files
– without actually allocating a piece of hardware. An allocation for
– noio can also be used for issuing device commands. Note that some
– commands may actually require a physical device and thus will get
– an error if the allocate was done with noio set to TRUE.

– params
– This string is simply passed to the device allocate OSD. It is
– completely port and device specific.

– Returns:
– It returns a valid device type. This is the type that should be
– allocated to access the same sequential files at a later date. Note
– that this might not be exactly the same value as the input string.
– The allocate OSD may do some translation of the type passed in. The
– return value is NULL when using operating system files.

PROCEDURE restoreControlfileTo(cfname IN varchar2);

– This copies the controlfile from the backup set to an operating system
– file. If the database is mounted, the name must NOT match any of the
– current controlfiles.

– Input parameters:
– cfname
– Name of file to create or overwrite with the controlfile from the
– backup set.

PROCEDURE restoreDataFileTo( dfnumber IN binary_integer
,toname IN varchar2 default NULL);

– restoreDataFileTo creates the output file from a complete backup in the
– backup set.

如果您有兴趣可以去阅读一下这两个文件的注释说明.

解决过程
首先,用控制文件作数据库系统的全备份:

C:WUTemp>rman target /

Recovery Manager: Release 9.2.0.1.0 – Production.

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: DEMO (DBID=3272375326)

RMAN> run {
2> allocate channel C1 type disk;
3> backup full tag ‘FullBackup’ format ‘d:\KDE\%d_%u_%s_%p.dbf’ database include current controlfile;
4> sql ‘ alter system archive log current’;
5> release channel C1;
6> }

using target database controlfile instead of recovery catalog
allocated channel: C1
channel C1: sid=15 devtype=DISK

Starting backup at 18-JUL-04
channel C1: starting full datafile backupset
channel C1: specifying datafile(s) in backupset
including current SPFILE in backupset
including current controlfile in backupset
input datafile fno=00001 name=D:\ORACLE\ORADATA\DEMO\SYSTEM01.DBF
input datafile fno=00002 name=D:\ORACLE\ORADATA\DEMO\UNDOTBS01.DBF
input datafile fno=00004 name=D:\ORACLE\ORADATA\DEMO\EXAMPLE01.DBF
input datafile fno=00009 name=D:\ORACLE\ORADATA\DEMO\XDB01.DBF
input datafile fno=00005 name=D:\ORACLE\ORADATA\DEMO\INDX01.DBF
input datafile fno=00008 name=D:\ORACLE\ORADATA\DEMO\USERS01.DBF
input datafile fno=00003 name=D:\ORACLE\ORADATA\DEMO\DRSYS01.DBF
input datafile fno=00006 name=D:\ORACLE\ORADATA\DEMO\ODM01.DBF
input datafile fno=00007 name=D:\ORACLE\ORADATA\DEMO\TOOLS01.DBF
channel C1: starting piece 1 at 18-JUL-04
channel C1: finished piece 1 at 18-JUL-04
piece handle=D:\KDE\DEMO_01FR79OT_1_1.DBF comment=NONE
channel C1: backup set complete, elapsed time: 00:01:17
Finished backup at 18-JUL-04

sql statement: alter system archive log current

released channel: C1

如上所示,我们做了一次数据库的Full备份.备份片中包括控制文件.注意上面输出内容的黑体部分.我们在后面的恢复操作中会用到.

模拟错误,关掉实例,删掉所有的控制文件和所有的.DBF文件。然后starup会看到如下的出错信息:

SQL> startup
ORACLE instance started.

Total System Global Area 152115804 bytes
Fixed Size 453212 bytes
Variable Size 100663296 bytes
Database Buffers 50331648 bytes
Redo Buffers 667648 bytes
ORA-00205: error in identifying controlfile, check alert log for more info

查看alert Log,应该是系统找不到控制文件.现在情形和客户问题一致.不过在继续讲述之前,我们还需要介绍一点背景知识.

我们首先尝试恢复控制文件:
SQL>startup force nomount;

SQL> DECLARE
2 devtype varchar2(256);
3 done boolean;
4 BEGIN
5 devtype:=sys.dbms_backup_restore.deviceAllocate(type=>”,ident=>’T1′);
6 sys.dbms_backup_restore.restoreSetDatafile;
7 sys.dbms_backup_restore.restoreControlfileTo(cfname=>’d:\oracle\Control01.ctl’);
8 sys.dbms_backup_restore.restoreBackupPiece(done=>done,handle=>’D:\KDE\DEMO_01FR79OT_1_1.DBF’, params=>null);
9 sys.dbms_backup_restore.deviceDeallocate;
10 END;
11 /

PL/SQL procedure successfully completed.

OK,控制文件恢复完成.对以上内容的解释:
第五行 分配一个device channel,因为使用的操作系统文件,所以这里为空,如果是从磁带上恢复要用    “sbt_tape”;
第六行 指明开始restore ;
第七行 指出待恢复文件目标存储位置;
第八行 从哪个备份片中恢复;
第九行 释放设备通道.
不妨对以上操作的结果验证一下:

SQL> host dir d:\oracle
Volume in drive D is DATA
Volume Serial Number is DC79-57F8
Directory of d:\oracle

07/18/2004 09:08 PM

.
07/18/2004 09:08 PM ..
06/08/2004 03:21 PM admin
07/18/2004 09:08 PM 1,871,872 CONTROL01.CTL
07/16/2004 11:27 AM ORA92
07/18/2004 09:02 PM oradata

这样,我们成功的restore了控制文件 .如果控制文件在Full备份之后单独做的,接下来关掉实例,拷贝控制文件到具体位置,然后rman 执行restore database;即可。 可是,我们这里的情况有些不同. 视丢失文件的情况而定,继续进行如下的恢复操作:
SQL> DECLARE
2 devtype varchar2(256);
3 done boolean;
4 BEGIN
5 devtype:=sys.dbms_backup_restore.deviceAllocate (type=>”,ident=>’t1′);
6 sys.dbms_backup_restore.restoreSetDatafile;
7 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>01,toname=>’
d:\oracle\oradata\demo\SYSTEM01.DBF’);
8 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>02,toname=>’
d:\oracle\oradata\demo\UNDOTBS01.DBF’);
9 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>03,toname=>’
d:\oracle\oradata\demo\DRSYS01.DBF’);
10 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>04,toname=>’
d:\oracle\oradata\demo\EXAMPLE01.DBF’);
11 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>05,toname=>’
d:\oracle\oradata\demo\INDX01.DBF’);
12 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>06,toname=>’
d:\oracle\oradata\demo\ODM01.DBF’);
13 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>07,toname=>’
d:\oracle\oradata\demo\TOOLS01.DBF’);
14 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>08,toname=>’
d:\oracle\oradata\demo\USERS01.DBF’);
15 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>09,toname=>’
d:\oracle\oradata\demo\XDB01.DBF’);
16 sys.dbms_backup_restore.restoreBackupPiece(done=>done,handle=>’
D:\KDE\DEMO_01FR79OT_1_1.DBF’, params=>null);
17 sys.dbms_backup_restore.deviceDeallocate;
18 END;
19 /

PL/SQL procedure successfully completed.

–我们的情形是所有的数据文件都丢失了,那就如法炮制 ……….. –文件对应编号来自前面全备份时候的屏幕输出内容.所以,在备份的时候保留操作Log是个很好的习惯.
SQL> startup force mount;
ORACLE instance started.

Total System Global Area 152115804 bytes
Fixed Size 453212 bytes
Variable Size 100663296 bytes
Database Buffers 50331648 bytes
Redo Buffers 667648 bytes
Database mounted.
SQL> Recover database using backup controlfile until cancel ;
ORA-00279: change 243854 generated at 07/18/2004 20:57:03 needed for thread 1
ORA-00289: suggestion : D:\KDE\ARC00002.001
ORA-00280: change 243854 for thread 1 is in sequence #2
Specify log: {=suggested | filename | AUTO | CANCEL}

D:\KDE\ARC00002.001
ORA-00279: change 244089 generated at 07/18/2004 20:58:18 needed for thread 1
ORA-00289: suggestion : D:\KDE\ARC00003.001
ORA-00280: change 244089 for thread 1 is in sequence #3
ORA-00278: log file ‘D:\KDE\ARC00002.001′ no longer needed for this recovery

Specify log: {=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.

SQL> alter database open resetlogs;

Database altered.

最后,不得不resetlogs .

然后,打扫战场,马上进行数据库的全备份。如果您是DBA的话,应该进一步制定并完善备份计划.亡羊补牢,为时未晚。

总结一下
控制文件在备份中意义重大,建议每次对其单独备份,如果数据库版本允许的话,应该设置为控制文件自动备 份。同时应该尽可能地增大CONTROL_FILE_RECORD_KEEP_TIME这个初始化参数的值。以便备份信息能更长时间的保留
应该制定比较完善的备份计划,否则备份计划一旦出现缺口,将可能给系统带来灾难.记住, "可能出错的地方一定会出错".
熟悉RMAN内部备份机制,对DBMS_BACKUP_RESTORE的用法有一定的掌握在关键时侯很有帮助.
备份脚本应该对Log重定向并保存.以便在出错的查找有用信息.

参考信息

RMAN Recovery Without Recovery Catalog or Controlfiles by Bonnie Bizzaro
dbmsbkrs.sql 和 prvtbkrs.plb 文件说明注释(可在你的系统 $ORACLE_HOME/rdbms/admin/中找到.)

相关链接
本文的更多讨论,请看这里 – http://www.itpub.net/244345.html
本文的Blog讨论,请看这里 – http://blog.csdn.net/fenng/archive/2004/07/19/44945.aspx

情况描述
客户报告数据库故障,新来的系统管理员误操作。删掉了一些文件。具体情况是:删掉了所有重要数据文件、所有控制文件。数据库原来是归档模式,用rman备份数据,而rman 使用控制文件。幸运的是,最后一次 rman full 备份是包括了控制文件在内。系统没有设定自动备份控制文件。现在状况是数据库无法启动。
不用说,客户的备份方案不够完善,但是这时候再去说这些话责备用户有事后诸葛亮之嫌,”用户是上帝,不要去得罪他”。还有,客户有Full备份(虽然不是自动备份控制文件,这样无法用常规的恢复步骤来进行恢复)。这对我们来说是个绝对的好消息。

下面我们通过一次模拟操作来演示这个问题的解决办法。

背景知识
在Oracle 816 以后的版本中,Oracle提供了一个包:DBMS_BACKUP_RESTORE.DBMS_BACKUP_RESTORE 包是由dbmsbkrs.sql 和 prvtbkrs.plb 这两个脚本创建的.catproc.sql 脚本运行后会调用这两个包.所以是每个数据库都有的这个包是Oracle服务器和操作系统之间IO操作的接口.由恢复管理器直接调用。而且据说这两个脚本的功能是内建到Oracle的一些库文件中的.

由此可见,我们可以在数据库 nomount 情况下调用这些package ,来达到我们的恢复目的。在dbmsbkrs.sql 和prvtbkrs.plb 这两个脚本中有详细的说明文档,出于篇幅问题,就不一一加以翻译了,但在下面会直接引用一些原文说明。

关键的内容有:

FUNCTION  deviceAllocate(
       type IN varchar2 default NULL
      ,name IN varchar2 default NULL
      ,ident IN varchar2 default NULL
      ,noio IN boolean default FALSE
      ,params IN varchar2 default NULL )
RETURN varchar2;

– Describe the device to be used for sequential I/O. For device types where
– only one process at a time can use a device, this call allocates a device
– for exclusive use by this session. The device remains allocated until
– deviceDeallocate is called or session termination. The device can be used
– both for creating and restoring backups.

– Specifying a device allocates a context that exists until the session
– terminates or deviceDeallocate is called. Only one device can be specified
– at a time for a particular session. Thus deviceDeallocate must be called
– before a different device can be specified. This is not a limitation since
– a session can only read or write one backup at a time.

– The other major effect of allocating a device is to specify the name space
– for the backup handles (file names). The handle for a sequential file does
– not necessarily define the type of device used to write the file. Thus it
– is necessary to specify the device type in order to interpret the file
– handle. The NULL device type is defined for all systems. It is the file
– system supplied by the operating system. The sequential file handles are
– thus normal file names.

– A device can be specified either by name or by type.
– If the type is specified but not the name, the system picks an
– available device of that type.
– If the name is specified but not the type, the type is determined
– from the device.
– If neither the type or the name is given, the backups are files in
– the operating system file system.

– Note that some types of devices, optical disks for example, can be shared
– by many processes, and thus do not really require allocation of the device
– itself. However we do need to allocate the context for accessing the
– device, and we do need to know the device type for proper interpretation
– of the file handle. Thus it is always necessary to make the device
– allocation call before making most other calls in this package.

– Input parameters:
– type
– If specified, this gives the type of device to use for sequential
– I/O. The allowed types are port specific. For example a port may
– support the type “TAPE” which is implemented via the Oracle tape
– API. If no type is specified, it may be implied by specifying a
– particular device name to allocate. The type should be allowed to
– default to NULL if operating system files are to be used.

– name
– If specified, this names a particular piece of hardware to use for
– accessing sequential files. If not specified, any available
– device of the correct type will be allocated. If the device cannot
– be shared, it is allocated to this session for exclusive use.
– The name should be allowed to default to NULL if operating system
– files are to be used.

– ident
– This is the users identifier that he uses to name this device. It
– is only used to report the status of this session via
– dbms_application_info. This value will be placed in the CLIENT_INFO
– column of the V$SESSION table, in the row corresponding to the
– session in which the device was allocated. This value can also
– be queried with the dbms_application_info.read_client_info procedure.

– noio
– If TRUE, the device will not be used for doing any I/O. This allows
– the specification of a device type for deleting sequential files
– without actually allocating a piece of hardware. An allocation for
– noio can also be used for issuing device commands. Note that some
– commands may actually require a physical device and thus will get
– an error if the allocate was done with noio set to TRUE.

– params
– This string is simply passed to the device allocate OSD. It is
– completely port and device specific.

– Returns:
– It returns a valid device type. This is the type that should be
– allocated to access the same sequential files at a later date. Note
– that this might not be exactly the same value as the input string.
– The allocate OSD may do some translation of the type passed in. The
– return value is NULL when using operating system files.

PROCEDURE restoreControlfileTo(cfname IN varchar2);

– This copies the controlfile from the backup set to an operating system
– file. If the database is mounted, the name must NOT match any of the
– current controlfiles.

– Input parameters:
– cfname
– Name of file to create or overwrite with the controlfile from the
– backup set.

PROCEDURE restoreDataFileTo( dfnumber IN binary_integer
,toname IN varchar2 default NULL);

– restoreDataFileTo creates the output file from a complete backup in the
– backup set.

如果您有兴趣可以去阅读一下这两个文件的注释说明.

解决过程
首先,用控制文件作数据库系统的全备份:

C:WUTemp>rman target /

Recovery Manager: Release 9.2.0.1.0 – Production.

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: DEMO (DBID=3272375326)

RMAN> run {
2> allocate channel C1 type disk;
3> backup full tag ‘FullBackup’ format ‘d:\KDE\%d_%u_%s_%p.dbf’ database include current controlfile;
4> sql ‘ alter system archive log current’;
5> release channel C1;
6> }

using target database controlfile instead of recovery catalog
allocated channel: C1
channel C1: sid=15 devtype=DISK

Starting backup at 18-JUL-04
channel C1: starting full datafile backupset
channel C1: specifying datafile(s) in backupset
including current SPFILE in backupset
including current controlfile in backupset
input datafile fno=00001 name=D:\ORACLE\ORADATA\DEMO\SYSTEM01.DBF
input datafile fno=00002 name=D:\ORACLE\ORADATA\DEMO\UNDOTBS01.DBF
input datafile fno=00004 name=D:\ORACLE\ORADATA\DEMO\EXAMPLE01.DBF
input datafile fno=00009 name=D:\ORACLE\ORADATA\DEMO\XDB01.DBF
input datafile fno=00005 name=D:\ORACLE\ORADATA\DEMO\INDX01.DBF
input datafile fno=00008 name=D:\ORACLE\ORADATA\DEMO\USERS01.DBF
input datafile fno=00003 name=D:\ORACLE\ORADATA\DEMO\DRSYS01.DBF
input datafile fno=00006 name=D:\ORACLE\ORADATA\DEMO\ODM01.DBF
input datafile fno=00007 name=D:\ORACLE\ORADATA\DEMO\TOOLS01.DBF
channel C1: starting piece 1 at 18-JUL-04
channel C1: finished piece 1 at 18-JUL-04
piece handle=D:\KDE\DEMO_01FR79OT_1_1.DBF comment=NONE
channel C1: backup set complete, elapsed time: 00:01:17
Finished backup at 18-JUL-04

sql statement: alter system archive log current

released channel: C1

如上所示,我们做了一次数据库的Full备份.备份片中包括控制文件.注意上面输出内容的黑体部分.我们在后面的恢复操作中会用到.

模拟错误,关掉实例,删掉所有的控制文件和所有的.DBF文件。然后starup会看到如下的出错信息:

SQL> startup
ORACLE instance started.

Total System Global Area 152115804 bytes
Fixed Size 453212 bytes
Variable Size 100663296 bytes
Database Buffers 50331648 bytes
Redo Buffers 667648 bytes
ORA-00205: error in identifying controlfile, check alert log for more info

查看alert Log,应该是系统找不到控制文件.现在情形和客户问题一致.不过在继续讲述之前,我们还需要介绍一点背景知识.

我们首先尝试恢复控制文件:
SQL>startup force nomount;

SQL> DECLARE
2 devtype varchar2(256);
3 done boolean;
4 BEGIN
5 devtype:=sys.dbms_backup_restore.deviceAllocate(type=>”,ident=>’T1′);
6 sys.dbms_backup_restore.restoreSetDatafile;
7 sys.dbms_backup_restore.restoreControlfileTo(cfname=>’d:\oracle\Control01.ctl’);
8 sys.dbms_backup_restore.restoreBackupPiece(done=>done,handle=>’D:\KDE\DEMO_01FR79OT_1_1.DBF’, params=>null);
9 sys.dbms_backup_restore.deviceDeallocate;
10 END;
11 /

PL/SQL procedure successfully completed.

OK,控制文件恢复完成.对以上内容的解释:
第五行 分配一个device channel,因为使用的操作系统文件,所以这里为空,如果是从磁带上恢复要用    “sbt_tape”;
第六行 指明开始restore ;
第七行 指出待恢复文件目标存储位置;
第八行 从哪个备份片中恢复;
第九行 释放设备通道.
不妨对以上操作的结果验证一下:

SQL> host dir d:\oracle
Volume in drive D is DATA
Volume Serial Number is DC79-57F8
Directory of d:\oracle

07/18/2004 09:08 PM

.
07/18/2004 09:08 PM ..
06/08/2004 03:21 PM admin
07/18/2004 09:08 PM 1,871,872 CONTROL01.CTL
07/16/2004 11:27 AM ORA92
07/18/2004 09:02 PM oradata

这样,我们成功的restore了控制文件 .如果控制文件在Full备份之后单独做的,接下来关掉实例,拷贝控制文件到具体位置,然后rman 执行restore database;即可。 可是,我们这里的情况有些不同. 视丢失文件的情况而定,继续进行如下的恢复操作:
SQL> DECLARE
2 devtype varchar2(256);
3 done boolean;
4 BEGIN
5 devtype:=sys.dbms_backup_restore.deviceAllocate (type=>”,ident=>’t1′);
6 sys.dbms_backup_restore.restoreSetDatafile;
7 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>01,toname=>’
d:\oracle\oradata\demo\SYSTEM01.DBF’);
8 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>02,toname=>’
d:\oracle\oradata\demo\UNDOTBS01.DBF’);
9 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>03,toname=>’
d:\oracle\oradata\demo\DRSYS01.DBF’);
10 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>04,toname=>’
d:\oracle\oradata\demo\EXAMPLE01.DBF’);
11 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>05,toname=>’
d:\oracle\oradata\demo\INDX01.DBF’);
12 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>06,toname=>’
d:\oracle\oradata\demo\ODM01.DBF’);
13 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>07,toname=>’
d:\oracle\oradata\demo\TOOLS01.DBF’);
14 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>08,toname=>’
d:\oracle\oradata\demo\USERS01.DBF’);
15 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>09,toname=>’
d:\oracle\oradata\demo\XDB01.DBF’);
16 sys.dbms_backup_restore.restoreBackupPiece(done=>done,handle=>’
D:\KDE\DEMO_01FR79OT_1_1.DBF’, params=>null);
17 sys.dbms_backup_restore.deviceDeallocate;
18 END;
19 /

PL/SQL procedure successfully completed.

–我们的情形是所有的数据文件都丢失了,那就如法炮制 ……….. –文件对应编号来自前面全备份时候的屏幕输出内容.所以,在备份的时候保留操作Log是个很好的习惯.
SQL> startup force mount;
ORACLE instance started.

Total System Global Area 152115804 bytes
Fixed Size 453212 bytes
Variable Size 100663296 bytes
Database Buffers 50331648 bytes
Redo Buffers 667648 bytes
Database mounted.
SQL> Recover database using backup controlfile until cancel ;
ORA-00279: change 243854 generated at 07/18/2004 20:57:03 needed for thread 1
ORA-00289: suggestion : D:\KDE\ARC00002.001
ORA-00280: change 243854 for thread 1 is in sequence #2
Specify log: {=suggested | filename | AUTO | CANCEL}

D:\KDE\ARC00002.001
ORA-00279: change 244089 generated at 07/18/2004 20:58:18 needed for thread 1
ORA-00289: suggestion : D:\KDE\ARC00003.001
ORA-00280: change 244089 for thread 1 is in sequence #3
ORA-00278: log file ‘D:\KDE\ARC00002.001′ no longer needed for this recovery

Specify log: {=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.

SQL> alter database open resetlogs;

Database altered.

最后,不得不resetlogs .

然后,打扫战场,马上进行数据库的全备份。如果您是DBA的话,应该进一步制定并完善备份计划.亡羊补牢,为时未晚。

总结一下
控制文件在备份中意义重大,建议每次对其单独备份,如果数据库版本允许的话,应该设置为控制文件自动备 份。同时应该尽可能地增大CONTROL_FILE_RECORD_KEEP_TIME这个初始化参数的值。以便备份信息能更长时间的保留
应该制定比较完善的备份计划,否则备份计划一旦出现缺口,将可能给系统带来灾难.记住, "可能出错的地方一定会出错".
熟悉RMAN内部备份机制,对DBMS_BACKUP_RESTORE的用法有一定的掌握在关键时侯很有帮助.
备份脚本应该对Log重定向并保存.以便在出错的查找有用信息.

参考信息

RMAN Recovery Without Recovery Catalog or Controlfiles by Bonnie Bizzaro
dbmsbkrs.sql 和 prvtbkrs.plb 文件说明注释(可在你的系统 $ORACLE_HOME/rdbms/admin/中找到.)

相关链接
本文的更多讨论,请看这里 – http://www.itpub.net/244345.html
本文的Blog讨论,请看这里 – http://blog.csdn.net/fenng/archive/2004/07/19/44945.aspx

2004年12月14日

1 删除undotbs01.dbf
2 连接到数据库
SQL> connect sys/oracle as sysdba
Connected.
SQL> startup force
ORACLE instance started.

Total System Global Area 336662768 bytes
Fixed Size 450800 bytes
Variable Size 117440512 bytes
Database Buffers 218103808 bytes
Redo Buffers 667648 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 2 – see DBWR trace file
ORA-01110: data file 2: ‘/home/oracle/oradata/esal/undotbs01.dbf’
3 查看rollback_segments
SQL> show parameter rollback

NAME TYPE VALUE
———————————— ———– ——————————
fast_start_parallel_rollback string LOW
max_rollback_segments integer 37
rollback_segments string
transactions_per_rollback_segment integer 5
4 修改初始化参数
###########################################
# System Managed Undo and Rollback Segments
###########################################
undo_management=MANUAL
undo_retention=10800
undo_tablespace=UNDOTBS01
rollback_segments=’SYSTEM’
5 启动数据库
SQL> connect sys/oracle as sysdba
Connected.
SQL> startup force
ORACLE instance started.

Total System Global Area 336662768 bytes
Fixed Size 450800 bytes
Variable Size 117440512 bytes
Database Buffers 218103808 bytes
Redo Buffers 667648 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 2 – see DBWR trace file
ORA-01110: data file 2: ‘/home/oracle/oradata/esal/undotbs01.dbf’


SQL> alter database /home/oracle/oradata/esal/undotbs01.dbf’ offline;
alter database /home/oracle/oradata/esal/undotbs01.dbf’ offline
*
ERROR at line 1:
ORA-02231: missing or invalid option to ALTER DATABASE


SQL> alter database datafile ‘/home/oracle/oradata/esal/undotbs01.dbf’ offline drop;

Database altered.

SQL> alter database open;

Database altered.




SQL> create spfile from pfile;

File created.

SQL> create undo tablespace undotbs1 datafile ‘/home/oracle/oradata/esal/undotbs01.dbf’ size 200M autoextend on;
create undo tablespace undotbs1 datafile ‘/home/oracle/oradata/esal/undotbs01.dbf’ size 200M autoextend on
*
ERROR at line 1:
ORA-01543: tablespace ‘UNDOTBS1′ already exists

SQL> drop tablespace undotbs1;

Tablespace dropped.

SQL> create undo tablespace undotbs1 datafile ‘/home/oracle/oradata/esal/undotbs01.dbf’ size 200M autoextend on;

Tablespace created.

SQL> show parameter undo

NAME TYPE VALUE
———————————— ———– ——————————
undo_management string MANUAL
undo_retention integer 10800
undo_suppress_errors boolean FALSE
undo_tablespace string UNDOTBS01
SQL> alter system set undo_management=auto scope=both;
alter system set undo_management=auto scope=both
*
ERROR at line 1:
ORA-02095: specified initialization parameter cannot be modified


SQL> alter system set undo_management=auto scope=spfile;
alter system set undo_management=auto scope=spfile
*
ERROR at line 1:
ORA-32001: write to SPFILE requested but no SPFILE specified at startup

###########################################
# System Managed Undo and Rollback Segments
###########################################
undo_management=AUTO
undo_retention=10800
undo_tablespace=UNDOTBS1
#rollback_segments=’SYSTEM’
“initxzh.ora” 99L, 2989C written
[oracle@WWW2 dbs]$ sqlplus /nolog

SQL*Plus: Release 9.2.0.1.0 – Production on Mon Jul 5 13:17:17 2004

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

SQL> connect sys/oracle@xzh as sysdba
Connected.
SQL> startup force
ORACLE instance started.

Total System Global Area 336662768 bytes
Fixed Size 450800 bytes
Variable Size 117440512 bytes
Database Buffers 218103808 bytes
Redo Buffers 667648 bytes
Database mounted.
Database opened.


感觉楼上的说的让人迷惑,经过试验把步骤总结如下,还请楼上指正:
1。如果发现回退表空间损坏,则先修改pfile文件INITSID.ORA为:
###########################################
# System Managed Undo and Rollback Segments
###########################################
undo_management=MANUAL
undo_retention=10800
undo_tablespace=UNDOTBS01
rollback_segments=’SYSTEM’
2。然后用pfile启动数据库startup mount pfile=”;
alter database datafile ‘/home/oracle/oradata/esal/undotbs01.dbf’ offline drop;
alter database open;
drop tablespace undotbs1 including contents;
create undo tablespace undotbs1 datafile ‘/home/oracle/oradata/esal/undotbs01.dbf’ size 200M autoextend on;
3。修改pfile文件INITSID.ORA为:
###########################################
# System Managed Undo and Rollback Segments
###########################################
undo_management=AUTO
undo_retention=10800
undo_tablespace=UNDOTBS01
#rollback_segments=’SYSTEM’
4。关闭数据库shutdown immediate
5。启动数据库startup

create or replace trigger tri_xxx_delete
after delete on xxxx
begin
insert into xxx_bak select * from xxx where xxx.id = 困惑ld.id;
end;


更改insert into xxx_bak select * from xxx where xxx.id = : old.id;




Create or replace trigger biud_employee_copy
Before insert or update or delete
On employees_copy
Declare
L_action employees_log.action%type;
Begin
if inserting then
l_action:=’Insert’;
elsif updating then
l_action:=’Update’;
elsif deleting then
l_action:=’Delete’;
else
raise_application_error(-20001,’You should never ever get this error.’);

Insert into employees_log(
Who,action,when)
Values( user, l_action,sysdate);
End;
/








不过有个问题,不是很方便:
SQL> create table test
2 (id number ,
3 user_name varchar2(16)
4 );

Table created.

SQL> create table test_bak as select * from test;

Table created.


SQL> create or replace trigger catch_sql
BEFORE DELETE ON test
FOR EACH ROW
begin
insert into test_bak select * from test where id=:old.id;
end;
/ 2 3 4 5 6 7

Trigger created.

SQL> insert into test values(1,’liubin’);

1 row created.

SQL> insert into test values(2,’test’);

1 row created.

SQL> select * from test;

ID USER_NAME
———- —————-
1 liubin
2 test

SQL> select * from test_bak;

no rows selected

SQL> select * from test_bak;

no rows selected

SQL> delete from test;
delete from test
*
ERROR at line 1:
ORA-04091: table APP.TEST is mutating, trigger/function may not see it
ORA-06512: at “APP.CATCH_SQL”, line 2
ORA-04088: error during execution of trigger ‘APP.CATCH_SQL’


SQL> delete from test where id=1;
delete from test where id=1
*
ERROR at line 1:
ORA-04091: table APP.TEST is mutating, trigger/function may not see it
ORA-06512: at “APP.CATCH_SQL”, line 2
ORA-04088: error during execution of trigger ‘APP.CATCH_SQL’


SQL> create or replace trigger catch_sql
BEFORE DELETE ON test
FOR EACH ROW
declare
v_id number;
v_user_name varchar2(16);
begin
v_id:=:old.id;
v_user_name:=:old.user_name;
insert into test_bak values(v_id,v_user_name);
end;
/ 2 3 4 5 6 7 8 9 10 11 12

Trigger created.


SQL> delete from test;

2 rows deleted.

SQL> select * from test_bak;

ID USER_NAME
———- —————-
1 liubin
2 test

SQL> select * from test;

no rows selected

SQL>
也就是说,在触发器里面不能用select那个记录了,我只好用笨方法把每个值取出来再插进去。
http://download-west.oracle.com/doc…dg13trg.htm#786





不明白,为什么还要加个中间变量,直接使用困惑ld. 的值不就可以了吗?




在触发器中不能访问被触发表,只能用:new和困惑ld访问当前记录的值。


2004年12月07日

分两部分:
a.如何配置ISA服务器,使得用户可以使用OWA来访问Microsoft Exchange 邮箱。

b.如何在OWA站点上使用SSL.


如何在ISA服务器后面发布OWA
1.配置ISA服务器接受外网卡的web请求:
   a. 启动 ISA Server 管理控制台:单击开始,指向程序,指向 Microsoft ISA Server,然后单击 ISA Management(ISA 管理)。

  b.展开左窗格中的”Servers(服务器)”文件夹,然后右键选择您的服务器属性。

   c.点击”Incoming Web Requests“页,再单击”Configure listeners  individually per IP address”.

     d.单击”Add”, 然后选择ISA Server和它的的外网卡ip地址。这一步确定了服务器响应外部http请求的ip地址和端口。

   e.单击”OK”完成,再单击”OK”回到ISA管理界面。

2. 创建”destination set”将Web客户端指向OWA站点的对应目录:

  a. 启动 ISA Server 管理控制台:单击开始,指向程序,指向 Microsoft ISA Server, 然后单击 ISA Management(ISA 管理)。

  b. 展开左窗格中的”Servers(服务器)”文件夹,再展开Policy Elements,右键选择 Destination Set文件夹,单击New, 然后单击Set. 您可以将该destination set命名为”OWA”.

  c.在Destination 一栏,输入外网用户访问OWA所使用的URL。

    注意:不要在URL里面包括”http://”;或者”https://”;部分。

d. 在Path一栏, 输入”/exchange*” (没有引号),然后单击OK.

 e.重复步骤d,分别为Exchweb和Public文件创建相应的路径:”/exchweb*”和”/public*”.

3. 用刚才创建的policy element 配置web publishing rule:
   a.展开Publishing,右键单击 Web Publishing Rules,选择New->Rule.
   b.在name一栏输入”OWA Access Rule” ,再单击Next.
  c.选择” specified destination set”,选择刚刚创建的OWA set,再单击Next.
   d. Apply the rule to Any Request, 再单击Next.
   e. 单击 “Redirect the request to this internal Web Server”,输入运行OWA的服务器的名称或者ip地址.
   f. 单击选择”Send the original header to publishing server instead of the actual one” ,再选择Next.
   g.单击finish.
   h. 在ISA 管理控制台里展开Monitoring图标,然后单击Services.
   i. 重新启动Web proxy和Firewall 服务:右键点击相应的服务,单击Stop,然后在菜单上启动。

如果OWA应用了SSL,必须在ISA服务器创建一条Server Publishing rule(HTTPS协议),选择内部OWA服务器的ip地址和ISA服务器外网卡地址(OWA服务器必须将ISA服务器的内网卡设为默认网关)。



如何用IIS5.0 和Certificate Server 2.0 建立SSL.
1. 首先,Web服务器必须做按照一下步骤证书请求:
   a.启动Internet Service Manager (ISM):开始->程序->管理工具->Internet Service Manager .
   b.右键点击需要建立SSL的站点,单击属性。
   c.单击目录安全页,再单击’服务器证书’,进入”Web服务器证书向导”.
   d.单击下一步开始向导,选择”创建一个新证书”。
   e.单击下一步,选择’现在准备请求,但稍候发送’。
   f.单击下一步,输入您的证书名称,你可以用web站点的名称。现在选择
位长,位长越长,证书加密越强。如果你的用户可能来自有加密限制的国家,就需要选择”服务器网关加密证书”.
   g.单击下一步,输入组织和组织单位。
  h. 单击下一步,输入公用名称,公用名称必须符合注册在DNS服务器上的 域名。例如如果URL是https://www.mydomain.com/securedir,那么公用  名称必须是 www.mydomain.com.如果客户使用IP地址访问该网站,如192.168.1.11,这儿的公用名称一定要使用”192.168.1.11″.
   i.单击下一步,输入您的国家、省、市。
   j.单击下一步,选择路径和您保存请求的文件名。
   k.再单击下一步两次,单击完成关闭向导。

2. 用Certificate Server处理您的请求,可以按照下列步骤来做:
  a.  在浏览器里面输入http://CAServerName/certsrv, 选择’Request a certificate‘. 注意:不要用”localhost“作为服务器名。
   b.单击Next并选择Advanced request.
   c.单击Next并选择Submit a certificate request using a base64 encoded PKCS #10 file or a renewal request using a base64 encoded PKCS #7 file.
   d.单击下一步,用Notepad打开您在Web证书向导中创建的请求文件。将整个文本文件(包括BEGIN和END两行),粘贴到Base64 Encoded Certificate   Request文本框。  
   e. 单击Submit。
   f. 关闭浏览器,在证书服务器上,打开Certification Authority 控制台。
   g.展开服务器名,选择Pending Requests folder.右键点击您提交的证书,选择 All Tasks,再选择Issue.
   h.右键点击服务器名您现在可以关闭Certification Authority 控制台。
   h. 打开一个新的浏览器窗口,输入步骤a里面的URL,选择Check on a      pending certificate.
  i. 单击下一步,选择您刚刚建立request.
   j. 单击下一步,选择DER encoded, 然后点击Download CA certificate的连接.保存证书文件到您的web服务器的本地硬盘,关闭浏览器。

3. 在IIS里完成证书安装,并启动SSL:

  a. 打开Internet Information Services MMC,右键点击需要应用SSL的web站点。
   b. 单击目录安全性,然后选择Server Certificate.
   c. 单击下一步,选择Process the pending request and install the certificate.
   d. 单击下一步,输入您上一部分保存的证书文件所在的路径和文件名。
   e. 单击下一步两次,点击完成推出向导。
   f. 单击web站点页,确认SSL端口一栏的端口是否正确。默认为443。
   g. 点击确认关闭web站点属性窗口。

现在我们已经建立了自己的一个SSL站点,但是这个站点不在Windows系列操作系统内建的信任之列,所以需要自己建立对该机构的信任,否则每次客户端访问该SSL站点,都会弹出该站点不被信任的窗口。

假如在试验中自己建立了一个根证书颁发机构MyCA2002,这个机构不在内建的受信任证书颁发机构之列,你可以按照下列步骤建立信任:
1.导出MyCA2002证书颁发机构的证书.
1)在CA服务器上单击”开始”菜单->”程序”->”管理工具”->”证书颁发机构”MMC管理单元,或者直接运行”%SystemRoot%\System32\certsrv.msc /s”.

(2)右击”证书颁发机构(本地)”下的”MyCA2002″节点,选择”属性”菜单项,打开”MyCA2002属性”对话框.

(3)在”MyCA2002属性”对话框”常规”页面单击”查看证书”,在跳出的”证书”对话框中切换到”详细信息”页面,单击下面的”复制到文件…”按钮,启动证书导出向导,将证书导出为”DER编码二进制X.509(.CER)”格式文件,假设文件名是”myca2002.cer”,文件大小约1k左右,复制到一个终端用户可以取到的位置,如软盘,网站(可以提供客户端下载)等.

2.在客户机建立对MyCA2002证书颁发机构的信任
(1)在客户机单击”开始”菜单->”设置”->”控制面板”->”Internet选项”,打开”Internet属性”对话框.
(2)切换到”内容”页面,单击下面的”证书…”按钮,打开”证书”对话框.
(3)切换到”受信任的根证书颁发机构”页面,单击下面的”导入…”按钮,启动”证书导入向导”,使用系统缺省的选
项将myca2002.cer证书导入.

(4)MyCA2002证书颁发机构成为受信任的根证书颁发机构.
如果OWA服务装在ISA服务器上,您需要禁用Socket Pooling. Socket Pooling 在IIS5.0里默认被启用,它使得IIS侦听所有ip地址的TCP 80 端口。关于如何禁用Socket Pooling,以及关于建立SSL和OWA的更详细的信息,可以

   Q238131 How to Disable Socket Pooling

   Q290625 HOWTO: Configure SSL in a Windows 2000 IIS 5.0 Test Environment
  Q290113 How to Publish Outlook Web Access Behind ISA Server
  Q267596 XWEB: How to Change OWA Passwords Through IIS

2004年12月01日

REM getsql.sql
REM author eygle
REM 在windows上,已知进程ID,得到当前正在执行的语句
REM 在windows上,进程ID为16进制,需要转换,在UNIX直接为10进制
SELECT   /*+ ORDERED */
         sql_text
    FROM v$sqltext a
   WHERE (a.hash_value, a.address) IN (
            SELECT DECODE (sql_hash_value,
                           0, prev_hash_value,
                           sql_hash_value
                          ),
                   DECODE (sql_hash_value, 0, prev_sql_addr, sql_address)
              FROM v$session b
             WHERE b.paddr = (SELECT addr
                                FROM v$process c
                               WHERE c.spid = TO_NUMBER (‘&pid’, ‘2348′)))
ORDER BY piece ASC
/




























   
   

Oracle诊断案例—-如何捕获问题SQL解决过度CPU消耗问题


–使用vmstat,top等辅助解决Oracle数据库性能问题


Last Updated: Sunday, 2004-10-24 0:37 Eygle
    

 


问题描述:
开发人员报告系统运行缓慢,影响用户访问.



1.登陆数据库主机


使用vmstat检查,发现CPU资源已经耗尽,大量任务位于运行队列:






 

bash-2.03$ vmstat 3
procs memory page disk faults cpu
r b w swap free re mf pi po fr de sr s6 s9 s1 sd in sy cs us sy id
0 0 0 5504232 1464112 0 0 0 0 0 0 0 0 1 1 0 4294967196 0 0 -84 -5 -145
131 0 0 5368072 1518360 56 691 0 2 2 0 0 0 1 0 0 3011 7918 2795 97 3 0
131 0 0 5377328 1522464 81 719 0 2 2 0 0 0 1 0 0 2766 8019 2577 96 4 0
130 0 0 5382400 1524776 67 682 0 0 0 0 0 0 0 0 0 3570 8534 3316 97 3 0
134 0 0 5373616 1520512 127 1078 0 2 2 0 0 0 1 0 0 3838 9584 3623 96 4 0
136 0 0 5369392 1518496 107 924 0 5 5 0 0 0 0 0 0 2920 8573 2639 97 3 0
132 0 0 5364912 1516224 63 578 0 0 0 0 0 0 0 0 0 3358 7944 3119 97 3 0
129 0 0 5358648 1511712 189 1236 0 0 0 0 0 0 0 0 0 3366 10365 3135 95 5 0
129 0 0 5354528 1511304 120 1194 0 0 0 0 0 0 0 4 0 3235 8864 2911 96 4 0
128 0 0 5346848 1507704 99 823 0 0 0 0 0 0 0 3 0 3189 9048 3074 96 4 0
125 0 0 5341248 1504704 80 843 0 2 2 0 0 0 6 1 0 3563 9514 3314 95 5 0
133 0 0 5332744 1501112 79 798 0 0 0 0 0 0 0 1 0 3218 8805 2902 97 3 0
129 0 0 5325384 1497368 107 643 0 2 2 0 0 0 1 4 0 3184 8297 2879 96 4 0
126 0 0 5363144 1514320 81 753 0 0 0 0 0 0 0 0 0 2533 7409 2164 97 3 0
136 0 0 5355624 1510512 169 566 786 0 0 0 0 0 0 1 0 3002 8600 2810 96 4 0
130 1 0 5351448 1502936 267 580 1821 0 0 0 0 0 0 0 0 3126 7812 2900 96 4 0
129 0 0 5347256 1499568 155 913 2 2 2 0 0 0 0 1 0 2225 8076 1941 98 2 0
116 0 0 5338192 1495400 177 1162 0 0 0 0 0 0 0 1 0 1947 7781 1639 97 3 0


2.使用Top命令


观察进程CPU耗用,发现没有明显过高CPU使用的进程
$ top







 

last pid: 28313;  load averages: 99.90, 117.54, 125.71            23:28:38
296 processes: 186 sleeping, 99 running, 2 zombie, 9 on cpu
CPU states: 0.0% idle, 96.5% user, 3.5% kernel, 0.0% iowait, 0.0% swap
Memory: 4096M real, 1404M free, 2185M swap in use, 5114M swap free

PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
27082 oracle8i 1 33 0 1328M 1309M run 0:17 1.29% oracle
26719 oracle8i 1 55 0 1327M 1306M sleep 0:29 1.11% oracle
28103 oracle8i 1 35 0 1327M 1304M run 0:06 1.10% oracle
28161 oracle8i 1 25 0 1327M 1305M run 0:04 1.10% oracle
26199 oracle8i 1 45 0 1328M 1309M run 0:42 1.10% oracle
26892 oracle8i 1 33 0 1328M 1310M run 0:24 1.09% oracle
27805 oracle8i 1 45 0 1327M 1306M cpu/1 0:10 1.04% oracle
23800 oracle8i 1 23 0 1327M 1306M run 1:28 1.03% oracle
25197 oracle8i 1 34 0 1328M 1309M run 0:57 1.03% oracle
21593 oracle8i 1 33 0 1327M 1306M run 2:12 1.01% oracle
27616 oracle8i 1 45 0 1329M 1311M run 0:14 1.01% oracle
27821 oracle8i 1 43 0 1327M 1306M run 0:10 1.00% oracle
26517 oracle8i 1 33 0 1328M 1309M run 0:33 0.97% oracle
25785 oracle8i 1 44 0 1328M 1309M run 0:46 0.96% oracle
26241 oracle8i 1 45 0 1327M 1306M run 0:42 0.96% oracle


3.检查进程数量







 

bash-2.03$ ps -ef|grep ora|wc -l
258
bash-2.03$ ps -ef|grep ora|wc -l
275
bash-2.03$ ps -ef|grep ora|wc -l
274
bash-2.03$ ps -ef|grep ora|wc -l
278
bash-2.03$ ps -ef|grep ora|wc -l
277
bash-2.03$ ps -ef|grep ora|wc -l
366

发现系统存在大量Oracle进程,大约在300左右,而正常情况下Oracle连接数应该在100左右.


4.检查数据库


查询v$session_wait获取各进程等待事件


 







 

 
SQL> select sid,event,p1,p1text from v$session_wait;

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
124 latch free 1.6144E+10 address
1 pmon timer 300 duration
2 rdbms ipc message 300 timeout
3 rdbms ipc message 300 timeout
11 rdbms ipc message 30000 timeout
6 rdbms ipc message 180000 timeout
4 rdbms ipc message 300 timeout
134 rdbms ipc message 6000 timeout
147 rdbms ipc message 6000 timeout
275 rdbms ipc message 17995 timeout
274 rdbms ipc message 6000 timeout

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
118 rdbms ipc message 6000 timeout
7 buffer busy waits 17 file#
56 buffer busy waits 17 file#
161 buffer busy waits 17 file#
195 buffer busy waits 17 file#
311 buffer busy waits 17 file#
314 buffer busy waits 17 file#
205 buffer busy waits 17 file#
269 buffer busy waits 17 file#
200 buffer busy waits 17 file#
164 buffer busy waits 17 file#

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
140 buffer busy waits 17 file#
66 buffer busy waits 17 file#
10 db file sequential read 17 file#
18 db file sequential read 17 file#
54 db file sequential read 17 file#
49 db file sequential read 17 file#
48 db file sequential read 17 file#
46 db file sequential read 17 file#
45 db file sequential read 17 file#
35 db file sequential read 17 file#
30 db file sequential read 17 file#

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
29 db file sequential read 17 file#
22 db file sequential read 17 file#
178 db file sequential read 17 file#
175 db file sequential read 17 file#
171 db file sequential read 17 file#
123 db file sequential read 17 file#
121 db file sequential read 17 file#
120 db file sequential read 17 file#
117 db file sequential read 17 file#
114 db file sequential read 17 file#
113 db file sequential read 17 file#

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
111 db file sequential read 17 file#
107 db file sequential read 17 file#
80 db file sequential read 17 file#
222 db file sequential read 17 file#
218 db file sequential read 17 file#
216 db file sequential read 17 file#
213 db file sequential read 17 file#
199 db file sequential read 17 file#
198 db file sequential read 17 file#
194 db file sequential read 17 file#
192 db file sequential read 17 file#

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
188 db file sequential read 17 file#
249 db file sequential read 17 file#
242 db file sequential read 17 file#
239 db file sequential read 17 file#
236 db file sequential read 17 file#
235 db file sequential read 17 file#
234 db file sequential read 17 file#
233 db file sequential read 17 file#
230 db file sequential read 17 file#
227 db file sequential read 17 file#
336 db file sequential read 17 file#

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
333 db file sequential read 17 file#
331 db file sequential read 17 file#
329 db file sequential read 17 file#
327 db file sequential read 17 file#
325 db file sequential read 17 file#
324 db file sequential read 17 file#
320 db file sequential read 17 file#
318 db file sequential read 17 file#
317 db file sequential read 17 file#
316 db file sequential read 17 file#
313 db file sequential read 17 file#

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
305 db file sequential read 17 file#
303 db file sequential read 17 file#
301 db file sequential read 17 file#
293 db file sequential read 17 file#
290 db file sequential read 17 file#
288 db file sequential read 17 file#
287 db file sequential read 17 file#
273 db file sequential read 17 file#
271 db file sequential read 17 file#
257 db file sequential read 17 file#
256 db file sequential read 17 file#

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
254 db file sequential read 17 file#
252 db file sequential read 17 file#
159 db file sequential read 17 file#
153 db file sequential read 17 file#
146 db file sequential read 17 file#
142 db file sequential read 17 file#
135 db file sequential read 17 file#
133 db file sequential read 17 file#
132 db file sequential read 17 file#
126 db file sequential read 17 file#
79 db file sequential read 17 file#

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
77 db file sequential read 17 file#
72 db file sequential read 17 file#
70 db file sequential read 17 file#
69 db file sequential read 17 file#
67 db file sequential read 17 file#
63 db file sequential read 17 file#
55 db file sequential read 17 file#
102 db file sequential read 17 file#
96 db file sequential read 17 file#
95 db file sequential read 17 file#
91 db file sequential read 17 file#

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
81 db file sequential read 17 file#
15 db file sequential read 17 file#
19 db file scattered read 17 file#
50 db file scattered read 17 file#
285 db file scattered read 17 file#
279 db file scattered read 17 file#
255 db file scattered read 17 file#
243 db file scattered read 17 file#
196 db file scattered read 17 file#
187 db file scattered read 17 file#
170 db file scattered read 17 file#

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
162 db file scattered read 17 file#
138 db file scattered read 17 file#
110 db file scattered read 17 file#
108 db file scattered read 17 file#
92 db file scattered read 17 file#
330 db file scattered read 17 file#
310 db file scattered read 17 file#
302 db file scattered read 17 file#
299 db file scattered read 17 file#
89 db file scattered read 17 file#
5 smon timer 300 sleep time

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
20 SQL*Net message to client 1952673792 driver id
103 SQL*Net message to client 1650815232 driver id
….
148 SQL*Net more data from client 1952673792 driver id
291 SQL*Net more data from client 1952673792 driver id

244 rows selected.


发现存在大量db file scattered read及db file sequential read等待.


5.捕获相关SQL


这里用到了我的以下脚本getsqlbysid.sql:







 

SELECT   sql_text
FROM v$sqltext a
WHERE a.hash_value = (SELECT sql_hash_value
FROM v$session b
WHERE b.SID = ‘&sid’)
ORDER BY piece ASC
/

 







 

 
SQL> @getsql
Enter value for sid: 18
old 5: where b.sid=’&sid’
new 5: where b.sid=’18′

SQL_TEXT
—————————————————————-
select i.vc2title,i.numinfoguid from hs_info i where i.intenab
ledflag = 1 and i.intpublishstate = 1 and i.datpublishdate <=
sysdate and i.numcatalogguid = 2047 order by i.datpublishdate d
esc, i.numorder desc

SQL> /
Enter value for sid: 54
old 5: where b.sid=’&sid’
new 5: where b.sid=’54′

SQL_TEXT
—————————————————————-
select i.vc2title,i.numinfoguid from hs_info i where i.intenab
ledflag = 1 and i.intpublishstate = 1 and i.datpublishdate <=
sysdate and i.numcatalogguid = 33 order by i.datpublishdate des
c, i.numorder desc

SQL> /
Enter value for sid: 49
old 5: where b.sid=’&sid’
new 5: where b.sid=’49′

SQL_TEXT
—————————————————————-
select i.vc2title,i.numinfoguid from hs_info i where i.intenab
ledflag = 1 and i.intpublishstate = 1 and i.datpublishdate <=
sysdate and i.numcatalogguid = 26 order by i.datpublishdate des
c, i.numorder desc


对几个全表扫描进程跟踪以后,得到以上SQL语句.
以上语句如果良好编码应该使用绑定变量.但是现在这个不是我们关心的.


使用该应用用户连接,检查其执行计划:







 

SQL> set autotrace trace explain
SQL> select i.vc2title,i.numinfoguid
2 from hs_info i where i.intenabledflag = 1
3 and i.intpublishstate = 1 and i.datpublishdate <=sysdate
4 and i.numcatalogguid = 3475
5 order by i.datpublishdate desc, i.numorder desc ;

Execution Plan
———————————————————-
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=228 Card=1 Bytes=106)
1 0 SORT (ORDER BY) (Cost=228 Card=1 Bytes=106)
2 1 TABLE ACCESS (FULL) OF ‘HS_INFO’ (Cost=218 Card=1 Bytes=106)

SQL> select count(*) from hs_info;

COUNT(*)
———-
227404


该表这里有22万记录,全表扫描已经不再适合.


检查该表,存在以下索引:






SQL> select index_name,index_type  from user_indexes where table_name=’HS_INFO’;

INDEX_NAME INDEX_TYPE
—————————— —————————
HSIDX_INFO1 FUNCTION-BASED NORMAL
HSIDX_INFO_SEARCHKEY DOMAIN
PK_HS_INFO NORMAL

检查索引键值:

SQL> select index_name,column_name from user_ind_columns where table_name =’HS_INFO’;

INDEX_NAME COLUMN_NAME
—————————— ——————–
HSIDX_INFO1 NUMORDER
HSIDX_INFO1 SYS_NC00024$
HSIDX_INFO_SEARCHKEY VC2INDEXWORDS
PK_HS_INFO NUMINFOGUID

SQL> desc hs_info
Name Null? Type
—————————————————————– ——– —————————
NUMINFOGUID NOT NULL NUMBER(15)
NUMCATALOGGUID NOT NULL NUMBER(15)
INTTEXTTYPE NOT NULL NUMBER(38)
VC2TITLE NOT NULL VARCHAR2(60)
VC2AUTHOR VARCHAR2(100)
NUMPREVINFOGUID NUMBER(15)
NUMNEXTINFOGUID NUMBER(15)
NUMORDER NOT NULL NUMBER(15)
DATPUBLISHDATE NOT NULL DATE
INTPUBLISHSTATE NOT NULL NUMBER(38)
VC2PUBLISHERID VARCHAR2(30)
VC2INDEXWORDS VARCHAR2(200)
VC2WAPPREVPATH VARCHAR2(200)
VC2WEBPREVPATH VARCHAR2(200)
VC2WAP2PREVPATH VARCHAR2(200)
NUMVISITED NOT NULL NUMBER(15)
INTENABLEDFLAG NOT NULL NUMBER(38)
DATCREATETIME NOT NULL DATE
DATMODIFYTIME NOT NULL DATE
VC2NOTES VARCHAR2(1000)
INTINFOTYPE NOT NULL NUMBER(38)
VC2PRIZEFLAG VARCHAR2(1)
VC2DESC VARCHAR2(1000)


 


6.决定创建新的索引以消除全表扫描







 

 
SQL> create index hs_info_NUMCATALOGGUID on hs_info(NUMCATALOGGUID);

Index created.

SQL> set autotrace trace explain
SQL> select i.vc2title,i.numinfoguid
2 from hs_info i where i.intenabledflag = 1
3 and i.intpublishstate = 1 and i.datpublishdate <=sysdate
4 and i.numcatalogguid = 3475
5 order by i.datpublishdate desc, i.numorder desc ;

Execution Plan
———————————————————-
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=12 Card=1 Bytes=106)
1 0 SORT (ORDER BY) (Cost=12 Card=1 Bytes=106)
2 1 TABLE ACCESS (BY INDEX ROWID) OF ‘HS_INFO’ (Cost=2 Card=1 Bytes=106)
3 2 INDEX (RANGE SCAN) OF ‘HS_INFO_NUMCATALOGGUID’ (NON-UNIQUE) (Cost=1 Card=1)


7.观察系统状况


原大量等待消失






 
SQL> select sid,event,p1,p1text from v$session_wait where event not like ‘SQL%’;

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
1 pmon timer 300 duration
2 rdbms ipc message 300 timeout
3 rdbms ipc message 300 timeout
6 rdbms ipc message 180000 timeout
59 rdbms ipc message 6000 timeout
118 rdbms ipc message 6000 timeout
275 rdbms ipc message 30000 timeout
147 rdbms ipc message 6000 timeout
62 rdbms ipc message 6000 timeout
11 rdbms ipc message 30000 timeout
4 rdbms ipc message 300 timeout

SID EVENT P1 P1TEXT
———- —————————— ———- —————————————————————-
305 db file sequential read 17 file#
356 db file sequential read 17 file#
19 db file scattered read 17 file#
5 smon timer 300 sleep time

15 rows selected.


 


持续观察的CPU使用情况


 






 
bash-2.03$ vmstat 3
procs memory page disk faults cpu
r b w swap free re mf pi po fr de sr s6 s9 s1 sd in sy cs us sy id
20 0 0 5421792 1503488 38 434 136 0 0 0 0 0 0 2 0 2931 7795 2622 91 9 0
23 1 0 5416080 1500632 95 734 56 0 0 0 0 0 0 0 0 2949 8057 2598 89 11 0
26 0 0 5412016 1498480 210 1170 21 5 5 0 0 0 2 1 0 3301 9647 3116 90 10 0
25 0 0 5394912 1490160 242 1606 56 0 0 0 0 0 0 1 0 3133 9318 2850 89 11 0
40 0 0 5390200 1488112 162 1393 66 0 0 0 0 0 0 0 0 2848 9080 2502 90 10 0
40 0 0 5377120 1481792 136 1180 120 2 2 0 0 0 1 1 0 2846 9099 2593 92 8 0
36 0 0 5363216 1475168 134 1169 53 0 0 0 0 0 3 2 0 2871 7989 2621 88 12 0
39 0 0 5348936 1469160 157 1448 210 0 0 0 0 0 0 0 0 3660 10062 3480 88 12 0
35 0 0 5344552 1466472 7 15 56 0 0 0 0 0 0 0 0 2885 7663 2635 92 8 0
34 0 0 5343016 1465416 44 386 77 0 0 0 0 0 0 0 0 3197 8486 2902 92 8 0
31 0 0 5331568 1459696 178 1491 122 0 0 0 0 0 0 3 0 3237 9461 3005 89 11 0
31 0 0 5317792 1453008 76 719 80 0 0 0 0 0 0 0 0 3292 8736 3025 93 7 0
31 2 0 5311144 1449552 235 1263 69 2 2 0 0 0 1 0 0 3473 9535 3357 88 12 0
25 0 0 5300240 1443920 108 757 18 2 2 0 0 0 1 1 0 2377 7876 2274 95 5 0
19 0 0 5295904 1441840 50 377 0 0 0 0 0 0 0 1 0 1915 6598 1599 98 1 0
—-以上为创建索引之前部分
—-以下为创建索引之后部分,CPU使用率恢复正常
procs memory page disk faults cpu
r b w swap free re mf pi po fr de sr s6 s9 s1 sd in sy cs us sy id
40 1 0 5290040 1439208 315 3894 8 2 2 0 0 0 1 6 0 3631 13414 5206 61 9 30
0 1 0 5237192 1414744 731 6749 45 0 0 0 0 0 2 7 0 3264 13558 4941 52 14 34
0 0 0 5163632 1380608 747 6585 10 0 0 0 0 0 0 1 0 2617 12291 3901 46 12 41
1 0 0 5090224 1348152 712 6079 29 0 0 0 0 0 0 6 0 2825 12416 4178 50 12 39
1 0 0 5023672 1317296 714 6183 24 0 0 0 0 0 0 5 0 3166 12424 4745 47 13 40
0 0 0 4955872 1287136 737 6258 16 0 0 0 0 0 0 3 0 2890 11777 4432 44 12 44
1 0 0 4887888 1256464 809 6234 8 2 2 0 0 0 0 2 0 2809 12066 4247 45 12 43
0 0 0 4828912 1228200 312 2364 13 5 5 0 0 0 2 1 0 2410 6816 3492 38 6 57
0 0 0 4856816 1240168 8 138 0 0 0 0 0 0 1 0 0 2314 4026 3232 34 4 62
0 0 0 4874176 1247712 0 86 0 0 0 0 0 0 0 0 0 2298 3930 3324 35 2 63
2 0 0 4926088 1270824 34 560 0 0 0 0 0 0 0 0 0 2192 4694 2612 29 16 55
0 0 0 5427320 1512952 53 694 0 0 0 0 0 0 3 2 0 2443 5085 3340 33 12 55
0 0 0 5509120 1553136 0 37 0 0 0 0 0 0 0 0 0 2309 3908 3321 35 1 64
0 0 0 5562048 1577000 16 234 0 0 0 0 0 0 0 0 0 2507 5187 3433 35 8 57
0 0 0 5665672 1623848 252 1896 8 2 2 0 0 0 1 0 0 2091 6548 2939 34 5 61
0 0 0 5654752 1618208 5 173 16 0 0 0 0 0 0 0 0 2226 4218 3051 35 4 60
0 0 0 5727024 1651120 28 254 0 0 0 0 0 0 0 0 0 2126 4224 2982 38 2 60
0 0 0 5723184 1648880 93 562 8 2 2 0 0 0 1 1 0 2371 5140 3432 38 3 59
0 0 0 5730744 1652512 7 177 26 2 2 0 0 0 1 0 0 2465 4442 3575 36 3 61

 


 


至此,此问题得以解决.


 


本文作者:
eygle,Oracle技术关注者,来自中国最大的Oracle技术论坛itpub.
www.eygle.com是作者的个人站点.你可通过Guoqiang.Gai@gmail.com来联系作者.欢迎技术探讨交流以及链接交换.



原文出处:


http://www.eygle.com/case/How.To.Capture.Problem.SQL.htm



如欲转载,请注明作者与出处.并请保留本文的连接.


回首页


 

很多时候我们的服务器可能会经历CPU消耗100%的性能问题.
排除系统的异常,这类问题通常都是因为系统中存在性能低下甚至存在错误的SQL语句, 消耗了大量的CPU所致.


本文通过一个案例就如何捕获这样的SQL给出一个通用的方法.


问题描述:系统CPU高度消耗,系统运行缓慢
OS:Sun Solaris8
Oracle:Oracle9203



1.首先通过Top命令查看






$ top

load averages: 1.61, 1.28, 1.25 HSWAPJSDB 10:50:44
172 processes: 160 sleeping, 1 running, 3 zombie, 6 stopped, 2 on cpu
CPU states: % idle, % user, % kernel, % iowait, % swap
Memory: 4.0G real, 1.4G free, 1.9G swap in use, 8.9G swap free

PID USERNAME THR PR NCE SIZE RES STATE TIME FLTS CPU COMMAND
20521 oracle 1 40 0 1.8G 1.7G run 6:37 0 47.77% oracle
20845 oracle 1 40 0 1.8G 1.7G cpu02 0:41 0 40.98% oracle
20847 oracle 1 58 0 1.8G 1.7G sleep 0:00 0 0.84% oracle
20780 oracle 1 48 0 1.8G 1.7G sleep 0:02 0 0.83% oracle
15828 oracle 1 58 0 1.8G 1.7G sleep 0:58 0 0.53% oracle
20867 root 1 58 0 4384K 2560K sleep 0:00 0 0.29% sshd2
20493 oracle 1 58 0 1.8G 1.7G sleep 0:03 0 0.29% oracle
20887 oracle 1 48 0 1.8G 1.7G sleep 0:00 0 0.13% oracle
20851 oracle 1 58 0 1.8G 1.7G sleep 0:00 0 0.10% oracle
20483 oracle 1 48 0 1.8G 1.7G sleep 0:00 0 0.09% oracle
20875 oracle 1 45 0 1064K 896K sleep 0:00 0 0.07% sh
20794 oracle 1 58 0 1.8G 1.7G sleep 0:00 0 0.06% oracle
20842 jiankong 1 52 2 1224K 896K sleep 0:00 0 0.05% sadc
20888 oracle 1 55 0 1712K 1272K cpu00 0:00 0 0.05% top
19954 oracle 1 58 0 1.8G 1.7G sleep 84:25 0 0.04% oracle


我们发现在进城列表里,存在两个高CPU耗用的Oracle进城,分别消耗了47.77%和40.98%的CPU资源.


 


2.找到存在问题的进程信息







 

 
$ ps -ef|grep 20521
oracle 20909 20875 0 10:50:53 pts/10 0:00 grep 20521
oracle 20521 1 47 10:43:59 ? 6:45 oraclejshs (LOCAL=NO)
$ ps -ef|grep 20845
oracle 20845 1 44 10:50:00 ? 0:55 oraclejshs (LOCAL=NO)
oracle 20918 20875 0 10:50:59 pts/10 0:00 grep 20845


确认这是两个远程连接的用户进程.


 


3.熟悉一下我的getsql.sql脚本







 

 
SELECT /*+ ORDERED */
sql_text
FROM v$sqltext a
WHERE (a.hash_value, a.address) IN (
SELECT DECODE (sql_hash_value,
0, prev_hash_value,
sql_hash_value
),
DECODE (sql_hash_value, 0, prev_sql_addr, sql_address)
FROM v$session b
WHERE b.paddr = (SELECT addr
FROM v$process c
WHERE c.spid = ‘&pid’))
ORDER BY piece ASC
/

注意这里我们涉及了3个视图,并应用其关联进行数据获取.
首先需要输入一个pid,这个pid即process id,也就是在Top或ps中我们看到的PID.
通过pid和v$process.spid相关联我们可以获得Process的相关信息
进而通过v$process.addr和v$session.paddr相关联,我们就可以获得和session相关的所有信息.
再结合v$sqltext,我们即可获得当前session正在执行的SQL语句.


通过v$process视图,我们得以把操作系统和数据库关联了起来.


 


4.连接数据库,找到问题sql及进程


通过Top中我们观察到的PID,进而应用我的getsql脚本,我们得到以下结果输出.







 

 
$ sqlplus “/ as sysdba”

SQL*Plus: Release 9.2.0.3.0 – Production on Mon Dec 29 10:52:14 2003

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.3.0 – 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 – Production

SQL> @getsql
Enter value for spid: 20521
old 10: where c.spid = ‘&pid’
new 10: where c.spid = ‘20521′

SQL_TEXT
—————————————————————-
select * from (select VC2URL,VC2PVDID,VC2MOBILE,VC2ENCRYPTFLAG,S
ERVICEID,VC2SUB_TYPE,CISORDER,NUMGUID,VC2KEY1, VC2NEEDDISORDER,V
C2PACKFLAG,datopertime from hsv_2cpsync where datopertime<=sysda
te and numguid>70000000000308 order by NUMGUid) where rownum<=20


那么这段代码就是当前正在疯狂消耗CPU的罪魁祸首.
接下来需要进行的工作就是找出这段代码的问题,看是否可以通过优化提高其效率,减少资源消耗.


 


5.进一步的我们可以通过dbms_system包跟踪该进程







 


SQL> @getsid
Enter value for spid: 20521
old 3: select addr from v$process where spid = &spid)
new 3: select addr from v$process where spid = 20521)


SID SERIAL# USERNAME MACHINE
—————————————————————-
45 38991 HSUSER_V51 hswapjsptl1.hurray.com.cn


SQL> exec dbms_system.set_sql_trace_in_session(45,38991,true);


PL/SQL procedure successfully completed.


SQL> !


 


这部分内容可以参考:
http://www.eygle.com/case/sql_trace_1.htm


对于Windows上的类似问题,可以参考:
http://www.eygle.com/faq/Use.Nt.tools.manage.Oracle.htm


 


6.一点说明


很多时候,高CPU消耗都是由于问题SQL导致的,所以找到这些SQL通常也就找到了问题所在,通过优化调整
通常就可以解决问题。


但是有时候你可能会发现,这些最消耗CPU的进程是后台进程,这一般是由于异常、BUG或者恢复后的异常
导致的,需要具体问题具体分析了.


 


 


 


本文作者:
eygle,Oracle技术关注者,来自中国最大的Oracle技术论坛itpub.
www.eygle.com是作者的个人站点.你可通过Guoqiang.Gai@gmail.com来联系作者.欢迎技术探讨交流以及链接交换.



原文出处:


http://www.eygle.com/case/how.to.getsql.which.cost.most.CPU.htm