大师这一次是真的错了
Posted: February 13, 2012 | Author: Cui Hua | Filed under: Oracle | 6 Comments »Jonathan Lewis在他的新书《Oracle Core Essential Internals for DBAs and Developers》的第116页这样描述到:
SQL> select name, level#
2 from v$latch
3 where name in (‘cache buffers lru chain’,'cache buffers chains’)
4 /
NAME LEVEL#
—————————— ———-
cache buffers lru chain 2
cache buffers chains 1
2 rows selected.
The cache buffers chains latch has a lower level than the cache buffers lru chain latch, so we can’t request the cache buffers lru chain latch in willing-to-wait mode if we’re already holding the cache buffers chains latch. Think about what this means: we’re holding the cache buffers chains latch (which I will call the hash latch for the rest of this subsection) because we’ve just searched the hash chain for a buffer and discovered that, for whatever reason, we need to add another buffer to the chain. So we have to acquire the cache buffers lru chain latch (which I will call the lru latch for the rest of this subsection) to move a buffer from the REPL_AUX list to the midpoint of the REPL_MAIN list; but we can’t request it in willing-to-wait mode because we’re already holding a lower-level latch.
大师这里提到了说在持有cache buffers chains latch的情况下Oracle不可能再去持有cache buffers lru chain latch,因为cache buffers lru chain的latch level比cache buffers chains的latch level要大。
事实上是:大师这里刚好搞反了,这是latch level最基本的常识啊,我希望大师在这里用了不少篇幅解释的观点只是他的笔误。
我们来证明一下。我只需证明Oracle在持有cache buffers chains latch,并且没有释放的情况下会去持有cache buffers lru chain latch即可证明Jonathan在这个点上犯了常识性的错误。
先来普及一下在Oracle里如何手工持有一个latch——要么用oradebug poke,要么用oradebug call:
Connected to Oracle Database
Connected as SYS
SQL> select count(*) from v$latch_children where name = ‘cache buffers lru chain’;
COUNT(*)
———-
32
SQL> select count(*) from v$latch_children where name = ‘cache buffers chains’;
COUNT(*)
———-
4096
SQL> select name,level# from v$latch where name in (‘cache buffers lru chain’,'cache buffers chains’);
NAME LEVEL#
—————————————————————- ———-
cache buffers lru chain 2
cache buffers chains 1
SQL> select * from (select addr,to_number(addr,’XXXXXXXXXXXX’),gets,misses from v$latch_children where name = ‘cache buffers lru chain’ order by addr desc) where rownum<5;
ADDR TO_NUMBER(ADDR,’XXXXXXXXXXXX’) GETS MISSES
——– —————————— ———- ———-
3497DE
3497DD94 882367892 10 0
3497D974 882366836 0 0
3497D8EC 882366700 10 0
现在我先看一下latch地址为3497DE
SQL> oradebug peek 0x3497DE
[3497DE
00000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ...
我开始调用kslgetl去持有上述latch:
SQL> oradebug call kslgetl 882368028 1
Function returned 1
持有完毕后再去看一下上述latch,可以看到我用红色标注的地方已经被Oracle改了:
SQL> oradebug peek 0x3497DE
[3497DE
00000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ...
现在我释放上述latch:
SQL> oradebug call kslfre 882368028
Function returned F0E8358
释放完毕后再去看一下上述latch,可以看到我用红色标注的前4个byte又被Oracle改回来了:
SQL> oradebug peek 0x3497DE
[3497DE
00000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ...
因为我刚才手工持有、释放了一次cache buffers lru chain latch,所以现在这个latch的willing to wait模式的gets的值从0变成了1:
SQL> select * from (select addr,to_number(addr,'XXXXXXXXXXXX'),gets,misses from v$latch_children where name = 'cache buffers lru chain' order by addr desc) where rownum<5;
ADDR TO_NUMBER(ADDR,'XXXXXXXXXXXX') GETS MISSES
-------- ------------------------------ ---------- ----------
3497DE
3497DD94 882367892 10 0
3497D974 882366836 0 0
3497D8EC 882366700 10 0
准备工作已经做好,现在我们开始反证的过程。
我们反证的思路是:
我同时开3个session,session 1中先持有所有的cache buffers lru chain latch,session 2在session 1持有完所有的cache buffers lru chain latch后去执行update操作,这个时候session 2一定会hang住,同时我们在session 3中去做session 2的process dump,从session 3的dump文件中即可看到session 2中的latch持有与等待的情况。
先分别到session1、session 2、session 3中做一下准备工作:
Session 1:
SQL> select 'oradebug poke 0x' || addr || ' 4 0x00000001' from v$latch_children where name = 'cache buffers lru chain';
'ORADEBUGPOKE0X'||ADDR||'40X00
-------------------------------------
oradebug poke 0x3497DE
oradebug poke 0x3497DD94 4 0x00000001
oradebug poke 0x3497D974 4 0x00000001
oradebug poke 0x3497D8EC 4 0x00000001
oradebug poke 0x
oradebug poke 0x
oradebug poke 0x34970BFC 4 0x00000001
oradebug poke 0x34970B74 4 0x00000001
oradebug poke 0x
oradebug poke 0x
oradebug poke 0x34963E84 4 0x00000001
oradebug poke 0x34963DFC 4 0x00000001
oradebug poke 0x349575B4 4 0x00000001
oradebug poke 0x
oradebug poke 0x
oradebug poke 0x34957084 4 0x00000001
oradebug poke 0x
oradebug poke 0x
oradebug poke 0x
oradebug poke 0x
oradebug poke 0x3493DAC4 4 0x00000001
oradebug poke 0x3493DA
oradebug poke 0x3493D
oradebug poke 0x3493D594 4 0x00000001
oradebug poke 0x34930D
oradebug poke 0x34930CC4 4 0x00000001
oradebug poke 0x
oradebug poke 0x
oradebug poke 0x34923FD4 4 0x00000001
oradebug poke 0x
oradebug poke 0x34923B
oradebug poke 0x34923AA4 4 0x00000001
32 rows selected
Session 2:
SQL> select sid from v$mystat where rownum=1;
SID
----------
138
SQL> select spid from v$process where addr in (select paddr from v$session where
sid=138);
SPID
------------------------
5356
Session 3:
SQL> oradebug setospid 5356
Oracle pid: 25, Windows thread id: 5356, image: ORACLE.EXE (SHAD)
准备工作已经做完,现在我们先去session 1中去持有所有的cache buffers lru chain latch:
Session 1:
SQL> oradebug poke 0x3497DE
BEFORE: [3497DE
AFTER: [3497DE
SQL> oradebug poke 0x3497DD94 4 0x00000001
BEFORE: [3497DD94, 3497DD98) = 00000000
AFTER: [3497DD94, 3497DD98) = 00000001
SQL> oradebug poke 0x3497D974 4 0x00000001
BEFORE: [3497D974, 3497D978) = 00000000
AFTER: [3497D974, 3497D978) = 00000001
SQL> oradebug poke 0x3497D8EC 4 0x00000001
BEFORE: [3497D8EC, 3497D
AFTER: [3497D8EC, 3497D
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x34970BFC 4 0x00000001
BEFORE: [34970BFC,
AFTER: [34970BFC,
SQL> oradebug poke 0x34970B74 4 0x00000001
BEFORE: [34970B74, 34970B78) = 00000000
AFTER: [34970B74, 34970B78) = 00000001
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x34963E84 4 0x00000001
BEFORE: [34963E84, 34963E88) = 00000000
AFTER: [34963E84, 34963E88) = 00000001
SQL> oradebug poke 0x34963DFC 4 0x00000001
BEFORE: [34963DFC, 34963E00) = 00000000
AFTER: [34963DFC, 34963E00) = 00000001
SQL> oradebug poke 0x349575B4 4 0x00000001
BEFORE: [349575B4, 349575B8) = 00000000
AFTER: [349575B4, 349575B8) = 00000001
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x34957084 4 0x00000001
BEFORE: [34957084, 34957088) = 00000000
AFTER: [34957084, 34957088) = 00000001
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x3493DAC4 4 0x00000001
BEFORE: [3493DAC4, 3493DAC8) = 00000000
AFTER: [3493DAC4, 3493DAC8) = 00000001
SQL> oradebug poke 0x3493DA
BEFORE: [3493DA
AFTER: [3493DA
SQL> oradebug poke 0x3493D
BEFORE: [3493D
AFTER: [3493D
SQL> oradebug poke 0x3493D594 4 0x00000001
BEFORE: [3493D594, 3493D598) = 00000000
AFTER: [3493D594, 3493D598) = 00000001
SQL> oradebug poke 0x34930D
BEFORE: [34930D
AFTER: [34930D
SQL> oradebug poke 0x34930CC4 4 0x00000001
BEFORE: [34930CC4, 34930CC8) = 00000000
AFTER: [34930CC4, 34930CC8) = 00000001
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x34923FD4 4 0x00000001
BEFORE: [34923FD4, 34923FD8) = 00000000
AFTER: [34923FD4, 34923FD8) = 00000001
SQL> oradebug poke 0x
BEFORE: [
AFTER: [
SQL> oradebug poke 0x34923B
BEFORE: [34923B
AFTER: [34923B
SQL> oradebug poke 0x34923AA4 4 0x00000001
BEFORE: [34923AA4, 34923AA8) = 00000000
AFTER: [34923AA4, 34923AA8) = 00000001
接着我们到session 2去随便执行一条update语句,这个一定会hang住的:
Session 2:
SQL> update scott.emp set ename='CUIHUA2' where empno=7369;
……这里hang住了
接着我们到session 3中去对session 2做一下process dump:
Session 3:
SQL> oradebug dump processstate 10
已处理的语句
SQL> oradebug tracefile_name
c:\app\cuihua\diag\rdbms\cuihua112\cuihua112\trace\cuihua112_ora_5356.trc
最后我们看一下上述trace文件:
Trace file c:\app\cuihua\diag\rdbms\cuihua112\cuihua112\trace\cuihua112_ora_5356.trc
Oracle Database
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Windows XP Version V5.1 Service Pack 2
CPU : 2 - type 586
Process Affinity : 0x0x00000000
Memory (Avail/Total): Ph:
Instance name: cuihua112
Redo thread mounted by this instance: 1
Oracle process number: 25
Windows thread id: 5356, image: ORACLE.EXE (SHAD)
*** 2012-02-13 13:41:39.859
*** SESSION ID:(138.8) 2012-02-13 13:41:39.859
*** CLIENT ID:() 2012-02-13 13:41:39.859
*** SERVICE NAME:(SYS$USERS) 2012-02-13 13:41:39.859
*** MODULE NAME:(sqlplus.exe) 2012-02-13 13:41:39.859
*** ACTION NAME:() 2012-02-13 13:41:39.859
Received ORADEBUG command (#1) 'dump processstate 10' from process 'Windows thread id: 2664, image: <none>'
===================================================
-------------
Process global information:
process: 0x35483BE4, call: 0x357D
----------------------------------------
SO: 0x35483BE4, type: 2, owner: 0x00000000, flag: INIT/-/-/0x00 if: 0x
proc=0x35483BE4, name=process, file=ksu.h LINE:11459, pg=0
(process) Oracle pid:25, ser:4, calls cur/top: 0x357D
flags : (0x0) -
flags2: (0x0), flags3: (0x0)
intr error: 0, call error: 0, sess error: 0, txn error 0
intr queue: empty
ksudlp FALSE at location: 0
(post info) last post received: 0 0 0
last post received-location: No post
last process to post me: none
last post sent: 0 0 0
last post sent-location: No post
last process posted by me: none
(latch info) wait_event=0 bits=2
Location from where call was made: kcb2.h LINE:3601 ID:kcbzgws:
waiting for 3493da
Location from where latch is held: kcbb.h LINE:2063 ID:kcbbic2:
Context saved from call: 0
state=busy [holder orapid=1] wlstate=free [value=0]
waiters [orapid (seconds since: put on list, posted, alive check)]:
31 (34, 1329111699, 34)
25 (27, 1329111699, 27)
waiter count=2
gotten 465 times wait, failed first 0 sleeps 0
gotten 13848 times nowait, failed: 4
possible holder pid = 1 ospid=
on wait list for 3493da
holding (efd=23)
Location from where latch is held: kcb2.h LINE:3034 ID:kcbgtcr: kslbegin shared:
Context saved from call: 4265585
state=busy(exclusive) [value=0x20000019, holder orapid=25] wlstate=free [value=0]
Process Group: DEFAULT, pseudo proc: 0x354E2458
O/S info: user: SYSTEM, term: Y8908CUIHUA, ospid: 5356
OSD pid info: Windows thread id: 5356, image: ORACLE.EXE (SHAD)
从trace文件里红色标注的部分我们可以看到:
waiting for 3493da
holding (efd=23)
即Oracle是在持有了cache buffers chains latch的情况下(注意这里是holding,意味着Oracle并没有释放,是正在持有)去持有cache buffers lru chain latch。
所以大师这次是真的错了。
关于timestamp with local time zone
Posted: February 1, 2012 | Author: Cui Hua | Filed under: Oracle | Leave a comment »有朋友问我:Oracle数据迁移时源库和目标库的时区不一样,这对于迁移数据而言是否有影响?
这个是没有影响的。
Oracle里跟时区有关的数据类型有两种,分别是timestamp with time zone和timestamp with local time zone,Oracle分别用13个byte和11个byte来存储他们。
这里面timestamp with time zone没什么好说的,因为里面的时区是用2个byte来固定的存储时区的偏移量,所以源库和目标库的时区即使不一样,对原始数据也没有影响。
timestamp with local time zone跟timestamp with time zone最大的区别就是timestamp with local time zone不会具体存储时区的偏移量,而是会根据dbtimezone和用户插入数据时指定的时区(通常就等于sessiontimezone)之间的差值来调整所存储的时间:
我们来看一个实例:
先来看数据库1:
数据库1:
SQL> select dbtimezone from dual;
DBTIMEZONE
———-
+08:00
SQL> select systimestamp from dual;
SYSTIMESTAMP
——————————————————————————–
01-2月 -12 04.15.54.224786 下午 +08:00
SQL> select sessiontimezone from dual;
SESSIONTIMEZONE
—————————————————————————
+08:00
现在这个库的systimestamp中包含的时区和dbtimezone相同,所以当我插入systimestamp的时候,Oracle 不会对timestamp with local time zone做时间上的调整:
SQL> create table t2(col1 timestamp with local time zone);
Table created
SQL> insert into t2 values(systimestamp);
1 row inserted
SQL> commit;
Commit complete
SQL> select col1,dump(col1) from t2;
COL1 DUMP(COL1)
————————————————– —————————————————————
01-2月 -12 04.18.58.839122 下午 Typ=231 Len=11: 120,112,2,1,17,19,59,50,3,252,80
从结果里我们看到,17减1后是16,也就是下午4点,所以这里是没有发生时间的调整的,原因我上面已经说过了。
我们修改一下sessiontimezone后再插入一条记录,我把sessiontimezone修改为洛杉矶所在的时区:
SQL> ALTER SESSION SET TIME_ZONE = ‘-08:00′;
Session altered
SQL> select sessiontimezone from dual;
SESSIONTIMEZONE
—————————————————————————
-08:00
注意,这个时候不能再插入systimestamp了,systimestamp跟sessiontimezone和dbtimezone没有关系,它只取决于database server端环境变量TZ的设置:
SQL> select systimestamp,current_timestamp from dual;
SYSTIMESTAMP CURRENT_TIMESTAMP
————————————————————– ————————————————-
01-2月 -12 04.37.19.615042 下午 +08:00 01-2月 -12 12.37.19.615109 上午 -08:00
SQL> select * from v$timezone_names where tzname like ‘America/Los%’;
TZNAME TZABBREV
—————————————————————- —————————————————————-
America/Los_Angeles LMT
America/Los_Angeles PST
America/Los_Angeles PDT
America/Los_Angeles PWT
America/Los_Angeles PPT
SQL> insert into t2 values(to_timestamp_tz(’2012-02-01 16:38:19 America/Los_Angeles’,'YYYY-MM-DD HH24:MI:SS TZR’));
1 row inserted
SQL> commit;
Commit complete
SQL> select col1,dump(col1) from t2;
COL1 DUMP(COL1)
————————————————– ——————————————————————————–
01-2月 -12 12.18.58.839122 上午 Typ=231 Len=11: 120,112,2,1,17,19,59,50,3,252,80
01-2月 -12 04.38.19.000000 下午 Typ=231 Len=7: 120,112,2,2,9,39,20
从结果里我们可以看到,之前插入的那条记录显示结果变成了2月1日上午零点18分,这是正常的,因为timestamp with local time zone的显示结果会随着sessiontimezone的改变而改变,洛杉矶滞后于北京16个小时,所以我们在16点18分插入的那条记录的显示结果变成了上午零点18分。
现在我们来解释刚刚插入的第二条记录的显示结果:
插入的时候我们指定了是以洛杉矶当地时间16点38分19秒,而现在的sessiontimezone就是洛杉矶所在的时区,所以第二条记录的显示结果就是16点38分19秒。
现在第二条记录Oracle实际上的存储结果是120,112,2,2,9,39,20,这里因为dbtimezone是+08:00,插入的时候我们指定了是以洛杉矶当地时间(相当于sessiontimezone是-08:00),所以Oracle要对timestamp with local time zone实际存储结果加上sessiontimezone和dbtimezone的差值,也就是16个小时。16点38分19秒加上16是32点38分19秒,也就是第二天早上8点38分19秒,所以这里Oracle实际的存储结果变成了2,2,9,39,20。
我们导出上述表:
oracle:/nbsdata01/oradata/testdb>exp \’sys/oracle@nbsdev as sysdba\’ file=t2.dmp tables=t2
Export: Release 11.2.0.1.0 – Production on Wed Feb 1 17:11:25 2012
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in UTF8 character set and AL16UTF16 NCHAR character set
server uses AL32UTF8 character set (possible charset conversion)
About to export specified tables via Conventional Path …
. . exporting table T2 2 rows exported
Export terminated successfully without warnings.
现在我们切换到目标数据库2,这个库里dbtimezone和数据库1的dbtimezone是不一样的,这里我用dbtimezone的差异来模拟时区的差异:
数据库2:
SQL> select dbtimezone from dual;
DBTIMEZONE
——————
+00:00
SQL> select sessiontimezone from dual;
SESSIONTIMEZONE
——————————————————————————–
+08:00
我们尝试导入刚才在数据库1里导出的表t2:
oracle:/nbsdata01/oradata/testdb>imp \’sys/oracle@testdb as sysdba\’ file=t2.dmp tables=t2 ignore=true
Import: Release 11.2.0.1.0 – Production on Wed Feb 1 17:16:50 2012
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export file created by EXPORT:V11.02.00 via conventional path
import done in UTF8 character set and AL16UTF16 NCHAR character set
import server uses ZHS16GBK character set (possible charset conversion)
. importing SYS’s objects into SYS
. importing SYS’s objects into SYS
. . importing table “T2″ 2 rows imported
Import terminated successfully without warnings.
现在我们来看一下导入的结果:
SQL> select col1,dump(col1) from t2;
COL1
—————————————————————————
DUMP(COL1)
——————————————————————————–
01-FEB-12 04.18.58.839122 PM Typ=231 Len=11: 120,112,2,1,9,19,59,50,3,252,80
02-FEB-12 08.38.19.000000 AM Typ=231 Len=7: 120,112,2,2,1,39,20
现在我们再次回到数据库1并将sessiontimezone设成和数据库2一样:
数据库1:
SQL> ALTER SESSION SET TIME_ZONE = ‘+08:00′;
Session altered
SQL> select col1,dump(col1) from t2;
COL1 DUMP(COL1)
————————————————– ——————————————————————————–
01-2月 -12 04.18.58.839122 下午 Typ=231 Len=11: 120,112,2,1,17,19,59,50,3,252,80
02-2月 -12 08.38.19.000000 上午 Typ=231 Len=7: 120,112,2,2,9,39,20
可以看到两条记录的显示时间是一致的,所以对于timestamp with local time zone而言,数据迁移时不同的时区是没有关系的。
这里的本质原因是Oracle在做数据迁移时,对timestamp with local time zone类型的数据Oracle会根据源库和目标库不同的dbtimezone来对timestamp with local time zone类型的数据做转换。这里数据库1的dbtimezone是+08:00,数据库2的dbtimezone是+00:00,所以Oracle对原来的第一条记录的小时17减了8变成了9,第二条记录的小时9减了8变成1。

Recent Comments