最近遇到一個 case,在關閉數據庫的時候報 ORA-01210: data file header is media corrupt
sys@ORCL 09:11:53> shutdown immediate
ORA-01122: database file 5 failed verification check
ORA-01110: data file 5: '/oradata/ORCL/tbs01.dbf'
ORA-01210: data file header is media corrupt
通過查看 alter 日志發現,報錯的數據文件在兩個數據庫實例中都有創建tbs成功的語句中,類似如下:
Completed: create tablespace tbs DataFile '/oradata/ORCL/tbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M;
可以看到,都沒有使用 reuse 關鍵字。以下是在我本地環境模擬。
數據庫的版本是 19.5,由于目前19.5ru需要密碼才能下載,我在本地19.3 和19.17 環境分別模擬如下:
–19.3
在 hhhh 實例上創建 hhtbs01 表空間,數據文件是 /u01/hhdata/HHHH/datafile/hhtbs01.dbf
create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M;
hhhh:sys@(301_HHHH)>
hhhh:sys@(301_HHHH)> alter system switch logfile;
System altered.
hhhh:sys@(301_HHHH)> create table h (i int) tablespace hhtbs01;
Table created.
hhhh:sys@(301_HHHH)> insert into h values(100);
1 row created.
hhhh:sys@(301_HHHH)> c/100/101
1* insert into h values(101)
hhhh:sys@(301_HHHH)> /
1 row created.
hhhh:sys@(301_HHHH)> commit;
Commit complete.
hhhh:sys@(301_HHHH)> insert into h select * from h;
2 rows created.
hhhh:sys@(301_HHHH)> /
4 rows created.
hhhh:sys@(301_HHHH)> /
8 rows created.
hhhh:sys@(301_HHHH)> /
16 rows created.
hhhh:sys@(301_HHHH)>
并插入一些數據。
ORADB01 instance 也創建 hhtbs01 表空間,數據文件是 /u01/hhdata/HHHH/datafile/hhtbs01.dbf
ORADB01:sys@(209_CDB$ROOT)> create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M;
create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M
*
ERROR at line 1:
ORA-01119: error in creating database file '/u01/hhdata/HHHH/datafile/hhtbs01.dbf'
ORA-27038: created file already exists
Additional information: 1
很顯然由于沒有reuse 關鍵字,無法創建成功的。但是根據 case 的alert 日志,很明顯兩個實例在不同的時間點,都執行創建表空間成功了,且都是同一個數據文件。
create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M
Completed: create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M
這里推測應該是手工將之前的文件進行了rm,不然第二個實例是無法創建成功的,即使通過 touch 一個空的數據文件,create tbs 不使用reuse關鍵字也是無法創建成功的。
oggscdg:sys@(963_CDB$ROOT)> !touch /u01/hhdata/HHHH/datafile/htouch.dbf
oggscdg:sys@(963_CDB$ROOT)> !ls -l /u01/hhdata/HHHH/datafile/htouch.dbf
-rw-r--r--. 1 oracle oinstall 0 Dec 21 09:35 /u01/hhdata/HHHH/datafile/htouch.dbf
oggscdg:sys@(963_CDB$ROOT)> create tablespace htouch DataFile '/u01/hhdata/HHHH/datafile/htouch.dbf' size 100M autoextend on next 1000M maxsize 32767M;
ORADB01:sys@(963_CDB$ROOT)> create tablespace htouch DataFile '/u01/hhdata/HHHH/datafile/htouch.dbf' size 100M autoextend on next 1000M maxsize 32767M;
create tablespace htouch DataFile '/u01/hhdata/HHHH/datafile/htouch.dbf' size 100M autoextend on next 1000M maxsize 32767M
*
ERROR at line 1:
ORA-01119: error in creating database file '/u01/hhdata/HHHH/datafile/htouch.dbf'
ORA-27038: created file already exists
Additional information: 1
文件已經存在的情況下,只有使用 reuse 關鍵字才能創建成功
ORADB01:sys@(963_CDB$ROOT)> create tablespace htouch DataFile '/u01/hhdata/HHHH/datafile/htouch.dbf' size 100M reuse autoextend on next 1000M maxsize 32767M;
Tablespace created.
同時查看 alert 日志里清晰的記錄 有reuse 關鍵字
Completed: create tablespace htouch DataFile '/u01/hhdata/HHHH/datafile/htouch.dbf' size 100M reuse autoextend on next 1000M maxsize 32767M
這里通過 rm 掉hhhh 實例的數據文件之后,在 ORADB01 實例上,創建成功。
ORADB01:sys@(209_CDB$ROOT)> ! ls -l /u01/hhdata/HHHH/datafile/hhtbs01.dbf
-rw-r-----. 1 oracle oinstall 104865792 Dec 21 08:27 /u01/hhdata/HHHH/datafile/hhtbs01.dbf
ORADB01:sys@(209_CDB$ROOT)> ! rm /u01/hhdata/HHHH/datafile/hhtbs01.dbf
ORADB01:sys@(209_CDB$ROOT)> ! ls -l /u01/hhdata/HHHH/datafile/hhtbs01.dbf
ls: cannot access /u01/hhdata/HHHH/datafile/hhtbs01.dbf: No such file or directory
ORADB01:sys@(209_CDB$ROOT)> create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M;
Tablespace created.
ORADB01:sys@(209_CDB$ROOT)>
問題出現了,我的版本是 19.3 ,在過了大概 1分鐘左右,還沒進行 shutdown 測試呢,hhhh 實例 crash 了。
2022-12-21T08:27:03.855326+08:00
create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M
Completed: create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M
2022-12-21T08:27:54.223384+08:00
Thread 1 advanced to log sequence 13 (LGWR switch)
Current log# 1 seq# 13 mem# 0: /u01/hhdata/HHHH/onlinelog/redo01.log
2022-12-21T08:27:54.227725+08:00
ARC3 (PID:11362): Archived Log entry 8 added for T-1.S-12 ID 0xd4895537 LAD:1
2022-12-21T08:32:08.183283+08:00
Read of datafile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' (fno 10) header failed with ORA-01210
Hex dump of (file 10, block 1) in trace file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_ckpt_10899.trc
Corrupt block relative dba: 0x02800001 (file 10, block 1)
Bad header found during datafile header read
Data in bad block:
type: 11 format: 2 rdba: 0x04c00001
last change scn: 0x0000.0000.00000000 seq: 0x1 flg: 0x04
spare3: 0x0
consistency value in tail: 0x00000b01
check value in block header: 0xd9c0
computed block checksum: 0x0
Rereading datafile 10 header failed with ORA-01210
2022-12-21T08:32:08.265924+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_ckpt_10899.trc:
ORA-63999: data file suffered media failure
ORA-01122: database file 10 failed verification check
ORA-01110: data file 10: '/u01/hhdata/HHHH/datafile/hhtbs01.dbf'
ORA-01210: data file header is media corrupt
2022-12-21T08:32:08.266337+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_ckpt_10899.trc:
ORA-63999: data file suffered media failure
ORA-01122: database file 10 failed verification check
ORA-01110: data file 10: '/u01/hhdata/HHHH/datafile/hhtbs01.dbf'
ORA-01210: data file header is media corrupt
2022-12-21T08:32:08.299887+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ORA-01110: data file 2: '/u01/hhdata/HHHH/datafile/htest01.dbf'
ORA-01210: data file header is media corrupt
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_ckpt_10899.trc (incident=37033):
ORA-63999 [] [] [] [] [] [] [] [] [] [] [] []
Incident details in: /u01/app/oracle/diag/rdbms/hhhh/hhhh/incident/incdir_37033/hhhh_ckpt_10899_i37033.trc
2022-12-21T08:32:08.479859+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ORA-01110: data file 5: '/u01/hhdata/HHHH/datafile/tbs01.dbf'
ORA-01210: data file header is media corrupt
2022-12-21T08:32:08.626372+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ORA-01110: data file 8: '/u01/hhdata/HHHH/datafile/oth01.dbf'
2022-12-21T08:32:08.783624+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ORA-01110: data file 9: '/u01/hhdata/HHHH/datafile/coth01.dbf'
ORA-01210: data file header is media corrupt
2022-12-21T08:32:08.927298+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ORA-01110: data file 10: '/u01/hhdata/HHHH/datafile/hhtbs01.dbf'
ORA-01210: data file header is media corrupt
2022-12-21T08:32:09.252599+08:00
USER (ospid: ): terminating the instance due to ORA error
2022-12-21T08:32:09.350800+08:00
System state dump requested by (instance=1, osid=10899 (CKPT)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_diag_10880.trc
2022-12-21T08:32:10.954050+08:00
Dumping diagnostic data in directory=[cdmp_20221221083209], requested by (instance=1, osid=10899 (CKPT)), summary=[abnormal instance termination].
2022-12-21T08:32:12.176920+08:00
Instance terminated by USER, pid = 10899
這個庫我測試了多次,alert 日志有多個報 ORA-01210: data file header is media corrupt 的信息,可以忽略。但是 case 遇到情況是:庫可以正常運行,關閉庫的時候無法正常關閉提示 :ORA-01210: data file header is media corrupt。
由于找不到 19.5 的 ru 包,索性就升級到最新的19.17 進行模擬測試。
19.17 進行以上相同步驟的測試,創建之后原庫正常,可以讀寫,過了幾分鐘也沒有crash,查看alert日志也都正常,于是我就進行一些插入操作。在插入了 2621440 條數據之后,同樣報了 ORA-01122和 ORA-01110但是沒有報ORA-01210,這次報的是 ORA-01203: wrong incarnation of this file - wrong creation SCN ,庫的狀態仍然是 open,這個表空間上的表仍然可以讀。
sys@ORCL 20-DEC-22> create table a( i int) tablespace tbs;
Table created.
Elapsed: 00:00:00.38
sys@ORCL 20-DEC-22> insert into a values(1);
1 row created.
Elapsed: 00:00:00.01
sys@ORCL 20-DEC-22> c/1/2
1* insert into a values(2)
sys@ORCL 20-DEC-22> /
1 row created.
Elapsed: 00:00:00.00
sys@ORCL 20-DEC-22> c/2/3
1* insert into a values(3)
sys@ORCL 20-DEC-22> /
1 row created.
Elapsed: 00:00:00.00
sys@ORCL 20-DEC-22> commit;
Commit complete.
Elapsed: 00:00:00.00
sys@ORCL 20-DEC-22> alter system switch logfile;
sys@ORCL 07:19:44> /
/
2621440 rows created.
Elapsed: 00:00:00.54
sys@ORCL 07:19:45>
/
insert into a select * from a
*
ERROR at line 1:
ORA-01122: database file 5 failed verification check
ORA-01110: data file 5: '/oradata/ORCL/tbs01.dbf'
ORA-01203: wrong incarnation of this file - wrong creation SCN
關閉數據庫測試,出現如下錯誤信息:
sys@ORCL 09:59:39> shutdown immediate
ORA-01122: database file 5 failed verification check
ORA-01110: data file 5: '/oradata/ORCL/tbs01.dbf'
ORA-01203: wrong incarnation of this file - wrong creation SCN
sys@ORCL 09:59:44>
這里基本可以肯定是通過 rm 掉數據文件,然后再另一個數據庫實例進行了表空間的重建。這里可以總結下,19.3 的時候,數據庫直接crash,都無法提供服務了。19.5 會報 ORA-01210: data file header is media corrupt ,數據庫仍然是open狀態能夠提供服務(case 環境)。到了19.17 之后,報錯更詳細了:ORA-01203: wrong incarnation of this file - wrong creation SCN 而不僅僅是報文件頭被損壞。
當然通過rm 之后,數據庫還沒關閉的情況下,句柄是仍然存在的,通過 lsof 可以查看,可以看到 dbf文件仍然存在,理論可以恢復的,下次文章再測試下如何恢復。
[oracle@dsmart:/home/oracle]$ lsof |grep delete|grep dbf
ora_dbw0_ 10767 oracle 263uW REG 253,1 104865792 917524 /oradata/ORCL/tbs01.dbf (deleted)
ora_ckpt_ 10775 oracle 261u REG 253,1 104865792 917524 /oradata/ORCL/tbs01.dbf (deleted)
[oracle@dsmart:/home/oracle]$




