一、前言
相信數據泵我們會經常用到,但是沒有理解其原理和普通exp/imp的區別,往往遇到一些錯誤就很難進行排查;
其實數據泵的本質就是后臺存儲過程作業,由dw
進程進行工作,此進程可以并發多個運行,相關作業狀態可以由dba_datapump_jobs視圖查看。在終端關閉或者ctrl c啥的依然不會影響作業的運行,若要進行作業管理需要attach;
利用數據泵可以做到邏輯的備份,數據遷移,傳輸表空間等,DBA比較常用的工具;
下面我們做個簡單案例,環境為12c版本。
二、數據導出(expdp)
-
數據的導出為數據庫的存儲過程作業,所以需要用到目錄對象,定義導出dump文件、日志文件等的路徑信息;
-
導出任務發起,作業會數據庫自動創建,作業會自動創建master表記錄作業信息,作業狀態也會記錄在dba_datapump_jobs視圖
-
數據導出源端為數據庫,目標端為dump文件,也就是靈活將數據庫的相關對象寫入到dump物理文件,理解鏈接關系,跟目標庫是無關系的
-
當然需要用到exp_full_database角色權限,目錄對象的執行讀寫權限
1、查看涉及的表空間信息
表:
set line 300 pages 100
col tablespace_name for a40
col owner for a30
select distinct tablespace_name,owner from dba_tables
where owner in (select username from dba_users where account_status='OPEN' and username not in ('SYS','SYSTEM','DBSNMP'))
order by 2;
索引:
set line 300 pages 100
col tablespace_name for a40
col owner for a30
select distinct tablespace_name,owner from dba_indexes
where owner in (select username from dba_users where account_status='OPEN' and username not in ('SYS','SYSTEM','DBSNMP'))
order by 2;
Note:規劃需要導出的schema,找出其數據涉及的表空間,方便后續的數據導入規劃
2、查看數據庫邏輯大小
select sum(bytes)/1024/1024/1024 from dba_segments;
Note:也可以事先查出比較大的表,例如千萬級別以上的,然后在導的過程中看進度也心里有數,也可以單獨拎出來導
3、創建目錄對象
準備目錄
mkdir /backup/dumpfile
chown oracle.oinstall /backup/dumpfile
創建
create directory mydir as '/backup/dumpfile';
Note:這里默認是用sys進行創建,普通用戶需要進行授權
4、導出數據
創建參數文件
vi /home/oracle/wl_full.par
USERID=system/*******
DIRECTORY=dumpdir
DUMPFILE=wl20220216_%U.dmp
LOGFILE=wl20220216_exp.log
JOB_NAME=wl_export_full
LOGTIME=all
SCHEMA=wl
EXCLUDE=statistics
COMPRESSION=all
PARALLEL=8
CLUSTER=no
Note:也可以直接寫導出參數,看個人習慣;參數比較多,可以得知數據泵很靈活功能強大,可以-help查看所有參數,這里列舉幾個參數;
- USERID 為用戶憑證,普通用戶操作需要授權
- DIRECTORY 為目錄對象
- DUMPFILE 為導出文件名稱
- LOGFILE 為導出日志名稱
- JOB_NAME 為作業名稱,自定義作業名稱方便管理
- LOGTIME all為記錄每個環節的時間
- SCHEMA 為模式名稱,通俗講也就是導出哪個業務用戶的數據
- EXCLUDE 為排除不導出的內容,這里列舉為統計信息,當然可以是其他的對象信息
- COMPRESSION 為導出數據的壓縮級別,all為數據和元數據全部壓縮,最高壓縮類型,當然可以結合壓縮算法COMPRESSION_ALGORITHM達到更高級別的壓縮
- PARALLEL 為并行度,也就是定義dw
進程的個數 - CLUSTER 為RAC特有參數,定義作業是否在每個節點運行
執行任務
expdp PARFILE=/home/oracle/wl_full.par
5、查看作業日志也狀態
觀察導出日志
也就是LOGFILE參數定義的日志文件
tail -1000f wl20220216_exp.log
Export: Release 12.2.0.1.0 - Production on Thu Oct 15 11:54:07 2020
Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.
;;;
Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
15-OCT-20 11:54:19.635: ;;; **************************************************************************
15-OCT-20 11:54:19.640: ;;; Parfile values:
15-OCT-20 11:54:19.645: ;;; parfile: parallel=8
15-OCT-20 11:54:19.650: ;;; parfile: cluster=N
15-OCT-20 11:54:19.655: ;;; parfile: schemas=wl
15-OCT-20 11:54:19.661: ;;; parfile: logtime=ALL
15-OCT-20 11:54:19.666: ;;; parfile: logfile=wl20220216_exp.log
15-OCT-20 11:54:19.671: ;;; parfile: dumpfile=wl20220216_%U.dmp
15-OCT-20 11:54:19.754: ;;; parfile: directory=mydir
15-OCT-20 11:54:19.760: ;;; **************************************************************************
15-OCT-20 11:54:20.427: FLASHBACK automatically enabled to preserve database integrity.
15-OCT-20 11:54:21.601: Starting "SYS"."SYS_EXPORT_SCHEMA_01": "/******** AS SYSDBA" parfile=/home/oracle/wl_full.par
15-OCT-20 11:54:24.230: Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
15-OCT-20 11:54:24.273: Processing object type SCHEMA_EXPORT/ROLE_GRANT
15-OCT-20 11:54:24.274: Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
15-OCT-20 11:54:24.275: Processing object type SCHEMA_EXPORT/USER
15-OCT-20 11:54:25.968: Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
15-OCT-20 11:54:26.146: Processing object type SCHEMA_EXPORT/DB_LINK
15-OCT-20 11:54:26.773: Processing object type SCHEMA_EXPORT/SYNONYM/SYNONYM
15-OCT-20 11:54:26.900: Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/FUNCTIONAL_INDEX/INDEX_STATISTICS
15-OCT-20 11:54:26.904: Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_BODY
15-OCT-20 11:54:27.075: Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
15-OCT-20 11:54:27.222: Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
15-OCT-20 11:54:27.621: Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE
15-OCT-20 11:54:28.311: Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
15-OCT-20 11:54:29.010: Processing object type SCHEMA_EXPORT/TABLE/PROCACT_INSTANCE
15-OCT-20 11:54:29.999: Processing object type SCHEMA_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT
15-OCT-20 11:54:31.714: Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_SPEC
15-OCT-20 11:54:31.830: Processing object type SCHEMA_EXPORT/FUNCTION/FUNCTION
15-OCT-20 11:54:31.969: Processing object type SCHEMA_EXPORT/PROCEDURE/PROCEDURE
15-OCT-20 11:54:32.145: Processing object type SCHEMA_EXPORT/PACKAGE/COMPILE_PACKAGE/PACKAGE_SPEC/ALTER_PACKAGE_SPEC
15-OCT-20 11:54:32.157: Processing object type SCHEMA_EXPORT/FUNCTION/ALTER_FUNCTION
15-OCT-20 11:54:32.163: Processing object type SCHEMA_EXPORT/PROCEDURE/ALTER_PROCEDURE
15-OCT-20 11:54:32.582: Processing object type SCHEMA_EXPORT/TABLE/COMMENT
15-OCT-20 11:54:32.630: Processing object type SCHEMA_EXPORT/TABLE/TABLE
15-OCT-20 11:54:32.972: Processing object type SCHEMA_EXPORT/TABLE/INDEX/FUNCTIONAL_INDEX/INDEX
15-OCT-20 11:54:34.086: Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT
15-OCT-20 11:54:34.612: Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
15-OCT-20 11:54:34.678: Processing object type SCHEMA_EXPORT/VIEW/VIEW
15-OCT-20 11:54:34.783: Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
15-OCT-20 11:54:47.347: Processing object type SCHEMA_EXPORT/STATISTICS/MARKER
15-OCT-20 11:58:02.392: . . exported "WL"."T_TEST_RECORD" 18.99 GB 66499480 rows
15-OCT-20 11:59:22.653: . . exported "WL"."T_TEST" 30.47 GB 70834724 rows
觀察作業狀態
真正管理作業需要attach作業進去操作
查看作業
set line 300 pages 100
col owner_name for a20
col job_name for a30
col state for a20
select owner_name,job_name,state from dba_datapump_jobs;
OWNER_NAME JOB_NAME STATE
-------------------- ------------------------------ --------------------
SYSTEM SJJ_EXPORT_FULL EXECUTING
登錄作業
expdp system attach=sjj_export_full
查看作業狀態
status
Export> status
Job: SJJ_EXPORT_FULL
Operation: EXPORT
Mode: FULL
State: STOP PENDING
Bytes Processed: 32,384,054,664
Percent Done: 99
Current Parallelism: 8
Job Error Count: 0
Job heartbeat: 6
Dump File: /backup/dumpfile/full20220216_%u.dmp
Dump File: /backup/dumpfile/full20220216_01.dmp
bytes written: 15,032,143,872
Dump File: /backup/dumpfile/full20220216_02.dmp
bytes written: 3,542,888,448
Dump File: /backup/dumpfile/full20220216_03.dmp
bytes written: 3,009,998,848
Dump File: /backup/dumpfile/full20220216_04.dmp
bytes written: 2,373,156,864
Dump File: /backup/dumpfile/full20220216_05.dmp
bytes written: 3,188,301,824
Dump File: /backup/dumpfile/full20220216_06.dmp
bytes written: 948,051,968
Dump File: /backup/dumpfile/full20220216_07.dmp
bytes written: 37,437,628,416
Dump File: /backup/dumpfile/full20220216_08.dmp
bytes written: 2,978,820,096
Worker 1 Status:
Instance ID: 1
Instance name: wldb1
Host name: wldb1
Object start time: Wednesday, 16 February, 2022 20:35:08
Object status at: Wednesday, 16 February, 2022 22:03:31
Process Name: DW00
State: WORK WAITING
Worker 2 Status:
Instance ID: 1
Instance name: wldb1
Host name: wldb1
Access method: direct_path
Object start time: Wednesday, 16 February, 2022 20:33:35
Object status at: Wednesday, 16 February, 2022 20:40:42
Process Name: DW01
State: WORK WAITING
Worker 3 Status:
Instance ID: 1
Instance name: wldb1
Host name: wldb1
Access method: direct_path
Object start time: Wednesday, 16 February, 2022 20:33:35
Object status at: Wednesday, 16 February, 2022 21:16:26
Process Name: DW02
State: WORK WAITING
Worker 4 Status:
Instance ID: 1
Instance name: wldb1
Host name: wldb1
Access method: direct_path
Object start time: Wednesday, 16 February, 2022 20:33:38
Object status at: Wednesday, 16 February, 2022 20:40:37
Process Name: DW03
State: WORK WAITING
Worker 5 Status:
Instance ID: 1
Instance name: wldb1
Host name: wldb1
Object start time: Wednesday, 16 February, 2022 20:36:11
Object status at: Wednesday, 16 February, 2022 20:38:13
Process Name: DW04
State: EXECUTING
Object Schema: WL
Object Name: T_TEST
Object Type: DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA
Completed Objects: 4
Total Objects: 939
Completed Rows: 430,816
Worker Parallelism: 1
Worker 6 Status:
Instance ID: 1
Instance name: wldb1
Host name: wldb1
Access method: external_table
Object start time: Wednesday, 16 February, 2022 20:33:35
Object status at: Wednesday, 16 February, 2022 20:41:10
Process Name: DW05
State: WORK WAITING
Worker 7 Status:
Instance ID: 1
Instance name: wldb1
Host name: wldb1
Access method: direct_path
Object start time: Wednesday, 16 February, 2022 20:33:35
Object status at: Wednesday, 16 February, 2022 20:40:31
Process Name: DW06
State: WORK WAITING
Worker 8 Status:
Instance ID: 1
Instance name: wldb1
Host name: wldb1
Access method: direct_path
Object start time: Wednesday, 16 February, 2022 20:34:00
Object status at: Wednesday, 16 February, 2022 20:40:31
Process Name: DW07
State: WORK WAITING
Export>
Note:若導出日志長時間沒反應,可以時不時查看作業status,觀察相關對象是否在變化,判斷作業是否正常。
其他操作:
STOP_JOB,停止作業,可以繼續啟動,dba_datapump_jobs信息依然存在
START_JOB,繼續啟動停止的作業
KILL_JOB,強制終止作業,dba_datapump_jobs信息會清除
三、數據導入
導入跟導出原理差不多
-
數據的導入也是數據庫的存儲過程作業,所以需要用到目錄對象,定義dump文件、日志文件等的路徑信息;
-
導入任務發起,作業會數據庫自動創建,作業會自動創建master表記錄作業信息,作業狀態也會記錄在dba_datapump_jobs視圖
-
數據導入源端為dump文件,目標端為數據庫,也就是靈活將dump文件的相關對象寫入到目標數據庫,理解鏈接關系,跟源庫是無關系的
-
當然需要用到imp_full_database角色權限,目錄對象的執行讀寫權限
1、創建用戶和表空間
create tablespace TEST datafile size 31G autoextend on;
create user TEST identified by "TEST" default tablespace TEST quota unlimited on TEST;
Note:關鍵是定義好表空間,用戶可以不用創建
若不同表空間需要利用REMAP_TABLESPACE重新映射表空間
若需要導入不同的用戶,可以利用REMAP_SCHEMA重新映射用戶
當然表結構和數據也可以重新映射
2、創建目錄對象
準備目錄
mkdir /backup/dumpfile
chown oracle.oinstall /backup/dumpfile
創建
create directory mydir as '/backup/dumpfile';
3、導入數據
創建參數文件
vi /home/oracle/imp_full.par
USERID=system/*******
DIRECTORY=mydir
DUMPFILE=wl20220216_%U.dmp
LOGFILE=wl20220216_imp.log
JOB_NAME=wl_import_full
LOGTIME=ALL
CLUSTER=NO
PARALLEL=8
TRANSFORM=DISABLE_ARCHIVE_LOGGING:Y
REMAP_TABLESPACE=users:CISMT_DATA
TABLE_EXISTS_ACTION=REPLACE
Note:也可以直接寫導出參數,看個人習慣;參數比較多,可以得知數據泵很靈活功能強大,可以-help查看所有參數,這里列舉幾個參數;
- USERID 為導出用戶憑證
- DIRECTORY 為目錄對象
- DUMPFILE 為導出文件名稱
- LOGFILE 為導入日志名稱
- JOB_NAME 為作業名稱,自定義作業名稱方便管理
- LOGTIME all為記錄每個環節的時間
- SCHEMA 為模式名稱,通俗講也就是導出哪個業務用戶的數據
- EXCLUDE 為排除不導出的內容,這里列舉為統計信息,當然可以是其他的對象信息
- COMPRESSION 為導出數據的壓縮級別,all為數據和元數據全部壓縮
- PARALLEL 為并行度,也就是定義dw
進程的個數,要跟dmp文件數對應上 - CLUSTER 為RAC特有參數,定義作業是否在每個節點運行
- TRANSFORM為轉換參數,DISABLE_ARCHIVE_LOGGING:Y也就是不寫歸檔,12c新特性
- REMAP_TABLESPACE為重新映射表空間,源:目標
- TABLE_EXISTS_ACTION為當表存著如何操作,REPLACE為替換,重新創建表
執行任務
expdp PARFILE=/home/oracle/imp_full.par
4、查看作業日志也狀態
觀察導出日志
也就是LOGFILE參數定義的日志文件
tail -1000f wl20220216_imp.log
;;;
Import: Release 12.2.0.1.0 - Production on Thu Oct 15 14:59:51 2020
Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.
;;;
Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
15-OCT-20 14:59:59.462: ;;; **************************************************************************
15-OCT-20 14:59:59.467: ;;; Parfile values:
15-OCT-20 14:59:59.472: ;;; parfile: table_exists_action=REPLACE
15-OCT-20 14:59:59.477: ;;; parfile: remap_tablespace=SAPME:test
15-OCT-20 14:59:59.488: ;;; parfile: transform=DISABLE_ARCHIVE_LOGGING:Y
15-OCT-20 14:59:59.493: ;;; parfile: parallel=8
15-OCT-20 14:59:59.498: ;;; parfile: cluster=N
15-OCT-20 14:59:59.503: ;;; parfile: logtime=ALL
15-OCT-20 14:59:59.508: ;;; parfile: logfile=wl20220216_imp.log
15-OCT-20 14:59:59.513: ;;; parfile: dumpfile=wl20220216_%U.dmp
15-OCT-20 14:59:59.518: ;;; parfile: directory=mydir
15-OCT-20 14:59:59.523: ;;; **************************************************************************
15-OCT-20 15:00:01.940: Master table "SYS"."SYS_IMPORT_FULL_03" successfully loaded/unloaded
15-OCT-20 15:00:03.878: Starting "SYS"."SYS_IMPORT_FULL_03": "/******** AS SYSDBA" parfile=/home/oracle/imp_full.par
15-OCT-20 15:00:03.970: Processing object type SCHEMA_EXPORT/USER
15-OCT-20 15:00:05.109: ORA-31684: Object type USER:"WIPTEST" already exists
15-OCT-20 15:00:05.286: Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
15-OCT-20 15:00:06.522: Processing object type SCHEMA_EXPORT/ROLE_GRANT
15-OCT-20 15:00:07.518: Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
15-OCT-20 15:00:08.364: Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
15-OCT-20 15:00:08.710: Processing object type SCHEMA_EXPORT/DB_LINK
15-OCT-20 15:00:09.081: Processing object type SCHEMA_EXPORT/SYNONYM/SYNONYM
15-OCT-20 15:00:11.453: Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE
15-OCT-20 15:00:12.108: Processing object type SCHEMA_EXPORT/TABLE/PROCACT_INSTANCE
15-OCT-20 15:00:14.614: Processing object type SCHEMA_EXPORT/TABLE/TABLE
15-OCT-20 15:00:25.856: Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
15-OCT-20 15:21:09.917: . . imported "WL"."T_TEST" 18.99 GB 66499480 rows
15-OCT-20 15:26:00.295: . . imported "WL"."TEST01" 30.47 GB 70834724 rows
觀察作業狀態
真正管理作業需要attach作業進去操作
查看作業
set line 300 pages 100
col owner_name for a20
col job_name for a30
col state for a20
select owner_name,job_name,state from dba_datapump_jobs;
OWNER_NAME JOB_NAME STATE
-------------------- ------------------------------ --------------------
SYSTEM SJJ_IMPORT_FULL EXECUTING
登錄作業
expdp system attach=sjj_import_full
查看作業狀態
status
Note:狀態跟導入的差不多;操作同理
晉升:當然再細排查作業進度,就需要分析數據庫作業會話,會話的等待事件,會話執行SQL進度。




