數(shù)據(jù)庫版本:PG12.1 openGauss/MOGDB 2.1.0
最近看到了許多關(guān)于PG等待事件的文章,對等待事件這部分也有了很大的興趣。
等待事件是一個累計的統(tǒng)計信息,表明一個server process要繼續(xù)完成作業(yè),必須等待一個時間的結(jié)束;因為系統(tǒng)資源有限,那么完成某些工作,所需資源就要輪流使用,那么在這個過程當(dāng)中,就會產(chǎn)生等待資源的情況。數(shù)據(jù)庫會用不同類型的定義,來描述這個事情,稱之為等待事件。
openGauss/MOGDB數(shù)據(jù)庫是基于PG研發(fā)的,PG是從9.6版本加入了等待事件特性,可以通過查詢pg_stat_activity中的wait_event_type和wait_event了解到每個sql進(jìn)程在當(dāng)前更詳細(xì)的執(zhí)行狀態(tài) 。openGauss/MOGDB在PG的基礎(chǔ)上有許多優(yōu)化及改動,把一些等待事件重新定義,等待事件在保留了部分原等待事件的基礎(chǔ)上也增加了一部分。
在分析問題的時候,等待事件對于我們還是較為重要的,我們可以根據(jù)等待事件,初步定位問題,并結(jié)合相關(guān)測試進(jìn)行驗證,看到了熟悉的等待事件,我們甚至能大概猜出問題所在。相對于ORACLE來說,PG以及openGauss/MOGDB的等待事件種類和數(shù)量較少。在等待事件這方面可能還有極大優(yōu)化的空間,如果能把等待事件的細(xì)粒程度增加,應(yīng)該會幫助我們更好的了解數(shù)據(jù)庫狀態(tài),解決數(shù)據(jù)庫問題。
一般來說產(chǎn)生等待事件的幾種情況:
1.請求的資源忙,需要資源釋放
2.會話處于空閑狀態(tài),等待任務(wù)
3.會話被阻塞,需要等待阻塞解除
以下內(nèi)容對比可能根據(jù)PG版本有變化,如果有誤,歡迎幫我指正交流。
一、ORACLE與PG與openGauss/MOGDB等待事件種類
在ORACLE 11G里,共有13類等待事件,包含了1367個等待事件,如下所示:
SYS@orcl11g> select distinct wait_class from v$event_name order by 1;
WAIT_CLASS
--------------------------------------
Administrative --管理類
Application --應(yīng)用類
Cluster --集群類
Commit --提交類
Concurrency --并發(fā)
Configuration --配置
Idle --空閑
Network --網(wǎng)絡(luò)
Other --其他
Queueing --隊列
Scheduler --任務(wù)調(diào)度
System I/O --系統(tǒng)I/O
User I/O --用戶I/O
13 rows selected.
SYS@orcl11g>select count(name) from v$event_name;
COUNT(NAME)
-------------------
1367
而PG數(shù)據(jù)庫里,有著9類等待事件
/* ----------
* Wait Classes
* ----------
*/
#define PG_WAIT_LWLOCK 0x01000000U /* 等待LWLock */
#define PG_WAIT_LOCK 0x03000000U /* 等待Lock */
#define PG_WAIT_BUFFER_PIN 0x04000000U /* 等待訪問數(shù)據(jù)緩沖區(qū) */
#define PG_WAIT_ACTIVITY 0x05000000U /* 服務(wù)器進(jìn)程處于空閑狀態(tài) */
#define PG_WAIT_CLIENT 0x06000000U /* 等待應(yīng)用客戶端程序在套接字中進(jìn)行操作 */
#define PG_WAIT_EXTENSION 0x07000000U /* 等待擴(kuò)展模塊中的操作 */
#define PG_WAIT_IPC 0x08000000U /* 等待進(jìn)程間通信 */
#define PG_WAIT_TIMEOUT 0x09000000U /* 等待達(dá)到超時時間 */
#define PG_WAIT_IO 0x0A000000U /* 等待IO操作完成 */
openGauss/MOGDB里,有著5類等待事件
/* ----------
* Wait Event Classes
* ----------
*/
#define WAIT_EVENT_END 0x00000000U /* 等待事件結(jié)束*/
#define PG_WAIT_LWLOCK 0x01000000U /* 等待LWLock */
#define PG_WAIT_LOCK 0x03000000U /* 等待Lock */
#define PG_WAIT_IO 0x0A000000U /* 等待IO操作完成 */
#define PG_WAIT_SQL 0x0B000000U /* 等待SQL的類型 */
二、PG和openGauss/MOGDB等待事件對比
1.WAIT_EVENT_END
在類型定義里,WAIT_EVENT_END更像是一種聲明等待事件結(jié)束的狀態(tài),可以看到代碼使用部分,在調(diào)用pgstat_report_waitevent 函數(shù)報告某個等待事件之后,進(jìn)行相關(guān)處理,最后再調(diào)用一次pgstat_report_waitevent 函數(shù)報告WAIT_EVENT_END,類似于聲明操作結(jié)束,等待結(jié)束。


pgstat_report_waitevent 函數(shù)部分代碼如下,這個函數(shù)會從服務(wù)器進(jìn)程需要等待的地方調(diào)用,會報告等待事件信息,等待信息被存儲作為4字節(jié)。

2.PG_WAIT_LOCK
Lock類的等待事件表示backend后臺進(jìn)程等待重量級的鎖,通常是指 relation、tuple、page、transactionid 等子類型鎖 。
在PG里共有10種,
/*
* LOCKTAG is the key information needed to look up a LOCK item in the
* lock hashtable. A LOCKTAG value uniquely identifies a lockable object.
*
* The LockTagType enum defines the different kinds of objects we can lock.
* We can handle up to 256 different LockTagTypes.
*/
typedef enum LockTagType
{
LOCKTAG_RELATION, /* whole relation */
LOCKTAG_RELATION_EXTEND, /* the right to extend a relation */
LOCKTAG_PAGE, /* one page of a relation */
LOCKTAG_TUPLE, /* one physical tuple */
LOCKTAG_TRANSACTION, /* transaction (for waiting for xact done) */
LOCKTAG_VIRTUALTRANSACTION, /* virtual transaction (ditto) */
LOCKTAG_SPECULATIVE_TOKEN, /* speculative insertion Xid and token */
LOCKTAG_OBJECT, /* non-relation database object */
LOCKTAG_USERLOCK, /* reserved for old contrib/userlock code */
LOCKTAG_ADVISORY /* advisory user locks */
} LockTagType;
而openGauss/MOGDB的LOCK類等待事件增加了LOCKTAG_PARTITION、LOCKTAG_PARTITION_SEQUENCE、LOCKTAG_CSTORE_FREESPACE、LOCKTAG_RELFILENODE、LOCKTAG_SUBTRANSACTION,分別是分區(qū)、分區(qū)序列、cstore的空閑空間、relfilenode以及子事務(wù)的等待。
/*
* LOCKTAG is the key information needed to look up a LOCK item in the
* lock hashtable. A LOCKTAG value uniquely identifies a lockable object.
*
* The LockTagType enum defines the different kinds of objects we can lock.
* We can handle up to 256 different LockTagTypes.
*/
typedef enum LockTagType {
LOCKTAG_RELATION, /* whole relation */
/* ID info for a relation is DB OID + REL OID; DB OID = 0 if shared */
LOCKTAG_RELATION_EXTEND, /* the right to extend a relation */
/* same ID info as RELATION */
LOCKTAG_PARTITION, /*partition*/
LOCKTAG_PARTITION_SEQUENCE, /*partition sequence*/
LOCKTAG_PAGE, /* one page of a relation */
/* ID info for a page is RELATION info + BlockNumber */
LOCKTAG_TUPLE, /* one physical tuple */
/* ID info for a tuple is PAGE info + OffsetNumber */
LOCKTAG_TRANSACTION, /* transaction (for waiting for xact done) */
/* ID info for a transaction is its TransactionId */
LOCKTAG_VIRTUALTRANSACTION, /* virtual transaction (ditto) */
/* ID info for a virtual transaction is its VirtualTransactionId */
LOCKTAG_OBJECT, /* non-relation database object */
/* ID info for an object is DB OID + CLASS OID + OBJECT OID + SUBID */
LOCKTAG_CSTORE_FREESPACE, /* cstore free space */
/*
* Note: object ID has same representation as in pg_depend and
* pg_description, but notice that we are constraining SUBID to 16 bits.
* Also, we use DB OID = 0 for shared objects such as tablespaces.
*/
LOCKTAG_USERLOCK, /* reserved for old contrib/userlock code */
LOCKTAG_ADVISORY, /* advisory user locks */
/* same ID info as spcoid, dboid, reloid */
LOCKTAG_RELFILENODE, /* relfilenode */
LOCKTAG_SUBTRANSACTION, /* subtransaction (for waiting for subxact done) */
/* ID info for a transaction is its TransactionId + SubTransactionId */
LOCK_EVENT_NUM
} LockTagType;
3.PG_WAIT_IO
如下的IO類部分為openGauss/MOGDB和PG12.1對比所不具有的,可以看到PG12.1比openGauss/MOGDB多了邏輯復(fù)制查詢重寫,Reorder?Buffer的讀寫等待、時間線歷史文件的同步、WAL同步、WAL BOOTSTRAP的同步、寫等待等等。
WAIT_EVENT_DSM_FILL_ZERO_WRITE, WAIT_EVENT_LOCK_FILE_RECHECKDATADIR_READ, WAIT_EVENT_LOGICAL_REWRITE_CHECKPOINT_SYNC, WAIT_EVENT_LOGICAL_REWRITE_MAPPING_SYNC, WAIT_EVENT_LOGICAL_REWRITE_MAPPING_WRITE, WAIT_EVENT_LOGICAL_REWRITE_SYNC, WAIT_EVENT_LOGICAL_REWRITE_TRUNCATE, WAIT_EVENT_LOGICAL_REWRITE_WRITE, WAIT_EVENT_REORDER_BUFFER_READ, WAIT_EVENT_REORDER_BUFFER_WRITE, WAIT_EVENT_REORDER_LOGICAL_MAPPING_READ, WAIT_EVENT_TIMELINE_HISTORY_FILE_SYNC, WAIT_EVENT_TIMELINE_HISTORY_FILE_WRITE, WAIT_EVENT_TIMELINE_HISTORY_READ, WAIT_EVENT_TIMELINE_HISTORY_SYNC, WAIT_EVENT_TIMELINE_HISTORY_WRITE, WAIT_EVENT_WALSENDER_TIMELINE_HISTORY_READ, WAIT_EVENT_WAL_BOOTSTRAP_SYNC, WAIT_EVENT_WAL_BOOTSTRAP_WRITE, WAIT_EVENT_WAL_SYNC,
而如下部分為PG12.1不具有而openGauss/MOGDB獨有的,多了undo文件相關(guān),doublerwite文件讀寫等等、
WAIT_EVENT_BUF_HASH_SEARCH, WAIT_EVENT_BUF_STRATEGY_GET, WAIT_EVENT_UNDO_FILE_EXTEND, WAIT_EVENT_UNDO_FILE_PREFETCH, WAIT_EVENT_UNDO_FILE_READ, WAIT_EVENT_UNDO_FILE_WRITE, WAIT_EVENT_UNDO_FILE_FLUSH, WAIT_EVENT_UNDO_FILE_SYNC, WAIT_EVENT_WAL_BUFFER_ACCESS, WAIT_EVENT_WAL_BUFFER_FULL, WAIT_EVENT_DW_READ, WAIT_EVENT_DW_WRITE, WAIT_EVENT_DW_SINGLE_POS, WAIT_EVENT_DW_SINGLE_WRITE, WAIT_EVENT_PREDO_PROCESS_PENDING, WAIT_EVENT_PREDO_APPLY, WAIT_EVENT_DISABLE_CONNECT_FILE_READ, WAIT_EVENT_DISABLE_CONNECT_FILE_SYNC, WAIT_EVENT_DISABLE_CONNECT_FILE_WRITE, WAIT_EVENT_MPFL_INIT, WAIT_EVENT_MPFL_READ, WAIT_EVENT_MPFL_WRITE, WAIT_EVENT_OBS_LIST, WAIT_EVENT_OBS_READ, WAIT_EVENT_OBS_WRITE, WAIT_EVENT_LOGCTRL_SLEEP, WAIT_EVENT_COMPRESS_ADDRESS_FILE_FLUSH, WAIT_EVENT_COMPRESS_ADDRESS_FILE_SYNC,
4.PG_WAIT_LWLOCK
LWLock的等待事件主要包含兩種:LWLockNamed 和LWLockTranche ,前者表示backend后臺進(jìn)程等待某種特定的輕量級鎖 ,后者表示表示backend后臺進(jìn)程等待一組相關(guān)輕量級鎖。
這一部分的等待事件較多,就不一一列舉了,但是可以看到,openGauss/MOGDB和PG12.1的LWLockNamed 類等待事件只有少部分一致,這個可能與openGauss/MOGDB基于PG 9.2.4版本研發(fā)有關(guān),等待事件重新定義了。而LWLockTranche 這部分還是有一部分是一致的,但明顯openGauss/MOGDB補充的等待事件數(shù)量也更加多。

5.PG_WAIT_SQL
這一類的等待事件是openGauss/MOGDB分的一類關(guān)于SQL的,可以看到等待的SQL類型。
/* ----------
* Wait Events - SQL
*
* Using this to indicate the type of SQL DML event.
* ----------
*/
typedef enum WaitEventSQL {
WAIT_EVENT_SQL_SELECT = PG_WAIT_SQL,
WAIT_EVENT_SQL_UPDATE,
WAIT_EVENT_SQL_INSERT,
WAIT_EVENT_SQL_DELETE,
WAIT_EVENT_SQL_MERGEINTO,
WAIT_EVENT_SQL_DDL,
WAIT_EVENT_SQL_DML,
WAIT_EVENT_SQL_DCL,
WAIT_EVENT_SQL_TCL
} WaitEventSQL;
如下是相應(yīng)的pgstat_report_wait_count 函數(shù)使用的關(guān)于這個等待事件的部分,可以看到它主要是使用pg atomic函數(shù)根據(jù)wait_event_info為用戶添加sql計數(shù)。
/* Using pg atomic function to add count for corresponsible WaitEventSQL */
if (classId == PG_WAIT_SQL) {
WaitEventSQL w = (WaitEventSQL)wait_event_info;
switch (w) {
case WAIT_EVENT_SQL_SELECT: {
UPDATE_SQL_COUNT(WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.wc_sql_select,
WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.selectElapse);
} break;
case WAIT_EVENT_SQL_UPDATE: {
UPDATE_SQL_COUNT(WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.wc_sql_update,
WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.updateElapse);
} break;
case WAIT_EVENT_SQL_INSERT: {
UPDATE_SQL_COUNT(WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.wc_sql_insert,
WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.insertElapse);
} break;
case WAIT_EVENT_SQL_DELETE: {
UPDATE_SQL_COUNT(WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.wc_sql_delete,
WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.deleteElapse);
} break;
case WAIT_EVENT_SQL_MERGEINTO:
pg_atomic_fetch_add_u64(&(WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.wc_sql_mergeinto), 1);
break;
case WAIT_EVENT_SQL_DDL:
pg_atomic_fetch_add_u64(&(WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.wc_sql_ddl), 1);
break;
case WAIT_EVENT_SQL_DML:
pg_atomic_fetch_add_u64(&(WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.wc_sql_dml), 1);
break;
case WAIT_EVENT_SQL_DCL:
pg_atomic_fetch_add_u64(&(WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.wc_sql_dcl), 1);
break;
case WAIT_EVENT_SQL_TCL:
pg_atomic_fetch_add_u64(&(WaitCountStatusCell->WaitCountArray[dataid].wc_cnt.wc_sql_tcl), 1);
break;
default:
break;
}
}
LWLockRelease(WaitCountHashLock);
}




