這是有關(guān)視圖合并的第二篇,有關(guān)視圖合并的基礎(chǔ),簡單select-project-join視圖和出現(xiàn)在外連接中的視圖是如何合并的,以及為什么其中一些視圖不能被合并。請(qǐng)看視圖合并一.在本篇中,我們將涵蓋復(fù)雜視圖合并。
復(fù)雜視圖合并
我們使用術(shù)語“復(fù)雜視圖合并”來描述對(duì)group by和distinct視圖的合并。就像簡單視圖合并,其允許優(yōu)化器提供額外的連接次序和訪問方法。此外,對(duì)group-by/distinct操作的評(píng)估,可以延后到連接之后才評(píng)估。延后group-by的評(píng)估會(huì)讓性能更好或更差,這取決于數(shù)據(jù)的特征。如果連接可以過濾,延后group-by 到連接后,會(huì)導(dǎo)致需執(zhí)行g(shù)roup-by操作的數(shù)據(jù)集的減少;另一方面,先group-by 可以減少后續(xù)連接處理的數(shù)據(jù)量,或者連接會(huì)使參與group-by的數(shù)據(jù)突增。對(duì)于DISTINCT操作也是一樣的。由于合并這樣的視圖并不總是更好的,我們是否選擇使用這一轉(zhuǎn)換,使用基于成本的方式。兩個(gè)選擇–用或不用視圖合并–分別由優(yōu)化器計(jì)算成本,只有當(dāng)這樣做的成本更低時(shí),我們才會(huì)選擇合并視圖。
考慮以下的group by 視圖和使用了它的查詢:
create view cust_prod_totals_v as
select sum(s.quantity_sold) total, s.cust_id, s.prod_id
from sales s
group by s.cust_id, s.prod_id;
select c.cust_id, c.cust_first_name, c.cust_last_name, c.cust_email
from customers c, products p, cust_prod_totals_v
where c.country_id = 'US'
and c.cust_id = cust_prod_totals_v.cust_id
and cust_prod_totals_v.total > 100
and cust_prod_totals_v.prod_id = p.prod_id
and p.prod_name = 'T3 Faux Fur-Trimmed Sweater';
該查詢查找來自US的,購買了至少100個(gè)特定商品的所有客戶。其視圖是滿足進(jìn)行復(fù)雜視圖合并的條件的。合并后,該查詢看起來是這樣的:
select c.cust_id, cust_first_name, cust_last_name, cust_email
from customers c, products p, sales s
where c.country_id = 'US'
and c.cust_id = s.cust_id
and s.prod_id = p.prod_id
and p.prod_name = 'T3 Faux Fur-Trimmed Sweater'
group by s.cust_id, s.prod_id, p.rowid, c.rowid,
c.cust_email, c.cust_last_name, c.cust_first_name, c.cust_id
having sum(s.quantity_sold) > 100;
轉(zhuǎn)換后的查詢成本低于未轉(zhuǎn)換查詢,因此,優(yōu)化器選擇合并該視圖。為什么轉(zhuǎn)換后的查詢成本更低呢?在未轉(zhuǎn)換的查詢中,group by 操作施加到視圖中SALES表的全部。在轉(zhuǎn)換后的查詢中,到products和customers(尤其是products表)的連接,過濾掉了SALES表中大部分的數(shù)據(jù),所以group by操作的成本更低。而連接的成本更高了,這是由于SALES表沒有被減少,但成本并沒有高太多,因?yàn)槌跏疾樵冎械膅roup by 并不能減少太多的數(shù)據(jù)。如果這些數(shù)據(jù)特征發(fā)生了變化,合并視圖也許成本就不低了。因此是需要基于成本來確定。如下是最終的執(zhí)行計(jì)劃:
--------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
--------------------------------------------------------
| 0 | SELECT STATEMENT | | 2101 (18)|
|* 1 | FILTER | | |
| 2 | HASH GROUP BY | | 2101 (18)|
|* 3 | HASH JOIN | | 2099 (18)|
|* 4 | HASH JOIN | | 1801 (19)|
|* 5 | TABLE ACCESS FULL| PRODUCTS | 96 (5)|
| 6 | TABLE ACCESS FULL| SALES | 1620 (15)|
|* 7 | TABLE ACCESS FULL | CUSTOMERS | 296 (11)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SUM("QUANTITY_SOLD")>100)
3 - access("C"."CUST_ID"="CUST_ID")
4 - access("PROD_ID"="P"."PROD_ID")
5 - filter("P"."PROD_NAME"='T3 Faux Fur-Trimmed Sweater')
7 - filter("C"."COUNTRY_ID"='US')
在上面的計(jì)劃中并沒有視圖,這是視圖合并后所期望的表現(xiàn)。然而,某些情況下,視圖合并后仍然在執(zhí)行計(jì)劃中出現(xiàn)視圖,其視圖名類似like VW_NWVW_1。稍后,我們會(huì)討論其原因,但首先讓我們看一個(gè)例子。這也是給我們一個(gè)機(jī)會(huì)觀察distinct視圖合并的例子。考慮一下這個(gè)查找位于US,并購買了特定產(chǎn)品的用戶的查詢:
select c.cust_id, c.cust_first_name, c.cust_last_name, c.cust_email
from customers c, products p,
(select distinct s.cust_id, s.prod_id
from sales s) cust_prod_v
where c.country_id = 'US'
and c.cust_id = cust_prod_v.cust_id
and cust_prod_v.prod_id = p.prod_id
and p.prod_name = 'T3 Faux Fur-Trimmed Sweater';
該視圖可以被合并,不過要基于成本,由于DISTINCT會(huì)讓數(shù)據(jù)更少,從而可以讓連接的成本更低。然而,在本例中,合并視圖是成本更低的,因此,我們得到了這樣一個(gè)等價(jià)的查詢:
select nwvw.cust_id, nwvw.cust_first_name, nwvw.cust_last_name, nwvw.cust_email
from (select distinct c.rowid, p.rowid, s.prod_id, s.cust_id, c.cust_id,
c.cust_first_name, c.cust_last_name, c.cujst_email
from customers c, products p, sales s
where c.country_id = 'US'
and c.cust_id = s.cust_id
and s.prod_id = p.prod_id
and p.prod_name = 'T3 Faux Fur-Trimmed Sweater') nwvw;
其執(zhí)行計(jì)劃如下:
-------------------------------------------
| Id | Operation | Name |
-------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | VIEW | VM_NWVW_1 |
| 2 | HASH UNIQUE | |
|* 3 | HASH JOIN | |
|* 4 | HASH JOIN | |
|* 5 | TABLE ACCESS FULL| PRODUCTS |
| 6 | TABLE ACCESS FULL| SALES |
|* 7 | TABLE ACCESS FULL | CUSTOMERS |
-------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("C"."CUST_ID"="S"."CUST_ID")
4 - access("S"."PROD_ID"="P"."PROD_ID")
5 - filter("P"."PROD_NAME"='T3 Faux Fur-Trimmed Sweater')
7 - filter("C"."COUNTRY_ID"='US')
那么,為什么我們?cè)诤喜⒘艘晥D后,還有視圖呢?新的視圖我們稱之為“投影視圖”(projection view)。當(dāng)我們合并視圖時(shí),將DISTINCT移到了外層查詢塊。但是,當(dāng)我們移動(dòng)DISTINCT時(shí),為了維持語義上與原始查詢等價(jià),我們必須添加若干列。所以,我們將它們裝到一個(gè)新視圖中,以便我們可以找出只在外層查詢塊的select列表中的那些列。但我們?nèi)匀猾@得了從視圖合并中得到的好處–所有的表位于同一個(gè)查詢塊,在最終的連接順序上,優(yōu)化器可以按其期望的隨意排列,distinct操作被延后到了所有連接完成之后。這些投影視圖出現(xiàn)在distinct視圖被合并的查詢中,或者一個(gè)group by視圖被合并到含有g(shù)roup by,having,并(或)有聚合的外層查詢塊中。在后者的情況中,投影視圖包含有外層查詢塊中的group by,having和聚合。
至此,最大的謎團(tuán)已經(jīng)被解開,讓我們看一下group by 或 distinct視圖不能被合并的原因。除了成本,有這樣一些原因,包括:
- 外層查詢中的表無rowid或唯一鍵列
- 視圖出現(xiàn)在層次查詢塊中
- 視圖包含grouping sets, rollup, pivot
- 視圖或外層查詢塊包含有分頁子句
總結(jié)
視圖合并通過允許額外的連接次序,訪問方法 以及可以被采用的其它轉(zhuǎn)換改善了執(zhí)行計(jì)劃。在視圖合并會(huì)導(dǎo)致一個(gè)更好的執(zhí)行計(jì)劃的情況下,Oracle自動(dòng)合并視圖;在其它情況下,其取決于成本。一個(gè)視圖不能被合并有多種原因,包括成本或有效性的限制。請(qǐng)注意, 基于成本或啟發(fā)式轉(zhuǎn)換而被拒絕的視圖合并,可以被提示所覆蓋;但基于有效性而被拒絕的視圖合并則不可以。
原文鏈接:https://blogs.oracle.com/optimizer/post/optimizer-transformations-view-merging-part-2
Optimizer Transformations: View Merging part 2
January 1, 2020 | 5 minute read
Maria Colgan
Distinguished Product Manager
This is the second of two posts on view merging. See the
first post for the basics of view merging, how it works for simple select-project-join views and views appearing in outer joins, and why one of these views might not be merged. In this post we’ll cover complex view merging.
Complex View Merging
We use the term “complex view merging” to describe merging of group by and distinct views. Like simple view merging, this allows the optimizer to consider additional join orders and access paths. In addition, the evaluation of the group-by/distinct operation can be delayed until after the joins have been evaluated. Delayed evaluation of group-by can make performance better or worse depending on the characteristics of the data. Delaying a group-by until after joins can result in a reduction in the data set on which the group-by operation is to be performed, if joins are filtering; on the other hand, early group-by can reduce the amount of data to be processed by subsequent joins or the joins could explode the amount of data to undergo group-by. The same is true for distinct operations. Because it is not always better to merge such a view, we choose whether to use this transformation in a cost-based manner. The two options - with and without view merging - are each costed by the optimizer, and we choose to merge the view only if it is cheaper to do so.
Consider the following group by view and query which refers to it:
create view cust_prod_totals_v as
select sum(s.quantity_sold) total, s.cust_id, s.prod_id
from sales s
group by s.cust_id, s.prod_id;
select c.cust_id, c.cust_first_name, c.cust_last_name, c.cust_email
from customers c, products p, cust_prod_totals_v
where c.country_id = 'US'
and c.cust_id = cust_prod_totals_v.cust_id
and cust_prod_totals_v.total > 100
and cust_prod_totals_v.prod_id = p.prod_id
and p.prod_name = 'T3 Faux Fur-Trimmed Sweater';
This query finds all of the customers from the US who have bought at least 100 of a particular item. The view is eligible for complex view merging. After merging, the query looks like this:
select c.cust_id, cust_first_name, cust_last_name, cust_email
from customers c, products p, sales s
where c.country_id = 'US'
and c.cust_id = s.cust_id
and s.prod_id = p.prod_id
and p.prod_name = 'T3 Faux Fur-Trimmed Sweater'
group by s.cust_id, s.prod_id, p.rowid, c.rowid,
c.cust_email, c.cust_last_name, c.cust_first_name, c.cust_id
having sum(s.quantity_sold) > 100;
The transformed query is cheaper than the untransformed query, so the optimizer chooses to merge the view. Why is the transformed query cheaper? In the untransformed query, the group by operator applies to the entire sales table in the view. In the transformed query, the joins to products and customers (especially products) filter out a large portion of the rows from the sales table, so the group by operation is much cheaper. The join is more expensive because the sales table has not been reduced, but it is not that much more expensive, since the group-by operation does not reduce the data size that much in the original query. If any of these characteristics were to change, it may no longer be cheaper to merge the view. Hence the need for a cost-based decision. The final plan is as follows:
--------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
--------------------------------------------------------
| 0 | SELECT STATEMENT | | 2101 (18)|
|* 1 | FILTER | | |
| 2 | HASH GROUP BY | | 2101 (18)|
|* 3 | HASH JOIN | | 2099 (18)|
|* 4 | HASH JOIN | | 1801 (19)|
|* 5 | TABLE ACCESS FULL| PRODUCTS | 96 (5)|
| 6 | TABLE ACCESS FULL| SALES | 1620 (15)|
|* 7 | TABLE ACCESS FULL | CUSTOMERS | 296 (11)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SUM("QUANTITY_SOLD")>100)
3 - access("C"."CUST_ID"="CUST_ID")
4 - access("PROD_ID"="P"."PROD_ID")
5 - filter("P"."PROD_NAME"='T3 Faux Fur-Trimmed Sweater')
7 - filter("C"."COUNTRY_ID"='US')
There is no view in the plan above, which is what one would expect after the view has been merged. However, there are some cases where a view will still appear in the plan even after view merging, with a name like VW_NWVW_1. We’ll discuss the reasons why in a moment, but first let’s look at an example. This also gives us a chance to look at an example of distinct view merging. Consider this query to find customers in the US that bought a particular product:
select c.cust_id, c.cust_first_name, c.cust_last_name, c.cust_email
from customers c, products p,
(select distinct s.cust_id, s.prod_id
from sales s) cust_prod_v
where c.country_id = 'US'
and c.cust_id = cust_prod_v.cust_id
and cust_prod_v.prod_id = p.prod_id
and p.prod_name = 'T3 Faux Fur-Trimmed Sweater';
The view can be merged, though it is based on cost, since the reduction in data due to distinct may make the join cheaper. In this case, however, it is cheaper to merge the view, so we get this equivalent query:
select nwvw.cust_id, nwvw.cust_first_name, nwvw.cust_last_name, nwvw.cust_email
from (select distinct c.rowid, p.rowid, s.prod_id, s.cust_id, c.cust_id,
c.cust_first_name, c.cust_last_name, c.cujst_email
from customers c, products p, sales s
where c.country_id = 'US'
and c.cust_id = s.cust_id
and s.prod_id = p.prod_id
and p.prod_name = 'T3 Faux Fur-Trimmed Sweater') nwvw;
and this plan:
-------------------------------------------
| Id | Operation | Name |
-------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | VIEW | VM_NWVW_1 |
| 2 | HASH UNIQUE | |
|* 3 | HASH JOIN | |
|* 4 | HASH JOIN | |
|* 5 | TABLE ACCESS FULL| PRODUCTS |
| 6 | TABLE ACCESS FULL| SALES |
|* 7 | TABLE ACCESS FULL | CUSTOMERS |
-------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("C"."CUST_ID"="S"."CUST_ID")
4 - access("S"."PROD_ID"="P"."PROD_ID")
5 - filter("P"."PROD_NAME"='T3 Faux Fur-Trimmed Sweater')
7 - filter("C"."COUNTRY_ID"='US')
So why do we still have a view after we’ve supposedly merged the view? The new view is what we call a “projection view”. When we merge the view, we move the distinct to the outer query block. But when we move the distinct, we have to add several additional columns, in order to maintain semantic equivalence with the original query. So we put all of that into a new view, so we can select out just the columns we want in the outer query block’s select list. But we still get all of the benefits we promised from merging the view – all of the tables are in one query block and the optimizer is free to permute them as it desires in the final join order, and the distinct operation has been delayed until after all of the joins are completed. These projection views appear in queries where a distinct view has been merged, or a group by view is merged into an outer query block which also contains group by, having, and/or aggregates. In the latter case, the projection view contains the group by, having, and aggregates from the original outer query block.
Now that this great mystery has been revealed, let’s look at some of the reasons a group by or distinct view might not be merged. Aside from cost, there are several other reasons, including:
- The outer query tables do not have a rowid or unique column
- View appears in a connect by query block
- View contains grouping sets, rollup, pivot
- View or outer query block contains spreadsheet clause
Summary
View merging can improve plans by allowing additional join orders, access methods, and other transformations to be considered. In cases where view merging should always lead to a better plan, Oracle automatically merges a view; in other cases, this is determined based on cost. There are many reasons why a view may not be merged, including cost or validity restrictions. Note that view merging that is rejected on the basis of cost or heuristics can be overridden with hints; but view merging that is rejected based on validity may not.




