New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct plan of general & segmentGeneral path with volatiole functions. #10418
Correct plan of general & segmentGeneral path with volatiole functions. #10418
Conversation
GPDB dev mailing list thread: https://groups.google.com/a/greenplum.org/forum/?hl=en#!topic/gpdb-dev/354vF_0EPIM |
3500fa4
to
1c67f6a
Compare
ERROR: function cannot execute on a QE slice because it accesses relation "qp_funcs_in_subquery.bar" (seg0 slice1 127.0.1.1:7002 pid=27185) | ||
CONTEXT: SQL statement "SELECT d FROM bar WHERE c <> $1" | ||
PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows | ||
ERROR: query plan with multiple segworker groups is not supported |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this test case is impacted? I dig into it, and here is the root cause:
The previous plan before this PR does treat the function scan as singlQE, however, this pr treat it as singleQE because of volatile projection.
============================================================================
this patch's plan
============================================================================
regression=# explain (costs off, verbose) SELECT * FROM foo, (SELECT func1_mod_setint_vol(func2_sql_int_vol(5))) r order by 1,2,3;
QUERY PLAN
------------------------------------------------------------
Sort
Output: foo.a, foo.b, (func1_mod_setint_vol($0))
Sort Key: foo.a, foo.b, (func1_mod_setint_vol($0))
-> Nested Loop
Output: foo.a, foo.b, (func1_mod_setint_vol($0))
-> Result
Output: func1_mod_setint_vol($0)
InitPlan 1 (returns $0) (slice1)
-> Result
Output: func2_sql_int_vol(5)
-> Materialize
Output: foo.a, foo.b
-> Gather Motion 3:1 (slice2; segments: 3)
Output: foo.a, foo.b
-> Seq Scan on public.foo
Output: foo.a, foo.b
Optimizer: Postgres query optimizer
(17 rows)
regression=# SELECT * FROM foo, (SELECT func1_mod_setint_vol(func2_sql_int_vol(5))) r order by 1,2,3;
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: SQL statement "UPDATE bar SET d = d+1 WHERE c > $1"
PL/pgSQL function func1_mod_setint_vol(integer) line 5 at SQL statement
============================================================================
Previous plan
============================================================================
regression=# explain (costs off, verbose) SELECT * FROM foo, (SELECT func1_mod_setint_vol(func2_sql_int_vol(5))) r order by 1,2,3;
QUERY PLAN
----------------------------------------------------------------
Gather Motion 3:1 (slice1; segments: 3)
Output: foo.a, foo.b, (func1_mod_setint_vol($0))
Merge Key: foo.a, foo.b, (func1_mod_setint_vol($0))
-> Sort
Output: foo.a, foo.b, (func1_mod_setint_vol($0))
Sort Key: foo.a, foo.b, (func1_mod_setint_vol($0))
-> Nested Loop
Output: foo.a, foo.b, (func1_mod_setint_vol($0))
-> Result
Output: func1_mod_setint_vol($0)
InitPlan 1 (returns $0) (slice2)
-> Result
Output: func2_sql_int_vol(5)
-> Materialize
Output: foo.a, foo.b
-> Seq Scan on qp_funcs_in_subquery.foo
Output: foo.a, foo.b
Optimizer: Postgres query optimizer
(18 rows)
regression=# SELECT * FROM foo, (SELECT func1_mod_setint_vol(func2_sql_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it issues a non-SELECT statement (seg0 slice1 10.142.0.40:7002 pid=28175)
CONTEXT: SQL statement "UPDATE bar SET d = d+1 WHERE c > $1"
PL/pgSQL function func1_mod_setint_vol(integer) line 5 at SQL statement
5b1a87d
to
582344c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contain_volatile_functions()
or contain_mutable_functions()
?
STABLE indicates that the function cannot modify the database, and that within a single table scan it will consistently return the same result for the same argument values, but that its result could change across SQL statements.
This doesn't imply that stable function produces the same results across different segments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to mutate some of the paths under add_path
, use a new function with a switch case for the path type?
Then we can avoid adding code under some create path functions.
@JunfengYang
That's why I do no think it is suitable to patch |
@gfphoenix78 If we change volatile check to mutable check in function create temp table table_a(id integer);
insert into table_a values (42);
create temp view view_a as select * from table_a;
select (select (a.*)::text) from view_a a; The last statement contains a SubPlan in Project. And the SubPlan's locus is set as To skip this first, I choose volatile in this PR. A word on this issue: the root cause I think is, the locus of the SubPlan is not General, it should be set as the query's jointree's locus. |
I think it implies the same results across segments. |
-> Materialize | ||
-> Broadcast Motion 1:3 (slice2; segments: 1) | ||
-> Function Scan on generate_series a | ||
Filter: ((a)::double precision > random()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am starting to review this PR with test cases.
For the push down case, the filter is pushed down, if we remove the pushdown optimizer, the plan can be converted to
Gather Motion 3:1 (slice1; segments: 3)
Filter: ((a)::double precision > random())
-> Nested Loop
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice2; segments: 1)
-> Function Scan on generate_series a
with more transformation, the plan can be converted to
Gather Motion 3:1 (slice1; segments: 3)
Filter: ((a)::double precision > random())
-> Nested Loop
-> Seq Scan on t_hashdist
-> Function Scan on generate_series a
This plan is same with the current plan that the master branch produced. My idea is, generate_series() has GENERAL locus and it actually indeed produced the same data set, there is no problem in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am starting to review this PR with test cases.
For the push down case, the filter is pushed down, if we remove the pushdown optimizer, the plan can be converted to
Gather Motion 3:1 (slice1; segments: 3) Filter: ((a)::double precision > random()) -> Nested Loop -> Seq Scan on t_hashdist -> Materialize -> Broadcast Motion 1:3 (slice2; segments: 1) -> Function Scan on generate_series a
with more transformation, the plan can be converted to
Gather Motion 3:1 (slice1; segments: 3) Filter: ((a)::double precision > random()) -> Nested Loop -> Seq Scan on t_hashdist -> Function Scan on generate_series a
This plan is same with the current plan that the master branch produced. My idea is, generate_series() has GENERAL locus and it actually indeed produced the same data set, there is no problem in this case.
@pengzhout
Thanks for your review. Here is the link of diff of general case:
https://www.diffnow.com/report/etulf
diff of replicated table case:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am starting to review this PR with test cases.
For the push down case, the filter is pushed down, if we remove the pushdown optimizer, the plan can be converted to
Gather Motion 3:1 (slice1; segments: 3) Filter: ((a)::double precision > random()) -> Nested Loop -> Seq Scan on t_hashdist -> Materialize -> Broadcast Motion 1:3 (slice2; segments: 1) -> Function Scan on generate_series a
with more transformation, the plan can be converted to
Gather Motion 3:1 (slice1; segments: 3) Filter: ((a)::double precision > random()) -> Nested Loop -> Seq Scan on t_hashdist -> Function Scan on generate_series a
This plan is same with the current plan that the master branch produced. My idea is, generate_series() has GENERAL locus and it actually indeed produced the same data set, there is no problem in this case.
IIUC, you propose another fix here: do not push down filters that contain volatile functions to general locus table?
Maybe it can work for some cases, but with the framework here, with the patch, the case is already fixe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I mean if the push down is correct here, the modified plan is equal to the original plan, they can be transformed into each other by adding/removing push-down optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I mean if the push down is correct here, the modified plan is equal to the original plan, they can be transformed into each other by adding/removing push-down optimization.
The transformation you mentioned may be invalid. The MPP plan to me is very different.
The inner path should always produces the same data set every where, that is why it is called general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upstream doesn't treat all volatile function as parallel-unsafe, strange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pengzhout Thanks for the comments. I do not know the filed proparallel
before.
upstream doesn't treat all volatile function as parallel-unsafe, strange.
I think this is not strange. The reason why a function is parallel-unsafe
is about side-effect:
Functions should be labeled parallel unsafe if they modify any database state,
or if they make changes to the transaction such as using sub-transactions,
or if they access sequences or attempt to make persistent changes to settings
(e.g. setval)
But volatile function is not always causing side-effect. It may just like random
,
It can return different results on successive calls with the same arguments.
And for this case, it is quite OK to push it to parallel scans (Postgres seems do not have replicated table or general?).
This is just the logic in gpdb now: other than replicated table and general, we do not consider fixing the volatile functions.
I have a conjecture for the motivation:
- for those volatile functions that are also parallel-safe, they must have not modified the database
- for those volatile functions that are parallel-un-safe, they must have tried to modify the database or transaction state
It is another story for Greenplum to consider proparallel
field during planning when we want to enable parallel scan.
Even parallel-safe
we have to use this PR to consider replicated table and general locus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pengzhout
Also parallel-safe
fields should be considered in other places of planning in GPDB besides parallel scan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not strange. The reason why a function is
parallel-unsafe
is about side-effect:Functions should be labeled parallel unsafe if they modify any database state, or if they make changes to the transaction such as using sub-transactions, or if they access sequences or attempt to make persistent changes to settings (e.g. setval)
But volatile function is not always causing side-effect. It may just like
random
,
Have you checked the 'PARALLEL RESTRICTED'? random() is one of those and can't run in parallel mode in
most of the cases. Upstream have very complex logic to determine whether a Rel can run in parallel or not (see set_rel_consider_parallel() or is_parallel_safe), we can depend on rel->consider_parallel
flag in the future to decide
whether to transform 'General/SegmentGeneral' or 'Hashed/Strewn' to 'SingleQE', we need a common framework in the future andvolatile
property check is not enough for that. Actually, upstream even doesn't do volatile
property check when determining the parallel mode in most cases, the proparallel
is considered.
It can return different results on successive calls with the same arguments.
And for this case, it is quite OK to push it to parallel scans (Postgres seems do not have replicated table or general?).
This is just the logic in gpdb now: other than replicated table and general, we do not consider fixing the volatile functions.
upstream has similar idea with general and replicated table, eg: partial nestloop join.
gpadmin=# set enable_hashjoin to off;
SET
gpadmin=# explain select count(*) from t3, t4 where t3.c1 like '%sss' and timeofday() = t4.c1 and t3.c1 = t4.c1;
;
QUERY PLAN
-----------------------------------------------------------------------------------------
Finalize Aggregate (cost=585414.30..585414.31 rows=1 width=8)
-> Gather (cost=585414.09..585414.30 rows=2 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=584414.09..584414.10 rows=1 width=8)
-> Nested Loop (cost=0.00..573997.42 rows=4166667 width=0)
Join Filter: (t3.c1 = t4.c1)
-> Parallel Seq Scan on t3 (cost=0.00..175540.42 rows=1 width=36)
Filter: (c1 ~~ '%sss'::text)
-> Seq Scan on t4 (cost=0.00..273457.00 rows=10000000 width=36)
Filter: (timeofday() = c1)
(10 rows)
Seq Scan on t4
is some kind of replicated table, also, you can see timeofday() is volatile, but it is also a parallel-safe function, so upstream can run it in parallel. As Tom said, the volatile function in baserestrictinfo
is unprincipled, so I think both parallel and non-parallel plan is correct.
I have a conjecture for the motivation:
- for those volatile functions that are also parallel-safe, they must have not modified the database
- for those volatile functions that are parallel-un-safe, they must have tried to modify the database or transaction state
It is another story for Greenplum to consider
proparallel
field during planning when we want to enable parallel scan.
Evenparallel-safe
we have to use this PR to consider replicated table and general locus.
This is the key part I want to declaim if I understand correctly.
the Locus
is the property of Path, the parallel-safe
is the property of Rel (RelOptInfo), a Rel may have serval Paths.
The baserestrictinfo
and reltarget
is also the property of Rel, we cannot force the two property to a Path.
From this point of view, the volatile
didn't break the definition of General, the path is still General. Because the volatile
or parallel-un-safe
function in baserestrictinfo
and reltarget
, the Rel might lose the parallel-safe
property so it needs to adjust/remove its paths which are parallel. I think that should be the motivation to turn General to SingleQE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not strange. The reason why a function is
parallel-unsafe
is about side-effect:Functions should be labeled parallel unsafe if they modify any database state, or if they make changes to the transaction such as using sub-transactions, or if they access sequences or attempt to make persistent changes to settings (e.g. setval)
But volatile function is not always causing side-effect. It may just like
random
,Have you checked the 'PARALLEL RESTRICTED'? random() is one of those and can't run in parallel mode in
most of the cases. Upstream have very complex logic to determine whether a Rel can run in parallel or not (see set_rel_consider_parallel() or is_parallel_safe), we can depend onrel->consider_parallel
flag in the future to decide
whether to transform 'General/SegmentGeneral' or 'Hashed/Strewn' to 'SingleQE', we need a common framework in the future andvolatile
property check is not enough for that. Actually, upstream even doesn't dovolatile
property check when determining the parallel mode in most cases, theproparallel
is considered.It can return different results on successive calls with the same arguments.
And for this case, it is quite OK to push it to parallel scans (Postgres seems do not have replicated table or general?).
This is just the logic in gpdb now: other than replicated table and general, we do not consider fixing the volatile functions.upstream has similar idea with general and replicated table, eg: partial nestloop join.
gpadmin=# set enable_hashjoin to off; SET gpadmin=# explain select count(*) from t3, t4 where t3.c1 like '%sss' and timeofday() = t4.c1 and t3.c1 = t4.c1; ; QUERY PLAN ----------------------------------------------------------------------------------------- Finalize Aggregate (cost=585414.30..585414.31 rows=1 width=8) -> Gather (cost=585414.09..585414.30 rows=2 width=8) Workers Planned: 2 -> Partial Aggregate (cost=584414.09..584414.10 rows=1 width=8) -> Nested Loop (cost=0.00..573997.42 rows=4166667 width=0) Join Filter: (t3.c1 = t4.c1) -> Parallel Seq Scan on t3 (cost=0.00..175540.42 rows=1 width=36) Filter: (c1 ~~ '%sss'::text) -> Seq Scan on t4 (cost=0.00..273457.00 rows=10000000 width=36) Filter: (timeofday() = c1) (10 rows)
Seq Scan on t4
is some kind of replicated table, also, you can see timeofday() is volatile, but it is also a parallel-safe function, so upstream can run it in parallel. As Tom said, the volatile function inbaserestrictinfo
is unprincipled, so I think both parallel and non-parallel plan is correct.I have a conjecture for the motivation:
- for those volatile functions that are also parallel-safe, they must have not modified the database
- for those volatile functions that are parallel-un-safe, they must have tried to modify the database or transaction state
It is another story for Greenplum to consider
proparallel
field during planning when we want to enable parallel scan.
Evenparallel-safe
we have to use this PR to consider replicated table and general locus.This is the key part I want to declaim if I understand correctly.
the
Locus
is the property of Path, theparallel-safe
is the property of Rel (RelOptInfo), a Rel may have serval Paths.
Thebaserestrictinfo
andreltarget
is also the property of Rel, we cannot force the two property to a Path.From this point of view, the
volatile
didn't break the definition of General, the path is still General. Because thevolatile
orparallel-un-safe
function inbaserestrictinfo
andreltarget
, the Rel might lose theparallel-safe
property so it needs to adjust/remove its paths which are parallel. I think that should be the motivation to turn General to SingleQE.
Thanks @pengzhout for your cases and comments.
Share some of my thoughts here:
- random is 'r' in the field, but I think it is OK for not-broad-cast-like locus.
- the case you wrote here:
Seq Scan on t4
should also consider volatile, I will update the thead in pghacks to ask.
63cdd95
to
93679c4
Compare
General and segmentGeneral locus imply that if the corresponding slice is executed in many different segments should provide the same result data set. Thus, in some cases, General and segmentGeneral can be treated like broadcast. But what if the segmentGeneral and general locus path contain volatile functions? volatile functions, by definition, do not guarantee results of different invokes. So for such cases, they lose the property and cannot be treated as *general. Previously, Greenplum planner does not handle these cases correctly. Limit general or segmentgeneral path also has such issue. The fix idea of this commit is: when we find the pattern (a general or segmentGeneral locus paths contain volatile functions), we create a motion path above it to turn its locus to singleQE and then create a projection path. Then the core job becomes how we choose the places to check: 1. For a single base rel, we should only check its restriction, this is the at bottom of planner, this is at the function set_rel_pathlist 2. When creating a join path, if the join locus is general or segmentGeneral, check its joinqual to see if it contains volatile functions 3. When handling subquery, we will invoke set_subquery_pathlist function, at the end of this function, check the targetlist and havingQual 4. When creating limit path, the check and change algorithm should also be used 5. Correctly handle make_subplan OrderBy clause and Group Clause should be included in targetlist and handled by the above Step 3. Also this commit fixes DMLs on replicated table. Update & Delete Statement on a replicated table is special. These statements have to be dispatched to each segment to execute. So if they contain volatile functions in their targetList or where clause, we should reject such statements: 1. For targetList, we check it at the function create_motion_path_for_upddel 2. For where clause, they will be handled in the query planner and if we find the pattern and want to fix it, do another check if we are updating or deleting replicated table, if so reject the statement.
93679c4
to
0ee234c
Compare
@pengzhout Have you any more advice on this? I have made the pr pipeline green and create another story to track the pull up all sublink issue. Would you please have a look at it again? Thanks. |
I will approve this PR because we cannot use the parallel of upstream at this stage, it can resolve the limit and replicated table insert problem which is causing wrong results. For other optimization, I think most of them are turning a unprincipled result to another unprincipled result, it is doing no harm because very few people will concern about it. |
@kainwen I have face the same problem as was fixed in this PR can suggest to add a fallback to Postgres planner from ORCA. You can cherry-pick arenadata@0411e3b if you like. |
Thanks @darthunix ORCA team have tracked this issue and I think they will soon fix it. |
Pushed into master branch. Thanks everyone's comments so much, I learn something new. |
With this PR, this test case from PostgreSQL v10 'tsrf' regression test started to fail:
The problem is that the Limit depends on the outer query parameter, and there's now a Motion above the Limit. This particular case would work if we put the Limit above the Motion, although we probably cannot handle the general case. For example, this produces incorrect results instead of erroring out:
I'm not sure what exactly we should do about this, but clearly there are a number of problems with this. Not all of them are new.
|
I'm curious why the new ProjectionPath.force mechanism is needed? I tried removing that, and got an assertion failure from this in make_motion:
But when I simply commented out that assertion, the regression tests passed except for a few plan changes where the superfluous Result nodes are gone. |
@hlinnaka Thanks for your comments.
Because I got the assert failure and found that the Result node is removed. I did not try to comment it off.
I will look into this case to see what we can do with it. |
follow up greenplum-db#10418. CTAS has an additional code path that will add a replicated locus on the top of the current node. but the code path lacks some volatile function check, will produce an incorrect plan. The incorrect plan will execute the volatile function more times than expected. Signed-off-by: Sasasu <i@sasa.su>
* correct plan for CTAS with volatile functions. follow up #10418. CTAS has an additional code path that will add a replicated locus on the top of the current node. but the code path lacks some volatile function check, which will produce an incorrect plan. The incorrect plan will execute the volatile function more times than expected. * port all CTAS with volatile functions from 6X to master Signed-off-by: Sasasu <i@sasa.su>
* correct plan for CTAS with volatile functions. follow up greenplum-db#10418. CTAS has an additional code path that will add a replicated locus on the top of the current node. but the code path lacks some volatile function check, which will produce an incorrect plan. The incorrect plan will execute the volatile function more times than expected. * port all CTAS with volatile functions from 6X to master Signed-off-by: Sasasu <i@sasa.su>
General and segmentGeneral locus imply that if the corresponding slice
is executed in many different segments should provide the same result
data set. Thus, in some cases, General and segmentGeneral can be
treated like broadcast.
But what if the segmentGeneral and general locus path contain volatile
functions? volatile functions, by definition, do not guarantee results
of different invokes. So for such cases, they lose the property and
cannot be treated as *general. Previously, Greenplum planner
does not handle these cases correctly. Limit general or segmentgeneral
path also has such issue.
The fix idea of this commit is: when we find the pattern (a general or
segmentGeneral locus paths contain volatile functions), we create a
motion path above it to turn its locus to singleQE and then create a
projection path. Then the core job becomes how we choose the places to
check:
the at bottom of planner, this is at the function set_rel_pathlist
check its joinqual to see if it contains volatile functions
at the end of this function, check the targetlist and havingQual
OrderBy clause and Group Clause should be included in targetlist and handled
by the above Step 3.
Also this commit fixes DMLs on replicated table. Update & Delete Statement on
a replicated table is special. These statements have to be dispatched to each
segment to execute. So if they contain volatile functions in their targetList
or where clause, we should reject such statements:
find the pattern and want to fix it, do another check if we are updating
or deleting replicated table, if so reject the statement.
This pr fixes Github Issue: #10419
Co-authored-by: Hao Wu hawu@pivotal.io