Skip to content

Disallow order by within ordered-set aggregate functions argument lists#20421

Open
cj-zhukov wants to merge 3 commits intoapache:mainfrom
cj-zhukov:cj-zhukov/disallow-order-by-within-ordered-set-agr-fn-arg-lists
Open

Disallow order by within ordered-set aggregate functions argument lists#20421
cj-zhukov wants to merge 3 commits intoapache:mainfrom
cj-zhukov:cj-zhukov/disallow-order-by-within-ordered-set-agr-fn-arg-lists

Conversation

@cj-zhukov
Copy link
Copy Markdown
Contributor

@cj-zhukov cj-zhukov commented Feb 18, 2026

Which issue does this PR close?

Rationale for this change

This PR fixes incorrect handling of ordered-set aggregate syntax.

What changes are included in this PR?

  • Updated datafusion/sql/src/expr/function.rs to return a planning error when ORDER BY is used inline
  • Added test coverage in datafusion/sqllogictest/test_files/aggregate.slt to ensure these invalid forms now return a syntax/planning error

Are these changes tested?

Yes, these changes are tested.

Are there any user-facing changes?

Yes - this is a public API change.
This query:

select quantile_cont(col0, 0.75 order by col0)
from values (1, 3), (2, 2), (3, 1) t(col0, col1);

now returns an error: ORDER BY must be specified using WITHIN GROUP

@github-actions github-actions bot added sql SQL Planner sqllogictest SQL Logic Tests (.slt) labels Feb 18, 2026
@cj-zhukov
Copy link
Copy Markdown
Contributor Author

High-level overview

This PR fixes incorrect handling of ordered-set aggregate syntax.

Specifically:

  • Updated datafusion/sql/src/expr/function.rs to return a planning error when ORDER BY is used inline
  • Added test coverage in datafusion/sqllogictest/test_files/aggregate.slt to ensure these invalid forms now return a syntax/planning error

@cj-zhukov
Copy link
Copy Markdown
Contributor Author

@Jefffrey Since you originally opened this issue, could you please take a look at the proposed refactor?

@Jefffrey Jefffrey changed the title Disallow order by within ordered-set aggregate functions argument lis… Disallow order by within ordered-set aggregate functions argument lists Feb 19, 2026
Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @cj-zhukov and @Jefffrey

The only thing I worry about this PR is that it will cause some queries that used to work to stop. However, on more review it seems like what it does is just improve the error message.

Can someone confirm that this is the case?

select quantile_cont(col0, 0.75 order by col0)
from values (1, 3), (2, 2), (3, 1) t(col0, col1);

statement error ORDER BY must be specified using WITHIN GROUP
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked that these queries fail on main too:

> select quantile_cont(0.75 order by col0)
from values (1, 3), (2, 2), (3, 1) t(col0, col1);
Error during planning: Function 'percentile_cont' expects 2 arguments but received 1 No function matches the given name and argument types 'percentile_cont(Float64)'. You might need to add explicit type casts.
	Candidate functions:
	percentile_cont(expr: Coercion(TypeSignatureClass::Float, implicit_coercion=ImplicitCoercion([Numeric], default_type=Float64), percentile: Coercion(TypeSignatureClass::Native(LogicalType(Native(Float64), Float64)), implicit_coercion=ImplicitCoercion([Numeric], default_type=Float64))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query above currently succeeds on main

select quantile_cont(col0, 0.75 order by col0)
from values (1, 3), (2, 2), (3, 1) t(col0, col1);

I was thinking this could be a breaking change, but I wonder how often these (ordered set aggregates) are currently used especially with an order by in the argument list like this; if we want to be fully explicit we can add an entry to the upgrade guide

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked and I agree with you

andrewlamb@Andrews-MacBook-Pro-3:~/Software/influxdb_pro/ent$ datafusion-cli
DataFusion CLI v52.1.0
> select quantile_cont(col0, 0.75 order by col0)
from values (1, 3), (2, 2), (3, 1) t(col0, col1);
+-------------------------------------+
| quantile_cont(t.col0,Float64(0.75)) |
+-------------------------------------+
| 2.5                                 |
+-------------------------------------+
1 row(s) fetched.
Elapsed 0.002 seconds.

In general, rather than disallow this syntax, is there some way we can support both syntaxes? I also have no idea how often they are used but if it is currently supported and produces the correct result, then I suspect at least some users would treat them starting to error as a regression 🤔

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that this PR does introduce a breaking change. This query works on main:

select quantile_cont(col0, 0.75 order by col0)
from values (1, 3), (2, 2), (3, 1) t(col0, col1);

but now it returns error: ORDER BY must be specified using WITHIN GROUP

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising this suggestion - I started working on supporting both syntaxes instead of returning an error.

After digging into it more deeply, it turns out the change is more involved than I initially expected. It’s starting to drift away from the scope of the original refactor.

Given that, I’m wondering if it would make sense to keep this PR focused and open a separate PR dedicated specifically to supporting both syntaxes. I’m happy to continue working on that and share progress once I have a clean and well-tested solution.

Would that approach make sense?

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Mar 2, 2026

Given that, I’m wondering if it would make sense to keep this PR focused and open a separate PR dedicated specifically to supporting both syntaxes.

Probably

I am just worried about disallowing syntax that was previously allowed as I worry it will cause downstream issues (we have hit stuff like this in the past at InfluxData)

From a customers perspective they are like "my query use to work but no longer does"

Unless we have some sort of more compelling story (like "it was getting the wrong answer and now you get an error which is better") it is hard to justify such changes

@cj-zhukov
Copy link
Copy Markdown
Contributor Author

Given that, I’m wondering if it would make sense to keep this PR focused and open a separate PR dedicated specifically to supporting both syntaxes.

Probably

I am just worried about disallowing syntax that was previously allowed as I worry it will cause downstream issues (we have hit stuff like this in the past at InfluxData)

From a customers perspective they are like "my query use to work but no longer does"

Unless we have some sort of more compelling story (like "it was getting the wrong answer and now you get an error which is better") it is hard to justify such changes

Thanks for clarifying - that makes sense.
I agree that preserving previously supported syntax is important for downstream users, especially to avoid unexpected regressions.

I’ll update this PR to continue supporting both syntaxes rather than introducing a breaking change. I’ll follow up once I have a clean solution in place.

@Jefffrey
Copy link
Copy Markdown
Contributor

Jefffrey commented Mar 3, 2026

Unless we have some sort of more compelling story (like "it was getting the wrong answer and now you get an error which is better") it is hard to justify such changes

I believe this might be the case here? As far as I'm aware, the order by syntax here essentially does nothing, so if there is a query like:

select quantile_cont(col0, 0.75 order by col0 desc)
from values (1, 3), (2, 2), (3, 1) t(col0, col1);

Then it is would actually be computed as ascending (the default since within group isn't specified). Fixing to support order by is also a good move, though that might be more subtle as any such queries would now return a different result 🤔 (i.e. harder to make users aware of this new behaviour)

@cj-zhukov
Copy link
Copy Markdown
Contributor Author

This PR now supports both syntaxes:

select quantile_cont(col0, 0.75 order by col0) from values (1, 3), (2, 2), (3, 1) t(col0, col1);
select quantile_cont(0.75 order by col0) from values (1, 3), (2, 2), (3, 1) t(col0, col1);

What do you think about supporting both syntaxes? Happy to adjust if there is a preference to keep only one form.

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still trying to get my head around what this PR does / doesn't do

Here is a query that (still) doesn't work on this PR. Is that expected? Or do you expect that this one will work?

  -- Unsupported on both main and on this branch
  SELECT approx_percentile_cont_with_weight(1, 0.95 ORDER BY x)
  FROM (VALUES (1), (2), (3)) AS t(x);

Ok((exprs, names))
}

#[expect(dead_code)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this means the code is not used -- shouldn't we just remove it rather than making #[expect(dead)]?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I may be missing part of the intended scope, so I’d appreciate some clarification.

It looks like we may need broader test coverage here - not only for quantile_cont, but also for other aggregate functions and similar usage patterns, to better define the expected behavior.

Originally, this issue was about rejecting invalid ORDER BY usage inside aggregate argument lists earlier. My initial fix enforced that, but it introduced a breaking change, so I updated the implementation to support both syntaxes for quantile_cont. This works for the cases I’ve added tests for.

Regarding your example:

SELECT approx_percentile_cont_with_weight(1, 0.95 ORDER BY x)
FROM (VALUES (1), (2), (3)) AS t(x);

I didn’t consider this kind of usage across other aggregate functions when implementing the change, so I’m not fully sure what the expected behavior should be here.

Would you expect inline ORDER BY to be supported only for quantile_cont (and similar ordered-set aggregates), or should this be extended more generally to other aggregate functions as well?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you expect inline ORDER BY to be supported only for quantile_cont (and similar ordered-set aggregates), or should this be extended more generally to other aggregate functions as well?

I am not sure -- I basically expect that datafusion would follow other implementations rather than creating our own syntax. In the case where previous versions of DataFusion supported syntax that is not supported by other systems, I would expect that syntax to keep working, though we don't have to extend it to other functions that didn't previously support it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that helps clarify the direction.

I’ll keep the inline ORDER BY syntax only for cases where it was already supported. I’ll also add tests to ensure we preserve existing behavior while rejecting unsupported cases more clearly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you -- I am sorry for the delay / back and forth on this -- but we have been bitten downstream in the past when stuff that used to work stopped working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants