Describe the bug
When the SQL unparser encounters a SubqueryAlias node whose direct child is an Aggregate (without an intermediate Projection), it flattens the subquery into a simple table alias, losing the aggregate entirely.
Root cause
The SubqueryAlias handler in select_to_sql_recursively (datafusion/sql/src/unparser/plan.rs) calls
subquery_alias_inner_query_and_columns (which only unwraps Projection children) and unparse_table_scan_pushdown (which only
handles TableScan/SubqueryAlias/Projection). When both return nothing useful for an Aggregate child, the code recurses directly
into the Aggregate, merging its GROUP BY into the outer SELECT instead of emitting a derived subquery.
Additional context
This also affects other plan types that build their own SELECT clauses (Window, Sort, Limit, Union) when directly wrapped by
SubqueryAlias.
To Reproduce
Manually construct a logical plan with a join where the right side is SubqueryAlias > Aggregate:
SELECT j1.j1_string FROM j1 JOIN (SELECT max(j2_id) AS max_id FROM j2) AS b ON j1.j1_id = b.max_id
This bug doesn't manifest when the SQL is parsed from a string, as the parser inserts a Projection between the SubqueryAlias and Aggregate, so the roundtrip works. The bug only manifests with manually constructed plans where SubqueryAlias directly wraps Aggregate.
let right_scan = table_scan(Some("j2"), &j2_schema, None)?.build()?;
let right_agg = LogicalPlanBuilder::from(right_scan)
.aggregate(vec![] as Vec<Expr>, vec![max(col("j2.j2_id")).alias("max_id")])?
.build()?;
let right_subquery = subquery_alias(right_agg, "b")?;
let left_scan = table_scan(Some("j1"), &j1_schema, None)?.build()?;
let plan = LogicalPlanBuilder::from(left_scan)
.join(right_subquery, JoinType::Inner,
(vec![Column::from_qualified_name("j1.j1_id")],
vec![Column::from_qualified_name("b.max_id")]),
None)?
.project(vec![col("j1.j1_string")])?
.build()?;
let sql = Unparser::default().plan_to_sql(&plan)?.to_string();
Current behavior
SELECT j1.j1_string FROM j1 INNER JOIN j2 AS b ON j1.j1_id = b.max_id
The aggregate subquery is completely dropped — (SELECT max(j2_id) AS max_id FROM j2) AS b becomes just j2 AS b.
Expected behavior
SELECT j1.j1_string FROM j1 INNER JOIN (SELECT max(j2.j2_id) AS max_id FROM j2) AS b ON j1.j1_id = b.max_id
Additional context
No response
Describe the bug
When the SQL unparser encounters a SubqueryAlias node whose direct child is an Aggregate (without an intermediate Projection), it flattens the subquery into a simple table alias, losing the aggregate entirely.
Root cause
The SubqueryAlias handler in select_to_sql_recursively (datafusion/sql/src/unparser/plan.rs) calls
subquery_alias_inner_query_and_columns (which only unwraps Projection children) and unparse_table_scan_pushdown (which only
handles TableScan/SubqueryAlias/Projection). When both return nothing useful for an Aggregate child, the code recurses directly
into the Aggregate, merging its GROUP BY into the outer SELECT instead of emitting a derived subquery.
Additional context
This also affects other plan types that build their own SELECT clauses (Window, Sort, Limit, Union) when directly wrapped by
SubqueryAlias.
To Reproduce
Manually construct a logical plan with a join where the right side is SubqueryAlias > Aggregate:
This bug doesn't manifest when the SQL is parsed from a string, as the parser inserts a Projection between the SubqueryAlias and Aggregate, so the roundtrip works. The bug only manifests with manually constructed plans where SubqueryAlias directly wraps Aggregate.
Current behavior
The aggregate subquery is completely dropped — (SELECT max(j2_id) AS max_id FROM j2) AS b becomes just j2 AS b.
Expected behavior
Additional context
No response