feat(optimizer): annotate types for ORDER BY alias references#7281
feat(optimizer): annotate types for ORDER BY alias references#7281doripo wants to merge 2 commits intotobymao:mainfrom
Conversation
When ORDER BY references a projection alias (e.g., SELECT x+1 AS y ... ORDER BY y), the column's type was left as UNKNOWN. qualify_columns intentionally preserves these alias refs (they're valid SQL in all dialects), so the single-pass annotator has no table-qualified column to resolve against. Add a post-pass (_fixup_order_by_aliases) in annotate_scope that runs after projections are fully typed. It builds an alias-to-type map, fixes matching bare columns in ORDER BY, and re-derives parent types on compound expressions (e.g., ORDER BY y + 1) via _reannotate_subtree. This approach avoids modifying _annotate_expression, the core annotation loop. _reannotate_subtree clears non-leaf types (preserving Column/Literal ground truth), prunes at Subquery boundaries, and re-invokes _annotate_expression sequentially.
|
Happy to adjust the approach if you'd prefer this handled differently. A few notes on design choices:
|
sqlglot/optimizer/annotate_types.py
Outdated
|
|
||
| # Iterate through all the expressions of the current scope in post-order, and annotate | ||
| self._annotate_expression(scope.expression, scope) | ||
| self._fixup_order_by_aliases(scope) |
There was a problem hiding this comment.
Why are we introducing a separate pass here? This method seems very costly. Is it possible to solve this more simply by resolving ORDER BY alias references to their projection expression types during the existing column annotation path (lines 390-425)?
There was a problem hiding this comment.
Thanks for the quick review!
Switched to an in-loop approach in the new commit — _resolve_order_by_alias forces the projection's annotation via a recursive call when needed. Updated the PR description accordingly.
Replace the post-pass (_fixup_order_by_aliases + _reannotate_subtree) with _resolve_order_by_alias, called from the column annotation path in _annotate_expression. When a bare column in ORDER BY matches a projection alias, it forces the projection's annotation via a recursive call if needed, then copies the type. This resolves alias types during the existing annotation pass instead of walking the ORDER BY subtree twice after the fact. Signed-off-by: Dori Polotsky <doripo@riverpool.ai>
|
Note: |
When ORDER BY references a projection alias,
annotate_typesleaves the column typed asUNKNOWN:This happens because
qualify_columnsintentionally preserves ORDER BY alias refs (they're valid SQL in all dialects), so the annotator has no table-qualified column to resolve against.This PR adds
_resolve_order_by_alias, called from the column annotation path in_annotate_expression. When a bare column in ORDER BY matches a projection alias, it forces the projection's annotation via a recursive call if needed, then copies the type. Sincequalify_columnshas already added table qualifiers to projection columns, the recursive call resolves them via the normal path and does not re-enter the alias-ref code.Test coverage includes basic alias resolution, shadowing/collisions, sort modifiers, compound expressions, set operations, window functions, subquery-as-projection, type coercion, and regression guards.