Skip to content

Add CAST expression support to parser#3207

Open
nssalian wants to merge 1 commit intoapache:mainfrom
nssalian:cast-expression-support
Open

Add CAST expression support to parser#3207
nssalian wants to merge 1 commit intoapache:mainfrom
nssalian:cast-expression-support

Conversation

@nssalian
Copy link
Copy Markdown
Contributor

Closes #198

Picks up where #209 left off with @Fokko's feedback addressed. Thanks @jayceslesar for the original work.

Rationale for this change

Adds CAST(column AS type) parsing that maps to Iceberg transforms (dateDayTransform, yearYearTransform, monthMonthTransform, hourHourTransform). Also implements BoundTransform.ref() and eval() — both were missing abstract methods required by BoundTerm. Introduces UnboundTransform with bind() that validates via can_transform().

Are these changes tested?

Yes. 11 new tests covering all transform types, case insensitivity, nested fields, unsupported types, comparison operators, and boolean composition.

Are there any user-facing changes?

No.

@nssalian nssalian marked this pull request as ready for review March 30, 2026 15:46
@nssalian
Copy link
Copy Markdown
Contributor Author

@Fokko @geruh @kevinjqliu PTAL

cast_left_ref = cast_term + comparison_op + literal


@cast_left_ref.set_parse_action
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we've already copied this logic multiple times, should we consolidate this:

@left_ref.set_parse_action
def _(result: ParseResults) -> BooleanExpression:
if result.op == "<":
return LessThan(result.column, result.literal)
elif result.op == "<=":
return LessThanOrEqual(result.column, result.literal)
elif result.op == ">":
return GreaterThan(result.column, result.literal)
elif result.op == ">=":
return GreaterThanOrEqual(result.column, result.literal)
if result.op in ("=", "=="):
return EqualTo(result.column, result.literal)
if result.op in ("!=", "<>"):
return NotEqualTo(result.column, result.literal)
raise ValueError(f"Unsupported operation type: {result.op}")
@right_ref.set_parse_action
def _(result: ParseResults) -> BooleanExpression:
if result.op == "<":
return GreaterThan(result.column, result.literal)
elif result.op == "<=":
return GreaterThanOrEqual(result.column, result.literal)
elif result.op == ">":
return LessThan(result.column, result.literal)
elif result.op == ">=":
return LessThanOrEqual(result.column, result.literal)
elif result.op in ("=", "=="):
return EqualTo(result.column, result.literal)
elif result.op in ("!=", "<>"):
return NotEqualTo(result.column, result.literal)
raise ValueError(f"Unsupported operation type: {result.op}")

Comment on lines +285 to +288
expected = EqualTo(
UnboundTransform(Reference("created_at"), DayTransform()),
StringLiteral("2024-01-01"),
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testing, I think it would be good to have an integration test as well. The StringLiteral should be converted to a DateLiteral when bound to the transform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add CAST to parser.py

2 participants