Skip to content

feat(snowflake)!: provide transpilation support for HASH_AGG#7284

Open
fivetran-ashashankar wants to merge 2 commits intomainfrom
RD-1069223-transpile_HASH_AGG
Open

feat(snowflake)!: provide transpilation support for HASH_AGG#7284
fivetran-ashashankar wants to merge 2 commits intomainfrom
RD-1069223-transpile_HASH_AGG

Conversation

@fivetran-ashashankar
Copy link
Collaborator

No description provided.

@fivetran-ashashankar fivetran-ashashankar changed the title Rd 1069223 transpile HASH_AGG feat(snowflake)!: provide transpilation support for HASH_AGG Mar 13, 2026
@github-actions
Copy link
Contributor

SQLGlot Integration Test Results

Comparing:

  • this branch (sqlglot:RD-1069223-transpile_HASH_AGG, sqlglot version: RD-1069223-transpile_HASH_AGG)
  • baseline (main, sqlglot version: 29.0.2.dev29)

⚠️ Limited to dialects: duckdb

By Dialect

dialect main sqlglot:RD-1069223-transpile_HASH_AGG transitions links
duckdb -> duckdb 4003/4003 passed (100.0%) 4003/4003 passed (100.0%) No change full result / delta

Overall

main: 4003 total, 4003 passed (pass rate: 100.0%), sqlglot version: 29.0.2.dev29

sqlglot:RD-1069223-transpile_HASH_AGG: 4003 total, 4003 passed (pass rate: 100.0%), sqlglot version: RD-1069223-transpile_HASH_AGG

Transitions:
No change

bit_xor = exp.BitwiseXorAgg(this=hash_func)

# Wrap with COALESCE(..., 0)
result = exp.Coalesce(this=bit_xor, expressions=[exp.Literal.number(0)])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory D SELECT HASH(NULL);
┌──────────────────────┐
│      hash(NULL)      │
│        uint64        │
├──────────────────────┤
│ 13787848793156543929 │
└──────────────────────┘

HASH computes hash even for NULL. There is no need to apply COALESCE.

Comment on lines +4094 to +4097
# HASH_AGG(col) -> COALESCE(BIT_XOR(HASH(col)), 0)
# HASH_AGG(col1, col2) -> COALESCE(BIT_XOR(HASH((col1, col2))), 0)
# HASH_AGG(DISTINCT col) -> COALESCE(BIT_XOR(DISTINCT HASH(col)), 0)
# HASH_AGG(*) -> COALESCE(BIT_XOR(HASH(*)), 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the we use as a winodw function ?

HASH_AGG window

self.validate_all(
"SELECT HASH_AGG(DISTINCT col1)",
write={
"duckdb": "SELECT COALESCE(BIT_XOR(DISTINCT HASH(col1)), 0)",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the correct approach here. We apply the DISTINCT on the HASH result right ? and not on the column.

Comment on lines +4117 to +4126
if isinstance(cols[0], exp.Star):
# HASH(*) is not supported in DuckDB - HASH() requires explicit columns
# Generate a warning and use a placeholder
self.unsupported(
"HASH_AGG(*) cannot be fully transpiled to DuckDB. "
"DuckDB's HASH() function requires explicit columns, not *. "
"Please rewrite as HASH_AGG(col1, col2, ...) with specific columns."
)
# Fall back to using Star - will cause runtime error but preserves intent
hash_arg = exp.Star()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't complete, you can do:

memory D with t as (select 1 as a, 2 as b) select hash(unpack(columns(t.*))) from t;
┌──────────────────────────┐
│ hash(a := t.a, b := t.b) │
│          uint64          │
├──────────────────────────┤
│   6530802887144669425    │
│    (6.53 quintillion)    │
└──────────────────────────┘ 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants