Skip to content

Bug: flatten_join_alias_var_optimizer unconditional pfree causes use-after-free, triggering ORCA fallback via T_List type confusion #1618

@yjhjstz

Description

@yjhjstz

Summary

flatten_join_alias_var_optimizer in src/backend/optimizer/util/clauses.c unconditionally called pfree(havingQual) even when flatten_join_alias_vars returned the same pointer (i.e., nothing was changed). This caused a use-after-free that led to non-deterministic ORCA fallback to the Postgres planner for correlated subqueries with GROUP BY () HAVING <outer_ref>.

Root Cause

In the original code:

Node *havingQual = queryNew->havingQual;
if (NULL != havingQual)
{
    queryNew->havingQual = flatten_join_alias_vars(queryNew, havingQual);
    pfree(havingQual);   // ← always freed, even when pointer unchanged
}

When flatten_join_alias_vars returns the same pointer (e.g., havingQual is an outer-reference Var with varlevelsup=1 — nothing to flatten), the code frees the live node and leaves queryNew->havingQual pointing to freed memory.

Observed Mechanism (Debug Instrumentation)

For the query:

SELECT v.c, (SELECT count(*) FROM gstest2 GROUP BY () HAVING v.c)
FROM (VALUES (false),(true)) v(c) ORDER BY v.c;

The inner subquery's havingQual is v.c (a T_Var, nodeTag=150). Debug output:

DEBUG flatten_join_alias_var_optimizer: pfree havingQual=0x55fc9d054080 (same=1) nodeTag_before=150
DEBUG after pfree:  havingQual=0x55fc9d054080 nodeTag_after=2139062143   ← freed (0x7F7F7F7F)
DEBUG copyQuery:    havingQual=0x55fc9d054080 nodeTag=596                ← memory REUSED as T_List!

Step-by-step:

  1. pfree(v.c Var) at address 0x55fc9d054080 — returned to palloc free pool
  2. EliminateDistinctClause calls gpdb::CopyObject(query)copyObjectImpl(T_Query)
  3. While copying earlier fields (targetList, groupingSets…), palloc reuses 0x55fc9d054080 for a new T_List node (nodeTag=596)
  4. COPY_NODE_FIELD(havingQual) calls copyObjectImpl(0x55fc9d054080) — now sees T_List instead of T_Var
  5. pqueryEliminateDistinct->havingQual = copy of a random T_List
  6. ORCA's query translator receives a T_List as the HAVING expression (expects a scalar boolean)
  7. ORCA finds a RangeTblEntry for gstest2 inside that list and throws:
    GPORCA does not support the following feature:
    ({RTE :alias <> :eref {ALIAS :aliasname gstest2 ...} :rtekind 0 ...})
    
  8. This is a non-ExmaGPDB GPOS exception → caught in CGPOptimizer::PlannedStmtFromQueryInternalORCA falls back to Postgres planner

Why the Bug Went Unnoticed

The Postgres planner fallback produced the correct result (f | NULL), so no regression test ever failed. The memory corruption was silently masked.

The bug was exposed when fixing the same function's list_free guards on targetList/returningList (adding pointer-equality checks before freeing). After that fix, ORCA no longer fell back for this query — but ORCA's decorrelation logic for GROUP BY () HAVING <outer_ref> was incorrect, producing wrong results (f | 0 instead of f | NULL).

Fix

Guard pfree with a pointer-equality check (same pattern already applied to targetList, returningList, scatterClause, limitOffset, limitCount):

Node *havingQual = queryNew->havingQual;
if (NULL != havingQual)
{
    queryNew->havingQual = flatten_join_alias_vars(queryNew, havingQual);
    if (havingQual != queryNew->havingQual)   // ← only free if mutated
        pfree(havingQual);
}

Related ORCA Fix

The ORCA decorrelation bug exposed by this fix (incorrect COALESCE(count(*), 0) applied to GROUP BY () HAVING <outer_ref>) was separately fixed in src/backend/gporca/libgpopt/src/xforms/CSubqueryHandler.cpp by detecting the correlated-HAVING pattern in SSubqueryDesc::Psd() and forcing m_fCorrelatedExecution = true to route through the SubPlan (correlated execution) path instead of the incorrect Left Outer Join + COALESCE decorrelation path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions