Skip to content

Expression index should be leveraged only when the predicate matches the expression of interest #17479

@nsivabalan

Description

@nsivabalan

Bug Description

What happened:
say we create an expression index as below

create index idx_datestr on tableName using column_stats(ts) options(expr='from_unixtime', format='yyyy-MM-dd HH:mm')

And the expectation is that, if a query with predicate "where from_unixtime(ts, 'yyyy-MM-dd') = '1970-01-01'" is supplied, expression index will be used.
Eg query:

select id, name from tableName where from_unixtime(ts, 'yyyy-MM-dd') = '1970-01-01'

But if the query contains the data column directly, we should not be looking up in expression index for pruning, but such query are looking up in expression index and hits exception due to casting issue. But we swallow the failure silently and move on to next index.

Image

What you expected:
Only when the expression matches, we should lookup in expression index, if not, we should fallback to other indices available.

Steps to reproduce:
1.
2.
3.

Environment

Hudi version:
Query engine: (Spark/Flink/Trino etc)
Relevant configs:

Logs and Stack Trace

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    type:bugBug reports and fixes

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions