I have a query somewhat as below
sel distinct ref.val,
when ref.val = 'A' then stg.col1
when ref.val = 'B' then stg.col2
join database.ref_table ref
on ref.val in ('A','B')
As my stg_table has more than 100 million records , how can I improve the performance of such case statements.
I have given example of 2 values ('A','B') but it will go as far as 'J' around for around 10 values each will fetch different columns from STG_TABLE.
Also more join conditions and columns will get added in it as query will develop.
Please suggest what kind of index i can create to improve performance in such case.
For performance you must avoid the cross join + distinct operations.
One good method would be an UNPIVOT-like query :
select 'A', col1 from stg_table union all select 'B', col2 from stg_table
You can generate this query against your ref_table, with something like :
select 'select ''' || val || ''', ' || case val when 'A' then 'col1' when 'B' then 'col2' end || ' from stg_table ' || case row_number() over(order by val desc) when 1 then ';' else ' union all ' || chr(10) end as req from ref_table where val in ('A', 'B');