COALESCE giving NULL on derived columns

Database
Teradata Employee

Re: COALESCE giving NULL on derived columns

Hi Frank,

 

I have used CURRENT ROW, but in my case i have to look in following next two rows, so it has to be 1 following and 1 following.

Frank, i know that for the last row, it will look for following two rows and it will return NULL as there will be no more rows and first_value will return NULL.

 

Frank, i still did not get the underlying working of COALESCE as you mentioned below in these lines.

The 1st part of the CASE (generated in lower code for the COALESCE), executed before the FIRST_VALUE is resolved.

The 2nd part is evaluated after the FIRST_VALUE is resolved, and as such can return NULL.

 

I mean how can the underlying case of COALESCE can work without getting the scalar expression value (first_value) which is basically the input for the COALESCE function. Secondly, if as u say that the first_value was not resolved by that time, then why it did not used my non-NULL default timestamp value.

Junior Contributor

Re: COALESCE giving NULL on derived columns

Regardless of FIRST_VALUE being deteministic or nondeterministic, the COALESCE as-is can't return NULL. 

 

When the query is ecexuted a 2nd time the result might change when the ORDER BY is not unique (and then the result is nondeterministic), but not within the same Select.

Teradata Employee

Re: COALESCE giving NULL on derived columns

Hi dnoeth,

 

I manipulated dates in frmTime coulms, so that i can get unique rows for order by clause and now it has worked perfectly. COALESCE did not gave me NULL, and i got a result which i had expected for that data.

 

dnoeth, can u please elaborate more what do you mean in below mentioned

"When the query is ecexuted a 2nd time the result might change when the ORDER BY is not unique (and then the result is nondeterministic), but not within the same Select."

 

Highlighted
Teradata Employee

Re: COALESCE giving NULL on derived columns

Hi Frank & dnoeth

 

Guys, can you people help me understand that which function is non-deterministic. Is it first_value() or coalesce. I mean, which is the one which has really created all this confusion and fuss for me. 

Junior Contributor

Re: COALESCE giving NULL on derived columns

 

SELECT calendar_date, day_of_month,
   Row_Number ()
   Over (ORDER BY month_of_calendar) AS rn,
   Sum (day_of_month)
   Over (ORDER BY month_of_calendar
         ROWS Unbounded Preceding) AS cum_sum,
   Coalesce(First_Value (calendar_date)
            Over (ORDER BY month_of_calendar
                  ROWS BETWEEN 1 Preceding AND 1 Preceding), DATE '9999-12-31') AS lag_dt
FROM sys_calendar.CALENDAR
WHERE month_of_calendar = 1418;

calendar_date day_of_month    rn  cum_sum         lag_dt
2018-02-15              15     1       15     9999-12-31
2018-02-12              12     2       27     2018-02-15
2018-02-10              10     3       37     2018-02-12
2018-02-27              27     4       64     2018-02-10
2018-02-02               2     5       66     2018-02-27
2018-02-19              19     6       85     2018-02-02
2018-02-07               7     7       92     2018-02-19
2018-02-05               5     8       97     2018-02-07
2018-02-22              22     9      119     2018-02-05
2018-02-06               6    10      125     2018-02-22
2018-02-21              21    11      146     2018-02-06
2018-02-03               3    12      149     2018-02-21
2018-02-01               1    13      150     2018-02-03
2018-02-20              20    14      170     2018-02-01
2018-02-18              18    15      188     2018-02-20
2018-02-16              16    16      204     2018-02-18
2018-02-11              11    17      215     2018-02-16
2018-02-09               9    18      224     2018-02-11
2018-02-28              28    19      252     2018-02-09
2018-02-26              26    20      278     2018-02-28
2018-02-24              24    21      302     2018-02-26
2018-02-08               8    22      310     2018-02-24
2018-02-25              25    23      335     2018-02-08
2018-02-13              13    24      348     2018-02-25
2018-02-04               4    25      352     2018-02-13
2018-02-14              14    26      366     2018-02-04
2018-02-17              17    27      383     2018-02-14
2018-02-23              23    28      406     2018-02-17

 

Run this several times and you'll get totally different result sets.

Cumulative & Moving aggregates and ROW_NUMBER need a unique sort order to return the same result repeatedly.

 

But the first row should always show a lag_dt 9999-12-31 and not a NULL.

When you cut&paste the lag_dt calculation multiple times, i.e. return it multiple times within a Select, all columns will always show the same date.

Explain shows a single STATS-step, calculation all columns simultaneously.

 

So FIRST_VALUE is deterministic within the same Select, but might be nondeterministic when it's not uniquely sorted and the Select is run repeatedly.

 

Junior Supporter

Re: COALESCE giving NULL on derived columns

Mobeen,

 

Deterministic means at the outcome of the function is always the same, on the same data.

If a function is nondeterministic, repeated executions may provide different results.

 

I agree with dnoeth that COALESCE should work regardless of nondeterministic, but the manual says different.

You should open an incident with GSC and hopefully this can get fixed, as it is confusing.

One would expect that the same result is used in both places of the CASE, not that the analytical function is executed twice.

 

You can try pushing the function into a subselect as below, but the parser may flatten the code and result could be the same.

Only way I know to get around that is to force resolution of inner select with a GROUP BY.

 

select 'active' as status, frmTime, toTime,
coalesce(nxtTime1, timestamp '9999-12-31 23:59:59' ) as nxtTime
FROM (SEL case
when status = 'expired' then else cast(fromDT as timestamp(0)) - interval '365' day
else cast(frmDTas timestamp(0))
end as frmTime,
case
when status = 'expired' then frmTime
else cast(toDT as timestamp(0)) + interval '1' day
end as toTime,
first_value(frmTime) over (partition by pNo , Obj_Id order by toTime asc, frmTime asc
rows between 1 following and 1 following) as nxtTime1
from CCD.clsTblDetail
where pNo = 1122
) A;

 

Teradata Frank, Certified Master
Junior Contributor

Re: COALESCE giving NULL on derived columns

SELECT x, Count(*)
FROM
 (
   SELECT Coalesce(CASE Random(1,4)
                     WHEN 1 THEN 1
                     WHEN 2 THEN 2
                     WHEN 3 THEN 3
                   END, 4) AS x
   FROM sys_calendar.CALENDAR
 ) AS dt
GROUP BY 1

x   Count(*)
1      18281
2      18327
3      18540
4      18266

This is using a nondeterministic RANDOM and was actually causing problems in a very old release (V2R?), but this was fixed (obviously not in the manual). 

Not a single NULL returned when calculated 73414 times :-)

Teradata Employee

Re: COALESCE giving NULL on derived columns

Hi Frank

 

Frank, yes i have raised this incident as it was becoming very confusing. Currently, I will go for someother workarounds just like you suggested in your query.

Teradata Employee

Re: COALESCE giving NULL on derived columns

Hi dnoeth,

 

I have executed your query on my system and it also did not give any NULL value. Meanwhile, i have raised an incident regarding this issue with Teradata Support. What uptill now, i have identified in my case is that, it will not work untill and unless i have unique values in my order by columns.