Insert into select with a count distinct and coalesce results in random answers.

Database
Highlighted
New Member

Insert into select with a count distinct and coalesce results in random answers.

I have the data as 

product_info       |    sales
----------------------------------------------------
size:52,weight:15.42,style:U,color:Red |  12

 

Creating a test table

CREATE MULTISET TABLE as_adventure.test (key_c1 VARCHAR(4096), color_c2 VARCHAR(4096), customer_count_c3 BIGINT);

This is the select statement

SELECT
   COALESCE(REGEXP_SUBSTR(factinternetsales_t10.product_info, '(^|,)color:\K([^,]+)'), CASE WHEN factinternetsales_t10.product_info IS NULL THEN factinternetsales_t10.product_info ELSE 'null' END) (CASESPECIFIC)  AS key_c1,
   COALESCE(REGEXP_SUBSTR(factinternetsales_t10.product_info, '(^|,)color:\K([^,]+)'), CASE WHEN factinternetsales_t10.product_info IS NULL THEN factinternetsales_t10.product_info ELSE 'null' END) (CASESPECIFIC)  AS color_c2,
   CAST(COUNT(DISTINCT factinternetsales_t10.customerkey) AS BIGINT) AS customer_count_c3
FROM
   as_adventure.factinternetsales AS factinternetsales_t10
GROUP BY
   1,
   2;

Output

White	White	559
null	null	13454
Blue	Blue	3577
Red	Red	4500
Yellow	Yellow	4134
Black	Black	7790
Silver	Silver	3053
Multi	Multi	3488

There are 8 rows returned here

 

Now when I try to insert this as a select

INSERT INTO as_adventure.test  SELECT
   COALESCE(REGEXP_SUBSTR(factinternetsales_t10.product_info, '(^|,)color:\K([^,]+)'), CASE WHEN factinternetsales_t10.product_info IS NULL THEN factinternetsales_t10.product_info ELSE 'null' END) (CASESPECIFIC)  AS key_c1,
   COALESCE(REGEXP_SUBSTR(factinternetsales_t10.product_info, '(^|,)color:\K([^,]+)'), CASE WHEN factinternetsales_t10.product_info IS NULL THEN factinternetsales_t10.product_info ELSE 'null' END) (CASESPECIFIC)  AS color_c2,
   CAST(COUNT(DISTINCT factinternetsales_t10.customerkey) AS BIGINT) AS customer_count_c3
FROM
   as_adventure.factinternetsales AS factinternetsales_t10
GROUP BY
   1,
   2;

That results in 127 rows

 

Example subset returned

White	White	559
null	null	13454
Blue	Blue	3577
Red	Red	4500
White	White	559
null	null	13454
Blue	Blue	3577
Red	Red	4500
Yellow	Yellow	4134
Black	Black	7790
......

 

I am not sure what the order of operations are but this seems wrong. Is there something I need to keep in mind for count(distinct) queries.

The above issue doesn't occur if I call sum instead of count distinct.

If I remove the coalesce, the query crashes.

1 REPLY 1
Teradata Employee

Re: Insert into select with a count distinct and coalesce results in random answers.

Hi gauravshetti,

 

That's very weird indeed.

Maybe it's a bug, you should open an incident with support.

 

As a workaround, can you try using NVP function instead of REGEXP_SUBSTR ?

select nvp('size:52,weight:15.42,style:U,color:Red', 'color', ',', ':') AS color;