UDFCONCAT

Database
Enthusiast

UDFCONCAT

I'm looking into usage of function: tdstats.udfconcat. Is there any way to control the order of the values that this function concatenates?

If I have a table: TABLE1 that looks like this:

Column_A               Column_B

55555                     123

55555                     124

55555                     125

55555                     126

55555                     127

77777                     130

77777                     131

77777                     132



and I would like to convert the values in the second column into a comma separated "ordered" list that should look like this:



Column_A                             Column_X

55555                               123,124,125,126,127

77777                               130,131,132

On using udfconcat I get this instead(values in column_x are not in order):

Column_A                             Column_X

55555                               124,123,126,125,127

77777                               132,130,131

How do I make udfconcat generate an ordered list?

if it is not possible to make udfconcat generate the concatenated values in a particular order, what is the most effecient way to rearrange the values in column_x after I apply udfconcat? (can't do recursion because the input table has about 500 million records)

I tried the aternative: xmlagg function (that did the same thing and generated the values in order too) but it is much much slower than udfconcat, consumes far more cpu/io/spool resources when compared to udfconcat (probably because udfconcat is an aggregate udf?)

-Suhail

5 REPLIES
Senior Apprentice

Re: UDFCONCAT

Hi Suhail,

There's no way to rearrange the data returned by udfconcat efficiently, you might write your own UDF (or do an Enhancement Request to add a LISTAGG/GROUP_CONCAT function).

tdstats.udfconcat is undocumented (used by AutoStats to get the column/fieldid list) and the order is probably based on the internal storage.

You might try to apply an order using ROW_NUMBER before aggregation, seems to work for my 4-AMP VM:

SELECT TABLENAME, tdstats.udfconcat(ColumnName)
FROM
(
SELECT TABLENAME, ColumnName,
ROW_NUMBER() OVER (PARTITION BY TABLENAME ORDER BY ColumnName) AS rn
FROM dbc.ColumnsV
WHERE DatabaseName = 'sys_calendar'
) AS dt
GROUP BY 1;

Of course there's no guarantee for correct sorting, it might work today and the next patch/version breaks it.

Similar when you create a NOPI table, the final spool is simply materialized as-is: 

CREATE VOLATILE TABLE vt AS
(
SELECT TABLENAME, ColumnName,
ROW_NUMBER() OVER (PARTITION BY TABLENAME ORDER BY ColumnName) AS rn
FROM dbc.ColumnsV
WHERE DatabaseName = 'sys_calendar'
) WITH DATA
NO PRIMARY INDEX
ON COMMIT PRESERVE ROWS;

SELECT TABLENAME, tdstats.udfconcat(ColumnName)
FROM VT
GROUP BY 1;
Enthusiast

Re: UDFCONCAT

Brilliant!! Thank you sir! 

One additional question regarding this comment:

"Of course there's no guarantee for correct sorting, it might work today and the next patch/version breaks it."

If I do not do row_number and generate the concatenated output in any random order, what is an effecient way to rearrange the comma separated values in the string alphabetically?

-Suhail

Enthusiast

Re: UDFCONCAT

The reason I ask for an effecient solution is because the input table has about 500 million records.

Senior Apprentice

Re: UDFCONCAT

I just can repeat: There's no way to rearrange the data returned by udfconcat efficiently.

Your options are:

  • If the number of rows per value is known and limited, old style MAX(CASE) is still very efficient
  • XMLAGG, way less efficient than udfconcat
  • Implement your own concat-UDF
  • If this was an export you could write a simple OUTMOD to concat the sorted data
Enthusiast

Re: UDFCONCAT

Thank you Dieter.

-Suhail