need to convert rows to columns on a large data set

Database
The Teradata Database channel includes discussions around advanced Teradata features such as high-performance parallel database technology, the optimizer, mixed workload management solutions, and other related technologies.
Enthusiast

need to convert rows to columns on a large data set

I need to convert rows to columns from a table as below.Please help

Input table

ID Name Distance

1   A        1.5

1  B         2.1

1  C        3.2

so on i have 100 rows for id 1 and similarly id 2,3... and so on

and these rows can grow over time may be few rows per year,

i want this data to be transposed to as below

Expected Output:

ID     _LabelA_               _labelB_              A     B       C       .... and so on until all the rows where _LabelA and _LabelB are new colums hardcoded to a fixed value

1      CLOSEST_DIST   CLOSEST_DIST 1.5   2.1    3.2

8 REPLIES 8
Highlighted
Teradata Employee

Re: need to convert rows to columns on a large data set

Hi prathaprj,

 

Check for PIVOT on docs.teradata.com.

If you're on 16.10+, you can use the PIVOT keyword, if not you can use the "legacy" syntax with max(case when ...).

 

Whatever the method, you have to build the columns list using dynamic SQL - either with a stored procedure or a two queries BTEQ script.

Enthusiast

Re: need to convert rows to columns on a large data set

Thanks Walder,

Could you please post the legacy syntax that can perform on the dataset  i mentioned below.

Enthusiast

Re: need to convert rows to columns on a large data set

Teradata documentation has below example but when i wrote the code in the same way i am getting the error "SELECT Failed, 3707 (Syntax error expected something like ',' between the word PIVOT and the '('"

 

Please let me know if the query below acutally works

 

This example specifies the PivotColumn argument and, with the PivotKeys argument, the values from the pivot column to use as pivot keys. Because PivotKeys does not include 'dewpoint', the function ignores rows that include 'dewpoint'.

Input

The input table, pivot_input, contains temperature, pressure, and dewpoint data for three cities, in sparse format.

pivot_inputsn city week attribute value
1Asheville1temp32
1Asheville1pressure1020.8
1Asheville1dewpoint27.6F
2Asheville2temp32
2Asheville2pressure1021.3
2Asheville2dewpoint27.4F
3Asheville3temp34
3Asheville3pressure1021.7
3Asheville3dewpoint28.2F
4Nashville1temp42
4Nashville1pressure1021
4Nashville1dewpoint29.4F
5Nashville2temp44
5Nashville2pressure1019.8
5Nashville2dewpoint29.2F
6Brownsville2temp47
6Brownsville2pressure1019
6Brownsville2dewpoint28.9F
7Brownsville3temp46
7Brownsville3pressure1019.2
7Brownsville3dewpoint28.9F
SQL Call
SELECT * FROM Pivoting (
  ON pivot_input PARTITION BY sn, city, week ORDER BY week, attribute
  USING
  PartitionColumns ('sn', 'city', 'week')
  PivotKeys ('temp', 'pressure') 
  PivotColumn ('attribute') 
  TargetColumns ('value1')
) AS dt ORDER BY 1,2,3;
The ORDER BY clause is required. If omitted, the output table column content is nondeterministic (for more information, see Nondeterministic Results).
Output

The function outputs the input column contents in dense format in the output columns value_temp and value_pressure, which contain the temperature and pressure, respectively. Because these values are numeric, the function casts them to VARCHAR.

sn city week value_pressure value_temp
1Asheville11020.832
2Asheville21021.332
3Asheville31021.734
4Nashville1102142
5Nashville21019.844
6Brownsville2101947
7Brownsville31019.246
Teradata Employee

Re: need to convert rows to columns on a large data set

From the syntax you looked in Vantage Machine Learning Engine documentation, not in the Vantage NewSQL one.

 

Data

create multiset volatile table mvt_data, no log
( ID            byteint
, Name          char(1)
, Distance      decimal(2,1)
)
primary index (ID)
on commit preserve rows;

insert into mvt_data values (1, 'A', 1.5);
insert into mvt_data values (1, 'B', 2.1);
insert into mvt_data values (1, 'C', 3.2);

 

16.10+ PIVOT query

select *
  from mvt_data
 pivot (max(Distance) as dist
        for Name in ( 'A' as A
                    , 'B' as B
                    , 'C' as C
                    )
       ) as pvt;

ID  A_dist  B_dist  C_dist
--  ------  ------  ------
1      1.5     2.1     3.2

 

Pré-PIVOT query

  select ID
       , max(case Name when 'A' then Distance end) as A_dist
       , max(case Name when 'B' then Distance end) as B_dist
       , max(case Name when 'C' then Distance end) as C_dist
    from mvt_data
group by ID;

-- Same output

As you can see, you have to know the values you want to pivot.

 

Thus the dynamic SQL that will do a select distinct name then build the query around it.

 

Enthusiast

Re: need to convert rows to columns on a large data set

Thanks Walder,I was able to achieve the output using Pivot,but for in list i want to get the values dynamically using select query as below, but not able to achieve, i am writing the logic same as below except that the in list is prepared from the same table as the from clause table.Could you please help me with the syntax Example: Subquery in PIVOT IN-List

This is an example of having a subquery in PIVOT IN-list.

Table s1 is defined as:
CREATE TABLE s1(yr INTEGER, mon VARCHAR (5), sales INTEGER);
CREATE TABLE s2(yr INTEGER, mon VARCHAR (5), sales INTEGER);
The table contains:
SELECT * FROM s1;
yr      mon      sales
-----   ---      -----
2001    Jan       100
2003    Jan       300
2002    Jan       150
2001    Feb       110
2003    Feb       310
2002    Feb       200
2001    Mar       120
2002    Mar       250

SELECT * FROM s2;
 yr     mon      sales
-----  -----    -------
2001    Jan       100
2002    Mar       250
2003    Feb       310
The table as a source to a PIVOT query having a subquery in PIVOT IN-list:
SELECT * FROM s1 PIVOT (SUM (sales) FOR mon in (SELECT mon FROM s2)) dt;
The output pivoted table:
*** Query completed. 3 rows found. 4 columns returned.
 *** Total elapsed time was 1 second.

         yr        'Feb'        'Jan'        'Mar'
-----------  -----------  -----------  -----------
       2001          110          100          120
       2003          310          300           ?
       2002          200          150          250

The query that i am using currntly is as below
SELECT * FROM WORK.DISTANCE_TO_STG PIVOT (sum (CLOSEST_DIST) FOR MSTR_BRAND2 in ('21C_MUSEUM_HOTEL','ALILA','ALL_SEASONS') ) DTS

I want something like below but it is not working.Please help

SELECT * FROM WORK.DISTANCE_TO_STG PIVOT (sum (CLOSEST_DIST) FOR MSTR_BRAND2 in (select distinct MSTR_BRAND2 FROM WORK.DISTANCE_TO_STG PIVOT ) ) DTS

Teradata Employee

Re: need to convert rows to columns on a large data set

You appear to have copied the keyword PIVOT within the subquery. Other that that it seems that it should work.

Enthusiast

Re: need to convert rows to columns on a large data set

Hi Fred,

Thanks for your response.

The syntax that i am using is from what is available in Teradata documentation.Please let me know if this is a valid syntax? It will be helpful if you could provide me a working syntax.

 

Regards,

Prathap

 

Tags (1)
Teradata Employee

Re: need to convert rows to columns on a large data set

You should check the version of your database with 

select * from dbc.dbcInfoV;

The IN-LIST is valid in version 16.20+, and also is limited to sixteen distincts values which is very short; also note you can't choose the ordering output.

 

For now, I suggest you to not use this feature as it's still limited.

 

As I said, you have to run a query (against dbc.ColumnsV) to build another query with all your pivoting columns.

You can build your query using legacy syntax or PIVOT syntax, it will perfom the same behind the scene.