Select with XMLTABLE is returning cartesian product

Database
The Teradata Database channel includes discussions around advanced Teradata features such as high-performance parallel database technology, the optimizer, mixed workload management solutions, and other related technologies.
Visitor

Select with XMLTABLE is returning cartesian product

I’m trying to incorporate some XML with typical column data and am running into a problem.  As a test environment I have loaded some sample data from
a previous example:
 
CREATE TABLE T_XMLDOCS
(
    ID     VARCHAR(30),
    XMLDOC XML
);
 
2 XML records to load in seperate rows.
 
<?xml version="1.0" encoding="UTF-8"?>
<Customers>
  <Customer C_CUSTKEY="11001" C_NAME="LCDs R Us">
    <C_ADDRESS>10001 Cary Parkway</C_ADDRESS>
    <C_NATIONKEY>21</C_NATIONKEY>
    <C_PHONE>31-389-986-4741</C_PHONE>
    <C_ACCTBAL>7140.81</C_ACCTBAL>
    <C_MKTSEGMENT>Electronics</C_MKTSEGMENT>
    <C_COMMENT>Manufacturer of Liquid Crystal Displays</C_COMMENT>
  </Customer>
</Customers>
 
<?xml version="1.0" encoding="UTF-8"?>
<Customers>
  <Customer C_CUSTKEY="1009" C_NAME="Bank of America">
    <C_ADDRESS>cWONXs2Vx30bkgYoCkx7LrJH E</C_ADDRESS>
    <C_NATIONKEY>12</C_NATIONKEY>
    <C_PHONE>22-132-906-1117</C_PHONE>
    <C_ACCTBAL>594.50</C_ACCTBAL>
    <C_MKTSEGMENT>FINANCE</C_MKTSEGMENT>
    <C_COMMENT>carefully unusual ideas sleep quickly after the even foxes. quickly regular deposits are carefully accor</C_COMMENT>
  </Customer>
</Customers>
 
I have two rows loaded; each row has only one of the XML entries in it.
100,XMLDOC0001.xml
200,XMLDOC0002.xml

When I run either of the two queries below, I receive a Cartesian product, which I don’t want
SELECT X.*
       FROM (SELECT ID, XMLDOC FROM T_XMLDOCS) AS C,
       XMLTABLE(
              '/customers/customer'
              PASSING C.XMLDOC
                     COLUMNS
                           "CustName" VARCHAR(20) PATH '@C_NAME',
                           "CustNum" VARCHAR(20) PATH '@C_CUSTKEY',
                           "PhoneNumber" VARCHAR(20) PATH 'C_PHONE',
                           "MarketSegment" VARCHAR(20) PATH 'C_MKTSEGMENT',
                           "AcctBal" DECIMAL(10,2) PATH 'C_ACCTBAL'
       ) AS X ("CustName", "CustKey", "Phone #", "MarketSegment", "AcctBal")
       ORDER BY("CustKey");
 
It returns:
Bank of America,1009,22-132-906-1117,FINANCE,594.50
Bank of America,1009,22-132-906-1117,FINANCE,594.50
LCDs R Us,11001,31-389-986-4741,Electronics,7140.81
LCDs R Us,11001,31-389-986-4741,Electronics,7140.81
 
SELECT ID, X."CustName", X."CustNum", X."PhoneNumber", X."MarketSegment", X."AcctBal"
       FROM T_XMLDOCS,
       XMLTABLE(
              '/customers/customer'
              PASSING T_XMLDOCS.XMLDOC
              COLUMNS
                     "CustName" VARCHAR(20) PATH '@C_NAME',
                     "CustNum" VARCHAR(20) PATH '@C_CUSTKEY',
                     "PhoneNumber" VARCHAR(20) PATH 'C_PHONE',
                     "MarketSegment" VARCHAR(20) PATH 'C_MKTSEGMENT',
                     "AcctBal" DECIMAL(10,2) PATH 'C_ACCTBAL'
              ) X
       ORDER BY(ID);
 
It returns:
100,Bank of America,1009,22-132-906-1117,FINANCE,594.50
100,LCDs R Us,11001,31-389-986-4741,Electronics,7140.81
200,Bank of America,1009,22-132-906-1117,FINANCE,594.50
200,LCDs R Us,11001,31-389-986-4741,Electronics,7140.81
 
How can I fix this so that I am only returning unique rows?
 
Thanks,
 
Ed
4 REPLIES 4
Teradata Employee

Re: Select with XMLTABLE is returning cartesian product

Hi.

 

You are getting a CARTESIAN PRODUCT because you are doing a CARTESIAN PRODUCT. You have to relate your ID's with some identifier in the XML (CustKey?).

If you insert the rows with ID=C_CUSTKEY:

 

10001,XMLDOC0001.xml
1009,XMLDOC0002.xml

 

you can filter then:

 

SELECT X.*
       FROM (SELECT ID, XMLDOC FROM T_XMLDOCS) AS C,
       XMLTABLE(
              '/Customers/Customer'
              PASSING C.XMLDOC
                     COLUMNS
                           "CustName" VARCHAR(20) PATH '@C_NAME',
                           "CustNum" VARCHAR(20) PATH '@C_CUSTKEY',
                           "PhoneNumber" VARCHAR(20) PATH 'C_PHONE',
                           "MarketSegment" VARCHAR(20) PATH 'C_MKTSEGMENT',
                           "AcctBal" DECIMAL(10,2) PATH 'C_ACCTBAL'
       ) AS X ("CustName", "CustKey", "Phone #", "MarketSegment", "AcctBal")

       WHERE ID = CustKey    -- this will avoid cartesian join results
       ORDER BY("CustKey");

 

 

and you'll not get duplicates.

 

Other (bad) solution is using a costly DISTINCT ...

 

HTH.

 

Cheers.

 

Carlos.

 

Teradata Employee

Re: Select with XMLTABLE is returning cartesian product

This one seems to "almost" work.

Data

create multiset volatile table mvt_XMLDOCS, no log
( ID     integer
, XMLDOC xml
)
unique primary index (ID)
on commit preserve rows;

insert into mvt_XMLDOCS values (100, CREATEXML('<?xml version="1.0" encoding="UTF-8" ?>
<Customers>
  <Customer C_CUSTKEY="11001" C_NAME="LCDs R Us">
    <C_ADDRESS>10001 Cary Parkway</C_ADDRESS>
    <C_NATIONKEY>21</C_NATIONKEY>
    <C_PHONE>31-389-986-4741</C_PHONE>
    <C_ACCTBAL>7140.81</C_ACCTBAL>
    <C_MKTSEGMENT>Electronics</C_MKTSEGMENT>
    <C_COMMENT>Manufacturer of Liquid Crystal Displays</C_COMMENT>
  </Customer>
</Customers>'));

insert into mvt_XMLDOCS values (200, CREATEXML('<?xml version="1.0" encoding="UTF-8"?>
<Customers>
  <Customer C_CUSTKEY="1009" C_NAME="Bank of America">
    <C_ADDRESS>cWONXs2Vx30bkgYoCkx7LrJH E</C_ADDRESS>
    <C_NATIONKEY>12</C_NATIONKEY>
    <C_PHONE>22-132-906-1117</C_PHONE>
    <C_ACCTBAL>594.50</C_ACCTBAL>
    <C_MKTSEGMENT>FINANCE</C_MKTSEGMENT>
    <C_COMMENT>carefully unusual ideas sleep quickly after the even foxes. quickly regular deposits are carefully accor</C_COMMENT>
  </Customer>
</Customers>'));

Query

  select CustName, CustNum, PhoneNumber, MarketSegment, AcctBal
    from XMLTable('/Customers/Customer'
                  passing mvt_XMLDOCS.XMLDOC 
                  columns CustName      varchar(20)   path '@C_NAME'
                        , CustNum       varchar(20)   path '@C_CUSTKEY'
                        , PhoneNumber   varchar(20)   path 'C_PHONE'
                        , MarketSegment varchar(20)   path 'C_MKTSEGMENT'
                        , AcctBal       decimal(10,2) path 'C_ACCTBAL'
                 ) as X
order by CustNum;

 CustName        CustNum PhoneNumber     MarketSegment AcctBal 
 --------------- ------- --------------- ------------- ------- 
 Bank of America 1009    22-132-906-1117 FINANCE        594.50
 LCDs R Us       11001   31-389-986-4741 Electronics   7140.81

But I loose the table ID.

New Member

Re: Select with XMLTABLE is returning cartesian product

I was having the exact same kind of issue, and with the help of your 'Almost' solution, I've got it working. 

 

You can pass variables into XMLTable function as part of the PASSING clause. This is mentioned in the TD manuals, but it does not provide any examples of how to apply the variable. 

 

As part of the columns clause you can apply the variable by prefixing with $. 

 

  select ID, CustName, CustNum, PhoneNumber, MarketSegment, AcctBal
    from XMLTable('/Customers/Customer'
                  passing mvt_XMLDOCS.XMLDOC , mvt_XMLDOCS.ID as ID
                  columns 
ID VARCHAR(20) PATH '$ID'
,
CustName varchar(20) path '@C_NAME'
,
CustNum varchar(20) path '@C_CUSTKEY'
,
PhoneNumber varchar(20) path 'C_PHONE'
,
MarketSegment varchar(20) path 'C_MKTSEGMENT'
,
AcctBal decimal(10,2) path 'C_ACCTBAL' ) as X order by CustNum;

 

	ID	CustName	CustNum	PhoneNumber	MarketSegment	AcctBal
1	100	LCDs R Us	11001	31-389-986-4741	Electronics	7,140.81
2	200	Bank of America	1009	22-132-906-1117	FINANCE 	594.50
Teradata Employee

Re: Select with XMLTABLE is returning cartesian product

Great finding !

On a side note, it works on a 16.20 but I got an error on a 15.10 - even if the explain plan works.

 

Bonus, you can pass the ID column as an integer :

  select ID, CustName, CustNum, PhoneNumber, MarketSegment, AcctBal
    from XMLTable('/Customers/Customer'
                  passing mvt_XMLDOCS.XMLDOC
                        , mvt_XMLDOCS.ID as ID
                  columns ID            integer       path '$ID' 
                        , CustName      varchar(20)   path '@C_NAME'  
                        , CustNum       varchar(20)   path '@C_CUSTKEY'     
                        , PhoneNumber   varchar(20)   path 'C_PHONE'    
                        , MarketSegment varchar(20)   path 'C_MKTSEGMENT'              
                        , AcctBal       decimal(10,2) path 'C_ACCTBAL'
                 ) as X
order by ID asc;