case when from TSQL

Database

Re: case when from TSQL

Hi Dave,

 

I understand the query would work if only it had aggregate functions in the sub-query. But in this case there is no aggregate function used and as I understand the inner query will give multiple values which needs to be compared with PROC1-10.

 

Thanks,
Rohan Sawant

ROHAÑ

Re: case when from TSQL

Hi Laura,

 

I think with a very little modification my solution should work for your case. Did you try out the solution?

 

Thanks,
Rohan Sawant

ROHAÑ

Re: case when from TSQL

Hi Rohan,

 

Note that the sub-query is correlated, it is joining column PROC1 in the main table (t1 in my example code) to the CODE column in the sub-query table (USR_RACSTER.CHL_2017).

 

To use your sample data values:

  • Assume that for a row in table 't1' column PROC1 has a value of 20.
  • Assume that the rows in the table USR_RACSTER.CHL_2017 [where "VALUE_SET_NAME IN('Chlamydia Tests')"] contain CODE values of 10,20,30,5.
  • The correlated sub-query will only allow the main table row to join to the sub-query table rows where the value in CODE column matches the value in PROC1 - it will return a value of 20. This matches PROC1 (#kind of' a double check) but will provide the correct logic.

We still need the MAX function to ensure that the sub-query only returns a single row. if there were multiple rows in the sub-query table that contained the same value as PROC1 this would fail if we didn't have the MAX function.

 

Another way to look at this may be to think about the original logic. Taking a snippet of Laura's original code:

 

WHEN PROC1 IN (SELECT CODE FROM USR_RACSTER.CHL_2017 WHERE VALUE_SET_NAME IN('Chlamydia Tests') AND CODE_SYSTEM IN ('CPT','HCPCS'))

Just looking at the sub-query, this could return a list of values.

 

However, only one of those values can match to the value in PROC1 (obviously).

 

So what is actually being asked is

"if the value in PROC1 is in one or more rows in the "USR_RACSTER.CHL_2017" table rows then "WHERE VALUE_SET_NAME IN('Chlamydia Tests')" then this is a match.

 

What I've done is code a join in the sub-query which forces that match to happen.

 

As I said above, we still need the MAX function because we need to ensure that the sub-query returns a single value (one column, one row).

 

Cheers,

Dave

 

 

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com

Re: case when from TSQL

Hi Dave,

 

I completely agree with your solution if the query is a correlated sub-query. I think there is a confusion here. Laura's first post doesnt show "AND code = a.Proc1" while later again in the later post we could see this join. So if later post by Laura is correct then Dave's solution should work. We need to figure out which is the correct query. The one at the first post or the later one. My solution is totally based on the first post. Hey Laura, can you confirm on this?

 

Thanks,

Rohan Sawant

ROHAÑ

Re: case when from TSQL

Hi Rohan,

 

I added the "code = t1.proc1" so that:

- I could make it a correlated sub-query, so that

- I could use this TD feature , so that

- I can convert Laura's original SQL to something valid in TD which will give the correct results.

 

You are correct that this depends on the original functionality, that is (obviously) what we need to replicate. I believe that my solution using the correlated sub-query approach does that.

 

I'm onsite wth a customer today but I'll see if I can create some sample data and then post that along with the results of my processing.

 

Cheers,

Dave

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com

Re: case when from TSQL

Hi,

 

The following (I think) demonstrates that the user of correlated scalar sub-queries (CSSQ) works for this process.

 

This example simply looks at columns PROC1 and PROC2. If it gives the correct answer then just extend it for columns PROC3 - PROCn.

 

-- Create a couple of tables to mimic the relevant parts of the originals
-- Column 'seqnbr' is simply an identifier for each row in the main table so that we can validate results
DROP TABLE vt1;
CREATE SET VOLATILE TABLE vt1
(seqnbr INTEGER NOT NULL
,proc1 INTEGER 
,proc2 INTEGER 
,code_system CHAR(10) 
)
UNIQUE PRIMARY INDEX(seqnbr)
ON COMMIT PRESERVE ROWS;

CREATE SET VOLATILE TABLE CHL_2017
(VALUE_SET_NAME CHAR(30) 
,code INTEGER 
)ON COMMIT PRESERVE ROWS;

Now load some simple data - but based on the original post I think it is acceptable.

 

 

DELETE FROM vt1;
INSERT INTO vt1 VALUES(1,10,100,'CPT');
INSERT INTO vt1 VALUES(2,5,101,'CPT');
INSERT INTO vt1 VALUES(3,5,200,'CPT');

DELETE FROM chl_2017;
INSERT INTO chl_2017 VALUES('Chlamydia Tests',10);
INSERT INTO chl_2017 VALUES('Chlamydia Tests',20);
INSERT INTO chl_2017 VALUES('Chlamydia Tests',100);
INSERT INTO chl_2017 VALUES('Chlamydia Tests',200);

Now run the following query that uses the CSSQ feature and the result using the above data:

 

 

SEL t1.*
  ,(CASE 
    WHEN proc1 = (SELECT MAX(code) FROM chl_2017 WHERE code = t1.proc1 AND VALUE_SET_NAME IN('Chlamydia Tests'))
		  AND CODE_SYSTEM IN ('CPT','HCPCS') THEN '1'
    WHEN proc2 = (SELECT MAX(code) FROM chl_2017 WHERE code = t1.proc2 AND VALUE_SET_NAME IN('Chlamydia Tests'))
		  AND CODE_SYSTEM IN ('CPT','HCPCS') THEN '1'
   ELSE '0'
   END) AS ChlamScreen
FROM vt1 AS t1
ORDER BY 1;

Using my sample data this gives:

seqnbr    proc1    proc2    code_system    ChlamScreen
  1          10      100    CPT            1
  2           5      101    CPT            0
  3           5      200    CPT            1

Laura, given my sample data, is this the result that you want?

 

Cheers,

Dave

 

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com

Re: case when from TSQL

Rohan, You are correct. The sub query will return multiple rows. I am trying to search my PROC1-10 columns against a list of procedure codes from my look-up table (table in the sub query.) I'm playing around with all your suggestions and appreciate your time. At this point I am leaning toward re-writing all my TSQL jobs that use this logic. There certainly doesn't seem to be a simple conversion to Teradata SQL. Thanks again!
Highlighted

Re: case when from TSQL

Hi Laura,

 

Having a look at multiple possible solutions is a good idea.

 

I understand that "The sub query will return multiple rows", but for any single row in the table containing column PROC1, that column can only contain a single value (assuming it isn't an ARRAY data type), so you only need one value returned from the sub-query.

 

Over to you.

 

Cheers,

Dave

Ward Analytics Ltd - information in motion
www: http://www.ward-analytics.com
N/A

Re: case when from TSQL

Dave's solution is a exact rewrite of the logic, but when you look at Explain it will be horrible. 

 

Btw, the logic in SQL Server is way too complicated, this should work, too:

CASE 
   WHEN
( SELECT CODE FROM USR_RACSTER.CHL_2017 WHERE VALUE_SET_NAME IN('Chlamydia Tests')
AND CODE_SYSTEM IN ('CPT','HCPCS')
) IN (Proc1, Proc2, Proc3, ...) THEN '1' ELSE'0' END

The bad news is, due to unknown reasons Teradata doesn't allow column names or calculations for IN :-(

 

IMHO a single Left Join based on rewriting IN using OR provides the same result and should have good performance:

 

SELECT DISTINCT -- distinct is needed because there might be multiple matches
t1.*, CASE WHEN chl.code IS NOT NULL THEN '1' ELSE '0' END FROM vt1 AS t1 LEFT JOIN chl_2017 AS chl ON (chl.code = t1.proc1 OR chl.code = t1.proc2 OR
...) AND chl.VALUE_SET_NAME IN ('Chlamydia Tests') AND chl.CODE_SYSTEM IN ('CPT','HCPCS')