Help with SQL

Database
Enthusiast

Help with SQL

Hello Experts,

Hope you are doing well.

I need some help determining the logic for the below scenario. The sample data is as follows

record    col1        dt             col3  col4      col5

===============================

 1        3202     09/27/2012      2     abc       ?

 2        3202     09/27/2012      4     abc       ?

 3        3202     09/27/2012      5     abc       ?

 4        3202     10/03/2012      5     abc       ?

 5        3202     10/13/2012      2     abc       ?

 6        3202     10/13/2012      3     abc       ?

 7        3202     10/13/2012      4     abc       ?

 8        3202     10/13/2012      5     abc       ?

 9        3202     10/14/2012      5     abc       ?

 10       3202     10/15/2012      4     abc      ?

 11       3202     10/16/2012      5     abc      ?

The desired output is as below. The logic to update col5 is as follows

(1) Only the first record (col5) for a combination of col1,dt,col3,col4 is updated and the remaining records for the same combination will always have NULL.

(2) Since there is no previous record for col4 for any day prior to 9/27, col5( record#1 ) is set to NULL.  Record #'s 2 and 3 will always be null.

(3) Looking at records 3 and 4, since the date difference is 6 days and the diff between col3 on those records ( 5 and 5) are same (or lower), it will be 6-1=5. If the record 3 had a col3 value of 4, it would have been 6. Again records 5,6,7,8 will be NULL based on rule 1.

(4) Looking at records 8 and 9, the dt difference is 1, but since col3 is same, the col5 value will be 0. Had col3 on record 8 were a 4 or lower, col5 on record 9 would have been set to 1

(5) Looking at records 9 and 10, the date difference is 1, but since col3 on record 10 is lower than the col3 value on 9, it will have a 0.

(6)  Looking at records 10 and 11, the date difference is 1, but since col3 on record 11 is greater than the col3 value on 10, it will have a the date differece which is 1.

record#   col1        dt               col3  col4      col5

===============================

1            3202     09/27/2012      2     abc       ?

2            3202     09/27/2012      4     abc       ?

3            3202     09/27/2012      5     abc       ?

4            3202     10/03/2012      5     abc       5

5            3202     10/13/2012      2     abc       9

6            3202     10/13/2012      3     abc       ?

7            3202     10/13/2012      4     abc       ?

8            3202     10/13/2012      5     abc       ?

9            3202     10/14/2012      5     abc       0

10          3202     10/15/2012      4     abc       0

11          3202     10/16/2012      5     abc       1

I am sorry, this is confusing, I wish I could explain it better. Any and all help on this would be greatly appreciated.

Thanks much.

6 REPLIES
WAQ
Enthusiast

Re: Help with SQL

Its always good to provide the DDL's and DML's in order to get the resolution quickly.

Enthusiast

Re: Help with SQL

Hello,

Good Morning!!

Thank you very much for looking into this.

Here is the DDL/DML . Kindly let me know if you need any further info.

CREATE volatile TABLE test  ,FALLBACK ,

      NO BEFORE JOURNAL,

      NO AFTER JOURNAL,

      CHECKSUM = DEFAULT,

      DEFAULT MERGEBLOCKRATIO

      (

       COL1 BIGINT NOT NULL,

       DT DATE FORMAT 'YYYY-MM-DD' NOT NULL,

       COL3 SMALLINT NOT NULL,

       COL4 VARCHAR(84) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,

       COL5 INTEGER

       )

 PRIMARY INDEX ( COL3 )

 on commit preserve rows;

insert into test values(3202, date '2012-10-03',5,'abc',);

insert into test values(3202, date '2012-10-13',3,'abc',);

insert into test values(3202, date '2012-10-13',2,'abc',);

insert into test values(3202, date '2012-09-27',2,'abc',);

insert into test values(3202, date '2012-09-27',4,'abc',);

insert into test values(3202, date '2012-10-13',4,'abc',);

insert into test values(3202, date '2012-10-15',4,'abc',);

insert into test values(3202, date '2012-09-27',5,'abc',);

insert into test values(3202, date '2012-10-13',5,'abc',);

insert into test values(3202, date '2012-10-14',5,'abc',);

insert into test values(3202, date '2012-10-16',5,'abc',);

sel * from test

order by 1,2,3;

Senior Supporter

Re: Help with SQL

I guess rule 1 is not correct

as

sel COL1,dt,col3,col4 from vt_test
group by 1,3,4,2
having count(*) > 1;

gives null rows.

-> all rows are first occurance...

So try to explain it in different words...

Enthusiast

Re: Help with SQL

Hi Ulrich,

For a combination of col1,dt,col4 the record with col3 value "2" is the first (which is record# 1) for dt 9/27. For dt 10/13, record # 5 is the first.

In other words, these are the records that should be updated.

sel * from test

qualify row_number() over (partition by col1,dt,col4 order by col3)=1;

Please help.

Junior Contributor

Re: Help with SQL

Based on your narrative:

SELECT
test.*,
CASE
WHEN ROW_NUMBER() OVER (PARTITION BY col1,dt,col4 ORDER BY col3) = 1
THEN (dt-MIN(dt) OVER (PARTITION BY col1 ORDER BY dt,col4,col3 ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING))
- CASE
WHEN col3-MIN(col3)
OVER (PARTITION BY col1
ORDER BY dt,col4,col3
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) <= 0
THEN 1
ELSE 0
END
END
FROM test

Dieter

Enthusiast

Re: Help with SQL

Thank you very much Dieter. This is exactly what I've wanted. I am running out of spool when when run on a table with a couple million records. Can you think of something that can give me a better performance? I will try it on my side as well.

Very much appreciate your help.