Calculation of prescription drug adherence

Database
Enthusiast

Calculation of prescription drug adherence

I am intested in help replicating this process in Teradata: http://www2.sas.com/proceedings/forum2007/043-2007.pdf

I would like to calculate the number of calendar days "covered" by a series of prescriptions for a patient. In the below example, patient A has 3 prescriptions, and I would like to calculate the number of calendar days that are covered (ie, drug supply available) from the date of that person's first prescription (dated 2005-02-17) until 179 days after this (ie. 2005-08-15). This person has 65 days covered, for a proportion of 0.36111 (ie, 65/180).

For patient B, who has 5 prescriptions, I also would like to find the number of days that are covered from their first prescription (2005-04-21) until 179 days later (2005-10-17). However, for this person, prescription #4 overlaps prescription #3, and prescription #5 overlaps prescription #3 and #4. So, for this person, any "extra" drug supply which results from more than one prescription being active on a given day should be added the number of days covered (ie., extra days could be added to extend their prescription #5 past the original 2005-09-01)

CREATE MULTISET VOLATILE TABLE Pdc, 
NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
Person CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
Claim INTEGER,
Start_Date DATE FORMAT 'YY/MM/DD',
DaysSupply INTEGER,
End_Date DATE FORMAT 'YY/MM/DD'
)
ON COMMIT PRESERVE ROWS;
INSERT INTO Pdc('A',1,'2005-02-17',30,'2005-03-18');
INSERT INTO Pdc('A',2,'2005-06-13',30,'2005-07-12');
INSERT INTO Pdc('A',3,'2005-08-11',30,'2005-09-09');
INSERT INTO Pdc('B',1,'2005-04-21',30,'2005-05-20');
INSERT INTO Pdc('B',2,'2005-06-03',30,'2005-07-02');
INSERT INTO Pdc('B',3,'2005-07-07',30,'2005-08-05');
INSERT INTO Pdc('B',4,'2005-07-30',30,'2005-08-28');
INSERT INTO Pdc('B',5,'2005-08-03',30,'2005-09-01');

SEL * FROM PDC
ORDER BY Person, Claim;

....I am using Teradata v14.1

Thank you

7 REPLIES
Senior Apprentice

Re: Calculation of prescription drug adherence

Hi Brent,

that's a nice puzzle :-)

First I thought that it would be possible to solve with OLAP-functions, but I didn't find a solution

.

As it's based on date ranges and the number of rows per Person is probably low, a recursive query should perform well, too.

I used Periods to simplify the calculations, thus I had to adjust your End_Date to match the period logic (the end date of a period is excluded, e.g. period(jan 1, feb 1) covers january). To return your start/end logic instead of a period you can use BEGIN(pd) and LAST(pd).

WITH RECURSIVE cte AS
(
SELECT
Person
,Claim
,Start_Date
,CAST((INTERVAL(pd P_INTERSECT checked_pd) DAY) AS INT) AS DaysSupply
,End_Date
,PERIOD(Start_Date, Start_Date + 180) AS checked_pd
,PERIOD(Start_Date,End_Date+1) AS pd
FROM pdc
WHERE Claim = 1

UNION ALL

SELECT
cte.Person
,pdc.Claim
,pdc.Start_Date
,CAST((INTERVAL(pd) DAY) AS INT) -- days within range
,pdc.End_Date
,cte.checked_pd
-- days within the requested range of 180 days
,cte.checked_pd
P_INTERSECT
(PERIOD(pdc.Start_Date,pdc.End_Date+1)
-- shift overlapping days when the current start is before the previous end
+ (INTERVAL '1' DAY * GREATEST((END(cte.pd)- pdc.Start_Date),0))) AS pd_in_range
FROM pdc JOIN cte
ON pdc.Person = cte.Person
AND pdc.Claim = cte.Claim + 1
-- stop when outside of the 180 day range
WHERE pd_in_range IS NOT NULL
)
SELECT *
FROM cte
ORDER BY Person, Claim

-- or to get the percentage
SELECT Person, 100.00*SUM(DaysSupply)/180
FROM cte
GROUP BY Person

Caution: The Claim numbers must be sequential.

How is the base data calculated?

If it's a SELECT (with a ROW_NUMBER for Claim) you should materialize it in a (Volatile) Table, otherwise the recursion might perform really slow. The best PI for this table should be

PRIMARY INDEX (Person)
PARTITION BY claim
Enthusiast

Re: Calculation of prescription drug adherence

Thank you for the help, Dieter.  I am a bit new to Teradata. I seem to get errors with the SQL you have above (and also with what I have below). The error seems to be in the "WITH RECURSIVE ...." part     ....Thanks again

Encountered "WITH RECURSIVE cte AS. Was expecting one of: "alter" ... "call" ... "create" ... "drop" ... "dump" ... "logging" ... "replace" ... "replication" ... "restart" ... "restore" ... "dynamic" ... "explain" ... "lock" ... "locking" ... "with" ... "with" "recursive" "(" ... "(" ...  "validtime" ... "current" ... "transactiontime" ... "nontemporal" ... "as" ... "select" ... "sel" ... "retrieve" ... "ret" ... "cd" ... "modify" ... "give" ... "rename" ... "cm" ... "delete" ... "del" ... "release" ... "revalidate" ... "rollback" ... "rollforward" ... "checkpoint" ... "insert" ...  "ins" ... "update" ... "upd" ... "merge" ... "set" ... "commit" ... "abort" ... "begin" ... "bt" ... "end" ... "et" ... "ct" ... "grant" ... "cv" ... "revoke" ... "help" ... "show" ... "mload" ... "execute" ... "exec" ... "comment" ... "collect" ... "database" ... "echo" ... "ss" ...  "diagnostic" ... "initiate" ...

DROP TABLE Pdc;

CREATE MULTISET VOLATILE TABLE Pdc,
NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
Person CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
Claim INTEGER,
Start_Date DATE FORMAT 'YY/MM/DD',
DaysSupply INTEGER,
End_Date DATE FORMAT 'YY/MM/DD'
)
ON COMMIT PRESERVE ROWS;
INSERT INTO Pdc('A',1,'2005-02-17',30,'2005-03-18');
INSERT INTO Pdc('A',2,'2005-06-13',30,'2005-07-12');
INSERT INTO Pdc('A',3,'2005-08-11',30,'2005-09-09');
INSERT INTO Pdc('B',1,'2005-04-21',30,'2005-05-20');
INSERT INTO Pdc('B',2,'2005-06-03',30,'2005-07-02');
INSERT INTO Pdc('B',3,'2005-07-07',30,'2005-08-05');
INSERT INTO Pdc('B',4,'2005-07-30',30,'2005-08-28');
INSERT INTO Pdc('B',5,'2005-08-06',30,'2005-09-04');
INSERT INTO Pdc('C',1,'2005-04-21',30,'2005-05-20');
INSERT INTO Pdc('C',2,'2005-06-03',30,'2005-07-02');
INSERT INTO Pdc('C',3,'2005-07-07',30,'2005-08-05');
INSERT INTO Pdc('C',4,'2005-07-30',30,'2005-08-28');
INSERT INTO Pdc('C',5,'2005-08-03',30,'2005-09-01');

WITH RECURSIVE cte AS
(
SELECT
Person
,Claim
,Start_Date
,CAST((INTERVAL(pd P_INTERSECT checked_pd) DAY) AS INT) AS DaysSupply
,End_Date
,PERIOD(Start_Date, Start_Date + 180) AS checked_pd
,PERIOD(Start_Date,End_Date+1) AS pd
FROM pdc
WHERE Claim = 1

UNION ALL

SELECT
cte.Person
,pdc.Claim
,pdc.Start_Date
,CAST((INTERVAL(pd) DAY) AS INT) -- days within range
,pdc.End_Date
,cte.checked_pd
-- days within the requested range of 180 days
,cte.checked_pd
P_INTERSECT
(PERIOD(pdc.Start_Date,pdc.End_Date+1)
-- shift overlapping days when the current start is before the previous end
+ (INTERVAL '1' DAY * GREATEST((END(cte.pd)- pdc.Start_Date),0))) AS pd_in_range
FROM pdc JOIN cte
ON pdc.Person = cte.Person
AND pdc.Claim = cte.Claim + 1
-- stop when outside of the 180 day range
WHERE pd_in_range IS NOT NULL
)
SELECT *
FROM cte
ORDER BY Person, Claim

-- or to get the percentage
SELECT Person, 100.00*SUM(DaysSupply)/180
FROM cte
GROUP BY Person
Senior Apprentice

Re: Calculation of prescription drug adherence

Hi Brent, 

you're running a Teradata version where you have to list all columns of a CTE (afaik before TD15), sorry for that:

WITH RECURSIVE cte 
( Person -- must list all columns
,Claim
,Start_Date
,DaysSupply
,End_Date
,checked_pd
,pd
)
AS
(
SELECT
Person
,Claim
...
Enthusiast

Re: Calculation of prescription drug adherence

Dieter,

Thank you again for your rapid response and help! In the below SQL, it seems that for person A, a result of 50% of days covered is returned, but really it should only be 36.1%, as this person's "window close" (180 days after first prescription) is 2005-08-15, so 25 days worth of activity from this person's 3rd claim shouldn't "count", but in this version, it appears to be as 90/180 = 50%. Basicially, any days after a person's 180d window (defined as 180 days after first claim) don't count, whether those days were there because the claim naturally covered those days, or because some shifting caused days to be added post this 180d window.

And also, is there any way to see more about what lines 48-65 are doing? When I add more columns to the WITH RECURSIVE column list in line 33 (such as pd_in_range), and comment out lines 72-75 and uncomment lines 67-69, I get errors (more columns defined than exist).

DROP TABLE Pdc;

CREATE MULTISET VOLATILE TABLE Pdc,
NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
Person CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
Claim INTEGER,
Start_Date DATE FORMAT 'YY/MM/DD',
DaysSupply INTEGER,
End_Date DATE FORMAT 'YY/MM/DD'
)
ON COMMIT PRESERVE ROWS;
INSERT INTO Pdc('A',1,'2005-02-17',30,'2005-03-18');
INSERT INTO Pdc('A',2,'2005-06-13',30,'2005-07-12');
INSERT INTO Pdc('A',3,'2005-08-11',30,'2005-09-09');

INSERT INTO Pdc('B',1,'2005-04-21',30,'2005-05-20');
INSERT INTO Pdc('B',2,'2005-06-03',30,'2005-07-02');
INSERT INTO Pdc('B',3,'2005-07-07',30,'2005-08-05');
INSERT INTO Pdc('B',4,'2005-07-30',30,'2005-08-28');

INSERT INTO Pdc('C',1,'2005-04-21',30,'2005-05-20');
INSERT INTO Pdc('C',2,'2005-06-03',30,'2005-07-02');
INSERT INTO Pdc('C',3,'2005-07-07',30,'2005-08-05');
INSERT INTO Pdc('C',4,'2005-07-30',30,'2005-08-28');
INSERT INTO Pdc('C',5,'2005-08-03',30,'2005-09-01');

WITH RECURSIVE cte (Person, Claim, Start_Date, DaysSupply, End_Date, checked_pd, pd ) AS
(
SELECT
Person
,Claim
,Start_Date
,CAST((INTERVAL(pd P_INTERSECT checked_pd) DAY) AS INT) AS DaysSupply
,End_Date
,PERIOD(Start_Date, Start_Date + 180) AS checked_pd
,PERIOD(Start_Date,End_Date+1) AS pd
FROM pdc
WHERE Claim = 1

UNION ALL

SELECT
cte.Person
,pdc.Claim
,pdc.Start_Date
,CAST((INTERVAL(pd) DAY) AS INT) -- days within range
,pdc.End_Date
,cte.checked_pd
-- days within the requested range of 180 days
,cte.checked_pd
P_INTERSECT
(PERIOD(pdc.Start_Date,pdc.End_Date+1)
-- shift overlapping days when the current start is before the previous end
+ (INTERVAL '1' DAY * GREATEST((END(cte.pd)- pdc.Start_Date),0))) AS pd_in_range
FROM pdc JOIN cte
ON pdc.Person = cte.Person
AND pdc.Claim = cte.Claim + 1
-- stop when outside of the 180 day range
WHERE pd_in_range IS NOT NULL
)
--SELECT *
--FROM cte
--ORDER BY Person, Claim;

-- or to get the percentage
SELECT Person, 100.00*SUM(DaysSupply)/180
FROM cte
GROUP BY Person
ORDER BY Person;

Again, Many thanks. 

Senior Apprentice

Re: Calculation of prescription drug adherence

Argh, I simplified the query and didn't double check.

Change this line

,CAST((INTERVAL(pd) DAY) AS INT) -- days within range 

to

,CAST((INTERVAL(pd_in_range) DAY) AS INT) -- days within range 

When you add more columns to the cte list you also have to add them to the Selects, all three must have the same number. But you don't have to add pd_in_range, this is already included, it's pd (within the UNION SELECT each column might have a different name, but the name used when you SELECT FROM cte is the one defined in the WITH list.


Enthusiast

Re: Calculation of prescription drug adherence

Hi Dieter,

Really appreciate all your help on this one. Now I have a little variation on this puzzle.

I'd also be interested in calculating the number of unique days covered over a time frame for a series of prescriptions. In the below example, the person has three prescriptions, but the second one almost completely overlaps the first, and any days after December 31, 2015 that are covered in the last prescription do not count at all. 

Between the date of this person's first prescription June 16, 2015 and a defined cut off date of December 31, 2015, I believe that there are 127 unique calendar days that are 'covered' by a drug claim. In other words, very similar to the setup you have made above, but in this version, there is no advancing the number of days is allowed when drug claims overlap.

In your statements above, I imagine this is the line that would need to be modified, but I have to admit I am still working on upacking what it does....I think I am getting close, though....

cte.checked_pd P_INTERSECT(PERIOD(pdc.Start_Date,pdc.End_Date+1) + (INTERVAL '1' DAY * GREATEST((END(cte.pd)- pdc.Start_Date),0))) AS pd_in_range

New example (where the goal is 127 unique covered days)

CREATE MULTISET VOLATILE TABLE Pdc,
NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
Person CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
Claim INTEGER,
Start_Date DATE FORMAT 'YY/MM/DD',
DaysSupply INTEGER,
End_Date DATE FORMAT 'YY/MM/DD'
)
ON COMMIT PRESERVE ROWS;
INSERT INTO Pdc('A',1,'2015-06-16',90,'2015-09-13');
INSERT INTO Pdc('A',2,'2015-06-19',90,'2015-09-16');
INSERT INTO Pdc('A',3,'2015-11-28',90,'2016-02-25');

Many thanks, Brent

Senior Apprentice

Re: Calculation of prescription drug adherence

Hi Brent,

you're welcome, I'm currently writing a book on advanced SQL, always looking for real-world examples :)

in fact, this is much easier because there's no shifting.

No recursion needed, there are some little known functions for aggregating periods, but the syntax is strange and documentation in the manuals is marginal:

WITH cte(Person, pd) AS
(
SELECT
Person
,PERIOD(Start_Date, End_Date + 1) -- need to add one day due to period logic
P_INTERSECT -- return the period within the requested range
PERIOD(MIN(Start_Date) OVER (PARTITION BY Person), DATE '2015-12-31' + 1) AS pd
FROM pdc
QUALIFY pd IS NOT NULL -- only rows within the requested range
)
SELECT
Person
,SUM(CAST((INTERVAL(pd) DAY(4)) AS INT)) AS covered_days
,SUM(pd_cnt) -- number of rows within the specified range
FROM
TABLE
( TD_NORMALIZE_OVERLAP_MEET(NEW VARIANT_TYPE(cte.Person), cte.pd)
RETURNS (Person CHAR(1) -- change this to match your actual datatype exactly
,pd PERIOD(DATE)
,pd_cnt INTEGER) -- optional, numer of rows combined
HASH BY Person -- must be the column(s) used in NEW VARIANT_TYPE
LOCAL ORDER BY Person, pd -- must be the column(s) used in NEW VARIANT_TYPE plus the period
) AS dt
GROUP BY 1