Datastage

Tools & Utilities
Enthusiast

Datastage

what is full refresh and incremental load in datastage?
what happens to the overwritten/old record?
3 REPLIES

Re: Datastage

what is full refresh and incremental load in datastage?

Full Load is the entire data dump load taking place the very first time.
Tto synchronize the target data with source data,

there are further 2 techniques:-
Refresh load - Where the existing data is truncated and reloaded completely.
Incremental - Where delta or difference between target and source data is dumped at regular intervals. Timsetamp for previous delta load has to be maintained.

what happens to the overwritten/old record?

The Old record will be maintained only in Incremental load using one of this

1.Flag

2.Versioning

3.Timestamp(ie)

The best way is Timestamp where u have to maintain to parameters start date(last run date) and end date(current run date).
Enthusiast

Re: Datastage

thank u very much, Vasumathy.
can u elaborate more on the techniques- Flag and Versioning?
also, what is meant by the following terms & what happens to these fields in compare job in update & delete stages?
1. Source Extraction start/end date
2. Publication start/end date

Re: Datastage

Hi,

Slowly Changing Dimensions (SCD) are dimensions that have data that slowly changes.Dealing with these issues involves SCD management methodologies referred to as Type 0, 1, 2, 3, and 4.
The Type 1 methodology overwrites old data with new data, and therefore does not track historical data at all. This is most appropriate when correcting certain types of data errors, such as the spelling of a name.
The Type 2 method tracks historical data by creating multiple records in the dimensional tables with separate keys.
The Type 3 method track changes using separate columns. Where the original table structure in Type 1 and Type 2 was very similar, Type 3 will add additional columns to the tables:
The Type 4 method is usually just referred to as using "history tables", where one table keeps the current data, and an additional table is used to keep a record of some or all changes.

Adding new column, versioning flag and addding start date and End date comes under SCD TYPE 2.

Adding new column

Add new column CURRENT to the record and set it to 1(default).If the record gets updated, change the CURRENT column value of old record to 0 and then insert the updated (new) record with CURRENT column value as 1.

versioning flag

Add new column VERSION to the record and set it to v1.If the record gets updated, then insert the updated (new) record with VERSION column value as V2 and so on.

start date and End date

Add two new column START DATE & END DATE to the record and set date of insertion to START DATE and leave END DATE as blank.If the record gets updated, then set date of updation to END DATE of old record and then insert the updated (new) record with START DATE as date of updation and leave END DATE as blank.