How to gain performance in manipulating and querying huge tables ?
Hi every body,
I am working some very big tables (about 50 billions records). The table contains all transactions captured in any point of time (timestamp is used to store the transaction). It has some columns, including a varchar(250) contain the free text comment (but contain some codes in these free text). I have use Primary Partition Index based on the Rang_n Date extract from this timestamp, and defined some secondary indexes as well as join index on this big table. However, there is a request to do the following:
1- Adding a new column on this big table ==> How long it takes to add a new column and fill in data ? Will all PPI, SI and JI be affected and need to be updated ?
2- Making a query to do pivot report (some fields in other rows become columns of a specific row depend on the join condition) in a period of time, using self-join on this big table (could be up to 3 selft joins) ==> Which technique could be useful to allow such queries retrieval on a huge ammount of rows ?
3- Making queries searching record contains some specific codes (some where in middle of the free text field) in a period of time ==> Is there exist any technique (index, join index etc) could help for such of column like "%xyz%" query ?
Re: How to gain performance in manipulating and querying huge tables ?
Gott, I hate to see a posting go unanswered, (Especially mine!) but I think to answer fully people may need a little more information.
What is your teradata node configuration? How big is the table etc. 50 billion rows could be a very large table in Gb if the rows are wide.
but for question 1 - why not create a second table in the format you need, and back fill it over time with the old data. then when you are nearly up to date, take a small outage to copy the remaining data into place and rename the table.
question 2 - dont understand complex SQL myself so hopefully someone else can help answer that.
3 - that sort of thing would be full table scans. Teradata is not the right tool for that. It cannot do partial index matching, but even so putting a wildcard at the start of a string would scupper most other databases. without knowing what sort of data is in the column I cant suggest any alternative im afraid.