Have scenarios like this please tell and explain the outcome.
What if a node fails in a clique? with and without a hot stand by node. Will TD restart in both the cases or not and how will be the performance?
What if a node fails in a system without a clique? Will TD restart and the vprocs will migrate to other nodes or the system will be offline until the Node is brought back online?
Thanks in advance.
When a node fails - that is not currently a spare node - Teradata will restart no matter what the configuration.
If hot spare is available, then the AMPs from the failed node will move to the hot spare automatically and the system will return to operation. There will be no performance impact.
If a hot spare is not available, then the AMPs from the failed node will be distributed among the remaining nodes in the clique. The system will return to operation with all data available but with a performance impact equal to the percentage of the clique the failed node represented (eg 33% for a 3 node clique). This performance impact will be felt in full if the system is operating at full capacity, but will be felt signifcantly less if the system is operating below the full capacity.
A Teradata system always has cliques unless it is a single node system. If the only node in a single node system fails, the system is down until the node is replaced.
Thanks ToddAWalter :-) But have some doubts.
In the TD manual (Introduction to TD14) it says that,
Configuring hot standby nodes eliminates:
• Restarts that are required to bring a failed node back into service.
According to the point TD wont restart if it has a Hot stand by node? is that correct or i misunderstood something?
Also will TD undergo a multiple restarts if a node fails? i.e once the node fails, and after that node comes back online?
Thanks again for your time :-)
Good question - I did not comment on the return to service part...
If there is a standby node, then when the failed node is repaired it is returned to the configuration automatically as the new standby node. No further restart is required.
If no standby node, then when the failed node is repaired, another restart must be scheduled to return to the full configuration with the AMPs migrated back to the repaired node.
Each of the restarts mentioned is an automatic process except the return to service one which must be initiated at your convenience. All of these restarts are 3-5 minute events.
Thanks for the promt response :-)
So here's my understanding, please correct me if my understanding is wrong...
Node failure - TD will restart once for sure doesnt matter what the configuration is.
Node failure - If it doesnt have any Hot stand by node, TD has to restart once as mentioned above, in addition to that we have to schedule a restart so that the components will migrate to the original node. 2 restarts and some performance degradation
Node failure - If a Hot stand by node is configured, then TD has to restart once as mentioned above, and there is no further restart is required. Once the failed node is back online it becomes the new hot stand by node. 1 restart and no performance degradation.
In our production server one of the database is automatically shutdown and restart.
I want to know the timings of the shutdown and restart. And what is the reasons to shutdown and restart.
Please any once tell me the solution for the above problem.
TRATRACE command can be used to see information about restarts.
If you want to see last 5 restart information, then login to any node and enter tpatrace 5.
You can also use DBC.SW_Event_Log table to see the restart timing and other useful details.