Failure restarting Teradata Database Developer instance on AWS

Teradata Database on AWS

Failure restarting Teradata Database Developer instance on AWS

Hello. I have a Teradata Developer Database instance running on AWS, using the local HDD storage option. Setting it up proved easy enough. However, I want to do some testing on it to make sure that I can stop the instance and bring it back up safely (understanding that shutting it down will cause data loss, this is fine). After all, if the instance isn't needed for a while stopping it saves money.

 

To do this, I shut down the database cleanly. Then I stop the server from the EC2 console and start it back up again.

 

The database doesn't come back up automatically because the AMPs can't start (with the disks not being available). As per an earlier message on this board, I stop the database again, and use "tdc-init --force_config" to recreate the database correctly. Unfortunately this eventually results in an error such as the one below:

 

# tdc-init --force_config
-Initialize Database: ... [stopped on errors] 13:42/30:00 EET
[/] [##########################################################################################......................] 87%

 

Looking at the tdc-init.log file, I see the following lines at the end of the file:

 

2017-11-03 14:14:46,784 tdc_core.base.system_tasks [INFO] DIPALL
2017-11-03 14:14:46,807 tdc_core.base.system_tasks [INFO] Executing DIPALL at Fri Nov 3 14:14:44 2017
2017-11-03 14:14:47,104 tdc_core.base.system_tasks [INFO] Please wait...
2017-11-03 14:20:01,740 tdc_core.base.executors [WARNING] Command unset HISTFILE && /usr/pde/bin/cnsrun -utility dip -commands '{dbc} {DIPALL} {y} {DIPACC} {n}' -output return 2
2017-11-03 14:20:01,755 tdc_core.base.system_tasks [INFO] cnsrun: fatal error CNS Connection Lost.
2017-11-03 14:20:01,824 tdc_core.base.system_tasks [INFO] ***tdc_done***
2017-11-03 14:20:23,872 tdc_core.base.executors [WARNING] Command unset HISTFILE && /usr/pde/bin/cnsrun -utility dip -commands '{dbc} {DIPALL} {y} {DIPACC} {n}' -output return 2
2017-11-03 14:20:23,886 tdc_core.base.system_tasks [INFO] cnsrun: fatal error CNS Connection Lost.
2017-11-03 14:20:23,967 tdc_core.base.system_tasks [INFO] ***tdc_done***
2017-11-03 14:20:34,004 tdc_core.base.executors [WARNING] Command unset HISTFILE && /usr/pde/bin/cnsrun -utility dip -commands '{dbc} {DIPALL} {y} {DIPACC} {n}' -output return 2
2017-11-03 14:20:34,203 tdc_core.base.system_tasks [INFO] cnsrun: fatal error CNS Connection Lost.
2017-11-03 14:20:34,230 tdc_core.base.system_tasks [INFO] ***tdc_done***
2017-11-03 14:20:36,307 tdc_core.base.executors [WARNING] Command unset HISTFILE && /usr/pde/bin/cnsrun -utility dip -commands '{dbc} {DIPALL} {y} {DIPACC} {n}' -output return 2
2017-11-03 14:20:36,496 tdc_core.base.system_tasks [INFO] cnsrun: fatal error CNS Connection Lost.
2017-11-03 14:20:36,523 tdc_core.base.system_tasks [INFO] ***tdc_done***
2017-11-03 14:20:38,584 tdc_core.base.executors [WARNING] Command unset HISTFILE && /usr/pde/bin/cnsrun -utility dip -commands '{dbc} {DIPALL} {y} {DIPACC} {n}' -output return 2
2017-11-03 14:20:38,753 tdc_core.base.system_tasks [INFO] cnsrun: fatal error CNS Connection Lost.
2017-11-03 14:20:38,783 tdc_core.base.system_tasks [INFO] ***tdc_done***
2017-11-03 14:20:40,820 tdc_core.base.system [ERROR] Execution stopped.Task Error:[Initialize Database]cnsrun failed.
2017-11-03 14:20:40,849 tdc_core.base.system [ERROR] Configure Database Stopped at 2017-11-03 14:20:40. Error: Task Error:[Initialize Database] cnsrun failed.

 

I've looked at the cnsrun documentation in the manual but it doesn't really help diagnose the issue. Looks like a connectivity issue but it's not clear what it's trying to connect to (I can't think what it would try to connect to either).

 

Any ideas?

 

Richard

4 REPLIES
Teradata Employee

Re: Failure restarting Teradata Database Developer instance on AWS

If you plan on stoping and starting the image you need to launch one of the instance types that using EBS (Elastic Block Storage) so that the database data is preserved upon node shutdown. The nodes that use the EBS storarge are the ones that start with the letter "m"  (such as m4.10xlarge, m4.16xlarge).  Once you re-run tdc-init it does what is called a sysinit which wipes out all data in your database and sets it up like a new system. I believe you can type just tdc-init wihtout any arguements and see if that brings you system back up. But again I want to make sure you are clear that running this command will wipe all of the data out of your database as if it is a new system without any data.

Re: Failure restarting Teradata Database Developer instance on AWS

Hi Arnie, thanks for taking the time to respond.

 

I'm running this solution at the lowest possible cost, and as as result I'm using a d2.xlarge solution that runs without EBS storage. I'm not worried about losing any of my data, this is a pure development solution and I've got a script to rebuild the environment if required. The only outstanding issue with this solution is my inability to be able to restart the service after I've shut it down as described above. Obviously one answer is to terminate the service and start a new one but from a purely technical point of view I'd like to be able to restart this one.

 

Typing tdc-init without any arguments doesn't resolve the issue either.

Teradata Employee

Re: Failure restarting Teradata Database Developer instance on AWS

If I were going to attempt this, before I shut down the database I would go into ctl utility "debug" screen and set the "Start DBS" flag off. Since you know the database won't be able to start until re-initialized, why even attempt it? The tdc-init process isn't really designed to clean up a crashed instance.

 

To answer your earlier question, cnsrun is trying to connect to the virtual "database console", which is not available because the database has not started. You would need to look earlier in the log to see why not.

 

 

Re: Failure restarting Teradata Database Developer instance on AWS

Hi Fred,

 

Thanks for your advice on this issue. I ran the ctl utility and set the Start DBS parameter to Off. Unfortunately after shutting the database down, and then the instance, and then bringing it back up again and starting the database I get the same error.

 

With further testing what I've found though is that about one time in ten the tdc-init process will complete successfully. This allows me to take a successful log and a failed log and compare the two, and this shows the the first indication of any issue is precisely at the point in the log that I've pasted above. There are no failures or differences between the two logs until this point.

 

This also makes me suspect that the issue is a timeout issue. At that part in the process is there a timeout parameter that I could set differently that would give the cnsrun process to complete successfully? I've tried running the failed csnrun step directly from the command line and it does take a while to come back (but it always eventually comes back).