JSON import to TD using TPT

Database
The Teradata Database channel includes discussions around advanced Teradata features such as high-performance parallel database technology, the optimizer, mixed workload management solutions, and other related technologies.
Enthusiast

JSON import to TD using TPT

I'm looking for guidance on importing a .json file with varied column types (including nested JSON types, varchar, and int) into a TD DB using TPT. The user guide doesn't seem to have any direct examples; does anyone have tips or example scripts?

10 REPLIES 10
Teradata Employee

Re: JSON import to TD using TPT

Hi szaghik,

 

You have to load the file as a DEFERRED LOB in TPT, using the inserter connector. Like a CLOB.

 

Once in the database, you can take shred your JSON into multiple tables using procedure JSON_SHRED_BATCH.

I advice you to do some discovery first on your document.

Enthusiast

Re: JSON import to TD using TPT

Waldar,

 

Thank you for the guidance. I'm now getting an error related to my schema matching my input, I believe. This is my .tpt script:

 

DEFINE JOB LOAD
(
DEFINE SCHEMA test_schema (
col1 VARCHAR(2000),
col2 VARCHAR(2000),
col3 VARCHAR(2000),
col4 VARCHAR(2000),
col5 VARCHAR(2000),
col6 VARCHAR(2000),
col7 VARCHAR(2000),
col8 VARCHAR(2000),
col9 VARCHAR(2000),
col10 VARCHAR(2000),
col11 JSON(64000),
col12 JSON(64000),
col13 JSON(64000),
col14 VARCHAR(2000)
);

DEFINE OPERATOR write_to_td()
TYPE INSERTER
SCHEMA test_schema
ATTRIBUTES(
VARCHAR Tdpid=@ServerId,
VARCHAR UserName=@UserName,
VARCHAR UserPassword=@UserPassword,
VARCHAR Format=@Format,
VARCHAR LogonMech='LDAP'
);

DEFINE OPERATOR read_file()
TYPE DATACONNECTOR PRODUCER
SCHEMA test_schema
ATTRIBUTES(
VARCHAR FileName = @filename,
VARCHAR DirectoryPath = @path,
VARCHAR Format = 'Delimited'
);

APPLY ('INSERT INTO DB.TBL (
:col1,
:col2,
:col3,
:col4,
:col5,
:col6,
:col7,
:col8,
:col9,
:col10,
:col11,
:col12,
:col13,
:col14
);'
) TO OPERATOR (write_to_td())
SELECT * FROM OPERATOR(read_file());
);

 

And I'm getting this error message:

 

Teradata Parallel Transporter SQL Inserter Operator Version 16.20.00.09
write_to_td: private log not specified
read_file[1]: Instance 1 directing private log report to ''.
read_file[1]: DataConnector Producer operator Instances: 1
read_file[1]: ECI operator ID: 'read_file-6172'
read_file[1]: Operator instance 1 processing file 'filepath'.
write_to_td: connecting sessions
read_file[1]: TPT19134 !ERROR! Fatal data error processing file 'filepath'. Delimited Data Parsing error: Too few columns in row 1.
write_to_td: Total Rows Sent To RDBMS: 0
read_file[1]: TPT19015 TPT Exit code set to 12.
write_to_td: Total Rows Applied: 0
read_file[1]: Total files processed: 0.
write_to_td: disconnecting sessions
write_to_td: Performance metrics:
write_to_td: MB/sec in Load phase: 0
write_to_td: Elapsed time from start to Load phase: < 1 second
write_to_td: Elapsed time in Load phase: 2 second(s)
write_to_td: Elapsed time from Load phase to end: < 1 second
write_to_td: Total processor time used = '0.02633 Second(s)'
write_to_td: Start : Thu Aug 8 19:49:02 2019
write_to_td: End : Thu Aug 8 19:49:04 2019
Job step MAIN_STEP terminated (status 12)
Job jobname terminated (status 12)
Job start: Thu Aug 8 19:49:02 2019
Job end: Thu Aug 8 19:49:04 2019

 

Any thoughts? I'm assuming that since the delimiter between my columns is the same as the delimiter within the JSON columns, my file reader is creating additional columns not declared in the schema. 

Teradata Employee

Re: JSON import to TD using TPT

Well it will depends on your source, your schema seems complicated but it may be required.

Can you provide some rows on your source file (around 5-10 rows) ?

 

I thought you were only loading a JSON file.

Enthusiast

Re: JSON import to TD using TPT

@Waldar ,

 

Good question. The difficulty is that there are multiple JSON type columns mixed into a .json file. Here's some data that I've obfuscated, obviously. All of the strings are literal strings without commas in the text. The null values you see defined in these rows can be null or strings, again without commas. Thoughts?

 

{"col1": "string", "col2": "string", "col3": "string", "col4": null, "col5": null, "col6": null, "col7": "string", "col8": "string", "col9": "string", "col10": "string", "JSON_col1": {"JSON_col1_subcol1": "string", "JSON_col1_subcol2": "string", "JSON_col1_subcol3": "string"}, "JSON_col2": {"JSON_col2_subcol1": "string", "JSON_col2_subcol2": "string", "JSON_col2_subcol3": null}, "JSON_col3": {"JSON_col3_subcol1": {"JSON_col3_subcol1_sub1": "string", "JSON_col3_subcol1_sub2": "string"}}, "col11": null}

{"col1": "string", "col2": "string", "col3": "string", "col4": null, "col5": null, "col6": null, "col7": "string", "col8": "string", "col9": "string", "col10": "string", "JSON_col1": {"JSON_col1_subcol1": "string", "JSON_col1_subcol2": "string", "JSON_col1_subcol3": "string"}, "JSON_col2": {"JSON_col2_subcol1": "string", "JSON_col2_subcol2": "string", "JSON_col2_subcol3": null}, "JSON_col3": {"JSON_col3_subcol1": {"JSON_col3_subcol1_sub1": "string", "JSON_col3_subcol1_sub2": "string"}}, "col11": null}

{"col1": "string", "col2": "string", "col3": "string", "col4": null, "col5": null, "col6": null, "col7": "string", "col8": "string", "col9": "string", "col10": "string", "JSON_col1": {"JSON_col1_subcol1": "string", "JSON_col1_subcol2": "string", "JSON_col1_subcol3": "string"}, "JSON_col2": {"JSON_col2_subcol1": "string", "JSON_col2_subcol2": "string", "JSON_col2_subcol3": null}, "JSON_col3": {"JSON_col3_subcol1": {"JSON_col3_subcol1_sub1": "string", "JSON_col3_subcol1_sub2": "string"}}, "col11": null}

{"col1": "string", "col2": "string", "col3": "string", "col4": null, "col5": null, "col6": null, "col7": null, "col8": "string", "col9": "string", "col10": "string", "JSON_col1": {"JSON_col1_subcol1": "string", "JSON_col1_subcol2": "string", "JSON_col1_subcol3": "string"}, "JSON_col2": {"JSON_col2_subcol1": "string", "JSON_col2_subcol2": "string", "JSON_col2_subcol3": null}, "JSON_col3": {"JSON_col3_subcol1": {"JSON_col3_subcol1_sub1": "string", "JSON_col3_subcol1_sub2": "string"}}, "col11": null}

{"col1": "string", "col2": "string", "col3": "string", "col4": null, "col5": null, "col6": null, "col7": "string", "col8": "string", "col9": "string", "col10": "string", "JSON_col1": {"JSON_col1_subcol1": "string", "JSON_col1_subcol2": "string", "JSON_col1_subcol3": "string"}, "JSON_col2": {"JSON_col2_subcol1": "string", "JSON_col2_subcol2": "string", "JSON_col2_subcol3": null}, "JSON_col3": {"JSON_col3_subcol1": {"JSON_col3_subcol1_sub1": "string", "JSON_col3_subcol1_sub2": "string"}}, "col11": null}

Teradata Employee

Re: JSON import to TD using TPT

Ok, is this one file or multiple files ?

Enthusiast

Re: JSON import to TD using TPT

@Waldar ,

 

This is all one file, but there are multiple files I'll be loading into one table.

Teradata Employee

Re: JSON import to TD using TPT

If all your json rows are this small, load them as a big varchar inside one JSON column.

You may even be allower to use fastload.

Teradata Employee

Re: JSON import to TD using TPT

And after you'll have to nest multiple JSON_TABLE, and/or go for the procedure.

Check a similar post here :

JSON Shredding nested array

 

 

Enthusiast

Re: JSON import to TD using TPT

@Waldar ,

 

Thanks again for the guidance. The link you shared to a similar story, however, seems dead. Can you reshare?