Loading data into aster database (Aster express)

Aster
Enthusiast

Loading data into aster database (Aster express)

Hi,

I'm using ncluster_loader for loading data into Aster express (6.0). But if i want load data from realational

database like MySQL then how should i do it, is there any connector for that?

also if i've csv file containing data in 100 columns how should i load this into aster database, i mean do i've to create

table of 100 columns or is there any other way where table can be created on the fly ?

Please help me in this

Thank you

Pradi

2 REPLIES
Enthusiast

Re: Loading data into aster database (Aster express)

Pradi-

(I apologize if this is a double-post.  It looks like the site "ate" my previous response.  I'll try to re-create it from memory.  Hopefully there won't be two similar but different versions out there.)

To answer your first question, there is an SQL-MR function called "AnyDB2Aster" that allows you to load directly from (almost) any database that has a jdbc driver.  I am not sure if it comes with the free Aster "Express" distribution, though.  Log on via ACT and type "\dF+" or run query "select * from nc_all_sqlmr_funcs;" and see if you have it.  If not, you'll either have to find a copy or write your own.

To answer your second question, there's no built-in way to automatically create tables on-the-fly while loading.  However, there IS an option to the ncluster_loader command that allows you to run a series of SQL statements (one per line) saved in a file.  The "--begin-script" option takes a file name as an argument, and the ncluster_loader runs the commands in the file.

You could:

1. Write a shell/perl/python/whatever script that parses your file and figures out how many columns it has, then compiles a "create table" command and saves it to a temp file.  The create table command would have one column definition for every column in your file.

2. Run the ncluster_loader command with the "--begin-script" option pointing to the temp file you just created.

Here's an example that assumes a comma-separated file with the first line as a header with column names:

-----------------------------snip-----------------------------

#!/bin/sh

# load-on-the-fly.sh     - Load to Aster, creating a temp table on the fly

#   usage: load-on-the-fly.sh filename

#

#

#Some set-up:

ASTERHOSTNAME=192.168.100.100

ASTERUSERNAME=db_superuser

ASTERPASSWORD=db_superuser

ASTERDATABASENAME=beehive

#And the real fun begins:

TMPFILE=`mktemp`.sql   #Create a temporary file for the --begin-script

TMPTABLE="public.temp_in_$$"

SQLCMD="create table ${TMPTABLE}("

FIRSTLINE=1 #Keep track of whether or not we need a leading comma

for i in `head -1 $1 | sed 's/,/ /g'`; do

  if [ ${FIRSTLINE} == 1 ]; then

    COLNAME="$i"

    FIRSTLINE=0

  else

    COLNAME=",$i"

  fi

  SQLCMD="$SQLCMD ${COLNAME} character varying"

done

SQLCMD="${SQLCMD}) distribute by replication;"

echo "${SQLCMD}" >> ${TMPFILE}

ncluster_loader -h ${ASTERHOSTNAME} -U ${ASTERUSERNAME} -w ${ASTERPASSWORD} -d ${ASTERDATABASENAME} -c --skip-rows 1 --begin-script ${TMPFILE} ${TMPTABLE} $1

rm -f ${TMPFILE}

-----------------------------snip-----------------------------

I have a nice little data file with some randomly generated person records in it:

-----------------------------snip-----------------------------

linux-qvsn:~ # cat mydata.text

id,first_name,last_name,email,country,ip_address

1,Teresa,Black,tblack@realfire.mil,Panama,240.247.198.212

2,Carolyn,Murray,cmurray@edgeclub.name,Egypt,238.238.245.221

3,Gerald,Reid,greid@skynoodle.edu,Myanmar,219.88.221.60

4,Lisa,Gardner,lgardner@fivechat.net,Argentina,19.206.230.207

5,Christopher,Snyder,csnyder@dazzlesphere.name,Faroe Islands,144.178.219.174

6,Raymond,Hayes,rhayes@muxo.com,Comoros,189.218.108.217

7,Elizabeth,Foster,efoster@skalith.info,Gambia,212.65.242.62

8,Wanda,Weaver,wweaver@buzzbean.gov,Greenland,29.238.186.129

9,Ryan,Black,rblack@flashpoint.name,Malta,100.48.78.166

10,Matthew,Rose,mrose@dynabox.biz,Slovakia,32.221.60.46

-----------------------------snip-----------------------------

Running my "load-on-the-fly.sh" program looks like this:

-----------------------------snip-----------------------------

linux-qvsn:~ # ./load-on-the-fly.sh mydata.text

Loading tuples using node '192.168.100.100'.

10 tuples were successfully loaded into table 'public.temp_in_19401'.

-----------------------------snip-----------------------------

And checking the database itself:

-----------------------------snip-----------------------------

linux-qvsn:~ # act -d beehive -U db_superuser -w db_superuser -A -c "select * from public.temp_in_19401;"

id|first_name|last_name|email|country|ip_address

1|Teresa|Black|tblack@realfire.mil|Panama|240.247.198.212

2|Carolyn|Murray|cmurray@edgeclub.name|Egypt|238.238.245.221

3|Gerald|Reid|greid@skynoodle.edu|Myanmar|219.88.221.60

4|Lisa|Gardner|lgardner@fivechat.net|Argentina|19.206.230.207

5|Christopher|Snyder|csnyder@dazzlesphere.name|Faroe Islands|144.178.219.174

6|Raymond|Hayes|rhayes@muxo.com|Comoros|189.218.108.217

7|Elizabeth|Foster|efoster@skalith.info|Gambia|212.65.242.62

8|Wanda|Weaver|wweaver@buzzbean.gov|Greenland|29.238.186.129

9|Ryan|Black|rblack@flashpoint.name|Malta|100.48.78.166

10|Matthew|Rose|mrose@dynabox.biz|Slovakia|32.221.60.46

(10 rows)

-----------------------------snip-----------------------------

I hope that helps!

[IMPORTANT: Code is provided as a courtesy, in an "as-is" basis w/o any promise that it will run properly.  Exercise extreme care, and run at your own risk.  No warranty is given or implied, and author shall not be held liable for any adverse consequences.]

Enthusiast

Re: Loading data into aster database (Aster express)

Thanks a lot mcooper :)