Stream API for R

Aster

Stream API for R

I found a tutorial on www.asterdata.com on how to use R with asterdata. The main body of the mapper looks like this:

while (1)

{

 input_list = scan(stdin,what=list(stock_id=" ",open_price=0),nlines=1, quiet=TRUE)

 id<-input_list[["stock_id"]]

 open_price<-input_list[["open_price"]]

 if (length(id) == 0)

  break

 input<-open_price

 score = score_function(input)

 # Output original tuple with attached score

 result = c(id, score)

 write(result, stdout(), sep=DELIMITER, ncolumns = length(result))

}

In this example the R-code scans and processes the input (stdin) line-by-line and calculates some score for each line. Now let's suppose that I want to do scoring for 5 loan products and I have samples of n size for each product so all together 5 x n lines. If n = 200 I can set nlines = 200. If the sample is ordered then each cycle in the loop will process one product. But normally, the structure is not that symmetric, and n can be different for different products.

How should I write the SQL query in SQL-MR so that the stream function be called 5 times, once for each product?

Thanks,

Attila

xx

xxx

Tags (3)
1 REPLY
Enthusiast

Re: Stream API for R

SELECT score

    FROM STREAM (

        ON (

            SELECT

                stock_id, open_price

            FROM

                myschema.mytable

        )

        PARTITION BY loan_product

        SCRIPT ('RSCRIPT my_rscript.R "arg1" "arg2" ')

        OUTPUTS ('score numeric(5,2)')

);