The Tweet_Streamer is created to directly stream live Twitter events into Teradata Aster database. This SQL-MR will download tweets based on specified number of tweets, keywords and language. The idea is to stage the twitter feeds of JSON format into a payload table. Since the twitter data model may change due to at any time relevant business needs, it is best to acquire the complete JSON data and then later be parsed by using another SQL-MR JsonDataExtractor(), a configurable utility that can extract specific nodes and values of a JSON structure.
The SQL-MR is programmed to use an existing table with an auto-created id and a payload column of varchar type.
The main benefit of having this custom functionis that we can have direct access to live twitter feeds in Aster instead of using other external applications like Flume, HDFS. Execution can also be monitored via AMC. This functionality can also work in conjunction with an existing SQL-MR (load_tweets()) that fetches historical tweets for limited number of attributes.