Flotsam and Jetsam – on EDW Data Growth

Blog
The best minds from Teradata, our partners, and customers blog about whatever takes their fancy.
Teradata Employee

I’m going to catch up on some of my reading from over the summer. The common theme in this entry on the unabated explosion of data, even when it seems that some industries are scaling back.

First, we have a couple of articles from The Economist which IMHO is the world news magazine for nerds :-). From last June’s technology quarterly, we have two articles: "Sensors and Sensitivity" and "The Connected Car".

“Sensors and Sensitivity” addresses the expansion of sensors on mobile phones and the petabytes of data that I anticipate will result. Other than the phone’s knowledge of your communication activities (text, web, voice, media), with GPS “It also knows where you have been, how you get to work, where you like to go for lunch, what time you got home, and where you like to go at the weekend.” In addition, after some hardware upgrades “Sensors inside phones, or attached to them, could gather information about temperature, humidity, noise level and so on.” One can imagine the practical usage of this data for civil and network engineering, physical location analysis (all industries), disease control, sociology, etc. Load up the data in an EDW and away you go!

“The Connected Car” addresses the expanded data collected from automobiles, noting that modern cars can have up to 200 sensors. What first comes to mind is GPS data, with the resulting traffic flow analysis and sat-nav opportunities for traffic control. And, in a local story, a fight against a radar based traffic ticket that disagreed with the GPS data by 20mph. But how about constant streaming of sensor data to ground stations, much like we have today for airliners? Can an alert be generated to a local AAA repair service for a flat tire on a customer’s car that has stopped moving? And what if we expand this to any useable device? Intelligent refrigerators that automagically reorder goods based on usage have been around for a number of years. As for analysis, load up the data in an EDW and away you go (again)!

Next we have a NY Times article, "Mining the Web for Feelings, Not Facts." Social network analysis is becoming pretty standard, with many network providers offering standard APIs to access the data, such as Twitter. The Times article deals with a new field of sentiment analysis that tries to quantify and score positive and negative reactions to events, products, companies, etc., from the social network data. Currently this looks to be more of an offered service by those firms that have built the first scoring algorithms. But, as we’ve seen in the past, as these methods become more wide-spread, companies are going to want to bring the data and analysis in-house. Load up the data…

Finally, a Wall Street Journal article, "Retailers Cut Back on Variety, Once the Spice of Marketing" (subscription required) deals with the trend over the last year to cutback product selection in retail stores. This might mean less data for the future EDW. In some ways, customer based market basket analysis has given retailers more confidence in what products do and do not drive profit in their stores. Dropping that stinky cheese, with its spoilage costs and limited sales, will not happen when the store correctly identifies that customers that include this periodic purchase are overwhelmingly profitable. However, in cutting back on selection, this article noted that Wal-Mart found “Research showed shoppers spent an average of 22 minutes in a Wal-Mart, but suggested that the wide product variety was curtailing the number of items they put in their shopping baskets, says John Fleming, chief merchandise officer.”

There are a lot of anecdotal references in this article, but I’m not so sure. My experience follows the Long Tail thinking that online retailers can have unlimited selection and can deliver their goods very cheaply at the speed of a delivery van. I personally order online goods that no longer become available at my local retailer, most of the time at a lower cost. Maybe I’m not a profitable customer for my local brick and mortar store. However, I wouldn’t count on less data for the EDW any time soon.

‘Til next time.

1 Comment
John,

Very interesting post. I especially liked the cited articles in The Economist and your comments on them! I passed on some of this info, referencing your blog at http://twitter.com/richardwinter

Richard