Event Stream Data: What is it and Why Should I Care?
“Events, dear boy, events.”
—British Prime Minister Harold Macmillan, when asked by a reporter what would influence his government’s decisions the most.
For a long time, most business folks have thought of data in a very structured, rigid manner. This is especially true in large, mature organizations. Data was almost always stored in neat and orderly tables in relational databases. For reporting purposes, often data was sent via an ETL batch job to a data warehouse or datamart. In short, things tended to be very controlled and predictable.
To be sure, a great deal of key enterprise information still falls into this model. As I describe in Too Big to Ignore: The Business Case for Big Data the advents of Big Data and its components (the Internet of Things [IoT], sensor-driven data, social media, etc.) have not obviated the importance of clean accurate information in ERP and CRM applications. Make no mistake, though: these days events drive a great deal of critical information.
Data in Motion
Employees today can analyze massive amount of real-time information and make better business decisions.
Perhaps it is best to define event-stream processing (ESP) against “normal” data processing. In the case of the latter, an organization would never run a “final” payroll or produce a P&L in the middle of Black Friday. What’s more, based on my experience in retail, I know first-hand that stores prepare for months for their busiest time of the year—hence the term holiday hire.
Contrast that with the suddenness with which things trend on Twitter. Although sites such as Groupon and Living Social are no longer Silicon Valley’s darlings, flash sales and daily deals can still quickly go viral. To understand and respond effectively and quickly to these events, organizations need to do more than capture the data, load it into a reporting data warehouse, and run SQL statements.
Application program interfaces (APIs) can do many amazing things in conjunction with ESP. For instance, Twitter allows developers to create exciting applications via several different types of streaming APIs. And the company is hardly alone in this regard. Many Wall Street firms rely upon ESP to conduct high frequency trading (HFT). To provide location-based services, telecom providers also use ESP. (Of course, none of this happens without powerful data centers and networks, but that’s fodder for another post.)
Needless to say, relational databases simply weren’t conceived to handle information with this variety, volume, and velocity. To this end, I’m aware of very few organizations that attempt to handle ESP via traditional ETL. Most rely upon some type of cloud computing, whether it’s a public cloud, private cloud, or hybrid cloud.
Aside from the ability of newer technologies to handle ESP, cost is a major factor. Fortunately, Kryder’s Law is alive and well. Today, data storage has never been less expensive—and that will be true for the foreseeable future. Beyond cost, there’s the convenience factor. Imagine being on the floor of a retail organization and having to run to your desktop to see what’s going on. That’s so 1998. Thanks to mobile devices, employees can analyze massive amounts of real-time information and make better business decisions.