Today’s information society abounds in a myriad of information flows, computer-based human collaborations, software agent interactions, electronic business transactions, and the explosion of data on the Internet. Understanding what is happening in these environments is becoming increasingly difficult. In other words, we need to find the best ways to make sense of this wealth of data, to improve the quality and availability of information, and to ensure effective responses. Traditional storage and data analysis technologies are not adapted to this exponential growth of volume and event rates.
In addition, the value of information may decay rapidly over time. For instance, some events that may help anticipate a production outage have no value once the outage happened. There is a need to process data as soon as events occur, with respect to latency constraints. We need to move away from traditional client-server (query-response) interaction models, to more asynchronous, event-oriented, loosely coupled push model, with applications able to take decisions based on events data.
Complex Event Processing (CEP) is a set of technologies that allows exploring temporal, causal, and semantic relationships among events to make sense of them in a timely fashion.
This article is the first of a serie exploring these technologies, their capabilities and possible applications.
Use cases that may benefit from CEP are varied, but we can identify some patterns in them, showing the decisive contribution of this technology.
Event processing can be used in plants to detect anomalies or determine if significant changes require re-planning of production. Plant floor systems push events from numerous sensors to a centralized control system that will explore events patterns and emit new, aggregated, rich events to take decisions.
Patterns: Active diagnostics of problems, Real-time operational decision
RFID tags, mobile phones, and Wi-Fi enabled devices feed information about their spatial location into server-side systems. Applications can be tracking goods in the supply-chain, or pushing information to a customer, based on his mobile phone location.
Patterns: Information dissemination, Observation systems
Heterogeneity of information sources and event rates impose an event processing approach on modern financial IT systems, in which quasi real-time market analytics can hardly be implemented in conventional client-server architectures.
Patterns: Information dissemination, Real-time operational decision
Near real-time data coming from telecom subsystems could be analyzed together with business data from IT systems, or with historical data. With the use of predictive models, fraud detection can be improved.
Patterns: Predictive processing, Real-time operational decision
Clickstream analysis helps in optimizing user experience on commercial web sites, to adapt advertising, or drive page layout. This requires low latency decision, with immediate pattern recognition.
Patterns: Real-time operational decision
The utility sector requires an efficient infrastructure for managing electric grids and other utilities. This requires immediate response to variations in consumption, using events coming from numerous data sources, aggregated along the grid.
Patterns: Real-time operational decision, Active diagnostics, Information dissemination
“It’s all about time” !
The word “complex” in CEP refers mainly to the complexity of state management over time while processing the events. Typical examples are:
Most CEP implementations also provide advanced pattern detection, such as a non-deterministic finite state automaton, similar to a regular expression search over a flow of events, with influence of time in the search.
Another key influence of time is timeliness. Timeliness is the ability to handle events and produce output in a constrained time. It can be seen as end-to-end latency, and can reach the milli-second scale with CEP, or below (cf [perf 1, 3]). CEP tools also provide the ability to arbitrate between guaranteed time and correctness of output (eg. waiting or not for late or unordered events).
Then come event volumes and rates: CEP tools performance can exceed 10’000 and even reach 100’000 events/s [perf 1, 2].
Other complexity factors can also motivate a move towards CEP technologies as well :
This table sums up the areas in which event processing could particularly fit (from [Chandy et al 2011]) :
Event rates | Application complexity (time, state, context) | Timeliness |
High | High | High |
High | High | Low |
High | Low | High |
Low | High | High |
In other cases, more traditional messaging systems and/or transactional systems may be more suited than CEP.
CEP Market - March 2011
There are various vendors, having different approaches and paradigms in their event processing products. We can identify the following paradigms [Helmer et al 2011] :
Paradigm | Possible applications |
Event stream oriented and query based, this can be seen as a continuous query running on an infinite flow of data | Well suited for aggregation of event data, with SQL-like join logic (between events within the flow or with external DB) |
ECA (event/condition/action) rule based, this approach having ancestors in active database paradigm (eg. triggers in database) | Well suited in scenarios where business users should be able to define event patterns, by composing simple rules |
Inference rule based, with similarities to what can been seen in BRMS | Well suited when actions have to be taken when certain states are reached ; or in business activity monitoring context with real-time decisions |
Time-state machine based | Well suited in monitoring situtations, but with a well defined finite state space |
One can see strong similarities with more traditional technologies, ranging from BRMS, to versatile messaging systems (JMS), or EAI.
Beside the ability of CEP to handle very high rates of incoming events, CEP above all brings a coherent set of specific features.
Here is a list that may help you refining your need around CEP :
Functional capabilities:
Non-functional capabilities :
CEP is more an approach than a technology. On the contrary, there are several types of implementations available, and an even greater number of products on the market. Given the increasing importance of real-time information processing, choosing the best solution for your needs is not an easy task. For this purpose, our next series of articles on CEP will explore several CEP products in detail, and expose their key features.
[Chandy et al 2011] The event processing manifesto, 2011 Authors : Mani K. Chandy ; Opher Etzion ; Rainer von Ammon
[Grabs et al 2009] Introducing Microsoft StreamInsight, 2009 Authors : Torsten Grabs, Roman Schindlauer, Ramkumar Krishnan, Jonathan Goldstein
[Helmer et al 2011] Reasoning in Event-Based Distributed Systems, 2011 Authors : Sven Helmer, Alexandra Poulovassilis, and Fatos Xhafa
[perf 1] Sybase Aleri performance http://m.sybase.com/files/Data_Sheets/SybaseAleri_CEPPlatform_PerfTesting_ds.pdf
[perf 2] Esper performance http://esper.codehaus.org/esper/performance/performance.html
[perf 3] StreamBase performance at QCon 2011 http://qconlondon.com/dl/qcon-london-2011/slides/RichardTibbetts_ComplexEventProcessingDSLForHighFrequencyTrading.pdf