Real-time streaming data sources and Internet of things has brought Complex Event processing to the spotlight. The ability to collect data from devices using sensors, improvement in data carrier services and the growth of secure transfer to a centralized location has given a kick-start to analyze different data patterns from the various device at a combination.
Let us start by defining what an event is. An event is said to occur when something happens which needs to be known for inferring or taking some action. An event processing is a way to track the information of the events by processing data streams and determining a circumstantial conclusion from them. This is associated with events from a single source. Ex: When the temperature of the room is more than 45 c, is what I consider an event for me to lower the temperature of my Air Conditioner. We can think of event processing as collecting data from this temperature sensor.
Extending our example, when start monitoring multiple things such as temperature, pressure, and humidity using different sensors and want to derive some conclusion from them analyzed all together, then it is known as Complex Event Processing(CEP). So, we can define CEP as event processing that combines data from multiple sources to understand events or patterns that help us to understand more difficult circumstances beyond analyzing the data separately from each source. Typically, complex events are aggregated or derivations of simple events. A CEP engine usually acts as a data fusion engine which identifies meaningful patterns, relationships, and data abstraction among unrelated events and trigger actions. However, CEP doesn’t only mean only analyzing data set by combining multiple data set. It can be used for a variety of tasks depending on the problem context. Outlined below is a comprehensive list of various design patterns for complex event processing adapted from Srinath Perara’s research.
|This involves extracting useful information from the data streams using Filters, Splitting or combining different attributes
|Alerts and Thresholds
|Detecting certain events based on some predefined conditions and alerting the users. For example, alerting a user when the temperature of the furnace crosses 450 C.
|Time Window Operations
|Carrying out certain operations for each window in a time period. For example, calculating the average temperature every 5 minutes.
|Joining Event Streams
|Combining data from multiple sensors and infer some event. This is also known as Sensor Fusion. Example, combining data from two sensors in airplanes and detecting proximity of the planes for any possible collisions
|Establishing a correlation between different streams together which can give information about any missing events or erroneous data. For example, detecting when a product has not been shipped for delivery after it has been packed in the warehouse
|Historical Data Interaction
|Used to interact with historical data stored in databases. For example, look up the sensor parameters for the Sensor ID that flows in the data stream
|Detecting Temporal Event Sequence Patterns
|Used to recreate a process from a sequence of events. For example, recreating flow of order until it is delivered to a customer for each product. This is a useful application for mining business processes to detect any deviations from the defined workflow
|Tracking certain objects over space and time to detect certain conditions. For example, tracking ships to make sure that they adhere to routes and geofences.
|Detecting patterns from the time series data. For example, rise and fall in the temperature of a furnace
|Serves to deal with Big Data’s volume and velocity simultaneously within a single architecture. You can read more about this in the article Lambda Architecture for Big Data
Detecting and switching to Detailed Analysis
|Same as Lambda architecture but it doesn’t analyze all the data in real time. Instead, based on some anomaly, it pulls out the historical records and analyzes that particular case. For example, in case of credit card fraud detection, we can have basic rules to detect Fraud, then pull out all transactions done on that Credit Card for a detailed analysis
Using a Machine Learning Model
|Train a machine learning model and then use it with the Real time pipeline to make decisions. For example, you can build a model in R, export it to PMML and use it within your real time pipeline.
There are many CEP tools both open source and commercial that can be used to implement these design patterns. Some of them are drafted below for you to assess and adopt as your requirement.
In the next article, we will demonstrate some of the CEP design patterns using Apache Spark.