Streaming Analytics (or Fast Data processing) is becoming an increasingly popular subject in financial services, marketing, the internet of things and healthcare. Organizations want to respond in real-time to events such as clickstreams, transactions, logs and sensory data. A typical streaming analytics solution follows a ‘pipes and filters’ pattern that consists of three main steps: detecting patterns on raw event data (Complex Event Processing), evaluating the outcomes with the aid of business rules and machine learning algorithms, and deciding on the next action. At the core of this architecture is the execution of predictive models that operate on enormous amounts of never-ending data streams.
But with opportunities comes complexity. When you’re switching from batch to streaming, suddenly time-related aspects of the data become important. Do you want to preserve the order of events, and have a guarantee that each event is only processed once? In this talk, I will present an architecture for streaming analytics solutions that covers many use cases, such as actionable insights in retail, fraud detection in finance, log parsing, traffic analysis, factory data, the IoT, and others. I will go through a few architecture challenges that will arise when dealing with streaming data, such as latency issues, event time versus server time, and exactly-once processing. Finally, I will discuss some technology options as possible implementations of the architecture.
Click here for the conference schedule