Learning Apache Apex
上QQ阅读APP看书,第一时间看更新

Performance

Even with big data scale out architectures on commodity hardware, efficiency matters. Better efficiency of the platform lowers cost. If the architecture can handle a given workload with a fraction of the hardware, it will result in reduced Total Cost of Ownership (TCO). Apex provides several advanced mechanisms to optimize efficiency, such as stream locality and parallel partitioning, which will be covered in Chapter 4Scalability, Low Latency, and Performance.

Apex is capable of very low latency processing (< 10 ms), and is well suited for use cases such as the real-time threat detection as discussed earlier. Apex can be used to deliver latency processing Service Level Agreement (SLA) in conjunction with speculative execution (processing the same event multiple times in parallel to prevent delay) due to a unique feature: the ability to recover a path or subset of operators without resetting the entire DAG.

Only a fraction of real-time use cases may have such low latency and SLA requirements. However, it is generally desirable to avoid unnecessary trade-offs. If a platform can deliver high throughput (millions of events per second) with low latency and everything else is equal, why not choose such a platform over one that forces a throughput/latency trade-off? Various benchmarking studies have shown Apex to be highly performant in providing high throughput while maintaining very low latency.