Concurrent Patterns and Best Practices
上QQ阅读APP看书,第一时间看更新

Fault tolerance

Another common approach is to build in intentional redundancy to provide fault tolerance; for example, big data processing systems, such as Cassandra, Kafka, and ZooKeeper, can't afford to go down completelyThe following diagram shows how concurrently replicating the input stream protects against any one slave node going down. This pattern is commonly used by Apache Kafka, Apache Cassandra, and many other systems:

The right side of the diagram shows redundant machines on which a data stream is replicated 

In case any one node goes down (hardware failure), other redundant nodes take its place, thus ensuring that the system as a whole is never down.