Eventual consistency
- Wikipedia
In the previous section, you would have clearly got the notion behind the guarantees, namely consistency versus availability.
Its expected that data in a distributed system will become eventually consistent but always available. For a Data Lake, the availability guarantee is more important than consistency, but it purely depends on the use case. We cannot show/use bad data to the end user as this would have a huge impact on the company's brand. So, use cases need to be carefully validated you take this as a generic principle for your Data Lake.
In the context of Data Lakes (Lambda Layer), if the speed layer becomes corrupted (data loss or data corruption), eventually the batch and serving layer would correct these mistakes and have the queries served fine through the serving layer. The speed layer in the pattern just keeps the data for a period of time so that the batch and serving layers can process the data from scratch and ensure consistency; once done, the speed layer views get discarded.