Durable Execution

I was surprised I hadn’t heard about this before but I recently came across the concept of “durable execution”. I’ll let a formal definition elaborate:

Durable Execution is the practice of making code execution persistent, so that services recover automatically from crashes and restore the results of already completed operations and code blocks without re-executing them.¹

Think of a background job that not only gets retries (after failure or scheduled) for free but also restarts any failed jobs from where they failed instead or re-running them in their entirety. The idea is to allow for simpler code that is run by an engine designed to ensure it gets run. Failures and schedule handling are abstracted away. There are more benefits such as detailed logging and it’s great to see implementations of these ideas formalized, working, and gaining popularity.

Some links

Durable Execution: This Changes Everything, with Tom Wheeler
ChronoForge: A robust framework for building durable, distributed workflows in Ruby on Rails applications

What is Durable Execution or Workflows-as-Code? ↩