The Benefits of Idempotence

This is supposed to be an IT philosophy blog, but I’ll argue that Idempotence should be a coding philosophy.

Loosely, something that is Idempotent (Idem:Once, Potent:Having effect) will have an effect only the first time it is executed or will have exactly the same result every subsequent time it is executed.

Some say that dividing or multiplying by 1 is an idempotent operation.  In a coding sense I disagree somewhat.  Dividing or multiplying by 1 NEVER changes the result, not even the first time.  So maybe multiplying by zero is a more appropriate example of idempotence.  It changes the result the first time it is executed, but thereafter has no further effect if executed multiple times (i.e. after the first time you are multiplying zero by zero and the result will of course remain zero)

In software this translates into writing code that knows when it has already been “effective” and that if it is executed a second time will have no further effect.

An example of this is in batch processing.  A batch program written in an object oriented way could have a stack of calls to the same process (one for each item in the batch) so the batch program simply calls the same routine once for each item in the batch.  If this routine (let’s call it the batch-item-processor) is written according to idempotent principles, then calling it twice for the same batch-item will have no harmful effect, i.e. that item will not process a second time.

If you have ever had to troubleshoot batch faults you will immediately recognise that this has many benefits.  As the troubleshooter, you do not have to worry about which batch entries have already been processed and which ones have not.  Once you have found the faulty batch-item and either corrected or eliminated it, you simply resubmit the whole batch.  The batch-item-processor will not allow any batch-items to be processed twice.

You may be tempted to code an override switch so that you can force a second-potency of the batch-item-processor for a specific transaction, however it would be more effective to reverse the effects that the “Idempotency-detector” in the batch-item-processor uses to detect whether it has been processed previously.  This would roll back to a state before that transaction processed, so the new call to this batch-item-processor would correctly process for the first time “again”.

When writing an idempotent batch processor it would therefore be prudent to write a reversal processor at the same time to reverse the effects of the transaction as discussed above.  In fact I would write this in a “-1″potent way 🙂   So this means it will not try to reverse transactions that have not been actually performed yet.  This means if you want to do a batch reversal you can simply submit the whole batch to the reversal processor and only transactions that have actually been processed will be reversed.  It’s the same principle in reverse.

The key here is that you give sufficient intelligence to the batch-item-processor and reversal routine to know when it is safe for them to execute.  This is a higher principle than idempotence.  Appropriate-potence: A code-object knows when it can/should process and will not process until those conditions are met.  For example if you call a step in a process which requires that previous steps have already completed, it will simply respond that it cannot run yet because these upstream processes have not yet completed.  Depending on the process you may put it then into a polling loop checking for that state to occur.  Every subsequent call simply responds that it is waiting to run and the new instance terminates itself while the original continues to wait.

NOW you can start building an intelligent network of objects that together form an optimised just-in-time no-lag process in which every step performs at the moment it needs to.  As soon as a required state is reached the dependent process will begin its tasks.  This means that to initiate a process you simply call the LAST step in the process and as it checks state, all the way up the stack it will have initiated the whole process.  If you create a monitor agent for the required state for the FIRST step in the process an event can be generated that triggers this LAST step in the process, and hence the entire end-to-end process is automated.

Another side effect of giving each step the intelligence to know when it can run is that you manage dependencies at each node, you do not have to create a complex dependency matrix, simply worry about the dependencies of each object and let the network of objects calculate the matrix in runtime.

I’m sure there are many examples of this having been done before.  Leave a comment if you have tried this, or implemented a cool example of this approach – or of course if you simply want to make a comment. Find the “Respond” link below the Article date in the Article Header.

This entry was posted in Management. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *