Saturday 1 October 2016

Sensible micro-services

Micro Servicing Properly

One of the biggest mistake you can make is getting the granularity right for your micro-service. Where most developers go wrong is only thinking about the code, and not the operational resourcing required. By this I mean technical support, ensuring up-time, scaling easily, reconfiguring without recompiling etc. Think like you're the guy who has to wake up at 3:30 am on a Sunday morning because a database is being thrashed or something has gone offline. When you start thinking like that guy, you realise smaller can (but not always) can be better.



Is it too big?


Take the following simple workflow where the main objective is some data analytics with the result stored. The steps might be:
  1. Locate the objects that need to analysed.
  2. Load the data for each object (lets assume this is external data, even if by external we mean from a database, a flat file or some other such data).
  3. Analyse the data for each object.
  4. Store the result of each objects analysis.
To many developers this might seem like a simple case of:
  1. Execute a DB query to get the objects
  2. For-each every item (maybe in parallel like PLINQ or await/async in C#)
    1. Load the objects data
    2. Sum or average the values loaded from 2.1
    3. Store the result into the database
This seems perfectly sensible for a report that is as simple as loading data, averaging the values and saving it, especially when the next report has a slightly different requirement and you have 15 of these reports to write, think like that guy who's been woken up at 3:30 am with a report from the customer that the reports aren't working.

Imagine in this case it's a (relatively) simple issue of a report that's killing your database locking tables and trying to write too much data. This in itself is an issue, but only in that it slows down how long the data takes to be written, except you wrote all of these reports in a total of 10 lines each in one method. It's not hard. It's only 10 lines of code per report. But now all your requests are failing to save, and worse still... that data sourced for step 2 is from a remote system, like an accounting, CRM or ERP system. If the system isn't brought back up now, all the data analysis will be missing for today.

How to think better.


If this had been built on a smaller, more fine grain level, say each step as a different service then you start getting into some really cool possibilities.

Take our poor support person looking at a thrashed database at 3:30. If each step was a separate micro-service with something between each service like a service/message bus, then the resulting data from step 3 would have been saved into a queue. When the database had recovered, all those saved messages would be able to be processed and there would have been no loss of data. The micro-service running step 3 could also have gone offline as the data loaded from step 2 would also have been saved into a queue. You've now added fault tolerance to two steps in your reporting system. The poor support person could go back to bed knowing the reports would still be accurately saved at some point.

How to think global

With micro-servicing at this smaller, more finely grained level, you can start effectively using resourcing. Step 1 and step 2 might not require much CPU resourcing as it has a lot of wait time as it involves databases and external data, but needs fast networking access. Step 3 however might require a lot of CPU resourcing as it processes and analysis the data as quickly as possible. Step 4 again only requires network access, but this time, it's internal network access to the database.
With this thinking you can now start moving some of the micro-services around. Take the micro-services running step 2, they can be moved to other parts of the world, where they are closer to the external service they are getting data from making them faster and cheaper. The micro-services running step 3 could be moved to somewhere that has cheap CPU cycles per minute costs. And step 4 might actually be on premise.

When you start thinking beyond just those 10 lines of code, the world opens up and expands greatly.

No comments:

Post a Comment