SlapOs process watchdog architecture

SlapOs process watchdog architecture explaigned

Architecture

Every SlapOs machine has a "watchdog" process which is monitoring other processes and calls "bang" to master (either a real slapos.org master or local slapproxy one).

The purpose of bang is to infor master a process died so it can do whatever needed so next call to slapos node instance really do start it.

Bang is called whenever:

called explicitly (by a promise for example, by a service itself)
a process watched by the watchdog becomes in a state that is not the one it is supposed to be

In theory, buildout is run all the time, repeatedly and is supposed to have 0 execution time (theoretical model). But since that would take 100% of CPU, we have to call it less often. So, we find ways to call it less often.

buildout is called

every X (this can be configured at the profile level)
if promises are not all satisfied
if requested services are not available
as the result of bang

buildout is actually called by slapgrid. Slapgrid itself is called every Y (in theory, Y = 0, but in reality 1 minute). So, slapgrid is called:

at lease every minute
right after a slapgrid call if something happened in the previous call (ex. request of new service, failing promise) with an increasing delay to reduce CPU load

Short cut optimisations to consider

Currently bang has to go through the master. It is possible in future to consider a short cut that does not go through the master. But I (JP) am not sure yet that it is a good idea. It is probably simpler and cleaner to run slapproxy locally if one needs full autonomy.

Thank You

Nexedi SA
147 Rue du Ballon
59110 La Madeleine
France
+33629024425

nexedi.com/contact

For more information, please contact Jean-Paul, CEO of Nexedi (+33 629 02 44 25).