-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Currently clusterf relies on clusterf-docker to mark Docker containers as down and remove them from etcd as a crude form of backend health checking, which should be roughly sufficient to handle crashing server processes.
Using clusterf in any kind of serious production environment would require support for healthchecks. The ideal approach would be for each clusterf-ipvs node to perform its own healthchecks of its backends, and then mark unhealthy backends with an IPVS weight of zero. However, writing our healthchecks would be a bit of a pain, perhaps there is some existing golang package that we could use?
An alternative option would be the use of a more dedicated service discovery solution such as consul, with built-in health checks. However, the backend health seen by consul is not necessarily the same as seen by IPVS in the face of network partitions? Sadly, we can't really do any inline healthchecks on active backends connections as the kernel IPVS does not export any stats on failed connections in the form of e.g. TCP resets or timeouts.