From 2c1a2ed2258e6139a12aac7f7daddafd6b1a84bb Mon Sep 17 00:00:00 2001 From: Dobroslaw Zybort Date: Mon, 23 Jul 2018 14:03:34 +0200 Subject: [PATCH] Make health checks more frequent in Docker 5 min was copy/paste from official docs. After reading more about health checks in real world usage most examples was using duration around 5 seconds for interval. By default docker will show if service is unhealthy after 3 intervals return error. So in previous timing service would be taken out of poll (by e.g. docker swarm) after 15 min. For all this time it would be returning errors for any communication to it. Now it will be removed from poll of running services after 15 seconds. Regarding timeout more examples was using something shorter. For all services we are using if anything respond longer than 2 seconds then something is wrong with this service. Monasca is not web service but back-end service that should have high throughput. Change-Id: I4486c4974de38dea33739fdc470f38fd99d428fa --- docker/Dockerfile | 2 +- docker/README.rst | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/docker/Dockerfile b/docker/Dockerfile index fa001452..e729be14 100644 --- a/docker/Dockerfile +++ b/docker/Dockerfile @@ -129,7 +129,7 @@ ONBUILD RUN \ -o \( -type f -a \( -name '*.pyc' -o -name '*.pyo' \) \) \ \) -exec rm -rf '{}' + -ONBUILD HEALTHCHECK --interval=5m --timeout=3s \ +ONBUILD HEALTHCHECK --interval=5s --timeout=2s \ CMD python3 health_check.py || exit 1 ENTRYPOINT ["/sbin/tini", "-s", "--"] diff --git a/docker/README.rst b/docker/README.rst index f2faae7c..59ca8362 100644 --- a/docker/README.rst +++ b/docker/README.rst @@ -33,7 +33,8 @@ start.sh health_check.py This file will be used for checking the status of the application running in - the container. It will be useful for container orchestration like Kubernetes + the container. It should be used to inform Docker that service is operating + and healthy. It will be useful for container orchestration like Kubernetes or Docker Swarm to properly handle services that are still running but stopped being responsive. Avoid using `curl` directly and instead, use `health_check.py` written with specific service in mind. It will provide more