Use less aggressive node startup timeouts (#230)
* Use less aggressive node startup timeouts * Tweaks to try and prevent runaway remediations
This commit is contained in:
parent
44d77503d7
commit
b9eadc2650
@ -290,10 +290,10 @@ controlPlane:
|
|||||||
enabled: true
|
enabled: true
|
||||||
# The spec for the health check
|
# The spec for the health check
|
||||||
spec:
|
spec:
|
||||||
# By default, unhealthy control plane nodes are always remediated
|
# By default, don't remediate control plane nodes when more than one is unhealthy
|
||||||
maxUnhealthy: 100%
|
maxUnhealthy: 1
|
||||||
# If a node takes longer than 10 mins to startup, remediate it
|
# If a node takes longer than 30 mins to startup, remediate it
|
||||||
nodeStartupTimeout: 10m0s
|
nodeStartupTimeout: 30m0s
|
||||||
# By default, consider a control plane node that has not been Ready
|
# By default, consider a control plane node that has not been Ready
|
||||||
# for more than 5 mins unhealthy
|
# for more than 5 mins unhealthy
|
||||||
unhealthyConditions:
|
unhealthyConditions:
|
||||||
@ -387,10 +387,11 @@ nodeGroupDefaults:
|
|||||||
enabled: true
|
enabled: true
|
||||||
# The spec for the health check
|
# The spec for the health check
|
||||||
spec:
|
spec:
|
||||||
# By default, unhealthy worker nodes are always remediated
|
# By default, remediate unhealthy workers as long as they are less than 40% of
|
||||||
maxUnhealthy: 100%
|
# the total number of workers in the node group
|
||||||
# If a node takes longer than 10 mins to startup, remediate it
|
maxUnhealthy: 40%
|
||||||
nodeStartupTimeout: 10m0s
|
# If a node takes longer than 30 mins to startup, remediate it
|
||||||
|
nodeStartupTimeout: 30m0s
|
||||||
# By default, consider a worker node that has not been Ready for
|
# By default, consider a worker node that has not been Ready for
|
||||||
# more than 5 mins unhealthy
|
# more than 5 mins unhealthy
|
||||||
unhealthyConditions:
|
unhealthyConditions:
|
||||||
|
Loading…
x
Reference in New Issue
Block a user