
This update affects only locked nodes. If a remote node fails early config in a way that prevents IPSec over management from being established, and no cluster interface is configured or provisioned, then Node Locked commands sent from mtcAgent over management and cluster networks are not received by mtcClient. This leads to a perpetual watchdog reset loop. The pmon process fails to reach the configured state, and without the presence of the .node_locked file, the watchdog treats the node as unlocked. A quorum failure triggers a crashdump reset, repeating indefinitely. The mtcAgent detects this and attempts corrective action by resending the Node Locked command over the same failing networks, which also fails. This update adds a fallback: the Node Locked command is also sent over the pxeboot network. Testing also revealed that mtcClient socket recovery stops at the first socket failure rather than try and rcover them all. This update improves socket recovery by attempting all sockets in order. The pxeboot socket is tried first, now followed by management and cluster sockets. Test Plan: PASS: Verify mtcClient socket init and failure recovery handling. PASS: Verify the mtcAgent sends the Node Locked command on the pxeboot network when it sees a node locked state mismatch. PASS: Verify a locked node with failing management and cluster networking will get the node locked command serviced and node locked file produced as expected on the remote node. This event is noted by the following host specific mtcAgent log. "hostname mtcAlive reporting unlocked while locked ; correcting" Note: that before this update we see the above 'correcting' log every 5 seconds. With this update we see that log only once and the remote node does not go into a perpetual crashdump loop. Note: The host watchdog will not force a quorum failure crashdump if the /var/run/.noide_locked file is present. Closes-Bug: 2103863 Change-Id: I020c7ebe1e83254c52219546ec938f6cf3284c2e Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Description
Languages
C++
83%
Shell
10.2%
Python
3.3%
C
2.5%
Makefile
1%