Eric MacDonald d863aea172 Increase mtce host offline threshold to handle slow host shutdown
Mtce polls/queries the remote host for mtcAlive messages
for 42 x 100 ms intervals over unlock or host failed cases.
Absence of mtcAlive during this (~5 sec) period indicates
the node is offline.

However, in the rare case where shutdown is slow, 5 seconds
is not long enough. Rare cases have been seen where 7 or 8
second wait time is required to properly declare offline.

To avoid the rare transient 200.004 host alarm over an
unlock operation, this update increases the mtce host
offline window from 5 to 10 seconds (approx) by modifying
the mtce configuration file offline threshold from 42 to 90.

Test Plan:

PASS: Verify unchallenged failed to offline period to be ~10 secs
PASS: Verify algorithm restarts if there is mtcAlive received
      anytime during the polls/queries (challenge) window.
PASS: Verify challenge handling leads to a longer but
      successful offline declaration.
PASS: Verify above handling for both unlock and spontaneous
      failure handling cases.

Closes-Bug: 2024249
Change-Id: Ice41ed611b4ba71d9cf8edbfe98da4b65dcd05cf
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2023-06-16 18:14:08 +00:00
2019-04-19 19:52:33 +00:00
2023-04-28 12:38:51 -04:00
2018-05-31 07:36:43 -07:00
2022-12-26 23:26:54 +00:00

metal

StarlingX Bare Metal Management

Description
StarlingX Bare Metal and Node Management, Hardware Maintenance
Readme 15 MiB
Languages
C++ 83%
Shell 10.2%
Python 3.3%
C 2.5%
Makefile 1%