
The current sensor list is shared across all hosts. On large systems, this can lead to list corruption when host sensor read threads output data concurrently. This update moves sensor_list to be thread local, so each thread gets its own unique instance. Although thread_local variables are not on the stack, their memory is tied to the thread’s resources. In many cases, this memory is drawn from the same per-thread region as the stack, also known as TLS (Thread-Local Storage). The TLS area is often allocated adjacent to or within the thread’s stack mapping. A large thread_local variable increases the TLS requirement, and if it exceeds the reserved space or overlaps with the stack, thread creation may fail with Resource temporarily unavailable. To accommodate this, the per-thread stack size was increased. The sensor_list allocates for up to 512 sensors per host, which is excessive. This update reduces the max sensors per host to 256, cutting the list size from 327 KB to 163 KB per thread. Even with this reduction, the thread stack size needed to be increased from 128 KB to 512 KB. The Mtce Thread utility was updated to support custom stack sizes. This allows mtcAgent to remain at 128 KB while hwmond threads can specify a larger size. This update also adds a debug feature to create dated sensor reading files for each host. While testing, it was found that output files were created with inconsistent permissions. This update fixes the file mode to 0644. Test Plan: Verified in 2+2+50 node system PASS: Verify large system install and sensor monitoring PASS: Verify large system sensor monitoring over DOR and Swact PASS: Verify the sensor_sample list storage is unique per thread PASS: Verify sensor read file permissions PASS: Verify dated debug sensor read files PASS: Verify added debug options are disabled by default PASS: Verify 24 hour provision/monitor/deprovision soak PASS: Verify sensor monitoring following host delete and readd PASS: Verify sensor model is deleted completely with host delete PASS: Verify sensor model is recreated over host readd Regression: PASS: Verify sensor monitoring and alarm management PASS: Verify hardware monitor process restart handling PASS: Verify no coredumps PASS: Verify logging for all test cases Closes-Bug: 2102671 Change-Id: I9263ec2242e03d46e9dc768af965fed7e1ac9175 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
31 lines
564 B
C
31 lines
564 B
C
#ifndef __INCLUDE_MTCTHREAD_HH__
|
|
#define __INCLUDE_MTCTHREAD_HH__
|
|
|
|
/*
|
|
* Copyright (c) 2013-2017 Wind River Systems, Inc.
|
|
*
|
|
* SPDX-License-Identifier: Apache-2.0
|
|
*
|
|
*/
|
|
|
|
/**
|
|
* @file
|
|
* Wind River CGTS Platform Node Maintenance "Thread Header"
|
|
* Header and Maintenance API
|
|
*/
|
|
|
|
typedef struct
|
|
{
|
|
string bm_ip ;
|
|
string bm_un ;
|
|
string bm_pw ;
|
|
string bm_cmd ;
|
|
} thread_extra_info_type ;
|
|
|
|
#define MTCAGENT_STACK_SIZE (0x20000) // 128 kBytes
|
|
|
|
void * mtcThread_bmc ( void * );
|
|
void * mtcThread_bmc_test ( void * arg );
|
|
|
|
#endif // __INCLUDE_MTCTHREAD_HH__
|