931 Commits

Author SHA1 Message Date
Robert Church
3eef0bd7ee Add hwsettle support to pxeboot-update.sh and kickstarts
Add support for adding the hwsettle boot line parameter based on the
value provided to the installer. This will institute an init delay to
allow multipath constituent devices to become available prior to the the
start of coalescing.

Test Plan:
PASS - AIO-SX: HPE multipath install/bootstrap/unlock
PASS - AIO-SX: Qemu virtual multipath install/bootstrap/unlock
PASS - AIO-DX: Qemu virtual multipath install/bootstrap/unlock
PASS - AIO-DX+: Qemu virtual multipath install/bootstrap/unlock
PASS - 2+2 (controller storage): Qemu virtual multipath install/
bootstrap/unlock
PASS - 2+2+2 (dedicated storage): Qemu virtual multipath install/
bootstrap/unlock
PASS - Add OSD ceph storage configuration (AIO-SX)
PASS - Expand CGTS volume group using extra disk (Partition) (AIO-SX)
PASS - Expand CGTS volume group using extra disk (disk) (AIO-SX)
PASS - Add nova local volume group using extra disk (AIO-SX)
PASS - App pod that alocates and writes into a PVC (AIO-SX)
PASS - Local disk Commands (Disk API) - AIO-SX/DX
- host-disk-list
- host-disk-show
- host-disk-partition-list
- host-disk-partition-show
- host-pv-list
- host-pv-show
- host-stor-list
- host-stor-show
- host-lvg-list
- host-lvg-show
- host-pv-add
PASS - Create nova-local volume group
PASS - Local disk Commands on AIO-DX after swact

Regression:
PASS - AIO-SX: Non-multipath install/bootstrap/unlock (NVME)
PASS - AIO-DX: Non-multipath install/bootstrap/unlock (SSD)
PASS - 2+2: Non-multipath install/bootstrap/unlock (SSD)
PASS - 2+2+2 : Non-multipath install/bootstrap/unlock (SSD and HD)
PASS - Distributed cloud: Non-multipath install/bootstrap/unlock

Change-Id: I38586cd98d0635a16490e7b987617b8d7ec5e20e
Depends-On: https://review.opendev.org/c/starlingx/tools/+/860590
Story: 2010046
Task: 47268
Signed-off-by: Robert Church <robert.church@windriver.com>
2023-02-15 15:56:31 +00:00
Zuul
17c2912bd6 Merge "Refactor kickstarts to integrate multipath support" 2023-02-15 15:39:28 +00:00
Kyale, Eliud
502662a8a7 Cleanup mtcAgent error logging during startup
- reduced log level in http util to warning
- use inservice test handler to ensure state change notification
  is sent to vim
- reduce retry count from 3 to 1 for add_handler state_change
  vim notification

Test plan:
PASS - AIO-SX: ansible controller startup (race condition)
PASS - AIO-DX: ansible controller startup
PASS - AIO-DX: SWACT
PASS - AIO-DX: power off restart
PASS - AIO-DX: full ISO install
PASS - AIO-DX: Lock Host
PASS - AIO-DX: Unlock Host
PASS - AIO-DX: Fail Host ( by rebooting unlocked-enabled standby controller)

Story: 2010533
Task: 47338

Signed-off-by: Kyale, Eliud <Eliud.Kyale@windriver.com>
Change-Id: I7576e2642d33c69a4b355be863bd7183fbb81f45
2023-02-14 14:18:02 -05:00
Adriano Oliveira
a446585145 Refactor kickstarts to integrate multipath support
Refactor kickstart.cfg and miniboot.cfg device management to support
to support multipath disks. This includes:
- Improving function names for clarity
- Improving function docs (params, returns, examples)
- Add get_part_prefix() to provide a common function used to dynamically
  build the partition device names
- Add discovery of multipath disks as an install media option if no
  instdev is provided.
- Add support for by-id/wwn-* multipath persistent device names. This is
  in addition to by-path/* HDD/SSD/NVMe persistent device names which
  enables consistent disk usage, across reboots, irrespective of kernel
  device node enumeration inconsistencies.

Test Plan:
PASS - AIO-SX: HPE multipath install/bootstrap/unlock
PASS - AIO-SX: Qemu virtual multipath install/bootstrap/unlock
PASS - AIO-DX: Qemu virtual multipath install/bootstrap/unlock
PASS - AIO-DX+: Qemu virtual multipath install/bootstrap/unlock
PASS - 2+2 (controller storage): Qemu virtual multipath install/
       bootstrap/unlock
PASS - 2+2+2 (dedicated storage): Qemu virtual multipath install/
       bootstrap/unlock
PASS - Add OSD ceph storage configuration (AIO-SX)
PASS - Expand cgts volume group using extra disk (Partition) (AIO-SX)
PASS - Expand cgts volume group using extra disk (disk) (AIO-SX)
PASS - Add nova local volume group using extra disk (AIO-SX)
PASS - App pod that allocates and writes into a PVC (AIO-SX)
PASS - Local disk commands (Disk API) - AIO-SX/DX
- host-disk-list
- host-disk-show
- host-disk-partition-list
- host-disk-partition-show
- host-pv-list
- host-pv-show
- host-stor-list
- host-stor-show
- host-lvg-list
- host-lvg-show
- host-pv-add
PASS - Create nova-local volume group
PASS - Local disk commands on AIO-DX after swact

Regression:
PASS - AIO-SX: Non-multipath install/bootstrap/unlock (NVME)
PASS - AIO-DX: Non-multipath install/bootstrap/unlock (SSD)
PASS - 2+2: Non-multipath install/bootstrap/unlock (SSD)
PASS - 2+2+2 : Non-multipath install/bootstrap/unlock (SSD and HD)
PASS - Distributed cloud: Non-multipath install/bootstrap/unlock

Change-Id: I8b7ab349d9991810d4faad9c3f7e3be625d6ed5c
Depends-On: https://review.opendev.org/c/starlingx/tools/+/860590
Story: 2010046
Task: 46567
Co-Authored-By: Matheus Guilhermino <matheus.machadoguilhermino@windriver.com>
Co-Authored-By: Robert Church <robert.church@windriver.com>
Signed-off-by: Adriano Oliveira <adriano.oliveira@windriver.com>
Signed-off-by: Robert Church <robert.church@windriver.com>
2023-02-14 15:09:10 -03:00
Christopher Souza
56ab793bc5 Change hostwd emergency log to write to /dev/kmsg
The hostwd emergency logs was written to /dev/console,
the change was to add the prefix "hoswd:" to the log message
and write to /dev/kmsg.

Test Plan:

Pass: AIO-SX and AIO DX full deployment.
Pass: kill pmond and wait for the emergency log to be written.
Pass: check if the emergency log was written to /dev/kmsg.
Pass: Verify logging for quorum report missing failure.
Pass: Verify logging for quorum process failure.
Pass: Verify emergency log crash dump logging to mesg and
      console logging for each of the 2 cases above with
      stressng overloading the server (CPU, FS and Memory);
      stress-ng --vm-bytes 4000000000 --vm-keep -m 30 -i 30 -c 30

Story: 2010533
Task: 47216

Co-authored-by: Eric MacDonald <eric.macdonald@windriver.com>
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Co-authored-by: Christopher Souza <Christopher.DeOliveiraSouza@windriver.com>
Signed-off-by: Christopher Souza <Christopher.DeOliveiraSouza@windriver.com>
Change-Id: I0da82f964dd096840259c4d0ed4e5f558debdf22
2023-02-01 23:41:14 +00:00
Zuul
acbd301a1c Merge "Update pxe boot directory in kickstart for 21.12" vr/stx.8.0 __v.stx.test2 2023-01-25 19:44:32 +00:00
Zuul
424fdee3dc Merge "Fix 21.12 feed directory for 22.12 upgrade" 2023-01-20 16:06:44 +00:00
Junfeng (Shawn) Li
d3a7f90c0b Fix 21.12 feed directory for 22.12 upgrade
Details: The 21.12 feed directory is in /www/pages/feed/rel-21.12/.
The import.sh script in 22.12 iso is looking for the feed directory
in /var/www/pages/feed/rel-21.12/.

This commit is to make sure the import.sh is looking at
the right feed directory in different CentOS release

Test Plan:

PASS: ran the upgrade from 21.12 and feed directory is set up
PASS: ran the upgrade from 22.06 and feed directory is set up

Task: 46918
Story: 2009303
Signed-off-by: Junfeng (Shawn) Li <junfeng.li@windriver.com>
Change-Id: I30ea6403c336daa618c9b650ba94cfa1f94533f8
2023-01-18 09:20:18 -05:00
Shrikumar Sharma
ea1b8629e6 Fix for detection of existing file system in the prestage process
During prestage with a prestage iso, the existing filesystem must
not be overwritten if an installation with an install_guid exists,
when the force_install parameter is not specified.

However, when logical volumes are used, the check for a valid
installation does not succeed, resulting in the installer
overwriting the existing installation.

This commit fixes this issue by inspecting the volume for an
installation. This commit also ensures that if an invalid storage
device is specified for root device, then a failure is reported
and the system breaks into a bash shell.

Test Plan:

PASS: Verify that the installer does not overwrite an existing
installation with an install_guid.

PASS: Verify that the installer reports an error and breaks into
a bash shell if an invalid storage device is specified for root
device.

Closes-Bug: 2002999

Change-Id: I1d4ef10ce741b98455c65467367448e05f37fd64
Signed-off-by: Shrikumar Sharma <shrikumar.sharma@windriver.com>
2023-01-17 17:14:22 +00:00
Zuul
68452f367b Merge "Avoid logging in fork_sysreq_reboot failsafe thread" 2023-01-10 16:53:51 +00:00
Eric MacDonald
67c4f1b148 Avoid logging in fork_sysreq_reboot failsafe thread
Continuing to log in the fork_sysreq_reboot failsafe thread
is seen to cause mtcAgent and mtcClient log file corruption
with binary data.

As an avoidance measure this update changes the offending
information logs to normally disabled debug logs.

Test Plan:

PASS: Verify build, install and provision system with debian iso
      - AIO SX (hw), Standard 2+1 (vbox)
PASS: Verify mtcAgent and mtcClient log files do not get
      binary data (corruption) injected over a self reboot.
PASS: Verify lock and unlock of AIO SX host
PASS: Verify lock and unlock of system node from active controller
PASS: Verify host reboot command
PASS: Verify critical process failure reboot handling

Closes-Bug: 2001719
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Change-Id: Ib49ee427d2a6363ce21ec7488b1f739986828219
2023-01-10 11:38:12 -05:00
Eric MacDonald
a3cba57a1f Adapt Host Watchdog to use kdump-tools
The Debian package for kdump changed from kdump to kdump-tools

Test Plan:

PASS: Verify build and install AIO DX system
PASS: Verify host watchdog detects kdump as active in debian

Closes-Bug: 2001692
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Change-Id: Ie1ac29d3d29f3d9c843789cdedf85081fe790616
2023-01-04 12:57:19 -05:00
Zuul
85ea002112 Merge "Remove console=ttyS0,115200 from system node install menus" 2023-01-04 17:11:40 +00:00
Al Bailey
5f85f2066a Update tox.ini to work with tox 4
This change will allow this repo to pass zuul now
that this has merged:
https://review.opendev.org/c/zuul/zuul-jobs/+/866943

Tox 4 deprecated whitelist_externals.
Replace whitelist_externals with allowlist_externals

Partial-Bug: #2000399

Signed-off-by: Al Bailey <al.bailey@windriver.com>
Change-Id: Ib2aea53615a378ce47d2a2b23ec2e1946c312eed
2022-12-26 23:26:54 +00:00
Eric MacDonald
3fee973f9e Remove console=ttyS0,115200 from system node install menus
This update removes the 'console=ttyS0,115200' grub command
line arguement from the debian system node install menus.

Then allow system inventory to add a customer specified
console setting at node provisioning time by way of
xxxAPPEND_OPTIONSxxx variable replacement.

Test Plan:

PASS: Verify all system node install grub menus get 'ttySx*'
      value from system inventory. (dm config)
PASS: Verify all system node install grub menus default
      to "ttyS0,115200" if missing for a node's provisioning
      in system inventory. (vbox config)

Partial Bug: 2000093
Depends-On: https://review.opendev.org/c/starlingx/config/+/868353

Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Change-Id: I866f29233d2f2e637725a98b445ac6d24333ea30
2022-12-21 11:07:45 -05:00
Zuul
2d51929684 Merge "Fix bug in recent worker_reserved.conf handling" 2022-12-15 20:13:25 +00:00
emacdona
97ccd3d962 Fix bug in recent worker_reserved.conf handling
The new TEMPLATE_FILE and TARGET_FILE is out of
scope for work installs.

Test Plan:

PASS: Verify worker only install

Closes-Bug: 1999561
Signed-off-by: emacdona <eric.macdonald@windriver.com>
Change-Id: Id6ce9e773a208637a32d355e6a0bfb3745437eaa
2022-12-15 12:15:48 -05:00
Kyle MacLeod
35a2f1c296 Validate prestaged ostree_repo via checksum
For installs using prestage data (prestaging or prestage ISO),
an md5 directory-based checksum is now included at the same
directory level as ostree_repo (via related commits).

This commit adds a validation check for any prestaged
/opt/platform-backup/ostree_repo.

The validation check consists of the following:
- If a checksum file exists, use it for validation
- Otherwise, print a warning and fall back to using ostree fsck
    - The ostree fsck command takes much longer to complete.

If the validation check fails, the prestage data is removed, and the
remote install falls back to doing a fresh ostree pull from the system
controller.

If the validation fails for a local prestage ISO install,
the installation will fail during boot. This is unlikely;
it would only happen if the USB is somehow corrupt.

Test Plan

PASS: Remote installs
- Boot subcloud using prestage ISO. Perform remote install.
  Verify the checksum is validated as part of a successful install
  and bootstrap.
- Boot subcloud using prestage ISO. Manually corrupt the
  /opt/platform-backup/ostree_repo. Perform remote install.
  Verify the following:
  1) the checksum validation fails,
  2) the corrupt /opt/platform-backup/ostree_repo directory is removed
  3) the installation continues via remote ostree pull.

PASS: Local Install
- Boot subcloud using prestage ISO. Perform local install.
  Verify the checksum is validated as part of a successful install
  and bootstrap.

PASS: Pre-corrupted ISO
- Boot subcloud using a prestage ISO with a pre-corrupted ostree_repo
  Verify the boot fails due to the checksum validation failure.

Depends-On: https://review.opendev.org/c/starlingx/utilities/+/867179
Depends-On: https://review.opendev.org/c/starlingx/ansible-playbooks/+/867178
Closes-Bug: 1999306

Signed-off-by: Kyle MacLeod <kyle.macleod@windriver.com>
Change-Id: I1fb69b76de4b7fa5bc49cb4b182297b3bb94ba78
2022-12-15 11:13:04 -05:00
emacdona
af7defe48f Add error checking to worker_reserved.conf handling
This update add error checking and error handling to
worker_reserved.conf update handling.

Test Plan:

PASS: Verify kickstart logging around worker_reserved
      update for each of the install cases below.
PASS: Verify subcloud install
PASS: Verify All In One controller install
PASS: Verify worker only install
PASS: Verify standard Controller install

Closes-Bug: 1999561
Signed-off-by: emacdona <eric.macdonald@windriver.com>
Change-Id: I7ccdd9cc02908fcb0fe0a403c2b2141bd44b692a
2022-12-14 12:01:47 +00:00
Zuul
f96df1e18b Merge "Remove minimal PV support on AIO/workers" 2022-12-09 23:17:01 +00:00
Charles Short
52e7b9d979 debian: Manage /etc/platform/worker_resource.conf
Install the /etc/platform/worker_resource.conf based
on personaility type. The worker_resource.conf should
be installed on AIO/worker and worker types. But not
on controller only or storage types.

Test Plan
PASSED Build worker-utils package
PASSED Build ISO
PASSED Start a controller and check to make
       sure that the /etc/platform/worker_reserved.conf
       is not present.
PASSED Start a worker and check to make
       sure that the /etc/platform/worker_reserved.conf
       is present.
PASSED Install AIO and unlock host.
PASSED Install Standard installation and unlock host.

Story: 2009968
Task: 46980

Depends-On: https://review.opendev.org/c/starlingx/utilities/+/866496

Signed-off-by: Charles Short <charles.short@windriver.com>
Change-Id: I32f0a841e55bb2d45b005407f99ed6430b60bf48
2022-12-07 13:05:46 -05:00
Zuul
e1ecb6a005 Merge "Lock root account" 2022-12-07 05:09:50 +00:00
Shrikumar Sharma
b29e8c7345 Preserve persistent backup when invalid persistent_size provided
Miniboot wipes the backup-partition when the persistent size
is set to a value less than the existing size. The expectation is
that the install should fail and the contents of platform-backup
should be preserved.

This fix solves the issue by failing the installation during the
ks-early phase, where the provided persistent size value in the
kernel commandline can be read, and no disk operations have been
performed.

Test Plan:
PASS: Verify that installation with valid parameters passes.

PASS: Verify that reinstall fails if persistent_size less
      than the current persistent_size is provided.

PASS: Verify that contents of /opt/platform-backup are preserved
      when persistent_size less than the current size is
      provided.

PASS: Verify that reinstall fails if persistent_size greater
      than the size of the rootfs device is provided.

PASS: Verify that the contents of /opt/platform-backup are
      preserved when persistent-size greater than size of rootfs
      device is provided.

Closes-Bug: 1998932

Signed-off-by: Shrikumar Sharma <shrikumar.sharma@windriver.com>
Change-Id: I51351cb14cdcfa63b4b5839d935589d997b5403a
2022-12-06 17:57:21 +00:00
Eric MacDonald
dcc78cfdb9 Lock root account
This update stops setting the root password and locks the root account

Test Plan:

PASS: Verify root account can't be logged into with 'root' as password.
PASS: Verify can set root password with 'sudo passwd root'

Story: 2009968
Task: 46997
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Change-Id: I5ae53c2e457ffba3cdaea7bb45ff82bb60945083
2022-12-06 12:53:13 +00:00
Zuul
07640af5df Merge "Perform remote ostree pull during local-based ostree install" 2022-12-05 00:46:58 +00:00
Robert Church
6132aa7317 Remove minimal PV support on AIO/workers
To support long running patch-able systems that don't require a
reinstall, the entire root disk will be allocated to the cgts-vg volume
group as part of installation.

This update simply removes the use of MINIMUM_PLATFORM_PV_SIZE and
ensures that the 'platform_pv' uses all available space.

NOTE: A followup commit will be provided to clean up the large, small,
      tiny disk references and provide an accurate log checking for and
      displaying minimal disk size based on default logical volume
      sizes.

Test Plan:
PASS - Install AIO-SX, bootstrap, unlock
PASS - Install 2+2+2, bootstrap, unlock

Change-Id: I3a50f2305b781de1cf9b80c5aed62b03bebc4790
Story: 2010444
Task: 46981
Signed-off-by: Robert Church <robert.church@windriver.com>
2022-12-03 12:36:45 -06:00
Kyle MacLeod
f15661bcf6 Allow e2fsck exit codes of 0,1
From the e2fsck man pages, the exit codes of 0, 1, 2 should
not be treated as failures.  We should never see exit code 2 though,
since it only occurs when e2fsck is run against a mounted filesystem.

The solution is to extend the check to only fail on exit code > 1.

Test Plan
PASS: Verify e2fsck exit code handling during subcloud install
      with resized partition

Closes-Bug: 1998611

Change-Id: Ie22fd77e3d2e2d631ba467b818bdc77c77f0d8b8
Signed-off-by: Kyle MacLeod <kyle.macleod@windriver.com>
2022-12-02 11:18:32 -05:00
Kyle MacLeod
d634510319 Perform remote ostree pull during local-based ostree install
For Redfish-based subcloud installs which use a local ostree repo as the
basis of their install, we perform a secondary ostree pull from the
system controller. This will pull any ostree commits which have been
applied (via patch) since the local ostree repo was created.

Note that this also requires syncing of the patch metadata via
/opt/patching. This step is done via the install ansible playbook.

Test Plan:

PASS:
- Simulate a local ostree-based install during a sushy subcloud add in
  libvirt. Verify that a remote ostree pull retrieves any patch commits
  from the system controller. This is done by manipulating ostree repo
  contents on the system controller during the miniboot.cfg kicktart.

PASS:
- Prestage a subcloud from a non-patched system controller.
  Prestaged ostree_repo is stored on platform-backup partition.
  Patch the system controller. Add the prestaged subcloud.
  Verify that the subcloud boots and that the patched ostree commit
  is transferred to the subcloud during the miniboot.cfg kickstart.

Partial-Bug: 1998256

Signed-off-by: Kyle MacLeod <kyle.macleod@windriver.com>
Change-Id: Iee08b40dc2b930dacbbf4df08b0f727eb945d4ba
2022-11-30 21:24:01 -05:00
Junfeng (Shawn) Li
63912dc0b0 Update pxe boot directory in kickstart for 21.12
Details: This change is to allow the kickstart looking for the
right pxe boot directory in 21.12 for 22.12 upgrade

Test Plan:
PASS: 22.12 Debian is installed on both controllers

Task: 46968
Story: 2009303

Signed-off-by: Junfeng (Shawn) Li <junfeng.li@windriver.com>
Change-Id: I3401a25d41dd3af2c63fdb90c83316a35a9733d0
2022-11-30 16:19:06 -05:00
Robert Church
1796ed8740 Update wipedisk for LVM based rootfs
Now that the root filesystem is based on an LVM logical volume, discover
the root disk by searching for the boot partition.

Changes include:
 - remove detection of rootfs_part/rootfs and adjust rootfs related
   references with boot_disk.
 - run bashate on the script and resolve indentation and syntax related
   errors. Leave long-line errors alone for improved readability.

Test Plan:
PASS - run 'wipedisk', answer prompts, and ensure all partitions are
       cleaned up except for the platform backup partition
PASS - run 'wipedisk --include-backup', answer prompts, and ensure all
       partitions are cleaned up
PASS - run 'wipedisk --include-backup --force' and ensure all partitions
       are cleaned up

Change-Id: I036ce745353b6a26bc2615ffc6e3b8955b4dd1ec
Closes-Bug: #1998204
Signed-off-by: Robert Church <robert.church@windriver.com>
2022-11-29 05:04:38 -06:00
Robert Church
b0066dcd27 Remove all volume groups by UUID
In cases when wipedisk isn't run or isn't working correctly,
pre-existing volume groups, physical volumes, and logical volumes will
be present on the root disk. Depending on the sizes and layout of the
previous install along with partial or aborted cleanup activities, this
may lead [unknown] PVs with duplicate volume group names.

Adjust the cleanup logic to:
- Discover existing volume groups by UUID so that duplicate volume
  groups (i.e two occurrences of cgts-vg) can be handled individually.
- Ignore [unknown] physical volumes in a volume group as they cannnot be
  removed. Cleaning up existing physical volumes across all volume
  groups will resolve any [unknown] physical volumes.

In addition, unify if/then for/do syntax in the %pre-part hook

Test Plan:
PASS - create a scenario with multiple partitions along with a
       nova-local and cgts-vg volume group that result in an [unknown]
       physical volume and a duplicate cgts-vg. Do not wipe the disks
       and install an ISO with the above changes. Observe proper cleanup
       and install.
PASS - Perform consecutive installs without wipedisk and observe proper
       cleanup and install

Change-Id: Idf845cf00ca3c009d72dedef0805a77d94fa3d97
Partial-Bug: #1998204
Signed-off-by: Robert Church <robert.church@windriver.com>
2022-11-29 05:04:06 -06:00
Robert Church
651bd76566 Ensure magic strings that are visible for libblkid are erased
In the case when the root disk partition table is wiped but individual
partitions are not wiped correctly, this will leave previous physical
volume metadata intact on the disk.

When a new LVM partition is created and assigned as a newly created
physical volume the old LVM metadata on the disk partition will prevent
the cgts-vg volume group from being created.

This update will wipe all the magic strings present in the new physical
volume partition established by the kickstart by executing 'wipefs -a'
prior to creating the cgts-vg.

Test Plan:
PASS - Successfully install an ISO with this change on a system that did
       not cleanup the LVM metatadata from a previous install. Log in to
       the installed system and confirm that the cgts-vg is properly
       configured.

Change-Id: I63f4235a27cb40a4283f0f4c34f63564a4f18cdd
Partial-Bug: #1998204
Signed-off-by: Robert Church <robert.church@windriver.com>
2022-11-29 04:40:42 -06:00
Al Bailey
d4aaeb5836 Debian: Fix ostree remote for patching on workers
The sw_version was uninitialized for workers.
This led to a 404 error when doing ostree pull
during a patch installation on worker nodes.

The problem was introduced by
https://review.opendev.org/c/starlingx/metal/+/864930

Test Plan:
  Build / Install /Deploy Duplex env with a worker
  Successfully apply a patch on the worker

Closes-Bug: 1997130
Signed-off-by: Al Bailey <al.bailey@windriver.com>
Change-Id: If40466b0ac9ffe0ce1ae068e948682eafa3703e5
2022-11-24 19:04:13 +00:00
Zuul
d614eda8fc Merge "Fix failure to set instdev parameter when we use ks-setup.cfg" 2022-11-24 01:25:40 +00:00
Shrikumar Sharma
15e971b59e Fix failure to set instdev parameter when we use ks-setup.cfg
When we create the prestage iso with an external script,
ks-setup.cfg, we may not provide the rootfs_device or
boot_device parameter. This is a valid scenario where these
parameters are defined in ks-setup.cfg. An installation failure
is observed in this case.

The cause of the failure is that the prestage code is handled in
a pre-part hook. This commit moves it to a ks-early hook.

In this commit, a provision for the execution of a custom script
named ks-addon.cfg is also made. This script is a bash script
that must execute in the last post hook.

Test Plan:
PASS: Verify that the installation succeeds when the rootfs
      and boot device parameters are only specified via
      the ks-setup.cfg.

PASS: Verify that the external script, ks-addon.cfg, is executed
      after the install and configurations are done.

PASS: Verify that the logs from the execution of ks-addon.cfg
      are present in kickstart.log.

Closes-bug: 1997305

Signed-off-by: Shrikumar Sharma <shrikumar.sharma@windriver.com>
Change-Id: Ica1735aef3ab457cf0609ebee6aac45671e97987
2022-11-23 23:06:05 +00:00
Zuul
f80c698031 Merge "Make var and root filesystems LVM based" 2022-11-22 19:05:10 +00:00
Robert Church
c5c6f5353a Make var and root filesystems LVM based
Move the /var and /root partition based filesystems into the cgts-vg so
that they can be resized as required at runtime in the future.

This change includes:
- Update pxeboot network personality files to add installer command line
  parameters inst_ostree_root andinst_ostree_var to allow specifying the
  root and var devices to be created and populated by the installer.
- Update the StarlingX grub.cfg file to add a new single option booting
  that drops the rollback boot option (not working) and adds grub
  options ostree_root, rd.lvm.lv, and ostree_var to enable mounting the
  root and var filesystems at boot time.
- Update the kickstart/miniboot config files to:
  - Remove support for lat/lat-disk partition size variables and
    refactor the hooks to use specific PART_SZ_* and LV_SZ_* variables.
  - Increase /boot partition size to 2GB from 500M to provide some
    additional space for future patching scenarios that may require
    staging multiple ostree deployments prior to reboot and cleanup.
  - Create logical volumes for root and var set to the current 20GB
    values.
  - Adjust the minimum physical volume size used on AIO and worker
    personalities to include the new root and var logical volumes.
  - Adjust normal install disk thresholds to 219GB for AIOs and 120GB
    for workers.
  - Fix mkfs hook to ensure that the aio vs. std sizes are correctly
    reflected on hook execution.

Test Plan:
- PASS: BIOS AIO-SX
- PASS: UEFI AIO-SX
- PASS: BIOS 2+2+2
- SKIP: secure boot, not ready for Stx8.0
- PASS: AIO-SX upgrade
- PASS: AIO-DX upgrade
- PASS: DC subcloud install (virtual test)

Change-Id: I5f77266336b53d178eaae0e6fbb556bbea6400e8
Depends-On: https://review.opendev.org/c/starlingx/integ/+/865076
Story: 2010444
Task: 46881
Signed-off-by: Robert Church <robert.church@windriver.com>
2022-11-22 13:05:13 +02:00
Zuul
87a645b8ea Merge "Remove normal/rollback toggle code from stx grub menu" 2022-11-21 20:28:31 +00:00
Zuul
fde03b15d7 Merge "Debian: metal: update debian_iso_image.inc" 2022-11-21 19:49:54 +00:00
Eric MacDonald
04e9723dbb Remove normal/rollback toggle code from stx grub menu
Modify the stx grub template file to remove the
normal / rollback image switching/toggle algorithm.
Also remove the temporary sed based method in the
kickstart code.

Effectively, this moved the previous change introduced by

  https://review.opendev.org/c/starlingx/metal/+/861461

... to a grub.cfg 'code block remove' rather
than 'on the fly sed modification' by the kickstart.

Test Plan:

PASS: Verify build and install
PASS: Verify on target code removed from /boot/efi/EFI/BOOT/grub.cfg
PASS: Verify normal image is selected after 10 back to back reboots

Story: 2009968
Task: 46886
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Change-Id: Id8799dff6eef7ef8aa6f66180d6ed971c005618d
2022-11-20 23:19:53 +00:00
Zuul
d24c008374 Merge "Create new pxeboot feed refresh script and service" 2022-11-20 18:48:23 +00:00
Zuul
10a4bece22 Merge "Debian: Fix ostree remote pulls for IPv6 workers" 2022-11-20 16:00:40 +00:00
Eric MacDonald
b5d22ef3e7 Create new pxeboot feed refresh script and service
This update introduces a new script that can be called
by patching to refresh the kernel, initrd and other
system node install feed staged files in support of
kernel patching.

This update also introduces and enables new service file
that triggers the creation of the pxeboot feeds or refreshes
the pxeboot feeds if what they contain does not match the
content in /boot.

Both new script and service files are added to the
pxe-network-installer package so they get installed
into the filesystem properly.

Lastly, there are 2 kickstart changes implemented.
 1. The kickstart code that copied the kickstart files from
      /var/www/pages/feed/rel-xx.xx/
      to
      /var/www/pages/feed/rel-xx.xx/kickstart
    is removed in favor of the pxe-network-installer package
    doing that automatically.
 2. The kickstart is modified to remove the previous pxeboot
    feed fetch and creation function.
    One exception to this is the efi.img file, its fetch remains.
    Note the efi image is currenly not included in the /boot dir.

Test Plan:

PASS: Verify Debian build and AIO DX install (cd and pxe installs)
PASS: Verify Debian Standard 2+1 DX system install
PASS: In above cases verify end-to-end handling of the following
      test case staging.
PASS: Verify pxeboot feed staging on subcloud controller-0 install
PASS: Verify pxeboot feed file positioning in
      - /var/pxeboot/rel-xx.xx (kernel and initrd images)
      - /var/www/pages/feed/rel-xx.xx/pxeboot (kernel/initrd images)
      - /var/www/pages/feed/rel-xx.xx/pxeboot/EFI/BOOT (other files)
      - /var/pxeboot and /var/www/pages/feed/rel-xx.xx (efi.img)
PASS: Verify rsync bypass for the above cases when the files match
      - complete and partial cases
PASS: Verify staging when the stage dirs are missing
      - complete and partial cases
PASS: Verify staging when stage files mismatch
      - complete and partial cases
PASS: Verify service enable on controllers for AIO and STD configs
PASS: Verify kickstart file position change
PASS: Verify shellcheck static analysis
PASS: Verify pxeboot_feed.sh script error handling
PASS: Verify pxeboot_feed.sh script logging

Story: 2009968
Task: 46789
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Change-Id: Ic98b2686c417103749cb777adb28ac73ac1d397c
2022-11-20 15:36:23 +00:00
Al Bailey
32feb4a89a Debian: Fix ostree remote pulls for IPv6 workers
The entry for the ostree remote in computes is using
pxecontroller.   This is an ipv4 address, and therefore
will not be accessible on an unlocked IPv6 Worker
(or storage).

The fix is to use 'controller' instead of 'pxecontroller'
That address exists in both ipv4 and ipv6.

Test Plan:
 Debian: Successfully apply a patch to a worker node

Closes-Bug: 1997130

Signed-off-by: Al Bailey <al.bailey@windriver.com>
Change-Id: Idbc1e2728582ab3cd5c73761790cdd9fbc6d951a
2022-11-19 01:19:16 +00:00
Shrikumar Sharma
d19af3331d Copy efi.img to /var/pxeboot on subcloud for multinode support
To create a duplex subcloud with multiple nodes with different
personalites, pxeboot must be supported by controller-0 of the
subcloud.

Here, pxeboot is enabled on controller-0 by copying efi.img
from the mounted miniboot iso image to /var/pxeboot. This will
allow the installation of controller-1 and the computes of the
subcloud via pxeboot.

Test Plan:
PASSED: Verify that all nodes in the subcloud install, come online
      and are unlocked, enabled and available by the end of the
      installation process.

PASSED: Verify that multinode install completes successfully with
      prestaged ostree_repo.

Depends-On: https://review.opendev.org/c/starlingx/metal/+/862619/
Story: 2010118
Task: 46754

Signed-off-by: Shrikumar Sharma <shrikumar.sharma@windriver.com>
Change-Id: I0a6789b5a86f89da5e86581ab7b3eed950361ce7
2022-11-17 17:24:13 +00:00
Yue Tao
18c7435e39 Debian: metal: update debian_iso_image.inc
Move the packages of "metal" from stx-std.lst to debian_iso_image.inc

Test Plan:

Pass: build-pkgs -c -a
Pass: build-image
Pass: boot

Story: 2008862
Task: 46844

Signed-off-by: Yue Tao <yue.tao@windriver.com>
Change-Id: Ib284ae6f1762b0f3ca2fea242b49c1b75846286d
2022-11-16 12:06:51 +08:00
Zuul
1132443626 Merge "Deprecated sysinv-fpga-agent service cleanup" 2022-11-14 14:59:03 +00:00
Davi Frossard
9df5d206df Deprecated sysinv-fpga-agent service cleanup
Removing setup for pmon files. The service sysinv-fpga-agent
doesn't exist anymore. So this change is only a cleanup.

Test plan (AIO-SX):
PASS: Build, boot, bootstrap and unlock.

Story: 2010087
Task: 45628

Depends-on: https://review.opendev.org/c/starlingx/integ/+/864133
Signed-off-by: Davi Frossard <dbarrosf@windriver.com>
Change-Id: I0e56483f49be3a64bcb8047934df5bbb13fe1490
2022-11-11 18:40:38 +00:00
Shrikumar Sharma
67a31d1c6b Revert "Enable Multinode Subcloud in Distributed Cloud"
This reverts commit d09313ff0b527efdcfd2c03bdfb950eb1432be10.

While the code here itself is functionally correct and tested,
a download in the code is dependent on a location on the
active System Controller that is overridden by a drbd2 mount
on /var/www/pages/iso.

This drbd2 mount masks the pxeboot related files which were
placed there during System Controller installation.

Reverting this change until a resolution to the drbd mount on
/var/www/pages/iso on the active System Controller is resolved.

Signed-off-by: Shrikumar Sharma <shrikumar.sharma@windriver.com>
Change-Id: Ie91fde9a09f693d133fa484782a7df28ffd29faf
2022-11-11 18:33:27 +00:00
Eric MacDonald
0e7024f9a7 Grub file modifications for Debian signed UEFI installs
Initial delivery of UEFI system node installs did not
use the signed boot loader. As a result Secure Boot
of system nodes was not supported. This update changes
that by swapping in the signed bootx64.efi boot loader
in a puppet update ; see depends on.

This update modifies to the pxe-network-installer
and kickstart to support a robust UEFI system node
install that supports Secure Boot.

The first change creates and uses an stx template
file from LAT grub file. This is done to avoid ongoing
and difficult to implement LAT grub file hack changes
from the kickstart.

This new grub.cg.stx file is packaged in the
pxe-network-installer.

The kickstarts are modified to replace the LAT grub.cfg
file with the new stx template file grub.cfg.stx. As far
as this update goes, this template file is a null change
from the LAT grub file and represents what the LAT grub
file looked like at the time the template was created.

Moving forward, further changes to the system node
install grub file will be made to this new grub.cfg.stx
template file.

The second change is to modify existing stx unprovisioned
default pxe-grub.cfg files to look for the new mac based
config file with the '.cfg' extention.

The system node install mac-based grub files are dynamically
created with no signature file. To work around that, this
update exports the LAT environment variable 'skip_check_cfg'
which instructs LAT to 'skip' the grub menu signature 'check'
for these dynamically created grub files.

An additional change is made to handle timer reload on menu
refresh if the new node remains unprovisioned after timeout.

Test Plan:

PASS: Verify the default LAT file is renamed and the new
      template file positioned in its place.
PASS: Verify Debian pxe-network-installer package update
PASS: Verify Debian AIO DX UEFI Install
PASS: Verify CentOS kickstarts do not require the kickstart change

PASS: Verify build and UEFI install
      - Debian
      - CentOS
PASS: Verify unprovisioned grub menu reload handling with
      re-occuring timeout until node is provisioned.

Regression:

PASS: Verify host-delete and host-update install and unlock
PASS: Verify host-reinstall and host-unlock
PASS: Verify lock/unlock controller-1 and controller-0
PASS: Verify lock/delete/reinstall/unlock controller-1
PASS: Verify swact to controller-1
PASS: Verify lock/delete/reinstall/unlock controller-0

Depends-On: https://review.opendev.org/c/starlingx/stx-puppet/+/863776

Story: 2009968
Task: 46701
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Change-Id: Id073842ac1b29acf54c999022a9e37d4c2366031
2022-11-10 23:12:53 +00:00