117 Commits

Author SHA1 Message Date
Michel Thebeau
d921df347e chart version auto-increment scheme
Refer to the example for auto-increment presented by Bob Church:
https://review.opendev.org/c/starlingx/platform-armada-app/+/904464

Implement these specifics for vault-helm:

 - Use StarlingX debian git revcount packaging mechanisms to derive the
   semver BUILD version for upstream helm charts which maintains the
   upstream chart version and adds a versioned BUILD extension.

     <valid semver> ::= <version core> "+" <build>

   Chart version (MAJOR.MINOR.PATCH+STX.REV) is passed to 'helm package'
   command to force the version, where REV == 'git revcount'

 - Update the rules to automatically update the chart versions in the
   fluxCD helmrelease.yaml files.

Test Plan:
PASS  file byte level comparison of package before/after
PASS  AIO-SX vault sanity
PASS  application-update

Story: 2010929
Task: 49399

Change-Id: Id40547c1001ab8fa2d7c83abbcc5c9d44185ee2f
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2024-02-20 21:37:09 +00:00
Tae Park
fae21895d7 Update helm charts for new vm docker image
Updating helm charts with the latest vault manager docker image tag.

Test Plan:
PASS Vault sanity
PASS Check for installation of correct image

Story: 2010930
Task: 49526

Depends-On: https://review.opendev.org/c/starlingx/root/+/908336

Change-Id: I54d33f7c9c8d58df7424c51f9bf366d8746264f0
Signed-off-by: Tae Park <tae.park@windriver.com>
2024-02-08 13:18:06 -05:00
Tae Park
4f504c064c Fix KUBE_LATEST_VERSION to installed version
After updating the kubectl versions, we noticed that there is a mismatch
between version in the KUBE_LATEST_VERSION variable, and the actual
installed version. This change fixes the mismatch, and makes
KUBE_LATEST_VERSION point to the correct version.

Test Plan:
PASS Manual build of the image
PASS verify in docker history for correct version

Story: 2010930
Task: 49526

Change-Id: I5055fd204527c49cc47478d62d01e1afa18d3556
Signed-off-by: Tae Park <tae.park@windriver.com>
2024-02-07 15:13:29 -05:00
Zuul
969e0626b2 Merge "Add minimum Kubernetes version supported" 2024-02-06 19:45:27 +00:00
Tae Park
a6d4436e6e Updating supported kubectl version list
Updating the list of installed kubectl versions within the vault manager
docker image, to support the correct list of version for the master
branch.

Test Plan:
PASS Manual build of the image
PASS vault sanity with the new image
PASS test rekey for each version of kubectl

Story: 2010930
Task: 49526

Change-Id: I7d2103ec62e587c4cc8a6725ab5f2e53f4e9e93d
Signed-off-by: Tae Park <tae.park@windriver.com>
2024-02-06 13:37:31 -05:00
Igor Soares
0313845b34 Add minimum Kubernetes version supported
Add the supported minimum Kubernetes version into the application
metadata file.

The minimum Kubernetes version is set to 1.24.4 and should be changed
accordingly for future application updates.

The "supported_k8s_version:minimum" field is optional but it will become
mandatory in the near future.

This also contains a fix to properly trigger the Tox metadata checks.

Test Plan
PASS: build-pkgs && build-image
PASS: Apply application

Story: 2010929
Task: 49507

Change-Id: I6d698b94cf7008f574d4170e3bd1a8d494d5e619
Signed-off-by: Igor Soares <Igor.PiresSoares@windriver.com>
2024-02-06 15:14:34 -03:00
Tae Park
6fccda0818 Add configuration for pod termination wait time
Adding new configuration options for pod termination wait sequence. The
options set the number of times the new vault-manager pod will check
that the old vault-manager pod is still running, and the number of
seconds to wait between each check.
The total default wait time is now 60s.

Test Plan:
PASS vault build succesfully with the changes
PASS vault sanity on AIO-SX
PASS Test the new helm values

Story: 2010930
Task: 49476

Change-Id: Ie0d4c1fffccf59618cb10bc1e201468f5ffceed0
Signed-off-by: Tae Park <tae.park@windriver.com>
2024-01-31 09:40:29 -05:00
Tae Park
7c22500b16 Include kubectl v1.21 and v1.22 to supported list
Adding kubectl version 1.21.14 and 1.22.17 to the list of supported
kubectl versions in the vault manager container image. This is for
supporting platform upgrade from stx.6.0 to stx.8.0.

Test Plan:
PASS Manual build of the image
PASS vault sanity with the new image

Story: 2010930
Task: 49423

Change-Id: Ie10abc6473790cf44b9d69c4d706338d5063aa5b
Signed-off-by: Tae Park <tae.park@windriver.com>
vf/kernel-6.6
2024-01-18 15:06:52 +00:00
Zuul
9e4244e492 Merge "update vault helm chart to 0.25.0" 2024-01-11 18:01:20 +00:00
Sabyasachi Nayak
f61e33f6e1 update vault helm chart to 0.25.0
Replace references of 0.24.1 with 0.25.0.  Refresh the patches for
vault-manager and agent image reference. Update the image tags to match new vault chart. The vault helm chart uses vault server 1.14.0 version. The latest version of the vault server in the 1.14.x series is 1.14.8. Verified that the changes between vault v1.14.0 and v1.14.8 tags most of them are 'backport'', "cherry-pick" of commits i:e bug fixes. So used 1.14.8 version of vault sever.

Test plan:
 PASSED AIO-sx and Standard 2+2
 PASSED vault aware and un-aware applications
 PASSED HA tests
 PASSED test image pulls from private registry with external network
      restriction

story: 2010393
Task: 49391

Change-Id: I6bd022fed79ead6e1dc224e323a179d1dcd3ab0f
Signed-off-by: Sabyasachi Nayak <sabyasachi.nayak@windriver.com>
2024-01-10 17:47:38 +00:00
Igor Soares
e4504dd0e1 Application versioning based on build release
This change will automatically adjust versioning of the application
tarball and python plugins to reflect the same version reported by
SW_VERION in /etc/build.info.

Test plan:
PASS: build-pkgs -a & build-image
PASS: Confirm that the tarball version matches the platform version
PASS: Apply application

Story: 2010929
Task: 49354

Change-Id: Ib7afcce8b43db358ed7fa6b9bf83c4d3abd8db64
Signed-off-by: Igor Soares <Igor.PiresSoares@windriver.com>
2023-12-29 12:45:46 -03:00
Zuul
bde0b6c4da Merge "Update app Zuul Check Jobs." 2023-12-20 16:19:49 +00:00
Tae Park
857fedecc6 Issue a Warning for Vault-Manager PVC Storage
This commit adds an additional check for PVC storage for vault-manager
after PVC-to-k8s conversion. If the storage is found then it will log a
warning during start-up of vault manager.

Test Plan:
PASS bashate
PASS AIO-SX vault sanity
PASS New code issues logs only when the PVC storage persists after
     conversion

Story: 2010930
Task: 49293

Change-Id: I2d669b06927b9d396ce5d6e582983ab78a3cc5fc
Signed-off-by: Tae Park <tae.park@windriver.com>
2023-12-18 16:53:33 -05:00
Reed, Joshua
fd1d13a008 Update app Zuul Check Jobs.
Modify code to conform to flake8 and pylint.

Jobs are now flake8, pylint, py39 and metadata.

Test Plan
PASS - All zuul jobs pass as expected.

Story: 2010929
Task: 49283

Change-Id: I3e3f5191a2dac94e35b75bccdd563dc108f187bf
Signed-off-by: Reed, Joshua <Joshua.Reed@windriver.com>
2023-12-18 15:34:44 -06:00
Michel Thebeau
494edafaa9 Remove hardcoded vault and sva-vault
The vault namespace and full-name are in variables and should not have
been hardcoded.

Test Plan:
PASS  bashate of rendered init.sh
PASS  vault sanity
PASS  all affected code paths

Story: 2010930
Task: 49232

Change-Id: I1c4765b907ce8ce4200e98575922467edb34e9fd
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-15 15:40:47 +00:00
Michel Thebeau
1aa869135b Fix removal of rekey milestone secrets
When vault-manager is killed during finalizeRekey the k8s secrets may
not be deleted.  Especially: the kubectl command deleting multiple
secrets may be interrupted.

It is unclear in what order kubectl/k8s would delete the secrets when
they are specified in a single command - i.e., it is observed to be a
different order than what was specified.  Use one kubectl command for
each milestone secret.

Use cluster-rekey-audit as the final milestone.  Fix needsRekey to allow
the procedure to resume as long as cluster-rekey-audit persists.

Also adjust some comments and remove some chatty logs.

Test Plan:
PASS  bashate of rendered init.sh
PASS  vault sanity, including rekey
PASS  application-update
PASS  kubectl delete vault-manager pod tests
PASS  kill -9 vault-manager tests

Story: 2010930
Task: 49174

Change-Id: I2e5e15b4f89f9f9495381d33064c631cde6da193
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-15 15:40:37 +00:00
Tae Park
65b38b925d Prevent multiple vault-manager pods from acting
This commit adds new check in the main loop of vault manager
for multiple instances of vault manager. Only one vault manager is
needed, so it will be put on sleep or be
terminated until only one is left

Story: 2010930
Task: 49199

Test Plan:
PASS Bashate
PASS Vault sanity test

Change-Id: I0fd881aa4078528ba3f804087db87069dae58f7e
Signed-off-by: Tae Park <tae.park@windriver.com>
2023-12-13 19:58:07 +00:00
Michel Thebeau
be0e85ec77 stability fixes for vault-manager rekey
Continue/complete the rekey procedure when vault-manager is interrupted
(kill -9). Fixes include:
  - Refactor logic of rekeyRecover function
  - additionally handle specific failure scenarios to permit the rekey
    procedure to continue
  - correct return codes of procedure functions to fall through to the
    recovery procedure
  - resort the tests of needsShuffle
  - misc adjustment of logs and comments

The additional handling of failure scenarios includes:
  - partial deletion of cluster-rekey secrets after copying to
    cluster-key
  - restart rekey on failure during authentication

Test Plan: PASS  vault sanity, ha sanity
PASS  IPv4 and IPv6
PASS  system application-update, and platform application update
PASS  rekey operation without interuption
PASS  bashate the rendered init.sh

Stability testing includes kubectl deleting pods and kill -9 processes
during rekey operation at intervals spread across the procedure, with
slight random time added to each interval

PASS  delete a standby vault server pod
PASS  delete the active vault server pod
PASS  delete the vault-manager pod
PASS  delete the vault-manager pod and a random vault server pod
PASS  delete the vault-manager pod and the active pod
PASS  delete the vault-manager pod and a standby pod
PASS  kill -9 vault-manager process
PASS  kill -9 active vault server process
PASS  kill -9 standby vault server process
PASS  kill -9 random selection of vault and vault-manager processes

Story: 2010930
Task: 49174

Change-Id: I508e93a36de9ca8b4c8fa1da7941fe49936de159
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-07 13:30:32 +00:00
Michel Thebeau
615d6e4657 use the vault-manager image stx.9.0-v1.28.4
This new image adds uuidgen and multiple versions of kubectl, which
vault-manager now supports.

Test Plan:
PASS  sanity test of vault application
PASS  watch vault-manager log over kubernetes upgrade

Depends-On: Ib0a105306cecb38379f9d28a70e83ed156681f08
Depends-On: I03e37af31514c3fa3b95e0560a6d6f83879ec9de

Story: 2010930
Task: 49177

Change-Id: I7f578ac7e8d2aab98fb1e104f336fd750d7d7933
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-06 21:23:55 +00:00
Michel Thebeau
733ca0e9a6 Add multiple version support of kubectl
Allow vault-manager to pick the version of kubectl that matches the
currently running server.  Add a helm override option to pick a
particular version available within the image.

Refresh the helm chart patches on top of this change.

Test Plan:
PASS  Unit test the code
PASS  helm chart override
PASS  sanity of vault application
PASS  watch vault manager log during kubernetes upgrade

Story: 2010930
Task: 49177

Change-Id: I2459d0376efb6b7e47a25f59ee82ca74b277361f
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-04 20:46:29 +00:00
Michel Thebeau
65e8589183 add vault rekey option during upgrade
Allow the vault to be rekeyed after conversion from PVC storage to k8s
storage of the shard secrets.

Update the vault-manager patch to include rekey enable/disable and
timing parameters in helm values.yaml. Refresh the other patches
(include git long log descriptions in those patch files omitting
description).

Test Plan:
PASS  vault sanity, ha sanity
PASS  IPv4 and IPv6
PASS  system application-update, and platform application update
PASS  rekey operation without interuption
PASS  helm chart options
PASS  bashate the rendered init.sh

Stability testing includes kubectl deleting pods and kill -9 processes
during rekey operation at intervals spread across the procedure, with
slight random time added to each interval

PASS  delete a standby vault server pod
PASS  delete the active vault server pod
PASS  delete the vault-manager pod
PASS  delete the vault-manager pod and a random vault server pod
PASS  delete the vault-manager pod and the active pod
PASS  delete the vault-manager pod and a standby pod
TBD  kill -9 vault-manager process
TBD  kill -9 active vault server process
TBD  kill -9 standby vault server process
TBD  kill -9 random selection of vault and vault-manager processes

Story: 2010930
Task: 48850

Change-Id: I87911819c27caaf30be69b3c969a20ed97be42cb
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-04 19:21:11 +00:00
Michel Thebeau
dfcfa46061 improve error handling in vaultInitialized
A rare condition can result in vault servers not responding to this
early initialization status check.  The omission has no effect after
vault is initialized, but fails the application if it happens before
vault is initialized.

Test Plan:
PASS  Unit test the changes
PASS  vault sanity

Story: 2010930
Task: 49168

Change-Id: I6b5270f89ccea27f6c10edc6e1bc250b248f4054
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-12-04 19:21:07 +00:00
Zuul
2f047e86f3 Merge "update vault-manager docker image" 2023-12-04 18:11:31 +00:00
Michel Thebeau
8c6d86ea3b improve error handling of unsealVault
Add generic and specific error handling for unsealVault function.
Changes include:

  Recognize unseal success from the API response
  Recognize and stop unseal procedure if the response indicates
    authentication failure
  Always 'reset' unseal in progress, if any
  Recognize if the requested server is already unsealed
  Handle return code from vaultAPI function
  Remove key_error check as it is printed as DEBUG by vaultAPI
  Refactor reused variables to be less specific

Test Plan:
PASS  unit test the function
PASS  vault sanity including HA test

Story: 2010930
Task: 49167

Change-Id: If55589d207bbb374a6137922f62e2d494278e72c
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-30 21:17:34 +00:00
Michel Thebeau
394f20c28a update vault-manager docker image
Add uuidgen and add multiple versions of kubectl.

Test Plan:
PASS  Manual build of the image
PASS  Verify uuidgen and kubectl version in the running container
PASS  vault sanity with the new image

Story: 2010930
Task: 48849

Change-Id: Ib0a105306cecb38379f9d28a70e83ed156681f08
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-30 17:21:04 +00:00
Michel Thebeau
8669743ae2 add vault-manager pause debugging option
A debug feature to allow vault manager function to be paused.  Use case
may include setting up specific conditions for test.

Include a helm override for initial pause condition, which may be
difficult to reach as a pod starts.

Test Plan:
PASS  vault sanity
PASS  unit test the pause_on_trap code, helm override
PASS  misc usage of the option

Story: 2010930
Task: 49048

Change-Id: Icd69a79685427268d7d59b3fbe655b9b93e8ece8
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-13 19:04:30 +00:00
Michel Thebeau
f2d02300a9 add interactive mode for init.sh
Allow the init.sh script to be sourced by an author to permit
development and test activity.

Test Plan:
PASS vault sanity test
PASS enter vault-manager pod and source init.sh
PASS bashate on the rendered script

Story: 2010930
Task: 49047

Change-Id: I899dcf6df793ee69b51b63a8b214320282d091fa
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-13 19:02:36 +00:00
Michel Thebeau
c91580ebd2 add generic function for vault REST API calls
Replace curl REST API calls with a generic function. This prepares for
adding more functionality to vault manager; more REST API calls.

Main feature includes error/debug logging of the responses.

Also includes:
Define variables for server targets
Refactor ubiquitous global 'row' variable, covert to parameter
Explicitly declare the curl's default connect-timeout (120s)

Test plan:
PASS vault sanity
PASS vault HA test
PASS all code paths with REST API calls
PASS misc examples GET, POST, DELETE
PASS unit test the new function
PASS bashate of the rendered document

Story: 2010930
Task: 49042

Change-Id: Ic329f075ba1c0480f5d507f9768f76fa86fc2094
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-13 19:02:25 +00:00
Michel Thebeau
464f9d0e76 Conversion of storage during application update
Add lifecycle code to read secrets from PVC mounted to running
vault-manager, and vault-manager code for conversion of storage from PVC
to k8s secrets.

The lifecycle code is added because the previous version of
vault-manager does not respond to SIGTERM from kubernetes for
termination.  And yet the pod will be terminating when the new
vault-manager pod runs.  Reading the PVC data in lifecycle code before
helm updates the charts simplifies the process when vault-manager is
running during application-update.

The new vault-manager also handles the case where the application is not
running at the time the application is updated, such as if the
application is removed, deleted, uploaded and applied.

In general the procedure for conversion of the storage from PVC to k8s
secrets is:
 - read the data from PVC
 - store the data in k8s secrets
 - validate the data
 - confirm the stored data is the same as what was in PVC
 - delete the original data only when the copy is confirmed

The solution employs a 'mount-helper', an incarnation of init.sh,
that mounts the PVC resource so that vault-manager can read it.  The
mount-helper mounts the PVC resource and waits to be terminated.

Test plan:
PASS  vault sanity
PASS  vault sanity via application-update
PASS  vault sanity update via application remove, delete, upload, apply
      (update testing requires version bump similar to change 881754)
PASS  unit test of the code
PASS  bashate, flake8, bandit
PASS  tox

Story: 2010930
Task: 48846

Change-Id: Iace37dad256b50f8d2ea6741bca070b97ec7d2d2
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-02 15:12:47 +00:00
Michel Thebeau
cd165b8f5c fix flake8 and bandit complaints
Fixes based on local run of flake8 and bandit. Still passes pylint and
py39 unit tests.

Omit enabling the zuul jobs for flake8 and bandit, but fix the
complaints.  The bug https://bugs.launchpad.net/starlingx/+bug/2042457
will track getting those enabled.

Test plan: (with change 899277)
PASS  vault sanity
PASS  vault sanity via application-update
PASS  vault sanity update via application remove, delete, upload, apply
PASS  unit test of the code
PASS  bashate, flake8, bandit
PASS  tox

Story: 2010930
Task: 49026

Change-Id: Iab8c156be7bd5d32d420a500b7abf4f2ea2a2ac6
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-11-02 15:12:44 +00:00
Tae Park
ef1b8f663b Split key shards in json
Splitting key shard file gained from vault initialization into separate
files. Each key shard (plus root token) is now stored separately.

Test Plan:
PASS vault sanity
PASS bashate

Task: 2010930
Story: 48847

Change-Id: I8a007e505ea7ee9764301e494f4801a25cb194ce
Signed-off-by: Tae Park <tae.park@windriver.com>
2023-10-31 14:02:10 -04:00
Michel Thebeau
87bf94a0c5 fix print of new log level
When changing the log level, the logged new level was reported as the
log parameter rather than the configured log level.

Refactor the case statement as it's own function so it can be used in
several places.  Use the conversion function to print the correct
configured log level.

Also print the user friendly text when detecting invalid log level.

Misc other changes including comments, line lengths, local declarations.

Test Plan:
PASS  Unit test log functions
PASS  changing configured log level
PASS  bashate on the rendered init.sh

Story: 2010930
Task: 48842

Change-Id: I6c96f2e5193d722bb9e4cd32eb66c2cd2f65a503
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-10-26 14:03:13 +00:00
Michel Thebeau
3684074d88 exit_on_trap: adjust default and debug behaviours
It is observed that the exit_on_trap is not working under normal
operation - vault-manager takes 30 seconds to exit when the application
is removed.  The default behaviour is to exit when the trap file exists.
Not exiting when the trap file exists only occurs when the trap file has
content.  This latter behaviour is a debugging option.

Add the conditional for empty trap file, and always exit if the trap
file is empty.

For the debugging feature it is helpful for the procedure to remember
the trap number set in the trap file.  Use DEBUGGING_TRAP global
variable to remember the debugging trap requested, and exit whenever
that exit_on_call trap is run.

Also:
Refactor the parameter variable as 'trap' for readability. Adjust all
the logs for exit_on_trap to permit search for "exit_on_trap".  And log
at INFO level when exiting vault-manager.

Test Plan:
PASS  unit test exit_on_trap
PASS  default behavior; vault-manager responds promptly to termination
PASS  debug feature exits on matching trap (remembers debug trap)
PASS  bashate on the rendered init.sh

Story: 2010930
Task: 48843

Change-Id: Id67a89e063daa18ba7627553ac2a19ca673ff00b
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-10-25 19:58:13 +00:00
Tae Park
11c11f1e0c Support storing key shards in k8s secrets
Replaces current implementation of storing key shards from PVC to k8s
secret. Includes additional improvements to existing vm code, such as
added error checking.

Test Plan:
PASS vault sanity test
PASS bashate

Story: 2010930
Task: 48845

Change-Id: Ie0e5fe9749fa871d73d7b52600a8905abcb31887
Signed-off-by: Tae Park <tae.park@windriver.com>
2023-10-23 11:11:46 -04:00
Tae Park
9b9e0e8f62 Move Template Values to Top
Organizing the helm template vaules so that each call is unique. This
should allow for more efficient management of helm values within
init.sh. Also done some additional code clean up.

TEST PLAN:
PASS vault sanity test
PASS bashate

Story: 2010930
Task: 48844

Change-Id: Ia8e7820b9c86e307991f9affda7035bb89dfcc57
Signed-off-by: Tae Park <tae.park@windriver.com>
2023-10-11 16:46:13 -04:00
Tae Park
7267389382 Add manual exit points for catching SIGTERM
Adds new catch points for SIGTERM sent to the vault manager. Allows the
vault manager to exit gracefully. This includes debugging capability to
exit on a specific point.

Test Plan:
PASS vault sanity test
PASS bashate
PASS vault manager pod exits and restarts with no errors when an
     exit_on_trap file is created with the specificed exit point number.
     The file must be located in the vault manager work directory.

Story: 2010930
Task: 48843

Change-Id: I3dadccfca554b448d729d37132c8af17324368f1
Signed-off-by: Tae Park <tae.park@windriver.com>
2023-10-06 21:17:00 +00:00
Tae Park
a56dc6dfbb Add log levels to vault manager
Adding new log levels in vault manager to help diagnose logs. There are
five levels added (DEBUG, INFO, WARNING, ERROR, FATAL). Existing logs
are assigned to one of the above levels, and some of the echo commands
are removed since the log levels will now fulfill its role.

Test Plans:
PASS vault sanity test
PASS observe new log levels within vault manager log
PASS assign a new default log level and observe new log reflecting it
PASS bashate

Story: 2010930
Task: 48842

Change-Id: I03679ade6e1a6dcc51d13e76264f6c05d132f7c7
Signed-off-by: Tae Park <tae.park@windriver.com>
2023-10-04 16:48:43 -04:00
Tae Park
f7a37e6ad9 Removing default injector anti-affinity rules
Adding a null override over default anti-affinity rules for vault injectors. The default rule only allow one vault injector pod at a time. This is a problem because helm-override and application apply will try to schedule a new pod first before completely removing the old pod.
This change lets a new vault agent injector pod to be scheduled without issue.

TEST PLAN:
 - Test for AIO-SX
 - Update helm-override so that vault-injector has a different image tag than default
 - apply the new helm-override
 - There should be no FailedScheduling error in the vault pods
 - Sanity test for both AIO-SX and AIO-DX + 1 worker

Closes-bug: 2030901

Change-Id: I9814f502558ab1cbecad48cf37341639c964258f
Signed-off-by: Tae Park <tae.park@windriver.com>
2023-08-15 18:23:35 +00:00
Tae Park
896008fb73 vault-manager wait for one server only when initialized
Modifying the vault-manager initialization logic so that it only waits
for pod number equal to the replica value to be active
if the raft is not yet initialized.

TEST PLAN:
 - In a 2 controller, 1 worker setup,
 - Upload and apply vault
 - Lock the host that vault-manager is running on
 - Vault manager should restart
 - Within the logs, there should not be a repetition of " Waiting for sva-vault statefulset running pods..."
 - Vault Sanity test in AIO-SX
 - Bashate of rendered init.sh

Closes-bug: 2029375

Signed-off-by: Tae Park <tae.park@windriver.com>
Change-Id: I41990b87395a5d5364ef91c048f740d0f0675d6b
vf/antelope
2023-08-08 15:31:29 -04:00
Tae Park
ad3f7808ea Disable Vault Web UI
Changes the setting under vault overrides to disable Vault web UI.

Test Plan:
PASS Port 8200 is unreachable from vault kubernetes pod
  -  kubectl port-forward --address=10.10.31.2 -n vault pod/sva-vault-0 23443:8200
PASS vault kubernetes pod settings show ui = false
  -  kubectl get configmaps -n vault sva-vault-config -o yaml

Story: 2010393
Task: 48381
Change-Id: Ib7915f3071c663b1375e80f04104f1f4fb872a1e
Signed-off-by: Tae Park <tae.park@windriver.com>
2023-07-13 17:28:28 +00:00
Alan Bandeira
08305a2286 Add label core affinity labels to each vault pod
This commit adds the support to core affinity labels for
vault. The label 'app.starlingx.io/component' identifies
to k8s to rather run the application pods by 'platform'
or 'application' cores.

The default value for 'app.starlingx.io/component' label
is 'platform', but the label accept the values
'application' and 'platform'. The override has to be
performed when vault is in the uploaded state, after
application remove or before the first apply. This
behavior is required to ensure that no vault pod is
restarted in an improper manner.

Test plan:

PASS: In a AIO-SX system upload and apply the vault app. When apply
      is finished, run "kubectl -n vault describe po sva | grep
      platform" and the output should be three instances of
      "app.starlingx.io/component=platform", indicating that the
      default configuration is applied ofr each pod.

PASS: In a AIO-SX, where the vault app is in the applied state, run
      "system application-remove vault" and override
      'app.starlingx.io/component' label with 'application' value by
      helm api. After the override, apply vault and verify
      'app.starlingx.io/component' label is 'application' on the
      pods describe output, similar to the previous test.

PASS: In a AIO-SX, where the vault app is in the applied state, run
      "system application-remove vault" and override
      'app.starlingx.io/component' label with any value rather
      than 'platform' or 'application' and after the apply check if
      the default value of 'platform' was used for the pod labels.

PASS: In a Standard configuration with one worker node, upload and
      apply the vault app. When apply is finished, run 'kubectl -n
      vault describe po sva | grep -b3 "app.starlingx.io/component"'
      and check the output for the 'app.starlingx.io/component'
      label is the default value of 'platform' for each pod, with
      every vault server pod having the label.

PASS: In a Standard configuration with one worker node, remove vault
      and override 'app.starlingx.io/component' label with any value,
      valid or not, and after the override, apply vault. With vault
      in the applied state, verify the replica count override is kept
      and check the pods in a similar way to the previous test to
      validate that the HA configuration is maintained. The number
      of pods replicas should reflect the configuration.

Story: 2010612
Task: 48252

Change-Id: If729ab8bb8fecddf54824f5aa59326960b66942a
Signed-off-by: Alan Bandeira <Alan.PortelaBandeira@windriver.com>
2023-06-20 16:45:27 -03:00
Michel Thebeau
e2dd8a281f adjust server readiness probe
It is observed that vault pods consistently show readiness probe warning
when applying the application or when a pod is recovering.  The probe
runs "vault status" which returns failure when the vault is sealed.  The
probe failure is not impactful, but since there is a certain delay
before unseal is completed, adjust initialDelaySeconds to 25 to account
for the time required to unseal vault pods.  This commit should usually
omit readiness probe warning for a single recovering vault pod.

During testing it is observed that:
  Setting initialDelaySeconds to 15: a recovering pod shows readiness
probe warning.
  Setting initialDelaySeconds to 18, a recovering pod omits readiness
probe warning.
  On application-apply, the first pod to be unsealed _may_ show readiness
probe warning when initialDelaySeconds is 25.  Other pods will be
unsealed serially and will show readiness probe warning.

Test Plan:
  PASS  Standard controller storage 2+2
  PASS  HA tests, log inspection
  PASS  Inspection of kubectl describe of pods with various values for
        initialDelaySeconds

Story: 2010393
Task: 48237

Depends-On: https://review.opendev.org/c/starlingx/vault-armada-app/+/884553

Change-Id: I9ea6cca2b591c40bfe70737c0fb390b18b69f796
Signed-off-by: Michel Thebeau <michel.thebeau@windriver.com>
2023-06-15 09:34:00 -04:00
Michel Thebeau
a49584d4f9 remove vault-manager unseal delay
Options for vault-manager were introduced to delay unsealing of
recovering vault server pods until the active vault server pod would
start sending heartbeats to the recovering pod.  The behavior of vault
server that prompted the change to vault-manager is no longer
observed with vault server version 1.13.1.

Remove the unsealWaitIntervals so that vault manager will unseal the
recovering server immediately.

Test Plan:
  PASS  HA tests, review pods logs, election status
  PASS  active server remains active when a pod recovers
  PASS  no evidence of election attempts in vault server logs
  PASS  tested also with statusCheckRate=.1 to minimize delay
        (default 5s gives a random-ish delay of 0-5 seconds)

Story: 2010393
Task: 48236

Depends-On: https://review.opendev.org/c/starlingx/vault-armada-app/+/884553

Change-Id: Ifd73970658d6ef7a0e0ca5844b2db81d94bdde9f
Signed-off-by: Michel Thebeau <michel.thebeau@windriver.com>
2023-06-15 09:33:27 -04:00
Michel Thebeau
82478326fe update vault helm chart to 0.24.1
Replace references of 0.19.0 with 0.24.1.  Refresh the patches for
vault-manager and agent image reference. Update the image tags to match
new vault chart.

Test plan:
 PASS AIO-sx and Standard 2+2
 PASS vault aware and un-aware applications
 PASS HA tests
 PASS test image pulls from private registry with external network
      restriction

 Story: 2010393
 Task: 48109

Change-Id: Ib6b4d0a6f7d3a54676563c59f60d93d129c81c1c
Signed-off-by: Michel Thebeau <michel.thebeau@windriver.com>
2023-06-14 15:05:32 -04:00
Michel Thebeau
363529d1fc delete old chart build files
These have not been needed for a while and do not impact the build of
this application.

The Makefile remains as the necessary component of the build.

Test plan:
  PASS  compare chart files before/after to ensure no changes
  PASS  compare all of stx-vault-helm package before/after

Story: 2010393
Task: 47164

Change-Id: I97025ceee2875a6fc588d72436b55e7f5ac59062
Signed-off-by: Michel Thebeau <michel.thebeau@windriver.com>
2023-06-07 12:45:45 +00:00
Michel Thebeau
27f4742d8d remove extra vault-manager patch
This patch was part of the CentOS build, which was removed with
commit 20167fc54f7d3111762c52a4e8a4fe1e7c8ead5a

Whereas the patch was copied for debian here:
commit d96e143a34392324457f92019947d9af91ef803e

Test Plan:
PASS: debian build unaffected (there is no CentOS build)

Story: 2010393
Task: 47232

Change-Id: If90017b58f6220bca82e554e2fb50bd655d240ec
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-05-26 13:19:35 +00:00
Michel Thebeau
198f4e5164 set images to pull from configured registries
Add yaml to the fluxcd manifest which is compatible with the platform's
image pull and service parameter registry override handling.  The
platform will pull the image and populate registry.local, and the vault
injector agent will pull from registry.local.

Story: 2010393
Task: 47927

Test Plan:
PASS: sysinv.log shows that agentImage image is pulled when vault
      server image is hardcoded differently
PASS: agent image pulls when public network is blocked
PASS: agent image pulls when it is different than vault server image
PASS: vault app test, including vault un-aware application

Change-Id: Idd1215744bb31881127a6be23cf570166c79fad8
Signed-off-by: Michel Thebeau <michel.thebeau@windriver.com>
2023-05-05 18:56:31 -04:00
Michel Thebeau
1dab2dbf8f update vault-manager image tag
The new image has updated packages for CVE fixes, no other changes.

Test Plan:
PASS - apply vault application (inspect vault-manager pod)

Story: 2010710
Task: 47905

Change-Id: I83848d12baf0558edc0a2e4cd9a964f781edec56
Signed-off-by: Michel Thebeau <michel.thebeau@windriver.com>
2023-05-03 16:53:59 -04:00
Davlet Panech
809632ce2c Fix github mirroring for this repo
Updating the rsa ssh host key based on:
https://github.blog/2023-03-23-we-updated-our-rsa-ssh-host-key/

Note: In the future, StarlingX should have a zuul job and
secret setup for all repos so we do not need to do this
for every repo.

Needed to rename the secret, because zuul fails if like-named
secrets have diffent values in different branches of the same
repo.

Partial-Bug: #2015246
Change-Id: Ie65c51aabfa4b303b89634eb9e5c566669f5f5d9
Signed-off-by: Davlet Panech <davlet.panech@windriver.com>
2023-04-28 12:38:53 -04:00
Michel Thebeau
82d5d9abdc update vault-manager statefulset
Update the statefulset to prompt update strategy.  The config map is
updated in previous commits, for which we want vault-manager to restart.

Test Plan:
PASS - sw-patch upload/apply/install/remove
PASS - manual revert to original tarball (system application-update)

Story: 2010393
Task: 47731

Change-Id: Ib52d019170763d066c730d679067b91ed4d59bb5
Signed-off-by: Michel Thebeau <Michel.Thebeau@windriver.com>
2023-03-31 10:31:46 -04:00