233 Commits

Author SHA1 Message Date
Zuul
e5585cdaeb Merge "Adapt _validate_qos_rules_nbdb to flat and vlan networks" 2025-04-08 11:13:02 +00:00
Eduardo Olivares
45ff7d4f93 Adapt _validate_qos_rules_nbdb to flat and vlan networks
With recent neutron changes in how BW limits are applied when to vlan
and flat networks, the validation of the QoS configuration on the OVN
NB DB needs to be adapted.

Change-Id: I34db362f161085b1c45bd14eb9eabbd1ddafd070
2025-04-07 10:35:40 +02:00
Eduardo Olivares
d43e56466a Adapt BW limit tests to traffic shaping
With recent neutron changes, traffic shaping is applied when network
type is either vlan or flat when BW limit is configured.

BW limit tests have been adapted to the new algoritm to avoid dropping
iperf control packets that make tests fail. Those packets could be
dropped because shaper buffers drop packets when bitrate exceeds the
configured BW limit. In order to avoid this:
- Injected bitrate, which exceeds the configured BW limit, is limitted
  to 1.5 * bwlimit
- iperf tests duration is limitted to 6 seconds (the shorted the iperf
  tests, the less likely shaper buffers get full)

Change-Id: Icc1e1eb333a8844d7ddcf7a17ccc48886b3889e1
2025-04-04 13:34:48 +02:00
Maor Blaustein
2ad8456264 Apply rollout wait to remaining tests
This change applies rollout wait introduced in patch [1] to any other
test that can benefit from such extra stability.

[1] 945106: Wait for config rollout to complete in test_api_server
https://review.opendev.org/c/x/whitebox-neutron-tempest-plugin/+/945106

Change-Id: I2aaa0170f702c278be1d193dca30450c46f1303f
2025-03-30 16:44:10 +03:00
780cbfafc8 [CI] Extend nova disk allocation ratio
With multi threaded runs we see random issues with nova scheduling
as disk get's fully utilized.

The issue is seen when job run on specific nodepool provider like
raxflex which has low disk(~53 GB). Extending default disk allocation
ratio in jobs to avoid such failures.

Change-Id: Idaac65821cd289b3652f826cfda96f5d581671b0
2025-03-24 10:36:02 +05:30
0bbc254c72 Wait for config rollout to complete in test_api_server
test_api_server was not waiting for rollout to finish
and that was causing issues in follow up tests randomly.
This patch adds an option to wait for rollout to finish
when setting service config.
It can be used in other tests wherever needed.

Resolves: https://issues.redhat.com/browse/OSPRH-14166
Change-Id: Iabafe97934ed2b5ed40f1c529d6f2ffddeac2797
2025-03-20 18:59:22 +05:30
Zuul
2b7e019fa5 Merge "Add nested snat test" 2025-03-13 01:30:51 +00:00
Maor Blaustein
0bbfd62bb4 Wait in both log start track/retrieve
In order to detect correctly duplicates in tested log entries there is
need to wait before and after when logs retrieved or start tracking.

Looking at related code `test_only_accepted_traffic_logged` there are 2
very close code lines [1] that use `ping_ip_address` method (that uses `ping -c1`),
which may generate ICMP traffic that may seem duplicate - 4 log entries of twice request/reply.

This is still suspicious that 4 entries are on same second in timestamp,
but to ensure this isn't test timing fault, lets merge such fix.

[1] https://opendev.org/x/whitebox-neutron-tempest-plugin/src/commit/d08fcd29e9/whitebox_neutron_tempest_plugin/tests/scenario/test_security_group_logging.py#L559-L564

Related-Issue: #OSPRH-14560
Change-Id: Iaa18702edb2c553599657c66b7cb49120a48486e
2025-03-12 14:53:59 +02:00
Maor Blaustein
a01def7a72 Limit logging in security group logging tests
Log only up to given limit start and end entries, avoid over log of
test since lots of traffic generated to test security group logging ovn
feature and its limits, and this caused log file size to increase substantially.

Security group logging tests now differentiate between entries
fetched for test assertions (fully), and for test log (partly).

This handles issue of tempest failing even if tests pass successfully
due to tempest log file size being too large.

Related-Bug: #2102022
Change-Id: I4e49f537f9779e503050bd50fd4b52e921f3dbe5
2025-03-12 12:12:22 +02:00
Renjing Xiao
e401b6162b Add nested snat test
This test checks connectivity when ovn_router_indirect_snat is enabled.

This functionality was added for ML2/OVN in https://review.opendev.org/c/openstack/neutron/+/926495

Depends-On: https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/940906

Resolves-Bug: OSPRH-13329

Change-Id: I2b31d1c123286bcbd69f932bda069aa3af9beb18
2025-03-10 11:10:59 +00:00
Zuul
d08fcd29e9 Merge "Fetch only SGL log entries, drop count assertion" 2025-03-07 01:14:56 +00:00
Zuul
d0a6e8e7f0 Merge "Test verifies BZ#2214566/OSPRH-13533 doesn't regress" 2025-03-06 18:46:14 +00:00
Maor Blaustein
6271c2585a Fetch only SGL log entries, drop count assertion
After debugging with Elvira, it seems extra log entries may differ from
both patterns used before `acl_log` and `pinctrl0`, leaving extra logs
not fetched, then line count is not as count of subtraction between
parsed log IDs from test start/end entries.

There is no need to assert the exact amount of all log entries
generated since we only care about SGL logs and count,
and assert counts against pre-configured rate/burst limits.

Also changed to filter `acl_log` using class variable for various uses
across SGL tests.
Such change will fetch only SGL feature entries,
also adding to tempest log only SGL entries, not only upon failure.

Change-Id: I423f83df065058abd036496113dfd8e0157cb5f1
2025-03-05 23:13:11 +02:00
Maor Blaustein
c7c5a02eb2 Timeout before getting logs
Elvira noticed more occurrences where logs aren't fully written and
partially fetched, so this fix adds generic wait time before any log
fetching.
Removed existing timeout to avoid double timeout done.

Change-Id: I54a304c22108f79fa425bdb1b057536e00c072d2
2025-03-05 17:53:42 +02:00
Maor Blaustein
f00eb5e9d7 Test verifies BZ#2214566/OSPRH-13533 doesn't regress
Validate "openstack port list --long" output has correct related security group.
Test class may easily verify OSPRH-14118 (LP#2098980) when fixed.

Also adds `force_bash` argument to `validate_command` method, adjusted
where bash wrap was previously used.

Bump to include tested openstackclient fix on master branch:
Depends-On: https://review.opendev.org/c/openstack/releases/+/942104
Depends-On: https://review.opendev.org/c/openstack/requirements/+/942491

Skip test on RDO whitebox job until openstackclient bigger than antelope
version (which has no more releases):
https://github.com/openstack-k8s-operators/ci-framework/pull/2775

Change-Id: Id5e554a8f0079b31706574cece3b364e6a5e8e64
2025-03-05 14:15:49 +02:00
Rodolfo Alonso Hernandez
8c3bb22f49 Physical network max-bw QoS is applied on the localnet port
The test ``QosTestExternalNetwork.test_dscp_bwlimit_external_network``
is checking the QoS max-bw rules, using "iperf3" between two VMs. Since
[1][2], the QoS min and max BW rules for LSPs in LS that have associated
a physical network (and a localnet port) are defined in the LSP.options
dictionary. The QoS values are set in the localnet port interface using
"tc" commands.

If the "iperf3" is done between VMs in the same compute node, this
traffic won't cross the localnet port and the QoS rules won't be
applied.

The method ``_test_both_bwlimit_dscp`` checks the bandwidth between the
"sender" and "receiver" VMs, that are explicitly created in two
different compute nodes.

This method is also validating the DSCP mark, using the aforementioned
VMs. The code below the method call is redundant and removed.

[1]https://review.opendev.org/c/openstack/neutron/+/934418
[2]87514ac042

Closes-Bug: #2099755
Change-Id: I68e6730b19a81f2e9ca6c951e8920e650d7488c2
2025-02-26 06:21:26 +00:00
Elvira García
98b5177a87 SGL test doesn't retrieve all pings logged
In test_only_accepted_traffic_logged, after using the
check_remote_connectivity function, a small lapse of time needs
to be left in order for the last ping to get to the host. This
sleep is already present in other validations like
"check_east_west_icmp_flow".

Furthermore, using ovn_pinctrl0 will get all the ovn logs, as
opposed to using acl_logs in the grep to the log file that happens
in _get_logs_and_counts.

Change-Id: I337c964a9f39b97d4d66600321a7a9c468336e9d
2025-02-18 09:40:45 +01:00
Zuul
78d6530f0e Merge "Fix SGL test multicompute and cleanup" 2025-02-16 12:30:22 +00:00
Elvira García
de9fd26d70 Fix SGL test multicompute and cleanup
There are environments, like Openstack on Openshift (or CRC) where we
might see two nodes on an environment but one is not acting like a
compute node, thus no VMs can be spawned there. This change checks
specifically for compute nodes and not just nodes on a deployment.

Furthermore, the ALL network log object used to remain in between test
runs because it did not have a cleanup. This is now properly handled.

Change-Id: I1debc663bb19260d6a7bdd7ea642eaa51994a4ca
Signed-off-by: Elvira García <egarciar@redhat.com>
2025-02-11 16:28:10 +01:00
Rodolfo Alonso Hernandez
3fcaeea3f0 [eventlet-removal] Remove "logger" mechanism from ML2/OVN CI jobs
The "logger" mechanism is a testing class that is still calling
monkey_patch. This mechanism driver is not relevant nor neccessary
for the ML2/OVN CI jobs.

Change-Id: Ie8d7c3e0807bf5007289ca47c58c78056d511cc0
2025-02-11 10:14:16 +00:00
Slawek Kaplonski
37a4f7b828 Wait a bit longer for tcpdump to store captures to the file
This patch adds 5 seconds sleep in the ``check_north_south_icmp_flow`` method in the
``BaseTempestWhiteboxTestCase`` before stoping tcpdump captures. This is
to make sure that whatever was captured on the vms is actually stored
in the file so that it can be read later during the test.

This change is similar to what was already done in [1] for the method
``test_east_west_icmp_flow`` in the same class.

[1] https://review.opendev.org/c/x/whitebox-neutron-tempest-plugin/+/925323

Closes-Bug: #OSPRH-11312
Change-Id: Ic51c46f864a9c34e24a3d969f95a0d721f3eec77
2025-01-23 15:31:20 +00:00
Eduardo Olivares
46e40755d4 Add config dict dataplane_podified_services
Method reset_node_service uses "grep" to find the name of the services
that tests restart with systemctl. We have identified some scenarios
where this is not valid. Due to that, this patch adds a configurable
dictionary with the name of those services. If a service is not
included in this dictionary, the grep method is used.

Closes-Bug: #2095404

Change-Id: I131207361e093deb02f71d98a69602f4368a6413
2025-01-21 11:57:23 +01:00
Eduardo Olivares
c644e48d61 Skip test_dscp_bwlimit_external_network with only one compute
After recent changes in neutron, BW limit rules are applied in
a different way and testing BW limit applied to a provider network makes
only sense between VM instances running on different compute nodes.

Related-bug: #2095167
Change-Id: I6a8a26e250bc116689658c846d4c4d35e1948488
2025-01-17 15:56:20 +01:00
Zuul
da9f583c1e Merge "Switch jobs to run on ubuntu noble" 2025-01-02 08:58:03 +00:00
204b394315 Switch jobs to run on ubuntu noble
Default testing distro for epoxy cycle is ubuntu noble[1].
Also noble includes OVN latest LTS version i.e 24.03 so
removing build from source vars.
Also use crudini package instead of pip module as latest
pip don't allow modules to be install system wide.

[1] https://governance.openstack.org/tc/reference/runtimes/2025.1.html

Change-Id: I10a46b9f9aa1708b840353892e95069c0ffa5d06
2024-12-27 15:09:18 +05:30
Eduardo Olivares
901b1cd6b6 Ignore acl_log lines printed before a test SG logging starts
Instead of taking the last N lines printed that include "acl_log", print
only those "acl_log" lines with ID greater than the value obtained before
the test started.

Change-Id: I31d74e72678c21dac0850daea761fca1f9f736ac
2024-12-24 11:15:59 +01:00
Eduardo Olivares
a575b79451 Use run_on_master_controller to run command when possible
This method logs the command executed and its output, so it could help
to troubleshoot some issues.

Change-Id: If1de40baad8d7d956724b4d666e66169b23708d3
2024-12-20 07:44:59 +01:00
Zuul
3429e220a5 Merge "Log packets captured by tcpdump on the nodes in case of test failure" 2024-12-10 13:24:12 +00:00
Slawek Kaplonski
81d9dc6831 Log packets captured by tcpdump on the nodes in case of test failure
In the tests which are using tcpdump to check if traffic is going
through the right node(s) there was only check if something was captured
or not. But in case of failure we didn't know what was captured what
caused issue.
Now this patch adds logging of the packets captured on all of the nodes
in case if the assertion in test failed. Hopefully that will help
debugging issues like in the related bug.

Related-bug: #OSPRH-11312
Change-Id: I1025ae0c9dbb50d187b2827a8a7c4de864e35875
2024-12-09 13:11:52 +00:00
Eduardo Olivares
8175ddd879 Fix log message when reading tcpdump file fails
An exception was raised:
TypeError: not all arguments converted during string formatting

Using LOG.exception alredy prints the caught exception.

Change-Id: Ie65d47889485c48c6eb314e740bd2f52464ee0ef
2024-12-09 09:02:09 +01:00
Rodolfo Alonso Hernandez
41fa012c57 Add the periodic CI queue
The jobs added are:
* whitebox-neutron-tempest-plugin-ovn
* whitebox-neutron-tempest-plugin-ovn-single-thread

Change-Id: Ibbd441285f3d094babde2ad7b83e4f780df5ea2f
2024-12-06 12:34:27 +00:00
ccamposr
77bddb9256 Bump hacking
Fixes some new PEP8 errors that appear with jobs running on new ubuntu
version, and temporarily filters out the larger I202 error ("Additional
 newline in a group of imports").

This patch updates the hacking and flake8-import-order versions.

Copied from:
https://review.opendev.org/c/openstack/ovn-octavia-provider/+/936855

Change-Id: Ice4513eedc4fd6f054c19d1854eff00aeb5c35a1
2024-12-05 11:33:40 +01:00
Eduardo Olivares
435766502a Fix pep8 errors raised with python3.12 and update advanced guest image
Two recent changes in CI infra have affected
whitebox_neutron_tempest_plugin jobs:
1) After updating of pep8 jobs to ubuntu-noble/python3.12, an error is
   reported and this patch fixes it.
2) Tests using advanced images had issues to boot rocky 9.3 VM instances
   and due to that, this patch updates it to rocky 9.5.
   As a consequence of this change, virt-customize command has been
   updated to:
   * install tcpdump
   * enable dhcp for NetworkManager
   And some tests have been adapted to rocky 9.5 characteristics.

Change-Id: I489ecaf1765570e52b1f2d2676f13a0edc5f6fc4
2024-12-05 11:07:45 +01:00
Eduardo Olivares
3feaab6c99 Fix exclude_hosts mechanism in _create_server method
Patch [1] broke the way tests specify that a VM has to be spawned on
a host different from a provided list of exclude_hosts because the list
of hosts includes shortnames instead of FQDN after [1] was merged.
This patch fixes this problem.

Besides that, the method `get_shortname_for_server` is renamed to
`get_host_shortname_for_server` because what it really returns is the
hypervisor/host name.

[1] https://review.opendev.org/c/932348

Change-Id: I584565b42bfb691c224330ab96664ac9fec150c1
2024-10-24 14:47:11 +02:00
Eduardo Olivares
15692a9fa8 Add short_name to the nodes information
Some functions included in BaseDisruptiveTempestTestCase class execute
commands on the hypervisor server.
They did not work properly with latest ci-fmw changes.
This patch adapts those commands and now they work well, by using the
short hostname from the nodes instead of the FQDNs.

Related-bug: OSPRH-8595
Change-Id: I02c5cd4f9bcaf69011f3b59d853931a2ce5a714b
2024-10-21 07:55:46 +02:00
Rodolfo Alonso Hernandez
8b160a4c6e `set_service_setting` accept multiple config changes
The method ``BaseTempestWhiteboxTestCase.set_service_setting`` accepts
multiple configuration parameters in one call. If the service
configuration changes, it will be needed to restart it only once.

Closes-Bug: #OSPRH-10514
Change-Id: I206f3ce73e9ab36d4fe1ba072735a0bd382c5621
2024-10-09 14:48:57 +00:00
Miro Tomaska
6c381d8e20 Fix passing section argument into the set_service_setting
`_set_rate_limiting_config` was attempting to pass `section` value
by position but its position was off and the `service` argument
was getting passed instead. This caused bad formatting in the
crudini command and subsequent .conf file. This can be see in the
tempest log [1]

To prevent this from happening in the future. I am passing arguments
by their names.
I also put the  metada_rate_limiting string literal into a constant
to keep the code DRY.

[1] https://paste.opendev.org/show/btVjyzKuNaiI8UC1NC9M/

Related: https://issues.redhat.com/browse/OSPRH-9569

Change-Id: Ic73def44d03fb28b5975c1d96a471e0463d01740
2024-10-08 10:11:59 +01:00
Rodolfo Alonso Hernandez
a46b9ca6a3 Do no expect a configuration change in `OvnFdbAgingTest`
The configuration parameters ``localnet_learn_fdb`` and
``fdb_age_threshold`` can be set by other tests (or other job
executions, depending on the CI). In that case it is not needed
to wait for the pod replacement (in podified environments).

Related-Bug: #OSPRH-10452
Related-Bug: #OSPRH-893
Change-Id: I76d16c40c11b7255051a1154d9a7050d7add27f6
2024-09-30 13:31:41 +00:00
Rodolfo Alonso Hernandez
b259c20115 Make `set_service_setting` resilient to a no config change
Added a new parameter to ``set_service_setting``: ``cfg_change``. This
flag is True by default. In a podified environment, if the flag is
disabled and the config set does not modify the current one, the
method won't expected the pod to be replaced.

Closes-Bug: #OSPRH-10452
Change-Id: I1361ba4bcd9e755547394e823b134d615a2240b2
2024-09-30 13:27:36 +00:00
Roman Safronov
ea8a27a475 Fix fallback in dvr ingress test
The fallback mechanism in dvr ingress test was introduced
in [1] in order to allow the test to work properly on
environments where the host that is used as proxy host
does not have routing to neutron external network.
As appeared the ping check in this fallback mechanism
is working inconsistently on some environments in case
cirros image is used as default image.
This patch changes the problematic ping check to
check_connectivity function that does multiple retries
before raising an exception.

[1] https://review.opendev.org/c/x/whitebox-neutron-tempest-plugin/+/925475

Change-Id: I82a5095d31a7b9e96821a0985e1400814eda02b2
2024-09-26 15:17:21 +03:00
Roman Safronov
182582a0d5 Remove Tripleo from plugin description lines
This patch replaces mentions of Tripleo in the plugin description
lines by openstack-k8s-operators.

Change-Id: Ic0fc4b619cbf0be5b3da5a94c6fec1e4daeac119
2024-09-18 12:48:24 +03:00
Zuul
8b4555492a Merge "Make use of the utility method `get_neutron_api_service_name`" 2024-09-16 17:39:48 +00:00
Zuul
f0c681a9b7 Merge "Make use of the utility method `get_ml2_conf_file`" 2024-09-16 17:39:47 +00:00
Zuul
de66162e24 Merge "Do not "grep" the output of an empty list" 2024-09-16 17:39:46 +00:00
Zuul
cf1a5de2fe Merge "Fix dvr vip failover tests" 2024-09-16 13:03:52 +00:00
Zuul
a7c3c9454f Merge "Test Neutron API restart time" 2024-09-16 12:46:33 +00:00
Roman Safronov
79ddfc94b0 Fix dvr vip failover tests
On podified environments we can't use local IP for establishing
TCP connection via virtual IP (VIP) since tempest is running on
a node that can be a gateway node and logic of the test will be
affected. Therefore a separate VM is spawned on the external
network and it is used as a proxy host for establishing TCP
connection to the FIP.
Also introduced a function ensure_external_network_is_shared()
and simplified some tests from other files that had the same
code.

Change-Id: I099ec34299debcd43b8ce485656e3ea6d7a95f51
2024-09-16 00:14:56 +03:00
Zuul
28d64df58a Merge "Capture traffic only in podified nodes running ovs pods" 2024-09-15 11:00:23 +00:00
Rodolfo Alonso Hernandez
d8b9b0635f Test Neutron API restart time
The new test added checks that the time spent to restart the Neutron API
doesn't take more than one minute.

Closes-Bug: #OSPRH-2460
Change-Id: I467f6c2124b8e3d6eb76bea399fd9cc5bd553b5c
2024-09-14 16:42:26 +00:00
Rodolfo Alonso Hernandez
8233036bcd Make use of the utility method `get_neutron_api_service_name`
Change-Id: Ibfd04fef4c9182614256bec52c349c28f5cd4e49
2024-09-14 16:42:09 +00:00