
This adds a new Noble nodepool launcher node, nl06, to replace nl02. When this change lands I will put nl02 in the emergency file and manually shutdown services on it. This should allow nl06 to come up with its config and take over management of openmetal provider resources. While we are at it we update the system-config-run-nodepool testing to deploy a noble nl05. That doesn't exactly match this change but nl05 will replace nl01 soon enough. We just need to update CI to ensure that we can deploy a nodepool launcher on Noble before we actually attempt to do so in production. Depends-On: https://review.opendev.org/c/opendev/zone-opendev.org/+/945364 Depends-On: https://review.opendev.org/c/openstack/project-config/+/945359 Change-Id: I18db9f57bd41ed2a57c545f02ac0113bb8b4d9de
118 lines
4.5 KiB
ReStructuredText
118 lines
4.5 KiB
ReStructuredText
:title: Nodepool
|
|
|
|
.. _nodepool:
|
|
|
|
Nodepool
|
|
########
|
|
|
|
Nodepool is a service used by the OpenStack CI team to deploy and manage a pool
|
|
of devstack images on a cloud server for use in OpenStack project testing.
|
|
|
|
At a Glance
|
|
===========
|
|
|
|
:Hosts:
|
|
* nl01.opendev.org
|
|
* nl02.opendev.org
|
|
* nl03.opendev.org
|
|
* nl04.opendev.org
|
|
* nl06.opendev.org
|
|
* nb05.opendev.org
|
|
* nb06.opendev.org
|
|
* nb07.opendev.org
|
|
* zk04.opendev.org
|
|
* zk05.opendev.org
|
|
* zk06.opendev.org
|
|
:Puppet:
|
|
* https://opendev.org/opendev/puppet-openstackci/src/branch/master/manifests/nodepool_builder.pp
|
|
:Configuration:
|
|
* :config:`nodepool/nodepool.yaml`
|
|
* :config:`nodepool/scripts/`
|
|
* :config:`nodepool/elements/`
|
|
:Projects:
|
|
* https://opendev.org/zuul/nodepool
|
|
:Bugs:
|
|
* https://storyboard.openstack.org/#!/project/668
|
|
:Resources:
|
|
* `Nodepool Reference Manual <http://docs.openstack.org/infra/nodepool>`_
|
|
* `ZooKeeper Programmer's Guide <https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html>`_
|
|
* `ZooKeeper Administrator's Guide <https://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html>`_
|
|
* `zk_shell <https://pypi.org/project/zk_shell/>`_
|
|
|
|
Overview
|
|
========
|
|
|
|
Once per day, for every image type (and provider) configured by
|
|
nodepool, a new image with cached data is built for use by devstack.
|
|
Nodepool spins up new instances and tears down old as tests are queued
|
|
up and completed, always maintaining a consistent number of available
|
|
instances for tests up to the set limits of the CI infrastructure.
|
|
|
|
Zookeeper
|
|
=========
|
|
|
|
Nodepool stores image metadata in ZooKeeper. We have a three-node
|
|
ZooKeeper cluster running on zk04.opendev.org - zk06.opendev.org.
|
|
|
|
The Nodepool CLI should be sufficient to examine and alter any of the
|
|
information stored in ZooKeeper. However, in case advanced debugging
|
|
is needed, use of zk-shell ("pip install zk_shell" into a virtualenv
|
|
and run "zk-shell") is recommended as an easy way to inspect and/or
|
|
change data in ZooKeeper.
|
|
|
|
Bad Images
|
|
==========
|
|
|
|
Since nodepool takes a while to build images, and generally only does
|
|
it once per day, occasionally the images it produces may have
|
|
significant behavior changes from the previous versions. For
|
|
instance, a provider's base image or operating system package may
|
|
update, or some of the scripts or system configuration that we apply
|
|
to the images may change. If this occurs, it is easy to revert to the
|
|
last good image.
|
|
|
|
Nodepool periodically deletes old images, however, it never deletes
|
|
the current or next most recent image in the ``ready`` state for any
|
|
image-provider combination. So if you find that the
|
|
``ubuntu-precise`` image is problematic, you can run::
|
|
|
|
$ sudo nodepool dib-image-list
|
|
|
|
+---------------------------+----------------+---------+-----------+----------+-------------+
|
|
| ID | Image | Builder | Formats | State | Age |
|
|
+---------------------------+----------------+---------+-----------+----------+-------------+
|
|
| ubuntu-precise-0000000001 | ubuntu-precise | nb01 | qcow2,vhd | ready | 02:00:57:33 |
|
|
| ubuntu-precise-0000000002 | ubuntu-precise | nb01 | qcow2,vhd | ready | 01:00:57:33 |
|
|
+---------------------------+----------------+---------+-----------+----------+-------------+
|
|
|
|
Image ubuntu-precise-0000000001 is the previous image and
|
|
ubuntu-precise-0000000002 is the current image (they are both marked
|
|
as ``ready`` and the current image is simply the image with the
|
|
shortest age.
|
|
|
|
Nodepool aggressively attempts to build and upload missing images, so
|
|
if the problem with the image will not be solved with an immediate
|
|
rebuild, image builds must first be disabled for that image. To do
|
|
so, add ``pause: True`` to the ``diskimage`` section for
|
|
``ubuntu-precise`` in nodepool.yaml.
|
|
|
|
Then delete the problematic image with::
|
|
|
|
$ sudo nodepool dib-image-delete ubuntu-precise-0000000002
|
|
|
|
All uploads corresponding to that image build will be deleted from providers
|
|
before the image DIB files are deleted. The previous image will become the
|
|
current image and nodepool will use it when creating new nodes. When nodepool
|
|
next creates an image, it will still retain build #1 since it will still be
|
|
considered the next-most-recent image.
|
|
|
|
vhd-util
|
|
========
|
|
|
|
Creating images for Rackspace requires a patched version of vhd-util
|
|
to convert the images into the appropriate VHD format. See the
|
|
`opendev/infra-vhd-util-deb
|
|
<https://opendev.org/opendev/infra-vhd-util-deb>`__ for details of
|
|
this custom package. This is installed on a production host via a PPA
|
|
built and published by jobs in this repository.
|