diff --git a/doc/source/specs/index.rst b/doc/source/specs/index.rst index 2600e6b01b..8345d722cd 100644 --- a/doc/source/specs/index.rst +++ b/doc/source/specs/index.rst @@ -14,3 +14,4 @@ Contents: nginx-sidecar.rst support-linux-bridge-on-neutron.rst fluentbit-fluentd-architecture.rst + osh-1.0-requirements.rst diff --git a/doc/source/specs/osh-1.0-requirements.rst b/doc/source/specs/osh-1.0-requirements.rst new file mode 100644 index 0000000000..174d87bd48 --- /dev/null +++ b/doc/source/specs/osh-1.0-requirements.rst @@ -0,0 +1,266 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. + + http://creativecommons.org/licenses/by/3.0/legalcode + +.. + +=============================== +OpenStack-Helm 1.0 Requirements +=============================== + +Topic: +osh-1.0-requirements_ + +.. _osh-1.0-requirements: https://review.openstack.org/#/q/topic:bp/osh-1.0-requirements + +Problem Description +=================== + +OpenStack-Helm has undergone rapid development and maturation over its +lifetime, and is nearing the point of real-world readiness. This spec +details the functionality that must be implemented in OpenStack-Helm for it to +be considered ready for a 1.0 release, as well as for general use. + +Use case +--------- +This spec describes a point-in-time readiness for OpenStack-Helm 1.0, +after which it will be for historical reference only. + +Proposed Change +=============== + +The proposed requirements for a 1.0 release are as follows: + +Gating +------ +A foundational requirement of 1.0 readiness is the presence of robust gating +that will ensure functionality, backward compatibility, and upgradeability. +This will allow development to continue and for support for new versions of +OpenStack to be added post-1.0. +The following gating requirements must be met: + +**Helm test for all charts** + +Helm test is the building block for all gating. Each chart must integrate a +helm-test script which validates proper functionality. This is already a +merge criterion for new charts, but a handful of older charts still need +for helm test functionality to be added. No additional charts will be merged +prior to 1.0 unless they meet this requirement (and others in this document). + +**Resiliency across reboots** + +All services should survive node reboots, and their functionality validated +following a reboot by a gate. + +**Upgrades** + +Gating must prove that upgrades from each supported OpenStack version to the +next operate flawlessly, using the default image set (LOCI). Specifically, +each OpenStack chart should be upgraded from one release to the next, and +each infrastructure service from one minor version to the next. Both the +container image and configuration must be modified as part of this upgrade. +At minimum, Newton to Ocata upgrade must be validated for the 1.0 release. + +Code Completion and Refactoring +------------------------------- +A number of in-progress and planned development efforts must be completed +prior to 1.0, to ensure a stable OpenStack-Helm interface thereafter. + +**Charts in the appropriate project** + +All charts should migrate to their appropriate home project as follows: + +- OpenStack-Helm for OpenStack services +- OpenStack-Helm-Infra for supporting services +- OpenStack-Helm-Addons for ancillary services + +In particular, these charts must move to OpenStack-Helm-Infra: + +- ceph +- etcd +- ingress +- ldap +- libvirt +- mariadb +- memcached +- mongodb +- openvswitch +- postgresql +- rabbitmq + +**Combined helm-toolkit** + +Currently both OpenStack-Helm and OpenStack-Helm-Infra have their own parallel +versions of the Helm-Toolkit library chart. They must be combined into a +single chart in OpenStack-Helm-Infra prior to 1.0. + +**Standardization of manifests** + +Work is underway to refactor common manifest patterns into reusable snippets +in Helm-Toolkit. The following manifests have yet to be combined: + +- Database drop Job +- Prometheus exporters +- API Deployments +- Worker Deployments +- StatefulSets +- CronJobs +- Etc ConfigMaps +- Bin ConfigMaps + +**Standardization of values** + +OpenStack-Helm has developed a number of conventions around the format and +ordering of charts' `values.yaml` file, in support of both reusable Helm-Toolkit +functions and ease of developer ramp-up. For 1.0 readiness, OpenStack-Helm must +cement these conventions within a spec, as well as the ordering of `values.yaml` +keys. These conventions must then be gated to guarantee conformity. +The spec in progress can be found here [1]_. + +**Inclusion of all core services** + +Charts for all core OpenStack services must be present to achieve 1.0 +releasability. The only core service outstanding at this time is Swift. + +**Split Ceph chart** + +The monolithic Ceph chart does not allow for following Ceph upgrade best +practices, namely to upgrade Mons, OSDs, and client services in that order. +The Ceph chart must therefore be split into at least three charts (one +for each of the above upgrade phases) prior to 1.0 to ensure smooth +in-place upgradability. + +**Values-driven config files** + +In order to maximize flexibility for operators, and to help facilitate +upgrades to newer versions of containerized software without editing +the chart itself, all configuration files will be specified dynamically +based on `values.yaml` and overrides. In most cases the config files +will be generated based on the YAML values tree itself, and in some +cases the config file content will be specified in `values.yaml` as a +string literal. + +Documentation +------------- +Comprehensive documentation is key to the ability for real-world operators to +benefit from OpenStack-Helm, and so it is a requirement for 1.0 releasability. +The following outstanding items must be completed from a documentation +perspective: + +**Document version requirements** + +Version requirements for the following must be documented and maintained: + +- Kubernetes +- Helm +- Operating system +- External charts (Calico) + +**Document Kubernetes requirements** + +OpenStack-Helm supports a "bring your own Kubernetes" paradigm. Any +particular k8s configuration or feature requirements must be +documented. + +- Hosts must use KubeDNS / CoreDNS for resolution +- Kubernetes must enable mount propagation (until it is enabled by default) +- Helm must be installed + +Examples of how to set up the above under KubeADM and KubeSpray-based clusters +must be documented as well. + +**OpenStack-Helm release process** + +The OpenStack-Helm release process will be somewhat orthogonal to the +OpenStack release process, and the differences and relationship between the +two must be documented in a spec. This will help folks quickly understand why +OpenStack-Helm is a Release-Independent project from an OpenStack perspective. + +**Release notes** + +Release notes for the 1.0 release must be prepared, following OpenStack +best practices. The criteria for future changes that should be included +in release notes in an ongoing fashion must be defined / documented as well. + +- `values.yaml` changes +- New charts +- Any other changes to the external interface of OpenStack-Helm + +**LMA Operations Guide** + +A basic Logging, Monitoring, and Alerting-oriented operations guide must be in +place, illustrating for operators (and developers) how to set up and use an +example LMA setup for OpenStack and supporting services. It will include +instructions on how to perform basic configuration and how to access and use +the user interfaces at a high level. It will also link out to more detailed +documentation for the LMA tooling itself. + +Process and Tooling +------------------- +To facilitate effective collaboration and communication across the +OpenStack-Helm community team, work items for the enhancements above will be +captured in Storyboard. Therefore, migration from Launchpad to Storyboard +must be accomplished prior to the 1.0 release. Going forward, Storyboard +will be leveraged as a tool to collaboratively define and communicate the +OpenStack-Helm roadmap. + +Security Impact +--------------- +No impact + +Performance Impact +------------------ +No impact + +Alternatives +------------ +This spec lays out the criteria for a stable and reliable 1.0 release, which +can serve as the basis for real-world use as well as ongoing development. +The alternative approaches would be to either iterate indefinitely without +defining a 1.0 release, which would fail to signal to operators the point at +which the platform is ready for real-world use; or, to define a 1.0 release +which fails to satisfy key features which real-world operators need. + +Implementation +============== + +This spec describes a wide variety of self-contained work efforts, which will +be implemented individually by the whole OpenStack-Helm team. + +Assignee(s) +----------- + +Primary assignee: + +- mattmceuen (Matt McEuen ) for coordination +- powerds (DaeSeong Kim ) for the + `values.yaml` ordering spec [1]_ +- portdirect (Pete Birley ) for the + release management spec [2]_ +- randeep.jalli (Randeep Jalli ) and + renmak (Renis Makadia ) for splitting + up the Ceph chart +- rwellum (Rich Wellum ) for coordination + of Storyboard adoption +- Additional assignees TBD + +Work Items +---------- + +See above for the list of work items. + +Testing +======= +See above for gating requirements. + +Documentation Impact +==================== +See above for documentation requirements. + +References +========== + +.. [1] https://review.openstack.org/#/c/552485/ +.. [2] TODO - release management spec