Add a Theory of Auto-Scaling document
As discussed at the Denver PTG, create a document to describe some concepts for a user to understand when using auto-scaling. Original draft at https://wiki.openstack.org/wiki/Auto-scaling_SIG/Theory_of_Auto-Scaling Change-Id: Id3fb78adca4ca0032de4ca15b7c9c821bdfe756f
This commit is contained in:
parent
9b1840cd65
commit
bf9a5840b2
@ -16,6 +16,7 @@ Contributions to this documentation are warmly encouraged; please see
|
||||
|
||||
use-cases
|
||||
specs
|
||||
theory-of-auto-scaling
|
||||
|
||||
|
||||
Indices and tables
|
||||
|
@ -0,0 +1,34 @@
|
||||
# Diagram of OpenStack Auto-Scaling Concepts
|
||||
# PlantUML diagram - https://en.wikipedia.org/wiki/PlantUML
|
||||
@startuml
|
||||
|
||||
cloud Cloud\n {
|
||||
rectangle host as "Host" {
|
||||
}
|
||||
rectangle host2 as "Host" {
|
||||
agent VM
|
||||
agent VM2 as "VM"
|
||||
agent Container
|
||||
agent Container2 as "Container"
|
||||
}
|
||||
}
|
||||
|
||||
agent MS as "Monitoring Service"
|
||||
|
||||
agent DS as "Decision Services\n(Clustering,\nOptimization,\nRoot Cause)"
|
||||
agent Heat as "Orchestration \nEngine"
|
||||
|
||||
host -down-> MS
|
||||
VM -down-> MS
|
||||
Container -down-> MS : "Metric \nSamples"
|
||||
|
||||
MS -down-> DS : "Alarms"
|
||||
MS -down-> Heat : "Alarms"
|
||||
|
||||
DS -right-> Heat : "Scaling Commands"
|
||||
|
||||
Heat -up-> host : "Orchestration"
|
||||
Heat -up-> VM2 : "Orchestration"
|
||||
Heat -up-> Container2 : "Orchestration"
|
||||
|
||||
@enduml
|
1
doc/source/media/OpenStack-Auto-Scaling.svg
Normal file
1
doc/source/media/OpenStack-Auto-Scaling.svg
Normal file
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 176 KiB |
153
doc/source/theory-of-auto-scaling.rst
Normal file
153
doc/source/theory-of-auto-scaling.rst
Normal file
@ -0,0 +1,153 @@
|
||||
======================
|
||||
Theory of Auto-Scaling
|
||||
======================
|
||||
|
||||
.. contents::
|
||||
:depth: 2
|
||||
:local:
|
||||
|
||||
General Description
|
||||
===================
|
||||
|
||||
In OpenStack, "Auto-Scaling" refers to the ability of a Cloud to automatically
|
||||
detect conditions related to load in the Cloud and to react appropriately without
|
||||
an Operator's intervention.
|
||||
|
||||
This generally refers to Compute workloads in the Cloud, but the SIG also
|
||||
discusses use cases around scaling for the Control Plane or other resources.
|
||||
|
||||
Auto-Scaling includes both scale-up and scale-down actions to try to appropriately
|
||||
and efficiently allocate resources and avoid problems. This aims for the best
|
||||
Customer experience and return on investment.
|
||||
|
||||
Auto-Scaling uses many of the same services and technologies as used in `Self Healing`_.
|
||||
|
||||
Where Self Healing is focused on healing a Cloud when a failure occurs, Auto-Scaling
|
||||
is concerned with avoiding issues by allocating more resources when a need is detected
|
||||
or predicted, or in conserving resources by deactivating them when loads are low.
|
||||
|
||||
|
||||
Mission
|
||||
-------
|
||||
This SIG aims to improve the experience of developing, operating, and using auto-scaling and its related features
|
||||
(like metering, cluster schedule, life cycle management), and to coordinate efforts across projects and communities
|
||||
(like k8s cluster auto-scaling on OpenStack). The SIG also provides a central place to put tests, documentations,
|
||||
and even common libraries for auto-scaling features.
|
||||
|
||||
The SIG is expected to focus more on auto-scaling user workloads; however work on auto-scaling infrastructure is
|
||||
also welcome, especially considering that user workloads in an undercloud are actually infrastructure in the
|
||||
corresponding overcloud.
|
||||
|
||||
Background
|
||||
----------
|
||||
OpenStack provides multiple methods to auto-scale your cluster (Like using Heat AutoScalingGroup,
|
||||
Senlin Cluster, etc.). However without general coordination across projects, it may not be easy for users and ops
|
||||
to achieve auto-scaling on OpenStack. Developers tend to be focused on individual projects rather than cross-project
|
||||
integration. Most of the components required by auto-scaling already exist within OpenStack, but we need to provide
|
||||
a more simple way for users and ops to adopt auto-scaling. And allowing developers to coordinate together instead of
|
||||
implement something all over again.
|
||||
|
||||
|
||||
Conceptual Diagram
|
||||
==================
|
||||
|
||||
.. image:: ./media/OpenStack-Auto-Scaling.svg
|
||||
:alt: Architecture Component Diagram showing Scaling Units, Monitoring
|
||||
and Alarming, Decision Services, and Orchestration Engine
|
||||
|
||||
|
||||
Components of Auto-Scaling
|
||||
==========================
|
||||
|
||||
OpenStack offers a rich set of services to build, manage, orchestrate, and
|
||||
provision a cloud. This gives administrators some choices in how to best serve
|
||||
their customer's needs.
|
||||
|
||||
* Scaling Units - There are a number of components that can be controlled
|
||||
with Auto-Scaling.
|
||||
|
||||
* Compute Host
|
||||
* VM running on a Compute Host
|
||||
* Container running on a Compute Host
|
||||
* Network Attached Storage
|
||||
* Virtual Network Functions
|
||||
|
||||
* Monitoring Service - Monitoring works by either using an agent installed on
|
||||
the Scaling Unit, or using a polling method to retrieve metrics.
|
||||
|
||||
* `Monasca`_
|
||||
* Ceilometer from the `Telemetry`_ project
|
||||
* Prometheus
|
||||
|
||||
* Alarming Service
|
||||
|
||||
* `Monasca`_ has a built in alarm thresholding service and notification service
|
||||
* Aodh from the `Telemetry`_ project
|
||||
|
||||
* Decision Services - There are a number of services in OpenStack that can
|
||||
interpret metrics and alarms based on configured logic and produce commands
|
||||
to Orchestration Engines.
|
||||
|
||||
* `Congress`_
|
||||
* `Heat`_ can contain logic for auto-scaling decisions in HOT templates
|
||||
* `Mistral`_
|
||||
* `Vitrage`_
|
||||
* `Watcher`_
|
||||
|
||||
|
||||
* Orchestration Engines
|
||||
|
||||
* `Heat`_
|
||||
* `Senlin`_ is a clustering engine for OpenStack, and can orchestrate auto-scaling
|
||||
* `Tacker`_
|
||||
|
||||
Considerations and Guidelines
|
||||
=============================
|
||||
|
||||
* Monitoring takes resources, plan accordingly
|
||||
|
||||
* The more metrics monitored, the greater the bandwidth usage and CPU to process.
|
||||
The longer the retention period for metrics, the more storage needed.
|
||||
|
||||
* Avoid scaling too quickly or too often
|
||||
|
||||
* This can be done by specifying appropriate cooldown periods.
|
||||
* Another technique is to average the scaling metric over a longer time period to avoid reacting to sudden
|
||||
fluctuations.
|
||||
|
||||
* Don't expect instantaneous scaling (see above)
|
||||
|
||||
* Define thresholds to be predictive of scale needs, not reactive to a bad state
|
||||
|
||||
* Be aware of where the logic for scaling is (alarm thresholds, decision services)
|
||||
|
||||
* Define appropriate scaling limits in terms of minimum and maximum instances.
|
||||
|
||||
* Minimum number of instances will prevent all the instances from being removed.
|
||||
* Maximum number of instances safeguards against provisioning too many resources that could adversely affect
|
||||
other workloads.
|
||||
|
||||
* Applications must be horizontally scalable in order to auto-scale the underlying instances.
|
||||
|
||||
* Applications must be stateless or be able to drain existing stateful connections so that the underlying
|
||||
instances can be removed during a scale down.
|
||||
* Incoming requests must be dynamically load balanced among the instances running the application.
|
||||
|
||||
Anecdotes and Stories
|
||||
---------------------
|
||||
|
||||
There are many experiences which can be captured and shared around auto-scaling.
|
||||
Please also refer to the Use Cases.
|
||||
|
||||
|
||||
|
||||
.. _Congress: https://wiki.openstack.org/wiki/Congress
|
||||
.. _Heat: https://wiki.openstack.org/wiki/Heat
|
||||
.. _Mistral: https://docs.openstack.org/mistral/latest/
|
||||
.. _Monasca: https://wiki.openstack.org/wiki/Monasca
|
||||
.. _Self Healing: https://docs.openstack.org/self-healing-sig/latest/
|
||||
.. _Senlin: https://docs.openstack.org/senlin/latest/scenarios/autoscaling_heat.html
|
||||
.. _Telemetry: https://wiki.openstack.org/wiki/Telemetry
|
||||
.. _Tacker: https://wiki.openstack.org/wiki/Tacker
|
||||
.. _Vitrage: https://wiki.openstack.org/wiki/Vitrage
|
||||
.. _Watcher: https://wiki.openstack.org/wiki/Watcher
|
Loading…
x
Reference in New Issue
Block a user