[doc] Update proposal document

Update proposal document with description of problem space
Draft rules engine description in openstack_integration document
This commit is contained in:
Oleg Gelbukh 2013-10-17 09:42:40 +00:00
parent fa8777fd3e
commit c71f189faf
5 changed files with 54 additions and 12 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

View File

@ -0,0 +1,7 @@
@startuml
(*) -right-> [<i>OpenStack Services</i>\nNova, Keystone, Neutron,\nGlance, Heat, Swift] "Deployment"
"Deployment" -right-> [<i>OpenStack Deployment</i>\nFuel, TripleO, Devstack] "Operation\nMaintenance"
"Operation\nMaintenance" -right-> [<i>DRAGONS?</i>\nTuskar, <b>Rubick</b>] (*)
@enduml

View File

@ -11,21 +11,30 @@ Project Name
Overview Overview
-------- --------
The typical OpenStack cloud life cycle consists of 2 phases:
- initial deployment and
- operation maintenance
OpenStack cloud operators usually rely on deploymnet tools to configure all the OpenStack cloud operators usually rely on deploymnet tools to configure all the
platform components correctly and efficiently upfront. However, after initial platform components correctly and efficiently in *initial deployment* phase.
deployment platform configurations and operational conditions start to change. Multiple OpenStack projects cover that area: TripleO/Tuskar, Fuel and Devstack,
These changes could break consistency and integration of cloud platform and its to name a few.
components, and ultimately cause cloud service failures of different kinds.
However, once you installed and kicked off the cloud, platform configurations
and operational conditions begin to change. These changes could break
consistency and integration of cloud platform components. Keeping cloud up and
running is the essense of *operation maintenance* phase.
Cloud operator must quickly and efficiently identify and respond to the root Cloud operator must quickly and efficiently identify and respond to the root
cause of such failures. To do so, he must check if his OpenStack configuration cause of such failures. To do so, he must check if his OpenStack configuration
is sane and consistent. These checks could be thought of as rules of diagnostic is sane and consistent. These checks could be thought of as rules of diagnostic
production system. production system.
Currently OpenStack ecosystem does not provide tools which specifically help to Currently OpenStack ecosystem lacks projects aimed to increase reliability and
diagnose platform configuration. We propose a project which will help operators resilience of the cloud. With this proposal we want to introduce a project which
to diagnose their OpenStack platform and reduce response time to known and will help operators to diagnose their OpenStack platform, reduce response time
unknown failures. to known and unknown failures and effectively support the desired SLA.
Mission Mission
------- -------

View File

@ -1,5 +1,5 @@
VALIDATOR INTEGRATION WITH OPENSTACK DIAGNOSTICS INTEGRATION WITH OPENSTACK
==================================== ======================================
-------- --------
Overview Overview
@ -50,8 +50,29 @@ and inconsistencies.
This engine will provide hints and best practices to increase reliability and This engine will provide hints and best practices to increase reliability and
operational resilience of the cloud. operational resilience of the cloud.
Rules engine #FIXME: move this part to document rules_engine.rst
------------
Rules-based approach to diagnostics
-----------------------------------
The consistent configuration across all components is essential to OpenStack
cloud operation. If something is wrong with configuration, you as an operator
will know this immidiately either from monitoring or clients complaining. But
diagnosing the exact problem is always a challenge, given the number of
components and configuration options per component.
You could think about troubleshooting OpenStack as going through some scenarios
which can be expressed as sets of rules. Your configuration must comply to all those
rules to be operational. On the other hand, if you know rules which your
configuration breaks, you can identify incorrect parameters reliably and easy.
That is how production rules or diagnostic systems work.
Example production rule for OpenStack system could be::
if (condition)parameter) is (value) then (check_parameter_1) must be (value) and
(check_parameter_2) must be (value)
------------------ ------------------
Integration Points Integration Points

View File

@ -0,0 +1,5 @@
PRODUCTION RULES ENGINE
=======================
This document describes rules engine used for inspection and diagnostics of
OpenStack configuration.