Alexandr Nevenchannyy f84ec2ce07 Add reliability test results
This commit add part of reliability testing results.
Scope of this commit is testing Nova API under
several factors.

Change-Id: Id3cb644ccf4bd315846399e6ac40a446297787f3
2016-07-04 15:43:57 +03:00

378 lines
16 KiB
ReStructuredText
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

.. _reliability_testing:
=============================
OpenStack reliability testing
=============================
:status: draft
:version: 0
:Abstract:
This document describes an abstract methodology for OpenStack cluster
high-availability testing and analysis. OpenStack data plane testing
at this moment is out of scope, but will be described in future.
:Conventions:
.. include:: plan_conventions.rst
Test Plan
=========
Test Environment
----------------
This section should contain all information about deployed OpenStack
environment including archive with all information in the ``/etc`` folder from
all nodes.
Preparation
^^^^^^^^^^^
This section should contain all steps to reproduce Openstack environment
deployment and client node. For example: if testing environment is deployed
with DevStack, this section should contain all DevStack configuration files,
DevStack version and all deployment steps.
Environment description
^^^^^^^^^^^^^^^^^^^^^^^
This section should contain all cluster hardware information, including
processor model and its frequency, memory size, storage type and its capacity,
network interfaces, and others.
A separate client node must be used to drive the tests.
Hardware
~~~~~~~~
This section should contain a full hardware nodes specification.
.. table:: Description of server hardware
+--------+----------------+-------+-------+
|SERVER |name | | |
| +----------------+-------+-------+
| |role | | |
| +----------------+-------+-------+
| |vendor,model | | |
| +----------------+-------+-------+
| |operating_system| | |
+--------+----------------+-------+-------+
|CPU |vendor,model | | |
| +----------------+-------+-------+
| |processor_count | | |
| +----------------+-------+-------+
| |core_count | | |
| +----------------+-------+-------+
| |frequency_MHz | | |
+--------+----------------+-------+-------+
|RAM |vendor,model | | |
| +----------------+-------+-------+
| |amount_MB | | |
+--------+----------------+-------+-------+
|NETWORK |interface_name | | |
| +----------------+-------+-------+
| |vendor,model | | |
| +----------------+-------+-------+
| |bandwidth | | |
+--------+----------------+-------+-------+
|STORAGE |dev_name | | |
| +----------------+-------+-------+
| |vendor,model | | |
| +----------------+-------+-------+
| |SSD/HDD | | |
| +----------------+-------+-------+
| |size | | |
+--------+----------------+-------+-------+
Networking
~~~~~~~~~~
This section should сontain full description of network equipment used in
OpenStack cluster. Network topology diagram and network hardware
configuration files should be included in this section.
Factors description
-------------------
Please define here description of used factors during test runs.
Examples are:
- **reboot-random-controller:** consist node-crash fault injection on random
OpenStack controller node.
- **reboot-random-rabbitmq:** consist node-crash fault injection on master
RabbitMQ messaging node.
- **sigstop-random-nova-api:** consist service-hang fault injection on random
nova-api service.
- **sigkill-random-mysql:** consist service-crash fault injection on
random MySQL node.
- **network-partition-random-mysql:** consist network-partition fault injection on
random MySQL node.
Test Case 1: NovaServers.boot_and_delete_server
-----------------------------------------------
Description
^^^^^^^^^^^
This Rally scenario boots and deletes virtual instances with injected fault
factors through OpenStack Nova API.
Service-level agreement
^^^^^^^^^^^^^^^^^^^^^^^
In this section, specify SLA values. For example:
=================== ========
Parameter Value
=================== ========
MTTR (sec) <=240
Failure rate (%) <=95
Auto-healing Yes
=================== ========
Parameters
^^^^^^^^^^
In this section, specify load parameters during the test. For example:
=================== ========
Parameter Value
=================== ========
Runner constant
Concurrency X
Times Y
Injection-iteration Z
Testing-cycles N
=================== ========
List of reliability metrics
^^^^^^^^^^^^^^^^^^^^^^^^^^^
======== ============== ================= =================================================
Priority Value Measurement Units Description
======== ============== ================= =================================================
1 SLA Boolean Service-level agreement result
2 Auto-healing Boolean Is cluster auto-healed after fault-injection
3 Failure rate Percents Test iteration failure ratio
4 MTTR (auto) Seconds Automatic mean time to repair
5 MTTR (manual) Seconds Manual mean time to repair, if Auto MTTR is Inf.
======== ============== ================= =================================================
Results
^^^^^^^
reboot-random-controller
~~~~~~~~~~~~~~~~~~~~~~~~
.. table:: **Full description of cyclic execution results**
+--------------------+----------------+---------------------+------------------+-----------------------------+
| Cycles | MTTR(sec) | Failure rate(%) | Auto-healing | Performance degradation |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 1 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 2 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 3 | X | Y | No | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 4 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 5 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
Place here link to rally report file with results of testing this factor.
.. table:: **Testing results summary**
+--------------------+------------+------------------+
| Value | MTTR | Failure rate |
+--------------------+------------+------------------+
| Min | X | Y |
+--------------------+------------+------------------+
| Max | X | Y |
+--------------------+------------+------------------+
| SLA | X | Y |
+--------------------+------------+------------------+
Detailed results description
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this section, specify detailed description of test results,
including factor impact.
reboot-random-rabbitmq
~~~~~~~~~~~~~~~~~~~~~~
.. table:: **Full description of cyclic execution results**
+--------------------+----------------+---------------------+------------------+-----------------------------+
| Cycles | MTTR(sec) | Failure rate(%) | Auto-healing | Performance degradation |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 1 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 2 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 3 | X | Y | No | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 4 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 5 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
Place here link to rally report file with results of testing this factor.
.. table:: **Testing results summary**
+--------------------+------------+------------------+
| Value | MTTR | Failure rate |
+--------------------+------------+------------------+
| Min | X | Y |
+--------------------+------------+------------------+
| Max | X | Y |
+--------------------+------------+------------------+
| SLA | X | Y |
+--------------------+------------+------------------+
Detailed results description
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this section, specify detailed description of test results,
including factor impact.
Test Case 2: GlanceImages.create_and_delete_image
-------------------------------------------------
Description
^^^^^^^^^^^
This Rally scenario creates and deletes images with injected fault
factors through OpenStack Glance API.
Service-level agreement
^^^^^^^^^^^^^^^^^^^^^^^
In this section, specify SLA values. For example:
=================== ========
Parameter Value
=================== ========
MTTR (sec) <=120
Failure rate (%) <=95
Auto-healing Yes
=================== ========
Parameters
^^^^^^^^^^
In this section, specify load parameters during the test. For example:
=================== ========
Parameter Value
=================== ========
Runner constant
Concurrency X
Times Y
Injection-iteration Z
Testing-cycles N
=================== ========
List of reliability metrics
^^^^^^^^^^^^^^^^^^^^^^^^^^^
======== ============== ================= =================================================
Priority Value Measurement Units Description
======== ============== ================= =================================================
1 SLA Boolean Service-level agreement result
2 Auto-healing Boolean Is cluster auto-healed after fault-injection
3 Failure rate Percents Test iteration failure ratio
4 MTTR (auto) Seconds Automatic mean time to repair
5 MTTR (manual) Seconds Manual mean time to repair, if Auto MTTR is Inf.
======== ============== ================= =================================================
Results
^^^^^^^
reboot-random-controller
~~~~~~~~~~~~~~~~~~~~~~~~
.. table:: **Full description of cyclic execution results**
+--------------------+----------------+---------------------+------------------+-----------------------------+
| Cycles | MTTR(sec) | Failure rate(%) | Auto-healing | Performance degradation |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 1 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 2 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 3 | X | Y | No | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 4 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 5 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
Place here link to rally report file with results of testing this factor.
.. table:: **Testing results summary**
+--------------------+------------+------------------+
| Value | MTTR | Failure rate |
+--------------------+------------+------------------+
| Min | X | Y |
+--------------------+------------+------------------+
| Max | X | Y |
+--------------------+------------+------------------+
| SLA | X | Y |
+--------------------+------------+------------------+
Detailed results description
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this section, specify detailed description of test results,
including factor impact.
reboot-random-rabbitmq
~~~~~~~~~~~~~~~~~~~~~~
.. table:: **Full description of cyclic execution results**
+--------------------+----------------+---------------------+------------------+-----------------------------+
| Cycles | MTTR(sec) | Failure rate(%) | Auto-healing | Performance degradation |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 1 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 2 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 3 | X | Y | No | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 4 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
| 5 | X | Y | Yes | Yes |
+--------------------+----------------+---------------------+------------------+-----------------------------+
Place here link to rally report file with results of testing this factor.
.. table:: **Testing results summary**
+--------------------+------------+------------------+
| Value | MTTR | Failure rate |
+--------------------+------------+------------------+
| Min | X | Y |
+--------------------+------------+------------------+
| Max | X | Y |
+--------------------+------------+------------------+
| SLA | X | Y |
+--------------------+------------+------------------+
Detailed results description
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this section, specify detailed description of test results,
including factor impact.