Fenix rolling upgrade
Use case definition for Fenix project Change-Id: I45e3b4982be4357479628f4414328915bf92a62d Signed-off-by: Tomi Juvonen <tomi.juvonen@nokia.com>
This commit is contained in:
parent
bf603c75cc
commit
124e51c285
@ -10,3 +10,4 @@ a starting point.
|
|||||||
|
|
||||||
use-cases/nic-failure-affects-instance-and-app.rst
|
use-cases/nic-failure-affects-instance-and-app.rst
|
||||||
use-cases/heat-mistral-aodh.rst
|
use-cases/heat-mistral-aodh.rst
|
||||||
|
use-cases/fenix-rolling-upgrade.rst
|
||||||
|
107
use-cases/fenix-rolling-upgrade.rst
Normal file
107
use-cases/fenix-rolling-upgrade.rst
Normal file
@ -0,0 +1,107 @@
|
|||||||
|
==============================================
|
||||||
|
Infrastructure rolling maintenance and upgrade
|
||||||
|
==============================================
|
||||||
|
|
||||||
|
Telco has for years made maintenance and upgrades in rolling fashion. Now it is
|
||||||
|
the time to achieve this in the OpenStack also. Rolling upgrade makes minimal
|
||||||
|
downtime to infrastructure as well as for the application on top of it.
|
||||||
|
|
||||||
|
|
||||||
|
Problem description
|
||||||
|
===================
|
||||||
|
|
||||||
|
- Infrastructure maintenance and upgrade needs to possible in rolling fashion
|
||||||
|
to minimize downtime for services and applications.
|
||||||
|
|
||||||
|
- Maintenance and upgrade needs to be managed without adding more resources
|
||||||
|
to a system while all compute capacity is in use.
|
||||||
|
|
||||||
|
- It needs to be possible to know what hosts and instances are maintained and
|
||||||
|
what not.
|
||||||
|
|
||||||
|
- There needs to be a generic messaging defined between infrastructure and
|
||||||
|
application manager (VNFM).
|
||||||
|
|
||||||
|
- It has to be possible to ask application manager to scale down at non busy
|
||||||
|
hour to get free capacity during rolling maintenance and upgrade.
|
||||||
|
|
||||||
|
- Application manager will need to know when planned maintenance session is
|
||||||
|
over, so it can scale back to full capacity.
|
||||||
|
|
||||||
|
- Application manager needs to be aware of planned host maintenance, so
|
||||||
|
application (VNF) will safely be running somewhere else when the host will
|
||||||
|
be down for maintenance.
|
||||||
|
|
||||||
|
- Different infrastructure services needs to be aware of host being down for
|
||||||
|
maintenance. This can be important to disable automatic self-healing
|
||||||
|
actions or billing. There needs to be a generic messaging defined for this.
|
||||||
|
|
||||||
|
- Application manager needs to know when his instances are to move to
|
||||||
|
upgraded host, so it can also make its own upgrade to take new
|
||||||
|
capabilities into use.
|
||||||
|
|
||||||
|
- Rolling maintenance framework needs to be pluggable to handle different
|
||||||
|
maintenance and upgrade workflows and actions for hosts. This is also
|
||||||
|
important to support different payloads and clouds.
|
||||||
|
|
||||||
|
- Infrastructure admin needs to be able to have rolling maintenance done
|
||||||
|
with one-click.
|
||||||
|
|
||||||
|
- Infrastructure admin needs to be able to know rolling maintenance status
|
||||||
|
through API and notification.
|
||||||
|
|
||||||
|
- It must be possible for each maintenance session to define needed software
|
||||||
|
packages and plug-ins to run the maintenance workflow properly.
|
||||||
|
|
||||||
|
|
||||||
|
OpenStack projects used
|
||||||
|
=======================
|
||||||
|
|
||||||
|
All mentioned problems are being solved by the new `Fenix
|
||||||
|
<https://wiki.openstack.org/wiki/Fenix>`_ project to manage the
|
||||||
|
rolling maintenance and upgrade. More of its internals can be read
|
||||||
|
from project own documentation and blueprints. Proof of concept code
|
||||||
|
is already being tested in the OPNFV Doctor CI with a sample
|
||||||
|
implementation. The `Doctor maintenance design document`__ describes
|
||||||
|
the initial interaction needed. Also, the presentation in the
|
||||||
|
OpenStack Vancouver summit `"How to gain VNF zero downtime during
|
||||||
|
Infrastructure Maintenance and Upgrade"`__ will show the way for
|
||||||
|
implementing the Fenix.
|
||||||
|
|
||||||
|
__ https://wiki.opnfv.org/download/attachments/5046291/Planned%20Maintenance%20Design%20Guideline.pdf?version=1&modificationDate=1527183603000&api=v2
|
||||||
|
__ https://www.openstack.org/videos/vancouver-2018/how-to-gain-vnf-zero-down-time-during-infrastructure-maintenance-and-upgrade
|
||||||
|
|
||||||
|
As Fenix can interact with the application manager. There is a
|
||||||
|
blueprint to support the interaction in Tacker__. This would enable a
|
||||||
|
complex test case to be built to test Fenix workflow, that uses purely
|
||||||
|
OpenStack components.
|
||||||
|
|
||||||
|
__ https://blueprints.launchpad.net/tacker/+spec/vnf-rolling-upgrade
|
||||||
|
|
||||||
|
To disable self-healing, Fenix host maintenance notification could be
|
||||||
|
supported by Vitrage and Masakari.
|
||||||
|
|
||||||
|
As workflows can be different, there has already been some discussion with
|
||||||
|
the Airship and the Blazar projects. The Blazar should make a blueprint to have
|
||||||
|
it possible to change application-specific reservations to support rolling
|
||||||
|
maintenance. Airship could later look to implement its own maintenance and
|
||||||
|
upgrade process by utilizing Fenix.
|
||||||
|
|
||||||
|
Upgrade checks for different projects are `a community goal for
|
||||||
|
Stein`__. This is one step towards the automated rolling upgrade.
|
||||||
|
|
||||||
|
__ https://storyboard.openstack.org/#!/story/2003657
|
||||||
|
|
||||||
|
|
||||||
|
Future work
|
||||||
|
===========
|
||||||
|
|
||||||
|
`Fenix blueprints`__ indicate what is yet to be done for the basic
|
||||||
|
Fenix engine. When this work is ready, one can concentrate to make the
|
||||||
|
sample workflow plug-in for the rolling upgrade, sample upgrade action
|
||||||
|
plug-ins and the framework for testing it. Ideally, the framework use
|
||||||
|
case would be the OpenStack and application (VNF) upgrade. This can
|
||||||
|
then work as an example to implement own workflow and other plug-ins
|
||||||
|
for a specific real work use case.
|
||||||
|
|
||||||
|
__ https://storyboard.openstack.org/#!/worklist/482
|
Loading…
x
Reference in New Issue
Block a user