diff --git a/doc/ha-guide/source/compute-node-ha-api.rst b/doc/ha-guide/source/compute-node-ha-api.rst index 7e60cceb7d..df20eff4a0 100644 --- a/doc/ha-guide/source/compute-node-ha-api.rst +++ b/doc/ha-guide/source/compute-node-ha-api.rst @@ -1,12 +1,9 @@ - -============================================ -Configure high availability on compute nodes -============================================ +============================================== +Configuring high availability on compute nodes +============================================== The `Newton Installation Tutorials and Guides `_ -gives instructions for installing multiple compute nodes. -To make them highly available, -you must configure the environment -to include multiple instances of the API -and other services. +provide instructions for installing multiple compute nodes. +To make the compute nodes highly available, you must configure the +environment to include multiple instances of the API and other services. diff --git a/doc/ha-guide/source/compute-node-ha.rst b/doc/ha-guide/source/compute-node-ha.rst index c5a217c044..a504225ab2 100644 --- a/doc/ha-guide/source/compute-node-ha.rst +++ b/doc/ha-guide/source/compute-node-ha.rst @@ -1,4 +1,3 @@ - ================================================== Configuring the compute node for high availability ================================================== diff --git a/doc/ha-guide/source/controller-ha-haproxy.rst b/doc/ha-guide/source/controller-ha-haproxy.rst index 14f750be90..513206d3da 100644 --- a/doc/ha-guide/source/controller-ha-haproxy.rst +++ b/doc/ha-guide/source/controller-ha-haproxy.rst @@ -8,228 +8,222 @@ under very high loads while needing persistence or Layer 7 processing. It realistically supports tens of thousands of connections with recent hardware. -Each instance of HAProxy configures its front end to accept connections -only from the virtual IP (VIP) address and to terminate them as a list -of all instances of the corresponding service under load balancing, -such as any OpenStack API service. +Each instance of HAProxy configures its front end to accept connections only +to the virtual IP (VIP) address. The HAProxy back end (termination +point) is a list of all the IP addresses of instances for load balancing. -This makes the instances of HAProxy act independently and fail over -transparently together with the network endpoints (VIP addresses) -failover and, therefore, shares the same SLA. +.. note:: -You can alternatively use a commercial load balancer, which is a hardware -or software. A hardware load balancer generally has good performance. + Ensure your HAProxy installation is not a single point of failure, + it is advisable to have multiple HAProxy instances running. + + You can also ensure the availability by other means, using Keepalived + or Pacemaker. + +Alternatively, you can use a commercial load balancer, which is hardware +or software. We recommend a hardware load balancer as it generally has +good performance. For detailed instructions about installing HAProxy on your nodes, -see its `official documentation `_. +see the HAProxy `official documentation `_. -.. note:: - - HAProxy should not be a single point of failure. - It is advisable to have multiple HAProxy instances running, - where the number of these instances is a small odd number like 3 or 5. - You need to ensure its availability by other means, - such as Keepalived or Pacemaker. - -The common practice is to locate an HAProxy instance on each OpenStack -controller in the environment. - -Once configured (see example file below), add HAProxy to the cluster -and ensure the VIPs can only run on machines where HAProxy is active: - -``pcs`` - -.. code-block:: console - - $ pcs resource create lb-haproxy systemd:haproxy --clone - $ pcs constraint order start vip then lb-haproxy-clone kind=Optional - $ pcs constraint colocation add lb-haproxy-clone with vip - -``crmsh`` - -.. code-block:: console - - $ crm cib new conf-haproxy - $ crm configure primitive haproxy lsb:haproxy op monitor interval="1s" - $ crm configure clone haproxy-clone haproxy - $ crm configure colocation vip-with-haproxy inf: vip haproxy-clone - $ crm configure order haproxy-after-vip mandatory: vip haproxy-clone - -Example Config File +Configuring HAProxy ~~~~~~~~~~~~~~~~~~~ -Here is an example ``/etc/haproxy/haproxy.cfg`` configuration file. -You need a copy of it on each controller node. +#. Restart the HAProxy service. -.. note:: +#. Locate your HAProxy instance on each OpenStack controller in your + environment. The following is an example ``/etc/haproxy/haproxy.cfg`` + configuration file. Configure your instance using the following + configuration file, you will need a copy of it on each + controller node. - To implement any changes made to this you must restart the HAProxy service -.. code-block:: none + .. code-block:: none - global - chroot /var/lib/haproxy - daemon - group haproxy - maxconn 4000 - pidfile /var/run/haproxy.pid - user haproxy + global + chroot /var/lib/haproxy + daemon + group haproxy + maxconn 4000 + pidfile /var/run/haproxy.pid + user haproxy - defaults - log global - maxconn 4000 - option redispatch - retries 3 - timeout http-request 10s - timeout queue 1m - timeout connect 10s - timeout client 1m - timeout server 1m - timeout check 10s + defaults + log global + maxconn 4000 + option redispatch + retries 3 + timeout http-request 10s + timeout queue 1m + timeout connect 10s + timeout client 1m + timeout server 1m + timeout check 10s - listen dashboard_cluster - bind :443 - balance source - option tcpka - option httpchk - option tcplog - server controller1 10.0.0.12:443 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:443 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:443 check inter 2000 rise 2 fall 5 + listen dashboard_cluster + bind :443 + balance source + option tcpka + option httpchk + option tcplog + server controller1 10.0.0.12:443 check inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:443 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:443 check inter 2000 rise 2 fall 5 - listen galera_cluster - bind :3306 - balance source - option mysql-check - server controller1 10.0.0.12:3306 check port 9200 inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:3306 backup check port 9200 inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:3306 backup check port 9200 inter 2000 rise 2 fall 5 + listen galera_cluster + bind :3306 + balance source + option mysql-check + server controller1 10.0.0.12:3306 check port 9200 inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:3306 backup check port 9200 inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:3306 backup check port 9200 inter 2000 rise 2 fall 5 - listen glance_api_cluster - bind :9292 - balance source - option tcpka - option httpchk - option tcplog - server controller1 10.0.0.12:9292 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:9292 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:9292 check inter 2000 rise 2 fall 5 + listen glance_api_cluster + bind :9292 + balance source + option tcpka + option httpchk + option tcplog + server controller1 10.0.0.12:9292 check inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:9292 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:9292 check inter 2000 rise 2 fall 5 - listen glance_registry_cluster - bind :9191 - balance source - option tcpka - option tcplog - server controller1 10.0.0.12:9191 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:9191 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:9191 check inter 2000 rise 2 fall 5 + listen glance_registry_cluster + bind :9191 + balance source + option tcpka + option tcplog + server controller1 10.0.0.12:9191 check inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:9191 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:9191 check inter 2000 rise 2 fall 5 - listen keystone_admin_cluster - bind :35357 - balance source - option tcpka - option httpchk - option tcplog - server controller1 10.0.0.12:35357 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:35357 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:35357 check inter 2000 rise 2 fall 5 + listen keystone_admin_cluster + bind :35357 + balance source + option tcpka + option httpchk + option tcplog + server controller1 10.0.0.12:35357 check inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:35357 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:35357 check inter 2000 rise 2 fall 5 - listen keystone_public_internal_cluster - bind :5000 - balance source - option tcpka - option httpchk - option tcplog - server controller1 10.0.0.12:5000 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:5000 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:5000 check inter 2000 rise 2 fall 5 + listen keystone_public_internal_cluster + bind :5000 + balance source + option tcpka + option httpchk + option tcplog + server controller1 10.0.0.12:5000 check inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:5000 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:5000 check inter 2000 rise 2 fall 5 - listen nova_ec2_api_cluster - bind :8773 - balance source - option tcpka - option tcplog - server controller1 10.0.0.12:8773 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:8773 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:8773 check inter 2000 rise 2 fall 5 + listen nova_ec2_api_cluster + bind :8773 + balance source + option tcpka + option tcplog + server controller1 10.0.0.12:8773 check inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:8773 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:8773 check inter 2000 rise 2 fall 5 - listen nova_compute_api_cluster - bind :8774 - balance source - option tcpka - option httpchk - option tcplog - server controller1 10.0.0.12:8774 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:8774 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:8774 check inter 2000 rise 2 fall 5 + listen nova_compute_api_cluster + bind :8774 + balance source + option tcpka + option httpchk + option tcplog + server controller1 10.0.0.12:8774 check inter 2000 rise 2 fall 5 + erver controller2 10.0.0.13:8774 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:8774 check inter 2000 rise 2 fall 5 - listen nova_metadata_api_cluster - bind :8775 - balance source - option tcpka - option tcplog - server controller1 10.0.0.12:8775 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:8775 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:8775 check inter 2000 rise 2 fall 5 + listen nova_metadata_api_cluster + bind :8775 + balance source + option tcpka + option tcplog + server controller1 10.0.0.12:8775 check inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:8775 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:8775 check inter 2000 rise 2 fall 5 - listen cinder_api_cluster - bind :8776 - balance source - option tcpka - option httpchk - option tcplog - server controller1 10.0.0.12:8776 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:8776 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:8776 check inter 2000 rise 2 fall 5 + listen cinder_api_cluster + bind :8776 + balance source + option tcpka + option httpchk + option tcplog + server controller1 10.0.0.12:8776 check inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:8776 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:8776 check inter 2000 rise 2 fall 5 - listen ceilometer_api_cluster - bind :8777 - balance source - option tcpka - option tcplog - server controller1 10.0.0.12:8777 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:8777 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:8777 check inter 2000 rise 2 fall 5 + listen ceilometer_api_cluster + bind :8777 + balance source + option tcpka + option tcplog + server controller1 10.0.0.12:8777 check inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:8777 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:8777 check inter 2000 rise 2 fall 5 - listen nova_vncproxy_cluster - bind :6080 - balance source - option tcpka - option tcplog - server controller1 10.0.0.12:6080 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:6080 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:6080 check inter 2000 rise 2 fall 5 + listen nova_vncproxy_cluster + bind :6080 + balance source + option tcpka + option tcplog + server controller1 10.0.0.12:6080 check inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:6080 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:6080 check inter 2000 rise 2 fall 5 - listen neutron_api_cluster - bind :9696 - balance source - option tcpka - option httpchk - option tcplog - server controller1 10.0.0.12:9696 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:9696 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:9696 check inter 2000 rise 2 fall 5 + listen neutron_api_cluster + bind :9696 + balance source + option tcpka + option httpchk + option tcplog + server controller1 10.0.0.12:9696 check inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:9696 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:9696 check inter 2000 rise 2 fall 5 - listen swift_proxy_cluster - bind :8080 - balance source - option tcplog - option tcpka - server controller1 10.0.0.12:8080 check inter 2000 rise 2 fall 5 - server controller2 10.0.0.13:8080 check inter 2000 rise 2 fall 5 - server controller3 10.0.0.14:8080 check inter 2000 rise 2 fall 5 + listen swift_proxy_cluster + bind :8080 + balance source + option tcplog + option tcpka + server controller1 10.0.0.12:8080 check inter 2000 rise 2 fall 5 + server controller2 10.0.0.13:8080 check inter 2000 rise 2 fall 5 + server controller3 10.0.0.14:8080 check inter 2000 rise 2 fall 5 -.. note:: + .. note:: - The Galera cluster configuration directive ``backup`` indicates - that two of the three controllers are standby nodes. - This ensures that only one node services write requests - because OpenStack support for multi-node writes is not yet production-ready. + The Galera cluster configuration directive ``backup`` indicates + that two of the three controllers are standby nodes. + This ensures that only one node services write requests + because OpenStack support for multi-node writes is not yet production-ready. -.. note:: + .. note:: - The Telemetry API service configuration does not have the ``option httpchk`` - directive as it cannot process this check properly. - TODO: explain why the Telemetry API is so special + The Telemetry API service configuration does not have the ``option httpchk`` + directive as it cannot process this check properly. -[TODO: we need more commentary about the contents and format of this file] +.. TODO: explain why the Telemetry API is so special + +#. Add HAProxy to the cluster and ensure the VIPs can only run on machines + where HAProxy is active: + + ``pcs`` + + .. code-block:: console + + $ pcs resource create lb-haproxy systemd:haproxy --clone + $ pcs constraint order start vip then lb-haproxy-clone kind=Optional + $ pcs constraint colocation add lb-haproxy-clone with vip + + ``crmsh`` + + .. code-block:: console + + $ crm cib new conf-haproxy + $ crm configure primitive haproxy lsb:haproxy op monitor interval="1s" + $ crm configure clone haproxy-clone haproxy + $ crm configure colocation vip-with-haproxy inf: vip haproxy-clone + $ crm configure order haproxy-after-vip mandatory: vip haproxy-clone diff --git a/doc/ha-guide/source/controller-ha-identity.rst b/doc/ha-guide/source/controller-ha-identity.rst index c6994d287f..10ce8cec6e 100644 --- a/doc/ha-guide/source/controller-ha-identity.rst +++ b/doc/ha-guide/source/controller-ha-identity.rst @@ -2,13 +2,8 @@ Highly available Identity API ============================= -You should be familiar with -`OpenStack Identity service -`_ -before proceeding, which is used by many services. - Making the OpenStack Identity service highly available -in active / passive mode involves: +in active and passive mode involves: - :ref:`identity-pacemaker` - :ref:`identity-config-identity` @@ -16,17 +11,28 @@ in active / passive mode involves: .. _identity-pacemaker: +Prerequisites +~~~~~~~~~~~~~ + +Before beginning, ensure you have read the +`OpenStack Identity service getting started documentation +`_ +before proceeding. + Add OpenStack Identity resource to Pacemaker ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The following section(s) detail how to add the OpenStack Identity +resource to Pacemaker on SUSE and Red Hat. + SUSE ----- SUSE Enterprise Linux and SUSE-based distributions, such as openSUSE, use a set of OCF agents for controlling OpenStack services. -#. You must first download the OpenStack Identity resource to Pacemaker - by running the following commands: +#. Run the following commands to download the OpenStack Identity resource + to Pacemaker: .. code-block:: console @@ -36,40 +42,49 @@ use a set of OCF agents for controlling OpenStack services. # wget https://git.openstack.org/cgit/openstack/openstack-resource-agents/plain/ocf/keystone # chmod a+rx * -#. You can now add the Pacemaker configuration - for the OpenStack Identity resource - by running the :command:`crm configure` command - to connect to the Pacemaker cluster. - Add the following cluster resources: +#. Add the Pacemaker configuration for the OpenStack Identity resource + by running the following command to connect to the Pacemaker cluster: - :: + .. code-block:: console + + # crm configure + +#. Add the following cluster resources: + + .. code-block:: console clone p_keystone ocf:openstack:keystone \ params config="/etc/keystone/keystone.conf" os_password="secretsecret" os_username="admin" os_tenant_name="admin" os_auth_url="http://10.0.0.11:5000/v2.0/" \ op monitor interval="30s" timeout="30s" - This configuration creates ``p_keystone``, - a resource for managing the OpenStack Identity service. + .. note:: - :command:`crm configure` supports batch input - so you may copy and paste the above lines - into your live Pacemaker configuration, - and then make changes as required. - For example, you may enter edit ``p_ip_keystone`` - from the :command:`crm configure` menu - and edit the resource to match your preferred virtual IP address. + This configuration creates ``p_keystone``, + a resource for managing the OpenStack Identity service. -#. After you add these resources, - commit your configuration changes by entering :command:`commit` - from the :command:`crm configure` menu. - Pacemaker then starts the OpenStack Identity service - and its dependent resources on all of your nodes. +#. Commit your configuration changes from the :command:`crm configure` menu + with the following command: + + .. code-block:: console + + # commit + +The :command:`crm configure` supports batch input. You may have to copy and +paste the above lines into your live Pacemaker configuration, and then make +changes as required. + +For example, you may enter ``edit p_ip_keystone`` from the +:command:`crm configure` menu and edit the resource to match your preferred +virtual IP address. + +Pacemaker now starts the OpenStack Identity service and its dependent +resources on all of your nodes. Red Hat -------- For Red Hat Enterprise Linux and Red Hat-based Linux distributions, -the process is simpler as they use the standard Systemd unit files. +the following process uses Systemd unit files. .. code-block:: console @@ -116,29 +131,24 @@ Configure OpenStack Identity service Configure OpenStack services to use the highly available OpenStack Identity ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Your OpenStack services must now point -their OpenStack Identity configuration -to the highly available virtual cluster IP address -rather than point to the physical IP address -of an OpenStack Identity server as you would do -in a non-HA environment. +Your OpenStack services now point their OpenStack Identity configuration +to the highly available virtual cluster IP address. -#. For OpenStack Compute, for example, - if your OpenStack Identity service IP address is 10.0.0.11, - use the following configuration in your :file:`api-paste.ini` file: +#. For OpenStack Compute, (if your OpenStack Identity service IP address + is 10.0.0.11) use the following configuration in the :file:`api-paste.ini` + file: .. code-block:: ini auth_host = 10.0.0.11 -#. You also need to create the OpenStack Identity Endpoint - with this IP address. +#. Create the OpenStack Identity Endpoint with this IP address. .. note:: If you are using both private and public IP addresses, - you should create two virtual IP addresses - and define your endpoint like this: + create two virtual IP addresses and define the endpoint. For + example: .. code-block:: console @@ -150,12 +160,9 @@ in a non-HA environment. $service-type internal http://10.0.0.11:5000/v2.0 -#. If you are using the horizon dashboard, - edit the :file:`local_settings.py` file - to include the following: +#. If you are using the horizon Dashboard, edit the :file:`local_settings.py` + file to include the following: .. code-block:: ini OPENSTACK_HOST = 10.0.0.11 - - diff --git a/doc/ha-guide/source/controller-ha-memcached.rst b/doc/ha-guide/source/controller-ha-memcached.rst index 4592ea12eb..b5cebcc223 100644 --- a/doc/ha-guide/source/controller-ha-memcached.rst +++ b/doc/ha-guide/source/controller-ha-memcached.rst @@ -1,6 +1,6 @@ -=================== +========= Memcached -=================== +========= Memcached is a general-purpose distributed memory caching system. It is used to speed up dynamic database-driven websites by caching data @@ -10,12 +10,12 @@ source must be read. Memcached is a memory cache demon that can be used by most OpenStack services to store ephemeral data, such as tokens. -Access to memcached is not handled by HAproxy because replicated -access is currently only in an experimental state. Instead OpenStack +Access to Memcached is not handled by HAProxy because replicated +access is currently in an experimental state. Instead, OpenStack services must be supplied with the full list of hosts running -memcached. +Memcached. The Memcached client implements hashing to balance objects among the -instances. Failure of an instance only impacts a percentage of the +instances. Failure of an instance impacts only a percentage of the objects and the client automatically removes it from the list of -instances. The SLA is several minutes. +instances. The SLA is several minutes. diff --git a/doc/ha-guide/source/controller-ha-pacemaker.rst b/doc/ha-guide/source/controller-ha-pacemaker.rst index ef253b9b10..f4c61dc803 100644 --- a/doc/ha-guide/source/controller-ha-pacemaker.rst +++ b/doc/ha-guide/source/controller-ha-pacemaker.rst @@ -2,23 +2,24 @@ Pacemaker cluster stack ======================= -`Pacemaker `_ cluster stack is the state-of-the-art +`Pacemaker `_ cluster stack is a state-of-the-art high availability and load balancing stack for the Linux platform. -Pacemaker is useful to make OpenStack infrastructure highly available. -Also, it is storage and application-agnostic, and in no way -specific to OpenStack. +Pacemaker is used to make OpenStack infrastructure highly available. + +.. note:: + + It is storage and application-agnostic, and in no way specific to OpenStack. Pacemaker relies on the `Corosync `_ messaging layer -for reliable cluster communications. -Corosync implements the Totem single-ring ordering and membership protocol. -It also provides UDP and InfiniBand based messaging, -quorum, and cluster membership to Pacemaker. +for reliable cluster communications. Corosync implements the Totem single-ring +ordering and membership protocol. It also provides UDP and InfiniBand based +messaging, quorum, and cluster membership to Pacemaker. -Pacemaker does not inherently (need or want to) understand the -applications it manages. Instead, it relies on resource agents (RAs), -scripts that encapsulate the knowledge of how to start, stop, and -check the health of each application managed by the cluster. +Pacemaker does not inherently understand the applications it manages. +Instead, it relies on resource agents (RAs) that are scripts that encapsulate +the knowledge of how to start, stop, and check the health of each application +managed by the cluster. These agents must conform to one of the `OCF `_, @@ -44,57 +45,61 @@ The steps to implement the Pacemaker cluster stack are: Install packages ~~~~~~~~~~~~~~~~ -On any host that is meant to be part of a Pacemaker cluster, -you must first establish cluster communications -through the Corosync messaging layer. -This involves installing the following packages -(and their dependencies, which your package manager -usually installs automatically): +On any host that is meant to be part of a Pacemaker cluster, establish cluster +communications through the Corosync messaging layer. +This involves installing the following packages (and their dependencies, which +your package manager usually installs automatically): -- pacemaker +- `pacemaker` -- pcs (CentOS or RHEL) or crmsh +- `pcs` (CentOS or RHEL) or crmsh -- corosync +- `corosync` -- fence-agents (CentOS or RHEL) or cluster-glue +- `fence-agents` (CentOS or RHEL) or cluster-glue -- resource-agents +- `resource-agents` -- libqb0 +- `libqb0` .. _pacemaker-corosync-setup: -Set up the cluster with `pcs` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Set up the cluster with pcs +~~~~~~~~~~~~~~~~~~~~~~~~~~~ -#. Make sure pcs is running and configured to start at boot time: +#. Make sure `pcs` is running and configured to start at boot time: .. code-block:: console $ systemctl enable pcsd $ systemctl start pcsd -#. Set a password for hacluster user **on each host**. - - Since the cluster is a single administrative domain, it is generally - accepted to use the same password on all nodes. +#. Set a password for hacluster user on each host: .. code-block:: console $ echo my-secret-password-no-dont-use-this-one \ | passwd --stdin hacluster -#. Use that password to authenticate to the nodes which will - make up the cluster. The :option:`-p` option is used to give - the password on command line and makes it easier to script. + .. note:: + + Since the cluster is a single administrative domain, it is + acceptable to use the same password on all nodes. + +#. Use that password to authenticate to the nodes that will + make up the cluster: .. code-block:: console $ pcs cluster auth controller1 controller2 controller3 \ -u hacluster -p my-secret-password-no-dont-use-this-one --force -#. Create the cluster, giving it a name, and start it: + .. note:: + + The :option:`-p` option is used to give the password on command + line and makes it easier to script. + +#. Create and name the cluster, and then start it: .. code-block:: console @@ -115,12 +120,12 @@ After installing the Corosync package, you must create the :file:`/etc/corosync/corosync.conf` configuration file. .. note:: - For Ubuntu, you should also enable the Corosync service - in the ``/etc/default/corosync`` configuration file. -Corosync can be configured to work -with either multicast or unicast IP addresses -or to use the votequorum library. + For Ubuntu, you should also enable the Corosync service in the + ``/etc/default/corosync`` configuration file. + +Corosync can be configured to work with either multicast or unicast IP +addresses or to use the votequorum library. - :ref:`corosync-multicast` - :ref:`corosync-unicast` @@ -132,11 +137,10 @@ Set up Corosync with multicast ------------------------------ Most distributions ship an example configuration file -(:file:`corosync.conf.example`) -as part of the documentation bundled with the Corosync package. -An example Corosync configuration file is shown below: +(:file:`corosync.conf.example`) as part of the documentation bundled with +the Corosync package. An example Corosync configuration file is shown below: -**Example Corosync configuration file for multicast (corosync.conf)** +**Example Corosync configuration file for multicast (``corosync.conf``)** .. code-block:: ini @@ -215,26 +219,26 @@ Note the following: When this timeout expires, the token is declared lost, and after ``token_retransmits_before_loss_const lost`` tokens, the non-responding processor (cluster node) is declared dead. - In other words, ``token × token_retransmits_before_loss_const`` + ``token × token_retransmits_before_loss_const`` is the maximum time a node is allowed to not respond to cluster messages before being considered dead. The default for token is 1000 milliseconds (1 second), with 4 allowed retransmits. These defaults are intended to minimize failover times, - but can cause frequent "false alarms" and unintended failovers + but can cause frequent false alarms and unintended failovers in case of short network interruptions. The values used here are safer, albeit with slightly extended failover times. - With ``secauth`` enabled, - Corosync nodes mutually authenticate using a 128-byte shared secret - stored in the :file:`/etc/corosync/authkey` file, - which may be generated with the :command:`corosync-keygen` utility. - When using ``secauth``, cluster communications are also encrypted. + Corosync nodes mutually authenticates using a 128-byte shared secret + stored in the :file:`/etc/corosync/authkey` file. + This can be generated with the :command:`corosync-keygen` utility. + Cluster communications are encrypted when using ``secauth``. -- In Corosync configurations using redundant networking - (with more than one interface), - you must select a Redundant Ring Protocol (RRP) mode other than none. - ``active`` is the recommended RRP mode. +- In Corosync, configurations use redundant networking + (with more than one interface). This means you must select a Redundant + Ring Protocol (RRP) mode other than none. We recommend ``active`` as + the RRP mode. Note the following about the recommended interface configuration: @@ -245,61 +249,57 @@ Note the following: The example uses two network addresses of /24 IPv4 subnets. - Multicast groups (``mcastaddr``) must not be reused - across cluster boundaries. - In other words, no two distinct clusters + across cluster boundaries. No two distinct clusters should ever use the same multicast group. Be sure to select multicast addresses compliant with `RFC 2365, "Administratively Scoped IP Multicast" `_. - - For firewall configurations, - note that Corosync communicates over UDP only, - and uses ``mcastport`` (for receives) - and ``mcastport - 1`` (for sends). + - For firewall configurations, Corosync communicates over UDP only, + and uses ``mcastport`` (for receives) and ``mcastport - 1`` (for sends). -- The service declaration for the pacemaker service +- The service declaration for the Pacemaker service may be placed in the :file:`corosync.conf` file directly or in its own separate file, :file:`/etc/corosync/service.d/pacemaker`. .. note:: - If you are using Corosync version 2 on Ubuntu 14.04, - remove or comment out lines under the service stanza, - which enables Pacemaker to start up. Another potential - problem is the boot and shutdown order of Corosync and - Pacemaker. To force Pacemaker to start after Corosync and - stop before Corosync, fix the start and kill symlinks manually: + If you are using Corosync version 2 on Ubuntu 14.04, + remove or comment out lines under the service stanza. + These stanzas enable Pacemaker to start up. Another potential + problem is the boot and shutdown order of Corosync and + Pacemaker. To force Pacemaker to start after Corosync and + stop before Corosync, fix the start and kill symlinks manually: - .. code-block:: console + .. code-block:: console - # update-rc.d pacemaker start 20 2 3 4 5 . stop 00 0 1 6 . + # update-rc.d pacemaker start 20 2 3 4 5 . stop 00 0 1 6 . - The Pacemaker service also requires an additional - configuration file ``/etc/corosync/uidgid.d/pacemaker`` - to be created with the following content: + The Pacemaker service also requires an additional + configuration file ``/etc/corosync/uidgid.d/pacemaker`` + to be created with the following content: - .. code-block:: ini + .. code-block:: ini - uidgid { - uid: hacluster - gid: haclient - } + uidgid { + uid: hacluster + gid: haclient + } -- Once created, the :file:`corosync.conf` file +- Once created, synchronize the :file:`corosync.conf` file (and the :file:`authkey` file if the secauth option is enabled) - must be synchronized across all cluster nodes. + across all cluster nodes. .. _corosync-unicast: Set up Corosync with unicast ---------------------------- -For environments that do not support multicast, -Corosync should be configured for unicast. -An example fragment of the :file:`corosync.conf` file -for unicastis shown below: +For environments that do not support multicast, Corosync should be configured +for unicast. An example fragment of the :file:`corosync.conf` file +for unicastis is shown below: -**Corosync configuration file fragment for unicast (corosync.conf)** +**Corosync configuration file fragment for unicast (``corosync.conf``)** .. code-block:: ini @@ -341,45 +341,38 @@ for unicastis shown below: Note the following: -- If the ``broadcast`` parameter is set to yes, - the broadcast address is used for communication. - If this option is set, the ``mcastaddr`` parameter should not be set. +- If the ``broadcast`` parameter is set to ``yes``, the broadcast address is + used for communication. If this option is set, the ``mcastaddr`` parameter + should not be set. -- The ``transport`` directive controls the transport mechanism used. - To avoid the use of multicast entirely, - specify the ``udpu`` unicast transport parameter. - This requires specifying the list of members - in the ``nodelist`` directive; - this could potentially make up the membership before deployment. - The default is ``udp``. - The transport type can also be set to ``udpu`` or ``iba``. +- The ``transport`` directive controls the transport mechanism. + To avoid the use of multicast entirely, specify the ``udpu`` unicast + transport parameter. This requires specifying the list of members in the + ``nodelist`` directive. This potentially makes up the membership before + deployment. The default is ``udp``. The transport type can also be set to + ``udpu`` or ``iba``. -- Within the ``nodelist`` directive, - it is possible to specify specific information - about the nodes in the cluster. - The directive can contain only the node sub-directive, - which specifies every node that should be a member of the membership, - and where non-default options are needed. - Every node must have at least the ``ring0_addr`` field filled. +- Within the ``nodelist`` directive, it is possible to specify specific + information about the nodes in the cluster. The directive can contain only + the node sub-directive, which specifies every node that should be a member + of the membership, and where non-default options are needed. Every node must + have at least the ``ring0_addr`` field filled. .. note:: - For UDPU, every node that should be a member - of the membership must be specified. + For UDPU, every node that should be a member of the membership must be specified. Possible options are: - ``ring{X}_addr`` specifies the IP address of one of the nodes. - {X} is the ring number. + ``{X}`` is the ring number. - - ``nodeid`` is optional - when using IPv4 and required when using IPv6. - This is a 32-bit value specifying the node identifier - delivered to the cluster membership service. - If this is not specified with IPv4, - the node id is determined from the 32-bit IP address - of the system to which the system is bound with ring identifier of 0. - The node identifier value of zero is reserved and should not be used. + - ``nodeid`` is optional when using IPv4 and required when using IPv6. + This is a 32-bit value specifying the node identifier delivered to the + cluster membership service. If this is not specified with IPv4, + the node ID is determined from the 32-bit IP address of the system to which + the system is bound with ring identifier of 0. The node identifier value of + zero is reserved and should not be used. .. _corosync-votequorum: @@ -387,15 +380,14 @@ Note the following: Set up Corosync with votequorum library --------------------------------------- -The votequorum library is part of the corosync project. -It provides an interface to the vote-based quorum service -and it must be explicitly enabled in the Corosync configuration file. -The main role of votequorum library is to avoid split-brain situations, -but it also provides a mechanism to: +The votequorum library is part of the Corosync project. It provides an +interface to the vote-based quorum service and it must be explicitly enabled +in the Corosync configuration file. The main role of votequorum library is to +avoid split-brain situations, but it also provides a mechanism to: - Query the quorum status -- Get a list of nodes known to the quorum service +- List the nodes known to the quorum service - Receive notifications of quorum state changes @@ -403,15 +395,13 @@ but it also provides a mechanism to: - Change the number of expected votes for a cluster to be quorate -- Connect an additional quorum device - to allow small clusters remain quorate during node outages +- Connect an additional quorum device to allow small clusters remain quorate + during node outages -The votequorum library has been created to replace and eliminate -qdisk, the disk-based quorum daemon for CMAN, -from advanced cluster configurations. +The votequorum library has been created to replace and eliminate ``qdisk``, the +disk-based quorum daemon for CMAN, from advanced cluster configurations. -A sample votequorum service configuration -in the :file:`corosync.conf` file is: +A sample votequorum service configuration in the :file:`corosync.conf` file is: .. code-block:: ini @@ -425,42 +415,33 @@ in the :file:`corosync.conf` file is: Note the following: -- Specifying ``corosync_votequorum`` enables the votequorum library; - this is the only required option. +- Specifying ``corosync_votequorum`` enables the votequorum library. + This is the only required option. - The cluster is fully operational with ``expected_votes`` set to 7 nodes - (each node has 1 vote), quorum: 4. - If a list of nodes is specified as ``nodelist``, - the ``expected_votes`` value is ignored. + (each node has 1 vote), quorum: 4. If a list of nodes is specified as + ``nodelist``, the ``expected_votes`` value is ignored. -- Setting ``wait_for_all`` to 1 means that, - When starting up a cluster (all nodes down), - the cluster quorum is held until all nodes are online - and have joined the cluster for the first time. - This parameter is new in Corosync 2.0. +- When you start up a cluster (all nodes down) and set ``wait_for_all`` to 1, + the cluster quorum is held until all nodes are online and have joined the + cluster for the first time. This parameter is new in Corosync 2.0. -- Setting ``last_man_standing`` to 1 enables - the Last Man Standing (LMS) feature; - by default, it is disabled (set to 0). - If a cluster is on the quorum edge - (``expected_votes:`` set to 7; ``online nodes:`` set to 4) - for longer than the time specified - for the ``last_man_standing_window`` parameter, - the cluster can recalculate quorum and continue operating - even if the next node will be lost. - This logic is repeated until the number of online nodes - in the cluster reaches 2. - In order to allow the cluster to step down from 2 members to only 1, - the ``auto_tie_breaker`` parameter needs to be set; - this is not recommended for production environments. +- Setting ``last_man_standing`` to 1 enables the Last Man Standing (LMS) + feature. By default, it is disabled (set to 0). + If a cluster is on the quorum edge (``expected_votes:`` set to 7; + ``online nodes:`` set to 4) for longer than the time specified + for the ``last_man_standing_window`` parameter, the cluster can recalculate + quorum and continue operating even if the next node will be lost. + This logic is repeated until the number of online nodes in the cluster + reaches 2. In order to allow the cluster to step down from 2 members to only + 1, the ``auto_tie_breaker`` parameter needs to be set. + We do not recommended this for production environments. - ``last_man_standing_window`` specifies the time, in milliseconds, required to recalculate quorum after one or more hosts - have been lost from the cluster. - To do the new quorum recalculation, + have been lost from the cluster. To perform a new quorum recalculation, the cluster must have quorum for at least the interval - specified for ``last_man_standing_window``; - the default is 10000ms. + specified for ``last_man_standing_window``. The default is 10000ms. .. _pacemaker-corosync-start: @@ -468,30 +449,29 @@ Note the following: Start Corosync -------------- -``Corosync`` is started as a regular system service. -Depending on your distribution, it may ship with an LSB init script, -an upstart job, or a systemd unit file. -Either way, the service is usually named ``corosync``: +Corosync is started as a regular system service. Depending on your +distribution, it may ship with an LSB init script, an upstart job, or +a Systemd unit file. -- To start ``corosync`` with the LSB init script: +- Start ``corosync`` with the LSB init script: .. code-block:: console # /etc/init.d/corosync start -- Alternatively: + Alternatively: .. code-block:: console # service corosync start -- To start ``corosync`` with upstart: +- Start ``corosync`` with upstart: .. code-block:: console # start corosync -- To start ``corosync`` with systemd unit file: +- Start ``corosync`` with systemd unit file: .. code-block:: console @@ -514,8 +494,8 @@ to get a summary of the health of the communication rings: id = 10.0.42.100 status = ring 1 active with no faults -Use the :command:`corosync-objctl` utility -to dump the Corosync cluster member list: +Use the :command:`corosync-objctl` utility to dump the Corosync cluster +member list: .. code-block:: console @@ -527,11 +507,8 @@ to dump the Corosync cluster member list: runtime.totem.pg.mrp.srp.983895584.join_count=1 runtime.totem.pg.mrp.srp.983895584.status=joined -You should see a ``status=joined`` entry -for each of your constituent cluster nodes. - -[TODO: Should the main example now use corosync-cmapctl and have the note -give the command for Corosync version 1?] +You should see a ``status=joined`` entry for each of your constituent +cluster nodes. .. note:: @@ -543,38 +520,38 @@ give the command for Corosync version 1?] Start Pacemaker --------------- -After the ``corosync`` service have been started -and you have verified that the cluster is communicating properly, -you can start :command:`pacemakerd`, the Pacemaker master control process. -Choose one from the following four ways to start it: +After the ``corosync`` service have been started and you have verified that the +cluster is communicating properly, you can start :command:`pacemakerd`, the +Pacemaker master control process. Choose one from the following four ways to +start it: -- To start ``pacemaker`` with the LSB init script: +#. Start ``pacemaker`` with the LSB init script: .. code-block:: console # /etc/init.d/pacemaker start -- Alternatively: + Alternatively: .. code-block:: console # service pacemaker start -- To start ``pacemaker`` with upstart: +#. Start ``pacemaker`` with upstart: .. code-block:: console # start pacemaker -- To start ``pacemaker`` with the systemd unit file: +#. Start ``pacemaker`` with the systemd unit file: .. code-block:: console # systemctl start pacemaker -After the ``pacemaker`` service have started, -Pacemaker creates a default empty cluster configuration with no resources. -Use the :command:`crm_mon` utility to observe the status of ``pacemaker``: +After the ``pacemaker`` service has started, Pacemaker creates a default empty +cluster configuration with no resources. Use the :command:`crm_mon` utility to +observe the status of ``pacemaker``: .. code-block:: console @@ -596,30 +573,29 @@ Use the :command:`crm_mon` utility to observe the status of ``pacemaker``: Set basic cluster properties ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -After you set up your Pacemaker cluster, -you should set a few basic cluster properties: +After you set up your Pacemaker cluster, set a few basic cluster properties: -``crmsh`` +- ``crmsh`` -.. code-block:: console + .. code-block:: console - $ crm configure property pe-warn-series-max="1000" \ - pe-input-series-max="1000" \ - pe-error-series-max="1000" \ - cluster-recheck-interval="5min" + $ crm configure property pe-warn-series-max="1000" \ + pe-input-series-max="1000" \ + pe-error-series-max="1000" \ + cluster-recheck-interval="5min" -``pcs`` +- ``pcs`` -.. code-block:: console + .. code-block:: console - $ pcs property set pe-warn-series-max=1000 \ - pe-input-series-max=1000 \ - pe-error-series-max=1000 \ - cluster-recheck-interval=5min + $ pcs property set pe-warn-series-max=1000 \ + pe-input-series-max=1000 \ + pe-error-series-max=1000 \ + cluster-recheck-interval=5min Note the following: -- Setting the ``pe-warn-series-max``, ``pe-input-series-max`` +- Setting the ``pe-warn-series-max``, ``pe-input-series-max``, and ``pe-error-series-max`` parameters to 1000 instructs Pacemaker to keep a longer history of the inputs processed and errors and warnings generated by its Policy Engine. @@ -631,4 +607,4 @@ Note the following: It is usually prudent to reduce this to a shorter interval, such as 5 or 3 minutes. -After you make these changes, you may commit the updated configuration. +After you make these changes, commit the updated configuration. diff --git a/doc/ha-guide/source/controller-ha-telemetry.rst b/doc/ha-guide/source/controller-ha-telemetry.rst index 9fa959d6cb..3f7afafa51 100644 --- a/doc/ha-guide/source/controller-ha-telemetry.rst +++ b/doc/ha-guide/source/controller-ha-telemetry.rst @@ -2,76 +2,80 @@ Highly available Telemetry API ============================== -`Telemetry service -`__ -provides data collection service and alarming service. +The `Telemetry service +`_ +provides a data collection service and an alarming service. Telemetry central agent ~~~~~~~~~~~~~~~~~~~~~~~ The Telemetry central agent can be configured to partition its polling -workload between multiple agents, enabling high availability. +workload between multiple agents. This enables high availability (HA). -Both the central and the compute agent can run in an HA deployment, -which means that multiple instances of these services can run in +Both the central and the compute agent can run in an HA deployment. +This means that multiple instances of these services can run in parallel with workload partitioning among these running instances. -The `Tooz `__ library provides +The `Tooz `_ library provides the coordination within the groups of service instances. It provides an API above several back ends that can be used for building distributed applications. Tooz supports -`various drivers `__ +`various drivers `_ including the following back end solutions: -* `Zookeeper `__. +* `Zookeeper `_: Recommended solution by the Tooz project. -* `Redis `__. +* `Redis `_: Recommended solution by the Tooz project. -* `Memcached `__. +* `Memcached `_: Recommended for testing. You must configure a supported Tooz driver for the HA deployment of the Telemetry services. -For information about the required configuration options that have -to be set in the :file:`ceilometer.conf` configuration file for both -the central and compute agents, see the `coordination section -`__ +For information about the required configuration options +to set in the :file:`ceilometer.conf`, see the `coordination section +`_ in the OpenStack Configuration Reference. -.. note:: Without the ``backend_url`` option being set only one - instance of both the central and compute agent service is able to run - and function correctly. +.. note:: + + Only one instance for the central and compute agent service(s) is able + to run and function correctly if the ``backend_url`` option is not set. The availability check of the instances is provided by heartbeat messages. When the connection with an instance is lost, the workload will be -reassigned within the remained instances in the next polling cycle. +reassigned within the remaining instances in the next polling cycle. -.. note:: Memcached uses a timeout value, which should always be set to +.. note:: + + Memcached uses a timeout value, which should always be set to a value that is higher than the heartbeat value set for Telemetry. For backward compatibility and supporting existing deployments, the central -agent configuration also supports using different configuration files for -groups of service instances of this type that are running in parallel. +agent configuration supports using different configuration files. This is for +groups of service instances that are running in parallel. For enabling this configuration, set a value for the ``partitioning_group_prefix`` option in the -`polling section `__ +`polling section `_ in the OpenStack Configuration Reference. -.. warning:: For each sub-group of the central agent pool with the same - ``partitioning_group_prefix`` a disjoint subset of meters must be polled -- - otherwise samples may be missing or duplicated. The list of meters to poll +.. warning:: + + For each sub-group of the central agent pool with the same + ``partitioning_group_prefix``, a disjoint subset of meters must be polled + to avoid samples being missing or duplicated. The list of meters to poll can be set in the :file:`/etc/ceilometer/pipeline.yaml` configuration file. For more information about pipelines see the `Data collection and processing - `__ + `_ section. To enable the compute agent to run multiple instances simultaneously with -workload partitioning, the workload_partitioning option has to be set to -``True`` under the `compute section `__ +workload partitioning, the ``workload_partitioning`` option must be set to +``True`` under the `compute section `_ in the :file:`ceilometer.conf` configuration file. diff --git a/doc/ha-guide/source/controller-ha-vip.rst b/doc/ha-guide/source/controller-ha-vip.rst index b46adc811d..058c12cccc 100644 --- a/doc/ha-guide/source/controller-ha-vip.rst +++ b/doc/ha-guide/source/controller-ha-vip.rst @@ -1,13 +1,12 @@ - ================= Configure the VIP ================= -You must select and assign a virtual IP address (VIP) -that can freely float between cluster nodes. +You must select and assign a virtual IP address (VIP) that can freely float +between cluster nodes. -This configuration creates ``vip``, -a virtual IP address for use by the API node (``10.0.0.11``): +This configuration creates ``vip``, a virtual IP address for use by the +API node (``10.0.0.11``). For ``crmsh``: diff --git a/doc/ha-guide/source/controller-ha.rst b/doc/ha-guide/source/controller-ha.rst index 41e68668d8..364577925d 100644 --- a/doc/ha-guide/source/controller-ha.rst +++ b/doc/ha-guide/source/controller-ha.rst @@ -2,8 +2,8 @@ Configuring the controller for high availability ================================================ -The cloud controller runs on the management network -and must talk to all other services. +The cloud controller runs on the management network and must talk to +all other services. .. toctree:: :maxdepth: 2 diff --git a/doc/ha-guide/source/environment-hardware.rst b/doc/ha-guide/source/environment-hardware.rst index b2d2e50470..39a4448a3c 100644 --- a/doc/ha-guide/source/environment-hardware.rst +++ b/doc/ha-guide/source/environment-hardware.rst @@ -2,29 +2,26 @@ Hardware considerations for high availability ============================================= -.. TODO: Provide a minimal architecture example for HA, expanded on that - given in the *Environment* section of - http://docs.openstack.org/project-install-guide/newton (depending - on the distribution) for easy comparison. +When you use high availability, consider the hardware requirements needed +for your application. Hardware setup ~~~~~~~~~~~~~~ -The standard hardware requirements: +The following are the standard hardware requirements: -- Provider networks. See the *Overview -> Networking Option 1: Provider +- Provider networks: See the *Overview -> Networking Option 1: Provider networks* section of the `Install Tutorials and Guides `_ depending on your distribution. -- Self-service networks. See the *Overview -> Networking Option 2: +- Self-service networks: See the *Overview -> Networking Option 2: Self-service networks* section of the `Install Tutorials and Guides `_ depending on your distribution. -However, OpenStack does not require a significant amount of resources -and the following minimum requirements should support -a proof-of-concept high availability environment -with core services and several instances: +OpenStack does not require a significant amount of resources and the following +minimum requirements should support a proof-of-concept high availability +environment with core services and several instances: +-------------------+------------------+----------+-----------+------+ | Node type | Processor Cores | Memory | Storage | NIC | @@ -39,26 +36,23 @@ nodes is 2 milliseconds. Although the cluster software can be tuned to operate at higher latencies, some vendors insist on this value before agreeing to support the installation. -The `ping` command can be used to find the latency between two -servers. +You can use the `ping` command to find the latency between two servers. Virtualized hardware ~~~~~~~~~~~~~~~~~~~~ -For demonstrations and studying, -you can set up a test environment on virtual machines (VMs). -This has the following benefits: +For demonstrations and studying, you can set up a test environment on virtual +machines (VMs). This has the following benefits: - One physical server can support multiple nodes, each of which supports almost any number of network interfaces. -- Ability to take periodic "snap shots" throughout the installation process - and "roll back" to a working configuration in the event of a problem. +- You can take periodic snap shots throughout the installation process + and roll back to a working configuration in the event of a problem. -However, running an OpenStack environment on VMs -degrades the performance of your instances, -particularly if your hypervisor and/or processor lacks support -for hardware acceleration of nested VMs. +However, running an OpenStack environment on VMs degrades the performance of +your instances, particularly if your hypervisor or processor lacks +support for hardware acceleration of nested VMs. .. note:: diff --git a/doc/ha-guide/source/environment-memcached.rst b/doc/ha-guide/source/environment-memcached.rst index e14ee8b7e4..c3f5c9304c 100644 --- a/doc/ha-guide/source/environment-memcached.rst +++ b/doc/ha-guide/source/environment-memcached.rst @@ -1,40 +1,32 @@ -================= -Install memcached -================= +==================== +Installing Memcached +==================== -[TODO: Verify that Oslo supports hash synchronization; -if so, this should not take more than load balancing.] - -[TODO: This hands off to two different docs for install information. -We should choose one or explain the specific purpose of each.] - -Most OpenStack services can use memcached -to store ephemeral data such as tokens. -Although memcached does not support -typical forms of redundancy such as clustering, -OpenStack services can use almost any number of instances +Most OpenStack services can use Memcached to store ephemeral data such as +tokens. Although Memcached does not support typical forms of redundancy such +as clustering, OpenStack services can use almost any number of instances by configuring multiple hostnames or IP addresses. -The memcached client implements hashing -to balance objects among the instances. -Failure of an instance only impacts a percentage of the objects + +The Memcached client implements hashing to balance objects among the instances. +Failure of an instance only impacts a percentage of the objects, and the client automatically removes it from the list of instances. -To install and configure memcached, read the -`official documentation `_. +To install and configure Memcached, read the +`official documentation `_. Memory caching is managed by `oslo.cache -`_ -so the way to use multiple memcached servers is the same for all projects. - -Example configuration with three hosts: +`_. +This ensures consistency across all projects when using multiple Memcached +servers. The following is an example configuration with three hosts: .. code-block:: ini - memcached_servers = controller1:11211,controller2:11211,controller3:11211 + Memcached_servers = controller1:11211,controller2:11211,controller3:11211 -By default, ``controller1`` handles the caching service. -If the host goes down, ``controller2`` or ``controller3`` does the job. -For more information about memcached installation, see the +By default, ``controller1`` handles the caching service. If the host goes down, +``controller2`` or ``controller3`` will complete the service. + +For more information about Memcached installation, see the *Environment -> Memcached* section in the `Installation Tutorials and Guides `_ depending on your distribution. diff --git a/doc/ha-guide/source/environment-operatingsystem.rst b/doc/ha-guide/source/environment-operatingsystem.rst index aef9f11e67..e94079e6e6 100644 --- a/doc/ha-guide/source/environment-operatingsystem.rst +++ b/doc/ha-guide/source/environment-operatingsystem.rst @@ -1,6 +1,6 @@ -===================================== -Install operating system on each node -===================================== +=============================== +Installing the operating system +=============================== The first step in setting up your highly available OpenStack cluster is to install the operating system on each node. diff --git a/doc/ha-guide/source/ha-community.rst b/doc/ha-guide/source/ha-community.rst index 055128a2a6..ce8dec0811 100644 --- a/doc/ha-guide/source/ha-community.rst +++ b/doc/ha-guide/source/ha-community.rst @@ -3,7 +3,7 @@ HA community ============ Weekly IRC meetings -------------------- +~~~~~~~~~~~~~~~~~~~ The OpenStack HA community holds `weekly IRC meetings `_ to discuss @@ -12,7 +12,7 @@ encouraged to attend. The `logs of all previous meetings `_ are available to read. Contacting the community ------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~ You can contact the HA community directly in `the #openstack-ha channel on Freenode IRC `_, or by diff --git a/doc/ha-guide/source/index.rst b/doc/ha-guide/source/index.rst index e589ef17b5..70a9950709 100644 --- a/doc/ha-guide/source/index.rst +++ b/doc/ha-guide/source/index.rst @@ -1,20 +1,23 @@ -================================= +================================= OpenStack High Availability Guide ================================= Abstract ~~~~~~~~ -This guide describes how to install and configure -OpenStack for high availability. -It supplements the Installation Tutorials and Guides +This guide describes how to install and configure OpenStack for high +availability. It supplements the Installation Tutorials and Guides and assumes that you are familiar with the material in those guides. This guide documents OpenStack Newton, Mitaka, and Liberty releases. -.. warning:: This guide is a work-in-progress and changing rapidly - while we continue to test and enhance the guidance. Please note - where there are open "to do" items and help where you are able. +.. warning:: + + This guide is a work-in-progress and changing rapidly + while we continue to test and enhance the guidance. There are + open `TODO` items throughout and available on the OpenStack manuals + `bug list `_. + Please help where you are able. Contents ~~~~~~~~ diff --git a/doc/ha-guide/source/instance-ha.rst b/doc/ha-guide/source/instance-ha.rst index 548d114cb7..b40bb50c5e 100644 --- a/doc/ha-guide/source/instance-ha.rst +++ b/doc/ha-guide/source/instance-ha.rst @@ -4,28 +4,28 @@ Configure high availability of instances As of September 2016, the OpenStack High Availability community is designing and developing an official and unified way to provide high -availability for instances. That is, we are developing automatic +availability for instances. We are developing automatic recovery from failures of hardware or hypervisor-related software on -the compute node, or other failures which could prevent instances from -functioning correctly - issues with a cinder volume I/O path, for example. +the compute node, or other failures that could prevent instances from +functioning correctly, such as, issues with a cinder volume I/O path. More details are available in the `user story `_ co-authored by OpenStack's HA community and `Product Working Group -`_ (PWG), who have -identified this feature as missing functionality in OpenStack which +`_ (PWG), where this feature is +identified as missing functionality in OpenStack, which should be addressed with high priority. Existing solutions ------------------- +~~~~~~~~~~~~~~~~~~ The architectural challenges of instance HA and several currently existing solutions were presented in `a talk at the Austin summit `_, -for which `slides are also available -`_. +for which `slides are also available `_. -The code for three of these solutions can be found online: +The code for three of these solutions can be found online at the following +links: * `a mistral-based auto-recovery workflow `_, by Intel @@ -35,7 +35,7 @@ The code for three of these solutions can be found online: as used by Red Hat and SUSE Current upstream work ---------------------- +~~~~~~~~~~~~~~~~~~~~~ Work is in progress on a unified approach, which combines the best aspects of existing upstream solutions. More details are available on diff --git a/doc/ha-guide/source/intro-ha-arch-pacemaker.rst b/doc/ha-guide/source/intro-ha-arch-pacemaker.rst index 5b724da3cf..88f7528421 100644 --- a/doc/ha-guide/source/intro-ha-arch-pacemaker.rst +++ b/doc/ha-guide/source/intro-ha-arch-pacemaker.rst @@ -2,24 +2,24 @@ The Pacemaker architecture ========================== -What is a cluster manager -~~~~~~~~~~~~~~~~~~~~~~~~~ +What is a cluster manager? +~~~~~~~~~~~~~~~~~~~~~~~~~~ At its core, a cluster is a distributed finite state machine capable of co-ordinating the startup and recovery of inter-related services across a set of machines. -Even a distributed and/or replicated application that is able to -survive failures on one or more machines can benefit from a -cluster manager: +Even a distributed or replicated application that is able to survive failures +on one or more machines can benefit from a cluster manager because a cluster +manager has the following capabilities: #. Awareness of other applications in the stack While SYS-V init replacements like systemd can provide deterministic recovery of a complex stack of services, the recovery is limited to one machine and lacks the context of what - is happening on other machines - context that is crucial to - determine the difference between a local failure, clean startup + is happening on other machines. This context is crucial to + determine the difference between a local failure, and clean startup and recovery after a total site failure. #. Awareness of instances on other machines @@ -27,17 +27,17 @@ cluster manager: Services like RabbitMQ and Galera have complicated boot-up sequences that require co-ordination, and often serialization, of startup operations across all machines in the cluster. This is - especially true after site-wide failure or shutdown where we must + especially true after a site-wide failure or shutdown where you must first determine the last machine to be active. #. A shared implementation and calculation of `quorum - `_. + `_ It is very important that all members of the system share the same view of who their peers are and whether or not they are in the majority. Failure to do this leads very quickly to an internal `split-brain `_ - state - where different parts of the system are pulling in + state. This is where different parts of the system are pulling in different and incompatible directions. #. Data integrity through fencing (a non-responsive process does not @@ -46,7 +46,7 @@ cluster manager: A single application does not have sufficient context to know the difference between failure of a machine and failure of the application on a machine. The usual practice is to assume the - machine is dead and carry on, however this is highly risky - a + machine is dead and continue working, however this is highly risky. A rogue process or machine could still be responding to requests and generally causing havoc. The safer approach is to make use of remotely accessible power switches and/or network switches and SAN @@ -59,46 +59,46 @@ cluster manager: required volume of requests. A cluster can automatically recover failed instances to prevent additional load induced failures. -For this reason, the use of a cluster manager like `Pacemaker -`_ is highly recommended. +For these reasons, we highly recommend the use of a cluster manager like +`Pacemaker `_. Deployment flavors ~~~~~~~~~~~~~~~~~~ It is possible to deploy three different flavors of the Pacemaker -architecture. The two extremes are **Collapsed** (where every -component runs on every node) and **Segregated** (where every +architecture. The two extremes are ``Collapsed`` (where every +component runs on every node) and ``Segregated`` (where every component runs in its own 3+ node cluster). -Regardless of which flavor you choose, it is recommended that the -clusters contain at least three nodes so that we can take advantage of +Regardless of which flavor you choose, we recommend that +clusters contain at least three nodes so that you can take advantage of `quorum `_. Quorum becomes important when a failure causes the cluster to split in -two or more partitions. In this situation, you want the majority to -ensure the minority are truly dead (through fencing) and continue to -host resources. For a two-node cluster, no side has the majority and +two or more partitions. In this situation, you want the majority members of +the system to ensure the minority are truly dead (through fencing) and continue +to host resources. For a two-node cluster, no side has the majority and you can end up in a situation where both sides fence each other, or -both sides are running the same services - leading to data corruption. +both sides are running the same services. This can lead to data corruption. -Clusters with an even number of hosts suffer from similar issues - a +Clusters with an even number of hosts suffer from similar issues. A single network failure could easily cause a N:N split where neither side retains a majority. For this reason, we recommend an odd number of cluster members when scaling up. You can have up to 16 cluster members (this is currently limited by the ability of corosync to scale higher). In extreme cases, 32 and -even up to 64 nodes could be possible, however, this is not well tested. +even up to 64 nodes could be possible. However, this is not well tested. Collapsed --------- -In this configuration, there is a single cluster of 3 or more +In a collapsed configuration, there is a single cluster of 3 or more nodes on which every component is running. This scenario has the advantage of requiring far fewer, if more powerful, machines. Additionally, being part of a single cluster -allows us to accurately model the ordering dependencies between +allows you to accurately model the ordering dependencies between components. This scenario can be visualized as below. @@ -136,12 +136,11 @@ It is also possible to follow a segregated approach for one or more components that are expected to be a bottleneck and use a collapsed approach for the remainder. - Proxy server ~~~~~~~~~~~~ Almost all services in this stack benefit from being proxied. -Using a proxy server provides: +Using a proxy server provides the following capabilities: #. Load distribution @@ -152,8 +151,8 @@ Using a proxy server provides: #. API isolation - By sending all API access through the proxy, we can clearly - identify service interdependencies. We can also move them to + By sending all API access through the proxy, you can clearly + identify service interdependencies. You can also move them to locations other than ``localhost`` to increase capacity if the need arises. @@ -169,7 +168,7 @@ Using a proxy server provides: The proxy can be configured as a secondary mechanism for detecting service failures. It can even be configured to look for nodes in - a degraded state (such as being 'too far' behind in the + a degraded state (such as being too far behind in the replication) and take them out of circulation. The following components are currently unable to benefit from the use @@ -179,20 +178,13 @@ of a proxy server: * Memcached * MongoDB -However, the reasons vary and are discussed under each component's -heading. - -We recommend HAProxy as the load balancer, however, there are many -alternatives in the marketplace. - -We use a check interval of 1 second, however, the timeouts vary by service. +We recommend HAProxy as the load balancer, however, there are many alternative +load balancing solutions in the marketplace. Generally, we use round-robin to distribute load amongst instances of -active/active services, however, Galera uses the ``stick-table`` options -to ensure that incoming connections to the virtual IP (VIP) should be -directed to only one of the available back ends. - -In Galera's case, although it can run active/active, this helps avoid -lock contention and prevent deadlocks. It is used in combination with -the ``httpchk`` option that ensures only nodes that are in sync with its +active/active services. Alternatively, Galera uses ``stack-table`` options +to ensure that incoming connection to virtual IP (VIP) are directed to only one +of the available back ends. This helps avoid lock contention and prevent +deadlocks, although Galera can run active/active. Used in combination with +the ``httpchk`` option, this ensure only nodes that are in sync with their peers are allowed to handle requests. diff --git a/doc/ha-guide/source/intro-ha-concepts.rst b/doc/ha-guide/source/intro-ha-concepts.rst index ebca7a1310..5e30b2b7c8 100644 --- a/doc/ha-guide/source/intro-ha-concepts.rst +++ b/doc/ha-guide/source/intro-ha-concepts.rst @@ -2,20 +2,18 @@ High availability concepts ========================== -High availability systems seek to minimize two things: +High availability systems seek to minimize the following issues: -**System downtime** - Occurs when a user-facing service is unavailable - beyond a specified maximum amount of time. +#. System downtime: Occurs when a user-facing service is unavailable + beyond a specified maximum amount of time. -**Data loss** - Accidental deletion or destruction of data. +#. Data loss: Accidental deletion or destruction of data. Most high availability systems guarantee protection against system downtime and data loss only in the event of a single failure. However, they are also expected to protect against cascading failures, where a single failure deteriorates into a series of consequential failures. -Many service providers guarantee :term:`Service Level Agreement (SLA)` +Many service providers guarantee a :term:`Service Level Agreement (SLA)` including uptime percentage of computing service, which is calculated based on the available time and system downtime excluding planned outage time. @@ -65,19 +63,16 @@ guarantee 99.99% availability for individual guest instances. This document discusses some common methods of implementing highly available systems, with an emphasis on the core OpenStack services and other open source services that are closely aligned with OpenStack. -These methods are by no means the only ways to do it; -you may supplement these services with commercial hardware and software -that provides additional features and functionality. -You also need to address high availability concerns -for any applications software that you run on your OpenStack environment. -The important thing is to make sure that your services are redundant -and available; how you achieve that is up to you. -Stateless vs. stateful services -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +You will need to address high availability concerns for any applications +software that you run on your OpenStack environment. The important thing is +to make sure that your services are redundant and available. +How you achieve that is up to you. -Preventing single points of failure can depend on whether or not a -service is stateless. +Stateless versus stateful services +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following are the definitions of stateless and stateful services: Stateless service A service that provides a response after your request @@ -86,13 +81,13 @@ Stateless service you need to provide redundant instances and load balance them. OpenStack services that are stateless include ``nova-api``, ``nova-conductor``, ``glance-api``, ``keystone-api``, - ``neutron-api`` and ``nova-scheduler``. + ``neutron-api``, and ``nova-scheduler``. Stateful service A service where subsequent requests to the service depend on the results of the first request. Stateful services are more difficult to manage because a single - action typically involves more than one request, so simply providing + action typically involves more than one request. Providing additional instances and load balancing does not solve the problem. For example, if the horizon user interface reset itself every time you went to a new page, it would not be very useful. @@ -101,10 +96,11 @@ Stateful service Making stateful services highly available can depend on whether you choose an active/passive or active/active configuration. -Active/Passive vs. Active/Active -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Active/passive versus active/active +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Stateful services may be configured as active/passive or active/active: +Stateful services can be configured as active/passive or active/active, +which are defined as follows: :term:`active/passive configuration` Maintains a redundant instance @@ -148,7 +144,7 @@ in order for the cluster to remain functional. When one node fails and failover transfers control to other nodes, the system must ensure that data and processes remain sane. To determine this, the contents of the remaining nodes are compared -and, if there are discrepancies, a "majority rules" algorithm is implemented. +and, if there are discrepancies, a majority rules algorithm is implemented. For this reason, each cluster in a high availability environment should have an odd number of nodes and the quorum is defined as more than a half @@ -157,7 +153,7 @@ If multiple nodes fail so that the cluster size falls below the quorum value, the cluster itself fails. For example, in a seven-node cluster, the quorum should be set to -floor(7/2) + 1 == 4. If quorum is four and four nodes fail simultaneously, +``floor(7/2) + 1 == 4``. If quorum is four and four nodes fail simultaneously, the cluster itself would fail, whereas it would continue to function, if no more than three nodes fail. If split to partitions of three and four nodes respectively, the quorum of four nodes would continue to operate the majority @@ -169,25 +165,23 @@ example. .. note:: - Note that setting the quorum to a value less than floor(n/2) + 1 is not - recommended and would likely cause a split-brain in a face of network - partitions. + We do not recommend setting the quorum to a value less than ``floor(n/2) + 1`` + as it would likely cause a split-brain in a face of network partitions. -Then, for the given example when four nodes fail simultaneously, -the cluster would continue to function as well. But if split to partitions of -three and four nodes respectively, the quorum of three would have made both -sides to attempt to fence the other and host resources. And without fencing -enabled, it would go straight to running two copies of each resource. +When four nodes fail simultaneously, the cluster would continue to function as +well. But if split to partitions of three and four nodes respectively, the +quorum of three would have made both sides to attempt to fence the other and +host resources. Without fencing enabled, it would go straight to running +two copies of each resource. -This is why setting the quorum to a value less than floor(n/2) + 1 is -dangerous. However it may be required for some specific cases, like a +This is why setting the quorum to a value less than ``floor(n/2) + 1`` is +dangerous. However it may be required for some specific cases, such as a temporary measure at a point it is known with 100% certainty that the other nodes are down. When configuring an OpenStack environment for study or demonstration purposes, -it is possible to turn off the quorum checking; -this is discussed later in this guide. -Production systems should always run with quorum enabled. +it is possible to turn off the quorum checking. Production systems should +always run with quorum enabled. Single-controller high availability mode @@ -203,11 +197,12 @@ but is not appropriate for a production environment. It is possible to add controllers to such an environment to convert it into a truly highly available environment. - High availability is not for every user. It presents some challenges. High availability may be too complex for databases or systems with large amounts of data. Replication can slow large systems down. Different setups have different prerequisites. Read the guidelines for each setup. -High availability is turned off as the default in OpenStack setups. +.. important:: + + High availability is turned off as the default in OpenStack setups. diff --git a/doc/ha-guide/source/intro-ha-controller.rst b/doc/ha-guide/source/intro-ha-controller.rst index 2c6968b3a1..f9f95bb33b 100644 --- a/doc/ha-guide/source/intro-ha-controller.rst +++ b/doc/ha-guide/source/intro-ha-controller.rst @@ -3,17 +3,17 @@ Overview of highly available controllers ======================================== OpenStack is a set of multiple services exposed to the end users -as HTTP(s) APIs. Additionally, for own internal usage OpenStack -requires SQL database server and AMQP broker. The physical servers, -where all the components are running are often called controllers. -This modular OpenStack architecture allows to duplicate all the +as HTTP(s) APIs. Additionally, for your own internal usage, OpenStack +requires an SQL database server and AMQP broker. The physical servers, +where all the components are running, are called controllers. +This modular OpenStack architecture allows you to duplicate all the components and run them on different controllers. By making all the components redundant it is possible to make OpenStack highly available. In general we can divide all the OpenStack components into three categories: -- OpenStack APIs, these are HTTP(s) stateless services written in python, +- OpenStack APIs: These are HTTP(s) stateless services written in python, easy to duplicate and mostly easy to load balance. - SQL relational database server provides stateful type consumed by other @@ -42,17 +42,16 @@ Networking for high availability. Common deployment architectures ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -There are primarily two recommended architectures for making OpenStack -highly available. - -Both use a cluster manager such as Pacemaker or Veritas to -orchestrate the actions of the various services across a set of -machines. Since we are focused on FOSS, we will refer to these as -Pacemaker architectures. +We recommend two primary architectures for making OpenStack highly available. The architectures differ in the sets of services managed by the cluster. +Both use a cluster manager, such as Pacemaker or Veritas, to +orchestrate the actions of the various services across a set of +machines. Because we are focused on FOSS, we refer to these as +Pacemaker architectures. + Traditionally, Pacemaker has been positioned as an all-encompassing solution. However, as OpenStack services have matured, they are increasingly able to run in an active/active configuration and @@ -61,7 +60,7 @@ depend. With this in mind, some vendors are restricting Pacemaker's use to services that must operate in an active/passive mode (such as -cinder-volume), those with multiple states (for example, Galera) and +``cinder-volume``), those with multiple states (for example, Galera), and those with complex bootstrapping procedures (such as RabbitMQ). The majority of services, needing no real orchestration, are handled diff --git a/doc/ha-guide/source/intro-ha-other.rst b/doc/ha-guide/source/intro-ha-other.rst index e623ab3879..020776c5b6 100644 --- a/doc/ha-guide/source/intro-ha-other.rst +++ b/doc/ha-guide/source/intro-ha-other.rst @@ -1,4 +1,3 @@ - ====================================== High availability for other components ====================================== diff --git a/doc/ha-guide/source/intro-ha.rst b/doc/ha-guide/source/intro-ha.rst index dc4a5bdd92..e8bd101398 100644 --- a/doc/ha-guide/source/intro-ha.rst +++ b/doc/ha-guide/source/intro-ha.rst @@ -1,9 +1,7 @@ - =========================================== Introduction to OpenStack high availability =========================================== - .. toctree:: :maxdepth: 2 diff --git a/doc/ha-guide/source/networking-ha-dhcp.rst b/doc/ha-guide/source/networking-ha-dhcp.rst index fd306b0ba9..e28e7740ea 100644 --- a/doc/ha-guide/source/networking-ha-dhcp.rst +++ b/doc/ha-guide/source/networking-ha-dhcp.rst @@ -2,12 +2,10 @@ Run Networking DHCP agent ========================= -The OpenStack Networking service has a scheduler -that lets you run multiple agents across nodes; -the DHCP agent can be natively highly available. -To configure the number of DHCP agents per network, -modify the ``dhcp_agents_per_network`` parameter -in the :file:`/etc/neutron/neutron.conf` file. -By default this is set to 1. -To achieve high availability, -assign more than one DHCP agent per network. +The OpenStack Networking (neutron) service has a scheduler that lets you run +multiple agents across nodes. The DHCP agent can be natively highly available. + +To configure the number of DHCP agents per network, modify the +``dhcp_agents_per_network`` parameter in the :file:`/etc/neutron/neutron.conf` +file. By default this is set to 1. To achieve high availability, assign more +than one DHCP agent per network. diff --git a/doc/ha-guide/source/networking-ha-l3.rst b/doc/ha-guide/source/networking-ha-l3.rst index 495efbda96..dfc83fc405 100644 --- a/doc/ha-guide/source/networking-ha-l3.rst +++ b/doc/ha-guide/source/networking-ha-l3.rst @@ -2,12 +2,12 @@ Run Networking L3 agent ======================= -The neutron L3 agent is scalable, due to the scheduler that supports -Virtual Router Redundancy Protocol (VRRP) -to distribute virtual routers across multiple nodes. -To enable high availability for configured routers, -edit the :file:`/etc/neutron/neutron.conf` file -to set the following values: +The Networking (neutron) service L3 agent is scalable, due to the scheduler +that supports Virtual Router Redundancy Protocol (VRRP) to distribute virtual +routers across multiple nodes. + +To enable high availability for configured routers, edit the +:file:`/etc/neutron/neutron.conf` file to set the following values: .. list-table:: /etc/neutron/neutron.conf parameters for high availability :widths: 15 10 30 diff --git a/doc/ha-guide/source/networking-ha-lbaas.rst b/doc/ha-guide/source/networking-ha-lbaas.rst index 311429adf0..1d9cb6de01 100644 --- a/doc/ha-guide/source/networking-ha-lbaas.rst +++ b/doc/ha-guide/source/networking-ha-lbaas.rst @@ -2,12 +2,10 @@ Run Networking LBaaS agent ========================== -Currently, no native feature is provided -to make the LBaaS agent highly available -using the default plug-in HAProxy. -A common way to make HAProxy highly available -is to use the VRRP (Virtual Router Redundancy Protocol). -Unfortunately, this is not yet implemented -in the LBaaS HAProxy plug-in. +Currently, no native feature is provided to make the LBaaS agent highly +available using the default plug-in HAProxy. A common way to make HAProxy +highly available is to use the VRRP (Virtual Router Redundancy Protocol). + +Unfortunately, this is not yet implemented in the LBaaS HAProxy plug-in. [TODO: update this section.] diff --git a/doc/ha-guide/source/networking-ha-metadata.rst b/doc/ha-guide/source/networking-ha-metadata.rst index 78a696f31b..107a5bffc1 100644 --- a/doc/ha-guide/source/networking-ha-metadata.rst +++ b/doc/ha-guide/source/networking-ha-metadata.rst @@ -2,11 +2,9 @@ Run Networking metadata agent ============================= -No native feature is available -to make this service highly available. -At this time, the Active/Passive solution exists -to run the neutron metadata agent -in failover mode with Pacemaker. +Currently, no native feature is available to make this service highly +available. At this time, the active/passive solution exists to run the +neutron metadata agent in failover mode with Pacemaker. [TODO: Update this information. Can this service now be made HA in active/active mode diff --git a/doc/ha-guide/source/networking-ha.rst b/doc/ha-guide/source/networking-ha.rst index d8853ffeec..47e5d9622f 100644 --- a/doc/ha-guide/source/networking-ha.rst +++ b/doc/ha-guide/source/networking-ha.rst @@ -2,10 +2,10 @@ Networking services for high availability ========================================= -Configure networking on each node. See basic information +Configure networking on each node. See the basic information about configuring networking in the *Networking service* section of the -`Install Tutorials and Guides `_ +`Install Tutorials and Guides `_, depending on your distribution. Notes from planning outline: diff --git a/doc/ha-guide/source/shared-database-configure.rst b/doc/ha-guide/source/shared-database-configure.rst index 4057aa9f80..96ced6712d 100644 --- a/doc/ha-guide/source/shared-database-configure.rst +++ b/doc/ha-guide/source/shared-database-configure.rst @@ -12,26 +12,27 @@ Certain services running on the underlying operating system of your OpenStack database may block Galera Cluster from normal operation or prevent ``mysqld`` from achieving network connectivity with the cluster. - Firewall --------- -Galera Cluster requires that you open four ports to network traffic: +Galera Cluster requires that you open the following ports to network traffic: - On ``3306``, Galera Cluster uses TCP for database client connections and State Snapshot Transfers methods that require the client, (that is, ``mysqldump``). -- On ``4567`` Galera Cluster uses TCP for replication traffic. Multicast +- On ``4567``, Galera Cluster uses TCP for replication traffic. Multicast replication uses both TCP and UDP on this port. -- On ``4568`` Galera Cluster uses TCP for Incremental State Transfers. -- On ``4444`` Galera Cluster uses TCP for all other State Snapshot Transfer +- On ``4568``, Galera Cluster uses TCP for Incremental State Transfers. +- On ``4444``, Galera Cluster uses TCP for all other State Snapshot Transfer methods. -.. seealso:: For more information on firewalls, see `Firewalls and default ports - `_ in the Configuration Reference. +.. seealso:: -This can be achieved through the use of either the ``iptables`` -command such as: + For more information on firewalls, see `Firewalls and default ports + `_ + in the Configuration Reference. + +This can be achieved using the :command:`iptables` command: .. code-block:: console @@ -39,15 +40,14 @@ command such as: --protocol tcp --match tcp --dport ${PORT} \ --source ${NODE-IP-ADDRESS} --jump ACCEPT -Make sure to save the changes once you are done, this will vary +Make sure to save the changes once you are done. This will vary depending on your distribution: -- `Ubuntu `_ -- `Fedora `_ +- For `Ubuntu `_ +- For `Fedora `_ -Alternatively you may be able to make modifications using the -``firewall-cmd`` utility for FirewallD that is available on many Linux -distributions: +Alternatively, make modifications using the ``firewall-cmd`` utility for +FirewallD that is available on many Linux distributions: .. code-block:: console @@ -60,11 +60,11 @@ SELinux Security-Enhanced Linux is a kernel module for improving security on Linux operating systems. It is commonly enabled and configured by default on Red Hat-based distributions. In the context of Galera Cluster, systems with -SELinux may block the database service, keep it from starting or prevent it +SELinux may block the database service, keep it from starting, or prevent it from establishing network connections with the cluster. To configure SELinux to permit Galera Cluster to operate, you may need -to use the ``semanage`` utility to open the ports it uses, for +to use the ``semanage`` utility to open the ports it uses. For example: .. code-block:: console @@ -79,14 +79,16 @@ relaxed about database access and actions: # semanage permissive -a mysqld_t -.. note:: Bear in mind, leaving SELinux in permissive mode is not a good - security practice. Over the longer term, you need to develop a - security policy for Galera Cluster and then switch SELinux back - into enforcing mode. +.. note:: - For more information on configuring SELinux to work with - Galera Cluster, see the `Documentation - `_ + Bear in mind, leaving SELinux in permissive mode is not a good + security practice. Over the longer term, you need to develop a + security policy for Galera Cluster and then switch SELinux back + into enforcing mode. + + For more information on configuring SELinux to work with + Galera Cluster, see the `SELinux Documentation + `_ AppArmor --------- @@ -111,7 +113,7 @@ following steps on each cluster node: # service apparmor restart - For servers that use ``systemd``, instead run this command: + For servers that use ``systemd``, run the following command: .. code-block:: console @@ -119,7 +121,6 @@ following steps on each cluster node: AppArmor now permits Galera Cluster to operate. - Database configuration ~~~~~~~~~~~~~~~~~~~~~~~ @@ -152,21 +153,20 @@ additions. wsrep_sst_method=rsync - Configuring mysqld ------------------- While all of the configuration parameters available to the standard MySQL, -MariaDB or Percona XtraDB database server are available in Galera Cluster, +MariaDB, or Percona XtraDB database servers are available in Galera Cluster, there are some that you must define an outset to avoid conflict or unexpected behavior. -- Ensure that the database server is not bound only to to the localhost, - ``127.0.0.1``. Also, do not bind it to ``0.0.0.0``. It makes ``mySQL`` - bind to all IP addresses on the machine including the virtual IP address, - which will cause ``HAProxy`` not to start. Instead, bind it to the - management IP address of the controller node to enable access by other - nodes through the management network: +- Ensure that the database server is not bound only to the localhost: + ``127.0.0.1``. Also, do not bind it to ``0.0.0.0``. Binding to the localhost + or ``0.0.0.0`` makes ``mySQL`` bind to all IP addresses on the machine, + including the virtual IP address causing ``HAProxy`` not to start. Instead, + bind to the management IP address of the controller node to enable access by + other nodes through the management network: .. code-block:: ini @@ -194,7 +194,7 @@ parameters that you must define to avoid conflicts. default_storage_engine=InnoDB - Ensure that the InnoDB locking mode for generating auto-increment values - is set to ``2``, which is the interleaved locking mode. + is set to ``2``, which is the interleaved locking mode: .. code-block:: ini @@ -211,8 +211,8 @@ parameters that you must define to avoid conflicts. innodb_flush_log_at_trx_commit=0 - Bear in mind, while setting this parameter to ``1`` or ``2`` can improve - performance, it introduces certain dangers. Operating system failures can + Setting this parameter to ``1`` or ``2`` can improve + performance, but it introduces certain dangers. Operating system failures can erase the last second of transactions. While you can recover this data from another node, if the cluster goes down at the same time (in the event of a data center power outage), you lose this data permanently. @@ -230,19 +230,19 @@ Configuring wsrep replication ------------------------------ Galera Cluster configuration parameters all have the ``wsrep_`` prefix. -There are five that you must define for each cluster node in your +You must define the following parameters for each cluster node in your OpenStack database. -- **wsrep Provider** The Galera Replication Plugin serves as the wsrep - Provider for Galera Cluster. It is installed on your system as the - ``libgalera_smm.so`` file. You must define the path to this file in - your ``my.cnf``. +- **wsrep Provider**: The Galera Replication Plugin serves as the ``wsrep`` + provider for Galera Cluster. It is installed on your system as the + ``libgalera_smm.so`` file. Define the path to this file in + your ``my.cnf``: .. code-block:: ini wsrep_provider="/usr/lib/libgalera_smm.so" -- **Cluster Name** Define an arbitrary name for your cluster. +- **Cluster Name**: Define an arbitrary name for your cluster. .. code-block:: ini @@ -251,7 +251,7 @@ OpenStack database. You must use the same name on every cluster node. The connection fails when this value does not match. -- **Cluster Address** List the IP addresses for each cluster node. +- **Cluster Address**: List the IP addresses for each cluster node. .. code-block:: ini @@ -260,21 +260,18 @@ OpenStack database. Replace the IP addresses given here with comma-separated list of each OpenStack database in your cluster. -- **Node Name** Define the logical name of the cluster node. +- **Node Name**: Define the logical name of the cluster node. .. code-block:: ini wsrep_node_name="Galera1" -- **Node Address** Define the IP address of the cluster node. +- **Node Address**: Define the IP address of the cluster node. .. code-block:: ini wsrep_node_address="192.168.1.1" - - - Additional parameters ^^^^^^^^^^^^^^^^^^^^^^ @@ -299,6 +296,6 @@ For a complete list of the available parameters, run the | wsrep_sync_wait | 0 | +------------------------------+-------+ -For the documentation of these parameters, wsrep Provider option and status -variables available in Galera Cluster, see `Reference +For documentation about these parameters, ``wsrep`` provider option, and status +variables available in Galera Cluster, see the Galera cluster `Reference `_. diff --git a/doc/ha-guide/source/shared-database-manage.rst b/doc/ha-guide/source/shared-database-manage.rst index e4f0464bd1..a7134d92cd 100644 --- a/doc/ha-guide/source/shared-database-manage.rst +++ b/doc/ha-guide/source/shared-database-manage.rst @@ -2,35 +2,31 @@ Management ========== -When you finish the installation and configuration process on each -cluster node in your OpenStack database, you can initialize Galera Cluster. +When you finish installing and configuring the OpenStack database, +you can initialize the Galera Cluster. -Before you attempt this, verify that you have the following ready: +Prerequisites +~~~~~~~~~~~~~ -- Database hosts with Galera Cluster installed. You need a - minimum of three hosts; -- No firewalls between the hosts; -- SELinux and AppArmor set to permit access to ``mysqld``; +- Database hosts with Galera Cluster installed +- A minimum of three hosts +- No firewalls between the hosts +- SELinux and AppArmor set to permit access to ``mysqld`` - The correct path to ``libgalera_smm.so`` given to the - ``wsrep_provider`` parameter. + ``wsrep_provider`` parameter Initializing the cluster ~~~~~~~~~~~~~~~~~~~~~~~~~ -In Galera Cluster, the Primary Component is the cluster of database +In the Galera Cluster, the Primary Component is the cluster of database servers that replicate into each other. In the event that a cluster node loses connectivity with the Primary Component, it defaults into a non-operational state, to avoid creating or serving inconsistent data. -By default, cluster nodes do not start as part of a Primary -Component. Instead they assume that one exists somewhere and -attempts to establish a connection with it. To create a Primary -Component, you must start one cluster node using the -``--wsrep-new-cluster`` option. You can do this using any cluster -node, it is not important which you choose. In the Primary -Component, replication and state transfers bring all databases to -the same state. +By default, cluster nodes do not start as part of a Primary Component. +In the Primary Component, replication and state transfers bring all databases +to the same state. To start the cluster, complete the following steps: @@ -41,7 +37,7 @@ To start the cluster, complete the following steps: # service mysql start --wsrep-new-cluster - For servers that use ``systemd``, instead run this command: + For servers that use ``systemd``, run the following command: .. code-block:: console @@ -68,15 +64,15 @@ To start the cluster, complete the following steps: # service mysql start - For servers that use ``systemd``, instead run this command: + For servers that use ``systemd``, run the following command: .. code-block:: console # systemctl start mariadb #. When you have all cluster nodes started, log into the database - client on one of them and check the ``wsrep_cluster_size`` - status variable again. + client of any cluster node and check the ``wsrep_cluster_size`` + status variable again: .. code-block:: mysql @@ -89,32 +85,33 @@ To start the cluster, complete the following steps: +--------------------+-------+ When each cluster node starts, it checks the IP addresses given to -the ``wsrep_cluster_address`` parameter and attempts to establish +the ``wsrep_cluster_address`` parameter. It then attempts to establish network connectivity with a database server running there. Once it establishes a connection, it attempts to join the Primary Component, requesting a state transfer as needed to bring itself into sync with the cluster. -In the event that you need to restart any cluster node, you can do -so. When the database server comes back it, it establishes -connectivity with the Primary Component and updates itself to any -changes it may have missed while down. +.. note:: + In the event that you need to restart any cluster node, you can do + so. When the database server comes back it, it establishes + connectivity with the Primary Component and updates itself to any + changes it may have missed while down. Restarting the cluster ----------------------- Individual cluster nodes can stop and be restarted without issue. -When a database loses its connection or restarts, Galera Cluster +When a database loses its connection or restarts, the Galera Cluster brings it back into sync once it reestablishes connection with the Primary Component. In the event that you need to restart the entire cluster, identify the most advanced cluster node and initialize the Primary Component on that node. To find the most advanced cluster node, you need to check the -sequence numbers, or seqnos, on the last committed transaction for +sequence numbers, or the ``seqnos``, on the last committed transaction for each. You can find this by viewing ``grastate.dat`` file in -database directory, +database directory: .. code-block:: console @@ -139,26 +136,24 @@ Alternatively, if the database server is running, use the +----------------------+--------+ This value increments with each transaction, so the most advanced -node has the highest sequence number, and therefore is the most up to date. - +node has the highest sequence number and therefore is the most up to date. Configuration tips ~~~~~~~~~~~~~~~~~~~ - Deployment strategies ---------------------- Galera can be configured using one of the following strategies: -- Each instance has its own IP address; +- Each instance has its own IP address: OpenStack services are configured with the list of these IP addresses so they can select one of the addresses from those available. -- Galera runs behind HAProxy. +- Galera runs behind HAProxy: HAProxy load balances incoming requests and exposes just one IP address for all the clients. @@ -166,32 +161,25 @@ strategies: Galera synchronous replication guarantees a zero slave lag. The failover procedure completes once HAProxy detects that the active back end has gone down and switches to the backup one, which is - then marked as 'UP'. If no back ends are up (in other words, the - Galera cluster is not ready to accept connections), the failover - procedure finishes only when the Galera cluster has been + then marked as ``UP``. If no back ends are ``UP``, the failover + procedure finishes only when the Galera Cluster has been successfully reassembled. The SLA is normally no more than 5 minutes. - Use MySQL/Galera in active/passive mode to avoid deadlocks on ``SELECT ... FOR UPDATE`` type queries (used, for example, by nova - and neutron). This issue is discussed more in the following: + and neutron). This issue is discussed in the following: - `IMPORTANT: MySQL Galera does *not* support SELECT ... FOR UPDATE `_ - `Understanding reservations, concurrency, and locking in Nova `_ -Of these options, the second one is highly recommended. Although Galera -supports active/active configurations, we recommend active/passive -(enforced by the load balancer) in order to avoid lock contention. - - - Configuring HAProxy -------------------- -If you use HAProxy for load-balancing client access to Galera -Cluster as described in the :doc:`controller-ha-haproxy`, you can +If you use HAProxy as a load-balancing client to provide access to the +Galera Cluster, as described in the :doc:`controller-ha-haproxy`, you can use the ``clustercheck`` utility to improve health checks. #. Create a configuration file for ``clustercheck`` at @@ -205,7 +193,7 @@ use the ``clustercheck`` utility to improve health checks. MYSQL_PORT="3306" #. Log in to the database client and grant the ``clustercheck`` user - ``PROCESS`` privileges. + ``PROCESS`` privileges: .. code-block:: mysql @@ -248,12 +236,10 @@ use the ``clustercheck`` utility to improve health checks. # service xinetd enable # service xinetd start - For servers that use ``systemd``, instead run these commands: + For servers that use ``systemd``, run the following commands: .. code-block:: console # systemctl daemon-reload # systemctl enable xinetd # systemctl start xinetd - - diff --git a/doc/ha-guide/source/shared-database.rst b/doc/ha-guide/source/shared-database.rst index a74fb5a421..af48344651 100644 --- a/doc/ha-guide/source/shared-database.rst +++ b/doc/ha-guide/source/shared-database.rst @@ -13,19 +13,18 @@ You can achieve high availability for the OpenStack database in many different ways, depending on the type of database that you want to use. There are three implementations of Galera Cluster available to you: -- `Galera Cluster for MySQL `_ The MySQL - reference implementation from Codership, Oy; -- `MariaDB Galera Cluster `_ The MariaDB +- `Galera Cluster for MySQL `_: The MySQL + reference implementation from Codership, Oy. +- `MariaDB Galera Cluster `_: The MariaDB implementation of Galera Cluster, which is commonly supported in - environments based on Red Hat distributions; -- `Percona XtraDB Cluster `_ The XtraDB + environments based on Red Hat distributions. +- `Percona XtraDB Cluster `_: The XtraDB implementation of Galera Cluster from Percona. In addition to Galera Cluster, you can also achieve high availability through other database options, such as PostgreSQL, which has its own replication system. - .. toctree:: :maxdepth: 2 diff --git a/doc/ha-guide/source/shared-messaging.rst b/doc/ha-guide/source/shared-messaging.rst index 9badad2825..8906cd571b 100644 --- a/doc/ha-guide/source/shared-messaging.rst +++ b/doc/ha-guide/source/shared-messaging.rst @@ -9,8 +9,7 @@ execution of jobs entered into the system. The most popular AMQP implementation used in OpenStack installations is RabbitMQ. -RabbitMQ nodes fail over both on the application and the -infrastructure layers. +RabbitMQ nodes fail over on the application and the infrastructure layers. The application layer is controlled by the ``oslo.messaging`` configuration options for multiple AMQP hosts. If the AMQP node fails, @@ -21,7 +20,7 @@ constitutes its SLA. On the infrastructure layer, the SLA is the time for which RabbitMQ cluster reassembles. Several cases are possible. The Mnesia keeper node is the master of the corresponding Pacemaker resource for -RabbitMQ; when it fails, the result is a full AMQP cluster downtime +RabbitMQ. When it fails, the result is a full AMQP cluster downtime interval. Normally, its SLA is no more than several minutes. Failure of another node that is a slave of the corresponding Pacemaker resource for RabbitMQ results in no AMQP cluster downtime at all. @@ -32,43 +31,18 @@ Making the RabbitMQ service highly available involves the following steps: - :ref:`Configure RabbitMQ for HA queues` -- :ref:`Configure OpenStack services to use Rabbit HA queues +- :ref:`Configure OpenStack services to use RabbitMQ HA queues ` .. note:: - Access to RabbitMQ is not normally handled by HAproxy. Instead, + Access to RabbitMQ is not normally handled by HAProxy. Instead, consumers must be supplied with the full list of hosts running RabbitMQ with ``rabbit_hosts`` and turn on the ``rabbit_ha_queues`` - option. - - Jon Eck found the `core issue - `_ - and went into some detail regarding the `history and solution - `_ - on his blog. - - In summary though: - - The source address for the connection from HAProxy back to the - client is the VIP address. However the VIP address is no longer - present on the host. This means that the network (IP) layer - deems the packet unroutable, and informs the transport (TCP) - layer. TCP, however, is a reliable transport. It knows how to - handle transient errors and will retry. And so it does. - - In this case that is a problem though, because: - - TCP generally holds on to hope for a long time. A ballpark - estimate is somewhere on the order of tens of minutes (30 - minutes is commonly referenced). During this time it will keep - probing and trying to deliver the data. - - It is important to note that HAProxy has no idea that any of this is - happening. As far as its process is concerned, it called - ``write()`` with the data and the kernel returned success. The - resolution is already understood and just needs to make its way - through a review. + option. For more information, read the `core issue + `_. + For more detail, read the `history and solution + `_. .. _rabbitmq-install: @@ -93,17 +67,16 @@ you are using: * - SLES 12 - :command:`# zypper addrepo -f obs://Cloud:OpenStack:Kilo/SLE_12 Kilo` - [Verify fingerprint of imported GPG key; see below] + [Verify the fingerprint of the imported GPG key. See below.] :command:`# zypper install rabbitmq-server` - .. note:: For SLES 12, the packages are signed by GPG key 893A90DAD85F9316. You should verify the fingerprint of the imported GPG key before using it. - :: + .. code-block:: ini Key ID: 893A90DAD85F9316 Key Name: Cloud:OpenStack OBS Project @@ -111,8 +84,8 @@ you are using: Key Created: Tue Oct 8 13:34:21 2013 Key Expires: Thu Dec 17 13:34:21 2015 -For more information, -see the official installation manual for the distribution: +For more information, see the official installation manual for the +distribution: - `Debian and Ubuntu `_ - `RPM based `_ @@ -123,53 +96,45 @@ see the official installation manual for the distribution: Configure RabbitMQ for HA queues ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -[TODO: This section should begin with a brief mention -about what HA queues are and why they are valuable, etc] +.. [TODO: This section should begin with a brief mention +.. about what HA queues are and why they are valuable, etc] -We are building a cluster of RabbitMQ nodes to construct a RabbitMQ broker, -which is a logical grouping of several Erlang nodes. +.. [TODO: replace "currently" with specific release names] + +.. [TODO: Does this list need to be updated? Perhaps we need a table +.. that shows each component and the earliest release that allows it +.. to work with HA queues.] The following components/services can work with HA queues: -[TODO: replace "currently" with specific release names] - -[TODO: Does this list need to be updated? Perhaps we need a table -that shows each component and the earliest release that allows it -to work with HA queues.] - - OpenStack Compute - OpenStack Block Storage - OpenStack Networking - Telemetry -We have to consider that, while exchanges and bindings -survive the loss of individual nodes, -queues and their messages do not -because a queue and its contents are located on one node. -If we lose this node, we also lose the queue. +Consider that, while exchanges and bindings survive the loss of individual +nodes, queues and their messages do not because a queue and its contents +are located on one node. If we lose this node, we also lose the queue. -Mirrored queues in RabbitMQ improve -the availability of service since it is resilient to failures. +Mirrored queues in RabbitMQ improve the availability of service since +it is resilient to failures. -Production servers should run (at least) three RabbitMQ servers; -for testing and demonstration purposes, -it is possible to run only two servers. -In this section, we configure two nodes, -called ``rabbit1`` and ``rabbit2``. -To build a broker, we need to ensure -that all nodes have the same Erlang cookie file. +Production servers should run (at least) three RabbitMQ servers for testing +and demonstration purposes, however it is possible to run only two servers. +In this section, we configure two nodes, called ``rabbit1`` and ``rabbit2``. +To build a broker, ensure that all nodes have the same Erlang cookie file. -[TODO: Should the example instead use a minimum of three nodes?] +.. [TODO: Should the example instead use a minimum of three nodes?] -#. To do so, stop RabbitMQ everywhere and copy the cookie - from the first node to each of the other node(s): +#. Stop RabbitMQ and copy the cookie from the first node to each of the + other node(s): .. code-block:: console # scp /var/lib/rabbitmq/.erlang.cookie root@NODE:/var/lib/rabbitmq/.erlang.cookie #. On each target node, verify the correct owner, - group, and permissions of the file :file:`erlang.cookie`. + group, and permissions of the file :file:`erlang.cookie`: .. code-block:: console @@ -177,9 +142,7 @@ that all nodes have the same Erlang cookie file. # chmod 400 /var/lib/rabbitmq/.erlang.cookie #. Start the message queue service on all nodes and configure it to start - when the system boots. - - On Ubuntu, it is configured by default. + when the system boots. On Ubuntu, it is configured by default. On CentOS, RHEL, openSUSE, and SLES: @@ -216,7 +179,7 @@ that all nodes have the same Erlang cookie file. The default node type is a disc node. In this guide, nodes join the cluster as RAM nodes. -#. To verify the cluster status: +#. Verify the cluster status: .. code-block:: console @@ -225,8 +188,8 @@ that all nodes have the same Erlang cookie file. [{nodes,[{disc,[rabbit@rabbit1]},{ram,[rabbit@NODE]}]}, \ {running_nodes,[rabbit@NODE,rabbit@rabbit1]}] - If the cluster is working, - you can create usernames and passwords for the queues. + If the cluster is working, you can create usernames and passwords + for the queues. #. To ensure that all queues except those with auto-generated names are mirrored across all running nodes, @@ -255,53 +218,50 @@ More information is available in the RabbitMQ documentation: Configure OpenStack services to use Rabbit HA queues ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We have to configure the OpenStack components -to use at least two RabbitMQ nodes. +Configure the OpenStack components to use at least two RabbitMQ nodes. -Do this configuration on all services using RabbitMQ: +Use these steps to configurate all services using RabbitMQ: -#. RabbitMQ HA cluster host:port pairs: +#. RabbitMQ HA cluster ``host:port`` pairs: - :: + .. code-block:: console rabbit_hosts=rabbit1:5672,rabbit2:5672,rabbit3:5672 -#. How frequently to retry connecting with RabbitMQ: - [TODO: document the unit of measure here? Seconds?] +#. Retry connecting with RabbitMQ: - :: + .. code-block:: console rabbit_retry_interval=1 #. How long to back-off for between retries when connecting to RabbitMQ: - [TODO: document the unit of measure here? Seconds?] - :: + .. code-block:: console rabbit_retry_backoff=2 #. Maximum retries with trying to connect to RabbitMQ (infinite by default): - :: + .. code-block:: console rabbit_max_retries=0 #. Use durable queues in RabbitMQ: - :: + .. code-block:: console rabbit_durable_queues=true -#. Use HA queues in RabbitMQ (x-ha-policy: all): +#. Use HA queues in RabbitMQ (``x-ha-policy: all``): - :: + .. code-block:: console rabbit_ha_queues=true .. note:: If you change the configuration from an old set-up - that did not use HA queues, you should restart the service: + that did not use HA queues, restart the service: .. code-block:: console diff --git a/doc/ha-guide/source/storage-ha-backend.rst b/doc/ha-guide/source/storage-ha-backend.rst index 429f007a17..47ce74bdef 100644 --- a/doc/ha-guide/source/storage-ha-backend.rst +++ b/doc/ha-guide/source/storage-ha-backend.rst @@ -5,29 +5,20 @@ Storage back end ================ -Most of this guide concerns the control plane of high availability: -ensuring that services continue to run even if a component fails. -Ensuring that data is not lost -is the data plane component of high availability; -this is discussed here. - An OpenStack environment includes multiple data pools for the VMs: -- Ephemeral storage is allocated for an instance - and is deleted when the instance is deleted. - The Compute service manages ephemeral storage. - By default, Compute stores ephemeral drives as files - on local disks on the Compute node - but Ceph RBD can instead be used - as the storage back end for ephemeral storage. +- Ephemeral storage is allocated for an instance and is deleted when the + instance is deleted. The Compute service manages ephemeral storage and + by default, Compute stores ephemeral drives as files on local disks on the + Compute node. CAs an alternative, you can use Ceph RBD as the storage back + end for ephemeral storage. -- Persistent storage exists outside all instances. - Two types of persistent storage are provided: +- Persistent storage exists outside all instances. Two types of persistent + storage are provided: - - Block Storage service (cinder) - can use LVM or Ceph RBD as the storage back end. - - Image service (glance) - can use the Object Storage service (swift) + - The Block Storage service (cinder) that can use LVM or Ceph RBD as the + storage back end. + - The Image service (glance) that can use the Object Storage service (swift) or Ceph RBD as the storage back end. For more information about configuring storage back ends for @@ -35,45 +26,37 @@ the different storage options, see `Manage volumes `_ in the OpenStack Administrator Guide. -This section discusses ways to protect against -data loss in your OpenStack environment. +This section discusses ways to protect against data loss in your OpenStack +environment. RAID drives ----------- -Configuring RAID on the hard drives that implement storage -protects your data against a hard drive failure. -If, however, the node itself fails, data may be lost. +Configuring RAID on the hard drives that implement storage protects your data +against a hard drive failure. If the node itself fails, data may be lost. In particular, all volumes stored on an LVM node can be lost. Ceph ---- -`Ceph RBD `_ -is an innately high availability storage back end. -It creates a storage cluster with multiple nodes -that communicate with each other -to replicate and redistribute data dynamically. -A Ceph RBD storage cluster provides -a single shared set of storage nodes -that can handle all classes of persistent and ephemeral data --- glance, cinder, and nova -- -that are required for OpenStack instances. +`Ceph RBD `_ is an innately high availability storage back +end. It creates a storage cluster with multiple nodes that communicate with +each other to replicate and redistribute data dynamically. +A Ceph RBD storage cluster provides a single shared set of storage nodes that +can handle all classes of persistent and ephemeral data (glance, cinder, and +nova) that are required for OpenStack instances. -Ceph RBD provides object replication capabilities -by storing Block Storage volumes as Ceph RBD objects; -Ceph RBD ensures that each replica of an object -is stored on a different node. -This means that your volumes are protected against -hard drive and node failures -or even the failure of the data center itself. +Ceph RBD provides object replication capabilities by storing Block Storage +volumes as Ceph RBD objects. Ceph RBD ensures that each replica of an object +is stored on a different node. This means that your volumes are protected +against hard drive and node failures, or even the failure of the data center +itself. -When Ceph RBD is used for ephemeral volumes -as well as block and image storage, it supports -`live migration +When Ceph RBD is used for ephemeral volumes as well as block and image storage, +it supports `live migration `_ -of VMs with ephemeral drives; -LVM only supports live migration of volume-backed VMs. +of VMs with ephemeral drives. LVM only supports live migration of +volume-backed VMs. Remote backup facilities ------------------------ diff --git a/doc/ha-guide/source/storage-ha-block.rst b/doc/ha-guide/source/storage-ha-block.rst index 1d8f0f45cd..52566dfe89 100644 --- a/doc/ha-guide/source/storage-ha-block.rst +++ b/doc/ha-guide/source/storage-ha-block.rst @@ -2,7 +2,7 @@ Highly available Block Storage API ================================== -Cinder provides 'block storage as a service' suitable for performance +Cinder provides Block-Storage-as-a-Service suitable for performance sensitive scenarios such as databases, expandable file systems, or providing a server with access to raw block level storage. @@ -10,7 +10,7 @@ Persistent block storage can survive instance termination and can also be moved across instances like any external storage device. Cinder also has volume snapshots capability for backing up the volumes. -Making this Block Storage API service highly available in +Making the Block Storage API service highly available in active/passive mode involves: - :ref:`ha-blockstorage-pacemaker` @@ -18,60 +18,22 @@ active/passive mode involves: - :ref:`ha-blockstorage-services` In theory, you can run the Block Storage service as active/active. -However, because of sufficient concerns, it is recommended running +However, because of sufficient concerns, we recommend running the volume component as active/passive only. -Jon Bernard writes: - -:: - - Requests are first seen by Cinder in the API service, and we have a - fundamental problem there - a standard test-and-set race condition - exists for many operations where the volume status is first checked - for an expected status and then (in a different operation) updated to - a pending status. The pending status indicates to other incoming - requests that the volume is undergoing a current operation, however it - is possible for two simultaneous requests to race here, which - undefined results. - - Later, the manager/driver will receive the message and carry out the - operation. At this stage there is a question of the synchronization - techniques employed by the drivers and what guarantees they make. - - If cinder-volume processes exist as different process, then the - 'synchronized' decorator from the lockutils package will not be - sufficient. In this case the programmer can pass an argument to - synchronized() 'external=True'. If external is enabled, then the - locking will take place on a file located on the filesystem. By - default, this file is placed in Cinder's 'state directory' in - /var/lib/cinder so won't be visible to cinder-volume instances running - on different machines. - - However, the location for file locking is configurable. So an - operator could configure the state directory to reside on shared - storage. If the shared storage in use implements unix file locking - semantics, then this could provide the requisite synchronization - needed for an active/active HA configuration. - - The remaining issue is that not all drivers use the synchronization - methods, and even fewer of those use the external file locks. A - sub-concern would be whether they use them correctly. - You can read more about these concerns on the `Red Hat Bugzilla `_ and there is a `psuedo roadmap `_ for addressing them upstream. - .. _ha-blockstorage-pacemaker: Add Block Storage API resource to Pacemaker ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -On RHEL-based systems, you should create resources for cinder's -systemd agents and create constraints to enforce startup/shutdown -ordering: +On RHEL-based systems, create resources for cinder's systemd agents and create +constraints to enforce startup/shutdown ordering: .. code-block:: console @@ -115,29 +77,25 @@ and add the following cluster resources: keystone_get_token_url="http://10.0.0.11:5000/v2.0/tokens" \ op monitor interval="30s" timeout="30s" -This configuration creates ``p_cinder-api``, -a resource for managing the Block Storage API service. +This configuration creates ``p_cinder-api``, a resource for managing the +Block Storage API service. -The command :command:`crm configure` supports batch input, -so you may copy and paste the lines above -into your live pacemaker configuration and then make changes as required. -For example, you may enter ``edit p_ip_cinder-api`` -from the :command:`crm configure` menu -and edit the resource to match your preferred virtual IP address. +The command :command:`crm configure` supports batch input, copy and paste the +lines above into your live Pacemaker configuration and then make changes as +required. For example, you may enter ``edit p_ip_cinder-api`` from the +:command:`crm configure` menu and edit the resource to match your preferred +virtual IP address. -Once completed, commit your configuration changes -by entering :command:`commit` from the :command:`crm configure` menu. -Pacemaker then starts the Block Storage API service -and its dependent resources on one of your nodes. +Once completed, commit your configuration changes by entering :command:`commit` +from the :command:`crm configure` menu. Pacemaker then starts the Block Storage +API service and its dependent resources on one of your nodes. .. _ha-blockstorage-configure: Configure Block Storage API service ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Edit the ``/etc/cinder/cinder.conf`` file: - -On a RHEL-based system, it should look something like: +Edit the ``/etc/cinder/cinder.conf`` file. For example, on a RHEL-based system: .. code-block:: ini :linenos: @@ -211,19 +169,17 @@ database. .. _ha-blockstorage-services: -Configure OpenStack services to use highly available Block Storage API -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Configure OpenStack services to use the highly available Block Storage API +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Your OpenStack services must now point their -Block Storage API configuration to the highly available, -virtual cluster IP address -rather than a Block Storage API server’s physical IP address -as you would for a non-HA environment. +Your OpenStack services must now point their Block Storage API configuration +to the highly available, virtual cluster IP address rather than a Block Storage +API server’s physical IP address as you would for a non-HA environment. -You must create the Block Storage API endpoint with this IP. +Create the Block Storage API endpoint with this IP. -If you are using both private and public IP addresses, -you should create two virtual IPs and define your endpoint like this: +If you are using both private and public IP addresses, create two virtual IPs +and define your endpoint. For example: .. code-block:: console diff --git a/doc/ha-guide/source/storage-ha-file-systems.rst b/doc/ha-guide/source/storage-ha-file-systems.rst index 819a6ed6bb..38c61dad4b 100644 --- a/doc/ha-guide/source/storage-ha-file-systems.rst +++ b/doc/ha-guide/source/storage-ha-file-systems.rst @@ -14,41 +14,56 @@ in active/passive mode involves: Add Shared File Systems API resource to Pacemaker ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -You must first download the resource agent to your system: +#. Download the resource agent to your system: -.. code-block:: console + .. code-block:: console - # cd /usr/lib/ocf/resource.d/openstack - # wget https://git.openstack.org/cgit/openstack/openstack-resource-agents/plain/ocf/manila-api - # chmod a+rx * + # cd /usr/lib/ocf/resource.d/openstack + # wget https://git.openstack.org/cgit/openstack/openstack-resource-agents/plain/ocf/manila-api + # chmod a+rx * -You can now add the Pacemaker configuration for the Shared File Systems -API resource. Connect to the Pacemaker cluster with the -:command:`crm configure` command and add the following cluster resources: +#. Add the Pacemaker configuration for the Shared File Systems + API resource. Connect to the Pacemaker cluster with the following + command: -.. code-block:: ini + .. code-block:: console - primitive p_manila-api ocf:openstack:manila-api \ - params config="/etc/manila/manila.conf" \ - os_password="secretsecret" \ - os_username="admin" \ - os_tenant_name="admin" \ - keystone_get_token_url="http://10.0.0.11:5000/v2.0/tokens" \ - op monitor interval="30s" timeout="30s" + # crm configure -This configuration creates ``p_manila-api``, a resource for managing the -Shared File Systems API service. + .. note:: -The :command:`crm configure` supports batch input, so you may copy and paste -the lines above into your live Pacemaker configuration and then make changes -as required. For example, you may enter ``edit p_ip_manila-api`` from the -:command:`crm configure` menu and edit the resource to match your preferred -virtual IP address. + The :command:`crm configure` supports batch input. Copy and paste + the lines in the next step into your live Pacemaker configuration and then + make changes as required. -Once completed, commit your configuration changes by entering :command:`commit` -from the :command:`crm configure` menu. Pacemaker then starts the -Shared File Systems API service and its dependent resources on one of your -nodes. + For example, you may enter ``edit p_ip_manila-api`` from the + :command:`crm configure` menu and edit the resource to match your preferred + virtual IP address. + +#. Add the following cluster resources: + + .. code-block:: ini + + primitive p_manila-api ocf:openstack:manila-api \ + params config="/etc/manila/manila.conf" \ + os_password="secretsecret" \ + os_username="admin" \ + os_tenant_name="admin" \ + keystone_get_token_url="http://10.0.0.11:5000/v2.0/tokens" \ + op monitor interval="30s" timeout="30s" + + This configuration creates ``p_manila-api``, a resource for managing the + Shared File Systems API service. + +#. Commit your configuration changes by entering the following command + from the :command:`crm configure` menu: + + .. code-block:: console + + # commit + +Pacemaker now starts the Shared File Systems API service and its +dependent resources on one of your nodes. .. _ha-sharedfilesystems-configure: diff --git a/doc/ha-guide/source/storage-ha-image.rst b/doc/ha-guide/source/storage-ha-image.rst index 33e6da2d28..06147fe463 100644 --- a/doc/ha-guide/source/storage-ha-image.rst +++ b/doc/ha-guide/source/storage-ha-image.rst @@ -2,19 +2,21 @@ Highly available Image API ========================== -The OpenStack Image service offers a service for discovering, -registering, and retrieving virtual machine images. -To make the OpenStack Image API service highly available -in active / passive mode, you must: +The OpenStack Image service offers a service for discovering, registering, and +retrieving virtual machine images. To make the OpenStack Image API service +highly available in active/passive mode, you must: - :ref:`glance-api-pacemaker` - :ref:`glance-api-configure` - :ref:`glance-services` -This section assumes that you are familiar with the +Prerequisites +~~~~~~~~~~~~~ + +Before beginning, ensure that you are familiar with the documentation for installing the OpenStack Image API service. See the *Image service* section in the -`Installation Tutorials and Guides `_ +`Installation Tutorials and Guides `_, depending on your distribution. .. _glance-api-pacemaker: @@ -22,44 +24,54 @@ depending on your distribution. Add OpenStack Image API resource to Pacemaker ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -You must first download the resource agent to your system: +#. Download the resource agent to your system: -.. code-block:: console + .. code-block:: console - # cd /usr/lib/ocf/resource.d/openstack - # wget https://git.openstack.org/cgit/openstack/openstack-resource-agents/plain/ocf/glance-api - # chmod a+rx * + # cd /usr/lib/ocf/resource.d/openstack + # wget https://git.openstack.org/cgit/openstack/openstack-resource-agents/plain/ocf/glance-api + # chmod a+rx * -You can now add the Pacemaker configuration -for the OpenStack Image API resource. -Use the :command:`crm configure` command -to connect to the Pacemaker cluster -and add the following cluster resources: +#. Add the Pacemaker configuration for the OpenStack Image API resource. + Use the following command to connect to the Pacemaker cluster: -:: + .. code-block:: console - primitive p_glance-api ocf:openstack:glance-api \ - params config="/etc/glance/glance-api.conf" \ - os_password="secretsecret" \ - os_username="admin" os_tenant_name="admin" \ - os_auth_url="http://10.0.0.11:5000/v2.0/" \ - op monitor interval="30s" timeout="30s" + crm configure -This configuration creates ``p_glance-api``, -a resource for managing the OpenStack Image API service. + .. note:: -The :command:`crm configure` command supports batch input, -so you may copy and paste the above into your live Pacemaker configuration -and then make changes as required. -For example, you may enter edit ``p_ip_glance-api`` -from the :command:`crm configure` menu -and edit the resource to match your preferred virtual IP address. + The :command:`crm configure` command supports batch input. Copy and paste + the lines in the next step into your live Pacemaker configuration and + then make changes as required. -After completing these steps, -commit your configuration changes by entering :command:`commit` -from the :command:`crm configure` menu. -Pacemaker then starts the OpenStack Image API service -and its dependent resources on one of your nodes. + For example, you may enter ``edit p_ip_glance-api`` from the + :command:`crm configure` menu and edit the resource to match your + preferred virtual IP address. + +#. Add the following cluster resources: + + .. code-block:: console + + primitive p_glance-api ocf:openstack:glance-api \ + params config="/etc/glance/glance-api.conf" \ + os_password="secretsecret" \ + os_username="admin" os_tenant_name="admin" \ + os_auth_url="http://10.0.0.11:5000/v2.0/" \ + op monitor interval="30s" timeout="30s" + + This configuration creates ``p_glance-api``, a resource for managing the + OpenStack Image API service. + +#. Commit your configuration changes by entering the following command from + the :command:`crm configure` menu: + + .. code-block:: console + + commit + +Pacemaker then starts the OpenStack Image API service and its dependent +resources on one of your nodes. .. _glance-api-configure: @@ -67,7 +79,7 @@ Configure OpenStack Image service API ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Edit the :file:`/etc/glance/glance-api.conf` file -to configure the OpenStack image service: +to configure the OpenStack Image service: .. code-block:: ini @@ -93,20 +105,17 @@ to configure the OpenStack image service: .. _glance-services: -Configure OpenStack services to use highly available OpenStack Image API -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Configure OpenStack services to use the highly available OpenStack Image API +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Your OpenStack services must now point -their OpenStack Image API configuration to the highly available, -virtual cluster IP address -instead of pointing to the physical IP address -of an OpenStack Image API server -as you would in a non-HA cluster. +Your OpenStack services must now point their OpenStack Image API configuration +to the highly available, virtual cluster IP address instead of pointing to the +physical IP address of an OpenStack Image API server as you would in a non-HA +cluster. -For OpenStack Compute, for example, -if your OpenStack Image API service IP address is 10.0.0.11 -(as in the configuration explained here), -you would use the following configuration in your :file:`nova.conf` file: +For example, if your OpenStack Image API service IP address is 10.0.0.11 +(as in the configuration explained here), you would use the following +configuration in your :file:`nova.conf` file: .. code-block:: ini @@ -117,9 +126,8 @@ you would use the following configuration in your :file:`nova.conf` file: You must also create the OpenStack Image API endpoint with this IP address. -If you are using both private and public IP addresses, -you should create two virtual IP addresses -and define your endpoint like this: +If you are using both private and public IP addresses, create two virtual IP +addresses and define your endpoint. For example: .. code-block:: console