Adding 'running slowly' troubleshooting section
Added in Steve Deaton's content about troubleshooting a slow cloud. Also, address the broken link. Change-Id: Iadf7d2df62e9d4d77e0c36cb33467af3546bb2cb Closes-Bug: #1251088 Co-Authored-By: Steven Deaton <sdeaton2@gmail.com>
This commit is contained in:
parent
b08db66706
commit
a15d78f652
@ -899,7 +899,7 @@ inner join nova.instances on cinder.volumes.instance_uuid=nova.instances.uuid
|
|||||||
xlink:href="https://github.com/opscode/openstack-chef-repo">OpenStack Chef recipes</link>.
|
xlink:href="https://github.com/opscode/openstack-chef-repo">OpenStack Chef recipes</link>.
|
||||||
Other newer configuration tools include <link
|
Other newer configuration tools include <link
|
||||||
xlink:href="https://juju.ubuntu.com/">Juju</link>, <link
|
xlink:href="https://juju.ubuntu.com/">Juju</link>, <link
|
||||||
xlink:href="http://www.ansible.com/home">Ansible</link>, and <link
|
xlink:href="https://www.ansible.com/">Ansible</link>, and <link
|
||||||
xlink:href="http://www.saltstack.com/">Salt</link>; and more mature
|
xlink:href="http://www.saltstack.com/">Salt</link>; and more mature
|
||||||
configuration management tools include <link
|
configuration management tools include <link
|
||||||
xlink:href="http://cfengine.com/">CFEngine</link> and <link
|
xlink:href="http://cfengine.com/">CFEngine</link> and <link
|
||||||
@ -1330,6 +1330,127 @@ sql_connection = mysql+pymysql://cinder:password@cloud.example.com/cinder
|
|||||||
|
|
||||||
<?hard-pagebreak ?>
|
<?hard-pagebreak ?>
|
||||||
|
|
||||||
|
<section xml:id="runningslow">
|
||||||
|
<?dbhtml stop-chunking?>
|
||||||
|
|
||||||
|
<title>What to do when things are running slowly</title>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
When you are getting slow responses from various services, it can be
|
||||||
|
hard to know where to start looking. The first thing to check is the
|
||||||
|
extent of the slowness: is it specific to a single service, or varied
|
||||||
|
among different services? If your problem is isolated to a specific
|
||||||
|
service, it can temporarily be fixed by restarting the service, but that
|
||||||
|
is often only a fix for the symptom and not the actual problem.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
This is a collection of ideas from experienced operators on common
|
||||||
|
things to look at that may be the cause of slowness. It is not, however,
|
||||||
|
designed to be an exhaustive list.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<section xml:id="runningslow_keystone">
|
||||||
|
<?dbhtml stop-chunking?>
|
||||||
|
<title>OpenStack Identity service</title>
|
||||||
|
<para>
|
||||||
|
If OpenStack Identity is responding slowly, it could be due to the
|
||||||
|
token table getting large. This can be fixed by running the
|
||||||
|
<command>keystone-manage token_flush</command> command.
|
||||||
|
</para>
|
||||||
|
<para>
|
||||||
|
Additionally, for Identity-related issues, try the tips in
|
||||||
|
<xref linkend="runningslow_sql" />.
|
||||||
|
</para>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<section xml:id="runningslow_glance">
|
||||||
|
<?dbhtml stop-chunking?>
|
||||||
|
<title>OpenStack Image service</title>
|
||||||
|
<para>
|
||||||
|
OpenStack Image service can be slowed down by things related to the
|
||||||
|
Identity service, but the Image service itself can be slowed down if
|
||||||
|
connectivity to the back-end storage in use is slow or otherwise
|
||||||
|
problematic. For example, your back-end NFS server might have gone
|
||||||
|
down.
|
||||||
|
</para>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<section xml:id="runningslow_cinder">
|
||||||
|
<?dbhtml stop-chunking?>
|
||||||
|
<title>OpenStack Block Storage service</title>
|
||||||
|
<para>
|
||||||
|
OpenStack Block Storage service is similar to the Image service, so
|
||||||
|
start by checking Identity-related services, and the back-end storage.
|
||||||
|
Additionally, both the Block Storage and Image services rely on AMQP
|
||||||
|
and SQL functionality, so consider these when debugging.
|
||||||
|
</para>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<section xml:id="runningslow_nova">
|
||||||
|
<?dbhtml stop-chunking?>
|
||||||
|
<title>OpenStack Compute service</title>
|
||||||
|
<para>
|
||||||
|
Services related to OpenStack Compute are normally fairly fast and
|
||||||
|
rely on a couple of backend services: Identity for authentication and
|
||||||
|
authorization), and AMQP for interoperability. Any slowness related to
|
||||||
|
services is normally related to one of these. Also, as with all other
|
||||||
|
services, SQL is used extensively.
|
||||||
|
</para>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<section xml:id="runningslow_neutron">
|
||||||
|
<?dbhtml stop-chunking?>
|
||||||
|
<title>OpenStack Networking service</title>
|
||||||
|
<para>
|
||||||
|
Slowness in the OpenStack Networking service can be caused by services
|
||||||
|
that it relies upon, but it can also be related to either physical or
|
||||||
|
virtual networking. For example: network namespaces that do not exist
|
||||||
|
or are not tied to interfaces correctly; DHCP daemons that have hung
|
||||||
|
or are not running; a cable being physically disconnected; a switch
|
||||||
|
not being configured correctly. When debugging Networking service
|
||||||
|
problems, begin by verifying all physical networking functionality
|
||||||
|
(switch configuration, physical cabling, etc.). After the physical
|
||||||
|
networking is verified, check to be sure all of the Networking
|
||||||
|
services are running (neutron-server, neutron-dhcp-agent, etc.), then
|
||||||
|
check on AMQP and SQL back ends.
|
||||||
|
</para>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<section xml:id="runningslow_amqp">
|
||||||
|
<?dbhtml stop-chunking?>
|
||||||
|
<title>AMQP broker</title>
|
||||||
|
<para>
|
||||||
|
Regardless of which AMQP broker you use, such as RabbitMQ, there are
|
||||||
|
common issues which not only slow down operations, but can also cause
|
||||||
|
real problems. Sometimes messages queued for services stay on the
|
||||||
|
queues and are not consumed. This can be due to dead or stagnant
|
||||||
|
services and can be commonly cleared up by either restarting the
|
||||||
|
AMQP-related services or the OpenStack service in question.
|
||||||
|
</para>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<section xml:id="runningslow_sql">
|
||||||
|
<?dbhtml stop-chunking?>
|
||||||
|
<title>SQL back end</title>
|
||||||
|
<para>
|
||||||
|
Whether you use SQLite or an RDBMS (such as MySQL), SQL
|
||||||
|
interoperability is essential to a functioning OpenStack environment.
|
||||||
|
A large or fragmented SQLite file can cause slowness when using files
|
||||||
|
as a back end. A locked or long-running query can cause delays for
|
||||||
|
most RDBMS services. In this case, do not kill the query immediately,
|
||||||
|
but look into it to see if it is a problem with something that is
|
||||||
|
hung, or something that is just taking a long time to run and needs to
|
||||||
|
finish on its own. The administration of an RDBMS is outside the scope
|
||||||
|
of this document, but it should be noted that a properly functioning
|
||||||
|
RDBMS is essential to most OpenStack services.
|
||||||
|
</para>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<?hard-pagebreak ?>
|
||||||
|
|
||||||
<section xml:id="uninstalling">
|
<section xml:id="uninstalling">
|
||||||
<?dbhtml stop-chunking?>
|
<?dbhtml stop-chunking?>
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user