O'Reilly edit: Logging and Monitoring

This commit addresses the technical comments from O'Reilly for the logging and monitoring chapter. A short introduction to Nagios was added as a sidebar and how to calculate a user's quota usage was clarified. Change-Id: I9e638d0dc3861a7a8cc73b1ef975c7ef0c7c2f0c
2014-02-18 20:00:42 +01:00 · 2014-02-18 20:00:42 +01:00 · 513fa978d3
commit 513fa978d3
parent 132257a520
1 changed files with 30 additions and 6 deletions
--- a/doc/openstack-ops/ch_ops_log_monitor.xml
+++ b/doc/openstack-ops/ch_ops_log_monitor.xml
@ -426,6 +426,18 @@ notification_driver=nova.openstack.common.notifier.rabbit_notifier</programlisti
            The latter involves monitoring resource usage over time in
            order to make informed decisions about potential
            bottlenecks and upgrades.</para>
+        <sidebar>
+            <title>Nagios</title>
+            <para>Nagios is an open source monitoring service. It's
+                capable of executing arbitrary commands to check the
+                status of server and network services, remotely
+                executing arbitrary commands directly on servers, and
+                allow servers to push notifications back in the form
+                of passive monitoring. Nagios has been around since
+                1999. Although newer monitoring services are
+                available, Nagios is a tried-and-true systems
+                administration staple.</para>
+        </sidebar>
        <section xml:id="process_monitoring">
            <title>Process Monitoring</title>
            <para>A basic type of alert monitoring is to simply check
@ -557,12 +569,14 @@ root 24121 0.0 0.0 11688 912 pts/5 S+ 13:07 0:00 grep nova-api</programlisting>
 | 628df59f091142399e0689a2696f5baa | gigabytes    | 12     |
 | 628df59f091142399e0689a2696f5baa | images       | 1      |
 +----------------------------------+--------------+--------+</programlisting>
-            <para>By combining the resources used with the tenant's
-                quota, you can figure out a usage percentage. For
-                example, if this tenant is using 1 Floating IP out of
-                10, then they are using 10% of their Floating IP
-                quota. You can take this procedure and turn it into a
-                formatted report:</para>
+            <para>By comparing a tenant's hard limit with their
+                current resource usage, you can see their usage
+                percentage. For example, if this tenant is using 1
+                Floating IP out of 10, then they are using 10% of
+                their Floating IP quota. Rather than doing the
+                calculation manually, you can use SQL or the scripting
+                language of your choice and create a formatted
+                report:</para>
            <programlisting><?db-font-size 65%?>
 +----------------------------------+------------+-------------+---------------+
 | some_tenant                                                                 |
@ -730,4 +744,14 @@ glance image-create --name='cirros image' --is-public=true --container-format=ba
                (https://collectd.org/wiki/index.php/Data_source)</para>
        </section>
    </section>
+    <section xml:id="ops-log-monitor-summary">
+        <title>Summary</title>
+        <para>For stable operations, you want to detect failure promptly and
+        determine causes efficiently. With a distributed system, it's even
+        more important to track the right items to meet a service level target.
+        Learning where these logs are located in the file system or API gives
+        you an advantage. Plus, we have discussed how to read, interpret, and
+        manipulate information from OpenStack services so you can monitor
+        effectively.</para>
+    </section>
 </chapter>