operations-guide/doc/openstack-ops/ch_ops_backup_recovery.xml
Joe Topjian 1e81add8bc O'Reilly Edit: Backup and Recovery
This commit addresses the comment regarding HA clarification.

Change-Id: Icff52a848858d941f745b30849acb68bafc46cf2
2014-02-20 08:54:30 -06:00

210 lines
10 KiB
XML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<chapter xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="backup_and_recovery">
<?dbhtml stop-chunking?>
<title>Backup and Recovery</title>
<para>Standard backup best practices apply when creating your
OpenStack backup policy. For example, how often to backup your
data is closely related to how quickly you need to recover
from data loss.</para>
<note>
<para>If you cannot have any data loss at all, you should also
focus on a highly available deployment. The
<citetitle><link
xlink:href="http://docs.openstack.org/high-availability-guide/content/"
>OpenStack High Availability Guide offers
suggestions for elimination of a single point of
failure that could cause system downtime. While it
is not a completely prescriptive document, it
offers methods and techniques for avoiding
downtime and data loss.</link></citetitle>
</para></note>
<para>Other backup considerations include:</para>
<itemizedlist>
<listitem>
<para>How many backups to keep?</para>
</listitem>
<listitem>
<para>Should backups be kept off-site?</para>
</listitem>
<listitem>
<para>How often should backups be tested?</para>
</listitem>
</itemizedlist>
<para>Just as important as a backup policy is a recovery policy
(or at least recovery testing).</para>
<section xml:id="what_to_backup">
<title>What to Backup</title>
<para>While OpenStack is composed of many components and
moving parts, backing up the critical data is quite
simple.</para>
<para>This chapter describes only how to back up configuration
files and databases that the various OpenStack components
need to run. This chapter does not describe how to back up
objects inside Object Storage or data contained inside
Block Storage. Generally these areas are left for the user
to back up on their own.</para>
</section>
<section xml:id="database_backups">
<title>Database Backups</title>
<para>The example OpenStack architecture designates the Cloud
Controller as the MySQL server. This MySQL server hosts
the databases for Nova, Glance, Cinder, and Keystone. With
all of these databases in one place, it's very easy to
create a database backup:</para>
<programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt --all-databases &gt;
openstack.sql</programlisting>
<para>If you only want to backup a single database, you can
instead run:</para>
<programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt nova &gt; nova.sql</programlisting>
<para>where <code>nova</code> is the database you want to back
up.</para>
<para>You can easily automate this process by creating a cron
job that runs the following script once per day:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt>!/bin/bash
backup_dir="/var/lib/backups/mysql"
filename="${backup_dir}/mysql-`hostname`-`eval date +%Y%m%d`.sql.gz"
# Dump the entire MySQL database
/usr/bin/mysqldump --opt --all-databases | gzip &gt; $filename
# Delete backups older than 7 days
find $backup_dir -ctime +7 -type f -delete</programlisting>
<para>This script dumps the entire MySQL database and delete
any backups older than 7 days.</para>
</section>
<section xml:id="file_system_backups">
<title>File System Backups</title>
<para>This section discusses which files and directories should be backed up regularly, organized by service.</para>
<section xml:id="compute">
<title>Compute</title>
<para>The <code>/etc/nova</code> directory on both the
cloud controller and compute nodes should be regularly
backed up.</para>
<para>
<code>/var/log/nova</code> does not need backed up if
you have all logs going to a central area. It is
highly recommended to use a central logging server or
backup the log directory.</para>
<para>
<code>/var/lib/nova</code> is another important
directory to backup. The exception to this is the
<code>/var/lib/nova/instances</code> subdirectory
on compute nodes. This subdirectory contains the KVM
images of running instances. You would only want to
back up this directory if you need to maintain backup
copies of all instances. Under most circumstances, you
do not need to do this, but this can vary from cloud
to cloud and your service levels. Also be aware that
making a backup of a live KVM instance can cause that
instance to not boot properly if it is ever restored
from a backup.</para>
</section>
<section xml:id="image_catalog_delivery">
<title>Image Catalog and Delivery</title>
<para>
<code>/etc/glance</code> and
<code>/var/log/glance</code> follow the same rules
at the nova counterparts.</para>
<para>
<code>/var/lib/glance</code> should also be backed up.
Take special notice of
<code>/var/lib/glance/images</code>. If you are
using a file-based back-end of Glance,
<code>/var/lib/glance/images</code> is where the
images are stored and care should be taken.</para>
<para>There are two ways to ensure stability with this
directory. The first is to make sure this directory is
run on a RAID array. If a disk fails, the directory is
available. The second way is to use a tool such as
rsync to replicate the images to another
server:</para>
<para># rsync -az --progress /var/lib/glance/images
backup-server:/var/lib/glance/images/</para>
</section>
<section xml:id="identity">
<title>Identity</title>
<para>
<code>/etc/keystone</code> and
<code>/var/log/keystone</code> follow the same
rules as other components.</para>
<para>
<code>/var/lib/keystone</code>, while should not
contain any data being used, can also be backed up
just in case.</para>
</section>
<section xml:id="ops_block_storage">
<title>Block Storage</title>
<para>
<code>/etc/cinder</code> and
<code>/var/log/cinder</code> follow the same rules
as other components.</para>
<para>
<code>/var/lib/cinder</code> should also be backed
up.</para>
</section>
<section xml:id="ops_object_storage">
<title>Object Storage</title>
<para>
<code>/etc/swift</code> is very important to have
backed up. This directory contains the Swift
configuration files as well as the ring files and ring
<glossterm>builder file</glossterm>s, which if
lost render the data on your cluster inaccessible. A
best practice is to copy the builder files to all
storage nodes along with the ring files. Multiple
backups copies are spread throughout your storage
cluster.</para>
</section>
</section>
<section xml:id="recovering_backups">
<title>Recovering Backups</title>
<para>Recovering backups is a fairly simple process. To begin,
first ensure that the service you are recovering is not
running. For example, to do a full recovery of nova on the
cloud controller, first stop all <code>nova</code>
services:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> stop nova-api
# stop nova-cert
# stop nova-consoleauth
# stop nova-novncproxy
# stop nova-objectstore
# stop nova-scheduler</programlisting>
<para>Now you can import a previously backed up
database:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mysql nova &lt; nova.sql</programlisting>
<para>As well as restore backed up nova directories:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mv /etc/nova{,.orig}
# cp -a /path/to/backup/nova /etc/</programlisting>
<para>Once the files are restored, start everything back
up:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> start mysql
# for i in nova-api nova-cert nova-consoleauth nova-novncproxy nova-objectstore nova-scheduler
&gt; do
&gt; start $i
&gt; done
</programlisting>
<para>Other services follow the same process, with their
respective directories and databases.</para>
</section>
<section xml:id="ops-backup-recovery-summary">
<title>Summary</title>
<para>Backup and subsequent recovery is one of the first tasks system
administrators learn. However, each system has different
items that need attention. By taking care of your database, image
service and appropriate file system locations, you can be assured
you can handle any event requiring recovery.</para>
</section>
</chapter>