
Follow conventions for capitalization. Change-Id: I101efe345710363fa661840227886dc028efb090
290 lines
11 KiB
XML
290 lines
11 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<chapter version="5.0" xml:id="backup_and_recovery"
|
|
xmlns="http://docbook.org/ns/docbook"
|
|
xmlns:xlink="http://www.w3.org/1999/xlink"
|
|
xmlns:xi="http://www.w3.org/2001/XInclude"
|
|
xmlns:ns5="http://www.w3.org/1999/xhtml"
|
|
xmlns:ns4="http://www.w3.org/1998/Math/MathML"
|
|
xmlns:ns3="http://www.w3.org/2000/svg"
|
|
xmlns:ns="http://docbook.org/ns/docbook">
|
|
<?dbhtml stop-chunking?>
|
|
|
|
<title>Backup and Recovery</title>
|
|
|
|
<para>Standard backup best practices apply when creating your OpenStack
|
|
backup policy. For example, how often to back up your data is closely
|
|
related to how quickly you need to recover from data loss.<indexterm
|
|
class="singular">
|
|
<primary>backup/recovery</primary>
|
|
|
|
<secondary>considerations</secondary>
|
|
</indexterm></para>
|
|
|
|
<note>
|
|
<para>If you cannot have any data loss at all, you should also focus on a
|
|
highly available deployment. The <emphasis><link
|
|
xlink:href="http://docs.openstack.org/high-availability-guide/content/">OpenStack High Availability
|
|
Guide</link></emphasis> offers suggestions for elimination of a single
|
|
point of failure that could cause system downtime. While it is not a
|
|
completely prescriptive document, it offers methods and techniques for
|
|
avoiding downtime and data loss.<indexterm class="singular">
|
|
<primary>data</primary>
|
|
|
|
<secondary>preventing loss of</secondary>
|
|
</indexterm></para>
|
|
</note>
|
|
|
|
<para>Other backup considerations include:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>How many backups to keep?</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Should backups be kept off-site?</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>How often should backups be tested?</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>Just as important as a backup policy is a recovery policy (or at least
|
|
recovery testing).</para>
|
|
|
|
<section xml:id="what_to_backup">
|
|
<title>What to Back Up</title>
|
|
|
|
<para>While OpenStack is composed of many components and moving parts,
|
|
backing up the critical data is quite simple.<indexterm class="singular">
|
|
<primary>backup/recovery</primary>
|
|
|
|
<secondary>items included</secondary>
|
|
</indexterm></para>
|
|
|
|
<para>This chapter describes only how to back up configuration files and
|
|
databases that the various OpenStack components need to run. This chapter
|
|
does not describe how to back up objects inside Object Storage or data
|
|
contained inside Block Storage. Generally these areas are left for users
|
|
to back up on their own.</para>
|
|
</section>
|
|
|
|
<section xml:id="database_backups">
|
|
<title>Database Backups</title>
|
|
|
|
<para>The example OpenStack architecture designates the cloud controller
|
|
as the MySQL server. This MySQL server hosts the databases for nova,
|
|
glance, cinder, and keystone. With all of these databases in one place,
|
|
it's very easy to create a database backup:<indexterm class="singular">
|
|
<primary>databases</primary>
|
|
|
|
<secondary>backup/recovery of</secondary>
|
|
</indexterm><indexterm class="singular">
|
|
<primary>backup/recovery</primary>
|
|
|
|
<secondary>databases</secondary>
|
|
</indexterm></para>
|
|
|
|
<programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt --all-databases > openstack.sql</programlisting>
|
|
|
|
<para>If you only want to backup a single database, you can instead
|
|
run:</para>
|
|
|
|
<programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt nova > nova.sql</programlisting>
|
|
|
|
<para>where <code>nova</code> is the database you want to back up.</para>
|
|
|
|
<para>You can easily automate this process by creating a cron job that
|
|
runs the following script once per day:</para>
|
|
|
|
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt>!/bin/bash
|
|
backup_dir="/var/lib/backups/mysql"
|
|
filename="${backup_dir}/mysql-`hostname`-`eval date +%Y%m%d`.sql.gz"
|
|
# Dump the entire MySQL database
|
|
/usr/bin/mysqldump --opt --all-databases | gzip > $filename
|
|
# Delete backups older than 7 days
|
|
find $backup_dir -ctime +7 -type f -delete</programlisting>
|
|
|
|
<para>This script dumps the entire MySQL database and deletes any backups
|
|
older than seven days.</para>
|
|
</section>
|
|
|
|
<section xml:id="file_system_backups">
|
|
<title>File System Backups</title>
|
|
|
|
<para>This section discusses which files and directories should be backed
|
|
up regularly, organized by service.<indexterm class="singular">
|
|
<primary>file systems</primary>
|
|
|
|
<secondary>backup/recovery of</secondary>
|
|
</indexterm><indexterm class="singular">
|
|
<primary>backup/recovery</primary>
|
|
|
|
<secondary>file systems</secondary>
|
|
</indexterm></para>
|
|
|
|
<section xml:id="compute">
|
|
<title>Compute</title>
|
|
|
|
<para>The <filename>/etc/nova</filename> directory on both the cloud
|
|
controller and compute nodes should be regularly backed up.<indexterm
|
|
class="singular">
|
|
<primary>cloud controllers</primary>
|
|
|
|
<secondary>file system backups and</secondary>
|
|
</indexterm><indexterm class="singular">
|
|
<primary>compute nodes</primary>
|
|
|
|
<secondary>backup/recovery of</secondary>
|
|
</indexterm></para>
|
|
|
|
<para><code>/var/log/nova</code> does not need to be backed up if you
|
|
have all logs going to a central area. It is highly recommended to use a
|
|
central logging server or back up the log directory.</para>
|
|
|
|
<para><code>/var/lib/nova</code> is another important directory to back
|
|
up. The exception to this is the <code>/var/lib/nova/instances</code>
|
|
subdirectory on compute nodes. This subdirectory contains the KVM images
|
|
of running instances. You would want to back up this directory only if
|
|
you need to maintain backup copies of all instances. Under most
|
|
circumstances, you do not need to do this, but this can vary from cloud
|
|
to cloud and your service levels. Also be aware that making a backup of
|
|
a live KVM instance can cause that instance to not boot properly if it
|
|
is ever restored from a backup.</para>
|
|
</section>
|
|
|
|
<section xml:id="image_catalog_delivery">
|
|
<title>Image Catalog and Delivery</title>
|
|
|
|
<para><code>/etc/glance</code> and <code>/var/log/glance</code> follow
|
|
the same rules as their nova counterparts.<indexterm class="singular">
|
|
<primary>Image service</primary>
|
|
|
|
<secondary>backup/recovery of</secondary>
|
|
</indexterm></para>
|
|
|
|
<para><code>/var/lib/glance</code> should also be backed up. Take
|
|
special notice of <code>/var/lib/glance/images</code>. If you are using
|
|
a file-based backend of glance, <code>/var/lib/glance/images</code> is
|
|
where the images are stored and care should be taken.</para>
|
|
|
|
<para>There are two ways to ensure stability with this directory. The
|
|
first is to make sure this directory is run on a RAID array. If a disk
|
|
fails, the directory is available. The second way is to use a tool such
|
|
as rsync to replicate the images to another server:</para>
|
|
|
|
<programlisting># rsync -az --progress /var/lib/glance/images \
|
|
backup-server:/var/lib/glance/images/</programlisting>
|
|
</section>
|
|
|
|
<section xml:id="identity">
|
|
<title>Identity</title>
|
|
|
|
<para><code>/etc/keystone</code> and <code>/var/log/keystone</code>
|
|
follow the same rules as other components.<indexterm class="singular">
|
|
<primary>Identity Service</primary>
|
|
|
|
<secondary>backup/recovery</secondary>
|
|
</indexterm></para>
|
|
|
|
<para><code>/var/lib/keystone</code>, although it should not contain any
|
|
data being used, can also be backed up just in case.</para>
|
|
</section>
|
|
|
|
<section xml:id="ops_block_storage">
|
|
<title>Block Storage</title>
|
|
|
|
<para><code>/etc/cinder</code> and <code>/var/log/cinder</code> follow
|
|
the same rules as other components.<indexterm class="singular">
|
|
<primary>Block Storage</primary>
|
|
</indexterm><indexterm class="singular">
|
|
<primary>storage</primary>
|
|
|
|
<secondary>block storage</secondary>
|
|
</indexterm></para>
|
|
|
|
<para><code>/var/lib/cinder</code> should also be backed up.</para>
|
|
</section>
|
|
|
|
<section xml:id="ops_object_storage">
|
|
<title>Object Storage</title>
|
|
|
|
<para><code>/etc/swift</code> is very important to have backed up. This
|
|
directory contains the swift configuration files as well as the ring
|
|
files and ring <glossterm>builder file</glossterm>s, which if lost,
|
|
render the data on your cluster inaccessible. A best practice is to copy
|
|
the builder files to all storage nodes along with the ring files.
|
|
Multiple backup copies are spread throughout your storage
|
|
cluster.<indexterm class="singular">
|
|
<primary>builder files</primary>
|
|
</indexterm><indexterm class="singular">
|
|
<primary>rings</primary>
|
|
|
|
<secondary>ring builders</secondary>
|
|
</indexterm><indexterm class="singular">
|
|
<primary>Object Storage</primary>
|
|
|
|
<secondary>backup/recovery of</secondary>
|
|
</indexterm></para>
|
|
</section>
|
|
</section>
|
|
|
|
<section xml:id="recovering_backups">
|
|
<title>Recovering Backups</title>
|
|
|
|
<para>Recovering backups is a fairly simple process. To begin, first
|
|
ensure that the service you are recovering is not running. For example, to
|
|
do a full recovery of <literal>nova</literal> on the cloud controller,
|
|
first stop all <code>nova</code> services:<indexterm class="singular">
|
|
<primary>recovery</primary>
|
|
|
|
<seealso>backup/recovery</seealso>
|
|
</indexterm><indexterm class="singular">
|
|
<primary>backup/recovery</primary>
|
|
|
|
<secondary>recovering backups</secondary>
|
|
</indexterm></para>
|
|
|
|
<?hard-pagebreak ?>
|
|
|
|
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> stop nova-api
|
|
# stop nova-cert
|
|
# stop nova-consoleauth
|
|
# stop nova-novncproxy
|
|
# stop nova-objectstore
|
|
# stop nova-scheduler</programlisting>
|
|
|
|
<para>Now you can import a previously backed-up database:</para>
|
|
|
|
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mysql nova < nova.sql</programlisting>
|
|
|
|
<para>You can also restore backed-up nova directories:</para>
|
|
|
|
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mv /etc/nova{,.orig}
|
|
# cp -a /path/to/backup/nova /etc/</programlisting>
|
|
|
|
<para>Once the files are restored, start everything back up:</para>
|
|
|
|
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> start mysql
|
|
# for i in nova-api nova-cert nova-consoleauth nova-novncproxy
|
|
nova-objectstore nova-scheduler
|
|
> do
|
|
> start $i
|
|
> done</programlisting>
|
|
|
|
<para>Other services follow the same process, with their respective
|
|
directories and <phrase role="keep-together">databases</phrase>.</para>
|
|
</section>
|
|
|
|
<section xml:id="ops-backup-recovery-summary">
|
|
<title>Summary</title>
|
|
|
|
<para>Backup and subsequent recovery is one of the first tasks system
|
|
administrators learn. However, each system has different items that need
|
|
attention. By taking care of your database, image service, and appropriate
|
|
file system locations, you can be assured that you can handle any event
|
|
requiring recovery.</para>
|
|
</section>
|
|
</chapter>
|