operations-guide/doc/openstack-ops/ch_ops_backup_recovery.xml

<?xml version="1.0" encoding="UTF-8"?>
<chapter version="5.0" xml:id="backup_and_recovery"
         xmlns="http://docbook.org/ns/docbook"
         xmlns:xlink="http://www.w3.org/1999/xlink"
         xmlns:xi="http://www.w3.org/2001/XInclude"
         xmlns:ns5="http://www.w3.org/1999/xhtml"
         xmlns:ns4="http://www.w3.org/1998/Math/MathML"
         xmlns:ns3="http://www.w3.org/2000/svg"
         xmlns:ns="http://docbook.org/ns/docbook">
  <?dbhtml stop-chunking?>

  <title>Backup and Recovery</title>

  <para>Standard backup best practices apply when creating your OpenStack
  backup policy. For example, how often to back up your data is closely
  related to how quickly you need to recover from data loss.<indexterm
      class="singular">
      <primary>backup/recovery</primary>

      <secondary>considerations</secondary>
    </indexterm></para>

  <note>
    <para>If you cannot have any data loss at all, you should also focus on a
    highly available deployment. The <emphasis><link
    xlink:href="http://docs.openstack.org/high-availability-guide/content/">OpenStack High Availability
    Guide</link></emphasis> offers suggestions for elimination of a single
    point of failure that could cause system downtime. While it is not a
    completely prescriptive document, it offers methods and techniques for
    avoiding downtime and data loss.<indexterm class="singular">
        <primary>data</primary>

        <secondary>preventing loss of</secondary>
      </indexterm></para>
  </note>

  <para>Other backup considerations include:</para>

  <itemizedlist>
    <listitem>
      <para>How many backups to keep?</para>
    </listitem>

    <listitem>
      <para>Should backups be kept off-site?</para>
    </listitem>

    <listitem>
      <para>How often should backups be tested?</para>
    </listitem>
  </itemizedlist>

  <para>Just as important as a backup policy is a recovery policy (or at least
  recovery testing).</para>

  <section xml:id="what_to_backup">
    <title>What to Back Up</title>

    <para>While OpenStack is composed of many components and moving parts,
    backing up the critical data is quite simple.<indexterm class="singular">
        <primary>backup/recovery</primary>

        <secondary>items included</secondary>
      </indexterm></para>

    <para>This chapter describes only how to back up configuration files and
    databases that the various OpenStack components need to run. This chapter
    does not describe how to back up objects inside Object Storage or data
    contained inside Block Storage. Generally these areas are left for users
    to back up on their own.</para>
  </section>

  <section xml:id="database_backups">
    <title>Database Backups</title>

    <para>The example OpenStack architecture designates the cloud controller
    as the MySQL server. This MySQL server hosts the databases for nova,
    glance, cinder, and keystone. With all of these databases in one place,
    it's very easy to create a database backup:<indexterm class="singular">
        <primary>databases</primary>

        <secondary>backup/recovery of</secondary>
      </indexterm><indexterm class="singular">
        <primary>backup/recovery</primary>

        <secondary>databases</secondary>
      </indexterm></para>

    <programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt --all-databases &gt; openstack.sql</programlisting>

    <para>If you only want to backup a single database, you can instead
    run:</para>

    <programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt nova &gt; nova.sql</programlisting>

    <para>where <code>nova</code> is the database you want to back up.</para>

    <para>You can easily automate this process by creating a cron job that
    runs the following script once per day:</para>

    <programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt>!/bin/bash
backup_dir="/var/lib/backups/mysql"
filename="${backup_dir}/mysql-`hostname`-`eval date +%Y%m%d`.sql.gz"
# Dump the entire MySQL database
/usr/bin/mysqldump --opt --all-databases | gzip &gt; $filename
# Delete backups older than 7 days
find $backup_dir -ctime +7 -type f -delete</programlisting>

    <para>This script dumps the entire MySQL database and deletes any backups
    older than seven days.</para>
  </section>

  <section xml:id="file_system_backups">
    <title>File System Backups</title>

    <para>This section discusses which files and directories should be backed
    up regularly, organized by service.<indexterm class="singular">
        <primary>file systems</primary>

        <secondary>backup/recovery of</secondary>
      </indexterm><indexterm class="singular">
        <primary>backup/recovery</primary>

        <secondary>file systems</secondary>
      </indexterm></para>

    <section xml:id="compute">
      <title>Compute</title>

      <para>The <filename>/etc/nova</filename> directory on both the cloud
      controller and compute nodes should be regularly backed up.<indexterm
          class="singular">
          <primary>cloud controllers</primary>

          <secondary>file system backups and</secondary>
        </indexterm><indexterm class="singular">
          <primary>compute nodes</primary>

          <secondary>backup/recovery of</secondary>
        </indexterm></para>

      <para><code>/var/log/nova</code> does not need to be backed up if you
      have all logs going to a central area. It is highly recommended to use a
      central logging server or back up the log directory.</para>

      <para><code>/var/lib/nova</code> is another important directory to back
      up. The exception to this is the <code>/var/lib/nova/instances</code>
      subdirectory on compute nodes. This subdirectory contains the KVM images
      of running instances. You would want to back up this directory only if
      you need to maintain backup copies of all instances. Under most
      circumstances, you do not need to do this, but this can vary from cloud
      to cloud and your service levels. Also be aware that making a backup of
      a live KVM instance can cause that instance to not boot properly if it
      is ever restored from a backup.</para>
    </section>

    <section xml:id="image_catalog_delivery">
      <title>Image Catalog and Delivery</title>

      <para><code>/etc/glance</code> and <code>/var/log/glance</code> follow
      the same rules as their nova counterparts.<indexterm class="singular">
          <primary>Image service</primary>

          <secondary>backup/recovery of</secondary>
        </indexterm></para>

      <para><code>/var/lib/glance</code> should also be backed up. Take
      special notice of <code>/var/lib/glance/images</code>. If you are using
      a file-based backend of glance, <code>/var/lib/glance/images</code> is
      where the images are stored and care should be taken.</para>

      <para>There are two ways to ensure stability with this directory. The
      first is to make sure this directory is run on a RAID array. If a disk
      fails, the directory is available. The second way is to use a tool such
      as rsync to replicate the images to another server:</para>

      <programlisting># rsync -az --progress /var/lib/glance/images \
backup-server:/var/lib/glance/images/</programlisting>
    </section>

    <section xml:id="identity">
      <title>Identity</title>

      <para><code>/etc/keystone</code> and <code>/var/log/keystone</code>
      follow the same rules as other components.<indexterm class="singular">
          <primary>Identity Service</primary>

          <secondary>backup/recovery</secondary>
        </indexterm></para>

      <para><code>/var/lib/keystone</code>, although it should not contain any
      data being used, can also be backed up just in case.</para>
    </section>

    <section xml:id="ops_block_storage">
      <title>Block Storage</title>

      <para><code>/etc/cinder</code> and <code>/var/log/cinder</code> follow
      the same rules as other components.<indexterm class="singular">
          <primary>Block Storage</primary>
        </indexterm><indexterm class="singular">
          <primary>storage</primary>

          <secondary>block storage</secondary>
        </indexterm></para>

      <para><code>/var/lib/cinder</code> should also be backed up.</para>
    </section>

    <section xml:id="ops_object_storage">
      <title>Object Storage</title>

      <para><code>/etc/swift</code> is very important to have backed up. This
      directory contains the swift configuration files as well as the ring
      files and ring <glossterm>builder file</glossterm>s, which if lost,
      render the data on your cluster inaccessible. A best practice is to copy
      the builder files to all storage nodes along with the ring files.
      Multiple backup copies are spread throughout your storage
      cluster.<indexterm class="singular">
          <primary>builder files</primary>
        </indexterm><indexterm class="singular">
          <primary>rings</primary>

          <secondary>ring builders</secondary>
        </indexterm><indexterm class="singular">
          <primary>Object Storage</primary>

          <secondary>backup/recovery of</secondary>
        </indexterm></para>
    </section>
  </section>

  <section xml:id="recovering_backups">
    <title>Recovering Backups</title>

    <para>Recovering backups is a fairly simple process. To begin, first
    ensure that the service you are recovering is not running. For example, to
    do a full recovery of <literal>nova</literal> on the cloud controller,
    first stop all <code>nova</code> services:<indexterm class="singular">
        <primary>recovery</primary>

        <seealso>backup/recovery</seealso>
      </indexterm><indexterm class="singular">
        <primary>backup/recovery</primary>

        <secondary>recovering backups</secondary>
      </indexterm></para>

    <?hard-pagebreak ?>

    <programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> stop nova-api
# stop nova-cert
# stop nova-consoleauth
# stop nova-novncproxy
# stop nova-objectstore
# stop nova-scheduler</programlisting>

    <para>Now you can import a previously backed-up database:</para>

    <programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mysql nova &lt; nova.sql</programlisting>

    <para>You can also restore backed-up nova directories:</para>

    <programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mv /etc/nova{,.orig}
# cp -a /path/to/backup/nova /etc/</programlisting>

    <para>Once the files are restored, start everything back up:</para>

    <programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> start mysql
# for i in nova-api nova-cert nova-consoleauth nova-novncproxy
nova-objectstore nova-scheduler
&gt; do
&gt; start $i
&gt; done</programlisting>

    <para>Other services follow the same process, with their respective
    directories and <phrase role="keep-together">databases</phrase>.</para>
  </section>

  <section xml:id="ops-backup-recovery-summary">
    <title>Summary</title>

    <para>Backup and subsequent recovery is one of the first tasks system
    administrators learn. However, each system has different items that need
    attention. By taking care of your database, image service, and appropriate
    file system locations, you can be assured that you can handle any event
    requiring recovery.</para>
  </section>
</chapter>