O'Reilly Edit: Changes to the Storage Chapter

This patch addresses the comments made during the latest round of edits
from O'Reilly.

Change-Id: I7deaceacd319775c9960377074728538aa0b0314
This commit is contained in:
Joe Topjian 2014-02-15 14:05:46 +01:00 committed by Anne Gentle
parent afe62516ca
commit 96b265a87c

View File

@ -34,24 +34,112 @@ format="SVG" scale="60"/>
<para>Today, OpenStack clouds explicitly support two types of <para>Today, OpenStack clouds explicitly support two types of
persistent storage: <emphasis>object storage</emphasis> persistent storage: <emphasis>object storage</emphasis>
and <emphasis>block storage</emphasis>.</para></section> and <emphasis>block storage</emphasis>.</para></section>
<section xml:id="persistent_storage">
<title>Persistent Storage</title>
<para>Persistent storage means that the storage resource outlives any
other resource and is always available, regardless of the state of a
running instance.</para>
<section xml:id="object_storage"> <section xml:id="object_storage">
<title>Object Storage</title> <title>Object Storage</title>
<para>With object storage, users access binary objects <para>With object storage, users access binary objects
through a REST API. You may be familiar with Amazon through a REST API. You may be familiar with Amazon
S3, which is a well-known example of an object storage S3, which is a well-known example of an object storage
system. If your intended users need to archive or system. Object storage is implemented in OpenStack by
manage large datasets, you want to provide them with the OpenStack Object Storage (swift) project. If your
object storage. In addition, OpenStack can store your intended users need to archive or manage large
virtual machine (VM) images inside of an object datasets, you want to provide them with object
storage system, as an alternative to storing the storage. In addition, OpenStack can store your virtual
images on a file system.</para> machine (VM) images inside of an object storage
system, as an alternative to storing the images on a
file system.</para>
<para>OpenStack Object Storage provides a highly scalable,
highly available storage solution by relaxing some of the
constraints of traditional file systems. In designing and
procuring for such a cluster, it is important to
understand some key concepts about its operation.
Essentially, this type of storage is built on the idea
that all storage hardware fails, at every level, at some
point. Infrequently encountered failures that would
hamstring other storage systems, such as issues taking
down RAID cards, or entire servers are handled gracefully
with OpenStack Object Storage.</para>
<para>A good document describing the Object Storage
architecture is found within <link
xlink:title="OpenStack wiki"
xlink:href="http://docs.openstack.org/developer/swift/overview_architecture.html"
>the developer documentation</link>
(http://docs.openstack.org/developer/swift/overview_architecture.html)
- read this first. Once you have understood the
architecture, you should know what a proxy server does and
how zones work. However, some important points are often
missed at first glance.</para>
<para>When designing your cluster, you must consider
durability and availability. Understand that the
predominant source of these is the spread and placement of
your data, rather than the reliability of the hardware.
Consider the default value of the number of replicas,
which is 3. This means that before an object is marked as
having being written at least two copies exists - in case
a single server fails to write, the third copy may or may
not yet exist when the write operation initially returns.
Altering this number increases the robustness of your
data, but reduces the amount of storage you have
available. Next look at the placement of your servers.
Consider spreading them widely throughout your data
centre's network and power failure zones. Is a zone a
rack, a server or a disk?</para>
<para>Object Storage's network patterns might seem unfamiliar
at first. Consider these main traffic flows: <itemizedlist>
<listitem>
<para>Among <glossterm>object</glossterm>,
<glossterm>container</glossterm>, and
<glossterm>account
server</glossterm>s</para>
</listitem>
<listitem>
<para>Between those servers and the proxies</para>
</listitem>
<listitem>
<para>Between the proxies and your users</para>
</listitem>
</itemizedlist></para>
<para>Object Storage is very 'chatty' among servers hosting
data - even a small cluster does megabytes/second of
traffic, which is predominantly "Do you have the
object?"/"Yes I have the object!." Of course, if the
answer to the aforementioned question is negative or times
out, replication of the object begins.</para>
<para>Consider the scenario where an entire server fails, and
24 TB of data needs to be transferred "immediately" to
remain at three copies - this can put significant load on
the network.</para>
<para>Another oft forgotten fact is that when a new file is
being uploaded, the proxy server must write out as many
streams as there are replicas - giving a multiple of
network traffic. For a 3-replica cluster, 10Gbps in means
30Gbps out. Combining this with the previous high
bandwidth demands of replication is what results in the
recommendation that your private network is of
significantly higher bandwidth than your public need be.
Oh, and OpenStack Object Storage communicates internally
with unencrypted, unauthenticated rsync for performance
&mdash; you do want the private network to be
private.</para>
<para>The remaining point on bandwidth is the public facing
portion. The swift-proxy service is stateless, which means
that you can easily add more and use http load-balancing
methods to share bandwidth and availability between
them.</para>
<para>More proxies means more bandwidth, if your storage can
keep up.</para>
</section> </section>
<section xml:id="block_storage"> <section xml:id="block_storage">
<title>Block Storage</title> <title>Block Storage</title>
<para>Block storage (sometimes referred to as volume <para>Block storage (sometimes referred to as volume
storage) exposes a block device to the user. Users storage) provides users with access to block storage
interact with block storage by attaching volumes to devices. Users interact with block storage by
their running VM instances.</para> attaching volumes to their running VM
instances.</para>
<para>These volumes are persistent: they can be detached <para>These volumes are persistent: they can be detached
from one instance and re-attached to another, and the from one instance and re-attached to another, and the
data remains intact. Block storage is implemented in data remains intact. Block storage is implemented in
@ -76,6 +164,7 @@ format="SVG" scale="60"/>
utilizes QEMU's file-based virtual machines stored in utilizes QEMU's file-based virtual machines stored in
<code>/var/lib/nova/instances</code>.</para> <code>/var/lib/nova/instances</code>.</para>
</section> </section>
</section>
<section xml:id="storage_concepts"> <section xml:id="storage_concepts">
<title>OpenStack Storage Concepts</title> <title>OpenStack Storage Concepts</title>
<table xml:id="openstack_storage" rules="all"> <table xml:id="openstack_storage" rules="all">
@ -149,7 +238,8 @@ format="SVG" scale="60"/>
</tbody> </tbody>
</table> </table>
<section xml:id="file_level_storage"> <section xml:id="file_level_storage">
<title>File-level Storage</title> <!-- FIXME: change to an aside -->
<title>File-level Storage (for Live Migration)</title>
<para>With file-level storage, users access stored data <para>With file-level storage, users access stored data
using the operating system's file system interface. using the operating system's file system interface.
Most users, if they have used a network storage Most users, if they have used a network storage
@ -169,15 +259,16 @@ format="SVG" scale="60"/>
<?hard-pagebreak?> <?hard-pagebreak?>
<section xml:id="storage_backends"> <section xml:id="storage_backends">
<title>Choosing Storage Back-ends</title> <title>Choosing Storage Back-ends</title>
<para>Users will indicate different needs for their cloud use cases. <para>Users will indicate different needs for their cloud use
Some may need fast access to many objects that do not change often, cases. Some may need fast access to many objects that do
or they want to set a Time To Live (TTL) value on a file. Others may only not change often, or they want to set a Time To Live (TTL)
access storage that is mounted with the file system itself, but want value on a file. Others may only access storage that is
it to be replicated instantly when starting a new instance. For mounted with the file system itself, but want it to be
other systems, ephemeral storage that is released when a VM attached replicated instantly when starting a new instance. For
to it is shut down. When you select <glossterm>storage other systems, ephemeral storage that is released when a
back-end</glossterm>s, ask the following VM attached to it is shut down. When you select
questions on behalf of your users:</para> <glossterm>storage back-end</glossterm>s, ask the
following questions on behalf of your users:</para>
<itemizedlist role="compact"> <itemizedlist role="compact">
<listitem> <listitem>
<para>Do my users need block storage?</para> <para>Do my users need block storage?</para>
@ -263,12 +354,6 @@ format="SVG" scale="60"/>
<td><para>&CHECK;</para></td> <td><para>&CHECK;</para></td>
<td><para> </para></td> <td><para> </para></td>
</tr> </tr>
<tr>
<td><para>Sheepdog</para></td>
<td><para> </para></td>
<td><para>experimental</para></td>
<td><para> </para></td>
</tr>
</tbody> </tbody>
</table> </table>
<para>* This list of open-source file-level shared storage <para>* This list of open-source file-level shared storage
@ -315,10 +400,11 @@ format="SVG" scale="60"/>
</itemizedlist> </itemizedlist>
<section xml:id="commodity_storage_backends"> <section xml:id="commodity_storage_backends">
<title>Commodity Storage Back-end Technologies</title> <title>Commodity Storage Back-end Technologies</title>
<para>This section provides a high-level overview of the differences <para>This section provides a high-level overview of the
among the different commodity storage back-end technologies. differences among the different commodity storage
Depending on your cloud user's needs, you can implement one or back-end technologies. Depending on your cloud user's
many of these technologies in different combinations.</para> needs, you can implement one or many of these
technologies in different combinations.</para>
<itemizedlist role="compact"> <itemizedlist role="compact">
<listitem> <listitem>
<para><emphasis role="bold">OpenStack Object <para><emphasis role="bold">OpenStack Object
@ -394,17 +480,18 @@ format="SVG" scale="60"/>
version 3.3, you can use Gluster to version 3.3, you can use Gluster to
consolidate your object storage and file consolidate your object storage and file
storage into one unified file and object storage into one unified file and object
storage solution, which is called Gluster UFO. storage solution, which is called Gluster For
Gluster UFO uses a customizes version of Swift OpenStack (GFO). GFO uses a customized version
that uses Gluster as the back-end.</para> of Swift that enables Gluster to be used as
<para>The main advantage of using Gluster UFO over the back-end storage.</para>
regular Swift is if you also want to support a <para>The main advantage of using GFO over regular
Swift is if you also want to support a
distributed file system, either to support distributed file system, either to support
shared storage live migration or to provide it shared storage live migration or to provide it
as a separate service to your end-users. If as a separate service to your end-users. If
you wish to manage your object and file you wish to manage your object and file
storage within a single system, you should storage within a single system, you should
consider Gluster UFO.</para> consider GFO.</para>
</listitem> </listitem>
<listitem> <listitem>
<para><emphasis role="bold">LVM</emphasis>. The <para><emphasis role="bold">LVM</emphasis>. The
@ -459,107 +546,16 @@ format="SVG" scale="60"/>
that your experience is primarily with that your experience is primarily with
Linux-based systems.</para> Linux-based systems.</para>
</listitem> </listitem>
<listitem>
<para><emphasis role="bold">Sheepdog</emphasis>. A
recent project that aims to provide block
storage for KVM-based instances, with support
for replication across hosts. We don't
recommend Sheepdog for a production cloud,
because its authors at NTT Labs consider
Sheepdog as an experimental technology.</para>
</listitem>
</itemizedlist> </itemizedlist>
</section> </section>
</section> </section>
<?hard-pagebreak?>
<section xml:id="openstack_object_storage">
<title>Notes on OpenStack Object Storage</title>
<para>OpenStack Object Storage provides a highly scalable,
highly available storage solution by relaxing some of the
constraints of traditional file systems. In designing and
procuring for such a cluster, it is important to
understand some key concepts about its operation.
Essentially, this type of storage is built on the idea
that all storage hardware fails, at every level, at some
point. Infrequently encountered failures that would
hamstring other storage systems, such as issues taking
down RAID cards, or entire servers are handled gracefully
with OpenStack Object Storage.</para>
<para>A good document describing the Object Storage
architecture is found within <link
xlink:title="OpenStack wiki"
xlink:href="http://docs.openstack.org/developer/swift/overview_architecture.html"
>the developer documentation</link>
(http://docs.openstack.org/developer/swift/overview_architecture.html)
- read this first. Once you have understood the
architecture, you should know what a proxy server does and
how zones work. However, some important points are often missed at
first glance.</para>
<para>When designing your cluster, you must consider
durability and availability. Understand that the
predominant source of these is the spread and placement of
your data, rather than the reliability of the hardware.
Consider the default value of the number of replicas,
which is 3. This means that before an object is
marked as having being written at least two copies exists
- in case a single server fails to write, the third copy
may or may not yet exist when the write operation
initially returns. Altering this number increases the
robustness of your data, but reduces the amount of storage
you have available. Next look at the placement of your
servers. Consider spreading them widely throughout your
data centre's network and power failure zones. Is a zone a
rack, a server or a disk?</para>
<para>Object Storage's network patterns might seem unfamiliar
at first. Consider these main traffic flows: <itemizedlist>
<listitem>
<para>Among <glossterm>object</glossterm>,
<glossterm>container</glossterm>, and
<glossterm>account
server</glossterm>s</para>
</listitem>
<listitem>
<para>Between those servers and the proxies</para>
</listitem>
<listitem>
<para>Between the proxies and your users</para>
</listitem>
</itemizedlist></para>
<para>Object Storage is very 'chatty' among servers hosting
data - even a small cluster does megabytes/second of
traffic, which is predominantly "Do you have the
object?"/"Yes I have the object!." Of course, if the
answer to the aforementioned question is negative or times
out, replication of the object begins.</para>
<para>Consider the scenario where an entire server fails, and
24 TB of data needs to be transferred "immediately" to
remain at three copies - this can put significant load on
the network.</para>
<para>Another oft forgotten fact is that when a new file is
being uploaded, the proxy server must write out as many
streams as there are replicas - giving a multiple of
network traffic. For a 3-replica cluster, 10Gbps in means
30Gbps out. Combining this with the previous high
bandwidth demands of replication is what results in the
recommendation that your private network is of
significantly higher bandwidth than your public need be.
Oh, and OpenStack Object Storage communicates internally
with unencrypted, unauthenticated rsync for performance &mdash;
you do want the private network to be private.</para>
<para>The remaining point on bandwidth is the public facing
portion. The swift-proxy service is stateless, which means that you
can easily add more and use http load-balancing methods to
share bandwidth and availability between them.</para>
<para>More proxies means more bandwidth, if your storage can
keep up.</para>
</section>
<section xml:id="storagedecisions_conclusion"> <section xml:id="storagedecisions_conclusion">
<title>Conclusion</title> <title>Conclusion</title>
<para>Hopefully you now have some considerations in mind and questions <para>Hopefully you now have some considerations in mind and
to ask your future cloud users about their storage use cases. As you questions to ask your future cloud users about their
can see, your storage decisions will also influence your network design storage use cases. As you can see, your storage decisions
for performance and security needs. Continue with us to make more will also influence your network design for performance
informed decisions about your OpenStack cloud design.</para> and security needs. Continue with us to make more informed
decisions about your OpenStack cloud design.</para>
</section> </section>
</chapter> </chapter>