O'Reilly Edit: Changes to the Storage Chapter
This patch addresses the comments made during the latest round of edits from O'Reilly. Change-Id: I7deaceacd319775c9960377074728538aa0b0314
This commit is contained in:
parent
afe62516ca
commit
96b265a87c
@ -34,24 +34,112 @@ format="SVG" scale="60"/>
|
|||||||
<para>Today, OpenStack clouds explicitly support two types of
|
<para>Today, OpenStack clouds explicitly support two types of
|
||||||
persistent storage: <emphasis>object storage</emphasis>
|
persistent storage: <emphasis>object storage</emphasis>
|
||||||
and <emphasis>block storage</emphasis>.</para></section>
|
and <emphasis>block storage</emphasis>.</para></section>
|
||||||
|
<section xml:id="persistent_storage">
|
||||||
|
<title>Persistent Storage</title>
|
||||||
|
<para>Persistent storage means that the storage resource outlives any
|
||||||
|
other resource and is always available, regardless of the state of a
|
||||||
|
running instance.</para>
|
||||||
<section xml:id="object_storage">
|
<section xml:id="object_storage">
|
||||||
<title>Object Storage</title>
|
<title>Object Storage</title>
|
||||||
<para>With object storage, users access binary objects
|
<para>With object storage, users access binary objects
|
||||||
through a REST API. You may be familiar with Amazon
|
through a REST API. You may be familiar with Amazon
|
||||||
S3, which is a well-known example of an object storage
|
S3, which is a well-known example of an object storage
|
||||||
system. If your intended users need to archive or
|
system. Object storage is implemented in OpenStack by
|
||||||
manage large datasets, you want to provide them with
|
the OpenStack Object Storage (swift) project. If your
|
||||||
object storage. In addition, OpenStack can store your
|
intended users need to archive or manage large
|
||||||
virtual machine (VM) images inside of an object
|
datasets, you want to provide them with object
|
||||||
storage system, as an alternative to storing the
|
storage. In addition, OpenStack can store your virtual
|
||||||
images on a file system.</para>
|
machine (VM) images inside of an object storage
|
||||||
|
system, as an alternative to storing the images on a
|
||||||
|
file system.</para>
|
||||||
|
<para>OpenStack Object Storage provides a highly scalable,
|
||||||
|
highly available storage solution by relaxing some of the
|
||||||
|
constraints of traditional file systems. In designing and
|
||||||
|
procuring for such a cluster, it is important to
|
||||||
|
understand some key concepts about its operation.
|
||||||
|
Essentially, this type of storage is built on the idea
|
||||||
|
that all storage hardware fails, at every level, at some
|
||||||
|
point. Infrequently encountered failures that would
|
||||||
|
hamstring other storage systems, such as issues taking
|
||||||
|
down RAID cards, or entire servers are handled gracefully
|
||||||
|
with OpenStack Object Storage.</para>
|
||||||
|
<para>A good document describing the Object Storage
|
||||||
|
architecture is found within <link
|
||||||
|
xlink:title="OpenStack wiki"
|
||||||
|
xlink:href="http://docs.openstack.org/developer/swift/overview_architecture.html"
|
||||||
|
>the developer documentation</link>
|
||||||
|
(http://docs.openstack.org/developer/swift/overview_architecture.html)
|
||||||
|
- read this first. Once you have understood the
|
||||||
|
architecture, you should know what a proxy server does and
|
||||||
|
how zones work. However, some important points are often
|
||||||
|
missed at first glance.</para>
|
||||||
|
<para>When designing your cluster, you must consider
|
||||||
|
durability and availability. Understand that the
|
||||||
|
predominant source of these is the spread and placement of
|
||||||
|
your data, rather than the reliability of the hardware.
|
||||||
|
Consider the default value of the number of replicas,
|
||||||
|
which is 3. This means that before an object is marked as
|
||||||
|
having being written at least two copies exists - in case
|
||||||
|
a single server fails to write, the third copy may or may
|
||||||
|
not yet exist when the write operation initially returns.
|
||||||
|
Altering this number increases the robustness of your
|
||||||
|
data, but reduces the amount of storage you have
|
||||||
|
available. Next look at the placement of your servers.
|
||||||
|
Consider spreading them widely throughout your data
|
||||||
|
centre's network and power failure zones. Is a zone a
|
||||||
|
rack, a server or a disk?</para>
|
||||||
|
<para>Object Storage's network patterns might seem unfamiliar
|
||||||
|
at first. Consider these main traffic flows: <itemizedlist>
|
||||||
|
<listitem>
|
||||||
|
<para>Among <glossterm>object</glossterm>,
|
||||||
|
<glossterm>container</glossterm>, and
|
||||||
|
<glossterm>account
|
||||||
|
server</glossterm>s</para>
|
||||||
|
</listitem>
|
||||||
|
<listitem>
|
||||||
|
<para>Between those servers and the proxies</para>
|
||||||
|
</listitem>
|
||||||
|
<listitem>
|
||||||
|
<para>Between the proxies and your users</para>
|
||||||
|
</listitem>
|
||||||
|
</itemizedlist></para>
|
||||||
|
<para>Object Storage is very 'chatty' among servers hosting
|
||||||
|
data - even a small cluster does megabytes/second of
|
||||||
|
traffic, which is predominantly "Do you have the
|
||||||
|
object?"/"Yes I have the object!." Of course, if the
|
||||||
|
answer to the aforementioned question is negative or times
|
||||||
|
out, replication of the object begins.</para>
|
||||||
|
<para>Consider the scenario where an entire server fails, and
|
||||||
|
24 TB of data needs to be transferred "immediately" to
|
||||||
|
remain at three copies - this can put significant load on
|
||||||
|
the network.</para>
|
||||||
|
<para>Another oft forgotten fact is that when a new file is
|
||||||
|
being uploaded, the proxy server must write out as many
|
||||||
|
streams as there are replicas - giving a multiple of
|
||||||
|
network traffic. For a 3-replica cluster, 10Gbps in means
|
||||||
|
30Gbps out. Combining this with the previous high
|
||||||
|
bandwidth demands of replication is what results in the
|
||||||
|
recommendation that your private network is of
|
||||||
|
significantly higher bandwidth than your public need be.
|
||||||
|
Oh, and OpenStack Object Storage communicates internally
|
||||||
|
with unencrypted, unauthenticated rsync for performance
|
||||||
|
— you do want the private network to be
|
||||||
|
private.</para>
|
||||||
|
<para>The remaining point on bandwidth is the public facing
|
||||||
|
portion. The swift-proxy service is stateless, which means
|
||||||
|
that you can easily add more and use http load-balancing
|
||||||
|
methods to share bandwidth and availability between
|
||||||
|
them.</para>
|
||||||
|
<para>More proxies means more bandwidth, if your storage can
|
||||||
|
keep up.</para>
|
||||||
</section>
|
</section>
|
||||||
<section xml:id="block_storage">
|
<section xml:id="block_storage">
|
||||||
<title>Block Storage</title>
|
<title>Block Storage</title>
|
||||||
<para>Block storage (sometimes referred to as volume
|
<para>Block storage (sometimes referred to as volume
|
||||||
storage) exposes a block device to the user. Users
|
storage) provides users with access to block storage
|
||||||
interact with block storage by attaching volumes to
|
devices. Users interact with block storage by
|
||||||
their running VM instances.</para>
|
attaching volumes to their running VM
|
||||||
|
instances.</para>
|
||||||
<para>These volumes are persistent: they can be detached
|
<para>These volumes are persistent: they can be detached
|
||||||
from one instance and re-attached to another, and the
|
from one instance and re-attached to another, and the
|
||||||
data remains intact. Block storage is implemented in
|
data remains intact. Block storage is implemented in
|
||||||
@ -76,6 +164,7 @@ format="SVG" scale="60"/>
|
|||||||
utilizes QEMU's file-based virtual machines stored in
|
utilizes QEMU's file-based virtual machines stored in
|
||||||
<code>/var/lib/nova/instances</code>.</para>
|
<code>/var/lib/nova/instances</code>.</para>
|
||||||
</section>
|
</section>
|
||||||
|
</section>
|
||||||
<section xml:id="storage_concepts">
|
<section xml:id="storage_concepts">
|
||||||
<title>OpenStack Storage Concepts</title>
|
<title>OpenStack Storage Concepts</title>
|
||||||
<table xml:id="openstack_storage" rules="all">
|
<table xml:id="openstack_storage" rules="all">
|
||||||
@ -149,7 +238,8 @@ format="SVG" scale="60"/>
|
|||||||
</tbody>
|
</tbody>
|
||||||
</table>
|
</table>
|
||||||
<section xml:id="file_level_storage">
|
<section xml:id="file_level_storage">
|
||||||
<title>File-level Storage</title>
|
<!-- FIXME: change to an aside -->
|
||||||
|
<title>File-level Storage (for Live Migration)</title>
|
||||||
<para>With file-level storage, users access stored data
|
<para>With file-level storage, users access stored data
|
||||||
using the operating system's file system interface.
|
using the operating system's file system interface.
|
||||||
Most users, if they have used a network storage
|
Most users, if they have used a network storage
|
||||||
@ -169,15 +259,16 @@ format="SVG" scale="60"/>
|
|||||||
<?hard-pagebreak?>
|
<?hard-pagebreak?>
|
||||||
<section xml:id="storage_backends">
|
<section xml:id="storage_backends">
|
||||||
<title>Choosing Storage Back-ends</title>
|
<title>Choosing Storage Back-ends</title>
|
||||||
<para>Users will indicate different needs for their cloud use cases.
|
<para>Users will indicate different needs for their cloud use
|
||||||
Some may need fast access to many objects that do not change often,
|
cases. Some may need fast access to many objects that do
|
||||||
or they want to set a Time To Live (TTL) value on a file. Others may only
|
not change often, or they want to set a Time To Live (TTL)
|
||||||
access storage that is mounted with the file system itself, but want
|
value on a file. Others may only access storage that is
|
||||||
it to be replicated instantly when starting a new instance. For
|
mounted with the file system itself, but want it to be
|
||||||
other systems, ephemeral storage that is released when a VM attached
|
replicated instantly when starting a new instance. For
|
||||||
to it is shut down. When you select <glossterm>storage
|
other systems, ephemeral storage that is released when a
|
||||||
back-end</glossterm>s, ask the following
|
VM attached to it is shut down. When you select
|
||||||
questions on behalf of your users:</para>
|
<glossterm>storage back-end</glossterm>s, ask the
|
||||||
|
following questions on behalf of your users:</para>
|
||||||
<itemizedlist role="compact">
|
<itemizedlist role="compact">
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>Do my users need block storage?</para>
|
<para>Do my users need block storage?</para>
|
||||||
@ -263,12 +354,6 @@ format="SVG" scale="60"/>
|
|||||||
<td><para>&CHECK;</para></td>
|
<td><para>&CHECK;</para></td>
|
||||||
<td><para> </para></td>
|
<td><para> </para></td>
|
||||||
</tr>
|
</tr>
|
||||||
<tr>
|
|
||||||
<td><para>Sheepdog</para></td>
|
|
||||||
<td><para> </para></td>
|
|
||||||
<td><para>experimental</para></td>
|
|
||||||
<td><para> </para></td>
|
|
||||||
</tr>
|
|
||||||
</tbody>
|
</tbody>
|
||||||
</table>
|
</table>
|
||||||
<para>* This list of open-source file-level shared storage
|
<para>* This list of open-source file-level shared storage
|
||||||
@ -315,10 +400,11 @@ format="SVG" scale="60"/>
|
|||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
<section xml:id="commodity_storage_backends">
|
<section xml:id="commodity_storage_backends">
|
||||||
<title>Commodity Storage Back-end Technologies</title>
|
<title>Commodity Storage Back-end Technologies</title>
|
||||||
<para>This section provides a high-level overview of the differences
|
<para>This section provides a high-level overview of the
|
||||||
among the different commodity storage back-end technologies.
|
differences among the different commodity storage
|
||||||
Depending on your cloud user's needs, you can implement one or
|
back-end technologies. Depending on your cloud user's
|
||||||
many of these technologies in different combinations.</para>
|
needs, you can implement one or many of these
|
||||||
|
technologies in different combinations.</para>
|
||||||
<itemizedlist role="compact">
|
<itemizedlist role="compact">
|
||||||
<listitem>
|
<listitem>
|
||||||
<para><emphasis role="bold">OpenStack Object
|
<para><emphasis role="bold">OpenStack Object
|
||||||
@ -394,17 +480,18 @@ format="SVG" scale="60"/>
|
|||||||
version 3.3, you can use Gluster to
|
version 3.3, you can use Gluster to
|
||||||
consolidate your object storage and file
|
consolidate your object storage and file
|
||||||
storage into one unified file and object
|
storage into one unified file and object
|
||||||
storage solution, which is called Gluster UFO.
|
storage solution, which is called Gluster For
|
||||||
Gluster UFO uses a customizes version of Swift
|
OpenStack (GFO). GFO uses a customized version
|
||||||
that uses Gluster as the back-end.</para>
|
of Swift that enables Gluster to be used as
|
||||||
<para>The main advantage of using Gluster UFO over
|
the back-end storage.</para>
|
||||||
regular Swift is if you also want to support a
|
<para>The main advantage of using GFO over regular
|
||||||
|
Swift is if you also want to support a
|
||||||
distributed file system, either to support
|
distributed file system, either to support
|
||||||
shared storage live migration or to provide it
|
shared storage live migration or to provide it
|
||||||
as a separate service to your end-users. If
|
as a separate service to your end-users. If
|
||||||
you wish to manage your object and file
|
you wish to manage your object and file
|
||||||
storage within a single system, you should
|
storage within a single system, you should
|
||||||
consider Gluster UFO.</para>
|
consider GFO.</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para><emphasis role="bold">LVM</emphasis>. The
|
<para><emphasis role="bold">LVM</emphasis>. The
|
||||||
@ -459,107 +546,16 @@ format="SVG" scale="60"/>
|
|||||||
that your experience is primarily with
|
that your experience is primarily with
|
||||||
Linux-based systems.</para>
|
Linux-based systems.</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
<listitem>
|
|
||||||
<para><emphasis role="bold">Sheepdog</emphasis>. A
|
|
||||||
recent project that aims to provide block
|
|
||||||
storage for KVM-based instances, with support
|
|
||||||
for replication across hosts. We don't
|
|
||||||
recommend Sheepdog for a production cloud,
|
|
||||||
because its authors at NTT Labs consider
|
|
||||||
Sheepdog as an experimental technology.</para>
|
|
||||||
</listitem>
|
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
</section>
|
</section>
|
||||||
</section>
|
</section>
|
||||||
<?hard-pagebreak?>
|
|
||||||
<section xml:id="openstack_object_storage">
|
|
||||||
<title>Notes on OpenStack Object Storage</title>
|
|
||||||
<para>OpenStack Object Storage provides a highly scalable,
|
|
||||||
highly available storage solution by relaxing some of the
|
|
||||||
constraints of traditional file systems. In designing and
|
|
||||||
procuring for such a cluster, it is important to
|
|
||||||
understand some key concepts about its operation.
|
|
||||||
Essentially, this type of storage is built on the idea
|
|
||||||
that all storage hardware fails, at every level, at some
|
|
||||||
point. Infrequently encountered failures that would
|
|
||||||
hamstring other storage systems, such as issues taking
|
|
||||||
down RAID cards, or entire servers are handled gracefully
|
|
||||||
with OpenStack Object Storage.</para>
|
|
||||||
<para>A good document describing the Object Storage
|
|
||||||
architecture is found within <link
|
|
||||||
xlink:title="OpenStack wiki"
|
|
||||||
xlink:href="http://docs.openstack.org/developer/swift/overview_architecture.html"
|
|
||||||
>the developer documentation</link>
|
|
||||||
(http://docs.openstack.org/developer/swift/overview_architecture.html)
|
|
||||||
- read this first. Once you have understood the
|
|
||||||
architecture, you should know what a proxy server does and
|
|
||||||
how zones work. However, some important points are often missed at
|
|
||||||
first glance.</para>
|
|
||||||
<para>When designing your cluster, you must consider
|
|
||||||
durability and availability. Understand that the
|
|
||||||
predominant source of these is the spread and placement of
|
|
||||||
your data, rather than the reliability of the hardware.
|
|
||||||
Consider the default value of the number of replicas,
|
|
||||||
which is 3. This means that before an object is
|
|
||||||
marked as having being written at least two copies exists
|
|
||||||
- in case a single server fails to write, the third copy
|
|
||||||
may or may not yet exist when the write operation
|
|
||||||
initially returns. Altering this number increases the
|
|
||||||
robustness of your data, but reduces the amount of storage
|
|
||||||
you have available. Next look at the placement of your
|
|
||||||
servers. Consider spreading them widely throughout your
|
|
||||||
data centre's network and power failure zones. Is a zone a
|
|
||||||
rack, a server or a disk?</para>
|
|
||||||
<para>Object Storage's network patterns might seem unfamiliar
|
|
||||||
at first. Consider these main traffic flows: <itemizedlist>
|
|
||||||
<listitem>
|
|
||||||
<para>Among <glossterm>object</glossterm>,
|
|
||||||
<glossterm>container</glossterm>, and
|
|
||||||
<glossterm>account
|
|
||||||
server</glossterm>s</para>
|
|
||||||
</listitem>
|
|
||||||
<listitem>
|
|
||||||
<para>Between those servers and the proxies</para>
|
|
||||||
</listitem>
|
|
||||||
<listitem>
|
|
||||||
<para>Between the proxies and your users</para>
|
|
||||||
|
|
||||||
</listitem>
|
|
||||||
</itemizedlist></para>
|
|
||||||
<para>Object Storage is very 'chatty' among servers hosting
|
|
||||||
data - even a small cluster does megabytes/second of
|
|
||||||
traffic, which is predominantly "Do you have the
|
|
||||||
object?"/"Yes I have the object!." Of course, if the
|
|
||||||
answer to the aforementioned question is negative or times
|
|
||||||
out, replication of the object begins.</para>
|
|
||||||
<para>Consider the scenario where an entire server fails, and
|
|
||||||
24 TB of data needs to be transferred "immediately" to
|
|
||||||
remain at three copies - this can put significant load on
|
|
||||||
the network.</para>
|
|
||||||
<para>Another oft forgotten fact is that when a new file is
|
|
||||||
being uploaded, the proxy server must write out as many
|
|
||||||
streams as there are replicas - giving a multiple of
|
|
||||||
network traffic. For a 3-replica cluster, 10Gbps in means
|
|
||||||
30Gbps out. Combining this with the previous high
|
|
||||||
bandwidth demands of replication is what results in the
|
|
||||||
recommendation that your private network is of
|
|
||||||
significantly higher bandwidth than your public need be.
|
|
||||||
Oh, and OpenStack Object Storage communicates internally
|
|
||||||
with unencrypted, unauthenticated rsync for performance —
|
|
||||||
you do want the private network to be private.</para>
|
|
||||||
<para>The remaining point on bandwidth is the public facing
|
|
||||||
portion. The swift-proxy service is stateless, which means that you
|
|
||||||
can easily add more and use http load-balancing methods to
|
|
||||||
share bandwidth and availability between them.</para>
|
|
||||||
<para>More proxies means more bandwidth, if your storage can
|
|
||||||
keep up.</para>
|
|
||||||
</section>
|
|
||||||
<section xml:id="storagedecisions_conclusion">
|
<section xml:id="storagedecisions_conclusion">
|
||||||
<title>Conclusion</title>
|
<title>Conclusion</title>
|
||||||
<para>Hopefully you now have some considerations in mind and questions
|
<para>Hopefully you now have some considerations in mind and
|
||||||
to ask your future cloud users about their storage use cases. As you
|
questions to ask your future cloud users about their
|
||||||
can see, your storage decisions will also influence your network design
|
storage use cases. As you can see, your storage decisions
|
||||||
for performance and security needs. Continue with us to make more
|
will also influence your network design for performance
|
||||||
informed decisions about your OpenStack cloud design.</para>
|
and security needs. Continue with us to make more informed
|
||||||
|
decisions about your OpenStack cloud design.</para>
|
||||||
</section>
|
</section>
|
||||||
</chapter>
|
</chapter>
|
||||||
|
Loading…
x
Reference in New Issue
Block a user