Adds O'Reilly content with edits since May 5th 2014

- Also includes these changes:
  GlusterFS supports Block Storage
  Fix options for nova flavor-access-list
  Fixes bug 1316040 - reference a generic SQL database
  "once instance" -> "one instance"
  recover sheepdog doc

Change-Id: I81f57c189bb5138c89f0208404f31c4453ec0aac
This commit is contained in:
Anne Gentle 2014-05-27 17:00:29 -05:00
parent 97d17812e9
commit 1b7058a15c
84 changed files with 23894 additions and 16909 deletions

View File

@ -13,12 +13,12 @@
xml:id="app_crypt" label="B">
<title>Tales From the Cryp^H^H^H^H Cloud</title>
<para>Herein lies a selection of tales from OpenStack cloud operators.
Read and learn from their wisdom.</para>
<para>Herein lies a selection of tales from OpenStack cloud operators. Read,
and learn from their wisdom.</para>
<section xml:id="double_vlan">
<title>Double VLAN</title>
<para>I was on-site in Kelowna, British Columbia, Canada,
<para>I was on-site in Kelowna, British Columbia, Canada
setting up a new OpenStack cloud. The deployment was fully
automated: Cobbler deployed the OS on the bare metal,
bootstrapped it, and Puppet took over from there. I had
@ -28,56 +28,55 @@
from my hotel. In the background, I was fooling around on
the new cloud. I launched an instance and logged in.
Everything looked fine. Out of boredom, I ran
<command>ps aux</command>, and
<command>ps aux</command> and
all of the sudden the instance locked up.</para>
<para>Thinking it was just a one-off issue, I terminated the
instance and launched a new one. By then, the conference
call ended, and I was off to the data center.</para>
call ended and I was off to the data center.</para>
<para>At the data center, I was finishing up some tasks and
remembered the lock-up. I logged in to the new instance and
remembered the lock-up. I logged into the new instance and
ran <command>ps aux</command> again. It worked. Phew. I decided
to run it one
more time. It locked up. WTF.</para>
<para>After reproducing the problem several times, I came to
the unfortunate conclusion that this cloud did indeed have
a problem. Even worse, my time was up in Kelowna, and I had
to return to Calgary.</para>
a problem. Even worse, my time was up in Kelowna and I had
to return back to Calgary.</para>
<para>Where do you even begin troubleshooting something like
this? An instance just randomly locks when a command is
issued. Is it the image? Nope&mdash;it happens on all images.
Is it the compute node? Nope&mdash;all nodes. Is the instance
issued. Is it the image? Nopeit happens on all images.
Is it the compute node? Nopeall nodes. Is the instance
locked up? No! New SSH connections work just fine!</para>
<para>We reached out for help. A networking engineer suggested
it was an MTU issue. Great! MTU! Something to go on!
What's MTU, and why would it cause a problem?</para>
<para>MTU is <emphasis role="italic">maximum transmission
unit</emphasis>. It specifies the
What's MTU and why would it cause a problem?</para>
<para>MTU is maximum transmission unit. It specifies the
maximum number of bytes that the interface accepts for
each packet. If two interfaces have two different MTUs,
bytes might get chopped off and weird things happen&mdash;such
as random session lockups.</para>
bytes might get chopped off and weird things happen --
such as random session lockups.</para>
<note>
<para>Not all packets have a size of 1,500. Running the <command>ls</command>
command over SSH might create only a single packet,
less than 1,500 bytes. However, running a command with
heavy output, such as <command>ps aux</command>,
requires several packets of 1,500 bytes.</para>
<para>Not all packets have a size of 1500. Running the <command>ls</command>
command over SSH might only create a single packets
less than 1500 bytes. However, running a command with
heavy output, such as <command>ps aux</command>
requires several packets of 1500 bytes.</para>
</note>
<para>OK, so where is the MTU issue coming from? Why haven't
we seen this in any other deployment? What's new in this
situation? Well, new data center, new uplink, new
switches, new model of switches, new servers, first time
using this model of servers… so, basically, everything was
using this model of servers… so, basically everything was
new. Wonderful. We toyed around with raising the MTU at
various areas: the switches, the NICs on the compute
nodes, the virtual NICs in the instances; we even had the
nodes, the virtual NICs in the instances, we even had the
data center raise the MTU for our uplink interface. Some
changes worked, some didn't. This line of troubleshooting
didn't feel right, though. We shouldn't have to be
changing the MTU in these areas.</para>
<para>As a last resort, our network admin (Alvaro) and I
<para>As a last resort, our network admin (Alvaro) and myself
sat down with four terminal windows, a pencil, and a piece
of paper. In one window, we ran <command>ping</command>. In the second
of paper. In one window, we ran ping. In the second
window, we ran <command>tcpdump</command> on the cloud
controller. In the third, <command>tcpdump</command> on
the compute node. And the forth had <command>tcpdump</command>
@ -99,11 +98,11 @@
the outside hits the cloud controller, it should not be
configured with a VLAN. We verified this as true. When the
packet went from the cloud controller to the compute node,
it should have a VLAN only if it was destined for an
it should only have a VLAN if it was destined for an
instance. This was still true. When the ping reply was
sent from the instance, it should be in a VLAN. True. When
it came back to the cloud controller and on its way out to
the public Internet, it should no longer have a VLAN.
the public internet, it should no longer have a VLAN.
False. Uh oh. It looked as though the VLAN part of the
packet was not being removed.</para>
<para>That made no sense.</para>
@ -117,7 +116,7 @@
<para>"Hey Alvaro, can you run a VLAN on top of a
VLAN?"</para>
<para>"If you did, you'd add an extra 4 bytes to the
packet."</para>
packet&hellip;"</para>
<para>Then it all made sense…
<screen><userinput><prompt>$</prompt> grep vlan_interface /etc/nova/nova.conf</userinput>
<computeroutput>vlan_interface=vlan20</computeroutput></screen>
@ -125,26 +124,26 @@
<para>In <filename>nova.conf</filename>, <code>vlan_interface</code>
specifies what interface OpenStack should attach all VLANs
to. The correct setting should have been:
<code>vlan_interface=bond0</code>.</para>
<para>This would be the server's bonded NIC.</para>
<para>The vlan20 setting is the VLAN that the data center gave us for
outgoing public Internet access. It's a correct VLAN and
<programlisting>vlan_interface=bond0</programlisting>.</para>
<para>As this would be the server's bonded NIC.</para>
<para>vlan20 is the VLAN that the data center gave us for
outgoing public internet access. It's a correct VLAN and
is also attached to bond0.</para>
<para>By mistake, I configured OpenStack to attach all tenant
VLANs to vlan20 instead of bond0, thereby stacking one VLAN
on top of another. This then added an extra 4 bytes to
each packet, which caused a packet of 1,504 bytes to be sent
out, which would cause problems when it arrived at an
interface that accepted 1,500!</para>
VLANs to vlan20 instead of bond0 thereby stacking one VLAN
on top of another which then added an extra 4 bytes to
each packet which cause a packet of 1504 bytes to be sent
out which would cause problems when it arrived at an
interface that only accepted 1500!</para>
<para>As soon as this setting was fixed, everything
worked.</para>
</section>
<section xml:id="issue">
<title>The Issue</title>
<title>"The Issue"</title>
<para>At the end of August 2012, a post-secondary school in
Alberta, Canada, migrated its infrastructure to an
Alberta, Canada migrated its infrastructure to an
OpenStack cloud. As luck would have it, within the first
day or two of it running, one of its servers just
day or two of it running, one of their servers just
disappeared from the network. Blip. Gone.</para>
<para>After restarting the instance, everything was back up
and running. We reviewed the logs and saw that at some
@ -170,7 +169,7 @@
and sends a response.</para>
</listitem>
<listitem>
<para>Instance "ignores" the response and resends the
<para>Instance "ignores" the response and re-sends the
renewal request.</para>
</listitem>
<listitem>
@ -192,44 +191,44 @@
</listitem>
</orderedlist>
<para>With this information in hand, we were sure that the
problem had to do with DHCP. We thought that, for some
reason, the instance wasn't getting a new IP address, and
problem had to do with DHCP. We thought that for some
reason, the instance wasn't getting a new IP address and
with no IP, it shut itself off from the network.</para>
<para>A quick Google search turned up this: <link
xlink:href="https://lists.launchpad.net/openstack/msg11696.html"
>DHCP lease errors in VLAN mode</link>
(https://lists.launchpad.net/openstack/msg11696.html),
(https://lists.launchpad.net/openstack/msg11696.html)
which further supported our DHCP theory.</para>
<para>An initial idea was to just increase the lease time. If
the instance renewed only once every week, the chances of
the instance only renewed once every week, the chances of
this problem happening would be tremendously smaller than
every minute. This didn't solve the problem, though. It
was just covering the problem up.</para>
<para>We decided to have <command>tcpdump</command> run on this
instance and see whether we could catch it in action again.
instance and see if we could catch it in action again.
Sure enough, we did.</para>
<para>The <command>tcpdump</command> looked very, very weird. In
short, it looked as though network communication stopped
before the instance tried to renew its IP. Since there is
so much DHCP chatter from a one-minute lease, it's very
so much DHCP chatter from a one minute lease, it's very
hard to confirm it, but even with only milliseconds
difference between packets, if one packet arrives first,
it arrives first, and if that packet reported network
it arrived first, and if that packet reported network
issues, then it had to have happened before DHCP.</para>
<para>Additionally, the instance in question was responsible
for a very, very large backup job each night. While "the
<para>Additionally, this instance in question was responsible
for a very, very large backup job each night. While "The
Issue" (as we were now calling it) didn't happen exactly
when the backup happened, it was close enough (a few
hours) that we couldn't ignore it.</para>
<para>More days go by and we catch the Issue in action more
and more. We find that dhclient is not running after the
<para>Further days go by and we catch The Issue in action more
and more. We find that dhclient is not running after The
Issue happens. Now we're back to thinking it's a DHCP
issue. Running <command>/etc/init.d/networking restart</command>
issue. Running <filename>/etc/init.d/networking</filename> restart
brings everything back up and running.</para>
<para>Ever have one of those days where all of the sudden you
get the Google results you were looking for? Well, that's
what happened here. I was looking for information on
dhclient and why it dies when it can't renew its lease, and
dhclient and why it dies when it can't renew its lease and
all of the sudden I found a bunch of OpenStack and dnsmasq
discussions that were identical to the problem we were
seeing!</para>
@ -255,15 +254,15 @@
<para>It was funny to read the report. It was full of people
who had some strange network problem but didn't quite
explain it in the same way.</para>
<para>So it was a QEMU/KVM bug.</para>
<para>At the same time I found the bug report, a co-worker
was able to successfully reproduce the Issue! How? He used
<para>So it was a qemu/kvm bug.</para>
<para>At the same time of finding the bug report, a co-worker
was able to successfully reproduce The Issue! How? He used
<command>iperf</command> to spew a ton of bandwidth at an instance. Within 30
minutes, the instance just disappeared from the
network.</para>
<para>Armed with a patched QEMU and a way to reproduce, we set
out to see if we had finally solved the Issue. After 48
straight hours of hammering the instance with bandwidth,
<para>Armed with a patched qemu and a way to reproduce, we set
out to see if we've finally solved The Issue. After 48
hours straight of hammering the instance with bandwidth,
we were confident. The rest is history. You can search the
bug report for "joe" to find my comments and actual
tests.</para>
@ -277,37 +276,34 @@
xlink:href="http://www.canarie.ca/en/dair-program/about"
>DAIR project</link>
(http://www.canarie.ca/en/dair-program/about). A few days
into production, a compute node locked up. Upon rebooting
into production, a compute node locks up. Upon rebooting
the node, I checked to see what instances were hosted on
that node so I could boot them on behalf of the customer.
Luckily, only one instance.</para>
<para>The <command>nova reboot</command> command wasn't working, so
I used <command>virsh</command>, but it immediately came back
with an error saying it was unable to find the backing
disk. In this case, the backing disk is the glance image
disk. In this case, the backing disk is the Glance image
that is copied to
<filename>/var/lib/nova/instances/_base</filename> when the
image is used for the first time. Why couldn't it find it?
I checked the directory, and sure enough it was
I checked the directory and sure enough it was
gone.</para>
<para>I reviewed the <code>nova</code> database and saw the
instance's entry in the <code>nova.instances</code> table.
The image that the instance was using matched what
<command>virsh</command>
The image that the instance was using matched what virsh
was reporting, so no inconsistency there.</para>
<para>I checked glance and noticed that this image was a
<para>I checked Glance and noticed that this image was a
snapshot that the user created. At least that was good
news&mdash;this user would have been the only user
newsthis user would have been the only user
affected.</para>
<para>Finally, I checked StackTach and reviewed the user's
events. They had created and deleted several snapshots&mdash;most
likely experimenting. Although the timestamps
didn't match up, my conclusion was that they launched
their instance and then deleted the snapshot and it was
somehow removed from
<filename>/var/lib/nova/instances/_base</filename>. None of
that made sense, but it was the best I could come up
with.</para>
<para>Finally, I checked StackTach and reviewed the user's events. They
had created and deleted several snapshots&mdash;most likely
experimenting. Although the timestamps didn't match up, my
conclusion was that they launched their instance and then deleted
the snapshot and it was somehow removed from
<filename>/var/lib/nova/instances/_base</filename>. None of that
made sense, but it was the best I could come up with.</para>
<para>It turns out the reason that this compute node locked up
was a hardware issue. We removed it from the DAIR cloud
and called Dell to have it serviced. Dell arrived and
@ -315,7 +311,7 @@
different compute node was bumped and rebooted.
Great.</para>
<para>When this node fully booted, I ran through the same
scenario of seeing what instances were running so that I could
scenario of seeing what instances were running so I could
turn them back on. There were a total of four. Three
booted and one gave an error. It was the same error as
before: unable to find the backing disk. Seriously,
@ -341,18 +337,18 @@
</para>
<para>Ah-hah! So OpenStack was deleting it. But why?</para>
<para>A feature was introduced in Essex to periodically check
and see whether there were any <code>_base</code> files not in use.
and see if there were any <code>_base</code> files not in use.
If there
were, nova would delete them. This idea sounds innocent
were, Nova would delete them. This idea sounds innocent
enough and has some good qualities to it. But how did this
feature end up turned on? It was disabled by default in
Essex. As it should be. It was <link
xlink:href="https://bugs.launchpad.net/nova/+bug/1029674"
>decided to enable it in Folsom</link>
>decided to be turned on in Folsom</link>
(https://bugs.launchpad.net/nova/+bug/1029674). I cannot
emphasize enough that:</para>
<para>
<emphasis>Actions that delete things should not be
<emphasis>Actions which delete things should not be
enabled by default.</emphasis>
</para>
<para>Disk space is cheap these days. Data recovery is
@ -376,14 +372,14 @@
have the opportunity to use "Valentine's Day Massacre"
again in a title.</para>
<para>This past Valentine's Day, I received an alert that a
compute node was no longer available in the
cloud&mdash;meaning,</para>
compute node was no longer available in the cloud
— meaning,
<screen><prompt>$</prompt><userinput>nova-manage service list</userinput></screen>
<para>showed this particular node with a status of
showed this particular node with a status of
<code>XXX</code>.</para>
<para>I logged in to the cloud controller and was able to both
<command>ping</command> and SSH into the problematic compute node, which
seemed very odd. Usually when I receive this type of alert,
<para>I logged into the cloud controller and was able to both
<command>ping</command> and SSH into the problematic compute node which
seemed very odd. Usually if I receive this type of alert,
the compute node has totally locked up and would be
inaccessible.</para>
<para>After a few minutes of troubleshooting, I saw the
@ -415,14 +411,14 @@
connected to a separate switch, I thought that the chance
of a switch port dying on each switch at the same time was
quite improbable. I concluded that the 10gb dual port NIC
had died and needed to be replaced. I created a ticket for the
had died and needed replaced. I created a ticket for the
hardware support department at the data center where the
node was hosted. I felt lucky that this was a new node and
no one else was hosted on it yet.</para>
<para>An hour later I received the same alert, but for another
compute node. Crap. OK, now there's definitely a problem
going on. Just as with the original node, I was able to log
in by SSH. The bond0 NIC was DOWN, but the 1gb NIC was
going on. Just like the original node, I was able to log
in by SSH. The bond0 NIC was DOWN but the 1gb NIC was
active.</para>
<para>And the best part: the same user had just tried creating
a CentOS instance. What?</para>
@ -439,12 +435,12 @@ Feb 15 01:40:18 SW-1 Ebra: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-
Feb 15 01:40:19 SW-1 Stp: %SPANTREE-6-INTERFACE_DEL: Interface Port-Channel35 has been removed from instance MST0
Feb 15 01:40:19 SW-1 Ebra: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet35 (Server35), changed state to down</computeroutput></screen>
</para>
<para>He reenabled the switch ports, and the two compute nodes
<para>He re-enabled the switch ports and the two compute nodes
immediately came back to life.</para>
<para>Unfortunately, this story has an open ending... we're
still looking into why the CentOS image was sending out
spanning tree packets. Further, we're researching a proper
way for how to mitigate this from happening. It's a bigger
way on how to mitigate this from happening. It's a bigger
issue than one might think. While it's extremely important
for switches to prevent spanning tree loops, it's very
problematic to have an entire compute node be cut from the
@ -452,45 +448,45 @@ Feb 15 01:40:19 SW-1 Ebra: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ether
100 instances and one of them sends a spanning tree
packet, that instance has effectively DDOS'd the other 99
instances.</para>
<para>This is an ongoing and hot topic in networking
circles&mdash;especially with the rise of virtualization and
virtual switches.</para>
<para>This is an ongoing and hot topic in networking circles
— especially with the raise of virtualization and virtual
switches.</para>
</section>
<section xml:id="rabbithole">
<title>Down the Rabbit Hole</title>
<para>Users being able to retrieve console logs from running
instances is a boon for support&mdash;many times they can
instances is a boon for supportmany times they can
figure out what's going on inside their instance and fix
what's going on without bothering you. Unfortunately,
sometimes overzealous logging of failures can cause
problems of its own.</para>
<para>A report came in: VMs were launching slowly, or not at
all. Cue the standard checks&mdash;nothing on the Nagios, but
there was a spike in network toward the current master of
all. Cue the standard checks — nothing on the nagios, but
there was a spike in network towards the current master of
our RabbitMQ cluster. Investigation started, but soon the
other parts of the queue cluster were leaking memory like
a sieve. Then the alert came in: the master rabbit server
a sieve. Then the alert came in the master rabbit server
went down. Connections failed over to the slave.</para>
<para>At that time, our control services were hosted by
another team. We didn't have much debugging information
another team and we didn't have much debugging information
to determine what was going on with the master, and
couldn't reboot it. That team noted that the master failed without
alert, but they managed to reboot it. After an hour, the
cluster had returned to its normal state, and we went home
couldn't reboot it. That team noted that it failed without
alert, but managed to reboot it. After an hour, the
cluster had returned to its normal state and we went home
for the day.</para>
<para>Continuing the diagnosis the next morning was kick-started
by another identical failure. We quickly got the
message queue running again and tried to work out why
<para>Continuing the diagnosis the next morning was kick
started by another identical failure. We quickly got the
message queue running again, and tried to work out why
Rabbit was suffering from so much network traffic.
Enabling debug logging on
<systemitem class="service">nova-api</systemitem> quickly brought
understanding. A <command>tail -f
/var/log/nova/nova-api.log</command> was scrolling by
faster than we'd ever seen before. CTRL+C on that, and we
faster than we'd ever seen before. CTRL+C on that and we
could plainly see the contents of a system log spewing
failures over and over again&mdash;a system log from one of
failures over and over again - a system log from one of
our users' instances.</para>
<para>After finding the instance ID, we headed over to
<para>After finding the instance ID we headed over to
<filename>/var/lib/nova/instances</filename> to find the
<filename>console.log</filename>:
<screen><computeroutput>
@ -499,11 +495,11 @@ adm@cc12:/var/lib/nova/instances/instance-00000e05# wc -l console.log
adm@cc12:/var/lib/nova/instances/instance-00000e05# ls -sh console.log
5.5G console.log</computeroutput></screen></para>
<para>Sure enough, the user had been periodically refreshing
the console log page on the dashboard and the 5 GB file was
the console log page on the dashboard and the 5G file was
traversing the rabbit cluster to get to the
dashboard.</para>
<para>We called the user and asked her to stop for a while, and
she was happy to abandon the horribly broken VM. After
<para>We called them and asked them to stop for a while, and
they were happy to abandon the horribly broken VM. After
that, we started monitoring the size of console
logs.</para>
<para>To this day, <link
@ -518,10 +514,10 @@ adm@cc12:/var/lib/nova/instances/instance-00000e05# ls -sh console.log
<para>Felix Lee of Academia Sinica Grid Computing Centre in Taiwan
contributed this story.</para>
<para>I just upgraded OpenStack from Grizzly to Havana 2013.2-2 using
the RDO repository and everything was running pretty
well—except the EC2 API.</para>
the RDO repository and everything was running pretty well
-- except the EC2 API.</para>
<para>I noticed that the API would suffer from a heavy load and
respond slowly to particular EC2 requests, such as
respond slowly to particular EC2 requests such as
<literal>RunInstances</literal>.</para>
<para>Output from <filename>/var/log/nova/nova-api.log</filename> on
Havana:</para>
@ -532,7 +528,7 @@ adm@cc12:/var/lib/nova/instances/instance-00000e05# ls -sh console.log
/services/Cloud?AWSAccessKeyId=[something]&amp;Action=RunInstances&amp;ClientToken=[something]&amp;ImageId=ami-00000001&amp;InstanceInitiatedShutdownBehavior=terminate...
HTTP/1.1" status: 200 len: 1109 time: 138.5970151
</computeroutput></screen>
<para>This request took more than two minutes to process, but it executed
<para>This request took over two minutes to process, but executed
quickly on another co-existing Grizzly deployment using the same
hardware and system configuration.</para>
<para>Output from <filename>/var/log/nova/nova-api.log</filename> on
@ -547,17 +543,17 @@ HTTP/1.1" status: 200 len: 931 time: 3.9426181
<para>While monitoring system resources, I noticed
a significant increase in memory consumption while the EC2 API
processed this request. I thought it wasn't handling memory
properlypossibly not releasing memory. If the API received
properly -- possibly not releasing memory. If the API received
several of these requests, memory consumption quickly grew until
the system ran out of RAM and began using swap. Each node has
48 GB of RAM, and the "nova-api" process would consume all of it
48 GB of RAM and the "nova-api" process would consume all of it
within minutes. Once this happened, the entire system would become
unusably slow until I restarted the
<systemitem class="service">nova-api</systemitem> service.</para>
<para>So, I found myself wondering what changed in the EC2 API on
Havana that might cause this to happen. Was it a bug or normal
Havana that might cause this to happen. Was it a bug or a normal
behavior that I now need to work around?</para>
<para>After digging into the nova code, I noticed two areas in
<para>After digging into the Nova code, I noticed two areas in
<filename>api/ec2/cloud.py</filename> potentially impacting my
system:</para>
<programlisting language="python">
@ -569,13 +565,13 @@ HTTP/1.1" status: 200 len: 931 time: 3.9426181
context, search_filts=[{'key': ['EC2_client_token']},
{'value': [client_token]}])
</programlisting>
<para>Since my database contained many records&mdash;over 1 million
<para>Since my database contained many records -- over 1 million
metadata records and over 300,000 instance records in "deleted"
or "errored" states&mdash;each search took ages. I decided to clean
or "errored" states -- each search took ages. I decided to clean
up the database by first archiving a copy for backup and then
performing some deletions using the MySQL client. For example, I
ran the following SQL command to remove rows of instances deleted
for more than a year:</para>
for over a year:</para>
<screen><prompt>mysql></prompt> <userinput>delete from nova.instances where deleted=1 and terminated_at &lt; (NOW() - INTERVAL 1 YEAR);</userinput></screen>
<para>Performance increased greatly after deleting the old records and
my new deployment continues to behave well.</para>

File diff suppressed because it is too large Load Diff

View File

@ -1,262 +1,305 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE appendix [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<appendix label="A" version="5.0" xml:id="use-cases"
xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/1999/xhtml"
xmlns:ns4="http://www.w3.org/1998/Math/MathML"
xmlns:ns3="http://www.w3.org/2000/svg"
xmlns:ns="http://docbook.org/ns/docbook">
<title>Use Cases</title>
<appendix xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="use-cases" label="A">
<title>Use Cases</title>
<para>This appendix contains a small selection of use cases from
the community, with more technical detail than usual. Further
examples can be found on the <link
xlink:title="OpenStack User Stories Website"
xlink:href="https://www.openstack.org/user-stories/"
>OpenStack website</link>
(https://www.openstack.org/user-stories/)</para>
<section xml:id="nectar">
<title>NeCTAR</title>
<para>Who uses it: researchers from the Australian publicly
funded research sector. Use is across a wide variety of
disciplines, with the purpose of instances ranging from
running simple web servers to using hundreds of cores for
high throughput computing.</para>
<section xml:id="nectar_deploy">
<title>Deployment</title>
<para>Using OpenStack Compute cells, the NeCTAR Research Cloud
spans eight sites with approximately 4,000 cores per
site.</para>
<para>Each site runs a different configuration, as
resource <glossterm>cell</glossterm>s in an OpenStack
Compute cells setup. Some sites span multiple data
centers, some use off compute node storage with a
shared file system, and some use on compute node
storage with a nonshared file system. Each site
deploys the Image Service with an Object Storage
back end. A central Identity Service, dashboard, and
Compute API service is used. A login to the dashboard
triggers a SAML login with Shibboleth, which creates an
<glossterm>account</glossterm> in the Identity
Service with an SQL back end.</para>
<para>Compute nodes have 24 to 48 cores, with at least 4
GB of RAM per core and approximately 40 GB of
ephemeral storage per core.</para>
<para>All sites are based on Ubuntu 12.04 with KVM as the
hypervisor. The OpenStack version in use is typically
the current stable version, with 5 to 10 percent back-ported
code from trunk and modifications. Migration to Ubuntu 14.04
is planned as part of the Havana to Icehouse upgrade.</para>
</section>
<section xml:id="nectar_resources">
<title>Resources</title>
<itemizedlist>
<listitem>
<para>
<link
xlink:href="https://www.openstack.org/user-stories/nectar/"
>OpenStack.org Case Study</link>
(https://www.openstack.org/user-stories/nectar/)</para>
</listitem>
<listitem>
<para>
<link xlink:href="https://github.com/NeCTAR-RC/"
>NeCTAR-RC GitHub</link>
(https://github.com/NeCTAR-RC/)</para>
</listitem>
<listitem>
<para>
<link xlink:href="https://www.nectar.org.au/"
>NeCTAR website</link>
(https://www.nectar.org.au/)</para>
</listitem>
</itemizedlist>
</section>
</section>
<section xml:id="mit_csail">
<title>MIT CSAIL</title>
<para>Who uses it: researchers from the MIT Computer Science
and Artificial Intelligence Lab.</para>
<section xml:id="mit_csail_deploy">
<title>Deployment</title>
<para>The CSAIL cloud is currently 64 physical nodes with
a total of 768 physical cores and 3,456 GB of RAM.
Persistent data storage is largely outside the cloud on
NFS, with cloud resources focused on compute
resources. There are more than 130 users in more than 40
projects, typically running 2,000&ndash;2,500 vCPUs in 300
to 400 instances.</para>
<para>We initially deployed on Ubuntu 12.04 with the Essex
release of OpenStack using FlatDHCP multi-host
networking.</para>
<para>The software stack is still Ubuntu 12.04 LTS, but now
with OpenStack Havana from the Ubuntu Cloud Archive. KVM
is the hypervisor, deployed using <link
xlink:href="http://fai-project.org">FAI</link>
(http://fai-project.org/) and Puppet for configuration
management. The FAI and Puppet combination is used
lab-wide, not only for OpenStack. There is a single cloud
controller node, which also acts as network controller,
with the remainder of the server hardware dedicated to compute
nodes.</para>
<para>Host aggregates and instance-type extra specs are
used to provide two different resource allocation
ratios. The default resource allocation ratios we use are
4:1 CPU and 1.5:1 RAM. Compute-intensive workloads use
instance types that require non-oversubscribed hosts where
cpu_ratio and ram_ratio are both set to 1.0. Since we
have hyperthreading enabled on our compute nodes, this
provides one vCPU per CPU thread, or two vCPUs per
physical core.</para>
<para>With our upgrade to Grizzly in August 2013, we moved
to OpenStack Networking Service, neutron (quantum at the
time). Compute nodes have two gigabit network interfaces
and a separate management card for IPMI management. One
network interface is used for node-to-node
communications. The other is used as a trunk port for
OpenStack managed VLANs. The controller node uses two
bonded 10g network interfaces for its public IP
communications. Big pipes are used here because images are
served over this port, and it is also used to connect to
iSCSI storage, back ending the image storage and
database. The controller node also has a gigabit interface
that is used in trunk mode for OpenStack managed VLAN
traffic. This port handles traffic to the dhcp-agent and
metadata-proxy.</para>
<para>We approximate the older nova-networking multi-host
HA setup by using "provider vlan networks" that connect
instances directly to existing publicly addressable
networks and use existing physical routers as their
default gateway. This means that if our network
controller goes down, running instances still have
their network available, and no single Linux host becomes a
traffic bottleneck. We are able to do this because we have
a sufficient supply of IPv4 addresses to cover all of our
instances and thus don't need NAT and don't use floating
IP addresses. We provide a single generic public network to
all projects and additional existing VLANs on a project-by-project
basis as needed. Individual projects are also
allowed to create their own private GRE based
networks.</para>
</section>
<section xml:id="CSAIL_resources">
<title>Resources</title>
<itemizedlist>
<listitem>
<para>
<link xlink:href="http://www.csail.mit.edu"
>CSAIL Homepage</link>
(http://www.csail.mit.edu)</para>
</listitem>
</itemizedlist>
</section>
<para>This appendix contains a small selection of use cases from the
community, with more technical detail than usual. Further examples can be
found on the <link xlink:href="http://opsgui.de/1eLAdUw"
xlink:title="OpenStack User Stories Website">OpenStack
website</link>.</para>
<section xml:id="nectar">
<title>NeCTAR</title>
<para>Who uses it: researchers from the Australian publicly funded
research sector. Use is across a wide variety of disciplines, with the
purpose of instances ranging from running simple web servers to using
hundreds of cores for high-throughput computing.<indexterm
class="singular">
<primary>NeCTAR Research Cloud</primary>
</indexterm><indexterm class="singular">
<primary>use cases</primary>
<secondary>NeCTAR</secondary>
</indexterm><indexterm class="singular">
<primary>OpenStack community</primary>
<secondary>use cases</secondary>
<tertiary>NeCTAR</tertiary>
</indexterm></para>
<section xml:id="nectar_deploy">
<title>Deployment</title>
<para>Using OpenStack Compute cells, the NeCTAR Research Cloud spans
eight sites with approximately 4,000 cores per site.</para>
<para>Each site runs a different configuration, as resource
<glossterm>cell</glossterm>s in an OpenStack Compute cells setup. Some
sites span multiple data centers, some use off compute node storage with
a shared file system, and some use on compute node storage with a
nonshared file system. Each site deploys the Image Service with an
Object Storage backend. A central Identity Service, dashboard, and
Compute API service are used. A login to the dashboard triggers a SAML
login with Shibboleth, which creates an <glossterm>account</glossterm>
in the Identity Service with an SQL backend.</para>
<para>Compute nodes have 24 to 48 cores, with at least 4 GB of RAM per
core and approximately 40 GB of ephemeral storage per core.</para>
<para>All sites are based on Ubuntu 12.04, with KVM as the hypervisor.
The OpenStack version in use is typically the current stable version,
with 5 to 10 percent back-ported code from trunk and modifications.
Migration to Ubuntu 14.04 is planned as part of the Havana to Icehouse
upgrade.<indexterm class="singular">
<primary>Icehouse</primary>
<secondary>migration to Ubuntu</secondary>
</indexterm></para>
</section>
<section xml:id="dair">
<title>DAIR</title>
<para>Who uses it: DAIR is an integrated virtual environment
that leverages the CANARIE network to develop and test new
information communication technology (ICT) and other
digital technologies. It combines such digital
infrastructure as advanced networking and cloud computing
and storage to create an environment for developing and testing
innovative ICT applications, protocols, and services;
performing at-scale experimentation for deployment; and
facilitating a faster time to market.</para>
<section xml:id="dair_deploy">
<title>Deployment</title>
<para>DAIR is hosted at two different data centers across
Canada: one in Alberta and the other in Quebec. It
consists of a cloud controller at each location,
although, one is designated the "master" controller
which is in charge of central authentication and
quotas. This is done through custom scripts and light
modifications to OpenStack. DAIR is currently running
Grizzly.</para>
<para>For Object Storage, each region has a swift
environment.</para>
<para>A NetApp appliance is used in each region for both
block storage and instance storage. There are future
plans to move the instances off the NetApp
appliance and onto a distributed file system such as
<glossterm>Ceph</glossterm> or GlusterFS.</para>
<para>VlanManager is used extensively for network
management. All servers have two bonded 10GbE NICs that
are connected to two redundant switches. DAIR is set
up to use single-node networking where the cloud
controller is the gateway for all instances on all
compute nodes. Internal OpenStack traffic (for
example, storage traffic) does not go through the
cloud controller.</para>
</section>
<section xml:id="dair_resources">
<title>Resources</title>
<itemizedlist>
<listitem>
<para>
<link xlink:href="http://www.canarie.ca/en/dair-program/about"
>DAIR homepage</link>
(http://www.canarie.ca/en/dair-program/about)</para>
</listitem>
</itemizedlist>
</section>
<section xml:id="nectar_resources">
<title>Resources</title>
<itemizedlist>
<listitem>
<para><link xlink:href="http://opsgui.de/NPFCiF">OpenStack.org case
study</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/1eLAhnd">NeCTAR-RC
GitHub</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/NPFEHm">NeCTAR
website</link></para>
</listitem>
</itemizedlist>
</section>
<section xml:id="cern">
<title>CERN</title>
<para>Who uses it: researchers at CERN (European Organization
for Nuclear Research) conducting high-energy physics
research.</para>
<section xml:id="cern_deploy">
<title>Deployment</title>
<para>The environment is largely based on Scientific Linux
6, which is Red Hat compatible. We use KVM as our
primary hypervisor, although tests are ongoing with
Hyper-V on Windows Server 2008.</para>
<para>We use the Puppet Labs OpenStack modules to
configure Compute, Image Service, Identity, and
dashboard. Puppet is used widely for instance
configuration, and Foreman is used as a GUI for reporting and
instance provisioning.</para>
<para>Users and groups are managed through Active
Directory and imported into the Identity Service using
LDAP. CLIs are available for nova and Euca2ools to do
this.</para>
<para>There are three clouds currently running at CERN, totaling
about 3,400 compute nodes, with approximately 60,000 cores.
The CERN IT cloud aims to expand to 300,000 cores by 2015.
</para>
<!--FIXME - update numbers and release information for 2014 -->
</section>
<section xml:id="cern_resources">
<title>Resources</title>
<itemizedlist>
<listitem>
<para>
<link
xlink:href="http://openstack-in-production.blogspot.com/2013/09/a-tale-of-3-openstack-clouds-50000.html"
>OpenStack in Production: A tale of 3 OpenStack Clouds</link>
(http://openstack-in-production.blogspot.com/2013/09/a-tale-of-3-openstack-clouds-50000.html)
</para>
</listitem>
<listitem>
<para>
<link
xlink:href="http://cern.ch/go/N8wp">Review
of CERN Data Centre
Infrastructure</link> (http://cern.ch/go/N8wp)</para>
</listitem>
<listitem>
<para>
<link
xlink:href="http://information-technology.web.cern.ch/book/cern-private-cloud-user-guide"
>CERN Cloud Infrastructure User
Guide</link> (http://information-technology.web.cern.ch/book/cern-private-cloud-user-guide)</para>
</listitem>
</itemizedlist>
</section>
</section>
<section xml:id="mit_csail">
<title>MIT CSAIL</title>
<para>Who uses it: researchers from the MIT Computer Science and
Artificial Intelligence Lab.<indexterm class="singular">
<primary>CSAIL (Computer Science and Artificial Intelligence
Lab)</primary>
</indexterm><indexterm class="singular">
<primary>MIT CSAIL (Computer Science and Artificial Intelligence
Lab)</primary>
</indexterm><indexterm class="singular">
<primary>use cases</primary>
<secondary>MIT CSAIL</secondary>
</indexterm><indexterm class="singular">
<primary>OpenStack community</primary>
<secondary>use cases</secondary>
<tertiary>MIT CSAIL</tertiary>
</indexterm></para>
<section xml:id="mit_csail_deploy">
<title>Deployment</title>
<para>The CSAIL cloud is currently 64 physical nodes with a total of 768
physical cores and 3,456 GB of RAM. Persistent data storage is largely
outside the cloud on NFS, with cloud resources focused on compute
resources. There are more than 130 users in more than 40 projects,
typically running 2,0002,500 vCPUs in 300 to 400 instances.</para>
<para>We initially deployed on Ubuntu 12.04 with the Essex release of
OpenStack using FlatDHCP multi-host networking.</para>
<para>The software stack is still Ubuntu 12.04 LTS, but now with
OpenStack Havana from the Ubuntu Cloud Archive. KVM is the hypervisor,
deployed using <link xlink:href="http://opsgui.de/1eLAhUr">FAI</link>
and Puppet for configuration management. The FAI and Puppet combination
is used lab-wide, not only for OpenStack. There is a single cloud
controller node, which also acts as network controller, with the
remainder of the server hardware dedicated to compute nodes.</para>
<para>Host aggregates and instance-type extra specs are used to provide
two different resource allocation ratios. The default resource
allocation ratios we use are 4:1 CPU and 1.5:1 RAM. Compute-intensive
workloads use instance types that require non-oversubscribed hosts where
<literal>cpu_ratio</literal> and <literal>ram_ratio</literal> are both
set to 1.0. Since we have hyperthreading enabled on our compute nodes,
this provides one vCPU per CPU thread, or two vCPUs per physical
core.</para>
<para>With our upgrade to Grizzly in August 2013, we moved to OpenStack
Networking Service, neutron (quantum at the time). Compute nodes have
two-gigabit network interfaces and a separate management card for IPMI
management. One network interface is used for node-to-node
communications. The other is used as a trunk port for OpenStack managed
VLANs. The controller node uses two bonded 10g network interfaces for
its public IP communications. Big pipes are used here because images are
served over this port, and it is also used to connect to iSCSI storage,
backending the image storage and database. The controller node also has
a gigabit interface that is used in trunk mode for OpenStack managed
VLAN traffic. This port handles traffic to the dhcp-agent and
metadata-proxy.</para>
<para>We approximate the older <literal>nova-network</literal>
multi-host HA setup by using "provider vlan networks" that connect
instances directly to existing publicly addressable networks and use
existing physical routers as their default gateway. This means that if
our network controller goes down, running instances still have their
network available, and no single Linux host becomes a traffic
bottleneck. We are able to do this because we have a sufficient supply
of IPv4 addresses to cover all of our instances and thus don't need NAT
and don't use floating IP addresses. We provide a single generic public
network to all projects and additional existing VLANs on a
project-by-project basis as needed. Individual projects are also allowed
to create their own private GRE based networks.</para>
</section>
</appendix>
<section xml:id="CSAIL_resources">
<title>Resources</title>
<itemizedlist>
<listitem>
<para><link xlink:href="http://opsgui.de/NPFFez">CSAIL
homepage</link></para>
</listitem>
</itemizedlist>
</section>
</section>
<section xml:id="dair">
<title>DAIR</title>
<para>Who uses it: DAIR is an integrated virtual environment that
leverages the CANARIE network to develop and test new information
communication technology (ICT) and other digital technologies. It combines
such digital infrastructure as advanced networking and cloud computing and
storage to create an environment for developing and testing innovative ICT
applications, protocols, and services; performing at-scale experimentation
for deployment; and facilitating a faster time to market.<indexterm
class="singular">
<primary>DAIR</primary>
</indexterm><indexterm class="singular">
<primary>use cases</primary>
<secondary>DAIR</secondary>
</indexterm><indexterm class="singular">
<primary>OpenStack community</primary>
<secondary>use cases</secondary>
<tertiary>DAIR</tertiary>
</indexterm></para>
<section xml:id="dair_deploy">
<title>Deployment</title>
<para>DAIR is hosted at two different data centers across Canada: one in
Alberta and the other in Quebec. It consists of a cloud controller at
each location, although, one is designated the "master" controller that
is in charge of central authentication and quotas. This is done through
custom scripts and light modifications to OpenStack. DAIR is currently
running Grizzly.</para>
<para>For Object Storage, each region has a swift environment.</para>
<para>A NetApp appliance is used in each region for both block storage
and instance storage. There are future plans to move the instances off
the NetApp appliance and onto a distributed file system such as
<glossterm>Ceph</glossterm> or GlusterFS.</para>
<para>VlanManager is used extensively for network management. All
servers have two bonded 10GbE NICs that are connected to two redundant
switches. DAIR is set up to use single-node networking where the cloud
controller is the gateway for all instances on all compute nodes.
Internal OpenStack traffic (for example, storage traffic) does not go
through the cloud controller.</para>
</section>
<section xml:id="dair_resources">
<title>Resources</title>
<itemizedlist>
<listitem>
<para><link xlink:href="http://opsgui.de/NPFgIP">DAIR
homepage</link></para>
</listitem>
</itemizedlist>
</section>
</section>
<section xml:id="cern">
<title>CERN</title>
<para>Who uses it: researchers at CERN (European Organization for Nuclear
Research) conducting high-energy physics research.<indexterm
class="singular">
<primary>CERN (European Organization for Nuclear Research)</primary>
</indexterm><indexterm class="singular">
<primary>use cases</primary>
<secondary>CERN</secondary>
</indexterm><indexterm class="singular">
<primary>OpenStack community</primary>
<secondary>use cases</secondary>
<tertiary>CERN</tertiary>
</indexterm></para>
<section xml:id="cern_deploy">
<title>Deployment</title>
<para>The environment is largely based on Scientific Linux 6, which is
Red Hat compatible. We use KVM as our primary hypervisor, although tests
are ongoing with Hyper-V on Windows Server 2008.</para>
<para>We use the Puppet Labs OpenStack modules to configure Compute,
Image Service, Identity, and dashboard. Puppet is used widely for
instance configuration, and Foreman is used as a GUI for reporting and
instance provisioning.</para>
<para>Users and groups are managed through Active Directory and imported
into the Identity Service using LDAP.&#160;CLIs are available for nova
and Euca2ools to do this.</para>
<para>There are three clouds currently running at CERN, totaling about
3,400 compute nodes, with approximately 60,000 cores. The CERN IT cloud
aims to expand to 300,000 cores by 2015.</para>
<!--FIXME - update numbers and release information for 2014 -->
</section>
<section xml:id="cern_resources">
<title>Resources</title>
<itemizedlist>
<listitem>
<para><link xlink:href="http://opsgui.de/NPFGiu">“OpenStack in
Production: A tale of 3 OpenStack Clouds”</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/1eLAkPR">“Review of CERN
Data Centre Infrastructure”</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/NPFGPD">“CERN Cloud
Infrastructure User Guide”</link></para>
</listitem>
</itemizedlist>
</section>
</section>
</appendix>

View File

@ -47,7 +47,6 @@
</info>
<!-- front matter -->
<xi:include href="acknowledgements.xml"/>
<xi:include href="ch_ops_dochistory.xml"/>
<xi:include href="preface_ops.xml"/>
<!-- parts: architecture and operations -->
<xi:include href="part_architecture.xml"/>
@ -60,4 +59,5 @@
<xi:include href="ch_ops_resources.xml"/>
<!-- glossary -->
<xi:include href="glossary-terms.xml"/>
<index/>
</book>

BIN
doc/openstack-ops/callouts/1.pdf Executable file

Binary file not shown.

BIN
doc/openstack-ops/callouts/1.png Executable file

Binary file not shown.

After

Width:  |  Height:  |  Size: 329 B

BIN
doc/openstack-ops/callouts/10.pdf Executable file

Binary file not shown.

BIN
doc/openstack-ops/callouts/10.png Executable file

Binary file not shown.

After

Width:  |  Height:  |  Size: 361 B

BIN
doc/openstack-ops/callouts/2.pdf Executable file

Binary file not shown.

BIN
doc/openstack-ops/callouts/2.png Executable file

Binary file not shown.

After

Width:  |  Height:  |  Size: 353 B

BIN
doc/openstack-ops/callouts/3.pdf Executable file

Binary file not shown.

BIN
doc/openstack-ops/callouts/3.png Executable file

Binary file not shown.

After

Width:  |  Height:  |  Size: 350 B

BIN
doc/openstack-ops/callouts/4.pdf Executable file

Binary file not shown.

BIN
doc/openstack-ops/callouts/4.png Executable file

Binary file not shown.

After

Width:  |  Height:  |  Size: 345 B

BIN
doc/openstack-ops/callouts/5.pdf Executable file

Binary file not shown.

BIN
doc/openstack-ops/callouts/5.png Executable file

Binary file not shown.

After

Width:  |  Height:  |  Size: 348 B

BIN
doc/openstack-ops/callouts/6.pdf Executable file

Binary file not shown.

BIN
doc/openstack-ops/callouts/6.png Executable file

Binary file not shown.

After

Width:  |  Height:  |  Size: 355 B

BIN
doc/openstack-ops/callouts/7.pdf Executable file

Binary file not shown.

BIN
doc/openstack-ops/callouts/7.png Executable file

Binary file not shown.

After

Width:  |  Height:  |  Size: 344 B

BIN
doc/openstack-ops/callouts/8.pdf Executable file

Binary file not shown.

BIN
doc/openstack-ops/callouts/8.png Executable file

Binary file not shown.

After

Width:  |  Height:  |  Size: 357 B

BIN
doc/openstack-ops/callouts/9.pdf Executable file

Binary file not shown.

BIN
doc/openstack-ops/callouts/9.png Executable file

Binary file not shown.

After

Width:  |  Height:  |  Size: 357 B

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,45 +1,51 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<chapter xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="example_architecture">
<?dbhtml stop-chunking?>
<title>Example Architectures</title>
<para>To understand the possibilities OpenStack offers it's best to start
with basic architectures that are tried-and-true and have been tested
in production environments. We offer two such examples with basic pivots on
the base operating system (Ubuntu and Red Hat Enterprise Linux) and the
networking architectures. There are other differences between these two
examples, but you should find the considerations made for the choices in each
as well as a rationale for why it worked well in a given environment.</para>
<para>Because OpenStack is highly configurable, with many different
back-ends and network configuration options, it is difficult to write
documentation that covers all possible OpenStack deployments. Therefore,
this guide defines example architectures to simplify the task of
documenting, as well as to provide the scope for this guide. Both of the
offered architecture examples are currently running in production and
serving users.</para>
<tip><para>As always, refer to the Glossary if you are unclear about any of the
terminology mentioned in these architectures.</para></tip>
<xi:include href="section_arch_example-nova.xml"/>
<xi:include href="section_arch_example-neutron.xml"/>
<section xml:id="example_archs_conclusion">
<title>Parting Thoughts on Architectures</title>
<para>With so many considerations and options available our hope is to
provide a few clearly-marked and tested paths for your OpenStack
exploration. If you're looking for additional ideas, check out the
<link linkend="use-cases"
>Use Cases</link> appendix, the <link xlink:href="http://docs.openstack.org/"
>OpenStack Installation Guides</link>, or the <link
xlink:href="http://openstack.org/user-stories/"
>OpenStack User Stories page</link>.</para>
</section>
</chapter>
<chapter version="5.0" xml:id="example_architecture"
xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/1998/Math/MathML"
xmlns:ns4="http://www.w3.org/1999/xhtml"
xmlns:ns3="http://www.w3.org/2000/svg"
xmlns:ns="http://docbook.org/ns/docbook">
<?dbhtml stop-chunking?>
<title>Example Architectures</title>
<para>To understand the possibilities OpenStack offers, it's best to start
with basic architectures that are tried-and-true and have been tested in
production environments. We offer two such examples with basic pivots on the
base operating system (Ubuntu and Red Hat Enterprise Linux) and the
networking architectures. There are other differences between these two
examples, but you should find the considerations made for the choices in
each as well as a rationale for why it worked well in a given
environment.</para>
<para>Because OpenStack is highly configurable, with many different backends
and network configuration options, it is difficult to write documentation
that covers all possible OpenStack deployments. Therefore, this guide
defines example architectures to simplify the task of documenting, as well
as to provide the scope for this guide. Both of the offered architecture
examples are currently running in production and serving users.</para>
<tip>
<para>As always, refer to the <xref linkend="openstack_glossary" /> if you
are unclear about any of the terminology mentioned in these
architectures.</para>
</tip>
<xi:include href="section_arch_example-nova.xml" />
<xi:include href="section_arch_example-neutron.xml" />
<section xml:id="example_archs_conclusion">
<title>Parting Thoughts on Architectures</title>
<para>With so many considerations and options available, our hope is to
provide a few clearly-marked and tested paths for your OpenStack
exploration. If you're looking for additional ideas, check out <xref
linkend="use-cases" />, the <link
xlink:href="http://opsgui.de/NPFTC8">OpenStack Installation Guides</link>,
or the <link xlink:href="http://opsgui.de/1eLAAhX">OpenStack User Stories
page</link>.</para>
</section>
</chapter>

View File

@ -1,391 +1,536 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<chapter xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink"
version="5.0"
xml:id="network_design">
<?dbhtml stop-chunking?>
<title>Network Design</title>
<para>OpenStack provides a rich networking environment, and this
chapter details the requirements and options to deliberate
when designing your cloud.</para>
<warning><para>If this is the first time you are deploying a cloud
infrastructure in your organization, after reading this
section, your first conversations should be with your
networking team. Network usage in a running cloud is vastly
different from traditional network deployments and has the
potential to be disruptive at both a connectivity and a policy
level.</para></warning>
<para>For example, you must plan the number of IP addresses that
you need for both your guest instances as well as management
infrastructure. Additionally, you must research and discuss
cloud network connectivity through proxy servers and
firewalls.</para>
<para>In this chapter, we'll give some examples of network implementations to
consider and provide information about some of the network layouts
that OpenStack uses. Finally, we have some brief notes on the networking
services that are essential for stable operation.</para>
<section xml:id="mgmt_network">
<title>Management Network</title>
<para>A <glossterm>management network</glossterm> (a separate network for use by your
cloud operators), typically consisting of a separate
switch and separate NICs (network interface cards), is a
recommended option. This
segregation prevents system administration and the monitoring
of system access from being disrupted by traffic generated by
guests.</para>
<para>Consider creating other private networks for
communication between internal components of OpenStack,
such as the message queue and OpenStack Compute. Using a
virtual local area network (VLAN)
works well for these scenarios because it provides
a method for creating multiple virtual networks on a
physical network.</para>
</section>
<section xml:id="public_addressing">
<title>Public Addressing Options</title>
<para>There are two main types of IP addresses for guest
virtual machines: fixed IPs and floating IPs. Fixed IPs
are assigned to instances on boot, whereas floating IP
addresses can change their association between instances
by action of the user. Both types of IP addresses can
be either public or private, depending on your use
case.</para>
<para>Fixed IP addresses are required, whereas it is possible
to run OpenStack without floating IPs. One of the most
common use cases for floating IPs is to provide public IP
addresses to a private cloud, where there are a limited
number of IP addresses available. Another is for a public
cloud user to have a "static" IP address that can be
reassigned when an instance is upgraded or moved.</para>
<para>Fixed IP addresses can be private for private clouds, or
public for public clouds. When an instance terminates, its
fixed IP is lost. It is worth noting that newer users of
cloud computing may find their ephemeral nature
frustrating.</para>
</section>
<section xml:id="ip_address_planning">
<title>IP Address Planning</title>
<para>An OpenStack installation can potentially have many
subnets (ranges of IP addresses), and different types of
services in each. An IP
address plan can assist with a shared understanding of
network partition purposes and scalability. Control
services can have public and private IP addresses, and as
noted above there are a couple of options for an instance's
public addresses.</para>
<para>An IP address plan might be broken down into the
following sections:</para>
<informaltable rules="all">
<tbody>
<tr>
<td><para><emphasis role="bold">subnet router</emphasis></para></td>
<td><para>Packets leaving the subnet go via this
address, which could be a dedicated router
or a nova-network service.</para></td>
</tr>
<tr>
<td><para><emphasis role="bold">control services public
interfaces</emphasis></para></td>
<td><para>Public access to
<code>swift-proxy</code>,
<code>nova-api</code>,
<code>glance-api</code>, and horizon
come to these addresses, which could be on
one side of a load balancer or pointing
at individual machines.</para></td>
</tr>
<tr>
<td><para><emphasis role="bold">Object Storage cluster internal
communications</emphasis></para></td>
<td><para>Traffic among object/account/container
servers and between these and the proxy
server's internal interface uses this
private network.</para></td>
</tr>
<tr>
<td><para><emphasis role="bold">compute and storage
communications</emphasis></para></td>
<td><para>If ephemeral or block storage is
external to the compute node, this network
is used.</para></td>
</tr>
<tr>
<td><para><emphasis role="bold">out-of-band remote
management</emphasis></para></td>
<td><para>If a dedicated remote access controller
chip is included in servers, often these
are on a separate network.</para></td>
</tr>
<tr>
<td><para><emphasis role="bold">in-band remote management</emphasis></para></td>
<td><para>Often, an extra (such as 1 GB)
interface on compute or storage nodes is
used for system administrators or
monitoring tools to access the host
instead of going through the public
interface.</para></td>
</tr>
<tr>
<td><para><emphasis role="bold">spare space for future
growth</emphasis></para></td>
<td><para>Adding more public-facing control
services or guest instance IPs should
always be part of your plan.</para></td>
</tr>
</tbody>
</informaltable>
<para>For example, take a deployment that has both OpenStack
Compute and Object Storage, with private ranges
172.22.42.0/24 and 172.22.87.0/26 available. One way to
segregate the space might be as follows:</para>
<programlisting><?db-font-size 55%?>172.22.42.0/24:
<chapter version="5.0" xml:id="network_design"
xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/2000/svg"
xmlns:ns4="http://www.w3.org/1998/Math/MathML"
xmlns:ns3="http://www.w3.org/1999/xhtml"
xmlns:ns="http://docbook.org/ns/docbook">
<?dbhtml stop-chunking?>
<title>Network Design</title>
<para>OpenStack provides a rich networking environment, and this chapter
details the requirements and options to deliberate when designing your
cloud.<indexterm class="singular">
<primary>network design</primary>
<secondary>first steps</secondary>
</indexterm><indexterm class="singular">
<primary>design considerations</primary>
<secondary>network design</secondary>
</indexterm></para>
<warning>
<para>If this is the first time you are deploying a cloud infrastructure
in your organization, after reading this section, your first conversations
should be with your networking team. Network usage in a running cloud is
vastly different from traditional network deployments and has the
potential to be disruptive at both a connectivity and a policy
level.<indexterm class="singular">
<primary>cloud computing</primary>
<secondary>vs. traditional deployments</secondary>
</indexterm></para>
</warning>
<para>For example, you must plan the number of IP addresses that you need
for both your guest instances as well as management infrastructure.
Additionally, you must research and discuss cloud network connectivity
through proxy servers and firewalls.</para>
<para>In this chapter, we'll give some examples of network implementations
to consider and provide information about some of the network layouts that
OpenStack uses. Finally, we have some brief notes on the networking services
that are essential for stable operation.</para>
<section xml:id="mgmt_network">
<title>Management Network</title>
<para>A <glossterm>management network</glossterm> (a separate network for
use by your cloud operators) typically consists of a separate switch and
separate NICs (network interface cards), and is a recommended option. This
segregation prevents system administration and the monitoring of system
access from being disrupted by traffic generated by guests.<indexterm
class="singular">
<primary>NICs (network interface cards)</primary>
</indexterm><indexterm class="singular">
<primary>management network</primary>
</indexterm><indexterm class="singular">
<primary>network design</primary>
<secondary>management network</secondary>
</indexterm></para>
<para>Consider creating other private networks for communication between
internal components of OpenStack, such as the message queue and OpenStack
Compute. Using a virtual local area network (VLAN) works well for these
scenarios because it provides a method for creating multiple virtual
networks on a physical network.</para>
</section>
<section xml:id="public_addressing">
<title>Public Addressing Options</title>
<para>There are two main types of IP addresses for guest virtual machines:
fixed IPs and floating IPs. Fixed IPs are assigned to instances on boot,
whereas floating IP addresses can change their association between
instances by action of the user. Both types of IP addresses can be either
public or private, depending on your use case.<indexterm class="singular">
<primary>IP addresses</primary>
<secondary>public addressing options</secondary>
</indexterm><indexterm class="singular">
<primary>network design</primary>
<secondary>public addressing options</secondary>
</indexterm></para>
<para>Fixed IP addresses are required, whereas it is possible to run
OpenStack without floating IPs. One of the most common use cases for
floating IPs is to provide public IP addresses to a private cloud, where
there are a limited number of IP addresses available. Another is for a
public cloud user to have a "static" IP address that can be reassigned
when an instance is upgraded or moved.<indexterm class="singular">
<primary>IP addresses</primary>
<secondary>static</secondary>
</indexterm><indexterm class="singular">
<primary>static IP addresses</primary>
</indexterm></para>
<para>Fixed IP addresses can be private for private clouds, or public for
public clouds. When an instance terminates, its fixed IP is lost. It is
worth noting that newer users of cloud computing may find their ephemeral
nature frustrating.<indexterm class="singular">
<primary>IP addresses</primary>
<secondary>fixed</secondary>
</indexterm><indexterm class="singular">
<primary>fixed IP addresses</primary>
</indexterm></para>
</section>
<section xml:id="ip_address_planning">
<title>IP Address Planning</title>
<para>An OpenStack installation can potentially have many subnets (ranges
of IP addresses) and different types of services in each. An IP address
plan can assist with a shared understanding of network partition purposes
and scalability. Control services can have public and private IP
addresses, and as noted above, there are a couple of options for an
instance's public addresses.<indexterm class="singular">
<primary>IP addresses</primary>
<secondary>address planning</secondary>
</indexterm><indexterm class="singular">
<primary>network design</primary>
<secondary>IP address planning</secondary>
</indexterm></para>
<para>An IP address plan might be broken down into the following
sections:<indexterm class="singular">
<primary>IP addresses</primary>
<secondary>sections of</secondary>
</indexterm></para>
<variablelist>
<varlistentry>
<term>Subnet router</term>
<listitem>
<para>Packets leaving the subnet go via this address, which could be
a dedicated router or a <literal>nova-network</literal>
service.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Control services public interfaces</term>
<listitem>
<para>Public access to <code>swift-proxy</code>,
<code>nova-api</code>, <code>glance-api</code>, and horizon come to
these addresses, which could be on one side of a load balancer or
pointing at individual machines.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Object Storage cluster internal communications</term>
<listitem>
<para>Traffic among object/account/container servers and between
these and the proxy server's internal interface uses this private
network.<indexterm class="singular">
<primary>containers</primary>
<secondary>container servers</secondary>
</indexterm><indexterm class="singular">
<primary>objects</primary>
<secondary>object servers</secondary>
</indexterm><indexterm class="singular">
<primary>account server</primary>
</indexterm></para>
</listitem>
</varlistentry>
<varlistentry>
<term>Compute and storage communications</term>
<listitem>
<para>If ephemeral or block storage is external to the compute node,
this network is used.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Out-of-band remote management</term>
<listitem>
<para>If a dedicated remote access controller chip is included in
servers, often these are on a separate network.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>In-band remote management</term>
<listitem>
<para>Often, an extra (such as 1 GB) interface on compute or storage
nodes is used for system administrators or monitoring tools to
access the host instead of going through the public
interface.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Spare space for future growth</term>
<listitem>
<para>Adding more public-facing control services or guest instance
IPs should always be part of your plan.</para>
</listitem>
</varlistentry>
</variablelist>
<para>For example, take a deployment that has both OpenStack Compute and
Object Storage, with private ranges 172.22.42.0/24 and 172.22.87.0/26
available. One way to segregate the space might be as follows:</para>
<programlisting><?db-font-size 55%?>172.22.42.0/24:
172.22.42.1 - 172.22.42.3 - subnet routers
172.22.42.4 - 172.22.42.20 - spare for networks
172.22.42.21 - 172.22.42.104 - Compute node remote access controllers (inc spare)
172.22.42.21 - 172.22.42.104 - Compute node remote access controllers
(inc spare)
172.22.42.105 - 172.22.42.188 - Compute node management interfaces (inc spare)
172.22.42.189 - 172.22.42.208 - Swift proxy remote access controllers (inc spare)
172.22.42.189 - 172.22.42.208 - Swift proxy remote access controllers
(inc spare)
172.22.42.209 - 172.22.42.228 - Swift proxy management interfaces (inc spare)
172.22.42.229 - 172.22.42.252 - Swift storage servers remote access controllers (inc spare)
172.22.42.229 - 172.22.42.252 - Swift storage servers remote access controllers
(inc spare)
172.22.42.253 - 172.22.42.254 - spare
172.22.87.0/26:
172.22.87.1 - 172.22.87.3 - subnet routers
172.22.87.4 - 172.22.87.24 - Swift proxy server internal interfaces (inc spare)
172.22.87.25 - 172.22.87.63 - Swift object server internal interfaces (inc spare)</programlisting>
<para>A similar approach can be taken with public IP
addresses, taking note that large, flat ranges are
preferred for use with guest instance IPs. Take into
account that for some OpenStack networking options, a
public IP address in the range of a guest instance public
IP address is assigned to the nova-compute host.</para>
172.22.87.4 - 172.22.87.24 - Swift proxy server internal interfaces
(inc spare)
172.22.87.25 - 172.22.87.63 - Swift object server internal interfaces
(inc spare)</programlisting>
<para>A similar approach can be taken with public IP addresses, taking
note that large, flat ranges are preferred for use with guest instance
IPs. Take into account that for some OpenStack networking options, a
public IP address in the range of a guest instance public IP address is
assigned to the <literal>nova-compute</literal> host.</para>
</section>
<section xml:id="network_topology">
<title>Network Topology</title>
<para>OpenStack Compute with <literal>nova-network</literal> provides
predefined network deployment models, each with its own strengths and
weaknesses. The selection of a network manager changes your network
topology, so the choice should be made carefully. You also have a choice
between the tried-and-true legacy <literal>nova-network</literal> settings
or the <phrase role="keep-together">neutron</phrase> project for OpenStack
Networking. Both offer networking for launched instances with different
implementations and requirements.<indexterm class="singular">
<primary>networks</primary>
<secondary>deployment options</secondary>
</indexterm><indexterm class="singular">
<primary>networks</primary>
<secondary>network managers</secondary>
</indexterm><indexterm class="singular">
<primary>network design</primary>
<secondary>network topology</secondary>
<tertiary>deployment options</tertiary>
</indexterm></para>
<para>For OpenStack Networking with the neutron project, typical
configurations are documented with the idea that any setup you can
configure with real hardware you can re-create with a software-defined
equivalent. Each tenant can contain typical network elements such as
routers, and services such as DHCP.</para>
<para><xref linkend="network_deployment_options" /> discusses the
networking deployment options for both legacy
<literal>nova-network</literal> options and an equivalent neutron
configuration.<indexterm class="singular">
<primary>provisioning/deployment</primary>
<secondary>network deployment options</secondary>
</indexterm></para>
<table rules="all" width="729" xml:id="network_deployment_options">
<caption>Networking deployment options</caption>
<col width="17%" />
<col width="22%" />
<col width="23%" />
<col width="39%" />
<thead>
<tr valign="top">
<th>Network deployment model</th>
<th>Strengths</th>
<th>Weaknesses</th>
<th>Neutron equivalent</th>
</tr>
</thead>
<tbody>
<tr valign="top">
<td><para>Flat</para></td>
<td><para>Extremely simple topology.</para> <para>No DHCP
overhead.</para></td>
<td><para>Requires file injection into the instance to configure
network interfaces.</para></td>
<td>Configure a single bridge as the integration bridge (br-int) and
connect it to a physical network interface with the Modular Layer 2
(ML2) plug-in, which uses Open vSwitch by default.</td>
</tr>
<tr valign="top">
<td><para>FlatDHCP</para></td>
<td><para>Relatively simple to deploy.</para> <para>Standard
networking.</para> <para>Works with all guest operating
systems.</para></td>
<td><para>Requires its own DHCP broadcast domain.</para></td>
<td>Configure DHCP agents and routing agents. Network Address
Translation (NAT) performed outside of compute nodes, typically on
one or more network nodes.</td>
</tr>
<tr valign="top">
<td><para>VlanManager</para></td>
<td><para>Each tenant is isolated to its own VLANs.</para></td>
<td><para>More complex to set up.</para> <para>Requires its own DHCP
broadcast domain.</para> <para>Requires many VLANs to be trunked
onto a single port.</para> <para>Standard VLAN number
limitation.</para> <para>Switches must support 802.1q VLAN
tagging.</para></td>
<td><para>Isolated tenant networks implement some form of isolation
of layer 2 traffic between distinct networks. VLAN tagging is key
concept, where traffic is “tagged” with an ordinal identifier for
the VLAN. Isolated network implementations may or may not include
additional services like DHCP, NAT, and routing.</para></td>
</tr>
<tr valign="top">
<td><para>FlatDHCP&#160;Multi-host with high availability
(HA)</para></td>
<td><para>Networking failure is isolated to the VMs running on the
affected hypervisor.</para> <para>DHCP traffic can be isolated
within an individual host.</para> <para>Network traffic is
distributed to the compute nodes.</para></td>
<td><para>More complex to set up.</para> <para>Compute nodes
typically need IP addresses accessible by external networks.</para>
<para>Options must be carefully configured for live migration to
work with networking services.</para></td>
<td><para>Configure neutron with multiple DHCP and layer-3 agents.
Network nodes are not able to failover to each other, so the
controller runs networking services, such as DHCP. Compute nodes run
the ML2 plug-in with support for agents such as Open vSwitch or
Linux Bridge.</para></td>
</tr>
</tbody>
</table>
<para>Both <literal>nova-network</literal> and neutron services provide
similar capabilities, such as VLAN between VMs. You also can provide
multiple NICs on VMs with either service. Further discussion
follows.</para>
<section xml:id="vlans">
<title>VLAN Configuration Within OpenStack VMs</title>
<para>VLAN configuration can be as simple or as complicated as desired.
The use of VLANs has the benefit of allowing each project its own subnet
and broadcast segregation from other projects. To allow OpenStack to
efficiently use VLANs, you must allocate a VLAN range (one for each
project) and turn each compute node switch port into a trunk
port.<indexterm class="singular">
<primary>networks</primary>
<secondary>VLAN</secondary>
</indexterm><indexterm class="singular">
<primary>VLAN network</primary>
</indexterm><indexterm class="singular">
<primary>network design</primary>
<secondary>network topology</secondary>
<tertiary>VLAN with OpenStack VMs</tertiary>
</indexterm></para>
<para>For example, if you estimate that your cloud must support a
maximum of 100 projects, pick a free VLAN range that your network
infrastructure is currently not using (such as VLAN 200299). You must
configure OpenStack with this range and also configure your switch ports
to allow VLAN traffic from that range.</para>
</section>
<?hard-pagebreak?>
<section xml:id="network_topology">
<title>Network Topology</title>
<para>OpenStack Compute with nova-network provides pre-defined network
deployment models, each with its own strengths and weaknesses. The
selection of a network manager changes your network topology, so the
choice should be made carefully. You also have a choice between the
tried-and-true legacy nova-network settings or the neutron project
for OpenStack Networking. Both offer networking for launched
instances with different implementations and requirements.</para>
<para>For OpenStack Networking with the neutron project, typical
configurations are documented with the idea that any setup you can
configure with real hardware you can re-create with a
software-defined equivalent. Each tenant can contain typical network
elements such as routers and services such as DHCP.</para>
<para><xref linkend="network_deployment_options"/> discusses the
networking deployment options for both
legacy nova-network options and an equivalent neutron configuration:
</para>
<table rules="all" width="729" xml:id="network_deployment_options">
<caption>Networking Deployment Options</caption>
<col width="17%"/>
<col width="22%"/>
<col width="23%"/>
<col width="39%"/>
<thead>
<tr valign="top">
<th>Network Deployment Model</th>
<th>Strengths</th>
<th>Weaknesses</th>
<th>Neutron Equivalent</th>
</tr>
</thead>
<tbody>
<tr valign="top">
<td>
<para>Flat</para>
</td>
<td>
<para>Extremely simple topology.</para>
<para>No DHCP overhead.</para>
</td>
<td>
<para>Requires file injection into the instance to
configure network interfaces.</para>
</td>
<td>Configure a single bridge as the integration bridge
(br-int) and connect it to a physical network
interface with the Modular Layer 2 (ML2) plug-in,
which uses Open vSwitch by default.</td>
</tr>
<tr valign="top">
<td>
<para>FlatDHCP</para>
</td>
<td>
<para>Relatively simple to deploy.</para>
<para>Standard networking.</para>
<para>Works with all guest operating systems.</para>
</td>
<td>
<para>Requires its own DHCP broadcast domain.</para>
</td>
<td>Configure DHCP agents and routing agents. Network
Address Translation (NAT) performed outside of
compute nodes, typically on one or more network nodes.</td>
</tr>
<tr valign="top">
<td>
<para>VlanManager</para>
</td>
<td>
<para>Each tenant is isolated to its own VLANs.</para>
</td>
<td>
<para>More complex to set up.</para>
<para>Requires its own DHCP broadcast domain.</para>
<para>Requires many VLANs to be trunked onto a single port.</para>
<para>Standard VLAN number limitation.</para>
<para>Switches must support 802.1q VLAN tagging.</para>
</td>
<td>
<para>Isolated tenant networks implement some form of
isolation of layer 2 traffic between distinct networks.
VLAN tagging is key concept, where traffic is
“tagged” with an ordinal identifier for the VLAN.
Isolated network implementations may or may not
include additional services like DHCP, NAT, and
routing.</para>
</td>
</tr>
<tr valign="top">
<td>
<para>FlatDHCP Multi-host with high availability (HA)</para>
</td>
<td>
<para>Networking failure is isolated to the VMs running
on the affected hypervisor.</para>
<para>DHCP traffic can be isolated within an individual
host.</para>
<para>Network traffic is distributed to the compute
nodes.</para>
</td>
<td>
<para>More complex to set up.</para>
<para>Compute nodes typically need IP addresses
accessible by external networks.</para>
<para>Options must be carefully configured for live
migration to work with networking services.</para>
</td>
<td>
<para>Configure neutron with multiple DHCP and layer 3
agents. Network nodes are not able to failover to
each other, so the controller runs networking
services such as DHCP. Compute nodes run the
ML2 plug-in with support for agents such as
Open vSwitch or Linux Bridge.</para>
</td>
</tr>
</tbody>
</table>
<para>Both nova-network and neutron services provide similar capabilities,
such as VLAN between VMs. You also can provide multiple NICs
on VMs with either service. Further discussion
follows.</para>
<section xml:id="vlans">
<title>VLAN Configuration within OpenStack VMs</title>
<para>VLAN configuration can be as simple or as
complicated as desired. The use of VLANs has the
benefit of allowing each project its own subnet and
broadcast segregation from other projects. To allow
OpenStack to efficiently use VLANs, you must allocate
a VLAN range (one for each project) and turn each
compute node switch port into a trunk port.</para>
<para>For example, if you estimate that your cloud must
support a maximum of 100 projects, pick a free VLAN range
that your network infrastructure is currently not
using (such as VLAN 200&ndash;299). You must configure
OpenStack with this range and also configure your
switch ports to allow VLAN traffic from that
range.</para>
</section>
<?hard-pagebreak?>
<section xml:id="multi_nic">
<title>Multi-NIC Provisioning</title>
<para>OpenStack Compute has the ability to assign multiple
NICs to instances on a per-project basis. This is
generally an advanced feature and not an everyday
request. This can easily be done on a per-request
basis, though. However, be aware that a second NIC
uses up an entire subnet or VLAN. This decrements your
total number of supported projects by one.</para>
</section>
<section xml:id="multi_host_single_host_networks">
<title>Multi-Host and Single-Host Networking</title>
<para>The nova-network service has the ability to operate
in a multi-host or single-host mode. Multi-host is
when each compute node runs a copy of nova-network and
the instances on that compute node use the compute
node as a gateway to the Internet. The compute nodes
also host the floating IPs and security groups for
instances on that node. Single-host is when a central
server&mdash;for example, the cloud controller&mdash;runs the
<code>nova-network</code> service. All compute
nodes forward traffic from the instances to the cloud
controller. The cloud controller then forwards traffic
to the Internet. The cloud controller hosts the
floating IPs and security groups for all instances on
all compute nodes in the cloud.</para>
<para>There are benefits to both modes. Single-node has
the downside of a single point of failure. If the
cloud controller is not available, instances cannot
communicate on the network. This is not true with
multi-host, but multi-host requires that each compute
node has a public IP address to communicate on the
Internet. If you are not able to obtain a significant
block of public IP addresses, multi-host might not be
an option.</para>
</section>
<section xml:id="multi_nic">
<title>Multi-NIC Provisioning</title>
<para>OpenStack Compute has the ability to assign multiple NICs to
instances on a per-project basis. This is generally an advanced feature
and not an everyday request. This can easily be done on a per-request
basis, though. However, be aware that a second NIC uses up an entire
subnet or VLAN. This decrements your total number of supported projects
by one.<indexterm class="singular">
<primary>MultiNic</primary>
</indexterm><indexterm class="singular">
<primary>network design</primary>
<secondary>network topology</secondary>
<tertiary>multi-NIC provisioning</tertiary>
</indexterm></para>
</section>
<section xml:id="services_for_networking">
<title>Services for Networking</title>
<para>OpenStack, like any network application, has a number of
standard considerations to apply, such as NTP and
DNS.</para>
<section xml:id="ntp">
<title>NTP</title>
<para>Time synchronization is a critical element to ensure
continued operation of OpenStack components. Correct
time is necessary to avoid errors in instance
scheduling, replication of objects in the object
store, and even matching log timestamps for
debugging.</para>
<para>All servers running OpenStack components should be
able to access an appropriate NTP server. You may
decide to set up one locally or use the public pools
available from the <link xlink:href="http://www.pool.ntp.org">
Network Time Protocol project</link>
(http://www.pool.ntp.org/)</para>
</section>
<section xml:id="dns">
<title>DNS</title>
<para>OpenStack does not currently provide DNS
services, aside from the dnsmasq daemon, which
resides on <code>nova-network</code> hosts. You
could consider providing a dynamic DNS service to
allow instances to update a DNS entry with new IP
addresses. You can also consider making a generic
forward and reverse DNS mapping for instances' IP
addresses, such as
vm-203-0-113-123.example.com.</para>
</section>
<section xml:id="multi_host_single_host_networks">
<title>Multi-Host and Single-Host Networking</title>
<para>The <literal>nova-network</literal> service has the ability to
operate in a multi-host or single-host mode. Multi-host is when each
compute node runs a copy of <literal>nova-network</literal> and the
instances on that compute node use the compute node as a gateway to the
Internet. The compute nodes also host the floating IPs and security
groups for instances on that node. Single-host is when a central
server—for example, the cloud controller—runs the
<code>nova-network</code> service. All compute nodes forward traffic
from the instances to the cloud controller. The cloud controller then
forwards traffic to the Internet. The cloud controller hosts the
floating IPs and security groups for all instances on all compute nodes
in the cloud.<indexterm class="singular">
<primary>single-host networking</primary>
</indexterm><indexterm class="singular">
<primary>networks</primary>
<secondary>multi-host</secondary>
</indexterm><indexterm class="singular">
<primary>multi-host networking</primary>
</indexterm><indexterm class="singular">
<primary>network design</primary>
<secondary>network topology</secondary>
<tertiary>multi- vs. single-host networking</tertiary>
</indexterm></para>
<para>There are benefits to both modes. Single-node has the downside of
a single point of failure. If the cloud controller is not available,
instances cannot communicate on the network. This is not true with
multi-host, but multi-host requires that each compute node has a public
IP address to communicate on the Internet. If you are not able to obtain
a significant block of public IP addresses, multi-host might not be an
option.</para>
</section>
<section xml:id="ops-network-conclusion">
<title>Conclusion</title>
<para>Armed with your IP address layout and numbers and knowledge
about the topologies and services you can use, it's now time to prepare
the network for your installation. Be sure to also check out
the <citetitle>OpenStack Security Guide</citetitle> for tips on
securing your network. We wish you a good
relationship with your networking team!</para>
</section>
<section xml:id="services_for_networking">
<title>Services for Networking</title>
<para>OpenStack, like any network application, has a number of standard
considerations to apply, such as NTP and DNS.<indexterm class="singular">
<primary>network design</primary>
<secondary>services for networking</secondary>
</indexterm></para>
<section xml:id="ntp">
<title>NTP</title>
<para>Time synchronization is a critical element to ensure continued
operation of OpenStack components. Correct time is necessary to avoid
errors in instance scheduling, replication of objects in the object
store, and even matching log timestamps for debugging.<indexterm
class="singular">
<primary>networks</primary>
<secondary>Network Time Protocol (NTP)</secondary>
</indexterm></para>
<para>All servers running OpenStack components should be able to access
an appropriate NTP server. You may decide to set up one locally or use
the public pools available from the <link
xlink:href="http://opsgui.de/NPFRua"> Network Time Protocol
project</link>.</para>
</section>
<section xml:id="dns">
<title>DNS</title>
<para>OpenStack does not currently provide DNS services, aside from the
dnsmasq daemon, which resides on <code>nova-network</code> hosts. You
could consider providing a dynamic DNS service to allow instances to
update a DNS entry with new IP addresses. You can also consider making a
generic forward and reverse DNS mapping for instances' IP addresses,
such as vm-203-0-113-123.example.com.<indexterm class="singular">
<primary>DNS (Domain Name Server, Service or System)</primary>
<secondary>DNS service choices</secondary>
</indexterm></para>
</section>
</section>
<section xml:id="ops-network-conclusion">
<title>Conclusion</title>
<para>Armed with your IP address layout and numbers and knowledge about
the topologies and services you can use, it's now time to prepare the
network for your installation. Be sure to also check out the <link
xlink:href="http://opsgui.de/NPG4NW"
xlink:title="OpenStack Security Guide"><emphasis>OpenStack Security
Guide</emphasis></link> for tips on securing your network. We wish you a
good relationship with your networking team!</para>
</section>
</chapter>

View File

@ -1,305 +1,375 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<chapter xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="section_arch_provision">
<?dbhtml stop-chunking?>
<title>Provisioning and Deployment</title>
<para>A critical part of a cloud's scalability is the amount of effort that
it takes to run your cloud. To minimize the operational cost of running
your cloud, set up and use an automated deployment and configuration
infrastructure with a configuration management system such as Puppet or
Chef. Combined, these systems greatly reduce manual effort and the
chance for operator error.</para>
<para>This infrastructure includes systems to automatically install the
operating system's initial configuration and later coordinate the
configuration of all services automatically and centrally, which reduces
both manual effort and the chance for error. Examples include Ansible,
Chef, Puppet, and Salt. You can even use OpenStack to deploy OpenStack,
fondly named TripleO, for OpenStack On OpenStack.</para>
<section xml:id="automated_deploy">
<title>Automated Deployment</title>
<para>An automated deployment system installs and configures operating
systems on new servers, without intervention, after the
absolute minimum amount of manual work, including physical
racking, MAC-to-IP assignment, power configuration, and so
on. Typically, solutions rely on wrappers around PXE boot
and TFTP servers for the basic operating system install,
and then hand off to an automated configuration management
system.</para>
<para>Both Ubuntu and Red Hat Linux include mechanisms for configuring
the operating system, including preseed and kickstart, that you can
use after a network boot. Typically, these are used to bootstrap an
automated configuration system. Alternatively, you can use an
image-based approach for deploying the operating system, such as
systemimager. You can use both approaches with a virtualized
infrastructure, such as when you run VMs to separate your control
services and physical infrastructure.</para>
<para>When you create a deployment plan, focus on a few vital areas
because they are very hard to modify post deployment. The next two
sections talk about configurations for:</para>
<para>
<itemizedlist>
<listitem>
<para>Disk partioning and disk array setup for scalability</para>
</listitem>
<listitem>
<para>Networking configuration just for PXE booting</para>
</listitem>
</itemizedlist>
</para>
<?hard-pagebreak?>
<section xml:id="disk_partition_raid">
<title>Disk Partitioning and RAID</title>
<para>At the very base of any operating system are the hard drives on which the
operating system (OS) is installed.</para>
<para>You must complete the following configurations on the server's hard drives:</para>
<itemizedlist role="compact">
<listitem>
<para>Partitioning, which provides greater flexibility for layout of operating
system and swap space, as described below.</para>
</listitem>
<listitem>
<para>Adding to a RAID array (RAID stands for redundant array of independent
disks), based on the number of disks you have available, so that you can add
capacity as your cloud grows. Some options are described in more detail
below.</para>
</listitem>
</itemizedlist>
<para>The simplest option to get started is to use one hard drive with two
partitions:</para>
<itemizedlist role="compact">
<listitem>
<para>File system to store files and directories,
where all the data lives, including the root
partition that starts and runs the system.</para>
</listitem>
<listitem>
<para>Swap space to free up memory for processes,
as an independent area of the physical disk used
only for swapping and nothing else.</para>
</listitem>
</itemizedlist>
<para>RAID is not used in this simplistic one-drive setup
because generally for production clouds you want to ensure
that if one disk fails another can take its
place. Instead, for production, use more than one disk. The
number of disks determine what types of RAID arrays to
build.</para>
<?hard-pagebreak?>
<para>We recommend that you choose one of the following multiple disk options:</para>
<itemizedlist role="compact">
<listitem>
<para><emphasis role="bold">Option 1:</emphasis>
Partition all drives in the same way in a
horizontal fashion, as shown in <xref
linkend="disk_partition_figure"/>.</para>
<figure xml:id="disk_partition_figure">
<title>Partition setup of drives</title>
<mediaobject>
<imageobject>
<imagedata width="5in" fileref="figures/os_disk_partition.png"/>
</imageobject>
</mediaobject>
</figure>
<para>With this option, you can assign different partitions to different RAID
arrays. You can allocate partition 1 of disk one and two to the
<code>/boot</code> partition mirror. You can make partition 2 of all
disks the root partition mirror. You can use partition 3 of all disks for a
<code>cinder-volumes</code> LVM partition running on a RAID 10
array.</para>
<para>While you might end up with unused
partitions, such as partition 1 in disk three and
four of this example, this option allows for
maximum utilization of disk space. I/O
performance might be an issue as a result of all disks
being used for all tasks.</para>
</listitem>
<listitem>
<para><emphasis role="bold">Option 2:</emphasis> Add all raw disks to one large
RAID array, either hardware or software based. You can partition this large
array with the boot, root, swap, and LVM areas. This option is simple to
implement and uses all partitions. However, disk I/O might suffer.</para>
</listitem>
<listitem>
<para><emphasis role="bold">Option 3:</emphasis>
Dedicate entire disks to certain partitions. For
example, you could allocate disk one and two
entirely to the boot, root, and swap partitions
under a RAID 1 mirror. Then, allocate disk three
and four entirely to the LVM partition, also under
a RAID 1 mirror. Disk I/O should be better because
I/O is focused on dedicated tasks. However, the
LVM partition is much smaller.</para>
</listitem>
</itemizedlist>
<tip><para>You may find that you can automate the
partitioning itself. For example, MIT uses Fully
Automatic Installation (FAI) (<link
xlink:href="http://fai-project.org/"
>fai-project.org/</link>) to do the initial PXE-based
partition and then install using a combination of
min/max and percentage-based
partitioning.</para></tip>
<para>As with most architecture choices, the right answer
depends on your environment. If you are using existing
hardware, you know the disk density of your servers and
can determine some decisions based on the options
above. If you are going through a procurement process,
your user's requirements also help you determine hardware
purchases. Here are some examples from a private cloud
providing web developers custom environments at
AT&amp;T. This example is from a specific deployment, so
your existing hardware or procurement opportunity may vary
from this. AT&amp;Ts use three types of hardware in its
deployment:</para>
<itemizedlist>
<listitem>
<para>Hardware for controller nodes, used for all
stateless OpenStack API services. About
32&ndash;64 GB memory, small attached disk, one
processor, varied number of cores, such as
6&ndash;12.</para>
</listitem>
<listitem>
<para>Hardware for compute nodes. Typically 256 or
144 GB memory, two processors, 24 cores. 4&ndash;6
TB direct attached storage, typically in a RAID 5
configuration.</para>
</listitem>
<listitem>
<para>Hardware for storage nodes. Typically for
these the disk space is optimized for the lowest
cost per GB of storage while maintaining
rack-space efficiency.</para>
</listitem>
</itemizedlist>
<para>Again, the right answer
depends on your environment. You have to make your
decision based on the trade-offs between space utilization,
simplicity, and I/O performance.</para>
</section>
<section xml:id="network_config">
<title>Network Configuration</title>
<para>Network configuration is a very large topic that spans
multiple areas of this book. For now, make sure that your
servers can PXE boot and successfully communicate with the
deployment server.</para>
<para>For example, you usually cannot configure NICs for VLANs when
PXE booting. Additionally, you usually cannot PXE boot with
bonded NICs. If you run into this scenario, consider using a
simple 1 GB switch in a private network on which only your cloud
communicates.</para>
</section>
<chapter version="5.0" xml:id="section_arch_provision"
xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/1999/xhtml"
xmlns:ns4="http://www.w3.org/2000/svg"
xmlns:ns3="http://www.w3.org/1998/Math/MathML"
xmlns:ns="http://docbook.org/ns/docbook">
<?dbhtml stop-chunking?>
<title>Provisioning and Deployment</title>
<para>A critical part of a cloud's scalability is the amount of effort that
it takes to run your cloud. To minimize the operational cost of running your
cloud, set up and use an automated deployment and configuration
infrastructure with a configuration management system, such as Puppet or
Chef. Combined, these systems greatly reduce manual effort and the chance
for operator error.<indexterm class="singular">
<primary>cloud computing</primary>
<secondary>minimizing costs of</secondary>
</indexterm></para>
<para>This infrastructure includes systems to automatically install the
operating system's initial configuration and later coordinate the
configuration of all services automatically and centrally, which reduces
both manual effort and the chance for error. Examples include Ansible, Chef,
Puppet, and Salt. You can even use OpenStack to deploy OpenStack, fondly
named TripleO, for OpenStack On OpenStack.<indexterm class="singular">
<primary>Puppet</primary>
</indexterm><indexterm class="singular">
<primary>Chef</primary>
</indexterm></para>
<section xml:id="automated_deploy">
<title>Automated Deployment</title>
<para>An automated deployment system installs and configures operating
systems on new servers, without intervention, after the absolute minimum
amount of manual work, including physical racking, MAC-to-IP assignment,
and power configuration. Typically, solutions rely on wrappers around PXE
boot and TFTP servers for the basic operating system install and then hand
off to an automated configuration management system.<indexterm
class="singular">
<primary>deployment</primary>
<see>provisioning/deployment</see>
</indexterm><indexterm class="singular">
<primary>provisioning/deployment</primary>
<secondary>automated deployment</secondary>
</indexterm></para>
<para>Both Ubuntu and Red Hat Linux include mechanisms for configuring the
operating system, including preseed and kickstart, that you can use after
a network boot. Typically, these are used to bootstrap an automated
configuration system. Alternatively, you can use an image-based approach
for deploying the operating system, such as systemimager. You can use both
approaches with a virtualized infrastructure, such as when you run VMs to
separate your control services and physical infrastructure.</para>
<para>When you create a deployment plan, focus on a few vital areas
because they are very hard to modify post deployment. The next two
sections talk about configurations for:</para>
<para><itemizedlist>
<listitem>
<para>Disk partioning and disk array setup for scalability</para>
</listitem>
<listitem>
<para>Networking configuration just for PXE booting</para>
</listitem>
</itemizedlist></para>
<section xml:id="disk_partition_raid">
<title>Disk Partitioning and RAID</title>
<para>At the very base of any operating system are the hard drives on
which the operating system (OS) is installed.<indexterm class="singular">
<primary>RAID (redundant array of independent disks)</primary>
</indexterm><indexterm class="singular">
<primary>partitions</primary>
<secondary>disk partitioning</secondary>
</indexterm><indexterm class="singular">
<primary>disk partitioning</primary>
</indexterm></para>
<para>You must complete the following configurations on the server's
hard drives:</para>
<itemizedlist role="compact">
<listitem>
<para>Partitioning, which provides greater flexibility for layout of
operating system and swap space, as described below.</para>
</listitem>
<listitem>
<para>Adding to a RAID array (RAID stands for redundant array of
independent disks), based on the number of disks you have available,
so that you can add capacity as your cloud grows. Some options are
described in more detail below.</para>
</listitem>
</itemizedlist>
<para>The simplest option to get started is to use one hard drive with
two partitions:</para>
<itemizedlist role="compact">
<listitem>
<para>File system to store files and directories, where all the data
lives, including the root partition that starts and runs the
system</para>
</listitem>
<listitem>
<para>Swap space to free up memory for processes, as an independent
area of the physical disk used only for swapping and nothing
else</para>
</listitem>
</itemizedlist>
<para>RAID is not used in this simplistic one-drive setup because
generally for production clouds, you want to ensure that if one disk
fails, another can take its place. Instead, for production, use more
than one disk. The number of disks determine what types of RAID arrays
to build.</para>
<para>We recommend that you choose one of the following multiple disk
options:</para>
<variablelist>
<varlistentry>
<term>Option 1</term>
<listitem>
<para>Partition all drives in the same way in a horizontal
fashion, as shown in <xref
linkend="disk_partition_figure" />.</para>
<para>With this option, you can assign different partitions to
different RAID arrays. You can allocate partition 1 of disk one
and two to the <code>/boot</code> partition mirror. You can make
partition 2 of all disks the root partition mirror. You can use
partition 3 of all disks for a <code>cinder-volumes</code> LVM
partition running on a RAID 10 array.</para>
<figure xml:id="disk_partition_figure">
<title>Partition setup of drives</title>
<mediaobject>
<imageobject>
<imagedata fileref="figures/osog_0201.png"></imagedata>
</imageobject>
</mediaobject>
</figure>
<para>While you might end up with unused partitions, such as
partition 1 in disk three and four of this example, this option
allows for maximum utilization of disk space. I/O performance
might be an issue as a result of all disks being used for all
tasks.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Option 2</term>
<listitem>
<para>Add all raw disks to one large RAID array, either hardware
or software based. You can partition this large array with the
boot, root, swap, and LVM areas. This option is simple to
implement and uses all partitions. However, disk I/O might
suffer.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Option 3</term>
<listitem>
<para>Dedicate entire disks to certain partitions. For example,
you could allocate disk one and two entirely to the boot, root,
and swap partitions under a RAID 1 mirror. Then, allocate disk
three and four entirely to the LVM partition, also under a RAID 1
mirror. Disk I/O should be better because I/O is focused on
dedicated tasks. However, the LVM partition is much
smaller.</para>
</listitem>
</varlistentry>
</variablelist>
<tip>
<para>You may find that you can automate the partitioning itself. For
example, MIT uses <link xlink:href="http://fai-project.org/">Fully
Automatic Installation (FAI)</link> to do the initial PXE-based
partition and then install using a combination of min/max and
percentage-based partitioning.<indexterm class="singular">
<primary>Fully Automatic Installation (FAI)</primary>
</indexterm></para>
</tip>
<para>As with most architecture choices, the right answer depends on
your environment. If you are using existing hardware, you know the disk
density of your servers and can determine some decisions based on the
options above. If you are going through a procurement process, your
user's requirements also help you determine hardware purchases. Here are
some examples from a private cloud providing web developers custom
environments at AT&amp;T. This example is from a specific deployment, so
your existing hardware or procurement opportunity may vary from this.
AT&amp;T uses three types of hardware in its deployment:</para>
<itemizedlist>
<listitem>
<para>Hardware for controller nodes, used for all stateless
OpenStack API services. About 3264 GB memory, small attached disk,
one processor, varied number of cores, such as 612.</para>
</listitem>
<listitem>
<para>Hardware for compute nodes. Typically 256 or 144 GB memory,
two processors, 24 cores. 46 TB direct attached storage, typically
in a RAID 5 configuration.</para>
</listitem>
<listitem>
<para>Hardware for storage nodes. Typically for these, the disk
space is optimized for the lowest cost per GB of storage while
maintaining rack-space efficiency.</para>
</listitem>
</itemizedlist>
<para>Again, the right answer depends on your environment. You have to
make your decision based on the trade-offs between space utilization,
simplicity, and I/O performance.</para>
</section>
<section xml:id="auto_config">
<title>Automated Configuration</title>
<para>The purpose of automatic configuration management is to
establish and maintain the consistency of a system without
using human intervention. You want to maintain consistency in
your deployments so that you can have the same cloud every
time, repeatably. Proper use of automatic
configuration-management tools ensures that components of the
cloud systems are in particular states, in addition to
simplifying deployment, and configuration change
propagation.</para>
<para>These tools also make it possible to test and roll back
changes, as they are fully repeatable. Conveniently, a
large body of work has been done by the OpenStack
community in this space. Puppet&mdash;a configuration
management tool&mdash;even provides official modules for
OpenStack in an OpenStack infrastructure system known as
Stackforge at <link
xlink:href="https://github.com/stackforge/puppet-openstack"
>https://github.com/stackforge/puppet-openstack</link>. Chef
configuration management is provided within <link
xlink:href="https://github.com/stackforge/openstack-chef-repo"
>https://github.com/stackforge/openstack-chef-repo</link>.
Additional configuration-management
systems include Juju, Ansible, and Salt. Also,
PackStack is a command-line utility for Red Hat Enterprise
Linux and derivatives that uses Puppet modules to support
rapid deployment of OpenStack on existing servers over an
SSH connection.</para>
<para>An integral part of a configuration-management system is
the items that it controls. You should carefully consider all
of the items that you want, or do not want, to be
automatically managed. For example, you may not want to
automatically format hard drives with user data.</para>
<section xml:id="network_config">
<title>Network Configuration</title>
<para>Network configuration is a very large topic that spans multiple
areas of this book. For now, make sure that your servers can PXE boot
and successfully communicate with the deployment server.<indexterm
class="singular">
<primary>networks</primary>
<secondary>configuration of</secondary>
</indexterm></para>
<para>For example, you usually cannot configure NICs for VLANs when PXE
booting. Additionally, you usually cannot PXE boot with bonded NICs. If
you run into this scenario, consider using a simple 1 GB switch in a
private network on which only your cloud communicates.</para>
</section>
<?hard-pagebreak?>
<section xml:id="remote_mgmt">
<title>Remote Management</title>
<para>In our experience, most operators don't sit right next to the
servers running the cloud, and many don't necessarily enjoy visiting
the data center. OpenStack should be entirely remotely configurable,
but sometimes not everything goes according to plan.</para>
<para>In this instance, having an out-of-band access into nodes running
OpenStack components, is a boon. The IPMI protocol is the de-facto
standard here, and acquiring hardware that supports it is highly
recommended to achieve that lights-out data center aim.</para>
<para>In addition, consider remote power control as well. While IPMI
usually controls the server's power state, having remote access to
the PDU that the server is plugged into can really be useful for
situations when everything seems wedged.</para>
</section>
<section xml:id="provision-deploy-summary">
<title>Parting Thoughts for Provisioning and Deploying OpenStack</title>
<para>You can save time by understanding the use cases for the cloud you
want to create.
Use cases for OpenStack are varied. Some include object
storage only, others require preconfigured compute
resources to speed development-environment set up, and
others need fast provisioning of compute resources that
are already secured per tenant with private networks.
Your users may have need for highly redundant servers to make sure
their legacy applications continue to run. Perhaps a goal would be
to architect these legacy applications so that they run on
multiple instances in a cloudy, fault-tolerant way, but not make it
a goal to add to those clusters over time. Your users may indicate
that they need scaling considerations because of heavy Windows
server use.</para>
<para>You can save resources by looking at the best fit for the hardware
you have in place already. You might have some high-density storage
hardware available. You could format and repurpose those servers for
OpenStack Object Storage. All of these considerations and input
from users help you build your use case and your deployment
plan.</para>
<tip><para>For further research about OpenStack deployment, investigate the
supported and documented pre-configured, pre packaged installers for
OpenStack from companies such as <link
xlink:href="http://www.ubuntu.com/cloud/tools/openstack"
>Canonical</link>, <link
xlink:href="http://www.cisco.com/web/solutions/openstack/"
>Cisco</link>, <link xlink:href="http://www.cloudscaling.com/"
>Cloudscaling</link>, <link
xlink:href="http://www-03.ibm.com/software/products/en/smartcloud-orchestrator/"
>IBM</link>, <link xlink:href="http://www.metacloud.com/"
>Metacloud</link>, <link xlink:href="http://www.mirantis.com/"
>Mirantis</link>, <link xlink:href="http://www.pistoncloud.com/"
>Piston</link>, <link
xlink:href="http://www.rackspace.com/cloud/private/"
>Rackspace</link>, <link
xlink:href="http://www.redhat.com/openstack/">Red Hat</link>,
<link xlink:href="http://www.suse.com/cloud">SUSE</link>, and
<link xlink:href="http://www.swiftstack.com/"
>SwiftStack</link>.</para></tip>
</section>
<section xml:id="provision_conclusion">
<title>Conclusion</title>
<para>The decisions you make with respect to provisioning and
deployment will affect your day-to-day, week-to-week, and
month-to-month maintenance of the cloud. Your configuration
management will be able to evolve over time. However, more thought
and design need to be done for upfront choices about
deployment, disk partitioning, and network
configuration.</para>
</section>
</chapter>
</section>
<section xml:id="auto_config">
<title>Automated Configuration</title>
<para>The purpose of automatic configuration management is to establish
and maintain the consistency of a system without using human intervention.
You want to maintain consistency in your deployments so that you can have
the same cloud every time, repeatably. Proper use of automatic
configuration-management tools ensures that components of the cloud
systems are in particular states, in addition to simplifying deployment,
and configuration change propagation.<indexterm class="singular">
<primary>automated configuration</primary>
</indexterm><indexterm class="singular">
<primary>provisioning/deployment</primary>
<secondary>automated configuration</secondary>
</indexterm></para>
<para>These tools also make it possible to test and roll back changes, as
they are fully repeatable. Conveniently, a large body of work has been
done by the OpenStack community in this space. Puppet, a configuration
management tool, even provides official modules for OpenStack in an
OpenStack infrastructure system known as <link
xlink:href="http://opsgui.de/NPFUpL">Stackforge</link>. Chef configuration
management is provided within <link role="orm:hideurl:ital"
xlink:href="https://github.com/stackforge/openstack-chef-repo"></link>.
Additional configuration management systems include Juju, Ansible, and
Salt. Also, PackStack is a command-line utility for Red Hat Enterprise
Linux and derivatives that uses Puppet modules to support rapid deployment
of OpenStack on existing servers over an SSH connection.<indexterm
class="singular">
<primary>Stackforge</primary>
</indexterm></para>
<para>An integral part of a configuration-management system is the items
that it controls. You should carefully consider all of the items that you
want, or do not want, to be automatically managed. For example, you may
not want to automatically format hard drives with user data.</para>
</section>
<section xml:id="remote_mgmt">
<title>Remote Management</title>
<para>In our experience, most operators don't sit right next to the
servers running the cloud, and many don't necessarily enjoy visiting the
data center. OpenStack should be entirely remotely configurable, but
sometimes not everything goes according to plan.<indexterm
class="singular">
<primary>provisioning/deployment</primary>
<secondary>remote management</secondary>
</indexterm></para>
<para>In this instance, having an out-of-band access into nodes running
OpenStack components is a boon. The IPMI protocol is the de facto standard
here, and acquiring hardware that supports it is highly recommended to
achieve that lights-out data center aim.</para>
<para>In addition, consider remote power control as well. While IPMI
usually controls the server's power state, having remote access to the PDU
that the server is plugged into can really be useful for situations when
everything seems wedged.</para>
</section>
<section xml:id="provision-deploy-summary">
<title>Parting Thoughts for Provisioning and Deploying OpenStack</title>
<para>You can save time by understanding the use cases for the cloud you
want to create. Use cases for OpenStack are varied. Some include object
storage only; others require preconfigured compute resources to speed
development-environment set up; and others need fast provisioning of
compute resources that are already secured per tenant with private
networks. Your users may have need for highly redundant servers to make
sure their legacy applications continue to run. Perhaps a goal would be to
architect these legacy applications so that they run on multiple instances
in a cloudy, fault-tolerant way, but not make it a goal to add to those
clusters over time. Your users may indicate that they need scaling
considerations because of heavy Windows server use.<indexterm
class="singular">
<primary>provisioning/deployment</primary>
<secondary>tips for</secondary>
</indexterm></para>
<para>You can save resources by looking at the best fit for the hardware
you have in place already. You might have some high-density storage
hardware available. You could format and repurpose those servers for
OpenStack Object Storage. All of these considerations and input from users
help you build your use case and your deployment plan.</para>
<tip>
<para>For further research about OpenStack deployment, investigate the
supported and documented preconfigured, prepackaged installers for
OpenStack from companies such as <link
xlink:href="http://opsgui.de/NPFSy7">Canonical</link>, <link
xlink:href="http://opsgui.de/1gwRmlS">Cisco</link>, <link
xlink:href="http://opsgui.de/1eLAFSL">Cloudscaling</link>, <link
xlink:href="http://opsgui.de/NPFYG3">IBM</link>, <link
xlink:href="http://opsgui.de/1eLAGWE">Metacloud</link>, <link
xlink:href="http://opsgui.de/NPFWOy">Mirantis</link>, <link
xlink:href="http://opsgui.de/1eLAHKd">Piston</link>, <link
xlink:href="http://opsgui.de/1gwRm58">Rackspace</link>, <link
xlink:href="http://opsgui.de/NPFXlq">Red Hat</link>, <link
xlink:href="http://opsgui.de/1eLALK5">SUSE</link>, and <link
xlink:href="http://opsgui.de/NPG0hb">SwiftStack</link>.</para>
</tip>
</section>
<section xml:id="provision_conclusion">
<title>Conclusion</title>
<para>The decisions you make with respect to provisioning and deployment
will affect your day-to-day, week-to-week, and month-to-month maintenance
of the cloud. Your configuration management will be able to evolve over
time. However, more thought and design need to be done for upfront choices
about deployment, disk partitioning, and network configuration.</para>
</section>
</chapter>

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,183 +1,286 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<chapter xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="advanced_configuration">
<?dbhtml stop-chunking?>
<title>Advanced Configuration</title>
<para>OpenStack is intended to work well across a variety of installation
flavors, from very small private clouds to large public clouds.
To achieve this, the developers add configuration options to
their code that allow the behaviour of the various components to be
tweaked depending on your needs. Unfortunately, it is not possible to
cover all possible deployments with the default configuration values.</para>
<chapter version="5.0" xml:id="advanced_configuration"
xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/1999/xhtml"
xmlns:ns4="http://www.w3.org/2000/svg"
xmlns:ns3="http://www.w3.org/1998/Math/MathML"
xmlns:ns="http://docbook.org/ns/docbook">
<?dbhtml stop-chunking?>
<para>At the time of writing, OpenStack has more than 1,500 configuration options.
You can see them documented at
<link xlink:href="http://docs.openstack.org/trunk/config-reference/content/config_overview.html">the OpenStack configuration reference guide</link>. This chapter cannot hope to
document all of these, but we do try to introduce the important
concepts so that you know where to go digging for more information.</para>
<title>Advanced Configuration</title>
<section xml:id="driver_differences">
<para>OpenStack is intended to work well across a variety of installation
flavors, from very small private clouds to large public clouds. To achieve
this, the developers add configuration options to their code that allow the
behavior of the various components to be tweaked depending on your needs.
Unfortunately, it is not possible to cover all possible deployments with the
default configuration values.<indexterm class="singular">
<primary>advanced configuration</primary>
<see>configuration options</see>
</indexterm><indexterm class="singular">
<primary>configuration options</primary>
<secondary>wide availability of</secondary>
</indexterm></para>
<para>At the time of writing, OpenStack has more than 1,500 configuration
options. You can see them documented at <link
xlink:href="http://opsgui.de/1eLATt4">the OpenStack configuration reference
guide</link>. This chapter cannot hope to document all of these, but we do
try to introduce the important concepts so that you know where to go digging
for more information.</para>
<section xml:id="driver_differences">
<title>Differences Between Various Drivers</title>
<para>Many OpenStack projects implement a driver layer, and each of these
drivers will implement their own configuration options. For example
in OpenStack Compute (nova), there are various hypervisor drivers
implemented&mdash;libvirt, xenserver, hyper-v, and vmware, for example.
Not all of these hypervisor drivers have the same features, and each
has different tuning requirements.</para>
<note><para>The currently implemented hypervisors are listed on
<link xlink:href="http://docs.openstack.org/trunk/config-reference/content/section_compute-hypervisors.html">the OpenStack documentation website</link>. You can see a matrix of the
various features in OpenStack Compute
(nova) hypervisor drivers on the OpenStack wiki at
<link xlink:href="https://wiki.openstack.org/wiki/HypervisorSupportMatrix">the Hypervisor support matrix page</link>.</para></note>
<para>Many OpenStack projects implement a driver layer, and each of these
drivers will implement its own configuration options. For example, in
OpenStack Compute (nova), there are various hypervisor drivers
implemented—libvirt, xenserver, hyper-v, and vmware, for example. Not all
of these hypervisor drivers have the same features, and each has different
tuning requirements.<indexterm class="singular">
<primary>hypervisors</primary>
<secondary>differences between</secondary>
</indexterm><indexterm class="singular">
<primary>drivers</primary>
<secondary>differences between</secondary>
</indexterm></para>
<note>
<para>The currently implemented hypervisors are listed on <link
xlink:href="http://opsgui.de/1eLAwP2">the OpenStack documentation
website</link>. You can see a matrix of the various features in
OpenStack Compute (nova) hypervisor drivers on the OpenStack wiki at
<link xlink:href="http://opsgui.de/NPFQ9w">the Hypervisor support matrix
page</link>.</para>
</note>
<para>The point we are trying to make here is that just because an option
exists doesn't mean that option is relevant to your driver choices.
Normally, the documentation notes which drivers the configuration
applies to.</para>
</section>
exists doesn't mean that option is relevant to your driver choices.
Normally, the documentation notes which drivers the configuration applies
to.</para>
</section>
<section xml:id="periodic_tasks">
<section xml:id="periodic_tasks">
<title>Implementing Periodic Tasks</title>
<para>Another common concept across various OpenStack projects is that of
periodic tasks. Periodic tasks are much like cron jobs on traditional
Unix systems, but they are run inside an OpenStack process. For
example, when OpenStack Compute (nova) needs to work out what images
it can remove from its local cache, it runs a periodic task to do this.</para>
periodic tasks. Periodic tasks are much like cron jobs on traditional Unix
systems, but they are run inside an OpenStack process. For example, when
OpenStack Compute (nova) needs to work out what images it can remove from
its local cache, it runs a periodic task to do this.<indexterm
class="singular">
<primary>periodic tasks</primary>
</indexterm><indexterm class="singular">
<primary>configuration options</primary>
<secondary>periodic task implementation</secondary>
</indexterm></para>
<para>Periodic tasks are important to understand because of limitations in
the threading model that OpenStack uses. OpenStack uses cooperative
threading in python, which means that if something long and
complicated is running, it will block other tasks inside that process
from running unless it voluntarily yields execution to another
cooperative thread.</para>
the threading model that OpenStack uses. OpenStack uses cooperative
threading in Python, which means that if something long and complicated is
running, it will block other tasks inside that process from running unless
it voluntarily yields execution to another cooperative thread.<indexterm
class="singular">
<primary>cooperative threading</primary>
</indexterm></para>
<para>A tangible example of this is the nova-compute process. In order to
manage the image cache with libvirt, nova-compute has a periodic
process that scans the contents of the image cache. Part of this
scan is calculating a checksum for each of the images and making sure
that checksum matches what nova-compute expects it to be. However,
images can be very large, and these checksums can take a long time to
generate. At one point, before it was reported as a bug and fixed,
nova-compute would block on this task and stop responding to RPC
requests. This was visible to users as failure of operations such
as spawning or deleting instances.</para>
<para>A tangible example of this is the <literal>nova-compute</literal>
process. In order to manage the image cache with libvirt,
<literal>nova-compute</literal> has a periodic process that scans the
contents of the image cache. Part of this scan is calculating a checksum
for each of the images and making sure that checksum matches what
<literal>nova-compute</literal> expects it to be. However, images can be
very large, and these checksums can take a long time to generate. At one
point, before it was reported as a bug and fixed,
<literal>nova-compute</literal> would block on this task and stop
responding to RPC requests. This was visible to users as failure of
operations such as spawning or deleting instances.</para>
<para>The take away from this is if you observe an OpenStack process that
appears to "stop" for a while and then continue to process normally,
you should check that periodic tasks aren't the problem. One way to
do this is to disable the periodic tasks by setting their interval to
zero. Additionally, you can configure how often these periodic tasks
run&mdash;in some cases it might make sense to run them at a different
frequency from the default.</para>
appears to "stop" for a while and then continue to process normally, you
should check that periodic tasks aren't the problem. One way to do this is
to disable the periodic tasks by setting their interval to zero.
Additionally, you can configure how often these periodic tasks run—in some
cases, it might make sense to run them at a different frequency from the
default.</para>
<para>The frequency is defined separately for each periodic task.
Therefore, to disable every periodic task in OpenStack Compute
(nova), you would need to set a number of configuration
options to zero. The current list of configuration options
you would need to set to zero are:</para>
Therefore, to disable every periodic task in OpenStack Compute (nova), you
would need to set a number of configuration options to zero. The current
list of configuration options you would need to set to zero are:</para>
<itemizedlist>
<listitem><para>bandwidth_poll_interval</para></listitem>
<listitem><para>sync_power_state_interval</para></listitem>
<listitem><para>heal_instance_info_cache_interval</para></listitem>
<listitem><para>host_state_interval</para></listitem>
<listitem><para>image_cache_manager_interval</para></listitem>
<listitem><para>reclaim_instance_interval</para></listitem>
<listitem><para>volume_usage_poll_interval</para></listitem>
<listitem><para>shelved_poll_interval</para></listitem>
<listitem><para>shelved_offload_time</para></listitem>
<listitem><para>instance_delete_interval</para></listitem>
</itemizedlist>
<itemizedlist>
<listitem>
<para><literal>bandwidth_poll_interval</literal></para>
</listitem>
<listitem>
<para><literal>sync_power_state_interval</literal></para>
</listitem>
<listitem>
<para><literal>heal_instance_info_cache_interval</literal></para>
</listitem>
<listitem>
<para><literal>host_state_interval</literal></para>
</listitem>
<listitem>
<para><literal>image_cache_manager_interval</literal></para>
</listitem>
<listitem>
<para><literal>reclaim_instance_interval</literal></para>
</listitem>
<listitem>
<para><literal>volume_usage_poll_interval</literal></para>
</listitem>
<listitem>
<para><literal>shelved_poll_interval</literal></para>
</listitem>
<listitem>
<para><literal>shelved_offload_time</literal></para>
</listitem>
<listitem>
<para><literal>instance_delete_interval</literal></para>
</listitem>
</itemizedlist>
<para>To set a configuration option to zero, include a line such as
<literal>image_cache_manager_interval=0</literal> in your
<filename>nova.conf</filename> file.</para>
<literal>image_cache_manager_interval=0</literal> in your
<filename>nova.conf</filename> file.</para>
<para>This list will change between releases, so please refer to
your configuration guide for up to date information.</para>
<para>This list will change between releases, so please refer to your
configuration guide for up-to-date information.</para>
</section>
</section>
<section xml:id="specific-advanced-config-topics">
<section xml:id="specific-advanced-config-topics">
<title>Specific Configuration Topics</title>
<para>This section covers specific examples of configuration options you
might consider tuning. It is by no means an exhaustive list.</para>
<section xml:id="adv-config-security"><title>Security Configuration
for Compute, Networking, and Storage</title>
<para>The <citetitle><link
xlink:href="http://docs.openstack.org/sec/">OpenStack Security
Guide</link></citetitle> provides a deep dive into securing an
OpenStack cloud, including SSL/TLS, key management, PKI and certificate
management, data transport and privacy concerns, and
compliance.</para></section>
<section xml:id="adv-config-ha"><title>High Availability</title>
<para>The <citetitle><link
xlink:href="http://docs.openstack.org/high-availability-guide/content/">OpenStack
High Availability
Guide</link></citetitle> offers suggestions for
elimination of a single point of failure that could cause
system downtime. While it is not a completely prescriptive
document, it offers methods and techniques for avoiding downtime
and data loss.</para></section>
<section xml:id="adv-config-ipv6">
<title>Enabling IPv6 Support</title>
<para>The Havana release with OpenStack Networking
(neutron) does not offer complete support of
IPv6. Better support is planned for the
Icehouse release. You can follow along the
progress being made by watching the neutron
IPv6 Subteam at work (<link
xlink:href="https://wiki.openstack.org/wiki/Meetings/Neutron-IPv6-Subteam"
>https://wiki.openstack.org/wiki/Meetings/Neutron-IPv6-Subteam</link>).
</para>
<para>By modifying your configuration setup, you can set up
IPv6 when using nova-network for networking, and a tested
setup is documented for FlatDHCP and a multi-host
configuration. The key is to make nova-network think a
radvd command ran successfully. The entire
configuration is detailed in a Cybera blog post, <link
xlink:href="http://www.cybera.ca/news-and-events/tech-radar/an-ipv6-enabled-cloud/"
>An IPv6 enabled cloud</link>.</para>
</section>
<section xml:id="specific-advanced-config-period-tasks">
<title>Periodic Task Frequency for Compute</title>
<para>Before the Grizzly release, the frequency of periodic tasks
was specified in seconds between runs. This meant that if the
periodic task took 30 minutes to run and the frequency was
set to hourly, then the periodic task actually ran every 90
minutes, because the task would wait an hour after running
before running again. This changed in Grizzly, and we now
time the frequency of periodic tasks from the start of the
work the task does. So, our 30 minute periodic task will run
every hour, with a 30 minute wait between the end of the
first run and the start of the next.</para>
</section>
<section xml:id="adv-config-geography">
<title>Geographical Considerations for Object Storage</title>
<para>Enhanced support for global clustering of object storage
servers continues to be added since the Grizzly (1.8.0)
release, when regions were introduced. You would
implement these global clusters to ensure replication
across geographic areas in case of a natural disaster
and also to ensure that users can write or access their
objects more quickly based on the closest data center.
You configure a default region with one zone for each
cluster, but be sure your network (WAN) can handle the
additional request and response load between zones as
you add more zones and build a ring that handles more
zones. Refer to Geographically Distributed Clusters
(<link xlink:href="http://docs.openstack.org/developer/swift/admin_guide.html#geographically-distributed-clusters">http://docs.openstack.org/developer/swift/admin_guide.html#geographically-distributed-clusters</link>)
in the documentation for additional information.</para>
</section>
might consider tuning. It is by no means an exhaustive list.</para>
<section xml:id="adv-config-security">
<title>Security Configuration for Compute, Networking, and
Storage</title>
<para>The <emphasis><link xlink:href="http://opsgui.de/NPG4NW">OpenStack
Security Guide</link></emphasis> provides a deep dive into securing an
OpenStack cloud, including SSL/TLS, key management, PKI and certificate
management, data transport and privacy concerns, and
compliance.<indexterm class="singular">
<primary>security issues</primary>
<secondary>configuration options</secondary>
</indexterm><indexterm class="singular">
<primary>configuration options</primary>
<secondary>security</secondary>
</indexterm></para>
</section>
<section xml:id="adv-config-ha">
<title>High Availability</title>
<para>The <emphasis><link
xlink:href="http://opsgui.de/1eLAYwS">OpenStack High Availability
Guide</link></emphasis> offers suggestions for elimination of a single
point of failure that could cause system downtime. While it is not a
completely prescriptive document, it offers methods and techniques for
avoiding downtime and data loss.<indexterm class="singular">
<primary>high availability</primary>
</indexterm><indexterm class="singular">
<primary>configuration options</primary>
<secondary>high availability</secondary>
</indexterm></para>
</section>
<section xml:id="adv-config-ipv6">
<title>Enabling IPv6 Support</title>
<para>The Havana release with OpenStack Networking (neutron) does not
offer complete support of IPv6. Better support is planned for the
Icehouse release. You can follow along the progress being made by
watching the <link xlink:href="http://opsgui.de/NPG5kQ">neutron IPv6
Subteam at work</link>.<indexterm class="singular">
<primary>Icehouse</primary>
<secondary>IPv6 support</secondary>
</indexterm><indexterm class="singular">
<primary>IPv6, enabling support for</primary>
</indexterm><indexterm class="singular">
<primary>configuration options</primary>
<secondary>IPv6 support</secondary>
</indexterm></para>
<para>By modifying your configuration setup, you can set up IPv6 when
using <literal>nova-network</literal> for networking, and a tested setup
is documented for FlatDHCP and a multi-host configuration. The key is to
make <literal>nova-network</literal> think a <literal>radvd</literal>
command ran successfully. The entire configuration is detailed in a
Cybera blog post, <link xlink:href="http://opsgui.de/1eLB0F2">“An IPv6
enabled cloud”</link>.</para>
</section>
<section xml:id="specific-advanced-config-period-tasks">
<title>Periodic Task Frequency for Compute</title>
<para>Before the Grizzly release, the frequency of periodic tasks was
specified in seconds between runs. This meant that if the periodic task
took 30 minutes to run and the frequency was set to hourly, then the
periodic task actually ran every 90 minutes, because the task would wait
an hour after running before running again. This changed in Grizzly, and
we now time the frequency of periodic tasks from the start of the work
the task does. So, our 30 minute periodic task will run every hour, with
a 30 minute wait between the end of the first run and the start of the
next.<indexterm class="singular">
<primary>configuration options</primary>
<secondary>periodic task frequency</secondary>
</indexterm></para>
</section>
<section xml:id="adv-config-geography">
<title>Geographical Considerations for Object Storage</title>
<para>Enhanced support for global clustering of object storage servers
continues to be added since the Grizzly (1.8.0) release, when regions
were introduced. You would implement these global clusters to ensure
replication across geographic areas in case of a natural disaster and
also to ensure that users can write or access their objects more quickly
based on the closest data center. You configure a default region with
one zone for each cluster, but be sure your network (WAN) can handle the
additional request and response load between zones as you add more zones
and build a ring that handles more zones. Refer to <link
xlink:href="http://opsgui.de/NPG6FJ">Geographically Distributed
Clusters</link> in the documentation for additional
information.<indexterm class="singular">
<primary>Object Storage</primary>
<secondary>geographical considerations</secondary>
</indexterm><indexterm class="singular">
<primary>storage</primary>
<secondary>geographical considerations</secondary>
</indexterm><indexterm class="singular">
<primary>configuration options</primary>
<secondary>geographical storage considerations</secondary>
</indexterm></para>
</section>
</section>
</chapter>

View File

@ -1,209 +1,289 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<chapter xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="backup_and_recovery">
<?dbhtml stop-chunking?>
<title>Backup and Recovery</title>
<para>Standard backup best practices apply when creating your
OpenStack back up policy. For example, how often to backup your
data is closely related to how quickly you need to recover
from data loss.</para>
<note>
<para>If you cannot have any data loss at all, you should also
focus on a highly available deployment. The
<citetitle><link
xlink:href="http://docs.openstack.org/high-availability-guide/content/"
>OpenStack High Availability Guide</link></citetitle> offers
suggestions for elimination of a single point of
failure that could cause system downtime. While it
is not a completely prescriptive document, it
offers methods and techniques for avoiding
downtime and data loss.
</para></note>
<para>Other backup considerations include:</para>
<itemizedlist>
<listitem>
<para>How many backups to keep?</para>
</listitem>
<listitem>
<para>Should backups be kept off-site?</para>
</listitem>
<listitem>
<para>How often should backups be tested?</para>
</listitem>
</itemizedlist>
<para>Just as important as a backup policy is a recovery policy
(or at least recovery testing).</para>
<chapter version="5.0" xml:id="backup_and_recovery"
xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/1999/xhtml"
xmlns:ns4="http://www.w3.org/1998/Math/MathML"
xmlns:ns3="http://www.w3.org/2000/svg"
xmlns:ns="http://docbook.org/ns/docbook">
<?dbhtml stop-chunking?>
<title>Backup and Recovery</title>
<section xml:id="what_to_backup">
<title>What to Backup</title>
<para>While OpenStack is composed of many components and
moving parts, backing up the critical data is quite
simple.</para>
<para>This chapter describes only how to back up configuration
files and databases that the various OpenStack components
need to run. This chapter does not describe how to back up
objects inside Object Storage or data contained inside
Block Storage. Generally these areas are left for users
to back up on their own.</para>
</section>
<section xml:id="database_backups">
<title>Database Backups</title>
<para>The example OpenStack architecture designates the cloud
controller as the MySQL server. This MySQL server hosts
the databases for nova, glance, cinder, and keystone. With
all of these databases in one place, it's very easy to
create a database backup:</para>
<para>Standard backup best practices apply when creating your OpenStack
backup policy. For example, how often to back up your data is closely
related to how quickly you need to recover from data loss.<indexterm
class="singular">
<primary>backup/recovery</primary>
<programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt --all-databases &gt;
openstack.sql</programlisting>
<secondary>considerations</secondary>
</indexterm></para>
<para>If you only want to backup a single database, you can
instead run:</para>
<programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt nova &gt; nova.sql</programlisting>
<para>where <code>nova</code> is the database you want to back
up.</para>
<para>You can easily automate this process by creating a cron
job that runs the following script once per day:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt>!/bin/bash
<note>
<para>If you cannot have any data loss at all, you should also focus on a
highly available deployment. The <emphasis><link
xlink:href="http://opsgui.de/1eLAYwS">OpenStack High Availability
Guide</link></emphasis> offers suggestions for elimination of a single
point of failure that could cause system downtime. While it is not a
completely prescriptive document, it offers methods and techniques for
avoiding downtime and data loss.<indexterm class="singular">
<primary>data</primary>
<secondary>preventing loss of</secondary>
</indexterm></para>
</note>
<para>Other backup considerations include:</para>
<itemizedlist>
<listitem>
<para>How many backups to keep?</para>
</listitem>
<listitem>
<para>Should backups be kept off-site?</para>
</listitem>
<listitem>
<para>How often should backups be tested?</para>
</listitem>
</itemizedlist>
<para>Just as important as a backup policy is a recovery policy (or at least
recovery testing).</para>
<section xml:id="what_to_backup">
<title>What to Back Up</title>
<para>While OpenStack is composed of many components and moving parts,
backing up the critical data is quite simple.<indexterm class="singular">
<primary>backup/recovery</primary>
<secondary>items included</secondary>
</indexterm></para>
<para>This chapter describes only how to back up configuration files and
databases that the various OpenStack components need to run. This chapter
does not describe how to back up objects inside Object Storage or data
contained inside Block Storage. Generally these areas are left for users
to back up on their own.</para>
</section>
<section xml:id="database_backups">
<title>Database Backups</title>
<para>The example OpenStack architecture designates the cloud controller
as the MySQL server. This MySQL server hosts the databases for nova,
glance, cinder, and keystone. With all of these databases in one place,
it's very easy to create a database backup:<indexterm class="singular">
<primary>databases</primary>
<secondary>backup/recovery of</secondary>
</indexterm><indexterm class="singular">
<primary>backup/recovery</primary>
<secondary>databases</secondary>
</indexterm></para>
<programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt --all-databases &gt; openstack.sql</programlisting>
<para>If you only want to backup a single database, you can instead
run:</para>
<programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt nova &gt; nova.sql</programlisting>
<para>where <code>nova</code> is the database you want to back up.</para>
<para>You can easily automate this process by creating a cron job that
runs the following script once per day:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt>!/bin/bash
backup_dir="/var/lib/backups/mysql"
filename="${backup_dir}/mysql-`hostname`-`eval date +%Y%m%d`.sql.gz"
# Dump the entire MySQL database
/usr/bin/mysqldump --opt --all-databases | gzip &gt; $filename
# Delete backups older than 7 days
find $backup_dir -ctime +7 -type f -delete</programlisting>
<para>This script dumps the entire MySQL database and deletes
any backups older than seven days.</para>
<para>This script dumps the entire MySQL database and deletes any backups
older than seven days.</para>
</section>
<section xml:id="file_system_backups">
<title>File System Backups</title>
<para>This section discusses which files and directories should be backed
up regularly, organized by service.<indexterm class="singular">
<primary>file systems</primary>
<secondary>backup/recovery of</secondary>
</indexterm><indexterm class="singular">
<primary>backup/recovery</primary>
<secondary>file systems</secondary>
</indexterm></para>
<section xml:id="compute">
<title>Compute</title>
<para>The <filename>/etc/nova</filename> directory on both the cloud
controller and compute nodes should be regularly backed up.<indexterm
class="singular">
<primary>cloud controllers</primary>
<secondary>file system backups and</secondary>
</indexterm><indexterm class="singular">
<primary>compute nodes</primary>
<secondary>backup/recovery of</secondary>
</indexterm></para>
<para><code>/var/log/nova</code> does not need to be backed up if you
have all logs going to a central area. It is highly recommended to use a
central logging server or back up the log directory.</para>
<para><code>/var/lib/nova</code> is another important directory to back
up. The exception to this is the <code>/var/lib/nova/instances</code>
subdirectory on compute nodes. This subdirectory contains the KVM images
of running instances. You would want to back up this directory only if
you need to maintain backup copies of all instances. Under most
circumstances, you do not need to do this, but this can vary from cloud
to cloud and your service levels. Also be aware that making a backup of
a live KVM instance can cause that instance to not boot properly if it
is ever restored from a backup.</para>
</section>
<section xml:id="file_system_backups">
<title>File System Backups</title>
<para>This section discusses which files and directories should be backed up regularly, organized by service.</para>
<section xml:id="compute">
<title>Compute</title>
<para>The <filename>/etc/nova</filename> directory on both the
cloud controller and compute nodes should be regularly
backed up.</para>
<para>
<code>/var/log/nova</code> does not need to be backed up if
you have all logs going to a central area. It is
highly recommended to use a central logging server or
back up the log directory.</para>
<para>
<code>/var/lib/nova</code> is another important
directory to back up. The exception to this is the
<code>/var/lib/nova/instances</code> subdirectory
on compute nodes. This subdirectory contains the KVM
images of running instances. You would want to
back up this directory only if you need to maintain backup
copies of all instances. Under most circumstances, you
do not need to do this, but this can vary from cloud
to cloud and your service levels. Also be aware that
making a backup of a live KVM instance can cause that
instance to not boot properly if it is ever restored
from a backup.</para>
</section>
<section xml:id="image_catalog_delivery">
<title>Image Catalog and Delivery</title>
<para>
<code>/etc/glance</code> and
<code>/var/log/glance</code> follow the same rules
as their nova counterparts.</para>
<para>
<code>/var/lib/glance</code> should also be backed up.
Take special notice of
<code>/var/lib/glance/images</code>. If you are
using a file-based backend of glance,
<code>/var/lib/glance/images</code> is where the
images are stored and care should be taken.</para>
<para>There are two ways to ensure stability with this
directory. The first is to make sure this directory is
run on a RAID array. If a disk fails, the directory is
available. The second way is to use a tool such as
rsync to replicate the images to another
server:</para>
<para># rsync -az --progress /var/lib/glance/images
backup-server:/var/lib/glance/images/</para>
</section>
<section xml:id="identity">
<title>Identity</title>
<para>
<code>/etc/keystone</code> and
<code>/var/log/keystone</code> follow the same
rules as other components.</para>
<para>
<code>/var/lib/keystone</code>, although it should not
contain any data being used, can also be backed up
just in case.</para>
</section>
<section xml:id="ops_block_storage">
<title>Block Storage</title>
<para>
<code>/etc/cinder</code> and
<code>/var/log/cinder</code> follow the same rules
as other components.</para>
<para>
<code>/var/lib/cinder</code> should also be backed
up.</para>
</section>
<section xml:id="ops_object_storage">
<title>Object Storage</title>
<para>
<code>/etc/swift</code> is very important to have
backed up. This directory contains the swift
configuration files as well as the ring files and ring
<glossterm>builder file</glossterm>s, which if
lost render the data on your cluster inaccessible. A
best practice is to copy the builder files to all
storage nodes along with the ring files. Multiple
backup copies are spread throughout your storage
cluster.</para>
</section>
<section xml:id="image_catalog_delivery">
<title>Image Catalog and Delivery</title>
<para><code>/etc/glance</code> and <code>/var/log/glance</code> follow
the same rules as their nova counterparts.<indexterm class="singular">
<primary>Image Service</primary>
<secondary>backup/recovery of</secondary>
</indexterm></para>
<para><code>/var/lib/glance</code> should also be backed up. Take
special notice of <code>/var/lib/glance/images</code>. If you are using
a file-based backend of glance, <code>/var/lib/glance/images</code> is
where the images are stored and care should be taken.</para>
<para>There are two ways to ensure stability with this directory. The
first is to make sure this directory is run on a RAID array. If a disk
fails, the directory is available. The second way is to use a tool such
as rsync to replicate the images to another server:</para>
<programlisting># rsync -az --progress /var/lib/glance/images \
backup-server:/var/lib/glance/images/</programlisting>
</section>
<section xml:id="recovering_backups">
<title>Recovering Backups</title>
<para>Recovering backups is a fairly simple process. To begin,
first ensure that the service you are recovering is not
running. For example, to do a full recovery of nova on the
cloud controller, first stop all <code>nova</code>
services:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> stop nova-api
# stop nova-cert
# stop nova-consoleauth
# stop nova-novncproxy
# stop nova-objectstore
# stop nova-scheduler</programlisting>
<para>Now you can import a previously backed-up
database:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mysql nova &lt; nova.sql</programlisting>
<para>You can also restore backed-up nova directories:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mv /etc/nova{,.orig}
# cp -a /path/to/backup/nova /etc/</programlisting>
<para>Once the files are restored, start everything back
up:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> start mysql
# for i in nova-api nova-cert nova-consoleauth nova-novncproxy nova-objectstore nova-scheduler
&gt; do
&gt; start $i
&gt; done
</programlisting>
<para>Other services follow the same process, with their
respective directories and databases.</para>
<section xml:id="identity">
<title>Identity</title>
<para><code>/etc/keystone</code> and <code>/var/log/keystone</code>
follow the same rules as other components.<indexterm class="singular">
<primary>Identity Service</primary>
<secondary>backup/recovery</secondary>
</indexterm></para>
<para><code>/var/lib/keystone</code>, although it should not contain any
data being used, can also be backed up just in case.</para>
</section>
<section xml:id="ops-backup-recovery-summary">
<title>Summary</title>
<para>Backup and subsequent recovery is one of the first tasks system
administrators learn. However, each system has different
items that need attention. By taking care of your database, image
service, and appropriate file system locations, you can be assured
you can handle any event requiring recovery.</para>
</section>
<section xml:id="ops_block_storage">
<title>Block Storage</title>
<para><code>/etc/cinder</code> and <code>/var/log/cinder</code> follow
the same rules as other components.<indexterm class="singular">
<primary>Block Storage</primary>
</indexterm><indexterm class="singular">
<primary>storage</primary>
<secondary>block storage</secondary>
</indexterm></para>
<para><code>/var/lib/cinder</code> should also be backed up.</para>
</section>
<section xml:id="ops_object_storage">
<title>Object Storage</title>
<para><code>/etc/swift</code> is very important to have backed up. This
directory contains the swift configuration files as well as the ring
files and ring <glossterm>builder file</glossterm>s, which if lost,
render the data on your cluster inaccessible. A best practice is to copy
the builder files to all storage nodes along with the ring files.
Multiple backup copies are spread throughout your storage
cluster.<indexterm class="singular">
<primary>builder files</primary>
</indexterm><indexterm class="singular">
<primary>rings</primary>
<secondary>ring builders</secondary>
</indexterm><indexterm class="singular">
<primary>Object Storage</primary>
<secondary>backup/recovery of</secondary>
</indexterm></para>
</section>
</section>
<section xml:id="recovering_backups">
<title>Recovering Backups</title>
<para>Recovering backups is a fairly simple process. To begin, first
ensure that the service you are recovering is not running. For example, to
do a full recovery of <literal>nova</literal> on the cloud controller,
first stop all <code>nova</code> services:<indexterm class="singular">
<primary>recovery</primary>
<seealso>backup/recovery</seealso>
</indexterm><indexterm class="singular">
<primary>backup/recovery</primary>
<secondary>recovering backups</secondary>
</indexterm></para>
<?hard-pagebreak ?>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> stop nova-api
# stop nova-cert
# stop nova-consoleauth
# stop nova-novncproxy
# stop nova-objectstore
# stop nova-scheduler</programlisting>
<para>Now you can import a previously backed-up database:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mysql nova &lt; nova.sql</programlisting>
<para>You can also restore backed-up nova directories:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mv /etc/nova{,.orig}
# cp -a /path/to/backup/nova /etc/</programlisting>
<para>Once the files are restored, start everything back up:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> start mysql
# for i in nova-api nova-cert nova-consoleauth nova-novncproxy
nova-objectstore nova-scheduler
&gt; do
&gt; start $i
&gt; done</programlisting>
<para>Other services follow the same process, with their respective
directories and <phrase role="keep-together">databases</phrase>.</para>
</section>
<section xml:id="ops-backup-recovery-summary">
<title>Summary</title>
<para>Backup and subsequent recovery is one of the first tasks system
administrators learn. However, each system has different items that need
attention. By taking care of your database, image service, and appropriate
file system locations, you can be assured that you can handle any event
requiring recovery.</para>
</section>
</chapter>

File diff suppressed because it is too large Load Diff

View File

@ -1,22 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE colophon [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<colophon xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="ch_doc_history">
<?dbhtml stop-chunking?>
<title>Document Change History</title>
<?dbhtml stop-chunking?>
<para>This version of the document replaces and obsoletes all
previous versions. The following table describes the most
recent changes:</para>
<?rax revhistory?>
<!-- Table generated in output from revision element in the book element -->
</colophon>

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,113 +1,139 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE appendix [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<appendix xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink"
version="5.0"
label="D"
xml:id="recommended-reading">
<?dbhtml stop-chunking?>
<title>Resources</title>
<para>
<emphasis role="bold">OpenStack</emphasis>
</para>
<para>
<link
xlink:href="http://docs.openstack.org/trunk/config-reference/content/"
>OpenStack Configuration Reference</link>
(http://docs.openstack.org/trunk/config-reference/content/section_compute-hypervisors.html)</para>
<para>
<link
xlink:href="http://docs.openstack.org/havana/install-guide/install/apt-debian/content/"
>OpenStack Install Guide for Debian 7.0</link>
(http://docs.openstack.org/havana/install-guide/install/apt-debian/content/)</para>
<para>
<link
xlink:href="http://docs.openstack.org/havana/install-guide/install/yum/content/"
>OpenStack Install Guide for Red Hat Enterprise Linux,
CentOS, and Fedora</link>
(http://docs.openstack.org/havana/install-guide/install/yum/content/)</para>
<para>
<link
xlink:href="http://docs.openstack.org/havana/install-guide/install/zypper/content/"
>OpenStack Install Guide for openSUSE, SUSE Linux Enterprise Server</link>
(http://docs.openstack.org/havana/install-guide/install/zypper/content/)</para>
<para>
<link
xlink:href="http://docs.openstack.org/havana/install-guide/install/apt/content/"
>OpenStack Install Guide for Ubuntu 12.04 (LTS) Server</link>
(http://docs.openstack.org/havana/install-guide/install/apt/content/)</para>
<para>
<link
xlink:href="http://docs.openstack.org/admin-guide-cloud/content/"
>OpenStack Cloud Administrator Guide</link>
(http://docs.openstack.org/admin-guide-cloud/content/)</para>
<para>
<link
xlink:href="http://docs.openstack.org/security-guide/content/"
>OpenStack Security Guide</link>
(http://docs.openstack.org/security-guide/content/)</para>
<para>
<link
xlink:href="http://www.packtpub.com/openstack-cloud-computing-cookbook-second-edition/book"
>OpenStack Cloud Computing Cookbook</link>
(http://www.packtpub.com/openstack-cloud-computing-cookbook-second-edition/book)</para>
<para>
<emphasis role="bold">Cloud (General)</emphasis>
</para>
<para>
<link
xlink:href="http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf"
>NIST Cloud Computing Definition</link>
(http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf)</para>
<para>
<emphasis role="bold">Python</emphasis>
</para>
<para>
<link xlink:href="http://www.diveintopython.net">Dive Into
Python</link> (http://www.diveintopython.net)</para>
<para>
<emphasis role="bold">Networking</emphasis>
</para>
<para>
<link
xlink:href="http://www.pearsonhighered.com/educator/product/TCPIP-Illustrated-Volume-1-The-Protocols/9780321336316.page"
>TCP/IP Illustrated</link>
(http://www.pearsonhighered.com/educator/product/TCPIP-Illustrated-Volume-1-The-Protocols/9780321336316.page)</para>
<para>
<link xlink:href="http://nostarch.com/tcpip.htm">The TCP/IP
Guide</link> (http://nostarch.com/tcpip.htm)</para>
<para>
<link xlink:href="http://danielmiessler.com/study/tcpdump/">A
tcpdump Tutorial and Primer</link>
(http://danielmiessler.com/study/tcpdump/)</para>
<para>
<emphasis role="bold">Systems Administration</emphasis>
</para>
<para>
<link xlink:href="http://www.admin.com/">UNIX and Linux
Systems Administration Handbook</link>
(http://www.admin.com/)</para>
<para>
<emphasis role="bold">Virtualization </emphasis>
</para>
<para>
<link xlink:href="http://nostarch.com/xen.htm">The Book of
Xen</link> (http://nostarch.com/xen.htm)</para>
<para>
<emphasis role="bold"> Configuration Management</emphasis>
</para>
<para>
<link xlink:href="http://docs.puppetlabs.com/">Puppet Labs
Documentation</link> (http://docs.puppetlabs.com/)</para>
<para>
<link xlink:href="http://www.apress.com/9781430230571">Pro
Puppet</link> (http://www.apress.com/9781430230571)
</para>
</appendix>
<appendix label="D" version="5.0" xml:id="recommended-reading"
xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/1998/Math/MathML"
xmlns:ns4="http://www.w3.org/2000/svg"
xmlns:ns3="http://www.w3.org/1999/xhtml"
xmlns:ns="http://docbook.org/ns/docbook">
<?dbhtml stop-chunking?>
<title>Resources</title>
<section xml:id="openstack-resources">
<title>OpenStack</title>
<itemizedlist>
<listitem>
<para><link xlink:href="http://opsgui.de/NPGtjs">OpenStack
Configuration Reference</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/1eLBGtX">OpenStack Install
Guide for Debian 7.0</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/NPGvrs">OpenStack Install
Guide for Red Hat Enterprise Linux, CentOS, and Fedora</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/1eLBI50">OpenStack Install
Guide for openSUSE, SUSE Linux Enterprise Server</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/NPGunp">OpenStack Install
Guide for Ubuntu 12.04 (LTS) Server</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/1eLBL0N">OpenStack Cloud
Administrator Guide</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/NPGwvz"><emphasis>OpenStack
Cloud Computing Cookbook</emphasis> (Packt Publishing) </link></para>
</listitem>
</itemizedlist>
</section>
<section xml:id="cloud-general-resources">
<title>Cloud (General)</title>
<itemizedlist>
<listitem>
<para><link xlink:href="http://opsgui.de/1eLBLOv">“The NIST Definition
of Cloud Computing”</link></para>
</listitem>
</itemizedlist>
</section>
<section xml:id="python-resources">
<title>Python</title>
<itemizedlist>
<listitem>
<para><link xlink:href="http://opsgui.de/NPGxQd"><emphasis>Dive Into
Python</emphasis> (Apress)</link></para>
</listitem>
</itemizedlist>
</section>
<?hard-pagebreak ?>
<section xml:id="networking-resources">
<title>Networking</title>
<itemizedlist>
<listitem>
<para><link xlink:href="http://opsgui.de/1eLBNWl"><emphasis>TCP/IP
Illustrated, Volume 1: The Protocols, 2/E</emphasis>
(Pearson)</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/NPGzYr"><emphasis>The TCP/IP
Guide</emphasis> (No Starch Press)</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/1eLBOJS">“A
<code>tcpdump</code> Tutorial and Primer”</link></para>
</listitem>
</itemizedlist>
</section>
<section xml:id="system-admin-resources">
<title>Systems Administration</title>
<itemizedlist>
<listitem>
<para><link xlink:href="http://opsgui.de/NPGyDR"><emphasis>UNIX and
Linux Systems Administration Handbook</emphasis> (Prentice
Hall)</link></para>
</listitem>
</itemizedlist>
</section>
<section xml:id="virtualization-resources">
<title>Virtualization</title>
<itemizedlist>
<listitem>
<para><link xlink:href="http://opsgui.de/1eLBQSb"><emphasis>The Book
of Xen</emphasis> (No Starch Press)</link></para>
</listitem>
</itemizedlist>
</section>
<section xml:id="config-management-resources">
<title>Configuration Management</title>
<itemizedlist>
<listitem>
<para><link xlink:href="http://opsgui.de/NPGzrj">Puppet Labs
Documentation</link></para>
</listitem>
<listitem>
<para><link xlink:href="http://opsgui.de/1eLBRFD"><emphasis>Pro
Puppet</emphasis> (Apress)</link></para>
</listitem>
</itemizedlist>
</section>
</appendix>

File diff suppressed because it is too large Load Diff

View File

@ -1,468 +1,528 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<chapter xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="upstream_openstack">
<?dbhtml stop-chunking?>
<title>Upstream OpenStack</title>
<para>OpenStack is founded on a thriving community that is a
source of help and welcomes your contributions. This chapter
details some of the ways you can interact with the others
involved.</para>
<section xml:id="get_help">
<title>Getting Help</title>
<para>There are several avenues available for seeking
assistance. The quickest way is to help the community
help you. Search the Q&amp;A sites, mailing list archives,
and bug lists for issues similar to yours. If you can't
find anything, follow the directions for reporting bugs
or use one of the channels for support, which are listed below.</para>
<para>Your first port of call should be the official OpenStack
documentation, found on http://docs.openstack.org.</para>
<para>You can get questions answered on the ask.openstack.org site.</para>
<para>
<link
xlink:href="https://wiki.openstack.org/wiki/Mailing_Lists"
>Mailing
Lists</link> (https://wiki.openstack.org/wiki/Mailing_Lists)
are also a great place to get help. The wiki page has more
information about the various lists. As an operator, the
main lists you should be aware of are:</para>
<itemizedlist>
<listitem>
<para>
<link
xlink:href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack"
>General
list</link>: <code>openstack@lists.openstack.org</code>.
The scope of this list is the current state of OpenStack.
This is a very high-traffic mailing list, with many, many
emails per day.</para>
</listitem>
<listitem>
<para>
<link
xlink:href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators"
>Operators
list</link>: <code>openstack-operators@lists.openstack.org.</code>
This list is intended for discussion among
existing OpenStack cloud operators, such as
yourself. Currently, this list is relatively low
traffic, on the order of one email a day.</para>
</listitem>
<listitem>
<para>
<link
xlink:href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev"
>Development
list</link>: <code>openstack-dev@lists.openstack.org</code>.
The scope of this list is the future state of OpenStack.
This is a high-traffic mailing list, with multiple emails
per day.</para>
</listitem>
</itemizedlist>
<para>We recommend that you subscribe to the general list and the
operator list, although you must set up filters to manage the volume
for the general list. You'll also find links to the mailing list
archives on the mailing list wiki page where you can search through
the discussions.</para>
<para>
<link
xlink:href="https://wiki.openstack.org/wiki/IRC"
>Multiple IRC
channels</link> (https://wiki.openstack.org/wiki/IRC) are
available for general questions and developer discussions.
The general discussion channel is <code>#openstack</code>
on <code>irc.freenode.net</code>.</para>
</section>
<section xml:id="report_bugs">
<title>Reporting Bugs</title>
<para>As an operator, you are in a very good position to report
unexpected behavior with your cloud. Since OpenStack is flexible,
you may be the only individual to report a particular issue. Every
issue is important to fix, so it is essential to learn how to easily
submit a bug report.</para>
<para>All OpenStack projects use <link
xlink:href="http://launchpad.net/"
>Launchpad</link> for bug tracking. You'll need to create
an account on Launchpad before you can submit a bug
report.</para>
<para>Once you have a Launchpad account, reporting a bug is as simple as
identifying the project or projects that are causing the issue.
Sometimes this is more difficult than expected, but those working on
the bug triage are happy to help relocate issues if they are not in
the right place initially.</para>
<itemizedlist>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/nova/+filebug"
>nova</link> (https://bugs.launchpad.net/nova/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/python-novaclient/+filebug"
>python-novaclient</link> (https://bugs.launchpad.net/python-novaclient/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/swift/+filebug"
>swift</link> (https://bugs.launchpad.net/swift/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/python-swiftclient/+filebug"
>python-swiftclient</link>
(https://bugs.launchpad.net/python-swiftclient/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/glance/+filebug"
>glance</link> (https://bugs.launchpad.net/glance/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/python-glanceclient/+filebug"
>python-glanceclient</link>
(https://bugs.launchpad.net/python-glanceclient/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/keystone/+filebug"
>keystone</link> (https://bugs.launchpad.net/keystone/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/python-keystoneclient/+filebug"
>python-keystoneclient</link>
(https://bugs.launchpad.net/python-keystoneclient/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/neutron/+filebug"
>neutron</link> (https://bugs.launchpad.net/neutron/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/python-neutronclient/+filebug"
>python-neutronclient</link>
(https://bugs.launchpad.net/python-neutronclient/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/cinder/+filebug"
>cinder</link> (https://bugs.launchpad.net/cinder/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/python-cinderclient/+filebug"
>python-cinderclient</link>
(https://bugs.launchpad.net/python-cinderclient/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="https://bugs.launchpad.net/horizon/+filebug"
>horizon</link> (https://bugs.launchpad.net/horizon/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug with the <link
xlink:href="http://bugs.launchpad.net/openstack-manuals/+filebug"
>documentation</link> (http://bugs.launchpad.net/openstack-manuals/+filebug)</para>
</listitem>
<listitem>
<para>Report a bug with the <link
xlink:href="http://bugs.launchpad.net/openstack-api-site/+filebug"
>API
documentation</link> (http://bugs.launchpad.net/openstack-api-site/+filebug)</para>
</listitem>
</itemizedlist>
<para>To write a good bug report, the following process is essential.
First, search for the bug to make sure there is no bug already filed
for the same issue. If you find one, be sure to click on "This bug
affects X people. Does this bug affect you?" If you can't find the
issue, then enter the details of your report. It should at least
include:</para>
<itemizedlist>
<listitem>
<para>The release, or milestone, or commit ID
corresponding to the software that you are
running.</para>
</listitem>
<listitem>
<para>The operating system and version where you've
identified the bug.</para>
</listitem>
<listitem>
<para>Steps to reproduce the bug, including what went
wrong.</para>
</listitem>
<listitem>
<para>Description of the expected results instead of
what you saw.</para>
</listitem>
<listitem>
<para>Portions of your log files so that you include only
relevant excerpts.</para>
</listitem>
</itemizedlist>
<para>When you do this, the bug is created with:</para>
<itemizedlist>
<listitem>
<para>Status: <emphasis>New</emphasis>
</para>
</listitem>
</itemizedlist>
<para>In the bug comments, you can contribute instructions on how to fix
a given bug, and set it to <emphasis>Triaged</emphasis>. Or you can
directly fix it: assign the bug to yourself, set it to <emphasis>In
progress</emphasis>, branch the code, implement the fix, and
propose your change for merging. But let's not get ahead of
ourselves, there are bug triaging tasks as well.</para>
<section xml:id="confirm_priority">
<title>Confirming and Prioritizing</title>
<para>This stage is about checking that a bug is real and
assessing its impact. Some of these steps require bug
supervisor rights (usually limited to core teams). If
the bug lacks information to properly reproduce or
assess the importance of the bug, the bug is set
to:</para>
<itemizedlist>
<listitem>
<para>Status: <emphasis>Incomplete</emphasis>
</para>
</listitem>
</itemizedlist>
<para>Once you have reproduced the issue (or are 100 percent
confident that this is indeed a valid bug) and have permissions
to do so, set:</para>
<itemizedlist>
<listitem>
<para>Status: <emphasis>Confirmed</emphasis>
</para>
</listitem>
</itemizedlist>
<para>Core developers also prioritize the bug, based
on its impact:</para>
<itemizedlist>
<listitem>
<para>Importance: &lt;Bug impact&gt;</para>
</listitem>
</itemizedlist>
<para>The bug impacts are categorized as follows:</para>
<orderedlist>
<listitem>
<para>
<emphasis>Critical</emphasis> if the bug prevents a key
feature from working properly (regression) for all users
(or without a simple workaround) or results in data
loss.</para>
</listitem>
<listitem>
<para>
<emphasis>High</emphasis> if the bug prevents a key
feature from working properly for some users (or with a
workaround).</para>
</listitem>
<listitem>
<para>
<emphasis>Medium</emphasis> if the bug prevents a
secondary feature from working properly.</para>
</listitem>
<listitem>
<para>
<emphasis>Low</emphasis> if the bug is mostly
cosmetic.</para>
</listitem>
<listitem>
<para>
<emphasis>Wishlist</emphasis> if the bug is not really a
bug but rather a welcome change in behavior.</para>
</listitem>
</orderedlist>
<para>If the bug contains the solution, or a patch, set the bug
status to <emphasis>Triaged</emphasis>.</para>
</section>
<section xml:id="bug_fixing">
<title>Bug Fixing</title>
<para>At this stage, a developer works on a fix. During that time,
to avoid duplicating the work, the developer should set:</para>
<itemizedlist>
<listitem>
<para>Status: <emphasis>In progress</emphasis>
</para>
</listitem>
<listitem>
<para>Assignee: &lt;yourself&gt;</para>
</listitem>
</itemizedlist>
<para>When the fix is ready, the developer proposes a change and
gets the change reviewed.</para>
</section>
<section xml:id="after_change_is_accepted">
<title>After the Change Is Accepted</title>
<para>After the change is reviewed, accepted, and lands in master, it automatically moves
to:</para>
<itemizedlist>
<listitem>
<para>Status: <emphasis>Fix committed</emphasis>
</para>
</listitem>
</itemizedlist>
<para>When the fix makes it into a milestone or release
branch, it automatically moves to:</para>
<itemizedlist>
<listitem>
<para>Milestone: Milestone the bug was fixed
in</para>
</listitem>
<listitem>
<para>Status: <emphasis>Fix released</emphasis>
</para>
</listitem>
</itemizedlist>
</section>
</section>
<section xml:id="openstack_community">
<title>Join the OpenStack Community</title>
<para>Since you've made it this far in the book, you should consider
becoming an official individual member of the community and <link
xlink:href="https://www.openstack.org/join/">join the OpenStack
Foundation</link> (https://www.openstack.org/join/). The
OpenStack Foundation is an independent body providing shared
resources to help achieve the OpenStack mission by protecting,
empowering, and promoting OpenStack software and the community
around it, including users, developers, and the entire ecosystem. We
all share the responsibility to make this community the best it can
possibly be, and signing up to be a member is the first step to
participating. Like the software, individual membership within the
OpenStack Foundation is free and accessible to anyone.</para>
</section>
<section xml:id="contribute_to_docs">
<title>How to Contribute to the Documentation</title>
<para>OpenStack documentation efforts encompass operator and
administrator docs, API docs, and user docs.</para>
<para>The genesis of this book was an in-person event, but now
that the book is in your hands we want you to contribute
to it. OpenStack documentation follows the coding
principles of iterative work, with bug logging,
investigating, and fixing.</para>
<para>Just like the code, the <link
<chapter version="5.0" xml:id="upstream_openstack"
xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/1999/xhtml"
xmlns:ns4="http://www.w3.org/2000/svg"
xmlns:ns3="http://www.w3.org/1998/Math/MathML"
xmlns:ns="http://docbook.org/ns/docbook">
<?dbhtml stop-chunking?>
xlink:href="http://docs.openstack.org"
>docs.openstack.org</link> site is updated constantly
using the Gerrit review system, with source stored in
GitHub in the <link
xlink:href="http://github.com/openstack/openstack-manuals/"
>openstack-manuals</link> (http://github.com/openstack/openstack-manuals/)
repository and the <link
xlink:href="http://github.com/openstack/api-site/"
>api-site</link> (http://github.com/openstack/api-site/)
repository, in DocBook format.</para>
<para>To review the documentation before it's published, go to
the OpenStack Gerrit server at <link
<title>Upstream OpenStack</title>
xlink:href="http://review.openstack.org"
>review.openstack.org</link> and search for <link
<para>OpenStack is founded on a thriving community that is a source of help
and welcomes your contributions. This chapter details some of the ways you
can interact with the others involved.</para>
xlink:href="https://review.openstack.org/#/q/status:open+project:openstack/openstack-manuals,n,z"
>project:openstack/openstack-manuals</link> or <link
<section xml:id="get_help">
<title>Getting Help</title>
xlink:href="https://review.openstack.org/#/q/status:open+project:openstack/api-site,n,z"
>project:openstack/api-site</link>.</para>
<para>See the <link
xlink:href="https://wiki.openstack.org/wiki/How_To_Contribute"
>How To
Contribute</link> (https://wiki.openstack.org/wiki/How_To_Contribute)
page on the wiki for more information on the steps you
need to take to submit your first documentation
review or change.</para>
</section>
<section xml:id="security_info">
<title>Security Information</title>
<para>As a community, we take security very seriously and follow a
specific process for reporting potential issues. We vigilantly
pursue fixes and regularly eliminate exposures. You can report
security issues you discover through this specific process. The
OpenStack Vulnerability Management Team is a very small group of
experts in vulnerability management drawn from the OpenStack
community. The team's job is facilitating the reporting of
vulnerabilities, coordinating security fixes and handling
progressive disclosure of the vulnerability information.
Specifically, the team is responsible for the following
functions:</para>
<itemizedlist>
<listitem>
<para>Vulnerability management: All vulnerabilities discovered
by community members (or users) can be reported to the
Team.</para>
</listitem>
<listitem>
<para>Vulnerability tracking: The Team will curate a set of
vulnerability related issues in the issue tracker. Some of
these issues are private to the Team and the affected
product leads, but once remediation is in place, all
vulnerabilities are public.</para>
</listitem>
<listitem>
<para>Responsible disclosure: As part of our commitment to work
with the security community, the team ensures that proper
credit is given to security researchers who responsibly
report issues in OpenStack.</para>
</listitem>
</itemizedlist>
<para>We provide two ways to report issues to the OpenStack
Vulnerability Management Team, depending on how sensitive the issue
is:</para>
<itemizedlist>
<listitem>
<para>Open a bug in Launchpad and mark it as a "security bug."
This makes the bug private and accessible to only the
Vulnerability Management Team.</para>
</listitem>
<listitem>
<para>If the issue is extremely sensitive, send an encrypted
email to one of the team's members. Find their GPG keys at
<link
xlink:href="http://www.openstack.org/projects/openstack-security/"
>OpenStack Security</link>
(http://www.openstack.org/projects/openstack-security/).</para>
</listitem>
</itemizedlist>
<para>You can find the full list of security-oriented teams you can join
at <link xlink:href="https://wiki.openstack.org/wiki/SecurityTeams"
>Security
Teams</link> (http://wiki.openstack.org/SecurityTeams). The
vulnerability management process is fully documented at <link
xlink:href="https://wiki.openstack.org/wiki/VulnerabilityManagement"
>Vulnerability
Management</link> (https://wiki.openstack.org/wiki/VulnerabilityManagement).</para>
</section>
<section xml:id="additional_info">
<title>Finding Additional Information</title>
<para>In addition to this book, there are many other sources of
information about OpenStack. The <link
xlink:href="http://www.openstack.org">OpenStack
website</link> (http://www.openstack.org) is a good starting point,
with <link xlink:href="http://docs.openstack.org">OpenStack
Docs</link> (http://docs.openstack.org) and <link
xlink:href="http://api.openstack.org">OpenStack API
Docs</link> (http://api.openstack.org) providing technical
documentation about OpenStack. The <link
xlink:href="https://wiki.openstack.org">OpenStack wiki</link>
contains a lot of general information that cuts across the OpenStack
projects, including a list of <link
xlink:href="https://wiki.openstack.org/wiki/OperationsTools"
>recommended tools</link>
(https://wiki.openstack.org/wiki/OperationsTools ). Finally, there
are a number of blogs aggregated at <link
xlink:href="http://planet.openstack.org">Planet
OpenStack</link> (http://planet.openstack.org).</para>
<para>There are several avenues available for seeking assistance. The
quickest way is to help the community help you. Search the Q&amp;A sites,
mailing list archives, and bug lists for issues similar to yours. If you
can't find anything, follow the directions for reporting bugs or use one
of the channels for support, which are listed below.<indexterm
class="singular">
<primary>mailing lists</primary>
</indexterm><indexterm class="singular">
<primary>OpenStack</primary>
<secondary>documentation</secondary>
</indexterm><indexterm class="singular">
<primary>help, resources for</primary>
</indexterm><indexterm class="singular">
<primary>troubleshooting</primary>
<secondary>getting help</secondary>
</indexterm><indexterm class="singular">
<primary>OpenStack community</primary>
<secondary>getting help from</secondary>
</indexterm></para>
<para>Your first port of call should be the official OpenStack
documentation, found on <link
xlink:href="http://docs.openstack.org"></link>. You can get questions
answered on <link xlink:href="http://ask.openstack.org"></link>.</para>
<para><link xlink:href="http://opsgui.de/NPGELC">Mailing lists</link> are
also a great place to get help. The wiki page has more information about
the various lists. As an operator, the main lists you should be aware of
are:</para>
<variablelist>
<varlistentry>
<term><link xlink:href="http://opsgui.de/1eLBZoy">General
list</link></term>
<listitem>
<para><emphasis>openstack@lists.openstack.org</emphasis>. The scope
of this list is the current state of OpenStack. This is a very
high-traffic mailing list, with many, many emails per day.</para>
</listitem>
</varlistentry>
<varlistentry>
<term><link xlink:href="http://opsgui.de/NPGF2c">Operators
list</link></term>
<listitem>
<para><emphasis>openstack-operators@lists.openstack.org.</emphasis>
This list is intended for discussion among existing OpenStack cloud
operators, such as yourself. Currently, this list is relatively low
traffic, on the order of one email a day.</para>
</listitem>
</varlistentry>
<varlistentry>
<term><link xlink:href="http://opsgui.de/1eLC2Rk">Development
list</link></term>
<listitem>
<para><emphasis>openstack-dev@lists.openstack.org</emphasis>. The
scope of this list is the future state of OpenStack. This is a
high-traffic mailing list, with multiple emails per day.</para>
</listitem>
</varlistentry>
</variablelist>
<para>We recommend that you subscribe to the general list and the operator
list, although you must set up filters to manage the volume for the
general list. You'll also find links to the mailing list archives on the
mailing list wiki page, where you can search through the
discussions.</para>
<para><link xlink:href="http://opsgui.de/NPGIuU">Multiple IRC
channels</link> are available for general questions and developer
discussions. The general discussion channel is #openstack on
<emphasis>irc.freenode.net</emphasis>.</para>
</section>
<section xml:id="report_bugs">
<title>Reporting Bugs</title>
<para>As an operator, you are in a very good position to report unexpected
behavior with your cloud. Since OpenStack is flexible, you may be the only
individual to report a particular issue. Every issue is important to fix,
so it is essential to learn how to easily submit a bug report.<indexterm
class="singular">
<primary>maintenance/debugging</primary>
<secondary>reporting bugs</secondary>
</indexterm><indexterm class="singular">
<primary>bugs, reporting</primary>
</indexterm><indexterm class="singular">
<primary>OpenStack community</primary>
<secondary>reporting bugs</secondary>
</indexterm></para>
<para>All OpenStack projects use <link
xlink:href="http://opsgui.de/1eLC2ku">Launchpad</link>&#160;for bug
tracking. You'll need to create an account on Launchpad before you can
submit a bug report.</para>
<para>Once you have a Launchpad account, reporting a bug is as simple as
identifying the project or projects that are causing the issue. Sometimes
this is more difficult than expected, but those working on the bug triage
are happy to help relocate issues if they are not in the right place
initially:</para>
<itemizedlist>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/NPGLa0">nova</link>.</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/1eLC3Vv">python-novaclient</link>.</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/NPGMea">swift</link>.</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/1eLC4Zu">python-swiftclient</link>.</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/NPGOmf">glance</link>.</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/1eLC8bQ">python-glanceclient</link>.</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/NPGRhX">keystone</link>.</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/1eLC8Z6">python-keystoneclient</link>.</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/NPGSm2">neutron</link>.</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/1eLC9ME">python-neutronclient</link>.</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/NPGTGy">cinder</link>.</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/1eLCcs7">python-cinderclient</link>.</para>
</listitem>
<listitem>
<para>Report a bug in <link
xlink:href="http://opsgui.de/NPGUdz">horizon</link>.</para>
</listitem>
<listitem>
<para>Report a bug with the <link
xlink:href="http://opsgui.de/1eLCcZ8">documentation</link>.</para>
</listitem>
<listitem>
<para>Report a bug with the <link
xlink:href="http://opsgui.de/NPGUKx">API documentation</link>.</para>
</listitem>
</itemizedlist>
<para>To write a good bug report, the following process is essential.
First, search for the bug to make sure there is no bug already filed for
the same issue. If you find one, be sure to click on "This bug affects X
people. Does this bug affect you?" If you can't find the issue, then enter
the details of your report. It should at least include:</para>
<itemizedlist>
<listitem>
<para>The release, or milestone, or commit ID corresponding to the
software that you are running</para>
</listitem>
<listitem>
<para>The operating system and version where you've identified the
bug</para>
</listitem>
<listitem>
<para>Steps to reproduce the bug, including what went wrong</para>
</listitem>
<listitem>
<para>Description of the expected results instead of what you
saw</para>
</listitem>
<listitem>
<para>Portions of your log files so that you include only relevant
excerpts</para>
</listitem>
</itemizedlist>
<para>When you do this, the bug is created with:</para>
<itemizedlist>
<listitem>
<para>Status: <emphasis>New</emphasis></para>
</listitem>
</itemizedlist>
<para>In the bug comments, you can contribute instructions on how to fix a
given bug, and set it to <emphasis>Triaged</emphasis>. Or you can directly
fix it: assign the bug to yourself, set it to <emphasis>In
progress</emphasis>, branch the code, implement the fix, and propose your
change for merging. But let's not get ahead of ourselves; there are bug
triaging tasks as well.</para>
<section xml:id="confirm_priority">
<title>Confirming and Prioritizing</title>
<para>This stage is about checking that a bug is real and assessing its
impact. Some of these steps require bug supervisor rights (usually
limited to core teams). If the bug lacks information to properly
reproduce or assess the importance of the bug, the bug is set to:</para>
<itemizedlist>
<listitem>
<para>Status: <emphasis>Incomplete</emphasis></para>
</listitem>
</itemizedlist>
<para>Once you have reproduced the issue (or are 100 percent confident
that this is indeed a valid bug) and have permissions to do so,
set:</para>
<itemizedlist>
<listitem>
<para>Status: <emphasis>Confirmed</emphasis></para>
</listitem>
</itemizedlist>
<para>Core developers also prioritize the bug, based on its
impact:</para>
<itemizedlist>
<listitem>
<para>Importance: &lt;Bug impact&gt;</para>
</listitem>
</itemizedlist>
<para>The bug impacts are categorized as follows:</para>
<?hard-pagebreak ?>
<orderedlist>
<listitem>
<para><emphasis>Critical</emphasis> if the bug prevents a key
feature from working properly (regression) for all users (or without
a simple workaround) or results in data loss</para>
</listitem>
<listitem>
<para><emphasis>High</emphasis> if the bug prevents a key feature
from working properly for some users (or with a workaround)</para>
</listitem>
<listitem>
<para><emphasis>Medium</emphasis> if the bug prevents a secondary
feature from working properly</para>
</listitem>
<listitem>
<para><emphasis>Low</emphasis> if the bug is mostly cosmetic</para>
</listitem>
<listitem>
<para><emphasis>Wishlist</emphasis> if the bug is not really a bug
but rather a welcome change in behavior</para>
</listitem>
</orderedlist>
<para>If the bug contains the solution, or a patch, set the bug status
to <emphasis>Triaged</emphasis>.</para>
</section>
</chapter>
<section xml:id="bug_fixing">
<title>Bug Fixing</title>
<para>At this stage, a developer works on a fix. During that time, to
avoid duplicating the work, the developer should set:</para>
<itemizedlist>
<listitem>
<para>Status: <emphasis>In Progress</emphasis></para>
</listitem>
<listitem>
<para>Assignee: &lt;yourself&gt;</para>
</listitem>
</itemizedlist>
<para>When the fix is ready, the developer proposes a change and gets
the change reviewed.</para>
</section>
<section xml:id="after_change_is_accepted">
<title>After the Change Is Accepted</title>
<para>After the change is reviewed, accepted, and lands in master, it
automatically moves to:</para>
<itemizedlist>
<listitem>
<para>Status: <emphasis>Fix Committed</emphasis></para>
</listitem>
</itemizedlist>
<para>When the fix makes it into a milestone or release branch, it
automatically moves to:</para>
<itemizedlist>
<listitem>
<para>Milestone: Milestone the bug was fixed in</para>
</listitem>
<listitem>
<para>Status:&#160;<emphasis>Fix Released</emphasis></para>
</listitem>
</itemizedlist>
</section>
</section>
<section xml:id="openstack_community">
<title>Join the OpenStack Community</title>
<para>Since you've made it this far in the book, you should consider
becoming an official individual member of the community and <link
xlink:href="http://opsgui.de/1eLCejs">join the OpenStack
Foundation</link>.&#160;The OpenStack Foundation is an independent body
providing shared resources to help achieve the OpenStack mission by
protecting, empowering, and promoting OpenStack software and the community
around it, including users, developers, and the entire ecosystem. We all
share the responsibility to make this community the best it can possibly
be, and signing up to be a member is the first step to participating. Like
the software, individual membership within the OpenStack Foundation is
free and accessible to anyone.<indexterm class="singular">
<primary>OpenStack community</primary>
<secondary>joining</secondary>
</indexterm></para>
</section>
<section xml:id="contribute_to_docs">
<title>How to Contribute to the Documentation</title>
<para>OpenStack documentation efforts encompass operator and administrator
docs, API docs, and user docs.<indexterm class="singular">
<primary>OpenStack community</primary>
<secondary>contributing to</secondary>
</indexterm></para>
<para>The genesis of this book was an in-person event, but now that the
book is in your hands, we want you to contribute to it. OpenStack
documentation follows the coding principles of iterative work, with bug
logging, investigating, and fixing.</para>
<para>Just like the code, <link
xlink:href="http://docs.openstack.org"></link> is updated constantly using
the Gerrit review system, with source stored in GitHub in the <link
xlink:href="http://opsgui.de/1eLCf75">openstack-manuals repository</link>
and the <link xlink:href="http://opsgui.de/NPGYda">api-site
repository</link>, in DocBook format.</para>
<para>To review the documentation before it's published, go to the
OpenStack Gerrit server at&#160;<link
xlink:href="http://review.openstack.org"></link> and search for <link
xlink:href="http://opsgui.de/NPGXpV">project:openstack/openstack-manuals</link>
or <link
xlink:href="http://opsgui.de/1eLClM1">project:openstack/api-site</link>.</para>
<para>See the <link xlink:href="http://opsgui.de/NPG68B">How To Contribute
page on the wiki</link> for more information on the steps you need to take
to submit your first documentation review or change.</para>
</section>
<section xml:id="security_info">
<title>Security Information</title>
<para>As a community, we take security very seriously and follow a
specific process for reporting potential issues. We vigilantly pursue
fixes and regularly eliminate exposures. You can report security issues
you discover through this specific process. The OpenStack Vulnerability
Management Team is a very small group of experts in vulnerability
management drawn from the OpenStack community. The team's job is
facilitating the reporting of vulnerabilities, coordinating security fixes
and handling progressive disclosure of the vulnerability information.
Specifically, the team is responsible for the following
functions:<indexterm class="singular">
<primary>vulnerability tracking/management</primary>
</indexterm><indexterm class="singular">
<primary>security issues</primary>
<secondary>reporting/fixing vulnerabilities</secondary>
</indexterm><indexterm class="singular">
<primary>OpenStack community</primary>
<secondary>security information</secondary>
</indexterm></para>
<variablelist>
<varlistentry>
<term>Vulnerability management</term>
<listitem>
<para>All vulnerabilities discovered by community members (or users)
can be reported to the team.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Vulnerability tracking</term>
<listitem>
<para>The team will curate a set of vulnerability related issues in
the issue tracker. Some of these issues are private to the team and
the affected product leads, but once remediation is in place, all
vulnerabilities are public.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Responsible disclosure</term>
<listitem>
<para>As part of our commitment to work with the security community,
the team ensures that proper credit is given to security researchers
who responsibly report issues in OpenStack.</para>
</listitem>
</varlistentry>
</variablelist>
<para>We provide two ways to report issues to the OpenStack Vulnerability
Management Team, depending on how sensitive the issue is:</para>
<itemizedlist>
<listitem>
<para>Open a bug in Launchpad and mark it as a "security bug." This
makes the bug private and accessible to only the Vulnerability
Management Team.</para>
</listitem>
<listitem>
<para>If the issue is extremely sensitive, send an encrypted email to
one of the team's members. Find their GPG keys at <link
xlink:href="http://opsgui.de/1eLCkaQ">OpenStack
Security</link>.</para>
</listitem>
</itemizedlist>
<para>You can find the full list of security-oriented teams you can join
at <link xlink:href="http://opsgui.de/NPGZxO">Security Teams</link>. The
vulnerability management process is fully documented at <link
xlink:href="http://opsgui.de/1eLCkYk">Vulnerability
Management</link>.</para>
</section>
<section xml:id="additional_info">
<title>Finding Additional Information</title>
<para>In addition to this book, there are many other sources of
information about OpenStack. The&#160;<link
xlink:href="http://opsgui.de/NPGZOt">OpenStack website</link> is a good
starting point, with&#160;<link
xlink:href="http://opsgui.de/NPFTC8">OpenStack
Docs</link>&#160;and&#160;<link
xlink:href="http://opsgui.de/1eLAlDq">OpenStack API Docs</link> providing
technical documentation about OpenStack. The <link
xlink:href="http://opsgui.de/1eLCrDo">OpenStack wiki</link> contains a lot
of general information that cuts across the OpenStack projects, including
a list of <link xlink:href="http://opsgui.de/NPH3hd">recommended
tools</link>. Finally, there are a number of blogs aggregated
at&#160;<link xlink:href="http://opsgui.de/1eLCsXY">Planet
OpenStack</link>.<indexterm class="singular">
<primary>OpenStack community</primary>
<secondary>additional information</secondary>
</indexterm></para>
</section>
</chapter>

File diff suppressed because it is too large Load Diff

BIN
doc/openstack-ops/cover.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.3 MiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 150 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 651 B

After

Width:  |  Height:  |  Size: 3.0 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 692 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 838 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 518 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 103 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 186 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 122 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

File diff suppressed because it is too large Load Diff

View File

@ -1,64 +1,102 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE part [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<part xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="architecture">
<title>Architecture</title>
<partintro>
<para>
Designing an OpenStack cloud is a great achievement. It requires a
robust understanding of the requirements and needs of the cloud's
users to determine the best possible configuration to meet them.
OpenStack provides a great deal of flexibility to achieve your
needs, and this part of the book aims to shine light on many of the
decisions you need to make during the process.
</para>
<para>To design, deploy, and configure OpenStack, administrators
must understand the logical architecture. A diagram can help
you envision all the integrated services within
OpenStack and how they interact with each other.</para>
<para>OpenStack modules are one of the following types:</para>
<itemizedlist role="compact">
<part version="5.0" xml:id="architecture"
xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/2000/svg"
xmlns:ns4="http://www.w3.org/1998/Math/MathML"
xmlns:ns3="http://www.w3.org/1999/xhtml"
xmlns:ns="http://docbook.org/ns/docbook">
<title>Architecture</title>
<partintro>
<para>Designing an OpenStack cloud is a great achievement. It requires a
robust understanding of the requirements and needs of the cloud's users to
determine the best possible configuration to meet them. OpenStack provides
a great deal of flexibility to achieve your needs, and this part of the
book aims to shine light on many of the decisions you need to make during
the process.</para>
<para>To design, deploy, and configure OpenStack, administrators must
understand the logical architecture. A diagram can help you envision all
the integrated services within OpenStack and how they interact with each
other.<indexterm class="singular">
<primary>modules, types of</primary>
</indexterm><indexterm class="singular">
<primary>OpenStack</primary>
<secondary>module types in</secondary>
</indexterm></para>
<para>OpenStack modules are one of the following types:</para>
<variablelist>
<varlistentry>
<term>Daemon</term>
<listitem>
<para>Daemon. Runs as a background process. On Linux platforms, a daemon is
usually installed as a service.</para>
<para>Runs as a background process. On Linux platforms, a daemon is
usually installed as a service.<indexterm class="singular">
<primary>daemons</primary>
<secondary>basics of</secondary>
</indexterm></para>
</listitem>
</varlistentry>
<varlistentry>
<term>Script</term>
<listitem>
<para>Script. Installs a virtual environment and runs tests.</para>
<para>Installs a virtual environment and runs tests.<indexterm
class="singular">
<primary>script modules</primary>
</indexterm></para>
</listitem>
</varlistentry>
<varlistentry>
<term>Command-line interface (CLI)</term>
<listitem>
<para>Command-line interface (CLI). Enables users to submit API
calls to OpenStack services through commands.</para>
<para>Enables users to submit API calls to OpenStack services
through commands.<indexterm class="singular">
<primary>Command-line interface (CLI)</primary>
</indexterm></para>
</listitem>
</itemizedlist>
<para>As shown, end users can interact through the dashboard,
CLIs, and APIs. All services authenticate through a common
Identity Service and individual services interact with each
other through public APIs, except where privileged
administrator commands are necessary. The diagram shows the
most common, but not the only logical architecture for an
OpenStack cloud.</para>
<figure>
<title>OpenStack Havana Logical Architecture</title>
<mediaobject>
</varlistentry>
</variablelist>
<para>As shown, end users can interact through the dashboard, CLIs, and
APIs. All services authenticate through a common Identity Service, and
individual services interact with each other through public APIs, except
where privileged administrator commands are necessary. <xref
linkend="openstack-havana-diagram" /> shows the most common, but not the
only logical architecture for an OpenStack cloud.</para>
<figure xml:id="openstack-havana-diagram">
<title>OpenStack Havana Logical Architecture (<link
xlink:href="http://opsgui.de/1kYnyy1"></link>)</title>
<mediaobject>
<imageobject>
<imagedata width="6in"
fileref="figures/osog_0001online.png"/>
<imagedata fileref="figures/osog_0001.png"></imagedata>
</imageobject>
</mediaobject></figure>
</partintro>
<xi:include href="ch_arch_examples.xml"/>
<xi:include href="ch_arch_provision.xml"/>
<xi:include href="ch_arch_cloud_controller.xml"/>
<xi:include href="ch_arch_compute_nodes.xml"/>
<xi:include href="ch_arch_scaling.xml"/>
<xi:include href="ch_arch_storage.xml"/>
<xi:include href="ch_arch_network_design.xml"/>
</part>
</mediaobject>
</figure>
</partintro>
<xi:include href="ch_arch_examples.xml" />
<xi:include href="ch_arch_provision.xml" />
<xi:include href="ch_arch_cloud_controller.xml" />
<xi:include href="ch_arch_compute_nodes.xml" />
<xi:include href="ch_arch_scaling.xml" />
<xi:include href="ch_arch_storage.xml" />
<xi:include href="ch_arch_network_design.xml" />
</part>

View File

@ -1,56 +1,62 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
<part version="5.0" xml:id="operations" xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/1999/xhtml"
xmlns:ns4="http://www.w3.org/1998/Math/MathML"
xmlns:ns3="http://www.w3.org/2000/svg"
xmlns:ns="http://docbook.org/ns/docbook">
<title>Operations</title>
<partintro xml:id="ops-partintro">
<para>Congratulations! By now, you should have a solid design for your
cloud. We now recommend that you turn to the OpenStack Installation Guide
(<link xlink:href="http://opsgui.de/1eLCvD8"></link> for Ubuntu, for
example), which contains a step-by-step guide on how to manually install
the OpenStack packages and dependencies on your cloud.</para>
]>
<part xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="operations">
<title>Operations</title>
<partintro xml:id="ops-partintro">
<para>Congratulations! By now, you should have a solid design for
your cloud. We now recommend that you turn to the <link
xlink:title="OpenStack Install and Deploy Manual for Ubuntu"
xlink:href="http://docs.openstack.org/havana/install-guide/install/apt/"
>OpenStack Install and Deploy Manual for Ubuntu</link>
(http://docs.openstack.org/havana/install-guide/install/apt/),
which contains a step-by-step guide on how to manually install
the OpenStack packages and dependencies on your cloud.</para>
<para>While it is important for an operator to be familiar with
the steps involved in deploying OpenStack, we also strongly
encourage you to evaluate configuration-management tools such
as <glossterm>Puppet</glossterm> or
<glossterm>Chef</glossterm> that can help automate this
deployment process.</para>
<para>In the remainder of this guide, we assume that you have
successfully deployed an OpenStack cloud and are able to
perform basic operations such as adding images, booting
instances, and attaching volumes.</para>
<para>As your focus turns to stable operations, we recommend that
you do skim the remainder of this book to get a sense
of the content. Some of this content is useful to read in
advance so that you can put best practices into effect to
simplify your life in the long run. Other content is more
useful as a reference that you might turn to when an unexpected
event occurs, such as a power failure, or to troubleshoot a
particular problem.</para>
</partintro>
<xi:include href="ch_ops_lay_of_land.xml"/>
<xi:include href="ch_ops_projects_users.xml"/>
<xi:include href="ch_ops_user_facing.xml"/>
<xi:include href="ch_ops_maintenance.xml"/>
<xi:include href="ch_ops_network_troubleshooting.xml"/>
<xi:include href="ch_ops_log_monitor.xml"/>
<xi:include href="ch_ops_backup_recovery.xml"/>
<xi:include href="ch_ops_customize.xml"/>
<xi:include href="ch_ops_upstream.xml"/>
<xi:include href="ch_ops_advanced_configuration.xml"/>
<xi:include href="ch_ops_upgrades.xml"/>
<para>While it is important for an operator to be familiar with the steps
involved in deploying OpenStack, we also strongly encourage you to
evaluate configuration-management tools, such as
<glossterm>Puppet</glossterm> or <glossterm>Chef</glossterm>, which can
help automate this deployment process.<indexterm class="singular">
<primary>Chef</primary>
</indexterm><indexterm class="singular">
<primary>Puppet</primary>
</indexterm></para>
</part>
<para>In the remainder of this guide, we assume that you have successfully
deployed an OpenStack cloud and are able to perform basic operations such
as adding images, booting instances, and attaching volumes.</para>
<para>As your focus turns to stable operations, we recommend that you do
skim the remainder of this book to get a sense of the content. Some of
this content is useful to read in advance so that you can put best
practices into effect to simplify your life in the long run. Other content
is more useful as a reference that you might turn to when an unexpected
event occurs (such as a power failure), or to troubleshoot a particular
problem.</para>
</partintro>
<xi:include href="ch_ops_lay_of_land.xml" />
<xi:include href="ch_ops_projects_users.xml" />
<xi:include href="ch_ops_user_facing.xml" />
<xi:include href="ch_ops_maintenance.xml" />
<xi:include href="ch_ops_network_troubleshooting.xml" />
<xi:include href="ch_ops_log_monitor.xml" />
<xi:include href="ch_ops_backup_recovery.xml" />
<xi:include href="ch_ops_customize.xml" />
<xi:include href="ch_ops_upstream.xml" />
<xi:include href="ch_ops_advanced_configuration.xml" />
<xi:include href="ch_ops_upgrades.xml" />
</part>

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,334 +1,484 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE section [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
<section version="5.0" xml:id="example_architecture-nova"
xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/2000/svg"
xmlns:ns4="http://www.w3.org/1998/Math/MathML"
xmlns:ns3="http://www.w3.org/1999/xhtml"
xmlns:ns="http://docbook.org/ns/docbook">
<?dbhtml stop-chunking?>
<title>Example Architecture—Legacy Networking (nova)</title>
<para>This particular example architecture has been upgraded from Grizzly to
Havana and tested in production environments where many public IP addresses
are available for assignment to multiple instances. You can find a second
example architecture that uses OpenStack Networking (neutron) after this
section. Each example offers high availability, meaning that if a particular
node goes down, another node with the same configuration can take over the
tasks so that service continues to be available.<indexterm class="singular">
<primary>Havana</primary>
</indexterm><indexterm class="singular">
<primary>Grizzly</primary>
</indexterm></para>
<section xml:id="overview">
<title>Overview</title>
<para>The simplest architecture you can build upon for Compute has a
single cloud controller and multiple compute nodes. The simplest
architecture for Object Storage has five nodes: one for identifying users
and proxying requests to the API, then four for storage itself to provide
enough replication for eventual consistency. This example architecture
does not dictate a particular number of nodes, but shows the thinking
<phrase role="keep-together">and considerations</phrase> that went into
choosing this architecture including the features <phrase
role="keep-together">offered</phrase>.<indexterm class="singular">
<primary>CentOS</primary>
</indexterm><indexterm class="singular">
<primary>RDO (Red Hat Distributed OpenStack)</primary>
</indexterm><indexterm class="singular">
<primary>Ubuntu</primary>
</indexterm><indexterm class="singular">
<primary>legacy networking (nova)</primary>
<secondary>component overview</secondary>
</indexterm><indexterm class="singular">
<primary>example architectures</primary>
<see>legacy networking; OpenStack networking</see>
</indexterm><indexterm class="singular">
<primary>Object Storage</primary>
<secondary>simplest architecture for</secondary>
</indexterm><indexterm class="singular">
<primary>Compute</primary>
<secondary>simplest architecture for</secondary>
</indexterm></para>
]>
<section xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink"
version="5.0"
xml:id="example_architecture-nova">
<?dbhtml stop-chunking?>
<title>Example Architecture - Legacy Networking (nova)</title>
<para>This particular example architecture has been upgraded from Grizzly to
Havana and tested in production environments where many public IP addresses
are available for assignment to multiple instances. You can find a second
example architecture that uses OpenStack Networking (neutron) after this
section. Each example offers high availability, meaning that if a particular
node goes down, another node with the same configuration can take over the
tasks so that service continues to be available.</para>
<section xml:id="overview">
<title>Overview</title>
<para>The simplest architecture you can build upon for Compute has a
single cloud controller and multiple compute nodes. The simplest
architecture for Object Storage has five nodes: one for identifying
users and proxying requests to the API, then four for storage itself
to provide enough replication for eventual consistency. This example
architecture does not dictate a particular number of nodes, but shows
the thinking and considerations that went into choosing this
architecture including the features offered.</para>
<section xml:id="overview_components-nova">
<title>Components</title>
<informaltable rules="all">
<col width="40%"/>
<col width="60%"/>
<tbody>
<tr>
<td><para>OpenStack release</para></td>
<td><para>Havana</para></td>
</tr>
<tr>
<td><para>Host operating system</para></td>
<td><para>Ubuntu 12.04 LTS or Red Hat Enterprise Linux 6.5
including derivatives such as CentOS and Scientific Linux</para></td>
</tr>
<tr>
<td><para>OpenStack package repository</para></td>
<td><para><link
xlink:href="https://wiki.ubuntu.com/ServerTeam/CloudArchive"
>Ubuntu Cloud Archive</link>
(https://wiki.ubuntu.com/ServerTeam/CloudArchive)
or <link xlink:href="http://openstack.redhat.com/Frequently_Asked_Questions">RDO</link>
(http://openstack.redhat.com/Frequently_Asked_Questions)
*</para></td>
</tr>
<tr>
<td><para>Hypervisor</para></td>
<td><para>KVM</para></td>
</tr>
<tr>
<td><para>Database</para></td>
<td><para>MySQL*</para></td>
</tr>
<tr>
<td><para>Message queue</para></td>
<td><para>RabbitMQ for Ubuntu, Qpid for Red Hat Enterprise
Linux and derivatives</para></td>
</tr>
<tr>
<td><para>Networking service</para></td>
<td><para>nova-network</para></td>
</tr>
<tr>
<td><para>Network manager</para></td>
<td><para>FlatDHCP</para></td>
</tr>
<tr>
<td><para>Single nova-network or
multi-host?</para></td>
<td><para>multi-host*</para></td>
</tr>
<tr>
<td><para>Image Service (glance)
back-end</para></td>
<td><para>file</para></td>
</tr>
<tr>
<td><para>Identity Service (keystone)
driver</para></td>
<td><para>SQL</para></td>
</tr>
<tr>
<td><para>Block Storage Service (cinder)
back-end</para></td>
<td><para>LVM/iSCSI</para></td>
</tr>
<tr>
<td><para>Live Migration back-end</para></td>
<td><para>shared storage using NFS *</para></td>
</tr>
<tr>
<td><para>Object storage</para></td>
<td><para>OpenStack Object Storage
(swift)</para></td>
</tr>
</tbody>
</informaltable>
<para>An asterisk (*) indicates when the example architecture
deviates from the settings of a default
installation. We'll offer explanations for those deviations next.</para>
<?hard-pagebreak?>
<note>
<para>The following features of OpenStack are supported by
the example architecture documented in this guide, but
are optional:<itemizedlist role="compact">
<listitem>
<para><glossterm>dashboard</glossterm>: You probably want to offer a dashboard, but your users may be more interested in API access only.</para>
</listitem>
<listitem>
<para><glossterm>block
storage</glossterm>: You don't have to offer users block storage if their use case only needs ephemeral storage on compute nodes, for example.</para>
</listitem>
<listitem>
<para><glossterm>floating IP
address</glossterm>es: Floating IP addresses are public IP addresses that you allocate from a pre-defined pool to assign to virtual machines at launch. Floating IP address ensure that the public IP address is available whenever an instance is booted. Not every organization can offer thousands of public floating IP addresses for thousands of instances, so this feature is considered optional.</para>
</listitem>
<listitem>
<para><glossterm>live
migration</glossterm>: If you need to move running virtual machine instances from one host to another with little or no service interruption you would enable live migration, but it is considered optional.</para>
</listitem>
<listitem>
<para><glossterm>object
storage</glossterm>: You may choose to store machine
images on a file system rather than in object
storage if you do not have the extra hardware for
the required replication and redundancy that
OpenStack Object Storage offers.</para>
</listitem>
</itemizedlist></para>
</note>
<title>Components</title>
<informaltable rules="all">
<col width="40%" />
<col width="60%" />
<thead>
<tr>
<th>Component</th>
<th>Details</th>
</tr>
</thead>
<tbody>
<tr>
<td><para>OpenStack release</para></td>
<td><para>Havana</para></td>
</tr>
<tr>
<td><para>Host operating system</para></td>
<td><para>Ubuntu 12.04 LTS or Red Hat Enterprise Linux 6.5,
including derivatives such as CentOS and Scientific
Linux</para></td>
</tr>
<tr>
<td><para>OpenStack package repository</para></td>
<td><para><link xlink:href="http://opsgui.de/NPHp7s">Ubuntu Cloud
Archive</link> or <link
xlink:href="http://opsgui.de/1eLCZcm">RDO</link>*</para></td>
</tr>
<tr>
<td><para>Hypervisor</para></td>
<td><para>KVM</para></td>
</tr>
<tr>
<td><para>Database</para></td>
<td><para>MySQL*</para></td>
</tr>
<tr>
<td><para>Message queue</para></td>
<td><para>RabbitMQ for Ubuntu; Qpid for Red Hat Enterprise Linux
and derivatives</para></td>
</tr>
<tr>
<td><para>Networking service</para></td>
<td><para><literal>nova-network</literal></para></td>
</tr>
<tr>
<td><para>Network manager</para></td>
<td><para>FlatDHCP</para></td>
</tr>
<tr>
<td><para>Single <literal>nova-network</literal> or
multi-host?</para></td>
<td><para>multi-host*</para></td>
</tr>
<tr>
<td><para>Image Service (glance) backend</para></td>
<td><para>file</para></td>
</tr>
<tr>
<td><para>Identity Service (keystone) driver</para></td>
<td><para>SQL</para></td>
</tr>
<tr>
<td><para>Block Storage Service (cinder) backend</para></td>
<td><para>LVM/iSCSI</para></td>
</tr>
<tr>
<td><para>Live Migration backend</para></td>
<td><para>Shared storage using NFS*</para></td>
</tr>
<tr>
<td><para>Object storage</para></td>
<td><para>OpenStack Object Storage (swift)</para></td>
</tr>
</tbody>
</informaltable>
<para>An asterisk (*) indicates when the example architecture deviates
from the settings of a default installation. We'll offer explanations
for those deviations next.<indexterm class="singular">
<primary>objects</primary>
<secondary>object storage</secondary>
</indexterm><indexterm class="singular">
<primary>storage</primary>
<secondary>object storage</secondary>
</indexterm><indexterm class="singular">
<primary>migration</primary>
</indexterm><indexterm class="singular">
<primary>live migration</primary>
</indexterm><indexterm class="singular">
<primary>IP addresses</primary>
<secondary>floating</secondary>
</indexterm><indexterm class="singular">
<primary>floating IP address</primary>
</indexterm><indexterm class="singular">
<primary>storage</primary>
<secondary>block storage</secondary>
</indexterm><indexterm class="singular">
<primary>block storage</primary>
</indexterm><indexterm class="singular">
<primary>dashboard</primary>
</indexterm><indexterm class="singular">
<primary>legacy networking (nova)</primary>
<secondary>features supported by</secondary>
</indexterm></para>
<note>
<para>The following features of OpenStack are supported by the example
architecture documented in this guide, but are optional:<itemizedlist
role="compact">
<listitem>
<para><glossterm>Dashboard</glossterm>: You probably want to
offer a dashboard, but your users may be more interested in API
access only.</para>
</listitem>
<listitem>
<para><glossterm>Block storage</glossterm>: You don't have to
offer users block storage if their use case only needs ephemeral
storage on compute nodes, for example.</para>
</listitem>
<listitem>
<para><glossterm>Floating IP address</glossterm>: Floating IP
addresses are public IP addresses that you allocate from a
predefined pool to assign to virtual machines at launch.
Floating IP address ensure that the public IP address is
available whenever an instance is booted. Not every organization
can offer thousands of public floating IP addresses for
thousands of instances, so this feature is considered
optional.</para>
</listitem>
<listitem>
<para><glossterm>Live migration</glossterm>: If you need to move
running virtual machine instances from one host to another with
little or no service interruption, you would enable live
migration, but it is considered optional.</para>
</listitem>
<listitem>
<para><glossterm>Object storage</glossterm>: You may choose to
store machine images on a file system rather than in object
storage if you do not have the extra hardware for the required
replication and redundancy that OpenStack Object Storage
offers.</para>
</listitem>
</itemizedlist></para>
</note>
</section>
<section xml:id="rationale">
<title>Rationale</title>
<para>This example architecture has been selected based on the
current default feature set of OpenStack
<glossterm>Havana</glossterm>, with an emphasis on
stability. We believe that many
clouds that currently run OpenStack in production have
made similar choices.</para>
<para>You must first choose the operating system that runs on
all of the physical nodes. While OpenStack is supported on
several distributions of Linux, we used <emphasis
role="bold">Ubuntu 12.04 LTS (Long Term
Support)</emphasis>, which is used by the majority of
the development community, has feature completeness
compared with other distributions, and has clear future
support plans.</para>
<para>We recommend that you do not use the default Ubuntu
OpenStack install packages and instead use the <link
xlink:href="https://wiki.ubuntu.com/ServerTeam/CloudArchive"
>Ubuntu Cloud Archive</link>
(https://wiki.ubuntu.com/ServerTeam/CloudArchive). The Cloud Archive
is a package repository supported by Canonical that allows you to
upgrade to future OpenStack releases while remaining on Ubuntu
12.04.</para>
<para><emphasis role="bold">KVM</emphasis> as a
<glossterm>hypervisor</glossterm> complements the choice of
Ubuntu - being a matched pair in terms of support, and also because
of the significant degree of attention it garners from the OpenStack
development community (including the authors, who mostly use KVM).
It is also feature complete, free from licensing charges and
restrictions.</para>
<para><emphasis role="bold">MySQL</emphasis> follows a similar trend.
Despite its recent change of ownership, this database is the most
tested for use with OpenStack and is heavily documented. We deviate
from the default database, <emphasis>SQLite</emphasis>, because
SQLite is not an appropriate database for production usage.</para>
<para>The choice of <emphasis role="bold"> RabbitMQ</emphasis> over
other AMQP compatible options that are gaining support in OpenStack,
such as ZeroMQ and Qpid is due to its ease of use and
significant testing in production. It also is the only option which
supports features such as Compute cells. We recommend clustering
with RabbitMQ, as it is an integral component of the system, and
fairly simple to implement due to its inbuilt nature.</para>
<para>As discussed in previous chapters, there are several options for
networking in OpenStack Compute. We recommend <emphasis role="bold"
>FlatDHCP</emphasis> and to use <emphasis role="bold"
>Multi-Host</emphasis> networking mode for high availability,
running one <code>nova-network</code> daemon per OpenStack Compute
host. This provides a robust mechanism for ensuring network
interruptions are isolated to individual compute hosts, and allows
for the direct use of hardware network gateways.</para>
<para><emphasis role="bold">Live Migration</emphasis> is supported by
way of shared storage, with <emphasis role="bold">NFS</emphasis> as
the distributed file system.</para>
<para>Acknowledging that many small-scale deployments see running Object
Storage just for the storage of virtual machine images as too
costly, we opted for the file back-end in the OpenStack Image
Service (Glance). If your cloud will include Object Storage, you can
easily add it as a back-end.</para>
<para>We chose the <emphasis role="bold">SQL back-end for Identity
Service (keystone)</emphasis> over others, such as LDAP. This
back-end is simple to install and is robust. The authors acknowledge
that many installations want to bind with existing directory
services, and caution careful understanding of the <link
xlink:title="LDAP config options"
xlink:href="http://docs.openstack.org/havana/config-reference/content/ch_configuring-openstack-identity.html#configuring-keystone-for-ldap-backend"
>array of options available</link>
(http://docs.openstack.org/havana/config-reference/content/ch_configuring-openstack-identity.html#configuring-keystone-for-ldap-backend).</para>
<para>Block Storage (cinder) is installed natively on
external storage nodes and uses the <emphasis role="bold">LVM/iSCSI
plugin</emphasis>. Most Block Storage Service plugins are tied
to particular vendor products and implementations limiting their use
to consumers of those hardware platforms, but LVM/iSCSI is robust
and stable on commodity hardware.</para>
<para>While the cloud can be run without the <emphasis role="bold"
>OpenStack Dashboard</emphasis>, we consider it to be
indispensable, not just for user interaction with the cloud, but
also as a tool for operators. Additionally, the dashboard's use of
Django makes it a flexible framework for extension.</para>
<title>Rationale</title>
<para>This example architecture has been selected based on the current
default feature set of OpenStack <glossterm>Havana</glossterm>, with an
emphasis on stability. We believe that many clouds that currently run
OpenStack in production have made similar choices.<indexterm
class="singular">
<primary>legacy networking (nova)</primary>
<secondary>rationale for choice of</secondary>
</indexterm></para>
<para>You must first choose the operating system that runs on all of the
physical nodes. While OpenStack is supported on several distributions of
Linux, we used <emphasis>Ubuntu 12.04 LTS (Long Term
Support)</emphasis>, which is used by the majority of the development
community, has feature completeness compared with other distributions
and has clear future support plans.</para>
<para>We recommend that you do not use the default Ubuntu OpenStack
install packages and instead use the <link
xlink:href="http://opsgui.de/NPHp7s">Ubuntu Cloud Archive</link>. The
Cloud Archive is a package repository supported by Canonical that allows
you to upgrade to future OpenStack releases while remaining on Ubuntu
12.04.</para>
<para><emphasis>KVM</emphasis> as a <glossterm>hypervisor</glossterm>
complements the choice of Ubuntu—being a matched pair in terms of
support, and also because of the significant degree of attention it
garners from the OpenStack development community (including the authors,
who mostly use KVM). It is also feature complete, free from licensing
charges and restrictions.<indexterm class="singular">
<primary>kernel-based VM (KVM) hypervisor</primary>
</indexterm><indexterm class="singular">
<primary>hypervisors</primary>
<secondary>KVM</secondary>
</indexterm></para>
<para><emphasis>MySQL</emphasis> follows a similar trend. Despite its
recent change of ownership, this database is the most tested for use
with OpenStack and is heavily documented. We deviate from the default
database, <emphasis>SQLite</emphasis>, because SQLite is not an
appropriate database for production usage.</para>
<para>The choice of <emphasis>RabbitMQ</emphasis> over other AMQP
compatible options that are gaining support in OpenStack, such as ZeroMQ
and Qpid, is due to its ease of use and significant testing in
production. It also is the only option that supports features such as
Compute cells. We recommend clustering with RabbitMQ, as it is an
integral component of the system and fairly simple to implement due to
its inbuilt nature.<indexterm class="singular">
<primary>Advanced Message Queuing Protocol (AMQP)</primary>
</indexterm></para>
<para>As discussed in previous chapters, there are several options for
networking in OpenStack Compute. We recommend
<emphasis>FlatDHCP</emphasis> and to use <emphasis>Multi-Host</emphasis>
networking mode for high availability, running one
<code>nova-network</code> daemon per OpenStack compute host. This
provides a robust mechanism for ensuring network interruptions are
isolated to individual compute hosts, and allows for the direct use of
hardware network gateways.</para>
<para><emphasis>Live Migration</emphasis> is supported by way of shared
storage, with <emphasis>NFS</emphasis> as the distributed file
system.</para>
<para>Acknowledging that many small-scale deployments see running Object
Storage just for the storage of virtual machine images as too costly, we
opted for the file backend in the OpenStack Image Service (Glance). If
your cloud will include Object Storage, you can easily add it as a
backend.</para>
<para>We chose the <emphasis>SQL backend for Identity Service
(keystone)</emphasis> over others, such as LDAP. This backend is simple
to install and is robust. The authors acknowledge that many
installations want to bind with existing directory services and caution
careful understanding of the <link xlink:href="http://opsgui.de/1eLCZJr"
xlink:title="LDAP config options">array of options
available</link>.</para>
<para>Block Storage (cinder) is installed natively on external storage
nodes and uses the <emphasis>LVM/iSCSI plug-in</emphasis>. Most Block
Storage Service plug-ins are tied to particular vendor products and
implementations limiting their use to consumers of those hardware
platforms, but LVM/iSCSI is robust and stable on commodity
hardware.</para>
<?hard-pagebreak ?>
<para>While the cloud can be run without the <emphasis>OpenStack
Dashboard</emphasis>, we consider it to be indispensable, not just for
user interaction with the cloud, but also as a tool for operators.
Additionally, the dashboard's use of Django makes it a flexible
framework for <phrase role="keep-together">extension</phrase>.</para>
</section>
<section xml:id="neutron">
<title>Why Not Use the OpenStack Network Service
(neutron)?</title>
<para>This example architecture does not use the OpenStack
Network Service (neutron), because it does not yet support
multi-host networking and our organizations (university,
government) have access to a large range of
publicly-accessible IPv4 addresses.</para>
</section>
<section xml:id="multi-host-networking">
<title>Why Use Multi-host Networking?</title>
<para>In a default OpenStack deployment, there is a single
<code>nova-network</code> service that runs within the cloud
(usually on the cloud controller) that provides services such as
network address translation (NAT), DHCP, and DNS to the guest
instances. If the single node that runs the
<code>nova-network</code> service goes down, you cannot
access your instances and the instances cannot access the
Internet. The single node that runs the nova-network service can
become a bottleneck if excessive network traffic comes in and
goes out of the cloud.</para>
<tip><para>
<link
xlink:href="http://docs.openstack.org/havana/install-guide/install/apt/content/nova-network.html"
>Multi-host</link>
(http://docs.openstack.org/havana/install-guide/install/apt/content/nova-network.html)
is a high-availability option for the network
configuration where the nova-network service is run on
every compute node instead of running on only a single
node.</para></tip>
</section>
<section xml:id="neutron">
<title>Why not use the OpenStack Network Service (neutron)?</title>
<para>This example architecture does not use the OpenStack Network
Service (neutron), because it does not yet support multi-host networking
and our organizations (university, government) have access to a large
range of publicly-accessible IPv4 addresses.<indexterm class="singular">
<primary>legacy networking (nova)</primary>
<secondary>vs. OpenStack Network Service (neutron)</secondary>
</indexterm></para>
</section>
<section xml:id="detailed_desc">
<title>Detailed Description</title>
<para>The reference architecture consists of multiple compute
nodes, a cloud controller, an external NFS storage server
for instance storage and an OpenStack Block Storage server
for <glossterm>volume</glossterm> storage. A network time
service (Network Time Protocol, NTP) synchronizes time on
all the nodes. FlatDHCPManager in multi-host mode is used
for the networking. A logical diagram for this example
architecture shows which services are running on each node:</para>
<informalfigure>
<mediaobject>
<imageobject>
<imagedata width="4in"
fileref="figures/os-ref-arch.png"/>
</imageobject>
</mediaobject>
</informalfigure>
<para>The cloud controller runs: the dashboard, the API
services, the database (MySQL), a message queue server
(RabbitMQ), the scheduler for choosing compute resources
(nova-scheduler), Identity services (keystone,
<code>nova-consoleauth</code>), Image services
(<code>glance-api</code>,
<code>glance-registry</code>), services for console access
of guests, and Block Storage services including the
scheduler for storage resources (<code>cinder-api</code>
and <code>cinder-scheduler</code>).</para>
<para>Compute nodes are where the computing resources are
held, and in our example architecture they run the
hypervisor (KVM), libvirt (the driver for the hypervisor,
which enables live migration from node to node),
<code>nova-compute</code>,
<code>nova-api-metadata</code> (generally only used
when running in multi-host mode, it retrieves
instance-specific metadata), <code>nova-vncproxy</code>,
and <code>nova-network</code>.
</para>
<para>The network consists of two switches, one for the management or
private traffic, and one which covers public access including
Floating IPs. To support this, the cloud controller and the compute
nodes have two network cards. The OpenStack Block Storage and NFS
storage servers only need to access the private network and
therefore only need one network card, but multiple cards run in a
bonded configuration are recommended if possible. Floating IP access
is direct to the internet, whereas Flat IP access goes through a
NAT. To envision the network traffic use this diagram:</para>
<informalfigure>
<mediaobject>
<imageobject>
<imagedata width="4in"
fileref="figures/os_physical_network.png"/>
</imageobject>
</mediaobject>
</informalfigure>
<section xml:id="multi-host-networking">
<title>Why use multi-host networking?</title>
<para>In a default OpenStack deployment, there is a single
<code>nova-network</code> service that runs within the cloud (usually on
the cloud controller) that provides services such as network address
translation (NAT), DHCP, and DNS to the guest instances. If the single
node that runs the <code>nova-network</code> service goes down, you
cannot access your instances, and the instances cannot access the
Internet. The single node that runs the <literal>nova-network</literal>
service can become a bottleneck if excessive network traffic comes in
and goes out of the cloud.<indexterm class="singular">
<primary>networks</primary>
<secondary>multi-host</secondary>
</indexterm><indexterm class="singular">
<primary>multi-host networking</primary>
</indexterm><indexterm class="singular">
<primary>legacy networking (nova)</primary>
<secondary>benefits of multi-host networking</secondary>
</indexterm></para>
<tip>
<para><link xlink:href="http://opsgui.de/NPHqbu">Multi-host</link> is
a high-availability option for the network configuration, where the
<literal>nova-network</literal> service is run on every compute node
instead of running on only a single node.</para>
</tip>
</section>
<?hard-pagebreak?>
<section xml:id="optional_extensions">
<title>Optional Extensions</title>
<para>You can extend this reference architecture as
follows:</para>
<itemizedlist role="compact">
<listitem>
<para>Add additional cloud controllers (see <xref
linkend="maintenance"/>).</para>
</listitem>
<listitem>
<para>Add an OpenStack Storage service (see the Object Storage
chapter in the <citetitle>OpenStack Installation
Guide</citetitle> for your distribution.</para>
</listitem>
<listitem>
<para>Add additional OpenStack Block Storage hosts
(see <xref linkend="maintenance"/>).</para>
</listitem>
</itemizedlist>
</section>
</section>
</section>
<section xml:id="detailed_desc">
<title>Detailed Description</title>
<para>The reference architecture consists of multiple compute nodes, a
cloud controller, an external NFS storage server for instance storage, and
an OpenStack Block Storage server for <glossterm>volume</glossterm>
storage.<indexterm class="singular">
<primary>legacy networking (nova)</primary>
<secondary>detailed description</secondary>
</indexterm> A network time service (Network Time Protocol, or NTP)
synchronizes time on all the nodes. FlatDHCPManager in multi-host mode is
used for the networking. A logical diagram for this example architecture
shows which services are running on each node:</para>
<informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="figures/osog_01in01.png"></imagedata>
</imageobject>
</mediaobject>
</informalfigure>
<para>The cloud controller runs the dashboard, the API services, the
database (MySQL), a message queue server (RabbitMQ), the scheduler for
choosing compute resources (<literal>nova-scheduler</literal>), Identity
services (keystone, <code>nova-consoleauth</code>), Image services
(<code>glance-api</code>, <code>glance-registry</code>), services for
console access of guests, and Block Storage services, including the
scheduler for storage resources (<code>cinder-api</code> and
<code>cinder-scheduler</code>).<indexterm class="singular">
<primary>cloud controllers</primary>
<secondary>duties of</secondary>
</indexterm></para>
<para>Compute nodes are where the computing resources are held, and in our
example architecture, they run the hypervisor (KVM), libvirt (the driver
for the hypervisor, which enables live migration from node to node),
<code>nova-compute</code>, <code>nova-api-metadata</code> (generally only
used when running in multi-host mode, it retrieves instance-specific
metadata), <code>nova-vncproxy</code>, and
<code>nova-network</code>.</para>
<para>The network consists of two switches, one for the management or
private traffic, and one that covers public access, including floating
IPs. To support this, the cloud controller and the compute nodes have two
network cards. The OpenStack Block Storage and NFS storage servers only
need to access the private network and therefore only need one network
card, but multiple cards run in a bonded configuration are recommended if
possible. Floating IP access is direct to the Internet, whereas Flat IP
access goes through a NAT. To envision the network traffic, use this
diagram:</para>
<informalfigure>
<mediaobject>
<imageobject>
<imagedata fileref="figures/osog_01in02.png"></imagedata>
</imageobject>
</mediaobject>
</informalfigure>
</section>
<section xml:id="optional_extensions">
<title>Optional Extensions</title>
<para>You can extend this reference architecture as<indexterm
class="singular">
<primary>legacy networking (nova)</primary>
<secondary>optional extensions</secondary>
</indexterm> follows:</para>
<itemizedlist role="compact">
<listitem>
<para>Add additional cloud controllers (see <xref
linkend="maintenance" />).</para>
</listitem>
<listitem>
<para>Add an OpenStack Storage service (see the Object Storage chapter
in the <emphasis>OpenStack Installation Guide</emphasis> for your
distribution).</para>
</listitem>
<listitem>
<para>Add additional OpenStack Block Storage hosts (see <xref
linkend="maintenance" />).</para>
</listitem>
</itemizedlist>
</section>
</section>

View File

@ -1,57 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<section xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink"
version="5.0"
xml:id="section_conventions">
<!-- This file is copied from
openstack-manuals/doc/common/section_conventions.xml.
Do not edit this file, edit the openstack-manuals one and copy it
here.-->
<?dbhtml stop-chunking?>
<title>Conventions</title>
<para>
The OpenStack documentation uses several typesetting conventions:
</para>
<simplesect xml:id="conventions-admonitions">
<title>Admonitions</title>
<para>
Admonitions take three forms:
</para>
<note>
<para>
This is a note. The information in a note is usually in the form
of a handy tip or reminder.
</para>
</note>
<important>
<para>
This is important. The information in an important admonition is
something you must be aware of before moving on.
</para>
</important>
<warning>
<para>
This is a warning. The information in warnings is critical.
Warnings provide additional information about risk of data loss or
security issues.
</para>
</warning>
</simplesect>
<simplesect xml:id="conventions-prompts">
<title>Command prompts</title>
<para>
Commands prefixed with the <literal>#</literal> prompt are to be
executed by the <literal>root</literal> user. These examples can
also be executed using the <command>sudo</command> command, if
available.
</para>
<para>
Commands prefixed with the <literal>$</literal> prompt can be
executed by any user, including <literal>root</literal>.
</para>
</simplesect>
</section>