operations-guide/doc/openstack-ops/ch_arch_provision.xml
Anne Gentle 2ab7778d36 Address O'Reilly edits for Chapter 2, Provisioning
Change-Id: Ie2dcf7c166f062d53c585f34dbecf5f7e97387a8
2014-02-17 10:42:43 -06:00

272 lines
17 KiB
XML
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<chapter xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="section_arch_provision">
<?dbhtml stop-chunking?>
<title>Provisioning and Deployment</title>
<para>A critical part of a cloud's scalability is the amount of effort that
it takes to run your cloud. To minimize the operational cost of running
your cloud, set up and use an automated deployment and configuration
infrastructure with a configuration management system like Puppet or
Chef. Combined, these systems greatly reduce manual effort and chance
for operator error.</para>
<para>This infrastructure includes systems to automatically install the
operating system's initial configuration and later coordinate the
configuration of all services automatically and centrally, which reduces
both manual effort and chance for error. Examples include Ansible, Chef,
Puppet, Salt, and even using OpenStack to deploy OpenStack, fondly named
Triple-O for OpenStack On OpenStack.</para>
<section xml:id="automated_deploy">
<title>Automated Deployment</title>
<para>An automated deployment system installs and configures operating
systems on new servers, without intervention, after the absolute
minimum amount of manual work, including physical racking, MAC to IP
assignment, power configuration, and so on. Typically solutions rely
on wrappers around PXE boot and TFTP servers for the basic operating
system install, then hand off to an automated configuration
management system.</para>
<para>Ubuntu and Red Hat Linux both include mechanisms for configuring
the operating system, including preseed and kickstart, that you can
use after a network boot. Typically these are used to bootstrap an
automated configuration system. Alternatively, you can use an
image-based approach for deploying the operating system, such as
systemimager. You can use both approaches with a virtualized
infrastructure, such as when you run VMs to separate your control
services and physical infrastructure.</para>
<para>When you create a deployment plan, focus on a few vital areas
because they are very hard to modify post-deployment. The next two
sections talk about configurations for:</para>
<para>
<itemizedlist>
<listitem>
<para>Disk partioning and disk array setup for scalability</para>
</listitem>
<listitem>
<para>Networking configuration just for PXE booting</para>
</listitem>
</itemizedlist>
</para>
<?hard-pagebreak?>
<section xml:id="disk_partition_raid">
<title>Disk Partitioning and RAID</title>
<para>At the very base of any operating system are the hard drives on which the
operating system (OS) is installed.</para>
<para>You must complete the following configurations on the server's hard drives:</para>
<itemizedlist role="compact">
<listitem>
<para>Partitioning, which provides greater flexibility for layout of operating
system and swap space, as described below.</para>
</listitem>
<listitem>
<para>Adding to a RAID array (RAID stands for Redundant Array of Independent
Disks), based on the number of disks you have available, so that you can add
capacity as your cloud grows. Some options are described in more detail
below.</para>
</listitem>
</itemizedlist>
<para>The simplest option to get started is to use one hard drive with two
partitions:</para>
<itemizedlist role="compact">
<listitem>
<para>File system to store files, directories, where all the data lives
including the root partition that starts and runs the system.</para>
</listitem>
<listitem>
<para>Swap space to free up memory for processes, as an independent area of the
physical disk used only for swapping and nothing else.</para>
</listitem>
</itemizedlist>
<para>RAID is not used in that simplistic one-drive setup because generally for
production clouds you want to ensure that if one disk fails another can take its
place. Instead, for production use more than one disk. The number of disks determine
what types of RAID arrays to build.</para>
<?hard-pagebreak?>
<para>We recommend that you choose one of the following multiple disk options:</para>
<itemizedlist role="compact">
<listitem>
<para><emphasis role="bold">Option 1:</emphasis> Partition all drives in the
same way in a horizontal fashion, as shown in the following diagram: <informalfigure>
<mediaobject>
<imageobject>
<imagedata width="5in" fileref="figures/os_disk_partition.png"/>
</imageobject>
</mediaobject>
</informalfigure></para>
<para>With this option, you can assign different partitions to different RAID
arrays. You can allocate partition 1 of disk one and two to the
<code>/boot</code> partition mirror. You can make partition 2 of all
disks the root partition mirror. You can use partition 3 of all disks for a
<code>cinder-volumes</code> LVM partition running on a RAID 10
array.</para>
<para>While you might end up with unused partitions, such as partition 1 in disk
three and four of this example, it allows for maximum utilization off disk
space. I/O performance might be an issue due to all disks being used for all
tasks.</para>
</listitem>
<listitem>
<para><emphasis role="bold">Option 2:</emphasis> Add all raw disks to one large
RAID array, either hardware or software based. You can partition this large
array with the boot, root, swap, and LVM areas. This option is simple to
implement and uses all partitions. However, disk I/O might suffer.</para>
</listitem>
<listitem>
<para><emphasis role="bold">Option 3:</emphasis> Dedicate entire disks to
certain partitions. For example, you could allocate disk one and two
entirely to the boot, root, and swap partitions under a RAID 1 mirror. Then,
allocate disk 3 and 4 entirely to the LVM partition, also under a RAID 1
mirror. Disk I/O should be better because I/O is focused on dedicated tasks.
However, the LVM partition is much smaller.</para>
</listitem>
</itemizedlist>
<tip><para>You may find that you can automate the partitioning
itself. For example, at MIT they use Fully Automatic
Installation (FAI) (<link
xlink:href="http://fai-project.org/"
>fai-project.org/</link>) to do the initial PXE-based
partition, and install using a combination of min/max and
percentage-based partitioning.</para></tip>
<para>As with most architecture choices, the right answer depends on your environment.
If you are using existing hardware, you know the disk density of your servers and
can determine some decisions based on the options above. If you are going through a
procurement process, your user's requirements also help you determine hardware
purchases. Here are some examples from a private cloud providing web developers
custom environments at AT&amp;T. This example is from a specific deployment so your
existing hardware or procurement opportunity may vary from this, but they use three
types of hardware in their deployment:</para>
<itemizedlist>
<listitem>
<para>Hardware for controller nodes, used for all stateless OpenStack API
services. About 32-64 GB memory, small attached disk, 1 processor, varied
number of cores, such as 6-12.</para>
</listitem>
<listitem>
<para>Hardware for compute nodes. Typically 256 or 144GB memory, 2 processors,
24 cores. 4-6TB direct attached storage, typically in a RAID 5
configuration.</para>
</listitem>
<listitem>
<para>Hardware for storage nodes. Typically for these the disk space is
optimized for the lowest cost per GB of storage while maintaining rack space
efficiency.</para>
</listitem>
</itemizedlist>
<para>As with most architecture choices, the right answer
depends on your environment. You will have to make your
decision based on the trade-offs between space utilization,
simplicity, and I/O performance.</para>
</section>
<section xml:id="network_config">
<title>Network Configuration</title>
<para>Network configuration is a very large topic that spans
multiple areas of this book. For now, make sure that your
servers can PXE boot and successfully communicate with the
deployment server.</para>
<para>For example, you usually cannot configure NICs for VLANs when
PXE booting. Additionally, you usually cannot PXE boot with
bonded NICs. If you run into this scenario, consider using a
simple 1 GB switch in a private network on which only your cloud
communicates.</para>
</section>
</section>
<section xml:id="auto_config">
<title>Automated Configuration</title>
<para>The purpose of automatic configuration management is to establish and maintain the
consistency of a system with no human intervention. You want to maintain consistency in
your deployments so you can have the same cloud every time, repeatably. Proper use of
automatic configuration management tools ensures that components of the cloud systems
are in particular states, in addition to simplifying deployment, and configuration
change propagation.</para>
<para>These tools also make it possible to test and roll back changes, as they are fully
repeatable. Conveniently, a large body of work has been done by the OpenStack community
in this space. Puppet a configuration management tool even provides official modules
for OpenStack in an OpenStack infrastructure system know as Stackforge at <link
xlink:href="https://github.com/stackforge/puppet-openstack"
>https://github.com/stackforge/puppet-openstack</link>. Chef configuration
management is provided within <link
xlink:href="https://github.com/rcbops/chef-cookbooks"
>https://github.com/rcbops/chef-cookbooks</link>, soon to move to Stackforge also.
Additional configuration management systems include Juju, Ansible, and Salt. Also,
PackStack is a command line utility for Red Hat Enterprise Linux and derivatives that
uses Puppet modules to support rapid deployment of OpenStack on existing servers over an
SSH connection.</para>
<para>An integral part of a configuration management system is the items that it controls.
You should carefully consider all of the items that you want, or do not want, to be
automatically managed. For example, you may not want to automatically format hard drives
with user data.</para>
</section>
<?hard-pagebreak?>
<section xml:id="remote_mgmt">
<title>Remote Management</title>
<para>In our experience, most operators don't sit right next to the
servers running the cloud, and many don't necessarily enjoy visiting
the data center. OpenStack should be entirely remotely configurable,
but sometimes not everything goes according to plan.</para>
<para>In this instance, having an out-of-band access into nodes running
OpenStack components, is a boon. The IPMI protocol is the de-facto
standard here, and acquiring hardware that supports it is highly
recommended to achieve that lights-out data center aim.</para>
<para>In addition, consider remote power control as well. While IPMI
usually controls the server's power state, having remote access to
the PDU that the server is plugged into can really be useful for
situations when everything seems wedged.</para>
</section>
<section xml:id="provision-deploy-summary">
<title>Parting Thoughts for Provisioning and Deploying OpenStack</title>
<para>You can save time by understanding the use cases for the cloud you
want to create. Use cases for OpenStack are varied, some include
object storage only, or pre-configured compute resources to speed
development environment set up, or fast provisioning of compute
resources that are already secured per tenant with private networks.
Your users may have need for highly redundant servers to make sure
their legacy applications continue to run. Perhaps a goal would be
to architect these legacy applications so they are running on
multiple instances in a cloudy, fault-tolerant way, but not make it
a goal to add to those clusters over time. Your users may indicate
that they need scaling considerations due to heavy Windows server
use.</para>
<para>You can save resources by looking at the best fit for the hardware
you have in place already. You might have some high density storage
hardware available. You could format and repurpose those servers for
OpenStack Object Storage. All of these considerations and the input
from users help you build your use case and your deployment
plan.</para>
<tip><para>For further research about OpenStack deployment, investigate the
supported and documented pre-configured, pre-packaged installers for
OpenStack from companies like <link
xlink:href="http://www.ubuntu.com/cloud/tools/openstack"
>Canonical</link>, <link
xlink:href="http://www.cisco.com/web/solutions/openstack/"
>Cisco</link>, <link xlink:href="http://www.cloudscaling.com/"
>Cloudscaling</link>, <link
xlink:href="http://www-03.ibm.com/software/products/en/smartcloud-orchestrator/"
>IBM</link>, <link xlink:href="http://www.metacloud.com/"
>Metacloud</link>, <link xlink:href="http://www.mirantis.com/"
>Mirantis</link>, <link xlink:href="http://www.pistoncloud.com/"
>Piston</link>, <link
xlink:href="http://www.rackspace.com/cloud/private/"
>Rackspace</link>, <link
xlink:href="http://www.redhat.com/openstack/">Red Hat</link>,
<link xlink:href="http://www.suse.com/cloud">SUSE</link>, and
<link xlink:href="http://www.swiftstack.com/"
>SwiftStack</link>.</para></tip>
</section>
<section xml:id="provision_conclusion">
<title>Conclusion</title>
<para>The decisions you make with respect to provisioning and
deployment will affect your day-to-day, week-to-week, and
month-to-month maintenance of the cloud. Your configuration
management will be able to evolve over time. However, more thought
and design will need to be done for the upfront choices to be
made when doing deployment, disk partitioning, and network
configuration.</para>
</section>
</chapter>