operations-guide/doc/openstack-ops/ch_arch_compute_nodes.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash  "&#x2013;">
<!ENTITY mdash  "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<chapter xmlns="http://docbook.org/ns/docbook"
    xmlns:xi="http://www.w3.org/2001/XInclude"
    xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
    xml:id="compute_nodes">
    <?dbhtml stop-chunking?>
    <title>Compute Nodes</title>
    <para>In this chapter, we will discuss some of the choices you'll
        need to consider when building out your compute nodes.
        Compute nodes form the resource core of the OpenStack
        Compute cloud, providing the processing, memory, network and
        storage resources to run instances.</para>
    <section xml:id="cpu_choice">
        <title>CPU Choice</title>
        <para>The type of CPU in your compute node is a very important
            choice. First, ensure the CPU supports virtualization by
            way of <emphasis>VT-x</emphasis> for Intel chips and
            <emphasis>AMD-v</emphasis> for AMD chips.</para>
        <tip><para>Consult the vendor documentation to check for virtualization
            support. For Intel read <link xlink:title="Intel VT-x" xlink:href="http://www.intel.com/support/processors/sb/cs-030729.htm">
            Does my processor support Intel® Virtualization Technology?</link>
            (http://www.intel.com/support/processors/sb/cs-030729.htm). For AMD
            read <link xlink:title="AMD-v" xlink:href="http://sites.amd.com/us/business/it-solutions/virtualization/Pages/client-side-virtualization.aspx">
            AMD Virtualization</link>
            (http://sites.amd.com/us/business/it-solutions/virtualization/Pages/client-side-virtualization.aspx).
            Note that your CPU may support virtualization but it may be
            disabled. Consult your BIOS documentation for how to enable CPU
            features.</para>
        </tip>
        <para>The number of cores that the CPU has also affects the
            decision. It's common for current CPUs to have up to 12
            cores. Additionally, if an Intel CPU supports Hyper-threading,
            those 12 cores are doubled to 24 cores. If you purchase a
            server that supports multiple CPUs, the number of cores is
            further multiplied.</para>
        <para>Hyper-threading is Intel's proprietary simultaneous
            multithreading implementation used to improve parallelization
            on their CPUs. You might consider enabling hyper-threading to
            improve the performance of multi-threaded applications.</para>
        <para>Whether you should enable hyper-threading on your CPUs
            depends upon your use case. For example, disabling
            hyper-threading can be beneficial in intense computing
            environments. We recommend you do performance testing with
            your local workload with both hyper-threading on and off
            to determine what is more appropriate in your case.</para>
    </section>
    <?hard-pagebreak?>
    <section xml:id="hypervisor_choice">
        <title>Hypervisor Choice</title>
        <para>OpenStack Compute supports many hypervisors to various
            degrees, including:
            <itemizedlist role="compact">
                <listitem><para><link xlink:title="reference manual" xlink:href="http://www.linux-kvm.org/">KVM</link> (http://www.linux-kvm.org/)</para></listitem>
                <listitem><para><link xlink:title="reference manual" xlink:href="http://lxc.sourceforge.net/">LXC</link> (http://lxc.sourceforge.net/)</para></listitem>
                <listitem><para><link xlink:title="reference manual" xlink:href="http://wiki.qemu.org/">QEMU</link> (http://wiki.qemu.org/)</para></listitem>
                <listitem><para><link xlink:title="reference manual" xlink:href="https://www.vmware.com/support/vsphere-hypervisor">VMWare ESX/ESXi</link> (https://www.vmware.com/support/vsphere-hypervisor)</para></listitem>
                <listitem><para><link xlink:title="reference manual" xlink:href="http://www.xen.org/">Xen</link> (http://www.xen.org/)</para></listitem>
                <listitem><para><link xlink:title="reference manual" xlink:href="http://technet.microsoft.com/en-us/library/hh831531.aspx">Hyper-V</link> (http://technet.microsoft.com/en-us/library/hh831531.aspx)</para></listitem>
                <listitem><para><link xlink:title="reference manual" xlink:href="http://www.docker.io/">Docker</link> (http://www.docker.io/)</para></listitem>
            </itemizedlist>
        </para>
        <para>Probably the most important factor in your choice of
            hypervisor is your current usage or experience. Aside from
            that, there are practical concerns to do with feature
            parity, documentation, and the level of community
            experience.</para>
        <para>For example, KVM is the most widely adopted hypervisor
            in the OpenStack community. Besides KVM, more deployments
            exist running Xen, LXC, VMWare, and Hyper-V than the others
            listed. However, each of these are lacking some feature
            support or the documentation on how to use them with
            OpenStack is out of date.</para>
        <para>The best information available to support your choice is
            found on the <link xlink:title="reference manual"
                xlink:href="https://wiki.openstack.org/wiki/HypervisorSupportMatrix"
                >Hypervisor Support Matrix</link>
            (https://wiki.openstack.org/wiki/HypervisorSupportMatrix),
            and in the <link xlink:title="configuration reference"
                xlink:href="http://docs.openstack.org/trunk/config-reference/content/section_compute-hypervisors.html"
                >configuration reference</link>
            (http://docs.openstack.org/trunk/config-reference/content/section_compute-hypervisors.html).</para>
        <note>
            <para>It is also possible to run multiple hypervisors in a
                single deployment using Host Aggregates or Cells.
                However, an individual compute node can only run a
                single hypervisor at a time.</para>
        </note>
    </section>

    <section xml:id="instance_storage">
        <title>Instance Storage Solutions</title>
        <para>As part of the procurement for a compute cluster, you
            must specify some storage for the disk on which the
            instantiated instance runs. There are three main
            approaches to providing this temporary-style storage, and
            it is important to understand the implications of the
            choice.</para>
        <para>They are:</para>
        <itemizedlist role="compact">
            <listitem>
                <para>Off compute node storage – shared file system</para>
            </listitem>
            <listitem>
                <para>On compute node storage – shared file system</para>
            </listitem>
            <listitem>
                <para>On compute node storage – non-shared file system</para>
            </listitem>
        </itemizedlist>
        <para>In general, the questions you should be asking when
            selecting the storage are as follows:</para>
        <itemizedlist role="compact">
            <listitem>
                <para>What is the platter count you can achieve?</para>
            </listitem>
            <listitem>
                <para>Do more spindles result in better I/O despite
                    network access?</para>
            </listitem>
            <listitem>
                <para>Which one results in the best cost-performance
                    scenario you're aiming for?</para>
            </listitem>
            <listitem>
                <para>How do you manage the storage operationally?</para>
            </listitem>
        </itemizedlist>
        <para>Many operators use separate compute and storage
            hosts. Compute services and storage services have
            different requirements, compute hosts typically
            require more CPU and RAM than storage hosts.
            Therefore, for a fixed budget, it makes sense to have
            different configurations for your compute nodes and
            your storage nodes. Compute nodes will be invested in CPU
            and RAM, and storage nodes will be invested in block
            storage.</para>
        <para>However, if you are more restricted in the number of
            physical hosts you have available for creating your
            cloud and you want to be able to dedicate as many of
            your hosts as possible to running instances, it makes
            sense to run compute and storage on the same
            machines.</para>
        <para>We'll discuss the three main approaches to instance
            storage in the next few sections.</para>
        <section xml:id="off_compute_node_storage">
            <title>Off Compute Node Storage – Shared File System</title>
            <para>In this option, the disks storing the running
                instances are hosted in servers outside of the compute
                nodes.</para>
            <para>If you use separate compute and storage hosts
                then you can treat your compute hosts as "stateless".
                As long as you don't have any instances currently running
                on a compute host, you can take it offline or wipe it
                completely without having any effect on the rest of your
                cloud. This simplifies maintenance for the compute hosts.</para>
            <para>There are several advantages to this approach:</para>
            <itemizedlist role="compact">
                <listitem>
                    <para>If a compute node fails, instances are
                        usually easily recoverable.</para>
                </listitem>
                <listitem>
                    <para>Running a dedicated storage system can be
                        operationally simpler.</para>
                </listitem>
                <listitem>
                    <para>Being able to scale to any number of spindles.</para>
                </listitem>
                <listitem>
                    <para>It may be possible to share the external
                        storage for other purposes.</para>
                </listitem>
            </itemizedlist>
            <?hard-pagebreak?>
            <para>The main downsides to this approach are:</para>
            <itemizedlist role="compact">
                <listitem>
                    <para>Depending on design, heavy I/O usage from
                        some instances can affect unrelated
                        instances.</para>
                </listitem>
                <listitem>
                    <para>Use of the network can decrease performance.</para>
                </listitem>
            </itemizedlist>
        </section>
        <section xml:id="on_compute_node_storage">
            <title>On Compute Node Storage – Shared File System</title>
            <para>In this option, each compute node
                is specified with a significant amount of disks, but a
                distributed file system ties the disks from each
                compute node into a single mount.</para>
            <para>The main advantage of this option is that it scales to
                external storage when you require additional storage.</para>
            <para>However, this option has several downsides:</para>
            <itemizedlist role="compact">
                <listitem>
                    <para>Running a distributed file system can make
                        you lose your data locality compared with
                        non-shared storage.</para>
                </listitem>
                <listitem>
                    <para>Recovery of instances is complicated by
                        depending on multiple hosts.</para>
                </listitem>
                <listitem>
                    <para>The chassis size of the compute node can
                        limit the number of spindles able to be used
                        in a compute node.</para>
                </listitem>
                <listitem>
                    <para>Use of the network can decrease performance.</para>
                </listitem>
            </itemizedlist>
        </section>

        <section xml:id="on_compute_node_storage_nonshared">
            <title>On Compute Node Storage – Non-shared File System</title>
            <para>In this option, each compute node is specified with enough
                disks to store the instances it hosts.</para>
            <para>There are two main reasons why this is a good idea:</para>
            <itemizedlist role="compact">
                <listitem>
                    <para>Heavy I/O usage on one compute node does not
                        affect instances on other compute nodes.</para>
                </listitem>
                <listitem>
                    <para>Direct I/O access can increase performance.</para>
                </listitem>
            </itemizedlist>
            <?hard-pagebreak?>
            <para>This has several downsides:</para>
            <itemizedlist role="compact">
                <listitem>
                    <para>If a compute node fails, the instances
                        running on that node are lost.</para>
                </listitem>
                <listitem>
                    <para>The chassis size of the compute node can
                        limit the number of spindles able to be used
                        in a compute node.</para>
                </listitem>
                <listitem>
                    <para>Migrations of instances from one node to
                        another are more complicated, and rely on
                        features which may not continue to be
                        developed.</para>
                </listitem>
                <listitem>
                    <para>If additional storage is required, this
                        option does not to scale.</para>
                </listitem>
            </itemizedlist>
            <para>Running a shared file system on a storage system apart from
                the computes nodes is ideal for clouds where reliability and
                scalability are the most important factors. Running a shared
                file system on the compute nodes themselves may be best in a
                scenario where you have to deploy to pre-existing servers for
                which you have little to no control over their specifications.
                Running a non-shared file system on the compute nodes
                themselves is a good option for clouds with high I/O
                requirements and low concern for reliability.</para>
        </section>
        <section xml:id="live_migration">
            <title>Issues with Live Migration</title>
            <para>We consider live migration an integral part of the
                operations of the cloud. This feature provides the
                ability to seamlessly move instances from one physical
                host to another, a necessity for performing upgrades
                that require reboots of the compute hosts, but only
                works well with shared storage.</para>
            <para>Live migration can be also done with non-shared storage,
                using a feature known as <emphasis>KVM live block
                migration</emphasis>. While an earlier implementation
                of block-based migration in KVM and QEMU was considered
                unreliable, there is a newer, more reliable implementation of
                block-based live migration as of QEMU 1.4 and libvirt 1.0.2
                that is also compatible with OpenStack. However, none of the
                authors of this guide have first-hand experience using live
                block migration.</para>
        </section>
        <section xml:id="file_systems">
            <title>Choice of File System</title>
            <para>If you want to support shared storage live
                migration, you'll need to configure a distributed file
                system.</para>
            <para>Possible options include:</para>
            <itemizedlist role="compact">
                <listitem>
                    <para>NFS (default for Linux)</para>
                </listitem>
                <listitem>
                    <para>GlusterFS</para>
                </listitem>
                <listitem>
                    <para>MooseFS</para>
                </listitem>
                <listitem>
                    <para>Lustre</para>
                </listitem>
            </itemizedlist>
            <para>We've seen deployments with all, and recommend you
                choose the one you are most familiar with operating. If you
                are unfamiliar with any of these, choose NFS as it is the
                easiest to setup and there is extensive community
                knowledge about it.</para>
        </section>
    </section>

    <section xml:id="overcommit">
        <title>Overcommitting</title>
        <para>OpenStack allows you to overcommit CPU and RAM on
            compute nodes. This allows you to increase the number of
            instances you can have running on your cloud, at the cost
            of reducing the performance of the instances. OpenStack
            Compute uses the following ratios by default:</para>
        <itemizedlist role="compact">
            <listitem>
                <para>CPU allocation ratio: 16:1</para>
            </listitem>
            <listitem>
                <para>RAM allocation ratio: 1.5:1</para>
            </listitem>
        </itemizedlist>
        <para>The default CPU allocation ratio of 16:1 means that the
            scheduler allocates up to 16 virtual cores per physical
            core. For example, if a physical node has 12 cores, then
            192 virtual cores would be available and with typical
            flavours, of 4 virtual cores per instance, this would
            provide 48 instances on a physical node.</para>
        <para>The formula for the number of virtual instances on a
            compute node is <emphasis>(OR*PC)/VC</emphasis>, where:
        </para>
        <variablelist>
            <varlistentry>
                <term><emphasis>OR</emphasis></term>
                <listitem>
                    <para>CPU overcommit ratio (virtual cores per physical
                        core).</para>
                </listitem>
            </varlistentry>
            <varlistentry>
                <term><emphasis>PC</emphasis></term>
                <listitem>
                    <para>Number of physical cores.</para>
                </listitem>
            </varlistentry>
            <varlistentry>
                <term><emphasis>VC</emphasis></term>
                <listitem>
                    <para>Number of virtual cores per instance.</para>
                </listitem>
            </varlistentry>
        </variablelist>
        <para>Similarly, the default RAM allocation ratio of 1.5:1 means
            that the scheduler allocates instances to a physical node
            as long as the total amount of RAM associated with the
            instances is less than 1.5 times the amount of RAM
            available on the physical node.</para>
        <para>For example, if a physical node has 48 GB of RAM, the
            scheduler allocates instances to that node until the sum
            of the RAM associated with the instances reaches 72 GB
            (such as nine instances, in the case where each instance
            has 8 GB of RAM).</para>
        <para>You must select the appropriate CPU and RAM allocation
            ratio for your particular use case.</para>
    </section>
    <section xml:id="logging">
        <title>Logging</title>
        <para>Logging is detailed more fully in <xref
                linkend="logging"/>. However it is an important design
            consideration to take into account before commencing
            operations of your cloud.</para>
        <para>OpenStack produces a great deal of useful logging
            information, however, in order for it to be useful for
            operations purposes you should consider having a central
            logging server to send logs to, and a log parsing/analysis
            system (such as logstash).</para>
    </section>
    <section xml:id="networking">
        <title>Networking</title>
        <para>Networking in OpenStack is a complex, multi-faceted
            challenge. See <xref linkend="network_design"/>.</para>
    </section>
    <section xml:id="conclusion">
        <title>Conclusion</title>
        <para>Compute nodes are the workhorse of your cloud and the place
            where your user's applications will run. They are likely to be
            affected by your decisions on what to deploy and how you deploy it.
            Their requirements should be reflected in the choices you make.</para>
    </section>
</chapter>