operations-guide/doc/openstack-ops/ch_arch_cloud_controller.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash  "&#x2013;">
<!ENTITY mdash  "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<chapter xmlns="http://docbook.org/ns/docbook"
    xmlns:xi="http://www.w3.org/2001/XInclude"
    xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
    xml:id="cloud_controller_design">
    <?dbhtml stop-chunking?>
    <title>Designing for Cloud Controllers and Cloud Management</title>
    <para>OpenStack is designed to be massively horizontally scalable, which
        allows all services to be distributed widely. However, to simplify this
        guide we have decided to discuss services of a more central nature using
        the concept of a <emphasis>cloud controller</emphasis>. A cloud
        controller is just a conceptual simplification, and in the real world
        you likely will design an architecture for your cloud controller that
        enables high availability so the cloud controller tasks are spread out
        across more than a single node. The cloud controller provides the
        central management system for OpenStack deployments. Typically the cloud
        controller manages authentication and sends messaging to all the systems
        through a message queue.</para>
    <para>For more details about an overall example architecture, see the <xref
            linkend="example_architecture"/>.</para>
    <para>For many deployments, the cloud controller is a single node, but for
        high availability, you have considerations about the design to explore
        further in this chapter. The concept of a cloud controller hosts these
        necessary management services for the cloud:</para>
    <itemizedlist>
            <listitem>
                <para>Databases (tracks current information about users and
                    instances for example in a database, typically one database
                    instance managed per service)</para>
            </listitem>
            <listitem>
                <para>Message queue services (all AMQP - Advanced Message Queue
                    Protocol - messages for services are received and sent
                    according to the queue broker)</para>
            </listitem>
            <listitem>
            <para>Conductor services (proxies requests to a
                database)</para>
            </listitem>
            <listitem>
                <para>Authentication and authorization for identity management
                    (indicates which users can do what actions on certain cloud
                    resources, quota management is spread out among services
                    however)</para>
            </listitem>
            <listitem>
                <para>Image management services (stores and serves images with
                    metadata on each, for launching in the cloud)</para>
            </listitem>
            <listitem>
                <para>Scheduling services (indicates which resources to use
                    first, for example, spreading out where instances are
                    launched based on an algorithm)</para>
            </listitem>
            <listitem>
                <para>User dashboard (provides a web-based front-end for users
                    to consume OpenStack cloud services)</para>
            </listitem>
            <listitem>
                <para>API endpoints (offers each service's REST API access,
                    where the API endpoint catalog is managed by the identity
                    service)</para>
            </listitem>
        </itemizedlist>
    <para>For our example, the cloud controller has a collection of
            <code>nova-*</code> components that represent the global state of
        the cloud, talks to services such as authentication, maintains
        information about the cloud in a database, communicates to all compute
        nodes and storage <glossterm>worker</glossterm>s through a queue, and
        provides API access. Each service running on a designated cloud
        controller may be broken out into separate nodes for scalability or
        availability.</para>
    <para>As another example, you could use pairs of servers for a collective
        cloud controller - one active, one standby, for redundant nodes
        providing a given set of related services such as:</para>
    <itemizedlist>
            <listitem>
                <para>Front-end web for API requests, the scheduler for choosing
                    which compute node to boot an instance on, identity
                    services, and the dashboard</para>
            </listitem>
            <listitem>
                <para>Database and message queue server (such as MySQL,
                    RabbitMQ)</para>
            </listitem>
            <listitem>
                <para>Image Service for the image management</para>
            </listitem>
        </itemizedlist>
    <para>Now that you see the myriad designs for controlling your cloud, read
        more about the further considerations to help with your design decisions.</para>
    <section xml:id="hardware_consid">
        <title>Hardware Considerations</title>
        <para>A cloud controller's hardware can be the same as a
            compute node, though you may want to further specify based
            on the size and type of cloud that you run.</para>
        <para>It's also possible to use virtual machines for all or
            some of the services that the cloud controller manages,
            such as the message queuing. In this guide, we assume that
            all services are running directly on the cloud
            controller.</para>
        <para>To size the server correctly, and determine whether to
            virtualize any part of it, you should estimate:</para>
        <itemizedlist role="compact">
            <listitem>
                <para>The number of instances that you expect to
                    run</para>
            </listitem>
            <listitem>
                <para>The number of compute nodes that you have</para>
            </listitem>
            <listitem>
                <para>The number of users who will access the compute
                    or storage services</para>
            </listitem>
            <listitem>
                <para>Whether users interact with your cloud through
                    the REST API or the dashboard</para>
            </listitem>
            <listitem>
                <para>Whether users authenticate against an external
                    system (such as, LDAP or <glossterm>Active
                        Directory</glossterm>)</para>
            </listitem>
            <listitem>
                <para>How long you expect single instances to
                    run</para>
            </listitem>
        </itemizedlist>
        <para>The following table contains common considerations to review when
            sizing hardware for the cloud controller design:</para>
        <informaltable rules="all">
            <col width="25%"/>
            <col width="75%"/>
            <thead>
                <tr>
                    <th>Consideration</th>
                    <th>Ramification</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>How many instances will run at
                            once?</para></td>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>Size your database server
                            accordingly, and scale out beyond one
                            cloud controller if many instances will
                            report status at the same time and
                            scheduling where a new instance starts up
                            needs computing power.</para></td>
                </tr>
                <tr>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>How many compute nodes will run at
                            once?</para></td>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>Ensure that your messaging queue
                            handles requests successfully and size
                            accordingly.</para></td>
                </tr>
                <tr>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>How many users will access the
                            API?</para></td>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>If many users will make multiple
                            requests, make sure that the CPU load for
                            the cloud controller can handle it.</para></td>
                </tr>
                <tr>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>How many users will access the
                                <glossterm>dashboard</glossterm>?</para></td>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>The dashboard makes many requests,
                            even more than the API access, so add even
                            more CPU if your dashboard is the main
                            interface for your users.</para></td>
                </tr>
                <tr>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>How many <code>nova-api</code>
                            services do you run at once for your
                            cloud?</para></td>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>You need to size the controller
                            with a core per service.</para></td>
                </tr>
                <tr>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>How long does a single instance
                            run?</para></td>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>Starting instances and deleting
                            instances is demanding on the compute node
                            but also demanding on the controller node
                            because of all the API queries and
                            scheduling needs.</para></td>
                </tr>
                <tr>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>Does your authentication system
                            also verify externally?</para></td>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>Ensure network connectivity between
                            the cloud controller and external
                            authentication system are good and that
                            the cloud controller has the CPU power to
                            keep up with requests.</para></td>
                </tr>
            </tbody>
        </informaltable>
    </section>
    <?hard-pagebreak?>
    <section xml:id="separate_services">
        <title>Separation of Services</title>
        <para>While our example contains all central services in a
            single location, it is possible and indeed often a good
            idea to separate services onto different physical servers.
            The following is a list of deployment scenarios we've
            seen, and their justifications.</para>
        <informaltable rules="all">
            <col width="25%"/>
            <col width="75%"/>
            <tbody>
                <tr>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>Run <code>glance-*</code> servers
                            on the <code>swift-proxy</code>
                            server</para></td>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>This deployment felt the spare I/O
                            on the Object Storage proxy server was
                            sufficient, and the Image Delivery portion
                            of Glance benefited from being on physical
                            hardware and having good connectivity to
                            the Object Storage back-end it was
                            using.</para></td>
                </tr>


                <tr>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>Run a central dedicated database
                            server</para></td>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>This deployment made a central
                            dedicated server to provide the databases
                            for all services. This simplified
                            operations by isolating database server
                            updates, and allowed for the simple
                            creation of slave database servers for
                            failover.</para></td>
                </tr>
                <tr>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>Run one VM per service</para></td>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>This deployment ran central
                            services on a set of servers running KVM.
                            A dedicated VM was created for each
                            service (nova-scheduler, rabbitmq,
                            database etc). This assisted the
                            deployment with scaling as they could tune
                            the resources given to each virtual
                            machine based on the load they received
                            (something that was not well understood
                            during installation).</para></td>
                </tr>
                <tr>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>Use an external load
                            balancer</para></td>
                    <td xmlns:db="http://docbook.org/ns/docbook"
                            ><para>This deployment had an expensive
                            hardware load balancer in their
                            organisation. They ran multiple
                                <code>nova-api</code> and
                                <code>swift-proxy</code> servers on
                            different physical servers and used the
                            load balancer to switch between
                            them.</para></td>
                </tr>

            </tbody>
        </informaltable>
        <para>One choice that always comes up is whether to virtualize
            or not. Some services, such as nova-compute, swift-proxy
            and swift-object servers, should not be virtualized.
            However, control servers can often be happily virtualized
            - the performance penalty can usually be offset by simply
            running more of the service.</para>
    </section>
    <section xml:id="database">
        <title>Database</title>
        <para>OpenStack Compute uses a SQL database to store and
            retrieve stateful information. MySQL is the popular
            database choice in the OpenStack community.</para>
        <para>Loss of the database leads to errors. As a result, we
            recommend that you cluster your database to make it
            failure tolerant. Configuring and maintaining a database
            cluster is done outside of OpenStack and is determined by
            the database software you choose to use in your cloud
            environment. MySQL/Galera is a popular option for
            MySQL-based databases.</para>
    </section>
    <section xml:id="message_queue">
        <title>Message Queue</title>
        <para>Most OpenStack services communicate with each other using the
            Message Queue. As an example, Compute communicates to block storage
            services and networking services through the message queue. Also,
            you can optionally enable notifications for any service. RabbitMQ,
            Qpid, and 0mq are all popular choices for a message queue service.
            In general, if the message queue fails or becomes inaccessible, the
            cluster grinds to a halt and ends up in a "read only" state, with
            information stuck at the point where the last message was sent.
            Accordingly, we recommend that you cluster the message queue. Be
            aware that clustered message queues can be a pain point for many
            OpenStack deployments. While RabbitMQ has native clustering support,
            there have been reports of issues when running it at a large scale.
            While there are other queuing solutions
            available such as 0mq and Qpid, 0mq does not offer stateful queues.
            Qpid is the messaging system of choice for RedHat and its
            derivatives. Qpid does not have native clustering capabilities and
            requires a supplemental service like Pacemaker or Corsync. For your
            message queue, you need to determine a) what level of data loss you
            are comfortable with, and b) whether to use an OpenStack project's
            ability to retry multiple MQ hosts in the event of a failure, such
            as using Compute's ability to do so.</para>
    </section>
    <section xml:id="conductor">
        <title>Conductor Services</title>
        <para>In previous version of OpenStack, all nova-compute
            services required direct access to the database hosted on
            the cloud controller. This was problematic for two
            reasons: security and performance. With regard to
            security, if a compute node is compromised, then the
            attacker inherently has access to the database. With
            regard to performance, nova-compute calls to the database
            are single-threaded and blocking. This creates a
            performance bottleneck as database requests are fulfilled
            serially rather than parallel.</para>
        <para>The conductor service resolves both of these issues by
            acting as a proxy for the nova-compute service. Now
            instead of nova-compute directly accessing the database,
            it will contact the nova-conductor service and
            nova-conductor will access the database on nova-compute's
            behalf. Since nova-compute no longer has direct access to
            the database, the security issue is resolved.
            Additionally, nova-conductor is a non-blocking service so
            requests from all compute nodes are fulfilled in
            parallel.</para>
        <note>
            <para>If you are using nova-network and multi-host
                networking in your cloud environment, nova-compute
                still requires direct access to the database.</para>
        </note>
        <para>The nova-conductor service is horizontally scalable. To
            make nova-conductor highly available and fault tolerant,
            just launch more instances of the
                <code>nova-conductor</code> process, either on the
            same server or across multiple servers.</para>
    </section>
    <section xml:id="api">
        <title>Application Programming Interface (API)</title>
        <para>All public access, whether direct, through a command
            line client, or through the web-based dashboard, uses the
            API service. Find the API reference at <link
                xlink:href="http://api.openstack.org/"
                >http://api.openstack.org/</link>.</para>
        <para>You must choose whether you want to support the Amazon
            EC2 compatibility APIs, or just the OpenStack APIs. One
            issue you might encounter when running both APIs is an
            inconsistent experience when referring to images and
            instances.</para>
        <para>For example, the EC2 API refers to instances using IDs
            that contain hexadecimal whereas the OpenStack API uses
            names and digits. Similarly, the EC2 API tends to rely on
            DNS aliases for contacting virtual machines, as opposed to
            OpenStack which typically lists IP addresses.</para>
        <para>If OpenStack is not set up in the right way, it is
            simple to have scenarios where users are unable to contact
            their instances due to only having an incorrect DNS alias.
            Despite this, EC2 compatibility can assist users migrating
            to your cloud.</para>
        <para>Like databases and message queues, having more than one
                <glossterm>API server</glossterm> is a good thing.
            Traditional HTTP load balancing techniques can be used to
            achieve a highly available <code>nova-api</code>
            service.</para>
    </section>
    <section xml:id="extensions">
        <title>Extensions</title>
        <para>The <link xlink:title="API Specifications"
                xlink:href="http://docs.openstack.org/api/api-specs.html"
                >API Specifications</link>
            (http://docs.openstack.org/api/api-specs.html) define the
            core actions, capabilities, and media-types of the
            OpenStack API. A client can always depend on the
            availability of this core API and implementers are always
            required to support it in its entirety. Requiring strict
            adherence to the core API allows clients to rely upon a
            minimal level of functionality when interacting with
            multiple implementations of the same API.</para>
        <para>The OpenStack Compute API is extensible. An extension
            adds capabilities to an API beyond those defined in the
            core. The introduction of new features, MIME types,
            actions, states, headers, parameters, and resources can
            all be accomplished by means of extensions to the core
            API. This allows the introduction of new features in the
            API without requiring a version change and allows the
            introduction of vendor-specific niche
            functionality.</para>
    </section>
    <?hard-pagebreak?>
    <section xml:id="scheduling">
        <title>Scheduling</title>
        <para>The scheduling services are responsible for determining
            the compute or storage node where a virtual machine or
            block storage volume should be created. The scheduling
            services receive creation requests for these resources
            from the message queue and then begin the process of
            determining the appropriate node where the resource should
            reside. This process is done by applying a series
            user-configurable filters against the available collection
            of nodes.</para>
        <para>There are currently two schedulers: nova-scheduler for
            virtual machines and cinder-scheduler for block storage
            volumes. Both schedulers are able to scale horizontally,
            so for high availability purposes, or for very large or
            high-schedule frequency installations, you should consider
            running multiple instances of each scheduler. The
            schedulers all listen to the shared message queue, so no
            special load balancing is required.</para>
    </section>
    <section xml:id="images">
        <title>Images</title>
        <para>The OpenStack Image Catalog and Delivery service
            consists of two parts - <code>glance-api</code> and
                <code>glance-registry</code>. The former is
            responsible for the delivery of images and the compute
            node uses it to download images from the back-end. The
            latter maintains the metadata information associated with
            virtual machine images and requires a database.</para>
        <para>The <code>glance-api</code> part is an abstraction layer
            that allows a choice of back-end. Currently, it
            supports:</para>
        <itemizedlist role="compact">
            <listitem>
                <para>OpenStack Object Storage. Allows you to store
                    images as objects.</para>
            </listitem>
            <listitem>
                <para>File system. Uses any traditional file system to
                    store the images as files.</para>
            </listitem>
            <listitem>
                <para>S3. Allows you to fetch images from Amazon S3.</para>
            </listitem>
            <listitem>
                <para>HTTP. Allows you to fetch images from a web
                    server. You cannot write images by using this
                    mode.</para>
            </listitem>
        </itemizedlist>
        <para>If you have an OpenStack Object Storage service, we recommend using this as a scalable
            place to store your images. You can also use a file system with sufficient performance
            or Amazon S3 - unless you do not need the ability to upload new images through
            OpenStack.</para>
    </section>
    <section xml:id="dashboard">
        <title>Dashboard</title>
        <para>The OpenStack Dashboard (Horizon) provides a web-based
            user interface to the various OpenStack components. The
            dashboard includes an end-user area for users to manage
            their virtual infrastructure as well as an admin area for
            cloud operators to manage the OpenStack environment as a
            whole.</para>
        <para>The dashboard is implemented as a Python web
            application that normally runs in <glossterm>Apache</glossterm>
            <code>httpd</code>. Therefore, you may treat it the same
            as any other web application, provided it can reach the
            API servers (including their admin endpoints) over the
            network.</para>
    </section>
    <section xml:id="authentication">
        <title>Authentication and Authorization</title>
        <para>The concepts supporting OpenStack's authentication and
            authorization are derived from well understood and widely
            used systems of a similar nature. Users have credentials
            they can use to authenticate, and they can be a member of
            one or more groups (known as projects or tenants
            interchangeably).</para>
        <para>For example, a cloud administrator might be able to list
            all instances in the cloud, whereas a user can only see
            those in their current group. Resources quotas, such as
            the number of cores that can be used, disk space, and so
            on, are associated with a project.</para>
        <para>The OpenStack Identity Service (Keystone) is the point
            that provides the authentication decisions and user
            attribute information, which is then used by the other
            OpenStack services to perform authorization. Policy is set
            in the <filename>policy.json</filename> file. For
            information on how to configure these, see <xref
                linkend="projects_users"/>.</para>
        <para>The Identity Service supports different plugins for
            authentication decisions and identity storage. Examples of
            these plugins include:</para>
        <itemizedlist role="compact">
            <listitem>
                <para>In-memory Key-Value Store (a simplified internal
                    storage structure)</para>
            </listitem>
            <listitem>
                <para>SQL database (such as MySQL or
                    PostgreSQL)</para>
            </listitem>
            <listitem>
                <para>PAM (Pluggable Authentication Module)</para>
            </listitem>
            <listitem>
                <para>LDAP (such as OpenLDAP or Microsoft's Active
                    Directory)</para>
            </listitem>
        </itemizedlist>
        <para>Many deployments use the SQL database, however LDAP is
            also a popular choice for those with existing
            authentication infrastructure that needs to be
            integrated.</para>
    </section>
    <?hard-pagebreak?>
    <section xml:id="network_consid">
        <title>Network Considerations</title>
        <para>Because the cloud controller handles so many different
            services, it must be able to handle the amount of traffic
            that hits it. For example, if you choose to host the
            OpenStack Imaging Service on the cloud controller, the
            cloud controller should be able to support the
            transferring of the images at an acceptable speed.</para>
        <para>As another example, if you choose to use single-host
            networking where the cloud controller is the network
            gateway for all instances, then the Cloud Controller must
            support the total amount of traffic that travels between
            your cloud and the public Internet.</para>
        <para>We recommend that you use a fast NIC, such as 10 GB. You
            can also choose to use two 10 GB NICs and bond them
            together. While you might not be able to get a full bonded
            20 GB speed, different transmission streams use different
            NICs. For example, if the Cloud Controller transfers two
            images, each image uses a different NIC and gets a full 10
            GB of bandwidth.</para>
    </section>


</chapter>