diff --git a/doc/src/docbkx/openstack-compute-admin/computeadmin.xml b/doc/src/docbkx/openstack-compute-admin/computeadmin.xml index b744b792fd..53a431e89d 100644 --- a/doc/src/docbkx/openstack-compute-admin/computeadmin.xml +++ b/doc/src/docbkx/openstack-compute-admin/computeadmin.xml @@ -532,8 +532,8 @@ euca-register mybucket/windowsserver.img.manifest.xml
Managing Volumes Nova-volume is the service that allows you to give extra block level storage to your - OpenStack Compute instances. You may recognize this as a similar offering that Amazon - EC2 offers, Elastic Block Storage (EBS). However, nova-volume is not the same + OpenStack Compute instances. You may recognize this as a similar offering from Amazon + EC2 known as Elastic Block Storage (EBS). However, nova-volume is not the same implementation that EC2 uses today. Nova-volume is an iSCSI solution that employs the use of Logical Volume Manager (LVM) for Linux. Note that a volume may only be attached to one instance at a time. This is not a ‘shared storage’ solution like a SAN of NFS on @@ -564,7 +564,7 @@ euca-register mybucket/windowsserver.img.manifest.xml The volume is attached to an instance via $euca-attach-volume; which creates a - unique iSCSI IQN that will be exposed to the compute node. + unique iSCSI IQN that will be exposed to the compute node The compute node which run the concerned instance has now an active ISCSI @@ -580,9 +580,9 @@ euca-register mybucket/windowsserver.img.manifest.xml additional compute nodes running nova-compute. The walkthrough uses a custom partitioning scheme that carves out 60GB of space and labels it as LVM. The network is a /28 .80-.95, and FlatManger is the NetworkManager setting for OpenStack Compute (Nova). - Please note that the network mode doesn't interfere at all the way nova-volume works, - but it is essential for nova-volumes to work that the mode you are currently using is - set up. Please refer to Networking for more details. + Please note that the network mode doesn't interfere at all with the way nova-volume + works, but networking must be set up for for nova-volumes to work. Please refer to Networking for more details. To set up Compute to use volumes, ensure that nova-volume is installed along with lvm2. The guide will be split in four parts : @@ -1172,11 +1172,10 @@ tcp: [9] 172.16.40.244:3260,1 iqn.2010-10.org.openstack:volume-00000014 filesystem) there could be two causes : - You didn't allocate enought size for the snapshot + You didn't allocate enough size for the snapshot - kapartx had been unable to disover the partition table. - + kapartx had been unable to discover the partition table. Try to allocate more space to the snapshot and see if it works. @@ -1192,7 +1191,7 @@ tcp: [9] 172.16.40.244:3260,1 iqn.2010-10.org.openstack:volume-00000014 This command will create a tar.gz file containing the datas, and datas only, so you ensure you don't - waste space by backupping empty sectors ! + waste space by backing up empty sectors ! @@ -1203,8 +1202,8 @@ tcp: [9] 172.16.40.244:3260,1 iqn.2010-10.org.openstack:volume-00000014 checksum is a unique identifier for a file. When you transfer that same file over the network ; you can run another checksum calculation. Having different checksums means the file - is corrupted,so it is an interesting way to make sure your file is has - not been corrupted during it's transfer. + is corrupted, so it is an interesting way to make sure your file is has + not been corrupted during its transfer. Let's checksum our file, and save the result to a file : $sha1sum volume-00000001.tar.gz > volume-00000001.checksumBe aware the sha1sum should be used carefully @@ -1236,11 +1235,11 @@ tcp: [9] 172.16.40.244:3260,1 iqn.2010-10.org.openstack:volume-00000014 6- Automate your backups - You will mainly have more and more volumes on you nova-volumes' server. It might + You will mainly have more and more volumes on your nova-volumes' server. It might be interesting then to automate things a bit. This script here will assist you on this task. The script does the operations we - just did earlier, but also provides mail report and backup prunning (based on the " + just did earlier, but also provides mail report and backup running (based on the " backups_retention_days " setting). It is meant to be launched from the server which runs the nova-volumes component. Here is how a mail report looks like : @@ -1340,12 +1339,262 @@ HostC p2 5 10240 150 Migration of i-00000001 initiated. Check its progress using euca-describe-instances. ]]> Make sure instances are migrated successfully with euca-describe-instances. - If instances are still running on HostB, check logfiles( src/dest nova-compute + If instances are still running on HostB, check logfiles ( src/dest nova-compute and nova-scheduler) +
+
+ Nova Disaster Recovery Process + Sometimes, things just don't go right. An incident is never planned, by its + definition. + In this section, we will see how to manage your cloud after a disaster, and how to + easily backup the persistent storage volumes, which is another approach when you face a + disaster. Even apart from the disaster scenario, backup ARE mandatory. While the Diablo + release includes the snapshot functions, both the backup procedure and the utility + do apply to the Cactus release. + For reference, you cand find a DRP definition here : http://en.wikipedia.org/wiki/Disaster_Recovery_Plan. + + A- The disaster Recovery Process presentation + A disaster could happen to several components of your architecture : a disk crash, + a network loss, a power cut... In our scenario, we suppose the following setup : + + A cloud controller (nova-api, nova-objecstore, nova-volumes, + nova-network) + + + A compute node (nova-compute) + + + - A Storage Area Network used by nova-volumes (aka SAN) + + Our disaster will be the worst one : a power loss. That power loss + applies to the three components. Let's see what runs and how + it runs before the crash : + + From the SAN to the cloud controller, we have an active iscsi session + (used for the "nova-volumes" LVM's VG). + + + From the cloud controller to the compute node we also have active + iscsi sessions (managed by nova-volumes). + + + For every volume an iscsi session is made (so 14 ebs volumes equals 14 + sessions). + + + From the cloud controller to the compute node, we also have iptables/ + ebtables rules which allows the acess from the cloud controller to the + running instance. + + + And at least, from the cloud controller to the compute node ; saved + into database, the current state of the instances (in that case + "running" ), and their volumes attachment (mountpoint, volume id, volume + status, etc..) + + Now, our power loss occurs and everything restarts (the hardware + parts), and here is now the situation : + + + + From the SAN to the cloud, the ISCSI session no longer exists. + + + From the cloud controller to the compute node, the ISCSI sessions no + longer exist. + + + From the cloud controller to the compute node, the iptables/ ebtables + are recreated, since, at boot, nova-network reapply the configurations. + + + + From the cloud controller, instances turn into a shutdown state + (because they are no longer running) + + + Into the datase, datas were not updated at all, since nova could not + have guessed the crash. + + Before going further, and in order to prevent the admin to make + fatal mistakes, the instances won't be lost, since + not any "destroy" or "terminate" command had been invoked, so the files for the instances + remain on the compute node. + The plan is to perform the following tasks, in that exact order, any extra step would be dangerous at that stage + : + + + + We need to get the current relation from a volume to its instance, since we + will recreate the attachment. + + + We need to update the database in order to clean the stalled state. + (After that, we won't be able to perform the first step). + + + We need to restart the instances (so go from a "shutdown" to a + "running" state). + + + After the restart, we can reattach the volumes to their respective + instances. + + + That step, which is not a mandatory one, exists in an SSH into the + instances in order to reboot them. + + + + + + B - The Disaster Recovery Process itself + + + + + Instance to Volume relation + + We need to get the current relation from a volume to its instance, + since we will recreate the attachment : + This relation could be figured by running an "euca-describe-volumes" : + euca-describe-volumes | $AWK '{print $2,"\t",$8,"\t,"$9}' | $GREP -v "None" | $SED "s/\,//g; s/)//g; s/\[.*\]//g; s/\\\\\//g" + That would output a three-columns table : VOLUME + INSTANCE MOUNTPOINT + + + + + Database Update + + Second, we need to update the database in order to clean the stalled + state. Now that we have saved for every volume the attachment we need to + restore, it's time to clean the database, here are the queries that need + to be run : + + mysql> use nova; + mysql> update volumes set mountpoint=NULL; + mysql> update volumes set status="available" where status <>"error_deleting"; + mysql> update volumes set attach_status="detached"; + mysql> update volumes set instance_id=0; + + Now, by running an euca-describe-volumesall volumes should + be available. + + + + Instances Restart + + We need to restart the instances ; It's time to launch a restart, so + the instances will really run. This can be done via a simple + euca-reboot-instances $instance + + At that stage, depending on your image, some instances would totally + reboot (thus become reacheable), while others would stop on the + "plymouth" stage. + DO NOT reboot a second time the ones + which are stopped at that stage (see below, the + fourth step). In fact it depends on whether you added an + "/etc/fstab" entry for that volume or not. Images built with the + cloud-init package (More infos on + help.ubuntu.com)will remain on a pending state, while others + will skip the missing volume and start. But remember that the idea of + that stage is only to ask nova to reboot every instance, so the stored + state is preserved. + + + + + Volume Attachment + + After the restart, we can reattach the volumes to their respective + instances. Now that nova has restored the right status, it is time to + performe the attachments via an euca-attach-volume + + Here is a simple snippet that uses the file we created : + + #!/bin/bash + + while read line; do + volume=`echo $line | $CUT -f 1 -d " "` + instance=`echo $line | $CUT -f 2 -d " "` + mount_point=`echo $line | $CUT -f 3 -d " "` + echo "ATTACHING VOLUME FOR INSTANCE - $instance" + euca-attach-volume -i $instance -d $mount_point $volume + sleep 2 + done < $volumes_tmp_file + + At that stage, instances which were pending on the boot sequence + (plymouth) will automatically + continue their boot, and restart normally, while the ones which booted + will see the volume. + + + + SSH into instances + + If some services depend on the volume, or if a volume has an entry + into fstab, it could be good to simply restart the instance. This + restart needs to be made from the instance itself, not via nova. So, we + SSH into the instance and perform a reboot : + shutdown -r now + + + Voila! You successfully recovered your cloud after that. + Here are some suggestions : + + + Use the parameter errors=remount,ro into you fstab file, + that would prevent data corruption. + The system would lock any write to the disk if it detects an I/O + error. This flag should be added into the nova-volume server (the one + which performs the ISCSI connection to the SAN), but also into the + intances' fstab file. + + + Do not add into the nova-volumes' fstab file the entry for the SAN's + disks. + Some systems would hang on that step, which means you could loose + access to your cloud-controller. In order to re-run the session + manually, you would run : + iscsiadm -m discovery -t st -p $SAN_IP $ iscsiadm -m node --target-name $IQN -p $SAN_IP -l +Then perform the mount. + + + For your instances, if you have the whole "/home/" directory on the + disk, then, instead of emptying the /home directory and map the disk on + it, leave a user's directory, with at least, his bash files ; but, more + importantly, the "authorized_keys" file. + That would allow you to connect to the instance, even without the + volume attached. (If you allow only connections via public keys.) + + + + + + + C- Scripted DRP + You could get here a bash script which performs these five steps : + The "test mode" allows you to perform that whole sequence for only one + instance. + In order to reproduce the power loss, simply connect to the compute node which + runs that same instance, and close the iscsi session (do + not dettach the volume via "euca-dettach", but manually close the + iscsi session). + Let's say this is the iscsi session number 15 for that instance : + iscsiadm -m session -u -r 15Do not forget the flag -r, otherwise, you would close ALL + sessions !! +
Reference for Flags in nova.conf @@ -1378,7 +1627,29 @@ Migration of i-00000001 initiated. Check its progress using euca-describe-instan default: 'http://127.0.0.1:8000' IP address plus port value; Location of the ajax console proxy and port - + + --allowed_roles + default: 'cloudadmin,itsec,sysadmin,netadmin,developer' + Comma separated list: List of allowed roles for a project (or tenant). + + + --auth_driver + default:'nova.auth.dbdriver.DbDriver' + + String value; Name of the driver for authentication + + + nova.auth.dbdriver.DbDriver - Default setting, uses + credentials stored in zip file, one per project. + + + nova.auth.ldapdriver.FakeLdapDriver - create a replacement for + this driver supporting other backends by creating another class + that exposes the same public methods. + + + + --auth_token_ttl default: '3600' @@ -1401,6 +1672,12 @@ Migration of i-00000001 initiated. Check its progress using euca-describe-instan Password key; The secret access key that pairs with the AWS ID for connecting to AWS if necessary + + --ca_file + default: 'cacert.pem') + File name; File name of root CA + + --cnt_vpn_clients default: '0' @@ -1409,16 +1686,40 @@ Migration of i-00000001 initiated. Check its progress using euca-describe-instan --compute_manager default: 'nova.compute.manager.ComputeManager' - String value; Manager for Compute which handles remote procedure calls - relating to creating instances + String value; Manager for Compute which handles remote procedure calls relating to creating instances --create_unique_mac_address_attempts - default: 'nova.compute.manager.ComputeManager' - String value; Manager for Compute which handles remote procedure calls - relating to creating instances + default: '5' + String value; Number of attempts to create unique mac + address - + + --credential_cert_file + default: 'cert.pem' + Filename; Filename of certificate in credentials zip + + + --credential_key_file + default: 'pk.pem' + Filename; Filename of private key in credentials zip + + + --credential_rc_file + default: '%src' + File name; Filename of rc in credentials zip, %src will be replaced + by name of the region (nova by default). + + + --credential_vpn_file + default: 'nova-vpn.conf' + File name; Filename of certificate in credentials zip + + + --crl_file + default: 'crl.pem') + File name; File name of Certificate Revocation List + --compute_topic default: 'compute' @@ -1514,6 +1815,12 @@ Migration of i-00000001 initiated. Check its progress using euca-describe-instan Deprecated - HTTP URL; Location to interface nova-api. Example: http://184.106.239.134:8773/services/Cloud + + --global_roles + default: 'cloudadmin,itsec' + Comma separated list; Roles that apply to all projects (or tenants) + + --flat_injected default: 'false' @@ -1641,6 +1948,11 @@ Migration of i-00000001 initiated. Check its progress using euca-describe-instan default: 'instance-%08x' Template string to be used to generate instance names. + + --keys_path + default: '$state_path/keys') + Directory; Where Nova keeps the keys + --libvirt_type default: kvm @@ -1651,6 +1963,21 @@ Migration of i-00000001 initiated. Check its progress using euca-describe-instan default: none Directory path: Writeable path to store lock files. + + --lockout_attempts + default: 5 + Integer value: Allows this number of failed EC2 authorizations before lockout. + + + --lockout_minutes + default: 15 + Integer value: Number of minutes to lockout if triggered. + + + --lockout_window + default: 15 + Integer value: Number of minutes for lockout window. + --logfile default: none @@ -1932,6 +2259,11 @@ Migration of i-00000001 initiated. Check its progress using euca-describe-instan default: '/usr/lib/pymodules/python2.6/nova/../' Top-level directory for maintaining Nova's state + + --superuser_roles + default: 'cloudadmin' + Comma separate list; Roles that ignore authorization checking completely + --use_deprecated_auth default: 'false' Set to 1 or true to turn on; Determines whether to use the deprecated nova auth system or Keystone as the auth system @@ -1962,8 +2294,13 @@ Migration of i-00000001 initiated. Check its progress using euca-describe-instan AMI (Amazon Machine Image) for cloudpipe VPN server - --vpn_key_suffix + --vpn_client_template default: '-vpn' + String value; Template for creating users vpn file. + + + --vpn_key_suffix + default: '/root/nova/nova/nova/cloudpipe/client.ovpn.template' This is the interface that VlanManager uses to bind bridges and VLANs to. diff --git a/doc/src/docbkx/openstack-compute-admin/interfaces.xml b/doc/src/docbkx/openstack-compute-admin/interfaces.xml index 0440af5f39..32117f198a 100644 --- a/doc/src/docbkx/openstack-compute-admin/interfaces.xml +++ b/doc/src/docbkx/openstack-compute-admin/interfaces.xml @@ -63,7 +63,7 @@ cd src Next, get the openstack-dashboard project, which provides all the look and feel for the OpenStack Dashboard. -git clone https://github.com/4P/openstack-dashboard +git clone https://github.com/4P/horizon You should now have a directory called openstack-dashboard, which contains the OpenStack Dashboard application.