Lenny Verkhovsky 469b881ced added Single Use Slave plugin

Change-Id: Ibb9579aa107d666bc38e9e285fb8ec4b7eec83bc

2017-04-04 05:34:50 +00:00

20 KiB

Raw Blame History

OpenStack Third-Party CI

These instructions provide a Third Party Testing solution using the same tools and scripts used by the OpenStack Infrastructure 'Jenkins' CI system.

If you are setting up a similar system for use outside of OpenStack, many of these steps are still valid, while others can be skipped. These will be mentioned within each step.

If you are creating a third-party CI system for use within OpenStack, you'll need to familiarize yourself with the contents of the third party manual, and in particular you'll need to [create a service account] (http://docs.openstack.org/infra/system-config/third\_party.html#creating-a-service-account).

Overview

This CI solution uses a few open-source tools:

Jenkins
- an open-source continuous integration server.
Zuul -a project gating system
Nodepool-a node management system for testing
Jenkins Job Builder -a tool to manage jenkins job definitions
os-loganalyze - a tool to facilitate browsing, sharing, and filtering log files by log level.

The following steps will help you integrate and deploy the first 4 tools on a single node. An initial system with 8GB RAM, 4CPUs, 80GB HD should be sufficient, running Ubuntu 14.04.

A second node will be used to store the log files and create a public log server to host the static log files generated by jenkins jobs. This log server node is an Apache server serving log files stored on disk or on a Swift service. It is hosted on a separate node because it usually needs to be publicly accessible to share job results whereas the rest of the CI system can be located behind a firewall or within a VPN. At the end of a Jenkins Job, publishers will scp log files from the jenkins slave to the log server node or upload to the Swift Service.

The system requires two external resources:

A source for Nodepool nodes. This is a service that implements the OpenStack Nova API to provide virtual machines or bare metal nodes. Nodepool will use this service to manage a pool of Jenkins slaves that will run the actual CI jobs. You can use a public or private OpenStack cloud, or even run your own devstack to get started.
A Gerrit server (for OpenStack users, this is provided to you at review.openstack.org) Zuul will listen to the Gerrit event stream to decide which jobs to run when it receives a desired event. Zuul will also post a comment with results to this Gerrit server with the job results along with a link to the related log files.

These instructions are for a 'masterless' puppet setup, which is the simplest version to set up for those not familiar with puppet.

Install and Configure Puppet

On each node, you will need to install and configure puppet. These scripts assume a dedicated 'clean' node built with a base ubuntu 14.04 server image.

Install Puppet

Puppet is a tool to automate the installation of servers by defining the desired end state. You can quickly install puppet along with basic tools (such as pip and git) using this script:

sudo su -
wget https://git.openstack.org/cgit/openstack-infra/system-config/plain/install_puppet.sh
bash install_puppet.sh
exit

Install Puppet Modules

You can get the latest version of the puppet modules needed using this script.

sudo su -
git clone https://git.openstack.org/openstack-infra/system-config
cd system-config
./install_modules.sh
exit

This script will install all the puppet modules used by upstream to /etc/puppet/modules. In many cases, these are git cloned, and running the install_modules.sh script again will update them to the latest version. This script uses modules.env as its configuration input.

Configure Masterless Puppet

The instructions in this section apply to both the single-node CI server node as well as the log server node.

It is useful to save the history, so set up a git repo as root user:

sudo su -
cd /etc/puppet
git init
echo "modules/" >> .gitignore
git add .
git config --global user.email "you@example.com"
git config --global user.name "Your Name"
git commit -m "initial files"
exit

You will be configuring 3 puppet files. The first is site.pp which is the top level entry point for puppet to start managing the node. The second is a hiera.yaml which configures Puppet Hiera to store local configurations and secrets such as passwords and private keys, and finally some yaml files which store the actual configurations and secrets.

Set up these 3 files by starting with the samples provided. For each node, select the corresponding single_node_ci* or log_server* files.

Configure Puppet hiera.yaml so that puppet knows where to look for the common.yaml file you'll create in the next step. :

sudo su -
cp /etc/puppet/modules/openstackci/contrib/hiera.yaml /etc/puppet
exit

If setting up the single node ci node: :

sudo su -
cp /etc/puppet/modules/openstackci/contrib/single_node_ci_site.pp /etc/puppet/manifests/site.pp
cp /etc/puppet/modules/openstackci/contrib/single_node_ci_data.yaml /etc/puppet/environments/common.yaml
exit

If setting up the log server node: :

sudo su -
cp /etc/puppet/modules/openstackci/contrib/log_server_site.pp /etc/puppet/manifests/site.pp
cp /etc/puppet/modules/openstackci/contrib/log_server_data.yaml /etc/puppet/environments/common.yaml
exit

Modify /etc/puppet/environments/common.yaml as you need using the parameter documentation described in single_node_ci.pp or logserver.pp. These are the top level puppet class that is used in site.pp.

One parameter called project_config_repo is necessary to be set into /etc/puppet/environments/common.yaml.

You need to configure this parameter with the URL of the 'project-config' repository which you will create in the step Create an Initial 'project-config' Repository below.

Once completed, you should commit these 3 files to the /etc/puppet git repo. Your git workflow may vary a bit, but here is an example:

sudo su -
cd /etc/puppet
git checkout -b setup
git add environments/common.yaml
# repeat for other modified files
git commit -a -m 'initial setup'
exit

Set up the log server

Set up the log server node first as it is simpler to configure. Besides, its FQDN (or IP address) is needed to set up the CI server node.

While setting up jenkins_ssh_public_key in common.yaml it is important that the same ssh key pair is used when setting up the CI server node in the next step. This is the ssh key that Jenkins will use to scp files.

At this point you are ready to invoke Puppet for the first time. Puppet needs to be run as root.

sudo puppet apply --verbose /etc/puppet/manifests/site.pp

You can simulate a jenkins file upload using:

scp -i $JENKINS_SSH_PRIVATE_KEY_FILE -o StrictHostKeyChecking=no $your-log-file jenkins@<fqdn_or_ip>:/srv/static/logs/

You should now be able to see the file you uploaded at http://<fqnd_or_ip>/$your-log-file

Set up the CI server

Follow the steps above to install and configure puppet on the CI server node.

Create an Initial 'project-config' Repository

Setting up a CI system consists of two major operational aspects. The first is system configuration, which focuses on the installation and deployment of the services, including any ssh keys, credentials, databases, etc., and ensure all system components are able to interact together. This portion is performed by a System Administrator.

The second is project configuration, which includes the configuration files that the services use to perform the desired project-specific operations.

The instructions provided here are mainly focused on the system configuration aspect. However, system configuration requires an initial set of project configurations in order to work. These project configurations are provided via a git URL to a project-config repository. Before moving on, create an initial project-config repository. You can start with this project-config-example following the instructions provided in its README.md. While tailored for OpenStack users, the instructions provided will help non-OpenStack users also start with this repository. After your system is deployed, you can make further changes to the project-config repository to continuously tailor it to your needs.

Add 'jenkins' to your host name

Add 'jenkins' to your /etc/hosts file so that Apache (which will be installed by the puppet scripts) is happy. This is needed because the scripts will install multiple services on a single node. For example:

head -n 1 /etc/hosts
127.0.0.1 localhost jenkins

Run masterless Puppet

At this point you are ready to invoke Puppet for the first time. Puppet needs to be run as root.

sudo puppet apply --verbose /etc/puppet/manifests/site.pp

Puppet will install nodepool, jenkins, zuul, jenkins jobs builder, etc.

Your project-config repository will be cloned to /etc/project-config, and the puppet scripts will use these configuration files located in this folder. Do not update these files directly. Instead, you should update them from a clone on a dev host, merge the changes to master, and push them to the same git remote location. Puppet will always pull down the latest version of master from the git remote and use that to update services.

If you get the following error, manually run the failed jenkins-jobs update command with the arguments specified in the error message as root. This is caused by a bug in the puppet scripts where Jenkins is not yet running when Jenkins Job Builder attempts to load the Jenkins jobs.

Notice: /Stage[main]/Jenkins::Job_builder/Exec[jenkins_jobs_update]/returns: jenkins.JenkinsException: Error in request: [Errno 111] Connection refused
Notice: /Stage[main]/Jenkins::Job_builder/Exec[jenkins_jobs_update]/returns: INFO:jenkins_jobs.builder:Cache saved
Error: /Stage[main]/Jenkins::Job_builder/Exec[jenkins_jobs_update]: Failed to call refresh: jenkins-jobs update --delete-old /etc/jenkins_jobs/config returned 1 instead of one of [0]
Error: /Stage[main]/Jenkins::Job_builder/Exec[jenkins_jobs_update]: jenkins-jobs update --delete-old /etc/jenkins_jobs/config returned 1 instead of one of [0]

Restart apache if necessary

There are some known issues with Puppet automation. If you get the following error:

AH00526: Syntax error on line 21 of /etc/apache2/sites-enabled/50-<fqdn/ip>.conf:
Invalid command 'RewriteEngine', perhaps misspelled or defined by a module not included in the server configuration

A simple restart works around the issue:

sudo service apache2 restart

Start zuul

We'll start zuul first:

sudo service zuul start
sudo service zuul-merger start

You should see 2 zuul-server processes and 1 zuul-merger process

ps -ef | grep zuul
zuul      5722     1  2 18:13 ?        00:00:00 /usr/bin/python /usr/local/bin/zuul-server
zuul      5725  5722  0 18:13 ?        00:00:00 /usr/bin/python /usr/local/bin/zuul-server
zuul      5741     1  2 18:13 ?        00:00:00 /usr/bin/python /usr/local/bin/zuul-merger

You can view the log files for any errors:

view /var/log/zuul/zuul.log

Most zuul files are located in either of the following directories. They should not need to be modified directly, but are useful to help identify root causes:

/var/lib/zuul
/etc/zuul

Start nodepool

The first time starting nodepool, it's recommended to manually build the image to aid in debugging any issues. To do that, first, initiate the nodepool-builder service:

sudo service nodepool-builder start

The nodepool-builder service is responsible for receiving image building requests and calling Disk Image Builder to carry on the image creation. You can see its logs by typing:

view /var/log/nodepool/nodepool-builder.log

Next, log into the nodepool user to issue manually the image building:

sudo su - nodepool

# Ensure the NODEPOOL_SSH_KEY variable is in the environment
# Otherwise nodepool won't be able to ssh into nodes based
# on the image built manually using these instructions
source /etc/default/nodepool

# In the command below <image-name> references one of the
# images defined in your project-config/nodepool/nodepool.yaml
# file as the 'name' field in the section 'diskimages'.
nodepool image-build <image-name>

You can follow the image creation process by seeing the image creation log:

tail -f /var/log/nodepool/image/image.log

If you run into issues building the image, the documentation provided here can help you debug

After you have successfully built an image, manually upload it to the provider to ensure provider authentication and image uploading work:

nodepool image-upload all <image-name>

Once successful, you can start nodepool. (Note that if you don't yet have an image, this is one of the first actions nodepool will do when it starts, before creating any nodes):

sudo service nodepool start

You should see at least one process running. In particular:

ps -ef | grep nodepool
nodepool  5786     1 28 18:14 ?        00:00:01 /usr/bin/python /usr/local/bin/nodepoold -c /etc/nodepool/nodepool.yaml -l /etc/nodepool/logging.conf

After building and uploading the images to the providers, nodepool will start to build nodes on those providers based on the image and will register those nodes as jenkins slaves.

If that does not happen, the nodepool log files will help identify the causes.

view /var/log/nodepool/nodepool.log
view /var/log/nodepool/debug.log

Most nodepool configuration files are located in either of the following directories. They should never to be modified directly as puppet will overwrite any changes, but are useful to help identify root causes:

/etc/nodepool
/home/nodepool/.config/openstack/clouds.yaml

Setup Jenkins

First Restart Jenkins so that plugins will be fully installed:

sudo service jenkins restart

Then open the Jenkins UI to finish manual configuration steps.

Enable Gearman, which is the Jenkins plugin zuul uses to queue jobs:

http://<host fqdn/ip>:8080/
Manage Jenkins --> Configure System
For "Gearman Server Port" use port number 4730
Under "Gearman Plugin Config" Check the box "Enable Gearman"
Click "Test Connection" It should return success if zuul is running.

The zuul process is running a gearman server on port 4730. To check the status of gearman: on your zuul node telnet to 127.0.0.1 port 4730, and issue the command status to get status information about the jobs registered in gearman.

echo 'status' | nc 127.0.0.1 4730 -w 1

The output of the status command contains tab separated columns with the following information.

Name: The name of the job.

2. Number in queue: The total number of jobs in the queue including the currently running ones (next column). 3. Number of jobs running: The total number of jobs currently running. 4. Number of capable workers: A maximum possible count of workers that can run this job. This number being zero is one reason zuul reports "NOT Registered".

build:noop-check-communication    1    0    1
build:dsvm-tempest-full           2    1    1

Enable ZMQ Event Publisher, which is how nodepool is notified of Jenkin slaves status events:

http://<host fqdn/ip>:8080/
Manage Jenkins --> Configure System
Under "ZMQ Event Publisher"
Check the box "Enable on all Jobs"

Securing Jenkins (optional)

By default, Jenkins is installed with security disabled. While this is fine for development environments where external access to Jenkins UI is restricted, you are strongly encouraged to enable it. You can skip this step and do it at a later time if you wish:

Create a jenkins 'credentials':

http://<host fqdn/ip>:8080/
Manage Jenkins --> Add Credentials --> SSH Username with private key
Username 'jenkins'
Private key --> From a file  on Jenkins master
"/var/lib/jenkins/.ssh/id_rsa"
--> Save

Save the credential uuid in your hiera data:

sudo su jenkins
cat /var/lib/jenkins/credentials.xml | grep "<id>"
Copy the id to the 'jenkins_credentials_id' value in  /etc/puppet/environments/common.yaml

Enable basic Jenkins security:

http://<host fqdn/ip>:8080/
Manage Jenkins --> Configure Global Security
Check "Enable Security"
Under "Security Realm"
Select Jenkin's own user database
Uncheck allow users to sign up
Under "Authorization" select "logged-in users can do anything"

Create a user 'jenkins'
Choose a password.
check 'Sign up'
Save the password to the 'jenkins_password' value in /etc/puppet/environments/common.yaml

Get the new 'jenkins' user API token:

http://<host fqdn/ip>:8080/
Manage Jenkins --> People --> Select user 'jenkins' --> configure --> Show API Token
Save this token to the 'jenkins_api_key' value in /etc/puppet/environments/common.yaml

Reconfigure your system to use Jenkins security settings stored in /etc/puppet/environments/common.yaml

sudo puppet apply --verbose /etc/puppet/manifests/site.pp

Configuring Jenkins Plugins (recommended)

single-use slave:

This plugin will mark nodes as offline when a job completes on them. This plugin is intended to be used with external tools like Nodepool, which has the ability to spin up slaves on demand and then reap them when Jenkins has run a job on them. This plugin is needed because there is a race condition between when the job completes and when the external tool is able to reap the node. Labels can be taken from the project-config/nodepool/nodepool.yaml file under section "labels".

http://<host fqdn/ip>:8080/
Manage Jenkins --> Configure System
Under "Single Use Slaves"
Add comma seperated labels

Updating your masterless puppet hosts

Any time you check-in changes to your project-config repo, make changes to the hiera data (/etc/puppet/environments/common.yaml), or update the puppet files (in /etc/puppet/modules, either manually or via the install_modules.sh script), run the same puppet command to update the host.

sudo puppet apply --verbose /etc/puppet/manifests/site.pp

If you need to change the git url in your project-config or any other git urls in your common.yaml, delete the respective repository, e.g. /etc/project-config, and puppet will reclone it from the new location when the above puppet apply command is reinvoked.

Note that it is safe, and expected, to rerun the above puppet apply command. Puppet will update the configuration of the host as described in the puppet classes. This means that if you delete or modify any files managed by puppet, rerunning the puppet apply command will restore those settings back to the specified state (and remove your local changes for better or worse). You could even run the puppet apply command as a cron job to enable continuous deployment in your CI system.

20 KiB Raw Blame History