Simulating DevStack jobs with Vagrant

There are times where you would like to troubleshoot locally why your change is failing a dsvm job at the gate.
Or maybe you want to write a new dsvm job definition and you need an environment as close
as the gate as possible to try out things.

The DevStack-gate project has instructions to simulate Devstack gate tests, which can be found here:

https://git.openstack.org/cgit/openstack-infra/devstack-gate/tree/README.rst

These instructions are easy to follow, but would be nice to have automation around it so I wrote a Vagrantfile:

if ENV['ZUUL_URL']
    ZUUL_URL = ENV['ZUUL_URL']
else
    ZUUL_URL = 'https://git.openstack.org'
end

if ENV['ZUUL_PROJECT']
    ZUUL_PROJECT = ENV['ZUUL_PROJECT']
else
    ZUUL_PROJECT = 'openstack/nova'
end

if ENV['ZUUL_BRANCH']
    ZUUL_BRANCH = ENV['ZUUL_BRANCH']
else
    ZUUL_BRANCH = 'master'
end

if ENV['ZUUL_REF']
    ZUUL_REF = ENV['ZUUL_REF']
else
    ZUUL_REF = 'HEAD'
end

if ENV['DEVSTACK_JOB']
    DEVSTACK_JOB = ENV['DEVSTACK_JOB']
else
    DEVSTACK_JOB = <<JOB
export PYTHONUNBUFFERED=true
export DEVSTACK_GATE_TEMPEST=1
export DEVSTACK_GATE_TEMPEST_FULL=1
cp devstack-gate/devstack-vm-gate-wrap.sh ./safe-devstack-vm-gate-wrap.sh
./safe-devstack-vm-gate-wrap.sh
JOB
end

$script = <<SCRIPT
apt-get update
apt-get -y install git python python-dev python-pip
pip install tox
ssh-keygen -N "" -t rsa -f /root/.ssh/id_rsa
KEY_CONTENTS=$(cat /root/.ssh/id_rsa.pub | awk '{print $2}' )
git clone https://git.openstack.org/openstack-infra/system-config /opt/system-config
/opt/system-config/install_puppet.sh
/opt/system-config/install_modules.sh
puppet apply --modulepath=/opt/system-config/modules:/etc/puppet/modules -e 'class { openstack_project::single_use_slave: install_users => false,   enable_unbound => true, ssh_key => \"$KEY_CONTENTS\" }'
echo "jenkins ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/jenkins
export WORKSPACE=/home/jenkins/workspace/testing
mkdir -p "$WORKSPACE"
cd $WORKSPACE
git clone --depth 1 https://git.openstack.org/openstack-infra/devstack-gate

export ZUUL_URL=#{ZUUL_URL}
export ZUUL_PROJECT=#{ZUUL_PROJECT}
export ZUUL_BRANCH=#{ZUUL_BRANCH}
export ZUUL_REF=#{ZUUL_REF}

exec 0<&-
#{DEVSTACK_JOB}
SCRIPT

Vagrant.configure(2) do |config|
  config.vm.box = "ubuntu/trusty64"
  config.vm.provider "virtualbox" do |vb|
  #   # Display the VirtualBox GUI when booting the machine
  #   vb.gui = true
  #
  #   # Customize the amount of memory on the VM:
    vb.memory = "6144"
  end
  config.vm.provision "shell", inline: $script
end

This Vagrantfile has several variables that can be defined by the user:

ZUUL_URL -> The URL to clone the ref from
ZUUL_PROJECT -> The project to test
ZUUL_BRANCH -> The branch to test
ZUUL_REF -> The ref to test
DEVSTACK_JOB -> The dsvm job

The variables can be either modified on the Vagrantfile or defined as environment variables in the shell.

But better, let’s do an example:
I wrote the patch https://review.openstack.org/#/c/302770/ , which is about adding Designate zones support to the shade project.
This patch could use some functional/integration tests, but shade project jobs do not set up Designate.
Ideally, we would like to change one of the existing dsvm jobs for shade and just add the bits that make the produced DevStack “designate-enabled”.
So, we clone the project-config project, and write down one of the shade dsvm jobs:

export PYTHONUNBUFFERED=true
export DEVSTACK_GATE_NEUTRON=0
export DEVSTACK_GATE_HEAT=1
export DEVSTACK_LOCAL_CONFIG="enable_plugin designate git://git.openstack.org/openstack/designate"
export PROJECTS="openstack/designate $PROJECTS"
export PROJECTS="openstack/designate-dashboard $PROJECTS"
export PROJECTS="openstack/designate-tempest-plugin $PROJECTS"
export PROJECTS="openstack-infra/shade $PROJECTS"
if [ "$BRANCH_OVERRIDE" != "default" ] ; then
    export OVERRIDE_ZUUL_BRANCH=$BRANCH_OVERRIDE
fi

function post_test_hook {
    $BASE/new/shade/shade/tests/functional/hooks/post_test_hook.sh
}
export -f post_test_hook

cp devstack-gate/devstack-vm-gate-wrap.sh ./safe-devstack-vm-gate-wrap.sh
./safe-devstack-vm-gate-wrap.sh

By looking on the project-config Designate dsvm jobs, it seems that Designate is installed in DevStack via a plugin, so adding those lines to the shade dsvm jobs would do the job:

export PYTHONUNBUFFERED=true
export DEVSTACK_GATE_NEUTRON=0
export DEVSTACK_GATE_HEAT=1
export DEVSTACK_LOCAL_CONFIG="enable_plugin designate git://git.openstack.org/openstack/designate"
export PROJECTS="openstack/designate $PROJECTS"
export PROJECTS="openstack/designate-dashboard $PROJECTS"
export PROJECTS="openstack/designate-tempest-plugin $PROJECTS"
export PROJECTS="openstack-infra/shade $PROJECTS"
if [ "$BRANCH_OVERRIDE" != "default" ] ; then
    export OVERRIDE_ZUUL_BRANCH=$BRANCH_OVERRIDE
fi

function post_test_hook {
    $BASE/new/shade/shade/tests/functional/hooks/post_test_hook.sh
}
export -f post_test_hook

cp devstack-gate/devstack-vm-gate-wrap.sh ./safe-devstack-vm-gate-wrap.sh
./safe-devstack-vm-gate-wrap.sh

We just inserted lines 4 to 7, which enables the Designate DevStack plugin and adds the needed Designate projects to the PROJECTS variable, so they are checked out by DevStack-gate scripts.

We are almost there!
We just need to go to the change, click on ‘Download’ and annotate the ref to set the ZUUL_REF variable.
This is how you would run the whole thing:

ricky@ricky-Surface-Pro-3:~/devel/vagrants/devstack-designate$ export ZUUL_PROJECT=openstack-infra/shade
ricky@ricky-Surface-Pro-3:~/devel/vagrants/devstack-designate$ export ZUUL_BRANCH=master
ricky@ricky-Surface-Pro-3:~/devel/vagrants/devstack-designate$ export ZUUL_REF=refs/changes/70/302770/1
ricky@ricky-Surface-Pro-3:~/devel/vagrants/devstack-designate$ read -d '' DEVSTACK_JOB <<"EOF"
export PYTHONUNBUFFERED=true
export DEVSTACK_GATE_NEUTRON=0
export DEVSTACK_GATE_HEAT=1
export DEVSTACK_LOCAL_CONFIG="enable_plugin designate git://git.openstack.org/openstack/designate"
export PROJECTS="openstack/designate $PROJECTS"
export PROJECTS="openstack/designate-dashboard $PROJECTS"
export PROJECTS="openstack/designate-tempest-plugin $PROJECTS"
export PROJECTS="openstack-infra/shade $PROJECTS"
if [ "$BRANCH_OVERRIDE" != "default" ] ; then
    export OVERRIDE_ZUUL_BRANCH=$BRANCH_OVERRIDE
fi

function post_test_hook {
    $BASE/new/shade/shade/tests/functional/hooks/post_test_hook.sh
}
export -f post_test_hook

cp devstack-gate/devstack-vm-gate-wrap.sh ./safe-devstack-vm-gate-wrap.sh
./safe-devstack-vm-gate-wrap.sh
EOF
ricky@ricky-Surface-Pro-3:~/devel/vagrants/devstack-designate$ export DEVSTACK_JOB=$DEVSTACK_JOB
ricky@ricky-Surface-Pro-3:~/devel/vagrants/devstack-designate$ echo $DEVSTACK_JOB

We kick off the thing with vagrant up and voila, we have a Vagrant VM with DevStack running the job defined on the project/change we want to test :-) .

Happy hacking!

Deploying multiple OpenStack clouds with Ansible in a data-driven fashion

Ansible comes with a great companion of OpenStack modules:

http://docs.ansible.com/ansible/list_of_cloud_modules.html#openstack

As you can see, you can create a wide variety of common OpenStack resources (servers, networks, volumes, etc), which you use as building blocks to create a “cloud” with its applications and services (monitoring, DBs, app servers, or whatever).

Let’s say we have access to an OpenStack cloud named yaycloud, and we stored already the connection details to this cloud under our OSCC clouds.yaml.

With our new shiny cloud, we would like to upload a bootstrap key to SSH later on the servers, upload a locally available Ubuntu Trusty image and create a server to host a Nagios service.

This server would be tied to the Neutron network that the cloud admins created for us, named ‘test-net’.

The Ansible playbook could look something like this:


---
- hosts: localhost
  connection: local
  gather_facts: false
  tasks:
  - os_keypair:
      cloud: yaycloud
      name: bootstrap-key
      public_key_file: /home/ubuntu/.ssh/id_rsa.pub
  - os_image:
      cloud: yaycloud
      name: ubuntu
      filename: /home/ubuntu/trusty-server-cloudimg-amd64-disk1.img
  - os_server:
      cloud: yaycloud
      name: nagios
      image: ubuntu-trusty 
      key_name: bootstrap-key 
      flavor: m1.small 
      network: test-net

This works well when you need to create just a few OpenStack resources in the same cloud.

However, it quickly becomes unwieldy when your setup grows as you need to repeat the os_<resource> on and on for each one of the individual resources.
Also, if you plan to have resources in other clouds you may end up with different playbooks for each cloud to avoid having a cluttered single playbook that creates resources in all the different clouds you use.
This also has the problem of repeating the same Ansible code with just different configuration on different playbooks.

Ideally, it would be great if we could:

  1. Decouple the configuration from the code (your specific cloud resources configuration data from the Ansible statements)
  2. Define the resources of my cloud(s) in a declarative way and get Ansible to deploy them
  3. Be able to define as many resources as we want, on as many clouds as we like
  4. Be able to re-use expressed resources in different clouds, without repeating ourselves

As such, I’ve been working the past weeks in an Ansible role that meets the above requirements.
I present you the Ansible OpenStack Cloud Launcher!

With this role, you just have to do the following steps to deploy your clouds resources:

  1. Create a YAML file that defines your clouds and clouds resources (e.g. resources.yml)
  2. Create a super-simple playbook that calls the AOCL (Ansible OpenStack Cloud Launcher) role (e.g. test_aocl.yml)
  3. ---
    - hosts: localhost
      connection: local
      gather_facts: false
      roles:
        - { role: ansible-openstack-cloud-launcher }
    
  4. Run ansible-playbook -i ‘localhost,’ test_aocl.yml -e “@resources.yml”
  5. Profit!

As seeing is believing, this is better explained with an example :-).

Imagine we work in a startup called aoclcompany.
In this company there are a few teams that use OpenStack to do their work (it’s the obvious choice!):
OPS, QA and RnD team.

We are fortunate enough to have access to two different cloud providers, awesomecloud and yaycloud.
In the primer we have cloud admin access, allowing us to create domains, projects and users.
In the second one, the admins of yaycloud created regular non-admin accounts for each one of the teams.

In this initial deployment phase, we are given the task to create the following layout:

  1. In the cloud we have admin access, create a specific domain for each one of the teams. These domains would contain a single project and a user (that would eventually be admins, but we won’t do this at this stage), a Ubuntu Trusty image and some specific flavors that will be available to the users, different to the public cloud provider flavors
  2. In the cloud we have regular user access, all the different accounts (OPS/QA/RnD) will have their own network/subnet/router as they are not created to us by default. The provider does not have Trusty available, so each account would have its own Ubuntu Trusty image
  3. Both clouds (awesomecloud and yaycloud) will have a common bootstrap key
  4. The OPS account in yaycloud will have a machine created to host a Nagios, with HTTP/HTTPS opened
  5. The QA account in yaycloud will have a machine created to host a a Jenkins, with 8080 opened
  6. The RnD account in yaycloud will have a machine created to host a Docker registry, except in this case we don’t care about ports and everything should be opened

Now that we have the requirements, we model them in YAML in our resources.yml file:

profiles:
  - name: admin-clouds
    domains:
      - name: ops
        description: Ops team domain
      - name: qa
        description: QA team domain
      - name: rnd
        description: R&D team domain
    projects:
      - name: ops
        domain: ops
        description: Ops team project
      - name: qa
        domain: qa
        description: QA team project
      - name: rnd
        domain: rnd
        description: RnD team project
    users:
      - name: opsadmin
        password: changeme
        email: opsadmin@aoclcompany.aocl
        domain: ops
        default_project: ops
      - name: qaadmin
        password: changeme
        email: qaadmin@aoclcompany.aocl
        domain: qa
        default_project: qa
      - name: rndadmin
        password: changeme
        email: rndadmin@aoclcompany.aocl
        domain: rnd
        default_project: rnd
    flavors:
      - name: aoclcompany.xlarge
        ram: 128
        vcpus: 1
        disk: 0
      - name: aoclcompany.large 
        ram: 64
        vcpus: 1
        disk: 0
    images:
      - name: ubuntu-trusty
        filename: /home/ubuntu/trusty-server-cloudimg-amd64-disk1.img
  - name: ops
    networks:
      - name: ops-net
    subnets:
      - name: ops-subnet
        network_name: ops-net
        cidr: 192.168.0.0/24
        dns_nameservers:
          - 8.8.8.8
    routers:
      - name: ops-router
        network: public
        interfaces: ops-subnet
    security_groups:
      - name: webserver
        description: Allow HTTP/HTTPS traffic
    images:
      - name: ubuntu-trusty
        filename: /home/ubuntu/trusty-server-cloudimg-amd64-disk1.img
    security_groups_rules:
      - security_group: webserver
        protocol: tcp
        port_range_min: 80
        port_range_max: 80
        remote_ip_prefix: 0.0.0.0/0
      - security_group: webserver
        protocol: tcp
        port_range_min: 443
        port_range_max: 443
        remote_ip_prefix: 0.0.0.0/0
    servers:
      - name: nagios
        image: ubuntu-trusty
        key_name: bootstrap-key
        flavor: m1.small
        security_groups: webserver
        network: ops-net
  - name: qa
    networks:
      - name: qa-net
    subnets:
      - name: qa-subnet
        network_name: qa-net
        cidr: 192.168.1.0/24
        dns_nameservers:
          - 8.8.8.8
    routers:
      - name: qa-router
        network: public
        interfaces: qa-subnet
    security_groups:
      - name: webserver
        description: Allow HTTP/HTTPS traffic
      - name: altwebserver
        description: Allow 8080 traffic
    security_groups_rules:
      - security_group: webserver
        protocol: tcp
        port_range_min: 80
        port_range_max: 80
        remote_ip_prefix: 0.0.0.0/0
      - security_group: webserver
        protocol: tcp
        port_range_min: 443
        port_range_max: 443
        remote_ip_prefix: 0.0.0.0/0
      - security_group: altwebserver
        protocol: tcp
        port_range_min: 8080
        port_range_max: 8080
        remote_ip_prefix: 0.0.0.0/0
    servers:
      - name: jenkins
        image: cirros-0.3.4-x86_64-uec
        key_name: bootstrap-key
        flavor: m1.tiny
        security_groups: altwebserver
        network: qa-net
  - name: rnd
    networks:
      - name: rnd-net
    subnets:
      - name: rnd-subnet
        network_name: rnd-net
        cidr: 192.168.2.0/24
        dns_nameservers:
          - 8.8.8.8
    routers:
      - name: rnd-router
        network: public
        interfaces: rnd-subnet
    security_groups:
      - name: openwide
        description: Allow all traffic
    security_groups_rules:
      - security_group: openwide
        protocol: tcp
        remote_ip_prefix: 0.0.0.0/0
    servers:
      - name: docker-registry
        image: cirros-0.3.4-x86_64-uec
        key_name: bootstrap-key
        flavor: m1.tiny
        security_groups: openwide
        network: rnd-net
  - name: bootstrap-keypair
    keypairs:
      - name: bootstrap-key
        public_key_file: /home/ubuntu/.ssh/id_rsa.pub
clouds:
  - name: awesomecloud
    profiles:
      - admin-clouds
      - bootstrap-keypair
  - name: yaycloud-ops
    oscc_cloud: yaycloud-opsuser
    profiles:
      - bootstrap-keypair
      - ops
  - name: yaycloud-qa
    oscc_cloud: yaycloud-qauser
    profiles:
      - bootstrap-keypair
      - qa
  - name: yaycloud-rnd
    oscc_cloud: yaycloud-rnduser
    profiles:
      - bootstrap-keypair
      - rnd

And with it and as noted earlier, we just run the following command to deploy our clouds resources:

ansible-playbook -i ‘localhost,’ test_aocl.yml -e “@resources.yml”

The resources.yml “DSL” is very simple:
It contains a profiles and clouds list.
The profiles list will have an item for each profile you define, which is a collection of OpenStack resources.
These can be servers,networks,domains, etc. (Note, we use plurals here).
Each type of resource will have as attributes the expected attributes of the corresponding Ansible os_<resource> , so you can just use them if you are familiar with them or refer to the Ansible OpenStack modules documentation.
The attributes will be only required when the Ansible module requires them, and do a default(omit) when the attribute is optional.

The clouds list will contain an item for each one of the clouds you have previously defined on OSCC clouds.yaml.
On these items you can either re-use the profiles previously defined by name or define per-cloud specific resources.
For example:

clouds:
  - name: awesomecloud
    profiles:
      - admin-clouds
      - bootstrap-keypair
  - name: nonprofilescloud
    domains:
      - name: IT
        description: IT team domain

In the above, the first cloud re-uses the resources defined on profiles admin-clouds and bootstrap-keypair whereas the second cloud defines a specific per-cloud domain called IT.
Obviously, you can mix and match per-cloud and profiles resources (Ansible will create per-cloud resources first).

This is very powerful, because:

  1. We can define our clouds layouts in a human readable format like YAML
  2. We can re-use resources definitions by using profiles
  3. We can also have per-cloud specific resources
  4. We can have persistent infrastructure by having the resources.yml in a git repo.
    We just need to put in the ansible control machine a cron that clones the resources.yml repo and runs AOCL role with it as argument
  5. We can put the resources.yml git repo under code review in Gerrit to review infrastructure changes (because, you know, you don’t want to put state:absent on the Gerrit server resource and merge it straight-away :D)

Happy hacking!

Random startup failures on Gerrit instance in cloud

Lately, I’ve been playing around with testing the various main components of OpenStack Infra, namely the puppet manifests.

I ran into an interesting problem last week where starting Gerrit would work on first try, then it would fail afterwards.

The interesting thing is that if I increased the timeout value of the Gerrit init upstart script to some ludicrous high value (900 seconds), it would eventually start at some point.

I thought it could be due to upstream using a forked Gerrit version, but the git diff showed the differences were minimal.

As I was trying this Gerrit test on a HP Cloud instance, I tried running it on my rusty but still working home server on a Vagrant VM.

Turned out it would start and stop immediately without any problems, thus the problem clearly had something up with running it on a cloud instance.

I shared the problem with my colleagues and one of them said ‘hey, this could be something about entropy’.

Suddenly, something clicked on my mind and I remembered that in upstream the Nodepool images had haveged package baked in, thus I did an apt-get install haveged and voila, Gerrit would start and stop without ANY problems.

P.S. Thanks to my colleague Nicola Heald for putting me on track to resolution on this problem, I spent a whole morning doing all sorts of testing and didn’t think about entropy!

Create a Jenkins user from the command-line with Jenkins CLI and Groovy

I have not been able to find examples on how to create users in Jenkins from the API.

Fortunately, the Jenkins CLI allows to run arbitrary Groovy scripts against the Jenkins instance so it was a matter of finding the right API call to achieve that.

These are the two commands needed:


wget http://localhost:8080/jnlpJars/jenkins-cli.jar

echo 'hpsr=new hudson.security.HudsonPrivateSecurityRealm(false); hpsr.createAccount("dummyuser", "dummypassword")' | java -jar jenkins-cli.jar -s http://localhost:8080 groovy =

Introducing you my ESX whitebox

For debugging issues in my daily job, I need a box with a lot of memory.

And when I say a lot, I mean it.

One of the products I give support in my job needs at a minimum 8 GB to barely work.

Creating a VM with such massive amount of memory in a laptop kills the underlying OS, thus I had to figure out something.

A few months ago, I started to buy components in order to build an ESX whitebox server that could solve my problems.

I carefully selected components that were as silent as possible, as the server would be placed in my rack cabinet, which is next to me (I’ll probably blog about my rack cabinet in future posts) .

It took me several months to get all the server components needed, but I couldn’t be happier with the results 😀 .

These are the specs:

CPU

2x Intel Xeon E5506 (http://ark.intel.com/products/37096/Intel-Xeon-Processor-E5506-(4M-Cache-2_13-GHz-4_80-GTs-Intel-QPI)

Motherboard

Asus Z8NA-D6 (http://www.asus.com/Server_Workstation/Server_Motherboards/Z8NAD6/)

Memory

6x Kingston KVR1333D3D4R9S/8GHB (http://www.kingston.com/dataSheets/KVR1333D3D4R9S_8GHB.pdf)

Harddisks

OCZ Vertex 3 SSD (http://www.ocztechnology.com/ocz-vertex-3-sata-iii-2-5-ssd.html)

2x Seagate 3TB 7200RPM (http://www.seagate.com/internal-hard-drives/desktop-hard-drives/barracuda/?sku=ST3000DM001) , configured in RAID1.

RAID Controller

Asus PIKE 2108 (http://www.asus.com/Server_Workstation/Accessories/PIKE_2108/)

Fans

Noctua NF-S12B FLX (http://www.noctua.at/main.php?show=productview&products_id=25&lng=en)

2x Noctua NH-U9DX 1366  (http://www.noctua.at/main.php?show=productview&products_id=27&lng=en)

PSU

Antec HCP-1200 (http://www.antec.com/productPSU.php?id=2468&fid=343)

Case: Rackmatic CK32 (http://www.planetronic.es/rack-caja-atx-rack-f540-2×525-8×35-p-5990.html)