Download Report

Treeherder Documentation
Release prototype
Mozilla
October 27, 2014
Contents
1
Installation
1.1 Cloning the Repo . . . . . . . . . .
1.2 Setting up Vagrant . . . . . . . . .
1.3 Setting up Treeherder . . . . . . .
1.4 Viewing the local server . . . . . .
1.5 Running the ingestion tasks . . . .
1.6 Building changes to the log parsers
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
3
3
4
4
4
2
Loading buildbot data
5
3
Deployment
3.1 Securing the connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Serving the UI build from the distribution directory . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
8
8
4
Integrating the ui
9
5
Services architecture
5.1 Gunicorn . . . . . . . . .
5.2 Gevent-socketio . . . . .
5.3 Celery task worker . . . .
5.4 Celerybeat task scheduler
5.5 Celerymon task monitor .
.
.
.
.
.
11
11
11
11
12
12
6
Common tasks
6.1 Apply a change in the code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Add a new repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Restarting varnish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
13
13
13
7
Troubleshooting
7.1 Using supervisord for development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Where are my log files? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
15
15
8
Indices and tables
17
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
i
ii
Treeherder Documentation, Release prototype
Contents:
Contents
1
Treeherder Documentation, Release prototype
2
Contents
CHAPTER 1
Installation
1.1 Cloning the Repo
• Clone the treeherder-service repo from Github.
1.2 Setting up Vagrant
• Install Virtualbox and Vagrant if not present.
• Either follow the Integrating the ui steps, or comment out this line in the Vagrantfile:
config.vm.synced_folder "../treeherder-ui", "/home/vagrant/treeherder-ui", type: "nfs"
• Open a shell, cd into the root of the project you just cloned and type
>vagrant up
• Go grab a tea or coffee, it will take a few minutes to setup the environment.
• Once the virtual machine is set up, you can log into it with
>vagrant ssh
1.3 Setting up Treeherder
• A python virtual environment will be activated on login, all that is left to do is cd into the project directory:
(venv)vagrant@precise32:~$ cd treeherder-service
• You can run the py.test suite with
(venv)vagrant@precise32:~/treeherder-service$ ./runtests.sh
• Initialize the master database
(venv)vagrant@precise32:~/treeherder-service$ python manage.py init_master_db
• Populate the database with repository data sources
3
Treeherder Documentation, Release prototype
(venv)vagrant@precise32:~/treeherder-service$ python manage.py init_datasources
• Export oauth credentials for all data source projects
(venv)vagrant@precise32:~/treeherder-service$ python manage.py export_project_credentials
• And an entry to your host machine /etc/hosts so that you can point your browser to local.treeherder.mozilla.org
to reach it
1.4 Viewing the local server
192.168.33.10
local.treeherder.mozilla.org
• Start a gunicorn instance listening on port 8000
(venv)vagrant@precise32:~/treeherder-service$ ./bin/run_gunicorn
all the request sent to local.treeherder.mozilla.org will be proxied to it by varnish/apache.
• For development you can use the django runserver instead of gunicorn:
(venv)vagrant@precise32:~/treeherder-service$ python manage.py runserver
this is more convenient because it automatically refreshes every time there’s a change in the code.
1.5 Running the ingestion tasks
• Start up one or more celery worker to process async tasks:
(venv)vagrant@precise32:~/treeherder-service$ celery -A treeherder worker -B
The “-B” option tells the celery worker to startup a beat service, so that periodic tasks can be executed. You
only need one worker with the beat service enabled. Multiple beat services will result in periodic tasks being
executed multiple times
1.6 Building changes to the log parsers
• The log parser shipped with treeherder makes use of cython. If you change something in the treeherder/log_parser folder, remember to re-build the c extensions with:
(venv)vagrant@precise32:~/treeherder-service$ python setup.py build_ext --inplace
4
Chapter 1. Installation
CHAPTER 2
Loading buildbot data
In order to start ingesting data, you need to turn on a celery worker with a ‘-B’ option. In this way the worker can run
some scheduled tasks that are loaded in the database with the init_master_db command. Here is a brief description of
what each periodic task will do for you:
fetch-push-logs Retrieves and store all the latest pushes (a.k.a. resultsets) from the available repositories. You need
to have this running before you can start ingestiong job data. No pushes, no jobs.
fetch-buildapi-pending Retrieves and store buildbot pending jobs using RelEng buildapi service
fetch-buildapi-running Same as before, but for running jobs
fetch-buildapi-build4h Same as before, but it collects all the jobs completed in the last 4 hours.
process-objects As the name says, processes job objects from the objectstore to the jobs store. Once a job is processed,
it becomes available in the restful interface for consumption. See the dataflow diagram for more info
Follows a data flow diagram which can help to understand better how these tasks are used by treeherder
5
Treeherder Documentation, Release prototype
6
Chapter 2. Loading buildbot data
CHAPTER 3
Deployment
The easiest way to deploy all the treeherder services on a server is to let puppet do it. Once puppet is installed on
your machine, clone the treeherder repo on the target machine and create a puppet manifest like this inside the puppet
directory:
import "classes/*.pp"
$APP_URL="your.webapp.url"
$APP_USER="your_app_user"
$APP_GROUP="your_app_group"
$PROJ_DIR = "/home/${APP_USER}/treeherder-service"
$VENV_DIR = "/home/${APP_USER}/venv"
# You can make these less generic if you like, but these are box-specific
# so it’s not required.
$DB_NAME = "db_name"
$DB_USER = "db_user"
$DB_PASS = "db_pass"
$DB_HOST = "localhost"
$DB_PORT = "3306"
$DJANGO_SECRET_KEY = "your-django-secret"
$RABBITMQ_USER = "your_rabbitmq_user"
$RABBITMQ_PASSWORD = "your_rabbitmq_pass"
$RABBITMQ_VHOST = "your_rabbitmq_vhost"
$RABBITMQ_HOST = "your_rabbitmq_host"
$RABBITMQ_PORT = "your_rabbitmq_port"
Exec {
path => "/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin",
}
file {"/etc/profile.d/treeherder.sh":
content => "
export TREEHERDER_DATABASE_NAME=’${DB_NAME}’
export TREEHERDER_DATABASE_USER=’${DB_USER}’
export TREEHERDER_DATABASE_PASSWORD=’${DB_PASS}’
export TREEHERDER_DATABASE_HOST=’${DB_HOST}’
export TREEHERDER_DATABASE_PORT=’${DB_PORT}’
export TREEHERDER_DEBUG=’1’
export TREEHERDER_DJANGO_SECRET_KEY=’${DJANGO_SECRET_KEY}’
export TREEHERDER_MEMCACHED=’127.0.0.1:11211’
export TREEHERDER_RABBITMQ_USER=’${RABBITMQ_USER}’
export TREEHERDER_RABBITMQ_PASSWORD=’${RABBITMQ_PASSWORD}’
export TREEHERDER_RABBITMQ_VHOST=’${RABBITMQ_VHOST}’
export TREEHERDER_RABBITMQ_HOST=’${RABBITMQ_HOST}’
7
Treeherder Documentation, Release prototype
export TREEHERDER_RABBITMQ_PORT=’${RABBITMQ_PORT}’
"
}
class deployment {
class {
init: before => Class["mysql"];
mysql: before => Class["python"];
python: before => Class["apache"];
apache: before => Class["varnish"];
varnish: before => Class["treeherder"];
treeherder: before => Class["rabbitmq"];
rabbitmq:;
}
}
include deployment
As you can see it’s very similar to the file used to startup the vagrant environment. You can run this file with the
following command
(venv)vagrant@precise32:~/treeherder-service$ sudo puppet apply puppet/your_manifest_file.pp
Once puppet has finished, the only thing left to do is to start all the treeherder services (gunicorn, socketio, celery,
etc). The easiest way to do it is via supervisord. A supervisord configuration file is included in the repo under
deployment/supervisord/treeherder.conf.
3.1 Securing the connection
To put everything under a SSL connection you may want to use a SSL wrapper like stunnel. Here is a bacis example
of a stunnel configuration file:
cert = /path-to-my-pem-file/credentials.pem
[https]
accept = 443
connect = 80
3.2 Serving the UI build from the distribution directory
To serve the UI from the treeherder-ui/dist directory, in the treeherder-ui directory run:
(venv)vagrant@precise32:~/treeherder-ui$ grunt build
This will build the UI by concatenating and minifying the js and css and move all required assets to a directory called
dist in the repository root of treeherder-ui. In treeherder-service/Vagrantfile uncomment this
line:
puppet.manifest_file = "production.pp"
The production.pp manifest sets the web application directory to the dist directory.
8
Chapter 3. Deployment
CHAPTER 4
Integrating the ui
If you want to develop both the ui and the service side by side it may be convenient to load the ui in the vagrant
environment.
• Make sure the treeherder-ui repo is cloned in the same parent folder as treeherder-service (and with the directory
name ‘treeherder-ui’).
• If you previously commented out the treeherder-ui line in the Vagrantfile as part of the Installation instructions,
undo that now.
• If you have an existing Vagrant environment set up, you will need to reload it using:
>vagrant reload
You should now be able to access the ui on http://local.treeherder.mozilla.org/ui/
9
Treeherder Documentation, Release prototype
10
Chapter 4. Integrating the ui
CHAPTER 5
Services architecture
Running treeherder at full speed requires a number of services to be started. For an overview of all the services, see
the diagram below
All the services marked with a yellow background are python scripts that can be found in the bin directory. In a typical
deployment they are monitored by something like supervisord. Follows a description of those services.
5.1 Gunicorn
A wsgi server in charge of serving the restful api and the django admin. All the requests to this server are proxied
through varnish and apache.
5.2 Gevent-socketio
A gevent-based implementation of a socket.io server that can send soft-realtime updates to the clients. It only serves
socketio-related request, typically namespaced with /socket.io. When executing, it consumes messages from rabbitmq
using a “events.#” routing key. As soon as a new event is detected, it’s sent down to the consumers who subscribed to
it. To separate the socket.io connection from the standard http ones we use varnish with the following configuration
sub vcl_recv {
if (req.url ~ "socket.io/[0-9]") {
set req.backend = socketio;
if(req.http.upgrade ~ "(?i)websocket"){
return (pipe);
}
}
else {
set req.backend = apache;
}
return (pass);
}
5.3 Celery task worker
This service executes asynchronous tasks that can be by triggered by the celerybeat task scheduler or by another
worker. In a typical treeherder deployment you will have two different pools of workers:
11
Treeherder Documentation, Release prototype
• a gevent based pool, generally good for I/O bound tasks
• a pre-fork based pool, generally good for CPU bound tasks
In the bin directory of treeherder-service there’s a script to run both these type of pools.
5.4 Celerybeat task scheduler
A scheduler process in charge of running periodic tasks.
5.5 Celerymon task monitor
This process provides an interface to the status of the worker and the running tasks. It can be used to provide such
informations to monitoring tools like munin.
12
Chapter 5. Services architecture
CHAPTER 6
Common tasks
This is a list of maintenance tasks you may have to execute on a treeherder-service deployment
6.1 Apply a change in the code
If you changed something in the log parser, you need to do a compilation step:
> python setup.py build_ext --inplace
In order to make the various services aware of a change in the code you need to restart supervisor:
> sudo /etc/init.d/supervisord restart
6.2 Add a new repository
To add a new repository, the following steps are needed:
• Append a new datasource to the datasource fixtures file located at treeherder/model/fixtures/repository.json
• Load the file you edited with the loaddata command:
> python manage.py loaddata repository
• Create a new datasource for the given repository:
> python manage.py init_datasources
• Generate a new oauth credentials file:
> python manage.py export_project_credentials
• Restart all the services through supervisord:
> sudo /etc/init.d/supervisord restart
6.3 Restarting varnish
You may want to restart varnish after a change in the ui. To do so type
13
Treeherder Documentation, Release prototype
> sudo /etc/init.d/varnish restart
14
Chapter 6. Common tasks
CHAPTER 7
Troubleshooting
7.1 Using supervisord for development
On an ubuntu machine you can install supervisord with
>sudo apt-get install supervisor
To start supervisord with an arbitrary configuration, you can type:
>supervisord -c my_config_file.conf
You can find a supervisord config file inside the deployment/supervisord folder. That config file contains a section
for each service that you may want to run. Feel free to comment one or more of those sections if you don’t need that
specific service. If you just want to access the restful api or the admin for example, comment all those sections but the
one related to gunicorn. You can stop supervisord (and all processes he’s taking care of) with ctrl+c. Please note that
for some reason you may need to manually kill the celery worker when it’s under heavy load.
7.2 Where are my log files?
You can find the various services log files under
• /var/log/celery
• /var/log/gunicorn
• /var/log/socketio
You may also want to inspect the main treeherder log file ~/treeherder-service/treeherder.log
15
Treeherder Documentation, Release prototype
16
Chapter 7. Troubleshooting
CHAPTER 8
Indices and tables
• genindex
• modindex
• search
17