documentation:
Annotations provide a way to mark points on the graph with rich events. When you hover over an annotation you can get event description and event tags. The text field can include links to other systems with more detail.
As all our deployments are performed through Ansible, the idea is to have Ansible create an annotation in Grafana everytime a playbook is executed.
We started to look at how we could push events to Prometheus through the Push Gateway and display them as annotations in Grafana, but as soon as we went to the PushGateway readme we found a message which seemed addressed to us:
The Pushgateway is not an event store. While you can use Prometheus as a data source for Grafana annotations, tracking something like release events has to happen with some event-logging framework.
With that immediate stop on our idea, we delayed the topic as it wasn’t in the project’s critical path, but we kept it in mind as a “nice to have” feature.
After some time, I found myself browsing the Grafana documentation, and found out that Grafana 4.6+ now comes with an HTTP API allowing to store the annotation natively, meaning no need to store anything in Prometheus.
As soon as we got the news, we came back on the topic and started thinking about how we should implement the solution.
How can we interact with this brand new API from Ansible ? We identified 3 possibilities:
Solution 1: Ansible URI module in the playbooks
Since we are dealing with a HTTP API, the first solution coming to our mind was to use the Ansible URI module that would allow us to send the appropriate HTTP request to the Grafana endpoint.
But it also means that we have to update all our playbooks and find a way to avoid breaking their idempotency, so it may look trivial but it can result in several lines of yaml.
Solution 2: Dedicated Ansible module
We thought about writing an Ansible module, mostly to try to handle the idempotency problem in the module and keep the yaml short.
Solution 3: Dedicated Ansible callback
An Ansible callback would allow to send the annotations without even modifying the playbooks, because it is code that is always executed by Ansible, no matter what you do in the playbooks.
But the code is executed only at specific stages of the playbook run.
Since our objective was to have all deployments (whatever the playbook) being visible in the Grafana dashboards, a callback seemed to be the good option as everything becomes traceable as soon as the callback is enabled.
We also have all our playbooks idempotent and we did not want to lose that or spend too much time trying to keep it while adding the grafana annotations through a module use.
I won’t explain how the Ansible callback works, or how we implemented it, but the code is available on Github (and was submitted to the Ansible repository and is awaiting review while I’m writing these lines).
You may have noticed that we have 2 kinds of annotations available in Grafana:
A simple bar with information available on hover. The annotation has only one time indication.
Regular annotation
In our use case it fits the notification of playbook start or failure.
Aims at representing a period on the Graph. The annotation has two time indication, one for the event start time, the second for the event end time.
The event information are available on hover anywhere between the start/stop times.
Region annotation
In our use case it fits the representation of the playbook execution period (and duration).
From the Ansible documentation:
You can activate a custom callback by either dropping it into a callback_plugins directory adjacent to your play, inside a role, or by putting it in one of the callback directory sources configured in ansible.cfg.
Plugins are loaded in alphanumeric order. For example, a plugin implemented in a file named 1_first.py would run before a plugin file named 2_second.py.
Most callbacks shipped with Ansible are disabled by default and need to be whitelisted in your ansible.cfg file in order to function. For example:
#callback_whitelist = timer, mail, profile_roles
In our case, the Grafana callback is not shipped with Ansible, so we will have to create the “callback_plugins” directory near our playbooks:
$ cd <your_playbook_dir> $ mkdir callback_plugins
Then copy the callback source into the directory:
$ cd callback_plugins $ wget https://raw.githubusercontent.com/rrey/ansible-callback-grafana-annotations/master/callback_plugins/grafana_annotations.py
Finally, enable the plugin in the ansible.cfg by adding it in the variable “callback_whitelist”:
[...] callback_whitelist = grafana_annotations
Note: You may have to create the ansible.cfg as it is not created by default when you install Ansible.
Different location are supported, see the Ansible documentation.
Once the callback is enabled, you’ll have to set some environment variables to provide the callback the required information. Here is the different parameters you can define through the environment available:
With the environment variables set, you can run your playbooks without changing anything in the Ansible command call.
Let’s do a quick demo!
First let’s start a Grafana instance through docker:
$ docker pull grafana/grafana $ docker run -d --name=grafana -p 3000:3000 grafana/grafana
You now have Grafana reachable on http://127.0.0.1:3000, the default credentials are admin/admin
Once logged in, create a Dashboard with a Graph panel and save it.
For this demo, we don’t need to create a datasource and have data available. When you create a Graph panel, Grafana displays a graph with random data. If you refresh the dashboard you will see that the data completely changes. Again, it is not important for the demo as we simply want to see the annotation displayed at the proper time.
Go to the dashboard settings and go to the “annotations” sub-menu:
By default, a dashboard has a “built-in” annotation query that only displays the annotation and alerts of the dashboard.(You can scope an annotation to a dashboard by specifying its id).
Looking closer at this built-in query definition, we can see that the configuration specifies that the query filters by “Dashboard”.
The built-in query for Annotations & Alerts
In our case, we have an instance of Grafana and Prometheus per environment, so there is no risk of publishing annotations that are totally unrelated to the environment.
So the callback will not publish the annotations on a specific dashboard, they will be global.
Since the built-in query will only display the annotations explicitly scoped to the dashboard we need to add a new query that will be able to display our global annotations.
Click the “New” Button under the built-in query and configure the following query:
We define a query performed in Grafana’s native store, and the query filters the annotations by tags. It means that our query will look for annotations tagged with the value “ansible”, which is one of the tags defined by our callback for all annotations.
The callback also define a tag with the playbook name and 3rd tag among the following possibilities:
For the demo, selecting the “ansible” tag will allow to display all the annotations with only one query.
If you have the dashboard and a Graph panel inside it, let’s now create an API Token for Ansible.
You can do it from Grafana UI but you can use the API like me:
$ curl -XPOST 127.0.0.1:3000/api/auth/keys --user "admin:admin" --data '{"name": "ansible-callback", "role": "Editor"}' -H "Content-Type: application/json" {"name":"ansible-callback","key":"eyJrIjoiZ2RNc2NPWXZmNE5IZmxjb1hHOGJTNk5YSjJqWXdmbVYiLCJuIjoiYW5zaWJsZS1jYWxsYmFjayIsImlkIjoxfQ=="}
The token value is returned in the server’s response. Write it somewhere, you can not get it twice, you will have to recreate it if you loose it.
The configuration parameters are provided to the callback through environment variables. Export the following variables:
$ export GRAFANA_SERVER=127.0.0.1 $ export GRAFANA_PORT=3000 $ export GRAFANA_API_TOKEN=eyJrIjoiZ2RNc2NPWXZmNE5IZmxjb1hHOGJTNk5YSjJqWXdmbVYiLCJuIjoiYW5zaWJsZS1jYWxsYmFjayIsImlkIjoxfQ==
Note: be sure to replace the token value by your own.
Since Grafana is reachable through HTTP in the container, we can leave GRAFANA_SECURE to its default value. That is why I am not exporting it.
Now let’s write a dummy playbook that will not do much:
test.yml:
- hosts: localhost connection: local tasks: - debug: msg: “Hello world”
We just want to trigger the callback, so what the playbook does is not really important here.
Now that everything is ready, let’s run the playbook:
$ ansible-playbook test.yml [WARNING]: Host file not found: /etc/ansible/hosts [WARNING]: provided hosts list is empty, only localhost is available
PLAY [localhost] *************************************************************************************************************************************************************************************
TASK [Gathering Facts] ******************************************************************************************************************************************************************************* ok: [localhost] TASK [debug] ***************************************************************************************************************************************************************************************** ok: [localhost] => { "msg": "“Hello world”" } PLAY RECAP ******************************************************************************************************************************************************************************************* localhost : ok=2 changed=0 unreachable=0 failed=0
Check your dashboard and enjoy the fancy annotations!
This annotation aims at notifying of a playbook execution. Since a region annotation needs a start time and an end time, it is not possible to create the region before the end of the playbook.
We can still have something visible on the Grafana dashboard by having this simple annotation so that anyone watching the dashboard can see something is going on before it ended.
The annotation allows to see the playbook run period through a region annotation.
Now if you see huge peaks/falls in a panel, you immediately see that it is related to a deployment, you don’t question yourself about the possible root cause of the problem to realize that a deployment was performed.
Now we just have to configure our Jenkins (or any kind of CI/CD tool) pipeline/jobs to define the environment variable before executing the playbooks, and we’ll have all the deployments being nicely displayed in the Grafana Dashboards.
That’s it. If you have questions, comments or remarks, don’t hesitate to comment!