By Miguel Angel Mingorance Fernandez
GrafanaCon EU 2018, AmsterdamBy Tobias Kühne
On the 1st and 2nd of March 2018 the first GrafanaCon on European territory took place in the Compagnietheater in Amsterdam, Netherlands. I’ve had the pleasure of being among the attendees and would like to share some of the highlights and ideas that have been presented at the conference.
What is Grafana?
For those not familiar with Grafana so far – it is a highly customizable and user friendly open source software for visualizing time series data. It supports lots of data sources (e.g. InfluxDB, Prometheus, Elasticsearch, Graphite, AWS Cloudwatch, but also some RDBMS like PostgreSQL, MySQL and Microsoft SQL Server).
One of the standout features of Grafana is the possibility to combine metrics out of different data sources in a single graph, which makes it very easy to correlate events.
Also alerting and notifications are supported through various channels like Slack, Email, OpsGenie, etc.
GrafanaCon in numbers
As we all like numbers, here are some about this conference:
- 2 days of talks
- 3rd GrafanaCon (2015 & 2016 in New York)
- 32 speakers
- 350+ attendees
- $20,000 donation to the Electronic Frontier Foundation (by selling 20 “Angel tickets”)
- >140,000 active Grafana installations
Release of Grafana 5
After his introduction speech, Torkel Ödegaard – the creator of the project – announced the release of the (at the time) newest version 5.0 of Grafana just on the day of the conference.
Grafana 5 offers a lot of new features, mostly related to an even more user-friendly, flexible user interface and a more granular permission model.A new layout engine makes it easier to design dashboards by dragging & dropping or resizing panels. Folders can be created now to help organizing and grouping dashboards. By setting permissions on folders or individual dashboards, it’s more comfortable to manage access to Grafana resources.
For me personally, the most interesting feature in version 5+ is the option to provision data sources and dashboards from source code. In previous versions we were creating an initial data source by running calls to the API endpoints when setting up Grafana. It worked, but always felt a bit ugly. By now it’s as simple as creating YAML files via <your favourite configuration management tool>.
Dashboards are still fully backwards compatible. Torkel demonstrated in his presentation the import of a dashboard that had been created within the version 1 series of Grafana into a version 5 installation and it looked exactly the same. It just doesn’t work the other way around.
Meanwhile Grafana 5.2 is about to arrive soon.
Trends in the Grafana universe
One of the trends, which could be seen at the GrafanaCon 2018, is that more and more different types of Time-series database systems (TSDB) are being developed and integrated into the supported Grafana data sources.
IRONdb is being designed for scalability, performance and data retention while being compatible with the Graphite data format.
Timescale is based on the popular RDBMS PostgreSQL and therefore is compatible with SQL natively. Although lots of changes in the underlying engine (e.g. native sharding) were introduced by the developers to increase the performance for reading and writing time-series data.
Despite the fact that new TSDBs are on the horizon, Prometheus seems to turn out as a de-facto standard for storing time-series data. One factor supporting this fact might be that Prometheus is very well integrated within the modern, containerized service infrastructure.
Another aspect that seems to be trending is the fact that not only system metrics and performance data are being collected and graphed. Data that isn’t related to IT topics at all is being visualized with Grafana.
One example that was presented is the Hiveeyes project which aims at monitoring a beehive. By measuring the weight, you can see on a dashboard when the bees are actually leaving and returning to the hive.
Erwin de Keijzer was presenting a solution to monitor the electricity consumption in his apartment, based on a Raspberry Pi Zero W, Prometheus and Grafana. By reading data from a smart meter, he’s now able to see if his washing machine is finished by just having a glimpse on a Grafana dashboard.
The RED method
- Utilization: Percentage of time that a resource was busy
- Saturation: Amount of work that a resource has (e.g. queue lengths)
- Errors: Count of error events
For physical components like CPU, memory and disk, the USE method defined an appropriate approach for monitoring.
In our “cloudy” world today with HTTP microservices and components running “serverless” and as developers don’t care about server metrics, a new, modern way of monitoring the health of these services is required. Hence Tom explained the RED method:
- Rate: Number of requests per second
- Errors: Number of failing requests
- Duration: The amount of time that requests take
Implementing this method across all services gives a consistent view how the whole architecture is behaving. Also the RED method is an excellent indicator how happy the users are, while the USE method is more about how “happy” the servers are.
In my personal opinion there is still much potential of using Grafana within Delivery Hero.
Although it is already in place in some of our entities and departments, it could be made even more transparent and accessible. The permission control features of Grafana 5+ qualify it for delegating more control to the users. By correlating business data from different data sources, dashboards of higher value could be created.
By implementing the aforementioned RED method for our global (micro-) services a certain level of abstract comparability could be reached.
Last but not least, I’m looking forward to the next GrafanaCon, which will hopefully take place in a warmer location. Even if Amsterdam is an absolutely beautiful city, it was freezing cold in the beginning of March and the windchill made it even worse. But the organizers of the conference were well prepared for this situation and handed out the Grafana branded scarves, which allowed to identify the attendees all around the city even one or two days after the conference already had ended. 😉