Grafana Ipsec

PUBLISHED ON 30/11/2017 — DEVELOPMENT

Duct-tape example of sticking together some hacks to display info of Strongswan ipsec information in Grafana.

downloaded.png

Parser

Using custom python/shell scripts to collect the data (bytes in, bytes out, packets in, packets out, number of clients and their names from ipsec statusall command):

#!/bin/bash

STATUS=$(/usr/sbin/ipsec statusall)
echo "$STATUS" | grep ESTABLISHED | sed 's/\[/ /g' | sed 's/\]/ / g' | tr -s " " | cut -d" " -f12
echo "$STATUS" | grep -e "bytes_i" | cut -d":" -f2 | sed 's/[^0-9 ]//g' | tr -s " " | cut -d" " -f3,4,6,7
#!/usr/bin/env python

import commands

stats = commands.getoutput('/opt/ipsec-stats/stats.sh')

A = stats.splitlines()
clients = A[:len(A)/2]
stats = A[len(A)/2:]

seznam = [list(a) for a in  zip(clients, stats)]

for i in seznam:
    print "ipsec,client="+i[0]+" bytes_i="+i[1].split()[0]+",bytes_o="+i[1].split()[2]+",clients="+str(len(seznam))+",pkt_i="+i[1].split()[1]+",pkt_o="+i[1].split()[3]

Collecting data

Using telegraf and pushing data to InfluxDB.

First change the group of charon socket to telegraf to give it permissions of reading the status. I have tried to allow user ‘telegraf’ sudo the ipsec command, but there was a mess of sudo alerts in logs. Then I had a cron job that run ipsec statusall every minute as root and dumped the output to a text file that was then parsed. Bad idea, as data was old and obtained only once per minute. So I have settled for this hack. I believe you will need to do this everythime you reset ipsec service:

chgrp telegraf /var/run/charon.ctl

Call the python script in /etc/telegraf/telegraf.conf inputs.exec block, so telegraf starts doing its magic.

# Read metrics from one or more commands that can output to stdout
[[inputs.exec]]
  commands = [
    "/opt/ipsec-stats/ipsec-telegraf.py"
  ]
  timeout = "5s"
  name_suffix = "_info"
  data_format = "influx"

Restart telegraf service: service telegraf restart `

Visualization

Set up collection in Grafana, mine looks like this, use derivative, because the data is in fact a counter.

uploaded.png

PS: Yeah, I have no idea why the data goes to negative values, got to check that out.