Tuesday 5 December 2017

Benchmarking elasticsearch and logstash pipeline

Elasticsearch and logstash pipelines can be elaborate or simple. Depending upon the setup, end to end benchmarking should be done time to time. One way to do it is to have a marker document (log line). We need to track when the marker is introduced into the pipeline and finally when it becomes available to query.

import urllib2

import json

from datetime import datetime

import sys

import time

print "Started at " + str(datetime.now())
if len(sys.argv) < 2:
   print "URL not specified.\nUsage: watch.py "
   exit(1)

count = 0
while count < 1:
    resp = urllib2.urlopen(sys.argv[1]).read()
    count = json.loads(resp)["hits"]["total"]
    if count > 0:
        print "Found at " + str(datetime.now())
        break
    time.sleep(2)


The above script can be used as follows:

python <script_file> "http://<host_name>:<port_number>/_search?q=message:<markerMessage>"