RabbitMQ vs Redis as Message Brokers

2013-10-26 » Message Brokers, Python, Python External Library

I have been looking into job queues for one of my personal projects. This excellent post by Muriel Salvan A quick message queue benchmark: ActiveMQ, RabbitMQ, HornetQ, QPID, Apollo gives a good comparison of popular message brokers. The consensus in on RabbitMQ, which is well established but one of the upcoming options not covered is Redis. With it’s recent support for PubSub, it is shaping up a strong contender.

Advantages of RabbitMQ

  • Highly customizable routing
  • Persistent queues

Advantages of Redis

  • high speed due to in memory datastore
  • can double up as both key-value datastore and job queue

Since I’m working in python, I decided to go with Celery. I tried testing both RabbitMQ and Redis by adding 100000 messages to the queue and using a worker to process the queued messages. The test was run thrice and averaged. In the case of the celery worker, there doesn’t seem to be a burst mode, i.e the worker cannot not exit when all the messages in the queue are processed. So I had to use the next best approximation, the timestamps in the log messages.

tasks.py has the task definition and the message broker to use.

from celery import Celery
celery = Celery('tasks', broker='amqp://guest@localhost//') 
#celery = Celery('tasks', broker='redis://localhost//') 

@celery.task
def newtask(somestr, dt, value):
    pass

test.py does the actual adding of the tasks to the queue

from tasks import newtask
from datetime import datetime
import time

dt = datetime.utcnow()
st_time = time.time()
for i in xrange(100000):
    newtask.delay('shortstring', dt, 67.8)
print time.time() - st_time

The celery worker retrieves the messages by running the command

time celery -A tasks worker --loglevel=info  -f tasks.log --concurrency 1

–concurrency indicates how many simultaneous workers to run. -f indicates the logfile to use. We can infer the time taken for the run from the log timestamp to process the last message. Next we need to estimate the time taken for the INFO level logging the worker does and deduct it from the total time taken.

import logging
import sys
import time
logger = logging.getLogger('MainProcess')
hdlr = logging.FileHandler('/tmp/myapp.log')
formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
hdlr.setFormatter(formatter)
logger.addHandler(hdlr) 
logger.setLevel(logging.INFO)

def main():
    inputf = sys.argv[1]
    for inputf in sys.argv[1:]:
        loglines = file(inputf).readlines()
        loglines = [line.split(']', 1)[1].strip() for line in loglines]
        st_time = time.time()
        for line in loglines:
            logger.info(line)
        print inputf, time.time() - st_time

if __name__ == "__main__":
    main()

Here is the tabulation of the results for each trial consisting of 100,000 messages. It is apparent that RabbitMQ takes 75% of Redis' time to add a message and 86% of the time to process a message. Since the message processing capacity is almost equal, the decision would be solely based on the features. i.e if you want sophisticated routing capabilities go with RabbitMQ. If you need an in memory key-value store go with Redis.

Activity Trial 1 Trial 2 Trial 3 Average Per Message
RabbitMQ - Adding Message to Queue 56.96 54.18 57.13 56.09 0.0005609
Redis - Adding Message to Queue 68.81 76.52 76.95 74.09 0.0007409
RabbitMQ - Processing Messages off the Queue 122.406 132.55 195.885 150.28 0.0015028
Redis - Processing Messages off the Queue 157.59 177.774 186.332 173.9 0.001739
comments powered by Disqus