12.5.2011
Comments
Scalable Performance: using ZeroMQ in your next application
If you are like me, you will try to abstract communication between threads or sub-processes within your application so it works the same while running on single CPU core or a thousand CPU cores. One of the easiest ways to achieve this is through a queue system such as RabbitMQ, ActiveMQ, or Gearman but I suggest using ZeroMQ.
Despite having “MQ” in the name, ZeroMQ is not your typical queue system. Instead it’s a socket library that carries messages over TCP, IPC, inproc and multicast allowing you to transmit messages in the same manner between threads, processes, or networks.
IPC is usually used for communication between processes running on the same computer and uses UNIX domain sockets in the background. Unfortunately it’s only available on UNIX and GNU/Linux platforms at the moment.
Inproc passes messages in memory so no I/O threads are involved. Downside of using inproc transport is that you are forced to use the same context instance.
Multicast (PGM) can only be used with PUB/SUB sockets and is used to reliably deliver same message to multiple clients over IP network. PGM is rate limited by default which means that there is a potential performance penalty if used over a loopback interface.
TCP transport provides a generic unicast messaging that can be used over IP network. It will work in the same manner as IPC if used over a loopback interface.
ZeroMQ sockets will not work with regular, non-ZeroMQ sockets. This means that you can not build an HTTP server (for example) on top of ZeroMQ and expect it to work with browsers that do not support ZeroMQ (I don’t know of any browsers that support ZeroMQ, although I strongly believe that they should).
ZeroMQ provides multiple messaging patterns: request/reply, publish/subscribe, pipeline and exclusive pair. Each pattern provides it’s own routing strategy and directionality.
Request/reply provides a bidirectional communication with load-balanced outgoing message routing and last peer incoming message routing. In this pattern a ZMQ_REQ socket (client) can send a message to a ZMQ_REP socket(s) (service/server) that it’s connected to and receive a response from it. If a client socket is connected to multiple service sockets, messages originated on a client socket will be sent in a load-balanced fashion. Service sockets on the other hand will receive a multi-part message (array/tuple) that will contain client socket’s unique identifier as well as a message that client socket has sent.
Publish/subscribe is used for a unidirectional commication with fan-out outgoing message routing. Similar to ZMQ_REP sockets, ZMQ_PUB sockets act as a service and ZMQ_SUB sockets act as clients. Service socket send multi-part messages consisting of a topic and a message. Client sockets subscribe to specific topics that they want to receive messages about.
Pipeline provides a reliable unidirectional communication with load-balanced outgoing message routing. ZMQ_PUSH sockets are used on service side and ZMQ_PULL sockets are used on client (node) side. This pattern is used to reliably distribute messages from a service to multiple nodes that are connected to it. If there are no connected nodes, service will block until at least one node connects to it. Unlike publish/subscribe pattern, a message is only delivered to a single node instead of all nodes connected to a service.
Exclusive pair provides bidirectional communication between two ZMQ_PAIR sockets. ZMQ_PAIR sockets are designed for inter-thread communication across the inproc transport and do not implement functionality such as auto-reconnection. ZMQ_PAIR sockets are considered experimental and may have other missing or broken aspects.