I don’t usually just re-post stuff, but this article really hit the spot for me.
I like message queues – they are a wonderfully useful abstraction. Want to send a message from one component to another in a distributed system? Don’t let the components talk to each other directly – decoupling your components makes them more robust, and makes it far easier to swap in other service implementations at a later date.
But I am not here to discuss message queues in general. I’m here to discuss everyone’s (well, most people’s) favorite pub/sub system – Apache Kafka. Kafka is just there. Anyone doing big-data, streaming analytics, anything that requires reading in lots of data from different sources and pumping it into a processing system, is either using Kafka or evaluating it. It’s that successful.
But it’s not a message queue. It was not designed for it, and does not make any pretense to compete with the likes of RabbitMQ and ActiveMQ in this type of scenario.
But it’s there. If you already have a Kafka cluster all set-up and running nicely, maybe even running across data-centers (Kafka was built for this!), and you don’t need spectacular performance from your Message Queue, can you somehow use Kafka and spare yourself the setup and maintenance of another clustered system?
I was wondering how to go about this when I found this post on Software Mill’s feed. It’s worth a read: