Confluent’s Jun Rao on turning Kafka into a thriving venture

Image courtesy of Conny Schneider.

There is a massive difference between developing a new technology and establishing a business from scratch, and Jun Rao, the co-founder of Kafka, and subsequently, Confluent, certainly knows where the line is, having walked through it himself.

In this second of an exclusive, two-part interview, Rao addresses some of the burning industry questions pertaining to Kafka, and recounts how he, Jay Kreps, and Neha Narkhede went from innovators to business owners.

According to some solutions architects, Kafka is not quite as scalable, especially in the IoT space. Is that something that you agree with? Or is that just a marketing tactic they use?

It is true. That’s probably how solutions architects get started, focusing on IoT use cases and edge use cases. At Kafka, we primarily started with the back-end system, where there’s a specific place to integrate all the data collected from the edge. That’s how Kafka was initially designed. However, strictly speaking, it doesn’t prevent Kafka from being used in the edge space, especially with some of the work that we are doing to make Kafka deployment much easier.

For example, we are actively working on reducing and eliminating the reliance on Zookeeper and bringing the control plane within Kafka itself. Using that model, you can achieve a linear deployment approach for a smaller-scale version of Kafka that is more easily deployable at the edge.

Is it true that you still need in-house Kafka experts, even when you deploy Confluent? Does that remain the case? Let’s say someone wants to deploy Kafka, and they use Confluent to do that. Do they still need a lot of in-house expertise?

We believe that the easiest way to adopt many of these newer infrastructure technologies, especially distributed systems, is through a fully managed service. For the majority of companies, managing a distributed system is not their core business. Instead, they want to use cutting-edge technology to power their applications. Thus, it’s probably not in their best interest to handle the infrastructure themselves. This is where the cloud presents an opportunity, as vendors like Confluent can offer fully managed services and technologies like Kafka in the cloud. This means our users and customers don’t have to handle the management themselves; they simply need to be users.

By taking this approach, we’re helping in the adoption of these technologies. Using technology as a service is much easier than self-management. This is our area of focus, and we continuously invest in evolving and innovating in this space. We aim to provide the most up-to-date capabilities that people want. So, this is the best use of resources in the specific area we concentrate on. We believe we can provide the biggest value, not only through software, but also in terms of operational support. This allows users to use the technology and focus their resources on their core business.

Talking a little bit about the business side, you went from LinkedIn straight to co-founding Confluent in September 2014. What was the transition like, from developing the technology to actually running a business? How did you manage to do that?

In general, starting a company is not easy. But in our case, it was somewhat smoother because of our prior work on Kafka for several years, which had also been open-sourced. Through open source, we observed a significant demand for this technology, not only from tech companies like LinkedIn, but across various industries.

Jun Rao, co-founder of Confluent. Image courtesy of Confluent.

When we started the company, we already knew a few things: Firstly, we knew that there is a pretty strong potential market for Kafka, as evidenced by its usage at LinkedIn and the response to the open-source release. Secondly, we had confidence in the strength of our technology to cater to this market. These two aspects are critical components that most start-ups have been missing.

Typically, start-ups begin with an idea but are uncertain whether there is a market for it. The next step involves building the technology or product to validate the idea. After that, if you’re lucky, there is a waiting period of a couple of years before finding the right market fit. For us, the risk was somewhat mitigated due to the insights gained from open-sourcing, which provided us with a glimpse of the product-market fit.

Over time, we realised that the best way to build a business around open-source technology is through a cloud offering. This is because, for many companies, especially in managing newer distributed technologies, the software itself is only one aspect. It can be sold and may even compete with commercial alternatives that require payment. However, this is only a part of the overall cost of ownership. A significant portion lies in the operational aspects. As a vendor, we believe we can deliver significant value in this area, given our deep understanding of the internal workings of the system. We consider this operational support to be a stronger value proposition for our users compared to just providing additional add-on software while leaving them to manage it themselves.

So basically, you’ve gathered the best Kafka experts in the world?

Due to our early entry into this space, we had a strong understanding of the technology and its landscape. This allowed us to effectively appeal to the core technology team, highlighting the potential opportunities in this field and fostering ongoing innovation. That’s where our advantage lies. While we started with a small core team, over time, we were able to attract other technologists and learn from them.

Looking towards the future, we have AI being a part of everything now. It seems like every company is releasing some sort of ChatGPT, or private chat service. Data streaming and event streaming are going to be even more important moving forward. How do you see Kafka and Confluent evolve in the future?

The era where a company’s data is confined to a single location for all interesting operations has come to an end. We are witnessing the emergence of many different ways to leverage data, and these approaches can evolve over time. Today, some of this can be seen in different data analytics platforms or data warehouses. However, machine learning and artificial intelligence technologies represent the next wave of leveraging data.

In the case of Kafka, we offer a platform that facilitates the sharing of integrated data on a broad scale. Once the data is integrated, it can be used by an increasing amount of applications over time. This may include AI applications or new machine learning technologies. The greater the diversity of usage and the broader range of how this data can be leveraged, the more value Kafka brings.