RSA CTO on enterprise security foundations & handling Shadow IT

RSA CTO Zulfikar Ramzan talks about why so much of machine learning is marketing-speak or attempting to fit a square peg into a round hole, and how algorithms are in fact the least important part of machine learning. Finally, he offers some advice for CISOs and enterprises grappling with problems of Shadow IT.

Image courtesy of RSA.

Continued from Part I.

How do you see the security space?

Security has to start off on a foundation of visibility — you cannot protect what you cannot see. You cannot manage what you cannot see. Visibility has to be both broad and deep, covering all your key assets. Visibility though, on its own, is necessary, but not sufficient. You have to be able to drive meaningful insights from that visibility — so you have to have analytics on top of that visibility. And ultimately, you take a set of actions against those insights.

If you want to develop secure infrastructure or think about security in any way, shape, or form, you have to kind of follow those three steps: visibility, insights, and action. 

Start off, first of all, with risk, which is the most important thing to consider. Quantification models can help you understand what your critical risks and assets are — because that’s what you have to focus on the most. You have a limited budget: spend it in the right way. To be able to understand and measure risk, you have to have visibility. The way to achieve that is through having capabilities of monitoring your digital assets. 

Analytics built on top of visibility leads to meaningful insights. And that’s where we’ve applied, at RSA, approaches like machine learning — we’ve been doing it for 15-plus years in enterprise production environments. It has gotten a lot of buzz, but it’s not new. 

Has machine learning taken off recently because compute power has gotten to a point where it can be meaningfully deployed?

A few things have happened. Computing powers have improved, which has been very helpful. Also, we live in a world where data has a significant amount of currency. For machine learning to be effective, you need good data to work with. Historically, data was not as abundant as it is today. On top of it, there have been improvements on the algorithmic side. The stuff from deep learning, for example, has been great — it has led to greater advances in machine learning. 

Interestingly enough, a lot of the stuff in deep learning is geared towards specific kinds of problems like image classification or speech recognition. It’s not necessarily good for a lot of other problems. 

But the rest of machine learning has benefited from the marketing hype. I always get upset when I see startups — especially in the security space — whose marketing’s all about deep learning and all these fancy things. And I think, “You don’t realize that you’re taking a tool that’s not meant for this problem and trying to make it fit, just to make some good marketing slides?”

They’re trying to fit a square peg in a round hole?

I think they largely are. In many cases, it’s a lack of understanding of how to make machine learning work correctly. What’s happened is that when anything becomes popular, everybody jumps onto it. But the subset of people who actually understand that technology deeply, who understand the mathematics behind it, who were there when these papers were first written and who understand the nuances — they look at this space very differently, right? But the problem is, those people are very few and far between. For every 100 startups in deep learning, maybe one or two will actually have the right bench of talent to understand what’s really happening and to understand how to do it correctly.

I can’t tell you how many times I’ve seen somebody try to use a code library and completely mess it up — It’s just mind-boggling. My perspective is that the algorithm is actually the least important part of machine learning.

The most important part is data. If you don’t have good data, you cannot derive meaningful inferences — just like you cannot make good wine from bad grapes, or good coffee from bad beans, or whatever the right culinary analogy is. 

The second thing is to identify the critical features of that data that are relevant to the domain of the problem you’re trying to solve. Let’s say I’m trying to build an algorithm to determine whether a website URL is good or bad. Now, if you ask anybody who sits in a security operations center what they would do, the first thing would be to look up the domain and find out when it was registered. Because if the domain was registered seven years ago, it’s probably not a malicious site. If it was registered yesterday, my antennas are going to go up. Now, no machine learning algorithm is going to realize that they have to do a whois lookup on data. It’s not something that’s inherently part of the data; it requires some domain expertise to say that we need to look at these kinds of things.

The next critical facet after your data is identifying the salient characteristics — being very broad and picking a whole bunch of things that could be relevant. Then you apply a machine learning algorithm that can tell you how to combine those characteristics in potentially meaningful ways to drive some kind of assessment about that particular URL. If you take an expert who understands the field, they can always come with a better algorithm one time than any machine-learning approach can. But what makes machine learning powerful is that you can repeat it and scale it — as new data comes in, you automatically update models. Over time, it will start to beat out the human, mainly because of efficiency gains. 

It’s data first, domain expertise/features second, and algorithm third. Most people invert the order.

I’ve seen engineers who start tweaking parameters, and ask them, “Do you know what you’re tweaking?” They respond, “No, I just changed this from a 2 to a 3 and now my code works better.”

But they have no idea what they’re actually doing. In many cases, it works better in that one instance, but worse in the field. When you want to apply machine learning correctly, you have to understand those kinds of fundamental primitives so you can use it correctly in the real world.

A lot of enterprises have shadow IT problems. As a technology manager, how do you enforce technological discipline within a company?

Ultimately, you have to realize that your employees are trying to get their work done effectively. They may not understand how to do that correctly or safely. At Elastica, where I was CTO, this is exactly the problem we were looking at. What we discovered is that you cannot force people to work a certain way. If somebody has to get their job done, they’re going to take a credit card and buy software — in fact, we had a customer who did this. They had an SVP who paid for licenses for a file-sharing application on his corporate credit card every month. IT had no idea this was going on, and he was able to expense it for a few thousand people every month. Amazing, right? The thing is — the intentions were good!

As a CEO or as a manager, the goal cannot be to enforce certain behavior. You have to understand the business problem they’re trying to solve and help them find a way to solve that problem. 

The biggest impediment we’ve had in cybersecurity is that a lot of the old-school folks still have the binary mentality of, “You should not do this, you should do this.” They don’t succeed because in today’s world, everything is digital, everything is cloud-based.

It’s very easy for someone to leverage any application they want to, to get their job done. The successful CISOs today understand that, and finds ways to enable the business to solve problems, but put the right compensating controls around it, without being too restrictive.