Democratising analytics: All in a day’s work (Part I)

Image courtesy of SAS

SAS, United States-based analytics software firm has been around since 1976. What began as a project at the North Carolina State University is now a $3-billion+ giant — and today the largest privately-held software company in the world. Since the early days of data mining, the company has seen several trends in analytics come and go — an in some cases, resurface.

SAS has firmly committed to cloud in the recent years; earlier this year, SAS partnered with Microsoft to offer its Viya artificial intelligence, analytics, and data management platform on Microsoft’s Azure cloud platform. This partnership resulted in the use of SAS’s platform and Microsoft Azure to predict and manage flood events in the town of Cary, North Carolina, where the SAS headquarters are located.

Gavin Day, Senior Vice President of Technology at SAS, has been with the company for two decades, entering the company through the acquisition of Dataflux, a data management company. He has previously run SAS’s sales and consulting arms, and today he oversees technology development in the SAS R&D division, and is responsible for defining and communicating its technology vision, product management and product strategy. “I spend the majority of my time with customers, with the media, and with our analyst community, thinking and talking about where we are headed, and why we are going there,” he explains. We sat down virtually with Day about his journey through analytics, what’s happening in the SAS R&D labs, and what’s next for analytics on the horizon.

You have been with SAS way before there was cloud computing, and before AI and ML became a mainstream part of analytics. How has the journey been, and what was it like to see these developments in the field of analytics?

I think one very positive result from all of that is that it has really sped up the adoption of data and analytics from a customer-user perspective. Analytics now is not only for the highly educated or the highly skilled, or the people that have been doing analytics and statistics for years and decades. We have lowered the barrier of adoption now. So, if you are a data scientist, you have an entry point into analytics. If you are a business user and you want to use a drag-and-drop interface, we have that as well. Or if you are an open-source programmer, and you want to interact with SAS or another vendor, you also have an entry point. So I think lowering the barrier of entry has been really good.

Back to the question, it absolutely has been a journey. Analytics had been education- and academic-based for a very long time, and then we moved into mainframe servers, we had the Hadoop big data craze that we all had to figure out, and now the journey to the cloud has encompassed everyone. When I talk to customers or members of the media or the analyst community, I think we are going to continue to see this.

Two years ago, there was a big rush into the cloud, and COVID-19 has absolutely accelerated that. But now, customers are understanding that there is a significant expense to this, and there is a significant amount of planning that needs to go into it.

So, almost every large enterprise I am talking to is talking to me about having some type of hybrid cloud model, where they are utilising some workloads in the cloud, and they also have on-premise workloads that they are dealing with. That has led us to spend a lot of time making sure that our analytics are portable. This means that when a customer comes forward and says that they want to run on Azure, or they want to run on GCP, or they want to run on AWS, or any other cloud in the future, we will be able to support them. I do not want to get into a big debate on where they want to run it. I just need to say yes to their request.

The second part is that they want to run with R, with Python, with SAS, with whatever language they want. My answer to that has to be yes as well. That is where the last 2 years, 18 months have taken us, and certainly where we are headed moving forward.

What are some of the things that you are working on in your labs right now that excites you the most?

Our innovations with the cloud providers has definitely been one, specifically our work with Microsoft that we started last year in November, both from a foundational integration perspective and also now moving into the next level of integration with Power BI and Dynamics. The journey from SAS 9 to SAS Viya has also been just amazing.

As we move from monolithic software applications that everyone had into a fully containerised CI/CD that allows us to do builds with weekly and monthly drops, we were looking forward to getting that done twice a month, and then every week as we move forward.

That was not only a huge change in culture and experience for our customers, but we also had to completely change and retool research and development.

We have a couple of thousand people, and we looked at all of our modern coding practices, such as how we have been writing and how we have been testing.

For the last 12 months, a lot of the work has been on embedding our AI and machine learning capabilities into everything that we do, and we are going to see a lot of it this year as well. I am never going to go, “Here is your AI toolkit, go and build something with it”. Instead, we are going to take AI and put it into everything that we do. Data management is a great example for this. Data quality and data integration are very user-intensive, but we can lower the barrier to entry with our AI tools and make it more accessible. That is extremely exciting to me because we have that lens across all of research and development.

Why do you do analytics? Well, you do it to make a decision, and we have customers that are using our decisioning platform for hundreds, thousands, and millions of decisions. When I think about that moving forward, there is the baseline decisioning platform that we have, and then we layer on risk decisioning, fraud decisioning, customer decisioning, lending decision, and so on. That then becomes a very powerful underpinning.

Lastly, not forgetting our heritage, are the investments that we continue to make in analytics. It is our bread and butter, and where we spend a significant amount of research and development.

Even after a challenging year in 2020, we took 27% of our revenues and invested them back into R&D.

We want to make sure we balance the ‘R’ and the ‘D’ in R&D. We want to do plenty of development design, but we have to balance those internally.

From a product release point-of-view, how do you verticalise your analytics solutions for particular industry use cases?

First off, we want to look in to make sure that the vertical is a core competency of SAS. It is very easy to be all things to all people and all customers, and so we want to make sure that we are focused on where we know we can be successful. Then, we listen to our customers. At the beginning of 2019, we changed some of our product management and product strategy practices.

Everything that we do that makes its way into engineering has a customer attached to it, has potential revenue attached to it, and has use cases attached to it.

This helps us know that everything we are doing is for our customer.

Then, we look at our core fundamental analytics and data management technologies. These two are generic in nature. This means that out of the gate, I am not going to make them just for banking, or just for any particular industry. We float these up, and our solutions consume them, whether it is risk / fraud, customer intelligence, Internet of Things (IoT), health, life sciences, or retail. They will take those and they will use the platform that comes out of R&D to make specific solutions, whether it is for money laundering, liquidity risk solutions, or IoT sensors. It is a balance, but the important part for us is that everything underneath is absolutely the same foundation and platform with Viya. That gives us the ability to innovate more quickly, but also control things such as migration and upgrades in a seamless way for customers. 

In 2020, if we look at where we had a lot of use cases and adoption, banking and government were two big industries for us from a growth perspective. Our fraud and risk technologies also continued to be very well-received. There is a balance there, and we want to make sure that when we talk about a solution, there has to be value-added IP in there. It is not just technology, but also the people on our side who have come from industry, or have done this before, that we can ask to go and work with our customers.

To you, what makes a successful data science team, and what makes a successful data scientist?

From a personality trait perspective, the first thing is creativity. They must want to seek out the hard problems and go to solve them. That is a great trait, and it is something that all of us at SAS wholeheartedly believe in. 

For a high-functioning data science organisation, they must also have collaboration. Some of our customers have five data scientists, some have ten, some have hundreds, and there has to be a way for them to collaborate. This collaboration is not just about the technology that is used, but also the work that is being done. This includes making sure that an R programmer, a Python programmer, and a SAS programmer can all collaborate within the same platform to move models from the lab into production. When it is only one or the other, then I think organisations are setting themselves up to be unsuccessful.

Next, it is about having a robust way to get those models from your data scientist community into production and operationalise them at scale.

We see a lot of customers who have a lot of great work being done by data scientists, but they do not have a repeatable and scalable way of getting it over the wall or moving it at scale.

This is one area that comes up a lot when I talk to customers.

One of the other challenges we hear is that there is so much technology out there, and everybody wants to go and grab the latest things. As a technologist, I love doing that too, where you go and grab it, download it, mess with it. But we have to continue to balance that, which is why I think having an API-driven approach as we go forward is really important. This is so that we can ingest it and integrate it with things from a cloud perspective. Everyone talks about R, Python, Go, and I think these languages are great and we will continue to work with them because our customers and the market want us to. But it is the next language that we do not know about that gives me a little bit more pause, because there is another one coming, and we are going to have to go and integrate with it. That is a design principle for us as we go forward, because our customers are going to demand it.

Python for example, has seen a resurgence in its use for data science, and what was old is new again. We see these things go down, and then all of a sudden go back up. This is something I talk about with the teams that our customers have. I want SAS to be a company where customers bring us analytical problems to solve. And you will notice that I did not say SAS analytical problems. I am not interested in whether you are using R, Python, or SAS. I want someone to bring all of those in, and we will work with them together in a collaborative way, and then use SAS to get that over the hump into production. That is the kind of view that I have.

It is sometimes difficult to get a clear ROI on analytics investment from the very beginning, until you can get the data and show the exact savings to be had. What advice would you give to customers who are trying to convince their finance teams or CFOs to make the investment?

For someone who is just embarking on an analytics journey, they have to pick a use case that is reasonable, manageable, and measurable. Oftentimes, they want to boil the entire ocean, and they have this huge problem that they want to go and solve. Yes, we should worry about that, but let us first get something we can put our arms around, that we can measure and then put a proof of value around. 

The other part of the conversation is time. The six-month proof of concept is gone and no longer acceptable within the industry. When we talk to stakeholders and customers, they will tell us what they want to accomplish in two weeks, and what they want to accomplish in four weeks.

If we cannot start to measure success in eight weeks or less, then we are picking the wrong use case.

For us, that is something that we are laser-focused on.

One of the ways that we can help customers is by bringing people and our consulting organisation to bear, because they have the expertise from having done this before. They have designed these programmes before, so we get them to work with our customers. There are two benefits there: one, there is a technology benefit because of the size and breadth of SAS. Secondly, when SAS employees show up at the customer’s site, they bring the knowledge and some of our company culture, and I would like to think that that rubs off and makes the company a little bit of a better place. That, I think, is the 30,000-foot view.

One important question you raised was how to convince the CFO. Too often, I see big analytical projects that companies want to kickstart that do not have a C-level executive sponsor. So, the first step internally is to go and find that sponsor, and make sure that they understand and know why we are going to solve this problem, and that there is a measurable benefit to the company.

How do you think SAS, as well as the analytics industry as a whole, is going to take shape over the next 5-10 years?

I think we are going to see humans being taken more out of the loop when we think about decisioning. Going forward, the analytical landscape is going to mature so much that we are going to automate so much more from a decisioning and analytics perspective. This means that our knowledge workers can then go and focus on what they are good at — being creative and solving problems. That, to me, is exciting.

We are also going to see huge increases in actual direct investment in areas of innovation such as privacy by design, security by design, and ethics by design. As we start trusting computers and AI models to make decisions for us, we need to know why, and make sure there is some explainability there on what led to that decision happening. Quantum computing is one of those things that for me, quite frankly, is still a little bit out there. As a research area it excites me, with just the sheer size and power of compute, and how it will completely change the way and speed at which we can solve problems. 

There is also edge computing in IoT. There is a lot to talk about for IoT, and I think as an industry, we are just scratching the surface.

But when we think about running true analytics at scale on the edge, I do not see enough organisations either thinking that far enough that way, or actually being able to execute in that way.

I think cloud providers are also going to really drive innovation across the industry, and I view that as a good thing. When you look at Google trying to continue to capitalise on this market, they are going to pour billions in investment there.

Lastly, although it is not necessarily a technology topic, we are going to continue to invest heavily in education, because I think we have an obligation to teach the next generation of critical thinkers, statistical thinkers, and people who are coming up and understanding the power of data analytics. That is a palace outside of pure technology innovation that SAS is going to continue to invest in, and we have an obligation there, in my opinion.

You mentioned that we are barely scratching the surface of analytics at the edge. In your opinion, what is setting the limit for us? Is it a compute limit, network problem, or lack of use cases?

I do not think it is a network or compute problem. I am dealing with a large manufacturer right now here in the United States, and they are deploying analytics on the edge with sensors in every machine they have in every plant in the US. They are a huge company, and that is a size and scale challenge. When you are looking at managing that many models, sensors, and devices all at once, it is a whole different scale. They are deploying tens of thousands of models in real-time. That is a different problem than just pushing out a handful of sensors out at the edge. For me, we are solving for scale there, and we are solving for operational rigour which someone like a manufacturer would expect. Those have been some of the limiting factors that we have seen.