Edge computing: The nirvana for data glut

Smart home devices such as smart switches, smart thermostats and smart speakers with voice assistants are fast going mainstream. Revenue for the smart home market in Asia is projected to reach US$35,756m in 2020, according to Statista, and this is expected to grow at an annual rate of close to 16 percent over the next four years.

The popularity of smart home devices is no surprise as they are becoming more intuitive by the day. Increasingly, these devices can gather voluminous amounts of data and use it to learn and perform tasks – from controlling home entertainment systems to regulating robots as they are tasked to perform household chores.

At the same time, with the sophisticated smart home environments of today, it is impossible for systems to collect all the information into one central repository, analyze that information, and then push the recommendations back to the devices.

Nonetheless, we expect the technology to evolve in 2020 and beyond, and devices will be able to execute the compute function and systems will not have to do it centrally. By computing at the edge (a.k.a. edge computing), these devices will improve functional efficiency by learning to adjust in real-time rather than being slowed down by the transfer of information to and from a central system.

Devices Becoming Smarter

It was only a few years ago that traditional devices became smarter and new smart devices started emerging. Now, devices with smart voice assistants like Alexa, Google Assistant and Siri, are common in homes across Singapore. They are also becoming even more intelligent with the ability to adjust on the fly.

For instance, devices like Google’s Nest are equipped with machine learning capabilities. This means, they can learn about users’ habits, such as weeks of temperature adjustments on the thermostat, to figure out users’ preference.

Centralized Analysis Becoming Harder

Traditionally, these systems are designed to analyze data and derive intelligence from it centrally. Data is extracted from operational systems, transformed into the appropriate format, and loaded into the data warehouses, the workhorse of business intelligence. These data warehouses serve as central repositories where data is turned into insights.

However, data warehouses are losing their limelight as the single source of the truth for various reasons. For one, data warehouses can only store structured data, whereas, the bulk of data these days is unstructured.

Another reason is the volume of data; it has become so vast that it is not economically feasible to store all the data in a single data warehouse.

Companies have tried to adopt alternatives like Hadoop, which can store unstructured data, as central repositories. However, it is still not possible to collect all the information generated across multiple devices residing in various locations into one central repository. It is also not feasible to analyze the information for intelligence, and then make smart recommendations back to the devices for optimal performance.

Edge Computing as the Solution

What is missing currently is technology execute the compute function closer on to or on the devices themselves. Edge computing architectures allow devices to send data they capture or generate to edge nodes, located near the devices, where analysis and computation is managed. Devices can gain intelligence to meet their users’ needs much faster since they only need to communicate with their edge nodes.

However, such independence does not mean that the devices and edge nodes function autonomously. The edge nodes are still connected to central systems and transmit the information that is needed for the central systems to analyze across multiple devices.

In other words, there is a duality of computation in which some analysis happens at the edges to the extent needed for local operation. At the same time, data is also transmitted to central analytical systems to perform more holistic analysis.

The filtering of data at the sources and transmitting only the required information to a central system is not new. In data virtualization, a method of data integration, we perform this selective data processing and delivery in real-time without replicating the data itself.

When data from smart devices comes in data virtualization instance that sits at the edge nodes closer to these devices integrates the disparate data and extracts just the results. The results are then sent to another instance of data virtualization that sits in the central location for analysis. That is, a network of virtualization instances, some at the edge nodes, connected to a central data virtualization instance forms the multi-location architecture that completes the edge computing framework.   

Benefits of Edge Computing

Two aspects of edge computing technology have evolved much faster than the others in recent years: Compute and storage. For instance, the mobile phones we have today have compute power and memory that overshadow those of desktop computers of 30 years ago.

However, what is holding the technology back is bandwidth limitations and data transmission, as it can still take minutes and hours for data to move from one location to another. This gets even more challenging when we have devices being carried physically further away from cloud and central systems. Edge computing allows devices to compute, adjust and learn in real-time rather than being slowed down by the transfer of information to and from a central system.

Data volume has exploded with the proliferation of connected and smart devices, and this has reduced the efficacy of centralized computation and analysis. Edge computing solves this problem by making the smart devices even smarter as they can process their own data to suffice their needs, and only transmit data that is required for centralized computation.