Swarm learning and its applications

Image courtesy of Hewlett Packard Enterprise

Artificial intelligence (AI) is turning out to be a major source of innovation, disruption, and competitive advantage in today’s business environment. According to research and advisory firm Gartner Inc., AI will create $1.9 trillion of business value and 6.2 billion hours of worker productivity in 2021.

One of the innovations brought about by AI is swarm learning (or SL) – a machine learning model that can help detect patients with severe illnesses such as leukaemia, tuberculosis, and COVID-19. It can also enable collaboration models in intelligent edge, autonomous vehicles, and cross-enterprise collaboration. To find out more about SL, the technology behind it, and how the model works, Frontier Enterprise recently talked with Dr. Eng Lim Goh, Senior Vice President and Chief Technology Officer for Artificial Intelligence at Hewlett Packard Enterprise (HPE).

In addition to inventing SL applications, Dr Goh is also the principal investigator of the International Space Station experiment to operate an autonomous supercomputer for long-duration space travel. He also oversees the application of AI to Formula One racing; and works on applying the technology behind Libratus (the champion poker AI) to enterprise decision-making, among other endeavours.

You’ve been with HPE since 2016. What have been the highlights of your time there, and what are the most significant changes you’ve seen since then, specifically when it comes to AI?

AI has made significant advances in the past decade, impacting both our everyday lives and business transformation. The volume of use cases involving AI is fast-increasing – it is a situation now where organisations are empowered to take the technology and drive change that customers are expecting.

Many are achieving this by devoting resources to move from the proof-of-concept phase to delivering sustained production at scale. For example, AI is taking an increasingly predominant role in these three use cases: inventing new products, improving current productions, and further automating back-office operations such as accounting and human resources.

What sort of technologies (e.g. hardware and software) are necessary to perform SL? How is it able to integrate sensitive medical data while sharing them privately and avoiding the violation of privacy laws?

SL is a decentralised machine-learning approach that unites edge computing, blockchain-based peer-to-peer networking, and coordination while maintaining confidentiality without the need for a central coordinator, thereby going beyond federated learning.

SL software is offered as Docker containers. It can run on any infrastructure that supports the Docker environment. Therefore, it runs on infrastructure ranging from edge computing to supercomputers. The infrastructure capability needs depend on the complexity of the models. For example, simple models can run on regular x86 processors, whereas complex computer vision models would need graphics processing units.

SL offers a way to ensure private raw data never has to leave the location in which it is collected. Only insights from that data are shared between the participating nodes, and the machine learning method is applied locally at each node or data source. Moreover, to provide for equitable participation, there is no central custodian to collect all insights or learnings from the nodes. A blockchain is used instead.

Blockchain – which is an element of SL – is known to take some time to process data. How does SL handle this limitation?

SL has the goal of facilitating the integration of any medical data from any data owner worldwide without violating privacy laws. The approach involves leveraging a group of technologies, including edge computing and blockchain, to process data while removing the need for centralised coordination, thus preserving the confidentiality of the underlying information and allowing for collaboration that wasn’t previously possible.

Public blockchains use compute-intensive consensus algorithms, like Proof of Work, to ensure the transactional guarantee across untrusted participants. For example, with a public blockchain, the consensus of truth is established by applying huge computing resources in a competition to solve a puzzle. Private, permissioned blockchain implementations do not need such consensus algorithms as the participants are validated through identity.

SL uses private, permissioned blockchain that does not require such a consensus and therefore does not suffer from compute intensive consensus algorithms.

How different is SL compared to the approach used by the Gaia-X project in developing a decentralised data-sharing system in the European Union?

Gaia-X is aimed at providing federated data infrastructure and was also designed with improved efficiency and security in mind.

The SL framework does not share any private raw data. Instead, it enables decentralised training by sharing learnings and insights. The mechanism for developing federated networks via identities could be the same for both approaches.

SL could be a part of delivering on the goals of Gaia-X data federation. In particular, to still enable the sharing of learnings and insights across the federation when the needed raw data for training is restricted or confidential and cannot be shared.

What are some of the most exciting developments in HPE’s labs, specifically in AI?

HPE is focused on making AI solutions that are data-driven, production-oriented, and cloud-enabled. The trend we are witnessing is an acceleration of the move from AI experimentation to operationalisation, with the goal to receive insights on demand at any scale. Running AI algorithms and workloads at the edge beyond the data center is also a shifting mindset – along with the feasibility of running AI use cases at the edge. Sourcing real-time data to use remains a hurdle to overcome, and AI experts see the need for improved infrastructure at the edge in the future to serve business needs.

We are engaging in the important discussion of how to provide clear ethical guidance for AI in order to address two critical needs: demonstrate with transparency how current technology can be applied with confidence, and illuminate where current technologies fall short – in order to identify where we need to innovate.

Earlier this year, Hewlett Packard Labs and the HPE Chief Compliance and Privacy Offices collaborated to define HPE’s AI ethics and principles. As we gain experience applying these principles in practice, we’re uncovering gaps where conventional AI falls short of our goal, revealing issues of bias, explainability, trust, and robustness.

Our Labs AI Research team is currently focused on model synthesis and analysis, on the data foundation underpinning ethical AI, and on hardware acceleration to enable explainable, robust AI to be operated efficiently and sustainably.