Failure to launch: behind the scenes of NAIA’s New Year’s Day fiasco

Image courtesy of Anete Lūsiņa on Unsplash.

In the Philippines, people often usher in the new year with loud noises and grand fireworks displays.

However, on January 1, 2023, the silence on the runway of the Ninoy Aquino International Airport was deafening.

As domestic and international flights to and from NAIA were either cancelled or rerouted, nothing was making sense to stranded passengers— other than the reality that their holiday plans were quickly going up in smoke.

Ground zero

Around 300 flights were affected, and approximately 65,000 passengers were inconvenienced as a result of NAIA’s shutdown on New Year’s Day. 

Airline personnel distribute food packs to stranded passengers at NAIA on New Year’s Day. Image courtesy of DoTr.

The country’s Department of Transportation (DoTr) released a statement later in the day, addressing the incident.

“At around 9:49 AM local time, the Air Traffic Management Center (ATMC), which serves as the facility for controlling and overseeing all inbound and outbound flights and overflights within the Philippine airspace, went down due to power outage, resulting to loss of communication, radio, radar, and internet,” Transport Secretary Jaime Bautista explained.

Jaime Bautista, Philippine Transport Secretary. Image courtesy of DoTr.

According to Bautista, the system issue caused the disruption of flights in NAIA, as well as in other airports in the country.

“The primary cause identified was a problem with the power supply and the degraded uninterrupted power supply which had no link to the commercial power and had to be connected to the latter manually. The secondary problem was the power surge due to the power outage which affected the equipment,” he continued.

By 4:00 PM, the ATMC resumed partial operations with limited capacity, and returned to normal operations by 5:50 PM, while equipment restoration was still ongoing, the transport chief said.

Despite this, airport officials admitted that it might take around 72 hours before NAIA’s operations return to normal.

Cabin pressure

With mounting pressure from lawmakers and the public, a top-level meeting followed by an airport inspection took place two days later, on January 3, attended by various government agencies.

On January 4, the DoTr announced that the immediate courses of action agreed upon during the meeting were to upgrade the existing facilities and replace the affected equipment of the Civil Aviation Authority of the Philippines (CAAP), an attached agency of the DoTr in charge of NAIA, and several other airports in the country.

CAAP stated that the incident did not appear to be a cyberattack since “affected electrical equipment cannot be manipulated from outside (the) CAAP compound.”

About two weeks after the NAIA outage, on January 12, airport officials were summoned before the Senate Committee on Public Services, “to shed light on what actually caused the fiasco, and exact accountability from agencies and officers involved.”

Among the findings during the Senate probe was the absence of CCTV cameras in the area housing NAIA’s Communications, Navigation and Surveillance Systems for Air Traffic Management (CNS/ATM), the equipment that broke down on New Year’s Day.

Senator Grace Poe, Chair, Senate Committee on Public Services, Philippines. Image courtesy of Sen. Grace Poe.

Moreover, it was revealed that CAAP had been doing in-house maintenance work on airport equipment, after a contract with a third-party service provider ended in 2020.

To this end, Senator Grace Poe, Chair of the Senate Committee on Public Services, called on CAAP to engage a third-party service provider to do maintenance work, so as to avoid a repeat of the New Year’s Day incident.

“Knowing what and why it happened and seeking accountability is to our best interest. But at the end of the day, our goal is to make sure that this will not happen again— not only by upgrading the system or replacing the equipment, but also making sure that the institutions running these are empowered and capacitated,” she said.

In response, CAAP announced several recommendations to improve the CNS/ATM:

  • Procurement of a multi-mode fallback system.
  • Construction of an independent backup CNS/ATM.
  • Engagement of a third-party contractor to oversee the operation and maintenance of the system.

Mayday

Two months after the incident, and after a series of inquiries, the Senate finally released its findings on March 7.

According to Senator Grace Poe, there was indeed a power outage at NAIA on January 1, which was due to malfunctioning equipment onsite.

The investigation found that:

  • The uninterruptible power supply detected a fault in the power transfer switch and disconnected from the power source as a safety measure to prevent further damage.
  • A loose connection with the circuit breaker’s neutral wire released excess voltage that damaged the CNS/ATM, leading to a prolonged airport shutdown.
  • The automatic voltage regulator had not been functioning since August 2022, or four months before the incident.

“The malfunctioning of these three pieces of equipment was worsened by several underlying issues that all aligned on New Year’s Day, and ultimately led to a system failure,” Senator Poe said.

The issues presented were:

  • Lack of engineering standards and guidelines for the maintenance and troubleshooting of equipment.
  • Absence of a system evaluation for the entire airport.
  • No proper personnel training and lack of electrical engineers on site.
  • No functioning system backup or redundancy.

Senator Poe added that while sabotage and cyberattack were ruled out for the New Year’s Day incident, a more conclusive finding will be reached after the data logs sent to Turkey have been thoroughly analysed.

Moreover, Senator Poe reintroduced the proposed “Philippine Transportation Safety Board Act” before the Senate, which will create an independent agency called the Philippine Transportation Safety Board (PTSB).

The PTSB’s task is to “conduct independent, thorough, and truthful investigations, and provide corresponding and critical recommendations” on all transport matters.

“The PTSB can save lives. It also aims to resolve issues in our transport systems and avoid further shutdowns,” Senator Poe said.

Ready for takeoff

In May 2022, a study by American luggage app Bounce identified NAIA as the world’s worst airport for business class travellers. Later that year, travel website Hawaiianislands.com named the airport the third most stressful in Asia.

Phil Scanlon, Senior Vice President, Global Solution Engineering, Solace. Image courtesy of Solace.

However, the airport can take flight once again, given certain adjustments, cybersecurity experts observed. 

Phil Scanlon, Senior Vice President, Global Solution Engineering at Solace, observed that the incident demonstrated the need for always-on systems to operate airspace in real time.

“A solid availability strategy, supported by event-driven architecture (EDA) infrastructure such as decoupled systems and the adoption of cloud technology as part of the availability strategy, will help airports operate in a hybrid model. In this instance, the airport’s backup power was insufficient to run all the systems operating within the airport,” he explained.

“The breakdown of the central air traffic control system, radar equipment, and communications system, as a result of the power outage, could have been managed differently if the airport had adopted cloud technology where possible,” the executive added.

Chris Cruz, Chief Information Officer, Public Sector, Tanium. Image courtesy of Tanium.

If NAIA proceeds with a systems upgrade, the airport operator should use the lessons learned from the incident as a starting point for its digital transformation, advised Chris Cruz, Chief Information Officer, Public Sector, Tanium.

“Not having a secondary system in place to manage a failover is a hardware issue in terms of not being able to switch to a backup system with auxiliary power. An airport this size should have a backup generator system capable of handling basic technology functions in the event of a power outage,” he said.

Cruz outlined a likely scenario for NAIA’s systems upgrade, stating that the typical steps taken to upgrade technology involve loading the system onto another server, either on-premises or in the cloud, that serves as a test server. This allows for the application to be “smoke tested” for software bugs and to ensure that it meets all requirements.

“Once it is tested and validated, typically you do a cutover from the old system to the new system with minimal disruption to services,” Cruz explained.

Meanwhile, Praveen Kumar, Vice President for APAC, Rocket Software, listed several pointers which could guide the modernisation of NAIA’s systems.

Praveen Kumar, Vice President for APAC, Rocket Software. Image courtesy of Rocket Software.

“First, have a look at how you can interface modern technologies with the infrastructure that you already have. One common mistake companies make too often is to blindly follow technology trends, jumping from one platform to another without much planning, and not ensuring a seamless transition between systems, before eventually ending up struggling to achieve results and revenue,” he said.

Kumar also highlighted the importance of using visual tools to understand how employees interact with data and the entire application landscape.

Further, by identifying customer and employee needs and working backwards, teams can better determine where modernisation is needed to achieve desired outcomes, he noted.

With revenge travel currently in full swing, airports such as NAIA can only expect a surge in passengers, especially during upcoming long weekend holidays such as Labour Day in May.

Solace’s Phil Scanlon suggested that EDA can help prevent a repeat of the NAIA New Year’s Day fiasco.

“With customer experience and operational efficiency being consistently highlighted as key areas for improvement in the aviation industry, the integration of EDA to provide resilient real-time communication would undoubtedly emerge as the standard in the upcoming years. An event mesh interconnecting aviation systems, devices, and people, regardless of where they are deployed, would become vital in safely disseminating information efficiently to the right place and at the right time,” he said.

“Given that a singular miscommunication in the aviation industry could potentially cascade into a host of serious complications, EDA plays a significant role in mitigating future disruptions across major transportation institutions,” Scanlon concluded.