In August 2023, the UK’s National Air Traffic Services (Nats) faced a significant incident due to a system failure triggered by a wrongly formatted flight plan. This rare event led to a final report from the Independent Review Panel for the UK Civil Aviation Authority, which recommends 34 crucial changes.
The report dives into how to better handle similar situations in the future. During this incident, the primary system didn’t actually fail; it went into maintenance mode as it was designed to do, preventing potentially unsafe data from reaching air traffic controllers. Unfortunately, the backup system did the same, causing it to go into maintenance mode as well. As a result, Nats couldn’t automatically process flight plans, and manual intervention was necessary.
To prevent such failures, the report urges Nats to reassess its command structure and technology, evaluating if the current practices yield the best outcomes or if alternative options could enhance safety. The authors advocate including different command models and suggest having a single incident manager to streamline operations during crises.
The report also highlights the need to clarify air traffic control documents so that engineers and non-specialists can better understand system complexity. It urges a comprehensive review of critical systems to ensure documentation is clear enough to operate safely under unexpected conditions.
Although escalation procedures were in place, the investigation revealed that reaching out to the supplier earlier could have speeded up the resolution. The report recommends refining the escalation process, setting clear guidelines on when to seek supplier support. It also calls for a centralized document detailing supplier contacts and support levels, accessible to all team members involved in incident response.
A minor yet significant point is the challenge of keeping system architecture maps updated. The authors suggest evaluating new technologies or model-based engineering processes to quickly create accurate system schematics during incidents. Additionally, they recommend that the technical services director assess current documentation to support these new approaches. The goal is to identify issues related to faults early on by understanding the broader system connections.