By Tony Stentz, Vice President, Systems, Safety Engineering, and Validation
Starting with a thought experiment
You are driving along on the highway when your vehicle’s check engine light comes on. Like most drivers, you figure it can wait and continue on your way. Then the light starts blinking, signaling a critical problem that requires immediate attention. If you continue to ignore the light, it’s possible that you could be putting yourself or other road users in danger. As an alert and responsible driver, you quickly pull over to a safe spot on the side of the road.
Modern vehicles are equipped with electronic control systems and onboard diagnostics to signal to the human driving the vehicle that something has gone awry. So when a problem does happen, human drivers can make smart, informed decisions about whether or not they feel it’s safe to continue driving. But what happens when the engine of an autonomous vehicle experiences a problem?
Unlike traditional vehicles, Aurora Driver-powered vehicles will not be able to rely on human intervention. Instead, the Aurora Driver must be able to detect, diagnose, and respond to anomalous conditions that interfere with operations and pose a safety risk. To that end, we recently announced the early release of the Aurora Driver’s Fault Management System, or FMS.
What is a fault?
A fault occurs when some part of the Aurora Driver’s system or the vehicle it’s controlling becomes impaired. A fault could include anything from an obstructed camera to a blown tire or an anomaly in the code of an autonomy sub-system.
A system is fault-tolerant if, in the event of a fault, it remains capable of safe operation—though perhaps with reduced capabilities. For the Aurora Driver, this means that the vehicle is still able to safely stop or pull over when something goes wrong.
Fault tolerance is built into the software, hardware, and embedded systems that make up the Aurora Driver. All of the safety-critical functions (including those embedded in perception, forecasting, motion planning, localization, control, steering, braking, communication, and power supply and distribution) continue to work even when something breaks.
What is fault management?
To operate safely in autonomy, the Aurora Driver will be equipped with a Fault Management System that actively detects and mitigates faults. The Aurora Driver’s FMS and fault tolerance support the second principle of our Safety Case Framework: Fail-Safe. Under this principle, we consider the Aurora Driver to be acceptably safe even when parts of the autonomy system fail so that it continues to behave in a way that does not endanger passengers or other road users.
Detection: Aurora’s FMS is designed to actively monitor the health of the vehicle, including the self-driving software, sensors, and onboard computer. Each component of the Aurora Driver is constantly reporting diagnostic health checks to the other components, ensuring that all systems are meeting the right conditions for safe autonomous operation.
Diagnosis: When a fault is detected, the FMS will evaluate its severity and determine the impact it will have on the Aurora Driver’s ability to drive safely. If it is not safe to continue normal operations, the FMS will produce a mitigation strategy.
Response: The result of the diagnosis will trigger one of a number of mitigation strategies, such as continuing to drive but at a reduced speed or pulling over to the shoulder. The FMS will consider the state of the entire system to decide on the safest fault response. The Aurora Driver’s motion planner will then execute that strategy according to the vehicle’s environment.
Safely testing fault management
Like all of the Aurora Driver’s software, the FMS was developed and extensively tested in our Virtual Testing Suite before being validated on closed tracks and then on public roads. This process allows us to quickly and responsibly implement new capabilities.