Creating Driving Tests for Self-⁠Driving Cars


Volvo-backed Zenuity wants to prove that autonomous vehicles can drive more safely than humans

Illustration: Jude Buffum

At a test track east of Gothenburg, Sweden, people are ushered into autonomous vehicles for a test drive. But there’s a twist: The vehicles aren’t actually autonomous—there’s a hidden driver in the back—and the people are participating in an experiment to discover how they’ll behave when the car is chauffeuring them around.

At Zenuity—a joint venture between Volvo and Autoliv, a Swedish auto-safety company—this test is just one of many ways we make sure not just that autonomous vehicles work but that they can drive more safely than humans ever could. If self-driving cars are ever going to hit the road, they’ll need to know the rules and how to follow them safely, regardless of how much they might depend on the human behind the wheel.

Even now your car doesn’t need you as much as it once did. Advanced computer vision, radar technology, and computational platforms already intervene to avoid accidents, turning cars into guardian angels for their drivers. Vehicles will continue taking over more driving tasks until they’re capable of driving themselves. This will be the biggest transportation revolution since cars replaced horse-drawn carriages.

But it’s one thing to build a self-driving vehicle that works, and quite another to prove that it’s safe. Traffic can be as unpredictable as the weather, and being able to respond to both means navigating countless scenarios. To fully test all those scenarios by simply driving around would take not years but centuries. Therefore, we have to find other ways to assure safety—things like computer simulations and mathematical modeling. We’re combining real traffic tests with extensive augmented-reality simulations and test cases on one of the world’s most advanced test tracks to truly understand how to make self-driving cars safe.

It’s easy for a self-driving vehicle to cruise down a straightaway in the middle of a sunny day. But what about what we call corner cases—scenarios in which several unlikely factors occur together? A road littered with fallen branches during a thunderstorm poses different challenges to a vehicle than an elk crossing the road while the sun is setting.

Manufacturers will likely be held liable for vehicles that react incorrectly, and so they want to know how the vehicle will respond. For us, the biggest question is “How do we know the vehicle is safe?”

But before that, we must first ask what it means for a self-driving vehicle to be safe. Safe doesn’t mean perfect; perfect information about the environment will never be available. Instead, it must mean the self-driving vehicle can handle the problems it’s designed to handle, like obeying speed limits, yielding to a car merging into its lane, or observing right-of-way at a stop sign. And it must also recognize when it is at risk of exceeding its design specifications. For example, the vehicle shouldn’t attempt to drive after being placed in the middle of the forest.

In 9 out of 10 accidents resulting in fatalities or major injuries, mistakes by the driver are a contributing factor, according to multiple U.S. and U.K. sources. Because of this, the quick answer to what is “safe enough” is usually “better than a human driver.” But the devil is in the details. It’s too easy a challenge to surpass the drunken driver or even the statistically average driver. The median driver, you might argue, is not very good.

We propose that self-driving cars be held neither to a standard so strict that it delays the introduction of a life-saving technology nor to one so lenient that it treats the initial customers as guinea pigs. Instead, the first self-driving vehicles should be demonstrably safer than a vehicle driven by the median human driver. We believe that if every component can be demonstrated to work better than a human and if the complex algorithms that govern each component can interact together to drive the vehicle, it’s reasonable to conclude that the car is a better driver than the human.

This means designing the vehicle’s systems to handle any situation within its scope and discount the rest. While it is possible a parachutist could land directly in front of the vehicle, it is so extremely unlikely it is not required to consider that scenario for safety tests.

Any potential for unsafe behavior—software bugs, hardware failures, sensor limitations, unexpected weather conditions—must be shown to be very rare. Our rule of thumb is that any one of these problems should occur less than once in a billion hours of operation. At that low failure rate, all of these potential causes considered together still produce a vehicle that is not only safe but can be made available without taking too long to test.

However, we shouldn’t expect the car to solve a problem that even the very best human driver could not solve. One such problem, often trotted out by ethics professors and the media, is the trolley dilemma. In this hypothetical scenario, someone must choose between doing nothing and allowing a runaway trolley to kill several people on one track, or actively throwing a switch, allowing the trolley to kill one person on another track. The parallels to self-driving vehicles are clear. What if a vehicle winds up in a situation where it must choose between killing several pedestrians or swerving into a barricade and killing its own driver?

But this question is a red herring. Given that the risk of ending up in any fatal accident is low, then the risk of ending up in a situation where a choice must be made between two is even lower. All we can demand of self-driving cars is that they should avoid such impossible choices to begin with.

Once we have what appears to be a safe automated vehicle, we are obliged to prove it safe. One way is through a brute-force method. Here, the vehicle is tested in real traffic until we can say, with statistical significance, that it’s safe—and that would take hundreds of millions or even billions of hours on the road. The first automobiles were tested in this fashion, even if the people who drove those cars were unwitting participants in the experiment.

It’s also possible to use a divide-and-conquer approach. Rather than ask a complex question, such as whether the vehicle’s sensors can spot a deer crossing the road during a blizzard, it’s easier to ask simpler questions, such as whether the car can tell when its sensors are blocked by snow, or when hardware has failed due to cold temperatures. To that you could add the question, “If yes, can the vehicle adjust its decision making accordingly?” We can then tackle each of these smaller questions with whichever verification method is best suited to answer it, whether that’s a computer simulation, a quick spin on a test track, or putting the car in a real traffic situation.

In reality, any practical approach will fall somewhere between brute-force testing and the divide-and-conquer mode. Because technology develops in fast iterations, it would be wise to emphasize divide and conquer. For example, whenever the hardware or software is changed, any data collected by a brute-force approach may no longer be valid. Divide and conquer directs us to focus on retesting the safety of only the systems that were updated, while avoiding a time-consuming regathering of data we already have.

To tackle divide-and-conquer situations, we divide a self-driving vehicle’s system into four components—the human-machine interface, perception, decision making, and vehicle control. The human-machine interface has to do with the way the vehicle and its user interact. Perception is how the vehicle’s sensors create a view of its surroundings, decision making plans how the vehicle should respond to that view, and vehicle control is the plan’s physical execution. Each component has its own corner cases and methods to verify that it will be acceptably safe.

Consider the user interface in a car that drives itself most of the time but still requires occasional human intervention. Even then, a user may impair the vehicle’s performance by trying to take control of the vehicle unexpectedly. To find out how people react, we simulate autonomy by hiding a professional driver in the back of the vehicle, thus giving the impression that the vehicle is driving itself, most of the time. We call it the “Wizard of Oz” vehicle, because like the famous wizard in the movie, we use misdirection to draw attention away from the “man behind the curtain,” as the wizard describes himself.

It’s not an elaborate ruse—we hide the actual driver behind some plywood. Perhaps surprisingly, the test subjects rarely question the backseat enclosure. When one of them does ask, we explain it away as a space to hold computers or other equipment necessary for the prototype. Satisfied, our test subjects can sit back and experience a “true” self-driving car. In fact, they become so comfortable they often get bored or even fall asleep—revealing that we can’t always rely on drivers to react quickly if cars need them to take over in a tricky situation.

In time, as human supervision of the vehicle decreases, the bigger challenge becomes showing that the vehicle can safely drive—unsupervised—in any situation it may encounter. That means putting trust in the vehicle’s sensors after performing standalone sensor tests.

We can test cameras, for example, in real driving conditions, but it’s just as easy to point the camera at a screen displaying an augmented image of a real road. It’s tough to test whether a camera could detect a moose on the road in the real world—getting the moose to cooperate would be tricky—but we can test how well the cameras see the moose by augmenting actual road footage with an image of a moose at whatever distance and angle we need. In contrast, when we test the radar, we place the radar in a room where we can partially or entirely cover the radar with water or snow and test whether it recognizes the blockage.

Of course, we also test how these sensors work together in the actual vehicle. We can operate the vehicle in a self-driving mode to observe how well it gathers and uses the sensor data. We’re primarily concerned with understanding how well the vehicle’s sensors function, so we also collect sensor data during manual driving sessions to see how well the vehicle detected its surroundings. These driving tests also give us an opportunity to determine how well the vehicle detects its own limitations—for example, radar works well in fog but lidar and cameras do not. The vehicle must recognize those limitations and adjust accordingly.

Then comes perception. In general, the best way to test how well sensors perform this job is by placing real vehicles in real traffic conditions, in both good and bad weather. For less common corner cases, such as detecting debris on the road, the real traffic experiences are supplemented with test-⁠track scenarios.

We conduct our scenario tests at the AstaZero Proving Ground, a large test track partially owned by the Research Institutes of Sweden. The test track extends through wooded areas and a section simulating a town. AstaZero can be configured to tackle virtually any traffic conditions we would be interested in testing.

We supplement such traffic testing with increasingly sophisticated virtual simulations. At the same time, augmented imagery could assist in testing some corner cases. Using augmented reality, for example, we can test how well a vehicle obeys U.S. traffic signs using a road in Sweden; all we have to do is superimpose those signs over the Swedish ones. This approach can be even more helpful in testing dangerous corner cases like driving near pedestrians at high speed.

Decision making must be evaluated separately from perception. That is, rather than test whether the vehicle sees the world properly, we judge how well it works with the uncertain and incomplete picture that the perception system provides it. For example, a blind corner could be hiding a pedestrian, who’s liable to step into the road just as the vehicle approaches. The decision-making system must perceive that obstructed sight line and take that limitation into consideration by planning to slow down when it approaches the corner. Like any human driver, the vehicle must know its own limitations.

We’re also building and continuously expanding an immense database of scenarios to test the decision-making system against any scenario we choose. Consider a self-driving vehicle in a lane adjacent to a large 18-wheeler traveling slightly behind it. Ideally, the vehicle will recognize that it is driving in the truck’s blind spot and adjust its position to decrease the chance of an accident. Our database of scenarios can expose the vehicle’s decision-making process to any variety of scenarios involving lane changes, merges, or other traffic conditions with different speeds, positions, and distances between vehicles to observe what decisions the vehicle makes. Currently, our vehicles can handle lane changes quite well, though we still require them to keep a greater distance from the vehicles around them than a human driver would need.

Last but not least, the car must be able to execute its driving plan. We need to know how the vehicle’s normal capacity to steer and to brake might be limited by conditions on the road, such as ice. We can rely, in part, on computer simulations, thanks to accurate models of vehicle components. But we can’t model everything; to test how the car handles a tire blowout or an unexpected pothole, we have to put it on the test track.

We’ve spent quite some time on the test track tackling the diverse challenges of on-road objects. Even small things, such as a rock or an exhaust pipe that’s suddenly fallen off another car, can damage a vehicle if it’s going fast enough. For all of these objects, we need to be sure the vehicle can detect them from far enough away to brake or change lanes in time. Our testing is ongoing; while we’ve gotten good results for some obstructions, others pose specific challenges. Tires in the road, for example, aren’t picked up by lidar or radar very well. For the vehicle to react to them in time, we still need to improve its ability to detect them with cameras.

Using a divide-and-conquer approach gives us the ability to test how the system components work together. To verify the complete system means running countless simulations of various combinations of traffic situations and introduced failures.

Of course, other companies are tackling this problem in different ways. In our experience, older, bigger companies are relying on brute-force methods. However, we see value in being modular and swapping out hardware as needed. We also want to keep the driver involved and learn how to build an autonomous vehicle that is safe for users.

We can use these methods to test the redundancy of a fully self-driving vehicle. Suppose the vehicle’s brakes fail. The vehicle must carry out a complex sequence of actions: It has to detect the problem, plan maneuvers to react with a minimum amount of risk, and carry out that plan using a secondary system. Similar redundancies must be included for components like sensors, control units, software, communication systems, and the mechanical systems, and they must all be tested thoroughly.

Even after all that, we need to verify that the complete system in the real vehicle in real traffic operates safely. Car manufacturers and self-driving vehicle startups will deploy test fleets and drivers for this final step.

Self-driving vehicles are on the way, and they will make roads safer and more efficient. But don’t expect to see them on streets soon. It will still take a while to verify every corner case. Once a vehicle has been verified for a specific scenario, like downtown New York City in July, it will take additional verification to ensure it can handle the same area under winter conditions, as well as in downtown Shanghai, on country roads, and so forth.

By focusing on the acceptable rather than obsessing over perfection, and by breaking down the colossal number of safety verifications into manageable tasks, we can make self-driving vehicles a reality for many customers around the world.

This article appears in the March 2018 print issue as “Driving Tests for Self-⁠Driving Cars.”

About the Authors

Jonas Nilsson is the technical lead for functional safety in autonomous drive at Zenuity, in Göteborg, Sweden. Erik Coelingh is technology advisor at Zenuity.

Previous articleCleveland RTA, Battelle team to test transit, pedestrian safety tech
Next articleTriMet launching 3 bus lines, improving service on 10 more routes
Public Transit Blog provides information in the areas of bus, paratransit, light rail, commuter rail, subways, waterborne passenger services, and high-speed rail. Our readers include large and small companies who plan, design, construct, finance, supply, and operate bus and rail services worldwide. Government agencies, metropolitan planning organizations, state departments of transportation, academic institutions, and trade publications are also part of our readership.