Safety Assessment Of Autonomous Vehicles With Naturalistic And .

Transcription

Safety Assessment of Autonomous Vehicles withNaturalistic and Adversarial Driving EnvironmentHenry Liu and Shuo FengDepartment of Civil and Environmental EngineeringUniversity of Michigan Transportation Research InstituteUniversity of Michigan, Ann ArborJune 9th, 2021CCAT Research Review

Purposes of AV TestingBehavior Competency andSafety Performance EvaluationProbabilityLong Tail of EventsNormal eventsRare eventsEvents?𝟏𝟎 πŸ”Accidents Per MileAccidents Per MileScenario Identification & Training1

How to test an AI-based driving system? There are no consensus nor standard procedures on how totest and evaluate AVs. The AI-based agent, which is usually a black box to externalusers, limits the use of traditional logic-based softwareverification and validation techniques. The prevailing state-of-the-art approach for AV testing usesthe agent-environment framework, through a combinationof software simulation, closed-track testing, and on-roadtesting.2

Existing testing methodsSimulationTest TrackLow FidelityLack of TrafficOn-RoadTime & Cost Expensive3

Testing a human driver vs. testing an AVv.s.AutomatedDriving SystemVision TestWritten TestOn road test in a naturalisticdriving environmentSimulation / Test track / On roadIn a (simulated) naturalistic drivingenvironment4

How to model naturalistic driving environment (NDE)? The decision variables of a naturalistic driving environment can be represented as𝑋1,1 𝑋1,𝑇 ,𝑋 𝕏𝑋 𝑋𝑁,1 𝑋𝑁,𝑇𝑋𝑖,π‘˜ : variables (positions and speed) of the 𝑖-th vehicle at the π‘˜-th time step The simulated NDE should follow the same distribution with the realistic trafficdistribution, which is:𝑋 𝑃 𝑋 Challenge: The probability distributions of the joint state space of all backgroundvehicles in the simulated NDE needs to be consistent with those from the real-worlddriving environment. This is particularly challenging because of the highdimensionality.5

High dimensionality of NDEStaticWeather SpeedLimit LaneNumber ManeuversDynamicParameters: Initial BV’sposition, speed; BV’s cut-inposition, speed; BV’s cut-in angle Cut-inFree drivingFollowingLane changeOvertakingLeave laneCut-throughSlow trafficStop & Go Enumeration of all scenarios is impossible Importance of different scenarios are different A small number of critical scenarios make the major contribution6

What does AV testing mean? To evaluate an AV's safety performance, its expected probability of an event of interest(e.g., accident) is usually measured by testing the AV in the naturalistic drivingenvironment (NDE), either in simulation or in the real-world. Basic procedure: drive the AV in NDE for 𝑛 times and experience π‘š events, and thenthe estimated rate is π‘š/𝑛.𝑛1π‘šπ‘ƒ(𝐴) 𝑃 𝐴 𝑋 𝑃(𝑋) 𝑃 𝐴 𝑋𝑗 , 𝑋𝑗 𝑃 𝑋𝑛𝑛𝑋 𝕏𝑗 1𝑋: decision variables of NDE𝐴: event of interest (e.g., accident)π‘š: number of events occurred during the tests𝑛: number of tests However, as A is a rare event, very large n is required to get an accurate estimation7

Severe Inefficiency1 To obtain an accurate estimation, the required number of accidents is𝑧1 𝛼 22π‘š .𝛽𝛼: confidence interval𝑧: constant determined by 𝛼𝛽: relative estimation error For example, with 𝛼 95%, 𝛽 20%, we have 𝑧1 𝛼 1.96 and221.96π‘š 96.0.2 As the accident rate is very small for human drivers (e.g., 1.09 fatalities per 100 millionmiles), it requires about 8.8 billion miles to encounter so many accidents, which wouldtake hundreds of years under even aggressive testing assumptions.1Kalra,N. and S. M. Paddock. Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability? Transportation Research Part A:Policy and Practice, 2016, 94, pp.182-193.8

Major ChallengesVariables that define the environment arehigh dimensional, which can cause theβ€œcurse of dimensionality”. In addition, theunderlying distribution needs to beconsistent with that of the real world nmentRarenessof events01,000,000 milesThe rareness of events can lead to theintolerable inefficiency issue for testing.9

AV safety testing frameworkSolutionsObjectivesNaturalistic Driving Environment (NDE)The underlying distribution is consistentwith that of the real world NDENaturalistic and Adversarial DrivingEnvironment (NADE)Augmented Reality (AR) TestingPlatformAccelerate the testing throughadversarial scenario generationImprove testing fidelity using simulatedvehicles as background traffic10

Car-followingpdfDistributional Inconsistency Issue in ExistingSimulation BV AccelerationpdfLarge Deviation in Rangeπ‘ŸRangeMost of existing models are developed for traffic flow simulation, not for safety evaluation.11

A Data-driven NDE Modeling FrameworkSmallInconsistencypdf1. Data-driven stepOptimized DataDriven ModelLarge-scale naturalistic driving data atUMTRIVelocity2. Optimization Step The goal of NDE is to reproduce the real-world traffic environment.12

Rare event estimation with importance Sampling NDE Sampling:1π‘›π‘šπ‘›π‘ƒ(𝐴) σ𝑛𝑗 1 𝑃 𝐴 𝑋𝑗 , 𝑋𝑗 𝑃 𝑋 Importance sampling: construction of a new sampling distribution𝑛𝑃(𝑋𝑗 )𝑃 𝐴 𝑋 𝑃(𝑋)1𝑃(𝐴) 𝑃 𝐴 𝑋 𝑃(𝑋) π‘ž 𝑋 𝑃 𝐴 𝑋𝑗 , 𝑋𝑗 𝒒(𝑿)π‘ž π‘‹π‘›π‘ž(𝑋𝑗 )𝑋 𝕏π‘₯ 𝑋𝑗 1Construct an importance funtion 𝒒 𝑿 so that:1. 𝑃(𝐴) is an unbiased estimation2. The number of required tests (or testing miles) is smaller (fewer) than public road testsThe construction of importance function leads to the development of naturalistic andadversarial driving environment (NADE)13

Naturalistic and Adversarial Driving Environment NADE Theory: if there exists a small subset of variables that are critical to the rareevents, apply the importance sampling method with the small subset of variables,while apply the crude Monte Carlo method with the remaining variables. NADE Implementation: The key idea is to train the background vehicles in the NDEto learn when to execute what adversarial maneuver, while ensuring unbiasednessand improving efficiency. The selection of background vehicles and their maneuversis based on maneuver criticality.14

Maneuver Criticality What maneuvers of background vehicles are more important? Maneuvers that are challenging for an AV (adversarial) Maneuvers that are common in the real world (naturalistic) Maneuver criticality definitiondef𝑉 𝑒𝑖 𝑠 𝑃 𝑆𝑖 𝑠, 𝑒𝑖 𝑃(𝑒𝑖 𝑠)Maneuver challengeExposure FrequencyNaturalisticdriving data𝑆𝑖 : Event of interest (e.g., accident) caused by the 𝑖-th backgroundvehicle in a Surrogate Model of AVs15

Surrogate Model What is a surrogate model?CommonFeaturesSurrogateModel Why do we need a surrogate model? In case the exact AV model is unavailable. Domain knowledge and prior knowledge of testing can be leveraged. How to construct a surrogate model? Simplified (high level) AV models Data-driven method, e.g., human driving models.16

RL-based Maneuver Challenge Calculation We propose a reinforcement learning (RL)-based maneuver challengecalculation method:StateState 𝑠 (𝑣1 , π‘Ÿ1 , 𝑣2 )v2r1Markov Decision ProcessFormulationActionv1 𝑒ActionManeuver ChallengeDefine 𝑄 𝑠, 𝑒 𝑷(𝑺 𝒔, 𝒖)𝐴: accident eventExample ScenarioLearning ProcessLearning Results[Feng et al., Nature Communications, 2021]17

Identification of Principal Other Vehicle The criticality for each background vehicle is defined as𝐢𝑖 𝑠 𝑉 𝑒𝑖 𝑠 𝑃 𝑆𝑖 𝑠, 𝑒𝑖 𝑃(𝑒𝑖 𝑠) .𝑒𝑖𝑒𝑖Maneuver challengeNaturalisticdriving dataExposure Frequency The principal other vehicle (POV) can be identified by𝑐 arg max 𝐢𝑖 𝑠 ,𝑖𝑖𝑓 𝐢𝑐 𝑠 𝐢, where 𝐢 is a pre-determined threshold (e.g., 0). The moment is the critical moment if there is one POV.18

Intelligent Driving Intelligence cessAdversarialenvironmentNaturalistic andadversarialenvironmentEfficiencyConceptual generation framework of naturalistic and adversarial driving environment (NADE).[Feng et al., Nature Communications, 2021]19

From Discrete Testing Scenario to ContinuousDriving Environment Discrete testing scenarios can be generated for behavior competency evaluation Continuous driving environment can be generated for the safety performanceevaluation.Cut-in Scenario Only two vehicles; Simple maneuvers; A few seconds.Highway driving Environment Hundreds of vehicles; Various maneuvers: lanechanging, car-following, overtaking, etc. Hours of time duration.20

Cut-in Case StudyCritical Scenarios(Naturalistic and Adversarial)Locations of 414,770 lanechanging maneuvers inSafety Pilot DatabaseExposure frequencyBVAVSurrogate Model Construction:Range and Range RateModified Intelligent DrivingαˆΆπ‘Ώ (𝑹, 𝑹)ModelThe Generated Library(5.38% of all scenarios)Maneuver Challenge[Feng et al., IEEE trans Intell Transp Syst, 2020a; 2020b; 2020c]21

Cut-in Case StudyNDENDEWe can obtain the same accurate estimation with πŸ—. πŸ–πŸ• πŸπŸŽπŸ’ times faster (about 31 and 3 106 tests) than public road test.[Feng et al., IEEE trans Intell Transp Syst, 2020a; 2020b; 2020c]22

Case Study: Highway Driving Environment A three-lanes highway driving environment is simulated: For each simulation test, an AV will drive for a constant distance (400m). AV model: trained by deep reinforcement learning techniques considering bothefficiency and safety. Surrogate models: IDM car-following model MOBIL lane changing model; Calibrated by naturalistic driving data; The MOBIL model is modified as a stochastic model: the utility of the model isused to calculate the probability of lane changing maneuvers.23

Naturalistic Driving Databases Safety Pilot Model Deployment (SPMD)Database Over 2,800 personal vehicles, truck fleets, andtransit buses About 140 vehicles equipped with Mobileyeand DAS About 35 million miles or 1.2 million hours ofdriving Integrated Vehicle-Based Safety System(IVBSS) Database 16 vehicles each with a prototype crashwarning systems 7 radars, 5 video streams, GPS, 500 othersignals at 10 to 50 Hz 108 adult drivers Data Acquisition System (DAS)RadarsRada edWarning(CSW)Lateral DriftWarning (LDW)Forward CrashWarning (FCW)24

Data ProcessingabStarting pointVehiclesLanechangingCar followingManeuveridentificationLaneMarkingsObject detectioncCross thelaneData categorization ofvehicle maneuversCut inSubject vehicleOther vehicleFree drivingv1115 mv3Lane change (One adjacent vehicle)r2v2r2r1v1Lane change (No adjacent vehicle)v2r1v3v1Lane change (Two adjacent vehicles)Car followingv2r1v125

Data Processing Exposure frequency calculation for each category:Free drivingCut inLane change (One adjacent vehicle)Car followingLane change (No adjacent vehicle)Lane change (Two adjacent vehicles)26

Generation of NDE[Feng et al., Nature Communications, 2021]27

Generation of NADE[Feng et al., Nature Communications, 2021]28

Sparse adversarial adjustmentscBumper-to-bumper distance (m)AV-II in NDEAV-II in NADEdBumper-to-bumper distance (m) NADE generates very similar distributions as NDE (naturalistic), but much moredangerous scenarios with small distances and TTC (adversarial). We investigate the adjustment frequency of BVs’ maneuvers in NADE: we only adjustabout 1.5% maneuvers of the environment, which is very sparse and thus keep theenvironment naturalistic.29

More valuable events We compare the events encountered by the AVs in NDE and NADE:efBV cut inBV hard brakeAV-I in NDEAV-I in NADEgAV-II in NDEAV lane changeAV-II in NADELane conflictRR v1 v2v2NDENADEg hAV-II in NDEAV-II in NADEEvasive lane changeAV-I in NDEAV-I in NADEr1v1iEvasive lane changeAV-II in NDEAV-II in NADEi30

Unbiased accident rate estimation We investigate the accuracy and efficiency of the tests:0.01.02.03.0Accident rate6.04.05.0AV-II in NDEAV-II in NADE4.01010.01.00.00.32.03.04.05.01.21.51000.310 12.00.0dRelative half-widthc0.00.30.90.6Number of tests1.21.510 20.90.6Number of testsFor the AV-II model, NADE requires only 𝟐. πŸ‘πŸ πŸπŸŽπŸ’ number of tests, while NDE requires𝟏. πŸ’πŸ πŸπŸŽπŸ– number of times. Our method can accelerate the evaluation for about 6,000times and reduce about 35 million driving miles.31

Unbiased accident types We further investigate the unbiasedness of accident types:Type 1fgAV-II in NADEType 4Type 3Type 21.07e-071.01e-07Type 5AV-II in NDEAV-II in NADE6e-08 6e-080.0 0.00.0 0.00.0 3e-1132

Adversarial Examples33

Augmented Reality Testing PlatformOverall architecture of the AR and NADE system(Feng et al., Accident Analysis & Prevention, 2020.)AR at Mcity34

Cut-in Case Study: Field Test Execution35

Augmented perception Augment the real-world videos with the virtual vehicles generated by NADE.Real-world videos from Ford AV datasetAugmented videos by our solution36

Counterfactual simulation/track testing of fieldcollected long tail eventsRarepedestrianbehavior One long tail event inthe real-world dataOur solutionThousands of variantsof long tail events37

Summary of Our Solution to AV TestingSimulationTest TrackNADE AR1. Naturalistic and Adversarial Environment (NADE): Improving the fidelityof simulation through naturalistic modeling and accelerating the testingthrough adversarial scenario generation2. Augmented Reality (AR): Using simulated vehicles as background traffic38

Related Publications [1] Feng et al., 2020. Testing Scenario Library Generation for Connected andAutomated Vehicles, Part I: Methodology. IEEE Transactions on IntelligentTransportation Systems, DOI: 10.1109/TITS.2020.2972211. [2] Feng et al., 2020. Testing Scenario Library Generation for Connected andAutomated Vehicles, Part II: Case Studies. IEEE Transactions on IntelligentTransportation Systems, DOI:10.1109/TITS.2020.2988309. [3] Feng S., Feng Y., Sun H., Zhang Y., and Liu H.X. (2020). Testing Scenario LibraryGeneration for Connected and Automated Vehicles: An Adaptive Framework. IEEETransactions on Intelligent Transportation Systems. DOI: 10.1109/TITS.2020.3023668 [4] Feng S., Feng Y., Yan, X., Shen, S., Xu, S., Liu, H.X. (2020) Safety assessment ofhighly automated driving systems in test tracks: A new framework, Accident Analysisand Prevention, 144, https://doi.org/10.1016/j.aap.2020.105664. [5] Feng et al., 2021. Intelligent Driving Intelligence Test for Autonomous Vehicleswith Naturalistic and Adversarial Environment . Nature Communications. DOI:10.1038/s41467-021-21007-839

Thank you!

AV model: trained by deep reinforcement learning techniques considering both efficiency and safety. Surrogate models: IDM car-following model MOBIL lane changing model; Calibrated by naturalistic driving data; The MOBIL model is modified as a stochastic model: the utility of the model is