Timezone: »

Workshop
Progress and Challenges in Building Trustworthy Embodied AI
Chen Tang · Karen Leung · Leilani Gilpin · Jiachen Li · Changliu Liu

Fri Dec 02 05:50 AM -- 03:00 PM (PST) @ Room 281 - 282

The recent advances in deep learning and artificial intelligence have equipped autonomous agents with increasing intelligence, which enables human-level performance in challenging tasks. In particular, these agents with advanced intelligence have shown great potential in interacting and collaborating with humans (e.g., self-driving cars, industrial robot co-worker, smart homes and domestic robots). However, the opaque nature of deep learning models makes it difficult to decipher the decision-making process of the agents, thus preventing stakeholders from readily trusting the autonomous agents, especially for safety-critical tasks requiring physical human interactions. In this workshop, we bring together experts with diverse and interdisciplinary backgrounds, to build a roadmap for developing and deploying trustworthy interactive autonomous systems at scale. Specifically, we aim to the following questions: 1) What properties are required for building trust between humans and interactive autonomous systems? How can we assess and ensure these properties without compromising the expressiveness of the models and performance of the overall systems? 2) How can we develop and deploy trustworthy autonomous agents under an efficient and trustful workflow? How should we transfer from development to deployment? 3) How to define standard metrics to quantify trustworthiness, from regulatory, theoretical, and experimental perspectives? How do we know that the trustworthiness metrics can scale to the broader population? 4) What are the most pressing aspects and open questions for the development of trustworthy autonomous agents interacting with humans? Which research areas are prime for research in academia and which are better suited for industry research?

 Fri 5:50 a.m. - 6:00 a.m. Opening Remarks (Introduction) 🔗 Fri 6:00 a.m. - 6:25 a.m. Trustworthy Robots for Human-Robot Interaction (Keynote Talk) Harold Soh 🔗 Fri 6:25 a.m. - 6:30 a.m. Q & A 🔗 Fri 6:30 a.m. - 6:55 a.m. Towards Safe Model-based Reinforcement Learning (Keynote Talk) Felix Berkenkamp 🔗 Fri 6:55 a.m. - 7:00 a.m. Q & A 🔗 Fri 7:00 a.m. - 7:25 a.m. Scenario Generation via Quality Diversity for Trustworthy AI (Keynote Talk) Stefanos Nikolaidis 🔗 Fri 7:25 a.m. - 7:30 a.m. Q & A 🔗 Fri 7:30 a.m. - 7:36 a.m. Take 5: Interpretable Image Classification with a Handful of Features (Spotlight) []   link »    Deep Neural Networks use thousands of mostly incomprehensible features to identify a single class, a decision no human can follow. We propose an interpretable sparse and low dimensional final decision layer in a deep neural network with measurable aspects of interpretability and demonstrate it on fine-grained image classification. We argue that a human can only understand the decision of a machine learning model, if the input features are interpretable and only very few of them are used for a single decision. For that matter, the final layer has to be sparse and – to make interpreting the features feasible – low dimensional. We call a model with a Sparse Low-Dimensional Decision “SLDD-Model”. We show that a SLDD-Model is easier to interpret locally and globally than a dense high-dimensional decision layer while being able to maintain competitive accuracy. Additionally, we propose a loss function that improves a model’s feature diversity and accuracy. Our interpretable SLDD-Model only uses 5 out of just 50 features per class, while maintaining 97% to 100% of the accuracy on four common benchmark datasets compared to the baseline model with 2048 features. Link » Thomas Norrenbrock · Marco Rudolph · Bodo Rosenhahn 🔗 Fri 7:36 a.m. - 7:42 a.m. Characterising the Robustness of Reinforcement Learning for Continuous Control using Disturbance Injection (Spotlight) []   link »    In this study, we leverage the deliberate and systematic fault-injection capabilities of an open-source benchmark suite to perform a series of experiments on state-of-the-art deep and robust reinforcement learning algorithms.We aim to benchmark robustness in the context of continuous action spaces---crucial for deployment in robot control.We find that robustness is more prominent for action disturbances than it is for disturbances to observations and dynamics. We also observe that state-of-the-art approaches that are not explicitly designed to improve robustness perform at a level comparable to that achieved by those that are.Our study and results are intended to provide insight into the current state of safe and robust reinforcement learning and a foundation for the advancement of the field, in particular, for deployment in robotic systems.NOTE: We plan to submit a subset of our results in a shorter 4-page version of this paper to the NeurIPS 2022 Workshop on Distribution Shifts (DistShift)''. DistShift does NOT have proceedings and will be held on a different date (Dec. 3) than TEA. Link » Catherine Glossop · Jacopo Panerati · Amrit Krishnan · Zhaocong Yuan · Angela Schoellig 🔗 Fri 7:42 a.m. - 7:48 a.m. MAFEA: Multimodal Attribution Framework for Embodied AI (Spotlight) []   link »    Understanding multimodal perception for embodied AI is an open question because such inputs may contain highly complementary as well as redundant information for the task. A relevant direction for multimodal policies is understanding the global trends of each modality at the fusion layer. To this end, we disentangle the attributions for visual, language, and previous action inputs across different policies trained on the ALFRED dataset. Attribution analysis can be utilized to rank and group the failure scenarios, investigate modeling and dataset biases, and critically analyze multimodal EAI policies for robustness and user trust before deployment. We present MAFEA, a framework to compute global attributions per modality of any differentiable policy. In addition, we show how attributions enable lower-level behavior analysis in EAI policies through two example case studies on language and visual attributions. Link » Vidhi Jain · Jayant Sravan Tamarapalli · Sahiti Yerramilli · Yonatan Bisk 🔗 Fri 7:48 a.m. - 7:54 a.m. Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and Generalization Guarantees (Spotlight) []   link »    Safety is a critical component of autonomous systems and remains a challenge for learning-based policies to be utilized in the real world. In this paper, we propose Sim-to-Lab-to-Real to bridge the reality gap with a probabilistically guaranteed safety-aware policy distribution.. To improve safety, we apply a dual policy setup where a performance policy is trained using the cumulative task reward and a backup (safety) policy is trained by solving the safety Bellman Equation based on Hamilton-Jacobi reachability analysis. In \textit{Sim-to-Lab} transfer, we apply a supervisory control scheme to shield unsafe actions during exploration; in \textit{Lab-to-Real} transfer, we leverage the Probably Approximately Correct (PAC)-Bayes framework to provide lower bounds on the expected performance and safety of policies in unseen environments. We empirically study the [proposed framework for ego-vision navigation in two types of indoor environments including a photo-realistic one. We also demonstrate strong generalization performance through hardware experiments in real indoor spaces with a quadrupedal robot. Link » Kai-Chieh Hsu · Allen Z. Ren · Duy Nguyen · Anirudha Majumdar · Jaime Fisac 🔗 Fri 7:54 a.m. - 8:00 a.m. Addressing Mistake Severity in Neural Networks with Semantic Knowledge (Spotlight) []   link »    Robustness in deep neural networks and machine learning algorithms in general is an open research challenge. In particular, it is difficult to ensure algorithmic performance is maintained on out-of-distribution inputs or anomalous instances that cannot be anticipated at training time. Embodied agents will be deployed in these conditions, and are likely to make incorrect predictions. An agent will be viewed as untrustworthy unless it can maintain its performance in dynamic environments. Most robust training techniques aim to improve model accuracy on perturbed inputs; as an alternate form of robustness, we aim to reduce the severity of mistakes made by neural networks in challenging conditions. We leverage current adversarial training methods to generate targeted adversarial attacks during the training process in order to increase the semantic similarity between a model's predictions and true labels of misclassified instances. Results demonstrate that our approach performs better with respect to mistake severity compared to standard and adversarially trained models. We also find an intriguing role that non-robust features play with regards to semantic similarity. Link » Victoria Helus · Nathan Vaska · Natalie Abreu 🔗 Fri 8:00 a.m. - 9:00 a.m. Coffee Break & Poster Session 1 (Break) 🔗 Fri 9:00 a.m. - 9:25 a.m. Failure Identification for Semi- and Unstructured Robot Environments (Keynote Talk) Katherine Driggs-Campbell 🔗 Fri 9:25 a.m. - 9:30 a.m. Q & A 🔗 Fri 9:30 a.m. - 9:55 a.m. Progress and Challenges in Learning Control Certificates in Large-scale Autonomy (Keynote Talk) Chuchu Fan 🔗 Fri 9:55 a.m. - 10:00 a.m. Q & A 🔗 Fri 10:00 a.m. - 10:25 a.m. Explainable Interactive Learning for Human-Robot Teaming (Keynote Talk) Matthew Gombolay 🔗 Fri 10:25 a.m. - 10:30 a.m. Q & A 🔗 Fri 10:30 a.m. - 11:20 a.m. Lunch Break (Break) 🔗 Fri 11:20 a.m. - 11:25 a.m. Paper Award Ceremony (Award Ceremony) 🔗 Fri 11:25 a.m. - 11:45 a.m. To Explain or Not to Explain: A Study on the Necessity of Explanations for Autonomous Vehicles (Awarded Paper Presentation) []   link »    Explainable AI, in the context of autonomous systems, like self-driving cars, has drawn broad interests from researchers. Recent studies have found that providing explanations for autonomous vehicles' actions has many benefits (e.g., increased trust and acceptance), but put little emphasis on when an explanation is needed and how the content of explanation changes with driving context. In this work, we investigate which scenarios people need explanations and how the critical degree of explanation shifts with situations and driver types. Through a user experiment, we ask participants to evaluate how necessary an explanation is and measure the impact on their trust in self-driving cars in different contexts. Moreover, we present a self-driving explanation dataset with first-person explanations and associated measures of the necessity for 1103 video clips, augmenting the Berkeley Deep Drive Attention dataset. Our research reveals that driver types and driving scenarios dictate whether an explanation is necessary. In particular, people tend to agree on the necessity for near-crash events but hold different opinions on ordinary or anomalous driving situations. Link » Yuan Shen · Shanduojiao Jiang · Yanlin Chen · Katherine Driggs-Campbell 🔗 Fri 11:40 a.m. - 11:45 a.m. Q & A 🔗 Fri 11:45 a.m. - 12:05 p.m. Post-Hoc Attribute-Based Explanations for Recommender Systems (Awarded Paper Presentation) []   link »    Recommender systems are ubiquitous in most of our interactions in the current digital world. Whether shopping for clothes, scrolling YouTube for exciting videos, or searching for restaurants in a new city, the recommender systems at the back-end power these services. Most large-scale recommender systems are huge models trained on extensive datasets and are black-boxes to both their developers and end-users. Prior research has shown that providing recommendations along with their reason enhances trust, scrutability, and persuasiveness of the recommender systems. Recent literature in explainability has been inundated with works proposing several algorithms to this end. Most of these works provide item-style explanations, i.e., 'We recommend item A because you bought item B.' We propose a novel approach to generate more fine-grained explanations based on the user's preference over the attributes of the recommended items. We perform experiments using real-world datasets and demonstrate the efficacy of our technique in capturing users' preferences and using them to explain recommendations. We also propose ten new evaluation metrics and compare our approach to six baseline methods. We have also submitted this paper to Trustworthy and Socially Responsible Machine Learning workshop at Neurips. That workshop is on a different day and does not proceedings. Link » Sahil Verma · Anurag Beniwal · Narayanan Sadagopan · Arjun Seshadri 🔗 Fri 12:00 p.m. - 12:05 p.m. Q & A 🔗 Fri 12:05 p.m. - 12:30 p.m. Providing Intelligible Explanations in Autonomous Driving (Keynote Talk) Daniel Omeiza 🔗 Fri 12:30 p.m. - 12:35 p.m. Q & A 🔗 Fri 12:35 p.m. - 12:41 p.m. Dynamic Efficient Adversarial Training Guided by Gradient Magnitude (Spotlight) []   link »    Adversarial training is an effective but time-consuming way to train robust deep neural networks that can withstand strong adversarial attacks. As a response to its inefficiency, we propose Dynamic Efficient Adversarial Training (DEAT), which gradually increases the adversarial iteration during training. We demonstrate that the gradient's magnitude correlates with the curvature of the trained model's loss landscape, which allows it to reflect the effect of adversarial training. Therefore, based on the magnitude of the gradient, we propose a general acceleration strategy, M+ acceleration, which enables an automatic and highly effective method of adjusting the training procedure. M+ acceleration is computationally efficient and easy to implement. It is suited for DEAT and compatible with the majority of existing adversarial training techniques. Extensive experiments have been done on CIFAR-10 and ImageNet datasets with various training environments. The results show that proposed M+ acceleration significantly improves the training efficiency of existing adversarial training methods while maintaining or even enhancing their robustness performance. This demonstrates that the strategy is highly adaptive and offers a valuable solution for automatic adversarial training. Link » Fu Wang · Yanghao Zhang · Wenjie Ruan · Yanbin Zheng 🔗 Fri 12:41 p.m. - 12:47 p.m. A Theory of Learning with Competing Objectives and User Feedback (Spotlight) []   link »    Large-scale deployed learning systems are often evaluated alongmultiple objectives or criteria. But, how can we learn or optimizesuch complex systems, with potentially conflicting or evenincompatible objectives? How can we improve the system when user feedback becomes available, feedback possibly alerting to issues not previously optimized for by the system?We present a new theoretical model for learning and optimizing suchcomplex systems. Rather than committing to a static or pre-definedtradeoff for the multiple objectives, our model is guided by thefeedback received, which is used to update its internal state.Our model supports multiple objectives that can be of very generalform and takes into account their potential incompatibilities.We consider both a stochastic and an adversarial setting. In thestochastic setting, we show that our framework can be naturally castas a Markov Decision Process with stochastic losses, for which we giveefficient vanishing regret algorithmic solutions. In the adversarialsetting, we design efficient algorithms with competitive ratioguarantees.We also report the results of experiments with our stochasticalgorithms validating their effectiveness. Link » Pranjal Awasthi · Corinna Cortes · Yishay Mansour · Mehryar Mohri 🔗 Fri 12:47 p.m. - 12:53 p.m. A Framework for Generating Dangerous Scenes for Testing Robustness (Spotlight) []   link »    Benchmark datasets for autonomous driving, such as KITTI, nuScenes, Argoverse, or Waymo are realistic but designed to be faultless. These datasets do not contain errors, difficult driving maneuvers, or other corner cases. We propose a framework for perturbing autonomous vehicle datasets, the DANGER framework, which generates edge-case images on top of current autonomous driving datasets. The input to DANGER are photorealistic datasets from real driving scenarios. We present the DANGER algorithm for vehicle position manipulation and the interface towards the renderer module, and present five scenario-level dangerous primitives generation applied to the virtual KITTI and virtual KITTI 2 datasets. Our experiments prove that DANGER can be used as a framework for expanding the current datasets to cover generative while realistic and anomalous corner cases. Link » Shengjie Xu · Lan Mi · Leilani Gilpin 🔗 Fri 12:53 p.m. - 12:59 p.m. What Makes a Good Explanation?: A Unified View of Properties of Interpretable ML (Spotlight) []   link »    Interpretability provides a means for humans to verify aspects of machine learning (ML) models. Different tasks require explanations with different properties. However, presently, there is a lack of standardization in assessing properties of explanations: different papers use the same term to mean different quantities, and different terms to mean the same quantity. This lack of standardization prevents us from rigorously comparing explanation systems. In this work, we survey explanation properties defined in the current interpretable ML literature, we synthesize properties based on what they measure, and describe the trade-offs between different formulations of these properties. We provide a unifying framework for comparing properties of interpretable ML. Link » Varshini Subhash · Zixi Chen · Marton Havasi · Weiwei Pan · Finale Doshi-Velez 🔗 Fri 1:00 p.m. - 2:00 p.m. Coffee Break & Poster Session (Break) 🔗 Fri 2:00 p.m. - 2:55 p.m. Panel Discussion (Panel) Chuchu Fan · Stefanos Nikolaidis · Katherine Driggs-Campbell · Matthew Gombolay · Daniel Omeiza 🔗 Fri 2:55 p.m. - 3:00 p.m. Closing Remarks (Closing) 🔗 - Post-Hoc Attribute-Based Explanations for Recommender Systems (Poster) []   link »    Recommender systems are ubiquitous in most of our interactions in the current digital world. Whether shopping for clothes, scrolling YouTube for exciting videos, or searching for restaurants in a new city, the recommender systems at the back-end power these services. Most large-scale recommender systems are huge models trained on extensive datasets and are black-boxes to both their developers and end-users. Prior research has shown that providing recommendations along with their reason enhances trust, scrutability, and persuasiveness of the recommender systems. Recent literature in explainability has been inundated with works proposing several algorithms to this end. Most of these works provide item-style explanations, i.e., 'We recommend item A because you bought item B.' We propose a novel approach to generate more fine-grained explanations based on the user's preference over the attributes of the recommended items. We perform experiments using real-world datasets and demonstrate the efficacy of our technique in capturing users' preferences and using them to explain recommendations. We also propose ten new evaluation metrics and compare our approach to six baseline methods. We have also submitted this paper to Trustworthy and Socially Responsible Machine Learning workshop at Neurips. That workshop is on a different day and does not proceedings. Link » Sahil Verma · Anurag Beniwal · Narayanan Sadagopan · Arjun Seshadri 🔗 - Characterising the Robustness of Reinforcement Learning for Continuous Control using Disturbance Injection (Poster) []  []   link »    In this study, we leverage the deliberate and systematic fault-injection capabilities of an open-source benchmark suite to perform a series of experiments on state-of-the-art deep and robust reinforcement learning algorithms.We aim to benchmark robustness in the context of continuous action spaces---crucial for deployment in robot control.We find that robustness is more prominent for action disturbances than it is for disturbances to observations and dynamics. We also observe that state-of-the-art approaches that are not explicitly designed to improve robustness perform at a level comparable to that achieved by those that are.Our study and results are intended to provide insight into the current state of safe and robust reinforcement learning and a foundation for the advancement of the field, in particular, for deployment in robotic systems.NOTE: We plan to submit a subset of our results in a shorter 4-page version of this paper to the NeurIPS 2022 Workshop on Distribution Shifts (DistShift)''. DistShift does NOT have proceedings and will be held on a different date (Dec. 3) than TEA. Link » Catherine Glossop · Jacopo Panerati · Amrit Krishnan · Zhaocong Yuan · Angela Schoellig 🔗 - What Makes a Good Explanation?: A Unified View of Properties of Interpretable ML (Poster) []  []   link »    Interpretability provides a means for humans to verify aspects of machine learning (ML) models. Different tasks require explanations with different properties. However, presently, there is a lack of standardization in assessing properties of explanations: different papers use the same term to mean different quantities, and different terms to mean the same quantity. This lack of standardization prevents us from rigorously comparing explanation systems. In this work, we survey explanation properties defined in the current interpretable ML literature, we synthesize properties based on what they measure, and describe the trade-offs between different formulations of these properties. We provide a unifying framework for comparing properties of interpretable ML. Link » Varshini Subhash · Zixi Chen · Marton Havasi · Weiwei Pan · Finale Doshi-Velez 🔗 - Dynamic Efficient Adversarial Training Guided by Gradient Magnitude (Poster) []   link »    Adversarial training is an effective but time-consuming way to train robust deep neural networks that can withstand strong adversarial attacks. As a response to its inefficiency, we propose Dynamic Efficient Adversarial Training (DEAT), which gradually increases the adversarial iteration during training. We demonstrate that the gradient's magnitude correlates with the curvature of the trained model's loss landscape, which allows it to reflect the effect of adversarial training. Therefore, based on the magnitude of the gradient, we propose a general acceleration strategy, M+ acceleration, which enables an automatic and highly effective method of adjusting the training procedure. M+ acceleration is computationally efficient and easy to implement. It is suited for DEAT and compatible with the majority of existing adversarial training techniques. Extensive experiments have been done on CIFAR-10 and ImageNet datasets with various training environments. The results show that proposed M+ acceleration significantly improves the training efficiency of existing adversarial training methods while maintaining or even enhancing their robustness performance. This demonstrates that the strategy is highly adaptive and offers a valuable solution for automatic adversarial training. Link » Fu Wang · Yanghao Zhang · Wenjie Ruan · Yanbin Zheng 🔗 - A Theory of Learning with Competing Objectives and User Feedback (Poster) []  []   link »    Large-scale deployed learning systems are often evaluated alongmultiple objectives or criteria. But, how can we learn or optimizesuch complex systems, with potentially conflicting or evenincompatible objectives? How can we improve the system when user feedback becomes available, feedback possibly alerting to issues not previously optimized for by the system?We present a new theoretical model for learning and optimizing suchcomplex systems. Rather than committing to a static or pre-definedtradeoff for the multiple objectives, our model is guided by thefeedback received, which is used to update its internal state.Our model supports multiple objectives that can be of very generalform and takes into account their potential incompatibilities.We consider both a stochastic and an adversarial setting. In thestochastic setting, we show that our framework can be naturally castas a Markov Decision Process with stochastic losses, for which we giveefficient vanishing regret algorithmic solutions. In the adversarialsetting, we design efficient algorithms with competitive ratioguarantees.We also report the results of experiments with our stochasticalgorithms validating their effectiveness. Link » Pranjal Awasthi · Corinna Cortes · Yishay Mansour · Mehryar Mohri 🔗 - Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and Generalization Guarantees (Poster) []  []   link »    Safety is a critical component of autonomous systems and remains a challenge for learning-based policies to be utilized in the real world. In this paper, we propose Sim-to-Lab-to-Real to bridge the reality gap with a probabilistically guaranteed safety-aware policy distribution.. To improve safety, we apply a dual policy setup where a performance policy is trained using the cumulative task reward and a backup (safety) policy is trained by solving the safety Bellman Equation based on Hamilton-Jacobi reachability analysis. In \textit{Sim-to-Lab} transfer, we apply a supervisory control scheme to shield unsafe actions during exploration; in \textit{Lab-to-Real} transfer, we leverage the Probably Approximately Correct (PAC)-Bayes framework to provide lower bounds on the expected performance and safety of policies in unseen environments. We empirically study the [proposed framework for ego-vision navigation in two types of indoor environments including a photo-realistic one. We also demonstrate strong generalization performance through hardware experiments in real indoor spaces with a quadrupedal robot. Link » Kai-Chieh Hsu · Allen Z. Ren · Duy Nguyen · Anirudha Majumdar · Jaime Fisac 🔗 - Addressing Mistake Severity in Neural Networks with Semantic Knowledge (Poster) []  []   link »    Robustness in deep neural networks and machine learning algorithms in general is an open research challenge. In particular, it is difficult to ensure algorithmic performance is maintained on out-of-distribution inputs or anomalous instances that cannot be anticipated at training time. Embodied agents will be deployed in these conditions, and are likely to make incorrect predictions. An agent will be viewed as untrustworthy unless it can maintain its performance in dynamic environments. Most robust training techniques aim to improve model accuracy on perturbed inputs; as an alternate form of robustness, we aim to reduce the severity of mistakes made by neural networks in challenging conditions. We leverage current adversarial training methods to generate targeted adversarial attacks during the training process in order to increase the semantic similarity between a model's predictions and true labels of misclassified instances. Results demonstrate that our approach performs better with respect to mistake severity compared to standard and adversarially trained models. We also find an intriguing role that non-robust features play with regards to semantic similarity. Link » Victoria Helus · Nathan Vaska · Natalie Abreu 🔗 - A Framework for Generating Dangerous Scenes for Testing Robustness (Poster) []  []   link »    Benchmark datasets for autonomous driving, such as KITTI, nuScenes, Argoverse, or Waymo are realistic but designed to be faultless. These datasets do not contain errors, difficult driving maneuvers, or other corner cases. We propose a framework for perturbing autonomous vehicle datasets, the DANGER framework, which generates edge-case images on top of current autonomous driving datasets. The input to DANGER are photorealistic datasets from real driving scenarios. We present the DANGER algorithm for vehicle position manipulation and the interface towards the renderer module, and present five scenario-level dangerous primitives generation applied to the virtual KITTI and virtual KITTI 2 datasets. Our experiments prove that DANGER can be used as a framework for expanding the current datasets to cover generative while realistic and anomalous corner cases. Link » Shengjie Xu · Lan Mi · Leilani Gilpin 🔗 - MAFEA: Multimodal Attribution Framework for Embodied AI (Poster) []  []   link »    Understanding multimodal perception for embodied AI is an open question because such inputs may contain highly complementary as well as redundant information for the task. A relevant direction for multimodal policies is understanding the global trends of each modality at the fusion layer. To this end, we disentangle the attributions for visual, language, and previous action inputs across different policies trained on the ALFRED dataset. Attribution analysis can be utilized to rank and group the failure scenarios, investigate modeling and dataset biases, and critically analyze multimodal EAI policies for robustness and user trust before deployment. We present MAFEA, a framework to compute global attributions per modality of any differentiable policy. In addition, we show how attributions enable lower-level behavior analysis in EAI policies through two example case studies on language and visual attributions. Link » Vidhi Jain · Jayant Sravan Tamarapalli · Sahiti Yerramilli · Yonatan Bisk 🔗 - Take 5: Interpretable Image Classification with a Handful of Features (Poster) []  []   link »    Deep Neural Networks use thousands of mostly incomprehensible features to identify a single class, a decision no human can follow. We propose an interpretable sparse and low dimensional final decision layer in a deep neural network with measurable aspects of interpretability and demonstrate it on fine-grained image classification. We argue that a human can only understand the decision of a machine learning model, if the input features are interpretable and only very few of them are used for a single decision. For that matter, the final layer has to be sparse and – to make interpreting the features feasible – low dimensional. We call a model with a Sparse Low-Dimensional Decision “SLDD-Model”. We show that a SLDD-Model is easier to interpret locally and globally than a dense high-dimensional decision layer while being able to maintain competitive accuracy. Additionally, we propose a loss function that improves a model’s feature diversity and accuracy. Our interpretable SLDD-Model only uses 5 out of just 50 features per class, while maintaining 97% to 100% of the accuracy on four common benchmark datasets compared to the baseline model with 2048 features. Link » Thomas Norrenbrock · Marco Rudolph · Bodo Rosenhahn 🔗 - To Explain or Not to Explain: A Study on the Necessity of Explanations for Autonomous Vehicles (Poster) []  []   link »    Explainable AI, in the context of autonomous systems, like self-driving cars, has drawn broad interests from researchers. Recent studies have found that providing explanations for autonomous vehicles' actions has many benefits (e.g., increased trust and acceptance), but put little emphasis on when an explanation is needed and how the content of explanation changes with driving context. In this work, we investigate which scenarios people need explanations and how the critical degree of explanation shifts with situations and driver types. Through a user experiment, we ask participants to evaluate how necessary an explanation is and measure the impact on their trust in self-driving cars in different contexts. Moreover, we present a self-driving explanation dataset with first-person explanations and associated measures of the necessity for 1103 video clips, augmenting the Berkeley Deep Drive Attention dataset. Our research reveals that driver types and driving scenarios dictate whether an explanation is necessary. In particular, people tend to agree on the necessity for near-crash events but hold different opinions on ordinary or anomalous driving situations. Link » Yuan Shen · Shanduojiao Jiang · Yanlin Chen · Katherine Driggs-Campbell 🔗