Timezone: »
Deep Reinforcement Learning (DRL) has demonstrated great potentials in solving sequential decision making problems in many applications. Despite its promising performance, practical gaps exist when deploying DRL in real-world scenarios. One main barrier is the over-fitting issue that leads to poor generalizability of the policy learned by DRL.
In particular, for offline DRL with observational data, model selection is a challenging task as there is no ground truth available for performance demonstration, in contrast with the online setting with simulated environments. In this work, we propose a pessimistic model selection (PMS) approach for offline DRL with a theoretical guarantee, which features a tuning-free framework for finding the best policy among a set of candidate models. Two refined approaches are also proposed to address the potential bias of DRL model in identifying the optimal policy. Numerical studies demonstrated the superior performance of our approach over existing methods.
Author Information
Chao-Han Huck Yang (Georgia Institute of Technology)

I just obtained my Ph.D. from Georgia Tech working on Speech Recognition, Language Model Adaptation, and Robust Reinforcement Learning.
Yifan Cui (National University of Singapore)
Pin-Yu Chen (IBM Research AI)
More from the Same Authors
-
2020 : Paper 10: Certified Interpretability Robustness for Class Activation Mapping »
Alex Gu · Tsui-Wei Weng · Pin-Yu Chen · Sijia Liu · Luca Daniel -
2021 : Certified Robustness for Free in Differentially Private Federated Learning »
Chulin Xie · Yunhui Long · Pin-Yu Chen · Krishnaram Kenthapadi · Bo Li -
2021 : MAML is a Noisy Contrastive Learner »
Chia-Hsiang Kao · Wei-Chen Chiu · Pin-Yu Chen -
2021 : QTN-VQC: An End-to-End Learning Framework for Quantum Neural Networks »
Jun Qi · Chao-Han Huck Yang · Pin-Yu Chen -
2022 : An Empirical Evaluation of Zeroth-Order Optimization Methods on AI-driven Molecule Optimization »
Elvin Lo · Pin-Yu Chen -
2022 : Improving Vertical Federated Learning by Efficient Communication with ADMM »
Chulin Xie · Pin-Yu Chen · Ce Zhang · Bo Li -
2022 : Visual Prompting for Adversarial Robustness »
Aochuan Chen · Peter Lorenz · Yuguang Yao · Pin-Yu Chen · Sijia Liu -
2022 : NeuralFuse: Improving the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes »
Hao-Lun Sun · Lei Hsiung · Nandhini Chandramoorthy · Pin-Yu Chen · Tsung-Yi Ho -
2022 : Visual Prompting for Adversarial Robustness »
Aochuan Chen · Peter Lorenz · Yuguang Yao · Pin-Yu Chen · Sijia Liu -
2022 : Do Domain Generalization Methods Generalize Well? »
Akshay Mehra · Bhavya Kailkhura · Pin-Yu Chen · Jihun Hamm -
2022 : On the Adversarial Robustness of Vision Transformers »
Rulin Shao · Zhouxing Shi · Jinfeng Yi · Pin-Yu Chen · Cho-Jui Hsieh -
2023 : AutoVP: An Automated Visual Prompting Framework and Benchmark »
Hsi-Ai Tsao · Lei Hsiung · Pin-Yu Chen · Sijia Liu · Tsung-Yi Ho -
2023 : AutoVP: An Automated Visual Prompting Framework and Benchmark »
Hsi-Ai Tsao · Lei Hsiung · Pin-Yu Chen · Sijia Liu · Tsung-Yi Ho -
2023 : Parameter Efficient Finetuning for Reducing Activation Density in Transformers »
Bharat Runwal · Tejaswini Pedapati · Pin-Yu Chen -
2023 : Transformers as Multi-Task Feature Selectors: Generalization Analysis of In-Context Learning »
Hongkang Li · Meng Wang · Songtao Lu · Hui Wan · Xiaodong Cui · Pin-Yu Chen -
2023 : What Improves the Generalization of Graph Transformer? A Theoretical Dive into Self-attention and Positional Encoding »
Hongkang Li · Meng Wang · Tengfei Ma · Sijia Liu · ZAIXI ZHANG · Pin-Yu Chen -
2023 : How to remove backdoors in diffusion models? »
Shengwei An · Sheng-Yen Chou · Kaiyuan Zhang · Qiuling Xu · Guanhong Tao · Guangyu Shen · Siyuan Cheng · Shiqing Ma · Pin-Yu Chen · Tsung-Yi Ho · Xiangyu Zhang -
2023 : VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models »
Sheng-Yen Chou · Pin-Yu Chen · Tsung-Yi Ho -
2023 Poster: VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models »
Sheng-Yen Chou · Pin-Yu Chen · Tsung-Yi Ho -
2023 Poster: RADAR: Robust AI-Text Detection via Adversarial Learning »
Xiaomeng Hu · Pin-Yu Chen · Tsung-Yi Ho -
2023 Poster: HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models »
CHEN CHEN · Yuchen Hu · Chao-Han Huck Yang · Sabato Marco Siniscalchi · Pin-Yu Chen · Eng-Siong Chng -
2023 Poster: Uncovering and Quantifying Social Biases in Code Generation »
Yan Liu · Xiaokang Chen · Yan Gao · Zhe Su · Fengji Zhang · Daoguang Zan · Jian-Guang Lou · Pin-Yu Chen · Tsung-Yi Ho -
2023 Poster: On the Convergence and Sample Complexity Analysis of Deep Q-Networks with $\epsilon$-Greedy Exploration »
Shuai Zhang · Hongkang Li · Meng Wang · Miao Liu · Pin-Yu Chen · Songtao Lu · Sijia Liu · Keerthiram Murugesan · Subhajit Chaudhury -
2022 : Panel »
Pin-Yu Chen · Alex Gittens · Bo Li · Celia Cintas · Hilde Kuehne · Payel Das -
2022 : Q & A »
Sayak Paul · Sijia Liu · Pin-Yu Chen -
2022 : Deep dive on foundation models for computer vision »
Pin-Yu Chen -
2022 Tutorial: Foundational Robustness of Foundation Models »
Pin-Yu Chen · Sijia Liu · Sayak Paul -
2022 : Basics in foundation model and robustness »
Pin-Yu Chen · Sijia Liu -
2022 : SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data »
Ching-Yun Ko · Pin-Yu Chen · Jeet Mohapatra · Payel Das · Luca Daniel -
2022 Poster: Make an Omelette with Breaking Eggs: Zero-Shot Learning for Novel Attribute Synthesis »
Yu-Hsuan Li · Tzu-Yin Chao · Ching-Chun Huang · Pin-Yu Chen · Wei-Chen Chiu -
2021 Workshop: New Frontiers in Federated Learning: Privacy, Fairness, Robustness, Personalization and Data Ownership »
Nghia Hoang · Lam Nguyen · Pin-Yu Chen · Tsui-Wei Weng · Sara Magliacane · Bryan Kian Hsiang Low · Anoop Deoras -
2021 Poster: Predicting Deep Neural Network Generalization with Perturbation Response Curves »
Yair Schiff · Brian Quanz · Payel Das · Pin-Yu Chen -
2021 Poster: Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination »
Arpan Mukherjee · Ali Tajer · Pin-Yu Chen · Payel Das -
2021 Poster: Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Sparse Neural Networks »
Shuai Zhang · Meng Wang · Sijia Liu · Pin-Yu Chen · Jinjun Xiong -
2021 Poster: CAFE: Catastrophic Data Leakage in Vertical Federated Learning »
Xiao Jin · Pin-Yu Chen · Chia-Yi Hsu · Chia-Mu Yu · Tianyi Chen -
2021 Poster: Adversarial Attack Generation Empowered by Min-Max Optimization »
Jingkang Wang · Tianyun Zhang · Sijia Liu · Pin-Yu Chen · Jiacen Xu · Makan Fardad · Bo Li -
2021 : Live Q&A session: MAML is a Noisy Contrastive Learner »
Chia-Hsiang Kao · Wei-Chen Chiu · Pin-Yu Chen -
2021 : Contributed Talk (Oral): MAML is a Noisy Contrastive Learner »
Chia-Hsiang Kao · Wei-Chen Chiu · Pin-Yu Chen -
2021 : SenSE: A Toolkit for Semantic Change Exploration via Word Embedding Alignment »
MaurĂcio Gruppi · Sibel Adali · Pin-Yu Chen -
2021 Poster: When does Contrastive Learning Preserve Adversarial Robustness from Pretraining to Finetuning? »
Lijie Fan · Sijia Liu · Pin-Yu Chen · Gaoyuan Zhang · Chuang Gan -
2021 Poster: Formalizing Generalization and Adversarial Robustness of Neural Networks to Weight Perturbations »
Yu-Lin Tsai · Chia-Yi Hsu · Chia-Mu Yu · Pin-Yu Chen -
2021 Poster: Understanding the Limits of Unsupervised Domain Adaptation via Data Poisoning »
Akshay Mehra · Bhavya Kailkhura · Pin-Yu Chen · Jihun Hamm -
2020 Poster: ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training »
Chia-Yu Chen · Jiamin Ni · Songtao Lu · Xiaodong Cui · Pin-Yu Chen · Xiao Sun · Naigang Wang · Swagath Venkataramani · Vijayalakshmi (Viji) Srinivasan · Wei Zhang · Kailash Gopalakrishnan -
2020 Poster: Higher-Order Certification For Randomized Smoothing »
Jeet Mohapatra · Ching-Yun Ko · Tsui-Wei Weng · Pin-Yu Chen · Sijia Liu · Luca Daniel -
2020 Poster: Optimizing Mode Connectivity via Neuron Alignment »
Norman J Tatro · Pin-Yu Chen · Payel Das · Igor Melnyk · Prasanna Sattigeri · Rongjie Lai -
2020 Spotlight: Higher-Order Certification For Randomized Smoothing »
Jeet Mohapatra · Ching-Yun Ko · Tsui-Wei Weng · Pin-Yu Chen · Sijia Liu · Luca Daniel -
2019 : Poster Session »
Ahana Ghosh · Javad Shafiee · Akhilan Boopathy · Alex Tamkin · Theodoros Vasiloudis · Vedant Nanda · Ali Baheri · Paul Fieguth · Andrew Bennett · Guanya Shi · Hao Liu · Arushi Jain · Jacob Tyo · Benjie Wang · Boxiao Chen · Carroll Wainwright · Chandramouli Shama Sastry · Chao Tang · Daniel S. Brown · David Inouye · David Venuto · Dhruv Ramani · Dimitrios Diochnos · Divyam Madaan · Dmitrii Krashenikov · Joel Oren · Doyup Lee · Eleanor Quint · elmira amirloo · Matteo Pirotta · Gavin Hartnett · Geoffroy Dubourg-Felonneau · Gokul Swamy · Pin-Yu Chen · Ilija Bogunovic · Jason Carter · Javier Garcia-Barcos · Jeet Mohapatra · Jesse Zhang · Jian Qian · John Martin · Oliver Richter · Federico Zaiter · Wentao Weng · Karthik Abinav Sankararaman · Kyriakos Polymenakos · Lan Hoang · mahdieh abbasi · Marco Gallieri · Mathieu Seurin · Matteo Papini · Matteo Turchetta · Matthew Sotoudeh · Mehrdad Hosseinzadeh · Nathan Fulton · Masatoshi Uehara · Niranjani Prasad · Oana-Maria Camburu · Patrik Kolaric · Philipp Renz · Prateek Jaiswal · Reazul Hasan Russel · Riashat Islam · Rishabh Agarwal · Alexander Aldrick · Sachin Vernekar · Sahin Lale · Sai Kiran Narayanaswami · Samuel Daulton · Sanjam Garg · Sebastian East · Shun Zhang · Soheil Dsidbari · Justin Goodwin · Victoria Krakovna · Wenhao Luo · Wesley Chung · Yuanyuan Shi · Yuh-Shyang Wang · Hongwei Jin · Ziping Xu -
2018 Poster: Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization »
Sijia Liu · Bhavya Kailkhura · Pin-Yu Chen · Paishun Ting · Shiyu Chang · Lisa Amini -
2018 Poster: Efficient Neural Network Robustness Certification with General Activation Functions »
Huan Zhang · Tsui-Wei Weng · Pin-Yu Chen · Cho-Jui Hsieh · Luca Daniel -
2018 Poster: Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives »
Amit Dhurandhar · Pin-Yu Chen · Ronny Luss · Chun-Chen Tu · Paishun Ting · Karthikeyan Shanmugam · Payel Das