Timezone: »
Data valuation, or the valuation of individual datum contributions, has seen growing interest in machine learning due to its demonstrable efficacy for tasks such as noisy label detection. In particular, due to the desirable axiomatic properties, several Shapley value approximations have been proposed. In these methods, the value function is usually defined as the predictive accuracy over the entire development set. However, this limits the ability to differentiate between training instances that are helpful or harmful to their own classes. Intuitively, instances that harm their own classes may be noisy or mislabeled and should receive a lower valuation than helpful instances. In this work, we propose CS-Shapley, a Shapley value with a new value function that discriminates between training instances’ in-class and out-of-class contributions. Our theoretical analysis shows the proposed value function is (essentially) the unique function that satisfies two desirable properties for evaluating data values in classification. Further, our experiments on two benchmark evaluation tasks (data removal and noisy label detection) and four classifiers demonstrate the effectiveness of CS-Shapley over existing methods. Lastly, we evaluate the “transferability” of data values estimated from one classifier to others, and our results suggest Shapley-based data valuation is transferable for application across different models.
Author Information
Stephanie Schoch (University of Virginia)
Haifeng Xu (University of Chicago)
Yangfeng Ji (University of Virginia)
More from the Same Authors
-
2021 : Adversarial Training for Improving Model Robustness? Look at Both Prediction and Interpretation »
Hanjie Chen · Yangfeng Ji -
2022 : Explaining Predictive Uncertainty by Looking Back at Model Explanations »
Hanjie Chen · Wanyu Du · Yangfeng Ji -
2022 : Information-Theoretic Evaluation of Free-Text Rationales with Conditional $\mathcal{V}$-Information »
Hanjie Chen · Faeze Brahman · Xiang Ren · Yangfeng Ji · Yejin Choi · Swabha Swayamdipta -
2023 Poster: Rethinking Incentives in Recommender Systems: Are Monotone Rewards Always Beneficial? »
Fan Yao · Chuanhao Li · Karthik Abinav Sankararaman · Yiming Liao · Yan Zhu · Qifan Wang · Hongning Wang · Haifeng Xu -
2023 Poster: Follow-ups Also Matter: Improving Contextual Bandits via Post-serving Contexts »
Chaoqi Wang · Ziyu Ye · Zhe Feng · Ashwinkumar Badanidiyuru Varadaraja · Haifeng Xu -
2023 Poster: Incentivized Communication for Federated Bandits »
Zhepei Wei · Chuanhao Li · Haifeng Xu · Hongning Wang -
2022 Poster: Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards »
Ashwinkumar Badanidiyuru Varadaraja · Zhe Feng · Tianxi Li · Haifeng Xu -
2022 Poster: Inverse Game Theory for Stackelberg Games: the Blessing of Bounded Rationality »
Jibang Wu · Weiran Shen · Fei Fang · Haifeng Xu