Timezone: »
The purpose of this tutorial is to explore the interplay between sensing and control, to highlight the "information knot" that ties them, and to design inference and learning algorithms to compute "representations" from data that are optimal, by design, for decision and control tasks. We will focus on visual sensing, but the analysis developed extends to other modalities.
We will first review various notions of information proposed in different fields from economic theory to perception psychology, and adapt them to decision and control tasks, as opposed to transmission and storage of data. We will see that for complex sensing phenomena, such as vision, nuisance factors play an important role, especially those that are not "invertible" such as occlusions of lineofsight and quantizationscale. Handling of the nuisances brings forward a notion of "representation," whose complexity measures the amount of "actionable information" contained in the data. We will discuss how to build representations that are optimal by design, in the sense of retaining all and only the statistics that matter to the task. For "invertible" nuisances, such representations can be made lossless (not in the classical sense of distortion, but in the sense of optimal performance in a decision or control task). In some cases, these representations are supported on a thinset, which can help elucidate the "signaltosymbol barrier" problem, and relate to a topologybased notion of "sparsity". However, noninvertible nuisances spoil the picture, requiring the introduction of a notion of "stability" of the representation with respect to noninvertible nuisances. This is not the classical notion of (boundedinputboundedoutput) stability from control theory, but instead relates to "structural stability" from catastrophe theory. The design of maximally stable statistics brings forward a notion of "proper sampling" of the data. However, this is not the traditional notion of proper sampling from Nyquist, but one related to persistent topology. Once an optimal representation is constructed, a bound on the risk or control functional can be derived, analog to distortion in communications. The "currency" that trades off this error (the equivalent of the bitrate in communication) is not the amount of data, but instead the "control authority" over the sensing process. Thus, sensing and control are intimately tied: Actionable information drives the control process, and control of the sensing process is what allows computing a representation.
We will present case studies in which formulating visual decision problems (e.g. detection, localization, recognition, categorization) in the context of visionbased control leads to improved performance and reduced computational burden. They include established lowlevel vision tools (e.g. tracking, local invariant descriptors), robotic exploration, and action and activity recognition. We will describe some of these in detail and distribute source code at the workshop, together with course notes.
Author Information
Stefano Soatto (UCLA)
Stefano Soatto received his Ph.D. in Control and Dynamical Systems from the California Institute of Technology in 1996; he joined UCLA in 2000 after being Assistant and then Associate Professor of Electrical Engineering and Biomedical Engineering at Washington University, and Research Associate in Applied Sciences at Harvard University. Between 1995 and 1998 he was also Ricercatore in the Department of Mathematics and Computer Science at the University of Udine  Italy. He received his D.Ing. degree (highest honors) from the University of Padova Italy in 1992. His general research interests are in Computer Vision and Nonlinear Estimation and Control Theory. In particular, he is interested in ways for computers to use sensory information to interact with humans and the environment. Dr. Soatto is the recipient of the David Marr Prize for work on Euclidean reconstruction and reprojection up to subgroups. He also received the Siemens Prize with the Outstanding Paper Award from the IEEE Computer Society for his work on optimal structure from motion. He received the National Science Foundation Career Award and the Okawa Foundation Grant. He is a Member of the Editorial Board of the International Journal of Computer Vision (IJCV) and Foundations and Trends in Computer Graphics and Vision. He is the founder and director of the UCLA Vision Lab; more information is available at http://vision.ucla.edu
More from the Same Authors

2021 Spotlight: Uniform Sampling over Episode Difficulty »
Sébastien Arnold · Guneet Dhillon · Avinash Ravichandran · Stefano Soatto 
2021 Spotlight: Long ShortTerm Transformer for Online Action Detection »
Mingze Xu · Yuanjun Xiong · Hao Chen · Xinyu Li · Wei Xia · Zhuowen Tu · Stefano Soatto 
2022 Poster: On LeaveOneOut Conditional Mutual Information For Generalization »
Mohamad Rida Rammal · Alessandro Achille · Aditya Golatkar · Suhas Diggavi · Stefano Soatto 
2022 : Evaluating Worst Case Adversarial Weather Perturbations Robustness »
Yihan Wang · Yunhao Ba · Howard Zhang · Huan Zhang · Achuta Kadambi · Stefano Soatto · Alex Wong · ChoJui Hsieh 
2023 Poster: GacsKorner Common Information Variational Autoencoder »
Michael Kleinman · Alessandro Achille · Stefano Soatto · Jonathan Kao 
2022 Poster: Semisupervised Vision Transformers at Scale »
Zhaowei Cai · Avinash Ravichandran · Paolo Favaro · Manchen Wang · Davide Modolo · Rahul Bhotika · Zhuowen Tu · Stefano Soatto 
2021 Poster: Uniform Sampling over Episode Difficulty »
Sébastien Arnold · Guneet Dhillon · Avinash Ravichandran · Stefano Soatto 
2021 Poster: Long ShortTerm Transformer for Online Action Detection »
Mingze Xu · Yuanjun Xiong · Hao Chen · Xinyu Li · Wei Xia · Zhuowen Tu · Stefano Soatto 
2018 : Plenary Talk 3 »
Stefano Soatto 
2010 Poster: Occlusion Detection and Motion Estimation with Convex Optimization »
Alper Ayvaci · Michalis Raptis · Stefano Soatto 
2006 Poster: Detecting Humans via Their Pose »
Alessandro Bissacco · MingHsuan Yang · Stefano Soatto