Timezone: »
Machine Learning does magical things when it starts interacting with and changing the world, yet most algorithms are not designed to do this. Systematically gathering the right data is the first order problem for learning with interaction. One simplistic example of this is ad recommendation where a high recommendation implies high placement which implies high click-through which implies high recommendation.... creating a self-fulfilling prophecy. This talk is about how to systematically avoid these problems by effectively (re)using randomization to engage in controlled exploration for learning algorithms. With these techniques, we can exponentially reduce the amount of exploration required, test many policies offline, and repurpose our existing learning algorithms to directly solve for optimal policies.
Author Information
John Langford (Microsoft Research)
John Langford is a machine learning research scientist, a field which he says "is shifting from an academic discipline to an industrial tool". He is the author of the weblog hunch.net and the principal developer of Vowpal Wabbit. John works at Microsoft Research New York, of which he was one of the founding members, and was previously affiliated with Yahoo! Research, Toyota Technological Institute, and IBM's Watson Research Center. He studied Physics and Computer Science at the California Institute of Technology, earning a double bachelor's degree in 1997, and received his Ph.D. in Computer Science from Carnegie Mellon University in 2002. He was the program co-chair for the 2012 International Conference on Machine Learning.
More from the Same Authors
-
2022 : Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information »
Riashat Islam · Manan Tomar · Alex Lamb · Hongyu Zang · Yonathan Efroni · Dipendra Misra · Aniket Didolkar · Xin Li · Harm Van Seijen · Remi Tachet des Combes · John Langford -
2022 : Towards Data-Driven Offline Simulations for Online Reinforcement Learning »
Shengpu Tang · Felipe Vieira Frujeri · Dipendra Misra · Alex Lamb · John Langford · Paul Mineiro · Sebastian Kochman -
2022 Poster: Interaction-Grounded Learning with Action-Inclusive Feedback »
Tengyang Xie · Akanksha Saran · Dylan J Foster · Lekan Molu · Ida Momennejad · Nan Jiang · Paul Mineiro · John Langford -
2016 : A Contextual Research Program »
John Langford -
2014 Poster: Scalable Non-linear Learning with Adaptive Polynomial Expansions »
Alekh Agarwal · Alina Beygelzimer · Daniel Hsu · John Langford · Matus J Telgarsky -
2011 Workshop: Relations between machine learning problems - an approach to unify the field »
Robert Williamson · John Langford · Ulrike von Luxburg · Mark Reid · Jennifer Wortman Vaughan -
2010 Workshop: Learning on Cores, Clusters, and Clouds »
Alekh Agarwal · Lawrence Cayton · Ofer Dekel · John Duchi · John Langford -
2009 Poster: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2009 Oral: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2008 Poster: Sparse Online Learning via Truncated Gradient »
John Langford · Lihong Li · Tong Zhang -
2008 Spotlight: Sparse Online Learning via Truncated Gradient »
John Langford · Lihong Li · Tong Zhang -
2008 Poster: Predictive Indexing for Fast Search »
Sharad Goel · John Langford · Alexander L Strehl -
2007 Workshop: Principles of Learning Problem Design »
John Langford · Alina Beygelzimer -
2007 Poster: The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information »
John Langford · Tong Zhang