Timezone: »

 
Real World RL with Vowpal Wabbit: Beyond Contextual Bandits
John Langford · Marek Wydmuch · Maryam Majzoubi · Adith Swaminathan · · Dylan Foster · Paul Mineiro

Sun Dec 06 10:00 AM -- 12:40 PM (PST) @

In recent years, breakthroughs in sample-efficient RL algorithms like Contextual Bandits enabled new solutions to personalization and optimization scenarios. Unbiased off-policy evaluation gave Data Scientists superpowers on real-world data volumes, giving them confidence in putting machine learning into production. Vowpal Wabbit (https://vowpalwabbit.org) is an open source machine learning toolkit and research platform, used extensively across the industry, providing fast, scalable machine learning.

Dive beyond Contextual Bandits in the Real World: * Build Extreme Multilabel Classifiers with the Probabilistic Label Tree learner. * Solve multi-slot scenarios with Conditional Contextual Bandits and Slates, and optimize systems with Continuous Action-Space CB * Learn about advanced off-policy evaluation and introspection options with new estimators and visualizations

Author Information

John Langford (Microsoft Research New York)
Marek Wydmuch (Poznan University of Technology)
Maryam Majzoubi (NYU)
Adith Swaminathan (Microsoft Research)
Dylan Foster (MIT)
Paul Mineiro (Microsoft)

More from the Same Authors