Timezone: »
We present an end-to-end framework for solving the Vehicle Routing Problem (VRP) using reinforcement learning. In this approach, we train a single policy model that finds near-optimal solutions for a broad range of problem instances of similar size, only by observing the reward signals and following feasibility rules. We consider a parameterized stochastic policy, and by applying a policy gradient algorithm to optimize its parameters, the trained model produces the solution as a sequence of consecutive actions in real time, without the need to re-train for every new problem instance. On capacitated VRP, our approach outperforms classical heuristics and Google's OR-Tools on medium-sized instances in solution quality with comparable computation time (after training). We demonstrate how our approach can handle problems with split delivery and explore the effect of such deliveries on the solution quality. Our proposed framework can be applied to other variants of the VRP such as the stochastic VRP, and has the potential to be applied more generally to combinatorial optimization problems
Author Information
MohammadReza Nazari (Lehigh University)
Afshin Oroojlooy (SAS Institute)
Lawrence Snyder (Lehigh University)
Martin Takac (Lehigh University)
More from the Same Authors
-
2021 : Random-reshuffled SARAH does not need a full gradient computations »
Aleksandr Beznosikov · Martin Takac -
2022 : Effects of momentum scaling for SGD »
Dmitry A. Pasechnyuk · Alexander Gasnikov · Martin Takac -
2022 : Using quadratic equations for overparametrized models »
Shuang Li · William Swartworth · Martin Takac · Deanna Needell · Robert Gower -
2022 : FLECS-CGD: A Federated Learning Second-Order Framework via Compression and Sketching with Compressed Gradient Differences »
Artem Agafonov · Brahim Erraji · Martin Takac -
2022 : Cubic Regularized Quasi-Newton Methods »
Dmitry Kamzolov · Klea Ziu · Artem Agafonov · Martin Takac -
2022 : PSPS: Preconditioned Stochastic Polyak Step-size method for badly scaled data »
Farshed Abdukhakimov · Chulu Xiang · Dmitry Kamzolov · Robert Gower · Martin Takac -
2022 Workshop: Order up! The Benefits of Higher-Order Optimization in Machine Learning »
Albert Berahas · Jelena Diakonikolas · Jarad Forristal · Brandon Reese · Martin Takac · Yan Xu -
2022 Poster: A Damped Newton Method Achieves Global $\mathcal O \left(\frac{1}{k^2}\right)$ and Local Quadratic Convergence Rate »
Slavomír Hanzely · Dmitry Kamzolov · Dmitry Pasechnyuk · Alexander Gasnikov · Peter Richtarik · Martin Takac -
2021 Workshop: OPT 2021: Optimization for Machine Learning »
Courtney Paquette · Quanquan Gu · Oliver Hinder · Katya Scheinberg · Sebastian Stich · Martin Takac -
2020 : Closing remarks »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2020 : Live Q&A with Suvrit Sra (Zoom) »
Martin Takac -
2020 : Intro to Invited Speaker 5 »
Martin Takac -
2020 : Contributed talks in Session 2 (Zoom) »
Martin Takac · Samuel Horváth · Guan-Horng Liu · Nicolas Loizou · Sharan Vaswani -
2020 : Live Q&A with Donald Goldfarb (Zoom) »
Martin Takac -
2020 : Live Q&A with Andreas Krause (Zoom) »
Martin Takac -
2020 : Welcome remarks to Session 2 »
Martin Takac -
2020 Workshop: OPT2020: Optimization for Machine Learning »
Courtney Paquette · Mark Schmidt · Sebastian Stich · Quanquan Gu · Martin Takac -
2020 : Welcome event (gather.town) »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2020 Poster: AttendLight: Universal Attention-Based Reinforcement Learning Model for Traffic Signal Control »
Afshin Oroojlooy · Mohammadreza Nazari · Davood Hajinezhad · Jorge Silva -
2019 : Poster Session »
Eduard Gorbunov · Alexandre d'Aspremont · Lingxiao Wang · Liwei Wang · Boris Ginsburg · Alessio Quaglino · Camille Castera · Saurabh Adya · Diego Granziol · Rudrajit Das · Raghu Bollapragada · Fabian Pedregosa · Martin Takac · Majid Jahani · Sai Praneeth Karimireddy · Hilal Asi · Balint Daroczy · Leonard Adolphs · Aditya Rawal · Nicolas Brandt · Minhan Li · Giuseppe Ughi · Orlando Romero · Ivan Skorokhodov · Damien Scieur · Kiwook Bae · Konstantin Mishchenko · Rohan Anil · Vatsal Sharan · Aditya Balu · Chao Chen · Zhewei Yao · Tolga Ergen · Paul Grigas · Chris Junchi Li · Jimmy Ba · Stephen J Roberts · Sharan Vaswani · Armin Eftekhari · Chhavi Sharma -
2016 Poster: A Multi-Batch L-BFGS Method for Machine Learning »
Albert Berahas · Jorge Nocedal · Martin Takac -
2014 Poster: Communication-Efficient Distributed Dual Coordinate Ascent »
Martin Jaggi · Virginia Smith · Martin Takac · Jonathan Terhorst · Sanjay Krishnan · Thomas Hofmann · Michael Jordan