Timezone: »
Recently, deep reinforcement learning (DRL) frameworks have shown potential for solving NP-hard routing problems such as the traveling salesman problem (TSP) without problem-specific expert knowledge. Although DRL can be used to solve complex problems, DRL frameworks still struggle to compete with state-of-the-art heuristics showing a substantial performance gap. This paper proposes a novel hierarchical problem-solving strategy, termed learning collaborative policies (LCP), which can effectively find the near-optimum solution using two iterative DRL policies: the seeder and reviser. The seeder generates as diversified candidate solutions as possible (seeds) while being dedicated to exploring over the full combinatorial action space (i.e., sequence of assignment action). To this end, we train the seeder's policy using a simple yet effective entropy regularization reward to encourage the seeder to find diverse solutions. On the other hand, the reviser modifies each candidate solution generated by the seeder; it partitions the full trajectory into sub-tours and simultaneously revises each sub-tour to minimize its traveling distance. Thus, the reviser is trained to improve the candidate solution's quality, focusing on the reduced solution space (which is beneficial for exploitation). Extensive experiments demonstrate that the proposed two-policies collaboration scheme improves over single-policy DRL framework on various NP-hard routing problems, including TSP, prize collecting TSP (PCTSP), and capacitated vehicle routing problem (CVRP).
Author Information
Minsu Kim (Korea Advanced Institute of Science and Technology)
Jinkyoo Park (KAIST)
joungho kim (KAIST)
More from the Same Authors
-
2022 : Scale-conditioned Adaptation for Large Scale Combinatorial Optimization »
Minsu Kim · Jiwoo SON · Hyeonah Kim · Jinkyoo Park -
2022 : Collaborative symmetricity exploitation for offline learning of hardware design solver »
HAEYEON KIM · Minsu Kim · joungho kim · Jinkyoo Park -
2022 : Neural Coarsening Process for Multi-level Graph Combinatorial Optimization »
Hyeonah Kim · Minsu Kim · Changhyun Kwon · Jinkyoo Park -
2022 Poster: Sym-NCO: Leveraging Symmetricity for Neural Combinatorial Optimization »
Minsu Kim · Junyoung Park · Jinkyoo Park -
2022 Poster: Learning NP-Hard Multi-Agent Assignment Planning using GNN: Inference on a Random Graph and Provable Auction-Fitted Q-learning »
HYUNWOOK KANG · Taehwan Kwon · Jinkyoo Park · James R. Morrison -
2022 Poster: Transform Once: Efficient Operator Learning in Frequency Domain »
Michael Poli · Stefano Massaroli · Federico Berto · Jinkyoo Park · Tri Dao · Christopher RĂ© · Stefano Ermon -
2021 : Neural Solvers for Fast and Accurate Numerical Optimal Control »
Federico Berto · Stefano Massaroli · Michael Poli · Jinkyoo Park -
2021 : TorchDyn: Implicit Models and Neural Numerical Methods in PyTorch »
Michael Poli · Stefano Massaroli · Atsushi Yamashita · Hajime Asama · Jinkyoo Park · Stefano Ermon -
2021 Poster: Differentiable Multiple Shooting Layers »
Stefano Massaroli · Michael Poli · Sho Sonoda · Taiji Suzuki · Jinkyoo Park · Atsushi Yamashita · Hajime Asama -
2021 Poster: Neural Hybrid Automata: Learning Dynamics With Multiple Modes and Stochastic Transitions »
Michael Poli · Stefano Massaroli · Luca Scimeca · Sanghyuk Chun · Seong Joon Oh · Atsushi Yamashita · Hajime Asama · Jinkyoo Park · Animesh Garg -
2020 Poster: Dissecting Neural ODEs »
Stefano Massaroli · Michael Poli · Jinkyoo Park · Atsushi Yamashita · Hajime Asama -
2020 Poster: Hypersolvers: Toward Fast Continuous-Depth Models »
Michael Poli · Stefano Massaroli · Atsushi Yamashita · Hajime Asama · Jinkyoo Park -
2020 Oral: Dissecting Neural ODEs »
Stefano Massaroli · Michael Poli · Jinkyoo Park · Atsushi Yamashita · Hajime Asama