Timezone: »
Natural language is one of the most intuitive ways to express human intent. However, translating instructions and commands towards robotic motion generation and deployment in the real world is far from being an easy task. The challenge of combining a robot's inherent low-level geometric and kinodynamic constraints with a human's high-level semantic instructions traditionally is solved using task-specific solutions with little generalizability between hardware platforms, often with the use of static sets of target actions and commands. This work instead proposes a flexible language-based framework that allows a user to modify generic robotic trajectories. Our method leverages pre-trained language models (BERT and CLIP) to encode the user's intent and target objects directly from a free-form text input and scene images, fuses geometrical features generated by a transformer encoder network, and finally outputs trajectories using a transformer decoder, without the need of priors related to the task or robot information. We significantly extend the previous work presented in Bucker et al. (2022) by expanding the trajectory parametrization space to 3D and velocity as opposed to just XY movements. In addition, we now train the model to use actual images of the objects in the scene for context (as opposed to textual descriptions), and we evaluate the system in a diverse set of scenarios beyond manipulation, such as aerial and legged robots. Our simulated and real-life experiments demonstrate that our transformer model can successfully follow human intent, modifying the shape and speed of trajectories within multiple environments.
Author Information
A Bucker (Universidade de Sao Paulo)
Luis Figueredo (Technische Universität München)
Sami Haddadin
Ashish Kapoor (Microsoft)
shuang ma (Microsoft)
Sai Vemprala (Microsoft)
Rogerio Bonatti (Microsoft)
More from the Same Authors
-
2020 : Paper 64: Modeling Affect-based Intrinsic Rewards for Exploration and Learning »
Daniel McDuff · Ashish Kapoor -
2021 Spotlight: Representation Learning for Event-based Visuomotor Policies »
Sai Vemprala · Sami Mian · Ashish Kapoor -
2022 : PACT: Perception-Action Causal Transformer for Autoregressive Robotics Pretraining »
Rogerio Bonatti · Sai Vemprala · shuang ma · Felipe Vieira Frujeri · Shuhang Chen · Ashish Kapoor -
2022 : SMART: Self-supervised Multi-task pretrAining with contRol Transformers »
Yanchao Sun · shuang ma · Ratnesh Madaan · Rogerio Bonatti · Furong Huang · Ashish Kapoor -
2022 Poster: Learning Modular Simulations for Homogeneous Systems »
Jayesh Gupta · Sai Vemprala · Ashish Kapoor -
2022 Poster: 3DB: A Framework for Debugging Computer Vision Models »
Guillaume Leclerc · Hadi Salman · Andrew Ilyas · Sai Vemprala · Logan Engstrom · Vibhav Vineet · Kai Xiao · Pengchuan Zhang · Shibani Santurkar · Greg Yang · Ashish Kapoor · Aleksander Madry -
2021 Poster: Contrastive Learning of Global and Local Video Representations »
shuang ma · Zhaoyang Zeng · Daniel McDuff · Yale Song -
2021 Poster: Representation Learning for Event-based Visuomotor Policies »
Sai Vemprala · Sami Mian · Ashish Kapoor -
2021 Poster: Unadversarial Examples: Designing Objects for Robust Vision »
Hadi Salman · Andrew Ilyas · Logan Engstrom · Sai Vemprala · Aleksander Madry · Ashish Kapoor -
2020 Poster: Do Adversarially Robust ImageNet Models Transfer Better? »
Hadi Salman · Andrew Ilyas · Logan Engstrom · Ashish Kapoor · Aleksander Madry -
2020 Oral: Do Adversarially Robust ImageNet Models Transfer Better? »
Hadi Salman · Andrew Ilyas · Logan Engstrom · Ashish Kapoor · Aleksander Madry -
2020 Poster: Denoised Smoothing: A Provable Defense for Pretrained Classifiers »
Hadi Salman · Mingjie Sun · Greg Yang · Ashish Kapoor · J. Zico Kolter -
2020 Poster: Multi-Robot Collision Avoidance under Uncertainty with Probabilistic Safety Barrier Certificates »
Wenhao Luo · Wen Sun · Ashish Kapoor -
2020 Spotlight: Multi-Robot Collision Avoidance under Uncertainty with Probabilistic Safety Barrier Certificates »
Wenhao Luo · Wen Sun · Ashish Kapoor -
2019 : The Game of Drones Competition »
Charbel Toumieh · Sai Vemprala · Sangyun Shin · Rahul Kumar · Andrey Ivanov · Hyunchul Shim · Jose Martinez-Carranza · Nicholas Gyde · Ashish Kapoor · Keiko Nagami · Tim Taubner · Ratnesh Madaan · Antony Gillette · Paul Stubbs -
2019 : Lunch + Poster Session »
Frederik Gerzer · Bill Yang Cai · Pieter-Jan Hoedt · Kelly Kochanski · Soo Kyung Kim · Yunsung Lee · Sunghyun Park · Sharon Zhou · Martin Gauch · Jonathan Wilson · Joyjit Chatterjee · Shamindra Shrotriya · Dimitri Papadimitriou · Christian Schön · Valentina Zantedeschi · Gabriella Baasch · Willem Waegeman · Gautier Cosne · Dara Farrell · Brendan Lucier · Letif Mones · Caleb Robinson · Tafara Chitsiga · Victor Kristof · Hari Prasanna Das · Yimeng Min · Alexandra Puchko · Alexandra Luccioni · Kyle Story · Jason Hickey · Yue Hu · Björn Lütjens · Zhecheng Wang · Renzhi Jing · Genevieve Flaspohler · Jingfan Wang · Saumya Sinha · Qinghu Tang · Armi Tiihonen · Ruben Glatt · Muge Komurcu · Jan Drgona · Juan Gomez-Romero · Ashish Kapoor · Dylan J Fitzpatrick · Alireza Rezvanifar · Adrian Albert · Olya (Olga) Irzak · Kara Lamb · Ankur Mahesh · Kiwan Maeng · Frederik Kratzert · Sorelle Friedler · Niccolo Dalmasso · Alex Robson · Lindiwe Malobola · Lucas Maystre · Yu-wen Lin · Surya Karthik Mukkavili · Brian Hutchinson · Alexandre Lacoste · Yanbing Wang · Zhengcheng Wang · Yinda Zhang · Victoria Preston · Jacob Pettit · Draguna Vrabie · Miguel Molina-Solana · Tonio Buonassisi · Andrew Annex · Tunai P Marques · Catalin Voss · Johannes Rausch · Max Evans -
2019 Poster: Characterizing Bias in Classifiers using Generative Models »
Daniel McDuff · Shuang Ma · Yale Song · Ashish Kapoor -
2019 Poster: Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting »
Aditya Grover · Jiaming Song · Ashish Kapoor · Kenneth Tran · Alekh Agarwal · Eric Horvitz · Stefano Ermon -
2016 Poster: Quantum Perceptron Models »
Ashish Kapoor · Nathan Wiebe · Krysta Svore -
2015 : Machine Learning as Rotations (Quantum Deep Learning) »
Ashish Kapoor -
2012 Poster: Multilabel Classification using Bayesian Compressed Sensing »
Ashish Kapoor · Raajay Viswanathan · Prateek Jain -
2009 Workshop: Analysis and Design of Algorithms for Interactive Machine Learning »
Sumit Basu · Ashish Kapoor -
2009 Poster: Breaking Boundaries Between Induction Time and Diagnosis Time Active Information Acquisition »
Ashish Kapoor · Eric Horvitz