Accepted Papers


    Improvisation through Physical Understanding: Using Novel Objects As Tools with Visual Foresight Machine learning has enabled robots to perform complex tasks in narrowly-scoped settings, and to perform simple tasks with high generalization. However, learning a model that can both perform complex tasks and generalize to previously unseen objects and goals remains a significant challenge. We study this challenge in the context of "improvisational" tool use: a robot is presented with novel objects and a user-specified goal (e.g., sweep some clutter into the dustpan), and must figure out, using only raw image observations, how to accomplish the goal using the available objects as tools. We approach this problem by training a model with both a visual and physical understanding of multi-object interactions, and develop a sampling-based optimizer that can leverage these interactions to accomplish tasks. We do so by combining diverse demonstration data with self-supervised interaction data, aiming to leverage the interaction data to build generalizable models and the demonstration data to guide the model-based RL planner to solve complex tasks. Our experiments show that our approach can solve a variety of complex tool use tasks from raw pixel inputs, outperforming both imitation learning and self-supervised learning individually. Furthermore, we show that the robot can perceive and use novel objects as tools, including objects that are not conventional tools, while also choosing dynamically to use or not use tools depending on whether or not they are required.
    [Full Paper] [Video]
    Xie, Annie; Ebert, Frederik; Levine, Sergey; Finn, Chelsea

    Efficient Algorithms for Optimal Perimeter Guarding We investigate the problem of optimally assigning a swarm of robots (or other types of autonomous agents) to guard the perimeters of closed 2D regions, where the perimeter of each region to be guarded may contain multiple disjoint line segments. Each robot is responsible for guarding a subset of a perimeter and any point on a perimeter must be guarded by some robot. In allocating the robotic swarm, the main objective is to minimize the maximum 1D distance to be covered by any robot along the boundary of the regions. For this optimization problem which we call optimal perimeter guarding (OPG), thorough structural analysis is performed, which is then exploited to develop fast exact algorithms that run in guaranteed low polynomial time. In addition to formal analysis and proofs, experimental evaluations and simulations are performed that further validate the correctness and effectiveness of our algorithmic results.
    [Full Paper] [Video]
    Feng, Si Wei; Han, Shuai D.; Yu, Jingjin

    A Polynomial-time Solution for Robust Registration with Extreme Outlier Rates We propose a robust approach for the registration of two sets of 3D points in the presence of a large amount of outliers. Our first contribution is to reformulate the registration problem using a Truncated Least Squares (TLS) cost that makes the estimation insensitive to a large fraction of spurious point- to-point correspondences. The second contribution is a general framework to decouple rotation, translation, and scale estimation, which allows solving in cascade for the three transformations. Since each subproblem (scale, rotation, and translation estimation) is still non-convex and combinatorial in nature, out third contribution is to show that (i) TLS scale and (component-wise) translation estimation can be solved exactly and in polynomial time via an adaptive voting scheme, (ii) TLS rotation estimation can be relaxed to a semidefinite program and the relaxation is tight in practice, even in the presence of an extreme amount of outliers. We validate the proposed algorithm, named TEASER (Truncated least squares Estimation And SEmidefinite Relaxation), in standard registration benchmarks showing that the algorithm outperforms RANSAC and robust local optimization techniques, and favorably compares with Branch-and-Bound methods, while being a polynomial-time algorithm. TEASER can tolerate up to 99% outliers and returns highly-accurate solutions.
    [Full Paper] [Video]
    Yang, Heng; Carlone, Luca

    Learning to Throw Arbitrary Objects with Residual Physics We investigate whether a robot arm can learn to pick and throw arbitrary objects into selected boxes quickly and accurately. Throwing has the potential to increase the physical reachability and picking speed of a robot arm. However, precisely throwing arbitrary objects in unstructured settings presents many challenges: from acquiring reliable pre-throw conditions (e.g. initial pose of object in manipulator) to handling varying object-centric properties (e.g. mass distribution, friction, shape) and dynamics (e.g. aerodynamics). In this work, we propose an end-to-end formulation that jointly learns to infer control parameters for grasping and throwing motion primitives from visual observations (images of arbitrary objects in a bin) through trial and error. Within this formulation, we investigate the synergies between grasping and throwing (i.e., learning grasps that enable more accurate throws) and between simulation and deep learning (i.e., using deep networks to predict residuals on top of control parameters predicted by a physics simulator). The resulting system, TossingBot, is able to grasp and throw arbitrary objects into boxes located outside its maximum reach range at 500+ mean picks per hour (600+ grasps per hour with 84% throwing accuracy); and generalizes to new objects and landing locations. Videos are available at http://tossingbot.cs.princeton.edu
    [Full Paper] [Video]
    Zeng, Andy; Song, Shuran; Lee, Johnny; Rodriguez, Alberto; Funkhouser, Thomas A.

    OIL: Observational Imitation Learning Recent work has explored the problem of autonomous navigation by imitating a teacher and learning an end-to-end policy, which directly predicts controls from raw images. However, these approaches tend to be sensitive to mistakes by the teacher and do not scale well to other environments or vehicles. To this end, we propose Observational Imitation Learning (OIL), a novel imitation learning variant that supports online training and automatic selection of optimal behavior by observing multiple imperfect teachers. We apply our proposed methodology to the challenging problems of autonomous driving and UAV racing. For both tasks, we utilize the Sim4CV simulator that enables the generation of large amounts of synthetic training data and also allows for online learning and evaluation. We train a perception network to predict waypoints from raw image data and use OIL to train another network to predict controls from these waypoints. Extensive experiments demonstrate that our trained network outperforms its teachers, conventional imitation learning (IL) and reinforcement learning (RL) baselines and even humans in simulation.
    [Full Paper] [Video]
    Li, Guohao; Mueller, Matthias; Casser, Vincent Michael; Smith, Neil; Michels, Dominik; Ghanem, Bernard

    DESPOT-α: Online POMDP Planning with Large State and Observation Spaces State-of-the-art sampling-based online POMDP solvers compute near-optimal policies for POMDPs with very large state spaces. However, when faced with large observation spaces, they may become overly optimistic and compute sub-optimal policies, because of particle divergence. This paper presents a new online POMDP solver DESPOT-α, which builds upon the widely used DESPOT solver. DESPOT-α improves the practical performance of online planning for POMDPs with large observation as well as state spaces. Like DESPOT, DESPOT-α uses the particle belief approximation and searches a determinized sparse belief tree. To tackle large observation spaces, DESPOT-α shares sub-policies among many observations during online policy computation. The value function of a sub-policy is a linear function of the belief, commonly known as α-vector. We introduce a particle approximation of the α-vector to improve the efficiency of online policy search. We further speed up DESPOT-α using CPU and GPU parallelization ideas introduced in HyP-DESPOT. Experimental results show that DESPOT-α/HyP-DESPOT-α outperform DESPOT/HyP-DESPOT on POMDPs with large observation spaces, including a complex simulation task involving an autonomous vehicle driving among many pedestrians.
    [Full Paper]
    Garg, Neha Priyadarshini; Hsu, David; Lee, Wee Sun

    Collective Formation and Cooperative Function of a Magnetic Microrobotic Swarm Untethered magnetically actuated microrobots can access distant, enclosed and small spaces, such as inside microfluidic channels and the human body, making them appealing for minimal invasive tasks. Despite the simplicity of individual magnetic microrobots, a collective of these microrobots that can work closely and cooperatively would significantly enhance their capabilities. However, a challenge of realizing such collective magnetic microrobots is to coordinate their formations and motions with under-actuated control signals. Here, we report a method that allows collective magnetic microrobots working closely and cooperatively by controlling their two-dimensional (2D) formations and collective motions in a programmable and reconfigurable manner. The actively designed formation and intrinsic adjustable compliance within the group allow bio-inspired collective behaviors, such as navigating through cluttered environments and reconfigurable cooperative manipulation ability. These collective magnetic microrobots thus could enable potential applications in programmable self-assembly, modular robotics, swarm systems, and biomedicine.
    [Full Paper]
    Dong, Xiaoguang; Sitti, Metin

    Highly Parallelized Data-Driven MPC for Minimal Intervention Shared Control We present a shared control paradigm that improves a user's ability to operate complex, dynamic systems in potentially dangerous environments without a priori knowledge of the user's objective. In this paradigm, the role of the autonomous partner is to improve the general safety of the system without constraining the user's ability to achieve unspecified behaviors. Our approach relies on a data-driven, model-based representation of the joint human-machine system to evaluate, in parallel, a significant number of potential inputs that the user may wish to provide. These samples are used to (1) predict the safety of the system over a receding horizon, and (2) minimize the influence of the autonomous partner. The resulting shared control algorithm maximizes the authority allocated to the human partner to improve their sense of agency, while improving safety. We evaluate the efficacy of our shared control algorithm with a human subjects study (n=20) conducted in two simulated environments: a balance bot and a race car. During the experiment, users are free to operate each system however they would like (i.e., there is no specified task) and are only asked to try to avoid unsafe regions of the state space. Using modern computational resources (i.e., GPUs) our approach is able to consider more than 10,000 potential trajectories at each time step in a control loop running at 100Hz for the balance bot and 60Hz for the race car. The results of the study show that our shared control paradigm improves system safety without knowledge of the user's goal, while maintaining high-levels of user satisfaction and low-levels of frustration. Our code is available online at https://github.com/asbroad/mpmi_shared_control.
    [Full Paper] [Video]
    Broad, Alexander; Murphey, Todd; Argall, Brenna

    Autonomous Tool Construction Using Part Shape and Attachment Prediction This work explores the problem of robot tool construction -— creating tools from parts available in the environment. We advance the state-of-the-art in robotic tool construction by introducing an approach that enables the robot to construct a wider range of tools with greater computational efficiency. Specifically, given an action that the robot wishes to accomplish and a set of building parts available to the robot, our approach reasons about the shape of the parts and potential ways of attaching them, generating a ranking of part combinations that the robot then uses to construct and test the target tool. We validate our approach on the construction of five tools using a physical 7-DOF robot arm.
    [Full Paper] [Video]
    Nair, Lakshmi Velayudhan; Srikanth, Nithin Shrivatsav; Erickson, Zackory; Chernova, Sonia

    A Behavioral Approach to Visual Navigation with Graph Localization Networks Inspired by research in psychology, we introduce a behavioral approach for visual navigation using topological maps. Our goal is to enable a robot to navigate from one location to another, relying only on its visual observations and the topological map of the environment. To this end, we propose using graph neural networks for localizing the agent in the map, and decompose the action space into primitive behaviors implemented as convolutional or recurrent neural networks. Using the Gibson simulator and the Stanford 2D-3D-S dataset, we verify that our approach outperforms relevant baselines and is able to navigate in both seen and unseen indoor environments.
    [Full Paper] [Video]
    Chen, Kevin; de Vicente, Juan Pablo; Sepulveda, Gabriel; Xia, Fei; Soto, Alvaro; Vázquez, Marynel; Savarese, Silvio

    Learning to Walk Via Deep Reinforcement Learning Deep reinforcement learning (deep RL) holds the promise of automating the acquisition of complex controllers that can map sensory inputs directly to low-level actions. In the domain of robotic locomotion, deep RL could enable learning locomotion skills with minimal engineering and without an explicit model of the robot dynamics. Unfortunately, applying deep RL to real-world robotic tasks is exceptionally difficult, primarily due to poor sample complexity and sensitivity to hyperparameters. While hyperparameters can be easily tuned in simulated domains, tuning may be prohibitively expensive on physical systems, such as legged robots, that can be damaged through extensive trial-and-error learning. In this paper, we propose a sample-efficient deep RL algorithm based on maximum entropy RL that requires minimal per-task tuning and only a modest number of trials to learn neural network policies. We apply this method to learning walking gaits on a real-world Minitaur robot. Our method can acquire a stable gait from scratch directly in the real world in about two hours, without relying on any model or simulation, and the resulting policy is robust to moderate variations in the environment. We further show that our algorithm achieves state-of-the-art performance on simulated benchmarks with a single set of hyperparameters. Videos of training and the learned policy can be found on the project website.
    [Full Paper]
    Haarnoja, Tuomas; Ha, Sehoon; Zhou, Aurick; Tan, Jie; Tucker, George; Levine, Sergey

    A Differentiable Augmented Lagrangian Method for Bilevel Nonlinear Optimization Many problems in modern robotics can be addressed by modeling them as bilevel optimization problems. In this work, we leverage augmented Lagrangian methods and recent advances in automatic differentiation to develop a general-purpose nonlinear optimization solver that is well suited to bilevel optimization. We then demonstrate the validity and scalability of our algorithm with two representative robotic problems, namely robust control and parameter estimation for a system involving contact. We stress the general nature of the algorithm and its potential relevance to many other problems in robotics.
    [Full Paper]
    Landry, Benoit; Manchester, Zachary; Pavone, Marco

    A Magnetically-Actuated Untethered Jellyfish-Inspired Soft Milliswimmer Untethered small-scale soft robots can potentially be used in healthcare and biomedical applications. They can access small spaces and reshape their bodies in a programmable manner to adapt to unstructured environments and have diverse dynamic behaviors. However, the functionalities of current miniature soft robots are limited, restricting their applications in medical procedures. Taking the advantage of the shape-programmable ability of magnetic soft composite materials, here we propose an untethered soft millirobot (jellyfishbot) that can swim like a jellyfish by time- and trajectory-asymmetric up and down beating of its lappets. Its swimming speed and direction can be controlled by tuning the magnitude, frequency, and direction of the external oscillating magnetic field. We demonstrate that such jellyfishbot can perform several tasks that could be useful towards medical applications, such as delivering drugs, clogging a narrow tube or vessel, and patching a target area under ultrasound imaging-based guiding. The millirobot presented in this paper could be used inside organs filled with fluids completely, such as a bladder or inflated stomach.
    [Full Paper] [Video]
    Ren, Ziyu; Wang, Tianlu; Hu, Wenqi; Sitti, Metin

    Value Iteration Networks on Multiple Levels of Abstraction Learning-based methods are promising to plan robot motion without performing extensive search, which is needed by many non-learning approaches. Recently, Value Iteration Networks (VINs) received much interest since---in contrast to standard CNN-based architectures---they learn goal-directed behaviors which generalize well to unseen domains. However, VINs are restricted to small and low-dimensional domains, limiting their applicability to real-world planning problems. To address this issue, we propose to extend VINs to representations with multiple levels of abstraction. While the vicinity of the robot is represented in sufficient detail, the representation gets spatially coarser with increasing distance from the robot. The information loss caused by the decreasing resolution is compensated by increasing the number of features representing a cell. We show that our approach is capable of solving significantly larger 2D grid world planning tasks than the original VIN implementation. In contrast to a multiresolution coarse-to-fine VIN implementation which does not employ additional descriptive features, our approach is capable of solving challenging environments, which demonstrates that the proposed method learns to encode useful information in the additional features. As an application for solving real-world planning tasks, we successfully employ our method to plan omnidirectional driving for a search-and-rescue robot in cluttered terrain.
    [Full Paper]
    Schleich, Daniel; Klamt, Tobias; Behnke, Sven

    From Explanation to Synthesis: Compositional Program Induction for Learning from Demonstration Hybrid systems are a compact and natural mechanism with which to address problems in robotics. This work introduces an approach to learn hybrid systems from demonstrations, with an emphasis on extracting models that are explicitly verifiable and easily interpreted by robot operators. We fit a sequence of controllers using sequential importance sampling under a generative switching proportional controller task model. Here, we parameterise controllers using a proportional gain and a visually verifiable joint angle goal. Inference under this model is challenging, but we address this by introducing an attribution prior extracted from a neural end-to-end visuomotor control model. Given the sequence of controllers comprising a task, we simplify the trace using grammar parsing strategies, taking advantage of the sequence compositionality, before grounding the controllers by training perception networks to predict goals given images. Using this approach, we are successfully able to induce a program for a visuomotor reaching task involving loops and conditionals from a single demonstration and a neural end-to-end model. In addition, we are able to discover the program used for a tower building task. We argue that computer program-like control systems are more interpretable than alternative end-to-end learning approaches, and that hybrid systems inherently allow for better generalisation across task configurations.
    [Full Paper]
    Burke, Michael; Penkov, Svetlin Valentinov; Ramamoorthy, Subramanian

    Segment2Regress: Monocular 3D Vehicle Localization in Two Stages High-quality depth information is required to perform 3D vehicle detection, consequently, there exists a large performance gap between camera and LiDAR-based approaches. In this paper, our monocular camera-based 3D vehicle localization method alleviates the dependency on high-quality depth maps by taking advantage of the commonly accepted assumption that the observed vehicles lie on the road surface. We propose a two-stage approach that consists of a segment network and a regression network, called Segment2Regress. For a given single RGB image and a prior 2D object detection bounding box, the two stages are as follows: 1) The segment network activates the pixels under the vehicle (modeled as four line segments and a quadrilateral representing the area beneath the vehicle projected on the image coordinate). These segments are trained to lie on the road plane such that our network does not require full depth estimation. Instead, the depth is directly approximated from the known ground plane parameters. 2) The regression network takes the segments fused with the plane depth to predict the 3D location of a car at the ground level. To stabilize the regression, we introduce a coupling loss that enforces structural constraints. The efficiency, accuracy, and robustness of the proposed technique are highlighted through a series of experiments and ablation assessments. These tests are conducted on the KITTI bird's eye view dataset where Segment2Regress demonstrates state-of-the-art performance.
    [Full Paper]
    Choe, Jaesung; Joo, Kyungdon; Rameau, Francois; Shim, Gyu Min; Kweon, In So

    On the Merits of Joint Space and Orientation Representations in Learning the Forward Kinematics in SE(3) This paper investigates the influence of different joint space and orientation representations on the approximation of the forward kinematics. We consider all degrees of freedom in three dimensional space SE(3) and in the robot's joint space Q. In order to approximate the forward kinematics, different shallow artificial neural networks with ReLU (rectified linear unit) activation functions are designed. The amount of weights and bias' of each network are normalized. The results show that quaternion/vector-pairs outperform other SE(3) representations with respect to the approximation capabilities, which is demonstrated with two robot types; a Stanford Arm and a concentric tube continuum robot. For the latter, experimental measurements from a robot prototype are used as well. Regarding measured data, if quaternion/vector-pairs are used, the approximation error with respect to translation and to rotation is found to be seven times and three times more accurate, respectively. By utilizing a four-parameter orientation representation, the position tip error is less than 0.8% with respect to the robot length on measured data showing higher accuracy compared to the state-of-the-art-modeling (1.5%) for concentric tube continuum robots. Other three-parameter representations of SO(3) cannot achieve this, for instance any sets of Euler angles (in the best case 3.5% with respect to the robot length).
    [Full Paper]
    Grassmann, Reinhard M.; Burgner-Kahrs, Jessica

    LeTS-Drive: Driving among a Crowd by Learning from Tree Search Autonomous driving in a crowded environment, e.g., a busy traffic intersection, is an unsolved challenge for robotics. The robot vehicle must contend with a dynamic and partially observable environment, noisy sensors, and many agents. A principled approach is to formalize it as a Partially Observable Markov Decision Process (POMDP) and solve it through online belief-tree search. To handle a large crowd and achieve real-time performance in this very challenging setting, we propose LeTS-Drive, which integrates online POMDP planning and deep learning. It consists of two phases. In the offline phase, we learn a policy and the corresponding value function by imitating the belief tree search. In the online phase, the learned policy and value function guide the belief tree search. LeTS-Drive leverages the robustness of planning and the runtime efficiency of learning to enhance the performance of both. Experimental results in simulation show that LeTS-Drive outperforms either planning or imitation learning alone and develops sophisticated driving skills.
    [Full Paper]
    Cai, Panpan; Luo, Yuanfu; Saxena, Aseem; Hsu, David; Lee, Wee Sun

    An Omnidirectional Aerial Manipulation Platform for Contact-Based Inspection This paper presents an omnidirectional aerial manipulation platform for robust and responsive interaction with unstructured environments, toward the goal of contact-based inspection. The fully actuated tilt-rotor aerial system is equipped with a rigidly mounted end-effector, and is able to exert a 6 degree of freedom force and torque, decoupling the system's translational and rotational dynamics, and enabling precise interaction with the environment while maintaining stability. An impedance controller with selective apparent inertia is formulated to permit compliance in certain degrees of freedom while achieving precise trajectory tracking and disturbance rejection in others. Experiments demonstrate disturbance rejection, push-and-slide interaction, and on-board state estimation with depth servoing to interact with local surfaces. The system is also validated as a tool for contact-based non-destructive testing of concrete infrastructure.
    [Full Paper] [Video]
    Bodie, Karen; Brunner, Maximilian; Pantic, Michael; Walser, Stefan; Pfändler, Patrick; Angst, Ueli; Siegwart, Roland; Nieto, Juan

    Unsupervised Visuomotor Control through Distributional Planning Networks While reinforcement learning (RL) has the potential to enable robots to autonomously acquire a wide range of skills, in practice, RL usually requires manual, per-task engineering of reward functions, especially in real world settings where aspects of the environment needed to compute progress are not directly accessible. To enable robots to autonomously learn skills, we instead consider the problem of reinforcement learning without access to rewards. We aim to learn an unsupervised embedding space under which the robot can measure progress towards a goal for itself. Our approach explicitly optimizes for a metric space under which action sequences that reach a particular state are optimal when the goal is the final state reached. This enables learning effective and control-centric representations that lead to more autonomous reinforcement learning algorithms. Our experiments on three simulated environments and two real-world manipulation problems show that our method can learn effective goal metrics from unlabeled interaction, and use the learned goal metrics for autonomous reinforcement learning.
    [Full Paper] [Video]
    Yu, Tianhe; Shevchuk, Gleb; Sadigh, Dorsa; Finn, Chelsea

    A Dynamical System Approach to Motion and Force Generation in Contact Tasks Many tasks require the robot to enter in contact with surfaces, be it to take support, to polish or to grasp an object. It is crucial that the robot controls forces both upon making contact and while in contact. While many solutions exist to control for contact, none offer the required robustness to adapt to real-world uncertainties, such as sudden displacement of the object prior and once in contact. To adapt to such disturbances require to re-plan on the fly both the trajectory and the force. Dynamical systems (DS) offer a framework for instant re-planning of trajectories. They are however limited to control of motions. We extend this framework here to enable generating contact forces and trajectories through DS. The framework allows also to modulate the impedance so as to show rigidity to maintain contact, and compliance to ensure safe interaction with humans. We validate the approach in single and dual arm setting using KUKA LWR 4+ robotic arms. We show that the approach allows 1) to make smooth contact while applying large forces, 2) to maintain desired contact force when scanning non-linear surfaces, even when the surface is moved, and 3) to grasp and lift smoothly an object in the air, and to re-balance forces on the fly to maintain the grasp even when subjected to strong external disturbances.
    [Full Paper] [Video]
    Amanhoud, Walid; Khoramshahi, Mahdi; Billard, Aude

    Modeling and Analysis of Non-Unique Behaviors in Multiple Frictional Impacts Many fundamental challenges in robotics, based in manipulation or locomotion, require making and breaking contact with the environment. To represent the complexity of frictional contact events, impulsive impact models are especially popular, as they often lead to mathematically and computationally tractable approaches. However, when two or more impacts occur simultaneously, the precise sequencing of impact forces is generally unknown, leading to the potential for multiple possible outcomes. This simultaneity is far from pathological, and occurs in many common robotics applications. In this work, we propose an approach for resolving simultaneous frictional impacts, represented as a differential inclusion. Solutions to our model, an extension to multiple contacts of Routh's method, naturally capture the set of potential post-impact velocities. We prove that solutions to the presented model must terminate. This is, to the best of our knowledge, the first such guarantee for set-valued outcomes to simultaneous frictional impacts.
    [Full Paper]
    Halm, Mathew; Posa, Michael

    Learning Reward Functions by Integrating Human Demonstrations and Preferences Our goal is to accurately and efficiently learn reward functions for autonomous robots. Current approaches to this problem include inverse reinforcement learning (IRL), which uses expert demonstrations, and preference-based learning, which iteratively queries the user for her preferences between trajectories. In robotics however, IRL often struggles because it is difficult to get high-quality demonstrations; conversely, preference-based learning is very inefficient since it attempts to learn a continuous, high-dimensional function from binary feedback. We propose a new framework for reward learning, DemPref, that uses both demonstrations and preference queries to learn a reward function. Specifically, we (1) use the demonstrations to learn a coarse prior over the space of reward functions, to reduce the effective size of the space from which queries are generated; and (2) use the demonstrations to ground the (active) query generation process, to improve the quality of the generated queries. Our method alleviates the efficiency issues faced by standard preference-based learning methods and does not exclusively depend on (possibly low-quality) demonstrations. In numerical experiments, we find that DemPref is significantly more efficient than a standard active preference-based learning method. In a user study, we compare our method to a standard IRL method; we find that users rated the robot trained with DemPref as being more successful at learning their desired behavior, and preferred to use the DemPref system (over IRL) to train the robot.
    [Full Paper]
    Palan, Malayandi; Shevchuk, Gleb; Landolfi, Nicholas Charles; Sadigh, Dorsa

    Monte-Carlo Policy Synthesis in POMDPs with Quantitative and Qualitative Objectives Autonomous robots operating in uncertain environments often face the problem of planning under a mix of formal, qualitative requirements, for example the assertion that the robot reaches a goal location safely, and optimality criteria, for example that the path to the goal is as short or energy-efficient as possible. Such problems can be modeled as Partially Observable Markov Decision Processes (POMDPs) with quantitative and qualitative objectives. In this paper, we present a new policy synthesis algorithm, called Policy Synthesis with Statistical Model Checking (PO-SMC), for such POMDPs. While previous policy synthesis approaches for this setting use symbolic tools (for example satisfiability solvers) to meet the qualitative requirements, our approach is based on Monte Carlo sampling and uses Statistical Model Checking to ensure that the qualitative requirements are satisfied with high confidence. An appeal of statistical model checking is that it can handle rich temporal requirements such as safe-reachability, while being far more scalable than symbolic methods. The safe-reachability combines the safety and reachability requirements as a single qualitative requirement. While our use of sampling introduces approximations that symbolic approaches do not require, we present theoretical results that estimate that the error due to approximation is bounded. Our experimental results demonstrate that PO-SMC consistently performs orders of magnitude faster than existing symbolic methods for policy synthesis under qualitative and quantitative requirements.
    [Full Paper]
    Redwan Newaz, Abdullah Al; Chaudhuri, Swarat; Kavraki, Lydia

    A 2-Approximation Algorithm for the Online Tethered Coverage Problem We consider the problem of covering a planar environment, possibly containing unknown obstacles, using a robot of square size D x D attached to a fixed point S by a cable of finite length L. The environment is discretized into 4-connected grid cells with resolution proportional to the robot size. Starting at S, the task of the robot is to visit each cell in the environment that are not occupied by obstacles and return to S with the cable fully retracted. Our goal is to minimize the total distance traveled by the robot to fully cover the unknown environment while avoiding tangling of the cable. In this paper, we present a novel online algorithm to solve this problem that achieves 2-approximation for the total distance traveled by the robot compared to the minimum distance that needs to be traveled. Our algorithm significantly improves the 2L/D-approximation achieved by the best previously known online algorithm designed for this problem. The approximation bound is also validated using rigorous simulated experiments.
    [Full Paper]
    Sharma, Gokarna; Poudel, Pavan; Dutta, Ayan; Zeinali, Vala; Talaei Khoei, Tala; Kim, Jong-Hoon

    Harnessing Reinforcement Learning for Neural Motion Planning Motion planning is an essential component in most of today's robotic applications. In this work, we consider the learning setting, where a set of solved motion planning problems is used to improve the efficiency of motion planning on different, yet similar problems. This setting is important in applications with rapidly changing environments such as in e-commerce, among others. We investigate a general deep learning based approach, where a neural network is trained to map an image of the domain, the current robot state, and a goal robot state to the next robot state in the plan. We focus on the learning algorithm, and compare supervised learning methods with reinforcement learning (RL) algorithms. We first establish that supervised learning approaches are inferior in their accuracy due to insufficient data on the boundary of the obstacles, an issue that RL methods mitigate by actively exploring the domain. We then propose a modification of the popular DDPG RL algorithm that is tailored to motion planning domains, by exploiting the known model in the problem and the set of solved plans in the data. We show that our algorithm, dubbed DDPG-MP, significantly improves the accuracy of the learned motion planning policy. Finally, we show that given enough training data, our method can plan significantly faster on novel domains than off-the-shelf sampling based motion planners. Results of our experiments are shown in https://youtu.be/wHQ4Y4mBRb8.
    [Full Paper] [Video]
    Jurgenson, Tom; Tamar, Aviv

    Simultaneously Learning Vision and Feature-Based Control Policies for Real-World Ball-In-A-Cup We present a method for fast training of vision based control policies on real robots. The key idea behind our method is to perform multi-task Reinforcement Learning with auxiliary tasks that differ not only in the reward to be optimized but also in the state-space in which they operate. In particular, we allow auxiliary task policies to utilize task features that are available only at training-time. This allows for fast learning of auxiliary policies, which subsequently generate good data for training the main, vision-based control policies. This method can be seen as an extension of the Scheduled Auxiliary Control (SAC-X) framework. We demonstrate the efficacy of our method by using both a simulated and real-world bic{} game controlled by a robot arm. In simulation, our approach leads to significant learning speed-ups when compared to standard SAC-X. On the real robot we show that the task can be learned from-scratch, i.e., with no transfer from simulation and no imitation learning. Videos of our learned policies running on the real robot can be found at https://sites.google.com/view/rss-2019-sawyer-bic/.
    [Full Paper]
    Schwab, Devin; Springenberg, Jost Tobias; Fernandes Martins, Murilo; Neunert, Michael; Lampe, Thomas; Abdolmaleki, Abbas; Hertweck, Tim; Hafner, Roland; Nori, Francesco; Riedmiller, Martin

    Automated Shapeshifting for Function Recovery in Damaged Robots A robot's mechanical parts routinely wear out from normal functioning and can be lost to injury. For autonomous robots operating in isolated or hostile environments, repair from a human operator is often not possible. Thus, much work has sought to automate damage recovery in robots. However, every case reported in the literature to date has accepted the damaged mechanical structure as fixed, and focused on learning new ways to control it. Here we show for the first time a robot that automatically recovers from unexpected damage by deforming its resting mechanical structure without changing its control policy. We found that, especially in the case of "deep insult", such as removal of all four of the robot's legs, the damaged machine evolves shape changes that not only recover the original level of function (locomotion) as before, but can in fact surpass the original level of performance (speed). This suggests that shape change, instead of control readaptation, may be a better method to recover function after damage in some cases.
    [Full Paper]
    Kriegman, Sam; Walker, Stephanie; Shah, Dylan S.; Kramer-Bottiglio, Rebecca; Bongard, Josh

    BayesSim: Adaptive Domain Randomization Via Probabilistic Inference for Robotics Simulators We introduce BayesSim, a framework for robotics simulations allowing a full Bayesian treatment for the parameters of the simulator. As simulators become more sophisticated and able to represent the dynamics more accurately, fundamental problems in robotics such as motion planning and perception can be solved in simulation and solutions transferred to the physical robot. However, even the most complex simulator might still not be able to represent reality in all its details either due to inaccurate parametrization or simplistic assumptions in the dynamic models. BayesSim provides a principled framework to reason about the uncertainty of simulation parameters. Given a black-box simulator (or generative model) that outputs trajectories of state and action pairs from unknown simulation parameters, followed by trajectories obtained with a physical robot, we develop a likelihood-free inference method that computes the posterior distribution of simulation parameters. This posterior can then be used in problems where Sim2Real is critical, for example in policy search. We compare the performance of BayesSim in obtaining accurate posteriors in a number of classical control and robotics problems, and show that the posterior computed from BayesSim can be used for domain radomization outperforming alternative methods that randomize based on uniform priors.
    [Full Paper] [Video]
    Ramos, Fabio; Possas, Rafael; Fox, Dieter

    Robot Adaptation to Unstructured Terrains by Joint Representation and Apprenticeship Learning When a mobile robot is deployed in a field environment, e.g., during a disaster response application, the capability of adapting its navigational behaviors to unstructured terrains is essential for effective and safe robot navigation. In this paper, we introduce a novel joint terrain representation and apprenticeship learning approach to implement robot adaptation to unstructured terrains. Different from conventional learning-based adaptation techniques, our approach provides a unified problem formulation that integrates representation and apprenticeship learning under a unified regularized optimization framework, instead of treating them as separate and independent procedures. Our approach also has the capability to automatically identify discriminative feature modalities, which can improve the robustness of robot adaptation. In addition, we implement a new optimization algorithm to solve the formulated problem, which provides a theoretical guarantee to converge to the global optimal solution. In the experiments, we extensively evaluate the proposed approach in real-world scenarios, in which a mobile robot navigates on familiar and unfamiliar unstructured terrains. Experimental results have shown that the proposed approach is able to transfer human expertise to robots with small errors, achieve superior performance compared with previous and baseline methods, and provide intuitive insights on the importance of terrain feature modalities.
    [Full Paper]
    Zhang, Hao; SIVA, SRIRAM; Wigness, Maggie; Rogers III, John G.

    ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst Our goal is to train a policy for autonomous driving via imitation learning that is robust enough to drive a real vehicle. We find that standard behavior cloning is insufficient for handling complex driving scenarios, even when we leverage a perception system for preprocessing the input and a controller for executing the output on the car: 30 million examples are still not enough. We propose exposing the learner to synthesized data in the form of perturbations to the expert's driving, which creates interesting situations such as collisions and/or going off the road. Rather than purely imitating all data, we augment the imitation loss with additional losses that penalize undesirable events and encourage progress -- the perturbations then provide an important signal for these losses and lead to robustness of the learned model. We show that the ChauffeurNet model can handle complex situations in simulation, and present ablation experiments that emphasize the importance of each of our proposed changes and show that the model is responding to the appropriate causal factors. Finally, we demonstrate the model driving a real car at our test facility.
    [Full Paper]
    Bansal, Mayank; Krizhevsky, Alex; Ogale, Abhijit

    Impact-Friendly Robust Control Design with Task-Space Quadratic Optimization Almost all known robots fear impacts. Unlike humans, robots keep guarded motions to near zero-velocity prior to establishing contacts with their surroundings. This significantly slows down robotic tasks involving physical interaction. Two main ingredients are necessary to remedy this limitation: impact-friendly hardware design, and impact-friendly controllers. Our work focuses on the controller aspect. Task-space controllers formulated as quadratic programming (QP) are widely used in robotics to generate modular and reactive motion for a large range of task specifications under various constraints. We explicitly introduce discrete impact dynamics model into the QP-based controllers to generate robot motions that are robust to impact-induced state jumps in the joint velocities and joint torques. Our simulations, validate that our proposed impact-friendly QP controller is robust to contacts, shall they be expected or not. Therefore, we can exploit it for establishing contacts with high velocities, and explicitly generate task-purpose impulsive forces.
    [Full Paper] [Video]
    Wang, Yuquan; Kheddar, Abderrahmane

    An Online Learning Approach to Model Predictive Control Model predictive control (MPC) is a powerful technique for solving dynamic control tasks. In this paper, we show that there exists a close connection between MPC and online learning, an abstract theoretical framework for analyzing online decision making in the optimization literature. This new perspective provides a foundation for leveraging powerful online learning algorithms to design MPC algorithms. Specifically, we propose a new algorithm based on dynamic mirror descent (DMD), an online learning algorithm that is designed for non-stationary setups. Our algorithm, Dynamic Mirror Descent Model Predictive Control (DMD-MPC), represents a general family of MPC algorithms that includes many existing techniques as special instances. DMD-MPC also provides a fresh perspective on previous heuristics used in MPC and suggests a principled way to design new MPC algorithms. In the experimental section of this paper, we demonstrate the flexibility of DMD-MPC, presenting a set of new MPC algorithms on a simple simulated cartpole and a simulated and real-world aggressive driving task. A video of the real-world experiment can be found at https://youtu.be/vZST3v0_S9w.
    [Full Paper]
    Wagener, Nolan; Cheng, Ching-an; Sacks, Jacob; Boots, Byron

    Real-Time Reactive Trip Avoidance for Powered Transfemoral Prostheses This paper presents a real-time reactive controller for a powered prosthesis that addresses the problem of trip avoidance. The control estimates the pose of the leg during swing with an extended Kalman filter, predicts future hip angles and hip heights using sparse Gaussian Processes, and reactively plans updated ankle and knee trajectories with a fast quadratic program solver to avoid trips. In preliminary experiments with an able-bodied user who purposefully lowered the hip to elicit trips on each swing, the proposed control reduced the rate of tripping by 68% when compared to a swing control that follows standard minimum-jerk trajectories. In addition, the proposed control also reduced the severity of toe-scuffing. To the best of our knowledge, this controller is the first to incorporate visual feedback in the real-time planning and control of a lower limb prosthesis during gait. The results demonstrate the potential of reactive and environment-aware controls to improve amputee gait robustness and encourage future development of leg prosthesis controls that can react in real-time to the environment and user state.
    [Full Paper] [Video]
    Thatte, Nitish; Srinivasan, Nandagopal; Geyer, Hartmut

    Robot Packing with Known Items and Nondeterministic Arrival Order This paper formulates two variants of packing problems in which the set of items is known but the arrival order is unknown. The goal is to certify that the items can be packed in a given container, and/or to optimize the size or cost of a container so that that the items are guaranteed to be packable, regardless of arrival order. The Nondeterministically ordered packing (NDOP) variant asks to generate a certificate that a packing plan exists for every ordering of items. Quasi-online packing (QOP) asks to generate a partially-observable packing policy that chooses the item location as each subsequent item is revealed. Theoretical analysis demonstrates that even the simple subproblem of verifying feasibility of a packing policy is NP-complete. Despite this worst-case complexity, practical solvers for both NDOP and QOP are developed, and experiments demonstrate their application to packing irregular 3D shapes with manipulator loading constraints.
    [Full Paper]
    Wang, Fan; Hauser, Kris

    Equivalence of the Projected Forward Dynamics and the Dynamically Consistent Inverse Solution The analysis, design, and motion planning of robotic systems, often relies on its forward and inverse dynamic models. When executing a task involving interaction with the environment, both the task and the environment impose constraints on the robot’s motion. For modeling such systems, we need to incorporate these constraints in the robot’s dynamic model. In this paper, we define the class of Task-based Constraints (TbC) to prove that the forward dynamic models of a constrained system obtained through the Projection-based Dynamics (PbD), and the Operational Space Formulation (OSF) are equivalent. In order to establish such equivalence, we first generalize the OSF to a rank deficient Jacobian. This generalization allow us to numerically handle redundant constraints and singular configurations, without having to use different controllers in the vicinity of such configurations. We then reformulate the PbD constraint inertia matrix, generalizing all its previous distinct algebraic variations. We also analyse the condition number of different constraint inertia matrices, which affects the numerical stability of its inversion. Furthermore, we show that we can recover the operational space control with constraints from a multiple Task-based Constraint abstraction.
    [Full Paper]
    Moura, Joao; Ivan, Vladimir; Erden, Mustafa Suphi; Vijayakumar, Sethu

    Robust Singular Smoothers for Tracking Using Low-Fidelity Data Tracking underwater autonomous platforms is often challenged by noisy, biased, and discretized input data. Classic filters and smoothers based on standard assumptions of Gaussian white noise break down when presented with any of these challenges. Robust models (such as the Huber loss) and constraints (e.g. maximum velocity) are used to attenuate these issues. Here, we consider robust smoothing with singular covariance, which covers bias and correlated noise, as well as many specific model types, such as those used in navigation. In particular, we show how to combine singular covariance models with robust losses and state-space constraints in a unified framework that can handle very low-fidelity data. A noisy, biased, and discretized navigation dataset from a submerged, low-cost inertial measurement unit (IMU) package, with ultra short baseline (USBL) data for ground truth, provides an opportunity to stresstest the proposed framework with promising results. We show how using the robust elements improves our ability to analyze the data, and present batch processing results for 10 minutes of data using three different frequencies of available USBL position fixes (gaps of 30 seconds, 1 minute, and 2 minutes). The results suggest that the framework can be extended to real-time tracking using robust windowed estimation.
    [Full Paper]
    Jonker, Jonathan; Aravkin, Aleksandr; Burke, James; Pillonetto, Gianluigi; Webster, Sarah E.

    Teleoperator Imitation with Continuous-Time Safety Learning to effectively imitate human teleoperators, with generalization to unseen and dynamic environments, is a promising path to greater autonomy enabling robots to steadily acquire complex skills from supervision. We propose a new motion learning technique rooted in contraction theory and sum-of-squares programming for estimating a control law in the form of a polynomial vector field from a given set of demonstrations. Notably, this vector field is provably optimal for the problem of minimizing imitation loss while providing continuous-time guarantees on the induced imitation behavior. Our method generalizes to new initial and goal poses of the robot and can adapt in real-time to dynamic obstacles during execution, with convergence to teleoperator behavior within a well-defined safety tube. We present an application of our framework for pick-and-place tasks in the presence of moving obstacles on a 7-DOF KUKA IIWA arm. The method compares favorably to other learning-from-demonstration approaches on benchmark handwriting imitation tasks.
    [Full Paper] [Video]
    El Khadir, Bachir; Varley, Jacob; Sindhwani, Vikas

    Differentiable Algorithm Networks for Composable Robot Learning This paper introduces the Differentiable Algorithm Network (DAN), a composable architecture for robot learning systems. A DAN is composed of neural network modules, each encoding a differentiable robot algorithm and an associated model; and it is trained end-to-end from data. DAN combines the strengths of model-driven modular system design and data-driven end-to-end learning. The algorithms and models act as structural assumptions to reduce the data requirements for learning; end-to-end learning allows the modules to adapt to one another and compensate for imperfect models and algorithms, in order to achieve the best overall system performance. We illustrate the DAN methodology through a case study on a simulated robot system, which learns to navigate in complex 3-D environments with only local visual observations and an image of a partially correct 2-D floor map.
    [Full Paper] [Video]
    Karkus, Peter; Ma, Xiao; Hsu, David; Kaelbling, Leslie; Lee, Wee Sun; Lozano-Perez, Tomas

    Online Incremental Learning of the Terrain Traversal Cost in Autonomous Exploration In this paper, we address motion efficiency in autonomous robot exploration with multi-legged walking robots that can traverse rough terrains at the cost of lower efficiency and greater body vibration. We propose a robotic system for online and incremental learning of the terrain traversal cost that is immediately utilized to reason about next navigational goals in building spatial model of the robot surrounding. The traversal cost experienced by the robot is characterized by incrementally constructed Gaussian Processes using Bayesian Committee Machine. During the exploration, the robot builds the spatial terrain model, marks untraversable areas, and leverages the Gaussian Process predictive variance to decide whether to improve the spatial model or decrease the uncertainty of the terrain traversal cost. The feasibility of the proposed approach has been experimentally verified in a fully autonomous deployment with the hexapod walking robot.
    [Full Paper] [Video]
    Pragr, Milos; Cizek, Petr; Bayer, Jan; Faigl, Jan

    Predicting Human Interpretations of Affect and Valence in a Social Robot As the adoption of robots becomes widespread across more industries and domains, those robots will be placed in new contexts where they will interact with people who do not understand how they work. The consequences of such a disparity can already be seen in how people assign anthropomorphic characteristics to those robots, despite what what the robot designers may have intended. In this paper, we seek to understand how people interpret a social robot's performance of an emotion, what we term 'affective display,' and the positive or negative valence of that affect. To this end, we tasked annotators with observing the Anki Cozmo robot perform its over 900 pre-scripted behaviors and labeling those behaviors with 16 possible affective display labels (e.g., interest, boredom, disgust, etc.). In our first experiment, we trained a neural network to predict annotated labels given multimodal information about the robot's movement, face, and audio. The results suggest that pairing affects to predict the valence between them is more informative, which we confirmed in a second experiment. Both experiments show that certain modalities are more useful for predicting displays of affect and valence. For our final experiment, we generated novel robot behaviors and tasked human raters with assigning scores to valence pairs instead of applying labels, then compared our model's predictions of valence between the affective pairs and compared the results to the human ratings. We conclude that robot designers and researchers cannot assume that people will perceive affect or valence as designed, and make several suggestions for directions of future work.
    [Full Paper]
    Kennington, Casey; McNeill, David

    Proximity Queries for Absolutely Continuous Parametric Curves In motion planning problems for autonomous robots, such as self-driving cars, the robot must ensure that its planned path is not in close proximity to obstacles in the environment. However, the problem of evaluating the proximity is generally non-convex and serves as a significant computational bottleneck for motion planning algorithms. In this paper, we present methods for a general class of absolutely continuous parametric curves to compute: (i) the minimum separating distance, (ii) tolerance verification, and (iii) collision detection. Our methods efficiently compute bounds on obstacle proximity by bounding the curve in a convex region. This bound is based on an upper bound on the curve arc length that can be expressed in closed form for a useful class of parametric curves including curves with trigonometric or polynomial bases. We demonstrate the computational efficiency and accuracy of our approach through numerical simulations of several proximity problems.
    [Full Paper] [Video]
    Lakshmanan, Arun; Patterson, Andrew; Cichella, Venanzio; HOVAKIMYAN, NAIRA

    A Modular Optimization Framework for Localization and Mapping This work approaches the challenge of how to divide the problem of Simultaneous Localization and Mapping (SLAM) into its smallest possible constituents, in such a way that the reusability and interchangeability of each such module is maximized. In particular, most components in the proposed system should be not aware of details such that whether the map comprises a single global map or a set of local submaps, whether the state vector is defined in SE(2) or SE(3), with or without velocity, etc. Any number of heterogeneous sensors should be used together and their information fused seamlessly into a consistent localization solution. The resulting system would be useful for researchers, easing the development of reproducible research and enabling the quick adoption of state-of-the-art algorithms into product prototypes. Our implementation has been tested with different sensors against the KITTI, EuRoC, and KAIST datasets. In this paper we focus on an introduction to the framework and on experimental results for 3D LiDAR odometry and mapping. LiDAR SLAM for the KITTI datasets achieves typical translation errors between 0.5% and 2% for most urban sequences, while processing the data at 1.5x the real-time rate with a reduced memory requirement thanks to our framework's capability to dynamically swap out from memory the parts of the map that are not immediately required, transparently loading them again when required. The framework will be released as open-source at https://github.com/MOLAorg/mola
    [Full Paper] [Video]
    Blanco-Claraco, Jose Luis

    Continuous Direct Sparse Visual Odometry from RGB-D Images This paper reports on a novel formulation and evaluation of visual odometry from RGB-D images. Assuming a static scene, the developed theoretical framework generalizes the widely used direct energy formulation (photometric error minimization) technique for obtaining a rigid body transformation that aligns two overlapping RGB-D images to a continuous formulation. The continuity is achieved through functional treatment of the problem and representing the process models over RGB-D images in a reproducing kernel Hilbert space; consequently, the registration is not limited to the specific image resolution and the framework is fully analytical with a closed-form derivation of the gradient. We solve the problem by maximizing the inner product between two functions defined over RGB-D images, while the continuous action of the rigid body motion Lie group is captured through the integration of the flow in the corresponding Lie algebra. Energy-based approaches have been extremely successful and the developed framework in this paper shares many of their desired properties such as the parallel structure on both CPUs and GPUs, sparsity, semi-dense tracking, avoiding explicit data association which is computationally expensive, and possible extensions to the simultaneous localization and mapping frameworks. The evaluations on experimental data and comparison with the equivalent energy-based formulation of the problem confirm the effectiveness of the proposed technique, especially, when the lack of structure and texture in the environment is evident.
    [Full Paper]
    Ghaffari Jadidi, Maani; Clark, William; Bloch, Anthony; Eustice, Ryan; Grizzle, J.W

    Cross-Modal Learning Filters for RGB-Neuromorphic Wormhole Learning Robots that need to act in an uncertain, populated, and varied world need heterogeneous sensors to be able to perceive and act robustly. For example, self-driving cars currently on the road are equipped with dozens of sensors of several types (lidar, radar, sonar, cameras, ...). All of this existing and emerging complexity opens up many interesting questions regarding how to deal with multi-modal perception and learning. The recently developed technique of "wormhole learning" shows that even temporary access to a different sensor with complementary invariance characteristics can be used to enlarge the operating domain of an existing object detector without the use of additional training data. For example, an RGB object detector trained with daytime data can be updated to function at night time by using a "wormhole" jump through a different modality that is more illumination invariant, such as an IR camera. It turns out that having an additional sensor improves performance, even if you subsequently lose it. In this work we extend wormhole learning to allow to cope with sensors that are radically different, such as RGB cameras and event-based neuromorphic sensors. Their profound differences imply that we need a more careful selection of which samples to transfer, by the definition of "cross-modal learning filters". We will walk in a relatively unexplored territory of multi-modal observability that is not usually considered in machine learning. We show that wormhole learning increases performance even though the intermediate neuromorphic modality is on average much worse at the task. These results suggest that multi-modal learning for perception is still an early field and there might be many opportunities to improve the perception performance by accessing a rich set of heterogeneous sensors (even if some are not actually deployed on the robot).
    [Full Paper]
    Zanardi, Alessandro; Aumiller, Andreas Jianhao; Zilly, Julian; Censi, Andrea; Frazzoli, Emilio

    DensePhysNet: Learning Dense Physical Object Representations Via Multi-Step Dynamic Interactions We study the problem of learning physical object representations for robot manipulation. Understanding object physics is critical for successful object manipulation, but also challenging because physical object properties can rarely be inferred from the object’s static appearance. In this paper, we propose DensePhysNet, a system that actively executes a sequence of dynamic interactions (e.g., sliding and colliding), and uses a deep predictive model over its visual observations to learn dense, pixel-wise representations that reflect the physical properties of observed objects. Our experiments in both simulation and real settings demonstrate that the learned representations carry rich physical information, and can directly be used to decode physical object properties such as friction and mass. The use of dense representation enables DensePhysNet to generalize well to novel scenes with more objects than in training. With knowledge of object physics, the learned representation also leads to more accurate and efficient manipulation in downstream tasks than the state-of-the-art.
    [Full Paper]
    Xu, Zhenjia; Wu, Jiajun; Zeng, Andy; Tenenbaum, Joshua; Song, Shuran

    Probabilistic Multimodal Modeling for Human-Robot Interaction Tasks Human-robot interaction benefits greatly from multimodal sensor inputs as they enable increased robustness and generalization accuracy. Despite this observation, few HRI methods are capable of efficiently performing inference for multimodal systems. In this work, we introduce a reformulation of Interaction Primitives which allows for learning from demonstration of interaction tasks, while also gracefully handling nonlinearities inherent to multimodal inference in such scenarios. We also empirically show that our method results in more accurate, more robust, and faster inference than standard Interaction Primitives and other common methods in challenging HRI scenarios.
    [Full Paper] [Video]
    Campbell, Joseph; Stepputtis, Simon; Ben Amor, Heni

    Game Theoretic Planning for Self-Driving Cars in Competitive Scenarios We propose a nonlinear receding horizon game-theoretic planner for autonomous cars in competitive scenarios with other cars. The online planner is specifically formulated for a two car autonomous racing scenario in which each car tries to advance along a given track as far as possible with respect to the other car. The algorithm extends previous work on game-theoretic planning for single integrator agents to be suitable for autonomous cars in the following ways: (i) by representing the trajectory as a piecewise-polynomial, (ii) incorporating bicycle kinematics into the trajectory, (iii) enforcing constraints on path curvature and acceleration. The game theoretic planner iteratively plans a trajectory for the ego vehicle, then the other vehicle until convergence. Crucially, the trajectory optimization includes a sensitivity term that allows the ego vehicle to reason about how much the other vehicle will yield to the ego vehicle to avoid collision. The resulting trajectories for the ego vehicle exhibit rich racing strategies such as blocking, faking, and opportunistic overtaking. The game-theoretic planner is shown to significantly out-perform a Model Predictive Control baseline planner in high-fidelity numerical simulations, in experiments with two scale autonomous cars, and in experiments with a full-scale autonomous car racing against a simulated vehicle.
    [Full Paper] [Video]
    Wang, Mingyu; Wang, Zijian; Talbot, John; Gerdes, J. Christian; Schwager, Mac

    PoseRBPF: A Rao-Blackwellized Particle Filter for6D Object Pose Estimation Tracking 6D poses of objects from videos providesrich information to a robot in performing different tasks suchas manipulation and navigation. In this work, we formulatethe 6D object pose tracking problem in the Rao-Blackwellizedparticle filtering framework, where the 3D rotation and the 3Dtranslation of the object are decoupled in the estimation process.This factorization allows our approach, called PoseRBPF, toefficiently estimate an object’s 3D translation along with the fulldistribution over the 3D rotation. This is achieved by discretizingthe rotation space in a fine-grained manner, and training an auto-encoder network to construct a codebook of feature embeddingsfor the discretized rotations. As a result, PoseRBPF can track ob-jects with arbitrary symmetries while still maintaining adequateposterior distributions. Our approach achieves state-of-the-artresults on two 6D pose estimation benchmarks
    [Full Paper] [Video]
    Deng, Xinke; Mousavian, Arsalan; Xiang, Yu; Xia, Fei; Bretl, Timothy; Fox, Dieter

    Leveraging Experience in Lazy Search Lazy graph search algorithms are efficient at solving motion planning problems where edge evaluation is the computational bottleneck. These algorithms work by lazily computing the shortest potentially feasible path, evaluating edges along that path, and repeating until a feasible path is found. The order in which edges are selected is critical to minimizing the total number of edge evaluations: a good edge selector chooses edges that are not only likely to be invalid, but also eliminates future paths from consideration. We wish to learn such a selector by leveraging prior experience. We formulate this problem as a Markov Decision Process (MDP) on the state of the search problem. While solving this large MDP is generally intractable, we show that we can compute oracular selectors that can solve the MDP during training. With access to such oracles, we use imitation learning to find effective policies. If new search problems are sufficiently similar to problems solved during training, the learned policy will choose a good edge evaluation ordering and solve the motion planning problem quickly. We evaluate our algorithms on a wide range of 2D and 7D problems and show that the learned selector outperforms baseline commonly used heuristics.
    [Full Paper]
    Bhardwaj, Mohak; Choudhury, Sanjiban; Boots, Byron; Srinivasa, Siddhartha

    Towards Provably Not-At-Fault Control of Autonomous Robots in Arbitrary Dynamic Environments As autonomous robots increasingly become part of daily life, they will often encounter dynamic environments while only having limited information about their surroundings. Unfortunately, due to the possible presence of malicious dynamic actors, it is infeasible to develop an algorithm that can guarantee collision-free operation. Instead, one can attempt to design a control technique that guarantees the robot is not-at-fault in any collision. In the literature, making such guarantees in real time has been restricted to static environments or specific dynamic models. To ensure not-at-fault behavior, a robot must first correctly sense and predict the world around it within some sufficiently large sensor horizon (the prediction problem), then correctly control relative to the predictions (the control problem). This paper addresses the control problem by proposing Reachability-based Trajectory Design for Dynamic environments (RTD-D), which guarantees that a robot with an arbitrary nonlinear dynamic model correctly responds to predictions in arbitrary dynamic environments. RTD-D first computes a Forward Reachable Set (FRS) offline of the robot tracking parameterized desired trajectories that include fail-safe maneuvers. Then, for online receding-horizon planning, RTD-D provides a way to discretize predictions of an arbitrary dynamic environment to enable real-time collision checking. The FRS is used to map these discretized predictions to provably not-at-fault trajectory plans. One such trajectory is chosen at each iteration, or the robot executes the fail-safe maneuver from its previous trajectory, which is guaranteed to be not at fault. RTD-D is shown to produce not-at-fault behavior over thousands of simulations and several real-world hardware demonstrations on two robots: a differential-drive Segway, and a small electric vehicle (EV).
    [Full Paper]
    Vaskov, Sean; Kousik, Shreyas; Larson, Hannah; Bu, Fan; Ward, James Robert; Worrall, Stewart; Johnson-Roberson, Matthew; Vasudevan, Ram

    Trajectory Optimization for Cable-Driven Soft Robot Locomotion Compliance is a defining characteristic of biological systems. Understanding how to exploit soft materials as effectively as living creatures do is consequently a fundamental challenge that is key to recreating the complex array of motor skills displayed in nature. As an important step towards this grand challenge, we propose a model-based trajectory optimization method for dynamic, cable-driven soft robot locomotion. To derive this trajectory optimization formulation, we begin by modeling soft robots using the Finite Element Method. Through a numerically robust implicit time integration scheme, forward dynamics simulations are used to predict the motion of the robot over arbitrarily long time horizons. Leveraging sensitivity analysis, we show how to efficiently compute analytic derivatives that encode the way in which entire motion trajectories change with respect to parameters that control cable contractions. This information is then used in a forward shooting method to automatically generate optimal locomotion trajectories starting from high-level goals such as the target walking speed or direction. We demonstrate the efficacy of our method by generating and analyzing locomotion gaits for multiple soft robots. Our results include both simulation and fabricated prototypes.
    [Full Paper] [Video]
    Bern, James; Banzet, Pol; Poranne, Roi; Coros, Stelian

    Direct Drive Hands: Force-Motion Transparency in Gripper Design The Direct Drive Hand (DDHand) project is exploring an alternative design philosophy for grippers. The conventional approach is to prioritize clamping force, leading to high gear ratios, slow motion, and poor transmission of force/motion signals. Instead, the DDHand prioritizes transparency: we view the gripper as a signal transmission channel, and seek high-bandwidth, high fidelity transmission of force and motion signals in both directions. The resulting design has no gears and no springs, occupying a new quadrant in the servo gripper design space. This paper presents the direct drive gripper design philosophy, compares the performance of different design choices, describes our current design and implementation, and demonstrates a fly-by “smack and snatch” grasping motion to show the gripper’s ability to safely detect and respond quickly to variations in the task environment.
    [Full Paper] [Video]
    Bhatia, Ankit; Johnson, Aaron; Mason, Matthew T.

    Local Koopman Operators for Data-Driven Control of Robotic Systems This paper presents a data-driven methodology for linear embedding of nonlinear systems. Utilizing structural knowledge of general nonlinear dynamics, the authors exploit the Koopman operator to develop a systematic, data-driven approach for constructing a linear representation in terms of higher order derivatives of the underlying nonlinear dynamics. With the linear representation, the nonlinear system is then controlled with an LQR feedback policy, the gains of which need to be calculated only once. As a result, the approach enables fast control synthesis. We demonstrate the efficacy of the approach with simulations and experimental results on the modeling and control of a tail-actuated robotic fish and show that the proposed policy is comparable to backstepping control. To the best of our knowledge, this is the first experimental validation of Koopman-based LQR control.
    [Full Paper] [Video]
    Mamakoukas, Giorgos; Castano, Maria; Tan, Xiaobo; Murphey, Todd

    DIViS: Domain Invariant Visual Servoing for Collision-Free Goal Reaching Robots should understand both semantics and physics to be functional in the real world. While robot platforms provide means for interacting with the physical world they cannot autonomously acquire object-level semantics without needing human. In this paper, we investigate how to minimize human effort and intervention to teach robots perform real world tasks that incorporate semantics. We study this question in the context of visual servoing of mobile robots and propose DIViS, a Domain Invariant policy learning approach for collision free Visual Servoing. DIViS incorporates high level semantics from previously collected static human-labeled datasets and learns collision free servoing entirely in simulation and without any real robot data. However, DIViS can directly be deployed on a real robot and is capable of servoing to the user-specified object categories while avoiding collisions in the real world. DIViS is not constrained to be queried by the final view of goal but rather is robust to servo to image goals taken from initial robot view with high occlusions without this impairing its ability to maintain a collision free path. We show the generalization capability of DIViS on real mobile robots in more than 90 real world test scenarios with various unseen object goals in unstructured environments. DIViS is compared to prior approaches via real world experiments and rigorous tests in simulation. For supplementary videos, see: https://fsadeghi.github.io/DIViS
    [Full Paper] [Video]
    Sadeghi, Fereshteh

    Risk Contours Map for Risk Bounded Motion Planning under Perception Uncertainties In this paper, we introduce ”risk contours map” that contains the risk information of different regions in uncertain environments. Risk is defined as the probability of collision of robots with obstacles in presence of probabilistic uncertainties in location, size, and geometry of obstacles. We use risk contours to obtain safe paths for robots with guaranteed bounded risk. We formulate the problem of obtaining risk contours as a chance constrained optimization. We leverage the theory of moments and nonnegative polynomials to provide a convex optimization in the form of sum of squares optimization. Provided approach deals with nonconvex obstacles and probabilistic bounded and unbounded uncertainties. We demonstrate the performance of the provided approach by solving risk bounded motion planning problems.
    [Full Paper]
    M. Jasour, Ashkan; Williams, Brian

    Toward Asymptotically-Optimal Inspection Planning Via Efficient Near-Optimal Graph Search Inspection planning, the task of planning motions that allow a robot to inspect a set of points of interest, has applications in domains such as industrial, field, and medical robotics. Inspection planning can be computationally challenging, as the search space over motion plans that inspect the points of interest grows exponentially with the number of inspected points. We propose a novel method, Incremental Random Inspection-roadmap Search (IRIS), that computes inspection plans whose length and set of inspected points asymptotically converge to those of an optimal inspection plan. IRIS incrementally densifies a motion planning roadmap using sampling-based algorithms, and performs efficient near-optimal graph search over the resulting roadmap as it is generated. We demonstrate IRIS’ efficacy on a simulated planar 5DOF manipulator inspection task and on a medical endoscopic inspection task for a continuum parallel surgical robot in anatomy segmented from patient CT data. We show that IRIS computes higher-quality inspection paths orders of magnitudes faster than a prior state-of-the-art method.
    [Full Paper] [Video]
    Fu, Mengyu; Kuntz, Alan; Salzman, Oren; Alterovitz, Ron

    High-Throughput Computation of Shannon Mutual Information on Chip Exploration problems are fundamental to robotics, arising in various domains, ranging from search and rescue to space exploration. Many effective exploration algorithms rely on the computation of mutual information between the current map and potential future measurements in order to make planning decisions. Unfortunately, computing mutual information metrics is computationally challenging. In fact, a large fraction of the current literature focuses on approximation techniques to devise computationally-efficient algorithms. In this paper, we propose a novel computing hardware architecture to efficiently compute Shannon mutual information. The proposed architecture consists of multiple mutual information computation cores, each evaluating the mutual information between a single sensor beam and the occupancy grid map. The key challenge is to ensure that each core is supplied with data when requested, so that all cores are maximally utilized. Our key contribution consists of a novel memory architecture and data delivery method that ensures effective utilization of all mutual information computation cores. This architecture was optimized for 16 mutual information computation cores, and was implemented on an FPGA. We show that it computes the mutual information metric for an entire map of 20m × 20m at 0.1m resolution in near real time, at 2 frames per second, which is approximately two orders of magnitude faster, while consuming an order of magnitude less power, when compared to an equivalent implementation on a Xeon CPU.
    [Full Paper]
    Li, Peter Zhi Xuan; Zhang, Zhengdong; Karaman, Sertac; Sze, Vivienne

    Planning with State Abstractions for Non-Markovian Task Specifications Often times, we specify tasks for a robot using temporal language that can also span different levels of abstraction. The example command “go to the kitchen before going to the second floor” contains spatial abstraction, given that “floor” consists of individual rooms that can also be referred to in isolation (“kitchen”, for example). There is also a temporal ordering of events, defined by the word “before”. Previous works have used Linear Temporal Logic (LTL) to interpret temporal language (such as “before”), and Abstract Markov Decision Processes (AMDPs) to interpret hierarchical abstractions (such as “kitchen” and “second floor”), separately. To handle both types of commands at once, we introduce the Abstract Product Markov Decision Process (AP-MDP), a novel approach capable of representing non-Markovian reward functions at different levels of abstractions. The AP-MDP framework translates LTL into its corresponding automata, creates a product Markov Decision Process (MDP) of the LTL specification and the environment MDP, and decomposes the problem into subproblems to enable efficient planning with abstractions. AP-MDP performs faster than a non-hierarchical method of solving LTL problems in over 95% of tasks, and this number only increases as the size of the environment domain increases. We also present a neural sequence-to-sequence model trained to translate language commands into LTL expression, and a new corpus of non-Markovian language commands spanning different levels of abstraction. We test our framework with the collected language commands on a drone, demonstrating that our approach enables a robot to efficiently solve temporal commands at different levels of abstraction.
    [Full Paper] [Video]
    Oh, Yoonseon; Patel, Roma; Nguyen, Thao; Huang, Baichuan; Pavlick, Ellie; Tellex, Stefanie

    Modeling and Control of Soft Robots Using the Koopman Operator and Model Predictive Control Controlling soft robots with precision is a challenge due in large part to the difficulty of constructing models that are amenable to model-based control design techniques. Koopman operator theory offers a way to construct explicit linear dynamical models of soft robots and to control them using established model-based linear control methods. This method is data-driven, yet unlike other data-driven models such as neural networks, it yields an explicit control-oriented linear model rather than just a ``black-box'' input-output mapping. This work describes this Koopman-based system identification method and its application to model predictive controller design. A model and MPC controller of a pneumatic soft robot arm is constructed via the method, and its performance is evaluated over several trajectory following tasks in the real-world. On all of the tasks, the Koopman-based MPC controller outperforms a benchmark MPC controller based on a linear state-space model of the same system.
    [Full Paper] [Video]
    Bruder, Daniel; Gillespie, Brent; Remy, C. David; Vasudevan, Ram

    Real-Time Information-Theoretic Exploration with Gaussian Mixture Model Maps This paper develops an exploration framework that leverages Gaussian mixture models (GMMs) for high-fidelity perceptual modeling and exploits the compactness of the distributions for information sharing in communications-constrained applications. State-of-the-art, high-resolution perceptual modeling techniques do not always consider the implications of transferring the model across limited bandwidth communications channels, which is critical for real time information sharing. To bridge this gap in the state of the art, this paper presents a system that compactly represents sensor observations as GMMs and maintains a local occupancy grid map for a sampling-based motion planner that maximizes an information-theoretic objective function. The method is extensively evaluated in long duration simulations on an embedded PC and deployed to an aerial robot equipped with a 3D LiDAR. The result is significant memory efficiency as compared to state-of-the-art techniques.
    [Full Paper]
    Tabib, Wennie; Goel, Kshitij; Yao, John; Dabhi, Mosam; Boirum, Curtis; Michael, Nathan

    Asymptotically Optimal Planning for Non-Myopic Multi-Robot Information Gathering This paper proposes a novel highly scalable sampling-based planning algorithm for multi-robot active information acquisition tasks in complex environments. Active information gathering scenarios include target localization and tracking, active SLAM, surveillance, environmental monitoring and others. The objective is to compute control policies for sensing robots which minimize the accumulated uncertainty of a dynamic hidden state over an a priori unknown horizon. To address this problem, we propose a new sampling-based algorithm that simultaneously explores both the robot motion space and the reachable information space. Unlike relevant sampling-based approaches, we show that the proposed algorithm is probabilistically complete, asymptotically optimal and is supported by convergence rate bounds. Moreover, we demonstrate that by introducing bias in the sampling process towards informative areas, the proposed method can quickly compute sensor policies that achieve desired levels of uncertainty in large-scale estimation tasks that may involve large sensor teams, workspaces, and dimensions of the hidden state. We provide extensive simulation results that corroborate the theoretical analysis and show that the proposed algorithm can address large-scale estimation tasks which were previously infeasible.
    [Full Paper]
    Kantaros, Yiannis; Schlotfeldt, Brent; Atanasov, Nikolay; Pappas, George J.

    Network Offloading Policies for Cloud Robotics: A Learning-Based Approach Today's robotic systems are increasingly turning to computationally expensive models such as deep neural networks (DNNs) for tasks like localization, perception, planning, and object detection. However, resource-constrained robots, like low-power drones, often have insufficient on-board compute resources or power reserves to scalably run the most accurate, state-of-the art neural network compute models. Cloud robotics allows mobile robots the benefit of offloading compute to centralized servers if they are uncertain locally or want to run more accurate, compute-intensive models. However, cloud robotics comes with a key, often understated cost: communicating with the cloud over congested wireless networks may result in latency or loss of data. In fact, sending high data-rate video or LIDAR from multiple robots over congested networks can lead to prohibitive delay for real-time applications, which we measure experimentally. In this paper, we formulate a novel Robot Offloading Problem - how and when should robots offload sensing tasks, especially if they are uncertain, to improve accuracy while minimizing the cost of cloud communication? We formulate offloading as a sequential decision making problem for robots, and propose a solution using deep reinforcement learning. In both simulations and hardware experiments using state-of-the art vision DNNs, our offloading strategy improves vision task performance by between 1.3-2.3x of benchmark offloading strategies, allowing robots the potential to significantly transcend their on-board sensing accuracy but with limited cost of cloud communication.
    [Full Paper]
    Chinchali, Sandeep; Sharma, Apoorva; Harrison, James; Elhafsi, Amine; Kang, Daniel; Pergament, Evgenya; Cidon, Eyal; Katti, Sachin; Pavone, Marco

    Learning to Plan with Logical Automata This paper introduces the Logic-based Value Iteration Network (LVIN) framework, which combines imitation learning and logical automata to enable agents to learn complex behaviors from demonstrations. We address two problems with learning from expert knowledge: (1) how to generalize learned policies for a task to larger classes of tasks, and (2) how to account for erroneous demonstrations. Our LVIN model solves finite gridworld environments by instantiating a recurrent, convolutional neural network as a value iteration procedure over a learned Markov Decision Process (MDP) that factors into two MDPs: a small finite state automaton (FSA) corresponding to logical rules, and a larger MDP corresponding to motions in the environment. The parameters of LVIN (value function, reward map, FSA transitions, large MDP transitions) are approximately learned from expert trajectories. Since the model represents the learned rules as an FSA, the model is interpretable; since the FSA is integrated into planning, the behavior of the agent can be manipulated by modifying the FSA transitions. We demonstrate these abilities in several domains of interest, including a lunchbox-packing manipulation task and a driving domain.
    [Full Paper]
    Araki, Brandon; Vodrahalli, Kiran; Leech, Thomas; Vasile, Cristian Ioan; Donahue, Mark; Rus, Daniela

    Inverting Learned Dynamics Models for Aggressive Multirotor Control We present a control strategy that applies inverse dynamics to a learned acceleration error model for accurate multirotor control input generation. This allows us to retain accurate trajectory and control input generation despite the presence of exogenous disturbances and modeling errors. Although accurate control input generation is traditionally possible when combined with parameter learning-based techniques, we propose a method that can do so while solving the relatively easier non-parametric model learning problem. We show that our technique is able to compensate for a larger class of model disturbances than traditional techniques can and we show reduced tracking error while following trajectories demanding accelerations of more than 7 m/s^2 in multirotor simulation and hardware experiments.
    [Full Paper]
    Spitzer, Alexander; Michael, Nathan

    Scalable and Congestion-Aware Routing for Autonomous Mobility-On-Demand Via Frank-Wolfe Optimization We consider the problem of vehicle routing for Autonomous Mobility-on-Demand (AMoD) systems, wherein a fleet of self-driving vehicles provides on-demand mobility in a given environment. Specifically, the task it to compute routes for the vehicles (both customer-carrying and empty travelling) so that travel demand is fulfilled and operational cost is minimized. The routing process must account for congestion effects affecting travel times, as modeled via a volume-delay function (VDF). Route planning with VDF constraints is notoriously challenging, as such constraints compound the combinatorial complexity of the routing optimization process. Thus, current solutions for AMoD routing resort to relaxations of the congestion constraints, thereby trading optimality with computational efficiency. In this paper, we present the first computationally-efficient approach for AMoD routing where VDF constraints are explicitly accounted for. We demonstrate that our approach is faster by at least one order of magnitude with respect to the state of the art, while providing higher quality solutions. From a methodological standpoint, the key technical insight is to establish a mathematical reduction of the AMoD routing problem to the classical traffic assignment problem (a related vehicle-routing problem where empty traveling vehicles are not present). Such a reduction allows us to extend powerful algorithmic tools for traffic assignment, which combine the classic Frank-Wolfe algorithm with modern techniques for pathfinding, to the AMoD routing problem. We provide strong theoretical guarantees for our approach in terms of near-optimality of the returned solution.
    [Full Paper]
    Solovey, Kiril; Salazar, Mauro; Pavone, Marco

    A Hierarchical Geometric Framework to Design Locomotive Gaits for Highly Articulated Robots Motion planning for mobile robots with many degrees-of-freedom (DoF) is challenging due to their high-dimensional configuration spaces. To manage this curse of dimensionality, this paper proposes a new hierarchical framework that decomposes the system into sub-systems (based on shared capabilities of DoFs), for which we can design and coordinate motions. Instead of constructing a high-dimensional configuration space, we establish a hierarchy of two-dimensional spaces on which we can visually design gaits using geometric mechanics tools. We then coordinate motions among the two-dimensional spaces in a pairwise fashion to obtain desired robot locomotion. Further geometric analysis of the two-dimensional spaces allows us to visualize the contribution of each sub-system to the locomotion, as well as the contribution of the coordination among the sub-systems. We demonstrate our approach by designing gaits for quadrupedal robots with different morphologies, and experimentally validate our findings on a robot with a long actuated back and intermediate-sized legs.
    [Full Paper]
    Zhong, Baxi; OZKAN-AYDIN, Yasemin; Sartoretti, Guillaume; Rieser, Jennifer; Gong, Chaohui; Xing, Haosen; Goldman, Daniel; Choset, Howie

    Remote Telemanipulation with Adapting Viewpoints in Visually Complex Environments In this paper, we introduce a novel method to support remote telemanipulation tasks in complex environments by providing operators with an enhanced view of the task environment. Our method features a novel viewpoint adjustment algorithm designed to automatically mitigate occlusions caused by workspace geometry, supports visual exploration to provide operators with situation awareness in the remote environment, and mediates context-specific visual challenges by making viewpoint adjustments based on sparse input from the user. Our method builds on the dynamic camera telemanipulation viewing paradigm, where a user controls a manipulation robot, and a camera-in-hand robot alongside the manipulation robot servos to provide a sufficient view of the remote environment. We discuss the real-time motion optimization formulation used to arbitrate the various objectives in our shared-control-based method, particularly highlighting how our occlusion avoidance and viewpoint adaptation approaches fit within this framework. We present results from an empirical evaluation of our proposed occlusion avoidance approach as well as a user study that compares our telemanipulation shared-control method against alternative telemanipulation approaches. We discuss the implications of our work for future shared-control research and robotics applications.
    [Full Paper] [Video]
    Rakita, Daniel; Mutlu, Bilge; Gleicher, Michael

    Reachable Space Characterization of Markov Decision Processes with Time Variability We propose a solution to a time-varying variant of Markov Decision Processes which can be used to address the decision-theoretic planning problems for autonomous systems operating in unstructured outdoor environments. We explore the time variability property of the planning stochasticity and investigate the state reachability in order to design an efficient method that can well trade-off the solution optimality and time complexity. The reachability space is constructed by analyzing the means and variances of future states' reaching time. We validate our algorithm through extensive simulations using ocean data and the results show that our method has a great performance in terms of both solution quality and computing time.
    [Full Paper]
    Xu, Junhong; Yin, Kai; Liu, Lantao

    Learning Deep Stochastic Optimal Control Policies Using Forward-Backward SDEs In this paper we propose a new methodology for decision-making under uncertainty using recent advancements in the areas of nonlinear stochastic optimal control theory, applied mathematics, and machine learning. Grounded on the fundamental relation between certain nonlinear partial differential equations and forward-backward stochastic differential equations, we develop a control framework that is scalable and applicable to general classes of stochastic systems and decision-making problem formulations in robotics and autonomy. The proposed deep neural network architectures for stochastic control consist of recurrent and fully connected layers. The performance and scalability of the aforementioned algorithm are investigated in three non-linear systems in simulation with and without control constraints. We conclude with a discussion on future directions and their implications to robotics.
    [Full Paper]
    Wang, Ziyi; Pereira, Marcus; Theodorou, Evangelos

    Conditional Neural Movement Primitives Conditional Neural Movement Primitives (CNMPs) is a learning from demonstration framework that is designed as a robotic movement learning and generation system built on top of a recent deep neural architecture, namely Conditional Neural Processes (CNPs). Based on CNPs, CNMPs extract the prior knowledge directly from the training data by sampling observations from it, and uses it to predict a conditional distribution over any other target points. CNMPs specifically learns complex temporal multi-modal sensorimotor relations in connection with external parameters and goals; produces movement trajectories in joint or task space; and executes these trajectories through a high-level feedback control loop. Conditioned with an external goal that is encoded in the sensorimotor space of the robot, predicted sensorimotor trajectory that is expected to be observed during the successful execution of the task is generated by the CNMP, and the corresponding motor commands are executed. In order to detect and react to unexpected events during action execution, CNMP is further conditioned with the actual sensor readings in each time-step. Through simulations and real robot experiments, we showed that CNMPs can learn the non-linear relations between low-dimensional parameter spaces and complex movement trajectories from few demonstrations; and they can also model the associations between high-dimensional sensorimotor spaces and complex motions using large number of demonstrations. The experiments further showed that even the task parameters were not explicitly provided to the system, the robot could learn their influence by associating the learned sensorimotor representations with the movement trajectories. The robot, for example, learned the influence of object weights and shapes through exploiting its sensorimotor space that includes proprioception and force measurements; and be able to change the movement trajectory on the fly when one of these factors were changed through external intervention.
    [Full Paper] [Video]
    Seker, Muhammet Yunus; Imre, Mert; Piater, Justus; Ugur, Emre

    Pareto Monte Carlo Tree Search for Multi-Objective Informative Planning In many environmental monitoring scenarios, the sampling robot needs to simultaneously explore the environment and exploit features of interest with limited time. We present an anytime multi-objective informative planning method called Pareto Monte Carlo Tree Search which allows the robot to handle potentially competing objectives such as exploration versus exploitation. The method produces optimized decision solutions for the robot based on its knowledge (estimation) of the environment state, leading to better adaptation to environmental dynamics. We provide algorithmic analysis on the critical tree node selection step and show that the number of times choosing sub-optimal nodes is logarithmically bounded and the search result converges to the optimal choices at a polynomial rate.
    [Full Paper] [Video]
    Chen, Weizhe; Liu, Lantao

    End-To-End Robotic Reinforcement Learning without Reward Engineering The combination of deep neural network models and reinforcement learning algorithms can make it possible to learn policies for robotic behaviors that directly read in raw sensory inputs, such as camera images, effectively subsuming both estimation and control into one model. However, real-world applications of reinforcement learning must specify the goal of the task by means of a manually programmed reward function, which in practice requires either designing the very same perception pipeline that end-to-end reinforcement learning promises to avoid, or else instrumenting the environment with additional sensors to determine if the task has been performed successfully. In this paper, we propose an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively solicited queries, where the robot shows the user a state and asks for a label to determine whether that state represents successful completion of the task. While requesting labels for every single state would amount to asking the user to manually provide the reward signal, our method requires labels for only a tiny fraction of the states seen during training, making it an efficient and practical approach for learning skills without manually engineered rewards. We evaluate our method on real-world robotic manipulation tasks where the observations consist of images viewed by the robot's camera. In our experiments, our method effectively learns to arrange objects, place books, and drape cloth, directly from images and without any manually specified reward functions, and with only 1-4 hours of interaction with the real world.
    [Full Paper]
    Singh, Avi; Yang, Larry; Finn, Chelsea; Levine, Sergey

    Learning Robotic Manipulation through Visual Planning and Acting Planning for robotic manipulation requires reasoning about the changes a robot can affect on objects. When such interactions can be modelled analytically, as in domains with rigid objects, efficient planning algorithms exist. However, in both domestic and industrial domains, the objects of interest can be soft, or deformable, and hard to model analytically. For such cases, we posit that a data-driven modelling approach is more suitable. In recent years, progress in deep generative models has produced methods that learn to `imagine' plausible images from data. Building on the recent Causal InfoGAN generative model, in this work we learn to imagine goal-directed object manipulation directly from raw image data of self-supervised interaction of the robot with the object. After learning, given a goal observation of the system, our model can generate an imagined plan -- a sequence of images that transition the object into the desired goal. To execute the plan, we use it as a reference trajectory to track with a visual servoing controller, which we also learn from the data as an inverse dynamics model. In a simulated manipulation task, we show that separating the problem into visual planning and visual tracking control is more sample efficient and more interpretable than alternative data-driven approaches. We further demonstrate our approach on learning to imagine and execute in 3 environments, the final of which is deformable rope manipulation on a PR2 robot.
    [Full Paper] [Video]
    Wang, Angelina; Kurutach, Thanard; Tamar, Aviv; Abbeel, Pieter

    Influencing Leading and Following in Human-Robot Teams Recent efforts in human-robot interaction have been focused on modeling and interacting with single human agents. However, when modeling teams of humans, current models are not able to capture underlying emergent dynamics that define group behavior, such as leading and following. We introduce a mathematical framework that enables robots to influence human teams by modeling emergent leading and following behaviors. We tackle the task in two steps. First, we develop a scalable representation of latent leading-following structures by combining model-based methods and data-driven techniques. Second, we optimize for a robot policy that leverages this representation to influence a human team toward a desired outcome. We demonstrate our approach on three tasks where a robot optimizes for changing a leader-follower relationship, distracting a team, and leading a team towards an optimal goal. Our evaluations show that our representation is scalable with different human team sizes, generalizable across different tasks, and can be used to design meaningful robot policies.
    [Full Paper]
    Kwon, Minae; Li, Mengxi; Bucquet, Alexandre; Sadigh, Dorsa

    Reduced Order vs. Discretized Lumped System Models with Absolute and Relative States for Continuum Manipulators A reliable, accurate, and yet simple dynamic model is important to analyze, design and control continuum manipulators. Such models should be fast, as simple as possible and user-friendly to be widely accepted by the ever-growing robotics research community. In this study, we introduce two new modeling methods for continuum manipulators: a general reduced-order model (ROM) and a discretized model with absolute states and Euler-Bernoulli beam segments (EBA). Additionally, a new formulation is presented for a recently introduced discretized model based on Euler-Bernoulli beam segments and relative states (EBR). The models are validated in comparison to experimental results for dynamics of a STIFF-FLOP continuum appendage. Our comparison shows higher simulation accuracy (8-14% normalized error) and numerical robustness of the ROM model for a system with small number of states, and computational efficiency of the EBA model with near real-time performances that makes it suitable for large systems. The challenges with designing control and observation scenarios are briefly discussed in the end.
    [Full Paper]
    Sadati, Seyedmohammadhadi; Shiva, Ali; Naghibi, Seyedeh Elnaz; Rucker, Caleb; Renson, Ludovic; Bergeles, Christos; Althoefer, Kaspar; Nanayakkara, Thrishantha; Hauser, Helmut; Walker, Ian

    Commonsense Reasoning and Knowledge Acquisition to Guide Deep Learning on Robots Algorithms based on deep network models are being used for many pattern recognition and decision-making tasks in robotics and AI. Training these models requires a large labeled dataset and considerable computational resources, which are not readily available in many domains. Also, it is difficult to understand the internal representations and reasoning mechanisms of these models. The architecture described in this paper attempts to address these limitations by drawing inspiration from research in cognitive systems. It uses non-monotonic logical reasoning with incomplete commonsense domain knowledge, and inductive learning of previously unknown constraints on the domain's states, to guide the construction of deep network models based on a small number of relevant training examples. As a motivating example, we consider a robot reasoning about the stability and partial occlusion of configurations of objects in simulated images. Experimental results indicate that in comparison with an architecture based just on deep networks, our architecture improves reliability, and reduces the sample complexity and time complexity of training deep networks.
    [Full Paper]
    Mota, Tiago; Sridharan, Mohan

    Trajectory Optimization on Manifolds: A Theoretically-Guaranteed Embedded Sequential Convex Programming Approach Sequential Convex Programming (SCP) has recently gained popularity as a tool for trajectory optimization due to its sound theoretical properties and practical performance. Yet, most SCP-based methods for trajectory optimization are restricted to Euclidean settings, which precludes their application to problem instances where one must reason about manifold-type constraints (that is, constraints, such as loop closure, which restrict the motion of a system to a subset of the ambient space). The aim of this paper is to fill this gap by extending SCP-based trajectory optimization methods to a manifold setting. The key insight is to leverage geometric embeddings to lift a manifold-constrained trajectory optimization problem into an equivalent problem defined over a space enjoying a Euclidean structure. This insight allows one to extend existing SCP methods to a manifold setting in a fairly natural way. In particular, we present a SCP algorithm for manifold problems with refined theoretical guarantees that resemble those derived for the Euclidean setting, and demonstrate its practical performance via numerical experiments.
    [Full Paper] [Video]
    Bonalli, Riccardo; Cauligi, Abhishek; Bylard, Andrew; Lew, Thomas; Pavone, Marco

    Motion Planning, Design Optimization and Fabrication of Ferromagnetic Swimmers Small-scale robots have the potential to impact many areas of medicine and manufacturing including targeted drug delivery, telemetry and micromanipulation. This paper develops an algorithmic framework for regulating external magnetic fields to induce motion in millimeter-scale robots in a viscous liquid, to simulate the physics of swimming at the micrometer scale. Our approach for planning motions for these swimmers is based on tools from geometric mechanics that provide a novel means to design periodic changes in the physical shape of a robot that propels it in a desired direction. Using these tools, we are able to derive new motion primitives for generating locomotion in these swimmers. We use these primitives for optimizing swimming efficiency as a function of its internal magnetization and describe a principled approach to encode the ‘best’ magnetization distributions in the swimmers. We validate this procedure experimentally and conclude by implementing these newly computed motion primitives on several magnetic swimmer prototypes that include two-link and three-link swimmers.
    [Full Paper] [Video]
    Grover, Jaskaran Singh; Vedova, Daniel; Jain, Nalini; Travers, Matthew; Choset, Howie

    Bayesian Estimator for Partial Trajectory Alignment The problem of temporal alignment of time series is common across many fields of study. Within the domain of robotics, human motion trajectories are one type of time series that is often utilized for recognition and prediction of human intent. In these applications, online temporal alignment of partial trajectories to a full representative trajectory is of particular interest, as it is desirable to make accurate intent prediction decisions early in a motion in order to enable proactive robot behavior. This is a particularly difficult problem, however, due to the potential for overlapping trajectory regions and temporary stops, both of which can degrade the performance of existing alignment techniques. Furthermore, it is desirable to not only provide the most likely alignment but also characterize the uncertainty around it, which current methods are unable to accomplish. To address these difficulties and drawbacks, we present BEST-PTA, a framework that combines optimization, supervised learning, and unsupervised learning components in order to build a Bayesian model that outputs distributions over likely correspondence points based on observed partial trajectory data. Through an evaluation incorporating multiple datasets, we show that BEST-PTA outperforms previous alignment techniques; furthermore, we demonstrate that this improvement can significantly boost human motion prediction performance and discuss the implications of these results on improving the quality of human-robot interaction.
    [Full Paper]
    Lasota, Przemyslaw A.; Shah, Julie A.

    Semi-Autonomous Robot Teleoperation with Obstacle Avoidance Via Model Predictive Control This paper proposes a model predictive control (MPC) approach for semi-autonomous teleoperation of robot manipulators: the focus is on automatically avoiding singular configurations of the robot and obstacles in the robot workspace with the whole robot frame, exploiting predictions of the operator’s motion. The hand pose of the human operator provides the reference for the end effector, and the robot motion is continuously replanned in real time, satisfying several constraints (including, in addition to those above mentioned, limits on joint accelerations, velocities and positions). An experimental case study is described regarding the design and testing of the proposed framework on a UR5 manipulator: the discussion of the experimental results confirms the suitability of the proposed method for semi-autonomous robot teleoperation, both in terms of performance (tracking capability and constraint satisfaction) and computational complexity (the MPC control law is calculated well within the sampling interval).
    [Full Paper]
    Rubagotti, Matteo; Taunyazov, Tasbolat; Omarali, Bukeikhan; Shintemirov, Almas

    VIMO: Simultaneous Visual Inertial Model-Based Odometry and Force Estimation In recent years, many approaches to Visual Inertial Odometry (VIO) have become available. However, they neither exploit the robot's dynamics and known actuation inputs, nor differentiate between desired motion due to actuation and unwanted perturbation due to external force. For many robotic applications, it is often essential to sense the external force acting on the system due to, for example, interactions, contacts, and disturbances. Adding a motion constraint to an estimator leads to a discrepancy between the model-predicted motion and the actual motion. Our approach exploits this discrepancy and resolves it by simultaneously estimating the motion and the external force. We propose a relative motion constraint combining the robot's dynamics and the external force in a preintegrated residual, resulting in a tightly-coupled, sliding-window estimator exploiting all correlations among all variables. We implement our Visual Inertial Model-based Odometry (VIMO) system is into a state-of-the-art VIO approach and evaluate it against the original pipeline without motion constraints on both simulated and real-world data. The results show that our approach increases the accuracy of the estimator up to 29% compared to the original VIO, and provides external force estimates at no extra computational cost. To the best of our knowledge, this is the first approach exploiting model dynamics by jointly estimating motion and external force. Our implementation will be made available open-source.
    [Full Paper]
    Nisar, Barza; Foehn, Philipp; Falanga, Davide; Scaramuzza, Davide

    Approximate Bayesian Inference in Spatial Environments Model-based approaches bear great promise for decision making of agents interacting with the physical world. In the context of spatial environments, different types of problems such as localisation, mapping, navigation or autonomous exploration are typically adressed with specialised methods, often relying on detailed knowledge of the system at hand. We express these tasks as probabilistic inference and planning under the umbrella of deep sequential generative models. Using the frameworks of variational inference and neural networks, our method inherits favourable properties such as flexibility, scalability and the ability to learn from data. The method performs comparably to specialised state-of-the-art methodology in two distinct simulated environments.
    [Full Paper] [Video]
    Mirchev, Atanas; Kayalibay, Baris; Bayer, Justin; Soelch, Maximilian; van der Smagt, Patrick

    Systematic Handling of Heterogeneous Geometric Primitives in Graph-SLAM Optimization In this paper, we propose a pose-landmark graph optimization back-end that supports maps consisting of points, lines or planes. Our back-end allows representing both homogeneous (point-point, line-line, plane-plane) and heterogeneous measurements (point-on-line, point-on-plane, line-on-plane). Rather than treating all cases independently, we use a unified formulation that leads to both a compact derivation and a concise implementation. The additional geometric information, deriving from the use of higher-dimension primitives and constraints, yields to increased robustness and widens the convergence basin of our method. We evaluate the proposed formulation both on synthetic and raw data. Finally, we made available an open-source implementation to replicate the experiments.
    [Full Paper]
    Aloise, Irvin; Della Corte, Bartolomeo; Nardi, Federico; Grisetti, Giorgio

    Idiothetic Verticality Estimation through Head Stabilization Strategy The knowledge of the gravitational vertical is fundamental for the autonomous control of humanoids and other free-moving robotic systems such as rovers and drones. This article deals with the hypothesis that the so-called `head stabilization strategy' observed in humans and animals facilitates the estimation of the true vertical from inertial sensing only. This problem is difficult because inertial measurements respond to a combination of gravity and fictitious forces that are hard to disentangle. From simulations and experiments, we found that the angular stabilization of a platform bearing inertial sensors enables the application of the separation principle. This principle, which permits one to design estimators and controllers independently from each other, typically applies to linear systems, but rarely to nonlinear systems. We found empirically that, given inertial measurements, the angular regulation of a platform results in a system that is stable and robust and which provides true vertical estimates as a byproduct of the feedback. We conclude that angularly stabilized inertial measurement platforms could liberate robots from ground-based measurements for postural control, locomotion, and other functions, leading to a true idiothetic sensing modality, that is, not based on any external reference but the gravity field.
    [Full Paper]
    Farkhatdinov, Ildar; Michalska, Hannah; Berthoz, Alain; Hayward, Vincent