Constrained Motion Planning Networks X - Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip - arXiv
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Constrained Motion Planning Networks X Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip Abstract—Constrained motion planning is a challenging field of research, aiming for computationally efficient methods that can find a collision-free path on the constraint manifolds between a given start and goal configuration. These planning problems come up surprisingly frequently, such as in robot manipulation for performing daily life assistive tasks. However, few solutions to constrained motion planning are available, and those that exist struggle with high computational time complexity in finding a path solution on the manifolds. To address this challenge, we arXiv:2010.08707v2 [cs.RO] 3 Jul 2021 present Constrained Motion Planning Networks X (CoMPNetX). It is a neural planning approach, comprising a conditional deep neural generator and discriminator with neural gradients-based fast projection operator. We also introduce neural task and scene representations conditioned on which the CoMPNetX generates implicit manifold configurations to turbo-charge any underlying (a) (b) classical planner such as Sampling-based Motion Planning meth- Fig. 1: CoMPNetX generalized in sphere environment from ods for quickly solving complex constrained planning tasks. We show that our method finds path solutions with high success rates (a) small cubical obstacles’ geometry to (b) multiple longitu- and lower computation times than state-of-the-art traditional dinal obstacle strips and planned near-optimal paths between path-finding tools on various challenging scenarios. randomly selected start and goal pairs in sub-second compu- tational times. I. INTRODUCTION Constrained Motion Planning (CMP) has a broad range of eventually connects the given start and goal configurations robotics applications for solving practical problems emerging leading to a path solution [4]. However, in CMP, the constraint in domains such as assistance at home, factory floors, disaster equations implicitly define a configuration space compris- sites, and hospitals [1]. In our daily life, most of our activities ing zero-volume constraint manifolds embedded in a higher- involve a large number of CMP tasks. For example, at our dimensional ambient space of the robot’s joint variables [5]. home, we interact with various objects to perform usual Therefore, the probability of generating random robot con- household chores such as cleaning and cooking, including figurations on those manifolds is not just low but zero, which opening doors, carrying a tray or a glass filled with water, makes the state-of-the-art gold standard SMP methods [6], [7], and lifting boxes. Likewise, skilled workers manipulate their [8] [9], [10], [11], [12], [13] fail in such problems [14]. tools to solve a wide variety of tasks such as assembly at Recently, constraint-adherence methods that generate sam- factory floors and advanced-level surgery in the hospitals. ples on the manifolds have been incorporated into existing In all of the scenarios mentioned above, our cognitive SMP algorithms for CMP [14]. These methods include pro- process decomposes a given task (e.g., cleaning) into sub- jection and continuation-based approaches. The former uses tasks (e.g., moving objects to their designated places) and Jacobian-based gradient descent to project a given configu- accomplishes them sequentially or concurrently by sending ration to the manifold. The latter takes a known constraint- motor commands to the body for physical interaction with adhering configuration to compute a tangent space using the environment under the task-specific constraints [2], [3]. which new samples are generated closer to the manifold for In robotics, this phenomenon is known as Task and Motion projection. These advanced planning methods solve a wide Planning (TMP). A task planner decomposes a given task into range of tasks, but they often exhibit high computational a sequence of sub-tasks, and a motion planner achieves those time complexity with high variance, making them frequently sub-tasks by planning feasible robot motion sequences. This impractical for real-world manipulation problems. paper focuses on the latter part of TMP, i.e., task-constrained A parallel development led to the cross-fertilization of SMP motion planning methods, and their integration with the exist- and machine-learning approaches, resulting into learning- ing state-of-the-art learning-based task programmers. based motion planners [15], [16], [17], [18], [19], [20], [21]. In the last decade, Sampling-based Motion Planning (SMP) These methods learn from an oracle planner and are shown methods have surfaced as prominent motion planning tools to be scalable and generalizable to new problems with sig- in robotics [4]. These algorithms randomly sample the robot nificantly faster computational speed than classical methods. joint-configurations to build a collision-free graph, which Some of these planners even provide worst-case theoretical guarantees. For instance, Motion Planning Networks (MP- A. H. Qureshi, J. Dong, A. Baig and M. C. Yip are affiliated with Uni- Net) [15], [16] generates collision-free paths through divide- versity of California San Diego, La Jolla, CA 92093 USA. {a1qureshi, jid103, abaig, yip}@ucsd.edu and-conquer as it divides the problem into sub-problems and
either replans or outsources them, in worst-case, to a classical provides gradients to project them to the manifold if planner while still retaining its computational benefits. needed. In our recent work, we extended MPNet to solve CMP prob- In summary, CoMPNetX can generate robot configurations lems by proposing Constrained Motion Planning Networks for a wide range of SMP algorithms while retaining their (CoMPNet) [22]. CoMPNet is a deep neural network-based worst-case theoretical guarantees. Our generator and dis- approach that takes the environment perception information, criminator are conditioned on the neural task representation text-based task specification defining the constraints (e.g., and the environment observation encoding. The conditional open the door), and robot’s start and goal configurations generator takes the desired start and goal configurations to as an input and outputs a feasible path on the constraint output intermediate implicit manifold configurations, and the manifolds. CoMPNet connects any two given configurations conditional discriminator predicts their geodesic distances using a projection-based constraint-adherence operator, and from the underlying manifold. We use the discriminator’s like MPNet, it also performs a divide-and-conquer through predictions and their gradients as the operator to project the bidirectional expansion. However, it avoids replanning, which given configurations towards the constraint manifold if needed. is a computationally expensive process in CMP, and instead CoMPNetX naturally forms a mutual symbiotic relationship builds an informed tree of possible paths. with learning-based task programmers and exploits their inner This paper presents a unified framework called Constrained states, representing tasks, to transverse multiple constrained Motion Planning Networks X (CoMPNetX)1 , which extends manifolds for finding their path solutions. We show that these CoMPNet and generates informed implicit manifold configu- task representations from a learning-based task planner can rations to speed-up any SMP algorithm equipped with their lead to better performance in motion planning than human- constraint-adherence approach for solving CMP problems. defined text-based task representations (as in [22]). We test CoMPNetX comprises the conditional neural generator, dis- CoMPNetX with various SMP algorithms using both con- criminator, a neural gradient-based projection operator, and tinuation and projection-based constraint-adherence methods sampling heuristics to propose samples for all kinds of SMP on challenging problems and benchmark them against the methods. Furthermore, compared to our previously proposed state-of-the-art classical CMP algorithms. We also evaluate CoMPNet, this new approach, i.e., CoMPNetX, has the fol- our models’ generalization capacity to new planning problems lowing novel features: and environment structures, such as in the sphere environment • CoMPNetX plans in implicit manifold configuration from being trained on settings with small obstacle blocks and spaces, whereas CoMPNet only considers the robot con- generalizing to the environment with multiple obstacle strips figuration space. The implicit manifold configuration forming various narrow passages (Fig. 1). spaces are formed by the robot configuration and the The remainder of the paper is organized as follows. Section constraint function. For instance, in the door opening task, II presents preliminaries describing general notations and ideas the door, represented as a virtual-link manipulator using in CMP, such as constraint functions and their constraint- Task Space Regions (TSRs), and the robot arm forms an adherence methods. Section III offers a detailed literature re- implicit manifold planning space for CoMPNetX. view on existing approaches in CMP. Sections IV describes our • CoMPNet only considers the projection operator for procedure to obtain neural task representations, and Section V constraint adherence. In contrast, in this paper, we extend presents CoMPNetX with its batch and bidirectional sampling CoMPNet, naming it CoMPNetX, to operate with both heuristics. Section VI gives implementation details followed projection- and continuation-based constraint adherence by Section VII which is dedicated to experimental results of approaches for enhancing any SMP method, including our comparison, ablation, and extended studies. Section VIII batch and bidirectional techniques. presents a brief discussion about our method inheriting an • In our previous work, the task sequences were defined by underlying SMP algorithm’s worst-case theoretical properties. an expert as a text, e.g., open the cabinet and then move Finally, Section IX concludes our work with pointers to our an object into the cabinet. CoMPNet sequentially takes future directions, and an Appendix provides details on the the latent embeddings of those text-based task specifi- model architectures, algorithmic implementations, and their cations to generate the motion sequences. However, text- related parameters. based representations are agnostic of the given workspace and the overall planning objective. Therefore, this paper II. P RELIMINARIES introduces a strategy to combine CoMPNetX with the In this section, we describe the problem of constrained deep neural network-based task planning approaches that motion planning with its basic terminologies. We also outline relieve an expert from providing task sequences during a brief overview of constrained-adherence operators employed execution and provide context-aware neural task repre- by CMP methods for local planning under hard kinematic sentations for CMP. constraints. • Unlike CoMPNet, the proposed approach also comprises a discriminator function that predicts the distances of A. Problem Definition generated configurations from the constraint manifold and In the classical problem of motion planning, the robot 1 The project videos and other supplementary material are available at system is defined by a configuration space (C-space) Q ∈ Rn https://sites.google.com/view/compnetx/home with n ∈ N dimensions. The axis of C-space corresponds to
the system’s variables that govern their motion, such as robot Algorithm 1: Projection Operator: Proj (q) joint-angles, and hence, the dimension n is equivalent to the 1 for i ← 0 to N do robot’s degree-of-freedoms (DOF). The robot’s surrounding 2 ∆x ← F(q) environment is usually described as task-space X ∈ Rm 3 if k∆xk2 < ε then with m ∈ N dimensions, comprising obstacle Xobs ⊂ X 4 return q and obstacle-free Xf ree = X \Xobs spaces. In the C-space terminology, the spaces Xobs and Xf ree are represented as 5 else Qobs and Qf ree = Q\Qobs , respectively. In motion planning, 6 q ← q − J(q)+ ∆x a collision-checker InCollision(·) is assumed to be available that takes a robot configuration q ∈ Q and Xobs , and outputs a boolean indicating if a given configuration lies in Qobs or not. In our work, we show that CoMPNetX solves both uncon- We consider a setup where for a given current xt ∈ Xf ree strained (Problem 1) and constrained (Problem 2) planning and target xT ∈ Xf ree workspace observations, the high- problems. Furthermore, for the latter problem, we only con- level task planner, πH , at time t, outputs an achievable sub- sider kinematic constraints, i.e., the function F solely depends task representation Z c for the low-level agent πL . For each on robot configuration q ∈ Q, not on other robot properties subtask, Z c , we also assume there exist a constraint function such as dynamics representing velocity or acceleration. More- F. The agent, πL , finds motion sequences in Qf ree to achieve over, we define F(q) as distance to the constraint manifold the given subtask, Z c , under constraints F, leading to a next with domain s, i.e., observation xt+1 . This paper considers deep neural networks- F(q) = Distance to the constraint manifold based state-of-the-art task planners as high-level agents, πH , and proposes a novel low-level agent, πL , i.e., CoMPNetX, For instance, if the constraint is on the robot’s end-effector that leverages {Z c , F} for motion planning under task-specific to maintain a particular position, then F(q) can be defined as constraints. the distance of the robot’s end-effector to that specific position A fundamental unconstrained motion planning problem with domain s ∈ [0, 1], spanning an entire or a fraction for a given start configuration q init ∈ Qf ree , a goal region of a motion trajectory. Likewise, when the robot is moving, Qgoal ⊂ Qf ree , environment obstacles Xobs , and a collision- balancing constraints are usually imposed on the whole robot checker, is defined as: motion trajectory with s = [0, 1]. In the remaining section, we describe the two main types Problem 1 (Unconstrained Motion Planning) Given a of classical constraint-adherence operators that ensure a given planning problem {q init , Qgoal , Xobs }, and a collision- configuration or a motion between two configurations lies on checker, find a collision-free path solution σ : [0, 1], if one the constraint manifold defined by F. exists, such that σ0 = q init , σ1 ∈ Qgoal , and σ[0, 1] 7→ Qf ree . B. Projection-based Constraint-Adherance Operator In the constrained motion planning, a planner also has The projection operator (Proj) maps a given configuration to satisfy a set of hard constraints defined by a function q ∈ Rn to the manifold M. It can be formulated as a F(q) : Q 7→ Rk , such that F(q) = 0. The k ∈ N denotes constraint optimization problem [23] the number of constraints imposed on the robot motion, which induces an (n − k)-dimensional space embedded in 1 the robot’s unconstrained ambient C-space, comprising one min kq − q 0 k2 subject to F(q 0 ) = 0, q 0 2 or more manifolds M, i.e, with its dual as: M = {q ∈ Q | F(q) = 0} 1 L(q 0 , λ) = kq − q 0 k2 − λF(q 0 ), In practice, a configuration q is assumed to be on the 2 manifold if kF(q)k2 < ε, where ε > 0 is a tolerance where λ corresponds to Lagrange multipliers. The above threshold. Furthermore, the obstacle and obstacle-free spaces system is solved using gradient descent as summarized in on the manifolds are denoted as Mf ree = M ∩ Qf ree and Algorithm 1, where J+ (q) is the pseudoinverse of the Jacobian Mobs = M\Mf ree , respectively. A CMP problem for a at configuration q ∈ Q. Algorithm 2 outlines the local given start q init configuration, goal region Qgoal ⊂ Qf ree , planning procedure using a projection operator [23], [24]. This environment obstacles Xobs , function F, and a collision- procedure outputs all the intermediate configurations on the checker, is defined as: manifold in the given conditions and loop limit N , when transversing from a given start configuration (q s ) towards the Problem 2 (Constrained Motion Planning) Given a end configuration (q e ) in small incremental steps γ ∈ R. planning problem {q init , Qgoal , Xobs , F}, and a collision- The projection-based steering stops if any of the following checker, find a collision-free path solution σ : [0, 1], happens: (i) The loop limit is reached. (ii) The resulting if one exists, such that σ0 = q init , σ1 ∈ Qgoal , and configuration q i+1 is in a collision. (iii) The stepping distance σ[0, 1] 7→ Mf ree . is diverging rather than converging to prevent overshooting the target configuration, i.e., either d2 > d1 or d > λ1 γ. (v) The
α progress in manifold space D becomes greater than a scalar u i j q i Ci Ci j λ2 times the progress in the ambient space dw = kq e − q s k. qi ψi −1 ψ i ε M M Algorithm 2: Projection Integrator (q s , q e ) qj ρ 1 i ← 0; D ← 0 (a) (b) 2 dw ← kq e − q s k; q i ← q s Fig. 2: (a) A chart Ci operators comprising exponential ψi and 3 while i < N do logrithmic ψi−1 functions for mapping between the tangent 4 q i+1 ← Proj(q i + γ(q e − q i )) space at q i and the manifold. (b) The parameters defining the 5 d ← kq i+1 − q i k2 chart validity region. 6 D ←D+d 7 d1 ← kq i − q e k2 ; d2 ← kq i+1 − q e k2 8 if InCollision(q i+1 ) or d2 > d1 or d > λ1 γ or around configuration q i (Fig. 2 (a)). The basis Φi ∈ Rn×k D > λ2 dw then is computed by solving a following system of equations: 9 break J(q i ) > 0 i←i+1 Φ = , (1) 10 Φ>i i I 11 return {q j }ij=0 where J(q i ) ∈ Rk×n is the Jacobian of F at the configuration q i , 0 ∈ Rk×k , and I ∈ Rk×k is the identity matrix. The exponential mapping ψi is a two step process. The first Algorithm 3: Atlas Integrator (q s , q e , AM ) step determines a configuration q ij in the ambient space using the mapping φi , i.e., 1 i ← 0; D ← 0 2 dw ← kq e − q s k q ij = φi (uij ) = q i + Φi uij (2) 3 qi ← qs 4 Ci ← GetChart(q i , AM ) The second step takes the q ij and orthogonally projects it to 5 ui ← ψi−1 (q i ) the manifold resulting in q j , by solving the following system: ue ← ψi−1 (q e ) ) 6 F(q j ) = 0 7 while kui − ue k2 > γ do (3) Φ> i i (q j − q j ) = 0 8 ui+1 ← ui + γ(ue − ui )/kue − ui k2 9 q i+1 ← ψi (ui+1 ) The above equations are usually solved iteratively by a Newton 10 d ← kq i+1 − q i k2 method until the error k(q j − q ij )k2 < is tolerable or the 11 D ←D+d maximum iteration limit is reached. 12 d1 ← kq i − q e k2 ; d2 ← kq i+1 − q e k2 The inverse logarithmic mapping ψi−1 from the manifold to 13 if InCollision(q i+1 ) or d2 > d1 or d > λ1 γ or the tangent space is straightforward to compute, i.e., d < or D > λ2 dw or i > N then 14 break uij = ψi−1 (q j ) = Φ> i (q j − q i ) (4) 15 i←i+1 Note that each chart Ci has a validity region Vi in which 16 if not RegionValidity(ui , q i ) or ui ∈ / Pi−1 then it properly parameterizes the manifold and exceeding that 17 Ci ← GetChart(q i , AM ) region could lead to divergence when orthogonaly projecting 18 ui ← ψi−1 (q i ) configurations to the manifold during the exponential mapping 19 ue ← ψi−1 (q e ) process. This validity region is governed by the following conditions: 20 return {q j }ij=0 kuij k2 kq ij − q j k ≤ ε; < cos(α); kuij k ≤ ρ (5) kq i − q j k where ε and α indicate the maximum allowable distance C. Continuation-based Constraint-Adherence Operator and curvature, respectively, between the chart Ci and the The continuation-based approaches [23], [25], [26] repre- underlying manifold M, and ρ defines the radius of sphere sent the manifold through a set of local parameterizations, around q i (Fig. 2 (b)). Furthermore, the validity region Vi can known as charts C, forming an atlas A. have a complex shape and is usually approximated by a convex A chart Ci = (q i , Φi (q i )), with an index i ∈ N, locally polytope Pi ⊂ Vi , represented as a set of linear inequalities parameterizes a manifold through a tangent space and its defined in a tangent space of chart Ci . orthonormal basis Φi at a known constraint-adhering config- To realize the local planning using continuation operator, uration q i ∈ M. The orthonormal basis Φi ∈ R(n−k)×n is there exist two types of methods naming atlas integrator used to define an exponential map ψi : Rk 7→ Rn and its (Algorithm 3) and tangent bundle integrator (Algorithm 4). inverse, i.e., a logarithmic map ψi−1 : Rn 7→ Rk , between The latter, in contrast to the former, is less strict about the parameter uij on the tangent space and the manifold the intermediate configurations being on the manifold and
performs projections only when needed and does not separate To satisfy hard-constraints without relaxation on the robot the tangent spaces into half-spaces to prevent overlaps. In our motion, the SMP algorithms [4], such as multi-query Proba- implementations, these integrators assume both start (q s ) and bilistic Road Maps (PRMs) [32], and single-query Rapidly- end (q e ) configurations to be on the manifold. The procedure exploring Random Trees (RRTs) [33] with its bidirectional RegionValididty in the atlas integrator returns False if any of variant [6], have been augmented with constraint-adherence the above-mentioned region validity conditions are violated. methods, such as projection and continuation, to solve a wide range of CMP problems. Algorithm 4: Tangent Bundle Integrator (q s , q e , AM ) The projection-based method was first utilized with a variant of PRMs for parallel manipulators under specialized loop- 1 i ← 0; D ← 0 closure constraints [34]. The parallel manipulators were treated 2 dw ← kq e − q s k as active/passive links and were composed into a constraint- 3 qi ← qs adhering configuration using projection. Yakey et. el [35] 4 Ci ← GetChart(q i , AM ) introduced the Randomized Gradient Descent (RGD) method 5 ui ← ψi−1 (q i ) for closed-chain kinematics constraints that generates C-space 6 ue ← ψi−1 (q e ) samples and projects them to the constraint manifold. How- 7 while kui − ue k2 > γ do ever, their approach required a significant parameter tuning and 8 ui+1 ← ui + γ(ue − ui )/kue − ui k2 was later extended to a generalized framework using RRTs and 9 q i+1 ← φi (ui+1 ) a Jacobian pseudo-inverse based projection method [36]. In a 10 d ← kq i+1 − q i k2 similar vein, Berenson. et al. [24] proposed the Constrained 11 D ←D+d Bidirectional RRT (CBiRRT) with an intuitive constraint rep- 12 d1 ← kq i − q e k2 ; d2 ← kq i+1 − q e k2 resentation approach called Task Space Regions (TSRs). TSRs 13 if InCollision(q i+1 ) or d2 > d1 or d > λ1 γ or represent general end-effector pose constraints and allow a d < or D > λ2 dw or i > N then quick computation of geodesic distances from the constraint 14 break manifolds. Another class of sampling-based methods that use 15 i←i+1 projection operators and plan in the task-space include [37], 16 if kφi−1 (ui ) − q i k2 > ε or ui ∈ / Pi−1 then [38], [39]. These methods find a task-space motion plan and 17 q i ← ψi−1 (ui ) find their corresponding configurations in the C-space, which 18 Ci ← GetChart(q i , AM ) limits their exploration and thus does not yield completeness 19 ui ← ψi−1 (q i ) guarantees. 20 ue ← ψi−1 (q e ) The continuation-based methods compute tangent-spaces at a known constraint-adhering configuration to generate new 21 return {q j }ij=0 nearby samples for quick projections to the constraint man- ifold. Yakey et. el [35] used continuation to generate new configuration samples within tangent space, which were pro- jected to the manifold using RGD for closed-chain kinematic III. R ELATED W ORK constraints. The continuation methods have also been used In this section, we present the existing methods that address for general end-effector constraints [40], [41]. Inspired by the the problem of CMP, ranging from relaxation-based methods definition of differentiable manifolds [42], recent approaches for trajectory optimization and control to strict approaches do not discard tangent spaces. Instead, they compose them such as projection and continuation for sampling-based plan- using data-structures into an atlas for a piece-wise linear ap- ning algorithms. proximation of the constraint manifold [43]. These methods in- The relaxation-based methods represent the hard-constraints clude Atlas-RRT [25] and TangentBundle(TB)-RRT [26] with as soft-constraints by incorporating them as a penalty into an underlying single-query bidirectional RRTs algorithm [6]. the cost function. The cost function is optimized to get the Atlas-RRT ensures all samples to be on the manifold and sep- desired robot behavior. For instance, the IK-based reactive arates tangent spaces into tangent polytypes using half-spaces control method [27], [28] used at the DARPA Robotics for uniform coverage. In contrast, TB-RRT lazily projects Challenge operates in the workspace and finds constrained the configurations for constraint-adherence, i.e., only when robot motion through convex optimization of the given cost switching the tangent spaces, and has overlapping tangent function. However, these approaches often provide incom- spaces, which sometimes lead to invalid states. There also exist plete solutions as they are susceptible to local minima. The variants of Atlas-RRT that allow asymptotic optimality [44], trajectory optimization methods [29], [30] also optimize the [5] and kinodynamic planning [45] under constraints. given cost function over the entire trajectory to find a feasible Recently, Kingston et. el [23] introduced Implicit MAnifold motion plan. However, due to the relaxation, they weakly Configuration Spaces (IMACS) to decouple the choice of satisfy the given constraints and are typically only effective constraint-adherence methods from the underlying selection on short-horizon problems. Recently, Bonalli et al. [31] pro- of SMP planners. IMACS highlights that any SMP method posed a trajectory optimization method for implicitly-defined equipped with the following two components can solve CMP constraint manifolds, but their approach is yet to be explored problems. First, a uniform sampling technique to generate and analyzed in practical CMP robotics problems. samples on the manifold. Second, a constraint integrator func-
Neural Task Representations Append to list G No of a given task. input program Is API e.g., arrange_table Output program Program Program ? xt , xT Object State Encoder Zp Planner End-of-program probability API Decoder: A program is defined as an API program if current & goal Yes G observations it requires arguments for the execution. Given an api program arrange_table pick_place Zd p predicted by the program planner, the neural networks Zd API Decoder Api argument Graph Encoder a a based API Decoder predicts their required arguments a. The inputs to the API decoder are the current xt and goal xT Fig. 3: Given a high-level program (e.g., arrange table), the observations, the API program p, and a fixed size graph environment current xt , and target xT observations, we obtain encoding representing the program hierarchy. the Neural Task Representations for CoMPNetX by exploiting a learning-based task programmer’s internal state Z d and The overall flow of the algorithm is shown in the Fig. program arguments a. 3. The current and goal observations are encoded into la- tent embeddings using their encoders. The program planner, conditioned on observation encodings, iteratively decomposes tion to connect two configurations on the manifold. IMACS the given program (e.g., arrange table) into subprograms by incorporates the constraint function into C-Space, presenting generating a probability distribution over a set of predefined an implicit manifold space to an underlying SMP method. program instances (e.g., pick and place). The program with These SMP methods, augmented with a constrained integrator, maximum probability is selected, which becomes the input are shown to solve various CMP problems. Despite these to the program planner in the next iteration. This process advancements, existing SMP methods are computationally is repeated until an API-program is selected. For instance, inefficient and take up to several minutes for solving practical the given program, arrange table, can lead to the selection problems not just in CMP but also in unconstrained planning of a pick place program which subsequently results in the problems. selection of either pick or place programs. The pick and place In this paper, we propose CoMPNetX that extends IMACS are defined as API programs requiring arguments from the and our previously proposed CoMPNet [22] and also intro- API decoder. This API decoder, conditioned on observation duces neural-gradient-based projections to generate informed encodings and graph embeddings, predicts the API program’s implicit manifold configurations for underlying SMP methods arguments indicating the object that needed to be grasped equipped with any constrained integrator. Our approach can (pick) and moved (place). The graph embeddings are given by also be interpreted as Neural Informed Implicit MAnifold Con- the graph encoder that takes a list of non-API programs (Fig. figuration Spaces (NIIMACS), which replaces the abstraction 3) and encodes them into a fixed-size latent representation. In layer of IMACS with neural-learned sampling distributions to our implementation of NTP2, the current observation contains prioritize sampling in the subsets of a contraint manifold that the current poses of the given objects in the environment and potentially contains a path solution for a given problem. the robot end-effector pose. The goal observation includes the final poses of all objects at the end of the task. Furthermore, IV. N EURAL TASK R EPRESENTATIONS the program planner and the API decoder were trained using the cross-entropy loss for the given expert demonstration. This section describes the process to obtain the neural task For more details on the implementations, refer to [46], and representations, utilized by CoMPNetX to define task-specific Appendix A of this paper. constraints in a scalable and generalizable way. These rep- To generate a neural task representation for the CoMPNetX, resentations come from the internal state of a learning-based we take the latent inner embedding Z d of API Decoder and task planner. Although various learning-based task planners their corresponding arguments a (Fig. 3). The internal state can be utilized for acquiring these representations, we adapt Z d comprises current and goal encodings, graph embedding a variant [46] of the Neural Task Programming (NTP) [47] in representing the given task hierarchy, and an API program our framework. embedding. Note that the latent state Z d and arguments a This variant, which we name NTP2, extends original contain sufficient information, i.e., a given high-level task, NTP by relieving the need for task demonstration at the test their sub-task hierarchy, and workspace representation, for the time. NTP2 uses the goal xT and current xt observations CoMPNetX to effectively plan the feasible robot motion path of the environment to decompose a given high-level task respecting the task constraints at any instant. This is in contrast into a feasible sequence of intermediate sub-tasks. We use to the original CoMPNet framework [22] that relied on hand- NTP2 to obtain the neural task representations and the sub- engineered task plans, and sub-tasks were represented as text- task sequences for CoMPNetX. It comprises the following descriptions, making them oblivious of given high-level tasks, modules: their hierarchical structure, and overall workspace setup. Program Planner: It is a deep neural network-based iterative program predictor that takes a high-level symbolic V. C ONSTRAINED M OTION P LANNING N ETWORKS task pt , the environment’s current xt and goal xT observations This section formally present CoMPNetX (Fig. 4), compris- as an input and outputs a next sub-program pt+1 and the ing a conditional generator, discriminator, neural projection end-of-program probability r, indicating the accomplishment operator, and neural samplers. The neural generator and dis- criminator are conditioned on the task and scene observation
qinit Neural input_program Neural Discriminator (e.g., arrange_table) Generator Zc Neural Task Zc Distance start xt and goal xT Representations workspace observation Zo dM Zo qi Scene Encoder ^ q next qtarg ▽q^ ^ q next next Manifold q ^ ▽q^ qnext next Scene qcurr next observation qgoal Neural Projection Implicit manifold start qnext and goal configurations SMP with Constrained Integrator Fig. 4: CoMPNetX execution traces for the constrained door opening subtask. Our method comprises a conditional neural generator and discriminator that, in conjunction with a planning algorithm, finds a feasible path solution between start q init (purple) and goal q goal (green) configurations. qgoal qgoal qgoal 1 q next 1 qcurr CoMPNetX 1 qcurr 2 e x t Integerator q n 2 2 qcurr qcurr qinit qinit qinit Fig. 5: K-Batch CoMPNetX: The process shows COMPNetX exploiting neural networks parallelization to generate K = 2 informed manifold configurations from randomly selected nodes in the tree towards the goal configuration(s) for an underlying SMP method equipped with a constrained-integrator. a qcurr a a q a q next qnear next b q next b Ta Ta Ta a qcurr b q q targ targ Tb Tb Tb qinit b qinit qgoal qnear qgoal qinit qgoal (a) Informed Sample Generation (b) Ta and Tb extension (c) Swapping Roles Fig. 6: Bidirectional CoMPNetX: (a)-(c) show the CoMPNetX bidirectional sample generation, soliciting neural informed-trees from start and goal to quickly march towards each other within a Bidirectional-SMP method. encodings to generalize across different environments and encoder takes Z s , comprising Z d and a, as an input and planning problems. Our method with a constrained integrator composes them into a fixed-size latent embedding Z c ∈ Rd1 and an underlying SMP algorithm generates feasible motion of size d1 using a neural network. plans on the constraint manifolds for the given CMP problems. B. Scene Encoder A. Task Encoder The scene encoder takes the raw environment perception as The task-encoder processes the neural task representations a 3D depth point-cloud processed into voxels and transforms given as Z s = [Z d , a]. As mentioned earlier, the Z d is a them to an embedding Z o ∈ Rd2 of dimension d2 . The 3D fixed-sized vector comprising the workspace current and goal voxel grids of dimensions L × W × H × C are converted observation encodings, the API program embeddings, and the into 2D voxel patches as L × W × (HC), where L, W, H, graph encoding (representing the program hierarchy). Our task and C correspond to length, width, height, and the number of
channels, respectively. The voxel patches are encoded into Z o CoMPNetX uses the discriminator predictions and their using a 2D convolutional neural network (CNN). We process gradients as the operator, named NProj, to project the given 3D voxels into 2D voxel patches as 3D maps require 3D- configurations to the constraint manifold if their predicted CNNs, which are known to be computationally intensive and distances are greater than a threshold ν, thus discriminating their representations often contain empty volumes [48]. The samples based on their distances from the manifold and fixing scene embedding is passed as a fixed-size feature vector de- them accordingly as, scribing the environmental obstacles to a subsequent generator q ← q̂ − γ∇q̂ Dθ (q̂, Z c , Z o ), (9) and discriminator. Although neural task representations Z c contain poses of manipulatable objects in their embeddings, where γ ∈ R+ is a hyperparameter denoting a step size. scene observation Z o also includes information about static To train the discriminator network Dθ , we minimize the non-movable objects acting as obstacles in the environment. mean-square loss between its predictions and the true labels. The true labels are the geodesic distances of demonstration C. Conditional Neural Generator trajectories from the constraint manifolds. Furthermore, we introduce a trick to create negative training samples with CoMPNetX’s generator Gφ , with parameters φ, is a relatively larger distances from the manifold. The negative stochastic neural model that outputs a variety of implicit training samples comprise the robot configuration from the manifold configurations leading to a constrained path so- unconstrained tasks (e.g., reach a given object) and the virtual- lution (Fig. 4). Because the generator is trained on both link configuration from positive training samples and their unconstrained and constrained path demonstration data, the corresponding distances are computed by querying F. output distribution of the neural model tend to fall on or near the constraint manifolds when conditioned on task-specific constraints. Our generator derives its stochastic behavior from Algorithm 5: COMPNetX (Z s , v, q curr , q targ ) using Dropout [49] during inference, which instantly slices 1 Z c ← GetTaskEncoding(Z s ) Gφ in a probabilistic manner, inculcating variations in the 2 Z o ← GetObsEncoding(v) generated samples. Although other techniques such as input 3 q̂ next ← Gφ (Z c , Z o , q curr , q targ ) Gaussian noise can be used to foster stochasticity, they require 4 dM ← Dθ (q̂ next , Z c , Z o ) a reparametrization trick and are often hard to train end- 5 if dM > ν then to-end [50]. In contrast, Dropout helps capture stochastic 6 q̂ next ← q̂ next − γ∇q̂next Dθ (q̂ next , Z c , Z o ) behavior from demonstration data, which we observed to be 7 return q̂ next consistently better than hand-crafted input noise distributions in our planning problems. The generator’s input is the task-observation encodings (Z c and Z o ) that encode the given neural task representations E. Neural Samplers and scene observation, respectively, and the current q curr and Once trained, CoMPNetX can be used in a number of target q targ manifold configurations. The output is the next ways to generate informed neural samples for the underlying configuration q̂ next on/near the constraint manifold that will SMP algorithms equipped with a constrained adherence take the system closer to the given target, i.e., method. Fig. 4 and Algorithm. 5 present an overall flow of q̂ next ← Gφ (Z c , Z o , q curr , q targ ) (6) information between different neural modules of CoMPNetX. For a given current q curr and target q targ configuration(s), Given the demonstration trajectories σ ∗ = {q ∗0 , · · · , q ∗T } from COMPNetX, conditioned on encodings Z c and Z o , generates an oracle planner, we train the generator together with the task the next configuration(s) q̂ next and projects them towards the and observation encoders in an end-to-end manner using the constraint manifold using neural gradients if needed. Thanks mean-square loss function, i.e., to CoMPNetX’s informed but stochastic sampling and built-in N T −1 i parallelization capacity of neural networks, our method can 1 XX ||q i,j+1 − q ∗i,j+1 ||2 , (7) be adapted to most of underlying SMP methods. For case NB i=0 j=0 studies, we present two sampling strategies named K-Batch where i and j iterates over the number of given paths and CoMPNetX and Bidirectional CoMPNetX, which together the number of nodes in each path, respectively, and NB is the cover a wide range of SMP methods. averaging term. K-Batch CoMPNetX: Our approach exploits the neural D. Conditional Neural Discriminator networks’ innate parallelization capacity to generate a batch of samples with size K ∈ N≥1 using CoMPNetX for the CoMPNetX’s discriminator Dθ , with parameters θ, is a underlying unidirectional (K = 1) and batch (K > 1) SMP deterministic neural model that predicts the distance dM ∈ R methods. In this setup, the input to CoMPNetX is in the form of a given configuration q̂ from an implicit constraint manifold of batches of size K. The K target configurations q targ are a M conditioned on the task Z c and observation Z o encodings, set of samples from goal region Ggoal . The voxel map v and i.e., neural task representation Z s are simply replicated K times. dM ← Dθ (q̂, Z c , Z o ) (8) The K current configurations q curr are obtained by randomly
Algorithm 6: K-Batch COMPNetX Algorithm 7: Bidirectional COMPNetX 1 T ← InitializeSMP(q init , q goal ) 1 t ← 1; p0 ← input program 2 Kqtarg ← KReplicas(q goal ) 2 while not end of program do 3 for i ← 0 to Nmax do 3 xt , v t ← GetObservation() 4 if i < Nismp then 4 pt , Z s , end of program ← NTP2(xt , xT , pt−1 ) 5 Kqcurr ← SelectNodes(T , K) 5 q init , q goal ← GetConfigs(pt , Z s ) 6 Kqnext ← 6 Ta , Tb ← InitializeBiSMP(q init , q goal ) CoMPNetX(KZ s , Kv , Kqcurr , Kqtarg ) 7 q acurr , q btarg ← q init , q goal 7 else 8 for i ← 0 to Nmax do 8 Kqnext ← TraditionalSMP() 9 if i < Nismp then 10 q anext ← CoMPNetX(Z s , v t , q acurr , q btarg ) 9 goal reached ← SMP(Kqnext , T ) 10 if goal reached then 11 else 11 σ ← ExtractPath(T ) 12 q anext ← TraditonalSMP() 13 q anext , path found ← BiSMP(q anext , Ta , Tb ) 12 if σ is not empty then 14 if path found then 13 ExecutePlan(σ) 15 σt ← ExtractPath(Ta , Tb ) 14 else 16 q acurr ← q anext 15 return Failure or AskExpert 17 Swap(Ta , Tb ) 16 return ∅ 18 Swap(q curr , q targ ) 19 if σt is not empty then 20 ExecutePlan(σt ) selecting K nodes in the graph leading to their corresponding 21 else next output configurations as follows: 22 return Failure or AskExpert Kqnext = CoMPNetX KZ s , Kv , Kqcurr , Kqtarg , 23 t←t+1 1 1 24 return ∅ q next q targ .. .. where Kqnext = . , · · · , Kqtarg = . (10) qK next qK targ generate neural task representations and intermediate subtasks At the beginning of planning, the graph T might have only for CoMPNetX, which in return accomplishes those subtasks, one sample, i.e., q init . In that case, an initial set of Kqcurr forming a mutually symbiotic relationship. can be created by randomly sampling the manifold Mf ree In this procedure, CoMPNetX alternatively generates sam- or replicating q init for K times. Fig. 5 shows a case with ples for both trees and greedily expands them towards each K = 2, and Algorithm. 6 presents K-Batch CoMPNetX other by having current and target configurations in the oppo- algorithm with an underlying SMP. This approach is not site trees (Fig. 6), i.e., just for batch sampling methods such as FMT* [11] and Forward: q anext ← CoMPNetX Z s , v, q acurr , q btarg BIT* [9] but can also be applied to any unidirectional Backward: q bnext ← CoMPNetX Z s , v, q bcurr , q atarg SMP method like RRT [33], [6] and PRMs [32] by setting K = 1. Furthermore, our procedure shifts to traditional where configurations with superscript a and b corresponds to sampling techniques, introduced in IMACS [23], after the tree Ta and Tb , respectively. generating neural informed implicit manifold configurations using CoMPNetX for Nsmp iterations. This allows our Algorithm 7 presents an overall framework using NTP2 and framework to explore the entire space in worst-case, leading CoMPNetX with an underlying bidirectional SMP algorithm, to theoretical guarantees expected from a planning algorithm. like RRTConnect [6], and a constrained-adherence method. NTP2 takes the current environment observation xt , previous Bidirectional CoMPNetX: This approach incorporates task program pt−1 , and the desired goal observation xT Bidirectional SMP (BiSMP) methods into CoMPNetX that and generates the next program pt with their representation generate bidirectional trees Ta = (V, E) and Tb = (V, E) Z s . The procedure GetConfigs takes the generated task originating from the start q init and goal q goal configurations, information (pt , Z s ) and obtains their corresponding start respectively, with vertices V and edges E. Although the and goal configurations. These configurations and task-scene following approach can be formulated as K-Batch bidirec- representations are given to CoMPNetX-BiSMP to accomplish tional CoMPNetX, we consider K = 1 and drop down the the given subtask by generating a feasible motion plan. K notations introduced in the previous section for brevity. Fig. 6 illustrates the internal process of a BiSMP, such Furthermore, we also show that our approach can be com- as RRTConnect, using CoMPNetX generated samples. Let’s bined with learning-based task planners such as NTP2 that assume tree Ta current configuration being used to generate
the next sample (Fig. 6 (a)). The underlying BiSMP begins A. Scene setup by extending Ta towards the next configuration q anext and We setup the following cluttered environments imposing updates q anext with the last state reached by constrained various hard kinematic constraints on the robot motion: integrator towards the target q btarg (Fig. 6 (b)). The process Sphere Environment: This environment requires the mo- then extends Tb towards the q anext and the extension process tion planning of a point-mass on the sphere with constraint ends by returning updated q anext and a boolean path found. F(q) = kqk − 1, forming a two-dimensional manifold on a The path found is true when trees Ta and Tb are connected, three-dimensional ambient space. In this setup, we create two depending on trees’ connection strategy of an underlying scenarios: BiSMP, and there exists a path between start and goal that • Scenario 1 - We generate 50 unique scenes by randomly satisfies all the desired constraints. To solicit bidirectional path placing 500 small obstacle blocks over the sphere (Fig. 1 generation using CoMPNetX, the roles of current and target (a)). For each scene, we randomly sample 2000 start and configurations are also swapped along with planning trees’ goal pairs on the obstacle-free space of the sphere. roles at the end of each planning iteration (Fig. 6 (c)). Our • Scenario 2 - This setup requires transversing multiple CoMPNetX-BiSMP quickly finds a path solution by exploiting narrow passages between the randomly selected start and the moving targets from its own distribution which improves goal configurations (Fig. 1 (b)). We randomly sample the stability of the generator to find connectable paths as the unique 500 start and goal pairs from the obstacle- satisfying the two-point boundary value problem becomes free space, each of which constitutes a CMP problem. easier when the two goal states are iteratively sampled from a This setup is only used to test our model’s generalization distribution encoded by the generator, rather than one defined capacity, trained on sphere - scenario 1, to an entirely arbitrarily during the problem definition. different environment. Note that the constraint function F is used only by an Bartender Environment: A dataset, named Bartender en- underlying SMP method. Furthermore like CoMPNet, CoMP- vironment, containing three different scenarios was created NetX (batch and bidirectional) also extends the planning graph to fully capture the complexities of the real-world task and from the nearest node of the newly generated next node since constrained motion planning problems. The environment in- all underlying SMP algorithms rely on the nearest neighbor cludes two tables placed perpendicular to each other. The for their graph extension towards the given configuration table contains seven objects placed at random, and only five sample [4]. It is also in contrast to the MPNet algorithm [15], are movable under pre-specified motion constraints. The five [16] that greedily finds a path by extending from q curr movable/manipulatable items include a juice can (green), fuze to q next in an overall planning method and repairs any bottle (purple), soda can (red), kettle, and red mug. The two non-connectable nodes via stochastic re-planning. Although stationary objects include a tray and a recycling bin that form the MPNet approach works extremely fast in unconstrained the movable objects’ goal locations. The juice can, soda can, planning problems, re-planning becomes computationally ex- and fuze bottle are to be placed into the recycling bin with only pensive in CMP due to projections performed by the con- collision-avoidance constraints. The kettle and the red mug strained integrator. Moreover, the constraint manifolds are non- are to be placed on the tray with both stability and collision- euclidean in topography, and extension from nearest neighbors avoidance constraints, i.e., no tilting is allowed during the becomes convenient for geodesic interpolation. This is evident robot motion. The three different scenarios are described as from the experimentation in our previous work [22], showing follow. that leveraging MPNet’s greedy path-finding approach, without • Scenario 1 - In this scenario, the objects can be moved to replanning, often fails in finding a connectable path solution on the manifolds. However, in our extended analysis presented their targets in any order. In other words, in most cases, in this work, we show that CoMPNetX, in addition to CMP, all objects start, and goal configurations are reachable. can still be used with the MPNet planning algorithm for We generate 1833 unique scenes through the random efficiently solving unconstrained planning problems with low placement of the movable and non-movable objects on computation times and high success rates in high-dimensional the tables at the robot’s right arm’s reachable locations. planning problems. Each scene contains a total of ten (five unconstrained and five constrained) planning problems. • Scenario 2 - In this scenario, the goal location of either VI. I MPLEMENTATION DETAILS the red mug or the kettle contains an obstacle. The obstacles are formed by placing either juice bottle, fuze This section describes the data generation pipeline from bottle, or soda can, at the goal location of the kettle or setting up scenarios to obtaining expert demonstrations and the red mug. For example, if the red mug’s goal location observation data. We also describe training, and testing data contains the juice bottle, the task planner needs to account splits for all scenarios considered in this work. Furthermore, for this information during the planning process. That is, with this paper, all generated datasets, trained models, and the juice bottle needs to be moved into the recycling bin algorithmic implementations will be made publicly available before the red mug is attempted to be moved onto the on our project website2 . tray. This enforces a constraint on the task planner to account for obstacles. We created 700 scenes in this setup, 2 https://sites.google.com/view/compnetx/home each with at least two constrained and two unconstrained
planning problems. and kitchen environments leading to voxel maps of dimensions • Scenario 3 - In this setup, the kettle and red mug are 33 × 33 × 33 and 32 × 32 × 32, respectively. placed on the tray, and the task is to swap their start locations. In other words, the goal locations of both the D. NTP2: Programs and API Arguments Set kettle and the red mug are occupied by the red mug and In our NTP2 setup, the list of initial programs includes the kettle, respectively. Therefore, there is a need for a arrange table and swap tray objs. The Bartender (setup 1 sub-goal generation for one of the objects. For example, and 2) and Kitchen tasks begin with the former, whereas the tea kettle should be moved to a temporary location on the Bartender setup 3 begins with the latter program. The the table. This is followed by the pick-place of the plastic initial program can call either pick place, subgoal gen, mug to its goal location. Finally, the goal location of the return arm, or no op programs followed by their underlying tea kettle is now free for its pick-place operation. For API-programs named pick and place. The API-programs pick this problem, we created 300 unique cases by random and place represent an unconstrained planning problem, re- placement of the tray, and each case contained atleast quiring a robot to reach a given target/object, and a constrained six planning problems, i.e., three constrained and three planning problem, demanding manipulation under manifold unconstrained. constraints, respectively. An API-program also gets an argu- Kitchen Environment: In this scenario we have seven manipu- ment, predicted by the API-decoder, which in our cases, is latable objects: soda can, juice can, fuze bottle, cabinet door, one of the objects (e.g., juice can, fuze bottle, soda can, etc.) black mug, red mug, and pitcher. The objective is to move the to be picked or placed in the given scenario. Furthermore, the cans and bottle to the trash bin, open the cabinet door from any program return arm requires a robot to return to its initial starting angle to a fixed final angle (π/2.7), transfer (without default configuration from any starting state, and the program tilting) the black and red mugs from the cabinet to the tray, and no op means no operation needed. Finally, the subgoal gen move the pitcher from the table into the cabinet. We construct is executed to move objects acting as obstacles out of the way 1633 unique scenarios by the random placement of the trash through pick-place procedures to achieve the desired sub-task. bin, tray, and manipulatable objects (excluding door) on the table and by randomly selecting the cabinet’s door starting VII. R ESULTS angle between 0 to π/4. Each scenario contains a total of 14 planning problems, i.e., seven unconstrained (reach) and seven In this section, we present the results and analysis of the fol- constrained (manipulation) problems. lowing evaluation studies: (i) A comparison study evaluating CoMPNetX and state-of-the-art classical SMP planning meth- B. Training & testing data splits ods with an underlying constraint-adherence approach (projec- tion, atlas, or tangent bundle) on unseen challenging problems In the sphere (scenario 1), we use 40 environments for in environments named Sphere, Bartender, and Kitchen. (ii) An training and 10 for testing. The sphere (scenario 2) is used ablation study comparing CoMPNetX with its ablated models for testing only. In the bartender (scenario 1), and kitchen and our previous method CoMPNet [22]. (iii) An extended environments, we use 10% data for testing, and the remainder evaluation to highlight the mutualistic relationship of learning- is used for training. All training paths were generated by an based task programmers and CoMPNetX and their capacity to oracle planner, i.e., Atlas-RRTConnect. To train neural task generalize across different planning domains. programmer on all bartender (scenarios 1, 2 & 3) and kitchen environments, we use the same data split ratio, i.e., about 5% is kept for testing. Note that the CoMPNetX is never trained A. Comparative analysis on the sphere (scenario 2) and Bartender scenarios 2 and 3. This section compares SMP methods augmented with batch We use them to evaluate CoMPNetX generalization capacity and bidirectional CoMPNetX against their classical setups across different environment structures and planning problems. in solving CMP problems. In the batch method, we select FMT* [11], a state-of-the-art classical SMP algorithm, and C. Observation data it is proven to perform better than standard approaches like RRT* and PRM* [7]. FMT* begins with an initial batch In the sphere environment, the observation data is a point- of Ninit uniform samples, including a goal configuration. cloud converted into a voxel map of size 40×40×40. However, In case, the initial set of samples does not yield a path for the other high-dimensional robot environments (Bartender solution, FMT* continues to expand the tree by generating and Kitchen), there exist workspace and entire scene observa- a new random sample in every planning iteration. We choose tions at any time instant t. The workspace observation includes FMT* to highlight the flexibility offered by CoMPNetX in the current xt ∈ X and the target xT ∈ X . The current generating sample batches with different K ≥ 1 according to workspace observation xt at a given time is represented by the given SMP method. In CoMPNetX-FMT*, we generate an each objects’ poses and the robot end-effector pose. The target initial batch of Ninit samples with K
You can also read