Constrained Motion Planning Networks X - Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip - arXiv

Page created by Erica Parsons
Constrained Motion Planning Networks X - Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip - arXiv
Constrained Motion Planning Networks X
                                                                     Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip

                                           Abstract—Constrained motion planning is a challenging field of
                                        research, aiming for computationally efficient methods that can
                                        find a collision-free path on the constraint manifolds between
                                        a given start and goal configuration. These planning problems
                                        come up surprisingly frequently, such as in robot manipulation
                                        for performing daily life assistive tasks. However, few solutions to
                                        constrained motion planning are available, and those that exist
                                        struggle with high computational time complexity in finding a
                                        path solution on the manifolds. To address this challenge, we
arXiv:2010.08707v2 [cs.RO] 3 Jul 2021

                                        present Constrained Motion Planning Networks X (CoMPNetX).
                                        It is a neural planning approach, comprising a conditional deep
                                        neural generator and discriminator with neural gradients-based
                                        fast projection operator. We also introduce neural task and scene
                                        representations conditioned on which the CoMPNetX generates
                                        implicit manifold configurations to turbo-charge any underlying                          (a)                             (b)
                                        classical planner such as Sampling-based Motion Planning meth-             Fig. 1: CoMPNetX generalized in sphere environment from
                                        ods for quickly solving complex constrained planning tasks. We
                                        show that our method finds path solutions with high success rates          (a) small cubical obstacles’ geometry to (b) multiple longitu-
                                        and lower computation times than state-of-the-art traditional              dinal obstacle strips and planned near-optimal paths between
                                        path-finding tools on various challenging scenarios.                       randomly selected start and goal pairs in sub-second compu-
                                                                                                                   tational times.

                                                              I. INTRODUCTION
                                           Constrained Motion Planning (CMP) has a broad range of                  eventually connects the given start and goal configurations
                                        robotics applications for solving practical problems emerging              leading to a path solution [4]. However, in CMP, the constraint
                                        in domains such as assistance at home, factory floors, disaster            equations implicitly define a configuration space compris-
                                        sites, and hospitals [1]. In our daily life, most of our activities        ing zero-volume constraint manifolds embedded in a higher-
                                        involve a large number of CMP tasks. For example, at our                   dimensional ambient space of the robot’s joint variables [5].
                                        home, we interact with various objects to perform usual                    Therefore, the probability of generating random robot con-
                                        household chores such as cleaning and cooking, including                   figurations on those manifolds is not just low but zero, which
                                        opening doors, carrying a tray or a glass filled with water,               makes the state-of-the-art gold standard SMP methods [6], [7],
                                        and lifting boxes. Likewise, skilled workers manipulate their              [8] [9], [10], [11], [12], [13] fail in such problems [14].
                                        tools to solve a wide variety of tasks such as assembly at                    Recently, constraint-adherence methods that generate sam-
                                        factory floors and advanced-level surgery in the hospitals.                ples on the manifolds have been incorporated into existing
                                           In all of the scenarios mentioned above, our cognitive                  SMP algorithms for CMP [14]. These methods include pro-
                                        process decomposes a given task (e.g., cleaning) into sub-                 jection and continuation-based approaches. The former uses
                                        tasks (e.g., moving objects to their designated places) and                Jacobian-based gradient descent to project a given configu-
                                        accomplishes them sequentially or concurrently by sending                  ration to the manifold. The latter takes a known constraint-
                                        motor commands to the body for physical interaction with                   adhering configuration to compute a tangent space using
                                        the environment under the task-specific constraints [2], [3].              which new samples are generated closer to the manifold for
                                        In robotics, this phenomenon is known as Task and Motion                   projection. These advanced planning methods solve a wide
                                        Planning (TMP). A task planner decomposes a given task into                range of tasks, but they often exhibit high computational
                                        a sequence of sub-tasks, and a motion planner achieves those               time complexity with high variance, making them frequently
                                        sub-tasks by planning feasible robot motion sequences. This                impractical for real-world manipulation problems.
                                        paper focuses on the latter part of TMP, i.e., task-constrained               A parallel development led to the cross-fertilization of SMP
                                        motion planning methods, and their integration with the exist-             and machine-learning approaches, resulting into learning-
                                        ing state-of-the-art learning-based task programmers.                      based motion planners [15], [16], [17], [18], [19], [20], [21].
                                           In the last decade, Sampling-based Motion Planning (SMP)                These methods learn from an oracle planner and are shown
                                        methods have surfaced as prominent motion planning tools                   to be scalable and generalizable to new problems with sig-
                                        in robotics [4]. These algorithms randomly sample the robot                nificantly faster computational speed than classical methods.
                                        joint-configurations to build a collision-free graph, which                Some of these planners even provide worst-case theoretical
                                                                                                                   guarantees. For instance, Motion Planning Networks (MP-
                                          A. H. Qureshi, J. Dong, A. Baig and M. C. Yip are affiliated with Uni-   Net) [15], [16] generates collision-free paths through divide-
                                        versity of California San Diego, La Jolla, CA 92093 USA. {a1qureshi,
                                        jid103, abaig, yip}                                               and-conquer as it divides the problem into sub-problems and
Constrained Motion Planning Networks X - Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip - arXiv
either replans or outsources them, in worst-case, to a classical                   provides gradients to project them to the manifold if
planner while still retaining its computational benefits.                          needed.
   In our recent work, we extended MPNet to solve CMP prob-                  In summary, CoMPNetX can generate robot configurations
lems by proposing Constrained Motion Planning Networks                       for a wide range of SMP algorithms while retaining their
(CoMPNet) [22]. CoMPNet is a deep neural network-based                       worst-case theoretical guarantees. Our generator and dis-
approach that takes the environment perception information,                  criminator are conditioned on the neural task representation
text-based task specification defining the constraints (e.g.,                and the environment observation encoding. The conditional
open the door), and robot’s start and goal configurations                    generator takes the desired start and goal configurations to
as an input and outputs a feasible path on the constraint                    output intermediate implicit manifold configurations, and the
manifolds. CoMPNet connects any two given configurations                     conditional discriminator predicts their geodesic distances
using a projection-based constraint-adherence operator, and                  from the underlying manifold. We use the discriminator’s
like MPNet, it also performs a divide-and-conquer through                    predictions and their gradients as the operator to project the
bidirectional expansion. However, it avoids replanning, which                given configurations towards the constraint manifold if needed.
is a computationally expensive process in CMP, and instead                   CoMPNetX naturally forms a mutual symbiotic relationship
builds an informed tree of possible paths.                                   with learning-based task programmers and exploits their inner
   This paper presents a unified framework called Constrained                states, representing tasks, to transverse multiple constrained
Motion Planning Networks X (CoMPNetX)1 , which extends                       manifolds for finding their path solutions. We show that these
CoMPNet and generates informed implicit manifold configu-                    task representations from a learning-based task planner can
rations to speed-up any SMP algorithm equipped with their                    lead to better performance in motion planning than human-
constraint-adherence approach for solving CMP problems.                      defined text-based task representations (as in [22]). We test
CoMPNetX comprises the conditional neural generator, dis-                    CoMPNetX with various SMP algorithms using both con-
criminator, a neural gradient-based projection operator, and                 tinuation and projection-based constraint-adherence methods
sampling heuristics to propose samples for all kinds of SMP                  on challenging problems and benchmark them against the
methods. Furthermore, compared to our previously proposed                    state-of-the-art classical CMP algorithms. We also evaluate
CoMPNet, this new approach, i.e., CoMPNetX, has the fol-                     our models’ generalization capacity to new planning problems
lowing novel features:                                                       and environment structures, such as in the sphere environment
  •   CoMPNetX plans in implicit manifold configuration                      from being trained on settings with small obstacle blocks and
      spaces, whereas CoMPNet only considers the robot con-                  generalizing to the environment with multiple obstacle strips
      figuration space. The implicit manifold configuration                  forming various narrow passages (Fig. 1).
      spaces are formed by the robot configuration and the                      The remainder of the paper is organized as follows. Section
      constraint function. For instance, in the door opening task,           II presents preliminaries describing general notations and ideas
      the door, represented as a virtual-link manipulator using              in CMP, such as constraint functions and their constraint-
      Task Space Regions (TSRs), and the robot arm forms an                  adherence methods. Section III offers a detailed literature re-
      implicit manifold planning space for CoMPNetX.                         view on existing approaches in CMP. Sections IV describes our
  •   CoMPNet only considers the projection operator for                     procedure to obtain neural task representations, and Section V
      constraint adherence. In contrast, in this paper, we extend            presents CoMPNetX with its batch and bidirectional sampling
      CoMPNet, naming it CoMPNetX, to operate with both                      heuristics. Section VI gives implementation details followed
      projection- and continuation-based constraint adherence                by Section VII which is dedicated to experimental results of
      approaches for enhancing any SMP method, including                     our comparison, ablation, and extended studies. Section VIII
      batch and bidirectional techniques.                                    presents a brief discussion about our method inheriting an
  •   In our previous work, the task sequences were defined by               underlying SMP algorithm’s worst-case theoretical properties.
      an expert as a text, e.g., open the cabinet and then move              Finally, Section IX concludes our work with pointers to our
      an object into the cabinet. CoMPNet sequentially takes                 future directions, and an Appendix provides details on the
      the latent embeddings of those text-based task specifi-                model architectures, algorithmic implementations, and their
      cations to generate the motion sequences. However, text-               related parameters.
      based representations are agnostic of the given workspace
      and the overall planning objective. Therefore, this paper                                   II. P RELIMINARIES
      introduces a strategy to combine CoMPNetX with the                        In this section, we describe the problem of constrained
      deep neural network-based task planning approaches that                motion planning with its basic terminologies. We also outline
      relieve an expert from providing task sequences during                 a brief overview of constrained-adherence operators employed
      execution and provide context-aware neural task repre-                 by CMP methods for local planning under hard kinematic
      sentations for CMP.                                                    constraints.
  •   Unlike CoMPNet, the proposed approach also comprises
      a discriminator function that predicts the distances of
                                                                             A. Problem Definition
      generated configurations from the constraint manifold and
                                                                               In the classical problem of motion planning, the robot
  1 The   project videos and other supplementary material are available at   system is defined by a configuration space (C-space) Q ∈ Rn                                  with n ∈ N dimensions. The axis of C-space corresponds to
Constrained Motion Planning Networks X - Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip - arXiv
the system’s variables that govern their motion, such as robot          Algorithm 1: Projection Operator: Proj (q)
joint-angles, and hence, the dimension n is equivalent to the          1   for i ← 0 to N do
robot’s degree-of-freedoms (DOF). The robot’s surrounding              2       ∆x ← F(q)
environment is usually described as task-space X ∈ Rm                  3       if k∆xk2 < ε then
with m ∈ N dimensions, comprising obstacle Xobs ⊂ X                    4           return q
and obstacle-free Xf ree = X \Xobs spaces. In the C-space
terminology, the spaces Xobs and Xf ree are represented as             5      else
Qobs and Qf ree = Q\Qobs , respectively. In motion planning,           6          q ← q − J(q)+ ∆x
a collision-checker InCollision(·) is assumed to be available
that takes a robot configuration q ∈ Q and Xobs , and outputs
a boolean indicating if a given configuration lies in Qobs or
not.                                                                      In our work, we show that CoMPNetX solves both uncon-
   We consider a setup where for a given current xt ∈ Xf ree           strained (Problem 1) and constrained (Problem 2) planning
and target xT ∈ Xf ree workspace observations, the high-               problems. Furthermore, for the latter problem, we only con-
level task planner, πH , at time t, outputs an achievable sub-         sider kinematic constraints, i.e., the function F solely depends
task representation Z c for the low-level agent πL . For each          on robot configuration q ∈ Q, not on other robot properties
subtask, Z c , we also assume there exist a constraint function        such as dynamics representing velocity or acceleration. More-
F. The agent, πL , finds motion sequences in Qf ree to achieve         over, we define F(q) as distance to the constraint manifold
the given subtask, Z c , under constraints F, leading to a next        with domain s, i.e.,
observation xt+1 . This paper considers deep neural networks-                   F(q) = Distance to the constraint manifold
based state-of-the-art task planners as high-level agents, πH ,
and proposes a novel low-level agent, πL , i.e., CoMPNetX,             For instance, if the constraint is on the robot’s end-effector
that leverages {Z c , F} for motion planning under task-specific       to maintain a particular position, then F(q) can be defined as
constraints.                                                           the distance of the robot’s end-effector to that specific position
   A fundamental unconstrained motion planning problem                 with domain s ∈ [0, 1], spanning an entire or a fraction
for a given start configuration q init ∈ Qf ree , a goal region        of a motion trajectory. Likewise, when the robot is moving,
Qgoal ⊂ Qf ree , environment obstacles Xobs , and a collision-         balancing constraints are usually imposed on the whole robot
checker, is defined as:                                                motion trajectory with s = [0, 1].
                                                                          In the remaining section, we describe the two main types
Problem 1 (Unconstrained Motion Planning) Given a                      of classical constraint-adherence operators that ensure a given
planning problem {q init , Qgoal , Xobs }, and a collision-            configuration or a motion between two configurations lies on
checker, find a collision-free path solution σ : [0, 1], if one        the constraint manifold defined by F.
exists, such that σ0 = q init , σ1 ∈ Qgoal , and σ[0, 1] 7→ Qf ree .
                                                                       B. Projection-based Constraint-Adherance Operator
   In the constrained motion planning, a planner also has
                                                                         The projection operator (Proj) maps a given configuration
to satisfy a set of hard constraints defined by a function
                                                                       q ∈ Rn to the manifold M. It can be formulated as a
F(q) : Q 7→ Rk , such that F(q) = 0. The k ∈ N denotes
                                                                       constraint optimization problem [23]
the number of constraints imposed on the robot motion,
which induces an (n − k)-dimensional space embedded in                                  1
the robot’s unconstrained ambient C-space, comprising one                          min    kq − q 0 k2 subject to F(q 0 ) = 0,
                                                                                    q 0 2
or more manifolds M, i.e,
                                                                       with its dual as:
                   M = {q ∈ Q | F(q) = 0}                                                          1
                                                                                      L(q 0 , λ) = kq − q 0 k2 − λF(q 0 ),
In practice, a configuration q is assumed to be on the                                             2
manifold if kF(q)k2 < ε, where ε > 0 is a tolerance                    where λ corresponds to Lagrange multipliers. The above
threshold. Furthermore, the obstacle and obstacle-free spaces          system is solved using gradient descent as summarized in
on the manifolds are denoted as Mf ree = M ∩ Qf ree and                Algorithm 1, where J+ (q) is the pseudoinverse of the Jacobian
Mobs = M\Mf ree , respectively. A CMP problem for a                    at configuration q ∈ Q. Algorithm 2 outlines the local
given start q init configuration, goal region Qgoal ⊂ Qf ree ,         planning procedure using a projection operator [23], [24]. This
environment obstacles Xobs , function F, and a collision-              procedure outputs all the intermediate configurations on the
checker, is defined as:                                                manifold in the given conditions and loop limit N , when
                                                                       transversing from a given start configuration (q s ) towards the
Problem 2 (Constrained Motion Planning) Given a                        end configuration (q e ) in small incremental steps γ ∈ R.
planning problem {q init , Qgoal , Xobs , F}, and a collision-         The projection-based steering stops if any of the following
checker, find a collision-free path solution σ : [0, 1],               happens: (i) The loop limit is reached. (ii) The resulting
if one exists, such that σ0 = q init , σ1 ∈ Qgoal , and                configuration q i+1 is in a collision. (iii) The stepping distance
σ[0, 1] 7→ Mf ree .                                                    is diverging rather than converging to prevent overshooting the
                                                                       target configuration, i.e., either d2 > d1 or d > λ1 γ. (v) The
Constrained Motion Planning Networks X - Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip - arXiv
progress in manifold space D becomes greater than a scalar                               u
                                                                                                        i            Ci
                                                                        Ci                          j

λ2 times the progress in the ambient space dw = kq e − q s k.
                                                                                 qi           ψi

                                                                          M                                           M

     Algorithm 2: Projection Integrator (q s , q e )                                           qj                               ρ

 1    i ← 0; D ← 0                                                               (a)                                      (b)
 2    dw ← kq e − q s k; q i ← q s                                Fig. 2: (a) A chart Ci operators comprising exponential ψi and
 3    while i < N do                                              logrithmic ψi−1 functions for mapping between the tangent
 4       q i+1 ← Proj(q i + γ(q e − q i ))                        space at q i and the manifold. (b) The parameters defining the
 5       d ← kq i+1 − q i k2                                      chart validity region.
 6       D ←D+d
 7       d1 ← kq i − q e k2 ; d2 ← kq i+1 − q e k2
 8       if InCollision(q i+1 ) or d2 > d1 or d > λ1 γ or         around configuration q i (Fig. 2 (a)). The basis Φi ∈ Rn×k
           D > λ2 dw then                                         is computed by solving a following system of equations:
 9           break                                                                                       
                                                                                     J(q i )     >        0
      i←i+1                                                                                    Φ   =          ,           (1)
10                                                                                    Φ>i
                                                                                                 i        I
11    return {q j }ij=0
                                                                  where J(q i ) ∈ Rk×n is the Jacobian of F at the configuration
                                                                  q i , 0 ∈ Rk×k , and I ∈ Rk×k is the identity matrix.
                                                                      The exponential mapping ψi is a two step process. The first
     Algorithm 3: Atlas Integrator (q s , q e , AM )              step determines a configuration q ij in the ambient space using
                                                                  the mapping φi , i.e.,
 1    i ← 0; D ← 0
 2    dw ← kq e − q s k                                                                q ij = φi (uij ) = q i + Φi uij              (2)
 3    qi ← qs
 4    Ci ← GetChart(q i , AM )                                    The second step takes the q ij and orthogonally projects it to
 5    ui ← ψi−1 (q i )                                            the manifold resulting in q j , by solving the following system:
      ue ← ψi−1 (q e )
 6                                                                                               F(q j ) = 0
 7    while kui − ue k2 > γ do                                                                                                  (3)
                                                                                       Φ>            i
                                                                                          i (q j − q j ) = 0
 8       ui+1 ← ui + γ(ue − ui )/kue − ui k2
 9       q i+1 ← ψi (ui+1 )                                       The above equations are usually solved iteratively by a Newton
10       d ← kq i+1 − q i k2                                      method until the error k(q j − q ij )k2 <  is tolerable or the
11       D ←D+d                                                   maximum iteration limit is reached.
12       d1 ← kq i − q e k2 ; d2 ← kq i+1 − q e k2                   The inverse logarithmic mapping ψi−1 from the manifold to
13       if InCollision(q i+1 ) or d2 > d1 or d > λ1 γ or         the tangent space is straightforward to compute, i.e.,
           d <  or D > λ2 dw or i > N then
14           break                                                                uij = ψi−1 (q j ) = Φ>
                                                                                                       i (q j − q i )               (4)

15        i←i+1                                                      Note that each chart Ci has a validity region Vi in which
16        if not RegionValidity(ui , q i ) or ui ∈
                                                 / Pi−1 then      it properly parameterizes the manifold and exceeding that
17            Ci ← GetChart(q i , AM )                            region could lead to divergence when orthogonaly projecting
18            ui ← ψi−1 (q i )                                    configurations to the manifold during the exponential mapping
19            ue ← ψi−1 (q e )                                    process. This validity region is governed by the following
20    return {q j }ij=0
                                                                                                kuij k2
                                                                        kq ij − q j k ≤ ε;                 < cos(α); kuij k ≤ ρ     (5)
                                                                                              kq i − q j k
                                                                  where ε and α indicate the maximum allowable distance
C. Continuation-based Constraint-Adherence Operator               and curvature, respectively, between the chart Ci and the
   The continuation-based approaches [23], [25], [26] repre-      underlying manifold M, and ρ defines the radius of sphere
sent the manifold through a set of local parameterizations,       around q i (Fig. 2 (b)). Furthermore, the validity region Vi can
known as charts C, forming an atlas A.                            have a complex shape and is usually approximated by a convex
   A chart Ci = (q i , Φi (q i )), with an index i ∈ N, locally   polytope Pi ⊂ Vi , represented as a set of linear inequalities
parameterizes a manifold through a tangent space and its          defined in a tangent space of chart Ci .
orthonormal basis Φi at a known constraint-adhering config-          To realize the local planning using continuation operator,
uration q i ∈ M. The orthonormal basis Φi ∈ R(n−k)×n is           there exist two types of methods naming atlas integrator
used to define an exponential map ψi : Rk 7→ Rn and its           (Algorithm 3) and tangent bundle integrator (Algorithm 4).
inverse, i.e., a logarithmic map ψi−1 : Rn 7→ Rk , between        The latter, in contrast to the former, is less strict about
the parameter uij on the tangent space and the manifold           the intermediate configurations being on the manifold and
Constrained Motion Planning Networks X - Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip - arXiv
performs projections only when needed and does not separate            To satisfy hard-constraints without relaxation on the robot
the tangent spaces into half-spaces to prevent overlaps. In our     motion, the SMP algorithms [4], such as multi-query Proba-
implementations, these integrators assume both start (q s ) and     bilistic Road Maps (PRMs) [32], and single-query Rapidly-
end (q e ) configurations to be on the manifold. The procedure      exploring Random Trees (RRTs) [33] with its bidirectional
RegionValididty in the atlas integrator returns False if any of     variant [6], have been augmented with constraint-adherence
the above-mentioned region validity conditions are violated.        methods, such as projection and continuation, to solve a wide
                                                                    range of CMP problems.
     Algorithm 4: Tangent Bundle Integrator (q s , q e , AM )          The projection-based method was first utilized with a variant
                                                                    of PRMs for parallel manipulators under specialized loop-
 1    i ← 0; D ← 0
                                                                    closure constraints [34]. The parallel manipulators were treated
 2    dw ← kq e − q s k
                                                                    as active/passive links and were composed into a constraint-
 3    qi ← qs
                                                                    adhering configuration using projection. Yakey et. el [35]
 4    Ci ← GetChart(q i , AM )
                                                                    introduced the Randomized Gradient Descent (RGD) method
 5    ui ← ψi−1 (q i )                                              for closed-chain kinematics constraints that generates C-space
 6    ue ← ψi−1 (q e )                                              samples and projects them to the constraint manifold. How-
 7    while kui − ue k2 > γ do                                      ever, their approach required a significant parameter tuning and
 8       ui+1 ← ui + γ(ue − ui )/kue − ui k2                        was later extended to a generalized framework using RRTs and
 9       q i+1 ← φi (ui+1 )                                         a Jacobian pseudo-inverse based projection method [36]. In a
10       d ← kq i+1 − q i k2                                        similar vein, Berenson. et al. [24] proposed the Constrained
11       D ←D+d                                                     Bidirectional RRT (CBiRRT) with an intuitive constraint rep-
12       d1 ← kq i − q e k2 ; d2 ← kq i+1 − q e k2                  resentation approach called Task Space Regions (TSRs). TSRs
13       if InCollision(q i+1 ) or d2 > d1 or d > λ1 γ or           represent general end-effector pose constraints and allow a
           d <  or D > λ2 dw or i > N then                         quick computation of geodesic distances from the constraint
14           break                                                  manifolds. Another class of sampling-based methods that use
15        i←i+1                                                     projection operators and plan in the task-space include [37],
16        if kφi−1 (ui ) − q i k2 > ε or ui ∈
                                            / Pi−1 then             [38], [39]. These methods find a task-space motion plan and
17            q i ← ψi−1 (ui )                                      find their corresponding configurations in the C-space, which
18            Ci ← GetChart(q i , AM )                              limits their exploration and thus does not yield completeness
19            ui ← ψi−1 (q i )                                      guarantees.
20            ue ← ψi−1 (q e )                                         The continuation-based methods compute tangent-spaces at
                                                                    a known constraint-adhering configuration to generate new
21    return {q j }ij=0                                             nearby samples for quick projections to the constraint man-
                                                                    ifold. Yakey et. el [35] used continuation to generate new
                                                                    configuration samples within tangent space, which were pro-
                                                                    jected to the manifold using RGD for closed-chain kinematic
                          III. R ELATED W ORK                       constraints. The continuation methods have also been used
   In this section, we present the existing methods that address    for general end-effector constraints [40], [41]. Inspired by the
the problem of CMP, ranging from relaxation-based methods           definition of differentiable manifolds [42], recent approaches
for trajectory optimization and control to strict approaches        do not discard tangent spaces. Instead, they compose them
such as projection and continuation for sampling-based plan-        using data-structures into an atlas for a piece-wise linear ap-
ning algorithms.                                                    proximation of the constraint manifold [43]. These methods in-
   The relaxation-based methods represent the hard-constraints      clude Atlas-RRT [25] and TangentBundle(TB)-RRT [26] with
as soft-constraints by incorporating them as a penalty into         an underlying single-query bidirectional RRTs algorithm [6].
the cost function. The cost function is optimized to get the        Atlas-RRT ensures all samples to be on the manifold and sep-
desired robot behavior. For instance, the IK-based reactive         arates tangent spaces into tangent polytypes using half-spaces
control method [27], [28] used at the DARPA Robotics                for uniform coverage. In contrast, TB-RRT lazily projects
Challenge operates in the workspace and finds constrained           the configurations for constraint-adherence, i.e., only when
robot motion through convex optimization of the given cost          switching the tangent spaces, and has overlapping tangent
function. However, these approaches often provide incom-            spaces, which sometimes lead to invalid states. There also exist
plete solutions as they are susceptible to local minima. The        variants of Atlas-RRT that allow asymptotic optimality [44],
trajectory optimization methods [29], [30] also optimize the        [5] and kinodynamic planning [45] under constraints.
given cost function over the entire trajectory to find a feasible      Recently, Kingston et. el [23] introduced Implicit MAnifold
motion plan. However, due to the relaxation, they weakly            Configuration Spaces (IMACS) to decouple the choice of
satisfy the given constraints and are typically only effective      constraint-adherence methods from the underlying selection
on short-horizon problems. Recently, Bonalli et al. [31] pro-       of SMP planners. IMACS highlights that any SMP method
posed a trajectory optimization method for implicitly-defined       equipped with the following two components can solve CMP
constraint manifolds, but their approach is yet to be explored      problems. First, a uniform sampling technique to generate
and analyzed in practical CMP robotics problems.                    samples on the manifold. Second, a constraint integrator func-
Constrained Motion Planning Networks X - Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip - arXiv
Neural Task
                         Representations                        Append to list G
                                                                                             No        of a given task.
      input program
                                                                                              Is API
   e.g., arrange_table                                                  Output program       Program
                                                      Program                                    ?
          xt , xT        Object State Encoder   Zp
                                                      Planner            End-of-program
                                                                                                          API Decoder: A program is defined as an API program if
       current & goal                                                                        Yes
        observations                                                                                   it requires arguments for the execution. Given an api program
                                                                                                       p predicted by the program planner, the neural networks
                                                     Zd   API Decoder         Api argument

                            Graph Encoder
                                                                                    a              a
                                                                                                       based API Decoder predicts their required arguments a. The
                                                                                                       inputs to the API decoder are the current xt and goal xT
Fig. 3: Given a high-level program (e.g., arrange table), the                                          observations, the API program p, and a fixed size graph
environment current xt , and target xT observations, we obtain                                         encoding representing the program hierarchy.
the Neural Task Representations for CoMPNetX by exploiting
a learning-based task programmer’s internal state Z d and                                                 The overall flow of the algorithm is shown in the Fig.
program arguments a.                                                                                   3. The current and goal observations are encoded into la-
                                                                                                       tent embeddings using their encoders. The program planner,
                                                                                                       conditioned on observation encodings, iteratively decomposes
tion to connect two configurations on the manifold. IMACS                                              the given program (e.g., arrange table) into subprograms by
incorporates the constraint function into C-Space, presenting                                          generating a probability distribution over a set of predefined
an implicit manifold space to an underlying SMP method.                                                program instances (e.g., pick and place). The program with
These SMP methods, augmented with a constrained integrator,                                            maximum probability is selected, which becomes the input
are shown to solve various CMP problems. Despite these                                                 to the program planner in the next iteration. This process
advancements, existing SMP methods are computationally                                                 is repeated until an API-program is selected. For instance,
inefficient and take up to several minutes for solving practical                                       the given program, arrange table, can lead to the selection
problems not just in CMP but also in unconstrained planning                                            of a pick place program which subsequently results in the
problems.                                                                                              selection of either pick or place programs. The pick and place
   In this paper, we propose CoMPNetX that extends IMACS                                               are defined as API programs requiring arguments from the
and our previously proposed CoMPNet [22] and also intro-                                               API decoder. This API decoder, conditioned on observation
duces neural-gradient-based projections to generate informed                                           encodings and graph embeddings, predicts the API program’s
implicit manifold configurations for underlying SMP methods                                            arguments indicating the object that needed to be grasped
equipped with any constrained integrator. Our approach can                                             (pick) and moved (place). The graph embeddings are given by
also be interpreted as Neural Informed Implicit MAnifold Con-                                          the graph encoder that takes a list of non-API programs (Fig.
figuration Spaces (NIIMACS), which replaces the abstraction                                            3) and encodes them into a fixed-size latent representation. In
layer of IMACS with neural-learned sampling distributions to                                           our implementation of NTP2, the current observation contains
prioritize sampling in the subsets of a contraint manifold that                                        the current poses of the given objects in the environment and
potentially contains a path solution for a given problem.                                              the robot end-effector pose. The goal observation includes the
                                                                                                       final poses of all objects at the end of the task. Furthermore,
                    IV. N EURAL TASK R EPRESENTATIONS                                                  the program planner and the API decoder were trained using
                                                                                                       the cross-entropy loss for the given expert demonstration.
   This section describes the process to obtain the neural task
                                                                                                       For more details on the implementations, refer to [46], and
representations, utilized by CoMPNetX to define task-specific
                                                                                                       Appendix A of this paper.
constraints in a scalable and generalizable way. These rep-
                                                                                                          To generate a neural task representation for the CoMPNetX,
resentations come from the internal state of a learning-based
                                                                                                       we take the latent inner embedding Z d of API Decoder and
task planner. Although various learning-based task planners
                                                                                                       their corresponding arguments a (Fig. 3). The internal state
can be utilized for acquiring these representations, we adapt
                                                                                                       Z d comprises current and goal encodings, graph embedding
a variant [46] of the Neural Task Programming (NTP) [47] in
                                                                                                       representing the given task hierarchy, and an API program
our framework.
                                                                                                       embedding. Note that the latent state Z d and arguments a
   This variant, which we name NTP2, extends original
                                                                                                       contain sufficient information, i.e., a given high-level task,
NTP by relieving the need for task demonstration at the test
                                                                                                       their sub-task hierarchy, and workspace representation, for the
time. NTP2 uses the goal xT and current xt observations
                                                                                                       CoMPNetX to effectively plan the feasible robot motion path
of the environment to decompose a given high-level task
                                                                                                       respecting the task constraints at any instant. This is in contrast
into a feasible sequence of intermediate sub-tasks. We use
                                                                                                       to the original CoMPNet framework [22] that relied on hand-
NTP2 to obtain the neural task representations and the sub-
                                                                                                       engineered task plans, and sub-tasks were represented as text-
task sequences for CoMPNetX. It comprises the following
                                                                                                       descriptions, making them oblivious of given high-level tasks,
                                                                                                       their hierarchical structure, and overall workspace setup.
   Program Planner: It is a deep neural network-based
iterative program predictor that takes a high-level symbolic                                               V. C ONSTRAINED M OTION P LANNING N ETWORKS
task pt , the environment’s current xt and goal xT observations                                           This section formally present CoMPNetX (Fig. 4), compris-
as an input and outputs a next sub-program pt+1 and the                                                ing a conditional generator, discriminator, neural projection
end-of-program probability r, indicating the accomplishment                                            operator, and neural samplers. The neural generator and dis-
                                                                                                       criminator are conditioned on the task and scene observation
Constrained Motion Planning Networks X - Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip - arXiv
      input_program                                                                         Neural                                               Discriminator
   (e.g., arrange_table)                                                                   Generator
                                             Neural Task            Zc
   start xt and goal xT                    Representations
  workspace observation
                                                                                                                      Zo                                               dM
                                           Scene Encoder

                                                                   qtarg                                                                              ▽q^
                                                                                                                                             q              next
                                                                                                           ^                    ▽q^                         qnext
     Scene                                                         qcurr                                                              next

   observation                                                                                                                                                                                          qgoal
                                                                                                           Neural Projection
                           Implicit manifold start                                         qnext
                          and goal configurations                                                         SMP with Constrained

Fig. 4: CoMPNetX execution traces for the constrained door opening subtask. Our method comprises a conditional neural
generator and discriminator that, in conjunction with a planning algorithm, finds a feasible path solution between start q init
(purple) and goal q goal (green) configurations.

                                               qgoal                                                                          qgoal                                                                      qgoal


                  qcurr                                 CoMPNetX                      1
                                                                                   qcurr                   2
                                                                                                           q   n

                                2                                                              2
                              qcurr                                                           qcurr

        qinit                                                         qinit                                                                                qinit

Fig. 5: K-Batch CoMPNetX: The process shows COMPNetX exploiting neural networks parallelization to generate K = 2
informed manifold configurations from randomly selected nodes in the tree towards the goal configuration(s) for an underlying
SMP method equipped with a constrained-integrator.
                                    a                                                              a
                                q                                              a               q
                                    next                                      qnear                next
        Ta                                                           Ta                                                                               Ta                      a                 qcurr
                                                    b                                                                                                                     q
                                                q                                                                                                                             targ
                                                    targ                                                                               Tb
                                                            Tb                                                                                                                                                   Tb

                                                                     qinit                                                b
         qinit                                          qgoal                                                         qnear       qgoal               qinit                                               qgoal

       (a) Informed Sample Generation                                      (b) Ta and Tb extension                                                             (c) Swapping Roles
Fig. 6: Bidirectional CoMPNetX: (a)-(c) show the CoMPNetX bidirectional sample generation, soliciting neural informed-trees
from start and goal to quickly march towards each other within a Bidirectional-SMP method.

encodings to generalize across different environments and                                     encoder takes Z s , comprising Z d and a, as an input and
planning problems. Our method with a constrained integrator                                   composes them into a fixed-size latent embedding Z c ∈ Rd1
and an underlying SMP algorithm generates feasible motion                                     of size d1 using a neural network.
plans on the constraint manifolds for the given CMP problems.
                                                                                              B. Scene Encoder
A. Task Encoder                                                                                  The scene encoder takes the raw environment perception as
   The task-encoder processes the neural task representations                                 a 3D depth point-cloud processed into voxels and transforms
given as Z s = [Z d , a]. As mentioned earlier, the Z d is a                                  them to an embedding Z o ∈ Rd2 of dimension d2 . The 3D
fixed-sized vector comprising the workspace current and goal                                  voxel grids of dimensions L × W × H × C are converted
observation encodings, the API program embeddings, and the                                    into 2D voxel patches as L × W × (HC), where L, W, H,
graph encoding (representing the program hierarchy). Our task                                 and C correspond to length, width, height, and the number of
Constrained Motion Planning Networks X - Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip - arXiv
channels, respectively. The voxel patches are encoded into Z o              CoMPNetX uses the discriminator predictions and their
using a 2D convolutional neural network (CNN). We process                gradients as the operator, named NProj, to project the given
3D voxels into 2D voxel patches as 3D maps require 3D-                   configurations to the constraint manifold if their predicted
CNNs, which are known to be computationally intensive and                distances are greater than a threshold ν, thus discriminating
their representations often contain empty volumes [48]. The              samples based on their distances from the manifold and fixing
scene embedding is passed as a fixed-size feature vector de-             them accordingly as,
scribing the environmental obstacles to a subsequent generator
                                                                                              q ← q̂ − γ∇q̂ Dθ (q̂, Z c , Z o ),           (9)
and discriminator. Although neural task representations Z c
contain poses of manipulatable objects in their embeddings,              where γ ∈ R+ is a hyperparameter denoting a step size.
scene observation Z o also includes information about static                To train the discriminator network Dθ , we minimize the
non-movable objects acting as obstacles in the environment.              mean-square loss between its predictions and the true labels.
                                                                         The true labels are the geodesic distances of demonstration
C. Conditional Neural Generator                                          trajectories from the constraint manifolds. Furthermore, we
                                                                         introduce a trick to create negative training samples with
   CoMPNetX’s generator Gφ , with parameters φ, is a                     relatively larger distances from the manifold. The negative
stochastic neural model that outputs a variety of implicit               training samples comprise the robot configuration from the
manifold configurations leading to a constrained path so-                unconstrained tasks (e.g., reach a given object) and the virtual-
lution (Fig. 4). Because the generator is trained on both                link configuration from positive training samples and their
unconstrained and constrained path demonstration data, the               corresponding distances are computed by querying F.
output distribution of the neural model tend to fall on or near
the constraint manifolds when conditioned on task-specific
constraints. Our generator derives its stochastic behavior from           Algorithm 5: COMPNetX (Z s , v, q curr , q targ )
using Dropout [49] during inference, which instantly slices              1   Z c ← GetTaskEncoding(Z s )
Gφ in a probabilistic manner, inculcating variations in the              2   Z o ← GetObsEncoding(v)
generated samples. Although other techniques such as input               3   q̂ next ← Gφ (Z c , Z o , q curr , q targ )
Gaussian noise can be used to foster stochasticity, they require         4   dM ← Dθ (q̂ next , Z c , Z o )
a reparametrization trick and are often hard to train end-               5   if dM > ν then
to-end [50]. In contrast, Dropout helps capture stochastic               6        q̂ next ← q̂ next − γ∇q̂next Dθ (q̂ next , Z c , Z o )
behavior from demonstration data, which we observed to be                7   return q̂ next
consistently better than hand-crafted input noise distributions
in our planning problems.
   The generator’s input is the task-observation encodings (Z c
and Z o ) that encode the given neural task representations              E. Neural Samplers
and scene observation, respectively, and the current q curr and
                                                                            Once trained, CoMPNetX can be used in a number of
target q targ manifold configurations. The output is the next
                                                                         ways to generate informed neural samples for the underlying
configuration q̂ next on/near the constraint manifold that will
                                                                         SMP algorithms equipped with a constrained adherence
take the system closer to the given target, i.e.,
                                                                         method. Fig. 4 and Algorithm. 5 present an overall flow of
               q̂ next ← Gφ (Z c , Z o , q curr , q targ )        (6)    information between different neural modules of CoMPNetX.
                                                                         For a given current q curr and target q targ configuration(s),
Given the demonstration trajectories σ ∗ = {q ∗0 , · · · , q ∗T } from   COMPNetX, conditioned on encodings Z c and Z o , generates
an oracle planner, we train the generator together with the task         the next configuration(s) q̂ next and projects them towards the
and observation encoders in an end-to-end manner using the               constraint manifold using neural gradients if needed. Thanks
mean-square loss function, i.e.,                                         to CoMPNetX’s informed but stochastic sampling and built-in
                      N T −1
                       i                                                 parallelization capacity of neural networks, our method can
                1 XX
                          ||q i,j+1 − q ∗i,j+1 ||2 ,      (7)            be adapted to most of underlying SMP methods. For case
              NB i=0 j=0                                                 studies, we present two sampling strategies named K-Batch
where i and j iterates over the number of given paths and                CoMPNetX and Bidirectional CoMPNetX, which together
the number of nodes in each path, respectively, and NB is the            cover a wide range of SMP methods.
averaging term.
                                                                            K-Batch CoMPNetX: Our approach exploits the neural
D. Conditional Neural Discriminator                                      networks’ innate parallelization capacity to generate a batch
                                                                         of samples with size K ∈ N≥1 using CoMPNetX for the
   CoMPNetX’s discriminator Dθ , with parameters θ, is a                 underlying unidirectional (K = 1) and batch (K > 1) SMP
deterministic neural model that predicts the distance dM ∈ R             methods. In this setup, the input to CoMPNetX is in the form
of a given configuration q̂ from an implicit constraint manifold         of batches of size K. The K target configurations q targ are a
M conditioned on the task Z c and observation Z o encodings,             set of samples from goal region Ggoal . The voxel map v and
i.e.,                                                                    neural task representation Z s are simply replicated K times.
                    dM ← Dθ (q̂, Z c , Z o )                 (8)         The K current configurations q curr are obtained by randomly
Constrained Motion Planning Networks X - Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip - arXiv
Algorithm 6: K-Batch COMPNetX                                   Algorithm 7: Bidirectional COMPNetX
 1   T ← InitializeSMP(q init , q goal )                             1   t ← 1; p0 ← input program
 2   Kqtarg ← KReplicas(q goal )                                     2   while not end of program do
 3   for i ← 0 to Nmax do                                            3      xt , v t ← GetObservation()
 4       if i < Nismp then                                           4      pt , Z s , end of program ← NTP2(xt , xT , pt−1 )
 5            Kqcurr ← SelectNodes(T , K)                            5      q init , q goal ← GetConfigs(pt , Z s )
 6            Kqnext ←                                               6      Ta , Tb ← InitializeBiSMP(q init , q goal )
               CoMPNetX(KZ s , Kv , Kqcurr , Kqtarg )                7      q acurr , q btarg ← q init , q goal
 7      else                                                         8      for i ← 0 to Nmax do
 8          Kqnext ← TraditionalSMP()                                9           if i < Nismp then
                                                                    10                 q anext ← CoMPNetX(Z s , v t , q acurr , q btarg )
 9      goal reached ← SMP(Kqnext , T )
10      if goal reached then                                        11           else
11          σ ← ExtractPath(T )                                     12               q anext ← TraditonalSMP()
                                                                    13           q anext , path found ← BiSMP(q anext , Ta , Tb )
12   if σ is not empty then                                         14           if path found then
13       ExecutePlan(σ)                                             15                σt ← ExtractPath(Ta , Tb )
14   else                                                           16           q acurr ← q anext
15       return Failure or AskExpert                                17           Swap(Ta , Tb )
16   return ∅                                                       18           Swap(q curr , q targ )
                                                                    19       if σt is not empty then
                                                                    20           ExecutePlan(σt )
selecting K nodes in the graph leading to their corresponding       21       else
next output configurations as follows:                              22           return Failure or AskExpert
     Kqnext = CoMPNetX KZ s , Kv , Kqcurr , Kqtarg ,                23       t←t+1
                     1                       1                  24   return ∅
                      q next                    q targ
                     ..                      .. 
   where Kqnext =  .  , · · · , Kqtarg =  .  (10)
                         next                     qK
                                                   targ             generate neural task representations and intermediate subtasks
At the beginning of planning, the graph T might have only           for CoMPNetX, which in return accomplishes those subtasks,
one sample, i.e., q init . In that case, an initial set of Kqcurr   forming a mutually symbiotic relationship.
can be created by randomly sampling the manifold Mf ree                In this procedure, CoMPNetX alternatively generates sam-
or replicating q init for K times. Fig. 5 shows a case with         ples for both trees and greedily expands them towards each
K = 2, and Algorithm. 6 presents K-Batch CoMPNetX                   other by having current and target configurations in the oppo-
algorithm with an underlying SMP. This approach is not              site trees (Fig. 6), i.e.,
just for batch sampling methods such as FMT* [11] and                   Forward: q anext ← CoMPNetX Z s , v, q acurr , q btarg
BIT* [9] but can also be applied to any unidirectional
                                                                        Backward: q bnext ← CoMPNetX Z s , v, q bcurr , q atarg
SMP method like RRT [33], [6] and PRMs [32] by setting
K = 1. Furthermore, our procedure shifts to traditional
                                                                    where configurations with superscript a and b corresponds to
sampling techniques, introduced in IMACS [23], after
                                                                    the tree Ta and Tb , respectively.
generating neural informed implicit manifold configurations
using CoMPNetX for Nsmp iterations. This allows our
                                                                       Algorithm 7 presents an overall framework using NTP2 and
framework to explore the entire space in worst-case, leading
                                                                    CoMPNetX with an underlying bidirectional SMP algorithm,
to theoretical guarantees expected from a planning algorithm.
                                                                    like RRTConnect [6], and a constrained-adherence method.
                                                                    NTP2 takes the current environment observation xt , previous
   Bidirectional CoMPNetX: This approach incorporates
                                                                    task program pt−1 , and the desired goal observation xT
Bidirectional SMP (BiSMP) methods into CoMPNetX that
                                                                    and generates the next program pt with their representation
generate bidirectional trees Ta = (V, E) and Tb = (V, E)
                                                                    Z s . The procedure GetConfigs takes the generated task
originating from the start q init and goal q goal configurations,
                                                                    information (pt , Z s ) and obtains their corresponding start
respectively, with vertices V and edges E. Although the
                                                                    and goal configurations. These configurations and task-scene
following approach can be formulated as K-Batch bidirec-
                                                                    representations are given to CoMPNetX-BiSMP to accomplish
tional CoMPNetX, we consider K = 1 and drop down the
                                                                    the given subtask by generating a feasible motion plan.
K notations introduced in the previous section for brevity.
                                                                       Fig. 6 illustrates the internal process of a BiSMP, such
Furthermore, we also show that our approach can be com-
                                                                    as RRTConnect, using CoMPNetX generated samples. Let’s
bined with learning-based task planners such as NTP2 that
                                                                    assume tree Ta current configuration being used to generate
Constrained Motion Planning Networks X - Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig and Michael C. Yip - arXiv
the next sample (Fig. 6 (a)). The underlying BiSMP begins              A. Scene setup
by extending Ta towards the next configuration q anext and                We setup the following cluttered environments imposing
updates q anext with the last state reached by constrained             various hard kinematic constraints on the robot motion:
integrator towards the target q btarg (Fig. 6 (b)). The process           Sphere Environment: This environment requires the mo-
then extends Tb towards the q anext and the extension process          tion planning of a point-mass on the sphere with constraint
ends by returning updated q anext and a boolean path found.            F(q) = kqk − 1, forming a two-dimensional manifold on a
The path found is true when trees Ta and Tb are connected,             three-dimensional ambient space. In this setup, we create two
depending on trees’ connection strategy of an underlying               scenarios:
BiSMP, and there exists a path between start and goal that
                                                                          • Scenario 1 - We generate 50 unique scenes by randomly
satisfies all the desired constraints. To solicit bidirectional path
                                                                             placing 500 small obstacle blocks over the sphere (Fig. 1
generation using CoMPNetX, the roles of current and target
                                                                             (a)). For each scene, we randomly sample 2000 start and
configurations are also swapped along with planning trees’
                                                                             goal pairs on the obstacle-free space of the sphere.
roles at the end of each planning iteration (Fig. 6 (c)). Our
                                                                          • Scenario 2 - This setup requires transversing multiple
CoMPNetX-BiSMP quickly finds a path solution by exploiting
                                                                             narrow passages between the randomly selected start and
the moving targets from its own distribution which improves
                                                                             goal configurations (Fig. 1 (b)). We randomly sample
the stability of the generator to find connectable paths as
                                                                             the unique 500 start and goal pairs from the obstacle-
satisfying the two-point boundary value problem becomes
                                                                             free space, each of which constitutes a CMP problem.
easier when the two goal states are iteratively sampled from a
                                                                             This setup is only used to test our model’s generalization
distribution encoded by the generator, rather than one defined
                                                                             capacity, trained on sphere - scenario 1, to an entirely
arbitrarily during the problem definition.
                                                                             different environment.
   Note that the constraint function F is used only by an
                                                                          Bartender Environment: A dataset, named Bartender en-
underlying SMP method. Furthermore like CoMPNet, CoMP-
                                                                       vironment, containing three different scenarios was created
NetX (batch and bidirectional) also extends the planning graph
                                                                       to fully capture the complexities of the real-world task and
from the nearest node of the newly generated next node since
                                                                       constrained motion planning problems. The environment in-
all underlying SMP algorithms rely on the nearest neighbor
                                                                       cludes two tables placed perpendicular to each other. The
for their graph extension towards the given configuration
                                                                       table contains seven objects placed at random, and only five
sample [4]. It is also in contrast to the MPNet algorithm [15],
                                                                       are movable under pre-specified motion constraints. The five
[16] that greedily finds a path by extending from q curr
                                                                       movable/manipulatable items include a juice can (green), fuze
to q next in an overall planning method and repairs any
                                                                       bottle (purple), soda can (red), kettle, and red mug. The two
non-connectable nodes via stochastic re-planning. Although
                                                                       stationary objects include a tray and a recycling bin that form
the MPNet approach works extremely fast in unconstrained
                                                                       the movable objects’ goal locations. The juice can, soda can,
planning problems, re-planning becomes computationally ex-
                                                                       and fuze bottle are to be placed into the recycling bin with only
pensive in CMP due to projections performed by the con-
                                                                       collision-avoidance constraints. The kettle and the red mug
strained integrator. Moreover, the constraint manifolds are non-
                                                                       are to be placed on the tray with both stability and collision-
euclidean in topography, and extension from nearest neighbors
                                                                       avoidance constraints, i.e., no tilting is allowed during the
becomes convenient for geodesic interpolation. This is evident
                                                                       robot motion. The three different scenarios are described as
from the experimentation in our previous work [22], showing
that leveraging MPNet’s greedy path-finding approach, without
                                                                          • Scenario 1 - In this scenario, the objects can be moved to
replanning, often fails in finding a connectable path solution
on the manifolds. However, in our extended analysis presented                their targets in any order. In other words, in most cases,
in this work, we show that CoMPNetX, in addition to CMP,                     all objects start, and goal configurations are reachable.
can still be used with the MPNet planning algorithm for                      We generate 1833 unique scenes through the random
efficiently solving unconstrained planning problems with low                 placement of the movable and non-movable objects on
computation times and high success rates in high-dimensional                 the tables at the robot’s right arm’s reachable locations.
planning problems.                                                           Each scene contains a total of ten (five unconstrained and
                                                                             five constrained) planning problems.
                                                                          • Scenario 2 - In this scenario, the goal location of either

                 VI. I MPLEMENTATION DETAILS                                 the red mug or the kettle contains an obstacle. The
                                                                             obstacles are formed by placing either juice bottle, fuze
   This section describes the data generation pipeline from                  bottle, or soda can, at the goal location of the kettle or
setting up scenarios to obtaining expert demonstrations and                  the red mug. For example, if the red mug’s goal location
observation data. We also describe training, and testing data                contains the juice bottle, the task planner needs to account
splits for all scenarios considered in this work. Furthermore,               for this information during the planning process. That is,
with this paper, all generated datasets, trained models, and                 the juice bottle needs to be moved into the recycling bin
algorithmic implementations will be made publicly available                  before the red mug is attempted to be moved onto the
on our project website2 .                                                    tray. This enforces a constraint on the task planner to
                                                                             account for obstacles. We created 700 scenes in this setup,
  2                              each with at least two constrained and two unconstrained
planning problems.                                             and kitchen environments leading to voxel maps of dimensions
  •   Scenario 3 - In this setup, the kettle and red mug are         33 × 33 × 33 and 32 × 32 × 32, respectively.
      placed on the tray, and the task is to swap their start
      locations. In other words, the goal locations of both the      D. NTP2: Programs and API Arguments Set
      kettle and the red mug are occupied by the red mug and
                                                                        In our NTP2 setup, the list of initial programs includes
      the kettle, respectively. Therefore, there is a need for a
                                                                     arrange table and swap tray objs. The Bartender (setup 1
      sub-goal generation for one of the objects. For example,
                                                                     and 2) and Kitchen tasks begin with the former, whereas
      the tea kettle should be moved to a temporary location on
                                                                     the Bartender setup 3 begins with the latter program. The
      the table. This is followed by the pick-place of the plastic
                                                                     initial program can call either pick place, subgoal gen,
      mug to its goal location. Finally, the goal location of the
                                                                     return arm, or no op programs followed by their underlying
      tea kettle is now free for its pick-place operation. For
                                                                     API-programs named pick and place. The API-programs pick
      this problem, we created 300 unique cases by random
                                                                     and place represent an unconstrained planning problem, re-
      placement of the tray, and each case contained atleast
                                                                     quiring a robot to reach a given target/object, and a constrained
      six planning problems, i.e., three constrained and three
                                                                     planning problem, demanding manipulation under manifold
                                                                     constraints, respectively. An API-program also gets an argu-
Kitchen Environment: In this scenario we have seven manipu-
                                                                     ment, predicted by the API-decoder, which in our cases, is
latable objects: soda can, juice can, fuze bottle, cabinet door,
                                                                     one of the objects (e.g., juice can, fuze bottle, soda can, etc.)
black mug, red mug, and pitcher. The objective is to move the
                                                                     to be picked or placed in the given scenario. Furthermore, the
cans and bottle to the trash bin, open the cabinet door from any
                                                                     program return arm requires a robot to return to its initial
starting angle to a fixed final angle (π/2.7), transfer (without
                                                                     default configuration from any starting state, and the program
tilting) the black and red mugs from the cabinet to the tray, and
                                                                     no op means no operation needed. Finally, the subgoal gen
move the pitcher from the table into the cabinet. We construct
                                                                     is executed to move objects acting as obstacles out of the way
1633 unique scenarios by the random placement of the trash
                                                                     through pick-place procedures to achieve the desired sub-task.
bin, tray, and manipulatable objects (excluding door) on the
table and by randomly selecting the cabinet’s door starting
                                                                                              VII. R ESULTS
angle between 0 to π/4. Each scenario contains a total of 14
planning problems, i.e., seven unconstrained (reach) and seven          In this section, we present the results and analysis of the fol-
constrained (manipulation) problems.                                 lowing evaluation studies: (i) A comparison study evaluating
                                                                     CoMPNetX and state-of-the-art classical SMP planning meth-
B. Training & testing data splits                                    ods with an underlying constraint-adherence approach (projec-
                                                                     tion, atlas, or tangent bundle) on unseen challenging problems
   In the sphere (scenario 1), we use 40 environments for            in environments named Sphere, Bartender, and Kitchen. (ii) An
training and 10 for testing. The sphere (scenario 2) is used         ablation study comparing CoMPNetX with its ablated models
for testing only. In the bartender (scenario 1), and kitchen         and our previous method CoMPNet [22]. (iii) An extended
environments, we use 10% data for testing, and the remainder         evaluation to highlight the mutualistic relationship of learning-
is used for training. All training paths were generated by an        based task programmers and CoMPNetX and their capacity to
oracle planner, i.e., Atlas-RRTConnect. To train neural task         generalize across different planning domains.
programmer on all bartender (scenarios 1, 2 & 3) and kitchen
environments, we use the same data split ratio, i.e., about 5%
is kept for testing. Note that the CoMPNetX is never trained         A. Comparative analysis
on the sphere (scenario 2) and Bartender scenarios 2 and 3.             This section compares SMP methods augmented with batch
We use them to evaluate CoMPNetX generalization capacity             and bidirectional CoMPNetX against their classical setups
across different environment structures and planning problems.       in solving CMP problems. In the batch method, we select
                                                                     FMT* [11], a state-of-the-art classical SMP algorithm, and
C. Observation data                                                  it is proven to perform better than standard approaches like
                                                                     RRT* and PRM* [7]. FMT* begins with an initial batch
   In the sphere environment, the observation data is a point-
                                                                     of Ninit uniform samples, including a goal configuration.
cloud converted into a voxel map of size 40×40×40. However,
                                                                     In case, the initial set of samples does not yield a path
for the other high-dimensional robot environments (Bartender
                                                                     solution, FMT* continues to expand the tree by generating
and Kitchen), there exist workspace and entire scene observa-
                                                                     a new random sample in every planning iteration. We choose
tions at any time instant t. The workspace observation includes
                                                                     FMT* to highlight the flexibility offered by CoMPNetX in
the current xt ∈ X and the target xT ∈ X . The current
                                                                     generating sample batches with different K ≥ 1 according to
workspace observation xt at a given time is represented by
                                                                     the given SMP method. In CoMPNetX-FMT*, we generate an
each objects’ poses and the robot end-effector pose. The target
                                                                     initial batch of Ninit samples with K
You can also read